* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. @ 2011-06-20 9:23 Santosh Shilimkar 2011-06-20 9:50 ` Russell King - ARM Linux 0 siblings, 1 reply; 31+ messages in thread From: Santosh Shilimkar @ 2011-06-20 9:23 UTC (permalink / raw) To: linux-arm-kernel The current ARM CPU hotplug code suffers from couple of race conditions in CPU online path with scheduler. The ARM CPU hotplug code doesn't wait for hot-plugged CPU to be marked active as part of cpu_notify() by the CPU which brought it up before enabling interrupts. So we end up in with couple of race conditions, 1) Interrupts are enabled even before CPU is marked as active. 2) Newly plugged CPU is marked as active but it is not marked online yet. When an interrupt happens before the cpu_active bit is set, the scheduler can't schedule the woken thread which is bound to that newly onlined cpu and and selects a fallback runqueue. Secondly marking CPU active before it is online also not desirable behaviour. Fix this race conditions. Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <rmk+kernel@arm.linux.org.uk> --- On v3.0 kernel I started seeing lock-up and random crashes when CPU online/offline was attempted aggressively. With git bisect I could reach to Peter's commit e4a52bcb9a18142d79e231b6733cabdbf2e67c1f[sched: Remove rq->lock from the first half of ttwu()] which was also reported by Marc Zyngier. But even after using the follow up fix from Peter d6aa8f85f163[sched: Fix ttwu() for __ARCH_WANT_INTERRUPTS_ON_CTXSW], I was still seeing issues with hotplug path. So as a experiment I just pushed down the interrupt enabling on newly plugged CPU after it's marked as online. This made things better and much stable but occasionally I was still seeing lock-up. With above as background I looked at arch/x86/ code and got convinced myself that the experimental hack could be the right fix. While doing this I came across a commit from Thomas fd8a7de177b [x86: cpu-hotplug: Prevent softirq wakeup on wrong CPU] which fixed the race 2) on x86 architecture. In this patch I have folded possible fixes for both race conditions for ARM hotplug code as mentioned in change log. Hopefully I am not introducing any new race with this patch and hence the RFC. arch/arm/kernel/smp.c | 18 +++++++++++------- 1 files changed, 11 insertions(+), 7 deletions(-) diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 344e52b..84373a9 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -302,13 +302,6 @@ asmlinkage void __cpuinit secondary_start_kernel(void) platform_secondary_init(cpu); /* - * Enable local interrupts. - */ - notify_cpu_starting(cpu); - local_irq_enable(); - local_fiq_enable(); - - /* * Setup the percpu timer for this CPU. */ percpu_timer_setup(); @@ -322,6 +315,17 @@ asmlinkage void __cpuinit secondary_start_kernel(void) */ set_cpu_online(cpu, true); + + while (!cpumask_test_cpu(smp_processor_id(), cpu_active_mask)) + cpu_relax(); + + /* + * Enable local interrupts. + */ + notify_cpu_starting(cpu); + local_irq_enable(); + local_fiq_enable(); + /* * OK, it's off to the idle thread for us */ -- 1.6.0.4 ^ permalink raw reply related [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-20 9:23 [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler Santosh Shilimkar @ 2011-06-20 9:50 ` Russell King - ARM Linux 2011-06-20 10:14 ` Russell King - ARM Linux 2011-06-20 10:19 ` Santosh Shilimkar 0 siblings, 2 replies; 31+ messages in thread From: Russell King - ARM Linux @ 2011-06-20 9:50 UTC (permalink / raw) To: linux-arm-kernel On Mon, Jun 20, 2011 at 02:53:59PM +0530, Santosh Shilimkar wrote: > The current ARM CPU hotplug code suffers from couple of race conditions > in CPU online path with scheduler. > The ARM CPU hotplug code doesn't wait for hot-plugged CPU to be marked > active as part of cpu_notify() by the CPU which brought it up before > enabling interrupts. Hmm, why not just move the set_cpu_online() call before notify_cpu_starting() and add the wait after the set_cpu_online() ? ^ permalink raw reply [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-20 9:50 ` Russell King - ARM Linux @ 2011-06-20 10:14 ` Russell King - ARM Linux 2011-06-20 10:28 ` Santosh Shilimkar 2011-06-20 10:19 ` Santosh Shilimkar 1 sibling, 1 reply; 31+ messages in thread From: Russell King - ARM Linux @ 2011-06-20 10:14 UTC (permalink / raw) To: linux-arm-kernel On Mon, Jun 20, 2011 at 10:50:53AM +0100, Russell King - ARM Linux wrote: > On Mon, Jun 20, 2011 at 02:53:59PM +0530, Santosh Shilimkar wrote: > > The current ARM CPU hotplug code suffers from couple of race conditions > > in CPU online path with scheduler. > > The ARM CPU hotplug code doesn't wait for hot-plugged CPU to be marked > > active as part of cpu_notify() by the CPU which brought it up before > > enabling interrupts. > > Hmm, why not just move the set_cpu_online() call before notify_cpu_starting() > and add the wait after the set_cpu_online() ? Actually, the race is caused by the CPU being marked online (and therefore available for the scheduler) but not yet active (the CPU asking this one to boot hasn't run the online notifiers yet.) This, I feel, is a fault of generic code. If the CPU is not ready to have processes scheduled on it (because migration is not initialized) then we shouldn't be scheduling processes on the new CPU yet. In any case, this should close the window by ensuring that we don't receive an interrupt in the online-but-not-active case. Can you please test? arch/arm/kernel/smp.c | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 344e52b..e34d750 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -318,9 +318,15 @@ asmlinkage void __cpuinit secondary_start_kernel(void) smp_store_cpu_info(cpu); /* - * OK, now it's safe to let the boot CPU continue + * OK, now it's safe to let the boot CPU continue. Wait for + * the CPU migration code to notice that the CPU is online + * before we continue. */ + local_irq_disable(); set_cpu_online(cpu, true); + while (!cpumask_test_cpu(cpu, cpu_active_mask)) + cpu_relax(); + local_irq_enable(); /* * OK, it's off to the idle thread for us ^ permalink raw reply related [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-20 10:14 ` Russell King - ARM Linux @ 2011-06-20 10:28 ` Santosh Shilimkar 2011-06-20 10:35 ` Russell King - ARM Linux 2011-06-20 10:44 ` Russell King - ARM Linux 0 siblings, 2 replies; 31+ messages in thread From: Santosh Shilimkar @ 2011-06-20 10:28 UTC (permalink / raw) To: linux-arm-kernel On 6/20/2011 3:44 PM, Russell King - ARM Linux wrote: > On Mon, Jun 20, 2011 at 10:50:53AM +0100, Russell King - ARM Linux wrote: >> On Mon, Jun 20, 2011 at 02:53:59PM +0530, Santosh Shilimkar wrote: >>> The current ARM CPU hotplug code suffers from couple of race conditions >>> in CPU online path with scheduler. >>> The ARM CPU hotplug code doesn't wait for hot-plugged CPU to be marked >>> active as part of cpu_notify() by the CPU which brought it up before >>> enabling interrupts. >> >> Hmm, why not just move the set_cpu_online() call before notify_cpu_starting() >> and add the wait after the set_cpu_online() ? > > Actually, the race is caused by the CPU being marked online (and therefore > available for the scheduler) but not yet active (the CPU asking this one > to boot hasn't run the online notifiers yet.) > Scheduler uses the active mask and not online mask. For schedules CPU is ready for migration as soon as it is marked as active and that's the reason, interrupts should never be enabled before CPU is marked as active in online path. > This, I feel, is a fault of generic code. If the CPU is not ready to have > processes scheduled on it (because migration is not initialized) then we > shouldn't be scheduling processes on the new CPU yet. > > In any case, this should close the window by ensuring that we don't receive > an interrupt in the online-but-not-active case. Can you please test? > No it doesn't work. I still get the crash. The important point here is not to enable interrupts before CPU is marked as online and active. Regards Santosh ^ permalink raw reply [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-20 10:28 ` Santosh Shilimkar @ 2011-06-20 10:35 ` Russell King - ARM Linux 2011-06-20 10:45 ` Santosh Shilimkar 2011-06-20 10:44 ` Russell King - ARM Linux 1 sibling, 1 reply; 31+ messages in thread From: Russell King - ARM Linux @ 2011-06-20 10:35 UTC (permalink / raw) To: linux-arm-kernel On Mon, Jun 20, 2011 at 03:58:03PM +0530, Santosh Shilimkar wrote: > On 6/20/2011 3:44 PM, Russell King - ARM Linux wrote: >> On Mon, Jun 20, 2011 at 10:50:53AM +0100, Russell King - ARM Linux wrote: >>> On Mon, Jun 20, 2011 at 02:53:59PM +0530, Santosh Shilimkar wrote: >>>> The current ARM CPU hotplug code suffers from couple of race conditions >>>> in CPU online path with scheduler. >>>> The ARM CPU hotplug code doesn't wait for hot-plugged CPU to be marked >>>> active as part of cpu_notify() by the CPU which brought it up before >>>> enabling interrupts. >>> >>> Hmm, why not just move the set_cpu_online() call before notify_cpu_starting() >>> and add the wait after the set_cpu_online() ? >> >> Actually, the race is caused by the CPU being marked online (and therefore >> available for the scheduler) but not yet active (the CPU asking this one >> to boot hasn't run the online notifiers yet.) >> > Scheduler uses the active mask and not online mask. For schedules CPU > is ready for migration as soon as it is marked as active and that's > the reason, interrupts should never be enabled before CPU is marked > as active in online path. > >> This, I feel, is a fault of generic code. If the CPU is not ready to have >> processes scheduled on it (because migration is not initialized) then we >> shouldn't be scheduling processes on the new CPU yet. >> >> In any case, this should close the window by ensuring that we don't receive >> an interrupt in the online-but-not-active case. Can you please test? >> > No it doesn't work. I still get the crash. The important point > here is not to enable interrupts before CPU is marked > as online and active. But we can't do that. ^ permalink raw reply [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-20 10:35 ` Russell King - ARM Linux @ 2011-06-20 10:45 ` Santosh Shilimkar 2011-06-20 11:42 ` Santosh Shilimkar 0 siblings, 1 reply; 31+ messages in thread From: Santosh Shilimkar @ 2011-06-20 10:45 UTC (permalink / raw) To: linux-arm-kernel On 6/20/2011 4:05 PM, Russell King - ARM Linux wrote: > On Mon, Jun 20, 2011 at 03:58:03PM +0530, Santosh Shilimkar wrote: >> On 6/20/2011 3:44 PM, Russell King - ARM Linux wrote: >>> On Mon, Jun 20, 2011 at 10:50:53AM +0100, Russell King - ARM Linux wrote: >>>> On Mon, Jun 20, 2011 at 02:53:59PM +0530, Santosh Shilimkar wrote: >>>>> The current ARM CPU hotplug code suffers from couple of race conditions >>>>> in CPU online path with scheduler. >>>>> The ARM CPU hotplug code doesn't wait for hot-plugged CPU to be marked >>>>> active as part of cpu_notify() by the CPU which brought it up before >>>>> enabling interrupts. >>>> >>>> Hmm, why not just move the set_cpu_online() call before notify_cpu_starting() >>>> and add the wait after the set_cpu_online() ? >>> >>> Actually, the race is caused by the CPU being marked online (and therefore >>> available for the scheduler) but not yet active (the CPU asking this one >>> to boot hasn't run the online notifiers yet.) >>> >> Scheduler uses the active mask and not online mask. For schedules CPU >> is ready for migration as soon as it is marked as active and that's >> the reason, interrupts should never be enabled before CPU is marked >> as active in online path. >> >>> This, I feel, is a fault of generic code. If the CPU is not ready to have >>> processes scheduled on it (because migration is not initialized) then we >>> shouldn't be scheduling processes on the new CPU yet. >>> >>> In any case, this should close the window by ensuring that we don't receive >>> an interrupt in the online-but-not-active case. Can you please test? >>> >> No it doesn't work. I still get the crash. The important point >> here is not to enable interrupts before CPU is marked >> as online and active. > > But we can't do that. Why is that ? Is it because of calibration or the hotplug start notifies needs to be called with interrupts enabled ? Regards Santosh ^ permalink raw reply [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-20 10:45 ` Santosh Shilimkar @ 2011-06-20 11:42 ` Santosh Shilimkar 0 siblings, 0 replies; 31+ messages in thread From: Santosh Shilimkar @ 2011-06-20 11:42 UTC (permalink / raw) To: linux-arm-kernel On 6/20/2011 4:15 PM, Santosh Shilimkar wrote: > On 6/20/2011 4:05 PM, Russell King - ARM Linux wrote: >> On Mon, Jun 20, 2011 at 03:58:03PM +0530, Santosh Shilimkar wrote: >>> On 6/20/2011 3:44 PM, Russell King - ARM Linux wrote: >>>> On Mon, Jun 20, 2011 at 10:50:53AM +0100, Russell King - ARM Linux >>>> wrote: >>>>> On Mon, Jun 20, 2011 at 02:53:59PM +0530, Santosh Shilimkar wrote: >>>>>> The current ARM CPU hotplug code suffers from couple of race >>>>>> conditions >>>>>> in CPU online path with scheduler. >>>>>> The ARM CPU hotplug code doesn't wait for hot-plugged CPU to be >>>>>> marked >>>>>> active as part of cpu_notify() by the CPU which brought it up before >>>>>> enabling interrupts. >>>>> >>>>> Hmm, why not just move the set_cpu_online() call before >>>>> notify_cpu_starting() >>>>> and add the wait after the set_cpu_online() ? >>>> >>>> Actually, the race is caused by the CPU being marked online (and >>>> therefore >>>> available for the scheduler) but not yet active (the CPU asking this >>>> one >>>> to boot hasn't run the online notifiers yet.) >>>> >>> Scheduler uses the active mask and not online mask. For schedules CPU >>> is ready for migration as soon as it is marked as active and that's >>> the reason, interrupts should never be enabled before CPU is marked >>> as active in online path. >>> >>>> This, I feel, is a fault of generic code. If the CPU is not ready to >>>> have >>>> processes scheduled on it (because migration is not initialized) >>>> then we >>>> shouldn't be scheduling processes on the new CPU yet. >>>> >>>> In any case, this should close the window by ensuring that we don't >>>> receive >>>> an interrupt in the online-but-not-active case. Can you please test? >>>> >>> No it doesn't work. I still get the crash. The important point >>> here is not to enable interrupts before CPU is marked >>> as online and active. >> >> But we can't do that. > Why is that ? > Is it because of calibration or the hotplug start notifies needs to > be called with interrupts enabled ? > BTW, how is ARM different from X86 here. I mean the X86 code seems to do similar what my patch is trying to fix for ARM. Some pointers would help me to understand why can't we delay the interrupt enable part on ARM hotplug code. Regards Santosh ^ permalink raw reply [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-20 10:28 ` Santosh Shilimkar 2011-06-20 10:35 ` Russell King - ARM Linux @ 2011-06-20 10:44 ` Russell King - ARM Linux 2011-06-20 10:47 ` Santosh Shilimkar 1 sibling, 1 reply; 31+ messages in thread From: Russell King - ARM Linux @ 2011-06-20 10:44 UTC (permalink / raw) To: linux-arm-kernel On Mon, Jun 20, 2011 at 03:58:03PM +0530, Santosh Shilimkar wrote: > No it doesn't work. I still get the crash. The important point > here is not to enable interrupts before CPU is marked > as online and active. What is the crash (in full please)? Do we know what interrupt is causing it? ^ permalink raw reply [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-20 10:44 ` Russell King - ARM Linux @ 2011-06-20 10:47 ` Santosh Shilimkar 2011-06-20 11:13 ` Russell King - ARM Linux 0 siblings, 1 reply; 31+ messages in thread From: Santosh Shilimkar @ 2011-06-20 10:47 UTC (permalink / raw) To: linux-arm-kernel On 6/20/2011 4:14 PM, Russell King - ARM Linux wrote: > On Mon, Jun 20, 2011 at 03:58:03PM +0530, Santosh Shilimkar wrote: >> No it doesn't work. I still get the crash. The important point >> here is not to enable interrupts before CPU is marked >> as online and active. > > What is the crash (in full please)? > > Do we know what interrupt is causing it? Yes. It's because of interrupt and the CPU active-online race. Here is the chash log.. [ 21.025451] CPU1: Booted secondary processor [ 21.025451] CPU1: Unknown IPI message 0x1 [ 21.029113] Switched to NOHz mode on CPU #1 [ 21.029174] BUG: spinlock lockup on CPU#1, swapper/0, c06220c4 [ 21.029235] [<c0064704>] (unwind_backtrace+0x0/0xf4) from [<c028edc8>] (do_raw_spin_lock+0xd0/0x164) [ 21.029266] [<c028edc8>] (do_raw_spin_lock+0xd0/0x164) from [<c00cc3c4>] (tick_do_update_jiffies64+0x3c/0x118) [ 21.029296] [<c00cc3c4>] (tick_do_update_jiffies64+0x3c/0x118) from [<c00ccb04>] (tick_check_idle+0xb0/0x110) [ 21.029327] [<c00ccb04>] (tick_check_idle+0xb0/0x110) from [<c00a29cc>] (irq_enter+0x68/0x70) [ 21.029327] [<c00a29cc>] (irq_enter+0x68/0x70) from [<c00623c4>] (ipi_timer+0x24/0x40) [ 21.029357] [<c00623c4>] (ipi_timer+0x24/0x40) from [<c0051368>] (do_local_timer+0x54/0x70) [ 21.029388] [<c0051368>] (do_local_timer+0x54/0x70) from [<c048a09c>] (__irq_svc+0x3c/0x120) [ 21.029388] Exception stack(0xef87bf78 to 0xef87bfc0) [ 21.029388] bf60: 00000000 00026ec0 [ 21.029418] bf80: c0622080 ffff7483 c0622080 ffff7483 ef87a000 00000000 c0622080 411fc092 [ 21.029418] bfa0: c063a4f0 00000000 00000001 ef87bfc0 c0482e08 c0482b0c 60000113 ffffffff [ 21.029449] [<c048a09c>] (__irq_svc+0x3c/0x120) from [<c0482b0c>] (calibrate_delay+0x8c/0x1d4) [ 21.029479] [<c0482b0c>] (calibrate_delay+0x8c/0x1d4) from [<c0482e08>] (secondary_start_kernel+0x110/0x1ac) [ 21.029510] [<c0482e08>] (secondary_start_kernel+0x110/0x1ac) from [<c0070ee4>] (platform_cpu_die+0x34/0x54) [ 22.021362] CPU1: failed to come online [ 23.997955] CPU1: failed to come online [ 25.000122] BUG: spinlock lockup on CPU#0, kthreadd/663, efa27e64 [ 25.006408] [<c0064704>] (unwind_backtrace+0x0/0xf4) from [<c028edc8>] (do_raw_spin_lock+0xd0/0x164) [ 25.015808] [<c028edc8>] (do_raw_spin_lock+0xd0/0x164) from [<c048985c>] (_raw_spin_lock_irqsave+0x4c/0x58) [ 25.025848] [<c048985c>] (_raw_spin_lock_irqsave+0x4c/0x58) from [<c008ba24>] (complete+0x1c/0x5c) [ 25.035095] [<c008ba24>] (complete+0x1c/0x5c) from [<c00baf78>] (kthread+0x68/0x90) [ 25.042968] [<c00baf78>] (kthread+0x68/0x90) from [<c005dfdc>] (kernel_thread_exit+0x0/0x8) ^ permalink raw reply [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-20 10:47 ` Santosh Shilimkar @ 2011-06-20 11:13 ` Russell King - ARM Linux 2011-06-20 11:25 ` Santosh Shilimkar 0 siblings, 1 reply; 31+ messages in thread From: Russell King - ARM Linux @ 2011-06-20 11:13 UTC (permalink / raw) To: linux-arm-kernel On Mon, Jun 20, 2011 at 04:17:58PM +0530, Santosh Shilimkar wrote: > Yes. It's because of interrupt and the CPU active-online > race. I don't see that as a conclusion from this dump. > Here is the chash log.. > [ 21.025451] CPU1: Booted secondary processor > [ 21.025451] CPU1: Unknown IPI message 0x1 > [ 21.029113] Switched to NOHz mode on CPU #1 > [ 21.029174] BUG: spinlock lockup on CPU#1, swapper/0, c06220c4 That's the xtime seqlock. We're trying to update the xtime from CPU1, which is not yet online and not yet active. That's fine, we're just spinning on the spinlock here, waiting for the other CPUs to release it. But what this is saying is that the other CPUs aren't releasing it. The cpu hotplug code doesn't hold the seqlock either. So who else is holding this lock, causing CPU1 to time out on it. The other thing is that this is only supposed to trigger after about one second: u64 loops = loops_per_jiffy * HZ; for (i = 0; i < loops; i++) { if (arch_spin_trylock(&lock->raw_lock)) return; __delay(1); } which from the timings you have at the beginning of your printk lines is clearly not the case - it's more like 61us. Are you running with those h/w timer delay patches? ^ permalink raw reply [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-20 11:13 ` Russell King - ARM Linux @ 2011-06-20 11:25 ` Santosh Shilimkar 2011-06-20 11:40 ` Russell King - ARM Linux 0 siblings, 1 reply; 31+ messages in thread From: Santosh Shilimkar @ 2011-06-20 11:25 UTC (permalink / raw) To: linux-arm-kernel On 6/20/2011 4:43 PM, Russell King - ARM Linux wrote: > On Mon, Jun 20, 2011 at 04:17:58PM +0530, Santosh Shilimkar wrote: >> Yes. It's because of interrupt and the CPU active-online >> race. > > I don't see that as a conclusion from this dump. > >> Here is the chash log.. >> [ 21.025451] CPU1: Booted secondary processor >> [ 21.025451] CPU1: Unknown IPI message 0x1 >> [ 21.029113] Switched to NOHz mode on CPU #1 >> [ 21.029174] BUG: spinlock lockup on CPU#1, swapper/0, c06220c4 > > That's the xtime seqlock. We're trying to update the xtime from CPU1, > which is not yet online and not yet active. That's fine, we're just > spinning on the spinlock here, waiting for the other CPUs to release > it. > > But what this is saying is that the other CPUs aren't releasing it. > The cpu hotplug code doesn't hold the seqlock either. So who else is > holding this lock, causing CPU1 to time out on it. > > The other thing is that this is only supposed to trigger after about > one second: > > u64 loops = loops_per_jiffy * HZ; > for (i = 0; i< loops; i++) { > if (arch_spin_trylock(&lock->raw_lock)) > return; > __delay(1); > } > > which from the timings you have at the beginning of your printk lines > is clearly not the case - it's more like 61us. > > Are you running with those h/w timer delay patches? Nope. Regards Santosh ^ permalink raw reply [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-20 11:25 ` Santosh Shilimkar @ 2011-06-20 11:40 ` Russell King - ARM Linux 2011-06-20 11:51 ` Santosh Shilimkar 2011-06-20 14:23 ` Russell King - ARM Linux 0 siblings, 2 replies; 31+ messages in thread From: Russell King - ARM Linux @ 2011-06-20 11:40 UTC (permalink / raw) To: linux-arm-kernel On Mon, Jun 20, 2011 at 04:55:43PM +0530, Santosh Shilimkar wrote: > On 6/20/2011 4:43 PM, Russell King - ARM Linux wrote: >> On Mon, Jun 20, 2011 at 04:17:58PM +0530, Santosh Shilimkar wrote: >>> Yes. It's because of interrupt and the CPU active-online >>> race. >> >> I don't see that as a conclusion from this dump. >> >>> Here is the chash log.. >>> [ 21.025451] CPU1: Booted secondary processor >>> [ 21.025451] CPU1: Unknown IPI message 0x1 >>> [ 21.029113] Switched to NOHz mode on CPU #1 >>> [ 21.029174] BUG: spinlock lockup on CPU#1, swapper/0, c06220c4 >> >> That's the xtime seqlock. We're trying to update the xtime from CPU1, >> which is not yet online and not yet active. That's fine, we're just >> spinning on the spinlock here, waiting for the other CPUs to release >> it. >> >> But what this is saying is that the other CPUs aren't releasing it. >> The cpu hotplug code doesn't hold the seqlock either. So who else is >> holding this lock, causing CPU1 to time out on it. >> >> The other thing is that this is only supposed to trigger after about >> one second: >> >> u64 loops = loops_per_jiffy * HZ; >> for (i = 0; i< loops; i++) { >> if (arch_spin_trylock(&lock->raw_lock)) >> return; >> __delay(1); >> } >> >> which from the timings you have at the beginning of your printk lines >> is clearly not the case - it's more like 61us. >> >> Are you running with those h/w timer delay patches? > Nope. Ok. So loops_per_jiffy must be too small. My guess is you're using an older kernel without 71c696b1 (calibrate: extract fall-back calculation into own helper). The delay calibration code used to start out by setting: loops_per_jiffy = (1<<12); This will shorten the delay right down, and that's probably causing these false spinlock lockup bug dumps. Arranging for IRQs to be disabled across the delay calibration just avoids the issue by preventing any spinlock being taken. The reason that CPU#0 also complains about spinlock lockup is that for some reason CPU#1 never finishes its calibration, and so the loop also times out early on CPU#0. Of course, fiddling with this global variable in this way is _not_ a good idea while other CPUs are running and using that variable. We could also do with implementing trigger_all_cpu_backtrace() to get backtraces from the other CPUs when spinlock lockup happens... ^ permalink raw reply [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-20 11:40 ` Russell King - ARM Linux @ 2011-06-20 11:51 ` Santosh Shilimkar 2011-06-20 12:19 ` Russell King - ARM Linux 2011-06-20 14:23 ` Russell King - ARM Linux 1 sibling, 1 reply; 31+ messages in thread From: Santosh Shilimkar @ 2011-06-20 11:51 UTC (permalink / raw) To: linux-arm-kernel On 6/20/2011 5:10 PM, Russell King - ARM Linux wrote: > On Mon, Jun 20, 2011 at 04:55:43PM +0530, Santosh Shilimkar wrote: >> On 6/20/2011 4:43 PM, Russell King - ARM Linux wrote: >>> On Mon, Jun 20, 2011 at 04:17:58PM +0530, Santosh Shilimkar wrote: >>>> Yes. It's because of interrupt and the CPU active-online >>>> race. >>> >>> I don't see that as a conclusion from this dump. >>> >>>> Here is the chash log.. >>>> [ 21.025451] CPU1: Booted secondary processor >>>> [ 21.025451] CPU1: Unknown IPI message 0x1 >>>> [ 21.029113] Switched to NOHz mode on CPU #1 >>>> [ 21.029174] BUG: spinlock lockup on CPU#1, swapper/0, c06220c4 >>> >>> That's the xtime seqlock. We're trying to update the xtime from CPU1, >>> which is not yet online and not yet active. That's fine, we're just >>> spinning on the spinlock here, waiting for the other CPUs to release >>> it. >>> >>> But what this is saying is that the other CPUs aren't releasing it. >>> The cpu hotplug code doesn't hold the seqlock either. So who else is >>> holding this lock, causing CPU1 to time out on it. >>> >>> The other thing is that this is only supposed to trigger after about >>> one second: >>> >>> u64 loops = loops_per_jiffy * HZ; >>> for (i = 0; i< loops; i++) { >>> if (arch_spin_trylock(&lock->raw_lock)) >>> return; >>> __delay(1); >>> } >>> >>> which from the timings you have at the beginning of your printk lines >>> is clearly not the case - it's more like 61us. >>> >>> Are you running with those h/w timer delay patches? >> Nope. > > Ok. So loops_per_jiffy must be too small. My guess is you're using an > older kernel without 71c696b1 (calibrate: extract fall-back calculation > into own helper). > I am on V3.0-rc3+(latest mainline) and the above commit is already part of it. > The delay calibration code used to start out by setting: > > loops_per_jiffy = (1<<12); > > This will shorten the delay right down, and that's probably causing these > false spinlock lockup bug dumps. > > Arranging for IRQs to be disabled across the delay calibration just avoids > the issue by preventing any spinlock being taken. > > The reason that CPU#0 also complains about spinlock lockup is that for > some reason CPU#1 never finishes its calibration, and so the loop also > times out early on CPU#0. > I am not sure but what I think is happening is as soon as interrupts start firing, as part of IRQ handling, scheduler will try to enqueue softIRQ thread for newly booted CPU since it sees that it's active and ready. But that's failing and both CPU's eventually lock-up. But I may be wrong here. > Of course, fiddling with this global variable in this way is _not_ a good > idea while other CPUs are running and using that variable. > > We could also do with implementing trigger_all_cpu_backtrace() to get > backtraces from the other CPUs when spinlock lockup happens... Any pointers on the other question about "why we need to enable interrupts before the CPU is ready?" Regards Santosh ^ permalink raw reply [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-20 11:51 ` Santosh Shilimkar @ 2011-06-20 12:19 ` Russell King - ARM Linux 2011-06-20 12:27 ` Santosh Shilimkar 0 siblings, 1 reply; 31+ messages in thread From: Russell King - ARM Linux @ 2011-06-20 12:19 UTC (permalink / raw) To: linux-arm-kernel On Mon, Jun 20, 2011 at 05:21:48PM +0530, Santosh Shilimkar wrote: > On 6/20/2011 5:10 PM, Russell King - ARM Linux wrote: >> On Mon, Jun 20, 2011 at 04:55:43PM +0530, Santosh Shilimkar wrote: >>> On 6/20/2011 4:43 PM, Russell King - ARM Linux wrote: >>>> On Mon, Jun 20, 2011 at 04:17:58PM +0530, Santosh Shilimkar wrote: >>>>> Yes. It's because of interrupt and the CPU active-online >>>>> race. >>>> >>>> I don't see that as a conclusion from this dump. >>>> >>>>> Here is the chash log.. >>>>> [ 21.025451] CPU1: Booted secondary processor >>>>> [ 21.025451] CPU1: Unknown IPI message 0x1 >>>>> [ 21.029113] Switched to NOHz mode on CPU #1 >>>>> [ 21.029174] BUG: spinlock lockup on CPU#1, swapper/0, c06220c4 >>>> >>>> That's the xtime seqlock. We're trying to update the xtime from CPU1, >>>> which is not yet online and not yet active. That's fine, we're just >>>> spinning on the spinlock here, waiting for the other CPUs to release >>>> it. >>>> >>>> But what this is saying is that the other CPUs aren't releasing it. >>>> The cpu hotplug code doesn't hold the seqlock either. So who else is >>>> holding this lock, causing CPU1 to time out on it. >>>> >>>> The other thing is that this is only supposed to trigger after about >>>> one second: >>>> >>>> u64 loops = loops_per_jiffy * HZ; >>>> for (i = 0; i< loops; i++) { >>>> if (arch_spin_trylock(&lock->raw_lock)) >>>> return; >>>> __delay(1); >>>> } >>>> >>>> which from the timings you have at the beginning of your printk lines >>>> is clearly not the case - it's more like 61us. >>>> >>>> Are you running with those h/w timer delay patches? >>> Nope. >> >> Ok. So loops_per_jiffy must be too small. My guess is you're using an >> older kernel without 71c696b1 (calibrate: extract fall-back calculation >> into own helper). >> > I am on V3.0-rc3+(latest mainline) and the above commit is already > part of it. > >> The delay calibration code used to start out by setting: >> >> loops_per_jiffy = (1<<12); >> >> This will shorten the delay right down, and that's probably causing these >> false spinlock lockup bug dumps. >> >> Arranging for IRQs to be disabled across the delay calibration just avoids >> the issue by preventing any spinlock being taken. >> >> The reason that CPU#0 also complains about spinlock lockup is that for >> some reason CPU#1 never finishes its calibration, and so the loop also >> times out early on CPU#0. >> > I am not sure but what I think is happening is as soon as interrupts > start firing, as part of IRQ handling, scheduler will try to > enqueue softIRQ thread for newly booted CPU since it sees that > it's active and ready. But that's failing and both CPU's > eventually lock-up. But I may be wrong here. Even if that happens, there is NO WAY that the spinlock lockup detector should report lockup in anything under 1s. >> Of course, fiddling with this global variable in this way is _not_ a good >> idea while other CPUs are running and using that variable. >> >> We could also do with implementing trigger_all_cpu_backtrace() to get >> backtraces from the other CPUs when spinlock lockup happens... > > Any pointers on the other question about "why we need to enable > interrupts before the CPU is ready?" To ensure that things like the delay loop calibration and twd calibration can run, though that looks like it'll run happily enough with the boot CPU updating jiffies. However, I'm still not taking your patch because I believe its just papering over the real issue, which is not as you describe. You first need to work out why the spinlock lockup detection is firing after just 61us rather than the full 1s and fix that. You then need to work out whether you really do have spinlock lockup, and if so, why. Implementing trigger_all_cpu_backtrace() may help to find out what CPU#0 is doing, though we can only do that with IRQs on, and so would be fragile. We can test whether CPU#0 is going off to do something else while CPU#1 is being brought up, by adding a preempt_disable() / preempt_enable() in __cpu_up() to prevent the wait-for-cpu#1-online being preempted by other threads - I suspect you'll still see spinlock lockup on the xtime seqlock on CPU#1 though. That would suggest a coherency issue. Finally, how are you provoking this - and what kernel configuration are you using? ^ permalink raw reply [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-20 12:19 ` Russell King - ARM Linux @ 2011-06-20 12:27 ` Santosh Shilimkar 2011-06-20 12:57 ` Russell King - ARM Linux 0 siblings, 1 reply; 31+ messages in thread From: Santosh Shilimkar @ 2011-06-20 12:27 UTC (permalink / raw) To: linux-arm-kernel On 6/20/2011 5:49 PM, Russell King - ARM Linux wrote: > On Mon, Jun 20, 2011 at 05:21:48PM +0530, Santosh Shilimkar wrote: >> On 6/20/2011 5:10 PM, Russell King - ARM Linux wrote: [...] >> >> Any pointers on the other question about "why we need to enable >> interrupts before the CPU is ready?" > > To ensure that things like the delay loop calibration and twd calibration > can run, though that looks like it'll run happily enough with the boot > CPU updating jiffies. > I guessed it and had same point as above. Calibration will still work. > However, I'm still not taking your patch because I believe its just > papering over the real issue, which is not as you describe. > > You first need to work out why the spinlock lockup detection is firing > after just 61us rather than the full 1s and fix that. > This is possibly because of my script which doesn't wait for 1 second. > You then need to work out whether you really do have spinlock lockup, > and if so, why. Implementing trigger_all_cpu_backtrace() may help to > find out what CPU#0 is doing, though we can only do that with IRQs on, > and so would be fragile. > > We can test whether CPU#0 is going off to do something else while CPU#1 > is being brought up, by adding a preempt_disable() / preempt_enable() > in __cpu_up() to prevent the wait-for-cpu#1-online being preempted by > other threads - I suspect you'll still see spinlock lockup on the > xtime seqlock on CPU#1 though. That would suggest a coherency issue. > > Finally, how are you provoking this - and what kernel configuration are > you using? Latest mainline kernel with omap2plus_defconfig and below simple script to trigger the failure. ------------- while true do echo 0 > /sys/devices/system/cpu/cpu1/online echo 1 > /sys/devices/system/cpu/cpu1/online done Regards Santosh ^ permalink raw reply [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-20 12:27 ` Santosh Shilimkar @ 2011-06-20 12:57 ` Russell King - ARM Linux 0 siblings, 0 replies; 31+ messages in thread From: Russell King - ARM Linux @ 2011-06-20 12:57 UTC (permalink / raw) To: linux-arm-kernel On Mon, Jun 20, 2011 at 05:57:01PM +0530, Santosh Shilimkar wrote: > On 6/20/2011 5:49 PM, Russell King - ARM Linux wrote: >> On Mon, Jun 20, 2011 at 05:21:48PM +0530, Santosh Shilimkar wrote: >>> On 6/20/2011 5:10 PM, Russell King - ARM Linux wrote: > > [...] > >>> >>> Any pointers on the other question about "why we need to enable >>> interrupts before the CPU is ready?" >> >> To ensure that things like the delay loop calibration and twd calibration >> can run, though that looks like it'll run happily enough with the boot >> CPU updating jiffies. >> > I guessed it and had same point as above. Calibration will still > work. > >> However, I'm still not taking your patch because I believe its just >> papering over the real issue, which is not as you describe. >> >> You first need to work out why the spinlock lockup detection is firing >> after just 61us rather than the full 1s and fix that. >> > This is possibly because of my script which doesn't wait for 1 > second. How could a userspace script affect the internal behaviour of spin_lock() and the spinlock lockup detector? > Latest mainline kernel with omap2plus_defconfig and below simple script > to trigger the failure. > > ------------- > while true > do > echo 0 > /sys/devices/system/cpu/cpu1/online > echo 1 > /sys/devices/system/cpu/cpu1/online > done Thanks, I'll give it a go here and see if I can debug it further. ^ permalink raw reply [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-20 11:40 ` Russell King - ARM Linux 2011-06-20 11:51 ` Santosh Shilimkar @ 2011-06-20 14:23 ` Russell King - ARM Linux 2011-06-20 14:54 ` Santosh Shilimkar 1 sibling, 1 reply; 31+ messages in thread From: Russell King - ARM Linux @ 2011-06-20 14:23 UTC (permalink / raw) To: linux-arm-kernel On Mon, Jun 20, 2011 at 12:40:19PM +0100, Russell King - ARM Linux wrote: > Ok. So loops_per_jiffy must be too small. My guess is you're using an > older kernel without 71c696b1 (calibrate: extract fall-back calculation > into own helper). Right, this commit above helps show the problem - and it's fairly subtle. It's a race condition. Let's first look at the spinlock debugging code. It does this: static void __spin_lock_debug(raw_spinlock_t *lock) { u64 i; u64 loops = loops_per_jiffy * HZ; for (;;) { for (i = 0; i < loops; i++) { if (arch_spin_trylock(&lock->raw_lock)) return; __delay(1); } /* print warning */ } } If loops_per_jiffy is zero, we never try to grab the spinlock, because we never enter the inner for loop. We immediately print a warning, and re-execute the outer loop for ever, resulting in the CPU locking up in this condition. In theory, we should never see a zero loops_per_jiffy value, because it represents the number of loops __delay() needs to delay by one jiffy and clearly zero makes no sense. However, calibrate_delay() does this (which x86 and ARM call on secondary CPU startup): calibrate_delay() { ... if (preset_lpj) { } else if ((!printed) && lpj_fine) { } else if ((loops_per_jiffy = calibrate_delay_direct()) != 0) { } else { /* approximation/convergence stuff */ } } Now, before 71c696b, this used to be: } else { loops_per_jiffy = (1<<12); So the window between calibrate_delay_direct() returning and setting loops_per_jiffy to zero, and the re-initialization of loops_per_jiffy was relatively short (maybe even the compiler optimized away the zero write.) However, after 71c696b, this now does: } else { if (!printed) pr_info("Calibrating delay loop... "); + loops_per_jiffy = calibrate_delay_converge(); So, as loops_per_jiffy is not local to this function, the compiler has to write out that zero value, before calling calibrate_delay_converge(), and loops_per_jiffy only becomes non-zero _after_ calibrate_delay_converge() has returned. This opens the window and allows the spinlock debugging code to explode. This patch closes the window completely, by only writing to loops_per_jiffy only when we have a real value for it. This allows me to boot 3.0.0-rc3 on Versatile Express (4 CPU) whereas without this it fails with spinlock lockup and rcu problems. init/calibrate.c | 14 ++++++++------ 1 files changed, 8 insertions(+), 6 deletions(-) diff --git a/init/calibrate.c b/init/calibrate.c index 2568d22..aae2f40 100644 --- a/init/calibrate.c +++ b/init/calibrate.c @@ -245,30 +245,32 @@ static unsigned long __cpuinit calibrate_delay_converge(void) void __cpuinit calibrate_delay(void) { + unsigned long lpj; static bool printed; if (preset_lpj) { - loops_per_jiffy = preset_lpj; + lpj = preset_lpj; if (!printed) pr_info("Calibrating delay loop (skipped) " "preset value.. "); } else if ((!printed) && lpj_fine) { - loops_per_jiffy = lpj_fine; + lpj = lpj_fine; pr_info("Calibrating delay loop (skipped), " "value calculated using timer frequency.. "); - } else if ((loops_per_jiffy = calibrate_delay_direct()) != 0) { + } else if ((lpj = calibrate_delay_direct()) != 0) { if (!printed) pr_info("Calibrating delay using timer " "specific routine.. "); } else { if (!printed) pr_info("Calibrating delay loop... "); - loops_per_jiffy = calibrate_delay_converge(); + lpj = calibrate_delay_converge(); } if (!printed) pr_cont("%lu.%02lu BogoMIPS (lpj=%lu)\n", - loops_per_jiffy/(500000/HZ), - (loops_per_jiffy/(5000/HZ)) % 100, loops_per_jiffy); + lpj/(500000/HZ), + (lpj/(5000/HZ)) % 100, lpj); + loops_per_jiffy = lpj; printed = true; } ^ permalink raw reply related [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-20 14:23 ` Russell King - ARM Linux @ 2011-06-20 14:54 ` Santosh Shilimkar 2011-06-20 15:01 ` Russell King - ARM Linux 2011-06-21 9:08 ` Santosh Shilimkar 0 siblings, 2 replies; 31+ messages in thread From: Santosh Shilimkar @ 2011-06-20 14:54 UTC (permalink / raw) To: linux-arm-kernel On 6/20/2011 7:53 PM, Russell King - ARM Linux wrote: > On Mon, Jun 20, 2011 at 12:40:19PM +0100, Russell King - ARM Linux wrote: >> Ok. So loops_per_jiffy must be too small. My guess is you're using an >> older kernel without 71c696b1 (calibrate: extract fall-back calculation >> into own helper). > > Right, this commit above helps show the problem - and it's fairly subtle. > > It's a race condition. Let's first look at the spinlock debugging code. > It does this: > > static void __spin_lock_debug(raw_spinlock_t *lock) > { > u64 i; > u64 loops = loops_per_jiffy * HZ; > > for (;;) { > for (i = 0; i< loops; i++) { > if (arch_spin_trylock(&lock->raw_lock)) > return; > __delay(1); > } > /* print warning */ > } > } > > If loops_per_jiffy is zero, we never try to grab the spinlock, because > we never enter the inner for loop. We immediately print a warning, > and re-execute the outer loop for ever, resulting in the CPU locking up > in this condition. > > In theory, we should never see a zero loops_per_jiffy value, because it > represents the number of loops __delay() needs to delay by one jiffy and > clearly zero makes no sense. > > However, calibrate_delay() does this (which x86 and ARM call on secondary > CPU startup): > > calibrate_delay() > { > ... > if (preset_lpj) { > } else if ((!printed)&& lpj_fine) { > } else if ((loops_per_jiffy = calibrate_delay_direct()) != 0) { > } else { > /* approximation/convergence stuff */ > } > } > > Now, before 71c696b, this used to be: > > } else { > loops_per_jiffy = (1<<12); > > So the window between calibrate_delay_direct() returning and setting > loops_per_jiffy to zero, and the re-initialization of loops_per_jiffy > was relatively short (maybe even the compiler optimized away the zero > write.) > > However, after 71c696b, this now does: > > } else { > if (!printed) > pr_info("Calibrating delay loop... "); > + loops_per_jiffy = calibrate_delay_converge(); > > So, as loops_per_jiffy is not local to this function, the compiler has > to write out that zero value, before calling calibrate_delay_converge(), > and loops_per_jiffy only becomes non-zero _after_ calibrate_delay_converge() > has returned. This opens the window and allows the spinlock debugging > code to explode. > > This patch closes the window completely, by only writing to loops_per_jiffy > only when we have a real value for it. > > This allows me to boot 3.0.0-rc3 on Versatile Express (4 CPU) whereas > without this it fails with spinlock lockup and rcu problems. > > init/calibrate.c | 14 ++++++++------ > 1 files changed, 8 insertions(+), 6 deletions(-) > I am away from my board now. Will test this change. btw, the online-active race is still open even with this patch close and should be fixed. Regards Santosh ^ permalink raw reply [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-20 14:54 ` Santosh Shilimkar @ 2011-06-20 15:01 ` Russell King - ARM Linux 2011-06-20 15:10 ` Santosh Shilimkar 2011-06-21 9:08 ` Santosh Shilimkar 1 sibling, 1 reply; 31+ messages in thread From: Russell King - ARM Linux @ 2011-06-20 15:01 UTC (permalink / raw) To: linux-arm-kernel On Mon, Jun 20, 2011 at 08:24:33PM +0530, Santosh Shilimkar wrote: > I am away from my board now. Will test this change. > btw, the online-active race is still open even with this patch close > and should be fixed. I have yet to see any evidence of that race - I've been running your test loop for about an hour so far on Versatile Express and nothing yet. That's not to say that we shouldn't wait for the active mask to become true before calling schedule(), but I don't think its as big a deal as you're suggesting it is. ^ permalink raw reply [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-20 15:01 ` Russell King - ARM Linux @ 2011-06-20 15:10 ` Santosh Shilimkar 0 siblings, 0 replies; 31+ messages in thread From: Santosh Shilimkar @ 2011-06-20 15:10 UTC (permalink / raw) To: linux-arm-kernel On 6/20/2011 8:31 PM, Russell King - ARM Linux wrote: > On Mon, Jun 20, 2011 at 08:24:33PM +0530, Santosh Shilimkar wrote: >> I am away from my board now. Will test this change. >> btw, the online-active race is still open even with this patch close >> and should be fixed. > > I have yet to see any evidence of that race - I've been running your > test loop for about an hour so far on Versatile Express and nothing > yet. > In that case my script was just exposing the calibrate() code race condition. > That's not to say that we shouldn't wait for the active mask to become > true before calling schedule(), but I don't think its as big a deal as > you're suggesting it is. I am not expert to really trigger that exact race online-to-active but am sure the race needs to be fixed. May be Thomas can suggest a test-case to expose that race. From his fix for x86, it appeared that the race was indeed hit in some sequence. Regards, Santosh ^ permalink raw reply [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-20 14:54 ` Santosh Shilimkar 2011-06-20 15:01 ` Russell King - ARM Linux @ 2011-06-21 9:08 ` Santosh Shilimkar 2011-06-21 10:00 ` Russell King - ARM Linux 1 sibling, 1 reply; 31+ messages in thread From: Santosh Shilimkar @ 2011-06-21 9:08 UTC (permalink / raw) To: linux-arm-kernel Russell, On 6/20/2011 8:24 PM, Santosh Shilimkar wrote: > On 6/20/2011 7:53 PM, Russell King - ARM Linux wrote: >> On Mon, Jun 20, 2011 at 12:40:19PM +0100, Russell King - ARM Linux wrote: [....] >> So, as loops_per_jiffy is not local to this function, the compiler has >> to write out that zero value, before calling calibrate_delay_converge(), >> and loops_per_jiffy only becomes non-zero _after_ >> calibrate_delay_converge() >> has returned. This opens the window and allows the spinlock debugging >> code to explode. >> >> This patch closes the window completely, by only writing to >> loops_per_jiffy >> only when we have a real value for it. >> >> This allows me to boot 3.0.0-rc3 on Versatile Express (4 CPU) whereas >> without this it fails with spinlock lockup and rcu problems. >> >> init/calibrate.c | 14 ++++++++------ >> 1 files changed, 8 insertions(+), 6 deletions(-) >> > I am away from my board now. Will test this change. Have tested your change and it seems to fix the crash I was observing. Are you planning to send this fix for rc5? > btw, the online-active race is still open even with this patch close > and should be fixed. > The only problem remains is waiting for active mask before marking CPU online. Shall I refresh my patch with only this change then ? Regards Santosh ^ permalink raw reply [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-21 9:08 ` Santosh Shilimkar @ 2011-06-21 10:00 ` Russell King - ARM Linux 2011-06-21 10:17 ` Santosh Shilimkar 0 siblings, 1 reply; 31+ messages in thread From: Russell King - ARM Linux @ 2011-06-21 10:00 UTC (permalink / raw) To: linux-arm-kernel On Tue, Jun 21, 2011 at 02:38:34PM +0530, Santosh Shilimkar wrote: > Russell, > > On 6/20/2011 8:24 PM, Santosh Shilimkar wrote: >> On 6/20/2011 7:53 PM, Russell King - ARM Linux wrote: >>> So, as loops_per_jiffy is not local to this function, the compiler has >>> to write out that zero value, before calling calibrate_delay_converge(), >>> and loops_per_jiffy only becomes non-zero _after_ >>> calibrate_delay_converge() >>> has returned. This opens the window and allows the spinlock debugging >>> code to explode. >>> >>> This patch closes the window completely, by only writing to >>> loops_per_jiffy >>> only when we have a real value for it. >>> >>> This allows me to boot 3.0.0-rc3 on Versatile Express (4 CPU) whereas >>> without this it fails with spinlock lockup and rcu problems. >>> >>> init/calibrate.c | 14 ++++++++------ >>> 1 files changed, 8 insertions(+), 6 deletions(-) >>> >> I am away from my board now. Will test this change. > Have tested your change and it seems to fix the crash I > was observing. Are you planning to send this fix for rc5? Yes. I think sending CPUs into infinite loops in the spinlock code is definitely sufficiently serious that it needs to go to Linus ASAP. It'd be nice to have a tested-by line though. >> btw, the online-active race is still open even with this patch close >> and should be fixed. >> > The only problem remains is waiting for active mask before > marking CPU online. Shall I refresh my patch with only > this change then ? I already have that as a separate change. ^ permalink raw reply [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-21 10:00 ` Russell King - ARM Linux @ 2011-06-21 10:17 ` Santosh Shilimkar 2011-06-21 10:19 ` Russell King - ARM Linux 0 siblings, 1 reply; 31+ messages in thread From: Santosh Shilimkar @ 2011-06-21 10:17 UTC (permalink / raw) To: linux-arm-kernel On 6/21/2011 3:30 PM, Russell King - ARM Linux wrote: > On Tue, Jun 21, 2011 at 02:38:34PM +0530, Santosh Shilimkar wrote: >> Russell, >> >> On 6/20/2011 8:24 PM, Santosh Shilimkar wrote: >>> On 6/20/2011 7:53 PM, Russell King - ARM Linux wrote: >>>> So, as loops_per_jiffy is not local to this function, the compiler has >>>> to write out that zero value, before calling calibrate_delay_converge(), >>>> and loops_per_jiffy only becomes non-zero _after_ >>>> calibrate_delay_converge() >>>> has returned. This opens the window and allows the spinlock debugging >>>> code to explode. >>>> >>>> This patch closes the window completely, by only writing to >>>> loops_per_jiffy >>>> only when we have a real value for it. >>>> >>>> This allows me to boot 3.0.0-rc3 on Versatile Express (4 CPU) whereas >>>> without this it fails with spinlock lockup and rcu problems. >>>> >>>> init/calibrate.c | 14 ++++++++------ >>>> 1 files changed, 8 insertions(+), 6 deletions(-) >>>> >>> I am away from my board now. Will test this change. >> Have tested your change and it seems to fix the crash I >> was observing. Are you planning to send this fix for rc5? > > Yes. I think sending CPUs into infinite loops in the spinlock code is > definitely sufficiently serious that it needs to go to Linus ASAP. > It'd be nice to have a tested-by line though. > Sure. >>> btw, the online-active race is still open even with this patch close >>> and should be fixed. >>> >> The only problem remains is waiting for active mask before >> marking CPU online. Shall I refresh my patch with only >> this change then ? > > I already have that as a separate change. Can you point me to both of these commits so that I have them in my tree for testing. Thanks for help. Regards Santosh ^ permalink raw reply [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-21 10:17 ` Santosh Shilimkar @ 2011-06-21 10:19 ` Russell King - ARM Linux 2011-06-21 10:21 ` Santosh Shilimkar 0 siblings, 1 reply; 31+ messages in thread From: Russell King - ARM Linux @ 2011-06-21 10:19 UTC (permalink / raw) To: linux-arm-kernel On Tue, Jun 21, 2011 at 03:47:04PM +0530, Santosh Shilimkar wrote: > On 6/21/2011 3:30 PM, Russell King - ARM Linux wrote: >> On Tue, Jun 21, 2011 at 02:38:34PM +0530, Santosh Shilimkar wrote: >>> Russell, >>> >>> On 6/20/2011 8:24 PM, Santosh Shilimkar wrote: >>>> On 6/20/2011 7:53 PM, Russell King - ARM Linux wrote: >>>>> So, as loops_per_jiffy is not local to this function, the compiler has >>>>> to write out that zero value, before calling calibrate_delay_converge(), >>>>> and loops_per_jiffy only becomes non-zero _after_ >>>>> calibrate_delay_converge() >>>>> has returned. This opens the window and allows the spinlock debugging >>>>> code to explode. >>>>> >>>>> This patch closes the window completely, by only writing to >>>>> loops_per_jiffy >>>>> only when we have a real value for it. >>>>> >>>>> This allows me to boot 3.0.0-rc3 on Versatile Express (4 CPU) whereas >>>>> without this it fails with spinlock lockup and rcu problems. >>>>> >>>>> init/calibrate.c | 14 ++++++++------ >>>>> 1 files changed, 8 insertions(+), 6 deletions(-) >>>>> >>>> I am away from my board now. Will test this change. >>> Have tested your change and it seems to fix the crash I >>> was observing. Are you planning to send this fix for rc5? >> >> Yes. I think sending CPUs into infinite loops in the spinlock code is >> definitely sufficiently serious that it needs to go to Linus ASAP. >> It'd be nice to have a tested-by line though. >> > Sure. > >>>> btw, the online-active race is still open even with this patch close >>>> and should be fixed. >>>> >>> The only problem remains is waiting for active mask before >>> marking CPU online. Shall I refresh my patch with only >>> this change then ? >> >> I already have that as a separate change. > Can you point me to both of these commits so that I have > them in my tree for testing. I won't be committing the init/calibrate.c change to a git tree - it isn't ARM stuff so it goes in patch form. ^ permalink raw reply [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-21 10:19 ` Russell King - ARM Linux @ 2011-06-21 10:21 ` Santosh Shilimkar 2011-06-21 10:26 ` Russell King - ARM Linux 0 siblings, 1 reply; 31+ messages in thread From: Santosh Shilimkar @ 2011-06-21 10:21 UTC (permalink / raw) To: linux-arm-kernel On 6/21/2011 3:49 PM, Russell King - ARM Linux wrote: > On Tue, Jun 21, 2011 at 03:47:04PM +0530, Santosh Shilimkar wrote: [...] >>> I already have that as a separate change. >> Can you point me to both of these commits so that I have >> them in my tree for testing. > > I won't be committing the init/calibrate.c change to a git tree - it > isn't ARM stuff so it goes in patch form. Patches with change log would be fine as well. Regards Santosh ^ permalink raw reply [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-21 10:21 ` Santosh Shilimkar @ 2011-06-21 10:26 ` Russell King - ARM Linux 2011-06-21 20:16 ` Stephen Boyd 0 siblings, 1 reply; 31+ messages in thread From: Russell King - ARM Linux @ 2011-06-21 10:26 UTC (permalink / raw) To: linux-arm-kernel On Tue, Jun 21, 2011 at 03:51:00PM +0530, Santosh Shilimkar wrote: > On 6/21/2011 3:49 PM, Russell King - ARM Linux wrote: >> On Tue, Jun 21, 2011 at 03:47:04PM +0530, Santosh Shilimkar wrote: > > [...] > >>>> I already have that as a separate change. >>> Can you point me to both of these commits so that I have >>> them in my tree for testing. >> >> I won't be committing the init/calibrate.c change to a git tree - it >> isn't ARM stuff so it goes in patch form. > > Patches with change log would be fine as well. The answer is not at the moment, but maybe soon. ^ permalink raw reply [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-21 10:26 ` Russell King - ARM Linux @ 2011-06-21 20:16 ` Stephen Boyd 2011-06-21 23:10 ` Russell King - ARM Linux 0 siblings, 1 reply; 31+ messages in thread From: Stephen Boyd @ 2011-06-21 20:16 UTC (permalink / raw) To: linux-arm-kernel On 06/21/2011 03:26 AM, Russell King - ARM Linux wrote: > On Tue, Jun 21, 2011 at 03:51:00PM +0530, Santosh Shilimkar wrote: >> On 6/21/2011 3:49 PM, Russell King - ARM Linux wrote: >>> I won't be committing the init/calibrate.c change to a git tree - it >>> isn't ARM stuff so it goes in patch form. >> Patches with change log would be fine as well. > The answer is not at the moment, but maybe soon. Should we send those two patches to the stable trees as well? They seem to fix issues with cpu onlining that have existed for a long time. -- Sent by an employee of the Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum. ^ permalink raw reply [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-21 20:16 ` Stephen Boyd @ 2011-06-21 23:10 ` Russell King - ARM Linux 2011-06-22 0:06 ` Stephen Boyd 0 siblings, 1 reply; 31+ messages in thread From: Russell King - ARM Linux @ 2011-06-21 23:10 UTC (permalink / raw) To: linux-arm-kernel On Tue, Jun 21, 2011 at 01:16:47PM -0700, Stephen Boyd wrote: > On 06/21/2011 03:26 AM, Russell King - ARM Linux wrote: > > On Tue, Jun 21, 2011 at 03:51:00PM +0530, Santosh Shilimkar wrote: > >> On 6/21/2011 3:49 PM, Russell King - ARM Linux wrote: > >>> I won't be committing the init/calibrate.c change to a git tree - it > >>> isn't ARM stuff so it goes in patch form. > >> Patches with change log would be fine as well. > > The answer is not at the moment, but maybe soon. > > Should we send those two patches to the stable trees as well? They seem > to fix issues with cpu onlining that have existed for a long time. Looks to me like the problem was introduced for 2.6.39-rc1, so we should probably get the fix into the 2.6.39-stable tree too. ^ permalink raw reply [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-21 23:10 ` Russell King - ARM Linux @ 2011-06-22 0:06 ` Stephen Boyd 2011-06-22 10:06 ` Russell King - ARM Linux 0 siblings, 1 reply; 31+ messages in thread From: Stephen Boyd @ 2011-06-22 0:06 UTC (permalink / raw) To: linux-arm-kernel On 06/21/2011 04:10 PM, Russell King - ARM Linux wrote: > On Tue, Jun 21, 2011 at 01:16:47PM -0700, Stephen Boyd wrote: >> On 06/21/2011 03:26 AM, Russell King - ARM Linux wrote: >>> On Tue, Jun 21, 2011 at 03:51:00PM +0530, Santosh Shilimkar wrote: >>>> On 6/21/2011 3:49 PM, Russell King - ARM Linux wrote: >>>>> I won't be committing the init/calibrate.c change to a git tree - it >>>>> isn't ARM stuff so it goes in patch form. >>>> Patches with change log would be fine as well. >>> The answer is not at the moment, but maybe soon. >> Should we send those two patches to the stable trees as well? They seem >> to fix issues with cpu onlining that have existed for a long time. > Looks to me like the problem was introduced for 2.6.39-rc1, so we > should probably get the fix into the 2.6.39-stable tree too. Are we talking about the loops_per_jiffy problem or the cpu_active problem? I would think the cpu_active problem has been there since SMP support was added to ARM and the loops_per_jiffy problem has been there (depending on the compiler) since 8a9e1b0 ([PATCH] Platform SMIs and their interferance with tsc based delay calibration, 2005-06-23). So pretty much every stable tree would want both of these patches. -- Sent by an employee of the Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum. ^ permalink raw reply [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-22 0:06 ` Stephen Boyd @ 2011-06-22 10:06 ` Russell King - ARM Linux 0 siblings, 0 replies; 31+ messages in thread From: Russell King - ARM Linux @ 2011-06-22 10:06 UTC (permalink / raw) To: linux-arm-kernel On Tue, Jun 21, 2011 at 05:06:58PM -0700, Stephen Boyd wrote: > On 06/21/2011 04:10 PM, Russell King - ARM Linux wrote: > > On Tue, Jun 21, 2011 at 01:16:47PM -0700, Stephen Boyd wrote: > >> On 06/21/2011 03:26 AM, Russell King - ARM Linux wrote: > >>> On Tue, Jun 21, 2011 at 03:51:00PM +0530, Santosh Shilimkar wrote: > >>>> On 6/21/2011 3:49 PM, Russell King - ARM Linux wrote: > >>>>> I won't be committing the init/calibrate.c change to a git tree - it > >>>>> isn't ARM stuff so it goes in patch form. > >>>> Patches with change log would be fine as well. > >>> The answer is not at the moment, but maybe soon. > >> Should we send those two patches to the stable trees as well? They seem > >> to fix issues with cpu onlining that have existed for a long time. > > Looks to me like the problem was introduced for 2.6.39-rc1, so we > > should probably get the fix into the 2.6.39-stable tree too. > > Are we talking about the loops_per_jiffy problem or the cpu_active > problem? I would think the cpu_active problem has been there since SMP > support was added to ARM and the loops_per_jiffy problem has been there > (depending on the compiler) since 8a9e1b0 ([PATCH] Platform SMIs and > their interferance with tsc based delay calibration, 2005-06-23). The cpu_active problem hasn't actually caused any symptoms on ARM, so it's low priority. It's only a problem which should be sorted in -stable _if_ someone reports that it has caused a problem. Up until Santosh's patch, no one has done so, and I've not seen any problems on any of my ARM SMP platforms coming from it. As for the loops_per_jiffy, it isn't a problem before the commit ID I pointed out - I've checked the assembly, and the compiler optimizes away the initialization of loops_per_jiffy to zero - the first write is when its set to (1<<12). Take a moment to think about this: if ((loops_per_jiffy = 0) == 0) { } else { loops_per_jiffy = 1<<12; ... } Any compiler worth talking about is going to optimize away the initial constant write to loops_per_jiffy there provided loops_per_jiffy is not volatile. So, although its not desirable for older kernels to have their lpj overwritten in this way, it doesn't cause the spinlock debugging code to explode. This can be shown to be correct because there hasn't been any problem with ARM secondary CPU bringup until recently. Plus, the previous version of the code requires significant changes to sort the problem out. So, the lpj patch will only sensibly apply to 2.6.39-rc1 and later, and so it's only going to be submitted for 2.6.39-stable. Previous kernels, the risks of changing it outweighs by several orders of magnitude any benefit coming from the change. ^ permalink raw reply [flat|nested] 31+ messages in thread
* [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler. 2011-06-20 9:50 ` Russell King - ARM Linux 2011-06-20 10:14 ` Russell King - ARM Linux @ 2011-06-20 10:19 ` Santosh Shilimkar 1 sibling, 0 replies; 31+ messages in thread From: Santosh Shilimkar @ 2011-06-20 10:19 UTC (permalink / raw) To: linux-arm-kernel On 6/20/2011 3:20 PM, Russell King - ARM Linux wrote: > On Mon, Jun 20, 2011 at 02:53:59PM +0530, Santosh Shilimkar wrote: >> The current ARM CPU hotplug code suffers from couple of race conditions >> in CPU online path with scheduler. >> The ARM CPU hotplug code doesn't wait for hot-plugged CPU to be marked >> active as part of cpu_notify() by the CPU which brought it up before >> enabling interrupts. > > Hmm, why not just move the set_cpu_online() call before notify_cpu_starting() > and add the wait after the set_cpu_online() ? I think that's what patch is doing. Do you mean, calling hotplug notifier chain immediately after CPU marked as online. Something like below. set_cpu_online(cpu, true); notify_cpu_starting(cpu); while (!cpumask_test_cpu(smp_processor_id(), cpu_active_mask)) cpu_relax(); /* * Enable local interrupts. */ local_irq_enable(); local_fiq_enable(); Regards Santosh ^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2011-06-22 10:06 UTC | newest] Thread overview: 31+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-06-20 9:23 [RFC PATCH] ARM: smp: Fix the CPU hotplug race with scheduler Santosh Shilimkar 2011-06-20 9:50 ` Russell King - ARM Linux 2011-06-20 10:14 ` Russell King - ARM Linux 2011-06-20 10:28 ` Santosh Shilimkar 2011-06-20 10:35 ` Russell King - ARM Linux 2011-06-20 10:45 ` Santosh Shilimkar 2011-06-20 11:42 ` Santosh Shilimkar 2011-06-20 10:44 ` Russell King - ARM Linux 2011-06-20 10:47 ` Santosh Shilimkar 2011-06-20 11:13 ` Russell King - ARM Linux 2011-06-20 11:25 ` Santosh Shilimkar 2011-06-20 11:40 ` Russell King - ARM Linux 2011-06-20 11:51 ` Santosh Shilimkar 2011-06-20 12:19 ` Russell King - ARM Linux 2011-06-20 12:27 ` Santosh Shilimkar 2011-06-20 12:57 ` Russell King - ARM Linux 2011-06-20 14:23 ` Russell King - ARM Linux 2011-06-20 14:54 ` Santosh Shilimkar 2011-06-20 15:01 ` Russell King - ARM Linux 2011-06-20 15:10 ` Santosh Shilimkar 2011-06-21 9:08 ` Santosh Shilimkar 2011-06-21 10:00 ` Russell King - ARM Linux 2011-06-21 10:17 ` Santosh Shilimkar 2011-06-21 10:19 ` Russell King - ARM Linux 2011-06-21 10:21 ` Santosh Shilimkar 2011-06-21 10:26 ` Russell King - ARM Linux 2011-06-21 20:16 ` Stephen Boyd 2011-06-21 23:10 ` Russell King - ARM Linux 2011-06-22 0:06 ` Stephen Boyd 2011-06-22 10:06 ` Russell King - ARM Linux 2011-06-20 10:19 ` Santosh Shilimkar
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).