* [PATCH 0/2] ARM: SMP: Fix cpu hotplug problems @ 2012-05-21 9:45 Hui Wang 2012-05-21 9:45 ` [PATCH 1/2] ARM: SMP: move mm_cpumask clearing to the __cpu_die() Hui Wang 0 siblings, 1 reply; 6+ messages in thread From: Hui Wang @ 2012-05-21 9:45 UTC (permalink / raw) To: linux-arm-kernel In the RT kernel, current ARM SMP hotplug code will produce two call trace, these two patches can solve call trace problems in the RT kernel, and they are not harmful to standard kernel as well. To consistent between standard kernel and RT kernel, i want to apply these two patches to standard kernel, thus they will be applied to RT kernel automatically in future. Please kindly to help review these two patches. Thanks, Hui. ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] ARM: SMP: move mm_cpumask clearing to the __cpu_die() 2012-05-21 9:45 [PATCH 0/2] ARM: SMP: Fix cpu hotplug problems Hui Wang @ 2012-05-21 9:45 ` Hui Wang 2012-05-21 9:45 ` [PATCH 2/2] ARM: SMP: use per cpu state to replace Hui Wang 0 siblings, 1 reply; 6+ messages in thread From: Hui Wang @ 2012-05-21 9:45 UTC (permalink / raw) To: linux-arm-kernel __cpu_disable() will run on the dying cpu and the calling context is irq disabled, this will produce call trace in RT kernel like this: BUG: sleeping function called from invalid context at linux/kernel/rtmutex.c:707 pcnt: 0 0 in_atomic(): 0, irqs_disabled(): 128, pid: 1652, name: kstop/1 [<800413a0>] (unwind_backtrace+0x0/0xe4) from [<804ff198>] (__rt_spin_lock+0x30/0x5c) [<804ff198>] (__rt_spin_lock+0x30/0x5c) from [<804ff304>] (rt_read_lock+0x2c/0x3c) [<804ff304>] (rt_read_lock+0x2c/0x3c) from [<8003f890>] (__cpu_disable+0x50/0xa0) [<8003f890>] (__cpu_disable+0x50/0xa0) from [<804f5ad4>] (take_cpu_down+0x10/0x54) [<804f5ad4>] (take_cpu_down+0x10/0x54) from [<800a5ea0>] (stop_cpu+0xb0/0x110) [<800a5ea0>] (stop_cpu+0xb0/0x110) from [<8007e838>] (worker_thread+0x1bc/0x240) [<8007e838>] (worker_thread+0x1bc/0x240) from [<80082770>] (kthread+0x7c/0x84) [<80082770>] (kthread+0x7c/0x84) from [<8003b214>] (kernel_thread_exit+0x0/0x8) Clearing mm_cpumask for dying cpu is to save tlb/cache flush when this cpu is online again and needs to flush tlb/cache for the mm area. That is to say as long as we clear this flag before the cpu is online again, it will not bring trouble for us. So move this code from __cpu_disable() to __cpu_die(), since __cpu_disable() is irq disabled context while __cpu_die() is irq enabled context, this change will solve the call trace in the RT kernel. Signed-off-by: Hui Wang <jason77.wang@gmail.com> --- arch/arm/kernel/smp.c | 21 +++++++++++---------- 1 files changed, 11 insertions(+), 10 deletions(-) diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 8f46446..05fed61 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -130,7 +130,6 @@ static void percpu_timer_stop(void); int __cpu_disable(void) { unsigned int cpu = smp_processor_id(); - struct task_struct *p; int ret; ret = platform_cpu_disable(cpu); @@ -154,19 +153,12 @@ int __cpu_disable(void) percpu_timer_stop(); /* - * Flush user cache and TLB mappings, and then remove this CPU - * from the vm mask set of all processes. + * Flush user cache and TLB mappings, and later in the __cpu_die(), + * remove this CPU from the vm mask set of all processes. */ flush_cache_all(); local_flush_tlb_all(); - read_lock(&tasklist_lock); - for_each_process(p) { - if (p->mm) - cpumask_clear_cpu(cpu, mm_cpumask(p->mm)); - } - read_unlock(&tasklist_lock); - return 0; } @@ -178,6 +170,15 @@ static DECLARE_COMPLETION(cpu_died); */ void __cpu_die(unsigned int cpu) { + struct task_struct *p; + + read_lock(&tasklist_lock); + for_each_process(p) { + if (p->mm) + cpumask_clear_cpu(cpu, mm_cpumask(p->mm)); + } + read_unlock(&tasklist_lock); + if (!wait_for_completion_timeout(&cpu_died, msecs_to_jiffies(5000))) { pr_err("CPU%u: cpu didn't die\n", cpu); return; -- 1.7.6 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] ARM: SMP: use per cpu state to replace 2012-05-21 9:45 ` [PATCH 1/2] ARM: SMP: move mm_cpumask clearing to the __cpu_die() Hui Wang @ 2012-05-21 9:45 ` Hui Wang 2012-05-21 12:33 ` Russell King - ARM Linux 0 siblings, 1 reply; 6+ messages in thread From: Hui Wang @ 2012-05-21 9:45 UTC (permalink / raw) To: linux-arm-kernel CPU hotplug will bring following call trace in the RT kernel: BUG: sleeping function called from invalid context at linux/kernel/rtmutex.c:707 pcnt: 1 0 in_atomic(): 1, irqs_disabled(): 128, pid: 0, name: swapper [<800413a0>] (unwind_backtrace+0x0/0xe4) from [<804ff214>] (rt_spin_lock+0x30/0x5c) [<804ff214>] (rt_spin_lock+0x30/0x5c) from [<8005a848>] (complete+0x1c/0x54) [<8005a848>] (complete+0x1c/0x54) from [<804f59f8>] (cpu_die+0x34/0x70) [<804f59f8>] (cpu_die+0x34/0x70) from [<8003b840>] (cpu_idle+0x54/0xd8) [<8003b840>] (cpu_idle+0x54/0xd8) from [<104f9ecc>] (0x104f9ecc) To avoid this call trace, we use per cpu variable to replace completion, and it is safe for this modification since all reference of per cpu_state variable is in the preempt disabled context. Signed-off-by: Hui Wang <jason77.wang@gmail.com> --- arch/arm/kernel/smp.c | 23 ++++++++++++++++++----- 1 files changed, 18 insertions(+), 5 deletions(-) diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 05fed61..c63e2ff 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -122,6 +122,9 @@ int __cpuinit __cpu_up(unsigned int cpu) } #ifdef CONFIG_HOTPLUG_CPU +/* State of each CPU during hotplug phases */ +DEFINE_PER_CPU(int, cpu_state) = { 0 }; + static void percpu_timer_stop(void); /* @@ -171,6 +174,7 @@ static DECLARE_COMPLETION(cpu_died); void __cpu_die(unsigned int cpu) { struct task_struct *p; + unsigned long timeout; read_lock(&tasklist_lock); for_each_process(p) { @@ -179,10 +183,16 @@ void __cpu_die(unsigned int cpu) } read_unlock(&tasklist_lock); - if (!wait_for_completion_timeout(&cpu_died, msecs_to_jiffies(5000))) { - pr_err("CPU%u: cpu didn't die\n", cpu); - return; - } + timeout = jiffies + msecs_to_jiffies(5000); + + while (per_cpu(cpu_state, cpu) != CPU_DEAD) { + if (time_after(jiffies, timeout)) { + pr_err("CPU%u: cpu didn't die\n", cpu); + return; + } + cpu_relax(); + } + printk(KERN_NOTICE "CPU%u: shutdown\n", cpu); if (!platform_cpu_kill(cpu)) @@ -207,7 +217,7 @@ void __ref cpu_die(void) mb(); /* Tell __cpu_die() that this CPU is now safe to dispose of */ - complete(&cpu_died); + per_cpu(cpu_state, cpu) = CPU_DEAD; /* * actual CPU shutdown procedure is at least platform (if not @@ -285,6 +295,9 @@ asmlinkage void __cpuinit secondary_start_kernel(void) * the CPU migration code to notice that the CPU is online * before we continue - which happens after __cpu_up returns. */ +#ifdef CONFIG_HOTPLUG_CPU + per_cpu(cpu_state, cpu) = CPU_ONLINE; +#endif set_cpu_online(cpu, true); complete(&cpu_running); -- 1.7.6 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] ARM: SMP: use per cpu state to replace 2012-05-21 9:45 ` [PATCH 2/2] ARM: SMP: use per cpu state to replace Hui Wang @ 2012-05-21 12:33 ` Russell King - ARM Linux 2012-05-21 13:26 ` Thomas Gleixner 0 siblings, 1 reply; 6+ messages in thread From: Russell King - ARM Linux @ 2012-05-21 12:33 UTC (permalink / raw) To: linux-arm-kernel On Mon, May 21, 2012 at 05:45:31PM +0800, Hui Wang wrote: > CPU hotplug will bring following call trace in the RT kernel: > BUG: sleeping function called from invalid context at linux/kernel/rtmutex.c:707 > pcnt: 1 0 in_atomic(): 1, irqs_disabled(): 128, pid: 0, name: swapper > [<800413a0>] (unwind_backtrace+0x0/0xe4) from [<804ff214>] (rt_spin_lock+0x30/0x5c) > [<804ff214>] (rt_spin_lock+0x30/0x5c) from [<8005a848>] (complete+0x1c/0x54) > [<8005a848>] (complete+0x1c/0x54) from [<804f59f8>] (cpu_die+0x34/0x70) > [<804f59f8>] (cpu_die+0x34/0x70) from [<8003b840>] (cpu_idle+0x54/0xd8) > [<8003b840>] (cpu_idle+0x54/0xd8) from [<104f9ecc>] (0x104f9ecc) > > To avoid this call trace, we use per cpu variable to replace > completion, and it is safe for this modification since all reference > of per cpu_state variable is in the preempt disabled context. This is silly. Why is RT preventing things that work perfectly well in the standard kernel from being used in the RT kernel? Being able to call complete() from atomic contexts is one of the fundamentals that RT seems to be breaking here. > Signed-off-by: Hui Wang <jason77.wang@gmail.com> > --- > arch/arm/kernel/smp.c | 23 ++++++++++++++++++----- > 1 files changed, 18 insertions(+), 5 deletions(-) > > diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c > index 05fed61..c63e2ff 100644 > --- a/arch/arm/kernel/smp.c > +++ b/arch/arm/kernel/smp.c > @@ -122,6 +122,9 @@ int __cpuinit __cpu_up(unsigned int cpu) > } > > #ifdef CONFIG_HOTPLUG_CPU > +/* State of each CPU during hotplug phases */ > +DEFINE_PER_CPU(int, cpu_state) = { 0 }; > + > static void percpu_timer_stop(void); > > /* > @@ -171,6 +174,7 @@ static DECLARE_COMPLETION(cpu_died); > void __cpu_die(unsigned int cpu) > { > struct task_struct *p; > + unsigned long timeout; > > read_lock(&tasklist_lock); > for_each_process(p) { > @@ -179,10 +183,16 @@ void __cpu_die(unsigned int cpu) > } > read_unlock(&tasklist_lock); > > - if (!wait_for_completion_timeout(&cpu_died, msecs_to_jiffies(5000))) { > - pr_err("CPU%u: cpu didn't die\n", cpu); > - return; > - } > + timeout = jiffies + msecs_to_jiffies(5000); > + > + while (per_cpu(cpu_state, cpu) != CPU_DEAD) { > + if (time_after(jiffies, timeout)) { > + pr_err("CPU%u: cpu didn't die\n", cpu); > + return; > + } > + cpu_relax(); > + } > + > printk(KERN_NOTICE "CPU%u: shutdown\n", cpu); > > if (!platform_cpu_kill(cpu)) > @@ -207,7 +217,7 @@ void __ref cpu_die(void) > mb(); > > /* Tell __cpu_die() that this CPU is now safe to dispose of */ > - complete(&cpu_died); > + per_cpu(cpu_state, cpu) = CPU_DEAD; > > /* > * actual CPU shutdown procedure is at least platform (if not > @@ -285,6 +295,9 @@ asmlinkage void __cpuinit secondary_start_kernel(void) > * the CPU migration code to notice that the CPU is online > * before we continue - which happens after __cpu_up returns. > */ > +#ifdef CONFIG_HOTPLUG_CPU > + per_cpu(cpu_state, cpu) = CPU_ONLINE; > +#endif > set_cpu_online(cpu, true); > complete(&cpu_running); > > -- > 1.7.6 > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 2/2] ARM: SMP: use per cpu state to replace 2012-05-21 12:33 ` Russell King - ARM Linux @ 2012-05-21 13:26 ` Thomas Gleixner 2012-05-22 2:16 ` Hui Wang 0 siblings, 1 reply; 6+ messages in thread From: Thomas Gleixner @ 2012-05-21 13:26 UTC (permalink / raw) To: linux-arm-kernel On Mon, 21 May 2012, Russell King - ARM Linux wrote: > On Mon, May 21, 2012 at 05:45:31PM +0800, Hui Wang wrote: > > CPU hotplug will bring following call trace in the RT kernel: > > BUG: sleeping function called from invalid context at linux/kernel/rtmutex.c:707 > > pcnt: 1 0 in_atomic(): 1, irqs_disabled(): 128, pid: 0, name: swapper > > [<800413a0>] (unwind_backtrace+0x0/0xe4) from [<804ff214>] (rt_spin_lock+0x30/0x5c) > > [<804ff214>] (rt_spin_lock+0x30/0x5c) from [<8005a848>] (complete+0x1c/0x54) > > [<8005a848>] (complete+0x1c/0x54) from [<804f59f8>] (cpu_die+0x34/0x70) > > [<804f59f8>] (cpu_die+0x34/0x70) from [<8003b840>] (cpu_idle+0x54/0xd8) > > [<8003b840>] (cpu_idle+0x54/0xd8) from [<104f9ecc>] (0x104f9ecc) > > > > To avoid this call trace, we use per cpu variable to replace > > completion, and it is safe for this modification since all reference > > of per cpu_state variable is in the preempt disabled context. > > This is silly. Why is RT preventing things that work perfectly well in > the standard kernel from being used in the RT kernel? The silliness is in some of the users of waitqueues. waitqueues are protected by a spinlock, which we cannot convert to a raw spinlock on rt because some of the wq users run insane callbacks from the wakeup context with interrupts disabled..... > Being able to call complete() from atomic contexts is one of the > fundamentals that RT seems to be breaking here. Yeah, I have a simpler waitqueue variant (no callbacks) which I want to use for completions. Thanks, tglx ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 2/2] ARM: SMP: use per cpu state to replace 2012-05-21 13:26 ` Thomas Gleixner @ 2012-05-22 2:16 ` Hui Wang 0 siblings, 0 replies; 6+ messages in thread From: Hui Wang @ 2012-05-22 2:16 UTC (permalink / raw) To: linux-arm-kernel Thanks for review and glad to know this will be fixed from the root cause. Hi Russell, What about my first patch, the tasklist_lock is not easy changed to the raw lock in the RT kernel since this lock is used by various subsystems, and change this lock to raw lock will affect RT performance. Regards, Hui. Thomas Gleixner wrote: > On Mon, 21 May 2012, Russell King - ARM Linux wrote: > > >> On Mon, May 21, 2012 at 05:45:31PM +0800, Hui Wang wrote: >> >>> CPU hotplug will bring following call trace in the RT kernel: >>> BUG: sleeping function called from invalid context at linux/kernel/rtmutex.c:707 >>> pcnt: 1 0 in_atomic(): 1, irqs_disabled(): 128, pid: 0, name: swapper >>> [<800413a0>] (unwind_backtrace+0x0/0xe4) from [<804ff214>] (rt_spin_lock+0x30/0x5c) >>> [<804ff214>] (rt_spin_lock+0x30/0x5c) from [<8005a848>] (complete+0x1c/0x54) >>> [<8005a848>] (complete+0x1c/0x54) from [<804f59f8>] (cpu_die+0x34/0x70) >>> [<804f59f8>] (cpu_die+0x34/0x70) from [<8003b840>] (cpu_idle+0x54/0xd8) >>> [<8003b840>] (cpu_idle+0x54/0xd8) from [<104f9ecc>] (0x104f9ecc) >>> >>> To avoid this call trace, we use per cpu variable to replace >>> completion, and it is safe for this modification since all reference >>> of per cpu_state variable is in the preempt disabled context. >>> >> This is silly. Why is RT preventing things that work perfectly well in >> the standard kernel from being used in the RT kernel? >> > > The silliness is in some of the users of waitqueues. waitqueues are > protected by a spinlock, which we cannot convert to a raw spinlock on > rt because some of the wq users run insane callbacks from the wakeup > context with interrupts disabled..... > > >> Being able to call complete() from atomic contexts is one of the >> fundamentals that RT seems to be breaking here. >> > > Yeah, I have a simpler waitqueue variant (no callbacks) which I want > to use for completions. > > Thanks, > > tglx > > ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-05-22 2:16 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-05-21 9:45 [PATCH 0/2] ARM: SMP: Fix cpu hotplug problems Hui Wang 2012-05-21 9:45 ` [PATCH 1/2] ARM: SMP: move mm_cpumask clearing to the __cpu_die() Hui Wang 2012-05-21 9:45 ` [PATCH 2/2] ARM: SMP: use per cpu state to replace Hui Wang 2012-05-21 12:33 ` Russell King - ARM Linux 2012-05-21 13:26 ` Thomas Gleixner 2012-05-22 2:16 ` Hui Wang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).