* [Kgdb-bugreport] [PATCH] ARM: change definition of cpu_relax() for ARM11MPCore
2010-03-09 17:59 ` Will Deacon
@ 2010-03-10 22:06 ` Jason Wessel
2010-03-11 2:47 ` DDD
` (3 more replies)
0 siblings, 4 replies; 13+ messages in thread
From: Jason Wessel @ 2010-03-10 22:06 UTC (permalink / raw)
To: linux-arm-kernel
Will Deacon wrote:
>> Clearly, kgdb is using atomic_set()/atomic_read() in a way which does not
>> match this documentation - it's certainly missing the barriers as required
>> by the above quoted paragraphs.
>>
>> Let me repeat: atomic_set() and atomic_read() are NOT atomic. There's
>> nothing atomic about them. All they do is provide a pair of accessors
>> to the underlying value in the atomic type. They are no different to
>> declaring a volatile int and reading/writing it directly.
Clearly the docs state this.
>
> Indeed. I'm not familiar enough with KGDB internals to dive in and look at all the
> potential barrier conflicts, so I'll submit a patch that addresses the one that's
> bitten me so far. Maybe it will motivate somebody else to take a closer look!
>
Do you think you can try the patch below?
It seems we might not need to change to using the atomic_add_return(0,...) because using the atomic_inc() and atomic_dec() will end up using the memory barriers.
I would certainly rather fix kgdb vs mucking with the internals of cpu_relax().
Jason.
--- a/kernel/kgdb.c
+++ b/kernel/kgdb.c
@@ -580,14 +580,13 @@ static void kgdb_wait(struct pt_regs *re
* Make sure the above info reaches the primary CPU before
* our cpu_in_kgdb[] flag setting does:
*/
- smp_wmb();
- atomic_set(&cpu_in_kgdb[cpu], 1);
+ atomic_inc(&cpu_in_kgdb[cpu]);
/* Disable any cpu specific hw breakpoints */
kgdb_disable_hw_debug(regs);
/* Wait till primary CPU is done with debugging */
- while (atomic_read(&passive_cpu_wait[cpu]))
+ while (atomic_add_return(0, &passive_cpu_wait[cpu]))
cpu_relax();
kgdb_info[cpu].debuggerinfo = NULL;
@@ -598,7 +597,7 @@ static void kgdb_wait(struct pt_regs *re
arch_kgdb_ops.correct_hw_break();
/* Signal the primary CPU that we are done: */
- atomic_set(&cpu_in_kgdb[cpu], 0);
+ atomic_dec(&cpu_in_kgdb[cpu]);
touch_softlockup_watchdog_sync();
clocksource_touch_watchdog();
local_irq_restore(flags);
@@ -1493,7 +1492,7 @@ acquirelock:
* spin_lock code is good enough as a barrier so we don't
* need one here:
*/
- atomic_set(&cpu_in_kgdb[ks->cpu], 1);
+ atomic_inc(&cpu_in_kgdb[ks->cpu]);
#ifdef CONFIG_SMP
/* Signal the other CPUs to enter kgdb_wait() */
@@ -1505,7 +1504,7 @@ acquirelock:
* Wait for the other CPUs to be notified and be waiting for us:
*/
for_each_online_cpu(i) {
- while (!atomic_read(&cpu_in_kgdb[i]))
+ while (!atomic_add_return(0, &cpu_in_kgdb[i]))
cpu_relax();
}
@@ -1528,7 +1527,7 @@ acquirelock:
kgdb_info[ks->cpu].debuggerinfo = NULL;
kgdb_info[ks->cpu].task = NULL;
- atomic_set(&cpu_in_kgdb[ks->cpu], 0);
+ atomic_dec(&cpu_in_kgdb[ks->cpu]);
if (!kgdb_single_step) {
for (i = NR_CPUS-1; i >= 0; i--)
@@ -1538,7 +1537,7 @@ acquirelock:
* from the debugger.
*/
for_each_online_cpu(i) {
- while (atomic_read(&cpu_in_kgdb[i]))
+ while (atomic_add_return(0, &cpu_in_kgdb[i]))
cpu_relax();
}
}
@@ -1742,11 +1741,11 @@ EXPORT_SYMBOL_GPL(kgdb_unregister_io_mod
*/
void kgdb_breakpoint(void)
{
- atomic_set(&kgdb_setting_breakpoint, 1);
+ atomic_inc(&kgdb_setting_breakpoint);
wmb(); /* Sync point before breakpoint */
arch_kgdb_breakpoint();
wmb(); /* Sync point after breakpoint */
- atomic_set(&kgdb_setting_breakpoint, 0);
+ atomic_dec(&kgdb_setting_breakpoint);
}
EXPORT_SYMBOL_GPL(kgdb_breakpoint);
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Kgdb-bugreport] [PATCH] ARM: change definition of cpu_relax() for ARM11MPCore
2010-03-10 22:06 ` [Kgdb-bugreport] " Jason Wessel
@ 2010-03-11 2:47 ` DDD
2010-03-11 13:53 ` Will Deacon
2010-03-11 13:29 ` Will Deacon
` (2 subsequent siblings)
3 siblings, 1 reply; 13+ messages in thread
From: DDD @ 2010-03-11 2:47 UTC (permalink / raw)
To: linux-arm-kernel
Jason Wessel wrote:
> Will Deacon wrote:
>
>>> Clearly, kgdb is using atomic_set()/atomic_read() in a way which does not
>>> match this documentation - it's certainly missing the barriers as required
>>> by the above quoted paragraphs.
>>>
>>> Let me repeat: atomic_set() and atomic_read() are NOT atomic. There's
>>> nothing atomic about them. All they do is provide a pair of accessors
>>> to the underlying value in the atomic type. They are no different to
>>> declaring a volatile int and reading/writing it directly.
>
>
> Clearly the docs state this.
>
>> Indeed. I'm not familiar enough with KGDB internals to dive in and look at all the
>> potential barrier conflicts, so I'll submit a patch that addresses the one that's
>> bitten me so far. Maybe it will motivate somebody else to take a closer look!
>>
>
>
> Do you think you can try the patch below?
>
> It seems we might not need to change to using the atomic_add_return(0,...) because using the atomic_inc() and atomic_dec() will end up using the memory barriers.
>
Hi Jason,
Maybe we should initial the atomic_t variable before we using such as
atomic_inc/dec() directly.
Dongdong.
--- a/kernel/kgdb.c
+++ b/kernel/kgdb.c
@@ -227,6 +227,17 @@ kgdb_post_primary_code(struct pt_regs *regs, int
e_vector, int err_code)
return;
}
+static void kgdb_initial_atomic_var()
+{
+ int i;
+ for (i = NR_CPUS-1; i >= 0; i--) {
+ atomic_set(&passive_cpu_wait[i], 0);
+ atomic_set(&cpu_in_kgdb[i], 0);
+ }
+
+ atomic_set(&kgdb_setting_breakpoint, 0);
+}
+
/**
* kgdb_disable_hw_debug - Disable hardware debugging while we in
kgdb.
* @regs: Current &struct pt_regs.
@@ -1619,6 +1630,7 @@ static void kgdb_register_callbacks(void)
{
if (!kgdb_io_module_registered) {
kgdb_io_module_registered = 1;
+ kgdb_initial_atomic_var();
kgdb_arch_init();
#ifdef CONFIG_MAGIC_SYSRQ
register_sysrq_key('g', &sysrq_gdb_op);
> I would certainly rather fix kgdb vs mucking with the internals of cpu_relax().
>
>
> Jason.
>
> --- a/kernel/kgdb.c
> +++ b/kernel/kgdb.c
> @@ -580,14 +580,13 @@ static void kgdb_wait(struct pt_regs *re
> * Make sure the above info reaches the primary CPU before
> * our cpu_in_kgdb[] flag setting does:
> */
> - smp_wmb();
> - atomic_set(&cpu_in_kgdb[cpu], 1);
> + atomic_inc(&cpu_in_kgdb[cpu]);
>
> /* Disable any cpu specific hw breakpoints */
> kgdb_disable_hw_debug(regs);
>
> /* Wait till primary CPU is done with debugging */
> - while (atomic_read(&passive_cpu_wait[cpu]))
> + while (atomic_add_return(0, &passive_cpu_wait[cpu]))
> cpu_relax();
>
> kgdb_info[cpu].debuggerinfo = NULL;
> @@ -598,7 +597,7 @@ static void kgdb_wait(struct pt_regs *re
> arch_kgdb_ops.correct_hw_break();
>
> /* Signal the primary CPU that we are done: */
> - atomic_set(&cpu_in_kgdb[cpu], 0);
> + atomic_dec(&cpu_in_kgdb[cpu]);
> touch_softlockup_watchdog_sync();
> clocksource_touch_watchdog();
> local_irq_restore(flags);
> @@ -1493,7 +1492,7 @@ acquirelock:
> * spin_lock code is good enough as a barrier so we don't
> * need one here:
> */
> - atomic_set(&cpu_in_kgdb[ks->cpu], 1);
> + atomic_inc(&cpu_in_kgdb[ks->cpu]);
>
> #ifdef CONFIG_SMP
> /* Signal the other CPUs to enter kgdb_wait() */
> @@ -1505,7 +1504,7 @@ acquirelock:
> * Wait for the other CPUs to be notified and be waiting for us:
> */
> for_each_online_cpu(i) {
> - while (!atomic_read(&cpu_in_kgdb[i]))
> + while (!atomic_add_return(0, &cpu_in_kgdb[i]))
> cpu_relax();
> }
>
> @@ -1528,7 +1527,7 @@ acquirelock:
>
> kgdb_info[ks->cpu].debuggerinfo = NULL;
> kgdb_info[ks->cpu].task = NULL;
> - atomic_set(&cpu_in_kgdb[ks->cpu], 0);
> + atomic_dec(&cpu_in_kgdb[ks->cpu]);
>
> if (!kgdb_single_step) {
> for (i = NR_CPUS-1; i >= 0; i--)
> @@ -1538,7 +1537,7 @@ acquirelock:
> * from the debugger.
> */
> for_each_online_cpu(i) {
> - while (atomic_read(&cpu_in_kgdb[i]))
> + while (atomic_add_return(0, &cpu_in_kgdb[i]))
> cpu_relax();
> }
> }
> @@ -1742,11 +1741,11 @@ EXPORT_SYMBOL_GPL(kgdb_unregister_io_mod
> */
> void kgdb_breakpoint(void)
> {
> - atomic_set(&kgdb_setting_breakpoint, 1);
> + atomic_inc(&kgdb_setting_breakpoint);
> wmb(); /* Sync point before breakpoint */
> arch_kgdb_breakpoint();
> wmb(); /* Sync point after breakpoint */
> - atomic_set(&kgdb_setting_breakpoint, 0);
> + atomic_dec(&kgdb_setting_breakpoint);
> }
> EXPORT_SYMBOL_GPL(kgdb_breakpoint);
>
>
> ------------------------------------------------------------------------------
> Download Intel® Parallel Studio Eval
> Try the new software tools for yourself. Speed compiling, find bugs
> proactively, and fine-tune applications for parallel performance.
> See why Intel Parallel Studio got high marks during beta.
> http://p.sf.net/sfu/intel-sw-dev
> _______________________________________________
> Kgdb-bugreport mailing list
> Kgdb-bugreport at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Kgdb-bugreport] [PATCH] ARM: change definition of cpu_relax() for ARM11MPCore
2010-03-10 22:06 ` [Kgdb-bugreport] " Jason Wessel
2010-03-11 2:47 ` DDD
@ 2010-03-11 13:29 ` Will Deacon
2010-03-11 14:51 ` Will Deacon
2010-03-16 17:26 ` Will Deacon
3 siblings, 0 replies; 13+ messages in thread
From: Will Deacon @ 2010-03-11 13:29 UTC (permalink / raw)
To: linux-arm-kernel
Hello Jason,
> Do you think you can try the patch below?
Sure - I'll give it a go this afternoon.
> It seems we might not need to change to using the atomic_add_return(0,...) because using the
> atomic_inc() and atomic_dec() will end up using the memory barriers.
The fix I posted to the LKML here:
http://lkml.org/lkml/2010/3/9/260
fixed the issue for me, but your patch seems more general [although maybe overkill?].
> I would certainly rather fix kgdb vs mucking with the internals of cpu_relax().
I think that's the general consensus. I was worried about changing the general KGDB
code to fix an ARM specific problem, but it turns out that the problem is an issue of
semantics.
I'll let you know whether your patch fixes the problems for me.
Will
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Kgdb-bugreport] [PATCH] ARM: change definition of cpu_relax() for ARM11MPCore
2010-03-11 2:47 ` DDD
@ 2010-03-11 13:53 ` Will Deacon
0 siblings, 0 replies; 13+ messages in thread
From: Will Deacon @ 2010-03-11 13:53 UTC (permalink / raw)
To: linux-arm-kernel
> Maybe we should initial the atomic_t variable before we using such as
> atomic_inc/dec() directly.
>
> Dongdong.
>
>
> --- a/kernel/kgdb.c
> +++ b/kernel/kgdb.c
> @@ -227,6 +227,17 @@ kgdb_post_primary_code(struct pt_regs *regs, int
> e_vector, int err_code)
> return;
> }
>
> +static void kgdb_initial_atomic_var()
> +{
> + int i;
> + for (i = NR_CPUS-1; i >= 0; i--) {
> + atomic_set(&passive_cpu_wait[i], 0);
> + atomic_set(&cpu_in_kgdb[i], 0);
> + }
> +
> + atomic_set(&kgdb_setting_breakpoint, 0);
> +}
> +
Given that passive_cpu_wait and cpu_in_kgdb are static, I think those guys
are alright as-is. kgdb_setting_breakpoint should probably be reset though.
Will
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Kgdb-bugreport] [PATCH] ARM: change definition of cpu_relax() for ARM11MPCore
2010-03-10 22:06 ` [Kgdb-bugreport] " Jason Wessel
2010-03-11 2:47 ` DDD
2010-03-11 13:29 ` Will Deacon
@ 2010-03-11 14:51 ` Will Deacon
2010-03-16 17:26 ` Will Deacon
3 siblings, 0 replies; 13+ messages in thread
From: Will Deacon @ 2010-03-11 14:51 UTC (permalink / raw)
To: linux-arm-kernel
> Do you think you can try the patch below?
<snip>
> --- a/kernel/kgdb.c
> +++ b/kernel/kgdb.c
> @@ -580,14 +580,13 @@ static void kgdb_wait(struct pt_regs *re
> * Make sure the above info reaches the primary CPU before
> * our cpu_in_kgdb[] flag setting does:
> */
> - smp_wmb();
> - atomic_set(&cpu_in_kgdb[cpu], 1);
> + atomic_inc(&cpu_in_kgdb[cpu]);
>
> /* Disable any cpu specific hw breakpoints */
> kgdb_disable_hw_debug(regs);
>
> /* Wait till primary CPU is done with debugging */
> - while (atomic_read(&passive_cpu_wait[cpu]))
> + while (atomic_add_return(0, &passive_cpu_wait[cpu]))
> cpu_relax();
>
> kgdb_info[cpu].debuggerinfo = NULL;
> @@ -598,7 +597,7 @@ static void kgdb_wait(struct pt_regs *re
> arch_kgdb_ops.correct_hw_break();
>
> /* Signal the primary CPU that we are done: */
> - atomic_set(&cpu_in_kgdb[cpu], 0);
> + atomic_dec(&cpu_in_kgdb[cpu]);
> touch_softlockup_watchdog_sync();
> clocksource_touch_watchdog();
> local_irq_restore(flags);
> @@ -1493,7 +1492,7 @@ acquirelock:
> * spin_lock code is good enough as a barrier so we don't
> * need one here:
> */
> - atomic_set(&cpu_in_kgdb[ks->cpu], 1);
> + atomic_inc(&cpu_in_kgdb[ks->cpu]);
>
> #ifdef CONFIG_SMP
> /* Signal the other CPUs to enter kgdb_wait() */
> @@ -1505,7 +1504,7 @@ acquirelock:
> * Wait for the other CPUs to be notified and be waiting for us:
> */
> for_each_online_cpu(i) {
> - while (!atomic_read(&cpu_in_kgdb[i]))
> + while (!atomic_add_return(0, &cpu_in_kgdb[i]))
> cpu_relax();
> }
>
> @@ -1528,7 +1527,7 @@ acquirelock:
>
> kgdb_info[ks->cpu].debuggerinfo = NULL;
> kgdb_info[ks->cpu].task = NULL;
> - atomic_set(&cpu_in_kgdb[ks->cpu], 0);
> + atomic_dec(&cpu_in_kgdb[ks->cpu]);
>
> if (!kgdb_single_step) {
> for (i = NR_CPUS-1; i >= 0; i--)
> @@ -1538,7 +1537,7 @@ acquirelock:
> * from the debugger.
> */
> for_each_online_cpu(i) {
> - while (atomic_read(&cpu_in_kgdb[i]))
> + while (atomic_add_return(0, &cpu_in_kgdb[i]))
> cpu_relax();
> }
> }
> @@ -1742,11 +1741,11 @@ EXPORT_SYMBOL_GPL(kgdb_unregister_io_mod
> */
> void kgdb_breakpoint(void)
> {
> - atomic_set(&kgdb_setting_breakpoint, 1);
> + atomic_inc(&kgdb_setting_breakpoint);
> wmb(); /* Sync point before breakpoint */
> arch_kgdb_breakpoint();
> wmb(); /* Sync point after breakpoint */
> - atomic_set(&kgdb_setting_breakpoint, 0);
> + atomic_dec(&kgdb_setting_breakpoint);
> }
> EXPORT_SYMBOL_GPL(kgdb_breakpoint);
Appears to work, but it's hard to be 100% with these sorts of things.
I ran the basic testsuite, the fork tests and the breakpoint tests as mentioned
in drivers/misc/kgdbts.c. The patch might introduce more barriers than we really
need, but this isn't performance-critical code and without it, KGDB is broken, so:
Tested-by: Will Deacon <will.deacon@arm.com>
Cheers,
Will
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Kgdb-bugreport] [PATCH] ARM: change definition of cpu_relax() for ARM11MPCore
2010-03-10 22:06 ` [Kgdb-bugreport] " Jason Wessel
` (2 preceding siblings ...)
2010-03-11 14:51 ` Will Deacon
@ 2010-03-16 17:26 ` Will Deacon
2010-03-16 18:52 ` Jason Wessel
3 siblings, 1 reply; 13+ messages in thread
From: Will Deacon @ 2010-03-16 17:26 UTC (permalink / raw)
To: linux-arm-kernel
Hi Jason,
> Do you think you can try the patch below?
>
> It seems we might not need to change to using the atomic_add_return(0,...) because using the
> atomic_inc() and atomic_dec() will end up using the memory barriers.
>
> I would certainly rather fix kgdb vs mucking with the internals of cpu_relax().
>
>
> Jason.
>
> --- a/kernel/kgdb.c
> +++ b/kernel/kgdb.c
> @@ -580,14 +580,13 @@ static void kgdb_wait(struct pt_regs *re
> * Make sure the above info reaches the primary CPU before
> * our cpu_in_kgdb[] flag setting does:
> */
> - smp_wmb();
> - atomic_set(&cpu_in_kgdb[cpu], 1);
> + atomic_inc(&cpu_in_kgdb[cpu]);
>
> /* Disable any cpu specific hw breakpoints */
> kgdb_disable_hw_debug(regs);
>
> /* Wait till primary CPU is done with debugging */
> - while (atomic_read(&passive_cpu_wait[cpu]))
> + while (atomic_add_return(0, &passive_cpu_wait[cpu]))
> cpu_relax();
>
> kgdb_info[cpu].debuggerinfo = NULL;
> @@ -598,7 +597,7 @@ static void kgdb_wait(struct pt_regs *re
> arch_kgdb_ops.correct_hw_break();
>
> /* Signal the primary CPU that we are done: */
> - atomic_set(&cpu_in_kgdb[cpu], 0);
> + atomic_dec(&cpu_in_kgdb[cpu]);
> touch_softlockup_watchdog_sync();
> clocksource_touch_watchdog();
> local_irq_restore(flags);
> @@ -1493,7 +1492,7 @@ acquirelock:
> * spin_lock code is good enough as a barrier so we don't
> * need one here:
> */
> - atomic_set(&cpu_in_kgdb[ks->cpu], 1);
> + atomic_inc(&cpu_in_kgdb[ks->cpu]);
>
> #ifdef CONFIG_SMP
> /* Signal the other CPUs to enter kgdb_wait() */
> @@ -1505,7 +1504,7 @@ acquirelock:
> * Wait for the other CPUs to be notified and be waiting for us:
> */
> for_each_online_cpu(i) {
> - while (!atomic_read(&cpu_in_kgdb[i]))
> + while (!atomic_add_return(0, &cpu_in_kgdb[i]))
> cpu_relax();
> }
>
> @@ -1528,7 +1527,7 @@ acquirelock:
>
> kgdb_info[ks->cpu].debuggerinfo = NULL;
> kgdb_info[ks->cpu].task = NULL;
> - atomic_set(&cpu_in_kgdb[ks->cpu], 0);
> + atomic_dec(&cpu_in_kgdb[ks->cpu]);
>
> if (!kgdb_single_step) {
> for (i = NR_CPUS-1; i >= 0; i--)
> @@ -1538,7 +1537,7 @@ acquirelock:
> * from the debugger.
> */
> for_each_online_cpu(i) {
> - while (atomic_read(&cpu_in_kgdb[i]))
> + while (atomic_add_return(0, &cpu_in_kgdb[i]))
> cpu_relax();
> }
> }
> @@ -1742,11 +1741,11 @@ EXPORT_SYMBOL_GPL(kgdb_unregister_io_mod
> */
> void kgdb_breakpoint(void)
> {
> - atomic_set(&kgdb_setting_breakpoint, 1);
> + atomic_inc(&kgdb_setting_breakpoint);
> wmb(); /* Sync point before breakpoint */
> arch_kgdb_breakpoint();
> wmb(); /* Sync point after breakpoint */
> - atomic_set(&kgdb_setting_breakpoint, 0);
> + atomic_dec(&kgdb_setting_breakpoint);
> }
> EXPORT_SYMBOL_GPL(kgdb_breakpoint);
>
What's the status on this patch? Russell has applied the ARM SMP code to his
tree now, so it would be good to get this patch into KGDB to avoid the deadlock
on ARM11MPCore.
I only ask because I can't see it in -next [although I see kgdb.c has moved!].
Cheers,
Will
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Kgdb-bugreport] [PATCH] ARM: change definition of cpu_relax() for ARM11MPCore
2010-03-16 17:26 ` Will Deacon
@ 2010-03-16 18:52 ` Jason Wessel
0 siblings, 0 replies; 13+ messages in thread
From: Jason Wessel @ 2010-03-16 18:52 UTC (permalink / raw)
To: linux-arm-kernel
On 03/16/2010 12:26 PM, Will Deacon wrote:
> Hi Jason,
>
>
> What's the status on this patch? Russell has applied the ARM SMP code to his
> tree now, so it would be good to get this patch into KGDB to avoid the deadlock
> on ARM11MPCore.
>
>
Linux-next presently reflects the state of the outstanding pull request
I have to Linus. I have yet to hear back any word as to what happened
with my pull request.
> I only ask because I can't see it in -next [although I see kgdb.c has moved!].
>
It lives in kernel/debug/debug_core.c, assuming the code in linux-next
gets pulled into Linus's tree. I already have an updated version of the
patch for the new structure, but not pushed to linux-next while my pull
request is in limbo.
Jason.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] ARM: change definition of cpu_relax() for ARM11MPCore
@ 2010-04-12 17:23 Will Deacon
2010-04-12 17:32 ` Russell King - ARM Linux
2010-04-19 14:39 ` Will Deacon
0 siblings, 2 replies; 13+ messages in thread
From: Will Deacon @ 2010-04-12 17:23 UTC (permalink / raw)
To: linux-arm-kernel
Linux expects that if a CPU modifies a memory location, then that
modification will eventually become visible to other CPUs in the system.
On an ARM11MPCore processor, loads are prioritised over stores so it is
possible for a store operation to be postponed if a polling loop immediately
follows it. If the variable being polled indirectly depends on the outstanding
store [for example, another CPU may be polling the variable that is pending
modification] then there is the potential for deadlock if interrupts are
disabled. This deadlock occurs in the KGDB testsuire when executing on an
SMP ARM11MPCore configuration.
This patch changes the definition of cpu_relax() to smp_mb() for ARMv6 cores,
forcing the write buffer to drain while inside a polling loop on an SMP system.
If the Kernel is not compiled for SMP support, this will expand to a barrier()
as before.
Cc: KGDB Mailing List <kgdb-bugreport@lists.sourceforge.net>
Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
arch/arm/include/asm/processor.h | 9 +++++++++
1 files changed, 9 insertions(+), 0 deletions(-)
diff --git a/arch/arm/include/asm/processor.h b/arch/arm/include/asm/processor.h
index 6a89567..da2078e 100644
--- a/arch/arm/include/asm/processor.h
+++ b/arch/arm/include/asm/processor.h
@@ -91,7 +91,16 @@ extern void release_thread(struct task_struct *);
unsigned long get_wchan(struct task_struct *p);
+#if __LINUX_ARM_ARCH__ == 6
+/*
+ * The store buffer on ARM11MPCore is not guaranteed to drain when the CPU
+ * is performing aggresive loads. This is usually the case in a polling loop,
+ * so we add a memory barrier to allow any pending stores to complete.
+ */
+#define cpu_relax() smp_mb()
+#else
#define cpu_relax() barrier()
+#endif
/*
* Create a new kernel thread
--
1.6.3.3
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH] ARM: change definition of cpu_relax() for ARM11MPCore
2010-04-12 17:23 [PATCH] ARM: change definition of cpu_relax() for ARM11MPCore Will Deacon
@ 2010-04-12 17:32 ` Russell King - ARM Linux
2010-04-15 17:36 ` [Kgdb-bugreport] " George G. Davis
2010-04-19 14:39 ` Will Deacon
1 sibling, 1 reply; 13+ messages in thread
From: Russell King - ARM Linux @ 2010-04-12 17:32 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Apr 12, 2010 at 06:23:58PM +0100, Will Deacon wrote:
> This patch changes the definition of cpu_relax() to smp_mb() for ARMv6 cores,
> forcing the write buffer to drain while inside a polling loop on an SMP system.
> If the Kernel is not compiled for SMP support, this will expand to a barrier()
> as before.
Linus asked how expensive (in terms of power rather than performance)
this was; so far that question has remained unanswered. Can someone
please answer his question?
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Kgdb-bugreport] [PATCH] ARM: change definition of cpu_relax() for ARM11MPCore
2010-04-12 17:32 ` Russell King - ARM Linux
@ 2010-04-15 17:36 ` George G. Davis
2010-04-15 21:03 ` Jamie Lokier
2010-04-16 13:54 ` Will Deacon
0 siblings, 2 replies; 13+ messages in thread
From: George G. Davis @ 2010-04-15 17:36 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
On Mon, Apr 12, 2010 at 06:32:47PM +0100, Russell King - ARM Linux wrote:
> On Mon, Apr 12, 2010 at 06:23:58PM +0100, Will Deacon wrote:
> > This patch changes the definition of cpu_relax() to smp_mb() for ARMv6 cores,
> > forcing the write buffer to drain while inside a polling loop on an SMP system.
> > If the Kernel is not compiled for SMP support, this will expand to a barrier()
> > as before.
If I've followed these threads [1][2] correctly, this ARM11 MPCore issue
was discovered while running the "KGDB: internal test suite" (KGDB_TESTS)
and that problem is resolved via "kgdb: use atomic_inc and atomic_dec
instead of atomic_set" [3]. If so, isn't the original ARM11 MPCore KGDB
cpu_relax() issue just a red herring? Shouldn't any polling loops
which depend on specific (hardware) write/read order implement appropriate
barriers rather than rely on cpu_relax() to guarantee order? If
perhaps there are indeed other cases where cpu_relax() is being used
incorrectly, then maybe those should be fixed instead? Just curious...
> Linus asked how expensive (in terms of power rather than performance)
> this was; so far that question has remained unanswered. Can someone
> please answer his question?
Thanks!
--
Regards,
George
[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2010-March/010691.html
[2] http://lists.infradead.org/pipermail/linux-arm-kernel/2010-March/011076.html
[3] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff_plain;h=ae6bf53e0255c8ab04b6fe31806e318432570e3c
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Kgdb-bugreport] [PATCH] ARM: change definition of cpu_relax() for ARM11MPCore
2010-04-15 17:36 ` [Kgdb-bugreport] " George G. Davis
@ 2010-04-15 21:03 ` Jamie Lokier
2010-04-16 13:54 ` Will Deacon
1 sibling, 0 replies; 13+ messages in thread
From: Jamie Lokier @ 2010-04-15 21:03 UTC (permalink / raw)
To: linux-arm-kernel
George G. Davis wrote:
> Hi,
>
> On Mon, Apr 12, 2010 at 06:32:47PM +0100, Russell King - ARM Linux wrote:
> > On Mon, Apr 12, 2010 at 06:23:58PM +0100, Will Deacon wrote:
> > > This patch changes the definition of cpu_relax() to smp_mb() for ARMv6 cores,
> > > forcing the write buffer to drain while inside a polling loop on an SMP system.
> > > If the Kernel is not compiled for SMP support, this will expand to a barrier()
> > > as before.
>
> If I've followed these threads [1][2] correctly, this ARM11 MPCore issue
> was discovered while running the "KGDB: internal test suite" (KGDB_TESTS)
> and that problem is resolved via "kgdb: use atomic_inc and atomic_dec
> instead of atomic_set" [3]. If so, isn't the original ARM11 MPCore KGDB
> cpu_relax() issue just a red herring? Shouldn't any polling loops
> which depend on specific (hardware) write/read order implement appropriate
> barriers rather than rely on cpu_relax() to guarantee order?
Note that the need to force the write buffer to drain is _not_ an
ordering issue. It's a buffer draining issue. :-)
I'm not sure if Linux smp_wmb() guarantees to ensure prior writes will
be visible to other CPUs in a short time, or if it only guarantees
write order.
-- Jamie
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Kgdb-bugreport] [PATCH] ARM: change definition of cpu_relax() for ARM11MPCore
2010-04-15 17:36 ` [Kgdb-bugreport] " George G. Davis
2010-04-15 21:03 ` Jamie Lokier
@ 2010-04-16 13:54 ` Will Deacon
1 sibling, 0 replies; 13+ messages in thread
From: Will Deacon @ 2010-04-16 13:54 UTC (permalink / raw)
To: linux-arm-kernel
Hi George,
> Hi,
>
> On Mon, Apr 12, 2010 at 06:32:47PM +0100, Russell King - ARM Linux wrote:
> > On Mon, Apr 12, 2010 at 06:23:58PM +0100, Will Deacon wrote:
> > > This patch changes the definition of cpu_relax() to smp_mb() for ARMv6 cores,
> > > forcing the write buffer to drain while inside a polling loop on an SMP system.
> > > If the Kernel is not compiled for SMP support, this will expand to a barrier()
> > > as before.
>
> If I've followed these threads [1][2] correctly, this ARM11 MPCore issue
> was discovered while running the "KGDB: internal test suite" (KGDB_TESTS)
> and that problem is resolved via "kgdb: use atomic_inc and atomic_dec
> instead of atomic_set" [3].
Yes, I encountered the issue in the KGDB testsuite. However, the patch [3]
in mainline doesn't resolve the issue as pointed out here:
http://lkml.org/lkml/2010/4/8/214
> If so, isn't the original ARM11 MPCore KGDB
> cpu_relax() issue just a red herring? Shouldn't any polling loops
> which depend on specific (hardware) write/read order implement appropriate
> barriers rather than rely on cpu_relax() to guarantee order? If
> perhaps there are indeed other cases where cpu_relax() is being used
> incorrectly, then maybe those should be fixed instead? Just curious...
As pointed out by Jamie Lokier, this isn't about ordering, it's about
ensuring that stores eventually make it to memory.
According to Linus:
`Linux does expect that if some other CPU modifies a memory location,
then we _will_ see that modification eventually. If the CPU needs help to
do so, then cpu_relax() needs to do that.'
Since the ARM11MPCore doesn't make this guarantee in hardware, then we
have to fix it up in software. We can either fix up every polling loop in
the Kernel individually or [as suggested above] modify cpu_relax() to
do the work for us. Note that we only need to modify cpu_relax() for this
particular core; other ARM SMP variants behave more sanely.
Thanks,
Will
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] ARM: change definition of cpu_relax() for ARM11MPCore
2010-04-12 17:23 [PATCH] ARM: change definition of cpu_relax() for ARM11MPCore Will Deacon
2010-04-12 17:32 ` Russell King - ARM Linux
@ 2010-04-19 14:39 ` Will Deacon
1 sibling, 0 replies; 13+ messages in thread
From: Will Deacon @ 2010-04-19 14:39 UTC (permalink / raw)
To: linux-arm-kernel
Hi Russell,
> On Mon, Apr 12, 2010 at 06:23:58PM +0100, Will Deacon wrote:
> > This patch changes the definition of cpu_relax() to smp_mb() for ARMv6 cores,
> > forcing the write buffer to drain while inside a polling loop on an SMP system.
> > If the Kernel is not compiled for SMP support, this will expand to a barrier()
> > as before.
>
> Linus asked how expensive (in terms of power rather than performance)
> this was; so far that question has remained unanswered. Can someone
> please answer his question?
I gave my opinion on the potential performance impact here:
http://sourceforge.net/mailarchive/message.php?msg_name=000001cada64%24b9f84200%242de8c600%24%40deacon%40arm.com
To summarise: in the context of polling loops, I believe that the
performance impact of adding an smp_mb() to cpu_relax() for v6 SMP
is minimal. If cpu_relax() is used in other scenarios, the performance
impact can only be reviewed on a case-by-case basis.
Are there any other issues that need to be considered, or can I
submit this to the patch system?
Thanks,
Will
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2010-04-19 14:39 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-12 17:23 [PATCH] ARM: change definition of cpu_relax() for ARM11MPCore Will Deacon
2010-04-12 17:32 ` Russell King - ARM Linux
2010-04-15 17:36 ` [Kgdb-bugreport] " George G. Davis
2010-04-15 21:03 ` Jamie Lokier
2010-04-16 13:54 ` Will Deacon
2010-04-19 14:39 ` Will Deacon
-- strict thread matches above, loose matches on Subject: below --
2010-03-09 16:06 Will Deacon
2010-03-09 16:22 ` Russell King - ARM Linux
2010-03-09 16:35 ` Will Deacon
2010-03-09 16:49 ` Russell King - ARM Linux
2010-03-09 17:59 ` Will Deacon
2010-03-10 22:06 ` [Kgdb-bugreport] " Jason Wessel
2010-03-11 2:47 ` DDD
2010-03-11 13:53 ` Will Deacon
2010-03-11 13:29 ` Will Deacon
2010-03-11 14:51 ` Will Deacon
2010-03-16 17:26 ` Will Deacon
2010-03-16 18:52 ` Jason Wessel
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.