* [PATCH v2 0/3] Introduce atomic MMIO register clear-set
@ 2013-08-20 16:48 Ezequiel Garcia
2013-08-20 16:48 ` [PATCH v2 1/3] ARM: Introduce atomic MMIO clear/set Ezequiel Garcia
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Ezequiel Garcia @ 2013-08-20 16:48 UTC (permalink / raw)
To: linux-arm-kernel
Here's the v2 for the introduction of a generic thread-safe clear-set
register access.
Following the v1, I've added a proper barrier to ensure proper ordering,
between the writel and the spin_unlock.
It's worth noticing a different approach has been proposed (several times)
for getting thread-safe register access, through the use of regmap_update_bits
(just as it's done in mfd/syscon).
However, as noted in [1], such solution has been rejected (several times)
for two reasons:
* It's way more heavy-weight than necessary
* It cannot be used in very early scenarios,
such as a clocksource initialization.
Using this API it's possible -for instance- to add support for Armada 370/XP
in the orion_wdt driver. That work is ready, and it's been hold until we
decide a proper solution for shared-register access.
[1] http://www.spinics.net/lists/arm-kernel/msg266494.html
Ezequiel Garcia (3):
ARM: Introduce atomic MMIO clear/set
clocksource: orion: Use atomic access for shared registers
watchdog: orion: Use atomic access for shared registers
arch/arm/include/asm/io.h | 5 +++++
arch/arm/kernel/io.c | 13 +++++++++++++
drivers/clocksource/time-orion.c | 9 ++-------
drivers/watchdog/orion_wdt.c | 8 ++------
4 files changed, 22 insertions(+), 13 deletions(-)
--
1.8.1.5
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2 1/3] ARM: Introduce atomic MMIO clear/set
2013-08-20 16:48 [PATCH v2 0/3] Introduce atomic MMIO register clear-set Ezequiel Garcia
@ 2013-08-20 16:48 ` Ezequiel Garcia
2013-08-20 16:55 ` Ezequiel Garcia
` (2 more replies)
2013-08-20 16:48 ` [PATCH v2 2/3] clocksource: orion: Use atomic access for shared registers Ezequiel Garcia
2013-08-20 16:48 ` [PATCH v2 3/3] watchdog: " Ezequiel Garcia
2 siblings, 3 replies; 10+ messages in thread
From: Ezequiel Garcia @ 2013-08-20 16:48 UTC (permalink / raw)
To: linux-arm-kernel
Some SoC have MMIO regions that are shared across orthogonal
subsystems. This commit implements a possible solution for the
thread-safe access of such regions through a spinlock-protected API
with clear-set semantics.
Concurrent access is protected with a single spinlock for the
entire MMIO address space. While this protects shared-registers,
it also serializes access to unrelated/unshared registers.
---
Based on a similar approach suggested by Russel King:
http://archive.arm.linux.org.uk/lurker/message/20130618.113606.d7d4fe4b.en.html
Changes from v1:
* Added an io barrier iowmb() as suggested by Will Deacon,
to ensure the writel gets completed before the spin_unlock().
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
---
arch/arm/include/asm/io.h | 5 +++++
arch/arm/kernel/io.c | 13 +++++++++++++
2 files changed, 18 insertions(+)
diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h
index d070741..3cea1f0 100644
--- a/arch/arm/include/asm/io.h
+++ b/arch/arm/include/asm/io.h
@@ -36,6 +36,11 @@
#define isa_bus_to_virt phys_to_virt
/*
+ * Atomic MMIO-wide IO clear/set
+ */
+extern void atomic_io_clear_set(u32 clear, u32 set, void __iomem *reg);
+
+/*
* Generic IO read/write. These perform native-endian accesses. Note
* that some architectures will want to re-define __raw_{read,write}w.
*/
diff --git a/arch/arm/kernel/io.c b/arch/arm/kernel/io.c
index dcd5b4d..b2a53a3 100644
--- a/arch/arm/kernel/io.c
+++ b/arch/arm/kernel/io.c
@@ -1,6 +1,19 @@
#include <linux/export.h>
#include <linux/types.h>
#include <linux/io.h>
+#include <linux/spinlock.h>
+
+static DEFINE_SPINLOCK(__io_lock);
+
+void atomic_io_clear_set(u32 clear, u32 set, void __iomem *reg)
+{
+ spin_lock(&__io_lock);
+ writel((readl(reg) & ~clear) | set, reg);
+ /* ensure the write get done before unlocking */
+ __iowmb();
+ spin_unlock(&__io_lock);
+}
+EXPORT_SYMBOL(atomic_io_clear_set);
/*
* Copy data from IO memory space to "real" memory space.
--
1.8.1.5
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v2 2/3] clocksource: orion: Use atomic access for shared registers
2013-08-20 16:48 [PATCH v2 0/3] Introduce atomic MMIO register clear-set Ezequiel Garcia
2013-08-20 16:48 ` [PATCH v2 1/3] ARM: Introduce atomic MMIO clear/set Ezequiel Garcia
@ 2013-08-20 16:48 ` Ezequiel Garcia
2013-08-20 16:48 ` [PATCH v2 3/3] watchdog: " Ezequiel Garcia
2 siblings, 0 replies; 10+ messages in thread
From: Ezequiel Garcia @ 2013-08-20 16:48 UTC (permalink / raw)
To: linux-arm-kernel
Replace the driver-specific thread-safe shared register API
by the recently introduced atomic_io_clear_set().
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
---
drivers/clocksource/time-orion.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/drivers/clocksource/time-orion.c b/drivers/clocksource/time-orion.c
index ecbeb68..4334ea7 100644
--- a/drivers/clocksource/time-orion.c
+++ b/drivers/clocksource/time-orion.c
@@ -35,20 +35,15 @@
#define ORION_ONESHOT_MAX 0xfffffffe
static void __iomem *timer_base;
-static DEFINE_SPINLOCK(timer_ctrl_lock);
/*
* Thread-safe access to TIMER_CTRL register
* (shared with watchdog timer)
*/
-void orion_timer_ctrl_clrset(u32 clr, u32 set)
+static void orion_timer_ctrl_clrset(u32 clr, u32 set)
{
- spin_lock(&timer_ctrl_lock);
- writel((readl(timer_base + TIMER_CTRL) & ~clr) | set,
- timer_base + TIMER_CTRL);
- spin_unlock(&timer_ctrl_lock);
+ atomic_io_clear_set(timer_base + TIMER_CTRL, clr, set);
}
-EXPORT_SYMBOL(orion_timer_ctrl_clrset);
/*
* Free-running clocksource handling.
--
1.8.1.5
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v2 3/3] watchdog: orion: Use atomic access for shared registers
2013-08-20 16:48 [PATCH v2 0/3] Introduce atomic MMIO register clear-set Ezequiel Garcia
2013-08-20 16:48 ` [PATCH v2 1/3] ARM: Introduce atomic MMIO clear/set Ezequiel Garcia
2013-08-20 16:48 ` [PATCH v2 2/3] clocksource: orion: Use atomic access for shared registers Ezequiel Garcia
@ 2013-08-20 16:48 ` Ezequiel Garcia
2 siblings, 0 replies; 10+ messages in thread
From: Ezequiel Garcia @ 2013-08-20 16:48 UTC (permalink / raw)
To: linux-arm-kernel
Since the timer control register is shared with the clocksource driver,
use the recently introduced atomic_io_clear_set() to access such register.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
---
drivers/watchdog/orion_wdt.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/drivers/watchdog/orion_wdt.c b/drivers/watchdog/orion_wdt.c
index 4ea5fcc..de35ae9 100644
--- a/drivers/watchdog/orion_wdt.c
+++ b/drivers/watchdog/orion_wdt.c
@@ -73,9 +73,7 @@ static int orion_wdt_start(struct watchdog_device *wdt_dev)
writel(~WDT_INT_REQ, BRIDGE_CAUSE);
/* Enable watchdog timer */
- reg = readl(wdt_reg + TIMER_CTRL);
- reg |= WDT_EN;
- writel(reg, wdt_reg + TIMER_CTRL);
+ atomic_io_clear_set(wdt_reg + TIMER_CTRL, 0, WDT_EN);
/* Enable reset on watchdog */
reg = readl(RSTOUTn_MASK);
@@ -98,9 +96,7 @@ static int orion_wdt_stop(struct watchdog_device *wdt_dev)
writel(reg, RSTOUTn_MASK);
/* Disable watchdog timer */
- reg = readl(wdt_reg + TIMER_CTRL);
- reg &= ~WDT_EN;
- writel(reg, wdt_reg + TIMER_CTRL);
+ atomic_io_clear_set(wdt_reg + TIMER_CTRL, WDT_EN, 0);
spin_unlock(&wdt_lock);
return 0;
--
1.8.1.5
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v2 1/3] ARM: Introduce atomic MMIO clear/set
2013-08-20 16:48 ` [PATCH v2 1/3] ARM: Introduce atomic MMIO clear/set Ezequiel Garcia
@ 2013-08-20 16:55 ` Ezequiel Garcia
2013-08-20 21:08 ` Russell King - ARM Linux
2013-08-21 12:24 ` Will Deacon
2 siblings, 0 replies; 10+ messages in thread
From: Ezequiel Garcia @ 2013-08-20 16:55 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, Aug 20, 2013 at 01:48:25PM -0300, Ezequiel Garcia wrote:
> Some SoC have MMIO regions that are shared across orthogonal
> subsystems. This commit implements a possible solution for the
> thread-safe access of such regions through a spinlock-protected API
> with clear-set semantics.
>
> Concurrent access is protected with a single spinlock for the
> entire MMIO address space. While this protects shared-registers,
> it also serializes access to unrelated/unshared registers.
>
> ---
> Based on a similar approach suggested by Russel King:
> http://archive.arm.linux.org.uk/lurker/message/20130618.113606.d7d4fe4b.en.html
>
> Changes from v1:
>
> * Added an io barrier iowmb() as suggested by Will Deacon,
> to ensure the writel gets completed before the spin_unlock().
Argh! of course the above goes below the SOB.
>
> Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
> ---
> arch/arm/include/asm/io.h | 5 +++++
> arch/arm/kernel/io.c | 13 +++++++++++++
> 2 files changed, 18 insertions(+)
>
> diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h
> index d070741..3cea1f0 100644
> --- a/arch/arm/include/asm/io.h
> +++ b/arch/arm/include/asm/io.h
> @@ -36,6 +36,11 @@
> #define isa_bus_to_virt phys_to_virt
>
> /*
> + * Atomic MMIO-wide IO clear/set
> + */
> +extern void atomic_io_clear_set(u32 clear, u32 set, void __iomem *reg);
> +
> +/*
> * Generic IO read/write. These perform native-endian accesses. Note
> * that some architectures will want to re-define __raw_{read,write}w.
> */
> diff --git a/arch/arm/kernel/io.c b/arch/arm/kernel/io.c
> index dcd5b4d..b2a53a3 100644
> --- a/arch/arm/kernel/io.c
> +++ b/arch/arm/kernel/io.c
> @@ -1,6 +1,19 @@
> #include <linux/export.h>
> #include <linux/types.h>
> #include <linux/io.h>
> +#include <linux/spinlock.h>
> +
> +static DEFINE_SPINLOCK(__io_lock);
> +
> +void atomic_io_clear_set(u32 clear, u32 set, void __iomem *reg)
> +{
> + spin_lock(&__io_lock);
> + writel((readl(reg) & ~clear) | set, reg);
> + /* ensure the write get done before unlocking */
> + __iowmb();
> + spin_unlock(&__io_lock);
> +}
> +EXPORT_SYMBOL(atomic_io_clear_set);
>
> /*
> * Copy data from IO memory space to "real" memory space.
> --
> 1.8.1.5
>
--
Ezequiel Garc?a, Free Electrons
Embedded Linux, Kernel and Android Engineering
http://free-electrons.com
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2 1/3] ARM: Introduce atomic MMIO clear/set
2013-08-20 16:48 ` [PATCH v2 1/3] ARM: Introduce atomic MMIO clear/set Ezequiel Garcia
2013-08-20 16:55 ` Ezequiel Garcia
@ 2013-08-20 21:08 ` Russell King - ARM Linux
2013-08-21 14:36 ` Ezequiel Garcia
2013-08-21 12:24 ` Will Deacon
2 siblings, 1 reply; 10+ messages in thread
From: Russell King - ARM Linux @ 2013-08-20 21:08 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, Aug 20, 2013 at 01:48:25PM -0300, Ezequiel Garcia wrote:
> Based on a similar approach suggested by Russel King:
Russell please.
We Russells get upset when our names are incorrectly spelt, just like
others get upset if they end up with extra letters in their names, or
you confuse Steven vs Stephen. Or even dare call a Deborah "Debs"
(I did that once and the result was not particularly nice!)
> +void atomic_io_clear_set(u32 clear, u32 set, void __iomem *reg)
> +{
> + spin_lock(&__io_lock);
> + writel((readl(reg) & ~clear) | set, reg);
> + /* ensure the write get done before unlocking */
> + __iowmb();
> + spin_unlock(&__io_lock);
> +}
> +EXPORT_SYMBOL(atomic_io_clear_set);
Some comments - neither of them you _have_ to act on:
1. writel((readl(reg) & ~mask) | (set & mask), reg) could be deemed
to give better semantics - consider that if you don't look at the
implementation, how do you know what the result of setting a bit
in both the set & clear masks would be?
2. A historical note, that back in the 1980s with things like the BBC
micro, this kind of operation was defined:
new_value = (old_value & mask) ^ value
which has the flexibility of being able to set, clear or toggle any
bit. I'm not saying that's a good interface, I'm merely pointing
out that the problem of being able to set and clear bits is nothing
new and other solutions are available. :)
3. Would it be better to separate these by having atomic_io_clear() and
atomic_io_set() functions?
Just some things to think about; I have no overall preference here.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2 1/3] ARM: Introduce atomic MMIO clear/set
2013-08-20 16:48 ` [PATCH v2 1/3] ARM: Introduce atomic MMIO clear/set Ezequiel Garcia
2013-08-20 16:55 ` Ezequiel Garcia
2013-08-20 21:08 ` Russell King - ARM Linux
@ 2013-08-21 12:24 ` Will Deacon
2013-08-21 14:22 ` Ezequiel Garcia
2 siblings, 1 reply; 10+ messages in thread
From: Will Deacon @ 2013-08-21 12:24 UTC (permalink / raw)
To: linux-arm-kernel
Hi Ezequiel,
Cheers for the v2. I've been thinking about how to improve the performance
of this operation and I ended up completely changing my mind about how it
should be implemented :)
Comments and questions below...
On Tue, Aug 20, 2013 at 05:48:25PM +0100, Ezequiel Garcia wrote:
> diff --git a/arch/arm/kernel/io.c b/arch/arm/kernel/io.c
> index dcd5b4d..b2a53a3 100644
> --- a/arch/arm/kernel/io.c
> +++ b/arch/arm/kernel/io.c
> @@ -1,6 +1,19 @@
> #include <linux/export.h>
> #include <linux/types.h>
> #include <linux/io.h>
> +#include <linux/spinlock.h>
> +
> +static DEFINE_SPINLOCK(__io_lock);
> +
> +void atomic_io_clear_set(u32 clear, u32 set, void __iomem *reg)
First off, what exactly is this function supposed to guarantee? Do you simply
require thread-safe device access, or do you also require completion of the
the device writes?
Since the latter is device-specific (probably requiring some read-backs in
the driver), I assume you actually just need to ensure that multiple drivers
don't trip over each other. In which case, this can be *much* lighter
weight.
> +{
> + spin_lock(&__io_lock);
Is this function likely to be used in irq context? If so, better disable
IRQs here.
> + writel((readl(reg) & ~clear) | set, reg);
Going back to my earlier comments, this will assemble to something like:
dmb
ldr [device]
dsb
...
dsb
outer_sync
str [device]
dmb
If my assumption about completion is correct, you actually just want:
dmb
ldr [device]
...
str [device]
dmb
which can be done by using the _relaxed variants of readl/writel.
> + /* ensure the write get done before unlocking */
> + __iowmb();
And then, despite my previous suggestion, you can drop this line too.
> + spin_unlock(&__io_lock);
> +}
> +EXPORT_SYMBOL(atomic_io_clear_set);
Now, the only downside with this approach is that you need explicit barriers
in the driver if you want to enforce ordering with respect to normal,
cacheable buffers to be consumed/populated by the device.
What do you think? An alternative would be just to relax the readl and rely
on the dsb preceeding the writel, then add atomic_io_clear_set_relaxed
to implement what I described above, but it depends on how you envisage
this helper being used.
Will
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2 1/3] ARM: Introduce atomic MMIO clear/set
2013-08-21 12:24 ` Will Deacon
@ 2013-08-21 14:22 ` Ezequiel Garcia
2013-08-21 16:28 ` Will Deacon
0 siblings, 1 reply; 10+ messages in thread
From: Ezequiel Garcia @ 2013-08-21 14:22 UTC (permalink / raw)
To: linux-arm-kernel
Will,
On Wed, Aug 21, 2013 at 01:24:24PM +0100, Will Deacon wrote:
>
> Cheers for the v2. I've been thinking about how to improve the performance
> of this operation and I ended up completely changing my mind about how it
> should be implemented :)
>
> Comments and questions below...
>
Sure! Much appreciated...
> On Tue, Aug 20, 2013 at 05:48:25PM +0100, Ezequiel Garcia wrote:
> > diff --git a/arch/arm/kernel/io.c b/arch/arm/kernel/io.c
> > index dcd5b4d..b2a53a3 100644
> > --- a/arch/arm/kernel/io.c
> > +++ b/arch/arm/kernel/io.c
> > @@ -1,6 +1,19 @@
> > #include <linux/export.h>
> > #include <linux/types.h>
> > #include <linux/io.h>
> > +#include <linux/spinlock.h>
> > +
> > +static DEFINE_SPINLOCK(__io_lock);
> > +
> > +void atomic_io_clear_set(u32 clear, u32 set, void __iomem *reg)
>
> First off, what exactly is this function supposed to guarantee? Do you simply
> require thread-safe device access, or do you also require completion of the
> the device writes?
>
Indeed, thread-safe device access would be enough, unless I'm missing
something.
> Since the latter is device-specific (probably requiring some read-backs in
> the driver), I assume you actually just need to ensure that multiple drivers
> don't trip over each other. In which case, this can be *much* lighter
> weight.
>
Ok...
> > +{
> > + spin_lock(&__io_lock);
>
> Is this function likely to be used in irq context? If so, better disable
> IRQs here.
>
No, I don't think this API is aimed particularly at irq-context.
That said, it's a generic API that can be used anywhere, yet users are expected
to be aware of the performance impact (or at least I hope that).
And speaking about performance, I appreciate any performance tunings,
but I guess we can all agree this API is not meant for hot-paths.
> > + writel((readl(reg) & ~clear) | set, reg);
>
> Going back to my earlier comments, this will assemble to something like:
>
> dmb
> ldr [device]
> dsb
> ...
> dsb
> outer_sync
> str [device]
> dmb
>
> If my assumption about completion is correct, you actually just want:
>
> dmb
> ldr [device]
> ...
> str [device]
> dmb
>
> which can be done by using the _relaxed variants of readl/writel.
>
I see.
> > + /* ensure the write get done before unlocking */
> > + __iowmb();
>
> And then, despite my previous suggestion, you can drop this line too.
>
Ok... I'm not sure I understand why using relaxed variants allows us to drop this.
Maybe you can (try) to enlighten me?
> > + spin_unlock(&__io_lock);
> > +}
> > +EXPORT_SYMBOL(atomic_io_clear_set);
>
> Now, the only downside with this approach is that you need explicit barriers
> in the driver if you want to enforce ordering with respect to normal,
> cacheable buffers to be consumed/populated by the device.
>
> What do you think? An alternative would be just to relax the readl and rely
> on the dsb preceeding the writel, then add atomic_io_clear_set_relaxed
> to implement what I described above, but it depends on how you envisage
> this helper being used.
>
Mmm.. it's not easy to foresee, for we have only a scarce bunch of planned
usage for the API. I guess any simple shared-register access would want
to use this, typically just to enable/disable something in some 'control'
register.
Russell, what do you think?
--
Ezequiel Garc?a, Free Electrons
Embedded Linux, Kernel and Android Engineering
http://free-electrons.com
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2 1/3] ARM: Introduce atomic MMIO clear/set
2013-08-20 21:08 ` Russell King - ARM Linux
@ 2013-08-21 14:36 ` Ezequiel Garcia
0 siblings, 0 replies; 10+ messages in thread
From: Ezequiel Garcia @ 2013-08-21 14:36 UTC (permalink / raw)
To: linux-arm-kernel
Russell,
On Tue, Aug 20, 2013 at 10:08:39PM +0100, Russell King - ARM Linux wrote:
> On Tue, Aug 20, 2013 at 01:48:25PM -0300, Ezequiel Garcia wrote:
> > Based on a similar approach suggested by Russel King:
>
> Russell please.
>
> We Russells get upset when our names are incorrectly spelt, just like
> others get upset if they end up with extra letters in their names, or
> you confuse Steven vs Stephen. Or even dare call a Deborah "Debs"
> (I did that once and the result was not particularly nice!)
>
Ouch... sorry about that!
> > +void atomic_io_clear_set(u32 clear, u32 set, void __iomem *reg)
> > +{
> > + spin_lock(&__io_lock);
> > + writel((readl(reg) & ~clear) | set, reg);
> > + /* ensure the write get done before unlocking */
> > + __iowmb();
> > + spin_unlock(&__io_lock);
> > +}
> > +EXPORT_SYMBOL(atomic_io_clear_set);
>
> Some comments - neither of them you _have_ to act on:
>
> 1. writel((readl(reg) & ~mask) | (set & mask), reg) could be deemed
> to give better semantics - consider that if you don't look at the
> implementation, how do you know what the result of setting a bit
> in both the set & clear masks would be?
>
> 2. A historical note, that back in the 1980s with things like the BBC
> micro, this kind of operation was defined:
>
> new_value = (old_value & mask) ^ value
>
> which has the flexibility of being able to set, clear or toggle any
> bit. I'm not saying that's a good interface, I'm merely pointing
> out that the problem of being able to set and clear bits is nothing
> new and other solutions are available. :)
>
> 3. Would it be better to separate these by having atomic_io_clear() and
> atomic_io_set() functions?
>
> Just some things to think about; I have no overall preference here.
Indeed, I don't have any strong opinions on any of the above.
However, I'm a bit inclined to your proposal in (1), which coincides with:
int regmap_update_bits(struct regmap *map, unsigned int reg,
unsigned int mask, unsigned int val)
This looks a lot less intuitive to me, but more flexible.
Any other thoughts?
--
Ezequiel Garc?a, Free Electrons
Embedded Linux, Kernel and Android Engineering
http://free-electrons.com
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2 1/3] ARM: Introduce atomic MMIO clear/set
2013-08-21 14:22 ` Ezequiel Garcia
@ 2013-08-21 16:28 ` Will Deacon
0 siblings, 0 replies; 10+ messages in thread
From: Will Deacon @ 2013-08-21 16:28 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Aug 21, 2013 at 03:22:04PM +0100, Ezequiel Garcia wrote:
> On Wed, Aug 21, 2013 at 01:24:24PM +0100, Will Deacon wrote:
> > On Tue, Aug 20, 2013 at 05:48:25PM +0100, Ezequiel Garcia wrote:
> > > diff --git a/arch/arm/kernel/io.c b/arch/arm/kernel/io.c
> > > index dcd5b4d..b2a53a3 100644
> > > --- a/arch/arm/kernel/io.c
> > > +++ b/arch/arm/kernel/io.c
> > > @@ -1,6 +1,19 @@
> > > #include <linux/export.h>
> > > #include <linux/types.h>
> > > #include <linux/io.h>
> > > +#include <linux/spinlock.h>
> > > +
> > > +static DEFINE_SPINLOCK(__io_lock);
> > > +
> > > +void atomic_io_clear_set(u32 clear, u32 set, void __iomem *reg)
> >
> > First off, what exactly is this function supposed to guarantee? Do you simply
> > require thread-safe device access, or do you also require completion of the
> > the device writes?
> >
>
> Indeed, thread-safe device access would be enough, unless I'm missing
> something.
Great!
> > > + spin_lock(&__io_lock);
> >
> > Is this function likely to be used in irq context? If so, better disable
> > IRQs here.
> >
>
> No, I don't think this API is aimed particularly at irq-context.
>
> That said, it's a generic API that can be used anywhere, yet users are expected
> to be aware of the performance impact (or at least I hope that).
>
> And speaking about performance, I appreciate any performance tunings,
> but I guess we can all agree this API is not meant for hot-paths.
Well, device accesses in interrupt handlers are very common, so if you need
to co-exist with other users of the device, you could decide to use this
API. I think it's a good idea to use the _irqsave variant here.
> > > + /* ensure the write get done before unlocking */
> > > + __iowmb();
> >
> > And then, despite my previous suggestion, you can drop this line too.
> >
>
> Ok... I'm not sure I understand why using relaxed variants allows us to drop this.
>
> Maybe you can (try) to enlighten me?
Sure. Firstly, an __iowmb() is only required for ensuring *endpoint* ordering
between normal, cacheable memory and device memory. i.e. you write a normal
cacheable buffer, then you poke a device to go and read it. That is why
there is this barrier in writel, before the actual device write: it ensures
that the buffer is visible to the device before it sees the control write.
My original, rushed reading of your patch lead me to think that this was all
about endpoint ordering and you were trying to guarantee some sort of
immediacy of the device write by using a spinlock, but now I see that's not
the case. Given that the device is never going to go and read the spinlock
value, you can drop the __iowmb() regardless.
> > Now, the only downside with this approach is that you need explicit barriers
> > in the driver if you want to enforce ordering with respect to normal,
> > cacheable buffers to be consumed/populated by the device.
> >
> > What do you think? An alternative would be just to relax the readl and rely
> > on the dsb preceeding the writel, then add atomic_io_clear_set_relaxed
> > to implement what I described above, but it depends on how you envisage
> > this helper being used.
> >
>
> Mmm.. it's not easy to foresee, for we have only a scarce bunch of planned
> usage for the API. I guess any simple shared-register access would want
> to use this, typically just to enable/disable something in some 'control'
> register.
Right, but if that control register causes the device to go off and read a
normal, cacheable buffer then the driver would need explicit barriers if we
exclusively use relaxed accessors here. The more I think about it, the more
I like the idea of having two versions:
atomic_io_clear_set
spin_lock
writel(readl_relaxed(...)
spin_unlock
atomic_io_clear_set_relaxed
spin_lock
writel_relaxed(readl_relaxed(...))
spin_unlock
That way, the two versions match the semantics of the simpler I/O accessors
from which they are built and drivers can use the relaxed variant as an
optimisation.
Hope that helps, and sorry for the confusion.
Will
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2013-08-21 16:28 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-08-20 16:48 [PATCH v2 0/3] Introduce atomic MMIO register clear-set Ezequiel Garcia
2013-08-20 16:48 ` [PATCH v2 1/3] ARM: Introduce atomic MMIO clear/set Ezequiel Garcia
2013-08-20 16:55 ` Ezequiel Garcia
2013-08-20 21:08 ` Russell King - ARM Linux
2013-08-21 14:36 ` Ezequiel Garcia
2013-08-21 12:24 ` Will Deacon
2013-08-21 14:22 ` Ezequiel Garcia
2013-08-21 16:28 ` Will Deacon
2013-08-20 16:48 ` [PATCH v2 2/3] clocksource: orion: Use atomic access for shared registers Ezequiel Garcia
2013-08-20 16:48 ` [PATCH v2 3/3] watchdog: " Ezequiel Garcia
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).