* [PATCH v4 0/2] clocksource/drivers/dw_apb_timer: improve performance a bit
@ 2015-11-25 16:01 Jisheng Zhang
2015-11-25 16:01 ` [PATCH v4 1/2] clocksource/drivers/dw_apb_timer: Inline apbt_readl and apbt_writel Jisheng Zhang
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Jisheng Zhang @ 2015-11-25 16:01 UTC (permalink / raw)
To: daniel.lezcano, tglx, arnd; +Cc: linux-kernel, linux-arm-kernel, Jisheng Zhang
These two patches try to improve the dw_apb_timer clocksource/clockevent
performance.
These patches depend on the apbt_readl return value fix patch:
http://lists.infradead.org/pipermail/linux-arm-kernel/2015-November/388250.html
since v3:
- fix commit msg: we measured 4096 rounds of function call. So add this
information into commit msg to avoid confusion.
since v2:
- only use relaxed version in critical path
- inline apbt_readl/apbt_writel
- add some performance numbers
since v1:
- correct the wrong sentence in commit msg about writel performance on
CA9 with outer L2 cache. Thank Arnd for pointing out this in another
thread.
Jisheng Zhang (2):
clocksource/drivers/dw_apb_timer: Inline apbt_readl and apbt_writel
clocksource/drivers/dw_apb_timer: Use {readl|writel}_relaxed in
critical path
drivers/clocksource/dw_apb_timer.c | 30 +++++++++++++++++++++---------
1 file changed, 21 insertions(+), 9 deletions(-)
--
2.6.2
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v4 1/2] clocksource/drivers/dw_apb_timer: Inline apbt_readl and apbt_writel
2015-11-25 16:01 [PATCH v4 0/2] clocksource/drivers/dw_apb_timer: improve performance a bit Jisheng Zhang
@ 2015-11-25 16:01 ` Jisheng Zhang
2015-11-25 16:01 ` [PATCH v4 2/2] clocksource/drivers/dw_apb_timer: Use {readl|writel}_relaxed in critical path Jisheng Zhang
2015-12-15 20:42 ` [PATCH v4 0/2] clocksource/drivers/dw_apb_timer: improve performance a bit Daniel Lezcano
2 siblings, 0 replies; 4+ messages in thread
From: Jisheng Zhang @ 2015-11-25 16:01 UTC (permalink / raw)
To: daniel.lezcano, tglx, arnd; +Cc: linux-kernel, linux-arm-kernel, Jisheng Zhang
It seems gcc can automatically inline apbt_writel() for us, but
apbt_real isn't inlined. This patch makes them inline to get a trivial
performance improvement: 4096 rounds of __apbt_read_clocksource() call
spend time on Marvell BG4CT platform:
before the patch 1275240ns on average
after the patch 1263240ns on average
so we get 1% performance improvement.
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
---
drivers/clocksource/dw_apb_timer.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/clocksource/dw_apb_timer.c b/drivers/clocksource/dw_apb_timer.c
index 3a6d9db..114696b 100644
--- a/drivers/clocksource/dw_apb_timer.c
+++ b/drivers/clocksource/dw_apb_timer.c
@@ -49,13 +49,13 @@ clocksource_to_dw_apb_clocksource(struct clocksource *cs)
return container_of(cs, struct dw_apb_clocksource, cs);
}
-static u32 apbt_readl(struct dw_apb_timer *timer, unsigned long offs)
+static inline u32 apbt_readl(struct dw_apb_timer *timer, unsigned long offs)
{
return readl(timer->base + offs);
}
-static void apbt_writel(struct dw_apb_timer *timer, u32 val,
- unsigned long offs)
+static inline void apbt_writel(struct dw_apb_timer *timer, u32 val,
+ unsigned long offs)
{
writel(val, timer->base + offs);
}
--
2.6.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v4 2/2] clocksource/drivers/dw_apb_timer: Use {readl|writel}_relaxed in critical path
2015-11-25 16:01 [PATCH v4 0/2] clocksource/drivers/dw_apb_timer: improve performance a bit Jisheng Zhang
2015-11-25 16:01 ` [PATCH v4 1/2] clocksource/drivers/dw_apb_timer: Inline apbt_readl and apbt_writel Jisheng Zhang
@ 2015-11-25 16:01 ` Jisheng Zhang
2015-12-15 20:42 ` [PATCH v4 0/2] clocksource/drivers/dw_apb_timer: improve performance a bit Daniel Lezcano
2 siblings, 0 replies; 4+ messages in thread
From: Jisheng Zhang @ 2015-11-25 16:01 UTC (permalink / raw)
To: daniel.lezcano, tglx, arnd; +Cc: linux-kernel, linux-arm-kernel, Jisheng Zhang
It's safe to use the relaxed version. From another side, the relaxed io
accessor macros are available on all architectures now, so we can use
the relaxed versions to get a trivial system performance improvement,
we measured time the following functions spent on Marvell BG4CT:
4096 rounds of __apbt_read_clocksource() call:
before the patch: 1263240ns on average
after the patch: 1250080ns on average
improved by 1%
4096 rounds of apbt_eoi() call:
before the patch: 1290960ns on average
after the patch: 1248240ns on average
4096 rounds of apbt_next_event() call:
before the patch: 3333660ns on average
after the patch: 1322040ns on average
improved by 60%!
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
---
drivers/clocksource/dw_apb_timer.c | 24 ++++++++++++++++++------
1 file changed, 18 insertions(+), 6 deletions(-)
diff --git a/drivers/clocksource/dw_apb_timer.c b/drivers/clocksource/dw_apb_timer.c
index 114696b..6334526 100644
--- a/drivers/clocksource/dw_apb_timer.c
+++ b/drivers/clocksource/dw_apb_timer.c
@@ -60,6 +60,17 @@ static inline void apbt_writel(struct dw_apb_timer *timer, u32 val,
writel(val, timer->base + offs);
}
+static inline u32 apbt_readl_relaxed(struct dw_apb_timer *timer, unsigned long offs)
+{
+ return readl_relaxed(timer->base + offs);
+}
+
+static inline void apbt_writel_relaxed(struct dw_apb_timer *timer, u32 val,
+ unsigned long offs)
+{
+ writel_relaxed(val, timer->base + offs);
+}
+
static void apbt_disable_int(struct dw_apb_timer *timer)
{
u32 ctrl = apbt_readl(timer, APBTMR_N_CONTROL);
@@ -81,7 +92,7 @@ void dw_apb_clockevent_pause(struct dw_apb_clock_event_device *dw_ced)
static void apbt_eoi(struct dw_apb_timer *timer)
{
- apbt_readl(timer, APBTMR_N_EOI);
+ apbt_readl_relaxed(timer, APBTMR_N_EOI);
}
static irqreturn_t dw_apb_clockevent_irq(int irq, void *data)
@@ -200,13 +211,13 @@ static int apbt_next_event(unsigned long delta,
struct dw_apb_clock_event_device *dw_ced = ced_to_dw_apb_ced(evt);
/* Disable timer */
- ctrl = apbt_readl(&dw_ced->timer, APBTMR_N_CONTROL);
+ ctrl = apbt_readl_relaxed(&dw_ced->timer, APBTMR_N_CONTROL);
ctrl &= ~APBTMR_CONTROL_ENABLE;
- apbt_writel(&dw_ced->timer, ctrl, APBTMR_N_CONTROL);
+ apbt_writel_relaxed(&dw_ced->timer, ctrl, APBTMR_N_CONTROL);
/* write new count */
- apbt_writel(&dw_ced->timer, delta, APBTMR_N_LOAD_COUNT);
+ apbt_writel_relaxed(&dw_ced->timer, delta, APBTMR_N_LOAD_COUNT);
ctrl |= APBTMR_CONTROL_ENABLE;
- apbt_writel(&dw_ced->timer, ctrl, APBTMR_N_CONTROL);
+ apbt_writel_relaxed(&dw_ced->timer, ctrl, APBTMR_N_CONTROL);
return 0;
}
@@ -342,7 +353,8 @@ static cycle_t __apbt_read_clocksource(struct clocksource *cs)
struct dw_apb_clocksource *dw_cs =
clocksource_to_dw_apb_clocksource(cs);
- current_count = apbt_readl(&dw_cs->timer, APBTMR_N_CURRENT_VALUE);
+ current_count = apbt_readl_relaxed(&dw_cs->timer,
+ APBTMR_N_CURRENT_VALUE);
return (cycle_t)~current_count;
}
--
2.6.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v4 0/2] clocksource/drivers/dw_apb_timer: improve performance a bit
2015-11-25 16:01 [PATCH v4 0/2] clocksource/drivers/dw_apb_timer: improve performance a bit Jisheng Zhang
2015-11-25 16:01 ` [PATCH v4 1/2] clocksource/drivers/dw_apb_timer: Inline apbt_readl and apbt_writel Jisheng Zhang
2015-11-25 16:01 ` [PATCH v4 2/2] clocksource/drivers/dw_apb_timer: Use {readl|writel}_relaxed in critical path Jisheng Zhang
@ 2015-12-15 20:42 ` Daniel Lezcano
2 siblings, 0 replies; 4+ messages in thread
From: Daniel Lezcano @ 2015-12-15 20:42 UTC (permalink / raw)
To: Jisheng Zhang, tglx, arnd; +Cc: linux-kernel, linux-arm-kernel
On 11/25/2015 05:01 PM, Jisheng Zhang wrote:
> These two patches try to improve the dw_apb_timer clocksource/clockevent
> performance.
>
> These patches depend on the apbt_readl return value fix patch:
>
> http://lists.infradead.org/pipermail/linux-arm-kernel/2015-November/388250.html
>
> since v3:
> - fix commit msg: we measured 4096 rounds of function call. So add this
> information into commit msg to avoid confusion.
>
> since v2:
> - only use relaxed version in critical path
> - inline apbt_readl/apbt_writel
> - add some performance numbers
>
> since v1:
> - correct the wrong sentence in commit msg about writel performance on
> CA9 with outer L2 cache. Thank Arnd for pointing out this in another
> thread.
>
>
> Jisheng Zhang (2):
> clocksource/drivers/dw_apb_timer: Inline apbt_readl and apbt_writel
> clocksource/drivers/dw_apb_timer: Use {readl|writel}_relaxed in
> critical path
>
> drivers/clocksource/dw_apb_timer.c | 30 +++++++++++++++++++++---------
> 1 file changed, 21 insertions(+), 9 deletions(-)
Applied.
Thanks !
-- Daniel
--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2015-12-15 20:42 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-25 16:01 [PATCH v4 0/2] clocksource/drivers/dw_apb_timer: improve performance a bit Jisheng Zhang
2015-11-25 16:01 ` [PATCH v4 1/2] clocksource/drivers/dw_apb_timer: Inline apbt_readl and apbt_writel Jisheng Zhang
2015-11-25 16:01 ` [PATCH v4 2/2] clocksource/drivers/dw_apb_timer: Use {readl|writel}_relaxed in critical path Jisheng Zhang
2015-12-15 20:42 ` [PATCH v4 0/2] clocksource/drivers/dw_apb_timer: improve performance a bit Daniel Lezcano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).