linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] clocksource/drivers/dw_apb_timer: Use {readl|writel}_relaxed
@ 2015-11-13 12:31 Jisheng Zhang
  2015-11-13 12:42 ` Arnd Bergmann
  0 siblings, 1 reply; 4+ messages in thread
From: Jisheng Zhang @ 2015-11-13 12:31 UTC (permalink / raw)
  To: linux-arm-kernel

The driver is safe to use the relaxed version. From another side, the
relaxed io accessor macros are available on all architectures now, so
we can use the relaxed versions to get a trivial overall system
performance improvement and reduce the latency a bit on some
architectures.

Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
---
since v1:
 - correct the wrong sentence in commit msg about writel performance on
   CA9 with outer L2 cache. Thank Arnd for pointing out this in another
   thread
 drivers/clocksource/dw_apb_timer.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/clocksource/dw_apb_timer.c b/drivers/clocksource/dw_apb_timer.c
index c76c750..04282ee 100644
--- a/drivers/clocksource/dw_apb_timer.c
+++ b/drivers/clocksource/dw_apb_timer.c
@@ -51,13 +51,13 @@ clocksource_to_dw_apb_clocksource(struct clocksource *cs)
 
 static unsigned long apbt_readl(struct dw_apb_timer *timer, unsigned long offs)
 {
-	return readl(timer->base + offs);
+	return readl_relaxed(timer->base + offs);
 }
 
 static void apbt_writel(struct dw_apb_timer *timer, unsigned long val,
 		 unsigned long offs)
 {
-	writel(val, timer->base + offs);
+	writel_relaxed(val, timer->base + offs);
 }
 
 static void apbt_disable_int(struct dw_apb_timer *timer)
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v2] clocksource/drivers/dw_apb_timer: Use {readl|writel}_relaxed
  2015-11-13 12:31 [PATCH v2] clocksource/drivers/dw_apb_timer: Use {readl|writel}_relaxed Jisheng Zhang
@ 2015-11-13 12:42 ` Arnd Bergmann
  2015-11-13 12:57   ` Jisheng Zhang
  0 siblings, 1 reply; 4+ messages in thread
From: Arnd Bergmann @ 2015-11-13 12:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Friday 13 November 2015 20:31:23 Jisheng Zhang wrote:
> diff --git a/drivers/clocksource/dw_apb_timer.c b/drivers/clocksource/dw_apb_timer.c
> index c76c750..04282ee 100644
> --- a/drivers/clocksource/dw_apb_timer.c
> +++ b/drivers/clocksource/dw_apb_timer.c
> @@ -51,13 +51,13 @@ clocksource_to_dw_apb_clocksource(struct clocksource *cs)
>  
>  static unsigned long apbt_readl(struct dw_apb_timer *timer, unsigned long offs)
>  {
> -       return readl(timer->base + offs);
> +       return readl_relaxed(timer->base + offs);
>  }
>  
>  static void apbt_writel(struct dw_apb_timer *timer, unsigned long val,
>                  unsigned long offs)
>  {
> -       writel(val, timer->base + offs);
> +       writel_relaxed(val, timer->base + offs);
>  }
>  
> 

As with the other patch, I think it would be nicer to only change the
functions that benefit from the change, to make it easier to prove
that the conversion is correct. You could introduce apbt_readl_releaxed()
etc functions and call them from __apbt_read_clocksource()
and apbt_next_event().

	Arnd

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2] clocksource/drivers/dw_apb_timer: Use {readl|writel}_relaxed
  2015-11-13 12:42 ` Arnd Bergmann
@ 2015-11-13 12:57   ` Jisheng Zhang
  2015-11-13 13:51     ` Arnd Bergmann
  0 siblings, 1 reply; 4+ messages in thread
From: Jisheng Zhang @ 2015-11-13 12:57 UTC (permalink / raw)
  To: linux-arm-kernel

Dear Arnd,

On Fri, 13 Nov 2015 13:42:09 +0100
Arnd Bergmann <arnd@arndb.de> wrote:

> On Friday 13 November 2015 20:31:23 Jisheng Zhang wrote:
> > diff --git a/drivers/clocksource/dw_apb_timer.c b/drivers/clocksource/dw_apb_timer.c
> > index c76c750..04282ee 100644
> > --- a/drivers/clocksource/dw_apb_timer.c
> > +++ b/drivers/clocksource/dw_apb_timer.c
> > @@ -51,13 +51,13 @@ clocksource_to_dw_apb_clocksource(struct clocksource *cs)
> >  
> >  static unsigned long apbt_readl(struct dw_apb_timer *timer, unsigned long offs)
> >  {
> > -       return readl(timer->base + offs);
> > +       return readl_relaxed(timer->base + offs);
> >  }
> >  
> >  static void apbt_writel(struct dw_apb_timer *timer, unsigned long val,
> >                  unsigned long offs)
> >  {
> > -       writel(val, timer->base + offs);
> > +       writel_relaxed(val, timer->base + offs);
> >  }
> >  
> >   
> 
> As with the other patch, I think it would be nicer to only change the
> functions that benefit from the change, to make it easier to prove

I think to show performance improvement for this dw_apb_timer case is much
easier, because the clocksource ->read() function also calls the apbt_readl.
I will add some performance numbers in v3

> that the conversion is correct. You could introduce apbt_readl_releaxed()
> etc functions and call them from __apbt_read_clocksource()
> and apbt_next_event().

I'm not sure whether such changes would make the code a bit complex. From
another side, it's safe to always use relaxed version in this driver, so
is it better to switch to relaxed version no matter the code path benefit from
it or not?

PS: for the global timer related patch, I just hack code a bit to make it works
as clockevent on my platform, and I still try to think about a test case to
measure the improvement, cyclictest? Any idea is appreciated.

Thanks in advance,
Jisheng

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2] clocksource/drivers/dw_apb_timer: Use {readl|writel}_relaxed
  2015-11-13 12:57   ` Jisheng Zhang
@ 2015-11-13 13:51     ` Arnd Bergmann
  0 siblings, 0 replies; 4+ messages in thread
From: Arnd Bergmann @ 2015-11-13 13:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Friday 13 November 2015 20:57:27 Jisheng Zhang wrote:
> 
> > that the conversion is correct. You could introduce apbt_readl_releaxed()
> > etc functions and call them from __apbt_read_clocksource()
> > and apbt_next_event().
> 
> I'm not sure whether such changes would make the code a bit complex. From
> another side, it's safe to always use relaxed version in this driver, so
> is it better to switch to relaxed version no matter the code path benefit from
> it or not?

We've had problems in the past when people blindly converted whole drivers,
so I try to discourage that in general, if only to get people to pay more
attention when copying from one driver to another.

> PS: for the global timer related patch, I just hack code a bit to make it works
> as clockevent on my platform, and I still try to think about a test case to
> measure the improvement, cyclictest? Any idea is appreciated.

Measuring performance of timers is tricky by definition, but you could try
to sample the amount of time you spend setting up timers like this

diff --git a/drivers/clocksource/arm_global_timer.c b/drivers/clocksource/arm_global_timer.c
index a2cb6fae9295..da88347718ae 100644
--- a/drivers/clocksource/arm_global_timer.c
+++ b/drivers/clocksource/arm_global_timer.c
@@ -95,6 +95,8 @@ static u64 gt_counter_read(void)
 static void gt_compare_set(unsigned long delta, int periodic)
 {
 	u64 counter = gt_counter_read();
+	static u64 total_time;
+	static u32 count;
 	unsigned long ctrl;
 
 	counter += delta;
@@ -110,6 +112,12 @@ static void gt_compare_set(unsigned long delta, int periodic)
 
 	ctrl |= GT_CONTROL_COMP_ENABLE | GT_CONTROL_IRQ_ENABLE;
 	writel(ctrl, gt_base + GT_CONTROL);
+
+	total_time += gt_counter_read() - counter;
+	count++;
+
+	if (((count - 1) & 0xfff) == 0xfff)
+		printk(KERN_INFO "gt_compare_set time %lld\n", total_time / (count >> 12));
 }
 
 static int gt_clockevent_shutdown(struct clock_event_device *evt)

	Arnd

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-11-13 13:51 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-13 12:31 [PATCH v2] clocksource/drivers/dw_apb_timer: Use {readl|writel}_relaxed Jisheng Zhang
2015-11-13 12:42 ` Arnd Bergmann
2015-11-13 12:57   ` Jisheng Zhang
2015-11-13 13:51     ` Arnd Bergmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).