From mboxrd@z Thu Jan 1 00:00:00 1970 From: Krzysztof Kozlowski Subject: Re: [RFC PATCH 1/3] clocksource: exynos_mct: allow mct to read counter from coprocessor Date: Tue, 28 Jul 2015 15:11:28 +0900 Message-ID: <55B71D10.4060505@samsung.com> References: <1438032512.17734.47.camel@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Return-path: Received: from mailout4.w1.samsung.com ([210.118.77.14]:17977 "EHLO mailout4.w1.samsung.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750971AbbG1GLf (ORCPT ); Tue, 28 Jul 2015 02:11:35 -0400 Received: from eucpsbgm1.samsung.com (unknown [203.254.199.244]) by mailout4.w1.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTP id <0NS600CD3P78NWA0@mailout4.w1.samsung.com> for linux-samsung-soc@vger.kernel.org; Tue, 28 Jul 2015 07:11:32 +0100 (BST) In-reply-to: <1438032512.17734.47.camel@gmail.com> Sender: linux-samsung-soc-owner@vger.kernel.org List-Id: linux-samsung-soc@vger.kernel.org To: Alexey Klimov , linux-samsung-soc@vger.kernel.org, daniel.lezcano@linaro.org, dianders@chromium.org, chirantan@chromium.org Cc: t.dakhran@gmail.com, kgene@kernel.org, tglx@linutronix.de, linux-arm-kernel@lists.infradead.org, yury.norov@gmail.com On 28.07.2015 06:28, Alexey Klimov wrote: > It was discovered that 64-bit mmio MCT counter holds > the same value as ARM arch timer 64-bit physical counter. > Even more: arch timer and MCT are same underlying hardware > timer. > > Patch adds code to MCT to allow it to read 64-bit counter > from coprocessor cp15 registers since it's way more > faster than reading the same counter from MCT mmio region. > > That functionality triggers only if special property > use-cp15-phys-counter is present in device tree, > only on 32-bit ARMv7 systems and only if HYP mode is > available. Hi, It would be nice to put here also the measurements from cover letter. This would be the justification for the commit. I guess same optimization could be done for ARMv8 Exynos SoCs? > > Idea of rewriting accessors is taken from arm_arch_timer.c. > > By default MCT will read counter from mmio since it's > guaranteed to work. > > Signed-off-by: Alexey Klimov > --- > drivers/clocksource/exynos_mct.c | 77 ++++++++++++++++++++++++++++++++++------ > 1 file changed, 67 insertions(+), 10 deletions(-) > > diff --git a/drivers/clocksource/exynos_mct.c b/drivers/clocksource/exynos_mct.c > index 9064ff7..039b41c 100644 > --- a/drivers/clocksource/exynos_mct.c > +++ b/drivers/clocksource/exynos_mct.c > @@ -26,6 +26,9 @@ > #include > #include > > +#include /* for cp15 physical arch timer counter */ > +#include /* for checking HYP mode */ > + > #define EXYNOS4_MCTREG(x) (x) > #define EXYNOS4_MCT_G_CNT_L EXYNOS4_MCTREG(0x100) > #define EXYNOS4_MCT_G_CNT_U EXYNOS4_MCTREG(0x104) > @@ -82,6 +85,17 @@ static unsigned long clk_rate; > static unsigned int mct_int_type; > static int mct_irqs[MCT_NR_IRQS]; > > +static u32 notrace __exynos4_read_count_32(void); > +static u64 __exynos4_read_count_64(void); > + > +/* > + * Default to __exynos4_read_count_{32,64} functions that reads counter from > + * MCT mmio region and this method is guaranteed > + * to exist (if MCT was successfully initialized). > + */ > +u32 (*exynos4_read_count_32)(void) = __exynos4_read_count_32; > +u64 (*exynos4_read_count_64)(void) = __exynos4_read_count_64; I think these could be static. > + > struct mct_clock_event_device { > struct clock_event_device evt; > unsigned long base; > @@ -163,16 +177,16 @@ static void exynos4_mct_frc_start(void) > } > > /** > - * exynos4_read_count_64 - Read all 64-bits of the global counter > + * __exynos4_read_count_64 - Read all 64-bits of the global counter > * > - * This will read all 64-bits of the global counter taking care to make sure > - * that the upper and lower half match. Note that reading the MCT can be quite > - * slow (hundreds of nanoseconds) so you should use the 32-bit (lower half > - * only) version when possible. > + * This will read all 64-bits of the global counter from MCT mmio region > + * taking care to make sure that the upper and lower half match. > + * Note that reading the MCT can be quite slow (hundreds of nanoseconds) > + * so you should use the 32-bit (lower half only) version when possible. > * > * Returns the number of cycles in the global counter. > */ > -static u64 exynos4_read_count_64(void) > +static u64 __exynos4_read_count_64(void) > { > unsigned int lo, hi; > u32 hi2 = readl_relaxed(reg_base + EXYNOS4_MCT_G_CNT_U); > @@ -187,18 +201,45 @@ static u64 exynos4_read_count_64(void) > } > > /** > - * exynos4_read_count_32 - Read the lower 32-bits of the global counter > + * __exynos4_read_cp15_count_64 - Read all 64-bits of the global counter > + * from coprocessor regisers. > * > - * This will read just the lower 32-bits of the global counter. This is marked > - * as notrace so it can be used by the scheduler clock. > + * Note that reading here is faster than reading from MCT mmio region. > + * > + * Returns the number of cycles in the global counter. > + */ > +static u64 __exynos4_read_cp15_count_64(void) > +{ > + return arch_counter_get_cntpct(); > +} > + > +/** > + * __exynos4_read_count_32 - Read the lower 32-bits of the global counter > + * > + * This will read just the lower 32-bits of the global counter from > + * MCT mmio region. > + * This is marked as notrace so it can be used by the scheduler clock. > * > * Returns the number of cycles in the global counter (lower 32 bits). > */ > -static u32 notrace exynos4_read_count_32(void) > +static u32 notrace __exynos4_read_count_32(void) > { > return readl_relaxed(reg_base + EXYNOS4_MCT_G_CNT_L); > } > > +/** > + * __exynos4_read_cp15_count_32 - Read the lower 32-bits of the global counter > + * > + * This will read global counter from coprocessor regisers. s/regisers/registers/ > + * This is marked as notrace so it can be used by the scheduler clock. > + * > + * Returns the number of cycles in the global counter (lower 32 bits). > + */ > +static u32 notrace __exynos4_read_cp15_count_32(void) > +{ > + return arch_counter_get_cntpct(); > +} > + > static cycle_t exynos4_frc_read(struct clocksource *cs) > { > return exynos4_read_count_32(); > @@ -599,6 +640,22 @@ static void __init mct_init_dt(struct device_node *np, unsigned int int_type) > for (i = MCT_L0_IRQ; i < nr_irqs; i++) > mct_irqs[i] = irq_of_parse_and_map(np, i); > > + /* > + * If use-cp15-phys-counter property is present in device tree > + * then use CP15 ARM arch timer 64-bit physical counter > + * to speedup reading since it keeps the same value like > + * 64-bit counter in MCT mmio region. > + * If HYP mode is available we can rely on physical > + * timer counter to be accessible from PL1. > + */ > + if (IS_ENABLED(CONFIG_ARM) && is_hyp_mode_available() && > + of_property_read_bool(np, "use-cp15-phys-counter")) { This looks like a property specific to Exynos. Add a "samsung," prefix. Best regards, Krzysztof > + > + /* point MCT functions to read counter from coprocessor */ > + exynos4_read_count_32 = __exynos4_read_cp15_count_32; > + exynos4_read_count_64 = __exynos4_read_cp15_count_64; > + } > + > exynos4_timer_resources(np, of_iomap(np, 0)); > exynos4_clocksource_init(); > exynos4_clockevent_init(); > From mboxrd@z Thu Jan 1 00:00:00 1970 From: k.kozlowski@samsung.com (Krzysztof Kozlowski) Date: Tue, 28 Jul 2015 15:11:28 +0900 Subject: [RFC PATCH 1/3] clocksource: exynos_mct: allow mct to read counter from coprocessor In-Reply-To: <1438032512.17734.47.camel@gmail.com> References: <1438032512.17734.47.camel@gmail.com> Message-ID: <55B71D10.4060505@samsung.com> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On 28.07.2015 06:28, Alexey Klimov wrote: > It was discovered that 64-bit mmio MCT counter holds > the same value as ARM arch timer 64-bit physical counter. > Even more: arch timer and MCT are same underlying hardware > timer. > > Patch adds code to MCT to allow it to read 64-bit counter > from coprocessor cp15 registers since it's way more > faster than reading the same counter from MCT mmio region. > > That functionality triggers only if special property > use-cp15-phys-counter is present in device tree, > only on 32-bit ARMv7 systems and only if HYP mode is > available. Hi, It would be nice to put here also the measurements from cover letter. This would be the justification for the commit. I guess same optimization could be done for ARMv8 Exynos SoCs? > > Idea of rewriting accessors is taken from arm_arch_timer.c. > > By default MCT will read counter from mmio since it's > guaranteed to work. > > Signed-off-by: Alexey Klimov > --- > drivers/clocksource/exynos_mct.c | 77 ++++++++++++++++++++++++++++++++++------ > 1 file changed, 67 insertions(+), 10 deletions(-) > > diff --git a/drivers/clocksource/exynos_mct.c b/drivers/clocksource/exynos_mct.c > index 9064ff7..039b41c 100644 > --- a/drivers/clocksource/exynos_mct.c > +++ b/drivers/clocksource/exynos_mct.c > @@ -26,6 +26,9 @@ > #include > #include > > +#include /* for cp15 physical arch timer counter */ > +#include /* for checking HYP mode */ > + > #define EXYNOS4_MCTREG(x) (x) > #define EXYNOS4_MCT_G_CNT_L EXYNOS4_MCTREG(0x100) > #define EXYNOS4_MCT_G_CNT_U EXYNOS4_MCTREG(0x104) > @@ -82,6 +85,17 @@ static unsigned long clk_rate; > static unsigned int mct_int_type; > static int mct_irqs[MCT_NR_IRQS]; > > +static u32 notrace __exynos4_read_count_32(void); > +static u64 __exynos4_read_count_64(void); > + > +/* > + * Default to __exynos4_read_count_{32,64} functions that reads counter from > + * MCT mmio region and this method is guaranteed > + * to exist (if MCT was successfully initialized). > + */ > +u32 (*exynos4_read_count_32)(void) = __exynos4_read_count_32; > +u64 (*exynos4_read_count_64)(void) = __exynos4_read_count_64; I think these could be static. > + > struct mct_clock_event_device { > struct clock_event_device evt; > unsigned long base; > @@ -163,16 +177,16 @@ static void exynos4_mct_frc_start(void) > } > > /** > - * exynos4_read_count_64 - Read all 64-bits of the global counter > + * __exynos4_read_count_64 - Read all 64-bits of the global counter > * > - * This will read all 64-bits of the global counter taking care to make sure > - * that the upper and lower half match. Note that reading the MCT can be quite > - * slow (hundreds of nanoseconds) so you should use the 32-bit (lower half > - * only) version when possible. > + * This will read all 64-bits of the global counter from MCT mmio region > + * taking care to make sure that the upper and lower half match. > + * Note that reading the MCT can be quite slow (hundreds of nanoseconds) > + * so you should use the 32-bit (lower half only) version when possible. > * > * Returns the number of cycles in the global counter. > */ > -static u64 exynos4_read_count_64(void) > +static u64 __exynos4_read_count_64(void) > { > unsigned int lo, hi; > u32 hi2 = readl_relaxed(reg_base + EXYNOS4_MCT_G_CNT_U); > @@ -187,18 +201,45 @@ static u64 exynos4_read_count_64(void) > } > > /** > - * exynos4_read_count_32 - Read the lower 32-bits of the global counter > + * __exynos4_read_cp15_count_64 - Read all 64-bits of the global counter > + * from coprocessor regisers. > * > - * This will read just the lower 32-bits of the global counter. This is marked > - * as notrace so it can be used by the scheduler clock. > + * Note that reading here is faster than reading from MCT mmio region. > + * > + * Returns the number of cycles in the global counter. > + */ > +static u64 __exynos4_read_cp15_count_64(void) > +{ > + return arch_counter_get_cntpct(); > +} > + > +/** > + * __exynos4_read_count_32 - Read the lower 32-bits of the global counter > + * > + * This will read just the lower 32-bits of the global counter from > + * MCT mmio region. > + * This is marked as notrace so it can be used by the scheduler clock. > * > * Returns the number of cycles in the global counter (lower 32 bits). > */ > -static u32 notrace exynos4_read_count_32(void) > +static u32 notrace __exynos4_read_count_32(void) > { > return readl_relaxed(reg_base + EXYNOS4_MCT_G_CNT_L); > } > > +/** > + * __exynos4_read_cp15_count_32 - Read the lower 32-bits of the global counter > + * > + * This will read global counter from coprocessor regisers. s/regisers/registers/ > + * This is marked as notrace so it can be used by the scheduler clock. > + * > + * Returns the number of cycles in the global counter (lower 32 bits). > + */ > +static u32 notrace __exynos4_read_cp15_count_32(void) > +{ > + return arch_counter_get_cntpct(); > +} > + > static cycle_t exynos4_frc_read(struct clocksource *cs) > { > return exynos4_read_count_32(); > @@ -599,6 +640,22 @@ static void __init mct_init_dt(struct device_node *np, unsigned int int_type) > for (i = MCT_L0_IRQ; i < nr_irqs; i++) > mct_irqs[i] = irq_of_parse_and_map(np, i); > > + /* > + * If use-cp15-phys-counter property is present in device tree > + * then use CP15 ARM arch timer 64-bit physical counter > + * to speedup reading since it keeps the same value like > + * 64-bit counter in MCT mmio region. > + * If HYP mode is available we can rely on physical > + * timer counter to be accessible from PL1. > + */ > + if (IS_ENABLED(CONFIG_ARM) && is_hyp_mode_available() && > + of_property_read_bool(np, "use-cp15-phys-counter")) { This looks like a property specific to Exynos. Add a "samsung," prefix. Best regards, Krzysztof > + > + /* point MCT functions to read counter from coprocessor */ > + exynos4_read_count_32 = __exynos4_read_cp15_count_32; > + exynos4_read_count_64 = __exynos4_read_cp15_count_64; > + } > + > exynos4_timer_resources(np, of_iomap(np, 0)); > exynos4_clocksource_init(); > exynos4_clockevent_init(); >