From: jszhang@marvell.com (Jisheng Zhang)
To: linux-arm-kernel@lists.infradead.org
Subject: CFT: move outer_cache_sync() out of line
Date: Tue, 13 Jan 2015 14:48:51 +0800 [thread overview]
Message-ID: <20150113144851.1944cf0d@xhacker> (raw)
In-Reply-To: <20150112163648.GL12302@n2100.arm.linux.org.uk>
Hi Russell,
On Mon, 12 Jan 2015 08:36:48 -0800
Russell King - ARM Linux <linux@arm.linux.org.uk> wrote:
> The theory is that moving outer_cache_sync() out of line moves the
> conditional test from multiple sites to a single location, which
> in turn means that the branch predictor stands a better chance of
> making the correct decision.
>
> However, I'm a hardware pauper - I have the choice of testing this
> on iMX6 (which seems to have bus limitations, and seems to produce
> a wide variation on measured performance which makes it hard to
> evaluate any differences) or the Versatile Express, which really is
> not suitable for making IO performance measurements.
>
> So, without help from people with "good" hardware, I can't evaluate
> the performance impact of this change. Anyone willing to do some
> performance testing on this and feedback any numbers?
I applied your patch into marvell internal bsp tree, then did a basic emmc
performance test via. dd cmd, the performance is improved by 0.13%
the sdhc host driver is drivers/mmc/host/sdhci-pxav3.c, I'm not sure
it make use of a lot of non-relaxed IO accessors.
>
> Obviously, drivers which make use of a lot of non-relaxed IO accessors
> will be most affected by this - so you really have to know your
> drivers when deciding how to evaluate this (sorry, I can't say "measure
> network bandwidth" or "measure SSD sata performance" is a good test.)
>
> Theoretically, this should help overall system performance, since the
> branch predictor should be able to predict this better, but it's entirely
> possible that trying to benchmark a single workload won't be measurably
> different.
>
> In terms of kernel size figures, this change alone saves almost 17K of
> 10MB of kernel text on my iMX6 kernels - which is bordering on
> insignificant since that's not quite a 0.2% saving.
>
> So... right now I can't justify this change, but I'm hoping some can come
> up with some figures which shows that it benefits their workload without
> causing a performance regression for others.
>
> diff --git a/arch/arm/include/asm/outercache.h
> b/arch/arm/include/asm/outercache.h index 891a56b35bcf..807e4e71c8e7 100644
> --- a/arch/arm/include/asm/outercache.h
> +++ b/arch/arm/include/asm/outercache.h
> @@ -133,11 +133,7 @@ static inline void outer_resume(void) { }
> * Ensure that all outer cache operations are complete and any store
> * buffers are drained.
> */
> -static inline void outer_sync(void)
> -{
> - if (outer_cache.sync)
> - outer_cache.sync();
> -}
> +extern void outer_sync(void);
> #else
> static inline void outer_sync(void)
> { }
> diff --git a/arch/arm/mm/l2c-common.c b/arch/arm/mm/l2c-common.c
> index 10a3cf28c362..b1c24c8c1eb9 100644
> --- a/arch/arm/mm/l2c-common.c
> +++ b/arch/arm/mm/l2c-common.c
> @@ -7,9 +7,17 @@
> * published by the Free Software Foundation.
> */
> #include <linux/bug.h>
> +#include <linux/export.h>
> #include <linux/smp.h>
> #include <asm/outercache.h>
>
> +void outer_sync(void)
> +{
> + if (outer_cache.sync)
> + outer_cache.sync();
> +}
> +EXPORT_SYMBOL(outer_sync);
> +
> void outer_disable(void)
> {
> WARN_ON(!irqs_disabled());
>
>
next prev parent reply other threads:[~2015-01-13 6:48 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-01-12 16:36 CFT: move outer_cache_sync() out of line Russell King - ARM Linux
2015-01-13 6:48 ` Jisheng Zhang [this message]
2015-01-13 15:45 ` Catalin Marinas
2015-01-13 16:27 ` Russell King - ARM Linux
2015-01-13 17:09 ` Russell King - ARM Linux
2015-01-13 18:39 ` Catalin Marinas
2015-01-13 16:34 ` Arnd Bergmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150113144851.1944cf0d@xhacker \
--to=jszhang@marvell.com \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.