All of lore.kernel.org
 help / color / mirror / Atom feed
From: catalin.marinas@arm.com (Catalin Marinas)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v2] arm64: cache: Skip an unnecessary data cache clean PoU operation
Date: Tue, 21 Feb 2017 15:47:27 +0000	[thread overview]
Message-ID: <20170221154726.7zahvbeftxobiula@localhost> (raw)
In-Reply-To: <1486588777-1929-1-git-send-email-shankerd@codeaurora.org>

On Wed, Feb 08, 2017 at 03:19:37PM -0600, Shanker Donthineni wrote:
> The cache management functions always do the data cache PoU
> (point of unification) operations even though it is not required
> on some systems. No need to clean data cache till PoU if all the
> cache levels below PoUIS are WT (Write-Through) caches. It causes
> a huge performance degradation when operating on a larger memory
> area, especially THP with 64K page size kernel.
> 
> For each online CPU, check the need of 'dc cvau' instruction and
> update a global variable __dcache_flags. The two functions
> __flush_cache_user_range() and __clean_dcache_area_pou() are
> modified to skip an unnecessary code execution based on flags.
> It won't change the existing behavior if any one of the online
> CPU is capable of WB cache below PoUIS level.
> 
> Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
[...]
> +#define CLIDR_LOUIS_SHIFT	(21)
> +#define CLIDR_LOUIS_MASK	(0x7)
> +#define CLIDR_LOUIS(x)		(((x) >> CLIDR_LOUIS_SHIFT) & CLIDR_LOUIS_MASK)

According to the ARMv8 ARM, CLIDR_EL1 "identifies the type of cache, or
caches, that are implemented at each level and can be managed using the
architected cache maintenance instructions that operate by set/way". The
key part is "set/way" here and hence you cannot use CLIDR_EL1 and
CCSIDR_EL1 to infer whether you can skip cache maintenance by VA.

> +	/* Go through all the cache level below LoUIS */
> +	for (lvl = 0; lvl < louis; lvl++) {
> +		csidr = cache_get_ccsidr(lvl << 1);
> +		if (csidr & CCSIDR_EL1_WRITE_BACK) {

The type bits have also been deprecated in ARMv8 (we need to update the
kernel or just remove the cache topology detection entirely, leaving it
just to DT).

-- 
Catalin

WARNING: multiple messages have this Message-ID (diff)
From: Catalin Marinas <catalin.marinas@arm.com>
To: Shanker Donthineni <shankerd@codeaurora.org>
Cc: Mark Rutland <mark.rutland@arm.com>,
	Vikram Sethi <vikrams@codeaurora.org>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Marc Zyngier <marc.zyngier@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	James Morse <james.morse@arm.com>,
	Anna-Maria Gleixner <anna-maria@linutronix.de>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH v2] arm64: cache: Skip an unnecessary data cache clean PoU operation
Date: Tue, 21 Feb 2017 15:47:27 +0000	[thread overview]
Message-ID: <20170221154726.7zahvbeftxobiula@localhost> (raw)
In-Reply-To: <1486588777-1929-1-git-send-email-shankerd@codeaurora.org>

On Wed, Feb 08, 2017 at 03:19:37PM -0600, Shanker Donthineni wrote:
> The cache management functions always do the data cache PoU
> (point of unification) operations even though it is not required
> on some systems. No need to clean data cache till PoU if all the
> cache levels below PoUIS are WT (Write-Through) caches. It causes
> a huge performance degradation when operating on a larger memory
> area, especially THP with 64K page size kernel.
> 
> For each online CPU, check the need of 'dc cvau' instruction and
> update a global variable __dcache_flags. The two functions
> __flush_cache_user_range() and __clean_dcache_area_pou() are
> modified to skip an unnecessary code execution based on flags.
> It won't change the existing behavior if any one of the online
> CPU is capable of WB cache below PoUIS level.
> 
> Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
[...]
> +#define CLIDR_LOUIS_SHIFT	(21)
> +#define CLIDR_LOUIS_MASK	(0x7)
> +#define CLIDR_LOUIS(x)		(((x) >> CLIDR_LOUIS_SHIFT) & CLIDR_LOUIS_MASK)

According to the ARMv8 ARM, CLIDR_EL1 "identifies the type of cache, or
caches, that are implemented at each level and can be managed using the
architected cache maintenance instructions that operate by set/way". The
key part is "set/way" here and hence you cannot use CLIDR_EL1 and
CCSIDR_EL1 to infer whether you can skip cache maintenance by VA.

> +	/* Go through all the cache level below LoUIS */
> +	for (lvl = 0; lvl < louis; lvl++) {
> +		csidr = cache_get_ccsidr(lvl << 1);
> +		if (csidr & CCSIDR_EL1_WRITE_BACK) {

The type bits have also been deprecated in ARMv8 (we need to update the
kernel or just remove the cache topology detection entirely, leaving it
just to DT).

-- 
Catalin

  reply	other threads:[~2017-02-21 15:47 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-08 21:19 [PATCH v2] arm64: cache: Skip an unnecessary data cache clean PoU operation Shanker Donthineni
2017-02-08 21:19 ` Shanker Donthineni
2017-02-21 15:47 ` Catalin Marinas [this message]
2017-02-21 15:47   ` Catalin Marinas
2017-02-21 15:49   ` Will Deacon
2017-02-21 15:49     ` Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170221154726.7zahvbeftxobiula@localhost \
    --to=catalin.marinas@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.