linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] arm: mm: refactor v7 cache cleaning ops to use way/index sequence
@ 2013-11-19 15:29 Lorenzo Pieralisi
  2013-11-19 16:14 ` Catalin Marinas
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Lorenzo Pieralisi @ 2013-11-19 15:29 UTC (permalink / raw)
  To: linux-arm-kernel

Set-associative caches on all v7 implementations map the index bits
to physical addresses LSBs and tag bits to MSBs. On most systems with
sane DRAM controller configurations, this means that the current v7
cache flush routine using set/way operations triggers a DRAM memory
controller precharge/activate for every cache line writeback since the
cache routine cleans lines by first fixing the index and then looping
through ways.

Given the random content of cache tags, swapping the order between
indexes and ways loops do not prevent DRAM pages precharge and
activate cycles but at least, on average, improves the chances that
either multiple lines hit the same page or multiple lines belong to
different DRAM banks, improving throughput significantly.

This patch swaps the inner loops in the v7 cache flushing routine to
carry out the clean operations first on all sets belonging to a given
way (looping through sets) and then decrementing the way.

Benchmarks showed that by swapping the ordering in which sets and ways
are decremented in the v7 cache flushing routine, that uses set/way
operations, time required to flush caches is reduced significantly,
owing to improved writebacks throughput to the DRAM controller.

Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
---
 arch/arm/mm/cache-v7.S | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S
index b5c467a..778bcf8 100644
--- a/arch/arm/mm/cache-v7.S
+++ b/arch/arm/mm/cache-v7.S
@@ -146,18 +146,18 @@ flush_levels:
 	ldr	r7, =0x7fff
 	ands	r7, r7, r1, lsr #13		@ extract max number of the index size
 loop1:
-	mov	r9, r4				@ create working copy of max way size
+	mov	r9, r7				@ create working copy of max index
 loop2:
- ARM(	orr	r11, r10, r9, lsl r5	)	@ factor way and cache number into r11
- THUMB(	lsl	r6, r9, r5		)
+ ARM(	orr	r11, r10, r4, lsl r5	)	@ factor way and cache number into r11
+ THUMB(	lsl	r6, r4, r5		)
  THUMB(	orr	r11, r10, r6		)	@ factor way and cache number into r11
- ARM(	orr	r11, r11, r7, lsl r2	)	@ factor index number into r11
- THUMB(	lsl	r6, r7, r2		)
+ ARM(	orr	r11, r11, r9, lsl r2	)	@ factor index number into r11
+ THUMB(	lsl	r6, r9, r2		)
  THUMB(	orr	r11, r11, r6		)	@ factor index number into r11
 	mcr	p15, 0, r11, c7, c14, 2		@ clean & invalidate by set/way
-	subs	r9, r9, #1			@ decrement the way
+	subs	r9, r9, #1			@ decrement the index
 	bge	loop2
-	subs	r7, r7, #1			@ decrement the index
+	subs	r4, r4, #1			@ decrement the way
 	bge	loop1
 skip:
 	add	r10, r10, #2			@ increment cache number
-- 
1.8.2.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2013-12-09 14:24 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-19 15:29 [PATCH] arm: mm: refactor v7 cache cleaning ops to use way/index sequence Lorenzo Pieralisi
2013-11-19 16:14 ` Catalin Marinas
2013-11-19 16:58 ` Nicolas Pitre
2013-11-19 17:04   ` Santosh Shilimkar
2013-11-19 17:16   ` Catalin Marinas
2013-11-19 18:20     ` Lorenzo Pieralisi
2013-11-19 17:35 ` Dave Martin
2013-12-09 14:24 ` Lorenzo Pieralisi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).