linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 00/10] ARM: V7M: Support caches
@ 2016-04-21  8:18 Vladimir Murzin
  2016-04-21  8:18 ` [PATCH RFC 01/10] ARM: factor out CSSELR/CCSIDR operations that use cp15 directly Vladimir Murzin
                   ` (10 more replies)
  0 siblings, 11 replies; 17+ messages in thread
From: Vladimir Murzin @ 2016-04-21  8:18 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

This patch set allows M-class cpus benefit of optional cache support.
It originaly was written by Jonny, I've been keeping it localy mainly
rebasing over Linux versions.

The main idea behind patches is to reuse existing cache handling code
from v7A/R. In case v7M cache operations are provided via memory
mapped interface rather than co-processor instructions, so extra
macros have been introduced to factor out cache handling logic and
low-level operations.

Along with the v7M cache support the first user (Cortex-M7) is
introduced.

Patches were tested on MPS2 platform with Cortex-M3/M4/M7. The later
one showed significant boot speed-up.

Based on 4.6-rc3.

Thanks!
Vladimir

Jonathan Austin (9):
  ARM: factor out CSSELR/CCSIDR operations that use cp15 directly
  ARM: V7M: Make read_cpuid() generally available on V7M.
  ARM: V7M: Add addresses for mem-mapped V7M cache operations
  ARM: V7M: Add support for reading the CTR with CPUID_CACHETYPE
  ARM: Extract cp15 operations from cache flush code
  ARM: V7M: Implement cache macros for V7M
  ARM: V7M: Wire up caches for V7M processors with cache support.
  ARM: V7M: Indirect proc_info construction for V7M CPUs
  ARM: V7M: Add support for the Cortex-M7 processor

Vladimir Murzin (1):
  ARM: V7M: fix notrace variant of save_and_disable_irqs

 arch/arm/include/asm/assembler.h  |    4 ++
 arch/arm/include/asm/cachetype.h  |   39 +++++++++++
 arch/arm/include/asm/cputype.h    |   51 ++++++++------
 arch/arm/include/asm/glue-cache.h |    4 --
 arch/arm/include/asm/v7m.h        |   22 ++++++
 arch/arm/kernel/head-nommu.S      |   16 ++++-
 arch/arm/kernel/setup.c           |   16 ++---
 arch/arm/mm/Kconfig               |    7 +-
 arch/arm/mm/Makefile              |    4 ++
 arch/arm/mm/cache-v7.S            |   66 ++++++++---------
 arch/arm/mm/proc-macros.S         |   23 ------
 arch/arm/mm/proc-v7.S             |    1 +
 arch/arm/mm/proc-v7m.S            |   93 ++++++++++++++++++++----
 arch/arm/mm/v7-cache-macros.S     |  129 ++++++++++++++++++++++++++++++++++
 arch/arm/mm/v7m-cache-macros.S    |  140 +++++++++++++++++++++++++++++++++++++
 15 files changed, 507 insertions(+), 108 deletions(-)
 create mode 100644 arch/arm/mm/v7-cache-macros.S
 create mode 100644 arch/arm/mm/v7m-cache-macros.S

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH RFC 01/10] ARM: factor out CSSELR/CCSIDR operations that use cp15 directly
  2016-04-21  8:18 [PATCH RFC 00/10] ARM: V7M: Support caches Vladimir Murzin
@ 2016-04-21  8:18 ` Vladimir Murzin
  2016-04-21  8:18 ` [PATCH RFC 02/10] ARM: V7M: Make read_cpuid() generally available on V7M Vladimir Murzin
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Vladimir Murzin @ 2016-04-21  8:18 UTC (permalink / raw)
  To: linux-arm-kernel

From: Jonathan Austin <jonathan.austin@arm.com>

Currently we use raw cp15 operations to access the cache setup data.

This patch abstracts the CSSELR and CCSIDR accessors out to a header so
that the implementation for them can be switched out as we do with other
cpu/cachetype operations.

Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
---
 arch/arm/include/asm/cachetype.h |   24 ++++++++++++++++++++++++
 arch/arm/kernel/setup.c          |    7 ++-----
 2 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/arch/arm/include/asm/cachetype.h b/arch/arm/include/asm/cachetype.h
index 7ea7814..8609de8 100644
--- a/arch/arm/include/asm/cachetype.h
+++ b/arch/arm/include/asm/cachetype.h
@@ -56,4 +56,28 @@ static inline unsigned int __attribute__((pure)) cacheid_is(unsigned int mask)
 	       (~__CACHEID_NEVER & __CACHEID_ARCH_MIN & mask & cacheid);
 }
 
+#define CSSELR_ICACHE	1
+#define CSSELR_DCACHE	0
+
+#define CSSELR_L1	(0 << 1)
+#define CSSELR_L2	(1 << 1)
+#define CSSELR_L3	(2 << 1)
+#define CSSELR_L4	(3 << 1)
+#define CSSELR_L5	(4 << 1)
+#define CSSELR_L6	(5 << 1)
+#define CSSELR_L7	(6 << 1)
+
+static inline void set_csselr(unsigned int cache_selector)
+{
+	asm volatile("mcr p15, 2, %0, c0, c0, 0" : : "r" (cache_selector));
+}
+
+static inline unsigned int read_ccsidr(void)
+{
+	unsigned int val;
+
+	asm volatile("mrc p15, 1, %0, c0, c0, 0" : "=r" (val));
+	return val;
+}
+
 #endif
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index a28fce0..163a90d 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -291,12 +291,9 @@ static int cpu_has_aliasing_icache(unsigned int arch)
 	/* arch specifies the register format */
 	switch (arch) {
 	case CPU_ARCH_ARMv7:
-		asm("mcr	p15, 2, %0, c0, c0, 0 @ set CSSELR"
-		    : /* No output operands */
-		    : "r" (1));
+		set_csselr(CSSELR_ICACHE | CSSELR_L1);
 		isb();
-		asm("mrc	p15, 1, %0, c0, c0, 0 @ read CCSIDR"
-		    : "=r" (id_reg));
+		id_reg = read_ccsidr();
 		line_size = 4 << ((id_reg & 0x7) + 2);
 		num_sets = ((id_reg >> 13) & 0x7fff) + 1;
 		aliasing_icache = (line_size * num_sets) > PAGE_SIZE;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH RFC 02/10] ARM: V7M: Make read_cpuid() generally available on V7M.
  2016-04-21  8:18 [PATCH RFC 00/10] ARM: V7M: Support caches Vladimir Murzin
  2016-04-21  8:18 ` [PATCH RFC 01/10] ARM: factor out CSSELR/CCSIDR operations that use cp15 directly Vladimir Murzin
@ 2016-04-21  8:18 ` Vladimir Murzin
  2016-04-21  8:18 ` [PATCH RFC 03/10] ARM: V7M: Add addresses for mem-mapped V7M cache operations Vladimir Murzin
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Vladimir Murzin @ 2016-04-21  8:18 UTC (permalink / raw)
  To: linux-arm-kernel

From: Jonathan Austin <jonathan.austin@arm.com>

Previously V7M had a custom definition for read_cpuid_id that didn't use the
underlying read_cpuid() macro and a stub definition for read_cpuid().

This requires a custom specialisation for each of the CPUID_* registers, and
as more than just CPUID_ID may be implemented in the future this doesn't
make much sense.

This patch creates a generic implementation of read_cpuid for V7M and
removes the custom read_cpuid_id implementation.

Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
---
 arch/arm/include/asm/cputype.h |   50 ++++++++++++++++++++++------------------
 1 file changed, 28 insertions(+), 22 deletions(-)

diff --git a/arch/arm/include/asm/cputype.h b/arch/arm/include/asm/cputype.h
index b23c6c8..2d46425 100644
--- a/arch/arm/include/asm/cputype.h
+++ b/arch/arm/include/asm/cputype.h
@@ -4,15 +4,15 @@
 #include <linux/stringify.h>
 #include <linux/kernel.h>
 
-#define CPUID_ID	0
-#define CPUID_CACHETYPE	1
-#define CPUID_TCM	2
-#define CPUID_TLBTYPE	3
-#define CPUID_MPUIR	4
-#define CPUID_MPIDR	5
-#define CPUID_REVIDR	6
-
 #ifdef CONFIG_CPU_V7M
+
+#define CPUID_ID	0x0
+#define CPUID_CACHETYPE	-1
+#define CPUID_TCM	-1
+#define CPUID_TLBTYPE	-1
+#define CPUID_MPIDR	-1
+#define CPUID_REVIDR	-1
+
 #define CPUID_EXT_PFR0	0x40
 #define CPUID_EXT_PFR1	0x44
 #define CPUID_EXT_DFR0	0x48
@@ -28,6 +28,14 @@
 #define CPUID_EXT_ISAR4	0x70
 #define CPUID_EXT_ISAR5	0x74
 #else
+#define CPUID_ID	0
+#define CPUID_CACHETYPE	1
+#define CPUID_TCM	2
+#define CPUID_TLBTYPE	3
+#define CPUID_MPUIR	4
+#define CPUID_MPIDR	5
+#define CPUID_REVIDR	6
+
 #define CPUID_EXT_PFR0	"c1, 0"
 #define CPUID_EXT_PFR1	"c1, 1"
 #define CPUID_EXT_DFR0	"c1, 2"
@@ -114,11 +122,16 @@ extern unsigned int processor_id;
 #include <asm/io.h>
 #include <asm/v7m.h>
 
-#define read_cpuid(reg)							\
-	({								\
-		WARN_ON_ONCE(1);					\
-		0;							\
-	})
+static inline unsigned int __attribute_const__ read_cpuid(unsigned offset)
+{
+	switch (offset) {
+	case CPUID_ID:
+		return readl(BASEADDR_V7M_SCB + offset);
+	default:
+		WARN_ON_ONCE(1);
+		return 0;
+	}
+}
 
 static inline unsigned int __attribute_const__ read_cpuid_ext(unsigned offset)
 {
@@ -141,7 +154,7 @@ static inline unsigned int __attribute_const__ read_cpuid_ext(unsigned offset)
 
 #endif /* ifdef CONFIG_CPU_CP15 / else */
 
-#ifdef CONFIG_CPU_CP15
+#if defined(CONFIG_CPU_CP15) || defined(CONFIG_CPU_V7M)
 /*
  * The CPU ID never changes at run time, so we might as well tell the
  * compiler that it's constant.  Use this function to read the CPU ID
@@ -152,14 +165,7 @@ static inline unsigned int __attribute_const__ read_cpuid_id(void)
 	return read_cpuid(CPUID_ID);
 }
 
-#elif defined(CONFIG_CPU_V7M)
-
-static inline unsigned int __attribute_const__ read_cpuid_id(void)
-{
-	return readl(BASEADDR_V7M_SCB + V7M_SCB_CPUID);
-}
-
-#else /* ifdef CONFIG_CPU_CP15 / elif defined(CONFIG_CPU_V7M) */
+#else /* if defined(CONFIG_CPU_CP15) || defined(CONFIG_CPU_V7M) */
 
 static inline unsigned int __attribute_const__ read_cpuid_id(void)
 {
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH RFC 03/10] ARM: V7M: Add addresses for mem-mapped V7M cache operations
  2016-04-21  8:18 [PATCH RFC 00/10] ARM: V7M: Support caches Vladimir Murzin
  2016-04-21  8:18 ` [PATCH RFC 01/10] ARM: factor out CSSELR/CCSIDR operations that use cp15 directly Vladimir Murzin
  2016-04-21  8:18 ` [PATCH RFC 02/10] ARM: V7M: Make read_cpuid() generally available on V7M Vladimir Murzin
@ 2016-04-21  8:18 ` Vladimir Murzin
  2016-04-21  8:18 ` [PATCH RFC 04/10] ARM: V7M: Add support for reading the CTR with CPUID_CACHETYPE Vladimir Murzin
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Vladimir Murzin @ 2016-04-21  8:18 UTC (permalink / raw)
  To: linux-arm-kernel

From: Jonathan Austin <jonathan.austin@arm.com>

V7M implements cache operations similarly to V7A/R, however all operations
are performed via memory-mapped IO instead of co-processor operations.

This patch adds register definitions relevant to the V7M ARM architecture's
cache architecture.

Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
---
 arch/arm/include/asm/v7m.h |   22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/arch/arm/include/asm/v7m.h b/arch/arm/include/asm/v7m.h
index 615781c..1fd775c 100644
--- a/arch/arm/include/asm/v7m.h
+++ b/arch/arm/include/asm/v7m.h
@@ -24,6 +24,9 @@
 
 #define V7M_SCB_CCR			0x14
 #define V7M_SCB_CCR_STKALIGN			(1 << 9)
+#define V7M_SCB_CCR_DC				(1 << 16)
+#define V7M_SCB_CCR_IC				(1 << 17)
+#define V7M_SCB_CCR_BP				(1 << 18)
 
 #define V7M_SCB_SHPR2			0x1c
 #define V7M_SCB_SHPR3			0x20
@@ -47,6 +50,25 @@
 #define EXC_RET_STACK_MASK			0x00000004
 #define EXC_RET_THREADMODE_PROCESSSTACK		0xfffffffd
 
+/* Cache related definitions */
+
+#define	V7M_SCB_CLIDR		0x78	/* Cache Level ID register */
+#define	V7M_SCB_CTR		0x7c	/* Cache Type register */
+#define	V7M_SCB_CCSIDR		0x80	/* Cache size ID register */
+#define	V7M_SCB_CSSELR		0x84	/* Cache size selection register */
+
+/* Cache opeartions */
+#define	V7M_SCB_ICIALLU		0x250	/* I-cache invalidate all to PoU */
+#define	V7M_SCB_ICIMVAU		0x258	/* I-cache invalidate by MVA to PoU */
+#define	V7M_SCB_DCIMVAC		0x25c	/* D-cache invalidate by MVA to PoC */
+#define	V7M_SCB_DCISW		0x260	/* D-cache invalidate by set-way */
+#define	V7M_SCB_DCCMVAU		0x264	/* D-cache clean by MVA to PoU */
+#define	V7M_SCB_DCCMVAC		0x268	/* D-cache clean by MVA to PoC */
+#define	V7M_SCB_DCCSW		0x26c	/* D-cache clean by set-way */
+#define	V7M_SCB_DCCIMVAC	0x270	/* D-cache clean and invalidate by MVA to PoC */
+#define	V7M_SCB_DCCISW		0x274	/* D-cache clean and invalidate by set-way */
+#define	V7M_SCB_BPIALL		0x278	/* D-cache clean and invalidate by set-way */
+
 #ifndef __ASSEMBLY__
 
 enum reboot_mode;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH RFC 04/10] ARM: V7M: Add support for reading the CTR with CPUID_CACHETYPE
  2016-04-21  8:18 [PATCH RFC 00/10] ARM: V7M: Support caches Vladimir Murzin
                   ` (2 preceding siblings ...)
  2016-04-21  8:18 ` [PATCH RFC 03/10] ARM: V7M: Add addresses for mem-mapped V7M cache operations Vladimir Murzin
@ 2016-04-21  8:18 ` Vladimir Murzin
  2016-04-27  9:13   ` Russell King - ARM Linux
  2016-04-21  8:18 ` [PATCH RFC 05/10] ARM: Extract cp15 operations from cache flush code Vladimir Murzin
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 17+ messages in thread
From: Vladimir Murzin @ 2016-04-21  8:18 UTC (permalink / raw)
  To: linux-arm-kernel

From: Jonathan Austin <jonathan.austin@arm.com>

With the addition of caches to the V7M Architecture a new Cache Type Register
(CTR) is defined at 0xE000ED7C. This register serves the same purpose as the
V7A/R version, called CPUID_CACHETYPE in the kernel.

This patch adds appropriate definitions to the cpuid macros to allow the CTR to
be read with read_cpuid(reg).

Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
---
 arch/arm/include/asm/cachetype.h |   15 +++++++++++++++
 arch/arm/include/asm/cputype.h   |    3 ++-
 arch/arm/kernel/setup.c          |    9 +++++----
 3 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/arch/arm/include/asm/cachetype.h b/arch/arm/include/asm/cachetype.h
index 8609de8..d55b7de 100644
--- a/arch/arm/include/asm/cachetype.h
+++ b/arch/arm/include/asm/cachetype.h
@@ -67,6 +67,7 @@ static inline unsigned int __attribute__((pure)) cacheid_is(unsigned int mask)
 #define CSSELR_L6	(5 << 1)
 #define CSSELR_L7	(6 << 1)
 
+#ifndef CONFIG_CPU_V7M
 static inline void set_csselr(unsigned int cache_selector)
 {
 	asm volatile("mcr p15, 2, %0, c0, c0, 0" : : "r" (cache_selector));
@@ -79,5 +80,19 @@ static inline unsigned int read_ccsidr(void)
 	asm volatile("mrc p15, 1, %0, c0, c0, 0" : "=r" (val));
 	return val;
 }
+#else /* CONFIG_CPU_V7M */
+#include <asm/io.h>
+#include "asm/v7m.h"
+
+static inline void set_csselr(unsigned int cache_selector)
+{
+	writel(cache_selector, (void *)(BASEADDR_V7M_SCB + V7M_SCB_CTR));
+}
+
+static inline unsigned int read_ccsidr(void)
+{
+	return readl(BASEADDR_V7M_SCB + V7M_SCB_CCSIDR);
+}
+#endif
 
 #endif
diff --git a/arch/arm/include/asm/cputype.h b/arch/arm/include/asm/cputype.h
index 2d46425..ea595db 100644
--- a/arch/arm/include/asm/cputype.h
+++ b/arch/arm/include/asm/cputype.h
@@ -7,7 +7,7 @@
 #ifdef CONFIG_CPU_V7M
 
 #define CPUID_ID	0x0
-#define CPUID_CACHETYPE	-1
+#define CPUID_CACHETYPE	0x7c
 #define CPUID_TCM	-1
 #define CPUID_TLBTYPE	-1
 #define CPUID_MPIDR	-1
@@ -126,6 +126,7 @@ static inline unsigned int __attribute_const__ read_cpuid(unsigned offset)
 {
 	switch (offset) {
 	case CPUID_ID:
+	case CPUID_CACHETYPE:
 		return readl(BASEADDR_V7M_SCB + offset);
 	default:
 		WARN_ON_ONCE(1);
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 163a90d..596be88 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -313,11 +313,12 @@ static void __init cacheid_init(void)
 {
 	unsigned int arch = cpu_architecture();
 
-	if (arch == CPU_ARCH_ARMv7M) {
-		cacheid = 0;
-	} else if (arch >= CPU_ARCH_ARMv6) {
+	if (arch >= CPU_ARCH_ARMv6) {
 		unsigned int cachetype = read_cpuid_cachetype();
-		if ((cachetype & (7 << 29)) == 4 << 29) {
+
+		if ((arch == CPU_ARCH_ARMv7M) && !cachetype) {
+			cacheid = 0;
+		} else if ((cachetype & (7 << 29)) == 4 << 29) {
 			/* ARMv7 register format */
 			arch = CPU_ARCH_ARMv7;
 			cacheid = CACHEID_VIPT_NONALIASING;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH RFC 05/10] ARM: Extract cp15 operations from cache flush code
  2016-04-21  8:18 [PATCH RFC 00/10] ARM: V7M: Support caches Vladimir Murzin
                   ` (3 preceding siblings ...)
  2016-04-21  8:18 ` [PATCH RFC 04/10] ARM: V7M: Add support for reading the CTR with CPUID_CACHETYPE Vladimir Murzin
@ 2016-04-21  8:18 ` Vladimir Murzin
  2016-04-27  9:21   ` Russell King - ARM Linux
  2016-04-21  8:18 ` [PATCH RFC 06/10] ARM: V7M: Implement cache macros for V7M Vladimir Murzin
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 17+ messages in thread
From: Vladimir Murzin @ 2016-04-21  8:18 UTC (permalink / raw)
  To: linux-arm-kernel

From: Jonathan Austin <jonathan.austin@arm.com>

Caches have been added to the V7M architecture. Instead of CP15 operations,
the cache maintenance is done with memory-mapped registers. Other properties
of the cache architecture are the same as V7A/R.

In order to make it possible to use the same cacheflush code on V7A/R and
V7M, this commit separates out the cp15 cache maintenance operations into a
separate, V7A/R specific v7 cache macros file. This is done by introducing
cache macros.

This commit does not introduce any V7M-related code to simplify the process
of verifying that the result of compiling cache-v7.S is identical before and
after this commit.

Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
---
 arch/arm/mm/cache-v7.S        |   51 ++++++++---------
 arch/arm/mm/proc-macros.S     |   23 --------
 arch/arm/mm/proc-v7.S         |    1 +
 arch/arm/mm/v7-cache-macros.S |  124 +++++++++++++++++++++++++++++++++++++++++
 4 files changed, 147 insertions(+), 52 deletions(-)
 create mode 100644 arch/arm/mm/v7-cache-macros.S

diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S
index a134d8a..53a802e 100644
--- a/arch/arm/mm/cache-v7.S
+++ b/arch/arm/mm/cache-v7.S
@@ -17,6 +17,7 @@
 #include <asm/unwind.h>
 
 #include "proc-macros.S"
+#include "v7-cache-macros.S"
 
 /*
  * The secondary kernel init calls v7_flush_dcache_all before it enables
@@ -33,9 +34,9 @@
  */
 ENTRY(v7_invalidate_l1)
        mov     r0, #0
-       mcr     p15, 2, r0, c0, c0, 0
-       mrc     p15, 1, r0, c0, c0, 0
 
+       write_csselr r0
+       read_ccsidr r0
        movw    r1, #0x7fff
        and     r2, r1, r0, lsr #13
 
@@ -55,7 +56,7 @@ ENTRY(v7_invalidate_l1)
        mov     r5, r3, lsl r1
        mov     r6, r2, lsl r0
        orr     r5, r5, r6      @ Reg = (Temp<<WayShift)|(NumSets<<SetShift)
-       mcr     p15, 0, r5, c7, c6, 2
+       dcisw   r5
        bgt     2b
        cmp     r2, #0
        bgt     1b
@@ -73,9 +74,7 @@ ENDPROC(v7_invalidate_l1)
  *	r0 - set to 0
  */
 ENTRY(v7_flush_icache_all)
-	mov	r0, #0
-	ALT_SMP(mcr	p15, 0, r0, c7, c1, 0)		@ invalidate I-cache inner shareable
-	ALT_UP(mcr	p15, 0, r0, c7, c5, 0)		@ I+BTB cache invalidate
+	invalidate_icache r0
 	ret	lr
 ENDPROC(v7_flush_icache_all)
 
@@ -89,7 +88,7 @@ ENDPROC(v7_flush_icache_all)
 
 ENTRY(v7_flush_dcache_louis)
 	dmb					@ ensure ordering with previous memory accesses
-	mrc	p15, 1, r0, c0, c0, 1		@ read clidr, r0 = clidr
+	read_clidr r0
 ALT_SMP(mov	r3, r0, lsr #20)		@ move LoUIS into position
 ALT_UP(	mov	r3, r0, lsr #26)		@ move LoUU into position
 	ands	r3, r3, #7 << 1 		@ extract LoU*2 field from clidr
@@ -117,7 +116,7 @@ ENDPROC(v7_flush_dcache_louis)
  */
 ENTRY(v7_flush_dcache_all)
 	dmb					@ ensure ordering with previous memory accesses
-	mrc	p15, 1, r0, c0, c0, 1		@ read clidr
+	read_clidr r0
 	mov	r3, r0, lsr #23			@ move LoC into position
 	ands	r3, r3, #7 << 1			@ extract LoC*2 from clidr
 	beq	finished			@ if loc is 0, then no need to clean
@@ -132,9 +131,9 @@ flush_levels:
 #ifdef CONFIG_PREEMPT
 	save_and_disable_irqs_notrace r9	@ make cssr&csidr read atomic
 #endif
-	mcr	p15, 2, r10, c0, c0, 0		@ select current cache level in cssr
+	write_csselr r10			@ set current cache level
 	isb					@ isb to sych the new cssr&csidr
-	mrc	p15, 1, r1, c0, c0, 0		@ read the new csidr
+	read_ccsidr r1				@ read the new csidr
 #ifdef CONFIG_PREEMPT
 	restore_irqs_notrace r9
 #endif
@@ -154,7 +153,7 @@ loop2:
  ARM(	orr	r11, r11, r9, lsl r2	)	@ factor index number into r11
  THUMB(	lsl	r6, r9, r2		)
  THUMB(	orr	r11, r11, r6		)	@ factor index number into r11
-	mcr	p15, 0, r11, c7, c14, 2		@ clean & invalidate by set/way
+	dccisw r11				@ clean/invalidate by set/way
 	subs	r9, r9, #1			@ decrement the index
 	bge	loop2
 	subs	r4, r4, #1			@ decrement the way
@@ -165,7 +164,7 @@ skip:
 	bgt	flush_levels
 finished:
 	mov	r10, #0				@ swith back to cache level 0
-	mcr	p15, 2, r10, c0, c0, 0		@ select current cache level in cssr
+	write_csselr r10			@ select current cache level in cssr
 	dsb	st
 	isb
 	ret	lr
@@ -186,9 +185,7 @@ ENTRY(v7_flush_kern_cache_all)
  ARM(	stmfd	sp!, {r4-r5, r7, r9-r11, lr}	)
  THUMB(	stmfd	sp!, {r4-r7, r9-r11, lr}	)
 	bl	v7_flush_dcache_all
-	mov	r0, #0
-	ALT_SMP(mcr	p15, 0, r0, c7, c1, 0)	@ invalidate I-cache inner shareable
-	ALT_UP(mcr	p15, 0, r0, c7, c5, 0)	@ I+BTB cache invalidate
+	invalidate_icache r0
  ARM(	ldmfd	sp!, {r4-r5, r7, r9-r11, lr}	)
  THUMB(	ldmfd	sp!, {r4-r7, r9-r11, lr}	)
 	ret	lr
@@ -204,9 +201,7 @@ ENTRY(v7_flush_kern_cache_louis)
  ARM(	stmfd	sp!, {r4-r5, r7, r9-r11, lr}	)
  THUMB(	stmfd	sp!, {r4-r7, r9-r11, lr}	)
 	bl	v7_flush_dcache_louis
-	mov	r0, #0
-	ALT_SMP(mcr	p15, 0, r0, c7, c1, 0)	@ invalidate I-cache inner shareable
-	ALT_UP(mcr	p15, 0, r0, c7, c5, 0)	@ I+BTB cache invalidate
+	invalidate_icache r0
  ARM(	ldmfd	sp!, {r4-r5, r7, r9-r11, lr}	)
  THUMB(	ldmfd	sp!, {r4-r7, r9-r11, lr}	)
 	ret	lr
@@ -278,7 +273,7 @@ ENTRY(v7_coherent_user_range)
 	ALT_UP(W(nop))
 #endif
 1:
- USER(	mcr	p15, 0, r12, c7, c11, 1	)	@ clean D line to the point of unification
+ USER(	dccmvau	r12 )		@ clean D line to the point of unification
 	add	r12, r12, r2
 	cmp	r12, r1
 	blo	1b
@@ -287,13 +282,11 @@ ENTRY(v7_coherent_user_range)
 	sub	r3, r2, #1
 	bic	r12, r0, r3
 2:
- USER(	mcr	p15, 0, r12, c7, c5, 1	)	@ invalidate I line
+ USER(	icimvau r12 )	@ invalidate I line
 	add	r12, r12, r2
 	cmp	r12, r1
 	blo	2b
-	mov	r0, #0
-	ALT_SMP(mcr	p15, 0, r0, c7, c1, 6)	@ invalidate BTB Inner Shareable
-	ALT_UP(mcr	p15, 0, r0, c7, c5, 6)	@ invalidate BTB
+	invalidate_bp r0
 	dsb	ishst
 	isb
 	ret	lr
@@ -331,7 +324,7 @@ ENTRY(v7_flush_kern_dcache_area)
 	ALT_UP(W(nop))
 #endif
 1:
-	mcr	p15, 0, r0, c7, c14, 1		@ clean & invalidate D line / unified line
+	dccimvac r0		@ clean & invalidate D line / unified line
 	add	r0, r0, r2
 	cmp	r0, r1
 	blo	1b
@@ -358,13 +351,13 @@ v7_dma_inv_range:
 	ALT_SMP(W(dsb))
 	ALT_UP(W(nop))
 #endif
-	mcrne	p15, 0, r0, c7, c14, 1		@ clean & invalidate D / U line
+	dccimvac r0 ne
 
 	tst	r1, r3
 	bic	r1, r1, r3
-	mcrne	p15, 0, r1, c7, c14, 1		@ clean & invalidate D / U line
+	dccimvac r1 ne
 1:
-	mcr	p15, 0, r0, c7, c6, 1		@ invalidate D / U line
+	dcimvac r0
 	add	r0, r0, r2
 	cmp	r0, r1
 	blo	1b
@@ -386,7 +379,7 @@ v7_dma_clean_range:
 	ALT_UP(W(nop))
 #endif
 1:
-	mcr	p15, 0, r0, c7, c10, 1		@ clean D / U line
+	dccmvac r0			@ clean D / U line
 	add	r0, r0, r2
 	cmp	r0, r1
 	blo	1b
@@ -408,7 +401,7 @@ ENTRY(v7_dma_flush_range)
 	ALT_UP(W(nop))
 #endif
 1:
-	mcr	p15, 0, r0, c7, c14, 1		@ clean & invalidate D / U line
+	dccimvac r0			 @ clean & invalidate D / U line
 	add	r0, r0, r2
 	cmp	r0, r1
 	blo	1b
diff --git a/arch/arm/mm/proc-macros.S b/arch/arm/mm/proc-macros.S
index c671f34..a82800a 100644
--- a/arch/arm/mm/proc-macros.S
+++ b/arch/arm/mm/proc-macros.S
@@ -66,29 +66,6 @@
 	.endm
 
 /*
- * dcache_line_size - get the minimum D-cache line size from the CTR register
- * on ARMv7.
- */
-	.macro	dcache_line_size, reg, tmp
-	mrc	p15, 0, \tmp, c0, c0, 1		@ read ctr
-	lsr	\tmp, \tmp, #16
-	and	\tmp, \tmp, #0xf		@ cache line size encoding
-	mov	\reg, #4			@ bytes per word
-	mov	\reg, \reg, lsl \tmp		@ actual cache line size
-	.endm
-
-/*
- * icache_line_size - get the minimum I-cache line size from the CTR register
- * on ARMv7.
- */
-	.macro	icache_line_size, reg, tmp
-	mrc	p15, 0, \tmp, c0, c0, 1		@ read ctr
-	and	\tmp, \tmp, #0xf		@ cache line size encoding
-	mov	\reg, #4			@ bytes per word
-	mov	\reg, \reg, lsl \tmp		@ actual cache line size
-	.endm
-
-/*
  * Sanity check the PTE configuration for the code below - which makes
  * certain assumptions about how these bits are laid out.
  */
diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
index 6fcaac8..c7bcc0c 100644
--- a/arch/arm/mm/proc-v7.S
+++ b/arch/arm/mm/proc-v7.S
@@ -18,6 +18,7 @@
 #include <asm/pgtable.h>
 
 #include "proc-macros.S"
+#include "v7-cache-macros.S"
 
 #ifdef CONFIG_ARM_LPAE
 #include "proc-v7-3level.S"
diff --git a/arch/arm/mm/v7-cache-macros.S b/arch/arm/mm/v7-cache-macros.S
new file mode 100644
index 0000000..5212383
--- /dev/null
+++ b/arch/arm/mm/v7-cache-macros.S
@@ -0,0 +1,124 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) 2012 ARM Limited
+ *
+ * Author: Jonathan Austin <jonathan.austin@arm.com>
+ */
+
+.macro	read_ctr, rt
+	mrc     p15, 0, \rt, c0, c0, 1
+.endm
+
+.macro	read_ccsidr, rt
+	mrc     p15, 1, \rt, c0, c0, 0
+.endm
+
+.macro read_clidr, rt
+	mrc	p15, 1, \rt, c0, c0, 1
+.endm
+
+.macro	write_csselr, rt
+	mcr     p15, 2, \rt, c0, c0, 0
+.endm
+
+/*
+ * dcisw: invalidate data cache by set/way
+ */
+.macro dcisw, rt
+	mcr     p15, 0, \rt, c7, c6, 2
+.endm
+
+/*
+ * dccisw: clean and invalidate data cache by set/way
+ */
+.macro dccisw, rt
+	mcr	p15, 0, \rt, c7, c14, 2
+.endm
+
+/*
+ * dccimvac: Clean and invalidate data cache line by MVA to PoC.
+ */
+.macro dccimvac, rt, cond = al
+	mcr\cond	p15, 0, \rt, c7, c14, 1
+.endm
+
+/*
+ * dcimvac: Invalidate data cache line by MVA to PoC
+ */
+.macro dcimvac, rt
+	mcr	p15, 0, r0, c7, c6, 1
+.endm
+
+/*
+ * dccmvau: Clean data cache line by MVA to PoU
+ */
+.macro dccmvau, rt
+	mcr	p15, 0, \rt, c7, c11, 1
+.endm
+
+/*
+ * dccmvac: Clean data cache line by MVA to PoC
+ */
+.macro dccmvac,  rt
+	mcr	p15, 0, \rt, c7, c10, 1
+.endm
+
+/*
+ * icimvau: Invalidate instruction caches by MVA to PoU
+ */
+.macro icimvau, rt
+	mcr	p15, 0, \rt, c7, c5, 1
+.endm
+
+/*
+ * Invalidate the icache, inner shareable if SMP, invalidate BTB for UP.
+ */
+.macro invalidate_icache, rt
+	mov	\rt, #0
+	ALT_SMP(mcr	p15, 0, \rt, c7, c1, 0)		@ icialluis: I-cache invalidate inner shareable
+	ALT_UP(mcr	p15, 0, \rt, c7, c5, 0)		@ iciallu: I+BTB cache invalidate
+.endm
+
+/*
+ * Invalidate the BTB, inner shareable if SMP.
+ */
+.macro invalidate_bp, rt
+	mov	\rt, #0
+	ALT_SMP(mcr	p15, 0, \rt, c7, c1, 6)		@ bpiallis: invalidate BTB inner shareable
+	ALT_UP(mcr	p15, 0, \rt, c7, c5, 6)		@ bpiall: invalidate BTB
+.endm
+
+/*
+ * dcache_line_size - get the minimum D-cache line size from the CTR register
+ * on ARMv7.
+ */
+	.macro	dcache_line_size, reg, tmp
+	read_ctr \tmp
+	lsr	\tmp, \tmp, #16
+	and	\tmp, \tmp, #0xf		@ cache line size encoding
+	mov	\reg, #4			@ bytes per word
+	mov	\reg, \reg, lsl \tmp		@ actual cache line size
+	.endm
+
+/*
+ * icache_line_size - get the minimum I-cache line size from the CTR register
+ * on ARMv7.
+ */
+	.macro	icache_line_size, reg, tmp
+	read_ctr \tmp
+	and	\tmp, \tmp, #0xf		@ cache line size encoding
+	mov	\reg, #4			@ bytes per word
+	mov	\reg, \reg, lsl \tmp		@ actual cache line size
+	.endm
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH RFC 06/10] ARM: V7M: Implement cache macros for V7M
  2016-04-21  8:18 [PATCH RFC 00/10] ARM: V7M: Support caches Vladimir Murzin
                   ` (4 preceding siblings ...)
  2016-04-21  8:18 ` [PATCH RFC 05/10] ARM: Extract cp15 operations from cache flush code Vladimir Murzin
@ 2016-04-21  8:18 ` Vladimir Murzin
  2016-04-21  8:18 ` [PATCH RFC 07/10] ARM: V7M: fix notrace variant of save_and_disable_irqs Vladimir Murzin
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Vladimir Murzin @ 2016-04-21  8:18 UTC (permalink / raw)
  To: linux-arm-kernel

From: Jonathan Austin <jonathan.austin@arm.com>

This commit implements the cache operation macros for V7M, paving the way
for caches to be used on V7 with a future commit.

Because the cache operations in V7M are memory mapped, in most operations an
extra register is required compared to the V7 version, where the type of
operation is encoded in the instruction, not the address that is written to.

Thus, an extra register argument has been added to the cache operation macros,
that is required in V7M but ignored/unused in V7. In almost all cases there
was a spare temporary register, but in places where the register allocation
was tighter the ARM/THUMB macros have been used to avoid clobbering new
registers.

Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
---
 arch/arm/mm/cache-v7.S         |   41 +++++++-----
 arch/arm/mm/v7-cache-macros.S  |   23 ++++---
 arch/arm/mm/v7m-cache-macros.S |  140 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 179 insertions(+), 25 deletions(-)
 create mode 100644 arch/arm/mm/v7m-cache-macros.S

diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S
index 53a802e..a0c89c6 100644
--- a/arch/arm/mm/cache-v7.S
+++ b/arch/arm/mm/cache-v7.S
@@ -17,7 +17,11 @@
 #include <asm/unwind.h>
 
 #include "proc-macros.S"
+#ifdef CONFIG_CPU_V7M
+#include "v7m-cache-macros.S"
+#else
 #include "v7-cache-macros.S"
+#endif
 
 /*
  * The secondary kernel init calls v7_flush_dcache_all before it enables
@@ -35,7 +39,7 @@
 ENTRY(v7_invalidate_l1)
        mov     r0, #0
 
-       write_csselr r0
+       write_csselr r0, r1
        read_ccsidr r0
        movw    r1, #0x7fff
        and     r2, r1, r0, lsr #13
@@ -56,7 +60,7 @@ ENTRY(v7_invalidate_l1)
        mov     r5, r3, lsl r1
        mov     r6, r2, lsl r0
        orr     r5, r5, r6      @ Reg = (Temp<<WayShift)|(NumSets<<SetShift)
-       dcisw   r5
+       dcisw   r5, r6
        bgt     2b
        cmp     r2, #0
        bgt     1b
@@ -131,7 +135,7 @@ flush_levels:
 #ifdef CONFIG_PREEMPT
 	save_and_disable_irqs_notrace r9	@ make cssr&csidr read atomic
 #endif
-	write_csselr r10			@ set current cache level
+	write_csselr r10, r1			@ set current cache level
 	isb					@ isb to sych the new cssr&csidr
 	read_ccsidr r1				@ read the new csidr
 #ifdef CONFIG_PREEMPT
@@ -153,7 +157,8 @@ loop2:
  ARM(	orr	r11, r11, r9, lsl r2	)	@ factor index number into r11
  THUMB(	lsl	r6, r9, r2		)
  THUMB(	orr	r11, r11, r6		)	@ factor index number into r11
-	dccisw r11				@ clean/invalidate by set/way
+ ARM(	dccisw r11			)	@ clean/invalidate by set/way
+ THUMB(	dccisw r11, r6			)	@ clean/invalidate by set/way
 	subs	r9, r9, #1			@ decrement the index
 	bge	loop2
 	subs	r4, r4, #1			@ decrement the way
@@ -164,7 +169,7 @@ skip:
 	bgt	flush_levels
 finished:
 	mov	r10, #0				@ swith back to cache level 0
-	write_csselr r10			@ select current cache level in cssr
+	write_csselr r10, r3			@ select current cache level in cssr
 	dsb	st
 	isb
 	ret	lr
@@ -273,7 +278,7 @@ ENTRY(v7_coherent_user_range)
 	ALT_UP(W(nop))
 #endif
 1:
- USER(	dccmvau	r12 )		@ clean D line to the point of unification
+ USER(	dccmvau	r12, r3 )	@ clean D line to the point of unification
 	add	r12, r12, r2
 	cmp	r12, r1
 	blo	1b
@@ -282,7 +287,7 @@ ENTRY(v7_coherent_user_range)
 	sub	r3, r2, #1
 	bic	r12, r0, r3
 2:
- USER(	icimvau r12 )	@ invalidate I line
+ USER(	icimvau r12, r3 )	@ invalidate I line
 	add	r12, r12, r2
 	cmp	r12, r1
 	blo	2b
@@ -324,7 +329,7 @@ ENTRY(v7_flush_kern_dcache_area)
 	ALT_UP(W(nop))
 #endif
 1:
-	dccimvac r0		@ clean & invalidate D line / unified line
+	dccimvac r0, r3		@ clean & invalidate D line / unified line
 	add	r0, r0, r2
 	cmp	r0, r1
 	blo	1b
@@ -351,16 +356,20 @@ v7_dma_inv_range:
 	ALT_SMP(W(dsb))
 	ALT_UP(W(nop))
 #endif
-	dccimvac r0 ne
+	beq	1f
+ARM(	dccimvac r0		)
+THUMB(	dccimvac r0, r3		)
+THUMB(	sub	r3, r2, #1	)	@ restore r3, corrupted by dccimvac
 
-	tst	r1, r3
+1:	tst	r1, r3
 	bic	r1, r1, r3
-	dccimvac r1 ne
-1:
-	dcimvac r0
+	beq	2f
+	dccimvac r1, r3
+2:
+	dcimvac r0, r3
 	add	r0, r0, r2
 	cmp	r0, r1
-	blo	1b
+	blo	2b
 	dsb	st
 	ret	lr
 ENDPROC(v7_dma_inv_range)
@@ -379,7 +388,7 @@ v7_dma_clean_range:
 	ALT_UP(W(nop))
 #endif
 1:
-	dccmvac r0			@ clean D / U line
+	dccmvac r0, r3			@ clean D / U line
 	add	r0, r0, r2
 	cmp	r0, r1
 	blo	1b
@@ -401,7 +410,7 @@ ENTRY(v7_dma_flush_range)
 	ALT_UP(W(nop))
 #endif
 1:
-	dccimvac r0			 @ clean & invalidate D / U line
+	dccimvac r0, r3			 @ clean & invalidate D / U line
 	add	r0, r0, r2
 	cmp	r0, r1
 	blo	1b
diff --git a/arch/arm/mm/v7-cache-macros.S b/arch/arm/mm/v7-cache-macros.S
index 5212383..60cba98 100644
--- a/arch/arm/mm/v7-cache-macros.S
+++ b/arch/arm/mm/v7-cache-macros.S
@@ -15,6 +15,11 @@
  * Copyright (C) 2012 ARM Limited
  *
  * Author: Jonathan Austin <jonathan.austin@arm.com>
+ *
+ * The 'unused' parameters are to keep the macro signatures in sync with the
+ * V7M versions, which require a tmp register for certain operations (see
+ * v7m-cache-macros.S). GAS supports omitting optional arguments but doesn't
+ * happily ignore additional undefined ones.
  */
 
 .macro	read_ctr, rt
@@ -29,56 +34,56 @@
 	mrc	p15, 1, \rt, c0, c0, 1
 .endm
 
-.macro	write_csselr, rt
+.macro	write_csselr, rt, unused
 	mcr     p15, 2, \rt, c0, c0, 0
 .endm
 
 /*
  * dcisw: invalidate data cache by set/way
  */
-.macro dcisw, rt
+.macro dcisw, rt, unused
 	mcr     p15, 0, \rt, c7, c6, 2
 .endm
 
 /*
  * dccisw: clean and invalidate data cache by set/way
  */
-.macro dccisw, rt
+.macro dccisw, rt, unused
 	mcr	p15, 0, \rt, c7, c14, 2
 .endm
 
 /*
  * dccimvac: Clean and invalidate data cache line by MVA to PoC.
  */
-.macro dccimvac, rt, cond = al
-	mcr\cond	p15, 0, \rt, c7, c14, 1
+.macro dccimvac, rt, unused
+	mcr	p15, 0, \rt, c7, c14, 1
 .endm
 
 /*
  * dcimvac: Invalidate data cache line by MVA to PoC
  */
-.macro dcimvac, rt
+.macro dcimvac, rt, unused
 	mcr	p15, 0, r0, c7, c6, 1
 .endm
 
 /*
  * dccmvau: Clean data cache line by MVA to PoU
  */
-.macro dccmvau, rt
+.macro dccmvau, rt, unused
 	mcr	p15, 0, \rt, c7, c11, 1
 .endm
 
 /*
  * dccmvac: Clean data cache line by MVA to PoC
  */
-.macro dccmvac,  rt
+.macro dccmvac,  rt, unused
 	mcr	p15, 0, \rt, c7, c10, 1
 .endm
 
 /*
  * icimvau: Invalidate instruction caches by MVA to PoU
  */
-.macro icimvau, rt
+.macro icimvau, rt, unused
 	mcr	p15, 0, \rt, c7, c5, 1
 .endm
 
diff --git a/arch/arm/mm/v7m-cache-macros.S b/arch/arm/mm/v7m-cache-macros.S
new file mode 100644
index 0000000..9a07c15
--- /dev/null
+++ b/arch/arm/mm/v7m-cache-macros.S
@@ -0,0 +1,140 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) 2012 ARM Limited
+ *
+ * Author: Jonathan Austin <jonathan.austin@arm.com>
+ */
+#include "asm/v7m.h"
+#include "asm/assembler.h"
+
+/* Generic V7M read/write macros for memory mapped cache operations */
+.macro v7m_cache_read, rt, reg
+	movw	\rt, #:lower16:BASEADDR_V7M_SCB + \reg
+	movt	\rt, #:upper16:BASEADDR_V7M_SCB + \reg
+	ldr     \rt, [\rt]
+.endm
+
+.macro v7m_cacheop, rt, tmp, op
+	movw	\tmp, #:lower16:BASEADDR_V7M_SCB + \op
+	movt	\tmp, #:upper16:BASEADDR_V7M_SCB + \op
+	str	\rt, [\tmp]
+.endm
+
+/* read/write cache properties */
+.macro	read_ctr, rt
+	v7m_cache_read \rt, V7M_SCB_CTR
+.endm
+
+.macro	read_ccsidr, rt
+	v7m_cache_read \rt, V7M_SCB_CCSIDR
+.endm
+
+.macro read_clidr, rt
+	v7m_cache_read \rt, V7M_SCB_CLIDR
+.endm
+
+.macro	write_csselr, rt, tmp
+	v7m_cacheop \rt, \tmp, V7M_SCB_CSSELR
+.endm
+
+/*
+ * dcisw: Invalidate data cache by set/way
+ */
+.macro dcisw, rt, tmp
+	v7m_cacheop \rt, \tmp, V7M_SCB_DCISW
+.endm
+
+/*
+ * dccisw: Clean and invalidate data cache by set/way
+ */
+.macro dccisw, rt, tmp
+	v7m_cacheop \rt, \tmp, V7M_SCB_DCCISW
+.endm
+
+/*
+ * dccimvac: Clean and invalidate data cache line by MVA to PoC.
+ */
+.macro dccimvac, rt, tmp
+	v7m_cacheop \rt, \tmp, V7M_SCB_DCCIMVAC
+.endm
+
+/*
+ * dcimvac: Invalidate data cache line by MVA to PoC
+ */
+.macro dcimvac, rt, tmp
+	v7m_cacheop \rt, \tmp, V7M_SCB_DCIMVAC
+.endm
+
+/*
+ * dccmvau: Clean data cache line by MVA to PoU
+ */
+.macro dccmvau, rt, tmp
+	v7m_cacheop \rt, \tmp, V7M_SCB_DCCMVAU
+.endm
+
+/*
+ * dccmvac: Clean data cache line by MVA to PoC
+ */
+.macro dccmvac,  rt, tmp
+	v7m_cacheop \rt, \tmp, V7M_SCB_DCCMVAC
+.endm
+
+/*
+ * icimvau: Invalidate instruction caches by MVA to PoU
+ */
+.macro icimvau, rt, tmp
+	v7m_cacheop \rt, \tmp, V7M_SCB_ICIMVAU
+.endm
+
+/*
+ * Invalidate the icache, inner shareable if SMP, invalidate BTB for UP.
+ * rt data ignored by ICIALLU(IS), so can be used for the address
+ */
+.macro invalidate_icache, rt
+	v7m_cacheop \rt, \rt, V7M_SCB_ICIALLU
+	mov \rt, #0
+.endm
+
+/*
+ * Invalidate the BTB, inner shareable if SMP.
+ * rt data ignored by BPIALL, so it can be used for the address
+ */
+.macro invalidate_bp, rt
+	v7m_cacheop \rt, \rt, V7M_SCB_BPIALL
+	mov \rt, #0
+.endm
+
+/*
+ * dcache_line_size - get the minimum D-cache line size from the CTR register
+ * on ARMv7.
+ */
+.macro	dcache_line_size, reg, tmp
+	read_ctr \tmp
+	lsr	\tmp, \tmp, #16
+	and	\tmp, \tmp, #0xf		@ cache line size encoding
+	mov	\reg, #4			@ bytes per word
+	mov	\reg, \reg, lsl \tmp		@ actual cache line size
+.endm
+
+/*
+ * icache_line_size - get the minimum I-cache line size from the CTR register
+ * on ARMv7.
+ */
+.macro	icache_line_size, reg, tmp
+	read_ctr \tmp
+	and	\tmp, \tmp, #0xf		@ cache line size encoding
+	mov	\reg, #4			@ bytes per word
+	mov	\reg, \reg, lsl \tmp		@ actual cache line size
+.endm
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH RFC 07/10] ARM: V7M: fix notrace variant of save_and_disable_irqs
  2016-04-21  8:18 [PATCH RFC 00/10] ARM: V7M: Support caches Vladimir Murzin
                   ` (5 preceding siblings ...)
  2016-04-21  8:18 ` [PATCH RFC 06/10] ARM: V7M: Implement cache macros for V7M Vladimir Murzin
@ 2016-04-21  8:18 ` Vladimir Murzin
  2016-04-21  8:18 ` [PATCH RFC 08/10] ARM: V7M: Wire up caches for V7M processors with cache support Vladimir Murzin
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Vladimir Murzin @ 2016-04-21  8:18 UTC (permalink / raw)
  To: linux-arm-kernel

Commit 8e43a905 "ARM: 7325/1: fix v7 boot with lockdep enabled"
introduced notrace variant of save_and_disable_irqs to balance notrace
variant of restore_irqs; however V7M case has been missed. It was not
noticed because cache-v7.S the only place where notrace variant is used.
So fix it, since we are going to extend V7 cache routines to handle V7M
case too.

Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
---
 arch/arm/include/asm/assembler.h |    4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
index b2bc8e1..310a5e0 100644
--- a/arch/arm/include/asm/assembler.h
+++ b/arch/arm/include/asm/assembler.h
@@ -159,7 +159,11 @@
 	.endm
 
 	.macro	save_and_disable_irqs_notrace, oldcpsr
+#ifdef CONFIG_CPU_V7M
+	mrs	\oldcpsr, primask
+#else
 	mrs	\oldcpsr, cpsr
+#endif
 	disable_irq_notrace
 	.endm
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH RFC 08/10] ARM: V7M: Wire up caches for V7M processors with cache support.
  2016-04-21  8:18 [PATCH RFC 00/10] ARM: V7M: Support caches Vladimir Murzin
                   ` (6 preceding siblings ...)
  2016-04-21  8:18 ` [PATCH RFC 07/10] ARM: V7M: fix notrace variant of save_and_disable_irqs Vladimir Murzin
@ 2016-04-21  8:18 ` Vladimir Murzin
  2016-04-21  8:18 ` [PATCH RFC 09/10] ARM: V7M: Indirect proc_info construction for V7M CPUs Vladimir Murzin
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Vladimir Murzin @ 2016-04-21  8:18 UTC (permalink / raw)
  To: linux-arm-kernel

From: Jonathan Austin <jonathan.austin@arm.com>

This patch does the plumbing required to invoke the V7M cache code added
in earlier patches in this series, although there is no users for that
yet.

In order to honour the I/D cache disable config options, this patch changes
the mechanism by which the CCR is set on boot, to be more like V7A/R.

Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
---
 arch/arm/include/asm/glue-cache.h |    4 ----
 arch/arm/kernel/head-nommu.S      |   16 +++++++++++++++-
 arch/arm/mm/Kconfig               |    7 ++++---
 arch/arm/mm/Makefile              |    4 ++++
 arch/arm/mm/proc-v7m.S            |    5 ++---
 5 files changed, 25 insertions(+), 11 deletions(-)

diff --git a/arch/arm/include/asm/glue-cache.h b/arch/arm/include/asm/glue-cache.h
index cab07f6..01c3d92 100644
--- a/arch/arm/include/asm/glue-cache.h
+++ b/arch/arm/include/asm/glue-cache.h
@@ -118,11 +118,7 @@
 #endif
 
 #if defined(CONFIG_CPU_V7M)
-# ifdef _CACHE
 #  define MULTI_CACHE 1
-# else
-#  define _CACHE nop
-# endif
 #endif
 
 #if !defined(_CACHE) && !defined(MULTI_CACHE)
diff --git a/arch/arm/kernel/head-nommu.S b/arch/arm/kernel/head-nommu.S
index 9b8c5a1..cb10620 100644
--- a/arch/arm/kernel/head-nommu.S
+++ b/arch/arm/kernel/head-nommu.S
@@ -158,7 +158,21 @@ __after_proc_init:
 	bic	r0, r0, #CR_V
 #endif
 	mcr	p15, 0, r0, c1, c0, 0		@ write control reg
-#endif /* CONFIG_CPU_CP15 */
+#elif defined (CONFIG_CPU_V7M)
+	/* For V7M systems we want to modify the CCR similarly to the SCTLR */
+#ifdef CONFIG_CPU_DCACHE_DISABLE
+	bic	r0, r0, #V7M_SCB_CCR_DC
+#endif
+#ifdef CONFIG_CPU_BPREDICT_DISABLE
+	bic	r0, r0, #V7M_SCB_CCR_BP
+#endif
+#ifdef CONFIG_CPU_ICACHE_DISABLE
+	bic	r0, r0, #V7M_SCB_CCR_IC
+#endif
+	movw	r3, #:lower16:(BASEADDR_V7M_SCB + V7M_SCB_CCR)
+	movt	r3, #:upper16:(BASEADDR_V7M_SCB + V7M_SCB_CCR)
+	str	r0, [r3]
+#endif /* CONFIG_CPU_CP15 elif CONFIG_CPU_V7M */
 	ret	lr
 ENDPROC(__after_proc_init)
 	.ltorg
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index 5534766..db10577 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -403,6 +403,7 @@ config CPU_V7M
 	bool
 	select CPU_32v7M
 	select CPU_ABRT_NOMMU
+	select CPU_CACHE_V7
 	select CPU_CACHE_NOP
 	select CPU_PABRT_LEGACY
 	select CPU_THUMBONLY
@@ -747,14 +748,14 @@ config CPU_HIGH_VECTOR
 
 config CPU_ICACHE_DISABLE
 	bool "Disable I-Cache (I-bit)"
-	depends on CPU_CP15 && !(CPU_ARM720T || CPU_ARM740T || CPU_XSCALE || CPU_XSC3)
+	depends on (CPU_CP15 && !(CPU_ARM720T || CPU_ARM740T || CPU_XSCALE || CPU_XSC3)) || CPU_V7M
 	help
 	  Say Y here to disable the processor instruction cache. Unless
 	  you have a reason not to or are unsure, say N.
 
 config CPU_DCACHE_DISABLE
 	bool "Disable D-Cache (C-bit)"
-	depends on CPU_CP15 && !SMP
+	depends on (CPU_CP15 && !SMP) || CPU_V7M
 	help
 	  Say Y here to disable the processor data cache. Unless
 	  you have a reason not to or are unsure, say N.
@@ -789,7 +790,7 @@ config CPU_CACHE_ROUND_ROBIN
 
 config CPU_BPREDICT_DISABLE
 	bool "Disable branch prediction"
-	depends on CPU_ARM1020 || CPU_V6 || CPU_V6K || CPU_MOHAWK || CPU_XSC3 || CPU_V7 || CPU_FA526
+	depends on CPU_ARM1020 || CPU_V6 || CPU_V6K || CPU_MOHAWK || CPU_XSC3 || CPU_V7 || CPU_FA526 || CPU_V7M
 	help
 	  Say Y here to disable branch prediction.  If unsure, say N.
 
diff --git a/arch/arm/mm/Makefile b/arch/arm/mm/Makefile
index 7f76d96..cf4709d 100644
--- a/arch/arm/mm/Makefile
+++ b/arch/arm/mm/Makefile
@@ -45,7 +45,11 @@ obj-$(CONFIG_CPU_CACHE_FA)	+= cache-fa.o
 obj-$(CONFIG_CPU_CACHE_NOP)	+= cache-nop.o
 
 AFLAGS_cache-v6.o	:=-Wa,-march=armv6
+ifneq ($(CONFIG_CPU_V7M),y)
 AFLAGS_cache-v7.o	:=-Wa,-march=armv7-a
+else
+AFLAGS_cache-v7.o	:=-Wa,-march=armv7-m
+endif
 
 obj-$(CONFIG_CPU_COPY_V4WT)	+= copypage-v4wt.o
 obj-$(CONFIG_CPU_COPY_V4WB)	+= copypage-v4wb.o
diff --git a/arch/arm/mm/proc-v7m.S b/arch/arm/mm/proc-v7m.S
index 7229d8d..11f5816 100644
--- a/arch/arm/mm/proc-v7m.S
+++ b/arch/arm/mm/proc-v7m.S
@@ -118,9 +118,8 @@ __v7m_setup:
 
 	@ Configure the System Control Register to ensure 8-byte stack alignment
 	@ Note the STKALIGN bit is either RW or RAO.
-	ldr	r12, [r0, V7M_SCB_CCR]	@ system control register
-	orr	r12, #V7M_SCB_CCR_STKALIGN
-	str	r12, [r0, V7M_SCB_CCR]
+	ldr	r0, [r0, V7M_SCB_CCR]   @ system control register
+	orr	r0, #V7M_SCB_CCR_STKALIGN
 	ret	lr
 ENDPROC(__v7m_setup)
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH RFC 09/10] ARM: V7M: Indirect proc_info construction for V7M CPUs
  2016-04-21  8:18 [PATCH RFC 00/10] ARM: V7M: Support caches Vladimir Murzin
                   ` (7 preceding siblings ...)
  2016-04-21  8:18 ` [PATCH RFC 08/10] ARM: V7M: Wire up caches for V7M processors with cache support Vladimir Murzin
@ 2016-04-21  8:18 ` Vladimir Murzin
  2016-04-21  8:18 ` [PATCH RFC 10/10] ARM: V7M: Add support for the Cortex-M7 processor Vladimir Murzin
  2016-05-26  8:05 ` [PATCH RFC 00/10] ARM: V7M: Support caches Alexandre Torgue
  10 siblings, 0 replies; 17+ messages in thread
From: Vladimir Murzin @ 2016-04-21  8:18 UTC (permalink / raw)
  To: linux-arm-kernel

From: Jonathan Austin <jonathan.austin@arm.com>

This patch copies the method used for V7A/R CPUs to specify differing
processor info for different cores.

This patch differentiates Cortex-M3 and Cortex-M4 and leaves a fallback case
for any other V7M processors.

Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
---
 arch/arm/mm/proc-v7m.S |   46 +++++++++++++++++++++++++++++++++++-----------
 1 file changed, 35 insertions(+), 11 deletions(-)

diff --git a/arch/arm/mm/proc-v7m.S b/arch/arm/mm/proc-v7m.S
index 11f5816..796a983 100644
--- a/arch/arm/mm/proc-v7m.S
+++ b/arch/arm/mm/proc-v7m.S
@@ -132,6 +132,40 @@ ENDPROC(__v7m_setup)
 
 	.section ".proc.info.init", #alloc
 
+.macro __v7m_proc name, initfunc, cache_fns = nop_cache_fns, hwcaps = 0,  proc_fns = v7m_processor_functions
+	.long	0			/* proc_info_list.__cpu_mm_mmu_flags */
+	.long	0			/* proc_info_list.__cpu_io_mmu_flags */
+	initfn	\initfunc, \name
+	.long	cpu_arch_name
+	.long	cpu_elf_name
+	.long	HWCAP_HALF | HWCAP_THUMB | HWCAP_FAST_MULT | \hwcaps
+	.long	cpu_v7m_name
+	.long   \proc_fns
+	.long	0			/* proc_info_list.tlb */
+	.long	0			/* proc_info_list.user */
+	.long	\cache_fns
+.endm
+
+	/*
+	 * Match ARM Cortex-M4 processor.
+	 */
+	.type	__v7m_cm4_proc_info, #object
+__v7m_cm4_proc_info:
+	.long	0x410fc240		/* ARM Cortex-M4 0xC24 */
+	.long	0xff0ffff0		/* Mask off revision, patch release */
+	__v7m_proc __v7m_cm4_proc_info, __v7m_setup, hwcaps = HWCAP_EDSP
+	.size	__v7m_cm4_proc_info, . - __v7m_cm4_proc_info
+
+	/*
+	 * Match ARM Cortex-M3 processor.
+	 */
+	.type	__v7m_cm3_proc_info, #object
+__v7m_cm3_proc_info:
+	.long	0x410fc230		/* ARM Cortex-M3 0xC23 */
+	.long	0xff0ffff0		/* Mask off revision, patch release */
+	__v7m_proc __v7m_cm3_proc_info, __v7m_setup
+	.size	__v7m_cm3_proc_info, . - __v7m_cm3_proc_info
+
 	/*
 	 * Match any ARMv7-M processor core.
 	 */
@@ -139,16 +173,6 @@ ENDPROC(__v7m_setup)
 __v7m_proc_info:
 	.long	0x000f0000		@ Required ID value
 	.long	0x000f0000		@ Mask for ID
-	.long   0			@ proc_info_list.__cpu_mm_mmu_flags
-	.long   0			@ proc_info_list.__cpu_io_mmu_flags
-	initfn	__v7m_setup, __v7m_proc_info	@ proc_info_list.__cpu_flush
-	.long	cpu_arch_name
-	.long	cpu_elf_name
-	.long	HWCAP_HALF|HWCAP_THUMB|HWCAP_FAST_MULT
-	.long	cpu_v7m_name
-	.long	v7m_processor_functions	@ proc_info_list.proc
-	.long	0			@ proc_info_list.tlb
-	.long	0			@ proc_info_list.user
-	.long	nop_cache_fns		@ proc_info_list.cache
+	__v7m_proc __v7m_proc_info, __v7m_setup
 	.size	__v7m_proc_info, . - __v7m_proc_info
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH RFC 10/10] ARM: V7M: Add support for the Cortex-M7 processor
  2016-04-21  8:18 [PATCH RFC 00/10] ARM: V7M: Support caches Vladimir Murzin
                   ` (8 preceding siblings ...)
  2016-04-21  8:18 ` [PATCH RFC 09/10] ARM: V7M: Indirect proc_info construction for V7M CPUs Vladimir Murzin
@ 2016-04-21  8:18 ` Vladimir Murzin
  2016-05-26  8:05 ` [PATCH RFC 00/10] ARM: V7M: Support caches Alexandre Torgue
  10 siblings, 0 replies; 17+ messages in thread
From: Vladimir Murzin @ 2016-04-21  8:18 UTC (permalink / raw)
  To: linux-arm-kernel

From: Jonathan Austin <jonathan.austin@arm.com>

Cortex-M7 is a new member of the V7M processor family that adds, among
other things, caches over the features available in Cortex-M4.

This patch adds support for recognising the processor at boot time, and
make use of recently introduced cache functions.

Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
---
 arch/arm/mm/proc-v7m.S |   42 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 42 insertions(+)

diff --git a/arch/arm/mm/proc-v7m.S b/arch/arm/mm/proc-v7m.S
index 796a983..60387a0 100644
--- a/arch/arm/mm/proc-v7m.S
+++ b/arch/arm/mm/proc-v7m.S
@@ -15,6 +15,7 @@
 #include <asm/memory.h>
 #include <asm/v7m.h>
 #include "proc-macros.S"
+#include "v7m-cache-macros.S"
 
 ENTRY(cpu_v7m_proc_init)
 	ret	lr
@@ -74,6 +75,25 @@ ENTRY(cpu_v7m_do_resume)
 ENDPROC(cpu_v7m_do_resume)
 #endif
 
+ENTRY(cpu_cm7_dcache_clean_area)
+	dcache_line_size r2, r3
+1:	dccmvac r0, r3				@ clean D entry
+	add	r0, r0, r2
+	subs	r1, r1, r2
+	bhi	1b
+	dsb
+	ret	lr
+ENDPROC(cpu_cm7_dcache_clean_area)
+
+ENTRY(cpu_cm7_proc_fin)
+	movw	r2, #:lower16:(BASEADDR_V7M_SCB + V7M_SCB_CCR)
+	movt	r2, #:upper16:(BASEADDR_V7M_SCB + V7M_SCB_CCR)
+	ldr	r0, [r2]
+	bic	r0, r0, #(V7M_SCB_CCR_DC | V7M_SCB_CCR_IC)
+	str	r0, [r2]
+	ret	lr
+ENDPROC(cpu_cm7_proc_fin)
+
 	.section ".text.init", #alloc, #execinstr
 
 /*
@@ -120,10 +140,22 @@ __v7m_setup:
 	@ Note the STKALIGN bit is either RW or RAO.
 	ldr	r0, [r0, V7M_SCB_CCR]   @ system control register
 	orr	r0, #V7M_SCB_CCR_STKALIGN
+	read_ctr r12
+	teq     r12, #0
+	orrne   r0, r0, #(V7M_SCB_CCR_DC | V7M_SCB_CCR_IC| V7M_SCB_CCR_BP)
 	ret	lr
 ENDPROC(__v7m_setup)
 
+/*
+ * Cortex-M7 processor functions
+ */
+	globl_equ	cpu_cm7_proc_init,	cpu_v7m_proc_init
+	globl_equ	cpu_cm7_reset,		cpu_v7m_reset
+	globl_equ	cpu_cm7_do_idle,	cpu_v7m_do_idle
+	globl_equ	cpu_cm7_switch_mm,	cpu_v7m_switch_mm
+
 	define_processor_functions v7m, dabort=nommu_early_abort, pabort=legacy_pabort, nommu=1
+	define_processor_functions cm7, dabort=nommu_early_abort, pabort=legacy_pabort, nommu=1
 
 	.section ".rodata"
 	string cpu_arch_name, "armv7m"
@@ -147,6 +179,16 @@ ENDPROC(__v7m_setup)
 .endm
 
 	/*
+	 * Match ARM Cortex-M7 processor.
+	 */
+	.type	__v7m_cm7_proc_info, #object
+__v7m_cm7_proc_info:
+	.long	0x410fc270		/* ARM Cortex-M7 0xC27 */
+	.long	0xff0ffff0		/* Mask off revision, patch release */
+	__v7m_proc __v7m_cm7_proc_info, __v7m_setup, hwcaps = HWCAP_EDSP, cache_fns = v7_cache_fns, proc_fns = cm7_processor_functions
+	.size	__v7m_cm7_proc_info, . - __v7m_cm7_proc_info
+
+	/*
 	 * Match ARM Cortex-M4 processor.
 	 */
 	.type	__v7m_cm4_proc_info, #object
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH RFC 04/10] ARM: V7M: Add support for reading the CTR with CPUID_CACHETYPE
  2016-04-21  8:18 ` [PATCH RFC 04/10] ARM: V7M: Add support for reading the CTR with CPUID_CACHETYPE Vladimir Murzin
@ 2016-04-27  9:13   ` Russell King - ARM Linux
  2016-04-27 12:18     ` Vladimir Murzin
  0 siblings, 1 reply; 17+ messages in thread
From: Russell King - ARM Linux @ 2016-04-27  9:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Apr 21, 2016 at 09:18:16AM +0100, Vladimir Murzin wrote:
> @@ -79,5 +80,19 @@ static inline unsigned int read_ccsidr(void)
>  	asm volatile("mrc p15, 1, %0, c0, c0, 0" : "=r" (val));
>  	return val;
>  }
> +#else /* CONFIG_CPU_V7M */
> +#include <asm/io.h>

Please use linux/io.h

> +#include "asm/v7m.h"
> +
> +static inline void set_csselr(unsigned int cache_selector)
> +{
> +	writel(cache_selector, (void *)(BASEADDR_V7M_SCB + V7M_SCB_CTR));

writel() doesn't take a void pointer.  It takes a void __iomem pointer.
BASEADDR_V7M_SCB may need to be defined more appropriately.

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH RFC 05/10] ARM: Extract cp15 operations from cache flush code
  2016-04-21  8:18 ` [PATCH RFC 05/10] ARM: Extract cp15 operations from cache flush code Vladimir Murzin
@ 2016-04-27  9:21   ` Russell King - ARM Linux
  2016-04-27 12:24     ` Vladimir Murzin
  0 siblings, 1 reply; 17+ messages in thread
From: Russell King - ARM Linux @ 2016-04-27  9:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Apr 21, 2016 at 09:18:17AM +0100, Vladimir Murzin wrote:
> @@ -278,7 +273,7 @@ ENTRY(v7_coherent_user_range)
>  	ALT_UP(W(nop))
>  #endif
>  1:
> - USER(	mcr	p15, 0, r12, c7, c11, 1	)	@ clean D line to the point of unification
> + USER(	dccmvau	r12 )		@ clean D line to the point of unification

While this is correct for this patch, I think it's incorrect for the v7m
variant.  dccmvau expands to several instructions, the first is a mov,
and the effect of the above will be to mark the mov as the user-accessing
instruction, not the instruction which cleans the D line.

> @@ -287,13 +282,11 @@ ENTRY(v7_coherent_user_range)
>  	sub	r3, r2, #1
>  	bic	r12, r0, r3
>  2:
> - USER(	mcr	p15, 0, r12, c7, c5, 1	)	@ invalidate I line
> + USER(	icimvau r12 )	@ invalidate I line

Same problem.
> @@ -358,13 +351,13 @@ v7_dma_inv_range:
>  	ALT_SMP(W(dsb))
>  	ALT_UP(W(nop))
>  #endif
> -	mcrne	p15, 0, r0, c7, c14, 1		@ clean & invalidate D / U line
> +	dccimvac r0 ne

I'd prefer the:

	.irp    c,,eq,ne,cs,cc,mi,pl,vs,vc,hi,ls,ge,lt,gt,le,hs,lo
	.macro	dccimvac\c, ...
	.endm
	.endr

approach, so you can use

	dccimvacne r0

here.

>  
>  	tst	r1, r3
>  	bic	r1, r1, r3
> -	mcrne	p15, 0, r1, c7, c14, 1		@ clean & invalidate D / U line
> +	dccimvac r1 ne
>  1:
> -	mcr	p15, 0, r0, c7, c6, 1		@ invalidate D / U line
> +	dcimvac r0
>  	add	r0, r0, r2
>  	cmp	r0, r1
>  	blo	1b
> @@ -386,7 +379,7 @@ v7_dma_clean_range:
>  	ALT_UP(W(nop))
>  #endif
>  1:
> -	mcr	p15, 0, r0, c7, c10, 1		@ clean D / U line
> +	dccmvac r0			@ clean D / U line
>  	add	r0, r0, r2
>  	cmp	r0, r1
>  	blo	1b
> @@ -408,7 +401,7 @@ ENTRY(v7_dma_flush_range)
>  	ALT_UP(W(nop))
>  #endif
>  1:
> -	mcr	p15, 0, r0, c7, c14, 1		@ clean & invalidate D / U line
> +	dccimvac r0			 @ clean & invalidate D / U line
>  	add	r0, r0, r2
>  	cmp	r0, r1
>  	blo	1b
> diff --git a/arch/arm/mm/proc-macros.S b/arch/arm/mm/proc-macros.S
> index c671f34..a82800a 100644
> --- a/arch/arm/mm/proc-macros.S
> +++ b/arch/arm/mm/proc-macros.S
> @@ -66,29 +66,6 @@
>  	.endm
>  
>  /*
> - * dcache_line_size - get the minimum D-cache line size from the CTR register
> - * on ARMv7.
> - */
> -	.macro	dcache_line_size, reg, tmp
> -	mrc	p15, 0, \tmp, c0, c0, 1		@ read ctr
> -	lsr	\tmp, \tmp, #16
> -	and	\tmp, \tmp, #0xf		@ cache line size encoding
> -	mov	\reg, #4			@ bytes per word
> -	mov	\reg, \reg, lsl \tmp		@ actual cache line size
> -	.endm
> -
> -/*
> - * icache_line_size - get the minimum I-cache line size from the CTR register
> - * on ARMv7.
> - */
> -	.macro	icache_line_size, reg, tmp
> -	mrc	p15, 0, \tmp, c0, c0, 1		@ read ctr
> -	and	\tmp, \tmp, #0xf		@ cache line size encoding
> -	mov	\reg, #4			@ bytes per word
> -	mov	\reg, \reg, lsl \tmp		@ actual cache line size
> -	.endm
> -
> -/*
>   * Sanity check the PTE configuration for the code below - which makes
>   * certain assumptions about how these bits are laid out.
>   */
> diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
> index 6fcaac8..c7bcc0c 100644
> --- a/arch/arm/mm/proc-v7.S
> +++ b/arch/arm/mm/proc-v7.S
> @@ -18,6 +18,7 @@
>  #include <asm/pgtable.h>
>  
>  #include "proc-macros.S"
> +#include "v7-cache-macros.S"
>  
>  #ifdef CONFIG_ARM_LPAE
>  #include "proc-v7-3level.S"
> diff --git a/arch/arm/mm/v7-cache-macros.S b/arch/arm/mm/v7-cache-macros.S
> new file mode 100644
> index 0000000..5212383
> --- /dev/null
> +++ b/arch/arm/mm/v7-cache-macros.S
> @@ -0,0 +1,124 @@
> +/*
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
> + *
> + * Copyright (C) 2012 ARM Limited
> + *
> + * Author: Jonathan Austin <jonathan.austin@arm.com>
> + */
> +
> +.macro	read_ctr, rt
> +	mrc     p15, 0, \rt, c0, c0, 1
> +.endm
> +
> +.macro	read_ccsidr, rt
> +	mrc     p15, 1, \rt, c0, c0, 0
> +.endm
> +
> +.macro read_clidr, rt
> +	mrc	p15, 1, \rt, c0, c0, 1
> +.endm
> +
> +.macro	write_csselr, rt
> +	mcr     p15, 2, \rt, c0, c0, 0
> +.endm
> +
> +/*
> + * dcisw: invalidate data cache by set/way
> + */
> +.macro dcisw, rt
> +	mcr     p15, 0, \rt, c7, c6, 2
> +.endm
> +
> +/*
> + * dccisw: clean and invalidate data cache by set/way
> + */
> +.macro dccisw, rt
> +	mcr	p15, 0, \rt, c7, c14, 2
> +.endm
> +
> +/*
> + * dccimvac: Clean and invalidate data cache line by MVA to PoC.
> + */
> +.macro dccimvac, rt, cond = al
> +	mcr\cond	p15, 0, \rt, c7, c14, 1
> +.endm
> +
> +/*
> + * dcimvac: Invalidate data cache line by MVA to PoC
> + */
> +.macro dcimvac, rt
> +	mcr	p15, 0, r0, c7, c6, 1
> +.endm
> +
> +/*
> + * dccmvau: Clean data cache line by MVA to PoU
> + */
> +.macro dccmvau, rt
> +	mcr	p15, 0, \rt, c7, c11, 1
> +.endm
> +
> +/*
> + * dccmvac: Clean data cache line by MVA to PoC
> + */
> +.macro dccmvac,  rt
> +	mcr	p15, 0, \rt, c7, c10, 1
> +.endm
> +
> +/*
> + * icimvau: Invalidate instruction caches by MVA to PoU
> + */
> +.macro icimvau, rt
> +	mcr	p15, 0, \rt, c7, c5, 1
> +.endm
> +
> +/*
> + * Invalidate the icache, inner shareable if SMP, invalidate BTB for UP.
> + */
> +.macro invalidate_icache, rt
> +	mov	\rt, #0
> +	ALT_SMP(mcr	p15, 0, \rt, c7, c1, 0)		@ icialluis: I-cache invalidate inner shareable
> +	ALT_UP(mcr	p15, 0, \rt, c7, c5, 0)		@ iciallu: I+BTB cache invalidate
> +.endm
> +
> +/*
> + * Invalidate the BTB, inner shareable if SMP.
> + */
> +.macro invalidate_bp, rt
> +	mov	\rt, #0
> +	ALT_SMP(mcr	p15, 0, \rt, c7, c1, 6)		@ bpiallis: invalidate BTB inner shareable
> +	ALT_UP(mcr	p15, 0, \rt, c7, c5, 6)		@ bpiall: invalidate BTB
> +.endm
> +
> +/*
> + * dcache_line_size - get the minimum D-cache line size from the CTR register
> + * on ARMv7.
> + */
> +	.macro	dcache_line_size, reg, tmp
> +	read_ctr \tmp
> +	lsr	\tmp, \tmp, #16
> +	and	\tmp, \tmp, #0xf		@ cache line size encoding
> +	mov	\reg, #4			@ bytes per word
> +	mov	\reg, \reg, lsl \tmp		@ actual cache line size
> +	.endm
> +
> +/*
> + * icache_line_size - get the minimum I-cache line size from the CTR register
> + * on ARMv7.
> + */
> +	.macro	icache_line_size, reg, tmp
> +	read_ctr \tmp
> +	and	\tmp, \tmp, #0xf		@ cache line size encoding
> +	mov	\reg, #4			@ bytes per word
> +	mov	\reg, \reg, lsl \tmp		@ actual cache line size
> +	.endm
> -- 
> 1.7.9.5
> 

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH RFC 04/10] ARM: V7M: Add support for reading the CTR with CPUID_CACHETYPE
  2016-04-27  9:13   ` Russell King - ARM Linux
@ 2016-04-27 12:18     ` Vladimir Murzin
  0 siblings, 0 replies; 17+ messages in thread
From: Vladimir Murzin @ 2016-04-27 12:18 UTC (permalink / raw)
  To: linux-arm-kernel

On 27/04/16 10:13, Russell King - ARM Linux wrote:
> On Thu, Apr 21, 2016 at 09:18:16AM +0100, Vladimir Murzin wrote:
>> @@ -79,5 +80,19 @@ static inline unsigned int read_ccsidr(void)
>>  	asm volatile("mrc p15, 1, %0, c0, c0, 0" : "=r" (val));
>>  	return val;
>>  }
>> +#else /* CONFIG_CPU_V7M */
>> +#include <asm/io.h>
> 
> Please use linux/io.h
> 
>> +#include "asm/v7m.h"
>> +
>> +static inline void set_csselr(unsigned int cache_selector)
>> +{
>> +	writel(cache_selector, (void *)(BASEADDR_V7M_SCB + V7M_SCB_CTR));
> 
> writel() doesn't take a void pointer.  It takes a void __iomem pointer.
> BASEADDR_V7M_SCB may need to be defined more appropriately.
> 

I'll fix it.

Thanks!
Vladimir

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH RFC 05/10] ARM: Extract cp15 operations from cache flush code
  2016-04-27  9:21   ` Russell King - ARM Linux
@ 2016-04-27 12:24     ` Vladimir Murzin
  0 siblings, 0 replies; 17+ messages in thread
From: Vladimir Murzin @ 2016-04-27 12:24 UTC (permalink / raw)
  To: linux-arm-kernel

On 27/04/16 10:21, Russell King - ARM Linux wrote:
> On Thu, Apr 21, 2016 at 09:18:17AM +0100, Vladimir Murzin wrote:
>> @@ -278,7 +273,7 @@ ENTRY(v7_coherent_user_range)
>>  	ALT_UP(W(nop))
>>  #endif
>>  1:
>> - USER(	mcr	p15, 0, r12, c7, c11, 1	)	@ clean D line to the point of unification
>> + USER(	dccmvau	r12 )		@ clean D line to the point of unification
> 
> While this is correct for this patch, I think it's incorrect for the v7m
> variant.  dccmvau expands to several instructions, the first is a mov,
> and the effect of the above will be to mark the mov as the user-accessing
> instruction, not the instruction which cleans the D line.
> 

Would open coded variant guarded with M_CLASS/AR_CLASS be acceptable here?

>> @@ -287,13 +282,11 @@ ENTRY(v7_coherent_user_range)
>>  	sub	r3, r2, #1
>>  	bic	r12, r0, r3
>>  2:
>> - USER(	mcr	p15, 0, r12, c7, c5, 1	)	@ invalidate I line
>> + USER(	icimvau r12 )	@ invalidate I line
> 
> Same problem.
>> @@ -358,13 +351,13 @@ v7_dma_inv_range:
>>  	ALT_SMP(W(dsb))
>>  	ALT_UP(W(nop))
>>  #endif
>> -	mcrne	p15, 0, r0, c7, c14, 1		@ clean & invalidate D / U line
>> +	dccimvac r0 ne
> 
> I'd prefer the:
> 
> 	.irp    c,,eq,ne,cs,cc,mi,pl,vs,vc,hi,ls,ge,lt,gt,le,hs,lo
> 	.macro	dccimvac\c, ...
> 	.endm
> 	.endr
> 
> approach, so you can use
> 
> 	dccimvacne r0
> 
> here.
> 

I'll change.

Thanks!
Vladimir

>>  
>>  	tst	r1, r3
>>  	bic	r1, r1, r3
>> -	mcrne	p15, 0, r1, c7, c14, 1		@ clean & invalidate D / U line
>> +	dccimvac r1 ne
>>  1:
>> -	mcr	p15, 0, r0, c7, c6, 1		@ invalidate D / U line
>> +	dcimvac r0
>>  	add	r0, r0, r2
>>  	cmp	r0, r1
>>  	blo	1b
>> @@ -386,7 +379,7 @@ v7_dma_clean_range:
>>  	ALT_UP(W(nop))
>>  #endif
>>  1:
>> -	mcr	p15, 0, r0, c7, c10, 1		@ clean D / U line
>> +	dccmvac r0			@ clean D / U line
>>  	add	r0, r0, r2
>>  	cmp	r0, r1
>>  	blo	1b
>> @@ -408,7 +401,7 @@ ENTRY(v7_dma_flush_range)
>>  	ALT_UP(W(nop))
>>  #endif
>>  1:
>> -	mcr	p15, 0, r0, c7, c14, 1		@ clean & invalidate D / U line
>> +	dccimvac r0			 @ clean & invalidate D / U line
>>  	add	r0, r0, r2
>>  	cmp	r0, r1
>>  	blo	1b
>> diff --git a/arch/arm/mm/proc-macros.S b/arch/arm/mm/proc-macros.S
>> index c671f34..a82800a 100644
>> --- a/arch/arm/mm/proc-macros.S
>> +++ b/arch/arm/mm/proc-macros.S
>> @@ -66,29 +66,6 @@
>>  	.endm
>>  
>>  /*
>> - * dcache_line_size - get the minimum D-cache line size from the CTR register
>> - * on ARMv7.
>> - */
>> -	.macro	dcache_line_size, reg, tmp
>> -	mrc	p15, 0, \tmp, c0, c0, 1		@ read ctr
>> -	lsr	\tmp, \tmp, #16
>> -	and	\tmp, \tmp, #0xf		@ cache line size encoding
>> -	mov	\reg, #4			@ bytes per word
>> -	mov	\reg, \reg, lsl \tmp		@ actual cache line size
>> -	.endm
>> -
>> -/*
>> - * icache_line_size - get the minimum I-cache line size from the CTR register
>> - * on ARMv7.
>> - */
>> -	.macro	icache_line_size, reg, tmp
>> -	mrc	p15, 0, \tmp, c0, c0, 1		@ read ctr
>> -	and	\tmp, \tmp, #0xf		@ cache line size encoding
>> -	mov	\reg, #4			@ bytes per word
>> -	mov	\reg, \reg, lsl \tmp		@ actual cache line size
>> -	.endm
>> -
>> -/*
>>   * Sanity check the PTE configuration for the code below - which makes
>>   * certain assumptions about how these bits are laid out.
>>   */
>> diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
>> index 6fcaac8..c7bcc0c 100644
>> --- a/arch/arm/mm/proc-v7.S
>> +++ b/arch/arm/mm/proc-v7.S
>> @@ -18,6 +18,7 @@
>>  #include <asm/pgtable.h>
>>  
>>  #include "proc-macros.S"
>> +#include "v7-cache-macros.S"
>>  
>>  #ifdef CONFIG_ARM_LPAE
>>  #include "proc-v7-3level.S"
>> diff --git a/arch/arm/mm/v7-cache-macros.S b/arch/arm/mm/v7-cache-macros.S
>> new file mode 100644
>> index 0000000..5212383
>> --- /dev/null
>> +++ b/arch/arm/mm/v7-cache-macros.S
>> @@ -0,0 +1,124 @@
>> +/*
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; if not, write to the Free Software
>> + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
>> + *
>> + * Copyright (C) 2012 ARM Limited
>> + *
>> + * Author: Jonathan Austin <jonathan.austin@arm.com>
>> + */
>> +
>> +.macro	read_ctr, rt
>> +	mrc     p15, 0, \rt, c0, c0, 1
>> +.endm
>> +
>> +.macro	read_ccsidr, rt
>> +	mrc     p15, 1, \rt, c0, c0, 0
>> +.endm
>> +
>> +.macro read_clidr, rt
>> +	mrc	p15, 1, \rt, c0, c0, 1
>> +.endm
>> +
>> +.macro	write_csselr, rt
>> +	mcr     p15, 2, \rt, c0, c0, 0
>> +.endm
>> +
>> +/*
>> + * dcisw: invalidate data cache by set/way
>> + */
>> +.macro dcisw, rt
>> +	mcr     p15, 0, \rt, c7, c6, 2
>> +.endm
>> +
>> +/*
>> + * dccisw: clean and invalidate data cache by set/way
>> + */
>> +.macro dccisw, rt
>> +	mcr	p15, 0, \rt, c7, c14, 2
>> +.endm
>> +
>> +/*
>> + * dccimvac: Clean and invalidate data cache line by MVA to PoC.
>> + */
>> +.macro dccimvac, rt, cond = al
>> +	mcr\cond	p15, 0, \rt, c7, c14, 1
>> +.endm
>> +
>> +/*
>> + * dcimvac: Invalidate data cache line by MVA to PoC
>> + */
>> +.macro dcimvac, rt
>> +	mcr	p15, 0, r0, c7, c6, 1
>> +.endm
>> +
>> +/*
>> + * dccmvau: Clean data cache line by MVA to PoU
>> + */
>> +.macro dccmvau, rt
>> +	mcr	p15, 0, \rt, c7, c11, 1
>> +.endm
>> +
>> +/*
>> + * dccmvac: Clean data cache line by MVA to PoC
>> + */
>> +.macro dccmvac,  rt
>> +	mcr	p15, 0, \rt, c7, c10, 1
>> +.endm
>> +
>> +/*
>> + * icimvau: Invalidate instruction caches by MVA to PoU
>> + */
>> +.macro icimvau, rt
>> +	mcr	p15, 0, \rt, c7, c5, 1
>> +.endm
>> +
>> +/*
>> + * Invalidate the icache, inner shareable if SMP, invalidate BTB for UP.
>> + */
>> +.macro invalidate_icache, rt
>> +	mov	\rt, #0
>> +	ALT_SMP(mcr	p15, 0, \rt, c7, c1, 0)		@ icialluis: I-cache invalidate inner shareable
>> +	ALT_UP(mcr	p15, 0, \rt, c7, c5, 0)		@ iciallu: I+BTB cache invalidate
>> +.endm
>> +
>> +/*
>> + * Invalidate the BTB, inner shareable if SMP.
>> + */
>> +.macro invalidate_bp, rt
>> +	mov	\rt, #0
>> +	ALT_SMP(mcr	p15, 0, \rt, c7, c1, 6)		@ bpiallis: invalidate BTB inner shareable
>> +	ALT_UP(mcr	p15, 0, \rt, c7, c5, 6)		@ bpiall: invalidate BTB
>> +.endm
>> +
>> +/*
>> + * dcache_line_size - get the minimum D-cache line size from the CTR register
>> + * on ARMv7.
>> + */
>> +	.macro	dcache_line_size, reg, tmp
>> +	read_ctr \tmp
>> +	lsr	\tmp, \tmp, #16
>> +	and	\tmp, \tmp, #0xf		@ cache line size encoding
>> +	mov	\reg, #4			@ bytes per word
>> +	mov	\reg, \reg, lsl \tmp		@ actual cache line size
>> +	.endm
>> +
>> +/*
>> + * icache_line_size - get the minimum I-cache line size from the CTR register
>> + * on ARMv7.
>> + */
>> +	.macro	icache_line_size, reg, tmp
>> +	read_ctr \tmp
>> +	and	\tmp, \tmp, #0xf		@ cache line size encoding
>> +	mov	\reg, #4			@ bytes per word
>> +	mov	\reg, \reg, lsl \tmp		@ actual cache line size
>> +	.endm
>> -- 
>> 1.7.9.5
>>
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH RFC 00/10] ARM: V7M: Support caches
  2016-04-21  8:18 [PATCH RFC 00/10] ARM: V7M: Support caches Vladimir Murzin
                   ` (9 preceding siblings ...)
  2016-04-21  8:18 ` [PATCH RFC 10/10] ARM: V7M: Add support for the Cortex-M7 processor Vladimir Murzin
@ 2016-05-26  8:05 ` Alexandre Torgue
  2016-06-01 13:03   ` Vladimir Murzin
  10 siblings, 1 reply; 17+ messages in thread
From: Alexandre Torgue @ 2016-05-26  8:05 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Vladimir,

On 04/21/2016 10:18 AM, Vladimir Murzin wrote:
> Hi,
>
> This patch set allows M-class cpus benefit of optional cache support.
> It originaly was written by Jonny, I've been keeping it localy mainly
> rebasing over Linux versions.
>
> The main idea behind patches is to reuse existing cache handling code
> from v7A/R. In case v7M cache operations are provided via memory
> mapped interface rather than co-processor instructions, so extra
> macros have been introduced to factor out cache handling logic and
> low-level operations.
>
> Along with the v7M cache support the first user (Cortex-M7) is
> introduced.
>
> Patches were tested on MPS2 platform with Cortex-M3/M4/M7. The later
> one showed significant boot speed-up.
>


Thanks for the series. Maxime and I have just tested it on STM32F7 hardware.
We found an issue on this M7 chip, which is not linked to your patches 
(see below patch that will be send shortly).
Once fixed, we are able to boot but we observe instabilities which 
disappear when D-cache is disabled from kernel config.
We observe random crashes.
Most of them look like this (hardfault):

(gdb) info reg
r0             0xc018c3fc    -1072118788
r1             0x3e0    992
r2             0x4100000b    1090519051
r3             0x0    0
r4             0x0    0
r5             0xc018c3fc    -1072118788
r6             0xc018c7dc    -1072117796
r7             0xc018c3fc    -1072118788
r8             0xc018bb30    -1072121040
r9             0x3e0    992
r10            0x4100000b    1090519051
r11            0x0    0
r12            0x8303c003    -2096906237
sp             0xc016bed0    0xc016bed0 <init_thread_union+7888>
lr             0xfffffff1    -15
pc             0xc000b6bc    0xc000b6bc <__invalid_entry>
xPSR           0x81000003    -2130706429
(gdb) bt
#0  __invalid_entry () at arch/arm/kernel/entry-v7m.S:24
#1  <signal handler called>
#2  vsnprintf (buf=0xc018c3fc <textbuf> "", size=992, fmt=0x4100000b "", 
args=...) at lib/vsprintf.c:1999
#3  0xc0095910 in vscnprintf (buf=<optimized out>, size=992, 
fmt=<optimized out>, args=...) at lib/vsprintf.c:2152
#4  0xc00269dc in vprintk_emit (facility=0, level=-1, dict=0x0, 
dictlen=<optimized out>, fmt=0x4100000b "", args=...) at 
kernel/printk/printk.c:1673
#5  0xc000a20a in arch_local_irq_enable () at 
./arch/arm/include/asm/irqflags.h:38
#6  arch_cpu_idle () at arch/arm/kernel/process.c:73
#7  0x00000000 in ?? ()

We continue to investigate this issue. If you have any idea, please let 
us know.

Thanks.

alex

Patch which will be sent shortly:


Author: Alexandre TORGUE <alexandre.torgue@gmail.com>
Date:   Wed May 25 14:02:35 2016 +0200

     ARM: V7M: Add dsb before jumping in handler mode

     According to ARM AN321 (section 4.12):

     "If the vector table is in writable memory such as SRAM, either 
relocated
     by VTOR or a device dependent memory remapping mechanism, then
     architecturally a memory barrier instruction is required after the 
vector
     table entry is updated, and if the exception is to be activated 
immediately"

     Signed-off-by: Maxime Coquelin <mcoquelin.stm32@gmail.com>
     Signed-off-by: Alexandre TORGUE <alexandre.torgue@gmail.com>

diff --git a/arch/arm/mm/proc-v7m.S b/arch/arm/mm/proc-v7m.S
index 60387a0..807949f 100644
--- a/arch/arm/mm/proc-v7m.S
+++ b/arch/arm/mm/proc-v7m.S
@@ -124,6 +124,7 @@ __v7m_setup:
         badr    r1, 1f
         ldr     r5, [r12, #11 * 4]      @ read the SVC vector entry
         str     r1, [r12, #11 * 4]      @ write the temporary SVC 
vector entry
+       dsb
         mov     r6, lr                  @ save LR
         ldr     sp, =init_thread_union + THREAD_START_SP
         cpsie   i

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH RFC 00/10] ARM: V7M: Support caches
  2016-05-26  8:05 ` [PATCH RFC 00/10] ARM: V7M: Support caches Alexandre Torgue
@ 2016-06-01 13:03   ` Vladimir Murzin
  0 siblings, 0 replies; 17+ messages in thread
From: Vladimir Murzin @ 2016-06-01 13:03 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Alex,

On 26/05/16 09:05, Alexandre Torgue wrote:
> Hi Vladimir,
> 
> On 04/21/2016 10:18 AM, Vladimir Murzin wrote:
>> Hi,
>>
>> This patch set allows M-class cpus benefit of optional cache support.
>> It originaly was written by Jonny, I've been keeping it localy mainly
>> rebasing over Linux versions.
>>
>> The main idea behind patches is to reuse existing cache handling code
>> from v7A/R. In case v7M cache operations are provided via memory
>> mapped interface rather than co-processor instructions, so extra
>> macros have been introduced to factor out cache handling logic and
>> low-level operations.
>>
>> Along with the v7M cache support the first user (Cortex-M7) is
>> introduced.
>>
>> Patches were tested on MPS2 platform with Cortex-M3/M4/M7. The later
>> one showed significant boot speed-up.
>>
> 
> 
> Thanks for the series. Maxime and I have just tested it on STM32F7
> hardware.

Thanks for giving it a try!

> We found an issue on this M7 chip, which is not linked to your patches
> (see below patch that will be send shortly).
> Once fixed, we are able to boot but we observe instabilities which
> disappear when D-cache is disabled from kernel config.
> We observe random crashes.
> Most of them look like this (hardfault):
> 
> (gdb) info reg
> r0             0xc018c3fc    -1072118788
> r1             0x3e0    992
> r2             0x4100000b    1090519051
> r3             0x0    0
> r4             0x0    0
> r5             0xc018c3fc    -1072118788
> r6             0xc018c7dc    -1072117796
> r7             0xc018c3fc    -1072118788
> r8             0xc018bb30    -1072121040
> r9             0x3e0    992
> r10            0x4100000b    1090519051
> r11            0x0    0
> r12            0x8303c003    -2096906237
> sp             0xc016bed0    0xc016bed0 <init_thread_union+7888>
> lr             0xfffffff1    -15
> pc             0xc000b6bc    0xc000b6bc <__invalid_entry>
> xPSR           0x81000003    -2130706429
> (gdb) bt
> #0  __invalid_entry () at arch/arm/kernel/entry-v7m.S:24
> #1  <signal handler called>
> #2  vsnprintf (buf=0xc018c3fc <textbuf> "", size=992, fmt=0x4100000b "",
> args=...) at lib/vsprintf.c:1999
> #3  0xc0095910 in vscnprintf (buf=<optimized out>, size=992,
> fmt=<optimized out>, args=...) at lib/vsprintf.c:2152
> #4  0xc00269dc in vprintk_emit (facility=0, level=-1, dict=0x0,
> dictlen=<optimized out>, fmt=0x4100000b "", args=...) at
> kernel/printk/printk.c:1673
> #5  0xc000a20a in arch_local_irq_enable () at
> ./arch/arm/include/asm/irqflags.h:38
> #6  arch_cpu_idle () at arch/arm/kernel/process.c:73
> #7  0x00000000 in ?? ()
> 
> We continue to investigate this issue. If you have any idea, please let
> us know.

Nothing out of my head at the moment, sorry. I'm getting ready for the
next spin and I'll try to hummer the patches more. If you can give some
specific payload I can run here I'd try to reproduce what you've seen.

Cheers
Vladimir

> 
> Thanks.
> 
> alex
> 
> Patch which will be sent shortly:
> 
> 
> Author: Alexandre TORGUE <alexandre.torgue@gmail.com>
> Date:   Wed May 25 14:02:35 2016 +0200
> 
>     ARM: V7M: Add dsb before jumping in handler mode
> 
>     According to ARM AN321 (section 4.12):
> 
>     "If the vector table is in writable memory such as SRAM, either
> relocated
>     by VTOR or a device dependent memory remapping mechanism, then
>     architecturally a memory barrier instruction is required after the
> vector
>     table entry is updated, and if the exception is to be activated
> immediately"
> 
>     Signed-off-by: Maxime Coquelin <mcoquelin.stm32@gmail.com>
>     Signed-off-by: Alexandre TORGUE <alexandre.torgue@gmail.com>
> 
> diff --git a/arch/arm/mm/proc-v7m.S b/arch/arm/mm/proc-v7m.S
> index 60387a0..807949f 100644
> --- a/arch/arm/mm/proc-v7m.S
> +++ b/arch/arm/mm/proc-v7m.S
> @@ -124,6 +124,7 @@ __v7m_setup:
>         badr    r1, 1f
>         ldr     r5, [r12, #11 * 4]      @ read the SVC vector entry
>         str     r1, [r12, #11 * 4]      @ write the temporary SVC vector
> entry
> +       dsb
>         mov     r6, lr                  @ save LR
>         ldr     sp, =init_thread_union + THREAD_START_SP
>         cpsie   i
> 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2016-06-01 13:03 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-21  8:18 [PATCH RFC 00/10] ARM: V7M: Support caches Vladimir Murzin
2016-04-21  8:18 ` [PATCH RFC 01/10] ARM: factor out CSSELR/CCSIDR operations that use cp15 directly Vladimir Murzin
2016-04-21  8:18 ` [PATCH RFC 02/10] ARM: V7M: Make read_cpuid() generally available on V7M Vladimir Murzin
2016-04-21  8:18 ` [PATCH RFC 03/10] ARM: V7M: Add addresses for mem-mapped V7M cache operations Vladimir Murzin
2016-04-21  8:18 ` [PATCH RFC 04/10] ARM: V7M: Add support for reading the CTR with CPUID_CACHETYPE Vladimir Murzin
2016-04-27  9:13   ` Russell King - ARM Linux
2016-04-27 12:18     ` Vladimir Murzin
2016-04-21  8:18 ` [PATCH RFC 05/10] ARM: Extract cp15 operations from cache flush code Vladimir Murzin
2016-04-27  9:21   ` Russell King - ARM Linux
2016-04-27 12:24     ` Vladimir Murzin
2016-04-21  8:18 ` [PATCH RFC 06/10] ARM: V7M: Implement cache macros for V7M Vladimir Murzin
2016-04-21  8:18 ` [PATCH RFC 07/10] ARM: V7M: fix notrace variant of save_and_disable_irqs Vladimir Murzin
2016-04-21  8:18 ` [PATCH RFC 08/10] ARM: V7M: Wire up caches for V7M processors with cache support Vladimir Murzin
2016-04-21  8:18 ` [PATCH RFC 09/10] ARM: V7M: Indirect proc_info construction for V7M CPUs Vladimir Murzin
2016-04-21  8:18 ` [PATCH RFC 10/10] ARM: V7M: Add support for the Cortex-M7 processor Vladimir Murzin
2016-05-26  8:05 ` [PATCH RFC 00/10] ARM: V7M: Support caches Alexandre Torgue
2016-06-01 13:03   ` Vladimir Murzin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).