* [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation
@ 2014-11-27 12:18 Madhavan Srinivasan
2014-11-27 12:18 ` [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t Madhavan Srinivasan
` (2 more replies)
0 siblings, 3 replies; 29+ messages in thread
From: Madhavan Srinivasan @ 2014-11-27 12:18 UTC (permalink / raw)
To: mpe; +Cc: Madhavan Srinivasan, rusty, paulus, anton, linuxppc-dev
This patchset create the infrastructure to handle the CR based
local_* atomic operations. Local atomic operations are fast
and highly reentrant per CPU counters. Used for percpu
variable updates. Local atomic operations only guarantee
variable modification atomicity wrt the CPU which owns the
data and these needs to be executed in a preemption safe way.
Here is the design of the first patch. Since local_* operations
are only need to be atomic to interrupts (IIUC), patch uses
one of the Condition Register (CR) fields as a flag variable. When
entering the local_*, specific bit in the CR5 field is set
and on exit, bit is cleared. CR bit checking is done in the
interrupt return path. If CR5[EQ] bit set and if we return
to kernel, we reset to start of local_* operation.
Reason for this approach is that, currently l[w/d]arx/st[w/d]cx.
instruction pair is used for local_* operations, which are heavy
on cycle count and they dont support a local varient. So to
see whether the new implementation helps, Used a modified
version of Rusty's benchmark code on local_t. Have the
performance numbers in the patch commit message.
Second patch has the rewrite of the local_* functions to use
CR5 based logic. Changes are mostly in asm/local.h and only for
CONFIG_PPC64
Madhavan Srinivasan (2):
powerpc: foundation code to handle CR5 for local_t
powerpc: rewrite local_* to use CR5 flag
Makefile | 6 +
arch/powerpc/include/asm/exception-64s.h | 21 ++-
arch/powerpc/include/asm/local.h | 306 +++++++++++++++++++++++++++++++
arch/powerpc/kernel/entry_64.S | 106 ++++++++++-
arch/powerpc/kernel/exceptions-64s.S | 2 +-
arch/powerpc/kernel/head_64.S | 8 +
6 files changed, 444 insertions(+), 5 deletions(-)
--
1.9.1
^ permalink raw reply [flat|nested] 29+ messages in thread
* [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-11-27 12:18 [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation Madhavan Srinivasan
@ 2014-11-27 12:18 ` Madhavan Srinivasan
2014-11-27 16:56 ` Segher Boessenkool
` (3 more replies)
2014-11-27 12:18 ` [RFC PATCH 2/2]powerpc: rewrite local_* to use CR5 flag Madhavan Srinivasan
2014-11-27 14:05 ` [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation David Laight
2 siblings, 4 replies; 29+ messages in thread
From: Madhavan Srinivasan @ 2014-11-27 12:18 UTC (permalink / raw)
To: mpe; +Cc: Madhavan Srinivasan, rusty, paulus, anton, linuxppc-dev
This patch create the infrastructure to handle the CR based
local_* atomic operations. Local atomic operations are fast
and highly reentrant per CPU counters. Used for percpu
variable updates. Local atomic operations only guarantee
variable modification atomicity wrt the CPU which owns the
data and these needs to be executed in a preemption safe way.
Here is the design of this patch. Since local_* operations
are only need to be atomic to interrupts (IIUC), patch uses
one of the Condition Register (CR) fields as a flag variable. When
entering the local_*, specific bit in the CR5 field is set
and on exit, bit is cleared. CR bit checking is done in the
interrupt return path. If CR5[EQ] bit set and if we return
to kernel, we reset to start of local_* operation.
Reason for this approach is that, currently l[w/d]arx/st[w/d]cx.
instruction pair is used for local_* operations, which are heavy
on cycle count and they dont support a local variant. So to
see whether the new implementation helps, used a modified
version of Rusty's benchmark code on local_t.
https://lkml.org/lkml/2008/12/16/450
Modifications:
- increated the working set size from 1MB to 8MB,
- removed cpu_local_inc test.
Test ran
- on POWER8 1S Scale out System 2.0GHz
- on OPAL v3 with v3.18-rc4 patch kernel as Host
Here are the values with the patch.
Time in ns per iteration
inc add read add_return
atomic_long 67 67 18 69
irqsave/rest 39 39 23 39
trivalue 39 39 29 49
local_t 26 26 24 26
Since CR5 is used as a flag, have added CFLAGS to avoid CR5
for the kernel compilation and CR5 is zeroed at the kernel
entry.
Tested the patch in a
- pSeries LPAR,
- Host with patched/unmodified guest kernel
To check whether userspace see any CR5 corruption, ran a simple
test which does,
- set CR5 field,
- while(1)
- sleep or gettimeofday
- chk bit set
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
---
- I really appreciate feedback on the patchset.
- Kindly comment if I should try with any other benchmark or
workload to check the numbers.
- Also, kindly recommand any know stress test for CR
Makefile | 6 ++
arch/powerpc/include/asm/exception-64s.h | 21 +++++-
arch/powerpc/kernel/entry_64.S | 106 ++++++++++++++++++++++++++++++-
arch/powerpc/kernel/exceptions-64s.S | 2 +-
arch/powerpc/kernel/head_64.S | 8 +++
5 files changed, 138 insertions(+), 5 deletions(-)
diff --git a/Makefile b/Makefile
index 00d618b..2e271ad 100644
--- a/Makefile
+++ b/Makefile
@@ -706,6 +706,12 @@ endif
KBUILD_CFLAGS += $(call cc-option, -fno-var-tracking-assignments)
+ifdef CONFIG_PPC64
+# We need this flag to force compiler not to use CR5, since
+# local_t type code is based on this.
+KBUILD_CFLAGS += -ffixed-cr5
+endif
+
ifdef CONFIG_DEBUG_INFO
ifdef CONFIG_DEBUG_INFO_SPLIT
KBUILD_CFLAGS += $(call cc-option, -gsplit-dwarf, -g)
diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
index 77f52b2..c42919a 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -306,7 +306,26 @@ do_kvm_##n: \
std r10,0(r1); /* make stack chain pointer */ \
std r0,GPR0(r1); /* save r0 in stackframe */ \
std r10,GPR1(r1); /* save r1 in stackframe */ \
- beq 4f; /* if from kernel mode */ \
+BEGIN_FTR_SECTION; \
+ lis r9,4096; /* Create a mask with HV and PR */ \
+ rldicr r9,r9,32,31; /* bits, AND with the MSR */ \
+ mr r10,r9; /* to check for Hyp state */ \
+ ori r9,r9,16384; \
+ and r9,r12,r9; \
+ cmpd cr3,r10,r9; \
+ beq cr3,66f; /* Jump if we come from Hyp mode*/ \
+ mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \
+FTR_SECTION_ELSE; \
+ beq 4f; /* if kernel mode branch */ \
+ li r10,0; /* Clear CR5 incase of coming */ \
+ mtcrf 0x04,r10; /* from user. */ \
+ nop; /* This part of code is for */ \
+ nop; /* kernel with MSR[HV]=0, */ \
+ nop; /* MSR[PR]=0, so just chk for */ \
+ nop; /* MSR[PR] */ \
+ nop; \
+ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE); \
+66: beq 4f; /* if from kernel mode */ \
ACCOUNT_CPU_USER_ENTRY(r9, r10); \
SAVE_PPR(area, r9, r10); \
4: EXCEPTION_PROLOG_COMMON_2(area) \
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 0905c8d..e42bb99 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -68,7 +68,26 @@ system_call_common:
2: std r2,GPR2(r1)
std r3,GPR3(r1)
mfcr r2
- std r4,GPR4(r1)
+BEGIN_FTR_SECTION
+ lis r10,4096
+ rldicr r10,r10,32,31
+ mr r11,r10
+ ori r10,r10,16384
+ and r10,r12,r10
+ cmpd r11,r10
+ beq 67f
+ mtcrf 0x04,r11
+FTR_SECTION_ELSE
+ beq 67f
+ li r11,0
+ mtcrf 0x04,r11
+ nop
+ nop
+ nop
+ nop
+ nop
+ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
+67: std r4,GPR4(r1)
std r5,GPR5(r1)
std r6,GPR6(r1)
std r7,GPR7(r1)
@@ -224,8 +243,26 @@ syscall_exit:
BEGIN_FTR_SECTION
stdcx. r0,0,r1 /* to clear the reservation */
END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
+BEGIN_FTR_SECTION
+ lis r4,4096
+ rldicr r4,r4,32,31
+ mr r6,r4
+ ori r4,r4,16384
+ and r4,r8,r4
+ cmpd cr3,r6,r4
+ beq cr3,65f
+ mtcr r5
+FTR_SECTION_ELSE
andi. r6,r8,MSR_PR
- ld r4,_LINK(r1)
+ beq 65f
+ mtcr r5
+ nop
+ nop
+ nop
+ nop
+ nop
+ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
+65: ld r4,_LINK(r1)
beq- 1f
ACCOUNT_CPU_USER_EXIT(r11, r12)
@@ -234,7 +271,11 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
1: ld r2,GPR2(r1)
ld r1,GPR1(r1)
mtlr r4
+#ifdef CONFIG_PPC64
+ mtcrf 0xFB,r5
+#else
mtcr r5
+#endif
mtspr SPRN_SRR0,r7
mtspr SPRN_SRR1,r8
RFI
@@ -804,7 +845,66 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
*/
.globl fast_exception_return
fast_exception_return:
- ld r3,_MSR(r1)
+
+ /*
+ * Now that we are about to exit from interrupt, lets check for
+ * cr5 eq bit. If it is set, then we may be in the middle of
+ * local_t update. In this case, we should rewind the NIP
+ * accordingly.
+ */
+ mfcr r3
+ andi. r4,r3,0x200
+ beq 63f
+
+ /*
+ * Now that the bit is set, lets check for return to User
+ */
+ ld r4,_MSR(r1)
+BEGIN_FTR_SECTION
+ li r3,4096
+ rldicr r3,r3,32,31
+ mr r5,r3
+ ori r3,r3,16384
+ and r3,r4,r3
+ cmpd r5,r3
+ bne 63f
+FTR_SECTION_ELSE
+ andi. r3,r4,MSR_PR
+ bne 63f
+ nop
+ nop
+ nop
+ nop
+ nop
+ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
+
+ /*
+ * Looks like we are returning to Kernel, so
+ * lets get the NIP and search the ex_table.
+ * Change the NIP based on the return value
+ */
+lookup_ex_table:
+ ld r3,_NIP(r1)
+ bl search_exception_tables
+ cmpli 0,1,r3,0
+ bne 62f
+
+ /*
+ * This is a panic case. Reason is that, we
+ * have the CR5 bit set, but we are not in
+ * local_* code and we are returning to Kernel.
+ */
+ ld r3,_NIP(r1)
+ mfcr r4
+ EMIT_BUG_ENTRY lookup_ex_table, __FILE__,__LINE__,BUGFLAG_WARNING
+
+ /*
+ * Now save the return fixup address as NIP
+ */
+62: ld r4,8(r3)
+ std r4,_NIP(r1)
+ crclr 22
+63: ld r3,_MSR(r1)
ld r4,_CTR(r1)
ld r0,_LINK(r1)
mtctr r4
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 72e783e..edb75a9 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -637,7 +637,7 @@ masked_##_H##interrupt: \
rldicl r10,r10,48,1; /* clear MSR_EE */ \
rotldi r10,r10,16; \
mtspr SPRN_##_H##SRR1,r10; \
-2: mtcrf 0x80,r9; \
+2: mtcrf 0x90,r9; \
ld r9,PACA_EXGEN+EX_R9(r13); \
ld r10,PACA_EXGEN+EX_R10(r13); \
ld r11,PACA_EXGEN+EX_R11(r13); \
diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index d48125d..02e49b3 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -347,6 +347,14 @@ __mmu_off:
*
*/
__start_initialization_multiplatform:
+
+ /*
+ * Before we do anything, lets clear CR5 field,
+ * so that we will have a clean start at entry
+ */
+ li r11,0
+ mtcrf 0x04,r11
+
/* Make sure we are running in 64 bits mode */
bl enable_64b_mode
--
1.9.1
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [RFC PATCH 2/2]powerpc: rewrite local_* to use CR5 flag
2014-11-27 12:18 [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation Madhavan Srinivasan
2014-11-27 12:18 ` [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t Madhavan Srinivasan
@ 2014-11-27 12:18 ` Madhavan Srinivasan
2014-12-01 18:01 ` Gabriel Paubert
2014-11-27 14:05 ` [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation David Laight
2 siblings, 1 reply; 29+ messages in thread
From: Madhavan Srinivasan @ 2014-11-27 12:18 UTC (permalink / raw)
To: mpe; +Cc: Madhavan Srinivasan, rusty, paulus, anton, linuxppc-dev
This patch re-write the current local_* functions to CR5 based one.
Base flow for each function is
{
set cr5(eq)
load
..
store
clear cr5(eq)
}
Above set of instructions are followed by a fixup section which points
to the entry of the function incase of interrupt in the flow. If the
interrupt happens to be after the store, we just continue to last
instruction in that block.
Currently only asm/local.h has been rewrite, and local64 is TODO.
Also the entire change is only for PPC64.
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/local.h | 306 +++++++++++++++++++++++++++++++++++++++
1 file changed, 306 insertions(+)
diff --git a/arch/powerpc/include/asm/local.h b/arch/powerpc/include/asm/local.h
index b8da913..a26e5d3 100644
--- a/arch/powerpc/include/asm/local.h
+++ b/arch/powerpc/include/asm/local.h
@@ -11,6 +11,310 @@ typedef struct
#define LOCAL_INIT(i) { ATOMIC_LONG_INIT(i) }
+#ifdef CONFIG_PPC64
+
+static __inline__ long local_read(local_t *l)
+{
+ long t;
+
+ __asm__ __volatile__(
+"1: crset 22\n"
+"2:" PPC_LL" %0,0(%1)\n"
+"3: crclr 22\n"
+"4:\n"
+" .section __ex_table,\"a\"\n"
+ PPC_LONG_ALIGN "\n"
+ PPC_LONG "2b,1b\n"
+ PPC_LONG "3b,3b\n"
+" .previous\n"
+ : "=&r" (t)
+ : "r" (&(l->a.counter)));
+
+ return t;
+}
+
+static __inline__ void local_set(local_t *l, long i)
+{
+ long t;
+
+ __asm__ __volatile__(
+"1: crset 22\n"
+"2:" PPC_LL" %0,0(%1)\n"
+"3:" PPC405_ERR77(0,%2)
+"4:" PPC_STL" %0,0(%2)\n"
+"5: crclr 22\n"
+"6:\n"
+" .section __ex_table,\"a\"\n"
+ PPC_LONG_ALIGN "\n"
+ PPC_LONG "2b,1b\n"
+ PPC_LONG "3b,1b\n"
+ PPC_LONG "4b,1b\n"
+ PPC_LONG "5b,5b\n"
+" .previous\n"
+ : "=&r" (t)
+ : "r" (&(i)), "r" (&(l->a.counter)));
+}
+
+static __inline__ void local_add(long i, local_t *l)
+{
+ long t;
+
+ __asm__ __volatile__(
+"1: crset 22\n"
+"2:" PPC_LL" %0,0(%2)\n"
+"3: add %0,%1,%0\n"
+"4:" PPC405_ERR77(0,%2)
+"5:" PPC_STL" %0,0(%2)\n"
+"6: crclr 22\n"
+"7:\n"
+" .section __ex_table,\"a\"\n"
+ PPC_LONG_ALIGN "\n"
+ PPC_LONG "2b,1b\n"
+ PPC_LONG "3b,1b\n"
+ PPC_LONG "4b,1b\n"
+ PPC_LONG "5b,1b\n"
+ PPC_LONG "6b,6b\n"
+" .previous\n"
+ : "=&r" (t)
+ : "r" (i), "r" (&(l->a.counter)));
+}
+
+static __inline__ void local_sub(long i, local_t *l)
+{
+ long t;
+
+ __asm__ __volatile__(
+"1: crset 22\n"
+"2:" PPC_LL" %0,0(%2)\n"
+"3: subf %0,%1,%0\n"
+"4:" PPC405_ERR77(0,%2)
+"5:" PPC_STL" %0,0(%2)\n"
+"6: crclr 22\n"
+"7:\n"
+" .section __ex_table,\"a\"\n"
+ PPC_LONG_ALIGN "\n"
+ PPC_LONG "2b,1b\n"
+ PPC_LONG "3b,1b\n"
+ PPC_LONG "4b,1b\n"
+ PPC_LONG "5b,1b\n"
+ PPC_LONG "6b,6b\n"
+" .previous\n"
+ : "=&r" (t)
+ : "r" (i), "r" (&(l->a.counter)));
+}
+
+static __inline__ long local_add_return(long a, local_t *l)
+{
+ long t;
+
+ __asm__ __volatile__(
+"1: crset 22\n"
+"2:" PPC_LL" %0,0(%2)\n"
+"3: add %0,%1,%0\n"
+"4:" PPC405_ERR77(0,%2)
+"5:" PPC_STL "%0,0(%2)\n"
+"6: crclr 22\n"
+"7:\n"
+" .section __ex_table,\"a\"\n"
+ PPC_LONG_ALIGN "\n"
+ PPC_LONG "2b,1b\n"
+ PPC_LONG "3b,1b\n"
+ PPC_LONG "4b,1b\n"
+ PPC_LONG "5b,1b\n"
+ PPC_LONG "6b,6b\n"
+" .previous\n"
+ : "=&r" (t)
+ : "r" (a), "r" (&(l->a.counter))
+ : "cc", "memory");
+
+ return t;
+}
+
+
+#define local_add_negative(a, l) (local_add_return((a), (l)) < 0)
+
+static __inline__ long local_sub_return(long a, local_t *l)
+{
+ long t;
+
+ __asm__ __volatile__(
+"1: crset 22\n"
+"2:" PPC_LL" %0,0(%2)\n"
+"3: subf %0,%1,%0\n"
+"4:" PPC405_ERR77(0,%2)
+"5:" PPC_STL "%0,0(%2)\n"
+"6: crclr 22\n"
+"7:\n"
+" .section __ex_table,\"a\"\n"
+ PPC_LONG_ALIGN "\n"
+ PPC_LONG "2b,1b\n"
+ PPC_LONG "3b,1b\n"
+ PPC_LONG "4b,1b\n"
+ PPC_LONG "5b,1b\n"
+ PPC_LONG "6b,6b\n"
+" .previous\n"
+ : "=&r" (t)
+ : "r" (a), "r" (&(l->a.counter))
+ : "cc", "memory");
+
+ return t;
+}
+
+static __inline__ long local_inc_return(local_t *l)
+{
+ long t;
+
+ __asm__ __volatile__(
+"1: crset 22\n"
+"2:" PPC_LL" %0,0(%1)\n"
+"3: addic %0,%0,1\n"
+"4:" PPC405_ERR77(0,%1)
+"5:" PPC_STL "%0,0(%1)\n"
+"6: crclr 22\n"
+"7:\n"
+" .section __ex_table,\"a\"\n"
+ PPC_LONG_ALIGN "\n"
+ PPC_LONG "2b,1b\n"
+ PPC_LONG "3b,1b\n"
+ PPC_LONG "4b,1b\n"
+ PPC_LONG "5b,1b\n"
+ PPC_LONG "6b,6b\n"
+" .previous"
+ : "=&r" (t)
+ : "r" (&(l->a.counter))
+ : "cc", "xer", "memory");
+
+ return t;
+}
+
+/*
+ * local_inc_and_test - increment and test
+ * @l: pointer of type local_t
+ *
+ * Atomically increments @l by 1
+ * and returns true if the result is zero, or false for all
+ * other cases.
+ */
+#define local_inc_and_test(l) (local_inc_return(l) == 0)
+
+static __inline__ long local_dec_return(local_t *l)
+{
+ long t;
+
+ __asm__ __volatile__(
+"1: crset 22\n"
+"2:" PPC_LL" %0,0(%1)\n"
+"3: addic %0,%0,-1\n"
+"4:" PPC405_ERR77(0,%1)
+"5:" PPC_STL "%0,0(%1)\n"
+"6: crclr 22\n"
+"7:\n"
+" .section __ex_table,\"a\"\n"
+ PPC_LONG_ALIGN "\n"
+ PPC_LONG "2b,1b\n"
+ PPC_LONG "3b,1b\n"
+ PPC_LONG "4b,1b\n"
+ PPC_LONG "5b,1b\n"
+ PPC_LONG "6b,6b\n"
+" .previous\n"
+ : "=&r" (t)
+ : "r" (&(l->a.counter))
+ : "cc", "xer", "memory");
+
+ return t;
+}
+
+#define local_inc(l) local_inc_return(l)
+#define local_dec(l) local_dec_return(l)
+
+#define local_cmpxchg(l, o, n) \
+ (cmpxchg_local(&((l)->a.counter), (o), (n)))
+#define local_xchg(l, n) (xchg_local(&((l)->a.counter), (n)))
+
+/**
+ * local_add_unless - add unless the number is a given value
+ * @l: pointer of type local_t
+ * @a: the amount to add to v...
+ * @u: ...unless v is equal to u.
+ *
+ * Atomically adds @a to @l, so long as it was not @u.
+ * Returns non-zero if @l was not @u, and zero otherwise.
+ */
+static __inline__ int local_add_unless(local_t *l, long a, long u)
+{
+ long t;
+
+ __asm__ __volatile__ (
+"1: crset 22\n"
+"2:" PPC_LL" %0,0(%1)\n"
+"3: cmpw 0,%0,%3 \n"
+"4: beq- 9f \n"
+"5: add %0,%2,%0 \n"
+"6:" PPC405_ERR77(0,%1)
+"7:" PPC_STL" %0,0(%1) \n"
+"8: subf %0,%2,%0 \n"
+"9: crclr 22\n"
+"10:\n"
+" .section __ex_table,\"a\"\n"
+ PPC_LONG_ALIGN "\n"
+ PPC_LONG "2b,1b\n"
+ PPC_LONG "3b,1b\n"
+ PPC_LONG "4b,1b\n"
+ PPC_LONG "5b,1b\n"
+ PPC_LONG "6b,1b\n"
+ PPC_LONG "7b,1b\n"
+ PPC_LONG "8b,8b\n"
+ PPC_LONG "9b,9b\n"
+" .previous\n"
+ : "=&r" (t)
+ : "r" (&(l->a.counter)), "r" (a), "r" (u)
+ : "cc", "memory");
+
+ return t != u;
+}
+
+#define local_inc_not_zero(l) local_add_unless((l), 1, 0)
+
+#define local_sub_and_test(a, l) (local_sub_return((a), (l)) == 0)
+#define local_dec_and_test(l) (local_dec_return((l)) == 0)
+
+/*
+ * Atomically test *l and decrement if it is greater than 0.
+ * The function returns the old value of *l minus 1.
+ */
+static __inline__ long local_dec_if_positive(local_t *l)
+{
+ long t;
+
+ __asm__ __volatile__(
+"1: crset 22\n"
+"2:" PPC_LL" %0,0(%1)\n"
+"3: cmpwi %0,1\n"
+"4: addi %0,%0,-1\n"
+"5: blt- 8f\n"
+"6:" PPC405_ERR77(0,%1)
+"7:" PPC_STL "%0,0(%1)\n"
+"8: crclr 22\n"
+"9:\n"
+" .section__ex_table,\"a\"\n"
+ PPC_LONG_ALIGN "\n"
+ PPC_LONG "2b,1b\n"
+ PPC_LONG "3b,1b\n"
+ PPC_LONG "4b,1b\n"
+ PPC_LONG "5b,1b\n"
+ PPC_LONG "6b,1b\n"
+ PPC_LONG "7b,1b\n"
+ PPC_LONG "8b,8b\n"
+" .previous\n"
+ : "=&b" (t)
+ : "r" (&(l->a.counter))
+ : "cc", "memory");
+
+ return t;
+}
+
+#else
+
#define local_read(l) atomic_long_read(&(l)->a)
#define local_set(l,i) atomic_long_set(&(l)->a, (i))
@@ -162,6 +466,8 @@ static __inline__ long local_dec_if_positive(local_t *l)
return t;
}
+#endif
+
/* Use these for per-cpu local_t variables: on some archs they are
* much more efficient than these naive implementations. Note they take
* a variable, not an address.
--
1.9.1
^ permalink raw reply related [flat|nested] 29+ messages in thread
* RE: [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation
2014-11-27 12:18 [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation Madhavan Srinivasan
2014-11-27 12:18 ` [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t Madhavan Srinivasan
2014-11-27 12:18 ` [RFC PATCH 2/2]powerpc: rewrite local_* to use CR5 flag Madhavan Srinivasan
@ 2014-11-27 14:05 ` David Laight
2014-11-28 8:27 ` Madhavan Srinivasan
2 siblings, 1 reply; 29+ messages in thread
From: David Laight @ 2014-11-27 14:05 UTC (permalink / raw)
To: 'Madhavan Srinivasan', mpe@ellerman.id.au
Cc: linuxppc-dev@lists.ozlabs.org, rusty@rustcorp.com.au,
paulus@samba.org, anton@samba.org
RnJvbTogTWFkaGF2YW4gU3Jpbml2YXNhbg0KPiBUaGlzIHBhdGNoc2V0IGNyZWF0ZSB0aGUgaW5m
cmFzdHJ1Y3R1cmUgdG8gaGFuZGxlIHRoZSBDUiBiYXNlZA0KPiBsb2NhbF8qIGF0b21pYyBvcGVy
YXRpb25zLiBMb2NhbCBhdG9taWMgb3BlcmF0aW9ucyBhcmUgZmFzdA0KPiBhbmQgaGlnaGx5IHJl
ZW50cmFudCBwZXIgQ1BVIGNvdW50ZXJzLiAgVXNlZCBmb3IgcGVyY3B1DQo+IHZhcmlhYmxlIHVw
ZGF0ZXMuIExvY2FsIGF0b21pYyBvcGVyYXRpb25zIG9ubHkgZ3VhcmFudGVlDQo+IHZhcmlhYmxl
IG1vZGlmaWNhdGlvbiBhdG9taWNpdHkgd3J0IHRoZSBDUFUgd2hpY2ggb3ducyB0aGUNCj4gZGF0
YSBhbmQgdGhlc2UgbmVlZHMgdG8gYmUgZXhlY3V0ZWQgaW4gYSBwcmVlbXB0aW9uIHNhZmUgd2F5
Lg0KDQpUaGVzZSBhcmUgdXN1YWxseSBjYWxsZWQgJ3Jlc3RhcnRhYmxlIGF0b21pYyBzZXF1ZW5j
ZXMgKFJBUyknLg0KDQo+IEhlcmUgaXMgdGhlIGRlc2lnbiBvZiB0aGUgZmlyc3QgcGF0Y2guIFNp
bmNlIGxvY2FsXyogb3BlcmF0aW9ucw0KPiBhcmUgb25seSBuZWVkIHRvIGJlIGF0b21pYyB0byBp
bnRlcnJ1cHRzIChJSVVDKSwgcGF0Y2ggdXNlcw0KPiBvbmUgb2YgdGhlIENvbmRpdGlvbiBSZWdp
c3RlciAoQ1IpIGZpZWxkcyBhcyBhIGZsYWcgdmFyaWFibGUuIFdoZW4NCj4gZW50ZXJpbmcgdGhl
IGxvY2FsXyosIHNwZWNpZmljIGJpdCBpbiB0aGUgQ1I1IGZpZWxkIGlzIHNldA0KPiBhbmQgb24g
ZXhpdCwgYml0IGlzIGNsZWFyZWQuIENSIGJpdCBjaGVja2luZyBpcyBkb25lIGluIHRoZQ0KPiBp
bnRlcnJ1cHQgcmV0dXJuIHBhdGguIElmIENSNVtFUV0gYml0IHNldCBhbmQgaWYgd2UgcmV0dXJu
DQo+IHRvIGtlcm5lbCwgd2UgcmVzZXQgdG8gc3RhcnQgb2YgbG9jYWxfKiBvcGVyYXRpb24uDQoN
CkkgZG9uJ3QgY2xhaW0gdG8gYmUgYWJsZSB0byByZWFkIHBwYyBhc3NlbWJsZXIuDQpCdXQgSSBj
YW4ndCBzZWUgdGhlIGNvZGUgdGhhdCBjbGVhcnMgQ1I1W0VRXSBmb3IgdGhlIGR1cmF0aW9uDQpv
ZiB0aGUgSVNSLg0KV2l0aG91dCBpdCBhIG5lc3RlZCBpbnRlcnJ1cHQgd2lsbCBnbyB0aHJvdWdo
IHVud2FudGVkIHBhdGhzLg0KDQpUaGVyZSBhcmUgYWxzbyBhIGxvdCBvZiAnbWFnaWMnIGNvbnN0
YW50cyBpbiB0aGF0IGFzc2VtYmx5IGNvZGUuDQoNCkkgYWxzbyB3b25kZXIgaWYgaXQgaXMgcG9z
c2libGUgdG8gaW5zcGVjdCB0aGUgaW50ZXJydXB0ZWQNCmNvZGUgdG8gZGV0ZXJtaW5lIHRoZSBz
dGFydC9lbmQgb2YgdGhlIFJBUyBibG9jay4NCihFYXNpZXN0IGlmIHlvdSBhc3N1bWUgdGhhdCB0
aGVyZSBpcyBhIHNpbmdsZSAnd3JpdGUnIGluc3RydWN0aW9uDQphcyB0aGUgbGFzdCBlbnRyeSBp
biB0aGUgYmxvY2suKQ0KDQpBbHNvLCBob3cgZXhwZW5zaXZlIGlzIGl0IHRvIGRpc2FibGUgYWxs
IGludGVycnVwdHM/DQoNCglEYXZpZA0KDQo=
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-11-27 12:18 ` [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t Madhavan Srinivasan
@ 2014-11-27 16:56 ` Segher Boessenkool
2014-11-28 1:58 ` Benjamin Herrenschmidt
2014-11-28 2:57 ` Madhavan Srinivasan
2014-11-28 0:56 ` Benjamin Herrenschmidt
` (2 subsequent siblings)
3 siblings, 2 replies; 29+ messages in thread
From: Segher Boessenkool @ 2014-11-27 16:56 UTC (permalink / raw)
To: Madhavan Srinivasan; +Cc: rusty, paulus, anton, linuxppc-dev
On Thu, Nov 27, 2014 at 05:48:40PM +0530, Madhavan Srinivasan wrote:
> Here is the design of this patch. Since local_* operations
> are only need to be atomic to interrupts (IIUC), patch uses
> one of the Condition Register (CR) fields as a flag variable. When
> entering the local_*, specific bit in the CR5 field is set
> and on exit, bit is cleared. CR bit checking is done in the
> interrupt return path. If CR5[EQ] bit set and if we return
> to kernel, we reset to start of local_* operation.
Have you tested this with (upcoming) GCC 5.0? GCC now uses CR5,
and it likes to use it very much, it might be more convenient to
use e.g. CR1 (which is allocated almost last, only before CR0).
> --- a/arch/powerpc/include/asm/exception-64s.h
> +++ b/arch/powerpc/include/asm/exception-64s.h
> @@ -306,7 +306,26 @@ do_kvm_##n: \
> std r10,0(r1); /* make stack chain pointer */ \
> std r0,GPR0(r1); /* save r0 in stackframe */ \
> std r10,GPR1(r1); /* save r1 in stackframe */ \
> - beq 4f; /* if from kernel mode */ \
> +BEGIN_FTR_SECTION; \
> + lis r9,4096; /* Create a mask with HV and PR */ \
> + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \
> + mr r10,r9; /* to check for Hyp state */ \
> + ori r9,r9,16384; \
> + and r9,r12,r9; \
> + cmpd cr3,r10,r9; \
> + beq cr3,66f; /* Jump if we come from Hyp mode*/ \
> + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \
Wow, such nastiness, only to avoid using dot insns (since you need to keep
the current CR0 value for the following beq 4f). And CR0 already holds the
PR bit, so you need only to check the HV bit anyway? Some restructuring
would make this a lot simpler and clearer.
> + /*
> + * Now that we are about to exit from interrupt, lets check for
> + * cr5 eq bit. If it is set, then we may be in the middle of
> + * local_t update. In this case, we should rewind the NIP
> + * accordingly.
> + */
> + mfcr r3
> + andi. r4,r3,0x200
> + beq 63f
This is just bne cr5,63f isn't it?
> index 72e783e..edb75a9 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -637,7 +637,7 @@ masked_##_H##interrupt: \
> rldicl r10,r10,48,1; /* clear MSR_EE */ \
> rotldi r10,r10,16; \
> mtspr SPRN_##_H##SRR1,r10; \
> -2: mtcrf 0x80,r9; \
> +2: mtcrf 0x90,r9; \
> ld r9,PACA_EXGEN+EX_R9(r13); \
> ld r10,PACA_EXGEN+EX_R10(r13); \
> ld r11,PACA_EXGEN+EX_R11(r13); \
What does this do?
Segher
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-11-27 12:18 ` [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t Madhavan Srinivasan
2014-11-27 16:56 ` Segher Boessenkool
@ 2014-11-28 0:56 ` Benjamin Herrenschmidt
2014-11-28 3:15 ` Madhavan Srinivasan
2014-12-01 21:35 ` Gabriel Paubert
2014-12-02 2:04 ` Scott Wood
3 siblings, 1 reply; 29+ messages in thread
From: Benjamin Herrenschmidt @ 2014-11-28 0:56 UTC (permalink / raw)
To: Madhavan Srinivasan; +Cc: rusty, paulus, anton, linuxppc-dev
On Thu, 2014-11-27 at 17:48 +0530, Madhavan Srinivasan wrote:
> This patch create the infrastructure to handle the CR based
> local_* atomic operations. Local atomic operations are fast
> and highly reentrant per CPU counters. Used for percpu
> variable updates. Local atomic operations only guarantee
> variable modification atomicity wrt the CPU which owns the
> data and these needs to be executed in a preemption safe way.
>
> Here is the design of this patch. Since local_* operations
> are only need to be atomic to interrupts (IIUC), patch uses
> one of the Condition Register (CR) fields as a flag variable. When
> entering the local_*, specific bit in the CR5 field is set
> and on exit, bit is cleared. CR bit checking is done in the
> interrupt return path. If CR5[EQ] bit set and if we return
> to kernel, we reset to start of local_* operation.
>
> Reason for this approach is that, currently l[w/d]arx/st[w/d]cx.
> instruction pair is used for local_* operations, which are heavy
> on cycle count and they dont support a local variant. So to
> see whether the new implementation helps, used a modified
> version of Rusty's benchmark code on local_t.
>
> https://lkml.org/lkml/2008/12/16/450
>
> Modifications:
> - increated the working set size from 1MB to 8MB,
> - removed cpu_local_inc test.
>
> Test ran
> - on POWER8 1S Scale out System 2.0GHz
> - on OPAL v3 with v3.18-rc4 patch kernel as Host
>
> Here are the values with the patch.
>
> Time in ns per iteration
>
> inc add read add_return
> atomic_long 67 67 18 69
> irqsave/rest 39 39 23 39
> trivalue 39 39 29 49
> local_t 26 26 24 26
>
> Since CR5 is used as a flag, have added CFLAGS to avoid CR5
> for the kernel compilation and CR5 is zeroed at the kernel
> entry.
>
> Tested the patch in a
> - pSeries LPAR,
> - Host with patched/unmodified guest kernel
>
> To check whether userspace see any CR5 corruption, ran a simple
> test which does,
> - set CR5 field,
> - while(1)
> - sleep or gettimeofday
> - chk bit set
>
> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
> ---
> - I really appreciate feedback on the patchset.
> - Kindly comment if I should try with any other benchmark or
> workload to check the numbers.
> - Also, kindly recommand any know stress test for CR
>
> Makefile | 6 ++
> arch/powerpc/include/asm/exception-64s.h | 21 +++++-
> arch/powerpc/kernel/entry_64.S | 106 ++++++++++++++++++++++++++++++-
> arch/powerpc/kernel/exceptions-64s.S | 2 +-
> arch/powerpc/kernel/head_64.S | 8 +++
> 5 files changed, 138 insertions(+), 5 deletions(-)
>
> diff --git a/Makefile b/Makefile
> index 00d618b..2e271ad 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -706,6 +706,12 @@ endif
>
> KBUILD_CFLAGS += $(call cc-option, -fno-var-tracking-assignments)
>
> +ifdef CONFIG_PPC64
> +# We need this flag to force compiler not to use CR5, since
> +# local_t type code is based on this.
> +KBUILD_CFLAGS += -ffixed-cr5
> +endif
> +
> ifdef CONFIG_DEBUG_INFO
> ifdef CONFIG_DEBUG_INFO_SPLIT
> KBUILD_CFLAGS += $(call cc-option, -gsplit-dwarf, -g)
> diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
> index 77f52b2..c42919a 100644
> --- a/arch/powerpc/include/asm/exception-64s.h
> +++ b/arch/powerpc/include/asm/exception-64s.h
> @@ -306,7 +306,26 @@ do_kvm_##n: \
> std r10,0(r1); /* make stack chain pointer */ \
> std r0,GPR0(r1); /* save r0 in stackframe */ \
> std r10,GPR1(r1); /* save r1 in stackframe */ \
> - beq 4f; /* if from kernel mode */ \
> +BEGIN_FTR_SECTION; \
> + lis r9,4096; /* Create a mask with HV and PR */ \
> + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \
> + mr r10,r9; /* to check for Hyp state */ \
> + ori r9,r9,16384; \
> + and r9,r12,r9; \
> + cmpd cr3,r10,r9; \
> + beq cr3,66f; /* Jump if we come from Hyp mode*/ \
> + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \
> +FTR_SECTION_ELSE; \
Can't we just unconditionally clear at as long as we do that after we've
saved it ? In that case, it's just a matter for the fixup code to check
the saved version rather than the actual CR..
> + beq 4f; /* if kernel mode branch */ \
> + li r10,0; /* Clear CR5 incase of coming */ \
> + mtcrf 0x04,r10; /* from user. */ \
> + nop; /* This part of code is for */ \
> + nop; /* kernel with MSR[HV]=0, */ \
> + nop; /* MSR[PR]=0, so just chk for */ \
> + nop; /* MSR[PR] */ \
> + nop; \
> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE); \
> +66: beq 4f; /* if from kernel mode */ \
> ACCOUNT_CPU_USER_ENTRY(r9, r10); \
> SAVE_PPR(area, r9, r10); \
> 4: EXCEPTION_PROLOG_COMMON_2(area) \
> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
> index 0905c8d..e42bb99 100644
> --- a/arch/powerpc/kernel/entry_64.S
> +++ b/arch/powerpc/kernel/entry_64.S
> @@ -68,7 +68,26 @@ system_call_common:
> 2: std r2,GPR2(r1)
> std r3,GPR3(r1)
> mfcr r2
> - std r4,GPR4(r1)
> +BEGIN_FTR_SECTION
> + lis r10,4096
> + rldicr r10,r10,32,31
> + mr r11,r10
> + ori r10,r10,16384
> + and r10,r12,r10
> + cmpd r11,r10
> + beq 67f
> + mtcrf 0x04,r11
> +FTR_SECTION_ELSE
> + beq 67f
> + li r11,0
> + mtcrf 0x04,r11
> + nop
> + nop
> + nop
> + nop
> + nop
> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
> +67: std r4,GPR4(r1)
> std r5,GPR5(r1)
> std r6,GPR6(r1)
> std r7,GPR7(r1)
> @@ -224,8 +243,26 @@ syscall_exit:
> BEGIN_FTR_SECTION
> stdcx. r0,0,r1 /* to clear the reservation */
> END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
> +BEGIN_FTR_SECTION
> + lis r4,4096
> + rldicr r4,r4,32,31
> + mr r6,r4
> + ori r4,r4,16384
> + and r4,r8,r4
> + cmpd cr3,r6,r4
> + beq cr3,65f
> + mtcr r5
> +FTR_SECTION_ELSE
> andi. r6,r8,MSR_PR
> - ld r4,_LINK(r1)
> + beq 65f
> + mtcr r5
> + nop
> + nop
> + nop
> + nop
> + nop
> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
> +65: ld r4,_LINK(r1)
>
> beq- 1f
> ACCOUNT_CPU_USER_EXIT(r11, r12)
> @@ -234,7 +271,11 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
> 1: ld r2,GPR2(r1)
> ld r1,GPR1(r1)
> mtlr r4
> +#ifdef CONFIG_PPC64
> + mtcrf 0xFB,r5
> +#else
> mtcr r5
> +#endif
> mtspr SPRN_SRR0,r7
> mtspr SPRN_SRR1,r8
> RFI
> @@ -804,7 +845,66 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
> */
> .globl fast_exception_return
> fast_exception_return:
> - ld r3,_MSR(r1)
> +
> + /*
> + * Now that we are about to exit from interrupt, lets check for
> + * cr5 eq bit. If it is set, then we may be in the middle of
> + * local_t update. In this case, we should rewind the NIP
> + * accordingly.
> + */
> + mfcr r3
> + andi. r4,r3,0x200
> + beq 63f
> +
> + /*
> + * Now that the bit is set, lets check for return to User
> + */
> + ld r4,_MSR(r1)
> +BEGIN_FTR_SECTION
> + li r3,4096
> + rldicr r3,r3,32,31
> + mr r5,r3
> + ori r3,r3,16384
> + and r3,r4,r3
> + cmpd r5,r3
> + bne 63f
> +FTR_SECTION_ELSE
> + andi. r3,r4,MSR_PR
> + bne 63f
> + nop
> + nop
> + nop
> + nop
> + nop
> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
> +
> + /*
> + * Looks like we are returning to Kernel, so
> + * lets get the NIP and search the ex_table.
> + * Change the NIP based on the return value
> + */
> +lookup_ex_table:
> + ld r3,_NIP(r1)
> + bl search_exception_tables
> + cmpli 0,1,r3,0
> + bne 62f
> +
> + /*
> + * This is a panic case. Reason is that, we
> + * have the CR5 bit set, but we are not in
> + * local_* code and we are returning to Kernel.
> + */
> + ld r3,_NIP(r1)
> + mfcr r4
> + EMIT_BUG_ENTRY lookup_ex_table, __FILE__,__LINE__,BUGFLAG_WARNING
> +
> + /*
> + * Now save the return fixup address as NIP
> + */
> +62: ld r4,8(r3)
> + std r4,_NIP(r1)
> + crclr 22
> +63: ld r3,_MSR(r1)
> ld r4,_CTR(r1)
> ld r0,_LINK(r1)
> mtctr r4
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index 72e783e..edb75a9 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -637,7 +637,7 @@ masked_##_H##interrupt: \
> rldicl r10,r10,48,1; /* clear MSR_EE */ \
> rotldi r10,r10,16; \
> mtspr SPRN_##_H##SRR1,r10; \
> -2: mtcrf 0x80,r9; \
> +2: mtcrf 0x90,r9; \
> ld r9,PACA_EXGEN+EX_R9(r13); \
> ld r10,PACA_EXGEN+EX_R10(r13); \
> ld r11,PACA_EXGEN+EX_R11(r13); \
> diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
> index d48125d..02e49b3 100644
> --- a/arch/powerpc/kernel/head_64.S
> +++ b/arch/powerpc/kernel/head_64.S
> @@ -347,6 +347,14 @@ __mmu_off:
> *
> */
> __start_initialization_multiplatform:
> +
> + /*
> + * Before we do anything, lets clear CR5 field,
> + * so that we will have a clean start at entry
> + */
> + li r11,0
> + mtcrf 0x04,r11
> +
> /* Make sure we are running in 64 bits mode */
> bl enable_64b_mode
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-11-27 16:56 ` Segher Boessenkool
@ 2014-11-28 1:58 ` Benjamin Herrenschmidt
2014-11-28 3:00 ` Madhavan Srinivasan
2014-11-28 15:57 ` Segher Boessenkool
2014-11-28 2:57 ` Madhavan Srinivasan
1 sibling, 2 replies; 29+ messages in thread
From: Benjamin Herrenschmidt @ 2014-11-28 1:58 UTC (permalink / raw)
To: Segher Boessenkool
Cc: linuxppc-dev, rusty, Madhavan Srinivasan, paulus, anton
On Thu, 2014-11-27 at 10:56 -0600, Segher Boessenkool wrote:
> On Thu, Nov 27, 2014 at 05:48:40PM +0530, Madhavan Srinivasan wrote:
> > Here is the design of this patch. Since local_* operations
> > are only need to be atomic to interrupts (IIUC), patch uses
> > one of the Condition Register (CR) fields as a flag variable. When
> > entering the local_*, specific bit in the CR5 field is set
> > and on exit, bit is cleared. CR bit checking is done in the
> > interrupt return path. If CR5[EQ] bit set and if we return
> > to kernel, we reset to start of local_* operation.
>
> Have you tested this with (upcoming) GCC 5.0? GCC now uses CR5,
> and it likes to use it very much, it might be more convenient to
> use e.g. CR1 (which is allocated almost last, only before CR0).
We use CR1 all over the place in your asm code. Any other suggestion ?
What's the damage of -ffixed-cr5 on gcc5 ? won't it just use CR4 or 6
instead ?
> > --- a/arch/powerpc/include/asm/exception-64s.h
> > +++ b/arch/powerpc/include/asm/exception-64s.h
> > @@ -306,7 +306,26 @@ do_kvm_##n: \
> > std r10,0(r1); /* make stack chain pointer */ \
> > std r0,GPR0(r1); /* save r0 in stackframe */ \
> > std r10,GPR1(r1); /* save r1 in stackframe */ \
> > - beq 4f; /* if from kernel mode */ \
> > +BEGIN_FTR_SECTION; \
> > + lis r9,4096; /* Create a mask with HV and PR */ \
> > + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \
> > + mr r10,r9; /* to check for Hyp state */ \
> > + ori r9,r9,16384; \
> > + and r9,r12,r9; \
> > + cmpd cr3,r10,r9; \
> > + beq cr3,66f; /* Jump if we come from Hyp mode*/ \
> > + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \
>
> Wow, such nastiness, only to avoid using dot insns (since you need to keep
> the current CR0 value for the following beq 4f). And CR0 already holds the
> PR bit, so you need only to check the HV bit anyway? Some restructuring
> would make this a lot simpler and clearer.
>
> > + /*
> > + * Now that we are about to exit from interrupt, lets check for
> > + * cr5 eq bit. If it is set, then we may be in the middle of
> > + * local_t update. In this case, we should rewind the NIP
> > + * accordingly.
> > + */
> > + mfcr r3
> > + andi. r4,r3,0x200
> > + beq 63f
>
> This is just bne cr5,63f isn't it?
>
> > index 72e783e..edb75a9 100644
> > --- a/arch/powerpc/kernel/exceptions-64s.S
> > +++ b/arch/powerpc/kernel/exceptions-64s.S
> > @@ -637,7 +637,7 @@ masked_##_H##interrupt: \
> > rldicl r10,r10,48,1; /* clear MSR_EE */ \
> > rotldi r10,r10,16; \
> > mtspr SPRN_##_H##SRR1,r10; \
> > -2: mtcrf 0x80,r9; \
> > +2: mtcrf 0x90,r9; \
> > ld r9,PACA_EXGEN+EX_R9(r13); \
> > ld r10,PACA_EXGEN+EX_R10(r13); \
> > ld r11,PACA_EXGEN+EX_R11(r13); \
>
> What does this do?
>
>
> Segher
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-11-27 16:56 ` Segher Boessenkool
2014-11-28 1:58 ` Benjamin Herrenschmidt
@ 2014-11-28 2:57 ` Madhavan Srinivasan
2014-11-28 16:00 ` Segher Boessenkool
1 sibling, 1 reply; 29+ messages in thread
From: Madhavan Srinivasan @ 2014-11-28 2:57 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: rusty, paulus, anton, linuxppc-dev
On Thursday 27 November 2014 10:26 PM, Segher Boessenkool wrote:
> On Thu, Nov 27, 2014 at 05:48:40PM +0530, Madhavan Srinivasan wrote:
>> Here is the design of this patch. Since local_* operations
>> are only need to be atomic to interrupts (IIUC), patch uses
>> one of the Condition Register (CR) fields as a flag variable. When
>> entering the local_*, specific bit in the CR5 field is set
>> and on exit, bit is cleared. CR bit checking is done in the
>> interrupt return path. If CR5[EQ] bit set and if we return
>> to kernel, we reset to start of local_* operation.
>
> Have you tested this with (upcoming) GCC 5.0? GCC now uses CR5,
> and it likes to use it very much, it might be more convenient to
> use e.g. CR1 (which is allocated almost last, only before CR0).
>
No. I did not try it with GCC5.0 But I did force kernel compilation with
fixed-cr5 which should make GCC avoid using CR5. But i will try that today.
>> --- a/arch/powerpc/include/asm/exception-64s.h
>> +++ b/arch/powerpc/include/asm/exception-64s.h
>> @@ -306,7 +306,26 @@ do_kvm_##n: \
>> std r10,0(r1); /* make stack chain pointer */ \
>> std r0,GPR0(r1); /* save r0 in stackframe */ \
>> std r10,GPR1(r1); /* save r1 in stackframe */ \
>> - beq 4f; /* if from kernel mode */ \
>> +BEGIN_FTR_SECTION; \
>> + lis r9,4096; /* Create a mask with HV and PR */ \
>> + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \
>> + mr r10,r9; /* to check for Hyp state */ \
>> + ori r9,r9,16384; \
>> + and r9,r12,r9; \
>> + cmpd cr3,r10,r9; \
>> + beq cr3,66f; /* Jump if we come from Hyp mode*/ \
>> + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \
>
> Wow, such nastiness, only to avoid using dot insns (since you need to keep
> the current CR0 value for the following beq 4f). And CR0 already holds the
> PR bit, so you need only to check the HV bit anyway? Some restructuring
> would make this a lot simpler and clearer.
Ok I can try that.
>
>> + /*
>> + * Now that we are about to exit from interrupt, lets check for
>> + * cr5 eq bit. If it is set, then we may be in the middle of
>> + * local_t update. In this case, we should rewind the NIP
>> + * accordingly.
>> + */
>> + mfcr r3
>> + andi. r4,r3,0x200
>> + beq 63f
>
> This is just bne cr5,63f isn't it?
>
>> index 72e783e..edb75a9 100644
>> --- a/arch/powerpc/kernel/exceptions-64s.S
>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>> @@ -637,7 +637,7 @@ masked_##_H##interrupt: \
>> rldicl r10,r10,48,1; /* clear MSR_EE */ \
>> rotldi r10,r10,16; \
>> mtspr SPRN_##_H##SRR1,r10; \
>> -2: mtcrf 0x80,r9; \
>> +2: mtcrf 0x90,r9; \
>> ld r9,PACA_EXGEN+EX_R9(r13); \
>> ld r10,PACA_EXGEN+EX_R10(r13); \
>> ld r11,PACA_EXGEN+EX_R11(r13); \
>
> What does this do?
>
Since I use the CR3, I restore it here.
>
> Segher
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-11-28 1:58 ` Benjamin Herrenschmidt
@ 2014-11-28 3:00 ` Madhavan Srinivasan
2014-11-28 15:57 ` Segher Boessenkool
1 sibling, 0 replies; 29+ messages in thread
From: Madhavan Srinivasan @ 2014-11-28 3:00 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Segher Boessenkool
Cc: linuxppc-dev, rusty, paulus, anton
On Friday 28 November 2014 07:28 AM, Benjamin Herrenschmidt wrote:
> On Thu, 2014-11-27 at 10:56 -0600, Segher Boessenkool wrote:
>> On Thu, Nov 27, 2014 at 05:48:40PM +0530, Madhavan Srinivasan wrote:
>>> Here is the design of this patch. Since local_* operations
>>> are only need to be atomic to interrupts (IIUC), patch uses
>>> one of the Condition Register (CR) fields as a flag variable. When
>>> entering the local_*, specific bit in the CR5 field is set
>>> and on exit, bit is cleared. CR bit checking is done in the
>>> interrupt return path. If CR5[EQ] bit set and if we return
>>> to kernel, we reset to start of local_* operation.
>>
>> Have you tested this with (upcoming) GCC 5.0? GCC now uses CR5,
>> and it likes to use it very much, it might be more convenient to
>> use e.g. CR1 (which is allocated almost last, only before CR0).
>
> We use CR1 all over the place in your asm code. Any other suggestion ?
>
Yes. CR1 is used in many places and so do CR7. And CR0 are alway used
for dot instn. And I guess Vector instructions use CR6.
> What's the damage of -ffixed-cr5 on gcc5 ? won't it just use CR4 or 6
> instead ?
>
Will try this today with GCC 5.0.
>>> --- a/arch/powerpc/include/asm/exception-64s.h
>>> +++ b/arch/powerpc/include/asm/exception-64s.h
>>> @@ -306,7 +306,26 @@ do_kvm_##n: \
>>> std r10,0(r1); /* make stack chain pointer */ \
>>> std r0,GPR0(r1); /* save r0 in stackframe */ \
>>> std r10,GPR1(r1); /* save r1 in stackframe */ \
>>> - beq 4f; /* if from kernel mode */ \
>>> +BEGIN_FTR_SECTION; \
>>> + lis r9,4096; /* Create a mask with HV and PR */ \
>>> + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \
>>> + mr r10,r9; /* to check for Hyp state */ \
>>> + ori r9,r9,16384; \
>>> + and r9,r12,r9; \
>>> + cmpd cr3,r10,r9; \
>>> + beq cr3,66f; /* Jump if we come from Hyp mode*/ \
>>> + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \
>>
>> Wow, such nastiness, only to avoid using dot insns (since you need to keep
>> the current CR0 value for the following beq 4f). And CR0 already holds the
>> PR bit, so you need only to check the HV bit anyway? Some restructuring
>> would make this a lot simpler and clearer.
>>
>>> + /*
>>> + * Now that we are about to exit from interrupt, lets check for
>>> + * cr5 eq bit. If it is set, then we may be in the middle of
>>> + * local_t update. In this case, we should rewind the NIP
>>> + * accordingly.
>>> + */
>>> + mfcr r3
>>> + andi. r4,r3,0x200
>>> + beq 63f
>>
>> This is just bne cr5,63f isn't it?
>>
>>> index 72e783e..edb75a9 100644
>>> --- a/arch/powerpc/kernel/exceptions-64s.S
>>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>>> @@ -637,7 +637,7 @@ masked_##_H##interrupt: \
>>> rldicl r10,r10,48,1; /* clear MSR_EE */ \
>>> rotldi r10,r10,16; \
>>> mtspr SPRN_##_H##SRR1,r10; \
>>> -2: mtcrf 0x80,r9; \
>>> +2: mtcrf 0x90,r9; \
>>> ld r9,PACA_EXGEN+EX_R9(r13); \
>>> ld r10,PACA_EXGEN+EX_R10(r13); \
>>> ld r11,PACA_EXGEN+EX_R11(r13); \
>>
>> What does this do?
>>
>>
>> Segher
>> _______________________________________________
>> Linuxppc-dev mailing list
>> Linuxppc-dev@lists.ozlabs.org
>> https://lists.ozlabs.org/listinfo/linuxppc-dev
>
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-11-28 0:56 ` Benjamin Herrenschmidt
@ 2014-11-28 3:15 ` Madhavan Srinivasan
2014-11-28 3:21 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 29+ messages in thread
From: Madhavan Srinivasan @ 2014-11-28 3:15 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: rusty, paulus, anton, linuxppc-dev
On Friday 28 November 2014 06:26 AM, Benjamin Herrenschmidt wrote:
> On Thu, 2014-11-27 at 17:48 +0530, Madhavan Srinivasan wrote:
>> This patch create the infrastructure to handle the CR based
>> local_* atomic operations. Local atomic operations are fast
>> and highly reentrant per CPU counters. Used for percpu
>> variable updates. Local atomic operations only guarantee
>> variable modification atomicity wrt the CPU which owns the
>> data and these needs to be executed in a preemption safe way.
>>
>> Here is the design of this patch. Since local_* operations
>> are only need to be atomic to interrupts (IIUC), patch uses
>> one of the Condition Register (CR) fields as a flag variable. When
>> entering the local_*, specific bit in the CR5 field is set
>> and on exit, bit is cleared. CR bit checking is done in the
>> interrupt return path. If CR5[EQ] bit set and if we return
>> to kernel, we reset to start of local_* operation.
>>
>> Reason for this approach is that, currently l[w/d]arx/st[w/d]cx.
>> instruction pair is used for local_* operations, which are heavy
>> on cycle count and they dont support a local variant. So to
>> see whether the new implementation helps, used a modified
>> version of Rusty's benchmark code on local_t.
>>
>> https://lkml.org/lkml/2008/12/16/450
>>
>> Modifications:
>> - increated the working set size from 1MB to 8MB,
>> - removed cpu_local_inc test.
>>
>> Test ran
>> - on POWER8 1S Scale out System 2.0GHz
>> - on OPAL v3 with v3.18-rc4 patch kernel as Host
>>
>> Here are the values with the patch.
>>
>> Time in ns per iteration
>>
>> inc add read add_return
>> atomic_long 67 67 18 69
>> irqsave/rest 39 39 23 39
>> trivalue 39 39 29 49
>> local_t 26 26 24 26
>>
>> Since CR5 is used as a flag, have added CFLAGS to avoid CR5
>> for the kernel compilation and CR5 is zeroed at the kernel
>> entry.
>>
>> Tested the patch in a
>> - pSeries LPAR,
>> - Host with patched/unmodified guest kernel
>>
>> To check whether userspace see any CR5 corruption, ran a simple
>> test which does,
>> - set CR5 field,
>> - while(1)
>> - sleep or gettimeofday
>> - chk bit set
>>
>> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
>> ---
>> - I really appreciate feedback on the patchset.
>> - Kindly comment if I should try with any other benchmark or
>> workload to check the numbers.
>> - Also, kindly recommand any know stress test for CR
>>
>> Makefile | 6 ++
>> arch/powerpc/include/asm/exception-64s.h | 21 +++++-
>> arch/powerpc/kernel/entry_64.S | 106 ++++++++++++++++++++++++++++++-
>> arch/powerpc/kernel/exceptions-64s.S | 2 +-
>> arch/powerpc/kernel/head_64.S | 8 +++
>> 5 files changed, 138 insertions(+), 5 deletions(-)
>>
>> diff --git a/Makefile b/Makefile
>> index 00d618b..2e271ad 100644
>> --- a/Makefile
>> +++ b/Makefile
>> @@ -706,6 +706,12 @@ endif
>>
>> KBUILD_CFLAGS += $(call cc-option, -fno-var-tracking-assignments)
>>
>> +ifdef CONFIG_PPC64
>> +# We need this flag to force compiler not to use CR5, since
>> +# local_t type code is based on this.
>> +KBUILD_CFLAGS += -ffixed-cr5
>> +endif
>> +
>> ifdef CONFIG_DEBUG_INFO
>> ifdef CONFIG_DEBUG_INFO_SPLIT
>> KBUILD_CFLAGS += $(call cc-option, -gsplit-dwarf, -g)
>> diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
>> index 77f52b2..c42919a 100644
>> --- a/arch/powerpc/include/asm/exception-64s.h
>> +++ b/arch/powerpc/include/asm/exception-64s.h
>> @@ -306,7 +306,26 @@ do_kvm_##n: \
>> std r10,0(r1); /* make stack chain pointer */ \
>> std r0,GPR0(r1); /* save r0 in stackframe */ \
>> std r10,GPR1(r1); /* save r1 in stackframe */ \
>> - beq 4f; /* if from kernel mode */ \
>> +BEGIN_FTR_SECTION; \
>> + lis r9,4096; /* Create a mask with HV and PR */ \
>> + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \
>> + mr r10,r9; /* to check for Hyp state */ \
>> + ori r9,r9,16384; \
>> + and r9,r12,r9; \
>> + cmpd cr3,r10,r9; \
>> + beq cr3,66f; /* Jump if we come from Hyp mode*/ \
>> + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \
>> +FTR_SECTION_ELSE; \
>
> Can't we just unconditionally clear at as long as we do that after we've
> saved it ? In that case, it's just a matter for the fixup code to check
> the saved version rather than the actual CR..
>
I use CR bit setting in the interrupt return path to enter the fixup
section search. If we unconditionally clear it, we will have to enter
the fixup section for every kernel return nip right?
Regards
Maddy
>> + beq 4f; /* if kernel mode branch */ \
>> + li r10,0; /* Clear CR5 incase of coming */ \
>> + mtcrf 0x04,r10; /* from user. */ \
>> + nop; /* This part of code is for */ \
>> + nop; /* kernel with MSR[HV]=0, */ \
>> + nop; /* MSR[PR]=0, so just chk for */ \
>> + nop; /* MSR[PR] */ \
>> + nop; \
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE); \
>> +66: beq 4f; /* if from kernel mode */ \
>> ACCOUNT_CPU_USER_ENTRY(r9, r10); \
>> SAVE_PPR(area, r9, r10); \
>> 4: EXCEPTION_PROLOG_COMMON_2(area) \
>> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
>> index 0905c8d..e42bb99 100644
>> --- a/arch/powerpc/kernel/entry_64.S
>> +++ b/arch/powerpc/kernel/entry_64.S
>> @@ -68,7 +68,26 @@ system_call_common:
>> 2: std r2,GPR2(r1)
>> std r3,GPR3(r1)
>> mfcr r2
>> - std r4,GPR4(r1)
>> +BEGIN_FTR_SECTION
>> + lis r10,4096
>> + rldicr r10,r10,32,31
>> + mr r11,r10
>> + ori r10,r10,16384
>> + and r10,r12,r10
>> + cmpd r11,r10
>> + beq 67f
>> + mtcrf 0x04,r11
>> +FTR_SECTION_ELSE
>> + beq 67f
>> + li r11,0
>> + mtcrf 0x04,r11
>> + nop
>> + nop
>> + nop
>> + nop
>> + nop
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
>> +67: std r4,GPR4(r1)
>> std r5,GPR5(r1)
>> std r6,GPR6(r1)
>> std r7,GPR7(r1)
>> @@ -224,8 +243,26 @@ syscall_exit:
>> BEGIN_FTR_SECTION
>> stdcx. r0,0,r1 /* to clear the reservation */
>> END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
>> +BEGIN_FTR_SECTION
>> + lis r4,4096
>> + rldicr r4,r4,32,31
>> + mr r6,r4
>> + ori r4,r4,16384
>> + and r4,r8,r4
>> + cmpd cr3,r6,r4
>> + beq cr3,65f
>> + mtcr r5
>> +FTR_SECTION_ELSE
>> andi. r6,r8,MSR_PR
>> - ld r4,_LINK(r1)
>> + beq 65f
>> + mtcr r5
>> + nop
>> + nop
>> + nop
>> + nop
>> + nop
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
>> +65: ld r4,_LINK(r1)
>>
>> beq- 1f
>> ACCOUNT_CPU_USER_EXIT(r11, r12)
>> @@ -234,7 +271,11 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
>> 1: ld r2,GPR2(r1)
>> ld r1,GPR1(r1)
>> mtlr r4
>> +#ifdef CONFIG_PPC64
>> + mtcrf 0xFB,r5
>> +#else
>> mtcr r5
>> +#endif
>> mtspr SPRN_SRR0,r7
>> mtspr SPRN_SRR1,r8
>> RFI
>> @@ -804,7 +845,66 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
>> */
>> .globl fast_exception_return
>> fast_exception_return:
>> - ld r3,_MSR(r1)
>> +
>> + /*
>> + * Now that we are about to exit from interrupt, lets check for
>> + * cr5 eq bit. If it is set, then we may be in the middle of
>> + * local_t update. In this case, we should rewind the NIP
>> + * accordingly.
>> + */
>> + mfcr r3
>> + andi. r4,r3,0x200
>> + beq 63f
>> +
>> + /*
>> + * Now that the bit is set, lets check for return to User
>> + */
>> + ld r4,_MSR(r1)
>> +BEGIN_FTR_SECTION
>> + li r3,4096
>> + rldicr r3,r3,32,31
>> + mr r5,r3
>> + ori r3,r3,16384
>> + and r3,r4,r3
>> + cmpd r5,r3
>> + bne 63f
>> +FTR_SECTION_ELSE
>> + andi. r3,r4,MSR_PR
>> + bne 63f
>> + nop
>> + nop
>> + nop
>> + nop
>> + nop
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
>> +
>> + /*
>> + * Looks like we are returning to Kernel, so
>> + * lets get the NIP and search the ex_table.
>> + * Change the NIP based on the return value
>> + */
>> +lookup_ex_table:
>> + ld r3,_NIP(r1)
>> + bl search_exception_tables
>> + cmpli 0,1,r3,0
>> + bne 62f
>> +
>> + /*
>> + * This is a panic case. Reason is that, we
>> + * have the CR5 bit set, but we are not in
>> + * local_* code and we are returning to Kernel.
>> + */
>> + ld r3,_NIP(r1)
>> + mfcr r4
>> + EMIT_BUG_ENTRY lookup_ex_table, __FILE__,__LINE__,BUGFLAG_WARNING
>> +
>> + /*
>> + * Now save the return fixup address as NIP
>> + */
>> +62: ld r4,8(r3)
>> + std r4,_NIP(r1)
>> + crclr 22
>> +63: ld r3,_MSR(r1)
>> ld r4,_CTR(r1)
>> ld r0,_LINK(r1)
>> mtctr r4
>> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
>> index 72e783e..edb75a9 100644
>> --- a/arch/powerpc/kernel/exceptions-64s.S
>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>> @@ -637,7 +637,7 @@ masked_##_H##interrupt: \
>> rldicl r10,r10,48,1; /* clear MSR_EE */ \
>> rotldi r10,r10,16; \
>> mtspr SPRN_##_H##SRR1,r10; \
>> -2: mtcrf 0x80,r9; \
>> +2: mtcrf 0x90,r9; \
>> ld r9,PACA_EXGEN+EX_R9(r13); \
>> ld r10,PACA_EXGEN+EX_R10(r13); \
>> ld r11,PACA_EXGEN+EX_R11(r13); \
>> diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
>> index d48125d..02e49b3 100644
>> --- a/arch/powerpc/kernel/head_64.S
>> +++ b/arch/powerpc/kernel/head_64.S
>> @@ -347,6 +347,14 @@ __mmu_off:
>> *
>> */
>> __start_initialization_multiplatform:
>> +
>> + /*
>> + * Before we do anything, lets clear CR5 field,
>> + * so that we will have a clean start at entry
>> + */
>> + li r11,0
>> + mtcrf 0x04,r11
>> +
>> /* Make sure we are running in 64 bits mode */
>> bl enable_64b_mode
>>
>
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-11-28 3:15 ` Madhavan Srinivasan
@ 2014-11-28 3:21 ` Benjamin Herrenschmidt
2014-11-28 10:53 ` David Laight
0 siblings, 1 reply; 29+ messages in thread
From: Benjamin Herrenschmidt @ 2014-11-28 3:21 UTC (permalink / raw)
To: Madhavan Srinivasan; +Cc: rusty, paulus, anton, linuxppc-dev
On Fri, 2014-11-28 at 08:45 +0530, Madhavan Srinivasan wrote:
> > Can't we just unconditionally clear at as long as we do that after we've
> > saved it ? In that case, it's just a matter for the fixup code to check
> > the saved version rather than the actual CR..
> >
> I use CR bit setting in the interrupt return path to enter the fixup
> section search. If we unconditionally clear it, we will have to enter
> the fixup section for every kernel return nip right?
As I said above. Can't we look at the saved version ?
IE.
- On interrupt entry:
* Save CR to CCR(r1)
* clear CR5
- On exit
* Check CCR(r1)'s CR5 field
* restore CR
Cheers,
Ben.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation
2014-11-27 14:05 ` [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation David Laight
@ 2014-11-28 8:27 ` Madhavan Srinivasan
2014-11-28 10:09 ` David Laight
0 siblings, 1 reply; 29+ messages in thread
From: Madhavan Srinivasan @ 2014-11-28 8:27 UTC (permalink / raw)
To: David Laight, mpe@ellerman.id.au
Cc: linuxppc-dev@lists.ozlabs.org, rusty@rustcorp.com.au,
paulus@samba.org, anton@samba.org
On Thursday 27 November 2014 07:35 PM, David Laight wrote:
> From: Madhavan Srinivasan
>> This patchset create the infrastructure to handle the CR based
>> local_* atomic operations. Local atomic operations are fast
>> and highly reentrant per CPU counters. Used for percpu
>> variable updates. Local atomic operations only guarantee
>> variable modification atomicity wrt the CPU which owns the
>> data and these needs to be executed in a preemption safe way.
>
> These are usually called 'restartable atomic sequences (RAS)'.
>
>> Here is the design of the first patch. Since local_* operations
>> are only need to be atomic to interrupts (IIUC), patch uses
>> one of the Condition Register (CR) fields as a flag variable. When
>> entering the local_*, specific bit in the CR5 field is set
>> and on exit, bit is cleared. CR bit checking is done in the
>> interrupt return path. If CR5[EQ] bit set and if we return
>> to kernel, we reset to start of local_* operation.
>
> I don't claim to be able to read ppc assembler.
> But I can't see the code that clears CR5[EQ] for the duration
> of the ISR.
I use crclr instruction at the end of the code block to clear the bit.
> Without it a nested interrupt will go through unwanted paths.
>
> There are also a lot of 'magic' constants in that assembly code.
>
All these constants are define in asm/ppc-opcode.h
> I also wonder if it is possible to inspect the interrupted
> code to determine the start/end of the RAS block.
> (Easiest if you assume that there is a single 'write' instruction
> as the last entry in the block.)
>
So each local_* function also have code in the __ex_table section. IIUC,
__ex_table contains two address. So if the return address found in the
first column of the _ex_table, use the corresponding address in the
second column to continue from.
> Also, how expensive is it to disable all interrupts?
>
In the patch 1/2, posted the numbers for that too.
> David
>
Regards
Maddy
^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation
2014-11-28 8:27 ` Madhavan Srinivasan
@ 2014-11-28 10:09 ` David Laight
2014-12-01 15:35 ` Madhavan Srinivasan
0 siblings, 1 reply; 29+ messages in thread
From: David Laight @ 2014-11-28 10:09 UTC (permalink / raw)
To: 'Madhavan Srinivasan', mpe@ellerman.id.au
Cc: linuxppc-dev@lists.ozlabs.org, rusty@rustcorp.com.au,
paulus@samba.org, anton@samba.org
RnJvbTogTWFkaGF2YW4gU3Jpbml2YXNhbg0KPiBPbiBUaHVyc2RheSAyNyBOb3ZlbWJlciAyMDE0
IDA3OjM1IFBNLCBEYXZpZCBMYWlnaHQgd3JvdGU6DQo+ID4gRnJvbTogTWFkaGF2YW4gU3Jpbml2
YXNhbg0KPiA+PiBUaGlzIHBhdGNoc2V0IGNyZWF0ZSB0aGUgaW5mcmFzdHJ1Y3R1cmUgdG8gaGFu
ZGxlIHRoZSBDUiBiYXNlZA0KPiA+PiBsb2NhbF8qIGF0b21pYyBvcGVyYXRpb25zLiBMb2NhbCBh
dG9taWMgb3BlcmF0aW9ucyBhcmUgZmFzdA0KPiA+PiBhbmQgaGlnaGx5IHJlZW50cmFudCBwZXIg
Q1BVIGNvdW50ZXJzLiAgVXNlZCBmb3IgcGVyY3B1DQo+ID4+IHZhcmlhYmxlIHVwZGF0ZXMuIExv
Y2FsIGF0b21pYyBvcGVyYXRpb25zIG9ubHkgZ3VhcmFudGVlDQo+ID4+IHZhcmlhYmxlIG1vZGlm
aWNhdGlvbiBhdG9taWNpdHkgd3J0IHRoZSBDUFUgd2hpY2ggb3ducyB0aGUNCj4gPj4gZGF0YSBh
bmQgdGhlc2UgbmVlZHMgdG8gYmUgZXhlY3V0ZWQgaW4gYSBwcmVlbXB0aW9uIHNhZmUgd2F5Lg0K
PiA+DQo+ID4gVGhlc2UgYXJlIHVzdWFsbHkgY2FsbGVkICdyZXN0YXJ0YWJsZSBhdG9taWMgc2Vx
dWVuY2VzIChSQVMpJy4NCj4gPg0KPiA+PiBIZXJlIGlzIHRoZSBkZXNpZ24gb2YgdGhlIGZpcnN0
IHBhdGNoLiBTaW5jZSBsb2NhbF8qIG9wZXJhdGlvbnMNCj4gPj4gYXJlIG9ubHkgbmVlZCB0byBi
ZSBhdG9taWMgdG8gaW50ZXJydXB0cyAoSUlVQyksIHBhdGNoIHVzZXMNCj4gPj4gb25lIG9mIHRo
ZSBDb25kaXRpb24gUmVnaXN0ZXIgKENSKSBmaWVsZHMgYXMgYSBmbGFnIHZhcmlhYmxlLiBXaGVu
DQo+ID4+IGVudGVyaW5nIHRoZSBsb2NhbF8qLCBzcGVjaWZpYyBiaXQgaW4gdGhlIENSNSBmaWVs
ZCBpcyBzZXQNCj4gPj4gYW5kIG9uIGV4aXQsIGJpdCBpcyBjbGVhcmVkLiBDUiBiaXQgY2hlY2tp
bmcgaXMgZG9uZSBpbiB0aGUNCj4gPj4gaW50ZXJydXB0IHJldHVybiBwYXRoLiBJZiBDUjVbRVFd
IGJpdCBzZXQgYW5kIGlmIHdlIHJldHVybg0KPiA+PiB0byBrZXJuZWwsIHdlIHJlc2V0IHRvIHN0
YXJ0IG9mIGxvY2FsXyogb3BlcmF0aW9uLg0KPiA+DQo+ID4gSSBkb24ndCBjbGFpbSB0byBiZSBh
YmxlIHRvIHJlYWQgcHBjIGFzc2VtYmxlci4NCj4gPiBCdXQgSSBjYW4ndCBzZWUgdGhlIGNvZGUg
dGhhdCBjbGVhcnMgQ1I1W0VRXSBmb3IgdGhlIGR1cmF0aW9uDQo+ID4gb2YgdGhlIElTUi4NCj4g
SSB1c2UgY3JjbHIgaW5zdHJ1Y3Rpb24gYXQgdGhlIGVuZCBvZiB0aGUgY29kZSBibG9jayB0byBj
bGVhciB0aGUgYml0Lg0KPiANCj4gPiBXaXRob3V0IGl0IGEgbmVzdGVkIGludGVycnVwdCB3aWxs
IGdvIHRocm91Z2ggdW53YW50ZWQgcGF0aHMuDQoNClRoYXQgY3JjbHIgbG9va3MgdG8gYmUgaW4g
dGhlIElTUiBleGl0IHBhdGgsIHlvdSBuZWVkIG9uZSBpbiB0aGUNCmlzciBlbnRyeSBwYXRoLg0K
DQo+ID4NCj4gPiBUaGVyZSBhcmUgYWxzbyBhIGxvdCBvZiAnbWFnaWMnIGNvbnN0YW50cyBpbiB0
aGF0IGFzc2VtYmx5IGNvZGUuDQo+ID4NCj4gQWxsIHRoZXNlIGNvbnN0YW50cyBhcmUgZGVmaW5l
IGluIGFzbS9wcGMtb3Bjb2RlLmgNCg0KSSB3YXMgdGhpbmtpbmcgb2YgdGhlIGxpbmVzIGxpa2U6
DQorCW9yaQlyMyxyMywxNjM4NA0KVGhpcyBvbmUgcHJvYmFibHkgZGVzZXJ2ZXMgYSBjb21tZW50
IC0gb3Igc29tZXRoaW5nDQorIjM6IglQUEM0MDVfRVJSNzcoMCwlMikNCg0KPiA+IEkgYWxzbyB3
b25kZXIgaWYgaXQgaXMgcG9zc2libGUgdG8gaW5zcGVjdCB0aGUgaW50ZXJydXB0ZWQNCj4gPiBj
b2RlIHRvIGRldGVybWluZSB0aGUgc3RhcnQvZW5kIG9mIHRoZSBSQVMgYmxvY2suDQo+ID4gKEVh
c2llc3QgaWYgeW91IGFzc3VtZSB0aGF0IHRoZXJlIGlzIGEgc2luZ2xlICd3cml0ZScgaW5zdHJ1
Y3Rpb24NCj4gPiBhcyB0aGUgbGFzdCBlbnRyeSBpbiB0aGUgYmxvY2suKQ0KPiA+DQo+IFNvIGVh
Y2ggbG9jYWxfKiBmdW5jdGlvbiBhbHNvIGhhdmUgY29kZSBpbiB0aGUgX19leF90YWJsZSBzZWN0
aW9uLiBJSVVDLA0KPiBfX2V4X3RhYmxlIGNvbnRhaW5zIHR3byBhZGRyZXNzLiBTbyBpZiB0aGUg
cmV0dXJuIGFkZHJlc3MgZm91bmQgaW4gdGhlDQo+IGZpcnN0IGNvbHVtbiBvZiB0aGUgX2V4X3Rh
YmxlLCB1c2UgdGhlIGNvcnJlc3BvbmRpbmcgYWRkcmVzcyBpbiB0aGUNCj4gc2Vjb25kIGNvbHVt
biB0byBjb250aW51ZSBmcm9tLg0KDQpUaGF0IHJlYWxseSBkb2Vzbid0IHNjYWxlLg0KSSBkb24n
dCBrbm93IGhvdyBtYW55IDEwMDAgYWRkcmVzcyBwYWlycyB5b3UgdGFibGUgd2lsbCBoYXZlIChh
bmQgdGhlDQpvbmVzIGluIGVhY2ggbG9hZGFibGUgbW9kdWxlKSwgYnV0IHRoZSBzZWFyY2ggaXNu
J3QgZ29pbmcgdG8gYmUgY2hlYXAuDQoNCklmIHRoZXNlIHNlcXVlbmNlcyBhcmUgcmVzdGFydGFi
bGUgdGhlbiB0aGV5IGNhbiBvbmx5IGhhdmUgb25lIHdyaXRlDQp0byBtZW1vcnkuDQoNCkdpdmVu
IHlvdXI6DQo+IFRoaXMgcGF0Y2ggcmUtd3JpdGUgdGhlIGN1cnJlbnQgbG9jYWxfKiBmdW5jdGlv
bnMgdG8gQ1I1IGJhc2VkIG9uZS4NCj4gQmFzZSBmbG93IGZvciBlYWNoIGZ1bmN0aW9uIGlzIA0K
PiANCj4gew0KPiAJc2V0IGNyNShlcSkNCj4gCWxvYWQNCj4gCS4uDQo+IAlzdG9yZQ0KPiAJY2xl
YXIgY3I1KGVxKQ0KPiB9DQoNCk9uIElTUiBlbnRyeToNCklmIGFuIElTUiBkZXRlY3RzIGNyNShl
cSkgc2V0IHRoZW4gbG9vayBhdCB0aGUgcmV0dXJuZWQgdG8gaW5zdHJ1Y3Rpb24uDQpJZiBpdCBp
cyAnY2xlYXIgY3I1KGVxKScgZG8gbm90aGluZy4NCk90aGVyd2lzZSByZWFkIGJhY2t3YXJkcyB0
aHJvdWdoIHRoZSBjb2RlIChmb3IgYSBtYXggb2YgKHNheSkgMTYgaW5zdHJ1Y3Rpb25zKQ0Kc2Vh
cmNoaW5nIGZvciB0aGUgJ3NldCBjcjUoZXEpJyBhbmQgY2hhbmdlIHRoZSByZXR1cm4gYWRkcmVz
cyB0byBiZSB0aGF0DQpvZiB0aGUgaW5zdHJ1Y3Rpb24gZm9sbG93aW5nIHRoZSAnc2V0IGNyNShl
cSknLg0KSW4gYWxsIGNhc2VzIGNsZWFyIGNyNShlcSkgZm9yIHRoZSBJU1IgaXRzZWxmIChsZWF2
ZSB0aGUgc2F2ZWQgdmFsdWUgdW5jaGFuZ2VkKS4NCg0KVGhlIHlvdSBkb24ndCBuZWVkIGEgdGFi
bGUgb2YgZmF1bHQgbG9jYXRpb25zLg0KDQoJRGF2aWQNCg0K
^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-11-28 3:21 ` Benjamin Herrenschmidt
@ 2014-11-28 10:53 ` David Laight
2014-11-30 9:01 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 29+ messages in thread
From: David Laight @ 2014-11-28 10:53 UTC (permalink / raw)
To: 'Benjamin Herrenschmidt', Madhavan Srinivasan
Cc: linuxppc-dev@lists.ozlabs.org, rusty@rustcorp.com.au,
paulus@samba.org, anton@samba.org
RnJvbTogQmVuamFtaW4gSGVycmVuc2NobWlkdA0KPiBPbiBGcmksIDIwMTQtMTEtMjggYXQgMDg6
NDUgKzA1MzAsIE1hZGhhdmFuIFNyaW5pdmFzYW4gd3JvdGU6DQo+ID4gPiBDYW4ndCB3ZSBqdXN0
IHVuY29uZGl0aW9uYWxseSBjbGVhciBhdCBhcyBsb25nIGFzIHdlIGRvIHRoYXQgYWZ0ZXIgd2Un
dmUNCj4gPiA+IHNhdmVkIGl0ID8gSW4gdGhhdCBjYXNlLCBpdCdzIGp1c3QgYSBtYXR0ZXIgZm9y
IHRoZSBmaXh1cCBjb2RlIHRvIGNoZWNrDQo+ID4gPiB0aGUgc2F2ZWQgdmVyc2lvbiByYXRoZXIg
dGhhbiB0aGUgYWN0dWFsIENSLi4NCj4gPiA+DQo+ID4gSSB1c2UgQ1IgYml0IHNldHRpbmcgaW4g
dGhlIGludGVycnVwdCByZXR1cm4gcGF0aCB0byBlbnRlciB0aGUgZml4dXANCj4gPiBzZWN0aW9u
IHNlYXJjaC4gSWYgd2UgdW5jb25kaXRpb25hbGx5IGNsZWFyIGl0LCB3ZSB3aWxsIGhhdmUgdG8g
ZW50ZXINCj4gPiB0aGUgZml4dXAgc2VjdGlvbiBmb3IgZXZlcnkga2VybmVsIHJldHVybiBuaXAg
cmlnaHQ/DQo+IA0KPiBBcyBJIHNhaWQgYWJvdmUuIENhbid0IHdlIGxvb2sgYXQgdGhlIHNhdmVk
IHZlcnNpb24gPw0KPiANCj4gSUUuDQo+IA0KPiAgLSBPbiBpbnRlcnJ1cHQgZW50cnk6DQo+IA0K
PiAJKiBTYXZlIENSIHRvIENDUihyMSkNCj4gCSogY2xlYXIgQ1I1DQo+IA0KPiAgLSBPbiBleGl0
DQo+IA0KPiAJKiBDaGVjayBDQ1IocjEpJ3MgQ1I1IGZpZWxkDQo+IAkqIHJlc3RvcmUgQ1INCg0K
QWN0dWFsbHkgdGhlcmUgaXMgbm8gcmVhbCByZWFzb24gd2h5IHRoZSAnZml4dXAnIGNhbid0IGJl
IGRvbmUNCmR1cmluZyBpbnRlcnJ1cHQgZW50cnkuDQoNCglEYXZpZA0KDQo=
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-11-28 1:58 ` Benjamin Herrenschmidt
2014-11-28 3:00 ` Madhavan Srinivasan
@ 2014-11-28 15:57 ` Segher Boessenkool
1 sibling, 0 replies; 29+ messages in thread
From: Segher Boessenkool @ 2014-11-28 15:57 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: linuxppc-dev, rusty, Madhavan Srinivasan, paulus, anton
On Fri, Nov 28, 2014 at 12:58:55PM +1100, Benjamin Herrenschmidt wrote:
> > Have you tested this with (upcoming) GCC 5.0? GCC now uses CR5,
> > and it likes to use it very much, it might be more convenient to
> > use e.g. CR1 (which is allocated almost last, only before CR0).
>
> We use CR1 all over the place in your asm code. Any other suggestion ?
>
> What's the damage of -ffixed-cr5 on gcc5 ? won't it just use CR4 or 6
> instead ?
Oh, it will work fine. Not using CR5 would be more convenient so that
the register allocation for most code would not change when you use or
not use -ffixed-cr5, making code easier to read. But your point about
asm code already using the other CR fields makes CR5 a better choice
actually, because people avoided it (because the compiler did) :-)
Segher
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-11-28 2:57 ` Madhavan Srinivasan
@ 2014-11-28 16:00 ` Segher Boessenkool
0 siblings, 0 replies; 29+ messages in thread
From: Segher Boessenkool @ 2014-11-28 16:00 UTC (permalink / raw)
To: Madhavan Srinivasan; +Cc: rusty, paulus, anton, linuxppc-dev
On Fri, Nov 28, 2014 at 08:27:22AM +0530, Madhavan Srinivasan wrote:
> > Have you tested this with (upcoming) GCC 5.0? GCC now uses CR5,
> > and it likes to use it very much, it might be more convenient to
> > use e.g. CR1 (which is allocated almost last, only before CR0).
> >
> No. I did not try it with GCC5.0 But I did force kernel compilation with
> fixed-cr5 which should make GCC avoid using CR5. But i will try that today.
It should work just fine, but testing is always useful, because, you know,
"should". Thanks.
Segher
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-11-28 10:53 ` David Laight
@ 2014-11-30 9:01 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 29+ messages in thread
From: Benjamin Herrenschmidt @ 2014-11-30 9:01 UTC (permalink / raw)
To: David Laight
Cc: paulus@samba.org, rusty@rustcorp.com.au, Madhavan Srinivasan,
linuxppc-dev@lists.ozlabs.org, anton@samba.org
On Fri, 2014-11-28 at 10:53 +0000, David Laight wrote:
> From: Benjamin Herrenschmidt
> > On Fri, 2014-11-28 at 08:45 +0530, Madhavan Srinivasan wrote:
> > > > Can't we just unconditionally clear at as long as we do that after we've
> > > > saved it ? In that case, it's just a matter for the fixup code to check
> > > > the saved version rather than the actual CR..
> > > >
> > > I use CR bit setting in the interrupt return path to enter the fixup
> > > section search. If we unconditionally clear it, we will have to enter
> > > the fixup section for every kernel return nip right?
> >
> > As I said above. Can't we look at the saved version ?
> >
> > IE.
> >
> > - On interrupt entry:
> >
> > * Save CR to CCR(r1)
> > * clear CR5
> >
> > - On exit
> >
> > * Check CCR(r1)'s CR5 field
> > * restore CR
>
> Actually there is no real reason why the 'fixup' can't be done
> during interrupt entry.
Other than if we crash, we get the wrong PC in the log etc... unlikely
but I tend to prefer this. Also if we ever allow something like a local
atomic on a faulting (uesrspace) address, we want a precise PC on entry.
Generally, we have a lot more entry path than exit path, it's easier to
keep the entry path simpler.
Cheers,
Ben.
> David
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation
2014-11-28 10:09 ` David Laight
@ 2014-12-01 15:35 ` Madhavan Srinivasan
2014-12-01 15:53 ` David Laight
0 siblings, 1 reply; 29+ messages in thread
From: Madhavan Srinivasan @ 2014-12-01 15:35 UTC (permalink / raw)
To: David Laight, mpe@ellerman.id.au
Cc: linuxppc-dev@lists.ozlabs.org, rusty@rustcorp.com.au,
paulus@samba.org, anton@samba.org
On Friday 28 November 2014 03:39 PM, David Laight wrote:
> From: Madhavan Srinivasan
>> On Thursday 27 November 2014 07:35 PM, David Laight wrote:
>>> From: Madhavan Srinivasan
>>>> This patchset create the infrastructure to handle the CR based
>>>> local_* atomic operations. Local atomic operations are fast
>>>> and highly reentrant per CPU counters. Used for percpu
>>>> variable updates. Local atomic operations only guarantee
>>>> variable modification atomicity wrt the CPU which owns the
>>>> data and these needs to be executed in a preemption safe way.
>>>
>>> These are usually called 'restartable atomic sequences (RAS)'.
>>>
>>>> Here is the design of the first patch. Since local_* operations
>>>> are only need to be atomic to interrupts (IIUC), patch uses
>>>> one of the Condition Register (CR) fields as a flag variable. When
>>>> entering the local_*, specific bit in the CR5 field is set
>>>> and on exit, bit is cleared. CR bit checking is done in the
>>>> interrupt return path. If CR5[EQ] bit set and if we return
>>>> to kernel, we reset to start of local_* operation.
>>>
>>> I don't claim to be able to read ppc assembler.
>>> But I can't see the code that clears CR5[EQ] for the duration
>>> of the ISR.
>> I use crclr instruction at the end of the code block to clear the bit.
>>
>>> Without it a nested interrupt will go through unwanted paths.
>
> That crclr looks to be in the ISR exit path, you need one in the
> isr entry path.
In case of entry path, i clear the field using mtcr instruction
>
>>>
>>> There are also a lot of 'magic' constants in that assembly code.
>>>
>> All these constants are define in asm/ppc-opcode.h
>
> I was thinking of the lines like:
> + ori r3,r3,16384
ok sure. Will comment
> This one probably deserves a comment - or something
> +"3:" PPC405_ERR77(0,%2)
>
>>> I also wonder if it is possible to inspect the interrupted
>>> code to determine the start/end of the RAS block.
>>> (Easiest if you assume that there is a single 'write' instruction
>>> as the last entry in the block.)
>>>
>> So each local_* function also have code in the __ex_table section. IIUC,
>> __ex_table contains two address. So if the return address found in the
>> first column of the _ex_table, use the corresponding address in the
>> second column to continue from.
>
> That really doesn't scale.
> I don't know how many 1000 address pairs you table will have (and the
> ones in each loadable module), but the search isn't going to be cheap.
>
> If these sequences are restartable then they can only have one write
> to memory.
>
May be, but i see these issues incase of insts decode path,
1) Decoding instruction may also cause a fault (in case of module) and
handling a fault at this stage toward the exit path of interrupt exit
makes me nervous
2) resulting code with lot of condition and branch (for opcode decode)
will be lot messy and may be an issue incase of maintenance,
but let me think it over again on this.
Regards
Maddy
> Given your:
>> This patch re-write the current local_* functions to CR5 based one.
>> Base flow for each function is
>>
>> {
>> set cr5(eq)
>> load
>> ..
>> store
>> clear cr5(eq)
>> }
>
> On ISR entry:
> If an ISR detects cr5(eq) set then look at the returned to instruction.
> If it is 'clear cr5(eq)' do nothing.
> Otherwise read backwards through the code (for a max of (say) 16 instructions)
> searching for the 'set cr5(eq)' and change the return address to be that
> of the instruction following the 'set cr5(eq)'.
> In all cases clear cr5(eq) for the ISR itself (leave the saved value unchanged).
>
> The you don't need a table of fault locations.
>
> David
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation
2014-12-01 15:35 ` Madhavan Srinivasan
@ 2014-12-01 15:53 ` David Laight
2014-12-18 4:18 ` Rusty Russell
0 siblings, 1 reply; 29+ messages in thread
From: David Laight @ 2014-12-01 15:53 UTC (permalink / raw)
To: 'Madhavan Srinivasan', mpe@ellerman.id.au
Cc: linuxppc-dev@lists.ozlabs.org, rusty@rustcorp.com.au,
paulus@samba.org, anton@samba.org
RnJvbTogTWFkaGF2YW4gU3Jpbml2YXNhbiBbbWFpbHRvOm1hZGR5QGxpbnV4LnZuZXQuaWJtLmNv
bV0NCi4uLg0KPiA+Pj4gSSBhbHNvIHdvbmRlciBpZiBpdCBpcyBwb3NzaWJsZSB0byBpbnNwZWN0
IHRoZSBpbnRlcnJ1cHRlZA0KPiA+Pj4gY29kZSB0byBkZXRlcm1pbmUgdGhlIHN0YXJ0L2VuZCBv
ZiB0aGUgUkFTIGJsb2NrLg0KPiA+Pj4gKEVhc2llc3QgaWYgeW91IGFzc3VtZSB0aGF0IHRoZXJl
IGlzIGEgc2luZ2xlICd3cml0ZScgaW5zdHJ1Y3Rpb24NCj4gPj4+IGFzIHRoZSBsYXN0IGVudHJ5
IGluIHRoZSBibG9jay4pDQo+ID4+Pg0KPiA+PiBTbyBlYWNoIGxvY2FsXyogZnVuY3Rpb24gYWxz
byBoYXZlIGNvZGUgaW4gdGhlIF9fZXhfdGFibGUgc2VjdGlvbi4gSUlVQywNCj4gPj4gX19leF90
YWJsZSBjb250YWlucyB0d28gYWRkcmVzcy4gU28gaWYgdGhlIHJldHVybiBhZGRyZXNzIGZvdW5k
IGluIHRoZQ0KPiA+PiBmaXJzdCBjb2x1bW4gb2YgdGhlIF9leF90YWJsZSwgdXNlIHRoZSBjb3Jy
ZXNwb25kaW5nIGFkZHJlc3MgaW4gdGhlDQo+ID4+IHNlY29uZCBjb2x1bW4gdG8gY29udGludWUg
ZnJvbS4NCj4gPg0KPiA+IFRoYXQgcmVhbGx5IGRvZXNuJ3Qgc2NhbGUuDQo+ID4gSSBkb24ndCBr
bm93IGhvdyBtYW55IDEwMDAgYWRkcmVzcyBwYWlycyB5b3UgdGFibGUgd2lsbCBoYXZlIChhbmQg
dGhlDQo+ID4gb25lcyBpbiBlYWNoIGxvYWRhYmxlIG1vZHVsZSksIGJ1dCB0aGUgc2VhcmNoIGlz
bid0IGdvaW5nIHRvIGJlIGNoZWFwLg0KPiA+DQo+ID4gSWYgdGhlc2Ugc2VxdWVuY2VzIGFyZSBy
ZXN0YXJ0YWJsZSB0aGVuIHRoZXkgY2FuIG9ubHkgaGF2ZSBvbmUgd3JpdGUNCj4gPiB0byBtZW1v
cnkuDQo+ID4NCj4gDQo+IE1heSBiZSwgYnV0IGkgc2VlIHRoZXNlIGlzc3VlcyBpbmNhc2Ugb2Yg
aW5zdHMgZGVjb2RlIHBhdGgsDQo+IA0KPiAxKSBEZWNvZGluZyBpbnN0cnVjdGlvbiBtYXkgYWxz
byBjYXVzZSBhIGZhdWx0IChpbiBjYXNlIG9mIG1vZHVsZSkgYW5kDQo+IGhhbmRsaW5nIGEgZmF1
bHQgYXQgdGhpcyBzdGFnZSB0b3dhcmQgdGhlIGV4aXQgcGF0aCBvZiBpbnRlcnJ1cHQgZXhpdA0K
PiBtYWtlcyBtZSBuZXJ2b3VzDQoNCkl0IHNob3VsZG4ndCBiZSBwb3NzaWJsZSB0byB1bmxvYWQg
YSBtb2R1bGUgdGhhdCBpcyBpbnRlcnJ1cHRlZCBieQ0KYSBoYXJkd2FyZSBpbnRlcnJ1cHQuDQpB
biAnaW52YWxpZCcgbG9hZGFibGUgbW9kdWxlIGNhbiBjYXVzZSBhbiBvb3BzL3BhbmljIGFueXdh
eS4NCg0KPiAyKSByZXN1bHRpbmcgY29kZSB3aXRoIGxvdCBvZiBjb25kaXRpb24gYW5kIGJyYW5j
aCAoZm9yIG9wY29kZSBkZWNvZGUpDQo+IHdpbGwgYmUgbG90IG1lc3N5IGFuZCBtYXkgYmUgYW4g
aXNzdWUgaW5jYXNlIG9mIG1haW50ZW5hbmNlLA0KDQpZb3UgZG9uJ3QgbmVlZCB0byBkZWNvZGUg
dGhlIGluc3RydWN0aW9ucy4NCkp1c3QgbG9vayBmb3IgdGhlIHR3byBzcGVjaWZpYyBpbnN0cnVj
dGlvbnMgdXNlZCBhcyBtYXJrZXJzLg0KVGhpcyBpcyBvbmx5IHJlYWxseSBwb3NzaWJsZSB3aXRo
IGZpeGVkLXNpemUgaW5zdHJ1Y3Rpb25zLg0KDQpJdCBtaWdodCBhbHNvIGJlIHRoYXQgdGhlICdp
bnRlcnJ1cHQgZW50cnknIHBhdGggaXMgZWFzaWVyIHRvDQptb2RpZnkgdGhhbiB0aGUgJ2ludGVy
cnVwdCBleGl0JyBvbmUgKGZld2VyIGNvZGUgcGF0aHMpIGFuZA0KeW91IGp1c3QgbmVlZCB0byBt
b2RpZnkgdGhlICdwYycgaW4gdGhlIHN0YWNrIGZyYW1lLg0KWW91IGFyZSBvbmx5IGludGVyZXN0
ZWQgaW4gaW50ZXJydXB0cyBmcm9tIGtlcm5lbCBzcGFjZS4NCg0KCURhdmlkDQoNCg0K
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 2/2]powerpc: rewrite local_* to use CR5 flag
2014-11-27 12:18 ` [RFC PATCH 2/2]powerpc: rewrite local_* to use CR5 flag Madhavan Srinivasan
@ 2014-12-01 18:01 ` Gabriel Paubert
2014-12-03 15:05 ` Madhavan Srinivasan
0 siblings, 1 reply; 29+ messages in thread
From: Gabriel Paubert @ 2014-12-01 18:01 UTC (permalink / raw)
To: Madhavan Srinivasan; +Cc: rusty, paulus, anton, linuxppc-dev
On Thu, Nov 27, 2014 at 05:48:41PM +0530, Madhavan Srinivasan wrote:
> This patch re-write the current local_* functions to CR5 based one.
> Base flow for each function is
>
> {
> set cr5(eq)
> load
> ..
> store
> clear cr5(eq)
> }
>
> Above set of instructions are followed by a fixup section which points
> to the entry of the function incase of interrupt in the flow. If the
> interrupt happens to be after the store, we just continue to last
> instruction in that block.
>
> Currently only asm/local.h has been rewrite, and local64 is TODO.
> Also the entire change is only for PPC64.
>
> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
> ---
> arch/powerpc/include/asm/local.h | 306 +++++++++++++++++++++++++++++++++++++++
> 1 file changed, 306 insertions(+)
>
> diff --git a/arch/powerpc/include/asm/local.h b/arch/powerpc/include/asm/local.h
> index b8da913..a26e5d3 100644
> --- a/arch/powerpc/include/asm/local.h
> +++ b/arch/powerpc/include/asm/local.h
> @@ -11,6 +11,310 @@ typedef struct
>
> #define LOCAL_INIT(i) { ATOMIC_LONG_INIT(i) }
>
> +#ifdef CONFIG_PPC64
> +
> +static __inline__ long local_read(local_t *l)
> +{
> + long t;
> +
> + __asm__ __volatile__(
> +"1: crset 22\n"
> +"2:" PPC_LL" %0,0(%1)\n"
> +"3: crclr 22\n"
> +"4:\n"
> +" .section __ex_table,\"a\"\n"
> + PPC_LONG_ALIGN "\n"
> + PPC_LONG "2b,1b\n"
> + PPC_LONG "3b,3b\n"
> +" .previous\n"
> + : "=&r" (t)
> + : "r" (&(l->a.counter)));
> +
> + return t;
> +}
> +
> +static __inline__ void local_set(local_t *l, long i)
> +{
> + long t;
> +
> + __asm__ __volatile__(
> +"1: crset 22\n"
> +"2:" PPC_LL" %0,0(%1)\n"
> +"3:" PPC405_ERR77(0,%2)
> +"4:" PPC_STL" %0,0(%2)\n"
> +"5: crclr 22\n"
> +"6:\n"
> +" .section __ex_table,\"a\"\n"
> + PPC_LONG_ALIGN "\n"
> + PPC_LONG "2b,1b\n"
> + PPC_LONG "3b,1b\n"
> + PPC_LONG "4b,1b\n"
> + PPC_LONG "5b,5b\n"
> +" .previous\n"
> + : "=&r" (t)
> + : "r" (&(i)), "r" (&(l->a.counter)));
> +}
> +
Apart from the other comments on bloat which can very likely
be removed by tracing backwards for a few instructions, removing
the exception table entries which are 2 or 4 (64 bit?) times as
large as the instruction sequence, I don't understand at all why
you need these sequences for the local_read and local_set functions.
After all these are single instructions (why do you perform a read before
the write in set when the result of the read is never used?).
I believe read and set are better mapped to access_once (or assign_once
or whatever it's called after the recent discussion on linux-kernel).
You don't even need a memory barrier if it's for a single thread,
so you could get away with a single volatile access to the variables.
For the other ones, I think that what you do is correct, except that
the workaround for PPC405 erratum 77 is not needed since this erratum
only affects the stwcx. instruction and the whole point of the patch
is to avoid the use of an l?arx/st?cx. pair.
Regards,
Gabriel
> +static __inline__ void local_add(long i, local_t *l)
> +{
> + long t;
> +
> + __asm__ __volatile__(
> +"1: crset 22\n"
> +"2:" PPC_LL" %0,0(%2)\n"
> +"3: add %0,%1,%0\n"
> +"4:" PPC405_ERR77(0,%2)
> +"5:" PPC_STL" %0,0(%2)\n"
> +"6: crclr 22\n"
> +"7:\n"
> +" .section __ex_table,\"a\"\n"
> + PPC_LONG_ALIGN "\n"
> + PPC_LONG "2b,1b\n"
> + PPC_LONG "3b,1b\n"
> + PPC_LONG "4b,1b\n"
> + PPC_LONG "5b,1b\n"
> + PPC_LONG "6b,6b\n"
> +" .previous\n"
> + : "=&r" (t)
> + : "r" (i), "r" (&(l->a.counter)));
> +}
> +
> +static __inline__ void local_sub(long i, local_t *l)
> +{
> + long t;
> +
> + __asm__ __volatile__(
> +"1: crset 22\n"
> +"2:" PPC_LL" %0,0(%2)\n"
> +"3: subf %0,%1,%0\n"
> +"4:" PPC405_ERR77(0,%2)
> +"5:" PPC_STL" %0,0(%2)\n"
> +"6: crclr 22\n"
> +"7:\n"
> +" .section __ex_table,\"a\"\n"
> + PPC_LONG_ALIGN "\n"
> + PPC_LONG "2b,1b\n"
> + PPC_LONG "3b,1b\n"
> + PPC_LONG "4b,1b\n"
> + PPC_LONG "5b,1b\n"
> + PPC_LONG "6b,6b\n"
> +" .previous\n"
> + : "=&r" (t)
> + : "r" (i), "r" (&(l->a.counter)));
> +}
> +
> +static __inline__ long local_add_return(long a, local_t *l)
> +{
> + long t;
> +
> + __asm__ __volatile__(
> +"1: crset 22\n"
> +"2:" PPC_LL" %0,0(%2)\n"
> +"3: add %0,%1,%0\n"
> +"4:" PPC405_ERR77(0,%2)
> +"5:" PPC_STL "%0,0(%2)\n"
> +"6: crclr 22\n"
> +"7:\n"
> +" .section __ex_table,\"a\"\n"
> + PPC_LONG_ALIGN "\n"
> + PPC_LONG "2b,1b\n"
> + PPC_LONG "3b,1b\n"
> + PPC_LONG "4b,1b\n"
> + PPC_LONG "5b,1b\n"
> + PPC_LONG "6b,6b\n"
> +" .previous\n"
> + : "=&r" (t)
> + : "r" (a), "r" (&(l->a.counter))
> + : "cc", "memory");
> +
> + return t;
> +}
> +
> +
> +#define local_add_negative(a, l) (local_add_return((a), (l)) < 0)
> +
> +static __inline__ long local_sub_return(long a, local_t *l)
> +{
> + long t;
> +
> + __asm__ __volatile__(
> +"1: crset 22\n"
> +"2:" PPC_LL" %0,0(%2)\n"
> +"3: subf %0,%1,%0\n"
> +"4:" PPC405_ERR77(0,%2)
> +"5:" PPC_STL "%0,0(%2)\n"
> +"6: crclr 22\n"
> +"7:\n"
> +" .section __ex_table,\"a\"\n"
> + PPC_LONG_ALIGN "\n"
> + PPC_LONG "2b,1b\n"
> + PPC_LONG "3b,1b\n"
> + PPC_LONG "4b,1b\n"
> + PPC_LONG "5b,1b\n"
> + PPC_LONG "6b,6b\n"
> +" .previous\n"
> + : "=&r" (t)
> + : "r" (a), "r" (&(l->a.counter))
> + : "cc", "memory");
> +
> + return t;
> +}
> +
> +static __inline__ long local_inc_return(local_t *l)
> +{
> + long t;
> +
> + __asm__ __volatile__(
> +"1: crset 22\n"
> +"2:" PPC_LL" %0,0(%1)\n"
> +"3: addic %0,%0,1\n"
> +"4:" PPC405_ERR77(0,%1)
> +"5:" PPC_STL "%0,0(%1)\n"
> +"6: crclr 22\n"
> +"7:\n"
> +" .section __ex_table,\"a\"\n"
> + PPC_LONG_ALIGN "\n"
> + PPC_LONG "2b,1b\n"
> + PPC_LONG "3b,1b\n"
> + PPC_LONG "4b,1b\n"
> + PPC_LONG "5b,1b\n"
> + PPC_LONG "6b,6b\n"
> +" .previous"
> + : "=&r" (t)
> + : "r" (&(l->a.counter))
> + : "cc", "xer", "memory");
> +
> + return t;
> +}
> +
> +/*
> + * local_inc_and_test - increment and test
> + * @l: pointer of type local_t
> + *
> + * Atomically increments @l by 1
> + * and returns true if the result is zero, or false for all
> + * other cases.
> + */
> +#define local_inc_and_test(l) (local_inc_return(l) == 0)
> +
> +static __inline__ long local_dec_return(local_t *l)
> +{
> + long t;
> +
> + __asm__ __volatile__(
> +"1: crset 22\n"
> +"2:" PPC_LL" %0,0(%1)\n"
> +"3: addic %0,%0,-1\n"
> +"4:" PPC405_ERR77(0,%1)
> +"5:" PPC_STL "%0,0(%1)\n"
> +"6: crclr 22\n"
> +"7:\n"
> +" .section __ex_table,\"a\"\n"
> + PPC_LONG_ALIGN "\n"
> + PPC_LONG "2b,1b\n"
> + PPC_LONG "3b,1b\n"
> + PPC_LONG "4b,1b\n"
> + PPC_LONG "5b,1b\n"
> + PPC_LONG "6b,6b\n"
> +" .previous\n"
> + : "=&r" (t)
> + : "r" (&(l->a.counter))
> + : "cc", "xer", "memory");
> +
> + return t;
> +}
> +
> +#define local_inc(l) local_inc_return(l)
> +#define local_dec(l) local_dec_return(l)
> +
> +#define local_cmpxchg(l, o, n) \
> + (cmpxchg_local(&((l)->a.counter), (o), (n)))
> +#define local_xchg(l, n) (xchg_local(&((l)->a.counter), (n)))
> +
> +/**
> + * local_add_unless - add unless the number is a given value
> + * @l: pointer of type local_t
> + * @a: the amount to add to v...
> + * @u: ...unless v is equal to u.
> + *
> + * Atomically adds @a to @l, so long as it was not @u.
> + * Returns non-zero if @l was not @u, and zero otherwise.
> + */
> +static __inline__ int local_add_unless(local_t *l, long a, long u)
> +{
> + long t;
> +
> + __asm__ __volatile__ (
> +"1: crset 22\n"
> +"2:" PPC_LL" %0,0(%1)\n"
> +"3: cmpw 0,%0,%3 \n"
> +"4: beq- 9f \n"
> +"5: add %0,%2,%0 \n"
> +"6:" PPC405_ERR77(0,%1)
> +"7:" PPC_STL" %0,0(%1) \n"
> +"8: subf %0,%2,%0 \n"
> +"9: crclr 22\n"
> +"10:\n"
> +" .section __ex_table,\"a\"\n"
> + PPC_LONG_ALIGN "\n"
> + PPC_LONG "2b,1b\n"
> + PPC_LONG "3b,1b\n"
> + PPC_LONG "4b,1b\n"
> + PPC_LONG "5b,1b\n"
> + PPC_LONG "6b,1b\n"
> + PPC_LONG "7b,1b\n"
> + PPC_LONG "8b,8b\n"
> + PPC_LONG "9b,9b\n"
> +" .previous\n"
> + : "=&r" (t)
> + : "r" (&(l->a.counter)), "r" (a), "r" (u)
> + : "cc", "memory");
> +
> + return t != u;
> +}
> +
> +#define local_inc_not_zero(l) local_add_unless((l), 1, 0)
> +
> +#define local_sub_and_test(a, l) (local_sub_return((a), (l)) == 0)
> +#define local_dec_and_test(l) (local_dec_return((l)) == 0)
> +
> +/*
> + * Atomically test *l and decrement if it is greater than 0.
> + * The function returns the old value of *l minus 1.
> + */
> +static __inline__ long local_dec_if_positive(local_t *l)
> +{
> + long t;
> +
> + __asm__ __volatile__(
> +"1: crset 22\n"
> +"2:" PPC_LL" %0,0(%1)\n"
> +"3: cmpwi %0,1\n"
> +"4: addi %0,%0,-1\n"
> +"5: blt- 8f\n"
> +"6:" PPC405_ERR77(0,%1)
> +"7:" PPC_STL "%0,0(%1)\n"
> +"8: crclr 22\n"
> +"9:\n"
> +" .section__ex_table,\"a\"\n"
> + PPC_LONG_ALIGN "\n"
> + PPC_LONG "2b,1b\n"
> + PPC_LONG "3b,1b\n"
> + PPC_LONG "4b,1b\n"
> + PPC_LONG "5b,1b\n"
> + PPC_LONG "6b,1b\n"
> + PPC_LONG "7b,1b\n"
> + PPC_LONG "8b,8b\n"
> +" .previous\n"
> + : "=&b" (t)
> + : "r" (&(l->a.counter))
> + : "cc", "memory");
> +
> + return t;
> +}
> +
> +#else
> +
> #define local_read(l) atomic_long_read(&(l)->a)
> #define local_set(l,i) atomic_long_set(&(l)->a, (i))
>
> @@ -162,6 +466,8 @@ static __inline__ long local_dec_if_positive(local_t *l)
> return t;
> }
>
> +#endif
> +
> /* Use these for per-cpu local_t variables: on some archs they are
> * much more efficient than these naive implementations. Note they take
> * a variable, not an address.
> --
> 1.9.1
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-11-27 12:18 ` [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t Madhavan Srinivasan
2014-11-27 16:56 ` Segher Boessenkool
2014-11-28 0:56 ` Benjamin Herrenschmidt
@ 2014-12-01 21:35 ` Gabriel Paubert
2014-12-03 14:59 ` Madhavan Srinivasan
2014-12-02 2:04 ` Scott Wood
3 siblings, 1 reply; 29+ messages in thread
From: Gabriel Paubert @ 2014-12-01 21:35 UTC (permalink / raw)
To: Madhavan Srinivasan; +Cc: rusty, paulus, anton, linuxppc-dev
On Thu, Nov 27, 2014 at 05:48:40PM +0530, Madhavan Srinivasan wrote:
> This patch create the infrastructure to handle the CR based
> local_* atomic operations. Local atomic operations are fast
> and highly reentrant per CPU counters. Used for percpu
> variable updates. Local atomic operations only guarantee
> variable modification atomicity wrt the CPU which owns the
> data and these needs to be executed in a preemption safe way.
>
> Here is the design of this patch. Since local_* operations
> are only need to be atomic to interrupts (IIUC), patch uses
> one of the Condition Register (CR) fields as a flag variable. When
> entering the local_*, specific bit in the CR5 field is set
> and on exit, bit is cleared. CR bit checking is done in the
> interrupt return path. If CR5[EQ] bit set and if we return
> to kernel, we reset to start of local_* operation.
>
> Reason for this approach is that, currently l[w/d]arx/st[w/d]cx.
> instruction pair is used for local_* operations, which are heavy
> on cycle count and they dont support a local variant. So to
> see whether the new implementation helps, used a modified
> version of Rusty's benchmark code on local_t.
>
> https://lkml.org/lkml/2008/12/16/450
>
> Modifications:
> - increated the working set size from 1MB to 8MB,
> - removed cpu_local_inc test.
>
> Test ran
> - on POWER8 1S Scale out System 2.0GHz
> - on OPAL v3 with v3.18-rc4 patch kernel as Host
>
> Here are the values with the patch.
>
> Time in ns per iteration
>
> inc add read add_return
> atomic_long 67 67 18 69
> irqsave/rest 39 39 23 39
> trivalue 39 39 29 49
> local_t 26 26 24 26
>
> Since CR5 is used as a flag, have added CFLAGS to avoid CR5
> for the kernel compilation and CR5 is zeroed at the kernel
> entry.
>
> Tested the patch in a
> - pSeries LPAR,
> - Host with patched/unmodified guest kernel
>
> To check whether userspace see any CR5 corruption, ran a simple
> test which does,
> - set CR5 field,
> - while(1)
> - sleep or gettimeofday
> - chk bit set
>
> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
> ---
> - I really appreciate feedback on the patchset.
> - Kindly comment if I should try with any other benchmark or
> workload to check the numbers.
> - Also, kindly recommand any know stress test for CR
>
> Makefile | 6 ++
> arch/powerpc/include/asm/exception-64s.h | 21 +++++-
> arch/powerpc/kernel/entry_64.S | 106 ++++++++++++++++++++++++++++++-
> arch/powerpc/kernel/exceptions-64s.S | 2 +-
> arch/powerpc/kernel/head_64.S | 8 +++
> 5 files changed, 138 insertions(+), 5 deletions(-)
>
> diff --git a/Makefile b/Makefile
> index 00d618b..2e271ad 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -706,6 +706,12 @@ endif
>
> KBUILD_CFLAGS += $(call cc-option, -fno-var-tracking-assignments)
>
> +ifdef CONFIG_PPC64
> +# We need this flag to force compiler not to use CR5, since
> +# local_t type code is based on this.
> +KBUILD_CFLAGS += -ffixed-cr5
> +endif
> +
> ifdef CONFIG_DEBUG_INFO
> ifdef CONFIG_DEBUG_INFO_SPLIT
> KBUILD_CFLAGS += $(call cc-option, -gsplit-dwarf, -g)
> diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
> index 77f52b2..c42919a 100644
> --- a/arch/powerpc/include/asm/exception-64s.h
> +++ b/arch/powerpc/include/asm/exception-64s.h
> @@ -306,7 +306,26 @@ do_kvm_##n: \
> std r10,0(r1); /* make stack chain pointer */ \
> std r0,GPR0(r1); /* save r0 in stackframe */ \
> std r10,GPR1(r1); /* save r1 in stackframe */ \
> - beq 4f; /* if from kernel mode */ \
> +BEGIN_FTR_SECTION; \
> + lis r9,4096; /* Create a mask with HV and PR */ \
> + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \
> + mr r10,r9; /* to check for Hyp state */ \
> + ori r9,r9,16384; \
> + and r9,r12,r9; \
> + cmpd cr3,r10,r9; \
> + beq cr3,66f; /* Jump if we come from Hyp mode*/ \
> + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \
I think you can do better than this, powerpc has a fantastic set
of rotate and mask instructions. If I understand correctly your
code you can replace it with the following:
rldicl r10,r12,4,63 /* Extract HV bit to LSB of r10*/
rlwinm r9,r12,19,0x02 /* Extract PR bit to 2nd to last bit of r9 */
or r9,r9,10
cmplwi cr3,r9,1 /* Check for HV=1 and PR=0 */
beq cr3,66f
mtcrf 0x04,r10 /* Bits going to cr5 bits are 0 in r10 */
and obviously similarly for the other instances of this code sequence.
This said, mtcrf is not necessarily the fastest way of setting cr5 if
you only want to clear the EQ bit. For example comparing the stack pointer
with 0 should have the same effect.
> +FTR_SECTION_ELSE; \
> + beq 4f; /* if kernel mode branch */ \
> + li r10,0; /* Clear CR5 incase of coming */ \
> + mtcrf 0x04,r10; /* from user. */ \
> + nop; /* This part of code is for */ \
> + nop; /* kernel with MSR[HV]=0, */ \
> + nop; /* MSR[PR]=0, so just chk for */ \
> + nop; /* MSR[PR] */ \
> + nop; \
> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE); \
> +66: beq 4f; /* if from kernel mode */ \
> ACCOUNT_CPU_USER_ENTRY(r9, r10); \
> SAVE_PPR(area, r9, r10); \
> 4: EXCEPTION_PROLOG_COMMON_2(area) \
> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
> index 0905c8d..e42bb99 100644
> --- a/arch/powerpc/kernel/entry_64.S
> +++ b/arch/powerpc/kernel/entry_64.S
> @@ -68,7 +68,26 @@ system_call_common:
> 2: std r2,GPR2(r1)
> std r3,GPR3(r1)
> mfcr r2
> - std r4,GPR4(r1)
> +BEGIN_FTR_SECTION
> + lis r10,4096
> + rldicr r10,r10,32,31
> + mr r11,r10
> + ori r10,r10,16384
> + and r10,r12,r10
> + cmpd r11,r10
> + beq 67f
> + mtcrf 0x04,r11
> +FTR_SECTION_ELSE
> + beq 67f
> + li r11,0
> + mtcrf 0x04,r11
> + nop
> + nop
> + nop
> + nop
> + nop
> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
> +67: std r4,GPR4(r1)
> std r5,GPR5(r1)
> std r6,GPR6(r1)
> std r7,GPR7(r1)
> @@ -224,8 +243,26 @@ syscall_exit:
> BEGIN_FTR_SECTION
> stdcx. r0,0,r1 /* to clear the reservation */
> END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
> +BEGIN_FTR_SECTION
> + lis r4,4096
> + rldicr r4,r4,32,31
> + mr r6,r4
> + ori r4,r4,16384
> + and r4,r8,r4
> + cmpd cr3,r6,r4
> + beq cr3,65f
> + mtcr r5
> +FTR_SECTION_ELSE
> andi. r6,r8,MSR_PR
> - ld r4,_LINK(r1)
> + beq 65f
> + mtcr r5
> + nop
> + nop
> + nop
> + nop
> + nop
> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
> +65: ld r4,_LINK(r1)
>
> beq- 1f
> ACCOUNT_CPU_USER_EXIT(r11, r12)
> @@ -234,7 +271,11 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
> 1: ld r2,GPR2(r1)
> ld r1,GPR1(r1)
> mtlr r4
> +#ifdef CONFIG_PPC64
> + mtcrf 0xFB,r5
> +#else
> mtcr r5
> +#endif
> mtspr SPRN_SRR0,r7
> mtspr SPRN_SRR1,r8
> RFI
> @@ -804,7 +845,66 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
> */
> .globl fast_exception_return
> fast_exception_return:
> - ld r3,_MSR(r1)
> +
> + /*
> + * Now that we are about to exit from interrupt, lets check for
> + * cr5 eq bit. If it is set, then we may be in the middle of
> + * local_t update. In this case, we should rewind the NIP
> + * accordingly.
> + */
> + mfcr r3
> + andi. r4,r3,0x200
> + beq 63f
I believe that someonw has already mentioned that this is a very
convoluted way of testing a cr bit. This can be optimized to bne cr5,63f
> +
> + /*
> + * Now that the bit is set, lets check for return to User
> + */
> + ld r4,_MSR(r1)
> +BEGIN_FTR_SECTION
> + li r3,4096
> + rldicr r3,r3,32,31
> + mr r5,r3
> + ori r3,r3,16384
> + and r3,r4,r3
> + cmpd r5,r3
> + bne 63f
> +FTR_SECTION_ELSE
> + andi. r3,r4,MSR_PR
> + bne 63f
> + nop
> + nop
> + nop
> + nop
> + nop
> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
> +
> + /*
> + * Looks like we are returning to Kernel, so
> + * lets get the NIP and search the ex_table.
> + * Change the NIP based on the return value
> + */
> +lookup_ex_table:
> + ld r3,_NIP(r1)
> + bl search_exception_tables
> + cmpli 0,1,r3,0
> + bne 62f
> +
> + /*
> + * This is a panic case. Reason is that, we
> + * have the CR5 bit set, but we are not in
> + * local_* code and we are returning to Kernel.
> + */
> + ld r3,_NIP(r1)
> + mfcr r4
> + EMIT_BUG_ENTRY lookup_ex_table, __FILE__,__LINE__,BUGFLAG_WARNING
> +
> + /*
> + * Now save the return fixup address as NIP
> + */
> +62: ld r4,8(r3)
> + std r4,_NIP(r1)
> + crclr 22
> +63: ld r3,_MSR(r1)
> ld r4,_CTR(r1)
> ld r0,_LINK(r1)
> mtctr r4
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index 72e783e..edb75a9 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -637,7 +637,7 @@ masked_##_H##interrupt: \
> rldicl r10,r10,48,1; /* clear MSR_EE */ \
> rotldi r10,r10,16; \
> mtspr SPRN_##_H##SRR1,r10; \
> -2: mtcrf 0x80,r9; \
> +2: mtcrf 0x90,r9; \
> ld r9,PACA_EXGEN+EX_R9(r13); \
> ld r10,PACA_EXGEN+EX_R10(r13); \
> ld r11,PACA_EXGEN+EX_R11(r13); \
> diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
> index d48125d..02e49b3 100644
> --- a/arch/powerpc/kernel/head_64.S
> +++ b/arch/powerpc/kernel/head_64.S
> @@ -347,6 +347,14 @@ __mmu_off:
> *
> */
> __start_initialization_multiplatform:
> +
> + /*
> + * Before we do anything, lets clear CR5 field,
> + * so that we will have a clean start at entry
> + */
> + li r11,0
> + mtcrf 0x04,r11
> +
> /* Make sure we are running in 64 bits mode */
> bl enable_64b_mode
>
Regards,
Gabriel
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-11-27 12:18 ` [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t Madhavan Srinivasan
` (2 preceding siblings ...)
2014-12-01 21:35 ` Gabriel Paubert
@ 2014-12-02 2:04 ` Scott Wood
2014-12-03 14:49 ` Madhavan Srinivasan
3 siblings, 1 reply; 29+ messages in thread
From: Scott Wood @ 2014-12-02 2:04 UTC (permalink / raw)
To: Madhavan Srinivasan; +Cc: rusty, paulus, anton, linuxppc-dev
On Thu, 2014-11-27 at 17:48 +0530, Madhavan Srinivasan wrote:
> - I really appreciate feedback on the patchset.
> - Kindly comment if I should try with any other benchmark or
> workload to check the numbers.
> - Also, kindly recommand any know stress test for CR
>
> Makefile | 6 ++
> arch/powerpc/include/asm/exception-64s.h | 21 +++++-
> arch/powerpc/kernel/entry_64.S | 106 ++++++++++++++++++++++++++++++-
> arch/powerpc/kernel/exceptions-64s.S | 2 +-
> arch/powerpc/kernel/head_64.S | 8 +++
> 5 files changed, 138 insertions(+), 5 deletions(-)
Patch 2/2 enables this for all PPC64, not just book3s -- so please don't
forget about the book3e exception paths (also MSR[GS] for KVM, but
aren't most if not all the places you're checking for HV mode after KVM
would have taken control? Or am I missing something about how book3s
KVM works?).
Or, if you don't want to do that, change patch 2/2 to be book3s only and
ifdef-protect the changes to common exception code.
> @@ -224,8 +243,26 @@ syscall_exit:
> BEGIN_FTR_SECTION
> stdcx. r0,0,r1 /* to clear the reservation */
> END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
> +BEGIN_FTR_SECTION
> + lis r4,4096
> + rldicr r4,r4,32,31
> + mr r6,r4
> + ori r4,r4,16384
> + and r4,r8,r4
> + cmpd cr3,r6,r4
> + beq cr3,65f
> + mtcr r5
> +FTR_SECTION_ELSE
> andi. r6,r8,MSR_PR
> - ld r4,_LINK(r1)
> + beq 65f
> + mtcr r5
> + nop
> + nop
> + nop
> + nop
> + nop
> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
> +65: ld r4,_LINK(r1)
>
> beq- 1f
> ACCOUNT_CPU_USER_EXIT(r11, r12)
> @@ -234,7 +271,11 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
> 1: ld r2,GPR2(r1)
> ld r1,GPR1(r1)
> mtlr r4
> +#ifdef CONFIG_PPC64
> + mtcrf 0xFB,r5
> +#else
> mtcr r5
> +#endif
mtcrf with more than one CRn being updated is expensive on Freescale
chips (and this isn't a book3s-only code path). Why do you need to do
it twice? I don't see where either r5 or cr5 are messed with between
the two places...
-Scott
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-12-02 2:04 ` Scott Wood
@ 2014-12-03 14:49 ` Madhavan Srinivasan
0 siblings, 0 replies; 29+ messages in thread
From: Madhavan Srinivasan @ 2014-12-03 14:49 UTC (permalink / raw)
To: Scott Wood; +Cc: rusty, paulus, anton, linuxppc-dev
On Tuesday 02 December 2014 07:34 AM, Scott Wood wrote:
> On Thu, 2014-11-27 at 17:48 +0530, Madhavan Srinivasan wrote:
>> - I really appreciate feedback on the patchset.
>> - Kindly comment if I should try with any other benchmark or
>> workload to check the numbers.
>> - Also, kindly recommand any know stress test for CR
>>
>> Makefile | 6 ++
>> arch/powerpc/include/asm/exception-64s.h | 21 +++++-
>> arch/powerpc/kernel/entry_64.S | 106 ++++++++++++++++++++++++++++++-
>> arch/powerpc/kernel/exceptions-64s.S | 2 +-
>> arch/powerpc/kernel/head_64.S | 8 +++
>> 5 files changed, 138 insertions(+), 5 deletions(-)
>
> Patch 2/2 enables this for all PPC64, not just book3s -- so please don't
> forget about the book3e exception paths (also MSR[GS] for KVM, but
> aren't most if not all the places you're checking for HV mode after KVM
> would have taken control? Or am I missing something about how book3s
> KVM works?).
>
> Or, if you don't want to do that, change patch 2/2 to be book3s only and
> ifdef-protect the changes to common exception code.
>
Still learning the interrupt path and various configs. Was assuming
PPC64 is for server side. My bad. Will change the it to book3s and also
to common code.
>> @@ -224,8 +243,26 @@ syscall_exit:
>> BEGIN_FTR_SECTION
>> stdcx. r0,0,r1 /* to clear the reservation */
>> END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
>> +BEGIN_FTR_SECTION
>> + lis r4,4096
>> + rldicr r4,r4,32,31
>> + mr r6,r4
>> + ori r4,r4,16384
>> + and r4,r8,r4
>> + cmpd cr3,r6,r4
>> + beq cr3,65f
>> + mtcr r5
>> +FTR_SECTION_ELSE
>> andi. r6,r8,MSR_PR
>> - ld r4,_LINK(r1)
>> + beq 65f
>> + mtcr r5
>> + nop
>> + nop
>> + nop
>> + nop
>> + nop
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
>> +65: ld r4,_LINK(r1)
>>
>> beq- 1f
>> ACCOUNT_CPU_USER_EXIT(r11, r12)
>> @@ -234,7 +271,11 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
>> 1: ld r2,GPR2(r1)
>> ld r1,GPR1(r1)
>> mtlr r4
>> +#ifdef CONFIG_PPC64
>> + mtcrf 0xFB,r5
>> +#else
>> mtcr r5
>> +#endif
>
> mtcrf with more than one CRn being updated is expensive on Freescale
> chips (and this isn't a book3s-only code path). Why do you need to do
> it twice? I don't see where either r5 or cr5 are messed with between
> the two places...
>
Incase of returning to userspace, will be updating it twice. Which can
be avoid. Will redo this.
Regards
Maddy
> -Scott
>
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-12-01 21:35 ` Gabriel Paubert
@ 2014-12-03 14:59 ` Madhavan Srinivasan
2014-12-03 17:07 ` Gabriel Paubert
0 siblings, 1 reply; 29+ messages in thread
From: Madhavan Srinivasan @ 2014-12-03 14:59 UTC (permalink / raw)
To: Gabriel Paubert; +Cc: rusty, paulus, anton, linuxppc-dev
On Tuesday 02 December 2014 03:05 AM, Gabriel Paubert wrote:
> On Thu, Nov 27, 2014 at 05:48:40PM +0530, Madhavan Srinivasan wrote:
>> This patch create the infrastructure to handle the CR based
>> local_* atomic operations. Local atomic operations are fast
>> and highly reentrant per CPU counters. Used for percpu
>> variable updates. Local atomic operations only guarantee
>> variable modification atomicity wrt the CPU which owns the
>> data and these needs to be executed in a preemption safe way.
>>
>> Here is the design of this patch. Since local_* operations
>> are only need to be atomic to interrupts (IIUC), patch uses
>> one of the Condition Register (CR) fields as a flag variable. When
>> entering the local_*, specific bit in the CR5 field is set
>> and on exit, bit is cleared. CR bit checking is done in the
>> interrupt return path. If CR5[EQ] bit set and if we return
>> to kernel, we reset to start of local_* operation.
>>
>> Reason for this approach is that, currently l[w/d]arx/st[w/d]cx.
>> instruction pair is used for local_* operations, which are heavy
>> on cycle count and they dont support a local variant. So to
>> see whether the new implementation helps, used a modified
>> version of Rusty's benchmark code on local_t.
>>
>> https://lkml.org/lkml/2008/12/16/450
>>
>> Modifications:
>> - increated the working set size from 1MB to 8MB,
>> - removed cpu_local_inc test.
>>
>> Test ran
>> - on POWER8 1S Scale out System 2.0GHz
>> - on OPAL v3 with v3.18-rc4 patch kernel as Host
>>
>> Here are the values with the patch.
>>
>> Time in ns per iteration
>>
>> inc add read add_return
>> atomic_long 67 67 18 69
>> irqsave/rest 39 39 23 39
>> trivalue 39 39 29 49
>> local_t 26 26 24 26
>>
>> Since CR5 is used as a flag, have added CFLAGS to avoid CR5
>> for the kernel compilation and CR5 is zeroed at the kernel
>> entry.
>>
>> Tested the patch in a
>> - pSeries LPAR,
>> - Host with patched/unmodified guest kernel
>>
>> To check whether userspace see any CR5 corruption, ran a simple
>> test which does,
>> - set CR5 field,
>> - while(1)
>> - sleep or gettimeofday
>> - chk bit set
>>
>> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
>> ---
>> - I really appreciate feedback on the patchset.
>> - Kindly comment if I should try with any other benchmark or
>> workload to check the numbers.
>> - Also, kindly recommand any know stress test for CR
>>
>> Makefile | 6 ++
>> arch/powerpc/include/asm/exception-64s.h | 21 +++++-
>> arch/powerpc/kernel/entry_64.S | 106 ++++++++++++++++++++++++++++++-
>> arch/powerpc/kernel/exceptions-64s.S | 2 +-
>> arch/powerpc/kernel/head_64.S | 8 +++
>> 5 files changed, 138 insertions(+), 5 deletions(-)
>>
>> diff --git a/Makefile b/Makefile
>> index 00d618b..2e271ad 100644
>> --- a/Makefile
>> +++ b/Makefile
>> @@ -706,6 +706,12 @@ endif
>>
>> KBUILD_CFLAGS += $(call cc-option, -fno-var-tracking-assignments)
>>
>> +ifdef CONFIG_PPC64
>> +# We need this flag to force compiler not to use CR5, since
>> +# local_t type code is based on this.
>> +KBUILD_CFLAGS += -ffixed-cr5
>> +endif
>> +
>> ifdef CONFIG_DEBUG_INFO
>> ifdef CONFIG_DEBUG_INFO_SPLIT
>> KBUILD_CFLAGS += $(call cc-option, -gsplit-dwarf, -g)
>> diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
>> index 77f52b2..c42919a 100644
>> --- a/arch/powerpc/include/asm/exception-64s.h
>> +++ b/arch/powerpc/include/asm/exception-64s.h
>> @@ -306,7 +306,26 @@ do_kvm_##n: \
>> std r10,0(r1); /* make stack chain pointer */ \
>> std r0,GPR0(r1); /* save r0 in stackframe */ \
>> std r10,GPR1(r1); /* save r1 in stackframe */ \
>> - beq 4f; /* if from kernel mode */ \
>> +BEGIN_FTR_SECTION; \
>> + lis r9,4096; /* Create a mask with HV and PR */ \
>> + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \
>> + mr r10,r9; /* to check for Hyp state */ \
>> + ori r9,r9,16384; \
>> + and r9,r12,r9; \
>> + cmpd cr3,r10,r9; \
>> + beq cr3,66f; /* Jump if we come from Hyp mode*/ \
>> + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \
>
> I think you can do better than this, powerpc has a fantastic set
> of rotate and mask instructions. If I understand correctly your
> code you can replace it with the following:
>
> rldicl r10,r12,4,63 /* Extract HV bit to LSB of r10*/
> rlwinm r9,r12,19,0x02 /* Extract PR bit to 2nd to last bit of r9 */
> or r9,r9,10
> cmplwi cr3,r9,1 /* Check for HV=1 and PR=0 */
> beq cr3,66f
> mtcrf 0x04,r10 /* Bits going to cr5 bits are 0 in r10 */
>
>From previous comment, at this point, CR0 already has MSR[PR], and only
HV need to be checked. So I guess there still room for optimization.
But good to learn this seq.
> and obviously similarly for the other instances of this code sequence.
>
> This said, mtcrf is not necessarily the fastest way of setting cr5 if
> you only want to clear the EQ bit. For example comparing the stack pointer
> with 0 should have the same effect.
>
Yes. Will try this out.
>> +FTR_SECTION_ELSE; \
>> + beq 4f; /* if kernel mode branch */ \
>> + li r10,0; /* Clear CR5 incase of coming */ \
>> + mtcrf 0x04,r10; /* from user. */ \
>> + nop; /* This part of code is for */ \
>> + nop; /* kernel with MSR[HV]=0, */ \
>> + nop; /* MSR[PR]=0, so just chk for */ \
>> + nop; /* MSR[PR] */ \
>> + nop; \
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE); \
>> +66: beq 4f; /* if from kernel mode */ \
>> ACCOUNT_CPU_USER_ENTRY(r9, r10); \
>> SAVE_PPR(area, r9, r10); \
>> 4: EXCEPTION_PROLOG_COMMON_2(area) \
>> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
>> index 0905c8d..e42bb99 100644
>> --- a/arch/powerpc/kernel/entry_64.S
>> +++ b/arch/powerpc/kernel/entry_64.S
>> @@ -68,7 +68,26 @@ system_call_common:
>> 2: std r2,GPR2(r1)
>> std r3,GPR3(r1)
>> mfcr r2
>> - std r4,GPR4(r1)
>> +BEGIN_FTR_SECTION
>> + lis r10,4096
>> + rldicr r10,r10,32,31
>> + mr r11,r10
>> + ori r10,r10,16384
>> + and r10,r12,r10
>> + cmpd r11,r10
>> + beq 67f
>> + mtcrf 0x04,r11
>> +FTR_SECTION_ELSE
>> + beq 67f
>> + li r11,0
>> + mtcrf 0x04,r11
>> + nop
>> + nop
>> + nop
>> + nop
>> + nop
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
>> +67: std r4,GPR4(r1)
>> std r5,GPR5(r1)
>> std r6,GPR6(r1)
>> std r7,GPR7(r1)
>> @@ -224,8 +243,26 @@ syscall_exit:
>> BEGIN_FTR_SECTION
>> stdcx. r0,0,r1 /* to clear the reservation */
>> END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
>> +BEGIN_FTR_SECTION
>> + lis r4,4096
>> + rldicr r4,r4,32,31
>> + mr r6,r4
>> + ori r4,r4,16384
>> + and r4,r8,r4
>> + cmpd cr3,r6,r4
>> + beq cr3,65f
>> + mtcr r5
>> +FTR_SECTION_ELSE
>> andi. r6,r8,MSR_PR
>> - ld r4,_LINK(r1)
>> + beq 65f
>> + mtcr r5
>> + nop
>> + nop
>> + nop
>> + nop
>> + nop
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
>> +65: ld r4,_LINK(r1)
>>
>> beq- 1f
>> ACCOUNT_CPU_USER_EXIT(r11, r12)
>> @@ -234,7 +271,11 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
>> 1: ld r2,GPR2(r1)
>> ld r1,GPR1(r1)
>> mtlr r4
>> +#ifdef CONFIG_PPC64
>> + mtcrf 0xFB,r5
>> +#else
>> mtcr r5
>> +#endif
>> mtspr SPRN_SRR0,r7
>> mtspr SPRN_SRR1,r8
>> RFI
>> @@ -804,7 +845,66 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
>> */
>> .globl fast_exception_return
>> fast_exception_return:
>> - ld r3,_MSR(r1)
>> +
>> + /*
>> + * Now that we are about to exit from interrupt, lets check for
>> + * cr5 eq bit. If it is set, then we may be in the middle of
>> + * local_t update. In this case, we should rewind the NIP
>> + * accordingly.
>> + */
>> + mfcr r3
>> + andi. r4,r3,0x200
>> + beq 63f
>
> I believe that someonw has already mentioned that this is a very
> convoluted way of testing a cr bit. This can be optimized to bne cr5,63f
>
This. Will change it.
Regards
Maddy
>> +
>> + /*
>> + * Now that the bit is set, lets check for return to User
>> + */
>> + ld r4,_MSR(r1)
>> +BEGIN_FTR_SECTION
>> + li r3,4096
>> + rldicr r3,r3,32,31
>> + mr r5,r3
>> + ori r3,r3,16384
>> + and r3,r4,r3
>> + cmpd r5,r3
>> + bne 63f
>> +FTR_SECTION_ELSE
>> + andi. r3,r4,MSR_PR
>> + bne 63f
>> + nop
>> + nop
>> + nop
>> + nop
>> + nop
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
>> +
>> + /*
>> + * Looks like we are returning to Kernel, so
>> + * lets get the NIP and search the ex_table.
>> + * Change the NIP based on the return value
>> + */
>> +lookup_ex_table:
>> + ld r3,_NIP(r1)
>> + bl search_exception_tables
>> + cmpli 0,1,r3,0
>> + bne 62f
>> +
>> + /*
>> + * This is a panic case. Reason is that, we
>> + * have the CR5 bit set, but we are not in
>> + * local_* code and we are returning to Kernel.
>> + */
>> + ld r3,_NIP(r1)
>> + mfcr r4
>> + EMIT_BUG_ENTRY lookup_ex_table, __FILE__,__LINE__,BUGFLAG_WARNING
>> +
>> + /*
>> + * Now save the return fixup address as NIP
>> + */
>> +62: ld r4,8(r3)
>> + std r4,_NIP(r1)
>> + crclr 22
>> +63: ld r3,_MSR(r1)
>> ld r4,_CTR(r1)
>> ld r0,_LINK(r1)
>> mtctr r4
>> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
>> index 72e783e..edb75a9 100644
>> --- a/arch/powerpc/kernel/exceptions-64s.S
>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>> @@ -637,7 +637,7 @@ masked_##_H##interrupt: \
>> rldicl r10,r10,48,1; /* clear MSR_EE */ \
>> rotldi r10,r10,16; \
>> mtspr SPRN_##_H##SRR1,r10; \
>> -2: mtcrf 0x80,r9; \
>> +2: mtcrf 0x90,r9; \
>> ld r9,PACA_EXGEN+EX_R9(r13); \
>> ld r10,PACA_EXGEN+EX_R10(r13); \
>> ld r11,PACA_EXGEN+EX_R11(r13); \
>> diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
>> index d48125d..02e49b3 100644
>> --- a/arch/powerpc/kernel/head_64.S
>> +++ b/arch/powerpc/kernel/head_64.S
>> @@ -347,6 +347,14 @@ __mmu_off:
>> *
>> */
>> __start_initialization_multiplatform:
>> +
>> + /*
>> + * Before we do anything, lets clear CR5 field,
>> + * so that we will have a clean start at entry
>> + */
>> + li r11,0
>> + mtcrf 0x04,r11
>> +
>> /* Make sure we are running in 64 bits mode */
>> bl enable_64b_mode
>>
>
>
> Regards,
> Gabriel
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 2/2]powerpc: rewrite local_* to use CR5 flag
2014-12-01 18:01 ` Gabriel Paubert
@ 2014-12-03 15:05 ` Madhavan Srinivasan
0 siblings, 0 replies; 29+ messages in thread
From: Madhavan Srinivasan @ 2014-12-03 15:05 UTC (permalink / raw)
To: Gabriel Paubert; +Cc: rusty, paulus, anton, linuxppc-dev
On Monday 01 December 2014 11:31 PM, Gabriel Paubert wrote:
> On Thu, Nov 27, 2014 at 05:48:41PM +0530, Madhavan Srinivasan wrote:
>> This patch re-write the current local_* functions to CR5 based one.
>> Base flow for each function is
>>
>> {
>> set cr5(eq)
>> load
>> ..
>> store
>> clear cr5(eq)
>> }
>>
>> Above set of instructions are followed by a fixup section which points
>> to the entry of the function incase of interrupt in the flow. If the
>> interrupt happens to be after the store, we just continue to last
>> instruction in that block.
>>
>> Currently only asm/local.h has been rewrite, and local64 is TODO.
>> Also the entire change is only for PPC64.
>>
>> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
>> ---
>> arch/powerpc/include/asm/local.h | 306 +++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 306 insertions(+)
>>
>> diff --git a/arch/powerpc/include/asm/local.h b/arch/powerpc/include/asm/local.h
>> index b8da913..a26e5d3 100644
>> --- a/arch/powerpc/include/asm/local.h
>> +++ b/arch/powerpc/include/asm/local.h
>> @@ -11,6 +11,310 @@ typedef struct
>>
>> #define LOCAL_INIT(i) { ATOMIC_LONG_INIT(i) }
>>
>> +#ifdef CONFIG_PPC64
>> +
>> +static __inline__ long local_read(local_t *l)
>> +{
>> + long t;
>> +
>> + __asm__ __volatile__(
>> +"1: crset 22\n"
>> +"2:" PPC_LL" %0,0(%1)\n"
>> +"3: crclr 22\n"
>> +"4:\n"
>> +" .section __ex_table,\"a\"\n"
>> + PPC_LONG_ALIGN "\n"
>> + PPC_LONG "2b,1b\n"
>> + PPC_LONG "3b,3b\n"
>> +" .previous\n"
>> + : "=&r" (t)
>> + : "r" (&(l->a.counter)));
>> +
>> + return t;
>> +}
>> +
>> +static __inline__ void local_set(local_t *l, long i)
>> +{
>> + long t;
>> +
>> + __asm__ __volatile__(
>> +"1: crset 22\n"
>> +"2:" PPC_LL" %0,0(%1)\n"
>> +"3:" PPC405_ERR77(0,%2)
>> +"4:" PPC_STL" %0,0(%2)\n"
>> +"5: crclr 22\n"
>> +"6:\n"
>> +" .section __ex_table,\"a\"\n"
>> + PPC_LONG_ALIGN "\n"
>> + PPC_LONG "2b,1b\n"
>> + PPC_LONG "3b,1b\n"
>> + PPC_LONG "4b,1b\n"
>> + PPC_LONG "5b,5b\n"
>> +" .previous\n"
>> + : "=&r" (t)
>> + : "r" (&(i)), "r" (&(l->a.counter)));
>> +}
>> +
>
> Apart from the other comments on bloat which can very likely
> be removed by tracing backwards for a few instructions, removing
> the exception table entries which are 2 or 4 (64 bit?) times as
> large as the instruction sequence, I don't understand at all why
> you need these sequences for the local_read and local_set functions.
>
> After all these are single instructions (why do you perform a read before
> the write in set when the result of the read is never used?).
>
Agreed. But the benchmark I used, also did check for local_read timing.
So for consistency I re-wrote the these functions. But will drop the
changes for these function in the next version.
> I believe read and set are better mapped to access_once (or assign_once
> or whatever it's called after the recent discussion on linux-kernel).
> You don't even need a memory barrier if it's for a single thread,
> so you could get away with a single volatile access to the variables.
>
OK.
> For the other ones, I think that what you do is correct, except that
> the workaround for PPC405 erratum 77 is not needed since this erratum
> only affects the stwcx. instruction and the whole point of the patch
> is to avoid the use of an l?arx/st?cx. pair.
Yes. Will remove it.
With regards
Maddy
>
> Regards,
> Gabriel
>
>> +static __inline__ void local_add(long i, local_t *l)
>> +{
>> + long t;
>> +
>> + __asm__ __volatile__(
>> +"1: crset 22\n"
>> +"2:" PPC_LL" %0,0(%2)\n"
>> +"3: add %0,%1,%0\n"
>> +"4:" PPC405_ERR77(0,%2)
>> +"5:" PPC_STL" %0,0(%2)\n"
>> +"6: crclr 22\n"
>> +"7:\n"
>> +" .section __ex_table,\"a\"\n"
>> + PPC_LONG_ALIGN "\n"
>> + PPC_LONG "2b,1b\n"
>> + PPC_LONG "3b,1b\n"
>> + PPC_LONG "4b,1b\n"
>> + PPC_LONG "5b,1b\n"
>> + PPC_LONG "6b,6b\n"
>> +" .previous\n"
>> + : "=&r" (t)
>> + : "r" (i), "r" (&(l->a.counter)));
>> +}
>> +
>> +static __inline__ void local_sub(long i, local_t *l)
>> +{
>> + long t;
>> +
>> + __asm__ __volatile__(
>> +"1: crset 22\n"
>> +"2:" PPC_LL" %0,0(%2)\n"
>> +"3: subf %0,%1,%0\n"
>> +"4:" PPC405_ERR77(0,%2)
>> +"5:" PPC_STL" %0,0(%2)\n"
>> +"6: crclr 22\n"
>> +"7:\n"
>> +" .section __ex_table,\"a\"\n"
>> + PPC_LONG_ALIGN "\n"
>> + PPC_LONG "2b,1b\n"
>> + PPC_LONG "3b,1b\n"
>> + PPC_LONG "4b,1b\n"
>> + PPC_LONG "5b,1b\n"
>> + PPC_LONG "6b,6b\n"
>> +" .previous\n"
>> + : "=&r" (t)
>> + : "r" (i), "r" (&(l->a.counter)));
>> +}
>> +
>> +static __inline__ long local_add_return(long a, local_t *l)
>> +{
>> + long t;
>> +
>> + __asm__ __volatile__(
>> +"1: crset 22\n"
>> +"2:" PPC_LL" %0,0(%2)\n"
>> +"3: add %0,%1,%0\n"
>> +"4:" PPC405_ERR77(0,%2)
>> +"5:" PPC_STL "%0,0(%2)\n"
>> +"6: crclr 22\n"
>> +"7:\n"
>> +" .section __ex_table,\"a\"\n"
>> + PPC_LONG_ALIGN "\n"
>> + PPC_LONG "2b,1b\n"
>> + PPC_LONG "3b,1b\n"
>> + PPC_LONG "4b,1b\n"
>> + PPC_LONG "5b,1b\n"
>> + PPC_LONG "6b,6b\n"
>> +" .previous\n"
>> + : "=&r" (t)
>> + : "r" (a), "r" (&(l->a.counter))
>> + : "cc", "memory");
>> +
>> + return t;
>> +}
>> +
>> +
>> +#define local_add_negative(a, l) (local_add_return((a), (l)) < 0)
>> +
>> +static __inline__ long local_sub_return(long a, local_t *l)
>> +{
>> + long t;
>> +
>> + __asm__ __volatile__(
>> +"1: crset 22\n"
>> +"2:" PPC_LL" %0,0(%2)\n"
>> +"3: subf %0,%1,%0\n"
>> +"4:" PPC405_ERR77(0,%2)
>> +"5:" PPC_STL "%0,0(%2)\n"
>> +"6: crclr 22\n"
>> +"7:\n"
>> +" .section __ex_table,\"a\"\n"
>> + PPC_LONG_ALIGN "\n"
>> + PPC_LONG "2b,1b\n"
>> + PPC_LONG "3b,1b\n"
>> + PPC_LONG "4b,1b\n"
>> + PPC_LONG "5b,1b\n"
>> + PPC_LONG "6b,6b\n"
>> +" .previous\n"
>> + : "=&r" (t)
>> + : "r" (a), "r" (&(l->a.counter))
>> + : "cc", "memory");
>> +
>> + return t;
>> +}
>> +
>> +static __inline__ long local_inc_return(local_t *l)
>> +{
>> + long t;
>> +
>> + __asm__ __volatile__(
>> +"1: crset 22\n"
>> +"2:" PPC_LL" %0,0(%1)\n"
>> +"3: addic %0,%0,1\n"
>> +"4:" PPC405_ERR77(0,%1)
>> +"5:" PPC_STL "%0,0(%1)\n"
>> +"6: crclr 22\n"
>> +"7:\n"
>> +" .section __ex_table,\"a\"\n"
>> + PPC_LONG_ALIGN "\n"
>> + PPC_LONG "2b,1b\n"
>> + PPC_LONG "3b,1b\n"
>> + PPC_LONG "4b,1b\n"
>> + PPC_LONG "5b,1b\n"
>> + PPC_LONG "6b,6b\n"
>> +" .previous"
>> + : "=&r" (t)
>> + : "r" (&(l->a.counter))
>> + : "cc", "xer", "memory");
>> +
>> + return t;
>> +}
>> +
>> +/*
>> + * local_inc_and_test - increment and test
>> + * @l: pointer of type local_t
>> + *
>> + * Atomically increments @l by 1
>> + * and returns true if the result is zero, or false for all
>> + * other cases.
>> + */
>> +#define local_inc_and_test(l) (local_inc_return(l) == 0)
>> +
>> +static __inline__ long local_dec_return(local_t *l)
>> +{
>> + long t;
>> +
>> + __asm__ __volatile__(
>> +"1: crset 22\n"
>> +"2:" PPC_LL" %0,0(%1)\n"
>> +"3: addic %0,%0,-1\n"
>> +"4:" PPC405_ERR77(0,%1)
>> +"5:" PPC_STL "%0,0(%1)\n"
>> +"6: crclr 22\n"
>> +"7:\n"
>> +" .section __ex_table,\"a\"\n"
>> + PPC_LONG_ALIGN "\n"
>> + PPC_LONG "2b,1b\n"
>> + PPC_LONG "3b,1b\n"
>> + PPC_LONG "4b,1b\n"
>> + PPC_LONG "5b,1b\n"
>> + PPC_LONG "6b,6b\n"
>> +" .previous\n"
>> + : "=&r" (t)
>> + : "r" (&(l->a.counter))
>> + : "cc", "xer", "memory");
>> +
>> + return t;
>> +}
>> +
>> +#define local_inc(l) local_inc_return(l)
>> +#define local_dec(l) local_dec_return(l)
>> +
>> +#define local_cmpxchg(l, o, n) \
>> + (cmpxchg_local(&((l)->a.counter), (o), (n)))
>> +#define local_xchg(l, n) (xchg_local(&((l)->a.counter), (n)))
>> +
>> +/**
>> + * local_add_unless - add unless the number is a given value
>> + * @l: pointer of type local_t
>> + * @a: the amount to add to v...
>> + * @u: ...unless v is equal to u.
>> + *
>> + * Atomically adds @a to @l, so long as it was not @u.
>> + * Returns non-zero if @l was not @u, and zero otherwise.
>> + */
>> +static __inline__ int local_add_unless(local_t *l, long a, long u)
>> +{
>> + long t;
>> +
>> + __asm__ __volatile__ (
>> +"1: crset 22\n"
>> +"2:" PPC_LL" %0,0(%1)\n"
>> +"3: cmpw 0,%0,%3 \n"
>> +"4: beq- 9f \n"
>> +"5: add %0,%2,%0 \n"
>> +"6:" PPC405_ERR77(0,%1)
>> +"7:" PPC_STL" %0,0(%1) \n"
>> +"8: subf %0,%2,%0 \n"
>> +"9: crclr 22\n"
>> +"10:\n"
>> +" .section __ex_table,\"a\"\n"
>> + PPC_LONG_ALIGN "\n"
>> + PPC_LONG "2b,1b\n"
>> + PPC_LONG "3b,1b\n"
>> + PPC_LONG "4b,1b\n"
>> + PPC_LONG "5b,1b\n"
>> + PPC_LONG "6b,1b\n"
>> + PPC_LONG "7b,1b\n"
>> + PPC_LONG "8b,8b\n"
>> + PPC_LONG "9b,9b\n"
>> +" .previous\n"
>> + : "=&r" (t)
>> + : "r" (&(l->a.counter)), "r" (a), "r" (u)
>> + : "cc", "memory");
>> +
>> + return t != u;
>> +}
>> +
>> +#define local_inc_not_zero(l) local_add_unless((l), 1, 0)
>> +
>> +#define local_sub_and_test(a, l) (local_sub_return((a), (l)) == 0)
>> +#define local_dec_and_test(l) (local_dec_return((l)) == 0)
>> +
>> +/*
>> + * Atomically test *l and decrement if it is greater than 0.
>> + * The function returns the old value of *l minus 1.
>> + */
>> +static __inline__ long local_dec_if_positive(local_t *l)
>> +{
>> + long t;
>> +
>> + __asm__ __volatile__(
>> +"1: crset 22\n"
>> +"2:" PPC_LL" %0,0(%1)\n"
>> +"3: cmpwi %0,1\n"
>> +"4: addi %0,%0,-1\n"
>> +"5: blt- 8f\n"
>> +"6:" PPC405_ERR77(0,%1)
>> +"7:" PPC_STL "%0,0(%1)\n"
>> +"8: crclr 22\n"
>> +"9:\n"
>> +" .section__ex_table,\"a\"\n"
>> + PPC_LONG_ALIGN "\n"
>> + PPC_LONG "2b,1b\n"
>> + PPC_LONG "3b,1b\n"
>> + PPC_LONG "4b,1b\n"
>> + PPC_LONG "5b,1b\n"
>> + PPC_LONG "6b,1b\n"
>> + PPC_LONG "7b,1b\n"
>> + PPC_LONG "8b,8b\n"
>> +" .previous\n"
>> + : "=&b" (t)
>> + : "r" (&(l->a.counter))
>> + : "cc", "memory");
>> +
>> + return t;
>> +}
>> +
>> +#else
>> +
>> #define local_read(l) atomic_long_read(&(l)->a)
>> #define local_set(l,i) atomic_long_set(&(l)->a, (i))
>>
>> @@ -162,6 +466,8 @@ static __inline__ long local_dec_if_positive(local_t *l)
>> return t;
>> }
>>
>> +#endif
>> +
>> /* Use these for per-cpu local_t variables: on some archs they are
>> * much more efficient than these naive implementations. Note they take
>> * a variable, not an address.
>> --
>> 1.9.1
>>
>> _______________________________________________
>> Linuxppc-dev mailing list
>> Linuxppc-dev@lists.ozlabs.org
>> https://lists.ozlabs.org/listinfo/linuxppc-dev
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-12-03 14:59 ` Madhavan Srinivasan
@ 2014-12-03 17:07 ` Gabriel Paubert
0 siblings, 0 replies; 29+ messages in thread
From: Gabriel Paubert @ 2014-12-03 17:07 UTC (permalink / raw)
To: Madhavan Srinivasan; +Cc: rusty, paulus, anton, linuxppc-dev
On Wed, Dec 03, 2014 at 08:29:37PM +0530, Madhavan Srinivasan wrote:
> On Tuesday 02 December 2014 03:05 AM, Gabriel Paubert wrote:
> > On Thu, Nov 27, 2014 at 05:48:40PM +0530, Madhavan Srinivasan wrote:
> >> diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
> >> index 77f52b2..c42919a 100644
> >> --- a/arch/powerpc/include/asm/exception-64s.h
> >> +++ b/arch/powerpc/include/asm/exception-64s.h
> >> @@ -306,7 +306,26 @@ do_kvm_##n: \
> >> std r10,0(r1); /* make stack chain pointer */ \
> >> std r0,GPR0(r1); /* save r0 in stackframe */ \
> >> std r10,GPR1(r1); /* save r1 in stackframe */ \
> >> - beq 4f; /* if from kernel mode */ \
> >> +BEGIN_FTR_SECTION; \
> >> + lis r9,4096; /* Create a mask with HV and PR */ \
> >> + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \
> >> + mr r10,r9; /* to check for Hyp state */ \
> >> + ori r9,r9,16384; \
> >> + and r9,r12,r9; \
> >> + cmpd cr3,r10,r9; \
> >> + beq cr3,66f; /* Jump if we come from Hyp mode*/ \
> >> + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \
> >
> > I think you can do better than this, powerpc has a fantastic set
> > of rotate and mask instructions. If I understand correctly your
> > code you can replace it with the following:
> >
> > rldicl r10,r12,4,63 /* Extract HV bit to LSB of r10*/
> > rlwinm r9,r12,19,0x02 /* Extract PR bit to 2nd to last bit of r9 */
> > or r9,r9,10
> > cmplwi cr3,r9,1 /* Check for HV=1 and PR=0 */
> > beq cr3,66f
> > mtcrf 0x04,r10 /* Bits going to cr5 bits are 0 in r10 */
> >
>
> >From previous comment, at this point, CR0 already has MSR[PR], and only
> HV need to be checked. So I guess there still room for optimization.
> But good to learn this seq.
Actually the sequence I suggested can even be shortened again since the or
is not necessary and you can use an rlwimi instead.
rldicl r9,r12,5,62 /* r9 = 62 zeroes | HV | ? */
rlwimi r9,r12,18,0x01 /* r9 = 62 zeroes | HV | PR */
cmplwi cr3,r9,2 /* Check for HV=1 and PR=0 */
For 32 bit operands, rlwimi is a generic bit field to bit field move, but
GCC is (was, maybe GCC5 is better?) quite bad at recognizing opportunities
to use it.
I'm not sure that it is better to have two separate branches, one for
testing PR and the other for testing HV. Through how many branches can
the hardware speculate?
Unless I'm mistaken, this is code which is likely not to be in the
cache so I would optimize it first towards minimal code size.
Regards,
Gabriel
^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation
2014-12-01 15:53 ` David Laight
@ 2014-12-18 4:18 ` Rusty Russell
2014-12-18 9:52 ` David Laight
0 siblings, 1 reply; 29+ messages in thread
From: Rusty Russell @ 2014-12-18 4:18 UTC (permalink / raw)
To: David Laight, 'Madhavan Srinivasan', mpe@ellerman.id.au
Cc: linuxppc-dev@lists.ozlabs.org, paulus@samba.org, anton@samba.org
David Laight <David.Laight@ACULAB.COM> writes:
> From: Madhavan Srinivasan [mailto:maddy@linux.vnet.ibm.com]
> ...
>> >>> I also wonder if it is possible to inspect the interrupted
>> >>> code to determine the start/end of the RAS block.
>> >>> (Easiest if you assume that there is a single 'write' instruction
>> >>> as the last entry in the block.)
>> >>>
>> >> So each local_* function also have code in the __ex_table section. IIUC,
>> >> __ex_table contains two address. So if the return address found in the
>> >> first column of the _ex_table, use the corresponding address in the
>> >> second column to continue from.
>> >
>> > That really doesn't scale.
>> > I don't know how many 1000 address pairs you table will have (and the
>> > ones in each loadable module), but the search isn't going to be cheap.
>> >
>> > If these sequences are restartable then they can only have one write
>> > to memory.
>> >
>>
>> May be, but i see these issues incase of insts decode path,
>>
>> 1) Decoding instruction may also cause a fault (in case of module) and
>> handling a fault at this stage toward the exit path of interrupt exit
>> makes me nervous
>
> It shouldn't be possible to unload a module that is interrupted by
> a hardware interrupt.
> An 'invalid' loadable module can cause an oops/panic anyway.
Yes, the module won't fault (vmalloc memory can be lazily mapped, but
we've already copied the module into there, so it won't happen).
>> 2) resulting code with lot of condition and branch (for opcode decode)
>> will be lot messy and may be an issue incase of maintenance,
>
> You don't need to decode the instructions.
> Just look for the two specific instructions used as markers.
> This is only really possible with fixed-size instructions.
>
> It might also be that the 'interrupt entry' path is easier to
> modify than the 'interrupt exit' one (fewer code paths) and
> you just need to modify the 'pc' in the stack frame.
> You are only interested in interrupts from kernel space.
It's an overoptimization for case that statistically never happens.
You won't even be able to measure the difference.
The question of bloat remains, but that's also easily measured. In
practice, I'd guess less than 1k.
Cheers,
Rusty.
^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation
2014-12-18 4:18 ` Rusty Russell
@ 2014-12-18 9:52 ` David Laight
2014-12-18 10:53 ` Rusty Russell
0 siblings, 1 reply; 29+ messages in thread
From: David Laight @ 2014-12-18 9:52 UTC (permalink / raw)
To: 'Rusty Russell', 'Madhavan Srinivasan',
mpe@ellerman.id.au
Cc: paulus@samba.org, linuxppc-dev@lists.ozlabs.org, anton@samba.org
RnJvbTogUnVzdHkgUnVzc2VsbA0KPiBEYXZpZCBMYWlnaHQgPERhdmlkLkxhaWdodEBBQ1VMQUIu
Q09NPiB3cml0ZXM6DQo+ID4gRnJvbTogTWFkaGF2YW4gU3Jpbml2YXNhbiBbbWFpbHRvOm1hZGR5
QGxpbnV4LnZuZXQuaWJtLmNvbV0NCj4gPiAuLi4NCj4gPj4gPj4+IEkgYWxzbyB3b25kZXIgaWYg
aXQgaXMgcG9zc2libGUgdG8gaW5zcGVjdCB0aGUgaW50ZXJydXB0ZWQNCj4gPj4gPj4+IGNvZGUg
dG8gZGV0ZXJtaW5lIHRoZSBzdGFydC9lbmQgb2YgdGhlIFJBUyBibG9jay4NCj4gPj4gPj4+IChF
YXNpZXN0IGlmIHlvdSBhc3N1bWUgdGhhdCB0aGVyZSBpcyBhIHNpbmdsZSAnd3JpdGUnIGluc3Ry
dWN0aW9uDQo+ID4+ID4+PiBhcyB0aGUgbGFzdCBlbnRyeSBpbiB0aGUgYmxvY2suKQ0KPiA+PiA+
Pj4NCj4gPj4gPj4gU28gZWFjaCBsb2NhbF8qIGZ1bmN0aW9uIGFsc28gaGF2ZSBjb2RlIGluIHRo
ZSBfX2V4X3RhYmxlIHNlY3Rpb24uIElJVUMsDQo+ID4+ID4+IF9fZXhfdGFibGUgY29udGFpbnMg
dHdvIGFkZHJlc3MuIFNvIGlmIHRoZSByZXR1cm4gYWRkcmVzcyBmb3VuZCBpbiB0aGUNCj4gPj4g
Pj4gZmlyc3QgY29sdW1uIG9mIHRoZSBfZXhfdGFibGUsIHVzZSB0aGUgY29ycmVzcG9uZGluZyBh
ZGRyZXNzIGluIHRoZQ0KPiA+PiA+PiBzZWNvbmQgY29sdW1uIHRvIGNvbnRpbnVlIGZyb20uDQo+
ID4+ID4NCj4gPj4gPiBUaGF0IHJlYWxseSBkb2Vzbid0IHNjYWxlLg0KPiA+PiA+IEkgZG9uJ3Qg
a25vdyBob3cgbWFueSAxMDAwIGFkZHJlc3MgcGFpcnMgeW91IHRhYmxlIHdpbGwgaGF2ZSAoYW5k
IHRoZQ0KPiA+PiA+IG9uZXMgaW4gZWFjaCBsb2FkYWJsZSBtb2R1bGUpLCBidXQgdGhlIHNlYXJj
aCBpc24ndCBnb2luZyB0byBiZSBjaGVhcC4NCj4gPj4gPg0KPiA+PiA+IElmIHRoZXNlIHNlcXVl
bmNlcyBhcmUgcmVzdGFydGFibGUgdGhlbiB0aGV5IGNhbiBvbmx5IGhhdmUgb25lIHdyaXRlDQo+
ID4+ID4gdG8gbWVtb3J5Lg0KLi4uDQo+ID4+IDIpIHJlc3VsdGluZyBjb2RlIHdpdGggbG90IG9m
IGNvbmRpdGlvbiBhbmQgYnJhbmNoIChmb3Igb3Bjb2RlIGRlY29kZSkNCj4gPj4gd2lsbCBiZSBs
b3QgbWVzc3kgYW5kIG1heSBiZSBhbiBpc3N1ZSBpbmNhc2Ugb2YgbWFpbnRlbmFuY2UsDQo+ID4N
Cj4gPiBZb3UgZG9uJ3QgbmVlZCB0byBkZWNvZGUgdGhlIGluc3RydWN0aW9ucy4NCj4gPiBKdXN0
IGxvb2sgZm9yIHRoZSB0d28gc3BlY2lmaWMgaW5zdHJ1Y3Rpb25zIHVzZWQgYXMgbWFya2Vycy4N
Cj4gPiBUaGlzIGlzIG9ubHkgcmVhbGx5IHBvc3NpYmxlIHdpdGggZml4ZWQtc2l6ZSBpbnN0cnVj
dGlvbnMuDQo+ID4NCj4gPiBJdCBtaWdodCBhbHNvIGJlIHRoYXQgdGhlICdpbnRlcnJ1cHQgZW50
cnknIHBhdGggaXMgZWFzaWVyIHRvDQo+ID4gbW9kaWZ5IHRoYW4gdGhlICdpbnRlcnJ1cHQgZXhp
dCcgb25lIChmZXdlciBjb2RlIHBhdGhzKSBhbmQNCj4gPiB5b3UganVzdCBuZWVkIHRvIG1vZGlm
eSB0aGUgJ3BjJyBpbiB0aGUgc3RhY2sgZnJhbWUuDQo+ID4gWW91IGFyZSBvbmx5IGludGVyZXN0
ZWQgaW4gaW50ZXJydXB0cyBmcm9tIGtlcm5lbCBzcGFjZS4NCj4gDQo+IEl0J3MgYW4gb3Zlcm9w
dGltaXphdGlvbiBmb3IgY2FzZSB0aGF0IHN0YXRpc3RpY2FsbHkgbmV2ZXIgaGFwcGVucy4NCj4g
WW91IHdvbid0IGV2ZW4gYmUgYWJsZSB0byBtZWFzdXJlIHRoZSBkaWZmZXJlbmNlLg0KPiANCj4g
VGhlIHF1ZXN0aW9uIG9mIGJsb2F0IHJlbWFpbnMsIGJ1dCB0aGF0J3MgYWxzbyBlYXNpbHkgbWVh
c3VyZWQuICBJbg0KPiBwcmFjdGljZSwgSSdkIGd1ZXNzIGxlc3MgdGhhbiAxay4NCg0KSUlSQyB0
aGV5IHdlcmUgJ3N0YXRpYyBpbmxpbmUnIHNvIHRoZSB0YWJsZSBvZiBhZGRyZXNzZXMgaXMgZ2Vu
ZXJhdGVkDQpmb3IgZXZlcnkgdXNlIHNpdGUuDQooY29weWluL291dCBnZW5lcmF0ZXMgYSBzaW1p
bGFybHkgZW5vcm1vdXMgdGFibGUgb2YgYWRkcmVzc2VzIG9uIGFtZDY0KQ0KDQoNCklmIHRoZXkg
d2VyZSByZWFsIGZ1bmN0aW9ucyAoc28gb25seSBhcHBlYXJlZCBvbmNlKSBpdCB3b3VsZG4ndCBi
ZSBhcyBiYWQuDQpJbmRlZWQsIGluIHRoYXQgY2FzZSwgYnkgcHV0dGluZyBhbGwgc3VjaCBmdW5j
dGlvbnMgaW50byBhIHNlcGFyYXRlIGNvZGUNCnNlY3Rpb24gYSBzaW1wbGUgJ3dpbmRvdyB0ZXN0
JyBjYW4gYmUgZG9uZSBvbiB0aGUgcmV0dXJuIGFkZHJlc3MgaW5zdGVhZA0Kb2YgcmVzZXJ2aW5n
IG9uZSBvZiB0aGUgQ1IgYml0cy4NCg0KWW91IGFsc28gb25seSBuZWVkIHRvIHNhdmUgdGhlIHN0
YXJ0IGFuZCBlbmQgb2YgZWFjaCBibG9jaywgbm90IHRoZQ0KcmVzdGFydCBhZGRyZXNzIGZvciBl
dmVyeSBpbnN0cnVjdGlvbiB3aXRoaW4gdGhlIGJsb2NrLg0KDQoJRGF2aWQNCg0K
^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation
2014-12-18 9:52 ` David Laight
@ 2014-12-18 10:53 ` Rusty Russell
0 siblings, 0 replies; 29+ messages in thread
From: Rusty Russell @ 2014-12-18 10:53 UTC (permalink / raw)
To: David Laight, 'Madhavan Srinivasan', mpe@ellerman.id.au
Cc: paulus@samba.org, linuxppc-dev@lists.ozlabs.org, anton@samba.org
David Laight <David.Laight@ACULAB.COM> writes:
> From: Rusty Russell
>> David Laight <David.Laight@ACULAB.COM> writes:
>> > From: Madhavan Srinivasan [mailto:maddy@linux.vnet.ibm.com]
>> > ...
>> >> >>> I also wonder if it is possible to inspect the interrupted
>> >> >>> code to determine the start/end of the RAS block.
>> >> >>> (Easiest if you assume that there is a single 'write' instruction
>> >> >>> as the last entry in the block.)
>> >> >>>
>> >> >> So each local_* function also have code in the __ex_table section. IIUC,
>> >> >> __ex_table contains two address. So if the return address found in the
>> >> >> first column of the _ex_table, use the corresponding address in the
>> >> >> second column to continue from.
>> >> >
>> >> > That really doesn't scale.
>> >> > I don't know how many 1000 address pairs you table will have (and the
>> >> > ones in each loadable module), but the search isn't going to be cheap.
>> >> >
>> >> > If these sequences are restartable then they can only have one write
>> >> > to memory.
> ...
>> >> 2) resulting code with lot of condition and branch (for opcode decode)
>> >> will be lot messy and may be an issue incase of maintenance,
>> >
>> > You don't need to decode the instructions.
>> > Just look for the two specific instructions used as markers.
>> > This is only really possible with fixed-size instructions.
>> >
>> > It might also be that the 'interrupt entry' path is easier to
>> > modify than the 'interrupt exit' one (fewer code paths) and
>> > you just need to modify the 'pc' in the stack frame.
>> > You are only interested in interrupts from kernel space.
>>
>> It's an overoptimization for case that statistically never happens.
>> You won't even be able to measure the difference.
>>
>> The question of bloat remains, but that's also easily measured. In
>> practice, I'd guess less than 1k.
>
> IIRC they were 'static inline' so the table of addresses is generated
> for every use site.
> (copyin/out generates a similarly enormous table of addresses on amd64)
There are about 20 callers in the entire kernel.
Cheers,
Rusty.
^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2014-12-18 23:39 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-27 12:18 [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation Madhavan Srinivasan
2014-11-27 12:18 ` [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t Madhavan Srinivasan
2014-11-27 16:56 ` Segher Boessenkool
2014-11-28 1:58 ` Benjamin Herrenschmidt
2014-11-28 3:00 ` Madhavan Srinivasan
2014-11-28 15:57 ` Segher Boessenkool
2014-11-28 2:57 ` Madhavan Srinivasan
2014-11-28 16:00 ` Segher Boessenkool
2014-11-28 0:56 ` Benjamin Herrenschmidt
2014-11-28 3:15 ` Madhavan Srinivasan
2014-11-28 3:21 ` Benjamin Herrenschmidt
2014-11-28 10:53 ` David Laight
2014-11-30 9:01 ` Benjamin Herrenschmidt
2014-12-01 21:35 ` Gabriel Paubert
2014-12-03 14:59 ` Madhavan Srinivasan
2014-12-03 17:07 ` Gabriel Paubert
2014-12-02 2:04 ` Scott Wood
2014-12-03 14:49 ` Madhavan Srinivasan
2014-11-27 12:18 ` [RFC PATCH 2/2]powerpc: rewrite local_* to use CR5 flag Madhavan Srinivasan
2014-12-01 18:01 ` Gabriel Paubert
2014-12-03 15:05 ` Madhavan Srinivasan
2014-11-27 14:05 ` [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation David Laight
2014-11-28 8:27 ` Madhavan Srinivasan
2014-11-28 10:09 ` David Laight
2014-12-01 15:35 ` Madhavan Srinivasan
2014-12-01 15:53 ` David Laight
2014-12-18 4:18 ` Rusty Russell
2014-12-18 9:52 ` David Laight
2014-12-18 10:53 ` Rusty Russell
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).