* [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation
@ 2014-11-27 12:18 Madhavan Srinivasan
2014-11-27 12:18 ` [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t Madhavan Srinivasan
` (2 more replies)
0 siblings, 3 replies; 29+ messages in thread
From: Madhavan Srinivasan @ 2014-11-27 12:18 UTC (permalink / raw)
To: mpe; +Cc: Madhavan Srinivasan, rusty, paulus, anton, linuxppc-dev
This patchset create the infrastructure to handle the CR based
local_* atomic operations. Local atomic operations are fast
and highly reentrant per CPU counters. Used for percpu
variable updates. Local atomic operations only guarantee
variable modification atomicity wrt the CPU which owns the
data and these needs to be executed in a preemption safe way.
Here is the design of the first patch. Since local_* operations
are only need to be atomic to interrupts (IIUC), patch uses
one of the Condition Register (CR) fields as a flag variable. When
entering the local_*, specific bit in the CR5 field is set
and on exit, bit is cleared. CR bit checking is done in the
interrupt return path. If CR5[EQ] bit set and if we return
to kernel, we reset to start of local_* operation.
Reason for this approach is that, currently l[w/d]arx/st[w/d]cx.
instruction pair is used for local_* operations, which are heavy
on cycle count and they dont support a local varient. So to
see whether the new implementation helps, Used a modified
version of Rusty's benchmark code on local_t. Have the
performance numbers in the patch commit message.
Second patch has the rewrite of the local_* functions to use
CR5 based logic. Changes are mostly in asm/local.h and only for
CONFIG_PPC64
Madhavan Srinivasan (2):
powerpc: foundation code to handle CR5 for local_t
powerpc: rewrite local_* to use CR5 flag
Makefile | 6 +
arch/powerpc/include/asm/exception-64s.h | 21 ++-
arch/powerpc/include/asm/local.h | 306 +++++++++++++++++++++++++++++++
arch/powerpc/kernel/entry_64.S | 106 ++++++++++-
arch/powerpc/kernel/exceptions-64s.S | 2 +-
arch/powerpc/kernel/head_64.S | 8 +
6 files changed, 444 insertions(+), 5 deletions(-)
--
1.9.1
^ permalink raw reply [flat|nested] 29+ messages in thread* [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t 2014-11-27 12:18 [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation Madhavan Srinivasan @ 2014-11-27 12:18 ` Madhavan Srinivasan 2014-11-27 16:56 ` Segher Boessenkool ` (3 more replies) 2014-11-27 12:18 ` [RFC PATCH 2/2]powerpc: rewrite local_* to use CR5 flag Madhavan Srinivasan 2014-11-27 14:05 ` [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation David Laight 2 siblings, 4 replies; 29+ messages in thread From: Madhavan Srinivasan @ 2014-11-27 12:18 UTC (permalink / raw) To: mpe; +Cc: Madhavan Srinivasan, rusty, paulus, anton, linuxppc-dev This patch create the infrastructure to handle the CR based local_* atomic operations. Local atomic operations are fast and highly reentrant per CPU counters. Used for percpu variable updates. Local atomic operations only guarantee variable modification atomicity wrt the CPU which owns the data and these needs to be executed in a preemption safe way. Here is the design of this patch. Since local_* operations are only need to be atomic to interrupts (IIUC), patch uses one of the Condition Register (CR) fields as a flag variable. When entering the local_*, specific bit in the CR5 field is set and on exit, bit is cleared. CR bit checking is done in the interrupt return path. If CR5[EQ] bit set and if we return to kernel, we reset to start of local_* operation. Reason for this approach is that, currently l[w/d]arx/st[w/d]cx. instruction pair is used for local_* operations, which are heavy on cycle count and they dont support a local variant. So to see whether the new implementation helps, used a modified version of Rusty's benchmark code on local_t. https://lkml.org/lkml/2008/12/16/450 Modifications: - increated the working set size from 1MB to 8MB, - removed cpu_local_inc test. Test ran - on POWER8 1S Scale out System 2.0GHz - on OPAL v3 with v3.18-rc4 patch kernel as Host Here are the values with the patch. Time in ns per iteration inc add read add_return atomic_long 67 67 18 69 irqsave/rest 39 39 23 39 trivalue 39 39 29 49 local_t 26 26 24 26 Since CR5 is used as a flag, have added CFLAGS to avoid CR5 for the kernel compilation and CR5 is zeroed at the kernel entry. Tested the patch in a - pSeries LPAR, - Host with patched/unmodified guest kernel To check whether userspace see any CR5 corruption, ran a simple test which does, - set CR5 field, - while(1) - sleep or gettimeofday - chk bit set Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> --- - I really appreciate feedback on the patchset. - Kindly comment if I should try with any other benchmark or workload to check the numbers. - Also, kindly recommand any know stress test for CR Makefile | 6 ++ arch/powerpc/include/asm/exception-64s.h | 21 +++++- arch/powerpc/kernel/entry_64.S | 106 ++++++++++++++++++++++++++++++- arch/powerpc/kernel/exceptions-64s.S | 2 +- arch/powerpc/kernel/head_64.S | 8 +++ 5 files changed, 138 insertions(+), 5 deletions(-) diff --git a/Makefile b/Makefile index 00d618b..2e271ad 100644 --- a/Makefile +++ b/Makefile @@ -706,6 +706,12 @@ endif KBUILD_CFLAGS += $(call cc-option, -fno-var-tracking-assignments) +ifdef CONFIG_PPC64 +# We need this flag to force compiler not to use CR5, since +# local_t type code is based on this. +KBUILD_CFLAGS += -ffixed-cr5 +endif + ifdef CONFIG_DEBUG_INFO ifdef CONFIG_DEBUG_INFO_SPLIT KBUILD_CFLAGS += $(call cc-option, -gsplit-dwarf, -g) diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h index 77f52b2..c42919a 100644 --- a/arch/powerpc/include/asm/exception-64s.h +++ b/arch/powerpc/include/asm/exception-64s.h @@ -306,7 +306,26 @@ do_kvm_##n: \ std r10,0(r1); /* make stack chain pointer */ \ std r0,GPR0(r1); /* save r0 in stackframe */ \ std r10,GPR1(r1); /* save r1 in stackframe */ \ - beq 4f; /* if from kernel mode */ \ +BEGIN_FTR_SECTION; \ + lis r9,4096; /* Create a mask with HV and PR */ \ + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \ + mr r10,r9; /* to check for Hyp state */ \ + ori r9,r9,16384; \ + and r9,r12,r9; \ + cmpd cr3,r10,r9; \ + beq cr3,66f; /* Jump if we come from Hyp mode*/ \ + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \ +FTR_SECTION_ELSE; \ + beq 4f; /* if kernel mode branch */ \ + li r10,0; /* Clear CR5 incase of coming */ \ + mtcrf 0x04,r10; /* from user. */ \ + nop; /* This part of code is for */ \ + nop; /* kernel with MSR[HV]=0, */ \ + nop; /* MSR[PR]=0, so just chk for */ \ + nop; /* MSR[PR] */ \ + nop; \ +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE); \ +66: beq 4f; /* if from kernel mode */ \ ACCOUNT_CPU_USER_ENTRY(r9, r10); \ SAVE_PPR(area, r9, r10); \ 4: EXCEPTION_PROLOG_COMMON_2(area) \ diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index 0905c8d..e42bb99 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -68,7 +68,26 @@ system_call_common: 2: std r2,GPR2(r1) std r3,GPR3(r1) mfcr r2 - std r4,GPR4(r1) +BEGIN_FTR_SECTION + lis r10,4096 + rldicr r10,r10,32,31 + mr r11,r10 + ori r10,r10,16384 + and r10,r12,r10 + cmpd r11,r10 + beq 67f + mtcrf 0x04,r11 +FTR_SECTION_ELSE + beq 67f + li r11,0 + mtcrf 0x04,r11 + nop + nop + nop + nop + nop +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE) +67: std r4,GPR4(r1) std r5,GPR5(r1) std r6,GPR6(r1) std r7,GPR7(r1) @@ -224,8 +243,26 @@ syscall_exit: BEGIN_FTR_SECTION stdcx. r0,0,r1 /* to clear the reservation */ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) +BEGIN_FTR_SECTION + lis r4,4096 + rldicr r4,r4,32,31 + mr r6,r4 + ori r4,r4,16384 + and r4,r8,r4 + cmpd cr3,r6,r4 + beq cr3,65f + mtcr r5 +FTR_SECTION_ELSE andi. r6,r8,MSR_PR - ld r4,_LINK(r1) + beq 65f + mtcr r5 + nop + nop + nop + nop + nop +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE) +65: ld r4,_LINK(r1) beq- 1f ACCOUNT_CPU_USER_EXIT(r11, r12) @@ -234,7 +271,11 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) 1: ld r2,GPR2(r1) ld r1,GPR1(r1) mtlr r4 +#ifdef CONFIG_PPC64 + mtcrf 0xFB,r5 +#else mtcr r5 +#endif mtspr SPRN_SRR0,r7 mtspr SPRN_SRR1,r8 RFI @@ -804,7 +845,66 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) */ .globl fast_exception_return fast_exception_return: - ld r3,_MSR(r1) + + /* + * Now that we are about to exit from interrupt, lets check for + * cr5 eq bit. If it is set, then we may be in the middle of + * local_t update. In this case, we should rewind the NIP + * accordingly. + */ + mfcr r3 + andi. r4,r3,0x200 + beq 63f + + /* + * Now that the bit is set, lets check for return to User + */ + ld r4,_MSR(r1) +BEGIN_FTR_SECTION + li r3,4096 + rldicr r3,r3,32,31 + mr r5,r3 + ori r3,r3,16384 + and r3,r4,r3 + cmpd r5,r3 + bne 63f +FTR_SECTION_ELSE + andi. r3,r4,MSR_PR + bne 63f + nop + nop + nop + nop + nop +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE) + + /* + * Looks like we are returning to Kernel, so + * lets get the NIP and search the ex_table. + * Change the NIP based on the return value + */ +lookup_ex_table: + ld r3,_NIP(r1) + bl search_exception_tables + cmpli 0,1,r3,0 + bne 62f + + /* + * This is a panic case. Reason is that, we + * have the CR5 bit set, but we are not in + * local_* code and we are returning to Kernel. + */ + ld r3,_NIP(r1) + mfcr r4 + EMIT_BUG_ENTRY lookup_ex_table, __FILE__,__LINE__,BUGFLAG_WARNING + + /* + * Now save the return fixup address as NIP + */ +62: ld r4,8(r3) + std r4,_NIP(r1) + crclr 22 +63: ld r3,_MSR(r1) ld r4,_CTR(r1) ld r0,_LINK(r1) mtctr r4 diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 72e783e..edb75a9 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -637,7 +637,7 @@ masked_##_H##interrupt: \ rldicl r10,r10,48,1; /* clear MSR_EE */ \ rotldi r10,r10,16; \ mtspr SPRN_##_H##SRR1,r10; \ -2: mtcrf 0x80,r9; \ +2: mtcrf 0x90,r9; \ ld r9,PACA_EXGEN+EX_R9(r13); \ ld r10,PACA_EXGEN+EX_R10(r13); \ ld r11,PACA_EXGEN+EX_R11(r13); \ diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S index d48125d..02e49b3 100644 --- a/arch/powerpc/kernel/head_64.S +++ b/arch/powerpc/kernel/head_64.S @@ -347,6 +347,14 @@ __mmu_off: * */ __start_initialization_multiplatform: + + /* + * Before we do anything, lets clear CR5 field, + * so that we will have a clean start at entry + */ + li r11,0 + mtcrf 0x04,r11 + /* Make sure we are running in 64 bits mode */ bl enable_64b_mode -- 1.9.1 ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t 2014-11-27 12:18 ` [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t Madhavan Srinivasan @ 2014-11-27 16:56 ` Segher Boessenkool 2014-11-28 1:58 ` Benjamin Herrenschmidt 2014-11-28 2:57 ` Madhavan Srinivasan 2014-11-28 0:56 ` Benjamin Herrenschmidt ` (2 subsequent siblings) 3 siblings, 2 replies; 29+ messages in thread From: Segher Boessenkool @ 2014-11-27 16:56 UTC (permalink / raw) To: Madhavan Srinivasan; +Cc: rusty, paulus, anton, linuxppc-dev On Thu, Nov 27, 2014 at 05:48:40PM +0530, Madhavan Srinivasan wrote: > Here is the design of this patch. Since local_* operations > are only need to be atomic to interrupts (IIUC), patch uses > one of the Condition Register (CR) fields as a flag variable. When > entering the local_*, specific bit in the CR5 field is set > and on exit, bit is cleared. CR bit checking is done in the > interrupt return path. If CR5[EQ] bit set and if we return > to kernel, we reset to start of local_* operation. Have you tested this with (upcoming) GCC 5.0? GCC now uses CR5, and it likes to use it very much, it might be more convenient to use e.g. CR1 (which is allocated almost last, only before CR0). > --- a/arch/powerpc/include/asm/exception-64s.h > +++ b/arch/powerpc/include/asm/exception-64s.h > @@ -306,7 +306,26 @@ do_kvm_##n: \ > std r10,0(r1); /* make stack chain pointer */ \ > std r0,GPR0(r1); /* save r0 in stackframe */ \ > std r10,GPR1(r1); /* save r1 in stackframe */ \ > - beq 4f; /* if from kernel mode */ \ > +BEGIN_FTR_SECTION; \ > + lis r9,4096; /* Create a mask with HV and PR */ \ > + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \ > + mr r10,r9; /* to check for Hyp state */ \ > + ori r9,r9,16384; \ > + and r9,r12,r9; \ > + cmpd cr3,r10,r9; \ > + beq cr3,66f; /* Jump if we come from Hyp mode*/ \ > + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \ Wow, such nastiness, only to avoid using dot insns (since you need to keep the current CR0 value for the following beq 4f). And CR0 already holds the PR bit, so you need only to check the HV bit anyway? Some restructuring would make this a lot simpler and clearer. > + /* > + * Now that we are about to exit from interrupt, lets check for > + * cr5 eq bit. If it is set, then we may be in the middle of > + * local_t update. In this case, we should rewind the NIP > + * accordingly. > + */ > + mfcr r3 > + andi. r4,r3,0x200 > + beq 63f This is just bne cr5,63f isn't it? > index 72e783e..edb75a9 100644 > --- a/arch/powerpc/kernel/exceptions-64s.S > +++ b/arch/powerpc/kernel/exceptions-64s.S > @@ -637,7 +637,7 @@ masked_##_H##interrupt: \ > rldicl r10,r10,48,1; /* clear MSR_EE */ \ > rotldi r10,r10,16; \ > mtspr SPRN_##_H##SRR1,r10; \ > -2: mtcrf 0x80,r9; \ > +2: mtcrf 0x90,r9; \ > ld r9,PACA_EXGEN+EX_R9(r13); \ > ld r10,PACA_EXGEN+EX_R10(r13); \ > ld r11,PACA_EXGEN+EX_R11(r13); \ What does this do? Segher ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t 2014-11-27 16:56 ` Segher Boessenkool @ 2014-11-28 1:58 ` Benjamin Herrenschmidt 2014-11-28 3:00 ` Madhavan Srinivasan 2014-11-28 15:57 ` Segher Boessenkool 2014-11-28 2:57 ` Madhavan Srinivasan 1 sibling, 2 replies; 29+ messages in thread From: Benjamin Herrenschmidt @ 2014-11-28 1:58 UTC (permalink / raw) To: Segher Boessenkool Cc: linuxppc-dev, rusty, Madhavan Srinivasan, paulus, anton On Thu, 2014-11-27 at 10:56 -0600, Segher Boessenkool wrote: > On Thu, Nov 27, 2014 at 05:48:40PM +0530, Madhavan Srinivasan wrote: > > Here is the design of this patch. Since local_* operations > > are only need to be atomic to interrupts (IIUC), patch uses > > one of the Condition Register (CR) fields as a flag variable. When > > entering the local_*, specific bit in the CR5 field is set > > and on exit, bit is cleared. CR bit checking is done in the > > interrupt return path. If CR5[EQ] bit set and if we return > > to kernel, we reset to start of local_* operation. > > Have you tested this with (upcoming) GCC 5.0? GCC now uses CR5, > and it likes to use it very much, it might be more convenient to > use e.g. CR1 (which is allocated almost last, only before CR0). We use CR1 all over the place in your asm code. Any other suggestion ? What's the damage of -ffixed-cr5 on gcc5 ? won't it just use CR4 or 6 instead ? > > --- a/arch/powerpc/include/asm/exception-64s.h > > +++ b/arch/powerpc/include/asm/exception-64s.h > > @@ -306,7 +306,26 @@ do_kvm_##n: \ > > std r10,0(r1); /* make stack chain pointer */ \ > > std r0,GPR0(r1); /* save r0 in stackframe */ \ > > std r10,GPR1(r1); /* save r1 in stackframe */ \ > > - beq 4f; /* if from kernel mode */ \ > > +BEGIN_FTR_SECTION; \ > > + lis r9,4096; /* Create a mask with HV and PR */ \ > > + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \ > > + mr r10,r9; /* to check for Hyp state */ \ > > + ori r9,r9,16384; \ > > + and r9,r12,r9; \ > > + cmpd cr3,r10,r9; \ > > + beq cr3,66f; /* Jump if we come from Hyp mode*/ \ > > + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \ > > Wow, such nastiness, only to avoid using dot insns (since you need to keep > the current CR0 value for the following beq 4f). And CR0 already holds the > PR bit, so you need only to check the HV bit anyway? Some restructuring > would make this a lot simpler and clearer. > > > + /* > > + * Now that we are about to exit from interrupt, lets check for > > + * cr5 eq bit. If it is set, then we may be in the middle of > > + * local_t update. In this case, we should rewind the NIP > > + * accordingly. > > + */ > > + mfcr r3 > > + andi. r4,r3,0x200 > > + beq 63f > > This is just bne cr5,63f isn't it? > > > index 72e783e..edb75a9 100644 > > --- a/arch/powerpc/kernel/exceptions-64s.S > > +++ b/arch/powerpc/kernel/exceptions-64s.S > > @@ -637,7 +637,7 @@ masked_##_H##interrupt: \ > > rldicl r10,r10,48,1; /* clear MSR_EE */ \ > > rotldi r10,r10,16; \ > > mtspr SPRN_##_H##SRR1,r10; \ > > -2: mtcrf 0x80,r9; \ > > +2: mtcrf 0x90,r9; \ > > ld r9,PACA_EXGEN+EX_R9(r13); \ > > ld r10,PACA_EXGEN+EX_R10(r13); \ > > ld r11,PACA_EXGEN+EX_R11(r13); \ > > What does this do? > > > Segher > _______________________________________________ > Linuxppc-dev mailing list > Linuxppc-dev@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/linuxppc-dev ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t 2014-11-28 1:58 ` Benjamin Herrenschmidt @ 2014-11-28 3:00 ` Madhavan Srinivasan 2014-11-28 15:57 ` Segher Boessenkool 1 sibling, 0 replies; 29+ messages in thread From: Madhavan Srinivasan @ 2014-11-28 3:00 UTC (permalink / raw) To: Benjamin Herrenschmidt, Segher Boessenkool Cc: linuxppc-dev, rusty, paulus, anton On Friday 28 November 2014 07:28 AM, Benjamin Herrenschmidt wrote: > On Thu, 2014-11-27 at 10:56 -0600, Segher Boessenkool wrote: >> On Thu, Nov 27, 2014 at 05:48:40PM +0530, Madhavan Srinivasan wrote: >>> Here is the design of this patch. Since local_* operations >>> are only need to be atomic to interrupts (IIUC), patch uses >>> one of the Condition Register (CR) fields as a flag variable. When >>> entering the local_*, specific bit in the CR5 field is set >>> and on exit, bit is cleared. CR bit checking is done in the >>> interrupt return path. If CR5[EQ] bit set and if we return >>> to kernel, we reset to start of local_* operation. >> >> Have you tested this with (upcoming) GCC 5.0? GCC now uses CR5, >> and it likes to use it very much, it might be more convenient to >> use e.g. CR1 (which is allocated almost last, only before CR0). > > We use CR1 all over the place in your asm code. Any other suggestion ? > Yes. CR1 is used in many places and so do CR7. And CR0 are alway used for dot instn. And I guess Vector instructions use CR6. > What's the damage of -ffixed-cr5 on gcc5 ? won't it just use CR4 or 6 > instead ? > Will try this today with GCC 5.0. >>> --- a/arch/powerpc/include/asm/exception-64s.h >>> +++ b/arch/powerpc/include/asm/exception-64s.h >>> @@ -306,7 +306,26 @@ do_kvm_##n: \ >>> std r10,0(r1); /* make stack chain pointer */ \ >>> std r0,GPR0(r1); /* save r0 in stackframe */ \ >>> std r10,GPR1(r1); /* save r1 in stackframe */ \ >>> - beq 4f; /* if from kernel mode */ \ >>> +BEGIN_FTR_SECTION; \ >>> + lis r9,4096; /* Create a mask with HV and PR */ \ >>> + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \ >>> + mr r10,r9; /* to check for Hyp state */ \ >>> + ori r9,r9,16384; \ >>> + and r9,r12,r9; \ >>> + cmpd cr3,r10,r9; \ >>> + beq cr3,66f; /* Jump if we come from Hyp mode*/ \ >>> + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \ >> >> Wow, such nastiness, only to avoid using dot insns (since you need to keep >> the current CR0 value for the following beq 4f). And CR0 already holds the >> PR bit, so you need only to check the HV bit anyway? Some restructuring >> would make this a lot simpler and clearer. >> >>> + /* >>> + * Now that we are about to exit from interrupt, lets check for >>> + * cr5 eq bit. If it is set, then we may be in the middle of >>> + * local_t update. In this case, we should rewind the NIP >>> + * accordingly. >>> + */ >>> + mfcr r3 >>> + andi. r4,r3,0x200 >>> + beq 63f >> >> This is just bne cr5,63f isn't it? >> >>> index 72e783e..edb75a9 100644 >>> --- a/arch/powerpc/kernel/exceptions-64s.S >>> +++ b/arch/powerpc/kernel/exceptions-64s.S >>> @@ -637,7 +637,7 @@ masked_##_H##interrupt: \ >>> rldicl r10,r10,48,1; /* clear MSR_EE */ \ >>> rotldi r10,r10,16; \ >>> mtspr SPRN_##_H##SRR1,r10; \ >>> -2: mtcrf 0x80,r9; \ >>> +2: mtcrf 0x90,r9; \ >>> ld r9,PACA_EXGEN+EX_R9(r13); \ >>> ld r10,PACA_EXGEN+EX_R10(r13); \ >>> ld r11,PACA_EXGEN+EX_R11(r13); \ >> >> What does this do? >> >> >> Segher >> _______________________________________________ >> Linuxppc-dev mailing list >> Linuxppc-dev@lists.ozlabs.org >> https://lists.ozlabs.org/listinfo/linuxppc-dev > > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t 2014-11-28 1:58 ` Benjamin Herrenschmidt 2014-11-28 3:00 ` Madhavan Srinivasan @ 2014-11-28 15:57 ` Segher Boessenkool 1 sibling, 0 replies; 29+ messages in thread From: Segher Boessenkool @ 2014-11-28 15:57 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: linuxppc-dev, rusty, Madhavan Srinivasan, paulus, anton On Fri, Nov 28, 2014 at 12:58:55PM +1100, Benjamin Herrenschmidt wrote: > > Have you tested this with (upcoming) GCC 5.0? GCC now uses CR5, > > and it likes to use it very much, it might be more convenient to > > use e.g. CR1 (which is allocated almost last, only before CR0). > > We use CR1 all over the place in your asm code. Any other suggestion ? > > What's the damage of -ffixed-cr5 on gcc5 ? won't it just use CR4 or 6 > instead ? Oh, it will work fine. Not using CR5 would be more convenient so that the register allocation for most code would not change when you use or not use -ffixed-cr5, making code easier to read. But your point about asm code already using the other CR fields makes CR5 a better choice actually, because people avoided it (because the compiler did) :-) Segher ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t 2014-11-27 16:56 ` Segher Boessenkool 2014-11-28 1:58 ` Benjamin Herrenschmidt @ 2014-11-28 2:57 ` Madhavan Srinivasan 2014-11-28 16:00 ` Segher Boessenkool 1 sibling, 1 reply; 29+ messages in thread From: Madhavan Srinivasan @ 2014-11-28 2:57 UTC (permalink / raw) To: Segher Boessenkool; +Cc: rusty, paulus, anton, linuxppc-dev On Thursday 27 November 2014 10:26 PM, Segher Boessenkool wrote: > On Thu, Nov 27, 2014 at 05:48:40PM +0530, Madhavan Srinivasan wrote: >> Here is the design of this patch. Since local_* operations >> are only need to be atomic to interrupts (IIUC), patch uses >> one of the Condition Register (CR) fields as a flag variable. When >> entering the local_*, specific bit in the CR5 field is set >> and on exit, bit is cleared. CR bit checking is done in the >> interrupt return path. If CR5[EQ] bit set and if we return >> to kernel, we reset to start of local_* operation. > > Have you tested this with (upcoming) GCC 5.0? GCC now uses CR5, > and it likes to use it very much, it might be more convenient to > use e.g. CR1 (which is allocated almost last, only before CR0). > No. I did not try it with GCC5.0 But I did force kernel compilation with fixed-cr5 which should make GCC avoid using CR5. But i will try that today. >> --- a/arch/powerpc/include/asm/exception-64s.h >> +++ b/arch/powerpc/include/asm/exception-64s.h >> @@ -306,7 +306,26 @@ do_kvm_##n: \ >> std r10,0(r1); /* make stack chain pointer */ \ >> std r0,GPR0(r1); /* save r0 in stackframe */ \ >> std r10,GPR1(r1); /* save r1 in stackframe */ \ >> - beq 4f; /* if from kernel mode */ \ >> +BEGIN_FTR_SECTION; \ >> + lis r9,4096; /* Create a mask with HV and PR */ \ >> + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \ >> + mr r10,r9; /* to check for Hyp state */ \ >> + ori r9,r9,16384; \ >> + and r9,r12,r9; \ >> + cmpd cr3,r10,r9; \ >> + beq cr3,66f; /* Jump if we come from Hyp mode*/ \ >> + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \ > > Wow, such nastiness, only to avoid using dot insns (since you need to keep > the current CR0 value for the following beq 4f). And CR0 already holds the > PR bit, so you need only to check the HV bit anyway? Some restructuring > would make this a lot simpler and clearer. Ok I can try that. > >> + /* >> + * Now that we are about to exit from interrupt, lets check for >> + * cr5 eq bit. If it is set, then we may be in the middle of >> + * local_t update. In this case, we should rewind the NIP >> + * accordingly. >> + */ >> + mfcr r3 >> + andi. r4,r3,0x200 >> + beq 63f > > This is just bne cr5,63f isn't it? > >> index 72e783e..edb75a9 100644 >> --- a/arch/powerpc/kernel/exceptions-64s.S >> +++ b/arch/powerpc/kernel/exceptions-64s.S >> @@ -637,7 +637,7 @@ masked_##_H##interrupt: \ >> rldicl r10,r10,48,1; /* clear MSR_EE */ \ >> rotldi r10,r10,16; \ >> mtspr SPRN_##_H##SRR1,r10; \ >> -2: mtcrf 0x80,r9; \ >> +2: mtcrf 0x90,r9; \ >> ld r9,PACA_EXGEN+EX_R9(r13); \ >> ld r10,PACA_EXGEN+EX_R10(r13); \ >> ld r11,PACA_EXGEN+EX_R11(r13); \ > > What does this do? > Since I use the CR3, I restore it here. > > Segher > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t 2014-11-28 2:57 ` Madhavan Srinivasan @ 2014-11-28 16:00 ` Segher Boessenkool 0 siblings, 0 replies; 29+ messages in thread From: Segher Boessenkool @ 2014-11-28 16:00 UTC (permalink / raw) To: Madhavan Srinivasan; +Cc: rusty, paulus, anton, linuxppc-dev On Fri, Nov 28, 2014 at 08:27:22AM +0530, Madhavan Srinivasan wrote: > > Have you tested this with (upcoming) GCC 5.0? GCC now uses CR5, > > and it likes to use it very much, it might be more convenient to > > use e.g. CR1 (which is allocated almost last, only before CR0). > > > No. I did not try it with GCC5.0 But I did force kernel compilation with > fixed-cr5 which should make GCC avoid using CR5. But i will try that today. It should work just fine, but testing is always useful, because, you know, "should". Thanks. Segher ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t 2014-11-27 12:18 ` [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t Madhavan Srinivasan 2014-11-27 16:56 ` Segher Boessenkool @ 2014-11-28 0:56 ` Benjamin Herrenschmidt 2014-11-28 3:15 ` Madhavan Srinivasan 2014-12-01 21:35 ` Gabriel Paubert 2014-12-02 2:04 ` Scott Wood 3 siblings, 1 reply; 29+ messages in thread From: Benjamin Herrenschmidt @ 2014-11-28 0:56 UTC (permalink / raw) To: Madhavan Srinivasan; +Cc: rusty, paulus, anton, linuxppc-dev On Thu, 2014-11-27 at 17:48 +0530, Madhavan Srinivasan wrote: > This patch create the infrastructure to handle the CR based > local_* atomic operations. Local atomic operations are fast > and highly reentrant per CPU counters. Used for percpu > variable updates. Local atomic operations only guarantee > variable modification atomicity wrt the CPU which owns the > data and these needs to be executed in a preemption safe way. > > Here is the design of this patch. Since local_* operations > are only need to be atomic to interrupts (IIUC), patch uses > one of the Condition Register (CR) fields as a flag variable. When > entering the local_*, specific bit in the CR5 field is set > and on exit, bit is cleared. CR bit checking is done in the > interrupt return path. If CR5[EQ] bit set and if we return > to kernel, we reset to start of local_* operation. > > Reason for this approach is that, currently l[w/d]arx/st[w/d]cx. > instruction pair is used for local_* operations, which are heavy > on cycle count and they dont support a local variant. So to > see whether the new implementation helps, used a modified > version of Rusty's benchmark code on local_t. > > https://lkml.org/lkml/2008/12/16/450 > > Modifications: > - increated the working set size from 1MB to 8MB, > - removed cpu_local_inc test. > > Test ran > - on POWER8 1S Scale out System 2.0GHz > - on OPAL v3 with v3.18-rc4 patch kernel as Host > > Here are the values with the patch. > > Time in ns per iteration > > inc add read add_return > atomic_long 67 67 18 69 > irqsave/rest 39 39 23 39 > trivalue 39 39 29 49 > local_t 26 26 24 26 > > Since CR5 is used as a flag, have added CFLAGS to avoid CR5 > for the kernel compilation and CR5 is zeroed at the kernel > entry. > > Tested the patch in a > - pSeries LPAR, > - Host with patched/unmodified guest kernel > > To check whether userspace see any CR5 corruption, ran a simple > test which does, > - set CR5 field, > - while(1) > - sleep or gettimeofday > - chk bit set > > Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> > --- > - I really appreciate feedback on the patchset. > - Kindly comment if I should try with any other benchmark or > workload to check the numbers. > - Also, kindly recommand any know stress test for CR > > Makefile | 6 ++ > arch/powerpc/include/asm/exception-64s.h | 21 +++++- > arch/powerpc/kernel/entry_64.S | 106 ++++++++++++++++++++++++++++++- > arch/powerpc/kernel/exceptions-64s.S | 2 +- > arch/powerpc/kernel/head_64.S | 8 +++ > 5 files changed, 138 insertions(+), 5 deletions(-) > > diff --git a/Makefile b/Makefile > index 00d618b..2e271ad 100644 > --- a/Makefile > +++ b/Makefile > @@ -706,6 +706,12 @@ endif > > KBUILD_CFLAGS += $(call cc-option, -fno-var-tracking-assignments) > > +ifdef CONFIG_PPC64 > +# We need this flag to force compiler not to use CR5, since > +# local_t type code is based on this. > +KBUILD_CFLAGS += -ffixed-cr5 > +endif > + > ifdef CONFIG_DEBUG_INFO > ifdef CONFIG_DEBUG_INFO_SPLIT > KBUILD_CFLAGS += $(call cc-option, -gsplit-dwarf, -g) > diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h > index 77f52b2..c42919a 100644 > --- a/arch/powerpc/include/asm/exception-64s.h > +++ b/arch/powerpc/include/asm/exception-64s.h > @@ -306,7 +306,26 @@ do_kvm_##n: \ > std r10,0(r1); /* make stack chain pointer */ \ > std r0,GPR0(r1); /* save r0 in stackframe */ \ > std r10,GPR1(r1); /* save r1 in stackframe */ \ > - beq 4f; /* if from kernel mode */ \ > +BEGIN_FTR_SECTION; \ > + lis r9,4096; /* Create a mask with HV and PR */ \ > + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \ > + mr r10,r9; /* to check for Hyp state */ \ > + ori r9,r9,16384; \ > + and r9,r12,r9; \ > + cmpd cr3,r10,r9; \ > + beq cr3,66f; /* Jump if we come from Hyp mode*/ \ > + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \ > +FTR_SECTION_ELSE; \ Can't we just unconditionally clear at as long as we do that after we've saved it ? In that case, it's just a matter for the fixup code to check the saved version rather than the actual CR.. > + beq 4f; /* if kernel mode branch */ \ > + li r10,0; /* Clear CR5 incase of coming */ \ > + mtcrf 0x04,r10; /* from user. */ \ > + nop; /* This part of code is for */ \ > + nop; /* kernel with MSR[HV]=0, */ \ > + nop; /* MSR[PR]=0, so just chk for */ \ > + nop; /* MSR[PR] */ \ > + nop; \ > +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE); \ > +66: beq 4f; /* if from kernel mode */ \ > ACCOUNT_CPU_USER_ENTRY(r9, r10); \ > SAVE_PPR(area, r9, r10); \ > 4: EXCEPTION_PROLOG_COMMON_2(area) \ > diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S > index 0905c8d..e42bb99 100644 > --- a/arch/powerpc/kernel/entry_64.S > +++ b/arch/powerpc/kernel/entry_64.S > @@ -68,7 +68,26 @@ system_call_common: > 2: std r2,GPR2(r1) > std r3,GPR3(r1) > mfcr r2 > - std r4,GPR4(r1) > +BEGIN_FTR_SECTION > + lis r10,4096 > + rldicr r10,r10,32,31 > + mr r11,r10 > + ori r10,r10,16384 > + and r10,r12,r10 > + cmpd r11,r10 > + beq 67f > + mtcrf 0x04,r11 > +FTR_SECTION_ELSE > + beq 67f > + li r11,0 > + mtcrf 0x04,r11 > + nop > + nop > + nop > + nop > + nop > +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE) > +67: std r4,GPR4(r1) > std r5,GPR5(r1) > std r6,GPR6(r1) > std r7,GPR7(r1) > @@ -224,8 +243,26 @@ syscall_exit: > BEGIN_FTR_SECTION > stdcx. r0,0,r1 /* to clear the reservation */ > END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) > +BEGIN_FTR_SECTION > + lis r4,4096 > + rldicr r4,r4,32,31 > + mr r6,r4 > + ori r4,r4,16384 > + and r4,r8,r4 > + cmpd cr3,r6,r4 > + beq cr3,65f > + mtcr r5 > +FTR_SECTION_ELSE > andi. r6,r8,MSR_PR > - ld r4,_LINK(r1) > + beq 65f > + mtcr r5 > + nop > + nop > + nop > + nop > + nop > +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE) > +65: ld r4,_LINK(r1) > > beq- 1f > ACCOUNT_CPU_USER_EXIT(r11, r12) > @@ -234,7 +271,11 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) > 1: ld r2,GPR2(r1) > ld r1,GPR1(r1) > mtlr r4 > +#ifdef CONFIG_PPC64 > + mtcrf 0xFB,r5 > +#else > mtcr r5 > +#endif > mtspr SPRN_SRR0,r7 > mtspr SPRN_SRR1,r8 > RFI > @@ -804,7 +845,66 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) > */ > .globl fast_exception_return > fast_exception_return: > - ld r3,_MSR(r1) > + > + /* > + * Now that we are about to exit from interrupt, lets check for > + * cr5 eq bit. If it is set, then we may be in the middle of > + * local_t update. In this case, we should rewind the NIP > + * accordingly. > + */ > + mfcr r3 > + andi. r4,r3,0x200 > + beq 63f > + > + /* > + * Now that the bit is set, lets check for return to User > + */ > + ld r4,_MSR(r1) > +BEGIN_FTR_SECTION > + li r3,4096 > + rldicr r3,r3,32,31 > + mr r5,r3 > + ori r3,r3,16384 > + and r3,r4,r3 > + cmpd r5,r3 > + bne 63f > +FTR_SECTION_ELSE > + andi. r3,r4,MSR_PR > + bne 63f > + nop > + nop > + nop > + nop > + nop > +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE) > + > + /* > + * Looks like we are returning to Kernel, so > + * lets get the NIP and search the ex_table. > + * Change the NIP based on the return value > + */ > +lookup_ex_table: > + ld r3,_NIP(r1) > + bl search_exception_tables > + cmpli 0,1,r3,0 > + bne 62f > + > + /* > + * This is a panic case. Reason is that, we > + * have the CR5 bit set, but we are not in > + * local_* code and we are returning to Kernel. > + */ > + ld r3,_NIP(r1) > + mfcr r4 > + EMIT_BUG_ENTRY lookup_ex_table, __FILE__,__LINE__,BUGFLAG_WARNING > + > + /* > + * Now save the return fixup address as NIP > + */ > +62: ld r4,8(r3) > + std r4,_NIP(r1) > + crclr 22 > +63: ld r3,_MSR(r1) > ld r4,_CTR(r1) > ld r0,_LINK(r1) > mtctr r4 > diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S > index 72e783e..edb75a9 100644 > --- a/arch/powerpc/kernel/exceptions-64s.S > +++ b/arch/powerpc/kernel/exceptions-64s.S > @@ -637,7 +637,7 @@ masked_##_H##interrupt: \ > rldicl r10,r10,48,1; /* clear MSR_EE */ \ > rotldi r10,r10,16; \ > mtspr SPRN_##_H##SRR1,r10; \ > -2: mtcrf 0x80,r9; \ > +2: mtcrf 0x90,r9; \ > ld r9,PACA_EXGEN+EX_R9(r13); \ > ld r10,PACA_EXGEN+EX_R10(r13); \ > ld r11,PACA_EXGEN+EX_R11(r13); \ > diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S > index d48125d..02e49b3 100644 > --- a/arch/powerpc/kernel/head_64.S > +++ b/arch/powerpc/kernel/head_64.S > @@ -347,6 +347,14 @@ __mmu_off: > * > */ > __start_initialization_multiplatform: > + > + /* > + * Before we do anything, lets clear CR5 field, > + * so that we will have a clean start at entry > + */ > + li r11,0 > + mtcrf 0x04,r11 > + > /* Make sure we are running in 64 bits mode */ > bl enable_64b_mode > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t 2014-11-28 0:56 ` Benjamin Herrenschmidt @ 2014-11-28 3:15 ` Madhavan Srinivasan 2014-11-28 3:21 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 29+ messages in thread From: Madhavan Srinivasan @ 2014-11-28 3:15 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: rusty, paulus, anton, linuxppc-dev On Friday 28 November 2014 06:26 AM, Benjamin Herrenschmidt wrote: > On Thu, 2014-11-27 at 17:48 +0530, Madhavan Srinivasan wrote: >> This patch create the infrastructure to handle the CR based >> local_* atomic operations. Local atomic operations are fast >> and highly reentrant per CPU counters. Used for percpu >> variable updates. Local atomic operations only guarantee >> variable modification atomicity wrt the CPU which owns the >> data and these needs to be executed in a preemption safe way. >> >> Here is the design of this patch. Since local_* operations >> are only need to be atomic to interrupts (IIUC), patch uses >> one of the Condition Register (CR) fields as a flag variable. When >> entering the local_*, specific bit in the CR5 field is set >> and on exit, bit is cleared. CR bit checking is done in the >> interrupt return path. If CR5[EQ] bit set and if we return >> to kernel, we reset to start of local_* operation. >> >> Reason for this approach is that, currently l[w/d]arx/st[w/d]cx. >> instruction pair is used for local_* operations, which are heavy >> on cycle count and they dont support a local variant. So to >> see whether the new implementation helps, used a modified >> version of Rusty's benchmark code on local_t. >> >> https://lkml.org/lkml/2008/12/16/450 >> >> Modifications: >> - increated the working set size from 1MB to 8MB, >> - removed cpu_local_inc test. >> >> Test ran >> - on POWER8 1S Scale out System 2.0GHz >> - on OPAL v3 with v3.18-rc4 patch kernel as Host >> >> Here are the values with the patch. >> >> Time in ns per iteration >> >> inc add read add_return >> atomic_long 67 67 18 69 >> irqsave/rest 39 39 23 39 >> trivalue 39 39 29 49 >> local_t 26 26 24 26 >> >> Since CR5 is used as a flag, have added CFLAGS to avoid CR5 >> for the kernel compilation and CR5 is zeroed at the kernel >> entry. >> >> Tested the patch in a >> - pSeries LPAR, >> - Host with patched/unmodified guest kernel >> >> To check whether userspace see any CR5 corruption, ran a simple >> test which does, >> - set CR5 field, >> - while(1) >> - sleep or gettimeofday >> - chk bit set >> >> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> >> --- >> - I really appreciate feedback on the patchset. >> - Kindly comment if I should try with any other benchmark or >> workload to check the numbers. >> - Also, kindly recommand any know stress test for CR >> >> Makefile | 6 ++ >> arch/powerpc/include/asm/exception-64s.h | 21 +++++- >> arch/powerpc/kernel/entry_64.S | 106 ++++++++++++++++++++++++++++++- >> arch/powerpc/kernel/exceptions-64s.S | 2 +- >> arch/powerpc/kernel/head_64.S | 8 +++ >> 5 files changed, 138 insertions(+), 5 deletions(-) >> >> diff --git a/Makefile b/Makefile >> index 00d618b..2e271ad 100644 >> --- a/Makefile >> +++ b/Makefile >> @@ -706,6 +706,12 @@ endif >> >> KBUILD_CFLAGS += $(call cc-option, -fno-var-tracking-assignments) >> >> +ifdef CONFIG_PPC64 >> +# We need this flag to force compiler not to use CR5, since >> +# local_t type code is based on this. >> +KBUILD_CFLAGS += -ffixed-cr5 >> +endif >> + >> ifdef CONFIG_DEBUG_INFO >> ifdef CONFIG_DEBUG_INFO_SPLIT >> KBUILD_CFLAGS += $(call cc-option, -gsplit-dwarf, -g) >> diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h >> index 77f52b2..c42919a 100644 >> --- a/arch/powerpc/include/asm/exception-64s.h >> +++ b/arch/powerpc/include/asm/exception-64s.h >> @@ -306,7 +306,26 @@ do_kvm_##n: \ >> std r10,0(r1); /* make stack chain pointer */ \ >> std r0,GPR0(r1); /* save r0 in stackframe */ \ >> std r10,GPR1(r1); /* save r1 in stackframe */ \ >> - beq 4f; /* if from kernel mode */ \ >> +BEGIN_FTR_SECTION; \ >> + lis r9,4096; /* Create a mask with HV and PR */ \ >> + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \ >> + mr r10,r9; /* to check for Hyp state */ \ >> + ori r9,r9,16384; \ >> + and r9,r12,r9; \ >> + cmpd cr3,r10,r9; \ >> + beq cr3,66f; /* Jump if we come from Hyp mode*/ \ >> + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \ >> +FTR_SECTION_ELSE; \ > > Can't we just unconditionally clear at as long as we do that after we've > saved it ? In that case, it's just a matter for the fixup code to check > the saved version rather than the actual CR.. > I use CR bit setting in the interrupt return path to enter the fixup section search. If we unconditionally clear it, we will have to enter the fixup section for every kernel return nip right? Regards Maddy >> + beq 4f; /* if kernel mode branch */ \ >> + li r10,0; /* Clear CR5 incase of coming */ \ >> + mtcrf 0x04,r10; /* from user. */ \ >> + nop; /* This part of code is for */ \ >> + nop; /* kernel with MSR[HV]=0, */ \ >> + nop; /* MSR[PR]=0, so just chk for */ \ >> + nop; /* MSR[PR] */ \ >> + nop; \ >> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE); \ >> +66: beq 4f; /* if from kernel mode */ \ >> ACCOUNT_CPU_USER_ENTRY(r9, r10); \ >> SAVE_PPR(area, r9, r10); \ >> 4: EXCEPTION_PROLOG_COMMON_2(area) \ >> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S >> index 0905c8d..e42bb99 100644 >> --- a/arch/powerpc/kernel/entry_64.S >> +++ b/arch/powerpc/kernel/entry_64.S >> @@ -68,7 +68,26 @@ system_call_common: >> 2: std r2,GPR2(r1) >> std r3,GPR3(r1) >> mfcr r2 >> - std r4,GPR4(r1) >> +BEGIN_FTR_SECTION >> + lis r10,4096 >> + rldicr r10,r10,32,31 >> + mr r11,r10 >> + ori r10,r10,16384 >> + and r10,r12,r10 >> + cmpd r11,r10 >> + beq 67f >> + mtcrf 0x04,r11 >> +FTR_SECTION_ELSE >> + beq 67f >> + li r11,0 >> + mtcrf 0x04,r11 >> + nop >> + nop >> + nop >> + nop >> + nop >> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE) >> +67: std r4,GPR4(r1) >> std r5,GPR5(r1) >> std r6,GPR6(r1) >> std r7,GPR7(r1) >> @@ -224,8 +243,26 @@ syscall_exit: >> BEGIN_FTR_SECTION >> stdcx. r0,0,r1 /* to clear the reservation */ >> END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) >> +BEGIN_FTR_SECTION >> + lis r4,4096 >> + rldicr r4,r4,32,31 >> + mr r6,r4 >> + ori r4,r4,16384 >> + and r4,r8,r4 >> + cmpd cr3,r6,r4 >> + beq cr3,65f >> + mtcr r5 >> +FTR_SECTION_ELSE >> andi. r6,r8,MSR_PR >> - ld r4,_LINK(r1) >> + beq 65f >> + mtcr r5 >> + nop >> + nop >> + nop >> + nop >> + nop >> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE) >> +65: ld r4,_LINK(r1) >> >> beq- 1f >> ACCOUNT_CPU_USER_EXIT(r11, r12) >> @@ -234,7 +271,11 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) >> 1: ld r2,GPR2(r1) >> ld r1,GPR1(r1) >> mtlr r4 >> +#ifdef CONFIG_PPC64 >> + mtcrf 0xFB,r5 >> +#else >> mtcr r5 >> +#endif >> mtspr SPRN_SRR0,r7 >> mtspr SPRN_SRR1,r8 >> RFI >> @@ -804,7 +845,66 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) >> */ >> .globl fast_exception_return >> fast_exception_return: >> - ld r3,_MSR(r1) >> + >> + /* >> + * Now that we are about to exit from interrupt, lets check for >> + * cr5 eq bit. If it is set, then we may be in the middle of >> + * local_t update. In this case, we should rewind the NIP >> + * accordingly. >> + */ >> + mfcr r3 >> + andi. r4,r3,0x200 >> + beq 63f >> + >> + /* >> + * Now that the bit is set, lets check for return to User >> + */ >> + ld r4,_MSR(r1) >> +BEGIN_FTR_SECTION >> + li r3,4096 >> + rldicr r3,r3,32,31 >> + mr r5,r3 >> + ori r3,r3,16384 >> + and r3,r4,r3 >> + cmpd r5,r3 >> + bne 63f >> +FTR_SECTION_ELSE >> + andi. r3,r4,MSR_PR >> + bne 63f >> + nop >> + nop >> + nop >> + nop >> + nop >> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE) >> + >> + /* >> + * Looks like we are returning to Kernel, so >> + * lets get the NIP and search the ex_table. >> + * Change the NIP based on the return value >> + */ >> +lookup_ex_table: >> + ld r3,_NIP(r1) >> + bl search_exception_tables >> + cmpli 0,1,r3,0 >> + bne 62f >> + >> + /* >> + * This is a panic case. Reason is that, we >> + * have the CR5 bit set, but we are not in >> + * local_* code and we are returning to Kernel. >> + */ >> + ld r3,_NIP(r1) >> + mfcr r4 >> + EMIT_BUG_ENTRY lookup_ex_table, __FILE__,__LINE__,BUGFLAG_WARNING >> + >> + /* >> + * Now save the return fixup address as NIP >> + */ >> +62: ld r4,8(r3) >> + std r4,_NIP(r1) >> + crclr 22 >> +63: ld r3,_MSR(r1) >> ld r4,_CTR(r1) >> ld r0,_LINK(r1) >> mtctr r4 >> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S >> index 72e783e..edb75a9 100644 >> --- a/arch/powerpc/kernel/exceptions-64s.S >> +++ b/arch/powerpc/kernel/exceptions-64s.S >> @@ -637,7 +637,7 @@ masked_##_H##interrupt: \ >> rldicl r10,r10,48,1; /* clear MSR_EE */ \ >> rotldi r10,r10,16; \ >> mtspr SPRN_##_H##SRR1,r10; \ >> -2: mtcrf 0x80,r9; \ >> +2: mtcrf 0x90,r9; \ >> ld r9,PACA_EXGEN+EX_R9(r13); \ >> ld r10,PACA_EXGEN+EX_R10(r13); \ >> ld r11,PACA_EXGEN+EX_R11(r13); \ >> diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S >> index d48125d..02e49b3 100644 >> --- a/arch/powerpc/kernel/head_64.S >> +++ b/arch/powerpc/kernel/head_64.S >> @@ -347,6 +347,14 @@ __mmu_off: >> * >> */ >> __start_initialization_multiplatform: >> + >> + /* >> + * Before we do anything, lets clear CR5 field, >> + * so that we will have a clean start at entry >> + */ >> + li r11,0 >> + mtcrf 0x04,r11 >> + >> /* Make sure we are running in 64 bits mode */ >> bl enable_64b_mode >> > > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t 2014-11-28 3:15 ` Madhavan Srinivasan @ 2014-11-28 3:21 ` Benjamin Herrenschmidt 2014-11-28 10:53 ` David Laight 0 siblings, 1 reply; 29+ messages in thread From: Benjamin Herrenschmidt @ 2014-11-28 3:21 UTC (permalink / raw) To: Madhavan Srinivasan; +Cc: rusty, paulus, anton, linuxppc-dev On Fri, 2014-11-28 at 08:45 +0530, Madhavan Srinivasan wrote: > > Can't we just unconditionally clear at as long as we do that after we've > > saved it ? In that case, it's just a matter for the fixup code to check > > the saved version rather than the actual CR.. > > > I use CR bit setting in the interrupt return path to enter the fixup > section search. If we unconditionally clear it, we will have to enter > the fixup section for every kernel return nip right? As I said above. Can't we look at the saved version ? IE. - On interrupt entry: * Save CR to CCR(r1) * clear CR5 - On exit * Check CCR(r1)'s CR5 field * restore CR Cheers, Ben. ^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t 2014-11-28 3:21 ` Benjamin Herrenschmidt @ 2014-11-28 10:53 ` David Laight 2014-11-30 9:01 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 29+ messages in thread From: David Laight @ 2014-11-28 10:53 UTC (permalink / raw) To: 'Benjamin Herrenschmidt', Madhavan Srinivasan Cc: linuxppc-dev@lists.ozlabs.org, rusty@rustcorp.com.au, paulus@samba.org, anton@samba.org RnJvbTogQmVuamFtaW4gSGVycmVuc2NobWlkdA0KPiBPbiBGcmksIDIwMTQtMTEtMjggYXQgMDg6 NDUgKzA1MzAsIE1hZGhhdmFuIFNyaW5pdmFzYW4gd3JvdGU6DQo+ID4gPiBDYW4ndCB3ZSBqdXN0 IHVuY29uZGl0aW9uYWxseSBjbGVhciBhdCBhcyBsb25nIGFzIHdlIGRvIHRoYXQgYWZ0ZXIgd2Un dmUNCj4gPiA+IHNhdmVkIGl0ID8gSW4gdGhhdCBjYXNlLCBpdCdzIGp1c3QgYSBtYXR0ZXIgZm9y IHRoZSBmaXh1cCBjb2RlIHRvIGNoZWNrDQo+ID4gPiB0aGUgc2F2ZWQgdmVyc2lvbiByYXRoZXIg dGhhbiB0aGUgYWN0dWFsIENSLi4NCj4gPiA+DQo+ID4gSSB1c2UgQ1IgYml0IHNldHRpbmcgaW4g dGhlIGludGVycnVwdCByZXR1cm4gcGF0aCB0byBlbnRlciB0aGUgZml4dXANCj4gPiBzZWN0aW9u IHNlYXJjaC4gSWYgd2UgdW5jb25kaXRpb25hbGx5IGNsZWFyIGl0LCB3ZSB3aWxsIGhhdmUgdG8g ZW50ZXINCj4gPiB0aGUgZml4dXAgc2VjdGlvbiBmb3IgZXZlcnkga2VybmVsIHJldHVybiBuaXAg cmlnaHQ/DQo+IA0KPiBBcyBJIHNhaWQgYWJvdmUuIENhbid0IHdlIGxvb2sgYXQgdGhlIHNhdmVk IHZlcnNpb24gPw0KPiANCj4gSUUuDQo+IA0KPiAgLSBPbiBpbnRlcnJ1cHQgZW50cnk6DQo+IA0K PiAJKiBTYXZlIENSIHRvIENDUihyMSkNCj4gCSogY2xlYXIgQ1I1DQo+IA0KPiAgLSBPbiBleGl0 DQo+IA0KPiAJKiBDaGVjayBDQ1IocjEpJ3MgQ1I1IGZpZWxkDQo+IAkqIHJlc3RvcmUgQ1INCg0K QWN0dWFsbHkgdGhlcmUgaXMgbm8gcmVhbCByZWFzb24gd2h5IHRoZSAnZml4dXAnIGNhbid0IGJl IGRvbmUNCmR1cmluZyBpbnRlcnJ1cHQgZW50cnkuDQoNCglEYXZpZA0KDQo= ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t 2014-11-28 10:53 ` David Laight @ 2014-11-30 9:01 ` Benjamin Herrenschmidt 0 siblings, 0 replies; 29+ messages in thread From: Benjamin Herrenschmidt @ 2014-11-30 9:01 UTC (permalink / raw) To: David Laight Cc: paulus@samba.org, rusty@rustcorp.com.au, Madhavan Srinivasan, linuxppc-dev@lists.ozlabs.org, anton@samba.org On Fri, 2014-11-28 at 10:53 +0000, David Laight wrote: > From: Benjamin Herrenschmidt > > On Fri, 2014-11-28 at 08:45 +0530, Madhavan Srinivasan wrote: > > > > Can't we just unconditionally clear at as long as we do that after we've > > > > saved it ? In that case, it's just a matter for the fixup code to check > > > > the saved version rather than the actual CR.. > > > > > > > I use CR bit setting in the interrupt return path to enter the fixup > > > section search. If we unconditionally clear it, we will have to enter > > > the fixup section for every kernel return nip right? > > > > As I said above. Can't we look at the saved version ? > > > > IE. > > > > - On interrupt entry: > > > > * Save CR to CCR(r1) > > * clear CR5 > > > > - On exit > > > > * Check CCR(r1)'s CR5 field > > * restore CR > > Actually there is no real reason why the 'fixup' can't be done > during interrupt entry. Other than if we crash, we get the wrong PC in the log etc... unlikely but I tend to prefer this. Also if we ever allow something like a local atomic on a faulting (uesrspace) address, we want a precise PC on entry. Generally, we have a lot more entry path than exit path, it's easier to keep the entry path simpler. Cheers, Ben. > David > > _______________________________________________ > Linuxppc-dev mailing list > Linuxppc-dev@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/linuxppc-dev ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t 2014-11-27 12:18 ` [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t Madhavan Srinivasan 2014-11-27 16:56 ` Segher Boessenkool 2014-11-28 0:56 ` Benjamin Herrenschmidt @ 2014-12-01 21:35 ` Gabriel Paubert 2014-12-03 14:59 ` Madhavan Srinivasan 2014-12-02 2:04 ` Scott Wood 3 siblings, 1 reply; 29+ messages in thread From: Gabriel Paubert @ 2014-12-01 21:35 UTC (permalink / raw) To: Madhavan Srinivasan; +Cc: rusty, paulus, anton, linuxppc-dev On Thu, Nov 27, 2014 at 05:48:40PM +0530, Madhavan Srinivasan wrote: > This patch create the infrastructure to handle the CR based > local_* atomic operations. Local atomic operations are fast > and highly reentrant per CPU counters. Used for percpu > variable updates. Local atomic operations only guarantee > variable modification atomicity wrt the CPU which owns the > data and these needs to be executed in a preemption safe way. > > Here is the design of this patch. Since local_* operations > are only need to be atomic to interrupts (IIUC), patch uses > one of the Condition Register (CR) fields as a flag variable. When > entering the local_*, specific bit in the CR5 field is set > and on exit, bit is cleared. CR bit checking is done in the > interrupt return path. If CR5[EQ] bit set and if we return > to kernel, we reset to start of local_* operation. > > Reason for this approach is that, currently l[w/d]arx/st[w/d]cx. > instruction pair is used for local_* operations, which are heavy > on cycle count and they dont support a local variant. So to > see whether the new implementation helps, used a modified > version of Rusty's benchmark code on local_t. > > https://lkml.org/lkml/2008/12/16/450 > > Modifications: > - increated the working set size from 1MB to 8MB, > - removed cpu_local_inc test. > > Test ran > - on POWER8 1S Scale out System 2.0GHz > - on OPAL v3 with v3.18-rc4 patch kernel as Host > > Here are the values with the patch. > > Time in ns per iteration > > inc add read add_return > atomic_long 67 67 18 69 > irqsave/rest 39 39 23 39 > trivalue 39 39 29 49 > local_t 26 26 24 26 > > Since CR5 is used as a flag, have added CFLAGS to avoid CR5 > for the kernel compilation and CR5 is zeroed at the kernel > entry. > > Tested the patch in a > - pSeries LPAR, > - Host with patched/unmodified guest kernel > > To check whether userspace see any CR5 corruption, ran a simple > test which does, > - set CR5 field, > - while(1) > - sleep or gettimeofday > - chk bit set > > Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> > --- > - I really appreciate feedback on the patchset. > - Kindly comment if I should try with any other benchmark or > workload to check the numbers. > - Also, kindly recommand any know stress test for CR > > Makefile | 6 ++ > arch/powerpc/include/asm/exception-64s.h | 21 +++++- > arch/powerpc/kernel/entry_64.S | 106 ++++++++++++++++++++++++++++++- > arch/powerpc/kernel/exceptions-64s.S | 2 +- > arch/powerpc/kernel/head_64.S | 8 +++ > 5 files changed, 138 insertions(+), 5 deletions(-) > > diff --git a/Makefile b/Makefile > index 00d618b..2e271ad 100644 > --- a/Makefile > +++ b/Makefile > @@ -706,6 +706,12 @@ endif > > KBUILD_CFLAGS += $(call cc-option, -fno-var-tracking-assignments) > > +ifdef CONFIG_PPC64 > +# We need this flag to force compiler not to use CR5, since > +# local_t type code is based on this. > +KBUILD_CFLAGS += -ffixed-cr5 > +endif > + > ifdef CONFIG_DEBUG_INFO > ifdef CONFIG_DEBUG_INFO_SPLIT > KBUILD_CFLAGS += $(call cc-option, -gsplit-dwarf, -g) > diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h > index 77f52b2..c42919a 100644 > --- a/arch/powerpc/include/asm/exception-64s.h > +++ b/arch/powerpc/include/asm/exception-64s.h > @@ -306,7 +306,26 @@ do_kvm_##n: \ > std r10,0(r1); /* make stack chain pointer */ \ > std r0,GPR0(r1); /* save r0 in stackframe */ \ > std r10,GPR1(r1); /* save r1 in stackframe */ \ > - beq 4f; /* if from kernel mode */ \ > +BEGIN_FTR_SECTION; \ > + lis r9,4096; /* Create a mask with HV and PR */ \ > + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \ > + mr r10,r9; /* to check for Hyp state */ \ > + ori r9,r9,16384; \ > + and r9,r12,r9; \ > + cmpd cr3,r10,r9; \ > + beq cr3,66f; /* Jump if we come from Hyp mode*/ \ > + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \ I think you can do better than this, powerpc has a fantastic set of rotate and mask instructions. If I understand correctly your code you can replace it with the following: rldicl r10,r12,4,63 /* Extract HV bit to LSB of r10*/ rlwinm r9,r12,19,0x02 /* Extract PR bit to 2nd to last bit of r9 */ or r9,r9,10 cmplwi cr3,r9,1 /* Check for HV=1 and PR=0 */ beq cr3,66f mtcrf 0x04,r10 /* Bits going to cr5 bits are 0 in r10 */ and obviously similarly for the other instances of this code sequence. This said, mtcrf is not necessarily the fastest way of setting cr5 if you only want to clear the EQ bit. For example comparing the stack pointer with 0 should have the same effect. > +FTR_SECTION_ELSE; \ > + beq 4f; /* if kernel mode branch */ \ > + li r10,0; /* Clear CR5 incase of coming */ \ > + mtcrf 0x04,r10; /* from user. */ \ > + nop; /* This part of code is for */ \ > + nop; /* kernel with MSR[HV]=0, */ \ > + nop; /* MSR[PR]=0, so just chk for */ \ > + nop; /* MSR[PR] */ \ > + nop; \ > +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE); \ > +66: beq 4f; /* if from kernel mode */ \ > ACCOUNT_CPU_USER_ENTRY(r9, r10); \ > SAVE_PPR(area, r9, r10); \ > 4: EXCEPTION_PROLOG_COMMON_2(area) \ > diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S > index 0905c8d..e42bb99 100644 > --- a/arch/powerpc/kernel/entry_64.S > +++ b/arch/powerpc/kernel/entry_64.S > @@ -68,7 +68,26 @@ system_call_common: > 2: std r2,GPR2(r1) > std r3,GPR3(r1) > mfcr r2 > - std r4,GPR4(r1) > +BEGIN_FTR_SECTION > + lis r10,4096 > + rldicr r10,r10,32,31 > + mr r11,r10 > + ori r10,r10,16384 > + and r10,r12,r10 > + cmpd r11,r10 > + beq 67f > + mtcrf 0x04,r11 > +FTR_SECTION_ELSE > + beq 67f > + li r11,0 > + mtcrf 0x04,r11 > + nop > + nop > + nop > + nop > + nop > +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE) > +67: std r4,GPR4(r1) > std r5,GPR5(r1) > std r6,GPR6(r1) > std r7,GPR7(r1) > @@ -224,8 +243,26 @@ syscall_exit: > BEGIN_FTR_SECTION > stdcx. r0,0,r1 /* to clear the reservation */ > END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) > +BEGIN_FTR_SECTION > + lis r4,4096 > + rldicr r4,r4,32,31 > + mr r6,r4 > + ori r4,r4,16384 > + and r4,r8,r4 > + cmpd cr3,r6,r4 > + beq cr3,65f > + mtcr r5 > +FTR_SECTION_ELSE > andi. r6,r8,MSR_PR > - ld r4,_LINK(r1) > + beq 65f > + mtcr r5 > + nop > + nop > + nop > + nop > + nop > +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE) > +65: ld r4,_LINK(r1) > > beq- 1f > ACCOUNT_CPU_USER_EXIT(r11, r12) > @@ -234,7 +271,11 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) > 1: ld r2,GPR2(r1) > ld r1,GPR1(r1) > mtlr r4 > +#ifdef CONFIG_PPC64 > + mtcrf 0xFB,r5 > +#else > mtcr r5 > +#endif > mtspr SPRN_SRR0,r7 > mtspr SPRN_SRR1,r8 > RFI > @@ -804,7 +845,66 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) > */ > .globl fast_exception_return > fast_exception_return: > - ld r3,_MSR(r1) > + > + /* > + * Now that we are about to exit from interrupt, lets check for > + * cr5 eq bit. If it is set, then we may be in the middle of > + * local_t update. In this case, we should rewind the NIP > + * accordingly. > + */ > + mfcr r3 > + andi. r4,r3,0x200 > + beq 63f I believe that someonw has already mentioned that this is a very convoluted way of testing a cr bit. This can be optimized to bne cr5,63f > + > + /* > + * Now that the bit is set, lets check for return to User > + */ > + ld r4,_MSR(r1) > +BEGIN_FTR_SECTION > + li r3,4096 > + rldicr r3,r3,32,31 > + mr r5,r3 > + ori r3,r3,16384 > + and r3,r4,r3 > + cmpd r5,r3 > + bne 63f > +FTR_SECTION_ELSE > + andi. r3,r4,MSR_PR > + bne 63f > + nop > + nop > + nop > + nop > + nop > +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE) > + > + /* > + * Looks like we are returning to Kernel, so > + * lets get the NIP and search the ex_table. > + * Change the NIP based on the return value > + */ > +lookup_ex_table: > + ld r3,_NIP(r1) > + bl search_exception_tables > + cmpli 0,1,r3,0 > + bne 62f > + > + /* > + * This is a panic case. Reason is that, we > + * have the CR5 bit set, but we are not in > + * local_* code and we are returning to Kernel. > + */ > + ld r3,_NIP(r1) > + mfcr r4 > + EMIT_BUG_ENTRY lookup_ex_table, __FILE__,__LINE__,BUGFLAG_WARNING > + > + /* > + * Now save the return fixup address as NIP > + */ > +62: ld r4,8(r3) > + std r4,_NIP(r1) > + crclr 22 > +63: ld r3,_MSR(r1) > ld r4,_CTR(r1) > ld r0,_LINK(r1) > mtctr r4 > diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S > index 72e783e..edb75a9 100644 > --- a/arch/powerpc/kernel/exceptions-64s.S > +++ b/arch/powerpc/kernel/exceptions-64s.S > @@ -637,7 +637,7 @@ masked_##_H##interrupt: \ > rldicl r10,r10,48,1; /* clear MSR_EE */ \ > rotldi r10,r10,16; \ > mtspr SPRN_##_H##SRR1,r10; \ > -2: mtcrf 0x80,r9; \ > +2: mtcrf 0x90,r9; \ > ld r9,PACA_EXGEN+EX_R9(r13); \ > ld r10,PACA_EXGEN+EX_R10(r13); \ > ld r11,PACA_EXGEN+EX_R11(r13); \ > diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S > index d48125d..02e49b3 100644 > --- a/arch/powerpc/kernel/head_64.S > +++ b/arch/powerpc/kernel/head_64.S > @@ -347,6 +347,14 @@ __mmu_off: > * > */ > __start_initialization_multiplatform: > + > + /* > + * Before we do anything, lets clear CR5 field, > + * so that we will have a clean start at entry > + */ > + li r11,0 > + mtcrf 0x04,r11 > + > /* Make sure we are running in 64 bits mode */ > bl enable_64b_mode > Regards, Gabriel ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t 2014-12-01 21:35 ` Gabriel Paubert @ 2014-12-03 14:59 ` Madhavan Srinivasan 2014-12-03 17:07 ` Gabriel Paubert 0 siblings, 1 reply; 29+ messages in thread From: Madhavan Srinivasan @ 2014-12-03 14:59 UTC (permalink / raw) To: Gabriel Paubert; +Cc: rusty, paulus, anton, linuxppc-dev On Tuesday 02 December 2014 03:05 AM, Gabriel Paubert wrote: > On Thu, Nov 27, 2014 at 05:48:40PM +0530, Madhavan Srinivasan wrote: >> This patch create the infrastructure to handle the CR based >> local_* atomic operations. Local atomic operations are fast >> and highly reentrant per CPU counters. Used for percpu >> variable updates. Local atomic operations only guarantee >> variable modification atomicity wrt the CPU which owns the >> data and these needs to be executed in a preemption safe way. >> >> Here is the design of this patch. Since local_* operations >> are only need to be atomic to interrupts (IIUC), patch uses >> one of the Condition Register (CR) fields as a flag variable. When >> entering the local_*, specific bit in the CR5 field is set >> and on exit, bit is cleared. CR bit checking is done in the >> interrupt return path. If CR5[EQ] bit set and if we return >> to kernel, we reset to start of local_* operation. >> >> Reason for this approach is that, currently l[w/d]arx/st[w/d]cx. >> instruction pair is used for local_* operations, which are heavy >> on cycle count and they dont support a local variant. So to >> see whether the new implementation helps, used a modified >> version of Rusty's benchmark code on local_t. >> >> https://lkml.org/lkml/2008/12/16/450 >> >> Modifications: >> - increated the working set size from 1MB to 8MB, >> - removed cpu_local_inc test. >> >> Test ran >> - on POWER8 1S Scale out System 2.0GHz >> - on OPAL v3 with v3.18-rc4 patch kernel as Host >> >> Here are the values with the patch. >> >> Time in ns per iteration >> >> inc add read add_return >> atomic_long 67 67 18 69 >> irqsave/rest 39 39 23 39 >> trivalue 39 39 29 49 >> local_t 26 26 24 26 >> >> Since CR5 is used as a flag, have added CFLAGS to avoid CR5 >> for the kernel compilation and CR5 is zeroed at the kernel >> entry. >> >> Tested the patch in a >> - pSeries LPAR, >> - Host with patched/unmodified guest kernel >> >> To check whether userspace see any CR5 corruption, ran a simple >> test which does, >> - set CR5 field, >> - while(1) >> - sleep or gettimeofday >> - chk bit set >> >> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> >> --- >> - I really appreciate feedback on the patchset. >> - Kindly comment if I should try with any other benchmark or >> workload to check the numbers. >> - Also, kindly recommand any know stress test for CR >> >> Makefile | 6 ++ >> arch/powerpc/include/asm/exception-64s.h | 21 +++++- >> arch/powerpc/kernel/entry_64.S | 106 ++++++++++++++++++++++++++++++- >> arch/powerpc/kernel/exceptions-64s.S | 2 +- >> arch/powerpc/kernel/head_64.S | 8 +++ >> 5 files changed, 138 insertions(+), 5 deletions(-) >> >> diff --git a/Makefile b/Makefile >> index 00d618b..2e271ad 100644 >> --- a/Makefile >> +++ b/Makefile >> @@ -706,6 +706,12 @@ endif >> >> KBUILD_CFLAGS += $(call cc-option, -fno-var-tracking-assignments) >> >> +ifdef CONFIG_PPC64 >> +# We need this flag to force compiler not to use CR5, since >> +# local_t type code is based on this. >> +KBUILD_CFLAGS += -ffixed-cr5 >> +endif >> + >> ifdef CONFIG_DEBUG_INFO >> ifdef CONFIG_DEBUG_INFO_SPLIT >> KBUILD_CFLAGS += $(call cc-option, -gsplit-dwarf, -g) >> diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h >> index 77f52b2..c42919a 100644 >> --- a/arch/powerpc/include/asm/exception-64s.h >> +++ b/arch/powerpc/include/asm/exception-64s.h >> @@ -306,7 +306,26 @@ do_kvm_##n: \ >> std r10,0(r1); /* make stack chain pointer */ \ >> std r0,GPR0(r1); /* save r0 in stackframe */ \ >> std r10,GPR1(r1); /* save r1 in stackframe */ \ >> - beq 4f; /* if from kernel mode */ \ >> +BEGIN_FTR_SECTION; \ >> + lis r9,4096; /* Create a mask with HV and PR */ \ >> + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \ >> + mr r10,r9; /* to check for Hyp state */ \ >> + ori r9,r9,16384; \ >> + and r9,r12,r9; \ >> + cmpd cr3,r10,r9; \ >> + beq cr3,66f; /* Jump if we come from Hyp mode*/ \ >> + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \ > > I think you can do better than this, powerpc has a fantastic set > of rotate and mask instructions. If I understand correctly your > code you can replace it with the following: > > rldicl r10,r12,4,63 /* Extract HV bit to LSB of r10*/ > rlwinm r9,r12,19,0x02 /* Extract PR bit to 2nd to last bit of r9 */ > or r9,r9,10 > cmplwi cr3,r9,1 /* Check for HV=1 and PR=0 */ > beq cr3,66f > mtcrf 0x04,r10 /* Bits going to cr5 bits are 0 in r10 */ > >From previous comment, at this point, CR0 already has MSR[PR], and only HV need to be checked. So I guess there still room for optimization. But good to learn this seq. > and obviously similarly for the other instances of this code sequence. > > This said, mtcrf is not necessarily the fastest way of setting cr5 if > you only want to clear the EQ bit. For example comparing the stack pointer > with 0 should have the same effect. > Yes. Will try this out. >> +FTR_SECTION_ELSE; \ >> + beq 4f; /* if kernel mode branch */ \ >> + li r10,0; /* Clear CR5 incase of coming */ \ >> + mtcrf 0x04,r10; /* from user. */ \ >> + nop; /* This part of code is for */ \ >> + nop; /* kernel with MSR[HV]=0, */ \ >> + nop; /* MSR[PR]=0, so just chk for */ \ >> + nop; /* MSR[PR] */ \ >> + nop; \ >> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE); \ >> +66: beq 4f; /* if from kernel mode */ \ >> ACCOUNT_CPU_USER_ENTRY(r9, r10); \ >> SAVE_PPR(area, r9, r10); \ >> 4: EXCEPTION_PROLOG_COMMON_2(area) \ >> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S >> index 0905c8d..e42bb99 100644 >> --- a/arch/powerpc/kernel/entry_64.S >> +++ b/arch/powerpc/kernel/entry_64.S >> @@ -68,7 +68,26 @@ system_call_common: >> 2: std r2,GPR2(r1) >> std r3,GPR3(r1) >> mfcr r2 >> - std r4,GPR4(r1) >> +BEGIN_FTR_SECTION >> + lis r10,4096 >> + rldicr r10,r10,32,31 >> + mr r11,r10 >> + ori r10,r10,16384 >> + and r10,r12,r10 >> + cmpd r11,r10 >> + beq 67f >> + mtcrf 0x04,r11 >> +FTR_SECTION_ELSE >> + beq 67f >> + li r11,0 >> + mtcrf 0x04,r11 >> + nop >> + nop >> + nop >> + nop >> + nop >> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE) >> +67: std r4,GPR4(r1) >> std r5,GPR5(r1) >> std r6,GPR6(r1) >> std r7,GPR7(r1) >> @@ -224,8 +243,26 @@ syscall_exit: >> BEGIN_FTR_SECTION >> stdcx. r0,0,r1 /* to clear the reservation */ >> END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) >> +BEGIN_FTR_SECTION >> + lis r4,4096 >> + rldicr r4,r4,32,31 >> + mr r6,r4 >> + ori r4,r4,16384 >> + and r4,r8,r4 >> + cmpd cr3,r6,r4 >> + beq cr3,65f >> + mtcr r5 >> +FTR_SECTION_ELSE >> andi. r6,r8,MSR_PR >> - ld r4,_LINK(r1) >> + beq 65f >> + mtcr r5 >> + nop >> + nop >> + nop >> + nop >> + nop >> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE) >> +65: ld r4,_LINK(r1) >> >> beq- 1f >> ACCOUNT_CPU_USER_EXIT(r11, r12) >> @@ -234,7 +271,11 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) >> 1: ld r2,GPR2(r1) >> ld r1,GPR1(r1) >> mtlr r4 >> +#ifdef CONFIG_PPC64 >> + mtcrf 0xFB,r5 >> +#else >> mtcr r5 >> +#endif >> mtspr SPRN_SRR0,r7 >> mtspr SPRN_SRR1,r8 >> RFI >> @@ -804,7 +845,66 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) >> */ >> .globl fast_exception_return >> fast_exception_return: >> - ld r3,_MSR(r1) >> + >> + /* >> + * Now that we are about to exit from interrupt, lets check for >> + * cr5 eq bit. If it is set, then we may be in the middle of >> + * local_t update. In this case, we should rewind the NIP >> + * accordingly. >> + */ >> + mfcr r3 >> + andi. r4,r3,0x200 >> + beq 63f > > I believe that someonw has already mentioned that this is a very > convoluted way of testing a cr bit. This can be optimized to bne cr5,63f > This. Will change it. Regards Maddy >> + >> + /* >> + * Now that the bit is set, lets check for return to User >> + */ >> + ld r4,_MSR(r1) >> +BEGIN_FTR_SECTION >> + li r3,4096 >> + rldicr r3,r3,32,31 >> + mr r5,r3 >> + ori r3,r3,16384 >> + and r3,r4,r3 >> + cmpd r5,r3 >> + bne 63f >> +FTR_SECTION_ELSE >> + andi. r3,r4,MSR_PR >> + bne 63f >> + nop >> + nop >> + nop >> + nop >> + nop >> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE) >> + >> + /* >> + * Looks like we are returning to Kernel, so >> + * lets get the NIP and search the ex_table. >> + * Change the NIP based on the return value >> + */ >> +lookup_ex_table: >> + ld r3,_NIP(r1) >> + bl search_exception_tables >> + cmpli 0,1,r3,0 >> + bne 62f >> + >> + /* >> + * This is a panic case. Reason is that, we >> + * have the CR5 bit set, but we are not in >> + * local_* code and we are returning to Kernel. >> + */ >> + ld r3,_NIP(r1) >> + mfcr r4 >> + EMIT_BUG_ENTRY lookup_ex_table, __FILE__,__LINE__,BUGFLAG_WARNING >> + >> + /* >> + * Now save the return fixup address as NIP >> + */ >> +62: ld r4,8(r3) >> + std r4,_NIP(r1) >> + crclr 22 >> +63: ld r3,_MSR(r1) >> ld r4,_CTR(r1) >> ld r0,_LINK(r1) >> mtctr r4 >> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S >> index 72e783e..edb75a9 100644 >> --- a/arch/powerpc/kernel/exceptions-64s.S >> +++ b/arch/powerpc/kernel/exceptions-64s.S >> @@ -637,7 +637,7 @@ masked_##_H##interrupt: \ >> rldicl r10,r10,48,1; /* clear MSR_EE */ \ >> rotldi r10,r10,16; \ >> mtspr SPRN_##_H##SRR1,r10; \ >> -2: mtcrf 0x80,r9; \ >> +2: mtcrf 0x90,r9; \ >> ld r9,PACA_EXGEN+EX_R9(r13); \ >> ld r10,PACA_EXGEN+EX_R10(r13); \ >> ld r11,PACA_EXGEN+EX_R11(r13); \ >> diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S >> index d48125d..02e49b3 100644 >> --- a/arch/powerpc/kernel/head_64.S >> +++ b/arch/powerpc/kernel/head_64.S >> @@ -347,6 +347,14 @@ __mmu_off: >> * >> */ >> __start_initialization_multiplatform: >> + >> + /* >> + * Before we do anything, lets clear CR5 field, >> + * so that we will have a clean start at entry >> + */ >> + li r11,0 >> + mtcrf 0x04,r11 >> + >> /* Make sure we are running in 64 bits mode */ >> bl enable_64b_mode >> > > > Regards, > Gabriel > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t 2014-12-03 14:59 ` Madhavan Srinivasan @ 2014-12-03 17:07 ` Gabriel Paubert 0 siblings, 0 replies; 29+ messages in thread From: Gabriel Paubert @ 2014-12-03 17:07 UTC (permalink / raw) To: Madhavan Srinivasan; +Cc: rusty, paulus, anton, linuxppc-dev On Wed, Dec 03, 2014 at 08:29:37PM +0530, Madhavan Srinivasan wrote: > On Tuesday 02 December 2014 03:05 AM, Gabriel Paubert wrote: > > On Thu, Nov 27, 2014 at 05:48:40PM +0530, Madhavan Srinivasan wrote: > >> diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h > >> index 77f52b2..c42919a 100644 > >> --- a/arch/powerpc/include/asm/exception-64s.h > >> +++ b/arch/powerpc/include/asm/exception-64s.h > >> @@ -306,7 +306,26 @@ do_kvm_##n: \ > >> std r10,0(r1); /* make stack chain pointer */ \ > >> std r0,GPR0(r1); /* save r0 in stackframe */ \ > >> std r10,GPR1(r1); /* save r1 in stackframe */ \ > >> - beq 4f; /* if from kernel mode */ \ > >> +BEGIN_FTR_SECTION; \ > >> + lis r9,4096; /* Create a mask with HV and PR */ \ > >> + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \ > >> + mr r10,r9; /* to check for Hyp state */ \ > >> + ori r9,r9,16384; \ > >> + and r9,r12,r9; \ > >> + cmpd cr3,r10,r9; \ > >> + beq cr3,66f; /* Jump if we come from Hyp mode*/ \ > >> + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \ > > > > I think you can do better than this, powerpc has a fantastic set > > of rotate and mask instructions. If I understand correctly your > > code you can replace it with the following: > > > > rldicl r10,r12,4,63 /* Extract HV bit to LSB of r10*/ > > rlwinm r9,r12,19,0x02 /* Extract PR bit to 2nd to last bit of r9 */ > > or r9,r9,10 > > cmplwi cr3,r9,1 /* Check for HV=1 and PR=0 */ > > beq cr3,66f > > mtcrf 0x04,r10 /* Bits going to cr5 bits are 0 in r10 */ > > > > >From previous comment, at this point, CR0 already has MSR[PR], and only > HV need to be checked. So I guess there still room for optimization. > But good to learn this seq. Actually the sequence I suggested can even be shortened again since the or is not necessary and you can use an rlwimi instead. rldicl r9,r12,5,62 /* r9 = 62 zeroes | HV | ? */ rlwimi r9,r12,18,0x01 /* r9 = 62 zeroes | HV | PR */ cmplwi cr3,r9,2 /* Check for HV=1 and PR=0 */ For 32 bit operands, rlwimi is a generic bit field to bit field move, but GCC is (was, maybe GCC5 is better?) quite bad at recognizing opportunities to use it. I'm not sure that it is better to have two separate branches, one for testing PR and the other for testing HV. Through how many branches can the hardware speculate? Unless I'm mistaken, this is code which is likely not to be in the cache so I would optimize it first towards minimal code size. Regards, Gabriel ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t 2014-11-27 12:18 ` [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t Madhavan Srinivasan ` (2 preceding siblings ...) 2014-12-01 21:35 ` Gabriel Paubert @ 2014-12-02 2:04 ` Scott Wood 2014-12-03 14:49 ` Madhavan Srinivasan 3 siblings, 1 reply; 29+ messages in thread From: Scott Wood @ 2014-12-02 2:04 UTC (permalink / raw) To: Madhavan Srinivasan; +Cc: rusty, paulus, anton, linuxppc-dev On Thu, 2014-11-27 at 17:48 +0530, Madhavan Srinivasan wrote: > - I really appreciate feedback on the patchset. > - Kindly comment if I should try with any other benchmark or > workload to check the numbers. > - Also, kindly recommand any know stress test for CR > > Makefile | 6 ++ > arch/powerpc/include/asm/exception-64s.h | 21 +++++- > arch/powerpc/kernel/entry_64.S | 106 ++++++++++++++++++++++++++++++- > arch/powerpc/kernel/exceptions-64s.S | 2 +- > arch/powerpc/kernel/head_64.S | 8 +++ > 5 files changed, 138 insertions(+), 5 deletions(-) Patch 2/2 enables this for all PPC64, not just book3s -- so please don't forget about the book3e exception paths (also MSR[GS] for KVM, but aren't most if not all the places you're checking for HV mode after KVM would have taken control? Or am I missing something about how book3s KVM works?). Or, if you don't want to do that, change patch 2/2 to be book3s only and ifdef-protect the changes to common exception code. > @@ -224,8 +243,26 @@ syscall_exit: > BEGIN_FTR_SECTION > stdcx. r0,0,r1 /* to clear the reservation */ > END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) > +BEGIN_FTR_SECTION > + lis r4,4096 > + rldicr r4,r4,32,31 > + mr r6,r4 > + ori r4,r4,16384 > + and r4,r8,r4 > + cmpd cr3,r6,r4 > + beq cr3,65f > + mtcr r5 > +FTR_SECTION_ELSE > andi. r6,r8,MSR_PR > - ld r4,_LINK(r1) > + beq 65f > + mtcr r5 > + nop > + nop > + nop > + nop > + nop > +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE) > +65: ld r4,_LINK(r1) > > beq- 1f > ACCOUNT_CPU_USER_EXIT(r11, r12) > @@ -234,7 +271,11 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) > 1: ld r2,GPR2(r1) > ld r1,GPR1(r1) > mtlr r4 > +#ifdef CONFIG_PPC64 > + mtcrf 0xFB,r5 > +#else > mtcr r5 > +#endif mtcrf with more than one CRn being updated is expensive on Freescale chips (and this isn't a book3s-only code path). Why do you need to do it twice? I don't see where either r5 or cr5 are messed with between the two places... -Scott ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t 2014-12-02 2:04 ` Scott Wood @ 2014-12-03 14:49 ` Madhavan Srinivasan 0 siblings, 0 replies; 29+ messages in thread From: Madhavan Srinivasan @ 2014-12-03 14:49 UTC (permalink / raw) To: Scott Wood; +Cc: rusty, paulus, anton, linuxppc-dev On Tuesday 02 December 2014 07:34 AM, Scott Wood wrote: > On Thu, 2014-11-27 at 17:48 +0530, Madhavan Srinivasan wrote: >> - I really appreciate feedback on the patchset. >> - Kindly comment if I should try with any other benchmark or >> workload to check the numbers. >> - Also, kindly recommand any know stress test for CR >> >> Makefile | 6 ++ >> arch/powerpc/include/asm/exception-64s.h | 21 +++++- >> arch/powerpc/kernel/entry_64.S | 106 ++++++++++++++++++++++++++++++- >> arch/powerpc/kernel/exceptions-64s.S | 2 +- >> arch/powerpc/kernel/head_64.S | 8 +++ >> 5 files changed, 138 insertions(+), 5 deletions(-) > > Patch 2/2 enables this for all PPC64, not just book3s -- so please don't > forget about the book3e exception paths (also MSR[GS] for KVM, but > aren't most if not all the places you're checking for HV mode after KVM > would have taken control? Or am I missing something about how book3s > KVM works?). > > Or, if you don't want to do that, change patch 2/2 to be book3s only and > ifdef-protect the changes to common exception code. > Still learning the interrupt path and various configs. Was assuming PPC64 is for server side. My bad. Will change the it to book3s and also to common code. >> @@ -224,8 +243,26 @@ syscall_exit: >> BEGIN_FTR_SECTION >> stdcx. r0,0,r1 /* to clear the reservation */ >> END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) >> +BEGIN_FTR_SECTION >> + lis r4,4096 >> + rldicr r4,r4,32,31 >> + mr r6,r4 >> + ori r4,r4,16384 >> + and r4,r8,r4 >> + cmpd cr3,r6,r4 >> + beq cr3,65f >> + mtcr r5 >> +FTR_SECTION_ELSE >> andi. r6,r8,MSR_PR >> - ld r4,_LINK(r1) >> + beq 65f >> + mtcr r5 >> + nop >> + nop >> + nop >> + nop >> + nop >> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE) >> +65: ld r4,_LINK(r1) >> >> beq- 1f >> ACCOUNT_CPU_USER_EXIT(r11, r12) >> @@ -234,7 +271,11 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) >> 1: ld r2,GPR2(r1) >> ld r1,GPR1(r1) >> mtlr r4 >> +#ifdef CONFIG_PPC64 >> + mtcrf 0xFB,r5 >> +#else >> mtcr r5 >> +#endif > > mtcrf with more than one CRn being updated is expensive on Freescale > chips (and this isn't a book3s-only code path). Why do you need to do > it twice? I don't see where either r5 or cr5 are messed with between > the two places... > Incase of returning to userspace, will be updating it twice. Which can be avoid. Will redo this. Regards Maddy > -Scott > > ^ permalink raw reply [flat|nested] 29+ messages in thread
* [RFC PATCH 2/2]powerpc: rewrite local_* to use CR5 flag 2014-11-27 12:18 [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation Madhavan Srinivasan 2014-11-27 12:18 ` [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t Madhavan Srinivasan @ 2014-11-27 12:18 ` Madhavan Srinivasan 2014-12-01 18:01 ` Gabriel Paubert 2014-11-27 14:05 ` [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation David Laight 2 siblings, 1 reply; 29+ messages in thread From: Madhavan Srinivasan @ 2014-11-27 12:18 UTC (permalink / raw) To: mpe; +Cc: Madhavan Srinivasan, rusty, paulus, anton, linuxppc-dev This patch re-write the current local_* functions to CR5 based one. Base flow for each function is { set cr5(eq) load .. store clear cr5(eq) } Above set of instructions are followed by a fixup section which points to the entry of the function incase of interrupt in the flow. If the interrupt happens to be after the store, we just continue to last instruction in that block. Currently only asm/local.h has been rewrite, and local64 is TODO. Also the entire change is only for PPC64. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> --- arch/powerpc/include/asm/local.h | 306 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 306 insertions(+) diff --git a/arch/powerpc/include/asm/local.h b/arch/powerpc/include/asm/local.h index b8da913..a26e5d3 100644 --- a/arch/powerpc/include/asm/local.h +++ b/arch/powerpc/include/asm/local.h @@ -11,6 +11,310 @@ typedef struct #define LOCAL_INIT(i) { ATOMIC_LONG_INIT(i) } +#ifdef CONFIG_PPC64 + +static __inline__ long local_read(local_t *l) +{ + long t; + + __asm__ __volatile__( +"1: crset 22\n" +"2:" PPC_LL" %0,0(%1)\n" +"3: crclr 22\n" +"4:\n" +" .section __ex_table,\"a\"\n" + PPC_LONG_ALIGN "\n" + PPC_LONG "2b,1b\n" + PPC_LONG "3b,3b\n" +" .previous\n" + : "=&r" (t) + : "r" (&(l->a.counter))); + + return t; +} + +static __inline__ void local_set(local_t *l, long i) +{ + long t; + + __asm__ __volatile__( +"1: crset 22\n" +"2:" PPC_LL" %0,0(%1)\n" +"3:" PPC405_ERR77(0,%2) +"4:" PPC_STL" %0,0(%2)\n" +"5: crclr 22\n" +"6:\n" +" .section __ex_table,\"a\"\n" + PPC_LONG_ALIGN "\n" + PPC_LONG "2b,1b\n" + PPC_LONG "3b,1b\n" + PPC_LONG "4b,1b\n" + PPC_LONG "5b,5b\n" +" .previous\n" + : "=&r" (t) + : "r" (&(i)), "r" (&(l->a.counter))); +} + +static __inline__ void local_add(long i, local_t *l) +{ + long t; + + __asm__ __volatile__( +"1: crset 22\n" +"2:" PPC_LL" %0,0(%2)\n" +"3: add %0,%1,%0\n" +"4:" PPC405_ERR77(0,%2) +"5:" PPC_STL" %0,0(%2)\n" +"6: crclr 22\n" +"7:\n" +" .section __ex_table,\"a\"\n" + PPC_LONG_ALIGN "\n" + PPC_LONG "2b,1b\n" + PPC_LONG "3b,1b\n" + PPC_LONG "4b,1b\n" + PPC_LONG "5b,1b\n" + PPC_LONG "6b,6b\n" +" .previous\n" + : "=&r" (t) + : "r" (i), "r" (&(l->a.counter))); +} + +static __inline__ void local_sub(long i, local_t *l) +{ + long t; + + __asm__ __volatile__( +"1: crset 22\n" +"2:" PPC_LL" %0,0(%2)\n" +"3: subf %0,%1,%0\n" +"4:" PPC405_ERR77(0,%2) +"5:" PPC_STL" %0,0(%2)\n" +"6: crclr 22\n" +"7:\n" +" .section __ex_table,\"a\"\n" + PPC_LONG_ALIGN "\n" + PPC_LONG "2b,1b\n" + PPC_LONG "3b,1b\n" + PPC_LONG "4b,1b\n" + PPC_LONG "5b,1b\n" + PPC_LONG "6b,6b\n" +" .previous\n" + : "=&r" (t) + : "r" (i), "r" (&(l->a.counter))); +} + +static __inline__ long local_add_return(long a, local_t *l) +{ + long t; + + __asm__ __volatile__( +"1: crset 22\n" +"2:" PPC_LL" %0,0(%2)\n" +"3: add %0,%1,%0\n" +"4:" PPC405_ERR77(0,%2) +"5:" PPC_STL "%0,0(%2)\n" +"6: crclr 22\n" +"7:\n" +" .section __ex_table,\"a\"\n" + PPC_LONG_ALIGN "\n" + PPC_LONG "2b,1b\n" + PPC_LONG "3b,1b\n" + PPC_LONG "4b,1b\n" + PPC_LONG "5b,1b\n" + PPC_LONG "6b,6b\n" +" .previous\n" + : "=&r" (t) + : "r" (a), "r" (&(l->a.counter)) + : "cc", "memory"); + + return t; +} + + +#define local_add_negative(a, l) (local_add_return((a), (l)) < 0) + +static __inline__ long local_sub_return(long a, local_t *l) +{ + long t; + + __asm__ __volatile__( +"1: crset 22\n" +"2:" PPC_LL" %0,0(%2)\n" +"3: subf %0,%1,%0\n" +"4:" PPC405_ERR77(0,%2) +"5:" PPC_STL "%0,0(%2)\n" +"6: crclr 22\n" +"7:\n" +" .section __ex_table,\"a\"\n" + PPC_LONG_ALIGN "\n" + PPC_LONG "2b,1b\n" + PPC_LONG "3b,1b\n" + PPC_LONG "4b,1b\n" + PPC_LONG "5b,1b\n" + PPC_LONG "6b,6b\n" +" .previous\n" + : "=&r" (t) + : "r" (a), "r" (&(l->a.counter)) + : "cc", "memory"); + + return t; +} + +static __inline__ long local_inc_return(local_t *l) +{ + long t; + + __asm__ __volatile__( +"1: crset 22\n" +"2:" PPC_LL" %0,0(%1)\n" +"3: addic %0,%0,1\n" +"4:" PPC405_ERR77(0,%1) +"5:" PPC_STL "%0,0(%1)\n" +"6: crclr 22\n" +"7:\n" +" .section __ex_table,\"a\"\n" + PPC_LONG_ALIGN "\n" + PPC_LONG "2b,1b\n" + PPC_LONG "3b,1b\n" + PPC_LONG "4b,1b\n" + PPC_LONG "5b,1b\n" + PPC_LONG "6b,6b\n" +" .previous" + : "=&r" (t) + : "r" (&(l->a.counter)) + : "cc", "xer", "memory"); + + return t; +} + +/* + * local_inc_and_test - increment and test + * @l: pointer of type local_t + * + * Atomically increments @l by 1 + * and returns true if the result is zero, or false for all + * other cases. + */ +#define local_inc_and_test(l) (local_inc_return(l) == 0) + +static __inline__ long local_dec_return(local_t *l) +{ + long t; + + __asm__ __volatile__( +"1: crset 22\n" +"2:" PPC_LL" %0,0(%1)\n" +"3: addic %0,%0,-1\n" +"4:" PPC405_ERR77(0,%1) +"5:" PPC_STL "%0,0(%1)\n" +"6: crclr 22\n" +"7:\n" +" .section __ex_table,\"a\"\n" + PPC_LONG_ALIGN "\n" + PPC_LONG "2b,1b\n" + PPC_LONG "3b,1b\n" + PPC_LONG "4b,1b\n" + PPC_LONG "5b,1b\n" + PPC_LONG "6b,6b\n" +" .previous\n" + : "=&r" (t) + : "r" (&(l->a.counter)) + : "cc", "xer", "memory"); + + return t; +} + +#define local_inc(l) local_inc_return(l) +#define local_dec(l) local_dec_return(l) + +#define local_cmpxchg(l, o, n) \ + (cmpxchg_local(&((l)->a.counter), (o), (n))) +#define local_xchg(l, n) (xchg_local(&((l)->a.counter), (n))) + +/** + * local_add_unless - add unless the number is a given value + * @l: pointer of type local_t + * @a: the amount to add to v... + * @u: ...unless v is equal to u. + * + * Atomically adds @a to @l, so long as it was not @u. + * Returns non-zero if @l was not @u, and zero otherwise. + */ +static __inline__ int local_add_unless(local_t *l, long a, long u) +{ + long t; + + __asm__ __volatile__ ( +"1: crset 22\n" +"2:" PPC_LL" %0,0(%1)\n" +"3: cmpw 0,%0,%3 \n" +"4: beq- 9f \n" +"5: add %0,%2,%0 \n" +"6:" PPC405_ERR77(0,%1) +"7:" PPC_STL" %0,0(%1) \n" +"8: subf %0,%2,%0 \n" +"9: crclr 22\n" +"10:\n" +" .section __ex_table,\"a\"\n" + PPC_LONG_ALIGN "\n" + PPC_LONG "2b,1b\n" + PPC_LONG "3b,1b\n" + PPC_LONG "4b,1b\n" + PPC_LONG "5b,1b\n" + PPC_LONG "6b,1b\n" + PPC_LONG "7b,1b\n" + PPC_LONG "8b,8b\n" + PPC_LONG "9b,9b\n" +" .previous\n" + : "=&r" (t) + : "r" (&(l->a.counter)), "r" (a), "r" (u) + : "cc", "memory"); + + return t != u; +} + +#define local_inc_not_zero(l) local_add_unless((l), 1, 0) + +#define local_sub_and_test(a, l) (local_sub_return((a), (l)) == 0) +#define local_dec_and_test(l) (local_dec_return((l)) == 0) + +/* + * Atomically test *l and decrement if it is greater than 0. + * The function returns the old value of *l minus 1. + */ +static __inline__ long local_dec_if_positive(local_t *l) +{ + long t; + + __asm__ __volatile__( +"1: crset 22\n" +"2:" PPC_LL" %0,0(%1)\n" +"3: cmpwi %0,1\n" +"4: addi %0,%0,-1\n" +"5: blt- 8f\n" +"6:" PPC405_ERR77(0,%1) +"7:" PPC_STL "%0,0(%1)\n" +"8: crclr 22\n" +"9:\n" +" .section__ex_table,\"a\"\n" + PPC_LONG_ALIGN "\n" + PPC_LONG "2b,1b\n" + PPC_LONG "3b,1b\n" + PPC_LONG "4b,1b\n" + PPC_LONG "5b,1b\n" + PPC_LONG "6b,1b\n" + PPC_LONG "7b,1b\n" + PPC_LONG "8b,8b\n" +" .previous\n" + : "=&b" (t) + : "r" (&(l->a.counter)) + : "cc", "memory"); + + return t; +} + +#else + #define local_read(l) atomic_long_read(&(l)->a) #define local_set(l,i) atomic_long_set(&(l)->a, (i)) @@ -162,6 +466,8 @@ static __inline__ long local_dec_if_positive(local_t *l) return t; } +#endif + /* Use these for per-cpu local_t variables: on some archs they are * much more efficient than these naive implementations. Note they take * a variable, not an address. -- 1.9.1 ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 2/2]powerpc: rewrite local_* to use CR5 flag 2014-11-27 12:18 ` [RFC PATCH 2/2]powerpc: rewrite local_* to use CR5 flag Madhavan Srinivasan @ 2014-12-01 18:01 ` Gabriel Paubert 2014-12-03 15:05 ` Madhavan Srinivasan 0 siblings, 1 reply; 29+ messages in thread From: Gabriel Paubert @ 2014-12-01 18:01 UTC (permalink / raw) To: Madhavan Srinivasan; +Cc: rusty, paulus, anton, linuxppc-dev On Thu, Nov 27, 2014 at 05:48:41PM +0530, Madhavan Srinivasan wrote: > This patch re-write the current local_* functions to CR5 based one. > Base flow for each function is > > { > set cr5(eq) > load > .. > store > clear cr5(eq) > } > > Above set of instructions are followed by a fixup section which points > to the entry of the function incase of interrupt in the flow. If the > interrupt happens to be after the store, we just continue to last > instruction in that block. > > Currently only asm/local.h has been rewrite, and local64 is TODO. > Also the entire change is only for PPC64. > > Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> > --- > arch/powerpc/include/asm/local.h | 306 +++++++++++++++++++++++++++++++++++++++ > 1 file changed, 306 insertions(+) > > diff --git a/arch/powerpc/include/asm/local.h b/arch/powerpc/include/asm/local.h > index b8da913..a26e5d3 100644 > --- a/arch/powerpc/include/asm/local.h > +++ b/arch/powerpc/include/asm/local.h > @@ -11,6 +11,310 @@ typedef struct > > #define LOCAL_INIT(i) { ATOMIC_LONG_INIT(i) } > > +#ifdef CONFIG_PPC64 > + > +static __inline__ long local_read(local_t *l) > +{ > + long t; > + > + __asm__ __volatile__( > +"1: crset 22\n" > +"2:" PPC_LL" %0,0(%1)\n" > +"3: crclr 22\n" > +"4:\n" > +" .section __ex_table,\"a\"\n" > + PPC_LONG_ALIGN "\n" > + PPC_LONG "2b,1b\n" > + PPC_LONG "3b,3b\n" > +" .previous\n" > + : "=&r" (t) > + : "r" (&(l->a.counter))); > + > + return t; > +} > + > +static __inline__ void local_set(local_t *l, long i) > +{ > + long t; > + > + __asm__ __volatile__( > +"1: crset 22\n" > +"2:" PPC_LL" %0,0(%1)\n" > +"3:" PPC405_ERR77(0,%2) > +"4:" PPC_STL" %0,0(%2)\n" > +"5: crclr 22\n" > +"6:\n" > +" .section __ex_table,\"a\"\n" > + PPC_LONG_ALIGN "\n" > + PPC_LONG "2b,1b\n" > + PPC_LONG "3b,1b\n" > + PPC_LONG "4b,1b\n" > + PPC_LONG "5b,5b\n" > +" .previous\n" > + : "=&r" (t) > + : "r" (&(i)), "r" (&(l->a.counter))); > +} > + Apart from the other comments on bloat which can very likely be removed by tracing backwards for a few instructions, removing the exception table entries which are 2 or 4 (64 bit?) times as large as the instruction sequence, I don't understand at all why you need these sequences for the local_read and local_set functions. After all these are single instructions (why do you perform a read before the write in set when the result of the read is never used?). I believe read and set are better mapped to access_once (or assign_once or whatever it's called after the recent discussion on linux-kernel). You don't even need a memory barrier if it's for a single thread, so you could get away with a single volatile access to the variables. For the other ones, I think that what you do is correct, except that the workaround for PPC405 erratum 77 is not needed since this erratum only affects the stwcx. instruction and the whole point of the patch is to avoid the use of an l?arx/st?cx. pair. Regards, Gabriel > +static __inline__ void local_add(long i, local_t *l) > +{ > + long t; > + > + __asm__ __volatile__( > +"1: crset 22\n" > +"2:" PPC_LL" %0,0(%2)\n" > +"3: add %0,%1,%0\n" > +"4:" PPC405_ERR77(0,%2) > +"5:" PPC_STL" %0,0(%2)\n" > +"6: crclr 22\n" > +"7:\n" > +" .section __ex_table,\"a\"\n" > + PPC_LONG_ALIGN "\n" > + PPC_LONG "2b,1b\n" > + PPC_LONG "3b,1b\n" > + PPC_LONG "4b,1b\n" > + PPC_LONG "5b,1b\n" > + PPC_LONG "6b,6b\n" > +" .previous\n" > + : "=&r" (t) > + : "r" (i), "r" (&(l->a.counter))); > +} > + > +static __inline__ void local_sub(long i, local_t *l) > +{ > + long t; > + > + __asm__ __volatile__( > +"1: crset 22\n" > +"2:" PPC_LL" %0,0(%2)\n" > +"3: subf %0,%1,%0\n" > +"4:" PPC405_ERR77(0,%2) > +"5:" PPC_STL" %0,0(%2)\n" > +"6: crclr 22\n" > +"7:\n" > +" .section __ex_table,\"a\"\n" > + PPC_LONG_ALIGN "\n" > + PPC_LONG "2b,1b\n" > + PPC_LONG "3b,1b\n" > + PPC_LONG "4b,1b\n" > + PPC_LONG "5b,1b\n" > + PPC_LONG "6b,6b\n" > +" .previous\n" > + : "=&r" (t) > + : "r" (i), "r" (&(l->a.counter))); > +} > + > +static __inline__ long local_add_return(long a, local_t *l) > +{ > + long t; > + > + __asm__ __volatile__( > +"1: crset 22\n" > +"2:" PPC_LL" %0,0(%2)\n" > +"3: add %0,%1,%0\n" > +"4:" PPC405_ERR77(0,%2) > +"5:" PPC_STL "%0,0(%2)\n" > +"6: crclr 22\n" > +"7:\n" > +" .section __ex_table,\"a\"\n" > + PPC_LONG_ALIGN "\n" > + PPC_LONG "2b,1b\n" > + PPC_LONG "3b,1b\n" > + PPC_LONG "4b,1b\n" > + PPC_LONG "5b,1b\n" > + PPC_LONG "6b,6b\n" > +" .previous\n" > + : "=&r" (t) > + : "r" (a), "r" (&(l->a.counter)) > + : "cc", "memory"); > + > + return t; > +} > + > + > +#define local_add_negative(a, l) (local_add_return((a), (l)) < 0) > + > +static __inline__ long local_sub_return(long a, local_t *l) > +{ > + long t; > + > + __asm__ __volatile__( > +"1: crset 22\n" > +"2:" PPC_LL" %0,0(%2)\n" > +"3: subf %0,%1,%0\n" > +"4:" PPC405_ERR77(0,%2) > +"5:" PPC_STL "%0,0(%2)\n" > +"6: crclr 22\n" > +"7:\n" > +" .section __ex_table,\"a\"\n" > + PPC_LONG_ALIGN "\n" > + PPC_LONG "2b,1b\n" > + PPC_LONG "3b,1b\n" > + PPC_LONG "4b,1b\n" > + PPC_LONG "5b,1b\n" > + PPC_LONG "6b,6b\n" > +" .previous\n" > + : "=&r" (t) > + : "r" (a), "r" (&(l->a.counter)) > + : "cc", "memory"); > + > + return t; > +} > + > +static __inline__ long local_inc_return(local_t *l) > +{ > + long t; > + > + __asm__ __volatile__( > +"1: crset 22\n" > +"2:" PPC_LL" %0,0(%1)\n" > +"3: addic %0,%0,1\n" > +"4:" PPC405_ERR77(0,%1) > +"5:" PPC_STL "%0,0(%1)\n" > +"6: crclr 22\n" > +"7:\n" > +" .section __ex_table,\"a\"\n" > + PPC_LONG_ALIGN "\n" > + PPC_LONG "2b,1b\n" > + PPC_LONG "3b,1b\n" > + PPC_LONG "4b,1b\n" > + PPC_LONG "5b,1b\n" > + PPC_LONG "6b,6b\n" > +" .previous" > + : "=&r" (t) > + : "r" (&(l->a.counter)) > + : "cc", "xer", "memory"); > + > + return t; > +} > + > +/* > + * local_inc_and_test - increment and test > + * @l: pointer of type local_t > + * > + * Atomically increments @l by 1 > + * and returns true if the result is zero, or false for all > + * other cases. > + */ > +#define local_inc_and_test(l) (local_inc_return(l) == 0) > + > +static __inline__ long local_dec_return(local_t *l) > +{ > + long t; > + > + __asm__ __volatile__( > +"1: crset 22\n" > +"2:" PPC_LL" %0,0(%1)\n" > +"3: addic %0,%0,-1\n" > +"4:" PPC405_ERR77(0,%1) > +"5:" PPC_STL "%0,0(%1)\n" > +"6: crclr 22\n" > +"7:\n" > +" .section __ex_table,\"a\"\n" > + PPC_LONG_ALIGN "\n" > + PPC_LONG "2b,1b\n" > + PPC_LONG "3b,1b\n" > + PPC_LONG "4b,1b\n" > + PPC_LONG "5b,1b\n" > + PPC_LONG "6b,6b\n" > +" .previous\n" > + : "=&r" (t) > + : "r" (&(l->a.counter)) > + : "cc", "xer", "memory"); > + > + return t; > +} > + > +#define local_inc(l) local_inc_return(l) > +#define local_dec(l) local_dec_return(l) > + > +#define local_cmpxchg(l, o, n) \ > + (cmpxchg_local(&((l)->a.counter), (o), (n))) > +#define local_xchg(l, n) (xchg_local(&((l)->a.counter), (n))) > + > +/** > + * local_add_unless - add unless the number is a given value > + * @l: pointer of type local_t > + * @a: the amount to add to v... > + * @u: ...unless v is equal to u. > + * > + * Atomically adds @a to @l, so long as it was not @u. > + * Returns non-zero if @l was not @u, and zero otherwise. > + */ > +static __inline__ int local_add_unless(local_t *l, long a, long u) > +{ > + long t; > + > + __asm__ __volatile__ ( > +"1: crset 22\n" > +"2:" PPC_LL" %0,0(%1)\n" > +"3: cmpw 0,%0,%3 \n" > +"4: beq- 9f \n" > +"5: add %0,%2,%0 \n" > +"6:" PPC405_ERR77(0,%1) > +"7:" PPC_STL" %0,0(%1) \n" > +"8: subf %0,%2,%0 \n" > +"9: crclr 22\n" > +"10:\n" > +" .section __ex_table,\"a\"\n" > + PPC_LONG_ALIGN "\n" > + PPC_LONG "2b,1b\n" > + PPC_LONG "3b,1b\n" > + PPC_LONG "4b,1b\n" > + PPC_LONG "5b,1b\n" > + PPC_LONG "6b,1b\n" > + PPC_LONG "7b,1b\n" > + PPC_LONG "8b,8b\n" > + PPC_LONG "9b,9b\n" > +" .previous\n" > + : "=&r" (t) > + : "r" (&(l->a.counter)), "r" (a), "r" (u) > + : "cc", "memory"); > + > + return t != u; > +} > + > +#define local_inc_not_zero(l) local_add_unless((l), 1, 0) > + > +#define local_sub_and_test(a, l) (local_sub_return((a), (l)) == 0) > +#define local_dec_and_test(l) (local_dec_return((l)) == 0) > + > +/* > + * Atomically test *l and decrement if it is greater than 0. > + * The function returns the old value of *l minus 1. > + */ > +static __inline__ long local_dec_if_positive(local_t *l) > +{ > + long t; > + > + __asm__ __volatile__( > +"1: crset 22\n" > +"2:" PPC_LL" %0,0(%1)\n" > +"3: cmpwi %0,1\n" > +"4: addi %0,%0,-1\n" > +"5: blt- 8f\n" > +"6:" PPC405_ERR77(0,%1) > +"7:" PPC_STL "%0,0(%1)\n" > +"8: crclr 22\n" > +"9:\n" > +" .section__ex_table,\"a\"\n" > + PPC_LONG_ALIGN "\n" > + PPC_LONG "2b,1b\n" > + PPC_LONG "3b,1b\n" > + PPC_LONG "4b,1b\n" > + PPC_LONG "5b,1b\n" > + PPC_LONG "6b,1b\n" > + PPC_LONG "7b,1b\n" > + PPC_LONG "8b,8b\n" > +" .previous\n" > + : "=&b" (t) > + : "r" (&(l->a.counter)) > + : "cc", "memory"); > + > + return t; > +} > + > +#else > + > #define local_read(l) atomic_long_read(&(l)->a) > #define local_set(l,i) atomic_long_set(&(l)->a, (i)) > > @@ -162,6 +466,8 @@ static __inline__ long local_dec_if_positive(local_t *l) > return t; > } > > +#endif > + > /* Use these for per-cpu local_t variables: on some archs they are > * much more efficient than these naive implementations. Note they take > * a variable, not an address. > -- > 1.9.1 > > _______________________________________________ > Linuxppc-dev mailing list > Linuxppc-dev@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/linuxppc-dev ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 2/2]powerpc: rewrite local_* to use CR5 flag 2014-12-01 18:01 ` Gabriel Paubert @ 2014-12-03 15:05 ` Madhavan Srinivasan 0 siblings, 0 replies; 29+ messages in thread From: Madhavan Srinivasan @ 2014-12-03 15:05 UTC (permalink / raw) To: Gabriel Paubert; +Cc: rusty, paulus, anton, linuxppc-dev On Monday 01 December 2014 11:31 PM, Gabriel Paubert wrote: > On Thu, Nov 27, 2014 at 05:48:41PM +0530, Madhavan Srinivasan wrote: >> This patch re-write the current local_* functions to CR5 based one. >> Base flow for each function is >> >> { >> set cr5(eq) >> load >> .. >> store >> clear cr5(eq) >> } >> >> Above set of instructions are followed by a fixup section which points >> to the entry of the function incase of interrupt in the flow. If the >> interrupt happens to be after the store, we just continue to last >> instruction in that block. >> >> Currently only asm/local.h has been rewrite, and local64 is TODO. >> Also the entire change is only for PPC64. >> >> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> >> --- >> arch/powerpc/include/asm/local.h | 306 +++++++++++++++++++++++++++++++++++++++ >> 1 file changed, 306 insertions(+) >> >> diff --git a/arch/powerpc/include/asm/local.h b/arch/powerpc/include/asm/local.h >> index b8da913..a26e5d3 100644 >> --- a/arch/powerpc/include/asm/local.h >> +++ b/arch/powerpc/include/asm/local.h >> @@ -11,6 +11,310 @@ typedef struct >> >> #define LOCAL_INIT(i) { ATOMIC_LONG_INIT(i) } >> >> +#ifdef CONFIG_PPC64 >> + >> +static __inline__ long local_read(local_t *l) >> +{ >> + long t; >> + >> + __asm__ __volatile__( >> +"1: crset 22\n" >> +"2:" PPC_LL" %0,0(%1)\n" >> +"3: crclr 22\n" >> +"4:\n" >> +" .section __ex_table,\"a\"\n" >> + PPC_LONG_ALIGN "\n" >> + PPC_LONG "2b,1b\n" >> + PPC_LONG "3b,3b\n" >> +" .previous\n" >> + : "=&r" (t) >> + : "r" (&(l->a.counter))); >> + >> + return t; >> +} >> + >> +static __inline__ void local_set(local_t *l, long i) >> +{ >> + long t; >> + >> + __asm__ __volatile__( >> +"1: crset 22\n" >> +"2:" PPC_LL" %0,0(%1)\n" >> +"3:" PPC405_ERR77(0,%2) >> +"4:" PPC_STL" %0,0(%2)\n" >> +"5: crclr 22\n" >> +"6:\n" >> +" .section __ex_table,\"a\"\n" >> + PPC_LONG_ALIGN "\n" >> + PPC_LONG "2b,1b\n" >> + PPC_LONG "3b,1b\n" >> + PPC_LONG "4b,1b\n" >> + PPC_LONG "5b,5b\n" >> +" .previous\n" >> + : "=&r" (t) >> + : "r" (&(i)), "r" (&(l->a.counter))); >> +} >> + > > Apart from the other comments on bloat which can very likely > be removed by tracing backwards for a few instructions, removing > the exception table entries which are 2 or 4 (64 bit?) times as > large as the instruction sequence, I don't understand at all why > you need these sequences for the local_read and local_set functions. > > After all these are single instructions (why do you perform a read before > the write in set when the result of the read is never used?). > Agreed. But the benchmark I used, also did check for local_read timing. So for consistency I re-wrote the these functions. But will drop the changes for these function in the next version. > I believe read and set are better mapped to access_once (or assign_once > or whatever it's called after the recent discussion on linux-kernel). > You don't even need a memory barrier if it's for a single thread, > so you could get away with a single volatile access to the variables. > OK. > For the other ones, I think that what you do is correct, except that > the workaround for PPC405 erratum 77 is not needed since this erratum > only affects the stwcx. instruction and the whole point of the patch > is to avoid the use of an l?arx/st?cx. pair. Yes. Will remove it. With regards Maddy > > Regards, > Gabriel > >> +static __inline__ void local_add(long i, local_t *l) >> +{ >> + long t; >> + >> + __asm__ __volatile__( >> +"1: crset 22\n" >> +"2:" PPC_LL" %0,0(%2)\n" >> +"3: add %0,%1,%0\n" >> +"4:" PPC405_ERR77(0,%2) >> +"5:" PPC_STL" %0,0(%2)\n" >> +"6: crclr 22\n" >> +"7:\n" >> +" .section __ex_table,\"a\"\n" >> + PPC_LONG_ALIGN "\n" >> + PPC_LONG "2b,1b\n" >> + PPC_LONG "3b,1b\n" >> + PPC_LONG "4b,1b\n" >> + PPC_LONG "5b,1b\n" >> + PPC_LONG "6b,6b\n" >> +" .previous\n" >> + : "=&r" (t) >> + : "r" (i), "r" (&(l->a.counter))); >> +} >> + >> +static __inline__ void local_sub(long i, local_t *l) >> +{ >> + long t; >> + >> + __asm__ __volatile__( >> +"1: crset 22\n" >> +"2:" PPC_LL" %0,0(%2)\n" >> +"3: subf %0,%1,%0\n" >> +"4:" PPC405_ERR77(0,%2) >> +"5:" PPC_STL" %0,0(%2)\n" >> +"6: crclr 22\n" >> +"7:\n" >> +" .section __ex_table,\"a\"\n" >> + PPC_LONG_ALIGN "\n" >> + PPC_LONG "2b,1b\n" >> + PPC_LONG "3b,1b\n" >> + PPC_LONG "4b,1b\n" >> + PPC_LONG "5b,1b\n" >> + PPC_LONG "6b,6b\n" >> +" .previous\n" >> + : "=&r" (t) >> + : "r" (i), "r" (&(l->a.counter))); >> +} >> + >> +static __inline__ long local_add_return(long a, local_t *l) >> +{ >> + long t; >> + >> + __asm__ __volatile__( >> +"1: crset 22\n" >> +"2:" PPC_LL" %0,0(%2)\n" >> +"3: add %0,%1,%0\n" >> +"4:" PPC405_ERR77(0,%2) >> +"5:" PPC_STL "%0,0(%2)\n" >> +"6: crclr 22\n" >> +"7:\n" >> +" .section __ex_table,\"a\"\n" >> + PPC_LONG_ALIGN "\n" >> + PPC_LONG "2b,1b\n" >> + PPC_LONG "3b,1b\n" >> + PPC_LONG "4b,1b\n" >> + PPC_LONG "5b,1b\n" >> + PPC_LONG "6b,6b\n" >> +" .previous\n" >> + : "=&r" (t) >> + : "r" (a), "r" (&(l->a.counter)) >> + : "cc", "memory"); >> + >> + return t; >> +} >> + >> + >> +#define local_add_negative(a, l) (local_add_return((a), (l)) < 0) >> + >> +static __inline__ long local_sub_return(long a, local_t *l) >> +{ >> + long t; >> + >> + __asm__ __volatile__( >> +"1: crset 22\n" >> +"2:" PPC_LL" %0,0(%2)\n" >> +"3: subf %0,%1,%0\n" >> +"4:" PPC405_ERR77(0,%2) >> +"5:" PPC_STL "%0,0(%2)\n" >> +"6: crclr 22\n" >> +"7:\n" >> +" .section __ex_table,\"a\"\n" >> + PPC_LONG_ALIGN "\n" >> + PPC_LONG "2b,1b\n" >> + PPC_LONG "3b,1b\n" >> + PPC_LONG "4b,1b\n" >> + PPC_LONG "5b,1b\n" >> + PPC_LONG "6b,6b\n" >> +" .previous\n" >> + : "=&r" (t) >> + : "r" (a), "r" (&(l->a.counter)) >> + : "cc", "memory"); >> + >> + return t; >> +} >> + >> +static __inline__ long local_inc_return(local_t *l) >> +{ >> + long t; >> + >> + __asm__ __volatile__( >> +"1: crset 22\n" >> +"2:" PPC_LL" %0,0(%1)\n" >> +"3: addic %0,%0,1\n" >> +"4:" PPC405_ERR77(0,%1) >> +"5:" PPC_STL "%0,0(%1)\n" >> +"6: crclr 22\n" >> +"7:\n" >> +" .section __ex_table,\"a\"\n" >> + PPC_LONG_ALIGN "\n" >> + PPC_LONG "2b,1b\n" >> + PPC_LONG "3b,1b\n" >> + PPC_LONG "4b,1b\n" >> + PPC_LONG "5b,1b\n" >> + PPC_LONG "6b,6b\n" >> +" .previous" >> + : "=&r" (t) >> + : "r" (&(l->a.counter)) >> + : "cc", "xer", "memory"); >> + >> + return t; >> +} >> + >> +/* >> + * local_inc_and_test - increment and test >> + * @l: pointer of type local_t >> + * >> + * Atomically increments @l by 1 >> + * and returns true if the result is zero, or false for all >> + * other cases. >> + */ >> +#define local_inc_and_test(l) (local_inc_return(l) == 0) >> + >> +static __inline__ long local_dec_return(local_t *l) >> +{ >> + long t; >> + >> + __asm__ __volatile__( >> +"1: crset 22\n" >> +"2:" PPC_LL" %0,0(%1)\n" >> +"3: addic %0,%0,-1\n" >> +"4:" PPC405_ERR77(0,%1) >> +"5:" PPC_STL "%0,0(%1)\n" >> +"6: crclr 22\n" >> +"7:\n" >> +" .section __ex_table,\"a\"\n" >> + PPC_LONG_ALIGN "\n" >> + PPC_LONG "2b,1b\n" >> + PPC_LONG "3b,1b\n" >> + PPC_LONG "4b,1b\n" >> + PPC_LONG "5b,1b\n" >> + PPC_LONG "6b,6b\n" >> +" .previous\n" >> + : "=&r" (t) >> + : "r" (&(l->a.counter)) >> + : "cc", "xer", "memory"); >> + >> + return t; >> +} >> + >> +#define local_inc(l) local_inc_return(l) >> +#define local_dec(l) local_dec_return(l) >> + >> +#define local_cmpxchg(l, o, n) \ >> + (cmpxchg_local(&((l)->a.counter), (o), (n))) >> +#define local_xchg(l, n) (xchg_local(&((l)->a.counter), (n))) >> + >> +/** >> + * local_add_unless - add unless the number is a given value >> + * @l: pointer of type local_t >> + * @a: the amount to add to v... >> + * @u: ...unless v is equal to u. >> + * >> + * Atomically adds @a to @l, so long as it was not @u. >> + * Returns non-zero if @l was not @u, and zero otherwise. >> + */ >> +static __inline__ int local_add_unless(local_t *l, long a, long u) >> +{ >> + long t; >> + >> + __asm__ __volatile__ ( >> +"1: crset 22\n" >> +"2:" PPC_LL" %0,0(%1)\n" >> +"3: cmpw 0,%0,%3 \n" >> +"4: beq- 9f \n" >> +"5: add %0,%2,%0 \n" >> +"6:" PPC405_ERR77(0,%1) >> +"7:" PPC_STL" %0,0(%1) \n" >> +"8: subf %0,%2,%0 \n" >> +"9: crclr 22\n" >> +"10:\n" >> +" .section __ex_table,\"a\"\n" >> + PPC_LONG_ALIGN "\n" >> + PPC_LONG "2b,1b\n" >> + PPC_LONG "3b,1b\n" >> + PPC_LONG "4b,1b\n" >> + PPC_LONG "5b,1b\n" >> + PPC_LONG "6b,1b\n" >> + PPC_LONG "7b,1b\n" >> + PPC_LONG "8b,8b\n" >> + PPC_LONG "9b,9b\n" >> +" .previous\n" >> + : "=&r" (t) >> + : "r" (&(l->a.counter)), "r" (a), "r" (u) >> + : "cc", "memory"); >> + >> + return t != u; >> +} >> + >> +#define local_inc_not_zero(l) local_add_unless((l), 1, 0) >> + >> +#define local_sub_and_test(a, l) (local_sub_return((a), (l)) == 0) >> +#define local_dec_and_test(l) (local_dec_return((l)) == 0) >> + >> +/* >> + * Atomically test *l and decrement if it is greater than 0. >> + * The function returns the old value of *l minus 1. >> + */ >> +static __inline__ long local_dec_if_positive(local_t *l) >> +{ >> + long t; >> + >> + __asm__ __volatile__( >> +"1: crset 22\n" >> +"2:" PPC_LL" %0,0(%1)\n" >> +"3: cmpwi %0,1\n" >> +"4: addi %0,%0,-1\n" >> +"5: blt- 8f\n" >> +"6:" PPC405_ERR77(0,%1) >> +"7:" PPC_STL "%0,0(%1)\n" >> +"8: crclr 22\n" >> +"9:\n" >> +" .section__ex_table,\"a\"\n" >> + PPC_LONG_ALIGN "\n" >> + PPC_LONG "2b,1b\n" >> + PPC_LONG "3b,1b\n" >> + PPC_LONG "4b,1b\n" >> + PPC_LONG "5b,1b\n" >> + PPC_LONG "6b,1b\n" >> + PPC_LONG "7b,1b\n" >> + PPC_LONG "8b,8b\n" >> +" .previous\n" >> + : "=&b" (t) >> + : "r" (&(l->a.counter)) >> + : "cc", "memory"); >> + >> + return t; >> +} >> + >> +#else >> + >> #define local_read(l) atomic_long_read(&(l)->a) >> #define local_set(l,i) atomic_long_set(&(l)->a, (i)) >> >> @@ -162,6 +466,8 @@ static __inline__ long local_dec_if_positive(local_t *l) >> return t; >> } >> >> +#endif >> + >> /* Use these for per-cpu local_t variables: on some archs they are >> * much more efficient than these naive implementations. Note they take >> * a variable, not an address. >> -- >> 1.9.1 >> >> _______________________________________________ >> Linuxppc-dev mailing list >> Linuxppc-dev@lists.ozlabs.org >> https://lists.ozlabs.org/listinfo/linuxppc-dev > ^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation 2014-11-27 12:18 [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation Madhavan Srinivasan 2014-11-27 12:18 ` [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t Madhavan Srinivasan 2014-11-27 12:18 ` [RFC PATCH 2/2]powerpc: rewrite local_* to use CR5 flag Madhavan Srinivasan @ 2014-11-27 14:05 ` David Laight 2014-11-28 8:27 ` Madhavan Srinivasan 2 siblings, 1 reply; 29+ messages in thread From: David Laight @ 2014-11-27 14:05 UTC (permalink / raw) To: 'Madhavan Srinivasan', mpe@ellerman.id.au Cc: linuxppc-dev@lists.ozlabs.org, rusty@rustcorp.com.au, paulus@samba.org, anton@samba.org RnJvbTogTWFkaGF2YW4gU3Jpbml2YXNhbg0KPiBUaGlzIHBhdGNoc2V0IGNyZWF0ZSB0aGUgaW5m cmFzdHJ1Y3R1cmUgdG8gaGFuZGxlIHRoZSBDUiBiYXNlZA0KPiBsb2NhbF8qIGF0b21pYyBvcGVy YXRpb25zLiBMb2NhbCBhdG9taWMgb3BlcmF0aW9ucyBhcmUgZmFzdA0KPiBhbmQgaGlnaGx5IHJl ZW50cmFudCBwZXIgQ1BVIGNvdW50ZXJzLiAgVXNlZCBmb3IgcGVyY3B1DQo+IHZhcmlhYmxlIHVw ZGF0ZXMuIExvY2FsIGF0b21pYyBvcGVyYXRpb25zIG9ubHkgZ3VhcmFudGVlDQo+IHZhcmlhYmxl IG1vZGlmaWNhdGlvbiBhdG9taWNpdHkgd3J0IHRoZSBDUFUgd2hpY2ggb3ducyB0aGUNCj4gZGF0 YSBhbmQgdGhlc2UgbmVlZHMgdG8gYmUgZXhlY3V0ZWQgaW4gYSBwcmVlbXB0aW9uIHNhZmUgd2F5 Lg0KDQpUaGVzZSBhcmUgdXN1YWxseSBjYWxsZWQgJ3Jlc3RhcnRhYmxlIGF0b21pYyBzZXF1ZW5j ZXMgKFJBUyknLg0KDQo+IEhlcmUgaXMgdGhlIGRlc2lnbiBvZiB0aGUgZmlyc3QgcGF0Y2guIFNp bmNlIGxvY2FsXyogb3BlcmF0aW9ucw0KPiBhcmUgb25seSBuZWVkIHRvIGJlIGF0b21pYyB0byBp bnRlcnJ1cHRzIChJSVVDKSwgcGF0Y2ggdXNlcw0KPiBvbmUgb2YgdGhlIENvbmRpdGlvbiBSZWdp c3RlciAoQ1IpIGZpZWxkcyBhcyBhIGZsYWcgdmFyaWFibGUuIFdoZW4NCj4gZW50ZXJpbmcgdGhl IGxvY2FsXyosIHNwZWNpZmljIGJpdCBpbiB0aGUgQ1I1IGZpZWxkIGlzIHNldA0KPiBhbmQgb24g ZXhpdCwgYml0IGlzIGNsZWFyZWQuIENSIGJpdCBjaGVja2luZyBpcyBkb25lIGluIHRoZQ0KPiBp bnRlcnJ1cHQgcmV0dXJuIHBhdGguIElmIENSNVtFUV0gYml0IHNldCBhbmQgaWYgd2UgcmV0dXJu DQo+IHRvIGtlcm5lbCwgd2UgcmVzZXQgdG8gc3RhcnQgb2YgbG9jYWxfKiBvcGVyYXRpb24uDQoN CkkgZG9uJ3QgY2xhaW0gdG8gYmUgYWJsZSB0byByZWFkIHBwYyBhc3NlbWJsZXIuDQpCdXQgSSBj YW4ndCBzZWUgdGhlIGNvZGUgdGhhdCBjbGVhcnMgQ1I1W0VRXSBmb3IgdGhlIGR1cmF0aW9uDQpv ZiB0aGUgSVNSLg0KV2l0aG91dCBpdCBhIG5lc3RlZCBpbnRlcnJ1cHQgd2lsbCBnbyB0aHJvdWdo IHVud2FudGVkIHBhdGhzLg0KDQpUaGVyZSBhcmUgYWxzbyBhIGxvdCBvZiAnbWFnaWMnIGNvbnN0 YW50cyBpbiB0aGF0IGFzc2VtYmx5IGNvZGUuDQoNCkkgYWxzbyB3b25kZXIgaWYgaXQgaXMgcG9z c2libGUgdG8gaW5zcGVjdCB0aGUgaW50ZXJydXB0ZWQNCmNvZGUgdG8gZGV0ZXJtaW5lIHRoZSBz dGFydC9lbmQgb2YgdGhlIFJBUyBibG9jay4NCihFYXNpZXN0IGlmIHlvdSBhc3N1bWUgdGhhdCB0 aGVyZSBpcyBhIHNpbmdsZSAnd3JpdGUnIGluc3RydWN0aW9uDQphcyB0aGUgbGFzdCBlbnRyeSBp biB0aGUgYmxvY2suKQ0KDQpBbHNvLCBob3cgZXhwZW5zaXZlIGlzIGl0IHRvIGRpc2FibGUgYWxs IGludGVycnVwdHM/DQoNCglEYXZpZA0KDQo= ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation 2014-11-27 14:05 ` [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation David Laight @ 2014-11-28 8:27 ` Madhavan Srinivasan 2014-11-28 10:09 ` David Laight 0 siblings, 1 reply; 29+ messages in thread From: Madhavan Srinivasan @ 2014-11-28 8:27 UTC (permalink / raw) To: David Laight, mpe@ellerman.id.au Cc: linuxppc-dev@lists.ozlabs.org, rusty@rustcorp.com.au, paulus@samba.org, anton@samba.org On Thursday 27 November 2014 07:35 PM, David Laight wrote: > From: Madhavan Srinivasan >> This patchset create the infrastructure to handle the CR based >> local_* atomic operations. Local atomic operations are fast >> and highly reentrant per CPU counters. Used for percpu >> variable updates. Local atomic operations only guarantee >> variable modification atomicity wrt the CPU which owns the >> data and these needs to be executed in a preemption safe way. > > These are usually called 'restartable atomic sequences (RAS)'. > >> Here is the design of the first patch. Since local_* operations >> are only need to be atomic to interrupts (IIUC), patch uses >> one of the Condition Register (CR) fields as a flag variable. When >> entering the local_*, specific bit in the CR5 field is set >> and on exit, bit is cleared. CR bit checking is done in the >> interrupt return path. If CR5[EQ] bit set and if we return >> to kernel, we reset to start of local_* operation. > > I don't claim to be able to read ppc assembler. > But I can't see the code that clears CR5[EQ] for the duration > of the ISR. I use crclr instruction at the end of the code block to clear the bit. > Without it a nested interrupt will go through unwanted paths. > > There are also a lot of 'magic' constants in that assembly code. > All these constants are define in asm/ppc-opcode.h > I also wonder if it is possible to inspect the interrupted > code to determine the start/end of the RAS block. > (Easiest if you assume that there is a single 'write' instruction > as the last entry in the block.) > So each local_* function also have code in the __ex_table section. IIUC, __ex_table contains two address. So if the return address found in the first column of the _ex_table, use the corresponding address in the second column to continue from. > Also, how expensive is it to disable all interrupts? > In the patch 1/2, posted the numbers for that too. > David > Regards Maddy ^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation 2014-11-28 8:27 ` Madhavan Srinivasan @ 2014-11-28 10:09 ` David Laight 2014-12-01 15:35 ` Madhavan Srinivasan 0 siblings, 1 reply; 29+ messages in thread From: David Laight @ 2014-11-28 10:09 UTC (permalink / raw) To: 'Madhavan Srinivasan', mpe@ellerman.id.au Cc: linuxppc-dev@lists.ozlabs.org, rusty@rustcorp.com.au, paulus@samba.org, anton@samba.org RnJvbTogTWFkaGF2YW4gU3Jpbml2YXNhbg0KPiBPbiBUaHVyc2RheSAyNyBOb3ZlbWJlciAyMDE0 IDA3OjM1IFBNLCBEYXZpZCBMYWlnaHQgd3JvdGU6DQo+ID4gRnJvbTogTWFkaGF2YW4gU3Jpbml2 YXNhbg0KPiA+PiBUaGlzIHBhdGNoc2V0IGNyZWF0ZSB0aGUgaW5mcmFzdHJ1Y3R1cmUgdG8gaGFu ZGxlIHRoZSBDUiBiYXNlZA0KPiA+PiBsb2NhbF8qIGF0b21pYyBvcGVyYXRpb25zLiBMb2NhbCBh dG9taWMgb3BlcmF0aW9ucyBhcmUgZmFzdA0KPiA+PiBhbmQgaGlnaGx5IHJlZW50cmFudCBwZXIg Q1BVIGNvdW50ZXJzLiAgVXNlZCBmb3IgcGVyY3B1DQo+ID4+IHZhcmlhYmxlIHVwZGF0ZXMuIExv Y2FsIGF0b21pYyBvcGVyYXRpb25zIG9ubHkgZ3VhcmFudGVlDQo+ID4+IHZhcmlhYmxlIG1vZGlm aWNhdGlvbiBhdG9taWNpdHkgd3J0IHRoZSBDUFUgd2hpY2ggb3ducyB0aGUNCj4gPj4gZGF0YSBh bmQgdGhlc2UgbmVlZHMgdG8gYmUgZXhlY3V0ZWQgaW4gYSBwcmVlbXB0aW9uIHNhZmUgd2F5Lg0K PiA+DQo+ID4gVGhlc2UgYXJlIHVzdWFsbHkgY2FsbGVkICdyZXN0YXJ0YWJsZSBhdG9taWMgc2Vx dWVuY2VzIChSQVMpJy4NCj4gPg0KPiA+PiBIZXJlIGlzIHRoZSBkZXNpZ24gb2YgdGhlIGZpcnN0 IHBhdGNoLiBTaW5jZSBsb2NhbF8qIG9wZXJhdGlvbnMNCj4gPj4gYXJlIG9ubHkgbmVlZCB0byBi ZSBhdG9taWMgdG8gaW50ZXJydXB0cyAoSUlVQyksIHBhdGNoIHVzZXMNCj4gPj4gb25lIG9mIHRo ZSBDb25kaXRpb24gUmVnaXN0ZXIgKENSKSBmaWVsZHMgYXMgYSBmbGFnIHZhcmlhYmxlLiBXaGVu DQo+ID4+IGVudGVyaW5nIHRoZSBsb2NhbF8qLCBzcGVjaWZpYyBiaXQgaW4gdGhlIENSNSBmaWVs ZCBpcyBzZXQNCj4gPj4gYW5kIG9uIGV4aXQsIGJpdCBpcyBjbGVhcmVkLiBDUiBiaXQgY2hlY2tp bmcgaXMgZG9uZSBpbiB0aGUNCj4gPj4gaW50ZXJydXB0IHJldHVybiBwYXRoLiBJZiBDUjVbRVFd IGJpdCBzZXQgYW5kIGlmIHdlIHJldHVybg0KPiA+PiB0byBrZXJuZWwsIHdlIHJlc2V0IHRvIHN0 YXJ0IG9mIGxvY2FsXyogb3BlcmF0aW9uLg0KPiA+DQo+ID4gSSBkb24ndCBjbGFpbSB0byBiZSBh YmxlIHRvIHJlYWQgcHBjIGFzc2VtYmxlci4NCj4gPiBCdXQgSSBjYW4ndCBzZWUgdGhlIGNvZGUg dGhhdCBjbGVhcnMgQ1I1W0VRXSBmb3IgdGhlIGR1cmF0aW9uDQo+ID4gb2YgdGhlIElTUi4NCj4g SSB1c2UgY3JjbHIgaW5zdHJ1Y3Rpb24gYXQgdGhlIGVuZCBvZiB0aGUgY29kZSBibG9jayB0byBj bGVhciB0aGUgYml0Lg0KPiANCj4gPiBXaXRob3V0IGl0IGEgbmVzdGVkIGludGVycnVwdCB3aWxs IGdvIHRocm91Z2ggdW53YW50ZWQgcGF0aHMuDQoNClRoYXQgY3JjbHIgbG9va3MgdG8gYmUgaW4g dGhlIElTUiBleGl0IHBhdGgsIHlvdSBuZWVkIG9uZSBpbiB0aGUNCmlzciBlbnRyeSBwYXRoLg0K DQo+ID4NCj4gPiBUaGVyZSBhcmUgYWxzbyBhIGxvdCBvZiAnbWFnaWMnIGNvbnN0YW50cyBpbiB0 aGF0IGFzc2VtYmx5IGNvZGUuDQo+ID4NCj4gQWxsIHRoZXNlIGNvbnN0YW50cyBhcmUgZGVmaW5l IGluIGFzbS9wcGMtb3Bjb2RlLmgNCg0KSSB3YXMgdGhpbmtpbmcgb2YgdGhlIGxpbmVzIGxpa2U6 DQorCW9yaQlyMyxyMywxNjM4NA0KVGhpcyBvbmUgcHJvYmFibHkgZGVzZXJ2ZXMgYSBjb21tZW50 IC0gb3Igc29tZXRoaW5nDQorIjM6IglQUEM0MDVfRVJSNzcoMCwlMikNCg0KPiA+IEkgYWxzbyB3 b25kZXIgaWYgaXQgaXMgcG9zc2libGUgdG8gaW5zcGVjdCB0aGUgaW50ZXJydXB0ZWQNCj4gPiBj b2RlIHRvIGRldGVybWluZSB0aGUgc3RhcnQvZW5kIG9mIHRoZSBSQVMgYmxvY2suDQo+ID4gKEVh c2llc3QgaWYgeW91IGFzc3VtZSB0aGF0IHRoZXJlIGlzIGEgc2luZ2xlICd3cml0ZScgaW5zdHJ1 Y3Rpb24NCj4gPiBhcyB0aGUgbGFzdCBlbnRyeSBpbiB0aGUgYmxvY2suKQ0KPiA+DQo+IFNvIGVh Y2ggbG9jYWxfKiBmdW5jdGlvbiBhbHNvIGhhdmUgY29kZSBpbiB0aGUgX19leF90YWJsZSBzZWN0 aW9uLiBJSVVDLA0KPiBfX2V4X3RhYmxlIGNvbnRhaW5zIHR3byBhZGRyZXNzLiBTbyBpZiB0aGUg cmV0dXJuIGFkZHJlc3MgZm91bmQgaW4gdGhlDQo+IGZpcnN0IGNvbHVtbiBvZiB0aGUgX2V4X3Rh YmxlLCB1c2UgdGhlIGNvcnJlc3BvbmRpbmcgYWRkcmVzcyBpbiB0aGUNCj4gc2Vjb25kIGNvbHVt biB0byBjb250aW51ZSBmcm9tLg0KDQpUaGF0IHJlYWxseSBkb2Vzbid0IHNjYWxlLg0KSSBkb24n dCBrbm93IGhvdyBtYW55IDEwMDAgYWRkcmVzcyBwYWlycyB5b3UgdGFibGUgd2lsbCBoYXZlIChh bmQgdGhlDQpvbmVzIGluIGVhY2ggbG9hZGFibGUgbW9kdWxlKSwgYnV0IHRoZSBzZWFyY2ggaXNu J3QgZ29pbmcgdG8gYmUgY2hlYXAuDQoNCklmIHRoZXNlIHNlcXVlbmNlcyBhcmUgcmVzdGFydGFi bGUgdGhlbiB0aGV5IGNhbiBvbmx5IGhhdmUgb25lIHdyaXRlDQp0byBtZW1vcnkuDQoNCkdpdmVu IHlvdXI6DQo+IFRoaXMgcGF0Y2ggcmUtd3JpdGUgdGhlIGN1cnJlbnQgbG9jYWxfKiBmdW5jdGlv bnMgdG8gQ1I1IGJhc2VkIG9uZS4NCj4gQmFzZSBmbG93IGZvciBlYWNoIGZ1bmN0aW9uIGlzIA0K PiANCj4gew0KPiAJc2V0IGNyNShlcSkNCj4gCWxvYWQNCj4gCS4uDQo+IAlzdG9yZQ0KPiAJY2xl YXIgY3I1KGVxKQ0KPiB9DQoNCk9uIElTUiBlbnRyeToNCklmIGFuIElTUiBkZXRlY3RzIGNyNShl cSkgc2V0IHRoZW4gbG9vayBhdCB0aGUgcmV0dXJuZWQgdG8gaW5zdHJ1Y3Rpb24uDQpJZiBpdCBp cyAnY2xlYXIgY3I1KGVxKScgZG8gbm90aGluZy4NCk90aGVyd2lzZSByZWFkIGJhY2t3YXJkcyB0 aHJvdWdoIHRoZSBjb2RlIChmb3IgYSBtYXggb2YgKHNheSkgMTYgaW5zdHJ1Y3Rpb25zKQ0Kc2Vh cmNoaW5nIGZvciB0aGUgJ3NldCBjcjUoZXEpJyBhbmQgY2hhbmdlIHRoZSByZXR1cm4gYWRkcmVz cyB0byBiZSB0aGF0DQpvZiB0aGUgaW5zdHJ1Y3Rpb24gZm9sbG93aW5nIHRoZSAnc2V0IGNyNShl cSknLg0KSW4gYWxsIGNhc2VzIGNsZWFyIGNyNShlcSkgZm9yIHRoZSBJU1IgaXRzZWxmIChsZWF2 ZSB0aGUgc2F2ZWQgdmFsdWUgdW5jaGFuZ2VkKS4NCg0KVGhlIHlvdSBkb24ndCBuZWVkIGEgdGFi bGUgb2YgZmF1bHQgbG9jYXRpb25zLg0KDQoJRGF2aWQNCg0K ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation 2014-11-28 10:09 ` David Laight @ 2014-12-01 15:35 ` Madhavan Srinivasan 2014-12-01 15:53 ` David Laight 0 siblings, 1 reply; 29+ messages in thread From: Madhavan Srinivasan @ 2014-12-01 15:35 UTC (permalink / raw) To: David Laight, mpe@ellerman.id.au Cc: linuxppc-dev@lists.ozlabs.org, rusty@rustcorp.com.au, paulus@samba.org, anton@samba.org On Friday 28 November 2014 03:39 PM, David Laight wrote: > From: Madhavan Srinivasan >> On Thursday 27 November 2014 07:35 PM, David Laight wrote: >>> From: Madhavan Srinivasan >>>> This patchset create the infrastructure to handle the CR based >>>> local_* atomic operations. Local atomic operations are fast >>>> and highly reentrant per CPU counters. Used for percpu >>>> variable updates. Local atomic operations only guarantee >>>> variable modification atomicity wrt the CPU which owns the >>>> data and these needs to be executed in a preemption safe way. >>> >>> These are usually called 'restartable atomic sequences (RAS)'. >>> >>>> Here is the design of the first patch. Since local_* operations >>>> are only need to be atomic to interrupts (IIUC), patch uses >>>> one of the Condition Register (CR) fields as a flag variable. When >>>> entering the local_*, specific bit in the CR5 field is set >>>> and on exit, bit is cleared. CR bit checking is done in the >>>> interrupt return path. If CR5[EQ] bit set and if we return >>>> to kernel, we reset to start of local_* operation. >>> >>> I don't claim to be able to read ppc assembler. >>> But I can't see the code that clears CR5[EQ] for the duration >>> of the ISR. >> I use crclr instruction at the end of the code block to clear the bit. >> >>> Without it a nested interrupt will go through unwanted paths. > > That crclr looks to be in the ISR exit path, you need one in the > isr entry path. In case of entry path, i clear the field using mtcr instruction > >>> >>> There are also a lot of 'magic' constants in that assembly code. >>> >> All these constants are define in asm/ppc-opcode.h > > I was thinking of the lines like: > + ori r3,r3,16384 ok sure. Will comment > This one probably deserves a comment - or something > +"3:" PPC405_ERR77(0,%2) > >>> I also wonder if it is possible to inspect the interrupted >>> code to determine the start/end of the RAS block. >>> (Easiest if you assume that there is a single 'write' instruction >>> as the last entry in the block.) >>> >> So each local_* function also have code in the __ex_table section. IIUC, >> __ex_table contains two address. So if the return address found in the >> first column of the _ex_table, use the corresponding address in the >> second column to continue from. > > That really doesn't scale. > I don't know how many 1000 address pairs you table will have (and the > ones in each loadable module), but the search isn't going to be cheap. > > If these sequences are restartable then they can only have one write > to memory. > May be, but i see these issues incase of insts decode path, 1) Decoding instruction may also cause a fault (in case of module) and handling a fault at this stage toward the exit path of interrupt exit makes me nervous 2) resulting code with lot of condition and branch (for opcode decode) will be lot messy and may be an issue incase of maintenance, but let me think it over again on this. Regards Maddy > Given your: >> This patch re-write the current local_* functions to CR5 based one. >> Base flow for each function is >> >> { >> set cr5(eq) >> load >> .. >> store >> clear cr5(eq) >> } > > On ISR entry: > If an ISR detects cr5(eq) set then look at the returned to instruction. > If it is 'clear cr5(eq)' do nothing. > Otherwise read backwards through the code (for a max of (say) 16 instructions) > searching for the 'set cr5(eq)' and change the return address to be that > of the instruction following the 'set cr5(eq)'. > In all cases clear cr5(eq) for the ISR itself (leave the saved value unchanged). > > The you don't need a table of fault locations. > > David > ^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation 2014-12-01 15:35 ` Madhavan Srinivasan @ 2014-12-01 15:53 ` David Laight 2014-12-18 4:18 ` Rusty Russell 0 siblings, 1 reply; 29+ messages in thread From: David Laight @ 2014-12-01 15:53 UTC (permalink / raw) To: 'Madhavan Srinivasan', mpe@ellerman.id.au Cc: linuxppc-dev@lists.ozlabs.org, rusty@rustcorp.com.au, paulus@samba.org, anton@samba.org RnJvbTogTWFkaGF2YW4gU3Jpbml2YXNhbiBbbWFpbHRvOm1hZGR5QGxpbnV4LnZuZXQuaWJtLmNv bV0NCi4uLg0KPiA+Pj4gSSBhbHNvIHdvbmRlciBpZiBpdCBpcyBwb3NzaWJsZSB0byBpbnNwZWN0 IHRoZSBpbnRlcnJ1cHRlZA0KPiA+Pj4gY29kZSB0byBkZXRlcm1pbmUgdGhlIHN0YXJ0L2VuZCBv ZiB0aGUgUkFTIGJsb2NrLg0KPiA+Pj4gKEVhc2llc3QgaWYgeW91IGFzc3VtZSB0aGF0IHRoZXJl IGlzIGEgc2luZ2xlICd3cml0ZScgaW5zdHJ1Y3Rpb24NCj4gPj4+IGFzIHRoZSBsYXN0IGVudHJ5 IGluIHRoZSBibG9jay4pDQo+ID4+Pg0KPiA+PiBTbyBlYWNoIGxvY2FsXyogZnVuY3Rpb24gYWxz byBoYXZlIGNvZGUgaW4gdGhlIF9fZXhfdGFibGUgc2VjdGlvbi4gSUlVQywNCj4gPj4gX19leF90 YWJsZSBjb250YWlucyB0d28gYWRkcmVzcy4gU28gaWYgdGhlIHJldHVybiBhZGRyZXNzIGZvdW5k IGluIHRoZQ0KPiA+PiBmaXJzdCBjb2x1bW4gb2YgdGhlIF9leF90YWJsZSwgdXNlIHRoZSBjb3Jy ZXNwb25kaW5nIGFkZHJlc3MgaW4gdGhlDQo+ID4+IHNlY29uZCBjb2x1bW4gdG8gY29udGludWUg ZnJvbS4NCj4gPg0KPiA+IFRoYXQgcmVhbGx5IGRvZXNuJ3Qgc2NhbGUuDQo+ID4gSSBkb24ndCBr bm93IGhvdyBtYW55IDEwMDAgYWRkcmVzcyBwYWlycyB5b3UgdGFibGUgd2lsbCBoYXZlIChhbmQg dGhlDQo+ID4gb25lcyBpbiBlYWNoIGxvYWRhYmxlIG1vZHVsZSksIGJ1dCB0aGUgc2VhcmNoIGlz bid0IGdvaW5nIHRvIGJlIGNoZWFwLg0KPiA+DQo+ID4gSWYgdGhlc2Ugc2VxdWVuY2VzIGFyZSBy ZXN0YXJ0YWJsZSB0aGVuIHRoZXkgY2FuIG9ubHkgaGF2ZSBvbmUgd3JpdGUNCj4gPiB0byBtZW1v cnkuDQo+ID4NCj4gDQo+IE1heSBiZSwgYnV0IGkgc2VlIHRoZXNlIGlzc3VlcyBpbmNhc2Ugb2Yg aW5zdHMgZGVjb2RlIHBhdGgsDQo+IA0KPiAxKSBEZWNvZGluZyBpbnN0cnVjdGlvbiBtYXkgYWxz byBjYXVzZSBhIGZhdWx0IChpbiBjYXNlIG9mIG1vZHVsZSkgYW5kDQo+IGhhbmRsaW5nIGEgZmF1 bHQgYXQgdGhpcyBzdGFnZSB0b3dhcmQgdGhlIGV4aXQgcGF0aCBvZiBpbnRlcnJ1cHQgZXhpdA0K PiBtYWtlcyBtZSBuZXJ2b3VzDQoNCkl0IHNob3VsZG4ndCBiZSBwb3NzaWJsZSB0byB1bmxvYWQg YSBtb2R1bGUgdGhhdCBpcyBpbnRlcnJ1cHRlZCBieQ0KYSBoYXJkd2FyZSBpbnRlcnJ1cHQuDQpB biAnaW52YWxpZCcgbG9hZGFibGUgbW9kdWxlIGNhbiBjYXVzZSBhbiBvb3BzL3BhbmljIGFueXdh eS4NCg0KPiAyKSByZXN1bHRpbmcgY29kZSB3aXRoIGxvdCBvZiBjb25kaXRpb24gYW5kIGJyYW5j aCAoZm9yIG9wY29kZSBkZWNvZGUpDQo+IHdpbGwgYmUgbG90IG1lc3N5IGFuZCBtYXkgYmUgYW4g aXNzdWUgaW5jYXNlIG9mIG1haW50ZW5hbmNlLA0KDQpZb3UgZG9uJ3QgbmVlZCB0byBkZWNvZGUg dGhlIGluc3RydWN0aW9ucy4NCkp1c3QgbG9vayBmb3IgdGhlIHR3byBzcGVjaWZpYyBpbnN0cnVj dGlvbnMgdXNlZCBhcyBtYXJrZXJzLg0KVGhpcyBpcyBvbmx5IHJlYWxseSBwb3NzaWJsZSB3aXRo IGZpeGVkLXNpemUgaW5zdHJ1Y3Rpb25zLg0KDQpJdCBtaWdodCBhbHNvIGJlIHRoYXQgdGhlICdp bnRlcnJ1cHQgZW50cnknIHBhdGggaXMgZWFzaWVyIHRvDQptb2RpZnkgdGhhbiB0aGUgJ2ludGVy cnVwdCBleGl0JyBvbmUgKGZld2VyIGNvZGUgcGF0aHMpIGFuZA0KeW91IGp1c3QgbmVlZCB0byBt b2RpZnkgdGhlICdwYycgaW4gdGhlIHN0YWNrIGZyYW1lLg0KWW91IGFyZSBvbmx5IGludGVyZXN0 ZWQgaW4gaW50ZXJydXB0cyBmcm9tIGtlcm5lbCBzcGFjZS4NCg0KCURhdmlkDQoNCg0K ^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation 2014-12-01 15:53 ` David Laight @ 2014-12-18 4:18 ` Rusty Russell 2014-12-18 9:52 ` David Laight 0 siblings, 1 reply; 29+ messages in thread From: Rusty Russell @ 2014-12-18 4:18 UTC (permalink / raw) To: David Laight, 'Madhavan Srinivasan', mpe@ellerman.id.au Cc: linuxppc-dev@lists.ozlabs.org, paulus@samba.org, anton@samba.org David Laight <David.Laight@ACULAB.COM> writes: > From: Madhavan Srinivasan [mailto:maddy@linux.vnet.ibm.com] > ... >> >>> I also wonder if it is possible to inspect the interrupted >> >>> code to determine the start/end of the RAS block. >> >>> (Easiest if you assume that there is a single 'write' instruction >> >>> as the last entry in the block.) >> >>> >> >> So each local_* function also have code in the __ex_table section. IIUC, >> >> __ex_table contains two address. So if the return address found in the >> >> first column of the _ex_table, use the corresponding address in the >> >> second column to continue from. >> > >> > That really doesn't scale. >> > I don't know how many 1000 address pairs you table will have (and the >> > ones in each loadable module), but the search isn't going to be cheap. >> > >> > If these sequences are restartable then they can only have one write >> > to memory. >> > >> >> May be, but i see these issues incase of insts decode path, >> >> 1) Decoding instruction may also cause a fault (in case of module) and >> handling a fault at this stage toward the exit path of interrupt exit >> makes me nervous > > It shouldn't be possible to unload a module that is interrupted by > a hardware interrupt. > An 'invalid' loadable module can cause an oops/panic anyway. Yes, the module won't fault (vmalloc memory can be lazily mapped, but we've already copied the module into there, so it won't happen). >> 2) resulting code with lot of condition and branch (for opcode decode) >> will be lot messy and may be an issue incase of maintenance, > > You don't need to decode the instructions. > Just look for the two specific instructions used as markers. > This is only really possible with fixed-size instructions. > > It might also be that the 'interrupt entry' path is easier to > modify than the 'interrupt exit' one (fewer code paths) and > you just need to modify the 'pc' in the stack frame. > You are only interested in interrupts from kernel space. It's an overoptimization for case that statistically never happens. You won't even be able to measure the difference. The question of bloat remains, but that's also easily measured. In practice, I'd guess less than 1k. Cheers, Rusty. ^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation 2014-12-18 4:18 ` Rusty Russell @ 2014-12-18 9:52 ` David Laight 2014-12-18 10:53 ` Rusty Russell 0 siblings, 1 reply; 29+ messages in thread From: David Laight @ 2014-12-18 9:52 UTC (permalink / raw) To: 'Rusty Russell', 'Madhavan Srinivasan', mpe@ellerman.id.au Cc: paulus@samba.org, linuxppc-dev@lists.ozlabs.org, anton@samba.org RnJvbTogUnVzdHkgUnVzc2VsbA0KPiBEYXZpZCBMYWlnaHQgPERhdmlkLkxhaWdodEBBQ1VMQUIu Q09NPiB3cml0ZXM6DQo+ID4gRnJvbTogTWFkaGF2YW4gU3Jpbml2YXNhbiBbbWFpbHRvOm1hZGR5 QGxpbnV4LnZuZXQuaWJtLmNvbV0NCj4gPiAuLi4NCj4gPj4gPj4+IEkgYWxzbyB3b25kZXIgaWYg aXQgaXMgcG9zc2libGUgdG8gaW5zcGVjdCB0aGUgaW50ZXJydXB0ZWQNCj4gPj4gPj4+IGNvZGUg dG8gZGV0ZXJtaW5lIHRoZSBzdGFydC9lbmQgb2YgdGhlIFJBUyBibG9jay4NCj4gPj4gPj4+IChF YXNpZXN0IGlmIHlvdSBhc3N1bWUgdGhhdCB0aGVyZSBpcyBhIHNpbmdsZSAnd3JpdGUnIGluc3Ry dWN0aW9uDQo+ID4+ID4+PiBhcyB0aGUgbGFzdCBlbnRyeSBpbiB0aGUgYmxvY2suKQ0KPiA+PiA+ Pj4NCj4gPj4gPj4gU28gZWFjaCBsb2NhbF8qIGZ1bmN0aW9uIGFsc28gaGF2ZSBjb2RlIGluIHRo ZSBfX2V4X3RhYmxlIHNlY3Rpb24uIElJVUMsDQo+ID4+ID4+IF9fZXhfdGFibGUgY29udGFpbnMg dHdvIGFkZHJlc3MuIFNvIGlmIHRoZSByZXR1cm4gYWRkcmVzcyBmb3VuZCBpbiB0aGUNCj4gPj4g Pj4gZmlyc3QgY29sdW1uIG9mIHRoZSBfZXhfdGFibGUsIHVzZSB0aGUgY29ycmVzcG9uZGluZyBh ZGRyZXNzIGluIHRoZQ0KPiA+PiA+PiBzZWNvbmQgY29sdW1uIHRvIGNvbnRpbnVlIGZyb20uDQo+ ID4+ID4NCj4gPj4gPiBUaGF0IHJlYWxseSBkb2Vzbid0IHNjYWxlLg0KPiA+PiA+IEkgZG9uJ3Qg a25vdyBob3cgbWFueSAxMDAwIGFkZHJlc3MgcGFpcnMgeW91IHRhYmxlIHdpbGwgaGF2ZSAoYW5k IHRoZQ0KPiA+PiA+IG9uZXMgaW4gZWFjaCBsb2FkYWJsZSBtb2R1bGUpLCBidXQgdGhlIHNlYXJj aCBpc24ndCBnb2luZyB0byBiZSBjaGVhcC4NCj4gPj4gPg0KPiA+PiA+IElmIHRoZXNlIHNlcXVl bmNlcyBhcmUgcmVzdGFydGFibGUgdGhlbiB0aGV5IGNhbiBvbmx5IGhhdmUgb25lIHdyaXRlDQo+ ID4+ID4gdG8gbWVtb3J5Lg0KLi4uDQo+ID4+IDIpIHJlc3VsdGluZyBjb2RlIHdpdGggbG90IG9m IGNvbmRpdGlvbiBhbmQgYnJhbmNoIChmb3Igb3Bjb2RlIGRlY29kZSkNCj4gPj4gd2lsbCBiZSBs b3QgbWVzc3kgYW5kIG1heSBiZSBhbiBpc3N1ZSBpbmNhc2Ugb2YgbWFpbnRlbmFuY2UsDQo+ID4N Cj4gPiBZb3UgZG9uJ3QgbmVlZCB0byBkZWNvZGUgdGhlIGluc3RydWN0aW9ucy4NCj4gPiBKdXN0 IGxvb2sgZm9yIHRoZSB0d28gc3BlY2lmaWMgaW5zdHJ1Y3Rpb25zIHVzZWQgYXMgbWFya2Vycy4N Cj4gPiBUaGlzIGlzIG9ubHkgcmVhbGx5IHBvc3NpYmxlIHdpdGggZml4ZWQtc2l6ZSBpbnN0cnVj dGlvbnMuDQo+ID4NCj4gPiBJdCBtaWdodCBhbHNvIGJlIHRoYXQgdGhlICdpbnRlcnJ1cHQgZW50 cnknIHBhdGggaXMgZWFzaWVyIHRvDQo+ID4gbW9kaWZ5IHRoYW4gdGhlICdpbnRlcnJ1cHQgZXhp dCcgb25lIChmZXdlciBjb2RlIHBhdGhzKSBhbmQNCj4gPiB5b3UganVzdCBuZWVkIHRvIG1vZGlm eSB0aGUgJ3BjJyBpbiB0aGUgc3RhY2sgZnJhbWUuDQo+ID4gWW91IGFyZSBvbmx5IGludGVyZXN0 ZWQgaW4gaW50ZXJydXB0cyBmcm9tIGtlcm5lbCBzcGFjZS4NCj4gDQo+IEl0J3MgYW4gb3Zlcm9w dGltaXphdGlvbiBmb3IgY2FzZSB0aGF0IHN0YXRpc3RpY2FsbHkgbmV2ZXIgaGFwcGVucy4NCj4g WW91IHdvbid0IGV2ZW4gYmUgYWJsZSB0byBtZWFzdXJlIHRoZSBkaWZmZXJlbmNlLg0KPiANCj4g VGhlIHF1ZXN0aW9uIG9mIGJsb2F0IHJlbWFpbnMsIGJ1dCB0aGF0J3MgYWxzbyBlYXNpbHkgbWVh c3VyZWQuICBJbg0KPiBwcmFjdGljZSwgSSdkIGd1ZXNzIGxlc3MgdGhhbiAxay4NCg0KSUlSQyB0 aGV5IHdlcmUgJ3N0YXRpYyBpbmxpbmUnIHNvIHRoZSB0YWJsZSBvZiBhZGRyZXNzZXMgaXMgZ2Vu ZXJhdGVkDQpmb3IgZXZlcnkgdXNlIHNpdGUuDQooY29weWluL291dCBnZW5lcmF0ZXMgYSBzaW1p bGFybHkgZW5vcm1vdXMgdGFibGUgb2YgYWRkcmVzc2VzIG9uIGFtZDY0KQ0KDQoNCklmIHRoZXkg d2VyZSByZWFsIGZ1bmN0aW9ucyAoc28gb25seSBhcHBlYXJlZCBvbmNlKSBpdCB3b3VsZG4ndCBi ZSBhcyBiYWQuDQpJbmRlZWQsIGluIHRoYXQgY2FzZSwgYnkgcHV0dGluZyBhbGwgc3VjaCBmdW5j dGlvbnMgaW50byBhIHNlcGFyYXRlIGNvZGUNCnNlY3Rpb24gYSBzaW1wbGUgJ3dpbmRvdyB0ZXN0 JyBjYW4gYmUgZG9uZSBvbiB0aGUgcmV0dXJuIGFkZHJlc3MgaW5zdGVhZA0Kb2YgcmVzZXJ2aW5n IG9uZSBvZiB0aGUgQ1IgYml0cy4NCg0KWW91IGFsc28gb25seSBuZWVkIHRvIHNhdmUgdGhlIHN0 YXJ0IGFuZCBlbmQgb2YgZWFjaCBibG9jaywgbm90IHRoZQ0KcmVzdGFydCBhZGRyZXNzIGZvciBl dmVyeSBpbnN0cnVjdGlvbiB3aXRoaW4gdGhlIGJsb2NrLg0KDQoJRGF2aWQNCg0K ^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation 2014-12-18 9:52 ` David Laight @ 2014-12-18 10:53 ` Rusty Russell 0 siblings, 0 replies; 29+ messages in thread From: Rusty Russell @ 2014-12-18 10:53 UTC (permalink / raw) To: David Laight, 'Madhavan Srinivasan', mpe@ellerman.id.au Cc: paulus@samba.org, linuxppc-dev@lists.ozlabs.org, anton@samba.org David Laight <David.Laight@ACULAB.COM> writes: > From: Rusty Russell >> David Laight <David.Laight@ACULAB.COM> writes: >> > From: Madhavan Srinivasan [mailto:maddy@linux.vnet.ibm.com] >> > ... >> >> >>> I also wonder if it is possible to inspect the interrupted >> >> >>> code to determine the start/end of the RAS block. >> >> >>> (Easiest if you assume that there is a single 'write' instruction >> >> >>> as the last entry in the block.) >> >> >>> >> >> >> So each local_* function also have code in the __ex_table section. IIUC, >> >> >> __ex_table contains two address. So if the return address found in the >> >> >> first column of the _ex_table, use the corresponding address in the >> >> >> second column to continue from. >> >> > >> >> > That really doesn't scale. >> >> > I don't know how many 1000 address pairs you table will have (and the >> >> > ones in each loadable module), but the search isn't going to be cheap. >> >> > >> >> > If these sequences are restartable then they can only have one write >> >> > to memory. > ... >> >> 2) resulting code with lot of condition and branch (for opcode decode) >> >> will be lot messy and may be an issue incase of maintenance, >> > >> > You don't need to decode the instructions. >> > Just look for the two specific instructions used as markers. >> > This is only really possible with fixed-size instructions. >> > >> > It might also be that the 'interrupt entry' path is easier to >> > modify than the 'interrupt exit' one (fewer code paths) and >> > you just need to modify the 'pc' in the stack frame. >> > You are only interested in interrupts from kernel space. >> >> It's an overoptimization for case that statistically never happens. >> You won't even be able to measure the difference. >> >> The question of bloat remains, but that's also easily measured. In >> practice, I'd guess less than 1k. > > IIRC they were 'static inline' so the table of addresses is generated > for every use site. > (copyin/out generates a similarly enormous table of addresses on amd64) There are about 20 callers in the entire kernel. Cheers, Rusty. ^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2014-12-18 23:39 UTC | newest] Thread overview: 29+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-11-27 12:18 [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation Madhavan Srinivasan 2014-11-27 12:18 ` [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t Madhavan Srinivasan 2014-11-27 16:56 ` Segher Boessenkool 2014-11-28 1:58 ` Benjamin Herrenschmidt 2014-11-28 3:00 ` Madhavan Srinivasan 2014-11-28 15:57 ` Segher Boessenkool 2014-11-28 2:57 ` Madhavan Srinivasan 2014-11-28 16:00 ` Segher Boessenkool 2014-11-28 0:56 ` Benjamin Herrenschmidt 2014-11-28 3:15 ` Madhavan Srinivasan 2014-11-28 3:21 ` Benjamin Herrenschmidt 2014-11-28 10:53 ` David Laight 2014-11-30 9:01 ` Benjamin Herrenschmidt 2014-12-01 21:35 ` Gabriel Paubert 2014-12-03 14:59 ` Madhavan Srinivasan 2014-12-03 17:07 ` Gabriel Paubert 2014-12-02 2:04 ` Scott Wood 2014-12-03 14:49 ` Madhavan Srinivasan 2014-11-27 12:18 ` [RFC PATCH 2/2]powerpc: rewrite local_* to use CR5 flag Madhavan Srinivasan 2014-12-01 18:01 ` Gabriel Paubert 2014-12-03 15:05 ` Madhavan Srinivasan 2014-11-27 14:05 ` [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation David Laight 2014-11-28 8:27 ` Madhavan Srinivasan 2014-11-28 10:09 ` David Laight 2014-12-01 15:35 ` Madhavan Srinivasan 2014-12-01 15:53 ` David Laight 2014-12-18 4:18 ` Rusty Russell 2014-12-18 9:52 ` David Laight 2014-12-18 10:53 ` Rusty Russell
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.