* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-11-27 12:18 ` [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t Madhavan Srinivasan
@ 2014-11-27 16:56 ` Segher Boessenkool
2014-11-28 1:58 ` Benjamin Herrenschmidt
2014-11-28 2:57 ` Madhavan Srinivasan
2014-11-28 0:56 ` Benjamin Herrenschmidt
` (2 subsequent siblings)
3 siblings, 2 replies; 29+ messages in thread
From: Segher Boessenkool @ 2014-11-27 16:56 UTC (permalink / raw)
To: Madhavan Srinivasan; +Cc: rusty, paulus, anton, linuxppc-dev
On Thu, Nov 27, 2014 at 05:48:40PM +0530, Madhavan Srinivasan wrote:
> Here is the design of this patch. Since local_* operations
> are only need to be atomic to interrupts (IIUC), patch uses
> one of the Condition Register (CR) fields as a flag variable. When
> entering the local_*, specific bit in the CR5 field is set
> and on exit, bit is cleared. CR bit checking is done in the
> interrupt return path. If CR5[EQ] bit set and if we return
> to kernel, we reset to start of local_* operation.
Have you tested this with (upcoming) GCC 5.0? GCC now uses CR5,
and it likes to use it very much, it might be more convenient to
use e.g. CR1 (which is allocated almost last, only before CR0).
> --- a/arch/powerpc/include/asm/exception-64s.h
> +++ b/arch/powerpc/include/asm/exception-64s.h
> @@ -306,7 +306,26 @@ do_kvm_##n: \
> std r10,0(r1); /* make stack chain pointer */ \
> std r0,GPR0(r1); /* save r0 in stackframe */ \
> std r10,GPR1(r1); /* save r1 in stackframe */ \
> - beq 4f; /* if from kernel mode */ \
> +BEGIN_FTR_SECTION; \
> + lis r9,4096; /* Create a mask with HV and PR */ \
> + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \
> + mr r10,r9; /* to check for Hyp state */ \
> + ori r9,r9,16384; \
> + and r9,r12,r9; \
> + cmpd cr3,r10,r9; \
> + beq cr3,66f; /* Jump if we come from Hyp mode*/ \
> + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \
Wow, such nastiness, only to avoid using dot insns (since you need to keep
the current CR0 value for the following beq 4f). And CR0 already holds the
PR bit, so you need only to check the HV bit anyway? Some restructuring
would make this a lot simpler and clearer.
> + /*
> + * Now that we are about to exit from interrupt, lets check for
> + * cr5 eq bit. If it is set, then we may be in the middle of
> + * local_t update. In this case, we should rewind the NIP
> + * accordingly.
> + */
> + mfcr r3
> + andi. r4,r3,0x200
> + beq 63f
This is just bne cr5,63f isn't it?
> index 72e783e..edb75a9 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -637,7 +637,7 @@ masked_##_H##interrupt: \
> rldicl r10,r10,48,1; /* clear MSR_EE */ \
> rotldi r10,r10,16; \
> mtspr SPRN_##_H##SRR1,r10; \
> -2: mtcrf 0x80,r9; \
> +2: mtcrf 0x90,r9; \
> ld r9,PACA_EXGEN+EX_R9(r13); \
> ld r10,PACA_EXGEN+EX_R10(r13); \
> ld r11,PACA_EXGEN+EX_R11(r13); \
What does this do?
Segher
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-11-27 16:56 ` Segher Boessenkool
@ 2014-11-28 1:58 ` Benjamin Herrenschmidt
2014-11-28 3:00 ` Madhavan Srinivasan
2014-11-28 15:57 ` Segher Boessenkool
2014-11-28 2:57 ` Madhavan Srinivasan
1 sibling, 2 replies; 29+ messages in thread
From: Benjamin Herrenschmidt @ 2014-11-28 1:58 UTC (permalink / raw)
To: Segher Boessenkool
Cc: linuxppc-dev, rusty, Madhavan Srinivasan, paulus, anton
On Thu, 2014-11-27 at 10:56 -0600, Segher Boessenkool wrote:
> On Thu, Nov 27, 2014 at 05:48:40PM +0530, Madhavan Srinivasan wrote:
> > Here is the design of this patch. Since local_* operations
> > are only need to be atomic to interrupts (IIUC), patch uses
> > one of the Condition Register (CR) fields as a flag variable. When
> > entering the local_*, specific bit in the CR5 field is set
> > and on exit, bit is cleared. CR bit checking is done in the
> > interrupt return path. If CR5[EQ] bit set and if we return
> > to kernel, we reset to start of local_* operation.
>
> Have you tested this with (upcoming) GCC 5.0? GCC now uses CR5,
> and it likes to use it very much, it might be more convenient to
> use e.g. CR1 (which is allocated almost last, only before CR0).
We use CR1 all over the place in your asm code. Any other suggestion ?
What's the damage of -ffixed-cr5 on gcc5 ? won't it just use CR4 or 6
instead ?
> > --- a/arch/powerpc/include/asm/exception-64s.h
> > +++ b/arch/powerpc/include/asm/exception-64s.h
> > @@ -306,7 +306,26 @@ do_kvm_##n: \
> > std r10,0(r1); /* make stack chain pointer */ \
> > std r0,GPR0(r1); /* save r0 in stackframe */ \
> > std r10,GPR1(r1); /* save r1 in stackframe */ \
> > - beq 4f; /* if from kernel mode */ \
> > +BEGIN_FTR_SECTION; \
> > + lis r9,4096; /* Create a mask with HV and PR */ \
> > + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \
> > + mr r10,r9; /* to check for Hyp state */ \
> > + ori r9,r9,16384; \
> > + and r9,r12,r9; \
> > + cmpd cr3,r10,r9; \
> > + beq cr3,66f; /* Jump if we come from Hyp mode*/ \
> > + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \
>
> Wow, such nastiness, only to avoid using dot insns (since you need to keep
> the current CR0 value for the following beq 4f). And CR0 already holds the
> PR bit, so you need only to check the HV bit anyway? Some restructuring
> would make this a lot simpler and clearer.
>
> > + /*
> > + * Now that we are about to exit from interrupt, lets check for
> > + * cr5 eq bit. If it is set, then we may be in the middle of
> > + * local_t update. In this case, we should rewind the NIP
> > + * accordingly.
> > + */
> > + mfcr r3
> > + andi. r4,r3,0x200
> > + beq 63f
>
> This is just bne cr5,63f isn't it?
>
> > index 72e783e..edb75a9 100644
> > --- a/arch/powerpc/kernel/exceptions-64s.S
> > +++ b/arch/powerpc/kernel/exceptions-64s.S
> > @@ -637,7 +637,7 @@ masked_##_H##interrupt: \
> > rldicl r10,r10,48,1; /* clear MSR_EE */ \
> > rotldi r10,r10,16; \
> > mtspr SPRN_##_H##SRR1,r10; \
> > -2: mtcrf 0x80,r9; \
> > +2: mtcrf 0x90,r9; \
> > ld r9,PACA_EXGEN+EX_R9(r13); \
> > ld r10,PACA_EXGEN+EX_R10(r13); \
> > ld r11,PACA_EXGEN+EX_R11(r13); \
>
> What does this do?
>
>
> Segher
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-11-28 1:58 ` Benjamin Herrenschmidt
@ 2014-11-28 3:00 ` Madhavan Srinivasan
2014-11-28 15:57 ` Segher Boessenkool
1 sibling, 0 replies; 29+ messages in thread
From: Madhavan Srinivasan @ 2014-11-28 3:00 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Segher Boessenkool
Cc: linuxppc-dev, rusty, paulus, anton
On Friday 28 November 2014 07:28 AM, Benjamin Herrenschmidt wrote:
> On Thu, 2014-11-27 at 10:56 -0600, Segher Boessenkool wrote:
>> On Thu, Nov 27, 2014 at 05:48:40PM +0530, Madhavan Srinivasan wrote:
>>> Here is the design of this patch. Since local_* operations
>>> are only need to be atomic to interrupts (IIUC), patch uses
>>> one of the Condition Register (CR) fields as a flag variable. When
>>> entering the local_*, specific bit in the CR5 field is set
>>> and on exit, bit is cleared. CR bit checking is done in the
>>> interrupt return path. If CR5[EQ] bit set and if we return
>>> to kernel, we reset to start of local_* operation.
>>
>> Have you tested this with (upcoming) GCC 5.0? GCC now uses CR5,
>> and it likes to use it very much, it might be more convenient to
>> use e.g. CR1 (which is allocated almost last, only before CR0).
>
> We use CR1 all over the place in your asm code. Any other suggestion ?
>
Yes. CR1 is used in many places and so do CR7. And CR0 are alway used
for dot instn. And I guess Vector instructions use CR6.
> What's the damage of -ffixed-cr5 on gcc5 ? won't it just use CR4 or 6
> instead ?
>
Will try this today with GCC 5.0.
>>> --- a/arch/powerpc/include/asm/exception-64s.h
>>> +++ b/arch/powerpc/include/asm/exception-64s.h
>>> @@ -306,7 +306,26 @@ do_kvm_##n: \
>>> std r10,0(r1); /* make stack chain pointer */ \
>>> std r0,GPR0(r1); /* save r0 in stackframe */ \
>>> std r10,GPR1(r1); /* save r1 in stackframe */ \
>>> - beq 4f; /* if from kernel mode */ \
>>> +BEGIN_FTR_SECTION; \
>>> + lis r9,4096; /* Create a mask with HV and PR */ \
>>> + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \
>>> + mr r10,r9; /* to check for Hyp state */ \
>>> + ori r9,r9,16384; \
>>> + and r9,r12,r9; \
>>> + cmpd cr3,r10,r9; \
>>> + beq cr3,66f; /* Jump if we come from Hyp mode*/ \
>>> + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \
>>
>> Wow, such nastiness, only to avoid using dot insns (since you need to keep
>> the current CR0 value for the following beq 4f). And CR0 already holds the
>> PR bit, so you need only to check the HV bit anyway? Some restructuring
>> would make this a lot simpler and clearer.
>>
>>> + /*
>>> + * Now that we are about to exit from interrupt, lets check for
>>> + * cr5 eq bit. If it is set, then we may be in the middle of
>>> + * local_t update. In this case, we should rewind the NIP
>>> + * accordingly.
>>> + */
>>> + mfcr r3
>>> + andi. r4,r3,0x200
>>> + beq 63f
>>
>> This is just bne cr5,63f isn't it?
>>
>>> index 72e783e..edb75a9 100644
>>> --- a/arch/powerpc/kernel/exceptions-64s.S
>>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>>> @@ -637,7 +637,7 @@ masked_##_H##interrupt: \
>>> rldicl r10,r10,48,1; /* clear MSR_EE */ \
>>> rotldi r10,r10,16; \
>>> mtspr SPRN_##_H##SRR1,r10; \
>>> -2: mtcrf 0x80,r9; \
>>> +2: mtcrf 0x90,r9; \
>>> ld r9,PACA_EXGEN+EX_R9(r13); \
>>> ld r10,PACA_EXGEN+EX_R10(r13); \
>>> ld r11,PACA_EXGEN+EX_R11(r13); \
>>
>> What does this do?
>>
>>
>> Segher
>> _______________________________________________
>> Linuxppc-dev mailing list
>> Linuxppc-dev@lists.ozlabs.org
>> https://lists.ozlabs.org/listinfo/linuxppc-dev
>
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-11-28 1:58 ` Benjamin Herrenschmidt
2014-11-28 3:00 ` Madhavan Srinivasan
@ 2014-11-28 15:57 ` Segher Boessenkool
1 sibling, 0 replies; 29+ messages in thread
From: Segher Boessenkool @ 2014-11-28 15:57 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: linuxppc-dev, rusty, Madhavan Srinivasan, paulus, anton
On Fri, Nov 28, 2014 at 12:58:55PM +1100, Benjamin Herrenschmidt wrote:
> > Have you tested this with (upcoming) GCC 5.0? GCC now uses CR5,
> > and it likes to use it very much, it might be more convenient to
> > use e.g. CR1 (which is allocated almost last, only before CR0).
>
> We use CR1 all over the place in your asm code. Any other suggestion ?
>
> What's the damage of -ffixed-cr5 on gcc5 ? won't it just use CR4 or 6
> instead ?
Oh, it will work fine. Not using CR5 would be more convenient so that
the register allocation for most code would not change when you use or
not use -ffixed-cr5, making code easier to read. But your point about
asm code already using the other CR fields makes CR5 a better choice
actually, because people avoided it (because the compiler did) :-)
Segher
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-11-27 16:56 ` Segher Boessenkool
2014-11-28 1:58 ` Benjamin Herrenschmidt
@ 2014-11-28 2:57 ` Madhavan Srinivasan
2014-11-28 16:00 ` Segher Boessenkool
1 sibling, 1 reply; 29+ messages in thread
From: Madhavan Srinivasan @ 2014-11-28 2:57 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: rusty, paulus, anton, linuxppc-dev
On Thursday 27 November 2014 10:26 PM, Segher Boessenkool wrote:
> On Thu, Nov 27, 2014 at 05:48:40PM +0530, Madhavan Srinivasan wrote:
>> Here is the design of this patch. Since local_* operations
>> are only need to be atomic to interrupts (IIUC), patch uses
>> one of the Condition Register (CR) fields as a flag variable. When
>> entering the local_*, specific bit in the CR5 field is set
>> and on exit, bit is cleared. CR bit checking is done in the
>> interrupt return path. If CR5[EQ] bit set and if we return
>> to kernel, we reset to start of local_* operation.
>
> Have you tested this with (upcoming) GCC 5.0? GCC now uses CR5,
> and it likes to use it very much, it might be more convenient to
> use e.g. CR1 (which is allocated almost last, only before CR0).
>
No. I did not try it with GCC5.0 But I did force kernel compilation with
fixed-cr5 which should make GCC avoid using CR5. But i will try that today.
>> --- a/arch/powerpc/include/asm/exception-64s.h
>> +++ b/arch/powerpc/include/asm/exception-64s.h
>> @@ -306,7 +306,26 @@ do_kvm_##n: \
>> std r10,0(r1); /* make stack chain pointer */ \
>> std r0,GPR0(r1); /* save r0 in stackframe */ \
>> std r10,GPR1(r1); /* save r1 in stackframe */ \
>> - beq 4f; /* if from kernel mode */ \
>> +BEGIN_FTR_SECTION; \
>> + lis r9,4096; /* Create a mask with HV and PR */ \
>> + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \
>> + mr r10,r9; /* to check for Hyp state */ \
>> + ori r9,r9,16384; \
>> + and r9,r12,r9; \
>> + cmpd cr3,r10,r9; \
>> + beq cr3,66f; /* Jump if we come from Hyp mode*/ \
>> + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \
>
> Wow, such nastiness, only to avoid using dot insns (since you need to keep
> the current CR0 value for the following beq 4f). And CR0 already holds the
> PR bit, so you need only to check the HV bit anyway? Some restructuring
> would make this a lot simpler and clearer.
Ok I can try that.
>
>> + /*
>> + * Now that we are about to exit from interrupt, lets check for
>> + * cr5 eq bit. If it is set, then we may be in the middle of
>> + * local_t update. In this case, we should rewind the NIP
>> + * accordingly.
>> + */
>> + mfcr r3
>> + andi. r4,r3,0x200
>> + beq 63f
>
> This is just bne cr5,63f isn't it?
>
>> index 72e783e..edb75a9 100644
>> --- a/arch/powerpc/kernel/exceptions-64s.S
>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>> @@ -637,7 +637,7 @@ masked_##_H##interrupt: \
>> rldicl r10,r10,48,1; /* clear MSR_EE */ \
>> rotldi r10,r10,16; \
>> mtspr SPRN_##_H##SRR1,r10; \
>> -2: mtcrf 0x80,r9; \
>> +2: mtcrf 0x90,r9; \
>> ld r9,PACA_EXGEN+EX_R9(r13); \
>> ld r10,PACA_EXGEN+EX_R10(r13); \
>> ld r11,PACA_EXGEN+EX_R11(r13); \
>
> What does this do?
>
Since I use the CR3, I restore it here.
>
> Segher
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-11-28 2:57 ` Madhavan Srinivasan
@ 2014-11-28 16:00 ` Segher Boessenkool
0 siblings, 0 replies; 29+ messages in thread
From: Segher Boessenkool @ 2014-11-28 16:00 UTC (permalink / raw)
To: Madhavan Srinivasan; +Cc: rusty, paulus, anton, linuxppc-dev
On Fri, Nov 28, 2014 at 08:27:22AM +0530, Madhavan Srinivasan wrote:
> > Have you tested this with (upcoming) GCC 5.0? GCC now uses CR5,
> > and it likes to use it very much, it might be more convenient to
> > use e.g. CR1 (which is allocated almost last, only before CR0).
> >
> No. I did not try it with GCC5.0 But I did force kernel compilation with
> fixed-cr5 which should make GCC avoid using CR5. But i will try that today.
It should work just fine, but testing is always useful, because, you know,
"should". Thanks.
Segher
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-11-27 12:18 ` [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t Madhavan Srinivasan
2014-11-27 16:56 ` Segher Boessenkool
@ 2014-11-28 0:56 ` Benjamin Herrenschmidt
2014-11-28 3:15 ` Madhavan Srinivasan
2014-12-01 21:35 ` Gabriel Paubert
2014-12-02 2:04 ` Scott Wood
3 siblings, 1 reply; 29+ messages in thread
From: Benjamin Herrenschmidt @ 2014-11-28 0:56 UTC (permalink / raw)
To: Madhavan Srinivasan; +Cc: rusty, paulus, anton, linuxppc-dev
On Thu, 2014-11-27 at 17:48 +0530, Madhavan Srinivasan wrote:
> This patch create the infrastructure to handle the CR based
> local_* atomic operations. Local atomic operations are fast
> and highly reentrant per CPU counters. Used for percpu
> variable updates. Local atomic operations only guarantee
> variable modification atomicity wrt the CPU which owns the
> data and these needs to be executed in a preemption safe way.
>
> Here is the design of this patch. Since local_* operations
> are only need to be atomic to interrupts (IIUC), patch uses
> one of the Condition Register (CR) fields as a flag variable. When
> entering the local_*, specific bit in the CR5 field is set
> and on exit, bit is cleared. CR bit checking is done in the
> interrupt return path. If CR5[EQ] bit set and if we return
> to kernel, we reset to start of local_* operation.
>
> Reason for this approach is that, currently l[w/d]arx/st[w/d]cx.
> instruction pair is used for local_* operations, which are heavy
> on cycle count and they dont support a local variant. So to
> see whether the new implementation helps, used a modified
> version of Rusty's benchmark code on local_t.
>
> https://lkml.org/lkml/2008/12/16/450
>
> Modifications:
> - increated the working set size from 1MB to 8MB,
> - removed cpu_local_inc test.
>
> Test ran
> - on POWER8 1S Scale out System 2.0GHz
> - on OPAL v3 with v3.18-rc4 patch kernel as Host
>
> Here are the values with the patch.
>
> Time in ns per iteration
>
> inc add read add_return
> atomic_long 67 67 18 69
> irqsave/rest 39 39 23 39
> trivalue 39 39 29 49
> local_t 26 26 24 26
>
> Since CR5 is used as a flag, have added CFLAGS to avoid CR5
> for the kernel compilation and CR5 is zeroed at the kernel
> entry.
>
> Tested the patch in a
> - pSeries LPAR,
> - Host with patched/unmodified guest kernel
>
> To check whether userspace see any CR5 corruption, ran a simple
> test which does,
> - set CR5 field,
> - while(1)
> - sleep or gettimeofday
> - chk bit set
>
> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
> ---
> - I really appreciate feedback on the patchset.
> - Kindly comment if I should try with any other benchmark or
> workload to check the numbers.
> - Also, kindly recommand any know stress test for CR
>
> Makefile | 6 ++
> arch/powerpc/include/asm/exception-64s.h | 21 +++++-
> arch/powerpc/kernel/entry_64.S | 106 ++++++++++++++++++++++++++++++-
> arch/powerpc/kernel/exceptions-64s.S | 2 +-
> arch/powerpc/kernel/head_64.S | 8 +++
> 5 files changed, 138 insertions(+), 5 deletions(-)
>
> diff --git a/Makefile b/Makefile
> index 00d618b..2e271ad 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -706,6 +706,12 @@ endif
>
> KBUILD_CFLAGS += $(call cc-option, -fno-var-tracking-assignments)
>
> +ifdef CONFIG_PPC64
> +# We need this flag to force compiler not to use CR5, since
> +# local_t type code is based on this.
> +KBUILD_CFLAGS += -ffixed-cr5
> +endif
> +
> ifdef CONFIG_DEBUG_INFO
> ifdef CONFIG_DEBUG_INFO_SPLIT
> KBUILD_CFLAGS += $(call cc-option, -gsplit-dwarf, -g)
> diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
> index 77f52b2..c42919a 100644
> --- a/arch/powerpc/include/asm/exception-64s.h
> +++ b/arch/powerpc/include/asm/exception-64s.h
> @@ -306,7 +306,26 @@ do_kvm_##n: \
> std r10,0(r1); /* make stack chain pointer */ \
> std r0,GPR0(r1); /* save r0 in stackframe */ \
> std r10,GPR1(r1); /* save r1 in stackframe */ \
> - beq 4f; /* if from kernel mode */ \
> +BEGIN_FTR_SECTION; \
> + lis r9,4096; /* Create a mask with HV and PR */ \
> + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \
> + mr r10,r9; /* to check for Hyp state */ \
> + ori r9,r9,16384; \
> + and r9,r12,r9; \
> + cmpd cr3,r10,r9; \
> + beq cr3,66f; /* Jump if we come from Hyp mode*/ \
> + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \
> +FTR_SECTION_ELSE; \
Can't we just unconditionally clear at as long as we do that after we've
saved it ? In that case, it's just a matter for the fixup code to check
the saved version rather than the actual CR..
> + beq 4f; /* if kernel mode branch */ \
> + li r10,0; /* Clear CR5 incase of coming */ \
> + mtcrf 0x04,r10; /* from user. */ \
> + nop; /* This part of code is for */ \
> + nop; /* kernel with MSR[HV]=0, */ \
> + nop; /* MSR[PR]=0, so just chk for */ \
> + nop; /* MSR[PR] */ \
> + nop; \
> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE); \
> +66: beq 4f; /* if from kernel mode */ \
> ACCOUNT_CPU_USER_ENTRY(r9, r10); \
> SAVE_PPR(area, r9, r10); \
> 4: EXCEPTION_PROLOG_COMMON_2(area) \
> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
> index 0905c8d..e42bb99 100644
> --- a/arch/powerpc/kernel/entry_64.S
> +++ b/arch/powerpc/kernel/entry_64.S
> @@ -68,7 +68,26 @@ system_call_common:
> 2: std r2,GPR2(r1)
> std r3,GPR3(r1)
> mfcr r2
> - std r4,GPR4(r1)
> +BEGIN_FTR_SECTION
> + lis r10,4096
> + rldicr r10,r10,32,31
> + mr r11,r10
> + ori r10,r10,16384
> + and r10,r12,r10
> + cmpd r11,r10
> + beq 67f
> + mtcrf 0x04,r11
> +FTR_SECTION_ELSE
> + beq 67f
> + li r11,0
> + mtcrf 0x04,r11
> + nop
> + nop
> + nop
> + nop
> + nop
> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
> +67: std r4,GPR4(r1)
> std r5,GPR5(r1)
> std r6,GPR6(r1)
> std r7,GPR7(r1)
> @@ -224,8 +243,26 @@ syscall_exit:
> BEGIN_FTR_SECTION
> stdcx. r0,0,r1 /* to clear the reservation */
> END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
> +BEGIN_FTR_SECTION
> + lis r4,4096
> + rldicr r4,r4,32,31
> + mr r6,r4
> + ori r4,r4,16384
> + and r4,r8,r4
> + cmpd cr3,r6,r4
> + beq cr3,65f
> + mtcr r5
> +FTR_SECTION_ELSE
> andi. r6,r8,MSR_PR
> - ld r4,_LINK(r1)
> + beq 65f
> + mtcr r5
> + nop
> + nop
> + nop
> + nop
> + nop
> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
> +65: ld r4,_LINK(r1)
>
> beq- 1f
> ACCOUNT_CPU_USER_EXIT(r11, r12)
> @@ -234,7 +271,11 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
> 1: ld r2,GPR2(r1)
> ld r1,GPR1(r1)
> mtlr r4
> +#ifdef CONFIG_PPC64
> + mtcrf 0xFB,r5
> +#else
> mtcr r5
> +#endif
> mtspr SPRN_SRR0,r7
> mtspr SPRN_SRR1,r8
> RFI
> @@ -804,7 +845,66 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
> */
> .globl fast_exception_return
> fast_exception_return:
> - ld r3,_MSR(r1)
> +
> + /*
> + * Now that we are about to exit from interrupt, lets check for
> + * cr5 eq bit. If it is set, then we may be in the middle of
> + * local_t update. In this case, we should rewind the NIP
> + * accordingly.
> + */
> + mfcr r3
> + andi. r4,r3,0x200
> + beq 63f
> +
> + /*
> + * Now that the bit is set, lets check for return to User
> + */
> + ld r4,_MSR(r1)
> +BEGIN_FTR_SECTION
> + li r3,4096
> + rldicr r3,r3,32,31
> + mr r5,r3
> + ori r3,r3,16384
> + and r3,r4,r3
> + cmpd r5,r3
> + bne 63f
> +FTR_SECTION_ELSE
> + andi. r3,r4,MSR_PR
> + bne 63f
> + nop
> + nop
> + nop
> + nop
> + nop
> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
> +
> + /*
> + * Looks like we are returning to Kernel, so
> + * lets get the NIP and search the ex_table.
> + * Change the NIP based on the return value
> + */
> +lookup_ex_table:
> + ld r3,_NIP(r1)
> + bl search_exception_tables
> + cmpli 0,1,r3,0
> + bne 62f
> +
> + /*
> + * This is a panic case. Reason is that, we
> + * have the CR5 bit set, but we are not in
> + * local_* code and we are returning to Kernel.
> + */
> + ld r3,_NIP(r1)
> + mfcr r4
> + EMIT_BUG_ENTRY lookup_ex_table, __FILE__,__LINE__,BUGFLAG_WARNING
> +
> + /*
> + * Now save the return fixup address as NIP
> + */
> +62: ld r4,8(r3)
> + std r4,_NIP(r1)
> + crclr 22
> +63: ld r3,_MSR(r1)
> ld r4,_CTR(r1)
> ld r0,_LINK(r1)
> mtctr r4
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index 72e783e..edb75a9 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -637,7 +637,7 @@ masked_##_H##interrupt: \
> rldicl r10,r10,48,1; /* clear MSR_EE */ \
> rotldi r10,r10,16; \
> mtspr SPRN_##_H##SRR1,r10; \
> -2: mtcrf 0x80,r9; \
> +2: mtcrf 0x90,r9; \
> ld r9,PACA_EXGEN+EX_R9(r13); \
> ld r10,PACA_EXGEN+EX_R10(r13); \
> ld r11,PACA_EXGEN+EX_R11(r13); \
> diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
> index d48125d..02e49b3 100644
> --- a/arch/powerpc/kernel/head_64.S
> +++ b/arch/powerpc/kernel/head_64.S
> @@ -347,6 +347,14 @@ __mmu_off:
> *
> */
> __start_initialization_multiplatform:
> +
> + /*
> + * Before we do anything, lets clear CR5 field,
> + * so that we will have a clean start at entry
> + */
> + li r11,0
> + mtcrf 0x04,r11
> +
> /* Make sure we are running in 64 bits mode */
> bl enable_64b_mode
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-11-28 0:56 ` Benjamin Herrenschmidt
@ 2014-11-28 3:15 ` Madhavan Srinivasan
2014-11-28 3:21 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 29+ messages in thread
From: Madhavan Srinivasan @ 2014-11-28 3:15 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: rusty, paulus, anton, linuxppc-dev
On Friday 28 November 2014 06:26 AM, Benjamin Herrenschmidt wrote:
> On Thu, 2014-11-27 at 17:48 +0530, Madhavan Srinivasan wrote:
>> This patch create the infrastructure to handle the CR based
>> local_* atomic operations. Local atomic operations are fast
>> and highly reentrant per CPU counters. Used for percpu
>> variable updates. Local atomic operations only guarantee
>> variable modification atomicity wrt the CPU which owns the
>> data and these needs to be executed in a preemption safe way.
>>
>> Here is the design of this patch. Since local_* operations
>> are only need to be atomic to interrupts (IIUC), patch uses
>> one of the Condition Register (CR) fields as a flag variable. When
>> entering the local_*, specific bit in the CR5 field is set
>> and on exit, bit is cleared. CR bit checking is done in the
>> interrupt return path. If CR5[EQ] bit set and if we return
>> to kernel, we reset to start of local_* operation.
>>
>> Reason for this approach is that, currently l[w/d]arx/st[w/d]cx.
>> instruction pair is used for local_* operations, which are heavy
>> on cycle count and they dont support a local variant. So to
>> see whether the new implementation helps, used a modified
>> version of Rusty's benchmark code on local_t.
>>
>> https://lkml.org/lkml/2008/12/16/450
>>
>> Modifications:
>> - increated the working set size from 1MB to 8MB,
>> - removed cpu_local_inc test.
>>
>> Test ran
>> - on POWER8 1S Scale out System 2.0GHz
>> - on OPAL v3 with v3.18-rc4 patch kernel as Host
>>
>> Here are the values with the patch.
>>
>> Time in ns per iteration
>>
>> inc add read add_return
>> atomic_long 67 67 18 69
>> irqsave/rest 39 39 23 39
>> trivalue 39 39 29 49
>> local_t 26 26 24 26
>>
>> Since CR5 is used as a flag, have added CFLAGS to avoid CR5
>> for the kernel compilation and CR5 is zeroed at the kernel
>> entry.
>>
>> Tested the patch in a
>> - pSeries LPAR,
>> - Host with patched/unmodified guest kernel
>>
>> To check whether userspace see any CR5 corruption, ran a simple
>> test which does,
>> - set CR5 field,
>> - while(1)
>> - sleep or gettimeofday
>> - chk bit set
>>
>> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
>> ---
>> - I really appreciate feedback on the patchset.
>> - Kindly comment if I should try with any other benchmark or
>> workload to check the numbers.
>> - Also, kindly recommand any know stress test for CR
>>
>> Makefile | 6 ++
>> arch/powerpc/include/asm/exception-64s.h | 21 +++++-
>> arch/powerpc/kernel/entry_64.S | 106 ++++++++++++++++++++++++++++++-
>> arch/powerpc/kernel/exceptions-64s.S | 2 +-
>> arch/powerpc/kernel/head_64.S | 8 +++
>> 5 files changed, 138 insertions(+), 5 deletions(-)
>>
>> diff --git a/Makefile b/Makefile
>> index 00d618b..2e271ad 100644
>> --- a/Makefile
>> +++ b/Makefile
>> @@ -706,6 +706,12 @@ endif
>>
>> KBUILD_CFLAGS += $(call cc-option, -fno-var-tracking-assignments)
>>
>> +ifdef CONFIG_PPC64
>> +# We need this flag to force compiler not to use CR5, since
>> +# local_t type code is based on this.
>> +KBUILD_CFLAGS += -ffixed-cr5
>> +endif
>> +
>> ifdef CONFIG_DEBUG_INFO
>> ifdef CONFIG_DEBUG_INFO_SPLIT
>> KBUILD_CFLAGS += $(call cc-option, -gsplit-dwarf, -g)
>> diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
>> index 77f52b2..c42919a 100644
>> --- a/arch/powerpc/include/asm/exception-64s.h
>> +++ b/arch/powerpc/include/asm/exception-64s.h
>> @@ -306,7 +306,26 @@ do_kvm_##n: \
>> std r10,0(r1); /* make stack chain pointer */ \
>> std r0,GPR0(r1); /* save r0 in stackframe */ \
>> std r10,GPR1(r1); /* save r1 in stackframe */ \
>> - beq 4f; /* if from kernel mode */ \
>> +BEGIN_FTR_SECTION; \
>> + lis r9,4096; /* Create a mask with HV and PR */ \
>> + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \
>> + mr r10,r9; /* to check for Hyp state */ \
>> + ori r9,r9,16384; \
>> + and r9,r12,r9; \
>> + cmpd cr3,r10,r9; \
>> + beq cr3,66f; /* Jump if we come from Hyp mode*/ \
>> + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \
>> +FTR_SECTION_ELSE; \
>
> Can't we just unconditionally clear at as long as we do that after we've
> saved it ? In that case, it's just a matter for the fixup code to check
> the saved version rather than the actual CR..
>
I use CR bit setting in the interrupt return path to enter the fixup
section search. If we unconditionally clear it, we will have to enter
the fixup section for every kernel return nip right?
Regards
Maddy
>> + beq 4f; /* if kernel mode branch */ \
>> + li r10,0; /* Clear CR5 incase of coming */ \
>> + mtcrf 0x04,r10; /* from user. */ \
>> + nop; /* This part of code is for */ \
>> + nop; /* kernel with MSR[HV]=0, */ \
>> + nop; /* MSR[PR]=0, so just chk for */ \
>> + nop; /* MSR[PR] */ \
>> + nop; \
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE); \
>> +66: beq 4f; /* if from kernel mode */ \
>> ACCOUNT_CPU_USER_ENTRY(r9, r10); \
>> SAVE_PPR(area, r9, r10); \
>> 4: EXCEPTION_PROLOG_COMMON_2(area) \
>> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
>> index 0905c8d..e42bb99 100644
>> --- a/arch/powerpc/kernel/entry_64.S
>> +++ b/arch/powerpc/kernel/entry_64.S
>> @@ -68,7 +68,26 @@ system_call_common:
>> 2: std r2,GPR2(r1)
>> std r3,GPR3(r1)
>> mfcr r2
>> - std r4,GPR4(r1)
>> +BEGIN_FTR_SECTION
>> + lis r10,4096
>> + rldicr r10,r10,32,31
>> + mr r11,r10
>> + ori r10,r10,16384
>> + and r10,r12,r10
>> + cmpd r11,r10
>> + beq 67f
>> + mtcrf 0x04,r11
>> +FTR_SECTION_ELSE
>> + beq 67f
>> + li r11,0
>> + mtcrf 0x04,r11
>> + nop
>> + nop
>> + nop
>> + nop
>> + nop
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
>> +67: std r4,GPR4(r1)
>> std r5,GPR5(r1)
>> std r6,GPR6(r1)
>> std r7,GPR7(r1)
>> @@ -224,8 +243,26 @@ syscall_exit:
>> BEGIN_FTR_SECTION
>> stdcx. r0,0,r1 /* to clear the reservation */
>> END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
>> +BEGIN_FTR_SECTION
>> + lis r4,4096
>> + rldicr r4,r4,32,31
>> + mr r6,r4
>> + ori r4,r4,16384
>> + and r4,r8,r4
>> + cmpd cr3,r6,r4
>> + beq cr3,65f
>> + mtcr r5
>> +FTR_SECTION_ELSE
>> andi. r6,r8,MSR_PR
>> - ld r4,_LINK(r1)
>> + beq 65f
>> + mtcr r5
>> + nop
>> + nop
>> + nop
>> + nop
>> + nop
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
>> +65: ld r4,_LINK(r1)
>>
>> beq- 1f
>> ACCOUNT_CPU_USER_EXIT(r11, r12)
>> @@ -234,7 +271,11 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
>> 1: ld r2,GPR2(r1)
>> ld r1,GPR1(r1)
>> mtlr r4
>> +#ifdef CONFIG_PPC64
>> + mtcrf 0xFB,r5
>> +#else
>> mtcr r5
>> +#endif
>> mtspr SPRN_SRR0,r7
>> mtspr SPRN_SRR1,r8
>> RFI
>> @@ -804,7 +845,66 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
>> */
>> .globl fast_exception_return
>> fast_exception_return:
>> - ld r3,_MSR(r1)
>> +
>> + /*
>> + * Now that we are about to exit from interrupt, lets check for
>> + * cr5 eq bit. If it is set, then we may be in the middle of
>> + * local_t update. In this case, we should rewind the NIP
>> + * accordingly.
>> + */
>> + mfcr r3
>> + andi. r4,r3,0x200
>> + beq 63f
>> +
>> + /*
>> + * Now that the bit is set, lets check for return to User
>> + */
>> + ld r4,_MSR(r1)
>> +BEGIN_FTR_SECTION
>> + li r3,4096
>> + rldicr r3,r3,32,31
>> + mr r5,r3
>> + ori r3,r3,16384
>> + and r3,r4,r3
>> + cmpd r5,r3
>> + bne 63f
>> +FTR_SECTION_ELSE
>> + andi. r3,r4,MSR_PR
>> + bne 63f
>> + nop
>> + nop
>> + nop
>> + nop
>> + nop
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
>> +
>> + /*
>> + * Looks like we are returning to Kernel, so
>> + * lets get the NIP and search the ex_table.
>> + * Change the NIP based on the return value
>> + */
>> +lookup_ex_table:
>> + ld r3,_NIP(r1)
>> + bl search_exception_tables
>> + cmpli 0,1,r3,0
>> + bne 62f
>> +
>> + /*
>> + * This is a panic case. Reason is that, we
>> + * have the CR5 bit set, but we are not in
>> + * local_* code and we are returning to Kernel.
>> + */
>> + ld r3,_NIP(r1)
>> + mfcr r4
>> + EMIT_BUG_ENTRY lookup_ex_table, __FILE__,__LINE__,BUGFLAG_WARNING
>> +
>> + /*
>> + * Now save the return fixup address as NIP
>> + */
>> +62: ld r4,8(r3)
>> + std r4,_NIP(r1)
>> + crclr 22
>> +63: ld r3,_MSR(r1)
>> ld r4,_CTR(r1)
>> ld r0,_LINK(r1)
>> mtctr r4
>> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
>> index 72e783e..edb75a9 100644
>> --- a/arch/powerpc/kernel/exceptions-64s.S
>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>> @@ -637,7 +637,7 @@ masked_##_H##interrupt: \
>> rldicl r10,r10,48,1; /* clear MSR_EE */ \
>> rotldi r10,r10,16; \
>> mtspr SPRN_##_H##SRR1,r10; \
>> -2: mtcrf 0x80,r9; \
>> +2: mtcrf 0x90,r9; \
>> ld r9,PACA_EXGEN+EX_R9(r13); \
>> ld r10,PACA_EXGEN+EX_R10(r13); \
>> ld r11,PACA_EXGEN+EX_R11(r13); \
>> diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
>> index d48125d..02e49b3 100644
>> --- a/arch/powerpc/kernel/head_64.S
>> +++ b/arch/powerpc/kernel/head_64.S
>> @@ -347,6 +347,14 @@ __mmu_off:
>> *
>> */
>> __start_initialization_multiplatform:
>> +
>> + /*
>> + * Before we do anything, lets clear CR5 field,
>> + * so that we will have a clean start at entry
>> + */
>> + li r11,0
>> + mtcrf 0x04,r11
>> +
>> /* Make sure we are running in 64 bits mode */
>> bl enable_64b_mode
>>
>
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-11-28 3:15 ` Madhavan Srinivasan
@ 2014-11-28 3:21 ` Benjamin Herrenschmidt
2014-11-28 10:53 ` David Laight
0 siblings, 1 reply; 29+ messages in thread
From: Benjamin Herrenschmidt @ 2014-11-28 3:21 UTC (permalink / raw)
To: Madhavan Srinivasan; +Cc: rusty, paulus, anton, linuxppc-dev
On Fri, 2014-11-28 at 08:45 +0530, Madhavan Srinivasan wrote:
> > Can't we just unconditionally clear at as long as we do that after we've
> > saved it ? In that case, it's just a matter for the fixup code to check
> > the saved version rather than the actual CR..
> >
> I use CR bit setting in the interrupt return path to enter the fixup
> section search. If we unconditionally clear it, we will have to enter
> the fixup section for every kernel return nip right?
As I said above. Can't we look at the saved version ?
IE.
- On interrupt entry:
* Save CR to CCR(r1)
* clear CR5
- On exit
* Check CCR(r1)'s CR5 field
* restore CR
Cheers,
Ben.
^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-11-28 3:21 ` Benjamin Herrenschmidt
@ 2014-11-28 10:53 ` David Laight
2014-11-30 9:01 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 29+ messages in thread
From: David Laight @ 2014-11-28 10:53 UTC (permalink / raw)
To: 'Benjamin Herrenschmidt', Madhavan Srinivasan
Cc: linuxppc-dev@lists.ozlabs.org, rusty@rustcorp.com.au,
paulus@samba.org, anton@samba.org
RnJvbTogQmVuamFtaW4gSGVycmVuc2NobWlkdA0KPiBPbiBGcmksIDIwMTQtMTEtMjggYXQgMDg6
NDUgKzA1MzAsIE1hZGhhdmFuIFNyaW5pdmFzYW4gd3JvdGU6DQo+ID4gPiBDYW4ndCB3ZSBqdXN0
IHVuY29uZGl0aW9uYWxseSBjbGVhciBhdCBhcyBsb25nIGFzIHdlIGRvIHRoYXQgYWZ0ZXIgd2Un
dmUNCj4gPiA+IHNhdmVkIGl0ID8gSW4gdGhhdCBjYXNlLCBpdCdzIGp1c3QgYSBtYXR0ZXIgZm9y
IHRoZSBmaXh1cCBjb2RlIHRvIGNoZWNrDQo+ID4gPiB0aGUgc2F2ZWQgdmVyc2lvbiByYXRoZXIg
dGhhbiB0aGUgYWN0dWFsIENSLi4NCj4gPiA+DQo+ID4gSSB1c2UgQ1IgYml0IHNldHRpbmcgaW4g
dGhlIGludGVycnVwdCByZXR1cm4gcGF0aCB0byBlbnRlciB0aGUgZml4dXANCj4gPiBzZWN0aW9u
IHNlYXJjaC4gSWYgd2UgdW5jb25kaXRpb25hbGx5IGNsZWFyIGl0LCB3ZSB3aWxsIGhhdmUgdG8g
ZW50ZXINCj4gPiB0aGUgZml4dXAgc2VjdGlvbiBmb3IgZXZlcnkga2VybmVsIHJldHVybiBuaXAg
cmlnaHQ/DQo+IA0KPiBBcyBJIHNhaWQgYWJvdmUuIENhbid0IHdlIGxvb2sgYXQgdGhlIHNhdmVk
IHZlcnNpb24gPw0KPiANCj4gSUUuDQo+IA0KPiAgLSBPbiBpbnRlcnJ1cHQgZW50cnk6DQo+IA0K
PiAJKiBTYXZlIENSIHRvIENDUihyMSkNCj4gCSogY2xlYXIgQ1I1DQo+IA0KPiAgLSBPbiBleGl0
DQo+IA0KPiAJKiBDaGVjayBDQ1IocjEpJ3MgQ1I1IGZpZWxkDQo+IAkqIHJlc3RvcmUgQ1INCg0K
QWN0dWFsbHkgdGhlcmUgaXMgbm8gcmVhbCByZWFzb24gd2h5IHRoZSAnZml4dXAnIGNhbid0IGJl
IGRvbmUNCmR1cmluZyBpbnRlcnJ1cHQgZW50cnkuDQoNCglEYXZpZA0KDQo=
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-11-28 10:53 ` David Laight
@ 2014-11-30 9:01 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 29+ messages in thread
From: Benjamin Herrenschmidt @ 2014-11-30 9:01 UTC (permalink / raw)
To: David Laight
Cc: paulus@samba.org, rusty@rustcorp.com.au, Madhavan Srinivasan,
linuxppc-dev@lists.ozlabs.org, anton@samba.org
On Fri, 2014-11-28 at 10:53 +0000, David Laight wrote:
> From: Benjamin Herrenschmidt
> > On Fri, 2014-11-28 at 08:45 +0530, Madhavan Srinivasan wrote:
> > > > Can't we just unconditionally clear at as long as we do that after we've
> > > > saved it ? In that case, it's just a matter for the fixup code to check
> > > > the saved version rather than the actual CR..
> > > >
> > > I use CR bit setting in the interrupt return path to enter the fixup
> > > section search. If we unconditionally clear it, we will have to enter
> > > the fixup section for every kernel return nip right?
> >
> > As I said above. Can't we look at the saved version ?
> >
> > IE.
> >
> > - On interrupt entry:
> >
> > * Save CR to CCR(r1)
> > * clear CR5
> >
> > - On exit
> >
> > * Check CCR(r1)'s CR5 field
> > * restore CR
>
> Actually there is no real reason why the 'fixup' can't be done
> during interrupt entry.
Other than if we crash, we get the wrong PC in the log etc... unlikely
but I tend to prefer this. Also if we ever allow something like a local
atomic on a faulting (uesrspace) address, we want a precise PC on entry.
Generally, we have a lot more entry path than exit path, it's easier to
keep the entry path simpler.
Cheers,
Ben.
> David
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-11-27 12:18 ` [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t Madhavan Srinivasan
2014-11-27 16:56 ` Segher Boessenkool
2014-11-28 0:56 ` Benjamin Herrenschmidt
@ 2014-12-01 21:35 ` Gabriel Paubert
2014-12-03 14:59 ` Madhavan Srinivasan
2014-12-02 2:04 ` Scott Wood
3 siblings, 1 reply; 29+ messages in thread
From: Gabriel Paubert @ 2014-12-01 21:35 UTC (permalink / raw)
To: Madhavan Srinivasan; +Cc: rusty, paulus, anton, linuxppc-dev
On Thu, Nov 27, 2014 at 05:48:40PM +0530, Madhavan Srinivasan wrote:
> This patch create the infrastructure to handle the CR based
> local_* atomic operations. Local atomic operations are fast
> and highly reentrant per CPU counters. Used for percpu
> variable updates. Local atomic operations only guarantee
> variable modification atomicity wrt the CPU which owns the
> data and these needs to be executed in a preemption safe way.
>
> Here is the design of this patch. Since local_* operations
> are only need to be atomic to interrupts (IIUC), patch uses
> one of the Condition Register (CR) fields as a flag variable. When
> entering the local_*, specific bit in the CR5 field is set
> and on exit, bit is cleared. CR bit checking is done in the
> interrupt return path. If CR5[EQ] bit set and if we return
> to kernel, we reset to start of local_* operation.
>
> Reason for this approach is that, currently l[w/d]arx/st[w/d]cx.
> instruction pair is used for local_* operations, which are heavy
> on cycle count and they dont support a local variant. So to
> see whether the new implementation helps, used a modified
> version of Rusty's benchmark code on local_t.
>
> https://lkml.org/lkml/2008/12/16/450
>
> Modifications:
> - increated the working set size from 1MB to 8MB,
> - removed cpu_local_inc test.
>
> Test ran
> - on POWER8 1S Scale out System 2.0GHz
> - on OPAL v3 with v3.18-rc4 patch kernel as Host
>
> Here are the values with the patch.
>
> Time in ns per iteration
>
> inc add read add_return
> atomic_long 67 67 18 69
> irqsave/rest 39 39 23 39
> trivalue 39 39 29 49
> local_t 26 26 24 26
>
> Since CR5 is used as a flag, have added CFLAGS to avoid CR5
> for the kernel compilation and CR5 is zeroed at the kernel
> entry.
>
> Tested the patch in a
> - pSeries LPAR,
> - Host with patched/unmodified guest kernel
>
> To check whether userspace see any CR5 corruption, ran a simple
> test which does,
> - set CR5 field,
> - while(1)
> - sleep or gettimeofday
> - chk bit set
>
> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
> ---
> - I really appreciate feedback on the patchset.
> - Kindly comment if I should try with any other benchmark or
> workload to check the numbers.
> - Also, kindly recommand any know stress test for CR
>
> Makefile | 6 ++
> arch/powerpc/include/asm/exception-64s.h | 21 +++++-
> arch/powerpc/kernel/entry_64.S | 106 ++++++++++++++++++++++++++++++-
> arch/powerpc/kernel/exceptions-64s.S | 2 +-
> arch/powerpc/kernel/head_64.S | 8 +++
> 5 files changed, 138 insertions(+), 5 deletions(-)
>
> diff --git a/Makefile b/Makefile
> index 00d618b..2e271ad 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -706,6 +706,12 @@ endif
>
> KBUILD_CFLAGS += $(call cc-option, -fno-var-tracking-assignments)
>
> +ifdef CONFIG_PPC64
> +# We need this flag to force compiler not to use CR5, since
> +# local_t type code is based on this.
> +KBUILD_CFLAGS += -ffixed-cr5
> +endif
> +
> ifdef CONFIG_DEBUG_INFO
> ifdef CONFIG_DEBUG_INFO_SPLIT
> KBUILD_CFLAGS += $(call cc-option, -gsplit-dwarf, -g)
> diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
> index 77f52b2..c42919a 100644
> --- a/arch/powerpc/include/asm/exception-64s.h
> +++ b/arch/powerpc/include/asm/exception-64s.h
> @@ -306,7 +306,26 @@ do_kvm_##n: \
> std r10,0(r1); /* make stack chain pointer */ \
> std r0,GPR0(r1); /* save r0 in stackframe */ \
> std r10,GPR1(r1); /* save r1 in stackframe */ \
> - beq 4f; /* if from kernel mode */ \
> +BEGIN_FTR_SECTION; \
> + lis r9,4096; /* Create a mask with HV and PR */ \
> + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \
> + mr r10,r9; /* to check for Hyp state */ \
> + ori r9,r9,16384; \
> + and r9,r12,r9; \
> + cmpd cr3,r10,r9; \
> + beq cr3,66f; /* Jump if we come from Hyp mode*/ \
> + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \
I think you can do better than this, powerpc has a fantastic set
of rotate and mask instructions. If I understand correctly your
code you can replace it with the following:
rldicl r10,r12,4,63 /* Extract HV bit to LSB of r10*/
rlwinm r9,r12,19,0x02 /* Extract PR bit to 2nd to last bit of r9 */
or r9,r9,10
cmplwi cr3,r9,1 /* Check for HV=1 and PR=0 */
beq cr3,66f
mtcrf 0x04,r10 /* Bits going to cr5 bits are 0 in r10 */
and obviously similarly for the other instances of this code sequence.
This said, mtcrf is not necessarily the fastest way of setting cr5 if
you only want to clear the EQ bit. For example comparing the stack pointer
with 0 should have the same effect.
> +FTR_SECTION_ELSE; \
> + beq 4f; /* if kernel mode branch */ \
> + li r10,0; /* Clear CR5 incase of coming */ \
> + mtcrf 0x04,r10; /* from user. */ \
> + nop; /* This part of code is for */ \
> + nop; /* kernel with MSR[HV]=0, */ \
> + nop; /* MSR[PR]=0, so just chk for */ \
> + nop; /* MSR[PR] */ \
> + nop; \
> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE); \
> +66: beq 4f; /* if from kernel mode */ \
> ACCOUNT_CPU_USER_ENTRY(r9, r10); \
> SAVE_PPR(area, r9, r10); \
> 4: EXCEPTION_PROLOG_COMMON_2(area) \
> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
> index 0905c8d..e42bb99 100644
> --- a/arch/powerpc/kernel/entry_64.S
> +++ b/arch/powerpc/kernel/entry_64.S
> @@ -68,7 +68,26 @@ system_call_common:
> 2: std r2,GPR2(r1)
> std r3,GPR3(r1)
> mfcr r2
> - std r4,GPR4(r1)
> +BEGIN_FTR_SECTION
> + lis r10,4096
> + rldicr r10,r10,32,31
> + mr r11,r10
> + ori r10,r10,16384
> + and r10,r12,r10
> + cmpd r11,r10
> + beq 67f
> + mtcrf 0x04,r11
> +FTR_SECTION_ELSE
> + beq 67f
> + li r11,0
> + mtcrf 0x04,r11
> + nop
> + nop
> + nop
> + nop
> + nop
> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
> +67: std r4,GPR4(r1)
> std r5,GPR5(r1)
> std r6,GPR6(r1)
> std r7,GPR7(r1)
> @@ -224,8 +243,26 @@ syscall_exit:
> BEGIN_FTR_SECTION
> stdcx. r0,0,r1 /* to clear the reservation */
> END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
> +BEGIN_FTR_SECTION
> + lis r4,4096
> + rldicr r4,r4,32,31
> + mr r6,r4
> + ori r4,r4,16384
> + and r4,r8,r4
> + cmpd cr3,r6,r4
> + beq cr3,65f
> + mtcr r5
> +FTR_SECTION_ELSE
> andi. r6,r8,MSR_PR
> - ld r4,_LINK(r1)
> + beq 65f
> + mtcr r5
> + nop
> + nop
> + nop
> + nop
> + nop
> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
> +65: ld r4,_LINK(r1)
>
> beq- 1f
> ACCOUNT_CPU_USER_EXIT(r11, r12)
> @@ -234,7 +271,11 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
> 1: ld r2,GPR2(r1)
> ld r1,GPR1(r1)
> mtlr r4
> +#ifdef CONFIG_PPC64
> + mtcrf 0xFB,r5
> +#else
> mtcr r5
> +#endif
> mtspr SPRN_SRR0,r7
> mtspr SPRN_SRR1,r8
> RFI
> @@ -804,7 +845,66 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
> */
> .globl fast_exception_return
> fast_exception_return:
> - ld r3,_MSR(r1)
> +
> + /*
> + * Now that we are about to exit from interrupt, lets check for
> + * cr5 eq bit. If it is set, then we may be in the middle of
> + * local_t update. In this case, we should rewind the NIP
> + * accordingly.
> + */
> + mfcr r3
> + andi. r4,r3,0x200
> + beq 63f
I believe that someonw has already mentioned that this is a very
convoluted way of testing a cr bit. This can be optimized to bne cr5,63f
> +
> + /*
> + * Now that the bit is set, lets check for return to User
> + */
> + ld r4,_MSR(r1)
> +BEGIN_FTR_SECTION
> + li r3,4096
> + rldicr r3,r3,32,31
> + mr r5,r3
> + ori r3,r3,16384
> + and r3,r4,r3
> + cmpd r5,r3
> + bne 63f
> +FTR_SECTION_ELSE
> + andi. r3,r4,MSR_PR
> + bne 63f
> + nop
> + nop
> + nop
> + nop
> + nop
> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
> +
> + /*
> + * Looks like we are returning to Kernel, so
> + * lets get the NIP and search the ex_table.
> + * Change the NIP based on the return value
> + */
> +lookup_ex_table:
> + ld r3,_NIP(r1)
> + bl search_exception_tables
> + cmpli 0,1,r3,0
> + bne 62f
> +
> + /*
> + * This is a panic case. Reason is that, we
> + * have the CR5 bit set, but we are not in
> + * local_* code and we are returning to Kernel.
> + */
> + ld r3,_NIP(r1)
> + mfcr r4
> + EMIT_BUG_ENTRY lookup_ex_table, __FILE__,__LINE__,BUGFLAG_WARNING
> +
> + /*
> + * Now save the return fixup address as NIP
> + */
> +62: ld r4,8(r3)
> + std r4,_NIP(r1)
> + crclr 22
> +63: ld r3,_MSR(r1)
> ld r4,_CTR(r1)
> ld r0,_LINK(r1)
> mtctr r4
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index 72e783e..edb75a9 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -637,7 +637,7 @@ masked_##_H##interrupt: \
> rldicl r10,r10,48,1; /* clear MSR_EE */ \
> rotldi r10,r10,16; \
> mtspr SPRN_##_H##SRR1,r10; \
> -2: mtcrf 0x80,r9; \
> +2: mtcrf 0x90,r9; \
> ld r9,PACA_EXGEN+EX_R9(r13); \
> ld r10,PACA_EXGEN+EX_R10(r13); \
> ld r11,PACA_EXGEN+EX_R11(r13); \
> diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
> index d48125d..02e49b3 100644
> --- a/arch/powerpc/kernel/head_64.S
> +++ b/arch/powerpc/kernel/head_64.S
> @@ -347,6 +347,14 @@ __mmu_off:
> *
> */
> __start_initialization_multiplatform:
> +
> + /*
> + * Before we do anything, lets clear CR5 field,
> + * so that we will have a clean start at entry
> + */
> + li r11,0
> + mtcrf 0x04,r11
> +
> /* Make sure we are running in 64 bits mode */
> bl enable_64b_mode
>
Regards,
Gabriel
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-12-01 21:35 ` Gabriel Paubert
@ 2014-12-03 14:59 ` Madhavan Srinivasan
2014-12-03 17:07 ` Gabriel Paubert
0 siblings, 1 reply; 29+ messages in thread
From: Madhavan Srinivasan @ 2014-12-03 14:59 UTC (permalink / raw)
To: Gabriel Paubert; +Cc: rusty, paulus, anton, linuxppc-dev
On Tuesday 02 December 2014 03:05 AM, Gabriel Paubert wrote:
> On Thu, Nov 27, 2014 at 05:48:40PM +0530, Madhavan Srinivasan wrote:
>> This patch create the infrastructure to handle the CR based
>> local_* atomic operations. Local atomic operations are fast
>> and highly reentrant per CPU counters. Used for percpu
>> variable updates. Local atomic operations only guarantee
>> variable modification atomicity wrt the CPU which owns the
>> data and these needs to be executed in a preemption safe way.
>>
>> Here is the design of this patch. Since local_* operations
>> are only need to be atomic to interrupts (IIUC), patch uses
>> one of the Condition Register (CR) fields as a flag variable. When
>> entering the local_*, specific bit in the CR5 field is set
>> and on exit, bit is cleared. CR bit checking is done in the
>> interrupt return path. If CR5[EQ] bit set and if we return
>> to kernel, we reset to start of local_* operation.
>>
>> Reason for this approach is that, currently l[w/d]arx/st[w/d]cx.
>> instruction pair is used for local_* operations, which are heavy
>> on cycle count and they dont support a local variant. So to
>> see whether the new implementation helps, used a modified
>> version of Rusty's benchmark code on local_t.
>>
>> https://lkml.org/lkml/2008/12/16/450
>>
>> Modifications:
>> - increated the working set size from 1MB to 8MB,
>> - removed cpu_local_inc test.
>>
>> Test ran
>> - on POWER8 1S Scale out System 2.0GHz
>> - on OPAL v3 with v3.18-rc4 patch kernel as Host
>>
>> Here are the values with the patch.
>>
>> Time in ns per iteration
>>
>> inc add read add_return
>> atomic_long 67 67 18 69
>> irqsave/rest 39 39 23 39
>> trivalue 39 39 29 49
>> local_t 26 26 24 26
>>
>> Since CR5 is used as a flag, have added CFLAGS to avoid CR5
>> for the kernel compilation and CR5 is zeroed at the kernel
>> entry.
>>
>> Tested the patch in a
>> - pSeries LPAR,
>> - Host with patched/unmodified guest kernel
>>
>> To check whether userspace see any CR5 corruption, ran a simple
>> test which does,
>> - set CR5 field,
>> - while(1)
>> - sleep or gettimeofday
>> - chk bit set
>>
>> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
>> ---
>> - I really appreciate feedback on the patchset.
>> - Kindly comment if I should try with any other benchmark or
>> workload to check the numbers.
>> - Also, kindly recommand any know stress test for CR
>>
>> Makefile | 6 ++
>> arch/powerpc/include/asm/exception-64s.h | 21 +++++-
>> arch/powerpc/kernel/entry_64.S | 106 ++++++++++++++++++++++++++++++-
>> arch/powerpc/kernel/exceptions-64s.S | 2 +-
>> arch/powerpc/kernel/head_64.S | 8 +++
>> 5 files changed, 138 insertions(+), 5 deletions(-)
>>
>> diff --git a/Makefile b/Makefile
>> index 00d618b..2e271ad 100644
>> --- a/Makefile
>> +++ b/Makefile
>> @@ -706,6 +706,12 @@ endif
>>
>> KBUILD_CFLAGS += $(call cc-option, -fno-var-tracking-assignments)
>>
>> +ifdef CONFIG_PPC64
>> +# We need this flag to force compiler not to use CR5, since
>> +# local_t type code is based on this.
>> +KBUILD_CFLAGS += -ffixed-cr5
>> +endif
>> +
>> ifdef CONFIG_DEBUG_INFO
>> ifdef CONFIG_DEBUG_INFO_SPLIT
>> KBUILD_CFLAGS += $(call cc-option, -gsplit-dwarf, -g)
>> diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
>> index 77f52b2..c42919a 100644
>> --- a/arch/powerpc/include/asm/exception-64s.h
>> +++ b/arch/powerpc/include/asm/exception-64s.h
>> @@ -306,7 +306,26 @@ do_kvm_##n: \
>> std r10,0(r1); /* make stack chain pointer */ \
>> std r0,GPR0(r1); /* save r0 in stackframe */ \
>> std r10,GPR1(r1); /* save r1 in stackframe */ \
>> - beq 4f; /* if from kernel mode */ \
>> +BEGIN_FTR_SECTION; \
>> + lis r9,4096; /* Create a mask with HV and PR */ \
>> + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \
>> + mr r10,r9; /* to check for Hyp state */ \
>> + ori r9,r9,16384; \
>> + and r9,r12,r9; \
>> + cmpd cr3,r10,r9; \
>> + beq cr3,66f; /* Jump if we come from Hyp mode*/ \
>> + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \
>
> I think you can do better than this, powerpc has a fantastic set
> of rotate and mask instructions. If I understand correctly your
> code you can replace it with the following:
>
> rldicl r10,r12,4,63 /* Extract HV bit to LSB of r10*/
> rlwinm r9,r12,19,0x02 /* Extract PR bit to 2nd to last bit of r9 */
> or r9,r9,10
> cmplwi cr3,r9,1 /* Check for HV=1 and PR=0 */
> beq cr3,66f
> mtcrf 0x04,r10 /* Bits going to cr5 bits are 0 in r10 */
>
>From previous comment, at this point, CR0 already has MSR[PR], and only
HV need to be checked. So I guess there still room for optimization.
But good to learn this seq.
> and obviously similarly for the other instances of this code sequence.
>
> This said, mtcrf is not necessarily the fastest way of setting cr5 if
> you only want to clear the EQ bit. For example comparing the stack pointer
> with 0 should have the same effect.
>
Yes. Will try this out.
>> +FTR_SECTION_ELSE; \
>> + beq 4f; /* if kernel mode branch */ \
>> + li r10,0; /* Clear CR5 incase of coming */ \
>> + mtcrf 0x04,r10; /* from user. */ \
>> + nop; /* This part of code is for */ \
>> + nop; /* kernel with MSR[HV]=0, */ \
>> + nop; /* MSR[PR]=0, so just chk for */ \
>> + nop; /* MSR[PR] */ \
>> + nop; \
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE); \
>> +66: beq 4f; /* if from kernel mode */ \
>> ACCOUNT_CPU_USER_ENTRY(r9, r10); \
>> SAVE_PPR(area, r9, r10); \
>> 4: EXCEPTION_PROLOG_COMMON_2(area) \
>> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
>> index 0905c8d..e42bb99 100644
>> --- a/arch/powerpc/kernel/entry_64.S
>> +++ b/arch/powerpc/kernel/entry_64.S
>> @@ -68,7 +68,26 @@ system_call_common:
>> 2: std r2,GPR2(r1)
>> std r3,GPR3(r1)
>> mfcr r2
>> - std r4,GPR4(r1)
>> +BEGIN_FTR_SECTION
>> + lis r10,4096
>> + rldicr r10,r10,32,31
>> + mr r11,r10
>> + ori r10,r10,16384
>> + and r10,r12,r10
>> + cmpd r11,r10
>> + beq 67f
>> + mtcrf 0x04,r11
>> +FTR_SECTION_ELSE
>> + beq 67f
>> + li r11,0
>> + mtcrf 0x04,r11
>> + nop
>> + nop
>> + nop
>> + nop
>> + nop
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
>> +67: std r4,GPR4(r1)
>> std r5,GPR5(r1)
>> std r6,GPR6(r1)
>> std r7,GPR7(r1)
>> @@ -224,8 +243,26 @@ syscall_exit:
>> BEGIN_FTR_SECTION
>> stdcx. r0,0,r1 /* to clear the reservation */
>> END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
>> +BEGIN_FTR_SECTION
>> + lis r4,4096
>> + rldicr r4,r4,32,31
>> + mr r6,r4
>> + ori r4,r4,16384
>> + and r4,r8,r4
>> + cmpd cr3,r6,r4
>> + beq cr3,65f
>> + mtcr r5
>> +FTR_SECTION_ELSE
>> andi. r6,r8,MSR_PR
>> - ld r4,_LINK(r1)
>> + beq 65f
>> + mtcr r5
>> + nop
>> + nop
>> + nop
>> + nop
>> + nop
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
>> +65: ld r4,_LINK(r1)
>>
>> beq- 1f
>> ACCOUNT_CPU_USER_EXIT(r11, r12)
>> @@ -234,7 +271,11 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
>> 1: ld r2,GPR2(r1)
>> ld r1,GPR1(r1)
>> mtlr r4
>> +#ifdef CONFIG_PPC64
>> + mtcrf 0xFB,r5
>> +#else
>> mtcr r5
>> +#endif
>> mtspr SPRN_SRR0,r7
>> mtspr SPRN_SRR1,r8
>> RFI
>> @@ -804,7 +845,66 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
>> */
>> .globl fast_exception_return
>> fast_exception_return:
>> - ld r3,_MSR(r1)
>> +
>> + /*
>> + * Now that we are about to exit from interrupt, lets check for
>> + * cr5 eq bit. If it is set, then we may be in the middle of
>> + * local_t update. In this case, we should rewind the NIP
>> + * accordingly.
>> + */
>> + mfcr r3
>> + andi. r4,r3,0x200
>> + beq 63f
>
> I believe that someonw has already mentioned that this is a very
> convoluted way of testing a cr bit. This can be optimized to bne cr5,63f
>
This. Will change it.
Regards
Maddy
>> +
>> + /*
>> + * Now that the bit is set, lets check for return to User
>> + */
>> + ld r4,_MSR(r1)
>> +BEGIN_FTR_SECTION
>> + li r3,4096
>> + rldicr r3,r3,32,31
>> + mr r5,r3
>> + ori r3,r3,16384
>> + and r3,r4,r3
>> + cmpd r5,r3
>> + bne 63f
>> +FTR_SECTION_ELSE
>> + andi. r3,r4,MSR_PR
>> + bne 63f
>> + nop
>> + nop
>> + nop
>> + nop
>> + nop
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
>> +
>> + /*
>> + * Looks like we are returning to Kernel, so
>> + * lets get the NIP and search the ex_table.
>> + * Change the NIP based on the return value
>> + */
>> +lookup_ex_table:
>> + ld r3,_NIP(r1)
>> + bl search_exception_tables
>> + cmpli 0,1,r3,0
>> + bne 62f
>> +
>> + /*
>> + * This is a panic case. Reason is that, we
>> + * have the CR5 bit set, but we are not in
>> + * local_* code and we are returning to Kernel.
>> + */
>> + ld r3,_NIP(r1)
>> + mfcr r4
>> + EMIT_BUG_ENTRY lookup_ex_table, __FILE__,__LINE__,BUGFLAG_WARNING
>> +
>> + /*
>> + * Now save the return fixup address as NIP
>> + */
>> +62: ld r4,8(r3)
>> + std r4,_NIP(r1)
>> + crclr 22
>> +63: ld r3,_MSR(r1)
>> ld r4,_CTR(r1)
>> ld r0,_LINK(r1)
>> mtctr r4
>> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
>> index 72e783e..edb75a9 100644
>> --- a/arch/powerpc/kernel/exceptions-64s.S
>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>> @@ -637,7 +637,7 @@ masked_##_H##interrupt: \
>> rldicl r10,r10,48,1; /* clear MSR_EE */ \
>> rotldi r10,r10,16; \
>> mtspr SPRN_##_H##SRR1,r10; \
>> -2: mtcrf 0x80,r9; \
>> +2: mtcrf 0x90,r9; \
>> ld r9,PACA_EXGEN+EX_R9(r13); \
>> ld r10,PACA_EXGEN+EX_R10(r13); \
>> ld r11,PACA_EXGEN+EX_R11(r13); \
>> diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
>> index d48125d..02e49b3 100644
>> --- a/arch/powerpc/kernel/head_64.S
>> +++ b/arch/powerpc/kernel/head_64.S
>> @@ -347,6 +347,14 @@ __mmu_off:
>> *
>> */
>> __start_initialization_multiplatform:
>> +
>> + /*
>> + * Before we do anything, lets clear CR5 field,
>> + * so that we will have a clean start at entry
>> + */
>> + li r11,0
>> + mtcrf 0x04,r11
>> +
>> /* Make sure we are running in 64 bits mode */
>> bl enable_64b_mode
>>
>
>
> Regards,
> Gabriel
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-12-03 14:59 ` Madhavan Srinivasan
@ 2014-12-03 17:07 ` Gabriel Paubert
0 siblings, 0 replies; 29+ messages in thread
From: Gabriel Paubert @ 2014-12-03 17:07 UTC (permalink / raw)
To: Madhavan Srinivasan; +Cc: rusty, paulus, anton, linuxppc-dev
On Wed, Dec 03, 2014 at 08:29:37PM +0530, Madhavan Srinivasan wrote:
> On Tuesday 02 December 2014 03:05 AM, Gabriel Paubert wrote:
> > On Thu, Nov 27, 2014 at 05:48:40PM +0530, Madhavan Srinivasan wrote:
> >> diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
> >> index 77f52b2..c42919a 100644
> >> --- a/arch/powerpc/include/asm/exception-64s.h
> >> +++ b/arch/powerpc/include/asm/exception-64s.h
> >> @@ -306,7 +306,26 @@ do_kvm_##n: \
> >> std r10,0(r1); /* make stack chain pointer */ \
> >> std r0,GPR0(r1); /* save r0 in stackframe */ \
> >> std r10,GPR1(r1); /* save r1 in stackframe */ \
> >> - beq 4f; /* if from kernel mode */ \
> >> +BEGIN_FTR_SECTION; \
> >> + lis r9,4096; /* Create a mask with HV and PR */ \
> >> + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \
> >> + mr r10,r9; /* to check for Hyp state */ \
> >> + ori r9,r9,16384; \
> >> + and r9,r12,r9; \
> >> + cmpd cr3,r10,r9; \
> >> + beq cr3,66f; /* Jump if we come from Hyp mode*/ \
> >> + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \
> >
> > I think you can do better than this, powerpc has a fantastic set
> > of rotate and mask instructions. If I understand correctly your
> > code you can replace it with the following:
> >
> > rldicl r10,r12,4,63 /* Extract HV bit to LSB of r10*/
> > rlwinm r9,r12,19,0x02 /* Extract PR bit to 2nd to last bit of r9 */
> > or r9,r9,10
> > cmplwi cr3,r9,1 /* Check for HV=1 and PR=0 */
> > beq cr3,66f
> > mtcrf 0x04,r10 /* Bits going to cr5 bits are 0 in r10 */
> >
>
> >From previous comment, at this point, CR0 already has MSR[PR], and only
> HV need to be checked. So I guess there still room for optimization.
> But good to learn this seq.
Actually the sequence I suggested can even be shortened again since the or
is not necessary and you can use an rlwimi instead.
rldicl r9,r12,5,62 /* r9 = 62 zeroes | HV | ? */
rlwimi r9,r12,18,0x01 /* r9 = 62 zeroes | HV | PR */
cmplwi cr3,r9,2 /* Check for HV=1 and PR=0 */
For 32 bit operands, rlwimi is a generic bit field to bit field move, but
GCC is (was, maybe GCC5 is better?) quite bad at recognizing opportunities
to use it.
I'm not sure that it is better to have two separate branches, one for
testing PR and the other for testing HV. Through how many branches can
the hardware speculate?
Unless I'm mistaken, this is code which is likely not to be in the
cache so I would optimize it first towards minimal code size.
Regards,
Gabriel
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-11-27 12:18 ` [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t Madhavan Srinivasan
` (2 preceding siblings ...)
2014-12-01 21:35 ` Gabriel Paubert
@ 2014-12-02 2:04 ` Scott Wood
2014-12-03 14:49 ` Madhavan Srinivasan
3 siblings, 1 reply; 29+ messages in thread
From: Scott Wood @ 2014-12-02 2:04 UTC (permalink / raw)
To: Madhavan Srinivasan; +Cc: rusty, paulus, anton, linuxppc-dev
On Thu, 2014-11-27 at 17:48 +0530, Madhavan Srinivasan wrote:
> - I really appreciate feedback on the patchset.
> - Kindly comment if I should try with any other benchmark or
> workload to check the numbers.
> - Also, kindly recommand any know stress test for CR
>
> Makefile | 6 ++
> arch/powerpc/include/asm/exception-64s.h | 21 +++++-
> arch/powerpc/kernel/entry_64.S | 106 ++++++++++++++++++++++++++++++-
> arch/powerpc/kernel/exceptions-64s.S | 2 +-
> arch/powerpc/kernel/head_64.S | 8 +++
> 5 files changed, 138 insertions(+), 5 deletions(-)
Patch 2/2 enables this for all PPC64, not just book3s -- so please don't
forget about the book3e exception paths (also MSR[GS] for KVM, but
aren't most if not all the places you're checking for HV mode after KVM
would have taken control? Or am I missing something about how book3s
KVM works?).
Or, if you don't want to do that, change patch 2/2 to be book3s only and
ifdef-protect the changes to common exception code.
> @@ -224,8 +243,26 @@ syscall_exit:
> BEGIN_FTR_SECTION
> stdcx. r0,0,r1 /* to clear the reservation */
> END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
> +BEGIN_FTR_SECTION
> + lis r4,4096
> + rldicr r4,r4,32,31
> + mr r6,r4
> + ori r4,r4,16384
> + and r4,r8,r4
> + cmpd cr3,r6,r4
> + beq cr3,65f
> + mtcr r5
> +FTR_SECTION_ELSE
> andi. r6,r8,MSR_PR
> - ld r4,_LINK(r1)
> + beq 65f
> + mtcr r5
> + nop
> + nop
> + nop
> + nop
> + nop
> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
> +65: ld r4,_LINK(r1)
>
> beq- 1f
> ACCOUNT_CPU_USER_EXIT(r11, r12)
> @@ -234,7 +271,11 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
> 1: ld r2,GPR2(r1)
> ld r1,GPR1(r1)
> mtlr r4
> +#ifdef CONFIG_PPC64
> + mtcrf 0xFB,r5
> +#else
> mtcr r5
> +#endif
mtcrf with more than one CRn being updated is expensive on Freescale
chips (and this isn't a book3s-only code path). Why do you need to do
it twice? I don't see where either r5 or cr5 are messed with between
the two places...
-Scott
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
2014-12-02 2:04 ` Scott Wood
@ 2014-12-03 14:49 ` Madhavan Srinivasan
0 siblings, 0 replies; 29+ messages in thread
From: Madhavan Srinivasan @ 2014-12-03 14:49 UTC (permalink / raw)
To: Scott Wood; +Cc: rusty, paulus, anton, linuxppc-dev
On Tuesday 02 December 2014 07:34 AM, Scott Wood wrote:
> On Thu, 2014-11-27 at 17:48 +0530, Madhavan Srinivasan wrote:
>> - I really appreciate feedback on the patchset.
>> - Kindly comment if I should try with any other benchmark or
>> workload to check the numbers.
>> - Also, kindly recommand any know stress test for CR
>>
>> Makefile | 6 ++
>> arch/powerpc/include/asm/exception-64s.h | 21 +++++-
>> arch/powerpc/kernel/entry_64.S | 106 ++++++++++++++++++++++++++++++-
>> arch/powerpc/kernel/exceptions-64s.S | 2 +-
>> arch/powerpc/kernel/head_64.S | 8 +++
>> 5 files changed, 138 insertions(+), 5 deletions(-)
>
> Patch 2/2 enables this for all PPC64, not just book3s -- so please don't
> forget about the book3e exception paths (also MSR[GS] for KVM, but
> aren't most if not all the places you're checking for HV mode after KVM
> would have taken control? Or am I missing something about how book3s
> KVM works?).
>
> Or, if you don't want to do that, change patch 2/2 to be book3s only and
> ifdef-protect the changes to common exception code.
>
Still learning the interrupt path and various configs. Was assuming
PPC64 is for server side. My bad. Will change the it to book3s and also
to common code.
>> @@ -224,8 +243,26 @@ syscall_exit:
>> BEGIN_FTR_SECTION
>> stdcx. r0,0,r1 /* to clear the reservation */
>> END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
>> +BEGIN_FTR_SECTION
>> + lis r4,4096
>> + rldicr r4,r4,32,31
>> + mr r6,r4
>> + ori r4,r4,16384
>> + and r4,r8,r4
>> + cmpd cr3,r6,r4
>> + beq cr3,65f
>> + mtcr r5
>> +FTR_SECTION_ELSE
>> andi. r6,r8,MSR_PR
>> - ld r4,_LINK(r1)
>> + beq 65f
>> + mtcr r5
>> + nop
>> + nop
>> + nop
>> + nop
>> + nop
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
>> +65: ld r4,_LINK(r1)
>>
>> beq- 1f
>> ACCOUNT_CPU_USER_EXIT(r11, r12)
>> @@ -234,7 +271,11 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
>> 1: ld r2,GPR2(r1)
>> ld r1,GPR1(r1)
>> mtlr r4
>> +#ifdef CONFIG_PPC64
>> + mtcrf 0xFB,r5
>> +#else
>> mtcr r5
>> +#endif
>
> mtcrf with more than one CRn being updated is expensive on Freescale
> chips (and this isn't a book3s-only code path). Why do you need to do
> it twice? I don't see where either r5 or cr5 are messed with between
> the two places...
>
Incase of returning to userspace, will be updating it twice. Which can
be avoid. Will redo this.
Regards
Maddy
> -Scott
>
>
^ permalink raw reply [flat|nested] 29+ messages in thread