linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
To: Gabriel Paubert <paubert@iram.es>
Cc: rusty@rustcorp.com.au, paulus@samba.org, anton@samba.org,
	linuxppc-dev@lists.ozlabs.org
Subject: Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
Date: Wed, 03 Dec 2014 20:29:37 +0530	[thread overview]
Message-ID: <547F2559.9060404@linux.vnet.ibm.com> (raw)
In-Reply-To: <20141201213527.GA2278@visitor2.iram.es>

On Tuesday 02 December 2014 03:05 AM, Gabriel Paubert wrote:
> On Thu, Nov 27, 2014 at 05:48:40PM +0530, Madhavan Srinivasan wrote:
>> This patch create the infrastructure to handle the CR based 
>> local_* atomic operations. Local atomic operations are fast 
>> and highly reentrant per CPU counters.  Used for percpu 
>> variable updates. Local atomic operations only guarantee 
>> variable modification atomicity wrt the CPU which owns the
>> data and these needs to be executed in a preemption safe way. 
>>
>> Here is the design of this patch. Since local_* operations 
>> are only need to be atomic to interrupts (IIUC), patch uses 
>> one of the Condition Register (CR) fields as a flag variable. When 
>> entering the local_*, specific bit in the CR5 field is set
>> and on exit, bit is cleared. CR bit checking is done in the
>> interrupt return path. If CR5[EQ] bit set and if we return 
>> to kernel, we reset to start of local_* operation.
>>
>> Reason for this approach is that, currently l[w/d]arx/st[w/d]cx.
>> instruction pair is used for local_* operations, which are heavy 
>> on cycle count and they dont support a local variant. So to 
>> see whether the new implementation helps, used a modified 
>> version of Rusty's benchmark code on local_t.   
>>
>> https://lkml.org/lkml/2008/12/16/450
>>
>> Modifications: 
>>  - increated the working set size from 1MB to 8MB,
>>  - removed cpu_local_inc test.
>>
>> Test ran 
>>     - on POWER8 1S Scale out System 2.0GHz
>>     - on OPAL v3 with v3.18-rc4 patch kernel as Host
>>
>> Here are the values with the patch.
>>
>> Time in ns per iteration
>>
>> 		inc	add	read	add_return
>> atomic_long	67	67	18	69
>> irqsave/rest	39	39	23	39
>> trivalue	39	39	29	49
>> local_t		26	26	24	26
>>
>> Since CR5 is used as a flag, have added CFLAGS to avoid CR5 
>> for the kernel compilation and CR5 is zeroed at the kernel
>> entry.  
>>
>> Tested the patch in a 
>>  - pSeries LPAR, 
>>  - Host with patched/unmodified guest kernel 
>>
>> To check whether userspace see any CR5 corruption, ran a simple
>> test which does,
>>  - set CR5 field,
>>  - while(1)
>>    - sleep or gettimeofday
>>    - chk bit set
>>
>> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
>> ---
>> - I really appreciate feedback on the patchset.
>> - Kindly comment if I should try with any other benchmark or
>>     workload to check the numbers.
>> - Also, kindly recommand any know stress test for CR
>>
>>  Makefile                                 |   6 ++
>>  arch/powerpc/include/asm/exception-64s.h |  21 +++++-
>>  arch/powerpc/kernel/entry_64.S           | 106 ++++++++++++++++++++++++++++++-
>>  arch/powerpc/kernel/exceptions-64s.S     |   2 +-
>>  arch/powerpc/kernel/head_64.S            |   8 +++
>>  5 files changed, 138 insertions(+), 5 deletions(-)
>>
>> diff --git a/Makefile b/Makefile
>> index 00d618b..2e271ad 100644
>> --- a/Makefile
>> +++ b/Makefile
>> @@ -706,6 +706,12 @@ endif
>>  
>>  KBUILD_CFLAGS   += $(call cc-option, -fno-var-tracking-assignments)
>>  
>> +ifdef	CONFIG_PPC64
>> +# We need this flag to force compiler not to use CR5, since
>> +# local_t type code is based on this.
>> +KBUILD_CFLAGS   += -ffixed-cr5
>> +endif
>> +
>>  ifdef CONFIG_DEBUG_INFO
>>  ifdef CONFIG_DEBUG_INFO_SPLIT
>>  KBUILD_CFLAGS   += $(call cc-option, -gsplit-dwarf, -g)
>> diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
>> index 77f52b2..c42919a 100644
>> --- a/arch/powerpc/include/asm/exception-64s.h
>> +++ b/arch/powerpc/include/asm/exception-64s.h
>> @@ -306,7 +306,26 @@ do_kvm_##n:								\
>>  	std	r10,0(r1);		/* make stack chain pointer	*/ \
>>  	std	r0,GPR0(r1);		/* save r0 in stackframe	*/ \
>>  	std	r10,GPR1(r1);		/* save r1 in stackframe	*/ \
>> -	beq	4f;			/* if from kernel mode		*/ \
>> +BEGIN_FTR_SECTION;							   \
>> +	lis	r9,4096;		/* Create a mask with HV and PR */ \
>> +	rldicr	r9,r9,32,31;		/* bits, AND with the MSR       */ \
>> +	mr	r10,r9;			/* to check for Hyp state       */ \
>> +	ori	r9,r9,16384;						   \
>> +	and	r9,r12,r9;						   \
>> +	cmpd	cr3,r10,r9;							   \
>> +	beq	cr3,66f;		/* Jump if we come from Hyp mode*/ \
>> +	mtcrf	0x04,r10;		/* Clear CR5 if coming from usr */ \
> 
> I think you can do better than this, powerpc has a fantastic set
> of rotate and mask instructions. If I understand correctly your
> code you can replace it with the following:
> 
> 	rldicl	r10,r12,4,63       /* Extract HV bit to LSB of r10*/
> 	rlwinm  r9,r12,19,0x02     /* Extract PR bit to 2nd to last bit of r9 */
> 	or	r9,r9,10
> 	cmplwi  cr3,r9,1           /* Check for HV=1 and PR=0 */		
> 	beq     cr3,66f
> 	mtcrf   0x04,r10           /* Bits going to cr5 bits are 0 in r10 */
> 

>From previous comment, at this point, CR0 already has MSR[PR], and only
HV need to be checked. So I guess there still room for optimization.
But good to learn this seq.

> and obviously similarly for the other instances of this code sequence.
> 
> This said, mtcrf is not necessarily the fastest way of setting cr5 if
> you only want to clear the EQ bit. For example comparing the stack pointer
> with 0 should have the same effect.
> 

Yes. Will try this out.

>> +FTR_SECTION_ELSE;							   \
>> +	beq	4f;			/* if kernel mode branch        */ \
>> +	li	r10,0;			/* Clear CR5 incase of coming   */ \
>> +	mtcrf	0x04,r10;		/* from user.                   */ \
>> +	nop;				/* This part of code is for     */ \
>> +	nop;				/* kernel with MSR[HV]=0,       */ \
>> +	nop;				/* MSR[PR]=0, so just chk for   */ \
>> +	nop;				/* MSR[PR]                      */ \
>> +	nop;								   \
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE);				   \
>> +66:	beq	4f;			/* if from kernel mode		*/ \
>>  	ACCOUNT_CPU_USER_ENTRY(r9, r10);				   \
>>  	SAVE_PPR(area, r9, r10);					   \
>>  4:	EXCEPTION_PROLOG_COMMON_2(area)					   \
>> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
>> index 0905c8d..e42bb99 100644
>> --- a/arch/powerpc/kernel/entry_64.S
>> +++ b/arch/powerpc/kernel/entry_64.S
>> @@ -68,7 +68,26 @@ system_call_common:
>>  2:	std	r2,GPR2(r1)
>>  	std	r3,GPR3(r1)
>>  	mfcr	r2
>> -	std	r4,GPR4(r1)
>> +BEGIN_FTR_SECTION
>> +	lis	r10,4096
>> +	rldicr	r10,r10,32,31
>> +	mr	r11,r10
>> +	ori	r10,r10,16384
>> +	and	r10,r12,r10
>> +	cmpd	r11,r10
>> +	beq	67f
>> +	mtcrf	0x04,r11
>> +FTR_SECTION_ELSE
>> +	beq	67f
>> +	li	r11,0
>> +	mtcrf	0x04,r11
>> +	nop
>> +	nop
>> +	nop
>> +	nop
>> +	nop
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
>> +67:	std	r4,GPR4(r1)
>>  	std	r5,GPR5(r1)
>>  	std	r6,GPR6(r1)
>>  	std	r7,GPR7(r1)
>> @@ -224,8 +243,26 @@ syscall_exit:
>>  BEGIN_FTR_SECTION
>>  	stdcx.	r0,0,r1			/* to clear the reservation */
>>  END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
>> +BEGIN_FTR_SECTION
>> +	lis	r4,4096
>> +	rldicr	r4,r4,32,31
>> +	mr	r6,r4
>> +	ori	r4,r4,16384
>> +	and	r4,r8,r4
>> +	cmpd	cr3,r6,r4
>> +	beq	cr3,65f
>> +	mtcr	r5
>> +FTR_SECTION_ELSE
>>  	andi.	r6,r8,MSR_PR
>> -	ld	r4,_LINK(r1)
>> +	beq	65f
>> +	mtcr	r5
>> +	nop
>> +	nop
>> +	nop
>> +	nop
>> +	nop
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
>> +65:	ld	r4,_LINK(r1)
>>  
>>  	beq-	1f
>>  	ACCOUNT_CPU_USER_EXIT(r11, r12)
>> @@ -234,7 +271,11 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
>>  1:	ld	r2,GPR2(r1)
>>  	ld	r1,GPR1(r1)
>>  	mtlr	r4
>> +#ifdef	CONFIG_PPC64
>> +	mtcrf	0xFB,r5
>> +#else
>>  	mtcr	r5
>> +#endif
>>  	mtspr	SPRN_SRR0,r7
>>  	mtspr	SPRN_SRR1,r8
>>  	RFI
>> @@ -804,7 +845,66 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
>>  	 */
>>  	.globl	fast_exception_return
>>  fast_exception_return:
>> -	ld	r3,_MSR(r1)
>> +
>> +	/*
>> +	 * Now that we are about to exit from interrupt, lets check for
>> +	 * cr5 eq bit. If it is set, then we may be in the middle of
>> +	 * local_t update. In this case, we should rewind the NIP
>> +	 * accordingly.
>> +	 */
>> +	mfcr	r3
>> +	andi.	r4,r3,0x200
>> +	beq	63f
> 
> I believe that someonw has already mentioned that this is a very
> convoluted way of testing a cr bit. This can be optimized to bne cr5,63f
> 

This. Will change it.

Regards
Maddy

>> +
>> +	/*
>> +	 * Now that the bit is set, lets check for return to User
>> +	 */
>> +	ld	r4,_MSR(r1)
>> +BEGIN_FTR_SECTION
>> +	li	r3,4096
>> +	rldicr	r3,r3,32,31
>> +	mr	r5,r3
>> +	ori	r3,r3,16384
>> +	and	r3,r4,r3
>> +	cmpd	r5,r3
>> +	bne	63f
>> +FTR_SECTION_ELSE
>> +	andi.	r3,r4,MSR_PR
>> +	bne	63f
>> +	nop
>> +	nop
>> +	nop
>> +	nop
>> +	nop
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
>> +
>> +	/*
>> +	 * Looks like we are returning to Kernel, so
>> +	 * lets get the NIP and search the ex_table.
>> +	 * Change the NIP based on the return value
>> +	 */
>> +lookup_ex_table:
>> +	ld	r3,_NIP(r1)
>> +	bl	search_exception_tables	
>> +	cmpli	0,1,r3,0
>> +	bne	62f
>> +
>> +	/*
>> +	 * This is a panic case. Reason is that, we
>> +	 * have the CR5 bit set, but we are not in
>> +	 * local_* code and we are returning to Kernel.
>> +	 */
>> +	ld	r3,_NIP(r1)
>> +	mfcr	r4
>> +	EMIT_BUG_ENTRY lookup_ex_table, __FILE__,__LINE__,BUGFLAG_WARNING
>> +
>> +	/*
>> +	 * Now save the return fixup address as NIP
>> +	 */
>> +62:	ld	r4,8(r3)
>> +	std	r4,_NIP(r1)
>> +	crclr	22
>> +63:	ld	r3,_MSR(r1)
>>  	ld	r4,_CTR(r1)
>>  	ld	r0,_LINK(r1)
>>  	mtctr	r4
>> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
>> index 72e783e..edb75a9 100644
>> --- a/arch/powerpc/kernel/exceptions-64s.S
>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>> @@ -637,7 +637,7 @@ masked_##_H##interrupt:					\
>>  	rldicl	r10,r10,48,1; /* clear MSR_EE */	\
>>  	rotldi	r10,r10,16;				\
>>  	mtspr	SPRN_##_H##SRR1,r10;			\
>> -2:	mtcrf	0x80,r9;				\
>> +2:	mtcrf	0x90,r9;				\
>>  	ld	r9,PACA_EXGEN+EX_R9(r13);		\
>>  	ld	r10,PACA_EXGEN+EX_R10(r13);		\
>>  	ld	r11,PACA_EXGEN+EX_R11(r13);		\
>> diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
>> index d48125d..02e49b3 100644
>> --- a/arch/powerpc/kernel/head_64.S
>> +++ b/arch/powerpc/kernel/head_64.S
>> @@ -347,6 +347,14 @@ __mmu_off:
>>   *
>>   */
>>  __start_initialization_multiplatform:
>> +
>> +	/*
>> +	 * Before we do anything, lets clear CR5 field,
>> +	 * so that we will have a clean start at entry
>> +	 */
>> +	li	r11,0
>> +	mtcrf	0x04,r11
>> +
>>  	/* Make sure we are running in 64 bits mode */
>>  	bl	enable_64b_mode
>>  
> 
> 
> 	Regards,
> 	Gabriel
> 

  reply	other threads:[~2014-12-03 14:59 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-27 12:18 [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation Madhavan Srinivasan
2014-11-27 12:18 ` [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t Madhavan Srinivasan
2014-11-27 16:56   ` Segher Boessenkool
2014-11-28  1:58     ` Benjamin Herrenschmidt
2014-11-28  3:00       ` Madhavan Srinivasan
2014-11-28 15:57       ` Segher Boessenkool
2014-11-28  2:57     ` Madhavan Srinivasan
2014-11-28 16:00       ` Segher Boessenkool
2014-11-28  0:56   ` Benjamin Herrenschmidt
2014-11-28  3:15     ` Madhavan Srinivasan
2014-11-28  3:21       ` Benjamin Herrenschmidt
2014-11-28 10:53         ` David Laight
2014-11-30  9:01           ` Benjamin Herrenschmidt
2014-12-01 21:35   ` Gabriel Paubert
2014-12-03 14:59     ` Madhavan Srinivasan [this message]
2014-12-03 17:07       ` Gabriel Paubert
2014-12-02  2:04   ` Scott Wood
2014-12-03 14:49     ` Madhavan Srinivasan
2014-11-27 12:18 ` [RFC PATCH 2/2]powerpc: rewrite local_* to use CR5 flag Madhavan Srinivasan
2014-12-01 18:01   ` Gabriel Paubert
2014-12-03 15:05     ` Madhavan Srinivasan
2014-11-27 14:05 ` [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation David Laight
2014-11-28  8:27   ` Madhavan Srinivasan
2014-11-28 10:09     ` David Laight
2014-12-01 15:35       ` Madhavan Srinivasan
2014-12-01 15:53         ` David Laight
2014-12-18  4:18           ` Rusty Russell
2014-12-18  9:52             ` David Laight
2014-12-18 10:53               ` Rusty Russell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=547F2559.9060404@linux.vnet.ibm.com \
    --to=maddy@linux.vnet.ibm.com \
    --cc=anton@samba.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=paubert@iram.es \
    --cc=paulus@samba.org \
    --cc=rusty@rustcorp.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).