All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Ellerman <mpe@ellerman.id.au>
To: Christophe Leroy <christophe.leroy@c-s.fr>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 1/8] powerpc/32: Add VDSO version of getcpu
Date: Wed, 18 Sep 2019 15:51:54 +1000	[thread overview]
Message-ID: <87h85aw3r9.fsf@mpe.ellerman.id.au> (raw)
In-Reply-To: <27d699092118ee8d21741c08a6ff7e4c65effdf2.1566491310.git.christophe.leroy@c-s.fr>

Hi Christophe,

Sorry I'm late replying to this.

Christophe Leroy <christophe.leroy@c-s.fr> writes:
> Commit 18ad51dd342a ("powerpc: Add VDSO version of getcpu") added
> getcpu() for PPC64 only, by making use of a user readable general
> purpose SPR.
>
> PPC32 doesn't have any such SPR, a full system call can still be
> avoided by implementing a fast system call which reads the CPU id
> in the task struct and returns immediately without going back in
> virtual mode.
>
> Before the patch, vdsotest reported:
> getcpu: syscall: 1572 nsec/call
> getcpu:    libc: 1787 nsec/call
> getcpu:    vdso: not tested
>
> Now, vdsotest reports:
> getcpu: syscall: 1582 nsec/call
> getcpu:    libc: 667 nsec/call
> getcpu:    vdso: 368 nsec/call
>
> For non SMP, just return CPU id 0 from the VDSO directly.
>
> PPC32 doesn't support CONFIG_NUMA so NUMA node is always 0.
>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
>
> ---
> v2: fixed build error in getcpu.S
> ---
>  arch/powerpc/include/asm/vdso.h         |  2 ++
>  arch/powerpc/kernel/head_32.h           | 13 +++++++++++++
>  arch/powerpc/kernel/head_booke.h        | 11 +++++++++++
>  arch/powerpc/kernel/vdso32/Makefile     |  4 +---
>  arch/powerpc/kernel/vdso32/getcpu.S     |  7 +++++++
>  arch/powerpc/kernel/vdso32/vdso32.lds.S |  2 --
>  6 files changed, 34 insertions(+), 5 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/vdso.h b/arch/powerpc/include/asm/vdso.h
> index b5e1f8f8a05c..adb54782df5f 100644
> --- a/arch/powerpc/include/asm/vdso.h
> +++ b/arch/powerpc/include/asm/vdso.h
> @@ -16,6 +16,8 @@
>  /* Define if 64 bits VDSO has procedure descriptors */
>  #undef VDS64_HAS_DESCRIPTORS
>  
> +#define NR_MAGIC_FAST_VDSO_SYSCALL	0x789a

We are still in the middle of the years long process of removing the
"magic" syscall on 64-bit:

  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/powerpc/kernel/exceptions-64s.S?commit=4d856f72c10ecb060868ed10ff1b1453943fc6c8#n1578
 

Can we not add another one on 32-bit?

Is it really such a fast path that it's worth putting a wart in the
syscall entry like that?

Is there some other method? On s390 they have a per-cpu VDSO page, that
would be a nice option. How we do that would be specific to a particular
MMU, and maybe not even possible with some MMUs. So maybe that's not
feasible.

If you do want to add a fastpath syscall then please just add it as a
regular syscall number, that way it's at least a bit less of a wart.
It's still not visible via tracing/ptrace etc. which is a pain but at
least the number is not "magical" too.

cheers


>  /* Offsets relative to thread->vdso_base */
> diff --git a/arch/powerpc/kernel/head_32.h b/arch/powerpc/kernel/head_32.h
> index 4a692553651f..a2e38b59785a 100644
> --- a/arch/powerpc/kernel/head_32.h
> +++ b/arch/powerpc/kernel/head_32.h
> @@ -3,6 +3,8 @@
>  #define __HEAD_32_H__
>  
>  #include <asm/ptrace.h>	/* for STACK_FRAME_REGS_MARKER */
> +#include <asm/vdso.h>
> +#include <asm/asm-offsets.h>
>  
>  /*
>   * MSR_KERNEL is > 0x8000 on 4xx/Book-E since it include MSR_CE.
> @@ -74,7 +76,13 @@
>  .endm
>  
>  .macro SYSCALL_ENTRY trapno
> +#ifdef CONFIG_SMP
> +	cmplwi	cr0, r0, NR_MAGIC_FAST_VDSO_SYSCALL
> +#endif
>  	mfspr	r12,SPRN_SPRG_THREAD
> +#ifdef CONFIG_SMP
> +	beq-	1f
> +#endif
>  	mfcr	r10
>  	lwz	r11,TASK_STACK-THREAD(r12)
>  	mflr	r9
> @@ -152,6 +160,11 @@
>  	mtspr	SPRN_SRR0,r11
>  	SYNC
>  	RFI				/* jump to handler, enable MMU */
> +#ifdef CONFIG_SMP
> +1:
> +	lwz	r5, TASK_CPU - THREAD(r12)
> +	RFI
> +#endif
>  .endm
>  
>  /*
> diff --git a/arch/powerpc/kernel/head_booke.h b/arch/powerpc/kernel/head_booke.h
> index 2ae635df9026..c534e87cac84 100644
> --- a/arch/powerpc/kernel/head_booke.h
> +++ b/arch/powerpc/kernel/head_booke.h
> @@ -3,6 +3,8 @@
>  #define __HEAD_BOOKE_H__
>  
>  #include <asm/ptrace.h>	/* for STACK_FRAME_REGS_MARKER */
> +#include <asm/vdso.h>
> +#include <asm/asm-offsets.h>
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_booke_hv_asm.h>
>  
> @@ -104,6 +106,10 @@ FTR_SECTION_ELSE
>  #ifdef CONFIG_KVM_BOOKE_HV
>  ALT_FTR_SECTION_END_IFSET(CPU_FTR_EMB_HV)
>  #endif
> +#ifdef CONFIG_SMP
> +	cmplwi	cr0, r0, NR_MAGIC_FAST_VDSO_SYSCALL
> +	beq-	1f
> +#endif
>  	BOOKE_CLEAR_BTB(r11)
>  	lwz	r11, TASK_STACK - THREAD(r10)
>  	rlwinm	r12,r12,0,4,2	/* Clear SO bit in CR */
> @@ -176,6 +182,11 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_EMB_HV)
>  	mtspr	SPRN_SRR0,r11
>  	SYNC
>  	RFI				/* jump to handler, enable MMU */
> +#ifdef CONFIG_SMP
> +1:
> +	lwz	r5, TASK_CPU - THREAD(r10)
> +	RFI
> +#endif
>  .endm
>  
>  /* To handle the additional exception priority levels on 40x and Book-E
> diff --git a/arch/powerpc/kernel/vdso32/Makefile b/arch/powerpc/kernel/vdso32/Makefile
> index 06f54d947057..e147bbdc12cd 100644
> --- a/arch/powerpc/kernel/vdso32/Makefile
> +++ b/arch/powerpc/kernel/vdso32/Makefile
> @@ -2,9 +2,7 @@
>  
>  # List of files in the vdso, has to be asm only for now
>  
> -obj-vdso32-$(CONFIG_PPC64) = getcpu.o
> -obj-vdso32 = sigtramp.o gettimeofday.o datapage.o cacheflush.o note.o \
> -		$(obj-vdso32-y)
> +obj-vdso32 = sigtramp.o gettimeofday.o datapage.o cacheflush.o note.o getcpu.o
>  
>  # Build rules
>  
> diff --git a/arch/powerpc/kernel/vdso32/getcpu.S b/arch/powerpc/kernel/vdso32/getcpu.S
> index 63e914539e1a..bde226ad904d 100644
> --- a/arch/powerpc/kernel/vdso32/getcpu.S
> +++ b/arch/powerpc/kernel/vdso32/getcpu.S
> @@ -17,7 +17,14 @@
>   */
>  V_FUNCTION_BEGIN(__kernel_getcpu)
>    .cfi_startproc
> +#if defined(CONFIG_PPC64)
>  	mfspr	r5,SPRN_SPRG_VDSO_READ
> +#elif defined(CONFIG_SMP)
> +	li	r0, NR_MAGIC_FAST_VDSO_SYSCALL
> +	sc	/* returns cpuid in r5, clobbers cr0 and r10-r13 */
> +#else
> +	li	r5, 0
> +#endif
>  	cmpwi	cr0,r3,0
>  	cmpwi	cr1,r4,0
>  	clrlwi  r6,r5,16
> diff --git a/arch/powerpc/kernel/vdso32/vdso32.lds.S b/arch/powerpc/kernel/vdso32/vdso32.lds.S
> index 099a6db14e67..663880671e20 100644
> --- a/arch/powerpc/kernel/vdso32/vdso32.lds.S
> +++ b/arch/powerpc/kernel/vdso32/vdso32.lds.S
> @@ -152,9 +152,7 @@ VERSION
>  		__kernel_sync_dicache_p5;
>  		__kernel_sigtramp32;
>  		__kernel_sigtramp_rt32;
> -#ifdef CONFIG_PPC64
>  		__kernel_getcpu;
> -#endif
>  		__kernel_time;
>  
>  	local: *;
> -- 
> 2.13.3

WARNING: multiple messages have this Message-ID (diff)
From: Michael Ellerman <mpe@ellerman.id.au>
To: Christophe Leroy <christophe.leroy@c-s.fr>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>
Cc: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH v2 1/8] powerpc/32: Add VDSO version of getcpu
Date: Wed, 18 Sep 2019 15:51:54 +1000	[thread overview]
Message-ID: <87h85aw3r9.fsf@mpe.ellerman.id.au> (raw)
In-Reply-To: <27d699092118ee8d21741c08a6ff7e4c65effdf2.1566491310.git.christophe.leroy@c-s.fr>

Hi Christophe,

Sorry I'm late replying to this.

Christophe Leroy <christophe.leroy@c-s.fr> writes:
> Commit 18ad51dd342a ("powerpc: Add VDSO version of getcpu") added
> getcpu() for PPC64 only, by making use of a user readable general
> purpose SPR.
>
> PPC32 doesn't have any such SPR, a full system call can still be
> avoided by implementing a fast system call which reads the CPU id
> in the task struct and returns immediately without going back in
> virtual mode.
>
> Before the patch, vdsotest reported:
> getcpu: syscall: 1572 nsec/call
> getcpu:    libc: 1787 nsec/call
> getcpu:    vdso: not tested
>
> Now, vdsotest reports:
> getcpu: syscall: 1582 nsec/call
> getcpu:    libc: 667 nsec/call
> getcpu:    vdso: 368 nsec/call
>
> For non SMP, just return CPU id 0 from the VDSO directly.
>
> PPC32 doesn't support CONFIG_NUMA so NUMA node is always 0.
>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
>
> ---
> v2: fixed build error in getcpu.S
> ---
>  arch/powerpc/include/asm/vdso.h         |  2 ++
>  arch/powerpc/kernel/head_32.h           | 13 +++++++++++++
>  arch/powerpc/kernel/head_booke.h        | 11 +++++++++++
>  arch/powerpc/kernel/vdso32/Makefile     |  4 +---
>  arch/powerpc/kernel/vdso32/getcpu.S     |  7 +++++++
>  arch/powerpc/kernel/vdso32/vdso32.lds.S |  2 --
>  6 files changed, 34 insertions(+), 5 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/vdso.h b/arch/powerpc/include/asm/vdso.h
> index b5e1f8f8a05c..adb54782df5f 100644
> --- a/arch/powerpc/include/asm/vdso.h
> +++ b/arch/powerpc/include/asm/vdso.h
> @@ -16,6 +16,8 @@
>  /* Define if 64 bits VDSO has procedure descriptors */
>  #undef VDS64_HAS_DESCRIPTORS
>  
> +#define NR_MAGIC_FAST_VDSO_SYSCALL	0x789a

We are still in the middle of the years long process of removing the
"magic" syscall on 64-bit:

  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/powerpc/kernel/exceptions-64s.S?commit=4d856f72c10ecb060868ed10ff1b1453943fc6c8#n1578
 

Can we not add another one on 32-bit?

Is it really such a fast path that it's worth putting a wart in the
syscall entry like that?

Is there some other method? On s390 they have a per-cpu VDSO page, that
would be a nice option. How we do that would be specific to a particular
MMU, and maybe not even possible with some MMUs. So maybe that's not
feasible.

If you do want to add a fastpath syscall then please just add it as a
regular syscall number, that way it's at least a bit less of a wart.
It's still not visible via tracing/ptrace etc. which is a pain but at
least the number is not "magical" too.

cheers


>  /* Offsets relative to thread->vdso_base */
> diff --git a/arch/powerpc/kernel/head_32.h b/arch/powerpc/kernel/head_32.h
> index 4a692553651f..a2e38b59785a 100644
> --- a/arch/powerpc/kernel/head_32.h
> +++ b/arch/powerpc/kernel/head_32.h
> @@ -3,6 +3,8 @@
>  #define __HEAD_32_H__
>  
>  #include <asm/ptrace.h>	/* for STACK_FRAME_REGS_MARKER */
> +#include <asm/vdso.h>
> +#include <asm/asm-offsets.h>
>  
>  /*
>   * MSR_KERNEL is > 0x8000 on 4xx/Book-E since it include MSR_CE.
> @@ -74,7 +76,13 @@
>  .endm
>  
>  .macro SYSCALL_ENTRY trapno
> +#ifdef CONFIG_SMP
> +	cmplwi	cr0, r0, NR_MAGIC_FAST_VDSO_SYSCALL
> +#endif
>  	mfspr	r12,SPRN_SPRG_THREAD
> +#ifdef CONFIG_SMP
> +	beq-	1f
> +#endif
>  	mfcr	r10
>  	lwz	r11,TASK_STACK-THREAD(r12)
>  	mflr	r9
> @@ -152,6 +160,11 @@
>  	mtspr	SPRN_SRR0,r11
>  	SYNC
>  	RFI				/* jump to handler, enable MMU */
> +#ifdef CONFIG_SMP
> +1:
> +	lwz	r5, TASK_CPU - THREAD(r12)
> +	RFI
> +#endif
>  .endm
>  
>  /*
> diff --git a/arch/powerpc/kernel/head_booke.h b/arch/powerpc/kernel/head_booke.h
> index 2ae635df9026..c534e87cac84 100644
> --- a/arch/powerpc/kernel/head_booke.h
> +++ b/arch/powerpc/kernel/head_booke.h
> @@ -3,6 +3,8 @@
>  #define __HEAD_BOOKE_H__
>  
>  #include <asm/ptrace.h>	/* for STACK_FRAME_REGS_MARKER */
> +#include <asm/vdso.h>
> +#include <asm/asm-offsets.h>
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_booke_hv_asm.h>
>  
> @@ -104,6 +106,10 @@ FTR_SECTION_ELSE
>  #ifdef CONFIG_KVM_BOOKE_HV
>  ALT_FTR_SECTION_END_IFSET(CPU_FTR_EMB_HV)
>  #endif
> +#ifdef CONFIG_SMP
> +	cmplwi	cr0, r0, NR_MAGIC_FAST_VDSO_SYSCALL
> +	beq-	1f
> +#endif
>  	BOOKE_CLEAR_BTB(r11)
>  	lwz	r11, TASK_STACK - THREAD(r10)
>  	rlwinm	r12,r12,0,4,2	/* Clear SO bit in CR */
> @@ -176,6 +182,11 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_EMB_HV)
>  	mtspr	SPRN_SRR0,r11
>  	SYNC
>  	RFI				/* jump to handler, enable MMU */
> +#ifdef CONFIG_SMP
> +1:
> +	lwz	r5, TASK_CPU - THREAD(r10)
> +	RFI
> +#endif
>  .endm
>  
>  /* To handle the additional exception priority levels on 40x and Book-E
> diff --git a/arch/powerpc/kernel/vdso32/Makefile b/arch/powerpc/kernel/vdso32/Makefile
> index 06f54d947057..e147bbdc12cd 100644
> --- a/arch/powerpc/kernel/vdso32/Makefile
> +++ b/arch/powerpc/kernel/vdso32/Makefile
> @@ -2,9 +2,7 @@
>  
>  # List of files in the vdso, has to be asm only for now
>  
> -obj-vdso32-$(CONFIG_PPC64) = getcpu.o
> -obj-vdso32 = sigtramp.o gettimeofday.o datapage.o cacheflush.o note.o \
> -		$(obj-vdso32-y)
> +obj-vdso32 = sigtramp.o gettimeofday.o datapage.o cacheflush.o note.o getcpu.o
>  
>  # Build rules
>  
> diff --git a/arch/powerpc/kernel/vdso32/getcpu.S b/arch/powerpc/kernel/vdso32/getcpu.S
> index 63e914539e1a..bde226ad904d 100644
> --- a/arch/powerpc/kernel/vdso32/getcpu.S
> +++ b/arch/powerpc/kernel/vdso32/getcpu.S
> @@ -17,7 +17,14 @@
>   */
>  V_FUNCTION_BEGIN(__kernel_getcpu)
>    .cfi_startproc
> +#if defined(CONFIG_PPC64)
>  	mfspr	r5,SPRN_SPRG_VDSO_READ
> +#elif defined(CONFIG_SMP)
> +	li	r0, NR_MAGIC_FAST_VDSO_SYSCALL
> +	sc	/* returns cpuid in r5, clobbers cr0 and r10-r13 */
> +#else
> +	li	r5, 0
> +#endif
>  	cmpwi	cr0,r3,0
>  	cmpwi	cr1,r4,0
>  	clrlwi  r6,r5,16
> diff --git a/arch/powerpc/kernel/vdso32/vdso32.lds.S b/arch/powerpc/kernel/vdso32/vdso32.lds.S
> index 099a6db14e67..663880671e20 100644
> --- a/arch/powerpc/kernel/vdso32/vdso32.lds.S
> +++ b/arch/powerpc/kernel/vdso32/vdso32.lds.S
> @@ -152,9 +152,7 @@ VERSION
>  		__kernel_sync_dicache_p5;
>  		__kernel_sigtramp32;
>  		__kernel_sigtramp_rt32;
> -#ifdef CONFIG_PPC64
>  		__kernel_getcpu;
> -#endif
>  		__kernel_time;
>  
>  	local: *;
> -- 
> 2.13.3

  reply	other threads:[~2019-09-18  5:55 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-22 16:34 [PATCH v2 0/8] powerpc/vdso32 enhancement and optimisation Christophe Leroy
2019-08-22 16:34 ` Christophe Leroy
2019-08-22 16:34 ` [PATCH v2 1/8] powerpc/32: Add VDSO version of getcpu Christophe Leroy
2019-08-22 16:34   ` Christophe Leroy
2019-09-18  5:51   ` Michael Ellerman [this message]
2019-09-18  5:51     ` Michael Ellerman
2019-10-29 16:10     ` Christophe Leroy
2019-10-29 16:10       ` Christophe Leroy
2019-08-22 16:34 ` [PATCH v2 2/8] powerpc/vdso32: Add support for CLOCK_{REALTIME/MONOTONIC}_COARSE Christophe Leroy
2019-08-22 16:34   ` Christophe Leroy
2019-08-22 16:34 ` [PATCH v2 3/8] powerpc: Fix vDSO clock_getres() Christophe Leroy
2019-08-22 16:34   ` Christophe Leroy
2019-08-22 17:17   ` Sasha Levin
2019-08-22 16:34 ` [PATCH v2 4/8] powerpc/vdso32: inline __get_datapage() Christophe Leroy
2019-08-22 16:34   ` Christophe Leroy
2019-08-26  5:44   ` Santosh Sivaraj
2019-09-13 10:59     ` Christophe Leroy
2019-09-13 13:31       ` Santosh Sivaraj
2019-09-13 13:50         ` Christophe Leroy
2019-09-13 14:03           ` Santosh Sivaraj
2019-10-29 16:12     ` Christophe Leroy
2019-08-22 16:34 ` [PATCH v2 5/8] powerpc/vdso32: Don't read cache line size from the datapage on PPC32 Christophe Leroy
2019-08-22 16:34   ` Christophe Leroy
2019-08-22 16:34 ` [PATCH v2 6/8] powerpc/vdso32: use LOAD_REG_IMMEDIATE() Christophe Leroy
2019-08-22 16:34   ` Christophe Leroy
2019-08-22 16:34 ` [PATCH v2 7/8] powerpc/vdso32: implement clock_getres entirely Christophe Leroy
2019-08-22 16:34   ` Christophe Leroy
2019-08-22 16:34 ` [PATCH v2 8/8] powerpc/vdso32: miscellaneous optimisations Christophe Leroy
2019-08-22 16:34   ` Christophe Leroy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h85aw3r9.fsf@mpe.ellerman.id.au \
    --to=mpe@ellerman.id.au \
    --cc=benh@kernel.crashing.org \
    --cc=christophe.leroy@c-s.fr \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.