All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Andi Kleen <ak@suse.de>
Cc: patches@x86-64.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] [8/30] x86_64: Add vDSO for x86-64 with gettimeofday/clock_gettime/getcpu
Date: Mon, 30 Apr 2007 22:57:57 -0700	[thread overview]
Message-ID: <4636D6E5.8090006@goop.org> (raw)
In-Reply-To: <20070501035805.9FA9513CAF@wotan.suse.de>

Andi Kleen wrote:
> This implements new vDSO for x86-64.  The concept is similar
> to the existing vDSOs on i386 and PPC.  x86-64 has had static
> vsyscalls before,  but these are not flexible enough anymore.
>
> A vDSO is a ELF shared library supplied by the kernel that is mapped into 
> user address space.  The vDSO mapping is randomized for each process
> for security reasons.
>
> Doing this was needed for clock_gettime, because clock_gettime
> always needs a syscall fallback and having one at a fixed
> address would have made buffer overflow exploits too easy to write.
>
> The vdso can be disabled with vdso=0
>
> It currently includes a new gettimeofday implemention and optimized
> clock_gettime(). The gettimeofday implementation is slightly faster
> than the one in the old vsyscall.  clock_gettime is significantly faster 
> than the syscall for CLOCK_MONOTONIC and CLOCK_REALTIME.
>
> The new calls are generally faster than the old vsyscall. 
>
> TBD: add new benchmarks
>
> Advantages over the old x86-64 vsyscalls:
> - Extensible
> - Randomized
> - Cleaner
> - Easier to virtualize (the old static address range previously causes
> overhead e.g. for Xen because it has to create special page tables for it) 
>
> Weak points: 
> - glibc support still to be written
>
> The VM interface is partly based on Ingo Molnar's i386 version.
>
> Signed-off-by: Andi Kleen <ak@suse.de>
>
> ---
>  Documentation/kernel-parameters.txt |    2 
>  arch/x86_64/Makefile                |    3 
>  arch/x86_64/ia32/ia32_binfmt.c      |    1 
>  arch/x86_64/kernel/time.c           |    1 
>  arch/x86_64/kernel/vmlinux.lds.S    |   12 +++
>  arch/x86_64/kernel/vsyscall.c       |   22 +----
>  arch/x86_64/mm/init.c               |   17 ++++
>  arch/x86_64/vdso/Makefile           |   49 ++++++++++++
>  arch/x86_64/vdso/vclock_gettime.c   |  120 +++++++++++++++++++++++++++++++
>  arch/x86_64/vdso/vdso-note.S        |   25 ++++++
>  arch/x86_64/vdso/vdso-start.S       |    2 
>  arch/x86_64/vdso/vdso.S             |    2 
>  arch/x86_64/vdso/vdso.lds.S         |   77 ++++++++++++++++++++
>  arch/x86_64/vdso/vextern.h          |   16 ++++
>  arch/x86_64/vdso/vgetcpu.c          |   50 +++++++++++++
>  arch/x86_64/vdso/vma.c              |  137 ++++++++++++++++++++++++++++++++++++
>  arch/x86_64/vdso/voffset.h          |    1 
>  arch/x86_64/vdso/vvar.c             |   12 +++
>  include/asm-x86_64/auxvec.h         |    2 
>  include/asm-x86_64/elf.h            |   13 +++
>  include/asm-x86_64/mmu.h            |    1 
>  include/asm-x86_64/pgtable.h        |    8 +-
>  include/asm-x86_64/vgtod.h          |   29 +++++++
>  include/asm-x86_64/vsyscall.h       |    3 
>  24 files changed, 583 insertions(+), 22 deletions(-)
>
> Index: linux/arch/x86_64/ia32/ia32_binfmt.c
> ===================================================================
> --- linux.orig/arch/x86_64/ia32/ia32_binfmt.c
> +++ linux/arch/x86_64/ia32/ia32_binfmt.c
> @@ -38,6 +38,7 @@
>  
>  int sysctl_vsyscall32 = 1;
>  
> +#undef ARCH_DLINFO
>  #define ARCH_DLINFO do {  \
>  	if (sysctl_vsyscall32) { \
>  	NEW_AUX_ENT(AT_SYSINFO, (u32)(u64)VSYSCALL32_VSYSCALL); \
> Index: linux/arch/x86_64/kernel/vmlinux.lds.S
> ===================================================================
> --- linux.orig/arch/x86_64/kernel/vmlinux.lds.S
> +++ linux/arch/x86_64/kernel/vmlinux.lds.S
> @@ -94,6 +94,9 @@ SECTIONS
>    .vsyscall_gtod_data : AT(VLOAD(.vsyscall_gtod_data))
>  		{ *(.vsyscall_gtod_data) }
>    vsyscall_gtod_data = VVIRT(.vsyscall_gtod_data);
> +  .vsyscall_clock : AT(VLOAD(.vsyscall_clock))
> +		{ *(.vsyscall_clock) }
> +  vsyscall_clock = VVIRT(.vsyscall_clock);
>  
>  
>    .vsyscall_1 ADDR(.vsyscall_0) + 1024: AT(VLOAD(.vsyscall_1))
> @@ -153,6 +156,8 @@ SECTIONS
>  
>    . = ALIGN(4096);		/* Init code and data */
>    __init_begin = .;
> +
> +
>    .init.text : AT(ADDR(.init.text) - LOAD_OFFSET) {
>  	_sinittext = .;
>  	*(.init.text)
> @@ -190,6 +195,12 @@ SECTIONS
>    .exit.text : AT(ADDR(.exit.text) - LOAD_OFFSET) { *(.exit.text) }
>    .exit.data : AT(ADDR(.exit.data) - LOAD_OFFSET) { *(.exit.data) }
>  
> +/* vdso blob that is mapped into user space */
> +  vdso_start = . ;
> +  .vdso  : AT(ADDR(.vdso) - LOAD_OFFSET) { *(.vdso) }
> +  . = ALIGN(4096);
> +  vdso_end = .;
> +
>  #ifdef CONFIG_BLK_DEV_INITRD
>    . = ALIGN(4096);
>    __initramfs_start = .;
> @@ -202,6 +213,7 @@ SECTIONS
>    .data.percpu  : AT(ADDR(.data.percpu) - LOAD_OFFSET) { *(.data.percpu) }
>    __per_cpu_end = .;
>    . = ALIGN(4096);
> +
>    __init_end = .;
>  
>    . = ALIGN(4096);
> Index: linux/arch/x86_64/mm/init.c
> ===================================================================
> --- linux.orig/arch/x86_64/mm/init.c
> +++ linux/arch/x86_64/mm/init.c
> @@ -159,6 +159,14 @@ static __init void set_pte_phys(unsigned
>  	__flush_tlb_one(vaddr);
>  }
>  
> +void __init
> +set_kernel_map(void *vaddr,unsigned long len,unsigned long phys,pgprot_t prot)
> +{
> +	void *end = vaddr + ALIGN(len, PAGE_SIZE);
> +	for (; vaddr < end; vaddr += PAGE_SIZE, phys += PAGE_SIZE)
> +		set_pte_phys((unsigned long)vaddr, phys, prot);
> +}
> +
>  /* NOTE: this is meant to be run only at boot */
>  void __init 
>  __set_fixmap (enum fixed_addresses idx, unsigned long phys, pgprot_t prot)
> @@ -756,3 +764,12 @@ int in_gate_area_no_task(unsigned long a
>  {
>  	return (addr >= VSYSCALL_START) && (addr < VSYSCALL_END);
>  }
> +
> +const char *arch_vma_name(struct vm_area_struct *vma)
> +{
> +	if (vma->vm_mm && vma->vm_start == (long)vma->vm_mm->context.vdso)
> +		return "[vdso]";
> +	if (vma == &gate_vma)
> +		return "[vsyscall]";
> +	return NULL;
> +}
> Index: linux/arch/x86_64/vdso/vdso-note.S
> ===================================================================
> --- /dev/null
> +++ linux/arch/x86_64/vdso/vdso-note.S
> @@ -0,0 +1,25 @@
> +/*
> + * This supplies .note.* sections to go into the PT_NOTE inside the vDSO text.
> + * Here we can supply some information useful to userland.
> + */
> +
> +#include <linux/uts.h>
> +#include <linux/version.h>
> +
> +#define ASM_ELF_NOTE_BEGIN(name, flags, vendor, type)			      \
>   

Use linux/elfnote.h?

    J

  reply	other threads:[~2007-05-01  5:57 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-01  3:57 [PATCH] [0/30] x86 candidate patches for review VII: VDSO, CPUID, NMI watchdog, MCE, misc Andi Kleen
2007-05-01  3:57 ` [PATCH] [1/30] x86_64: Dynamically adjust machine check interval Andi Kleen
2007-05-01  3:57 ` [PATCH] [2/30] x86_64: set node_possible_map at runtime - try 2 Andi Kleen
2007-05-01  3:58 ` [PATCH] [3/30] i386: Clean up NMI watchdog code Andi Kleen
2007-05-01  3:58 ` [PATCH] [4/30] x86_64: Use the 32bit wd_ops for 64bit too Andi Kleen
2007-05-01  3:58 ` [PATCH] [5/30] x86_64: Define IGNORE_IOCTL() macro for compat_ioctls Andi Kleen
2007-05-01  3:58 ` [PATCH] [6/30] x86_64: Shut up 32bit emulation for SIOCGIFCOUNT Andi Kleen
2007-05-01  3:58 ` [PATCH] [7/30] x86_64: Avoid overflows during apic timer calibration Andi Kleen
2007-05-01  3:58 ` [PATCH] [8/30] x86_64: Add vDSO for x86-64 with gettimeofday/clock_gettime/getcpu Andi Kleen
2007-05-01  5:57   ` Jeremy Fitzhardinge [this message]
2007-05-01  7:23     ` Andi Kleen
2007-05-01  8:00       ` Jeremy Fitzhardinge
2007-05-01  3:58 ` [PATCH] [9/30] x86_64: Use symbolic CPU features in early CPUID check Andi Kleen
2007-05-01  3:58 ` [PATCH] [10/30] x86_64: Drop -traditional for arch/x86_64/boot Andi Kleen
2007-05-01  3:58 ` [PATCH] [11/30] i386: Drop -traditional in arch/i386/boot Andi Kleen
2007-05-01  3:58 ` [PATCH] [12/30] i386: Verify important CPUID bits in real mode Andi Kleen
2007-05-01  3:58 ` [PATCH] [13/30] i386: Evaluate constant cpu features at runtime Andi Kleen
2007-05-01  3:58 ` [PATCH] [14/30] i386: Implement alternative_io for i386 Andi Kleen
2007-05-01  3:58 ` [PATCH] [15/30] i386: Implement X86_FEATURE_SYNC_RDTSC on i386 Andi Kleen
2007-05-01  3:58 ` [PATCH] [16/30] i386: Add X86_FEATURE_RDTSCP Andi Kleen
2007-05-01  3:58 ` [PATCH] [17/30] x86: Use RDTSCP for synchronous get_cycles if possible Andi Kleen
2007-05-01  3:58 ` [PATCH] [18/30] x86_64: Don't enable NUMA for a single node in K8 NUMA scanning Andi Kleen
2007-05-01  3:58 ` [PATCH] [19/30] i386: Little cleanups in smpboot.c Andi Kleen
2007-05-01  3:58 ` [PATCH] [20/30] i386: Remove copy_*_user BUG_ONs for (size < 0) Andi Kleen
2007-05-01  3:58 ` [PATCH] [21/30] x86_64: Print type and size correctly for unknown compat ioctls Andi Kleen
2007-05-01  3:58 ` [PATCH] [22/30] x86_64: Remove CONFIG_REORDER Andi Kleen
2007-05-01  3:58 ` [PATCH] [23/30] x86_64: Share identical video.S between i386 and x86-64 Andi Kleen
2007-05-01  3:58 ` [PATCH] [24/30] x86_64: Shut up warnings for vfat compat ioctls on other file systems Andi Kleen
2007-05-01 15:45   ` Chuck Ebbert
2007-05-02 10:46     ` Andi Kleen
2007-05-01  3:58 ` [PATCH] [25/30] x86_64: Fix allnoconfig error in genapic_flat.c Andi Kleen
2007-05-01  3:58 ` [PATCH] [26/30] i386: Drop noisy e820 debugging printks Andi Kleen
2007-05-01  3:58 ` [PATCH] [27/30] i386: white space fixes in i387.h Andi Kleen
2007-05-01  3:58 ` [PATCH] [28/30] i386: avoid redundant preempt_disable in __unlazy_fpu Andi Kleen
2007-05-01  3:58 ` [PATCH] [29/30] x86_64: Don't exclude asm-offsets.c in Documentation/dontdiff Andi Kleen
2007-05-01  3:58 ` [PATCH] [30/30] x86_64: Add missing !X86_PAE dependincy to the 2G/2G split Andi Kleen
2007-05-01  4:26   ` Eric Dumazet
2007-05-01  6:21     ` Andi Kleen
2007-05-01 13:01       ` Bill Irwin
2007-05-01 13:49         ` Mark Lord
2007-05-01 15:51         ` Eric Dumazet
2007-05-01 17:00           ` Bill Irwin
2007-05-01 17:17             ` Eric W. Biederman
2007-05-01 20:41               ` Eric Dumazet
2007-05-02  9:38             ` Andi Kleen
2007-05-01  4:37   ` William Lee Irwin III
2007-05-01  4:57     ` Eric Dumazet
2007-05-01  5:11       ` William Lee Irwin III
2007-05-01  5:35     ` Eric W. Biederman
2007-05-01 13:32     ` Mark Lord
2007-05-01 14:17       ` William Lee Irwin III
2007-05-01 14:20         ` Mark Lord

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4636D6E5.8090006@goop.org \
    --to=jeremy@goop.org \
    --cc=ak@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=patches@x86-64.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.