public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v23 00/13] x86: Add x86 32 bit VDSO time function support
@ 2014-03-17 22:22 Stefani Seibold
  2014-03-17 22:22 ` [PATCH v23 01/13] x86, vdso: Make vsyscall_gtod_data handling x86 generic Stefani Seibold
                   ` (12 more replies)
  0 siblings, 13 replies; 32+ messages in thread
From: Stefani Seibold @ 2014-03-17 22:22 UTC (permalink / raw)
  To: gregkh, linux-kernel, x86, tglx, mingo, hpa, ak, aarcange,
	john.stultz, luto, xemul, gorcunov, andriy.shevchenko
  Cc: Martin.Runge, Andreas.Brief, Stefani Seibold

This patch add the functions vdso_gettimeofday(), vdso_clock_gettime()
and vdso_time() to the x86 32 bit VDSO.

The reason to do this was to get a fast reliable time stamp. Many developers
uses TSC to get a fast time stamp, without knowing the pitfalls. VDSO
time functions a fast and a reliable way, because the kernel knows the
best time source and the P- and C-state of the CPU.

The helper library to use the VDSO functions can be download at
http://http://seibold.net/vdso.c
The libary is very small, only 228 lines of code. Compile it with
gcc -Wall -O3 -fpic vdso.c -lrt -shared -o libvdso.so
and use it with LD_PRELOAD=<path>/libvdso.so

There is also a patch http://seibold.net/glibc.patch for glibc 2.19 which also
works for glibc 2.18.  This patch must be integrated into glibc.

Some linux 32 bit kernel benchmark results (all measurements are in nano
seconds):

Intel(R) Celeron(TM) CPU 400MHz

Average time kernel call:
 gettimeofday(): 1039
 clock_gettime(): 1578
 time(): 526
Average time VDSO call:
 gettimeofday(): 378
 clock_gettime(): 303
 time(): 60

Celeron(R) Dual-Core CPU T3100 1.90GHz

Average time kernel call:
 gettimeofday(): 209
 clock_gettime(): 406
 time(): 135
Average time VDSO call:
 gettimeofday(): 51
 clock_gettime(): 43
 time(): 10

So you can see a performance increase between 4 and 13, depending on the
CPU and the function.

The IA32 Emulation uses the whole 4 GB address space, so there is no
fixed address available.

The VDSO for a 32 bit application has now three pages:

^ Higher Address
|
+----------------------------------------+
+ VDSO page (includes code) ro+x         +
+----------------------------------------+
+ VVAR page (export kernel variables) ro +
+----------------------------------------+
+ HPET page (mapped registers) ro 
+----------------------------------------+
|
^ Lower Address

The VDSO page for a 32 bit resided still on 0xffffe000, the the VVAR and
HPET page are mapped before.

In the non compat mode the VMA of the VDSO is now 3 pages for a 32 bit kernel.
So this decrease the available logical address room by 2 pages.

The patch is against tip x86/vdso (1f2cbcf648962cdcf511d234cb39745baa9f5d07)

Changelog:
25.11.2012 - first release and proof of concept for linux 3.4
11.12.2012 - Port to linux 3.7 and code cleanup
12.12.2012 - fixes suggested by Andy Lutomirski
           - fixes suggested by John Stultz
           - use call VDSO32_vsyscall instead of int 80
           - code cleanup
17.12.2012 - support for IA32_EMULATION, this includes
             - code cleanup
             - include cleanup to fix compile warnings and errors
             - move out seqcount from seqlock, enable use in VDSO
             - map FIXMAP and HPET into the 32 bit address space
18.12.2012 - split into separate patches
30.01.2014 - revamp the code
             - code clean up
             - VDSO layout changed
             - no fixed addresses
             - port to 3.14
01.02.2014 - code cleanup
02.02.2014 - code cleanup
             - split into more patches
             - use HPET_COUNTER instead of hard coded value
             - fix changelog to the right year ;-)
02.02.2014 - reverse the mapping, this make the new VDSO 32 bit support
             full compatible.
03.02.2014 - code cleanup
             - fix comment
             - fix ABI break in vdso32.lds.S
04.02.2014 - revamp IA32 emulation support
             - introduce VVAR macro
             - rearranged vsyscall_gtod_data struture for IA32 emulation support
             - code cleanup
05.02.2014 - revamp IA32 emulation support
             - replace seqcount_t by an unsigned, to make the vsyscall_gtod_data
               structure independed of kernel config and functions.
08.02.2014 - revamp IA32 emulation support
             - replace all internal structures by fix size elements
10.02.2014 - code cleanup
             - add commets
             - revamp inline assembly
12.02.2014 - add conditional fixmap of vvar and hpet pages for 32 bit kernel
14.02.2014 - fix CONFIG_PARAVIRT_CLOCK, which is not supported in 32 bit VDSO
15.02.2014 - fix tsc
             code cleanup
             tested make ARCH=i386 allyesconfig and make allyesconfig
16.02.2014 - code cleanup
             - fix all C=1 warnings, also some one not introduced by this patch
             - hack to fix C=1 32 bit VDSO spinlock for a 64 bit kernel
             - fix VDSO Makefile for newer gcc
             tested for gcc 4.3.4 and 4.8.1
             tested ARCH=i386 allyesconfig, defconfig and allmodconfig
             tested X86_64 allyesconfig, defconfig and allmodconfig
17.02.2014 - In case of a 32 bit VDSO for a 64 bit kernel fake a 32 bit kernel
             configuration.
19.02.2014 - Add missing #undef and #define to fake proper 32 bit kernel config
             Add a missing #ifdef CONFIG_HPET_TIMER
             tested again ARCH=i386 allyesconfig, defconfig and allmodconfig
             tested again ARCH=X86_64 allyesconfig, defconfig and allmodconfig
02.03.2014 - Add fixes suggested by Andy Lutomirski
             - Patch alternatives in the 32 bit VDSO
	     - Use the default ABI for the 32 bit VDSO
	     - Inline the CLOCK MONOTONIC VDSO code
	     - Zero pad the VVAR page
	     - fix "patch alternatives" compile for 32 bit kernel
03.03.2014 - Add glibc.patch http://seibold.net/glibc.patch
             Add reviewed-by tags
17.03.2014 - Rebase on tip x86/vdso 1f2cbcf648962cdcf511d234cb39745baa9f5d07
             remove fixmap and compat mode dependencies

^ permalink raw reply	[flat|nested] 32+ messages in thread
* [Patch v22 03/12] x86: revamp vclock_gettime.c
@ 2014-03-03 21:12 Stefani Seibold
  2014-03-05 22:30 ` [tip:x86/vdso] x86, vdso: Revamp vclock_gettime.c tip-bot for Stefani Seibold
  0 siblings, 1 reply; 32+ messages in thread
From: Stefani Seibold @ 2014-03-03 21:12 UTC (permalink / raw)
  To: gregkh, linux-kernel, x86, tglx, mingo, hpa, ak, aarcange,
	john.stultz, luto, xemul, gorcunov, andriy.shevchenko
  Cc: Martin.Runge, Andreas.Brief, Stefani Seibold

This intermediate patch revamps the vclock_gettime.c by moving some functions
around. It is only for spliting purpose, to make whole the 32 bit vdso timer
patch easier to review.

Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Stefani Seibold <stefani@seibold.net>
---
 arch/x86/vdso/vclock_gettime.c | 85 +++++++++++++++++++++---------------------
 1 file changed, 42 insertions(+), 43 deletions(-)

diff --git a/arch/x86/vdso/vclock_gettime.c b/arch/x86/vdso/vclock_gettime.c
index eb5d7a5..bbc8065 100644
--- a/arch/x86/vdso/vclock_gettime.c
+++ b/arch/x86/vdso/vclock_gettime.c
@@ -26,41 +26,26 @@
 
 #define gtod (&VVAR(vsyscall_gtod_data))
 
-notrace static cycle_t vread_tsc(void)
+static notrace cycle_t vread_hpet(void)
 {
-	cycle_t ret;
-	u64 last;
-
-	/*
-	 * Empirically, a fence (of type that depends on the CPU)
-	 * before rdtsc is enough to ensure that rdtsc is ordered
-	 * with respect to loads.  The various CPU manuals are unclear
-	 * as to whether rdtsc can be reordered with later loads,
-	 * but no one has ever seen it happen.
-	 */
-	rdtsc_barrier();
-	ret = (cycle_t)vget_cycles();
-
-	last = VVAR(vsyscall_gtod_data).clock.cycle_last;
-
-	if (likely(ret >= last))
-		return ret;
+	return readl((const void __iomem *)fix_to_virt(VSYSCALL_HPET) + HPET_COUNTER);
+}
 
-	/*
-	 * GCC likes to generate cmov here, but this branch is extremely
-	 * predictable (it's just a funciton of time and the likely is
-	 * very likely) and there's a data dependence, so force GCC
-	 * to generate a branch instead.  I don't barrier() because
-	 * we don't actually need a barrier, and if this function
-	 * ever gets inlined it will generate worse code.
-	 */
-	asm volatile ("");
-	return last;
+notrace static long vdso_fallback_gettime(long clock, struct timespec *ts)
+{
+	long ret;
+	asm("syscall" : "=a" (ret) :
+	    "0" (__NR_clock_gettime), "D" (clock), "S" (ts) : "memory");
+	return ret;
 }
 
-static notrace cycle_t vread_hpet(void)
+notrace static long vdso_fallback_gtod(struct timeval *tv, struct timezone *tz)
 {
-	return readl((const void __iomem *)fix_to_virt(VSYSCALL_HPET) + HPET_COUNTER);
+	long ret;
+
+	asm("syscall" : "=a" (ret) :
+	    "0" (__NR_gettimeofday), "D" (tv), "S" (tz) : "memory");
+	return ret;
 }
 
 #ifdef CONFIG_PARAVIRT_CLOCK
@@ -133,23 +118,37 @@ static notrace cycle_t vread_pvclock(int *mode)
 }
 #endif
 
-notrace static long vdso_fallback_gettime(long clock, struct timespec *ts)
+notrace static cycle_t vread_tsc(void)
 {
-	long ret;
-	asm("syscall" : "=a" (ret) :
-	    "0" (__NR_clock_gettime),"D" (clock), "S" (ts) : "memory");
-	return ret;
-}
+	cycle_t ret;
+	u64 last;
 
-notrace static long vdso_fallback_gtod(struct timeval *tv, struct timezone *tz)
-{
-	long ret;
+	/*
+	 * Empirically, a fence (of type that depends on the CPU)
+	 * before rdtsc is enough to ensure that rdtsc is ordered
+	 * with respect to loads.  The various CPU manuals are unclear
+	 * as to whether rdtsc can be reordered with later loads,
+	 * but no one has ever seen it happen.
+	 */
+	rdtsc_barrier();
+	ret = (cycle_t)vget_cycles();
 
-	asm("syscall" : "=a" (ret) :
-	    "0" (__NR_gettimeofday), "D" (tv), "S" (tz) : "memory");
-	return ret;
-}
+	last = VVAR(vsyscall_gtod_data).clock.cycle_last;
 
+	if (likely(ret >= last))
+		return ret;
+
+	/*
+	 * GCC likes to generate cmov here, but this branch is extremely
+	 * predictable (it's just a funciton of time and the likely is
+	 * very likely) and there's a data dependence, so force GCC
+	 * to generate a branch instead.  I don't barrier() because
+	 * we don't actually need a barrier, and if this function
+	 * ever gets inlined it will generate worse code.
+	 */
+	asm volatile ("");
+	return last;
+}
 
 notrace static inline u64 vgetsns(int *mode)
 {
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread
* [PATCH v18 03/10] revamp vclock_gettime.c
@ 2014-02-16 21:52 Stefani Seibold
  2014-02-17  0:52 ` [tip:x86/vdso] x86, vdso: Revamp vclock_gettime.c tip-bot for Stefani Seibold
  0 siblings, 1 reply; 32+ messages in thread
From: Stefani Seibold @ 2014-02-16 21:52 UTC (permalink / raw)
  To: gregkh, linux-kernel, x86, tglx, mingo, hpa, ak, aarcange,
	john.stultz, luto, xemul, gorcunov, andriy.shevchenko
  Cc: Martin.Runge, Andreas.Brief, Stefani Seibold

This intermediate patch revamps the vclock_gettime.c by moving some functions
around. It is only for spliting purpose, to make whole the 32 bit vdso timer
patch easier to review.

Signed-off-by: Stefani Seibold <stefani@seibold.net>
---
 arch/x86/vdso/vclock_gettime.c | 85 +++++++++++++++++++++---------------------
 1 file changed, 42 insertions(+), 43 deletions(-)

diff --git a/arch/x86/vdso/vclock_gettime.c b/arch/x86/vdso/vclock_gettime.c
index eb5d7a5..bbc8065 100644
--- a/arch/x86/vdso/vclock_gettime.c
+++ b/arch/x86/vdso/vclock_gettime.c
@@ -26,41 +26,26 @@
 
 #define gtod (&VVAR(vsyscall_gtod_data))
 
-notrace static cycle_t vread_tsc(void)
+static notrace cycle_t vread_hpet(void)
 {
-	cycle_t ret;
-	u64 last;
-
-	/*
-	 * Empirically, a fence (of type that depends on the CPU)
-	 * before rdtsc is enough to ensure that rdtsc is ordered
-	 * with respect to loads.  The various CPU manuals are unclear
-	 * as to whether rdtsc can be reordered with later loads,
-	 * but no one has ever seen it happen.
-	 */
-	rdtsc_barrier();
-	ret = (cycle_t)vget_cycles();
-
-	last = VVAR(vsyscall_gtod_data).clock.cycle_last;
-
-	if (likely(ret >= last))
-		return ret;
+	return readl((const void __iomem *)fix_to_virt(VSYSCALL_HPET) + HPET_COUNTER);
+}
 
-	/*
-	 * GCC likes to generate cmov here, but this branch is extremely
-	 * predictable (it's just a funciton of time and the likely is
-	 * very likely) and there's a data dependence, so force GCC
-	 * to generate a branch instead.  I don't barrier() because
-	 * we don't actually need a barrier, and if this function
-	 * ever gets inlined it will generate worse code.
-	 */
-	asm volatile ("");
-	return last;
+notrace static long vdso_fallback_gettime(long clock, struct timespec *ts)
+{
+	long ret;
+	asm("syscall" : "=a" (ret) :
+	    "0" (__NR_clock_gettime), "D" (clock), "S" (ts) : "memory");
+	return ret;
 }
 
-static notrace cycle_t vread_hpet(void)
+notrace static long vdso_fallback_gtod(struct timeval *tv, struct timezone *tz)
 {
-	return readl((const void __iomem *)fix_to_virt(VSYSCALL_HPET) + HPET_COUNTER);
+	long ret;
+
+	asm("syscall" : "=a" (ret) :
+	    "0" (__NR_gettimeofday), "D" (tv), "S" (tz) : "memory");
+	return ret;
 }
 
 #ifdef CONFIG_PARAVIRT_CLOCK
@@ -133,23 +118,37 @@ static notrace cycle_t vread_pvclock(int *mode)
 }
 #endif
 
-notrace static long vdso_fallback_gettime(long clock, struct timespec *ts)
+notrace static cycle_t vread_tsc(void)
 {
-	long ret;
-	asm("syscall" : "=a" (ret) :
-	    "0" (__NR_clock_gettime),"D" (clock), "S" (ts) : "memory");
-	return ret;
-}
+	cycle_t ret;
+	u64 last;
 
-notrace static long vdso_fallback_gtod(struct timeval *tv, struct timezone *tz)
-{
-	long ret;
+	/*
+	 * Empirically, a fence (of type that depends on the CPU)
+	 * before rdtsc is enough to ensure that rdtsc is ordered
+	 * with respect to loads.  The various CPU manuals are unclear
+	 * as to whether rdtsc can be reordered with later loads,
+	 * but no one has ever seen it happen.
+	 */
+	rdtsc_barrier();
+	ret = (cycle_t)vget_cycles();
 
-	asm("syscall" : "=a" (ret) :
-	    "0" (__NR_gettimeofday), "D" (tv), "S" (tz) : "memory");
-	return ret;
-}
+	last = VVAR(vsyscall_gtod_data).clock.cycle_last;
 
+	if (likely(ret >= last))
+		return ret;
+
+	/*
+	 * GCC likes to generate cmov here, but this branch is extremely
+	 * predictable (it's just a funciton of time and the likely is
+	 * very likely) and there's a data dependence, so force GCC
+	 * to generate a branch instead.  I don't barrier() because
+	 * we don't actually need a barrier, and if this function
+	 * ever gets inlined it will generate worse code.
+	 */
+	asm volatile ("");
+	return last;
+}
 
 notrace static inline u64 vgetsns(int *mode)
 {
-- 
1.8.5.5


^ permalink raw reply related	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2014-03-27 22:37 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-17 22:22 [PATCH v23 00/13] x86: Add x86 32 bit VDSO time function support Stefani Seibold
2014-03-17 22:22 ` [PATCH v23 01/13] x86, vdso: Make vsyscall_gtod_data handling x86 generic Stefani Seibold
2014-03-18 21:27   ` [tip:x86/vdso] " tip-bot for Stefani Seibold
2014-03-17 22:22 ` [PATCH v23 02/13] mm: Add new func _install_special_mapping() to mmap.c Stefani Seibold
2014-03-18 21:28   ` [tip:x86/vdso] " tip-bot for Stefani Seibold
2014-03-17 22:22 ` [PATCH v23 03/13] x86, vdso: Revamp vclock_gettime.c Stefani Seibold
2014-03-18 21:28   ` [tip:x86/vdso] " tip-bot for Stefani Seibold
2014-03-17 22:22 ` [PATCH v23 04/13] x86, vdso: __vdso_clock_gettime() cleanup Stefani Seibold
2014-03-18 21:28   ` [tip:x86/vdso] " tip-bot for Stefani Seibold
2014-03-17 22:22 ` [PATCH v23 05/13] x86, vdso: Replace VVAR(vsyscall_gtod_data) by gtod macro Stefani Seibold
2014-03-18 21:28   ` [tip:x86/vdso] " tip-bot for Stefani Seibold
2014-03-17 22:22 ` [PATCH v23 06/13] x86, vdso: Cleanup __vdso_gettimeofday() Stefani Seibold
2014-03-18 21:28   ` [tip:x86/vdso] " tip-bot for Stefani Seibold
2014-03-17 22:22 ` [PATCH v23 07/13] x86, vdso: Introduce VVAR marco for vdso32 Stefani Seibold
2014-03-18 21:29   ` [tip:x86/vdso] " tip-bot for Stefani Seibold
2014-03-17 22:22 ` [PATCH v23 08/13] x86, vdso: Patch alternatives in the 32-bit VDSO Stefani Seibold
2014-03-18 21:29   ` [tip:x86/vdso] " tip-bot for Andy Lutomirski
2014-03-17 22:22 ` [PATCH v23 09/13] x86, vdso: Add 32 bit VDSO time support for 32 bit kernel Stefani Seibold
2014-03-18 21:29   ` [tip:x86/vdso] " tip-bot for Stefani Seibold
2014-03-17 22:22 ` [PATCH v23 10/13] x86, vdso: Add 32 bit VDSO time support for 64 " Stefani Seibold
2014-03-18 21:29   ` [tip:x86/vdso] " tip-bot for Stefani Seibold
2014-03-27 20:44   ` [PATCH v23 10/13] " John Stultz
2014-03-27 21:12     ` Andy Lutomirski
2014-03-27 22:35     ` H. Peter Anvin
2014-03-17 22:22 ` [PATCH v23 11/13] x86, vdso: Zero-pad the VVAR page Stefani Seibold
2014-03-18 21:29   ` [tip:x86/vdso] " tip-bot for Andy Lutomirski
2014-03-17 22:22 ` [PATCH v23 12/13] x86, vdso32: Disable stack protector, adjust optimizations Stefani Seibold
2014-03-18 21:29   ` [tip:x86/vdso] " tip-bot for H. Peter Anvin
2014-03-17 22:22 ` [PATCH v23 13/13] x86, vdso32: handle 32 bit vDSO larger one page Stefani Seibold
2014-03-18 21:30   ` [tip:x86/vdso] " tip-bot for Stefani Seibold
  -- strict thread matches above, loose matches on Subject: below --
2014-03-03 21:12 [Patch v22 03/12] x86: revamp vclock_gettime.c Stefani Seibold
2014-03-05 22:30 ` [tip:x86/vdso] x86, vdso: Revamp vclock_gettime.c tip-bot for Stefani Seibold
2014-02-16 21:52 [PATCH v18 03/10] revamp vclock_gettime.c Stefani Seibold
2014-02-17  0:52 ` [tip:x86/vdso] x86, vdso: Revamp vclock_gettime.c tip-bot for Stefani Seibold

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox