* [PATCH 0/4] clock_gettime_ns and x86-64 optimizations
@ 2011-12-25 16:50 Andy Lutomirski
2011-12-25 16:51 ` [PATCH 1/4] Add clock_gettime_ns syscall Andy Lutomirski
` (4 more replies)
0 siblings, 5 replies; 11+ messages in thread
From: Andy Lutomirski @ 2011-12-25 16:50 UTC (permalink / raw)
To: linux-kernel, Kumar Sundararajan, john stultz, Arun Sharma
Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, Richard Cochran,
Andy Lutomirski
On x86-64, clock_gettime is so fast that the overhead converting to and
from nanoseconds is non-negligible. clock_gettime_ns is a different and
faster interface.
Patch 1 adds the syscall and wires it up on x86-64. Patch 2 implements
the corresponding vdso entry on x86-64. Patch 3 optimizes the vdso
call, and patch 4 is a trivial change that speeds up the vdso
clock_gettime and clock_gettime_ns implementations.
The vdso timings are (on an 800MHz Sandy Bridge mobile):
Basic implementation:
realtime 77.4ns
monotonic 79.2ns
realtime_coarse 18.1ns
monotonic_coarse 22.0ns
realtime_ns 84.9ns
monotonic_ns 85.1ns
realtime_coarse_ns 19.49
monotonic_coarse_ns 27.32
Optimized implementation:
realtime 78.5ns
monotonic 77.4ns [a little faster -- maybe significant]
realtime_coarse 18.4ns
monotonic_coarse 19.4ns
realtime_ns 77.85ns [a nice improvement]
monotonic_ns 77.75ns [ditto]
realtime_coarse_ns 18.2ns
monotonic_coarse_ns 18.2ns [a lot faster]
Inlined (patch 4): [everything is improved]
realtime 73.4ns
monotonic 72.1ns
realtime_coarse 13.2ns
monotonic_coarse 15.8ns
realtime_ns 73.15ns
monotonic_ns 72.1ns
realtime_coarse_ns 14.1ns
monotonic_coarse_ns 15.6ns
This being the middle of the holidays, I reserve the right to have made
mistakes.
For the git-inclined, this series is at
https://git.kernel.org/?p=linux/kernel/git/luto/linux.git;a=shortlog;h=refs/heads/timing/clock_gettime_ns/patch_v1
Andy Lutomirski (4):
Add clock_gettime_ns syscall
x86-64: Add __vdso_clock_gettime_ns vsyscall
x86-64: Optimize vdso clock_gettime
x86-64: Inline vdso clock_gettime helpers
arch/x86/include/asm/unistd_64.h | 2 +
arch/x86/include/asm/vgtod.h | 21 ++++--
arch/x86/kernel/vsyscall_64.c | 25 +++++++-
arch/x86/vdso/vclock_gettime.c | 136 ++++++++++++++++++++++++++------------
arch/x86/vdso/vdso.lds.S | 7 ++
include/linux/syscalls.h | 3 +
include/linux/time.h | 5 ++
kernel/posix-timers.c | 30 ++++++++
8 files changed, 179 insertions(+), 50 deletions(-)
--
1.7.7.4
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/4] Add clock_gettime_ns syscall
2011-12-25 16:50 [PATCH 0/4] clock_gettime_ns and x86-64 optimizations Andy Lutomirski
@ 2011-12-25 16:51 ` Andy Lutomirski
2011-12-27 7:25 ` Richard Cochran
2011-12-28 19:02 ` Arun Sharma
2011-12-25 16:51 ` [PATCH 2/4] x86-64: Add __vdso_clock_gettime_ns vsyscall Andy Lutomirski
` (3 subsequent siblings)
4 siblings, 2 replies; 11+ messages in thread
From: Andy Lutomirski @ 2011-12-25 16:51 UTC (permalink / raw)
To: linux-kernel, Kumar Sundararajan, john stultz, Arun Sharma
Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, Richard Cochran,
Andy Lutomirski
On some architectures, clock_gettime is fast enough that converting
between nanoseconds and struct timespec takes a significant amount
of time. Introduce a new syscall that does the same thing but
returns the answer in nanoseconds. 2^64 nanoseconds since the epoch
won't wrap around until the year 2554, and by then we can use
128-bit types.
clock_gettime_ns returns an unsigned nanosecond count. It will wrap
when the time from whatever clock is being read exceeds about 584
years. For CLOCK_MONOTONIC, CLOCK_BOOTTIME, etc, this is unlikely
to be a problem. For CLOCK_REALTIME, either user code can check for
wraparound or can switch to 128-bit integers in a little over 500
years.
This interface intentionally does not support sub-nanosecond
precision. For one thing, light only travels about a foot per
nanosecond, so nanoseconds are really pretty good for networking
purposes. For another, 2^64 picoseconds (say) is a short enough
interval to be inconvenient. If anyone needs sub-nanosecond
precision for anything other than profiling, they're welcome to
figure out an appropriate interface. For very precise profiling,
"what time is it" is the wrong question, anyway -- modern CPUs can
reorder things across time scales much longer than a nanosecond.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
---
arch/x86/include/asm/unistd_64.h | 2 ++
include/linux/syscalls.h | 3 +++
include/linux/time.h | 5 +++++
kernel/posix-timers.c | 30 ++++++++++++++++++++++++++++++
4 files changed, 40 insertions(+), 0 deletions(-)
diff --git a/arch/x86/include/asm/unistd_64.h b/arch/x86/include/asm/unistd_64.h
index 2010405..3a48069 100644
--- a/arch/x86/include/asm/unistd_64.h
+++ b/arch/x86/include/asm/unistd_64.h
@@ -683,6 +683,8 @@ __SYSCALL(__NR_sendmmsg, sys_sendmmsg)
__SYSCALL(__NR_setns, sys_setns)
#define __NR_getcpu 309
__SYSCALL(__NR_getcpu, sys_getcpu)
+#define __NR_clock_gettime_ns 310
+__SYSCALL(__NR_clock_gettime_ns, sys_clock_gettime_ns)
#ifndef __NO_STUBS
#define __ARCH_WANT_OLD_READDIR
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 1ff0ec2..89cb897 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -49,6 +49,7 @@ struct statfs;
struct statfs64;
struct __sysctl_args;
struct sysinfo;
+struct timens;
struct timespec;
struct timeval;
struct timex;
@@ -316,6 +317,8 @@ asmlinkage long sys_clock_settime(clockid_t which_clock,
const struct timespec __user *tp);
asmlinkage long sys_clock_gettime(clockid_t which_clock,
struct timespec __user *tp);
+asmlinkage long sys_clock_gettime_ns(clockid_t which_clock,
+ struct timens __user *tp);
asmlinkage long sys_clock_adjtime(clockid_t which_clock,
struct timex __user *tx);
asmlinkage long sys_clock_getres(clockid_t which_clock,
diff --git a/include/linux/time.h b/include/linux/time.h
index b306178..d4488b1 100644
--- a/include/linux/time.h
+++ b/include/linux/time.h
@@ -27,6 +27,11 @@ struct timezone {
int tz_dsttime; /* type of dst correction */
};
+struct timens {
+ u64 ns; /* nanoseconds since the relevant epoch */
+ u64 padding; /* for future expansion (UTC offset? sub-ns?) */
+};
+
#ifdef __KERNEL__
extern struct timezone sys_tz;
diff --git a/kernel/posix-timers.c b/kernel/posix-timers.c
index 4556182..43bc842 100644
--- a/kernel/posix-timers.c
+++ b/kernel/posix-timers.c
@@ -980,6 +980,36 @@ SYSCALL_DEFINE2(clock_gettime, const clockid_t, which_clock,
return error;
}
+SYSCALL_DEFINE2(clock_gettime_ns, const clockid_t, which_clock,
+ struct timens __user *, tp)
+{
+ /*
+ * This implementation isn't as fast as it could be, but the syscall
+ * entry will take much longer than the unnecessary division and
+ * multiplication. Arch-specific implementations can be made faster.
+ */
+
+ struct k_clock *kc = clockid_to_kclock(which_clock);
+ struct timespec kernel_timespec;
+ struct timens timens;
+ int error;
+
+ if (!kc)
+ return -EINVAL;
+
+ error = kc->clock_get(which_clock, &kernel_timespec);
+
+ if (!error) {
+ timens.ns = kernel_timespec.tv_sec * NSEC_PER_SEC
+ + kernel_timespec.tv_nsec;
+ timens.padding = 0;
+
+ error = copy_to_user(tp, &timens, sizeof(timens));
+ }
+
+ return error;
+}
+
SYSCALL_DEFINE2(clock_adjtime, const clockid_t, which_clock,
struct timex __user *, utx)
{
--
1.7.7.4
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/4] x86-64: Add __vdso_clock_gettime_ns vsyscall
2011-12-25 16:50 [PATCH 0/4] clock_gettime_ns and x86-64 optimizations Andy Lutomirski
2011-12-25 16:51 ` [PATCH 1/4] Add clock_gettime_ns syscall Andy Lutomirski
@ 2011-12-25 16:51 ` Andy Lutomirski
2011-12-25 16:51 ` [PATCH 3/4] x86-64: Optimize vdso clock_gettime Andy Lutomirski
` (2 subsequent siblings)
4 siblings, 0 replies; 11+ messages in thread
From: Andy Lutomirski @ 2011-12-25 16:51 UTC (permalink / raw)
To: linux-kernel, Kumar Sundararajan, john stultz, Arun Sharma
Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, Richard Cochran,
Andy Lutomirski
This is just for the ABI. The next patch optimizes the implementation.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
---
arch/x86/vdso/vclock_gettime.c | 70 ++++++++++++++++++++++++++++++---------
arch/x86/vdso/vdso.lds.S | 7 ++++
2 files changed, 61 insertions(+), 16 deletions(-)
diff --git a/arch/x86/vdso/vclock_gettime.c b/arch/x86/vdso/vclock_gettime.c
index 6bc0e72..c8a2b46 100644
--- a/arch/x86/vdso/vclock_gettime.c
+++ b/arch/x86/vdso/vclock_gettime.c
@@ -82,7 +82,7 @@ notrace static inline long vgetns(void)
return (v * gtod->clock.mult) >> gtod->clock.shift;
}
-notrace static noinline int do_realtime(struct timespec *ts)
+notrace static noinline void do_realtime(struct timespec *ts)
{
unsigned long seq, ns;
do {
@@ -92,10 +92,9 @@ notrace static noinline int do_realtime(struct timespec *ts)
ns = vgetns();
} while (unlikely(read_seqretry(>od->lock, seq)));
timespec_add_ns(ts, ns);
- return 0;
}
-notrace static noinline int do_monotonic(struct timespec *ts)
+notrace static noinline void do_monotonic(struct timespec *ts)
{
unsigned long seq, ns, secs;
do {
@@ -115,11 +114,9 @@ notrace static noinline int do_monotonic(struct timespec *ts)
}
ts->tv_sec = secs;
ts->tv_nsec = ns;
-
- return 0;
}
-notrace static noinline int do_realtime_coarse(struct timespec *ts)
+notrace static noinline void do_realtime_coarse(struct timespec *ts)
{
unsigned long seq;
do {
@@ -127,10 +124,9 @@ notrace static noinline int do_realtime_coarse(struct timespec *ts)
ts->tv_sec = gtod->wall_time_coarse.tv_sec;
ts->tv_nsec = gtod->wall_time_coarse.tv_nsec;
} while (unlikely(read_seqretry(>od->lock, seq)));
- return 0;
}
-notrace static noinline int do_monotonic_coarse(struct timespec *ts)
+notrace static noinline void do_monotonic_coarse(struct timespec *ts)
{
unsigned long seq, ns, secs;
do {
@@ -150,25 +146,29 @@ notrace static noinline int do_monotonic_coarse(struct timespec *ts)
}
ts->tv_sec = secs;
ts->tv_nsec = ns;
-
- return 0;
}
notrace int __vdso_clock_gettime(clockid_t clock, struct timespec *ts)
{
switch (clock) {
case CLOCK_REALTIME:
- if (likely(gtod->clock.vclock_mode != VCLOCK_NONE))
- return do_realtime(ts);
+ if (likely(gtod->clock.vclock_mode != VCLOCK_NONE)) {
+ do_realtime(ts);
+ return 0;
+ }
break;
case CLOCK_MONOTONIC:
- if (likely(gtod->clock.vclock_mode != VCLOCK_NONE))
- return do_monotonic(ts);
+ if (likely(gtod->clock.vclock_mode != VCLOCK_NONE)) {
+ do_monotonic(ts);
+ return 0;
+ }
break;
case CLOCK_REALTIME_COARSE:
- return do_realtime_coarse(ts);
+ do_realtime_coarse(ts);
+ return 0;
case CLOCK_MONOTONIC_COARSE:
- return do_monotonic_coarse(ts);
+ do_monotonic_coarse(ts);
+ return 0;
}
return vdso_fallback_gettime(clock, ts);
@@ -176,6 +176,44 @@ notrace int __vdso_clock_gettime(clockid_t clock, struct timespec *ts)
int clock_gettime(clockid_t, struct timespec *)
__attribute__((weak, alias("__vdso_clock_gettime")));
+notrace int __vdso_clock_gettime_ns(clockid_t clock, struct timens *t)
+{
+ /* This implementation is slow. It will be improved later. */
+
+ struct timespec ts;
+ int error;
+
+ switch (clock) {
+ case CLOCK_REALTIME:
+ if (likely(gtod->clock.vclock_mode != VCLOCK_NONE)) {
+ do_realtime(&ts);
+ goto done;
+ }
+ break;
+ case CLOCK_MONOTONIC:
+ if (likely(gtod->clock.vclock_mode != VCLOCK_NONE)) {
+ do_monotonic(&ts);
+ goto done;
+ }
+ break;
+ case CLOCK_REALTIME_COARSE:
+ do_realtime_coarse(&ts);
+ goto done;
+ case CLOCK_MONOTONIC_COARSE:
+ do_monotonic_coarse(&ts);
+ goto done;
+ }
+
+ error = vdso_fallback_gettime(clock, &ts);
+ if (error)
+ return error;
+
+done:
+ t->ns = ts.tv_sec * NSEC_PER_SEC + ts.tv_nsec;
+ t->padding = 0;
+ return 0;
+}
+
notrace int __vdso_gettimeofday(struct timeval *tv, struct timezone *tz)
{
long ret;
diff --git a/arch/x86/vdso/vdso.lds.S b/arch/x86/vdso/vdso.lds.S
index b96b267..238f500 100644
--- a/arch/x86/vdso/vdso.lds.S
+++ b/arch/x86/vdso/vdso.lds.S
@@ -17,6 +17,10 @@
VERSION {
LINUX_2.6 {
global:
+ /*
+ * These are the original vsyscalls. They have weak symbols
+ * without the __vdso_ prefix for ABI compatibility.
+ */
clock_gettime;
__vdso_clock_gettime;
gettimeofday;
@@ -25,6 +29,9 @@ VERSION {
__vdso_getcpu;
time;
__vdso_time;
+
+ /* New vsyscalls are just plain functions. */
+ __vdso_clock_gettime_ns;
local: *;
};
}
--
1.7.7.4
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 3/4] x86-64: Optimize vdso clock_gettime
2011-12-25 16:50 [PATCH 0/4] clock_gettime_ns and x86-64 optimizations Andy Lutomirski
2011-12-25 16:51 ` [PATCH 1/4] Add clock_gettime_ns syscall Andy Lutomirski
2011-12-25 16:51 ` [PATCH 2/4] x86-64: Add __vdso_clock_gettime_ns vsyscall Andy Lutomirski
@ 2011-12-25 16:51 ` Andy Lutomirski
2011-12-25 16:51 ` [PATCH 4/4] x86-64: Inline vdso clock_gettime helpers Andy Lutomirski
2011-12-27 7:46 ` [PATCH 0/4] clock_gettime_ns and x86-64 optimizations Richard Cochran
4 siblings, 0 replies; 11+ messages in thread
From: Andy Lutomirski @ 2011-12-25 16:51 UTC (permalink / raw)
To: linux-kernel, Kumar Sundararajan, john stultz, Arun Sharma
Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, Richard Cochran,
Andy Lutomirski
This is a small improvement.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
---
arch/x86/include/asm/vgtod.h | 21 +++++++---
arch/x86/kernel/vsyscall_64.c | 25 +++++++++++-
arch/x86/vdso/vclock_gettime.c | 89 ++++++++++++++++++++++-----------------
3 files changed, 89 insertions(+), 46 deletions(-)
diff --git a/arch/x86/include/asm/vgtod.h b/arch/x86/include/asm/vgtod.h
index 815285b..03b4999 100644
--- a/arch/x86/include/asm/vgtod.h
+++ b/arch/x86/include/asm/vgtod.h
@@ -7,11 +7,6 @@
struct vsyscall_gtod_data {
seqlock_t lock;
- /* open coded 'struct timespec' */
- time_t wall_time_sec;
- u32 wall_time_nsec;
-
- struct timezone sys_tz;
struct { /* extract of a clocksource struct */
int vclock_mode;
cycle_t cycle_last;
@@ -19,8 +14,22 @@ struct vsyscall_gtod_data {
u32 mult;
u32 shift;
} clock;
- struct timespec wall_to_monotonic;
+
+ /* open coded 'struct timespec' */
+ time_t wall_time_sec;
+ u32 wall_time_nsec;
+ u32 monotonic_time_nsec;
+ time_t monotonic_time_sec;
+
+ /* Flat counts for clock_ns_get */
+ u64 wall_time_flat_ns;
+ u64 monotonic_time_flat_ns;
+ u64 wall_time_coarse_flat_ns;
+ u64 monotonic_time_coarse_flat_ns;
+
+ struct timezone sys_tz;
struct timespec wall_time_coarse;
+ struct timespec monotonic_time_coarse;
};
extern struct vsyscall_gtod_data vsyscall_gtod_data;
diff --git a/arch/x86/kernel/vsyscall_64.c b/arch/x86/kernel/vsyscall_64.c
index b56c65de..9c2e148 100644
--- a/arch/x86/kernel/vsyscall_64.c
+++ b/arch/x86/kernel/vsyscall_64.c
@@ -91,6 +91,7 @@ void update_vsyscall(struct timespec *wall_time, struct timespec *wtm,
struct clocksource *clock, u32 mult)
{
unsigned long flags;
+ struct timespec monotonic;
write_seqlock_irqsave(&vsyscall_gtod_data.lock, flags);
@@ -100,10 +101,32 @@ void update_vsyscall(struct timespec *wall_time, struct timespec *wtm,
vsyscall_gtod_data.clock.mask = clock->mask;
vsyscall_gtod_data.clock.mult = mult;
vsyscall_gtod_data.clock.shift = clock->shift;
+
vsyscall_gtod_data.wall_time_sec = wall_time->tv_sec;
vsyscall_gtod_data.wall_time_nsec = wall_time->tv_nsec;
- vsyscall_gtod_data.wall_to_monotonic = *wtm;
+
+ monotonic = timespec_add(*wall_time, *wtm);
+ vsyscall_gtod_data.monotonic_time_sec = monotonic.tv_sec;
+ vsyscall_gtod_data.monotonic_time_nsec = monotonic.tv_nsec;
+
vsyscall_gtod_data.wall_time_coarse = __current_kernel_time();
+ vsyscall_gtod_data.monotonic_time_coarse =
+ timespec_add(vsyscall_gtod_data.wall_time_coarse, *wtm);
+
+ /* generate flat data for clock_ns_get */
+ vsyscall_gtod_data.wall_time_flat_ns =
+ vsyscall_gtod_data.wall_time_sec * NSEC_PER_SEC +
+ vsyscall_gtod_data.wall_time_nsec;
+ vsyscall_gtod_data.monotonic_time_flat_ns =
+ vsyscall_gtod_data.monotonic_time_sec * NSEC_PER_SEC +
+ vsyscall_gtod_data.monotonic_time_nsec;
+
+ vsyscall_gtod_data.wall_time_coarse_flat_ns =
+ vsyscall_gtod_data.wall_time_coarse.tv_sec * NSEC_PER_SEC +
+ vsyscall_gtod_data.wall_time_coarse.tv_nsec;
+ vsyscall_gtod_data.monotonic_time_coarse_flat_ns =
+ vsyscall_gtod_data.monotonic_time_coarse.tv_sec * NSEC_PER_SEC +
+ vsyscall_gtod_data.monotonic_time_coarse.tv_nsec;
write_sequnlock_irqrestore(&vsyscall_gtod_data.lock, flags);
}
diff --git a/arch/x86/vdso/vclock_gettime.c b/arch/x86/vdso/vclock_gettime.c
index c8a2b46..fe960ac 100644
--- a/arch/x86/vdso/vclock_gettime.c
+++ b/arch/x86/vdso/vclock_gettime.c
@@ -94,26 +94,36 @@ notrace static noinline void do_realtime(struct timespec *ts)
timespec_add_ns(ts, ns);
}
+notrace static noinline u64 do_realtime_ns(void)
+{
+ unsigned long seq, ns;
+ do {
+ seq = read_seqbegin(>od->lock);
+ ns = gtod->wall_time_flat_ns + vgetns();
+ } while (unlikely(read_seqretry(>od->lock, seq)));
+ return ns;
+}
+
notrace static noinline void do_monotonic(struct timespec *ts)
{
- unsigned long seq, ns, secs;
+ unsigned long seq, ns;
do {
seq = read_seqbegin(>od->lock);
- secs = gtod->wall_time_sec;
- ns = gtod->wall_time_nsec + vgetns();
- secs += gtod->wall_to_monotonic.tv_sec;
- ns += gtod->wall_to_monotonic.tv_nsec;
+ ts->tv_sec = gtod->monotonic_time_sec;
+ ts->tv_nsec = gtod->monotonic_time_nsec;
+ ns = vgetns();
} while (unlikely(read_seqretry(>od->lock, seq)));
+ timespec_add_ns(ts, ns);
+}
- /* wall_time_nsec, vgetns(), and wall_to_monotonic.tv_nsec
- * are all guaranteed to be nonnegative.
- */
- while (ns >= NSEC_PER_SEC) {
- ns -= NSEC_PER_SEC;
- ++secs;
- }
- ts->tv_sec = secs;
- ts->tv_nsec = ns;
+notrace static noinline u64 do_monotonic_ns(void)
+{
+ unsigned long seq, ns;
+ do {
+ seq = read_seqbegin(>od->lock);
+ ns = gtod->monotonic_time_flat_ns + vgetns();
+ } while (unlikely(read_seqretry(>od->lock, seq)));
+ return ns;
}
notrace static noinline void do_realtime_coarse(struct timespec *ts)
@@ -126,26 +136,26 @@ notrace static noinline void do_realtime_coarse(struct timespec *ts)
} while (unlikely(read_seqretry(>od->lock, seq)));
}
+notrace static noinline u64 do_realtime_coarse_ns(void)
+{
+ /* This is atomic on x86-64. */
+ return ACCESS_ONCE(gtod->wall_time_coarse_flat_ns);
+}
+
notrace static noinline void do_monotonic_coarse(struct timespec *ts)
{
- unsigned long seq, ns, secs;
+ unsigned long seq;
do {
seq = read_seqbegin(>od->lock);
- secs = gtod->wall_time_coarse.tv_sec;
- ns = gtod->wall_time_coarse.tv_nsec;
- secs += gtod->wall_to_monotonic.tv_sec;
- ns += gtod->wall_to_monotonic.tv_nsec;
+ ts->tv_sec = gtod->monotonic_time_coarse.tv_sec;
+ ts->tv_nsec = gtod->monotonic_time_coarse.tv_nsec;
} while (unlikely(read_seqretry(>od->lock, seq)));
+}
- /* wall_time_nsec and wall_to_monotonic.tv_nsec are
- * guaranteed to be between 0 and NSEC_PER_SEC.
- */
- if (ns >= NSEC_PER_SEC) {
- ns -= NSEC_PER_SEC;
- ++secs;
- }
- ts->tv_sec = secs;
- ts->tv_nsec = ns;
+notrace static noinline u64 do_monotonic_coarse_ns(void)
+{
+ /* This is atomic on x86-64. */
+ return ACCESS_ONCE(gtod->monotonic_time_coarse_flat_ns);
}
notrace int __vdso_clock_gettime(clockid_t clock, struct timespec *ts)
@@ -178,37 +188,38 @@ int clock_gettime(clockid_t, struct timespec *)
notrace int __vdso_clock_gettime_ns(clockid_t clock, struct timens *t)
{
- /* This implementation is slow. It will be improved later. */
-
struct timespec ts;
int error;
switch (clock) {
case CLOCK_REALTIME:
if (likely(gtod->clock.vclock_mode != VCLOCK_NONE)) {
- do_realtime(&ts);
- goto done;
+ t->ns = do_realtime_ns();
+ t->padding = 0;
+ return 0;
}
break;
case CLOCK_MONOTONIC:
if (likely(gtod->clock.vclock_mode != VCLOCK_NONE)) {
- do_monotonic(&ts);
- goto done;
+ t->ns = do_monotonic_ns();
+ t->padding = 0;
+ return 0;
}
break;
case CLOCK_REALTIME_COARSE:
- do_realtime_coarse(&ts);
- goto done;
+ t->ns = do_realtime_coarse_ns();
+ t->padding = 0;
+ return 0;
case CLOCK_MONOTONIC_COARSE:
- do_monotonic_coarse(&ts);
- goto done;
+ t->ns = do_monotonic_coarse_ns();
+ t->padding = 0;
+ return 0;
}
error = vdso_fallback_gettime(clock, &ts);
if (error)
return error;
-done:
t->ns = ts.tv_sec * NSEC_PER_SEC + ts.tv_nsec;
t->padding = 0;
return 0;
--
1.7.7.4
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 4/4] x86-64: Inline vdso clock_gettime helpers
2011-12-25 16:50 [PATCH 0/4] clock_gettime_ns and x86-64 optimizations Andy Lutomirski
` (2 preceding siblings ...)
2011-12-25 16:51 ` [PATCH 3/4] x86-64: Optimize vdso clock_gettime Andy Lutomirski
@ 2011-12-25 16:51 ` Andy Lutomirski
2011-12-27 7:46 ` [PATCH 0/4] clock_gettime_ns and x86-64 optimizations Richard Cochran
4 siblings, 0 replies; 11+ messages in thread
From: Andy Lutomirski @ 2011-12-25 16:51 UTC (permalink / raw)
To: linux-kernel, Kumar Sundararajan, john stultz, Arun Sharma
Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, Richard Cochran,
Andy Lutomirski
This is about a 6% speedup on Sandy Bridge.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
---
arch/x86/vdso/vclock_gettime.c | 17 +++++++++--------
1 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/arch/x86/vdso/vclock_gettime.c b/arch/x86/vdso/vclock_gettime.c
index fe960ac..40eead5 100644
--- a/arch/x86/vdso/vclock_gettime.c
+++ b/arch/x86/vdso/vclock_gettime.c
@@ -82,7 +82,8 @@ notrace static inline long vgetns(void)
return (v * gtod->clock.mult) >> gtod->clock.shift;
}
-notrace static noinline void do_realtime(struct timespec *ts)
+/* Code size doesn't matter (vdso is 4k anyway) and this is faster. */
+notrace static void __always_inline do_realtime(struct timespec *ts)
{
unsigned long seq, ns;
do {
@@ -94,7 +95,7 @@ notrace static noinline void do_realtime(struct timespec *ts)
timespec_add_ns(ts, ns);
}
-notrace static noinline u64 do_realtime_ns(void)
+notrace static u64 do_realtime_ns(void)
{
unsigned long seq, ns;
do {
@@ -104,7 +105,7 @@ notrace static noinline u64 do_realtime_ns(void)
return ns;
}
-notrace static noinline void do_monotonic(struct timespec *ts)
+notrace static void do_monotonic(struct timespec *ts)
{
unsigned long seq, ns;
do {
@@ -116,7 +117,7 @@ notrace static noinline void do_monotonic(struct timespec *ts)
timespec_add_ns(ts, ns);
}
-notrace static noinline u64 do_monotonic_ns(void)
+notrace static u64 do_monotonic_ns(void)
{
unsigned long seq, ns;
do {
@@ -126,7 +127,7 @@ notrace static noinline u64 do_monotonic_ns(void)
return ns;
}
-notrace static noinline void do_realtime_coarse(struct timespec *ts)
+notrace static void do_realtime_coarse(struct timespec *ts)
{
unsigned long seq;
do {
@@ -136,13 +137,13 @@ notrace static noinline void do_realtime_coarse(struct timespec *ts)
} while (unlikely(read_seqretry(>od->lock, seq)));
}
-notrace static noinline u64 do_realtime_coarse_ns(void)
+notrace static u64 do_realtime_coarse_ns(void)
{
/* This is atomic on x86-64. */
return ACCESS_ONCE(gtod->wall_time_coarse_flat_ns);
}
-notrace static noinline void do_monotonic_coarse(struct timespec *ts)
+notrace static void do_monotonic_coarse(struct timespec *ts)
{
unsigned long seq;
do {
@@ -152,7 +153,7 @@ notrace static noinline void do_monotonic_coarse(struct timespec *ts)
} while (unlikely(read_seqretry(>od->lock, seq)));
}
-notrace static noinline u64 do_monotonic_coarse_ns(void)
+notrace static u64 do_monotonic_coarse_ns(void)
{
/* This is atomic on x86-64. */
return ACCESS_ONCE(gtod->monotonic_time_coarse_flat_ns);
--
1.7.7.4
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/4] Add clock_gettime_ns syscall
2011-12-25 16:51 ` [PATCH 1/4] Add clock_gettime_ns syscall Andy Lutomirski
@ 2011-12-27 7:25 ` Richard Cochran
2011-12-28 19:02 ` Arun Sharma
1 sibling, 0 replies; 11+ messages in thread
From: Richard Cochran @ 2011-12-27 7:25 UTC (permalink / raw)
To: Andy Lutomirski
Cc: linux-kernel, Kumar Sundararajan, john stultz, Arun Sharma,
Peter Zijlstra, Ingo Molnar, Thomas Gleixner
On Sun, Dec 25, 2011 at 08:51:00AM -0800, Andy Lutomirski wrote:
>
> This interface intentionally does not support sub-nanosecond
> precision. For one thing, light only travels about a foot per
> nanosecond, so nanoseconds are really pretty good for networking
> purposes. For another, 2^64 picoseconds (say) is a short enough
> interval to be inconvenient. If anyone needs sub-nanosecond
> precision for anything other than profiling, they're welcome to
> figure out an appropriate interface. For very precise profiling,
> "what time is it" is the wrong question, anyway -- modern CPUs can
> reorder things across time scales much longer than a nanosecond.
This paragraph probably should be updated, since you did add a padding
field for future sub-nanosecond reporting.
Also, on re-reading this, it seems silly to be to introduce a function
called "clock_gettime_ns" and then to say that this function cannot
answer the question, what time is it?
If the interface is a profiling only interface, then the name should
reflect this fact.
Thanks,
Richard
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 0/4] clock_gettime_ns and x86-64 optimizations
2011-12-25 16:50 [PATCH 0/4] clock_gettime_ns and x86-64 optimizations Andy Lutomirski
` (3 preceding siblings ...)
2011-12-25 16:51 ` [PATCH 4/4] x86-64: Inline vdso clock_gettime helpers Andy Lutomirski
@ 2011-12-27 7:46 ` Richard Cochran
4 siblings, 0 replies; 11+ messages in thread
From: Richard Cochran @ 2011-12-27 7:46 UTC (permalink / raw)
To: Andy Lutomirski
Cc: linux-kernel, Kumar Sundararajan, john stultz, Arun Sharma,
Peter Zijlstra, Ingo Molnar, Thomas Gleixner
On Sun, Dec 25, 2011 at 08:50:59AM -0800, Andy Lutomirski wrote:
> On x86-64, clock_gettime is so fast that the overhead converting to and
> from nanoseconds is non-negligible. clock_gettime_ns is a different and
> faster interface.
But your data contradict this statement. See below.
> Patch 1 adds the syscall and wires it up on x86-64. Patch 2 implements
> the corresponding vdso entry on x86-64. Patch 3 optimizes the vdso
> call, and patch 4 is a trivial change that speeds up the vdso
> clock_gettime and clock_gettime_ns implementations.
>
> The vdso timings are (on an 800MHz Sandy Bridge mobile):
It would be more informative to describe how you made the measurements
in more detail, for example, pseudo code, number of trials, min, max,
mean, std. dev.
> Basic implementation:
>
> realtime 77.4ns
> monotonic 79.2ns
> realtime_coarse 18.1ns
> monotonic_coarse 22.0ns
In order to better understand what you are reporting, I arranged your
numbers into a table:
1. Basic implementation
2. Optimized implementation
3. Inlined (patch 4)
|---------------------+-------+-------+-------|
| | 1. | 2. | 3. |
|---------------------+-------+-------+-------|
| realtime | 77.40 | 78.50 | 73.40 |
| realtime_ns | 84.90 | 77.85 | 73.15 |
|---------------------+-------+-------+-------|
| monotonic | 79.20 | 77.40 | 72.10 |
| monotonic_ns | 85.10 | 77.75 | 72.10 |
|---------------------+-------+-------+-------|
| realtime_coarse | 18.10 | 18.40 | 13.20 |
| realtime_coarse_ns | 19.49 | 18.20 | 14.10 |
|---------------------+-------+-------+-------|
| monotonic_coarse | 22.00 | 19.40 | 15.80 |
| monotonic_coarse_ns | 27.32 | 18.20 | 15.60 |
|---------------------+-------+-------+-------|
Looking down column 3, it appears that the _ns calls are no faster
than their plain counterparts.
So, while the inline patch does improve performance, the new _ns
functions do not really seem worth the trouble.
Thanks,
Richard
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/4] Add clock_gettime_ns syscall
2011-12-25 16:51 ` [PATCH 1/4] Add clock_gettime_ns syscall Andy Lutomirski
2011-12-27 7:25 ` Richard Cochran
@ 2011-12-28 19:02 ` Arun Sharma
[not found] ` <CALCETrVz1ADNxeLzPmeWXPU5ApfKURH2vnged2A2Vng8-hUxcw@mail.gmail.com>
1 sibling, 1 reply; 11+ messages in thread
From: Arun Sharma @ 2011-12-28 19:02 UTC (permalink / raw)
To: Andy Lutomirski
Cc: linux-kernel, Kumar Sundararajan, john stultz, Peter Zijlstra,
Ingo Molnar, Thomas Gleixner, Richard Cochran
On 12/25/11 8:51 AM, Andy Lutomirski wrote:
> +struct timens {
> + u64 ns; /* nanoseconds since the relevant epoch */
> + u64 padding; /* for future expansion (UTC offset? sub-ns?) */
> +};
>..
> +SYSCALL_DEFINE2(clock_gettime_ns, const clockid_t, which_clock,
> + struct timens __user *, tp)
> +{
How about returning a (signed) long as the time in ns? This way, we save
a store and a load and the value can be passed in registers.
This shouldn't preclude future expansion via extra args.
-Arun
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/4] Add clock_gettime_ns syscall
[not found] ` <CALCETrVz1ADNxeLzPmeWXPU5ApfKURH2vnged2A2Vng8-hUxcw@mail.gmail.com>
@ 2011-12-28 22:45 ` Arun Sharma
2011-12-28 23:42 ` Andy Lutomirski
0 siblings, 1 reply; 11+ messages in thread
From: Arun Sharma @ 2011-12-28 22:45 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Arun Sharma, Thomas Gleixner, Ingo Molnar, Peter Zijlstra,
Kumar Sundararajan, linux-kernel, john stultz, Richard Cochran
On Wed, Dec 28, 2011 at 12:13:37PM -0800, Andy Lutomirski wrote:
> > How about returning a (signed) long as the time in ns? This way, we save
> a store and a load and the value can be passed in registers.
> >
> > This shouldn't preclude future expansion via extra args.
>
> With an unconditional store to a pointer? If a null pointer is allowed,
> the branch will probably kill any performance gain.
No - please see the patch below.
>
> The downside is that this is probably a non-starter for a syscall on 32-bit
> architectures. I'll see what the i386 ABI says. I wonder if returning a
> struct will use registers for future expansion.
>
I was thinking of doing something similar to lseek() on 32 bit archs
(i.e. by using a type similar to off_t that maps to the right thing for
both 32 and 64 bit).
I used the code below to benchmark the performance of clock_gettime()
vs clock_gettime_ns() when the client is interested in a nanosec based
interface.
gettimespec: 3.19 secs
getns: 2.54 secs (21% faster)
-Arun
PS: I didn't have to delete struct timens. I meant to drop timens.ns
(since its the return value now).
>From 2a9bb81b56c2034f444c3caa6cf8fbfd47f1d888 Mon Sep 17 00:00:00 2001
From: Arun Sharma <asharma@fb.com>
Date: Wed, 28 Dec 2011 11:10:37 -0800
Subject: [PATCH] Make clock_gettime_ns return nanosecs
This should speed things up a bit, while leaving room for
future expansion (eg: return additional info in a struct
passed in as an extra arg).
Signed-off-by: Arun Sharma <asharma@fb.com>
---
arch/x86/vdso/vclock_gettime.c | 22 ++++++----------------
include/linux/syscalls.h | 3 +--
include/linux/time.h | 5 -----
kernel/posix-timers.c | 11 ++++-------
4 files changed, 11 insertions(+), 30 deletions(-)
diff --git a/arch/x86/vdso/vclock_gettime.c b/arch/x86/vdso/vclock_gettime.c
index f9c08b2..41f613c 100644
--- a/arch/x86/vdso/vclock_gettime.c
+++ b/arch/x86/vdso/vclock_gettime.c
@@ -219,7 +219,7 @@ notrace int __vdso_clock_gettime(clockid_t clock, struct timespec *ts)
int clock_gettime(clockid_t, struct timespec *)
__attribute__((weak, alias("__vdso_clock_gettime")));
-notrace int __vdso_clock_gettime_ns(clockid_t clock, struct timens *t)
+notrace long __vdso_clock_gettime_ns(clockid_t clock)
{
struct timespec ts;
int error;
@@ -227,35 +227,25 @@ notrace int __vdso_clock_gettime_ns(clockid_t clock, struct timens *t)
switch (clock) {
case CLOCK_REALTIME:
if (likely(gtod->clock.vclock_mode != VCLOCK_NONE)) {
- t->ns = do_realtime_ns();
- t->padding = 0;
- return 0;
+ return do_realtime_ns();
}
break;
case CLOCK_MONOTONIC:
if (likely(gtod->clock.vclock_mode != VCLOCK_NONE)) {
- t->ns = do_monotonic_ns();
- t->padding = 0;
- return 0;
+ return do_monotonic_ns();
}
break;
case CLOCK_REALTIME_COARSE:
- t->ns = do_realtime_coarse_ns();
- t->padding = 0;
- return 0;
+ return do_realtime_coarse_ns();
case CLOCK_MONOTONIC_COARSE:
- t->ns = do_monotonic_coarse_ns();
- t->padding = 0;
- return 0;
+ return do_monotonic_coarse_ns();
}
error = vdso_fallback_gettime(clock, &ts);
if (error)
return error;
- t->ns = ts.tv_sec * NSEC_PER_SEC + ts.tv_nsec;
- t->padding = 0;
- return 0;
+ return ts.tv_sec * NSEC_PER_SEC + ts.tv_nsec;
}
notrace int __vdso_gettimeofday(struct timeval *tv, struct timezone *tz)
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index b351ab6..63759aa 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -317,8 +317,7 @@ asmlinkage long sys_clock_settime(clockid_t which_clock,
const struct timespec __user *tp);
asmlinkage long sys_clock_gettime(clockid_t which_clock,
struct timespec __user *tp);
-asmlinkage long sys_clock_gettime_ns(clockid_t which_clock,
- struct timens __user *tp);
+asmlinkage long sys_clock_gettime_ns(clockid_t which_clock);
asmlinkage long sys_clock_adjtime(clockid_t which_clock,
struct timex __user *tx);
asmlinkage long sys_clock_getres(clockid_t which_clock,
diff --git a/include/linux/time.h b/include/linux/time.h
index d4488b1..b306178 100644
--- a/include/linux/time.h
+++ b/include/linux/time.h
@@ -27,11 +27,6 @@ struct timezone {
int tz_dsttime; /* type of dst correction */
};
-struct timens {
- u64 ns; /* nanoseconds since the relevant epoch */
- u64 padding; /* for future expansion (UTC offset? sub-ns?) */
-};
-
#ifdef __KERNEL__
extern struct timezone sys_tz;
diff --git a/kernel/posix-timers.c b/kernel/posix-timers.c
index 1b6ad2d..b87e3dc 100644
--- a/kernel/posix-timers.c
+++ b/kernel/posix-timers.c
@@ -980,8 +980,7 @@ SYSCALL_DEFINE2(clock_gettime, const clockid_t, which_clock,
return error;
}
-SYSCALL_DEFINE2(clock_gettime_ns, const clockid_t, which_clock,
- struct timens __user *, tp)
+SYSCALL_DEFINE1(clock_gettime_ns, const clockid_t, which_clock)
{
/*
* This implementation isn't as fast as it could be, but the syscall
@@ -991,8 +990,8 @@ SYSCALL_DEFINE2(clock_gettime_ns, const clockid_t, which_clock,
struct k_clock *kc = clockid_to_kclock(which_clock);
struct timespec kernel_timespec;
- struct timens timens;
int error;
+ long ns;
if (!kc)
return -EINVAL;
@@ -1000,11 +999,9 @@ SYSCALL_DEFINE2(clock_gettime_ns, const clockid_t, which_clock,
error = kc->clock_get(which_clock, &kernel_timespec);
if (!error) {
- timens.ns = kernel_timespec.tv_sec * NSEC_PER_SEC
+ ns = kernel_timespec.tv_sec * NSEC_PER_SEC
+ kernel_timespec.tv_nsec;
- timens.padding = 0;
-
- error = copy_to_user(tp, &timens, sizeof(timens));
+ return ns;
}
return error;
--
1.7.4
==== getns.c ====
#include <stdio.h>
#include <time.h>
#include <unistd.h>
#include <dlfcn.h>
#include <sys/types.h>
#include <sys/syscall.h>
volatile int sum;
int
main(int argc, char *argv[])
{
unsigned long v;
void *vdso = dlopen("linux-vdso.so.1", RTLD_LAZY | RTLD_LOCAL | RTLD_NOLOAD);
unsigned long (*vdso_func)();
void *vp;
int i;
if (!vdso) {
printf("Warning: failed to find vDSO\n");
return;
}
vdso_func = dlsym(vdso, "__vdso_clock_gettime_ns");
if (!vdso_func) {
printf("Warning: failed to find vdso_func in vDSO\n");
return;
}
v = (*vdso_func)(CLOCK_REALTIME);
printf("%lx %ld\n", v, v);
for (i = 0; i < 100000000; i++) {
sum += (*vdso_func)(CLOCK_REALTIME);
}
v = (*vdso_func)(CLOCK_REALTIME);
printf("%lx %ld\n", v, v);
}
=== gettimespec.c ===
#include <stdio.h>
#include <time.h>
#include <unistd.h>
#include <dlfcn.h>
#include <sys/types.h>
#include <sys/syscall.h>
#define NSECS_PER_SEC 1000000000
volatile int sum;
int
main(int argc, char *argv[])
{
unsigned long v;
void *vdso = dlopen("linux-vdso.so.1", RTLD_LAZY | RTLD_LOCAL | RTLD_NOLOAD);
unsigned long (*vdso_func)();
void *vp;
int i;
struct timespec ts;
if (!vdso) {
printf("Warning: failed to find vDSO\n");
return;
}
vdso_func = dlsym(vdso, "__vdso_clock_gettime");
if (!vdso_func) {
printf("Warning: failed to find vdso_func in vDSO\n");
return;
}
v = (*vdso_func)(CLOCK_REALTIME, &ts);
v = ts.tv_sec * NSECS_PER_SEC + ts.tv_nsec;
printf("%lx %ld\n", v, v);
for (i = 0; i < 100000000; i++) {
(*vdso_func)(CLOCK_REALTIME, &ts);
v = ts.tv_sec * NSECS_PER_SEC + ts.tv_nsec;
sum +=v;
}
v = (*vdso_func)(CLOCK_REALTIME, &ts);
v = ts.tv_sec * NSECS_PER_SEC + ts.tv_nsec;
printf("%lx %ld\n", v, v);
}
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/4] Add clock_gettime_ns syscall
2011-12-28 22:45 ` Arun Sharma
@ 2011-12-28 23:42 ` Andy Lutomirski
2011-12-29 0:19 ` Arun Sharma
0 siblings, 1 reply; 11+ messages in thread
From: Andy Lutomirski @ 2011-12-28 23:42 UTC (permalink / raw)
To: Arun Sharma
Cc: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Kumar Sundararajan,
linux-kernel, john stultz, Richard Cochran
On Wed, Dec 28, 2011 at 2:45 PM, Arun Sharma <asharma@fb.com> wrote:
> On Wed, Dec 28, 2011 at 12:13:37PM -0800, Andy Lutomirski wrote:
>> > How about returning a (signed) long as the time in ns? This way, we save
>> a store and a load and the value can be passed in registers.
>> >
>> > This shouldn't preclude future expansion via extra args.
>>
>> With an unconditional store to a pointer? If a null pointer is allowed,
>> the branch will probably kill any performance gain.
>
> No - please see the patch below.
>
>>
>> The downside is that this is probably a non-starter for a syscall on 32-bit
>> architectures. I'll see what the i386 ABI says. I wonder if returning a
>> struct will use registers for future expansion.
>>
>
> I was thinking of doing something similar to lseek() on 32 bit archs
> (i.e. by using a type similar to off_t that maps to the right thing for
> both 32 and 64 bit).
Huh? That nanosecond count really needs to be 64 bits (or more).
2^32 nanoseconds is a rather short time.
>
> I used the code below to benchmark the performance of clock_gettime()
> vs clock_gettime_ns() when the client is interested in a nanosec based
> interface.
>
> gettimespec: 3.19 secs
> getns: 2.54 secs (21% faster)
>
> -Arun
>
> PS: I didn't have to delete struct timens. I meant to drop timens.ns
> (since its the return value now).
How are you keeping the ability to add extra fields?
FWIW, returning a long long on i386 generates nice code (two registers
used), but returning a struct bigger than 64 bits uses a pointer
passed in via the stack (i.e. worse than passing a pointer as a
parameter). Returning a struct with two 64-bit ints on x86-64 uses
registers, which is rather unfortunate.
Something like:
u64 __vdso_clock_gettime_ns(clock_t clockid, struct timens_extra *extra)
{
extra->padding = 0;
return <the time>;
}
might be okay.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/4] Add clock_gettime_ns syscall
2011-12-28 23:42 ` Andy Lutomirski
@ 2011-12-29 0:19 ` Arun Sharma
0 siblings, 0 replies; 11+ messages in thread
From: Arun Sharma @ 2011-12-29 0:19 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Kumar Sundararajan,
linux-kernel, john stultz, Richard Cochran
On 12/28/11 3:42 PM, Andy Lutomirski wrote:
> Something like:
>
> u64 __vdso_clock_gettime_ns(clock_t clockid, struct timens_extra *extra)
> {
> extra->padding = 0;
> return<the time>;
> }
>
>
> might be okay.
I think you want s64 as the return type (for both 32 and 64 bit). This
is what I was trying to suggest as well.
-Arun (Clicks the virtually non-existent like button)
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2011-12-29 0:19 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-25 16:50 [PATCH 0/4] clock_gettime_ns and x86-64 optimizations Andy Lutomirski
2011-12-25 16:51 ` [PATCH 1/4] Add clock_gettime_ns syscall Andy Lutomirski
2011-12-27 7:25 ` Richard Cochran
2011-12-28 19:02 ` Arun Sharma
[not found] ` <CALCETrVz1ADNxeLzPmeWXPU5ApfKURH2vnged2A2Vng8-hUxcw@mail.gmail.com>
2011-12-28 22:45 ` Arun Sharma
2011-12-28 23:42 ` Andy Lutomirski
2011-12-29 0:19 ` Arun Sharma
2011-12-25 16:51 ` [PATCH 2/4] x86-64: Add __vdso_clock_gettime_ns vsyscall Andy Lutomirski
2011-12-25 16:51 ` [PATCH 3/4] x86-64: Optimize vdso clock_gettime Andy Lutomirski
2011-12-25 16:51 ` [PATCH 4/4] x86-64: Inline vdso clock_gettime helpers Andy Lutomirski
2011-12-27 7:46 ` [PATCH 0/4] clock_gettime_ns and x86-64 optimizations Richard Cochran
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox