* [PATCH 0/2] time: Fix races at clocksource switch time
@ 2012-03-14 23:58 John Stultz
2012-03-14 23:58 ` [PATCH 1/2] time: x86: Fix race switching from vsyscall to non-vsyscall clock John Stultz
2012-03-14 23:58 ` [PATCH 2/2] time: Fix change_clocksource locking John Stultz
0 siblings, 2 replies; 9+ messages in thread
From: John Stultz @ 2012-03-14 23:58 UTC (permalink / raw)
To: linux-kernel; +Cc: John Stultz, Andy Lutomirski, Thomas Gleixner
In testing some recent timekeeping changes, I found some problem
when changing clocksources.
These two patches close races at clocksource switch time. One I sent out
earlier, and have reworked to integrate Thomas' comments.
These two patches, along with Thomas' recent cleanup patches ontop can
be found here:
git://git.linaro.org/people/jstultz/linux.git fortglx/3.4/time
CC: Andy Lutomirski <luto@amacapital.net>
CC: Thomas Gleixner <tglx@linutronix.de>
John Stultz (2):
time: x86: Fix race switching from vsyscall to non-vsyscall clock
time: Fix change_clocksource locking
arch/x86/vdso/vclock_gettime.c | 63 ++++++++++++++++++++++++----------------
kernel/time/timekeeping.c | 7 ++++
2 files changed, 45 insertions(+), 25 deletions(-)
--
1.7.3.2.146.gca209
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/2] time: x86: Fix race switching from vsyscall to non-vsyscall clock
2012-03-14 23:58 [PATCH 0/2] time: Fix races at clocksource switch time John Stultz
@ 2012-03-14 23:58 ` John Stultz
2012-03-15 0:34 ` Thomas Gleixner
2012-03-14 23:58 ` [PATCH 2/2] time: Fix change_clocksource locking John Stultz
1 sibling, 1 reply; 9+ messages in thread
From: John Stultz @ 2012-03-14 23:58 UTC (permalink / raw)
To: linux-kernel; +Cc: John Stultz, Andy Lutomirski, Thomas Gleixner
When switching from a vsyscall capable to a non-vsyscall capable
clocksource, there was a small race, where the last vsyscall
gettimeofday before the switch might return a invalid time value
using the new non-vsyscall enabled clocksource values after the
switch is complete.
This is due to the vsyscall code checking the vclock_mode once
outside of the seqcount protected section. After it reads the
vclock mode, it doesn't re-check that the sampled clock data
that is obtained in the seqcount critical section still matches.
The fix is to sample vclock_mode inside the protected section,
and as long as it isn't VCLOCK_NONE, return the calculated
value. If it has changed and is now VCLOCK_NONE, fall back
to the syscall gettime calculation.
v2:
* Cleanup checks as suggested by tglx
* Also fix same issue present in gettimeofday path
CC: Andy Lutomirski <luto@amacapital.net>
CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
arch/x86/vdso/vclock_gettime.c | 63 ++++++++++++++++++++++++----------------
1 files changed, 38 insertions(+), 25 deletions(-)
diff --git a/arch/x86/vdso/vclock_gettime.c b/arch/x86/vdso/vclock_gettime.c
index 6bc0e72..6c93209 100644
--- a/arch/x86/vdso/vclock_gettime.c
+++ b/arch/x86/vdso/vclock_gettime.c
@@ -70,6 +70,15 @@ notrace static long vdso_fallback_gettime(long clock, struct timespec *ts)
return ret;
}
+notrace static long vdso_fallback_gtod(struct timeval *tv, struct timezone *tz)
+{
+ long ret;
+ asm("syscall" : "=a" (ret) :
+ "0" (__NR_gettimeofday), "D" (tv), "S" (tz) : "memory");
+ return ret;
+}
+
+
notrace static inline long vgetns(void)
{
long v;
@@ -85,21 +94,26 @@ notrace static inline long vgetns(void)
notrace static noinline int do_realtime(struct timespec *ts)
{
unsigned long seq, ns;
+ int mode;
do {
seq = read_seqbegin(>od->lock);
+ mode = gtod->clock.vclock_mode;
ts->tv_sec = gtod->wall_time_sec;
ts->tv_nsec = gtod->wall_time_nsec;
ns = vgetns();
} while (unlikely(read_seqretry(>od->lock, seq)));
+
timespec_add_ns(ts, ns);
- return 0;
+ return mode;
}
notrace static noinline int do_monotonic(struct timespec *ts)
{
unsigned long seq, ns, secs;
+ int mode;
do {
seq = read_seqbegin(>od->lock);
+ mode = gtod->clock.vclock_mode;
secs = gtod->wall_time_sec;
ns = gtod->wall_time_nsec + vgetns();
secs += gtod->wall_to_monotonic.tv_sec;
@@ -116,7 +130,7 @@ notrace static noinline int do_monotonic(struct timespec *ts)
ts->tv_sec = secs;
ts->tv_nsec = ns;
- return 0;
+ return mode;
}
notrace static noinline int do_realtime_coarse(struct timespec *ts)
@@ -156,14 +170,13 @@ notrace static noinline int do_monotonic_coarse(struct timespec *ts)
notrace int __vdso_clock_gettime(clockid_t clock, struct timespec *ts)
{
+ int ret = VCLOCK_NONE;
switch (clock) {
case CLOCK_REALTIME:
- if (likely(gtod->clock.vclock_mode != VCLOCK_NONE))
- return do_realtime(ts);
+ ret = do_realtime(ts);
break;
case CLOCK_MONOTONIC:
- if (likely(gtod->clock.vclock_mode != VCLOCK_NONE))
- return do_monotonic(ts);
+ ret = do_monotonic(ts);
break;
case CLOCK_REALTIME_COARSE:
return do_realtime_coarse(ts);
@@ -171,32 +184,32 @@ notrace int __vdso_clock_gettime(clockid_t clock, struct timespec *ts)
return do_monotonic_coarse(ts);
}
- return vdso_fallback_gettime(clock, ts);
+ if (ret == VCLOCK_NONE)
+ return vdso_fallback_gettime(clock, ts);
+ return 0;
}
int clock_gettime(clockid_t, struct timespec *)
__attribute__((weak, alias("__vdso_clock_gettime")));
notrace int __vdso_gettimeofday(struct timeval *tv, struct timezone *tz)
{
- long ret;
- if (likely(gtod->clock.vclock_mode != VCLOCK_NONE)) {
- if (likely(tv != NULL)) {
- BUILD_BUG_ON(offsetof(struct timeval, tv_usec) !=
- offsetof(struct timespec, tv_nsec) ||
- sizeof(*tv) != sizeof(struct timespec));
- do_realtime((struct timespec *)tv);
- tv->tv_usec /= 1000;
- }
- if (unlikely(tz != NULL)) {
- /* Avoid memcpy. Some old compilers fail to inline it */
- tz->tz_minuteswest = gtod->sys_tz.tz_minuteswest;
- tz->tz_dsttime = gtod->sys_tz.tz_dsttime;
- }
- return 0;
+ long ret = VCLOCK_NONE;
+ if (likely(tv != NULL)) {
+ BUILD_BUG_ON(offsetof(struct timeval, tv_usec) !=
+ offsetof(struct timespec, tv_nsec) ||
+ sizeof(*tv) != sizeof(struct timespec));
+ ret = do_realtime((struct timespec *)tv);
+ tv->tv_usec /= 1000;
}
- asm("syscall" : "=a" (ret) :
- "0" (__NR_gettimeofday), "D" (tv), "S" (tz) : "memory");
- return ret;
+ if (unlikely(tz != NULL)) {
+ /* Avoid memcpy. Some old compilers fail to inline it */
+ tz->tz_minuteswest = gtod->sys_tz.tz_minuteswest;
+ tz->tz_dsttime = gtod->sys_tz.tz_dsttime;
+ }
+
+ if (ret == VCLOCK_NONE)
+ return vdso_fallback_gtod(tv, tz);
+ return 0;
}
int gettimeofday(struct timeval *, struct timezone *)
__attribute__((weak, alias("__vdso_gettimeofday")));
--
1.7.3.2.146.gca209
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/2] time: Fix change_clocksource locking
2012-03-14 23:58 [PATCH 0/2] time: Fix races at clocksource switch time John Stultz
2012-03-14 23:58 ` [PATCH 1/2] time: x86: Fix race switching from vsyscall to non-vsyscall clock John Stultz
@ 2012-03-14 23:58 ` John Stultz
1 sibling, 0 replies; 9+ messages in thread
From: John Stultz @ 2012-03-14 23:58 UTC (permalink / raw)
To: linux-kernel; +Cc: John Stultz, Andy Lutomirski, Thomas Gleixner
change_clocksource() fails to grab locks or call timekeeping_update(),
which leaves a race window for time inconsistencies.
This adds proper locking and a call to timekeeping_update() to fix this.
CC: Andy Lutomirski <luto@amacapital.net>
CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
kernel/time/timekeeping.c | 7 +++++++
1 files changed, 7 insertions(+), 0 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 403c2a0..b53da5e 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -448,9 +448,12 @@ EXPORT_SYMBOL(timekeeping_inject_offset);
static int change_clocksource(void *data)
{
struct clocksource *new, *old;
+ unsigned long flags;
new = (struct clocksource *) data;
+ write_seqlock_irqsave(&timekeeper.lock, flags);
+
timekeeping_forward_now();
if (!new->enable || new->enable(new) == 0) {
old = timekeeper.clock;
@@ -458,6 +461,10 @@ static int change_clocksource(void *data)
if (old->disable)
old->disable(old);
}
+ timekeeping_update(true);
+
+ write_sequnlock_irqrestore(&timekeeper.lock, flags);
+
return 0;
}
--
1.7.3.2.146.gca209
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] time: x86: Fix race switching from vsyscall to non-vsyscall clock
2012-03-14 23:58 ` [PATCH 1/2] time: x86: Fix race switching from vsyscall to non-vsyscall clock John Stultz
@ 2012-03-15 0:34 ` Thomas Gleixner
2012-03-15 0:42 ` John Stultz
0 siblings, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2012-03-15 0:34 UTC (permalink / raw)
To: John Stultz; +Cc: linux-kernel, Andy Lutomirski
On Wed, 14 Mar 2012, John Stultz wrote:
> notrace static noinline int do_realtime(struct timespec *ts)
> {
> unsigned long seq, ns;
> + int mode;
Please keep a newline between declarations and code.
> do {
> seq = read_seqbegin(>od->lock);
> + mode = gtod->clock.vclock_mode;
> ts->tv_sec = gtod->wall_time_sec;
> ts->tv_nsec = gtod->wall_time_nsec;
> ns = vgetns();
> } while (unlikely(read_seqretry(>od->lock, seq)));
> +
> timespec_add_ns(ts, ns);
> - return 0;
> + return mode;
> }
>
> notrace static noinline int do_monotonic(struct timespec *ts)
> {
> unsigned long seq, ns, secs;
> + int mode;
etc. etc.
> do {
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] time: x86: Fix race switching from vsyscall to non-vsyscall clock
2012-03-15 0:34 ` Thomas Gleixner
@ 2012-03-15 0:42 ` John Stultz
2012-03-15 1:43 ` Andy Lutomirski
0 siblings, 1 reply; 9+ messages in thread
From: John Stultz @ 2012-03-15 0:42 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: linux-kernel, Andy Lutomirski
On 03/14/2012 05:34 PM, Thomas Gleixner wrote:
> On Wed, 14 Mar 2012, John Stultz wrote:
>> notrace static noinline int do_realtime(struct timespec *ts)
>> {
>> unsigned long seq, ns;
>> + int mode;
> Please keep a newline between declarations and code.
Fixed below. Thanks!
(Let me know if you see whitespace damage, I switched mail clients today
and am learning the quirks here.)
-john
When switching from a vsyscall capable to a non-vsyscall capable
clocksource, there was a small race, where the last vsyscall
gettimeofday before the switch might return a invalid time value
using the new non-vsyscall enabled clocksource values after the
switch is complete.
This is due to the vsyscall code checking the vclock_mode once
outside of the seqcount protected section. After it reads the
vclock mode, it doesn't re-check that the sampled clock data
that is obtained in the seqcount critical section still matches.
The fix is to sample vclock_mode inside the protected section,
and as long as it isn't VCLOCK_NONE, return the calculated
value. If it has changed and is now VCLOCK_NONE, fall back
to the syscall gettime calculation.
v2:
* Cleanup checks as suggested by tglx
* Also fix same issue present in gettimeofday path
CC: Andy Lutomirski<luto@amacapital.net>
CC: Thomas Gleixner<tglx@linutronix.de>
Signed-off-by: John Stultz<john.stultz@linaro.org>
---
arch/x86/vdso/vclock_gettime.c | 68 +++++++++++++++++++++++++--------------
1 files changed, 43 insertions(+), 25 deletions(-)
diff --git a/arch/x86/vdso/vclock_gettime.c b/arch/x86/vdso/vclock_gettime.c
index 6bc0e72..e5ba922 100644
--- a/arch/x86/vdso/vclock_gettime.c
+++ b/arch/x86/vdso/vclock_gettime.c
@@ -70,6 +70,16 @@ notrace static long vdso_fallback_gettime(long clock, struct timespec *ts)
return ret;
}
+notrace static long vdso_fallback_gtod(struct timeval *tv, struct timezone *tz)
+{
+ long ret;
+
+ asm("syscall" : "=a" (ret) :
+ "0" (__NR_gettimeofday), "D" (tv), "S" (tz) : "memory");
+ return ret;
+}
+
+
notrace static inline long vgetns(void)
{
long v;
@@ -85,21 +95,28 @@ notrace static inline long vgetns(void)
notrace static noinline int do_realtime(struct timespec *ts)
{
unsigned long seq, ns;
+ int mode;
+
do {
seq = read_seqbegin(>od->lock);
+ mode = gtod->clock.vclock_mode;
ts->tv_sec = gtod->wall_time_sec;
ts->tv_nsec = gtod->wall_time_nsec;
ns = vgetns();
} while (unlikely(read_seqretry(>od->lock, seq)));
+
timespec_add_ns(ts, ns);
- return 0;
+ return mode;
}
notrace static noinline int do_monotonic(struct timespec *ts)
{
unsigned long seq, ns, secs;
+ int mode;
+
do {
seq = read_seqbegin(>od->lock);
+ mode = gtod->clock.vclock_mode;
secs = gtod->wall_time_sec;
ns = gtod->wall_time_nsec + vgetns();
secs += gtod->wall_to_monotonic.tv_sec;
@@ -116,7 +133,7 @@ notrace static noinline int do_monotonic(struct timespec *ts)
ts->tv_sec = secs;
ts->tv_nsec = ns;
- return 0;
+ return mode;
}
notrace static noinline int do_realtime_coarse(struct timespec *ts)
@@ -156,14 +173,14 @@ notrace static noinline int do_monotonic_coarse(struct timespec *ts)
notrace int __vdso_clock_gettime(clockid_t clock, struct timespec *ts)
{
+ int ret = VCLOCK_NONE;
+
switch (clock) {
case CLOCK_REALTIME:
- if (likely(gtod->clock.vclock_mode != VCLOCK_NONE))
- return do_realtime(ts);
+ ret = do_realtime(ts);
break;
case CLOCK_MONOTONIC:
- if (likely(gtod->clock.vclock_mode != VCLOCK_NONE))
- return do_monotonic(ts);
+ ret = do_monotonic(ts);
break;
case CLOCK_REALTIME_COARSE:
return do_realtime_coarse(ts);
@@ -171,32 +188,33 @@ notrace int __vdso_clock_gettime(clockid_t clock, struct timespec *ts)
return do_monotonic_coarse(ts);
}
- return vdso_fallback_gettime(clock, ts);
+ if (ret == VCLOCK_NONE)
+ return vdso_fallback_gettime(clock, ts);
+ return 0;
}
int clock_gettime(clockid_t, struct timespec *)
__attribute__((weak, alias("__vdso_clock_gettime")));
notrace int __vdso_gettimeofday(struct timeval *tv, struct timezone *tz)
{
- long ret;
- if (likely(gtod->clock.vclock_mode != VCLOCK_NONE)) {
- if (likely(tv != NULL)) {
- BUILD_BUG_ON(offsetof(struct timeval, tv_usec) !=
- offsetof(struct timespec, tv_nsec) ||
- sizeof(*tv) != sizeof(struct timespec));
- do_realtime((struct timespec *)tv);
- tv->tv_usec /= 1000;
- }
- if (unlikely(tz != NULL)) {
- /* Avoid memcpy. Some old compilers fail to inline it */
- tz->tz_minuteswest = gtod->sys_tz.tz_minuteswest;
- tz->tz_dsttime = gtod->sys_tz.tz_dsttime;
- }
- return 0;
+ long ret = VCLOCK_NONE;
+
+ if (likely(tv != NULL)) {
+ BUILD_BUG_ON(offsetof(struct timeval, tv_usec) !=
+ offsetof(struct timespec, tv_nsec) ||
+ sizeof(*tv) != sizeof(struct timespec));
+ ret = do_realtime((struct timespec *)tv);
+ tv->tv_usec /= 1000;
}
- asm("syscall" : "=a" (ret) :
- "0" (__NR_gettimeofday), "D" (tv), "S" (tz) : "memory");
- return ret;
+ if (unlikely(tz != NULL)) {
+ /* Avoid memcpy. Some old compilers fail to inline it */
+ tz->tz_minuteswest = gtod->sys_tz.tz_minuteswest;
+ tz->tz_dsttime = gtod->sys_tz.tz_dsttime;
+ }
+
+ if (ret == VCLOCK_NONE)
+ return vdso_fallback_gtod(tv, tz);
+ return 0;
}
int gettimeofday(struct timeval *, struct timezone *)
__attribute__((weak, alias("__vdso_gettimeofday")));
--
1.7.3.2.146.gca209
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] time: x86: Fix race switching from vsyscall to non-vsyscall clock
2012-03-15 0:42 ` John Stultz
@ 2012-03-15 1:43 ` Andy Lutomirski
2012-03-15 1:46 ` Andy Lutomirski
2012-03-15 20:18 ` John Stultz
0 siblings, 2 replies; 9+ messages in thread
From: Andy Lutomirski @ 2012-03-15 1:43 UTC (permalink / raw)
To: John Stultz; +Cc: Thomas Gleixner, linux-kernel
On Wed, Mar 14, 2012 at 5:42 PM, John Stultz <john.stultz@linaro.org> wrote:
> On 03/14/2012 05:34 PM, Thomas Gleixner wrote:
>>
>> On Wed, 14 Mar 2012, John Stultz wrote:
>>>
>>> notrace static noinline int do_realtime(struct timespec *ts)
>>> {
>>> unsigned long seq, ns;
>>> + int mode;
>>
>> Please keep a newline between declarations and code.
>
>
> Fixed below. Thanks!
> (Let me know if you see whitespace damage, I switched mail clients today and
> am learning the quirks here.)
> -john
>
>
>
> When switching from a vsyscall capable to a non-vsyscall capable
> clocksource, there was a small race, where the last vsyscall
> gettimeofday before the switch might return a invalid time value
> using the new non-vsyscall enabled clocksource values after the
> switch is complete.
>
> This is due to the vsyscall code checking the vclock_mode once
> outside of the seqcount protected section. After it reads the
> vclock mode, it doesn't re-check that the sampled clock data
> that is obtained in the seqcount critical section still matches.
>
> The fix is to sample vclock_mode inside the protected section,
> and as long as it isn't VCLOCK_NONE, return the calculated
> value. If it has changed and is now VCLOCK_NONE, fall back
> to the syscall gettime calculation.
>
> v2:
> * Cleanup checks as suggested by tglx
> * Also fix same issue present in gettimeofday path
>
> CC: Andy Lutomirski<luto@amacapital.net>
> CC: Thomas Gleixner<tglx@linutronix.de>
> Signed-off-by: John Stultz<john.stultz@linaro.org>
> ---
> arch/x86/vdso/vclock_gettime.c | 68
> +++++++++++++++++++++++++--------------
> 1 files changed, 43 insertions(+), 25 deletions(-)
>
Looks reasonable to me. I like this approach better than the earlier
way -- it's likely to cause less slowdown in the VCLOCK_TSC case.
That being said, I think you might have a bug:
notrace static inline long vgetns(void)
{
long v;
cycles_t cycles;
if (gtod->clock.vclock_mode == VCLOCK_TSC)
cycles = vread_tsc();
else
cycles = vread_hpet();
v = (cycles - gtod->clock.cycle_last) & gtod->clock.mask;
return (v * gtod->clock.mult) >> gtod->clock.shift;
}
In the VCLOCK_NONE, you'll access the hpet mapping. But in
hpet_enable, hpet_set_mapping isn't called and this will crash, I
think.
--Andy
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] time: x86: Fix race switching from vsyscall to non-vsyscall clock
2012-03-15 1:43 ` Andy Lutomirski
@ 2012-03-15 1:46 ` Andy Lutomirski
2012-03-15 20:18 ` John Stultz
1 sibling, 0 replies; 9+ messages in thread
From: Andy Lutomirski @ 2012-03-15 1:46 UTC (permalink / raw)
To: John Stultz; +Cc: Thomas Gleixner, linux-kernel
On Wed, Mar 14, 2012 at 6:43 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Wed, Mar 14, 2012 at 5:42 PM, John Stultz <john.stultz@linaro.org> wrote:
>> On 03/14/2012 05:34 PM, Thomas Gleixner wrote:
>>>
>>> On Wed, 14 Mar 2012, John Stultz wrote:
>>>>
>>>> notrace static noinline int do_realtime(struct timespec *ts)
>>>> {
>>>> unsigned long seq, ns;
>>>> + int mode;
>>>
>>> Please keep a newline between declarations and code.
>>
>>
>> Fixed below. Thanks!
>> (Let me know if you see whitespace damage, I switched mail clients today and
>> am learning the quirks here.)
>> -john
>>
>>
>>
>> When switching from a vsyscall capable to a non-vsyscall capable
>> clocksource, there was a small race, where the last vsyscall
>> gettimeofday before the switch might return a invalid time value
>> using the new non-vsyscall enabled clocksource values after the
>> switch is complete.
>>
>> This is due to the vsyscall code checking the vclock_mode once
>> outside of the seqcount protected section. After it reads the
>> vclock mode, it doesn't re-check that the sampled clock data
>> that is obtained in the seqcount critical section still matches.
>>
>> The fix is to sample vclock_mode inside the protected section,
>> and as long as it isn't VCLOCK_NONE, return the calculated
>> value. If it has changed and is now VCLOCK_NONE, fall back
>> to the syscall gettime calculation.
>>
>> v2:
>> * Cleanup checks as suggested by tglx
>> * Also fix same issue present in gettimeofday path
>>
>> CC: Andy Lutomirski<luto@amacapital.net>
>> CC: Thomas Gleixner<tglx@linutronix.de>
>> Signed-off-by: John Stultz<john.stultz@linaro.org>
>> ---
>> arch/x86/vdso/vclock_gettime.c | 68
>> +++++++++++++++++++++++++--------------
>> 1 files changed, 43 insertions(+), 25 deletions(-)
>>
>
> Looks reasonable to me. I like this approach better than the earlier
> way -- it's likely to cause less slowdown in the VCLOCK_TSC case.
>
> That being said, I think you might have a bug:
>
> notrace static inline long vgetns(void)
> {
> long v;
> cycles_t cycles;
> if (gtod->clock.vclock_mode == VCLOCK_TSC)
> cycles = vread_tsc();
> else
> cycles = vread_hpet();
> v = (cycles - gtod->clock.cycle_last) & gtod->clock.mask;
> return (v * gtod->clock.mult) >> gtod->clock.shift;
> }
>
> In the VCLOCK_NONE, you'll access the hpet mapping. But in
> hpet_enable, hpet_set_mapping isn't called and this will crash, I
> think.
>
One way to fix it would be to unconditionally map the page. Then
there's no performance loss for the tsc case.
--Andy
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] time: x86: Fix race switching from vsyscall to non-vsyscall clock
2012-03-15 1:43 ` Andy Lutomirski
2012-03-15 1:46 ` Andy Lutomirski
@ 2012-03-15 20:18 ` John Stultz
2012-03-15 21:01 ` Andy Lutomirski
1 sibling, 1 reply; 9+ messages in thread
From: John Stultz @ 2012-03-15 20:18 UTC (permalink / raw)
To: Andy Lutomirski; +Cc: Thomas Gleixner, linux-kernel
On 03/14/2012 06:43 PM, Andy Lutomirski wrote:
> On Wed, Mar 14, 2012 at 5:42 PM, John Stultz<john.stultz@linaro.org> wrote:
>> On 03/14/2012 05:34 PM, Thomas Gleixner wrote:
>>> On Wed, 14 Mar 2012, John Stultz wrote:
>>>> notrace static noinline int do_realtime(struct timespec *ts)
>>>> {
>>>> unsigned long seq, ns;
>>>> + int mode;
>>> Please keep a newline between declarations and code.
>>
>> Fixed below. Thanks!
>> (Let me know if you see whitespace damage, I switched mail clients today and
>> am learning the quirks here.)
>> -john
>>
>>
>>
>> When switching from a vsyscall capable to a non-vsyscall capable
>> clocksource, there was a small race, where the last vsyscall
>> gettimeofday before the switch might return a invalid time value
>> using the new non-vsyscall enabled clocksource values after the
>> switch is complete.
>>
>> This is due to the vsyscall code checking the vclock_mode once
>> outside of the seqcount protected section. After it reads the
>> vclock mode, it doesn't re-check that the sampled clock data
>> that is obtained in the seqcount critical section still matches.
>>
>> The fix is to sample vclock_mode inside the protected section,
>> and as long as it isn't VCLOCK_NONE, return the calculated
>> value. If it has changed and is now VCLOCK_NONE, fall back
>> to the syscall gettime calculation.
>>
>> v2:
>> * Cleanup checks as suggested by tglx
>> * Also fix same issue present in gettimeofday path
>>
>> CC: Andy Lutomirski<luto@amacapital.net>
>> CC: Thomas Gleixner<tglx@linutronix.de>
>> Signed-off-by: John Stultz<john.stultz@linaro.org>
>> ---
>> arch/x86/vdso/vclock_gettime.c | 68
>> +++++++++++++++++++++++++--------------
>> 1 files changed, 43 insertions(+), 25 deletions(-)
>>
> Looks reasonable to me. I like this approach better than the earlier
> way -- it's likely to cause less slowdown in the VCLOCK_TSC case.
>
> That being said, I think you might have a bug:
>
> notrace static inline long vgetns(void)
> {
> long v;
> cycles_t cycles;
> if (gtod->clock.vclock_mode == VCLOCK_TSC)
> cycles = vread_tsc();
> else
> cycles = vread_hpet();
> v = (cycles - gtod->clock.cycle_last)& gtod->clock.mask;
> return (v * gtod->clock.mult)>> gtod->clock.shift;
> }
>
> In the VCLOCK_NONE, you'll access the hpet mapping. But in
> hpet_enable, hpet_set_mapping isn't called and this will crash, I
> think.
Thanks for catching this!
My solution is to add:
else if (gtod->clock.vclock_mode == VCLOCK_HPET)
cycles = vread_hpet();
else
return 0;
Let me know if this works for you.
thanks
-john
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] time: x86: Fix race switching from vsyscall to non-vsyscall clock
2012-03-15 20:18 ` John Stultz
@ 2012-03-15 21:01 ` Andy Lutomirski
0 siblings, 0 replies; 9+ messages in thread
From: Andy Lutomirski @ 2012-03-15 21:01 UTC (permalink / raw)
To: John Stultz; +Cc: Thomas Gleixner, linux-kernel
On Thu, Mar 15, 2012 at 1:18 PM, John Stultz <john.stultz@linaro.org> wrote:
> On 03/14/2012 06:43 PM, Andy Lutomirski wrote:
>>
>> On Wed, Mar 14, 2012 at 5:42 PM, John Stultz<john.stultz@linaro.org>
>> wrote:
>>>
>>> On 03/14/2012 05:34 PM, Thomas Gleixner wrote:
>>>>
>>>> On Wed, 14 Mar 2012, John Stultz wrote:
>>>>>
>>>>> notrace static noinline int do_realtime(struct timespec *ts)
>>>>> {
>>>>> unsigned long seq, ns;
>>>>> + int mode;
>>>>
>>>> Please keep a newline between declarations and code.
>>>
>>>
>>> Fixed below. Thanks!
>>> (Let me know if you see whitespace damage, I switched mail clients today
>>> and
>>> am learning the quirks here.)
>>> -john
>>>
>>>
>>>
>>> When switching from a vsyscall capable to a non-vsyscall capable
>>> clocksource, there was a small race, where the last vsyscall
>>> gettimeofday before the switch might return a invalid time value
>>> using the new non-vsyscall enabled clocksource values after the
>>> switch is complete.
>>>
>>> This is due to the vsyscall code checking the vclock_mode once
>>> outside of the seqcount protected section. After it reads the
>>> vclock mode, it doesn't re-check that the sampled clock data
>>> that is obtained in the seqcount critical section still matches.
>>>
>>> The fix is to sample vclock_mode inside the protected section,
>>> and as long as it isn't VCLOCK_NONE, return the calculated
>>> value. If it has changed and is now VCLOCK_NONE, fall back
>>> to the syscall gettime calculation.
>>>
>>> v2:
>>> * Cleanup checks as suggested by tglx
>>> * Also fix same issue present in gettimeofday path
>>>
>>> CC: Andy Lutomirski<luto@amacapital.net>
>>> CC: Thomas Gleixner<tglx@linutronix.de>
>>> Signed-off-by: John Stultz<john.stultz@linaro.org>
>>> ---
>>> arch/x86/vdso/vclock_gettime.c | 68
>>> +++++++++++++++++++++++++--------------
>>> 1 files changed, 43 insertions(+), 25 deletions(-)
>>>
>> Looks reasonable to me. I like this approach better than the earlier
>> way -- it's likely to cause less slowdown in the VCLOCK_TSC case.
>>
>> That being said, I think you might have a bug:
>>
>> notrace static inline long vgetns(void)
>> {
>> long v;
>> cycles_t cycles;
>> if (gtod->clock.vclock_mode == VCLOCK_TSC)
>> cycles = vread_tsc();
>> else
>> cycles = vread_hpet();
>> v = (cycles - gtod->clock.cycle_last)& gtod->clock.mask;
>>
>> return (v * gtod->clock.mult)>> gtod->clock.shift;
>> }
>>
>> In the VCLOCK_NONE, you'll access the hpet mapping. But in
>> hpet_enable, hpet_set_mapping isn't called and this will crash, I
>> think.
>
> Thanks for catching this!
>
> My solution is to add:
>
> else if (gtod->clock.vclock_mode == VCLOCK_HPET)
> cycles = vread_hpet();
> else
> return 0;
>
> Let me know if this works for you.
I think that's much better than my poorly-thought-out suggestion.
Accessing the hpet mapping is really slow, so avoiding it if the hpet
is disabled is a good idea. And the extra branches shouldn't penalize
the tsc case (which is the only fast case) unless the compiler does
something silly.
--Andy
>
> thanks
> -john
>
>
--
Andy Lutomirski
AMA Capital Management, LLC
Office: (310) 553-5322
Mobile: (650) 906-0647
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2012-03-15 21:02 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-14 23:58 [PATCH 0/2] time: Fix races at clocksource switch time John Stultz
2012-03-14 23:58 ` [PATCH 1/2] time: x86: Fix race switching from vsyscall to non-vsyscall clock John Stultz
2012-03-15 0:34 ` Thomas Gleixner
2012-03-15 0:42 ` John Stultz
2012-03-15 1:43 ` Andy Lutomirski
2012-03-15 1:46 ` Andy Lutomirski
2012-03-15 20:18 ` John Stultz
2012-03-15 21:01 ` Andy Lutomirski
2012-03-14 23:58 ` [PATCH 2/2] time: Fix change_clocksource locking John Stultz
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.