* [PATCH 0/2] time: Fix races at clocksource switch time @ 2012-03-14 23:58 John Stultz 2012-03-14 23:58 ` [PATCH 1/2] time: x86: Fix race switching from vsyscall to non-vsyscall clock John Stultz 2012-03-14 23:58 ` [PATCH 2/2] time: Fix change_clocksource locking John Stultz 0 siblings, 2 replies; 9+ messages in thread From: John Stultz @ 2012-03-14 23:58 UTC (permalink / raw) To: linux-kernel; +Cc: John Stultz, Andy Lutomirski, Thomas Gleixner In testing some recent timekeeping changes, I found some problem when changing clocksources. These two patches close races at clocksource switch time. One I sent out earlier, and have reworked to integrate Thomas' comments. These two patches, along with Thomas' recent cleanup patches ontop can be found here: git://git.linaro.org/people/jstultz/linux.git fortglx/3.4/time CC: Andy Lutomirski <luto@amacapital.net> CC: Thomas Gleixner <tglx@linutronix.de> John Stultz (2): time: x86: Fix race switching from vsyscall to non-vsyscall clock time: Fix change_clocksource locking arch/x86/vdso/vclock_gettime.c | 63 ++++++++++++++++++++++++---------------- kernel/time/timekeeping.c | 7 ++++ 2 files changed, 45 insertions(+), 25 deletions(-) -- 1.7.3.2.146.gca209 ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/2] time: x86: Fix race switching from vsyscall to non-vsyscall clock 2012-03-14 23:58 [PATCH 0/2] time: Fix races at clocksource switch time John Stultz @ 2012-03-14 23:58 ` John Stultz 2012-03-15 0:34 ` Thomas Gleixner 2012-03-14 23:58 ` [PATCH 2/2] time: Fix change_clocksource locking John Stultz 1 sibling, 1 reply; 9+ messages in thread From: John Stultz @ 2012-03-14 23:58 UTC (permalink / raw) To: linux-kernel; +Cc: John Stultz, Andy Lutomirski, Thomas Gleixner When switching from a vsyscall capable to a non-vsyscall capable clocksource, there was a small race, where the last vsyscall gettimeofday before the switch might return a invalid time value using the new non-vsyscall enabled clocksource values after the switch is complete. This is due to the vsyscall code checking the vclock_mode once outside of the seqcount protected section. After it reads the vclock mode, it doesn't re-check that the sampled clock data that is obtained in the seqcount critical section still matches. The fix is to sample vclock_mode inside the protected section, and as long as it isn't VCLOCK_NONE, return the calculated value. If it has changed and is now VCLOCK_NONE, fall back to the syscall gettime calculation. v2: * Cleanup checks as suggested by tglx * Also fix same issue present in gettimeofday path CC: Andy Lutomirski <luto@amacapital.net> CC: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org> --- arch/x86/vdso/vclock_gettime.c | 63 ++++++++++++++++++++++++---------------- 1 files changed, 38 insertions(+), 25 deletions(-) diff --git a/arch/x86/vdso/vclock_gettime.c b/arch/x86/vdso/vclock_gettime.c index 6bc0e72..6c93209 100644 --- a/arch/x86/vdso/vclock_gettime.c +++ b/arch/x86/vdso/vclock_gettime.c @@ -70,6 +70,15 @@ notrace static long vdso_fallback_gettime(long clock, struct timespec *ts) return ret; } +notrace static long vdso_fallback_gtod(struct timeval *tv, struct timezone *tz) +{ + long ret; + asm("syscall" : "=a" (ret) : + "0" (__NR_gettimeofday), "D" (tv), "S" (tz) : "memory"); + return ret; +} + + notrace static inline long vgetns(void) { long v; @@ -85,21 +94,26 @@ notrace static inline long vgetns(void) notrace static noinline int do_realtime(struct timespec *ts) { unsigned long seq, ns; + int mode; do { seq = read_seqbegin(>od->lock); + mode = gtod->clock.vclock_mode; ts->tv_sec = gtod->wall_time_sec; ts->tv_nsec = gtod->wall_time_nsec; ns = vgetns(); } while (unlikely(read_seqretry(>od->lock, seq))); + timespec_add_ns(ts, ns); - return 0; + return mode; } notrace static noinline int do_monotonic(struct timespec *ts) { unsigned long seq, ns, secs; + int mode; do { seq = read_seqbegin(>od->lock); + mode = gtod->clock.vclock_mode; secs = gtod->wall_time_sec; ns = gtod->wall_time_nsec + vgetns(); secs += gtod->wall_to_monotonic.tv_sec; @@ -116,7 +130,7 @@ notrace static noinline int do_monotonic(struct timespec *ts) ts->tv_sec = secs; ts->tv_nsec = ns; - return 0; + return mode; } notrace static noinline int do_realtime_coarse(struct timespec *ts) @@ -156,14 +170,13 @@ notrace static noinline int do_monotonic_coarse(struct timespec *ts) notrace int __vdso_clock_gettime(clockid_t clock, struct timespec *ts) { + int ret = VCLOCK_NONE; switch (clock) { case CLOCK_REALTIME: - if (likely(gtod->clock.vclock_mode != VCLOCK_NONE)) - return do_realtime(ts); + ret = do_realtime(ts); break; case CLOCK_MONOTONIC: - if (likely(gtod->clock.vclock_mode != VCLOCK_NONE)) - return do_monotonic(ts); + ret = do_monotonic(ts); break; case CLOCK_REALTIME_COARSE: return do_realtime_coarse(ts); @@ -171,32 +184,32 @@ notrace int __vdso_clock_gettime(clockid_t clock, struct timespec *ts) return do_monotonic_coarse(ts); } - return vdso_fallback_gettime(clock, ts); + if (ret == VCLOCK_NONE) + return vdso_fallback_gettime(clock, ts); + return 0; } int clock_gettime(clockid_t, struct timespec *) __attribute__((weak, alias("__vdso_clock_gettime"))); notrace int __vdso_gettimeofday(struct timeval *tv, struct timezone *tz) { - long ret; - if (likely(gtod->clock.vclock_mode != VCLOCK_NONE)) { - if (likely(tv != NULL)) { - BUILD_BUG_ON(offsetof(struct timeval, tv_usec) != - offsetof(struct timespec, tv_nsec) || - sizeof(*tv) != sizeof(struct timespec)); - do_realtime((struct timespec *)tv); - tv->tv_usec /= 1000; - } - if (unlikely(tz != NULL)) { - /* Avoid memcpy. Some old compilers fail to inline it */ - tz->tz_minuteswest = gtod->sys_tz.tz_minuteswest; - tz->tz_dsttime = gtod->sys_tz.tz_dsttime; - } - return 0; + long ret = VCLOCK_NONE; + if (likely(tv != NULL)) { + BUILD_BUG_ON(offsetof(struct timeval, tv_usec) != + offsetof(struct timespec, tv_nsec) || + sizeof(*tv) != sizeof(struct timespec)); + ret = do_realtime((struct timespec *)tv); + tv->tv_usec /= 1000; } - asm("syscall" : "=a" (ret) : - "0" (__NR_gettimeofday), "D" (tv), "S" (tz) : "memory"); - return ret; + if (unlikely(tz != NULL)) { + /* Avoid memcpy. Some old compilers fail to inline it */ + tz->tz_minuteswest = gtod->sys_tz.tz_minuteswest; + tz->tz_dsttime = gtod->sys_tz.tz_dsttime; + } + + if (ret == VCLOCK_NONE) + return vdso_fallback_gtod(tv, tz); + return 0; } int gettimeofday(struct timeval *, struct timezone *) __attribute__((weak, alias("__vdso_gettimeofday"))); -- 1.7.3.2.146.gca209 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] time: x86: Fix race switching from vsyscall to non-vsyscall clock 2012-03-14 23:58 ` [PATCH 1/2] time: x86: Fix race switching from vsyscall to non-vsyscall clock John Stultz @ 2012-03-15 0:34 ` Thomas Gleixner 2012-03-15 0:42 ` John Stultz 0 siblings, 1 reply; 9+ messages in thread From: Thomas Gleixner @ 2012-03-15 0:34 UTC (permalink / raw) To: John Stultz; +Cc: linux-kernel, Andy Lutomirski On Wed, 14 Mar 2012, John Stultz wrote: > notrace static noinline int do_realtime(struct timespec *ts) > { > unsigned long seq, ns; > + int mode; Please keep a newline between declarations and code. > do { > seq = read_seqbegin(>od->lock); > + mode = gtod->clock.vclock_mode; > ts->tv_sec = gtod->wall_time_sec; > ts->tv_nsec = gtod->wall_time_nsec; > ns = vgetns(); > } while (unlikely(read_seqretry(>od->lock, seq))); > + > timespec_add_ns(ts, ns); > - return 0; > + return mode; > } > > notrace static noinline int do_monotonic(struct timespec *ts) > { > unsigned long seq, ns, secs; > + int mode; etc. etc. > do { ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] time: x86: Fix race switching from vsyscall to non-vsyscall clock 2012-03-15 0:34 ` Thomas Gleixner @ 2012-03-15 0:42 ` John Stultz 2012-03-15 1:43 ` Andy Lutomirski 0 siblings, 1 reply; 9+ messages in thread From: John Stultz @ 2012-03-15 0:42 UTC (permalink / raw) To: Thomas Gleixner; +Cc: linux-kernel, Andy Lutomirski On 03/14/2012 05:34 PM, Thomas Gleixner wrote: > On Wed, 14 Mar 2012, John Stultz wrote: >> notrace static noinline int do_realtime(struct timespec *ts) >> { >> unsigned long seq, ns; >> + int mode; > Please keep a newline between declarations and code. Fixed below. Thanks! (Let me know if you see whitespace damage, I switched mail clients today and am learning the quirks here.) -john When switching from a vsyscall capable to a non-vsyscall capable clocksource, there was a small race, where the last vsyscall gettimeofday before the switch might return a invalid time value using the new non-vsyscall enabled clocksource values after the switch is complete. This is due to the vsyscall code checking the vclock_mode once outside of the seqcount protected section. After it reads the vclock mode, it doesn't re-check that the sampled clock data that is obtained in the seqcount critical section still matches. The fix is to sample vclock_mode inside the protected section, and as long as it isn't VCLOCK_NONE, return the calculated value. If it has changed and is now VCLOCK_NONE, fall back to the syscall gettime calculation. v2: * Cleanup checks as suggested by tglx * Also fix same issue present in gettimeofday path CC: Andy Lutomirski<luto@amacapital.net> CC: Thomas Gleixner<tglx@linutronix.de> Signed-off-by: John Stultz<john.stultz@linaro.org> --- arch/x86/vdso/vclock_gettime.c | 68 +++++++++++++++++++++++++-------------- 1 files changed, 43 insertions(+), 25 deletions(-) diff --git a/arch/x86/vdso/vclock_gettime.c b/arch/x86/vdso/vclock_gettime.c index 6bc0e72..e5ba922 100644 --- a/arch/x86/vdso/vclock_gettime.c +++ b/arch/x86/vdso/vclock_gettime.c @@ -70,6 +70,16 @@ notrace static long vdso_fallback_gettime(long clock, struct timespec *ts) return ret; } +notrace static long vdso_fallback_gtod(struct timeval *tv, struct timezone *tz) +{ + long ret; + + asm("syscall" : "=a" (ret) : + "0" (__NR_gettimeofday), "D" (tv), "S" (tz) : "memory"); + return ret; +} + + notrace static inline long vgetns(void) { long v; @@ -85,21 +95,28 @@ notrace static inline long vgetns(void) notrace static noinline int do_realtime(struct timespec *ts) { unsigned long seq, ns; + int mode; + do { seq = read_seqbegin(>od->lock); + mode = gtod->clock.vclock_mode; ts->tv_sec = gtod->wall_time_sec; ts->tv_nsec = gtod->wall_time_nsec; ns = vgetns(); } while (unlikely(read_seqretry(>od->lock, seq))); + timespec_add_ns(ts, ns); - return 0; + return mode; } notrace static noinline int do_monotonic(struct timespec *ts) { unsigned long seq, ns, secs; + int mode; + do { seq = read_seqbegin(>od->lock); + mode = gtod->clock.vclock_mode; secs = gtod->wall_time_sec; ns = gtod->wall_time_nsec + vgetns(); secs += gtod->wall_to_monotonic.tv_sec; @@ -116,7 +133,7 @@ notrace static noinline int do_monotonic(struct timespec *ts) ts->tv_sec = secs; ts->tv_nsec = ns; - return 0; + return mode; } notrace static noinline int do_realtime_coarse(struct timespec *ts) @@ -156,14 +173,14 @@ notrace static noinline int do_monotonic_coarse(struct timespec *ts) notrace int __vdso_clock_gettime(clockid_t clock, struct timespec *ts) { + int ret = VCLOCK_NONE; + switch (clock) { case CLOCK_REALTIME: - if (likely(gtod->clock.vclock_mode != VCLOCK_NONE)) - return do_realtime(ts); + ret = do_realtime(ts); break; case CLOCK_MONOTONIC: - if (likely(gtod->clock.vclock_mode != VCLOCK_NONE)) - return do_monotonic(ts); + ret = do_monotonic(ts); break; case CLOCK_REALTIME_COARSE: return do_realtime_coarse(ts); @@ -171,32 +188,33 @@ notrace int __vdso_clock_gettime(clockid_t clock, struct timespec *ts) return do_monotonic_coarse(ts); } - return vdso_fallback_gettime(clock, ts); + if (ret == VCLOCK_NONE) + return vdso_fallback_gettime(clock, ts); + return 0; } int clock_gettime(clockid_t, struct timespec *) __attribute__((weak, alias("__vdso_clock_gettime"))); notrace int __vdso_gettimeofday(struct timeval *tv, struct timezone *tz) { - long ret; - if (likely(gtod->clock.vclock_mode != VCLOCK_NONE)) { - if (likely(tv != NULL)) { - BUILD_BUG_ON(offsetof(struct timeval, tv_usec) != - offsetof(struct timespec, tv_nsec) || - sizeof(*tv) != sizeof(struct timespec)); - do_realtime((struct timespec *)tv); - tv->tv_usec /= 1000; - } - if (unlikely(tz != NULL)) { - /* Avoid memcpy. Some old compilers fail to inline it */ - tz->tz_minuteswest = gtod->sys_tz.tz_minuteswest; - tz->tz_dsttime = gtod->sys_tz.tz_dsttime; - } - return 0; + long ret = VCLOCK_NONE; + + if (likely(tv != NULL)) { + BUILD_BUG_ON(offsetof(struct timeval, tv_usec) != + offsetof(struct timespec, tv_nsec) || + sizeof(*tv) != sizeof(struct timespec)); + ret = do_realtime((struct timespec *)tv); + tv->tv_usec /= 1000; } - asm("syscall" : "=a" (ret) : - "0" (__NR_gettimeofday), "D" (tv), "S" (tz) : "memory"); - return ret; + if (unlikely(tz != NULL)) { + /* Avoid memcpy. Some old compilers fail to inline it */ + tz->tz_minuteswest = gtod->sys_tz.tz_minuteswest; + tz->tz_dsttime = gtod->sys_tz.tz_dsttime; + } + + if (ret == VCLOCK_NONE) + return vdso_fallback_gtod(tv, tz); + return 0; } int gettimeofday(struct timeval *, struct timezone *) __attribute__((weak, alias("__vdso_gettimeofday"))); -- 1.7.3.2.146.gca209 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] time: x86: Fix race switching from vsyscall to non-vsyscall clock 2012-03-15 0:42 ` John Stultz @ 2012-03-15 1:43 ` Andy Lutomirski 2012-03-15 1:46 ` Andy Lutomirski 2012-03-15 20:18 ` John Stultz 0 siblings, 2 replies; 9+ messages in thread From: Andy Lutomirski @ 2012-03-15 1:43 UTC (permalink / raw) To: John Stultz; +Cc: Thomas Gleixner, linux-kernel On Wed, Mar 14, 2012 at 5:42 PM, John Stultz <john.stultz@linaro.org> wrote: > On 03/14/2012 05:34 PM, Thomas Gleixner wrote: >> >> On Wed, 14 Mar 2012, John Stultz wrote: >>> >>> notrace static noinline int do_realtime(struct timespec *ts) >>> { >>> unsigned long seq, ns; >>> + int mode; >> >> Please keep a newline between declarations and code. > > > Fixed below. Thanks! > (Let me know if you see whitespace damage, I switched mail clients today and > am learning the quirks here.) > -john > > > > When switching from a vsyscall capable to a non-vsyscall capable > clocksource, there was a small race, where the last vsyscall > gettimeofday before the switch might return a invalid time value > using the new non-vsyscall enabled clocksource values after the > switch is complete. > > This is due to the vsyscall code checking the vclock_mode once > outside of the seqcount protected section. After it reads the > vclock mode, it doesn't re-check that the sampled clock data > that is obtained in the seqcount critical section still matches. > > The fix is to sample vclock_mode inside the protected section, > and as long as it isn't VCLOCK_NONE, return the calculated > value. If it has changed and is now VCLOCK_NONE, fall back > to the syscall gettime calculation. > > v2: > * Cleanup checks as suggested by tglx > * Also fix same issue present in gettimeofday path > > CC: Andy Lutomirski<luto@amacapital.net> > CC: Thomas Gleixner<tglx@linutronix.de> > Signed-off-by: John Stultz<john.stultz@linaro.org> > --- > arch/x86/vdso/vclock_gettime.c | 68 > +++++++++++++++++++++++++-------------- > 1 files changed, 43 insertions(+), 25 deletions(-) > Looks reasonable to me. I like this approach better than the earlier way -- it's likely to cause less slowdown in the VCLOCK_TSC case. That being said, I think you might have a bug: notrace static inline long vgetns(void) { long v; cycles_t cycles; if (gtod->clock.vclock_mode == VCLOCK_TSC) cycles = vread_tsc(); else cycles = vread_hpet(); v = (cycles - gtod->clock.cycle_last) & gtod->clock.mask; return (v * gtod->clock.mult) >> gtod->clock.shift; } In the VCLOCK_NONE, you'll access the hpet mapping. But in hpet_enable, hpet_set_mapping isn't called and this will crash, I think. --Andy ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] time: x86: Fix race switching from vsyscall to non-vsyscall clock 2012-03-15 1:43 ` Andy Lutomirski @ 2012-03-15 1:46 ` Andy Lutomirski 2012-03-15 20:18 ` John Stultz 1 sibling, 0 replies; 9+ messages in thread From: Andy Lutomirski @ 2012-03-15 1:46 UTC (permalink / raw) To: John Stultz; +Cc: Thomas Gleixner, linux-kernel On Wed, Mar 14, 2012 at 6:43 PM, Andy Lutomirski <luto@amacapital.net> wrote: > On Wed, Mar 14, 2012 at 5:42 PM, John Stultz <john.stultz@linaro.org> wrote: >> On 03/14/2012 05:34 PM, Thomas Gleixner wrote: >>> >>> On Wed, 14 Mar 2012, John Stultz wrote: >>>> >>>> notrace static noinline int do_realtime(struct timespec *ts) >>>> { >>>> unsigned long seq, ns; >>>> + int mode; >>> >>> Please keep a newline between declarations and code. >> >> >> Fixed below. Thanks! >> (Let me know if you see whitespace damage, I switched mail clients today and >> am learning the quirks here.) >> -john >> >> >> >> When switching from a vsyscall capable to a non-vsyscall capable >> clocksource, there was a small race, where the last vsyscall >> gettimeofday before the switch might return a invalid time value >> using the new non-vsyscall enabled clocksource values after the >> switch is complete. >> >> This is due to the vsyscall code checking the vclock_mode once >> outside of the seqcount protected section. After it reads the >> vclock mode, it doesn't re-check that the sampled clock data >> that is obtained in the seqcount critical section still matches. >> >> The fix is to sample vclock_mode inside the protected section, >> and as long as it isn't VCLOCK_NONE, return the calculated >> value. If it has changed and is now VCLOCK_NONE, fall back >> to the syscall gettime calculation. >> >> v2: >> * Cleanup checks as suggested by tglx >> * Also fix same issue present in gettimeofday path >> >> CC: Andy Lutomirski<luto@amacapital.net> >> CC: Thomas Gleixner<tglx@linutronix.de> >> Signed-off-by: John Stultz<john.stultz@linaro.org> >> --- >> arch/x86/vdso/vclock_gettime.c | 68 >> +++++++++++++++++++++++++-------------- >> 1 files changed, 43 insertions(+), 25 deletions(-) >> > > Looks reasonable to me. I like this approach better than the earlier > way -- it's likely to cause less slowdown in the VCLOCK_TSC case. > > That being said, I think you might have a bug: > > notrace static inline long vgetns(void) > { > long v; > cycles_t cycles; > if (gtod->clock.vclock_mode == VCLOCK_TSC) > cycles = vread_tsc(); > else > cycles = vread_hpet(); > v = (cycles - gtod->clock.cycle_last) & gtod->clock.mask; > return (v * gtod->clock.mult) >> gtod->clock.shift; > } > > In the VCLOCK_NONE, you'll access the hpet mapping. But in > hpet_enable, hpet_set_mapping isn't called and this will crash, I > think. > One way to fix it would be to unconditionally map the page. Then there's no performance loss for the tsc case. --Andy ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] time: x86: Fix race switching from vsyscall to non-vsyscall clock 2012-03-15 1:43 ` Andy Lutomirski 2012-03-15 1:46 ` Andy Lutomirski @ 2012-03-15 20:18 ` John Stultz 2012-03-15 21:01 ` Andy Lutomirski 1 sibling, 1 reply; 9+ messages in thread From: John Stultz @ 2012-03-15 20:18 UTC (permalink / raw) To: Andy Lutomirski; +Cc: Thomas Gleixner, linux-kernel On 03/14/2012 06:43 PM, Andy Lutomirski wrote: > On Wed, Mar 14, 2012 at 5:42 PM, John Stultz<john.stultz@linaro.org> wrote: >> On 03/14/2012 05:34 PM, Thomas Gleixner wrote: >>> On Wed, 14 Mar 2012, John Stultz wrote: >>>> notrace static noinline int do_realtime(struct timespec *ts) >>>> { >>>> unsigned long seq, ns; >>>> + int mode; >>> Please keep a newline between declarations and code. >> >> Fixed below. Thanks! >> (Let me know if you see whitespace damage, I switched mail clients today and >> am learning the quirks here.) >> -john >> >> >> >> When switching from a vsyscall capable to a non-vsyscall capable >> clocksource, there was a small race, where the last vsyscall >> gettimeofday before the switch might return a invalid time value >> using the new non-vsyscall enabled clocksource values after the >> switch is complete. >> >> This is due to the vsyscall code checking the vclock_mode once >> outside of the seqcount protected section. After it reads the >> vclock mode, it doesn't re-check that the sampled clock data >> that is obtained in the seqcount critical section still matches. >> >> The fix is to sample vclock_mode inside the protected section, >> and as long as it isn't VCLOCK_NONE, return the calculated >> value. If it has changed and is now VCLOCK_NONE, fall back >> to the syscall gettime calculation. >> >> v2: >> * Cleanup checks as suggested by tglx >> * Also fix same issue present in gettimeofday path >> >> CC: Andy Lutomirski<luto@amacapital.net> >> CC: Thomas Gleixner<tglx@linutronix.de> >> Signed-off-by: John Stultz<john.stultz@linaro.org> >> --- >> arch/x86/vdso/vclock_gettime.c | 68 >> +++++++++++++++++++++++++-------------- >> 1 files changed, 43 insertions(+), 25 deletions(-) >> > Looks reasonable to me. I like this approach better than the earlier > way -- it's likely to cause less slowdown in the VCLOCK_TSC case. > > That being said, I think you might have a bug: > > notrace static inline long vgetns(void) > { > long v; > cycles_t cycles; > if (gtod->clock.vclock_mode == VCLOCK_TSC) > cycles = vread_tsc(); > else > cycles = vread_hpet(); > v = (cycles - gtod->clock.cycle_last)& gtod->clock.mask; > return (v * gtod->clock.mult)>> gtod->clock.shift; > } > > In the VCLOCK_NONE, you'll access the hpet mapping. But in > hpet_enable, hpet_set_mapping isn't called and this will crash, I > think. Thanks for catching this! My solution is to add: else if (gtod->clock.vclock_mode == VCLOCK_HPET) cycles = vread_hpet(); else return 0; Let me know if this works for you. thanks -john ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] time: x86: Fix race switching from vsyscall to non-vsyscall clock 2012-03-15 20:18 ` John Stultz @ 2012-03-15 21:01 ` Andy Lutomirski 0 siblings, 0 replies; 9+ messages in thread From: Andy Lutomirski @ 2012-03-15 21:01 UTC (permalink / raw) To: John Stultz; +Cc: Thomas Gleixner, linux-kernel On Thu, Mar 15, 2012 at 1:18 PM, John Stultz <john.stultz@linaro.org> wrote: > On 03/14/2012 06:43 PM, Andy Lutomirski wrote: >> >> On Wed, Mar 14, 2012 at 5:42 PM, John Stultz<john.stultz@linaro.org> >> wrote: >>> >>> On 03/14/2012 05:34 PM, Thomas Gleixner wrote: >>>> >>>> On Wed, 14 Mar 2012, John Stultz wrote: >>>>> >>>>> notrace static noinline int do_realtime(struct timespec *ts) >>>>> { >>>>> unsigned long seq, ns; >>>>> + int mode; >>>> >>>> Please keep a newline between declarations and code. >>> >>> >>> Fixed below. Thanks! >>> (Let me know if you see whitespace damage, I switched mail clients today >>> and >>> am learning the quirks here.) >>> -john >>> >>> >>> >>> When switching from a vsyscall capable to a non-vsyscall capable >>> clocksource, there was a small race, where the last vsyscall >>> gettimeofday before the switch might return a invalid time value >>> using the new non-vsyscall enabled clocksource values after the >>> switch is complete. >>> >>> This is due to the vsyscall code checking the vclock_mode once >>> outside of the seqcount protected section. After it reads the >>> vclock mode, it doesn't re-check that the sampled clock data >>> that is obtained in the seqcount critical section still matches. >>> >>> The fix is to sample vclock_mode inside the protected section, >>> and as long as it isn't VCLOCK_NONE, return the calculated >>> value. If it has changed and is now VCLOCK_NONE, fall back >>> to the syscall gettime calculation. >>> >>> v2: >>> * Cleanup checks as suggested by tglx >>> * Also fix same issue present in gettimeofday path >>> >>> CC: Andy Lutomirski<luto@amacapital.net> >>> CC: Thomas Gleixner<tglx@linutronix.de> >>> Signed-off-by: John Stultz<john.stultz@linaro.org> >>> --- >>> arch/x86/vdso/vclock_gettime.c | 68 >>> +++++++++++++++++++++++++-------------- >>> 1 files changed, 43 insertions(+), 25 deletions(-) >>> >> Looks reasonable to me. I like this approach better than the earlier >> way -- it's likely to cause less slowdown in the VCLOCK_TSC case. >> >> That being said, I think you might have a bug: >> >> notrace static inline long vgetns(void) >> { >> long v; >> cycles_t cycles; >> if (gtod->clock.vclock_mode == VCLOCK_TSC) >> cycles = vread_tsc(); >> else >> cycles = vread_hpet(); >> v = (cycles - gtod->clock.cycle_last)& gtod->clock.mask; >> >> return (v * gtod->clock.mult)>> gtod->clock.shift; >> } >> >> In the VCLOCK_NONE, you'll access the hpet mapping. But in >> hpet_enable, hpet_set_mapping isn't called and this will crash, I >> think. > > Thanks for catching this! > > My solution is to add: > > else if (gtod->clock.vclock_mode == VCLOCK_HPET) > cycles = vread_hpet(); > else > return 0; > > Let me know if this works for you. I think that's much better than my poorly-thought-out suggestion. Accessing the hpet mapping is really slow, so avoiding it if the hpet is disabled is a good idea. And the extra branches shouldn't penalize the tsc case (which is the only fast case) unless the compiler does something silly. --Andy > > thanks > -john > > -- Andy Lutomirski AMA Capital Management, LLC Office: (310) 553-5322 Mobile: (650) 906-0647 ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 2/2] time: Fix change_clocksource locking 2012-03-14 23:58 [PATCH 0/2] time: Fix races at clocksource switch time John Stultz 2012-03-14 23:58 ` [PATCH 1/2] time: x86: Fix race switching from vsyscall to non-vsyscall clock John Stultz @ 2012-03-14 23:58 ` John Stultz 1 sibling, 0 replies; 9+ messages in thread From: John Stultz @ 2012-03-14 23:58 UTC (permalink / raw) To: linux-kernel; +Cc: John Stultz, Andy Lutomirski, Thomas Gleixner change_clocksource() fails to grab locks or call timekeeping_update(), which leaves a race window for time inconsistencies. This adds proper locking and a call to timekeeping_update() to fix this. CC: Andy Lutomirski <luto@amacapital.net> CC: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org> --- kernel/time/timekeeping.c | 7 +++++++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index 403c2a0..b53da5e 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -448,9 +448,12 @@ EXPORT_SYMBOL(timekeeping_inject_offset); static int change_clocksource(void *data) { struct clocksource *new, *old; + unsigned long flags; new = (struct clocksource *) data; + write_seqlock_irqsave(&timekeeper.lock, flags); + timekeeping_forward_now(); if (!new->enable || new->enable(new) == 0) { old = timekeeper.clock; @@ -458,6 +461,10 @@ static int change_clocksource(void *data) if (old->disable) old->disable(old); } + timekeeping_update(true); + + write_sequnlock_irqrestore(&timekeeper.lock, flags); + return 0; } -- 1.7.3.2.146.gca209 ^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2012-03-15 21:02 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-03-14 23:58 [PATCH 0/2] time: Fix races at clocksource switch time John Stultz 2012-03-14 23:58 ` [PATCH 1/2] time: x86: Fix race switching from vsyscall to non-vsyscall clock John Stultz 2012-03-15 0:34 ` Thomas Gleixner 2012-03-15 0:42 ` John Stultz 2012-03-15 1:43 ` Andy Lutomirski 2012-03-15 1:46 ` Andy Lutomirski 2012-03-15 20:18 ` John Stultz 2012-03-15 21:01 ` Andy Lutomirski 2012-03-14 23:58 ` [PATCH 2/2] time: Fix change_clocksource locking John Stultz
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox