* [PATCH 0/3] Fix for leapsecond caused hrtimer/futex issue @ 2012-07-05 19:12 John Stultz 2012-07-05 19:12 ` [PATCH 1/3] hrtimer: Fix clock_was_set so it is safe to call from irq context John Stultz ` (2 more replies) 0 siblings, 3 replies; 7+ messages in thread From: John Stultz @ 2012-07-05 19:12 UTC (permalink / raw) To: Linux Kernel; +Cc: John Stultz, Prarit Bhargava, stable, Thomas Gleixner Thomas: So Prarit and my testing over the last few days have gone fine, and its been quiet otherwise, so I wanted to go ahead and submit this for inclusion. As widely reported on the internet, many Linux systems after the leapsecond was inserted experienced futex related load spikes (usually connected to MySQL, Firefox, Thunderbird, Java, etc). An apparent workaround for this issue is running: $ date -s "`date`" Credit: http://www.sheeri.com/content/mysql-and-leap-second-high-cpu-and-fix This issue stemmed from the timekeeping subsystem not notifying the hrtimer subsystem that the leapsecond occurred, causing CLOCK_REALTIME hritmers to be fired one second early, and sub-second CLOCK_REALTIME hrtimer timeouts to fire immediately (causing the load spikes). To address this issue I'm proposing we do three things: 1) Fix the clock_was_set() call to remove the limitation that kept us from calling it from update_wall_time(). 2) Call clock_was_set() when we add/remove a leapsecond. 3) Change hrtimer_interrupt to update the hrtimer base offset values. This third item provides additional robustness should the clock_was_set() notification (done via a timer if we're in_atomic) be delayed significantly. NOTE: Some reports have been of a hard hang right at or before the leapsecond. I've not been able to reproduce or diagnose this, so this fix does not likely address the reported hard hangs (unless they end up being connected to the futex/hrtimer issue). Please email lkml and me if you experienced this. Big thanks to Prarit for shaking out a few issues in the earlier version of this patch set, as well as the extra effort testing over the Holiday! Also, I've already got backports generated for -stable, that I'm testing and I'll submitting them once I have upstream commit ids for these patches. thanks -john CC: Prarit Bhargava <prarit@redhat.com> CC: stable@vger.kernel.org CC: Thomas Gleixner <tglx@linutronix.de> John Stultz (3): hrtimer: Fix clock_was_set so it is safe to call from irq context time: Fix leapsecond triggered hrtimer/futex load spike issue hrtimer: Update hrtimer base offsets each hrtimer_interrupt include/linux/hrtimer.h | 3 +++ kernel/hrtimer.c | 31 +++++++++++++++++++++++++++---- kernel/time/timekeeping.c | 38 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 68 insertions(+), 4 deletions(-) -- 1.7.9.5 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/3] hrtimer: Fix clock_was_set so it is safe to call from irq context 2012-07-05 19:12 [PATCH 0/3] Fix for leapsecond caused hrtimer/futex issue John Stultz @ 2012-07-05 19:12 ` John Stultz 2012-07-09 9:43 ` [tip:timers/urgent] " tip-bot for John Stultz 2012-07-05 19:12 ` [PATCH 2/3] time: Fix leapsecond triggered hrtimer/futex load spike issue John Stultz 2012-07-05 19:12 ` [PATCH 3/3] hrtimer: Update hrtimer base offsets each hrtimer_interrupt John Stultz 2 siblings, 1 reply; 7+ messages in thread From: John Stultz @ 2012-07-05 19:12 UTC (permalink / raw) To: Linux Kernel; +Cc: John Stultz, Prarit Bhargava, stable, Thomas Gleixner NOTE:This is a prerequisite patch that's required to address the widely observed leap-second related futex/hrtimer issues. Currently clock_was_set() is unsafe to be called from irq context, as it calls on_each_cpu(). This causes problems when we need to adjust the time from update_wall_time(). To fix this, if clock_was_set is called when irqs are disabled, we schedule a timer to fire for immedately after we're out of interrupt context to then notify the hrtimer subsystem. CC: Prarit Bhargava <prarit@redhat.com> CC: stable@vger.kernel.org CC: Thomas Gleixner <tglx@linutronix.de> Acked-by: Prarit Bhargava <prarit@redhat.com> Reported-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: John Stultz <johnstul@us.ibm.com> --- kernel/hrtimer.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c index ae34bf5..d730678 100644 --- a/kernel/hrtimer.c +++ b/kernel/hrtimer.c @@ -746,7 +746,7 @@ static inline void retrigger_next_event(void *arg) { } * resolution timer interrupts. On UP we just disable interrupts and * call the high resolution interrupt code. */ -void clock_was_set(void) +static void do_clock_was_set(unsigned long data) { #ifdef CONFIG_HIGH_RES_TIMERS /* Retrigger the CPU local events everywhere */ @@ -755,6 +755,21 @@ void clock_was_set(void) timerfd_clock_was_set(); } +static DEFINE_TIMER(clock_was_set_timer, do_clock_was_set , 0, 0); + +void clock_was_set(void) +{ + /* + * We can't call on_each_cpu() from irq context, + * so if irqs are disabled , schedule the clock_was_set + * via a timer_list timer for right after. + */ + if (irqs_disabled()) + mod_timer(&clock_was_set_timer, jiffies); + else + do_clock_was_set(0); +} + /* * During resume we might have to reprogram the high resolution timer * interrupt (on the local CPU): -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [tip:timers/urgent] hrtimer: Fix clock_was_set so it is safe to call from irq context 2012-07-05 19:12 ` [PATCH 1/3] hrtimer: Fix clock_was_set so it is safe to call from irq context John Stultz @ 2012-07-09 9:43 ` tip-bot for John Stultz 0 siblings, 0 replies; 7+ messages in thread From: tip-bot for John Stultz @ 2012-07-09 9:43 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, hpa, mingo, johnstul, jengelh, tglx, prarit Commit-ID: adf374cf61394dd41c027c74a81f9bef90f7a640 Gitweb: http://git.kernel.org/tip/adf374cf61394dd41c027c74a81f9bef90f7a640 Author: John Stultz <johnstul@us.ibm.com> AuthorDate: Thu, 5 Jul 2012 15:12:16 -0400 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Mon, 9 Jul 2012 11:35:38 +0200 hrtimer: Fix clock_was_set so it is safe to call from irq context NOTE: This is a prerequisite patch that's required to address the widely observed leap-second related futex/hrtimer issues. Currently clock_was_set() is unsafe to be called from irq context, as it calls on_each_cpu(). This causes problems when we need to adjust the time from update_wall_time(). To fix this, if clock_was_set is called when irqs are disabled, we schedule a timer to fire for immedately after we're out of interrupt context to then notify the hrtimer subsystem. Reported-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: John Stultz <johnstul@us.ibm.com> Acked-by: Prarit Bhargava <prarit@redhat.com> CC: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1341515538-5100-2-git-send-email-johnstul@us.ibm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- kernel/hrtimer.c | 17 ++++++++++++++++- 1 files changed, 16 insertions(+), 1 deletions(-) diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c index ae34bf5..d730678 100644 --- a/kernel/hrtimer.c +++ b/kernel/hrtimer.c @@ -746,7 +746,7 @@ static inline void retrigger_next_event(void *arg) { } * resolution timer interrupts. On UP we just disable interrupts and * call the high resolution interrupt code. */ -void clock_was_set(void) +static void do_clock_was_set(unsigned long data) { #ifdef CONFIG_HIGH_RES_TIMERS /* Retrigger the CPU local events everywhere */ @@ -755,6 +755,21 @@ void clock_was_set(void) timerfd_clock_was_set(); } +static DEFINE_TIMER(clock_was_set_timer, do_clock_was_set , 0, 0); + +void clock_was_set(void) +{ + /* + * We can't call on_each_cpu() from irq context, + * so if irqs are disabled , schedule the clock_was_set + * via a timer_list timer for right after. + */ + if (irqs_disabled()) + mod_timer(&clock_was_set_timer, jiffies); + else + do_clock_was_set(0); +} + /* * During resume we might have to reprogram the high resolution timer * interrupt (on the local CPU): ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/3] time: Fix leapsecond triggered hrtimer/futex load spike issue 2012-07-05 19:12 [PATCH 0/3] Fix for leapsecond caused hrtimer/futex issue John Stultz 2012-07-05 19:12 ` [PATCH 1/3] hrtimer: Fix clock_was_set so it is safe to call from irq context John Stultz @ 2012-07-05 19:12 ` John Stultz 2012-07-09 9:43 ` [tip:timers/urgent] time: Fix leapsecond triggered hrtimer/ futex " tip-bot for John Stultz 2012-07-05 19:12 ` [PATCH 3/3] hrtimer: Update hrtimer base offsets each hrtimer_interrupt John Stultz 2 siblings, 1 reply; 7+ messages in thread From: John Stultz @ 2012-07-05 19:12 UTC (permalink / raw) To: Linux Kernel; +Cc: John Stultz, Prarit Bhargava, stable, Thomas Gleixner As widely reported on the internet, some Linux systems after the leapsecond was inserted are experiencing futex related load spikes (usually connected to MySQL, Firefox, Thunderbird, Java, etc). An apparent for this issue workaround is running: $ date -s "`date`" Credit: http://www.sheeri.com/content/mysql-and-leap-second-high-cpu-and-fix I this issue is due to the leapsecond being added without calling clock_was_set() to notify the hrtimer subsystem of the change. The workaround functions as it forces a clock_was_set() call from settimeofday(). This fix adds the required clock_was_set() calls to where we adjust for leapseconds. NOTE: This fix *depends* on the previous fix, which allows clock_was_set to be called from atomic context. Do not try to apply just this patch. CC: Prarit Bhargava <prarit@redhat.com> CC: stable@vger.kernel.org CC: Thomas Gleixner <tglx@linutronix.de> Acked-by: Prarit Bhargava <prarit@redhat.com> Reported-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: John Stultz <johnstul@us.ibm.com> --- kernel/time/timekeeping.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index 6f46a00..cc2991d 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -963,6 +963,8 @@ static cycle_t logarithmic_accumulation(cycle_t offset, int shift) leap = second_overflow(timekeeper.xtime.tv_sec); timekeeper.xtime.tv_sec += leap; timekeeper.wall_to_monotonic.tv_sec -= leap; + if (leap) + clock_was_set(); } /* Accumulate raw time */ @@ -1079,6 +1081,8 @@ static void update_wall_time(void) leap = second_overflow(timekeeper.xtime.tv_sec); timekeeper.xtime.tv_sec += leap; timekeeper.wall_to_monotonic.tv_sec -= leap; + if (leap) + clock_was_set(); } timekeeping_update(false); -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [tip:timers/urgent] time: Fix leapsecond triggered hrtimer/ futex load spike issue 2012-07-05 19:12 ` [PATCH 2/3] time: Fix leapsecond triggered hrtimer/futex load spike issue John Stultz @ 2012-07-09 9:43 ` tip-bot for John Stultz 0 siblings, 0 replies; 7+ messages in thread From: tip-bot for John Stultz @ 2012-07-09 9:43 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, hpa, mingo, johnstul, jengelh, tglx, prarit Commit-ID: bb88e92477def647976cd3d6964af98beceba900 Gitweb: http://git.kernel.org/tip/bb88e92477def647976cd3d6964af98beceba900 Author: John Stultz <johnstul@us.ibm.com> AuthorDate: Thu, 5 Jul 2012 15:12:17 -0400 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Mon, 9 Jul 2012 11:35:38 +0200 time: Fix leapsecond triggered hrtimer/futex load spike issue As widely reported on the internet, some Linux systems after the leapsecond was inserted are experiencing futex related load spikes (usually connected to MySQL, Firefox, Thunderbird, Java, etc). An apparent for this issue workaround is running: $ date -s "`date`" Credit: http://www.sheeri.com/content/mysql-and-leap-second-high-cpu-and-fix I this issue is due to the leapsecond being added without calling clock_was_set() to notify the hrtimer subsystem of the change. The workaround functions as it forces a clock_was_set() call from settimeofday(). This fix adds the required clock_was_set() calls to where we adjust for leapseconds. NOTE: This fix *depends* on the previous fix, which allows clock_was_set to be called from atomic context. Do not try to apply just this patch. Reported-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: John Stultz <johnstul@us.ibm.com> Acked-by: Prarit Bhargava <prarit@redhat.com> CC: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1341515538-5100-3-git-send-email-johnstul@us.ibm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- kernel/time/timekeeping.c | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index 6f46a00..cc2991d 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -963,6 +963,8 @@ static cycle_t logarithmic_accumulation(cycle_t offset, int shift) leap = second_overflow(timekeeper.xtime.tv_sec); timekeeper.xtime.tv_sec += leap; timekeeper.wall_to_monotonic.tv_sec -= leap; + if (leap) + clock_was_set(); } /* Accumulate raw time */ @@ -1079,6 +1081,8 @@ static void update_wall_time(void) leap = second_overflow(timekeeper.xtime.tv_sec); timekeeper.xtime.tv_sec += leap; timekeeper.wall_to_monotonic.tv_sec -= leap; + if (leap) + clock_was_set(); } timekeeping_update(false); ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 3/3] hrtimer: Update hrtimer base offsets each hrtimer_interrupt 2012-07-05 19:12 [PATCH 0/3] Fix for leapsecond caused hrtimer/futex issue John Stultz 2012-07-05 19:12 ` [PATCH 1/3] hrtimer: Fix clock_was_set so it is safe to call from irq context John Stultz 2012-07-05 19:12 ` [PATCH 2/3] time: Fix leapsecond triggered hrtimer/futex load spike issue John Stultz @ 2012-07-05 19:12 ` John Stultz 2012-07-09 9:44 ` [tip:timers/urgent] " tip-bot for John Stultz 2 siblings, 1 reply; 7+ messages in thread From: John Stultz @ 2012-07-05 19:12 UTC (permalink / raw) To: Linux Kernel; +Cc: John Stultz, Prarit Bhargava, stable, Thomas Gleixner This patch introduces a new funciton which captures the CLOCK_MONOTONIC time, along with the CLOCK_REALTIME and CLOCK_BOOTTIME offsets at the same moment. This new function is then used in place of ktime_get() when hrtimer_interrupt() is expiring timers. This ensures that any changes to realtime or boottime offsets are noticed and stored into the per-cpu hrtimer base structures, prior to doing any hrtimer expiration. This should ensure that timers are not expired early if the offsets changes under us. This is useful in the case where clock_was_set() is called from atomic context and have to schedule the hrtimer base offset update via a timer, as it provides extra robustness in the face of any possible timer delay. CC: Prarit Bhargava <prarit@redhat.com> CC: stable@vger.kernel.org CC: Thomas Gleixner <tglx@linutronix.de> Acked-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: John Stultz <johnstul@us.ibm.com> --- include/linux/hrtimer.h | 3 +++ kernel/hrtimer.c | 14 +++++++++++--- kernel/time/timekeeping.c | 34 ++++++++++++++++++++++++++++++++++ 3 files changed, 48 insertions(+), 3 deletions(-) diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h index fd0dc30..f6b2a74 100644 --- a/include/linux/hrtimer.h +++ b/include/linux/hrtimer.h @@ -320,6 +320,9 @@ extern ktime_t ktime_get(void); extern ktime_t ktime_get_real(void); extern ktime_t ktime_get_boottime(void); extern ktime_t ktime_get_monotonic_offset(void); +extern void ktime_get_and_real_and_sleep_offset(ktime_t *monotonic, + ktime_t *real_offset, + ktime_t *sleep_offset); DECLARE_PER_CPU(struct tick_device, tick_cpu_device); diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c index d730678..56600c4 100644 --- a/kernel/hrtimer.c +++ b/kernel/hrtimer.c @@ -1258,18 +1258,26 @@ static void __run_hrtimer(struct hrtimer *timer, ktime_t *now) void hrtimer_interrupt(struct clock_event_device *dev) { struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases); - ktime_t expires_next, now, entry_time, delta; + ktime_t expires_next, now, entry_time, delta, real_offset, sleep_offset; int i, retries = 0; BUG_ON(!cpu_base->hres_active); cpu_base->nr_events++; dev->next_event.tv64 = KTIME_MAX; - entry_time = now = ktime_get(); + + ktime_get_and_real_and_sleep_offset(&now, &real_offset, &sleep_offset); + + entry_time = now; retry: expires_next.tv64 = KTIME_MAX; raw_spin_lock(&cpu_base->lock); + + /* Update base offsets, to avoid early wakeups */ + cpu_base->clock_base[HRTIMER_BASE_REALTIME].offset = real_offset; + cpu_base->clock_base[HRTIMER_BASE_BOOTTIME].offset = sleep_offset; + /* * We set expires_next to KTIME_MAX here with cpu_base->lock * held to prevent that a timer is enqueued in our queue via @@ -1346,7 +1354,7 @@ retry: * interrupt routine. We give it 3 attempts to avoid * overreacting on some spurious event. */ - now = ktime_get(); + ktime_get_and_real_and_sleep_offset(&now, &real_offset, &sleep_offset); cpu_base->nr_retries++; if (++retries < 3) goto retry; diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index cc2991d..b3404cf 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -1251,6 +1251,40 @@ void get_xtime_and_monotonic_and_sleep_offset(struct timespec *xtim, } /** + * ktime_get_and_real_and_sleep_offset() - hrtimer helper, gets monotonic ktime, + * realtime offset, and sleep offsets. + */ +void ktime_get_and_real_and_sleep_offset(ktime_t *monotonic, + ktime_t *real_offset, + ktime_t *sleep_offset) +{ + unsigned long seq; + struct timespec wtom, sleep; + u64 secs, nsecs; + + do { + seq = read_seqbegin(&timekeeper.lock); + + secs = timekeeper.xtime.tv_sec + + timekeeper.wall_to_monotonic.tv_sec; + nsecs = timekeeper.xtime.tv_nsec + + timekeeper.wall_to_monotonic.tv_nsec; + nsecs += timekeeping_get_ns(); + /* If arch requires, add in gettimeoffset() */ + nsecs += arch_gettimeoffset(); + + wtom = timekeeper.wall_to_monotonic; + sleep = timekeeper.total_sleep_time; + } while (read_seqretry(&timekeeper.lock, seq)); + + *monotonic = ktime_add_ns(ktime_set(secs, 0), nsecs); + set_normalized_timespec(&wtom, -wtom.tv_sec, -wtom.tv_nsec); + *real_offset = timespec_to_ktime(wtom); + *sleep_offset = timespec_to_ktime(sleep); +} + + +/** * ktime_get_monotonic_offset() - get wall_to_monotonic in ktime_t format */ ktime_t ktime_get_monotonic_offset(void) -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [tip:timers/urgent] hrtimer: Update hrtimer base offsets each hrtimer_interrupt 2012-07-05 19:12 ` [PATCH 3/3] hrtimer: Update hrtimer base offsets each hrtimer_interrupt John Stultz @ 2012-07-09 9:44 ` tip-bot for John Stultz 0 siblings, 0 replies; 7+ messages in thread From: tip-bot for John Stultz @ 2012-07-09 9:44 UTC (permalink / raw) To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, johnstul, tglx, prarit Commit-ID: d7e2e7fef1f0f4e1e614353a6c5eef4e18d98d2d Gitweb: http://git.kernel.org/tip/d7e2e7fef1f0f4e1e614353a6c5eef4e18d98d2d Author: John Stultz <johnstul@us.ibm.com> AuthorDate: Thu, 5 Jul 2012 15:12:18 -0400 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Mon, 9 Jul 2012 11:35:38 +0200 hrtimer: Update hrtimer base offsets each hrtimer_interrupt This patch introduces a new funciton which captures the CLOCK_MONOTONIC time, along with the CLOCK_REALTIME and CLOCK_BOOTTIME offsets at the same moment. This new function is then used in place of ktime_get() when hrtimer_interrupt() is expiring timers. This ensures that any changes to realtime or boottime offsets are noticed and stored into the per-cpu hrtimer base structures, prior to doing any hrtimer expiration. This should ensure that timers are not expired early if the offsets changes under us. This is useful in the case where clock_was_set() is called from atomic context and have to schedule the hrtimer base offset update via a timer, as it provides extra robustness in the face of any possible timer delay. Signed-off-by: John Stultz <johnstul@us.ibm.com> Acked-by: Prarit Bhargava <prarit@redhat.com> CC: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1341515538-5100-4-git-send-email-johnstul@us.ibm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- include/linux/hrtimer.h | 3 +++ kernel/hrtimer.c | 14 +++++++++++--- kernel/time/timekeeping.c | 34 ++++++++++++++++++++++++++++++++++ 3 files changed, 48 insertions(+), 3 deletions(-) diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h index fd0dc30..f6b2a74 100644 --- a/include/linux/hrtimer.h +++ b/include/linux/hrtimer.h @@ -320,6 +320,9 @@ extern ktime_t ktime_get(void); extern ktime_t ktime_get_real(void); extern ktime_t ktime_get_boottime(void); extern ktime_t ktime_get_monotonic_offset(void); +extern void ktime_get_and_real_and_sleep_offset(ktime_t *monotonic, + ktime_t *real_offset, + ktime_t *sleep_offset); DECLARE_PER_CPU(struct tick_device, tick_cpu_device); diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c index d730678..56600c4 100644 --- a/kernel/hrtimer.c +++ b/kernel/hrtimer.c @@ -1258,18 +1258,26 @@ static void __run_hrtimer(struct hrtimer *timer, ktime_t *now) void hrtimer_interrupt(struct clock_event_device *dev) { struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases); - ktime_t expires_next, now, entry_time, delta; + ktime_t expires_next, now, entry_time, delta, real_offset, sleep_offset; int i, retries = 0; BUG_ON(!cpu_base->hres_active); cpu_base->nr_events++; dev->next_event.tv64 = KTIME_MAX; - entry_time = now = ktime_get(); + + ktime_get_and_real_and_sleep_offset(&now, &real_offset, &sleep_offset); + + entry_time = now; retry: expires_next.tv64 = KTIME_MAX; raw_spin_lock(&cpu_base->lock); + + /* Update base offsets, to avoid early wakeups */ + cpu_base->clock_base[HRTIMER_BASE_REALTIME].offset = real_offset; + cpu_base->clock_base[HRTIMER_BASE_BOOTTIME].offset = sleep_offset; + /* * We set expires_next to KTIME_MAX here with cpu_base->lock * held to prevent that a timer is enqueued in our queue via @@ -1346,7 +1354,7 @@ retry: * interrupt routine. We give it 3 attempts to avoid * overreacting on some spurious event. */ - now = ktime_get(); + ktime_get_and_real_and_sleep_offset(&now, &real_offset, &sleep_offset); cpu_base->nr_retries++; if (++retries < 3) goto retry; diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index cc2991d..b3404cf 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -1251,6 +1251,40 @@ void get_xtime_and_monotonic_and_sleep_offset(struct timespec *xtim, } /** + * ktime_get_and_real_and_sleep_offset() - hrtimer helper, gets monotonic ktime, + * realtime offset, and sleep offsets. + */ +void ktime_get_and_real_and_sleep_offset(ktime_t *monotonic, + ktime_t *real_offset, + ktime_t *sleep_offset) +{ + unsigned long seq; + struct timespec wtom, sleep; + u64 secs, nsecs; + + do { + seq = read_seqbegin(&timekeeper.lock); + + secs = timekeeper.xtime.tv_sec + + timekeeper.wall_to_monotonic.tv_sec; + nsecs = timekeeper.xtime.tv_nsec + + timekeeper.wall_to_monotonic.tv_nsec; + nsecs += timekeeping_get_ns(); + /* If arch requires, add in gettimeoffset() */ + nsecs += arch_gettimeoffset(); + + wtom = timekeeper.wall_to_monotonic; + sleep = timekeeper.total_sleep_time; + } while (read_seqretry(&timekeeper.lock, seq)); + + *monotonic = ktime_add_ns(ktime_set(secs, 0), nsecs); + set_normalized_timespec(&wtom, -wtom.tv_sec, -wtom.tv_nsec); + *real_offset = timespec_to_ktime(wtom); + *sleep_offset = timespec_to_ktime(sleep); +} + + +/** * ktime_get_monotonic_offset() - get wall_to_monotonic in ktime_t format */ ktime_t ktime_get_monotonic_offset(void) ^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-07-09 9:45 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-07-05 19:12 [PATCH 0/3] Fix for leapsecond caused hrtimer/futex issue John Stultz 2012-07-05 19:12 ` [PATCH 1/3] hrtimer: Fix clock_was_set so it is safe to call from irq context John Stultz 2012-07-09 9:43 ` [tip:timers/urgent] " tip-bot for John Stultz 2012-07-05 19:12 ` [PATCH 2/3] time: Fix leapsecond triggered hrtimer/futex load spike issue John Stultz 2012-07-09 9:43 ` [tip:timers/urgent] time: Fix leapsecond triggered hrtimer/ futex " tip-bot for John Stultz 2012-07-05 19:12 ` [PATCH 3/3] hrtimer: Update hrtimer base offsets each hrtimer_interrupt John Stultz 2012-07-09 9:44 ` [tip:timers/urgent] " tip-bot for John Stultz
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).