* [PATCH 0/7] 3.4-stable: Fix for leapsecond caused hrtimer/futex issue
@ 2012-07-17 6:39 John Stultz
2012-07-17 6:39 ` [PATCH 1/7] 3.4.x: hrtimer: Provide clock_was_set_delayed() John Stultz
` (7 more replies)
0 siblings, 8 replies; 16+ messages in thread
From: John Stultz @ 2012-07-17 6:39 UTC (permalink / raw)
To: stable; +Cc: John Stultz, Prarit Bhargava, Thomas Gleixner, Linux Kernel
Here is backport of the leapsecond fixes to 3.4-stable. These are very
straight forward, and backported to 3.4.x with no collisions or changes.
This patchset resolve the early hrtimer/futex expiration issue
widely seen after the June 30th leapsecond.
I've booted and tested this patchset on two boxes and run through a number
of leapsecond related stress tests. However, additional testing and review
would be appreciated.
The original commits backported in this set are:
f55a6faa384304c89cfef162768e88374d3312cb hrtimer: Provide clock_was_set_delayed()
4873fa070ae84a4115f0b3c9dfabc224f1bc7c51 timekeeping: Fix leapsecond triggered load spike issue
5b9fe759a678e05be4937ddf03d50e950207c1c0 timekeeping: Maintain ktime_t based offsets for hrtimers
196951e91262fccda81147d2bcf7fdab08668b40 hrtimers: Move lock held region in hrtimer_interrupt()
f6c06abfb3972ad4914cef57d8348fcb2932bc3b timekeeping: Provide hrtimer update function
5baefd6d84163443215f4a99f6a20f054ef11236 hrtimer: Update hrtimer base offsets each hrtimer_interrupt
3e997130bd2e8c6f5aaa49d6e3161d4d29b43ab0 timekeeping: Add missing update call in timekeeping_resume()
I've already done backports to all the stable kernels to 2.6.32,
and will be sending them out shortly.
Please let me know if you have any comments or feedback.
thanks
-john
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>
John Stultz (3):
3.4.x: hrtimer: Provide clock_was_set_delayed()
3.4.x: timekeeping: Fix leapsecond triggered load spike issue
3.4.x: hrtimer: Update hrtimer base offsets each hrtimer_interrupt
Thomas Gleixner (4):
3.4.x: timekeeping: Maintain ktime_t based offsets for hrtimers
3.4.x: hrtimers: Move lock held region in hrtimer_interrupt()
3.4.x: timekeeping: Provide hrtimer update function
3.4.x: timekeeping: Add missing update call in timekeeping_resume()
include/linux/hrtimer.h | 10 ++++++-
kernel/hrtimer.c | 53 +++++++++++++++++++++++++------------
kernel/time/timekeeping.c | 64 +++++++++++++++++++++++++++++++++++++++++++--
3 files changed, 108 insertions(+), 19 deletions(-)
--
1.7.9.5
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 1/7] 3.4.x: hrtimer: Provide clock_was_set_delayed()
2012-07-17 6:39 [PATCH 0/7] 3.4-stable: Fix for leapsecond caused hrtimer/futex issue John Stultz
@ 2012-07-17 6:39 ` John Stultz
2012-07-17 21:59 ` Patch "hrtimer: Provide clock_was_set_delayed()" has been added to the 3.4-stable tree gregkh
2012-07-17 6:39 ` [PATCH 2/7] 3.4.x: timekeeping: Fix leapsecond triggered load spike issue John Stultz
` (6 subsequent siblings)
7 siblings, 1 reply; 16+ messages in thread
From: John Stultz @ 2012-07-17 6:39 UTC (permalink / raw)
To: stable; +Cc: John Stultz, Thomas Gleixner, Prarit Bhargava, Linux Kernel
This is a backport of f55a6faa384304c89cfef162768e88374d3312cb
clock_was_set() cannot be called from hard interrupt context because
it calls on_each_cpu().
For fixing the widely reported leap seconds issue it is necessary to
call it from hard interrupt context, i.e. the timer tick code, which
does the timekeeping updates.
Provide a new function which denotes it in the hrtimer cpu base
structure of the cpu on which it is called and raise the hrtimer
softirq. We then execute the clock_was_set() notificiation from
softirq context in run_hrtimer_softirq(). The hrtimer softirq is
rarely used, so polling the flag there is not a performance issue.
[ tglx: Made it depend on CONFIG_HIGH_RES_TIMERS. We really should get
rid of all this ifdeffery ASAP ]
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Reported-by: Jan Engelhardt <jengelh@inai.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1341960205-56738-2-git-send-email-johnstul@us.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
---
include/linux/hrtimer.h | 9 ++++++++-
kernel/hrtimer.c | 20 ++++++++++++++++++++
2 files changed, 28 insertions(+), 1 deletion(-)
diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index fd0dc30..c9ec940 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -165,6 +165,7 @@ enum hrtimer_base_type {
* @lock: lock protecting the base and associated clock bases
* and timers
* @active_bases: Bitfield to mark bases with active timers
+ * @clock_was_set: Indicates that clock was set from irq context.
* @expires_next: absolute time of the next event which was scheduled
* via clock_set_next_event()
* @hres_active: State of high resolution mode
@@ -177,7 +178,8 @@ enum hrtimer_base_type {
*/
struct hrtimer_cpu_base {
raw_spinlock_t lock;
- unsigned long active_bases;
+ unsigned int active_bases;
+ unsigned int clock_was_set;
#ifdef CONFIG_HIGH_RES_TIMERS
ktime_t expires_next;
int hres_active;
@@ -286,6 +288,8 @@ extern void hrtimer_peek_ahead_timers(void);
# define MONOTONIC_RES_NSEC HIGH_RES_NSEC
# define KTIME_MONOTONIC_RES KTIME_HIGH_RES
+extern void clock_was_set_delayed(void);
+
#else
# define MONOTONIC_RES_NSEC LOW_RES_NSEC
@@ -306,6 +310,9 @@ static inline int hrtimer_is_hres_active(struct hrtimer *timer)
{
return 0;
}
+
+static inline void clock_was_set_delayed(void) { }
+
#endif
extern void clock_was_set(void);
diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index ae34bf5..3c24fb2 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -717,6 +717,19 @@ static int hrtimer_switch_to_hres(void)
return 1;
}
+/*
+ * Called from timekeeping code to reprogramm the hrtimer interrupt
+ * device. If called from the timer interrupt context we defer it to
+ * softirq context.
+ */
+void clock_was_set_delayed(void)
+{
+ struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
+
+ cpu_base->clock_was_set = 1;
+ __raise_softirq_irqoff(HRTIMER_SOFTIRQ);
+}
+
#else
static inline int hrtimer_hres_active(void) { return 0; }
@@ -1395,6 +1408,13 @@ void hrtimer_peek_ahead_timers(void)
static void run_hrtimer_softirq(struct softirq_action *h)
{
+ struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
+
+ if (cpu_base->clock_was_set) {
+ cpu_base->clock_was_set = 0;
+ clock_was_set();
+ }
+
hrtimer_peek_ahead_timers();
}
--
1.7.9.5
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 2/7] 3.4.x: timekeeping: Fix leapsecond triggered load spike issue
2012-07-17 6:39 [PATCH 0/7] 3.4-stable: Fix for leapsecond caused hrtimer/futex issue John Stultz
2012-07-17 6:39 ` [PATCH 1/7] 3.4.x: hrtimer: Provide clock_was_set_delayed() John Stultz
@ 2012-07-17 6:39 ` John Stultz
2012-07-17 21:59 ` Patch "timekeeping: Fix leapsecond triggered load spike issue" has been added to the 3.4-stable tree gregkh
2012-07-17 6:39 ` [PATCH 3/7] 3.4.x: timekeeping: Maintain ktime_t based offsets for hrtimers John Stultz
` (5 subsequent siblings)
7 siblings, 1 reply; 16+ messages in thread
From: John Stultz @ 2012-07-17 6:39 UTC (permalink / raw)
To: stable; +Cc: John Stultz, Thomas Gleixner, Prarit Bhargava, Linux Kernel
This is a backport of 4873fa070ae84a4115f0b3c9dfabc224f1bc7c51
The timekeeping code misses an update of the hrtimer subsystem after a
leap second happened. Due to that timers based on CLOCK_REALTIME are
either expiring a second early or late depending on whether a leap
second has been inserted or deleted until an operation is initiated
which causes that update. Unless the update happens by some other
means this discrepancy between the timekeeping and the hrtimer data
stays forever and timers are expired either early or late.
The reported immediate workaround - $ data -s "`date`" - is causing a
call to clock_was_set() which updates the hrtimer data structures.
See: http://www.sheeri.com/content/mysql-and-leap-second-high-cpu-and-fix
Add the missing clock_was_set() call to update_wall_time() in case of
a leap second event. The actual update is deferred to softirq context
as the necessary smp function call cannot be invoked from hard
interrupt context.
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Reported-by: Jan Engelhardt <jengelh@inai.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1341960205-56738-3-git-send-email-johnstul@us.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
---
kernel/time/timekeeping.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index d42574df..9588f0c 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -965,6 +965,8 @@ static cycle_t logarithmic_accumulation(cycle_t offset, int shift)
leap = second_overflow(timekeeper.xtime.tv_sec);
timekeeper.xtime.tv_sec += leap;
timekeeper.wall_to_monotonic.tv_sec -= leap;
+ if (leap)
+ clock_was_set_delayed();
}
/* Accumulate raw time */
@@ -1081,6 +1083,8 @@ static void update_wall_time(void)
leap = second_overflow(timekeeper.xtime.tv_sec);
timekeeper.xtime.tv_sec += leap;
timekeeper.wall_to_monotonic.tv_sec -= leap;
+ if (leap)
+ clock_was_set_delayed();
}
timekeeping_update(false);
--
1.7.9.5
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 3/7] 3.4.x: timekeeping: Maintain ktime_t based offsets for hrtimers
2012-07-17 6:39 [PATCH 0/7] 3.4-stable: Fix for leapsecond caused hrtimer/futex issue John Stultz
2012-07-17 6:39 ` [PATCH 1/7] 3.4.x: hrtimer: Provide clock_was_set_delayed() John Stultz
2012-07-17 6:39 ` [PATCH 2/7] 3.4.x: timekeeping: Fix leapsecond triggered load spike issue John Stultz
@ 2012-07-17 6:39 ` John Stultz
2012-07-17 21:59 ` Patch "timekeeping: Maintain ktime_t based offsets for hrtimers" has been added to the 3.4-stable tree gregkh
2012-07-17 6:39 ` [PATCH 4/7] 3.4.x: hrtimers: Move lock held region in hrtimer_interrupt() John Stultz
` (4 subsequent siblings)
7 siblings, 1 reply; 16+ messages in thread
From: John Stultz @ 2012-07-17 6:39 UTC (permalink / raw)
To: stable; +Cc: Thomas Gleixner, John Stultz, Prarit Bhargava, Linux Kernel
From: Thomas Gleixner <tglx@linutronix.de>
This is a backport of 5b9fe759a678e05be4937ddf03d50e950207c1c0
We need to update the hrtimer clock offsets from the hrtimer interrupt
context. To avoid conversions from timespec to ktime_t maintain a
ktime_t based representation of those offsets in the timekeeper. This
puts the conversion overhead into the code which updates the
underlying offsets and provides fast accessible values in the hrtimer
interrupt.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1341960205-56738-4-git-send-email-johnstul@us.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
---
kernel/time/timekeeping.c | 25 +++++++++++++++++++++++--
1 file changed, 23 insertions(+), 2 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 9588f0c..615ec8d 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -70,6 +70,12 @@ struct timekeeper {
/* The raw monotonic time for the CLOCK_MONOTONIC_RAW posix clock. */
struct timespec raw_time;
+ /* Offset clock monotonic -> clock realtime */
+ ktime_t offs_real;
+
+ /* Offset clock monotonic -> clock boottime */
+ ktime_t offs_boot;
+
/* Seqlock for all timekeeper values */
seqlock_t lock;
};
@@ -172,6 +178,14 @@ static inline s64 timekeeping_get_ns_raw(void)
return clocksource_cyc2ns(cycle_delta, clock->mult, clock->shift);
}
+static void update_rt_offset(void)
+{
+ struct timespec tmp, *wtm = &timekeeper.wall_to_monotonic;
+
+ set_normalized_timespec(&tmp, -wtm->tv_sec, -wtm->tv_nsec);
+ timekeeper.offs_real = timespec_to_ktime(tmp);
+}
+
/* must hold write on timekeeper.lock */
static void timekeeping_update(bool clearntp)
{
@@ -179,6 +193,7 @@ static void timekeeping_update(bool clearntp)
timekeeper.ntp_error = 0;
ntp_clear();
}
+ update_rt_offset();
update_vsyscall(&timekeeper.xtime, &timekeeper.wall_to_monotonic,
timekeeper.clock, timekeeper.mult);
}
@@ -606,6 +621,7 @@ void __init timekeeping_init(void)
}
set_normalized_timespec(&timekeeper.wall_to_monotonic,
-boot.tv_sec, -boot.tv_nsec);
+ update_rt_offset();
timekeeper.total_sleep_time.tv_sec = 0;
timekeeper.total_sleep_time.tv_nsec = 0;
write_sequnlock_irqrestore(&timekeeper.lock, flags);
@@ -614,6 +630,12 @@ void __init timekeeping_init(void)
/* time in seconds when suspend began */
static struct timespec timekeeping_suspend_time;
+static void update_sleep_time(struct timespec t)
+{
+ timekeeper.total_sleep_time = t;
+ timekeeper.offs_boot = timespec_to_ktime(t);
+}
+
/**
* __timekeeping_inject_sleeptime - Internal function to add sleep interval
* @delta: pointer to a timespec delta value
@@ -632,8 +654,7 @@ static void __timekeeping_inject_sleeptime(struct timespec *delta)
timekeeper.xtime = timespec_add(timekeeper.xtime, *delta);
timekeeper.wall_to_monotonic =
timespec_sub(timekeeper.wall_to_monotonic, *delta);
- timekeeper.total_sleep_time = timespec_add(
- timekeeper.total_sleep_time, *delta);
+ update_sleep_time(timespec_add(timekeeper.total_sleep_time, *delta));
}
--
1.7.9.5
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 4/7] 3.4.x: hrtimers: Move lock held region in hrtimer_interrupt()
2012-07-17 6:39 [PATCH 0/7] 3.4-stable: Fix for leapsecond caused hrtimer/futex issue John Stultz
` (2 preceding siblings ...)
2012-07-17 6:39 ` [PATCH 3/7] 3.4.x: timekeeping: Maintain ktime_t based offsets for hrtimers John Stultz
@ 2012-07-17 6:39 ` John Stultz
2012-07-17 21:59 ` Patch "hrtimers: Move lock held region in hrtimer_interrupt()" has been added to the 3.4-stable tree gregkh
2012-07-17 6:39 ` [PATCH 5/7] 3.4.x: timekeeping: Provide hrtimer update function John Stultz
` (3 subsequent siblings)
7 siblings, 1 reply; 16+ messages in thread
From: John Stultz @ 2012-07-17 6:39 UTC (permalink / raw)
To: stable; +Cc: Thomas Gleixner, John Stultz, Prarit Bhargava, Linux Kernel
From: Thomas Gleixner <tglx@linutronix.de>
This is a backport of 196951e91262fccda81147d2bcf7fdab08668b40
We need to update the base offsets from this code and we need to do
that under base->lock. Move the lock held region around the
ktime_get() calls. The ktime_get() calls are going to be replaced with
a function which gets the time and the offsets atomically.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Link: http://lkml.kernel.org/r/1341960205-56738-6-git-send-email-johnstul@us.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
---
kernel/hrtimer.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 3c24fb2..8f320af 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1263,11 +1263,10 @@ void hrtimer_interrupt(struct clock_event_device *dev)
cpu_base->nr_events++;
dev->next_event.tv64 = KTIME_MAX;
+ raw_spin_lock(&cpu_base->lock);
entry_time = now = ktime_get();
retry:
expires_next.tv64 = KTIME_MAX;
-
- raw_spin_lock(&cpu_base->lock);
/*
* We set expires_next to KTIME_MAX here with cpu_base->lock
* held to prevent that a timer is enqueued in our queue via
@@ -1344,6 +1343,7 @@ retry:
* interrupt routine. We give it 3 attempts to avoid
* overreacting on some spurious event.
*/
+ raw_spin_lock(&cpu_base->lock);
now = ktime_get();
cpu_base->nr_retries++;
if (++retries < 3)
@@ -1356,6 +1356,7 @@ retry:
*/
cpu_base->nr_hangs++;
cpu_base->hang_detected = 1;
+ raw_spin_unlock(&cpu_base->lock);
delta = ktime_sub(now, entry_time);
if (delta.tv64 > cpu_base->max_hang_time.tv64)
cpu_base->max_hang_time = delta;
--
1.7.9.5
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 5/7] 3.4.x: timekeeping: Provide hrtimer update function
2012-07-17 6:39 [PATCH 0/7] 3.4-stable: Fix for leapsecond caused hrtimer/futex issue John Stultz
` (3 preceding siblings ...)
2012-07-17 6:39 ` [PATCH 4/7] 3.4.x: hrtimers: Move lock held region in hrtimer_interrupt() John Stultz
@ 2012-07-17 6:39 ` John Stultz
2012-07-17 21:59 ` Patch "timekeeping: Provide hrtimer update function" has been added to the 3.4-stable tree gregkh
2012-07-17 6:39 ` [PATCH 6/7] 3.4.x: hrtimer: Update hrtimer base offsets each hrtimer_interrupt John Stultz
` (2 subsequent siblings)
7 siblings, 1 reply; 16+ messages in thread
From: John Stultz @ 2012-07-17 6:39 UTC (permalink / raw)
To: stable; +Cc: Thomas Gleixner, John Stultz, Prarit Bhargava, Linux Kernel
From: Thomas Gleixner <tglx@linutronix.de>
This is a backport of f6c06abfb3972ad4914cef57d8348fcb2932bc3b
To finally fix the infamous leap second issue and other race windows
caused by functions which change the offsets between the various time
bases (CLOCK_MONOTONIC, CLOCK_REALTIME and CLOCK_BOOTTIME) we need a
function which atomically gets the current monotonic time and updates
the offsets of CLOCK_REALTIME and CLOCK_BOOTTIME with minimalistic
overhead. The previous patch which provides ktime_t offsets allows us
to make this function almost as cheap as ktime_get() which is going to
be replaced in hrtimer_interrupt().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Link: http://lkml.kernel.org/r/1341960205-56738-7-git-send-email-johnstul@us.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
---
include/linux/hrtimer.h | 1 +
kernel/time/timekeeping.c | 34 ++++++++++++++++++++++++++++++++++
2 files changed, 35 insertions(+)
diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index c9ec940..cc07d27 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -327,6 +327,7 @@ extern ktime_t ktime_get(void);
extern ktime_t ktime_get_real(void);
extern ktime_t ktime_get_boottime(void);
extern ktime_t ktime_get_monotonic_offset(void);
+extern ktime_t ktime_get_update_offsets(ktime_t *offs_real, ktime_t *offs_boot);
DECLARE_PER_CPU(struct tick_device, tick_cpu_device);
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 615ec8d..62e12c3 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1273,6 +1273,40 @@ void get_xtime_and_monotonic_and_sleep_offset(struct timespec *xtim,
} while (read_seqretry(&timekeeper.lock, seq));
}
+#ifdef CONFIG_HIGH_RES_TIMERS
+/**
+ * ktime_get_update_offsets - hrtimer helper
+ * @offs_real: pointer to storage for monotonic -> realtime offset
+ * @offs_boot: pointer to storage for monotonic -> boottime offset
+ *
+ * Returns current monotonic time and updates the offsets
+ * Called from hrtimer_interupt() or retrigger_next_event()
+ */
+ktime_t ktime_get_update_offsets(ktime_t *offs_real, ktime_t *offs_boot)
+{
+ ktime_t now;
+ unsigned int seq;
+ u64 secs, nsecs;
+
+ do {
+ seq = read_seqbegin(&timekeeper.lock);
+
+ secs = timekeeper.xtime.tv_sec;
+ nsecs = timekeeper.xtime.tv_nsec;
+ nsecs += timekeeping_get_ns();
+ /* If arch requires, add in gettimeoffset() */
+ nsecs += arch_gettimeoffset();
+
+ *offs_real = timekeeper.offs_real;
+ *offs_boot = timekeeper.offs_boot;
+ } while (read_seqretry(&timekeeper.lock, seq));
+
+ now = ktime_add_ns(ktime_set(secs, 0), nsecs);
+ now = ktime_sub(now, *offs_real);
+ return now;
+}
+#endif
+
/**
* ktime_get_monotonic_offset() - get wall_to_monotonic in ktime_t format
*/
--
1.7.9.5
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 6/7] 3.4.x: hrtimer: Update hrtimer base offsets each hrtimer_interrupt
2012-07-17 6:39 [PATCH 0/7] 3.4-stable: Fix for leapsecond caused hrtimer/futex issue John Stultz
` (4 preceding siblings ...)
2012-07-17 6:39 ` [PATCH 5/7] 3.4.x: timekeeping: Provide hrtimer update function John Stultz
@ 2012-07-17 6:39 ` John Stultz
2012-07-17 21:59 ` Patch "hrtimer: Update hrtimer base offsets each hrtimer_interrupt" has been added to the 3.4-stable tree gregkh
2012-07-17 6:39 ` [PATCH 7/7] 3.4.x: timekeeping: Add missing update call in timekeeping_resume() John Stultz
2012-07-17 21:29 ` [PATCH 0/7] 3.4-stable: Fix for leapsecond caused hrtimer/futex issue Greg KH
7 siblings, 1 reply; 16+ messages in thread
From: John Stultz @ 2012-07-17 6:39 UTC (permalink / raw)
To: stable; +Cc: John Stultz, Thomas Gleixner, Prarit Bhargava, Linux Kernel
This is a backport of 5baefd6d84163443215f4a99f6a20f054ef11236
The update of the hrtimer base offsets on all cpus cannot be made
atomically from the timekeeper.lock held and interrupt disabled region
as smp function calls are not allowed there.
clock_was_set(), which enforces the update on all cpus, is called
either from preemptible process context in case of do_settimeofday()
or from the softirq context when the offset modification happened in
the timer interrupt itself due to a leap second.
In both cases there is a race window for an hrtimer interrupt between
dropping timekeeper lock, enabling interrupts and clock_was_set()
issuing the updates. Any interrupt which arrives in that window will
see the new time but operate on stale offsets.
So we need to make sure that an hrtimer interrupt always sees a
consistent state of time and offsets.
ktime_get_update_offsets() allows us to get the current monotonic time
and update the per cpu hrtimer base offsets from hrtimer_interrupt()
to capture a consistent state of monotonic time and the offsets. The
function replaces the existing ktime_get() calls in hrtimer_interrupt().
The overhead of the new function vs. ktime_get() is minimal as it just
adds two store operations.
This ensures that any changes to realtime or boottime offsets are
noticed and stored into the per-cpu hrtimer base structures, prior to
any hrtimer expiration and guarantees that timers are not expired early.
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1341960205-56738-8-git-send-email-johnstul@us.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
---
kernel/hrtimer.c | 28 ++++++++++++++--------------
1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 8f320af..6db7a5e 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -657,6 +657,14 @@ static inline int hrtimer_enqueue_reprogram(struct hrtimer *timer,
return 0;
}
+static inline ktime_t hrtimer_update_base(struct hrtimer_cpu_base *base)
+{
+ ktime_t *offs_real = &base->clock_base[HRTIMER_BASE_REALTIME].offset;
+ ktime_t *offs_boot = &base->clock_base[HRTIMER_BASE_BOOTTIME].offset;
+
+ return ktime_get_update_offsets(offs_real, offs_boot);
+}
+
/*
* Retrigger next event is called after clock was set
*
@@ -665,22 +673,12 @@ static inline int hrtimer_enqueue_reprogram(struct hrtimer *timer,
static void retrigger_next_event(void *arg)
{
struct hrtimer_cpu_base *base = &__get_cpu_var(hrtimer_bases);
- struct timespec realtime_offset, xtim, wtm, sleep;
if (!hrtimer_hres_active())
return;
- /* Optimized out for !HIGH_RES */
- get_xtime_and_monotonic_and_sleep_offset(&xtim, &wtm, &sleep);
- set_normalized_timespec(&realtime_offset, -wtm.tv_sec, -wtm.tv_nsec);
-
- /* Adjust CLOCK_REALTIME offset */
raw_spin_lock(&base->lock);
- base->clock_base[HRTIMER_BASE_REALTIME].offset =
- timespec_to_ktime(realtime_offset);
- base->clock_base[HRTIMER_BASE_BOOTTIME].offset =
- timespec_to_ktime(sleep);
-
+ hrtimer_update_base(base);
hrtimer_force_reprogram(base, 0);
raw_spin_unlock(&base->lock);
}
@@ -710,7 +708,6 @@ static int hrtimer_switch_to_hres(void)
base->clock_base[i].resolution = KTIME_HIGH_RES;
tick_setup_sched_timer();
-
/* "Retrigger" the interrupt to get things going */
retrigger_next_event(NULL);
local_irq_restore(flags);
@@ -1264,7 +1261,7 @@ void hrtimer_interrupt(struct clock_event_device *dev)
dev->next_event.tv64 = KTIME_MAX;
raw_spin_lock(&cpu_base->lock);
- entry_time = now = ktime_get();
+ entry_time = now = hrtimer_update_base(cpu_base);
retry:
expires_next.tv64 = KTIME_MAX;
/*
@@ -1342,9 +1339,12 @@ retry:
* We need to prevent that we loop forever in the hrtimer
* interrupt routine. We give it 3 attempts to avoid
* overreacting on some spurious event.
+ *
+ * Acquire base lock for updating the offsets and retrieving
+ * the current time.
*/
raw_spin_lock(&cpu_base->lock);
- now = ktime_get();
+ now = hrtimer_update_base(cpu_base);
cpu_base->nr_retries++;
if (++retries < 3)
goto retry;
--
1.7.9.5
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 7/7] 3.4.x: timekeeping: Add missing update call in timekeeping_resume()
2012-07-17 6:39 [PATCH 0/7] 3.4-stable: Fix for leapsecond caused hrtimer/futex issue John Stultz
` (5 preceding siblings ...)
2012-07-17 6:39 ` [PATCH 6/7] 3.4.x: hrtimer: Update hrtimer base offsets each hrtimer_interrupt John Stultz
@ 2012-07-17 6:39 ` John Stultz
2012-07-17 21:59 ` Patch "timekeeping: Add missing update call in timekeeping_resume()" has been added to the 3.4-stable tree gregkh
2012-07-17 21:29 ` [PATCH 0/7] 3.4-stable: Fix for leapsecond caused hrtimer/futex issue Greg KH
7 siblings, 1 reply; 16+ messages in thread
From: John Stultz @ 2012-07-17 6:39 UTC (permalink / raw)
To: stable
Cc: Thomas Gleixner, LKML, Linux PM list, John Stultz, Ingo Molnar,
Peter Zijlstra, Prarit Bhargava, Linus Torvalds
From: Thomas Gleixner <tglx@linutronix.de>
This is a backport of 3e997130bd2e8c6f5aaa49d6e3161d4d29b43ab0
The leap second rework unearthed another issue of inconsistent data.
On timekeeping_resume() the timekeeper data is updated, but nothing
calls timekeeping_update(), so now the update code in the timer
interrupt sees stale values.
This has been the case before those changes, but then the timer
interrupt was using stale data as well so this went unnoticed for quite
some time.
Add the missing update call, so all the data is consistent everywhere.
Reported-by: Andreas Schwab <schwab@linux-m68k.org>
Reported-and-tested-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Reported-and-tested-by: Martin Steigerwald <Martin@lichtvoll.de>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: Linux PM list <linux-pm@vger.kernel.org>
Cc: John Stultz <johnstul@us.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
---
kernel/time/timekeeping.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 62e12c3..7c50de8 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -719,6 +719,7 @@ static void timekeeping_resume(void)
timekeeper.clock->cycle_last = timekeeper.clock->read(timekeeper.clock);
timekeeper.ntp_error = 0;
timekeeping_suspended = 0;
+ timekeeping_update(false);
write_sequnlock_irqrestore(&timekeeper.lock, flags);
touch_softlockup_watchdog();
--
1.7.9.5
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 0/7] 3.4-stable: Fix for leapsecond caused hrtimer/futex issue
2012-07-17 6:39 [PATCH 0/7] 3.4-stable: Fix for leapsecond caused hrtimer/futex issue John Stultz
` (6 preceding siblings ...)
2012-07-17 6:39 ` [PATCH 7/7] 3.4.x: timekeeping: Add missing update call in timekeeping_resume() John Stultz
@ 2012-07-17 21:29 ` Greg KH
7 siblings, 0 replies; 16+ messages in thread
From: Greg KH @ 2012-07-17 21:29 UTC (permalink / raw)
To: John Stultz; +Cc: stable, Prarit Bhargava, Thomas Gleixner, Linux Kernel
On Tue, Jul 17, 2012 at 02:39:49AM -0400, John Stultz wrote:
> Here is backport of the leapsecond fixes to 3.4-stable. These are very
> straight forward, and backported to 3.4.x with no collisions or changes.
>
> This patchset resolve the early hrtimer/futex expiration issue
> widely seen after the June 30th leapsecond.
>
> I've booted and tested this patchset on two boxes and run through a number
> of leapsecond related stress tests. However, additional testing and review
> would be appreciated.
All now applied, thanks for doing this, that made my life so much
easier.
greg k-h
^ permalink raw reply [flat|nested] 16+ messages in thread
* Patch "hrtimer: Provide clock_was_set_delayed()" has been added to the 3.4-stable tree
2012-07-17 6:39 ` [PATCH 1/7] 3.4.x: hrtimer: Provide clock_was_set_delayed() John Stultz
@ 2012-07-17 21:59 ` gregkh
0 siblings, 0 replies; 16+ messages in thread
From: gregkh @ 2012-07-17 21:59 UTC (permalink / raw)
To: johnstul, a.p.zijlstra, gregkh, jengelh, linux-kernel, mingo,
prarit, tglx
Cc: stable, stable-commits
This is a note to let you know that I've just added the patch titled
hrtimer: Provide clock_was_set_delayed()
to the 3.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
The filename of the patch is:
hrtimer-provide-clock_was_set_delayed.patch
and it can be found in the queue-3.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.
>From johnstul@us.ibm.com Tue Jul 17 14:22:23 2012
From: John Stultz <johnstul@us.ibm.com>
Date: Tue, 17 Jul 2012 02:39:50 -0400
Subject: hrtimer: Provide clock_was_set_delayed()
To: stable@vger.kernel.org
Cc: John Stultz <johnstul@us.ibm.com>, Thomas Gleixner <tglx@linutronix.de>, Prarit Bhargava <prarit@redhat.com>, Linux Kernel <linux-kernel@vger.kernel.org>
Message-ID: <1342507196-54327-2-git-send-email-johnstul@us.ibm.com>
From: John Stultz <johnstul@us.ibm.com>
This is a backport of f55a6faa384304c89cfef162768e88374d3312cb
clock_was_set() cannot be called from hard interrupt context because
it calls on_each_cpu().
For fixing the widely reported leap seconds issue it is necessary to
call it from hard interrupt context, i.e. the timer tick code, which
does the timekeeping updates.
Provide a new function which denotes it in the hrtimer cpu base
structure of the cpu on which it is called and raise the hrtimer
softirq. We then execute the clock_was_set() notificiation from
softirq context in run_hrtimer_softirq(). The hrtimer softirq is
rarely used, so polling the flag there is not a performance issue.
[ tglx: Made it depend on CONFIG_HIGH_RES_TIMERS. We really should get
rid of all this ifdeffery ASAP ]
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Reported-by: Jan Engelhardt <jengelh@inai.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Link: http://lkml.kernel.org/r/1341960205-56738-2-git-send-email-johnstul@us.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
include/linux/hrtimer.h | 9 ++++++++-
kernel/hrtimer.c | 20 ++++++++++++++++++++
2 files changed, 28 insertions(+), 1 deletion(-)
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -165,6 +165,7 @@ enum hrtimer_base_type {
* @lock: lock protecting the base and associated clock bases
* and timers
* @active_bases: Bitfield to mark bases with active timers
+ * @clock_was_set: Indicates that clock was set from irq context.
* @expires_next: absolute time of the next event which was scheduled
* via clock_set_next_event()
* @hres_active: State of high resolution mode
@@ -177,7 +178,8 @@ enum hrtimer_base_type {
*/
struct hrtimer_cpu_base {
raw_spinlock_t lock;
- unsigned long active_bases;
+ unsigned int active_bases;
+ unsigned int clock_was_set;
#ifdef CONFIG_HIGH_RES_TIMERS
ktime_t expires_next;
int hres_active;
@@ -286,6 +288,8 @@ extern void hrtimer_peek_ahead_timers(vo
# define MONOTONIC_RES_NSEC HIGH_RES_NSEC
# define KTIME_MONOTONIC_RES KTIME_HIGH_RES
+extern void clock_was_set_delayed(void);
+
#else
# define MONOTONIC_RES_NSEC LOW_RES_NSEC
@@ -306,6 +310,9 @@ static inline int hrtimer_is_hres_active
{
return 0;
}
+
+static inline void clock_was_set_delayed(void) { }
+
#endif
extern void clock_was_set(void);
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -717,6 +717,19 @@ static int hrtimer_switch_to_hres(void)
return 1;
}
+/*
+ * Called from timekeeping code to reprogramm the hrtimer interrupt
+ * device. If called from the timer interrupt context we defer it to
+ * softirq context.
+ */
+void clock_was_set_delayed(void)
+{
+ struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
+
+ cpu_base->clock_was_set = 1;
+ __raise_softirq_irqoff(HRTIMER_SOFTIRQ);
+}
+
#else
static inline int hrtimer_hres_active(void) { return 0; }
@@ -1395,6 +1408,13 @@ void hrtimer_peek_ahead_timers(void)
static void run_hrtimer_softirq(struct softirq_action *h)
{
+ struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
+
+ if (cpu_base->clock_was_set) {
+ cpu_base->clock_was_set = 0;
+ clock_was_set();
+ }
+
hrtimer_peek_ahead_timers();
}
Patches currently in stable-queue which might be from johnstul@us.ibm.com are
queue-3.4/timekeeping-fix-leapsecond-triggered-load-spike-issue.patch
queue-3.4/hrtimer-update-hrtimer-base-offsets-each-hrtimer_interrupt.patch
queue-3.4/timekeeping-add-missing-update-call-in-timekeeping_resume.patch
queue-3.4/hrtimers-move-lock-held-region-in-hrtimer_interrupt.patch
queue-3.4/hrtimer-provide-clock_was_set_delayed.patch
queue-3.4/timekeeping-provide-hrtimer-update-function.patch
queue-3.4/timekeeping-maintain-ktime_t-based-offsets-for-hrtimers.patch
^ permalink raw reply [flat|nested] 16+ messages in thread
* Patch "hrtimers: Move lock held region in hrtimer_interrupt()" has been added to the 3.4-stable tree
2012-07-17 6:39 ` [PATCH 4/7] 3.4.x: hrtimers: Move lock held region in hrtimer_interrupt() John Stultz
@ 2012-07-17 21:59 ` gregkh
0 siblings, 0 replies; 16+ messages in thread
From: gregkh @ 2012-07-17 21:59 UTC (permalink / raw)
To: johnstul, a.p.zijlstra, gregkh, linux-kernel, mingo, prarit, tglx
Cc: stable, stable-commits
This is a note to let you know that I've just added the patch titled
hrtimers: Move lock held region in hrtimer_interrupt()
to the 3.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
The filename of the patch is:
hrtimers-move-lock-held-region-in-hrtimer_interrupt.patch
and it can be found in the queue-3.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.
>From johnstul@us.ibm.com Tue Jul 17 14:27:06 2012
From: John Stultz <johnstul@us.ibm.com>
Date: Tue, 17 Jul 2012 02:39:53 -0400
Subject: hrtimers: Move lock held region in hrtimer_interrupt()
To: stable@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>, John Stultz <johnstul@us.ibm.com>, Prarit Bhargava <prarit@redhat.com>, Linux Kernel <linux-kernel@vger.kernel.org>
Message-ID: <1342507196-54327-5-git-send-email-johnstul@us.ibm.com>
From: Thomas Gleixner <tglx@linutronix.de>
This is a backport of 196951e91262fccda81147d2bcf7fdab08668b40
We need to update the base offsets from this code and we need to do
that under base->lock. Move the lock held region around the
ktime_get() calls. The ktime_get() calls are going to be replaced with
a function which gets the time and the offsets atomically.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Link: http://lkml.kernel.org/r/1341960205-56738-6-git-send-email-johnstul@us.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
kernel/hrtimer.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1263,11 +1263,10 @@ void hrtimer_interrupt(struct clock_even
cpu_base->nr_events++;
dev->next_event.tv64 = KTIME_MAX;
+ raw_spin_lock(&cpu_base->lock);
entry_time = now = ktime_get();
retry:
expires_next.tv64 = KTIME_MAX;
-
- raw_spin_lock(&cpu_base->lock);
/*
* We set expires_next to KTIME_MAX here with cpu_base->lock
* held to prevent that a timer is enqueued in our queue via
@@ -1344,6 +1343,7 @@ retry:
* interrupt routine. We give it 3 attempts to avoid
* overreacting on some spurious event.
*/
+ raw_spin_lock(&cpu_base->lock);
now = ktime_get();
cpu_base->nr_retries++;
if (++retries < 3)
@@ -1356,6 +1356,7 @@ retry:
*/
cpu_base->nr_hangs++;
cpu_base->hang_detected = 1;
+ raw_spin_unlock(&cpu_base->lock);
delta = ktime_sub(now, entry_time);
if (delta.tv64 > cpu_base->max_hang_time.tv64)
cpu_base->max_hang_time = delta;
Patches currently in stable-queue which might be from johnstul@us.ibm.com are
queue-3.4/timekeeping-fix-leapsecond-triggered-load-spike-issue.patch
queue-3.4/hrtimer-update-hrtimer-base-offsets-each-hrtimer_interrupt.patch
queue-3.4/timekeeping-add-missing-update-call-in-timekeeping_resume.patch
queue-3.4/hrtimers-move-lock-held-region-in-hrtimer_interrupt.patch
queue-3.4/hrtimer-provide-clock_was_set_delayed.patch
queue-3.4/timekeeping-provide-hrtimer-update-function.patch
queue-3.4/timekeeping-maintain-ktime_t-based-offsets-for-hrtimers.patch
^ permalink raw reply [flat|nested] 16+ messages in thread
* Patch "hrtimer: Update hrtimer base offsets each hrtimer_interrupt" has been added to the 3.4-stable tree
2012-07-17 6:39 ` [PATCH 6/7] 3.4.x: hrtimer: Update hrtimer base offsets each hrtimer_interrupt John Stultz
@ 2012-07-17 21:59 ` gregkh
0 siblings, 0 replies; 16+ messages in thread
From: gregkh @ 2012-07-17 21:59 UTC (permalink / raw)
To: johnstul, a.p.zijlstra, gregkh, linux-kernel, mingo, prarit, tglx
Cc: stable, stable-commits
This is a note to let you know that I've just added the patch titled
hrtimer: Update hrtimer base offsets each hrtimer_interrupt
to the 3.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
The filename of the patch is:
hrtimer-update-hrtimer-base-offsets-each-hrtimer_interrupt.patch
and it can be found in the queue-3.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.
>From johnstul@us.ibm.com Tue Jul 17 14:28:09 2012
From: John Stultz <johnstul@us.ibm.com>
Date: Tue, 17 Jul 2012 02:39:55 -0400
Subject: hrtimer: Update hrtimer base offsets each hrtimer_interrupt
To: stable@vger.kernel.org
Cc: John Stultz <johnstul@us.ibm.com>, Thomas Gleixner <tglx@linutronix.de>, Prarit Bhargava <prarit@redhat.com>, Linux Kernel <linux-kernel@vger.kernel.org>
Message-ID: <1342507196-54327-7-git-send-email-johnstul@us.ibm.com>
From: John Stultz <johnstul@us.ibm.com>
This is a backport of 5baefd6d84163443215f4a99f6a20f054ef11236
The update of the hrtimer base offsets on all cpus cannot be made
atomically from the timekeeper.lock held and interrupt disabled region
as smp function calls are not allowed there.
clock_was_set(), which enforces the update on all cpus, is called
either from preemptible process context in case of do_settimeofday()
or from the softirq context when the offset modification happened in
the timer interrupt itself due to a leap second.
In both cases there is a race window for an hrtimer interrupt between
dropping timekeeper lock, enabling interrupts and clock_was_set()
issuing the updates. Any interrupt which arrives in that window will
see the new time but operate on stale offsets.
So we need to make sure that an hrtimer interrupt always sees a
consistent state of time and offsets.
ktime_get_update_offsets() allows us to get the current monotonic time
and update the per cpu hrtimer base offsets from hrtimer_interrupt()
to capture a consistent state of monotonic time and the offsets. The
function replaces the existing ktime_get() calls in hrtimer_interrupt().
The overhead of the new function vs. ktime_get() is minimal as it just
adds two store operations.
This ensures that any changes to realtime or boottime offsets are
noticed and stored into the per-cpu hrtimer base structures, prior to
any hrtimer expiration and guarantees that timers are not expired early.
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Link: http://lkml.kernel.org/r/1341960205-56738-8-git-send-email-johnstul@us.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
kernel/hrtimer.c | 28 ++++++++++++++--------------
1 file changed, 14 insertions(+), 14 deletions(-)
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -657,6 +657,14 @@ static inline int hrtimer_enqueue_reprog
return 0;
}
+static inline ktime_t hrtimer_update_base(struct hrtimer_cpu_base *base)
+{
+ ktime_t *offs_real = &base->clock_base[HRTIMER_BASE_REALTIME].offset;
+ ktime_t *offs_boot = &base->clock_base[HRTIMER_BASE_BOOTTIME].offset;
+
+ return ktime_get_update_offsets(offs_real, offs_boot);
+}
+
/*
* Retrigger next event is called after clock was set
*
@@ -665,22 +673,12 @@ static inline int hrtimer_enqueue_reprog
static void retrigger_next_event(void *arg)
{
struct hrtimer_cpu_base *base = &__get_cpu_var(hrtimer_bases);
- struct timespec realtime_offset, xtim, wtm, sleep;
if (!hrtimer_hres_active())
return;
- /* Optimized out for !HIGH_RES */
- get_xtime_and_monotonic_and_sleep_offset(&xtim, &wtm, &sleep);
- set_normalized_timespec(&realtime_offset, -wtm.tv_sec, -wtm.tv_nsec);
-
- /* Adjust CLOCK_REALTIME offset */
raw_spin_lock(&base->lock);
- base->clock_base[HRTIMER_BASE_REALTIME].offset =
- timespec_to_ktime(realtime_offset);
- base->clock_base[HRTIMER_BASE_BOOTTIME].offset =
- timespec_to_ktime(sleep);
-
+ hrtimer_update_base(base);
hrtimer_force_reprogram(base, 0);
raw_spin_unlock(&base->lock);
}
@@ -710,7 +708,6 @@ static int hrtimer_switch_to_hres(void)
base->clock_base[i].resolution = KTIME_HIGH_RES;
tick_setup_sched_timer();
-
/* "Retrigger" the interrupt to get things going */
retrigger_next_event(NULL);
local_irq_restore(flags);
@@ -1264,7 +1261,7 @@ void hrtimer_interrupt(struct clock_even
dev->next_event.tv64 = KTIME_MAX;
raw_spin_lock(&cpu_base->lock);
- entry_time = now = ktime_get();
+ entry_time = now = hrtimer_update_base(cpu_base);
retry:
expires_next.tv64 = KTIME_MAX;
/*
@@ -1342,9 +1339,12 @@ retry:
* We need to prevent that we loop forever in the hrtimer
* interrupt routine. We give it 3 attempts to avoid
* overreacting on some spurious event.
+ *
+ * Acquire base lock for updating the offsets and retrieving
+ * the current time.
*/
raw_spin_lock(&cpu_base->lock);
- now = ktime_get();
+ now = hrtimer_update_base(cpu_base);
cpu_base->nr_retries++;
if (++retries < 3)
goto retry;
Patches currently in stable-queue which might be from johnstul@us.ibm.com are
queue-3.4/timekeeping-fix-leapsecond-triggered-load-spike-issue.patch
queue-3.4/hrtimer-update-hrtimer-base-offsets-each-hrtimer_interrupt.patch
queue-3.4/timekeeping-add-missing-update-call-in-timekeeping_resume.patch
queue-3.4/hrtimers-move-lock-held-region-in-hrtimer_interrupt.patch
queue-3.4/hrtimer-provide-clock_was_set_delayed.patch
queue-3.4/timekeeping-provide-hrtimer-update-function.patch
queue-3.4/timekeeping-maintain-ktime_t-based-offsets-for-hrtimers.patch
^ permalink raw reply [flat|nested] 16+ messages in thread
* Patch "timekeeping: Add missing update call in timekeeping_resume()" has been added to the 3.4-stable tree
2012-07-17 6:39 ` [PATCH 7/7] 3.4.x: timekeeping: Add missing update call in timekeeping_resume() John Stultz
@ 2012-07-17 21:59 ` gregkh
0 siblings, 0 replies; 16+ messages in thread
From: gregkh @ 2012-07-17 21:59 UTC (permalink / raw)
To: johnstul, a.p.zijlstra, gregkh, linux-kernel, linux-pm, Martin,
mingo, prarit, rjw, schwab, tglx, torvalds
Cc: stable, stable-commits
This is a note to let you know that I've just added the patch titled
timekeeping: Add missing update call in timekeeping_resume()
to the 3.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
The filename of the patch is:
timekeeping-add-missing-update-call-in-timekeeping_resume.patch
and it can be found in the queue-3.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.
>From johnstul@us.ibm.com Tue Jul 17 14:27:38 2012
From: John Stultz <johnstul@us.ibm.com>
Date: Tue, 17 Jul 2012 02:39:56 -0400
Subject: timekeeping: Add missing update call in timekeeping_resume()
To: stable@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>, LKML <linux-kernel@vger.kernel.org>, Linux PM list <linux-pm@vger.kernel.org>, John Stultz <johnstul@us.ibm.com>, Ingo Molnar <mingo@kernel.org>, Peter Zijlstra <a.p.zijlstra@chello.nl>, Prarit Bhargava <prarit@redhat.com>, Linus Torvalds <torvalds@linux-foundation.org>
Message-ID: <1342507196-54327-8-git-send-email-johnstul@us.ibm.com>
From: Thomas Gleixner <tglx@linutronix.de>
This is a backport of 3e997130bd2e8c6f5aaa49d6e3161d4d29b43ab0
The leap second rework unearthed another issue of inconsistent data.
On timekeeping_resume() the timekeeper data is updated, but nothing
calls timekeeping_update(), so now the update code in the timer
interrupt sees stale values.
This has been the case before those changes, but then the timer
interrupt was using stale data as well so this went unnoticed for quite
some time.
Add the missing update call, so all the data is consistent everywhere.
Reported-by: Andreas Schwab <schwab@linux-m68k.org>
Reported-and-tested-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Reported-and-tested-by: Martin Steigerwald <Martin@lichtvoll.de>
Cc: John Stultz <johnstul@us.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
Cc: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
kernel/time/timekeeping.c | 1 +
1 file changed, 1 insertion(+)
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -719,6 +719,7 @@ static void timekeeping_resume(void)
timekeeper.clock->cycle_last = timekeeper.clock->read(timekeeper.clock);
timekeeper.ntp_error = 0;
timekeeping_suspended = 0;
+ timekeeping_update(false);
write_sequnlock_irqrestore(&timekeeper.lock, flags);
touch_softlockup_watchdog();
Patches currently in stable-queue which might be from johnstul@us.ibm.com are
queue-3.4/timekeeping-fix-leapsecond-triggered-load-spike-issue.patch
queue-3.4/hrtimer-update-hrtimer-base-offsets-each-hrtimer_interrupt.patch
queue-3.4/timekeeping-add-missing-update-call-in-timekeeping_resume.patch
queue-3.4/hrtimers-move-lock-held-region-in-hrtimer_interrupt.patch
queue-3.4/hrtimer-provide-clock_was_set_delayed.patch
queue-3.4/timekeeping-provide-hrtimer-update-function.patch
queue-3.4/timekeeping-maintain-ktime_t-based-offsets-for-hrtimers.patch
^ permalink raw reply [flat|nested] 16+ messages in thread
* Patch "timekeeping: Fix leapsecond triggered load spike issue" has been added to the 3.4-stable tree
2012-07-17 6:39 ` [PATCH 2/7] 3.4.x: timekeeping: Fix leapsecond triggered load spike issue John Stultz
@ 2012-07-17 21:59 ` gregkh
0 siblings, 0 replies; 16+ messages in thread
From: gregkh @ 2012-07-17 21:59 UTC (permalink / raw)
To: johnstul, a.p.zijlstra, gregkh, jengelh, linux-kernel, mingo,
prarit, tglx
Cc: stable, stable-commits
This is a note to let you know that I've just added the patch titled
timekeeping: Fix leapsecond triggered load spike issue
to the 3.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
The filename of the patch is:
timekeeping-fix-leapsecond-triggered-load-spike-issue.patch
and it can be found in the queue-3.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.
>From johnstul@us.ibm.com Tue Jul 17 14:22:56 2012
From: John Stultz <johnstul@us.ibm.com>
Date: Tue, 17 Jul 2012 02:39:51 -0400
Subject: timekeeping: Fix leapsecond triggered load spike issue
To: stable@vger.kernel.org
Cc: John Stultz <johnstul@us.ibm.com>, Thomas Gleixner <tglx@linutronix.de>, Prarit Bhargava <prarit@redhat.com>, Linux Kernel <linux-kernel@vger.kernel.org>
Message-ID: <1342507196-54327-3-git-send-email-johnstul@us.ibm.com>
From: John Stultz <johnstul@us.ibm.com>
This is a backport of 4873fa070ae84a4115f0b3c9dfabc224f1bc7c51
The timekeeping code misses an update of the hrtimer subsystem after a
leap second happened. Due to that timers based on CLOCK_REALTIME are
either expiring a second early or late depending on whether a leap
second has been inserted or deleted until an operation is initiated
which causes that update. Unless the update happens by some other
means this discrepancy between the timekeeping and the hrtimer data
stays forever and timers are expired either early or late.
The reported immediate workaround - $ data -s "`date`" - is causing a
call to clock_was_set() which updates the hrtimer data structures.
See: http://www.sheeri.com/content/mysql-and-leap-second-high-cpu-and-fix
Add the missing clock_was_set() call to update_wall_time() in case of
a leap second event. The actual update is deferred to softirq context
as the necessary smp function call cannot be invoked from hard
interrupt context.
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Reported-by: Jan Engelhardt <jengelh@inai.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Link: http://lkml.kernel.org/r/1341960205-56738-3-git-send-email-johnstul@us.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
kernel/time/timekeeping.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -965,6 +965,8 @@ static cycle_t logarithmic_accumulation(
leap = second_overflow(timekeeper.xtime.tv_sec);
timekeeper.xtime.tv_sec += leap;
timekeeper.wall_to_monotonic.tv_sec -= leap;
+ if (leap)
+ clock_was_set_delayed();
}
/* Accumulate raw time */
@@ -1081,6 +1083,8 @@ static void update_wall_time(void)
leap = second_overflow(timekeeper.xtime.tv_sec);
timekeeper.xtime.tv_sec += leap;
timekeeper.wall_to_monotonic.tv_sec -= leap;
+ if (leap)
+ clock_was_set_delayed();
}
timekeeping_update(false);
Patches currently in stable-queue which might be from johnstul@us.ibm.com are
queue-3.4/timekeeping-fix-leapsecond-triggered-load-spike-issue.patch
queue-3.4/hrtimer-update-hrtimer-base-offsets-each-hrtimer_interrupt.patch
queue-3.4/timekeeping-add-missing-update-call-in-timekeeping_resume.patch
queue-3.4/hrtimers-move-lock-held-region-in-hrtimer_interrupt.patch
queue-3.4/hrtimer-provide-clock_was_set_delayed.patch
queue-3.4/timekeeping-provide-hrtimer-update-function.patch
queue-3.4/timekeeping-maintain-ktime_t-based-offsets-for-hrtimers.patch
^ permalink raw reply [flat|nested] 16+ messages in thread
* Patch "timekeeping: Maintain ktime_t based offsets for hrtimers" has been added to the 3.4-stable tree
2012-07-17 6:39 ` [PATCH 3/7] 3.4.x: timekeeping: Maintain ktime_t based offsets for hrtimers John Stultz
@ 2012-07-17 21:59 ` gregkh
0 siblings, 0 replies; 16+ messages in thread
From: gregkh @ 2012-07-17 21:59 UTC (permalink / raw)
To: johnstul, a.p.zijlstra, gregkh, linux-kernel, mingo, prarit, tglx
Cc: stable, stable-commits
This is a note to let you know that I've just added the patch titled
timekeeping: Maintain ktime_t based offsets for hrtimers
to the 3.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
The filename of the patch is:
timekeeping-maintain-ktime_t-based-offsets-for-hrtimers.patch
and it can be found in the queue-3.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.
>From johnstul@us.ibm.com Tue Jul 17 14:23:18 2012
From: John Stultz <johnstul@us.ibm.com>
Date: Tue, 17 Jul 2012 02:39:52 -0400
Subject: timekeeping: Maintain ktime_t based offsets for hrtimers
To: stable@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>, John Stultz <johnstul@us.ibm.com>, Prarit Bhargava <prarit@redhat.com>, Linux Kernel <linux-kernel@vger.kernel.org>
Message-ID: <1342507196-54327-4-git-send-email-johnstul@us.ibm.com>
From: Thomas Gleixner <tglx@linutronix.de>
This is a backport of 5b9fe759a678e05be4937ddf03d50e950207c1c0
We need to update the hrtimer clock offsets from the hrtimer interrupt
context. To avoid conversions from timespec to ktime_t maintain a
ktime_t based representation of those offsets in the timekeeper. This
puts the conversion overhead into the code which updates the
underlying offsets and provides fast accessible values in the hrtimer
interrupt.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Link: http://lkml.kernel.org/r/1341960205-56738-4-git-send-email-johnstul@us.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
kernel/time/timekeeping.c | 25 +++++++++++++++++++++++--
1 file changed, 23 insertions(+), 2 deletions(-)
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -70,6 +70,12 @@ struct timekeeper {
/* The raw monotonic time for the CLOCK_MONOTONIC_RAW posix clock. */
struct timespec raw_time;
+ /* Offset clock monotonic -> clock realtime */
+ ktime_t offs_real;
+
+ /* Offset clock monotonic -> clock boottime */
+ ktime_t offs_boot;
+
/* Seqlock for all timekeeper values */
seqlock_t lock;
};
@@ -172,6 +178,14 @@ static inline s64 timekeeping_get_ns_raw
return clocksource_cyc2ns(cycle_delta, clock->mult, clock->shift);
}
+static void update_rt_offset(void)
+{
+ struct timespec tmp, *wtm = &timekeeper.wall_to_monotonic;
+
+ set_normalized_timespec(&tmp, -wtm->tv_sec, -wtm->tv_nsec);
+ timekeeper.offs_real = timespec_to_ktime(tmp);
+}
+
/* must hold write on timekeeper.lock */
static void timekeeping_update(bool clearntp)
{
@@ -179,6 +193,7 @@ static void timekeeping_update(bool clea
timekeeper.ntp_error = 0;
ntp_clear();
}
+ update_rt_offset();
update_vsyscall(&timekeeper.xtime, &timekeeper.wall_to_monotonic,
timekeeper.clock, timekeeper.mult);
}
@@ -606,6 +621,7 @@ void __init timekeeping_init(void)
}
set_normalized_timespec(&timekeeper.wall_to_monotonic,
-boot.tv_sec, -boot.tv_nsec);
+ update_rt_offset();
timekeeper.total_sleep_time.tv_sec = 0;
timekeeper.total_sleep_time.tv_nsec = 0;
write_sequnlock_irqrestore(&timekeeper.lock, flags);
@@ -614,6 +630,12 @@ void __init timekeeping_init(void)
/* time in seconds when suspend began */
static struct timespec timekeeping_suspend_time;
+static void update_sleep_time(struct timespec t)
+{
+ timekeeper.total_sleep_time = t;
+ timekeeper.offs_boot = timespec_to_ktime(t);
+}
+
/**
* __timekeeping_inject_sleeptime - Internal function to add sleep interval
* @delta: pointer to a timespec delta value
@@ -632,8 +654,7 @@ static void __timekeeping_inject_sleepti
timekeeper.xtime = timespec_add(timekeeper.xtime, *delta);
timekeeper.wall_to_monotonic =
timespec_sub(timekeeper.wall_to_monotonic, *delta);
- timekeeper.total_sleep_time = timespec_add(
- timekeeper.total_sleep_time, *delta);
+ update_sleep_time(timespec_add(timekeeper.total_sleep_time, *delta));
}
Patches currently in stable-queue which might be from johnstul@us.ibm.com are
queue-3.4/timekeeping-fix-leapsecond-triggered-load-spike-issue.patch
queue-3.4/hrtimer-update-hrtimer-base-offsets-each-hrtimer_interrupt.patch
queue-3.4/timekeeping-add-missing-update-call-in-timekeeping_resume.patch
queue-3.4/hrtimers-move-lock-held-region-in-hrtimer_interrupt.patch
queue-3.4/hrtimer-provide-clock_was_set_delayed.patch
queue-3.4/timekeeping-provide-hrtimer-update-function.patch
queue-3.4/timekeeping-maintain-ktime_t-based-offsets-for-hrtimers.patch
^ permalink raw reply [flat|nested] 16+ messages in thread
* Patch "timekeeping: Provide hrtimer update function" has been added to the 3.4-stable tree
2012-07-17 6:39 ` [PATCH 5/7] 3.4.x: timekeeping: Provide hrtimer update function John Stultz
@ 2012-07-17 21:59 ` gregkh
0 siblings, 0 replies; 16+ messages in thread
From: gregkh @ 2012-07-17 21:59 UTC (permalink / raw)
To: johnstul, a.p.zijlstra, gregkh, linux-kernel, mingo, prarit, tglx
Cc: stable, stable-commits
This is a note to let you know that I've just added the patch titled
timekeeping: Provide hrtimer update function
to the 3.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
The filename of the patch is:
timekeeping-provide-hrtimer-update-function.patch
and it can be found in the queue-3.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.
>From johnstul@us.ibm.com Tue Jul 17 14:27:23 2012
From: John Stultz <johnstul@us.ibm.com>
Date: Tue, 17 Jul 2012 02:39:54 -0400
Subject: timekeeping: Provide hrtimer update function
To: stable@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>, John Stultz <johnstul@us.ibm.com>, Prarit Bhargava <prarit@redhat.com>, Linux Kernel <linux-kernel@vger.kernel.org>
Message-ID: <1342507196-54327-6-git-send-email-johnstul@us.ibm.com>
From: Thomas Gleixner <tglx@linutronix.de>
This is a backport of f6c06abfb3972ad4914cef57d8348fcb2932bc3b
To finally fix the infamous leap second issue and other race windows
caused by functions which change the offsets between the various time
bases (CLOCK_MONOTONIC, CLOCK_REALTIME and CLOCK_BOOTTIME) we need a
function which atomically gets the current monotonic time and updates
the offsets of CLOCK_REALTIME and CLOCK_BOOTTIME with minimalistic
overhead. The previous patch which provides ktime_t offsets allows us
to make this function almost as cheap as ktime_get() which is going to
be replaced in hrtimer_interrupt().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Link: http://lkml.kernel.org/r/1341960205-56738-7-git-send-email-johnstul@us.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
include/linux/hrtimer.h | 1 +
kernel/time/timekeeping.c | 34 ++++++++++++++++++++++++++++++++++
2 files changed, 35 insertions(+)
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -327,6 +327,7 @@ extern ktime_t ktime_get(void);
extern ktime_t ktime_get_real(void);
extern ktime_t ktime_get_boottime(void);
extern ktime_t ktime_get_monotonic_offset(void);
+extern ktime_t ktime_get_update_offsets(ktime_t *offs_real, ktime_t *offs_boot);
DECLARE_PER_CPU(struct tick_device, tick_cpu_device);
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1273,6 +1273,40 @@ void get_xtime_and_monotonic_and_sleep_o
} while (read_seqretry(&timekeeper.lock, seq));
}
+#ifdef CONFIG_HIGH_RES_TIMERS
+/**
+ * ktime_get_update_offsets - hrtimer helper
+ * @offs_real: pointer to storage for monotonic -> realtime offset
+ * @offs_boot: pointer to storage for monotonic -> boottime offset
+ *
+ * Returns current monotonic time and updates the offsets
+ * Called from hrtimer_interupt() or retrigger_next_event()
+ */
+ktime_t ktime_get_update_offsets(ktime_t *offs_real, ktime_t *offs_boot)
+{
+ ktime_t now;
+ unsigned int seq;
+ u64 secs, nsecs;
+
+ do {
+ seq = read_seqbegin(&timekeeper.lock);
+
+ secs = timekeeper.xtime.tv_sec;
+ nsecs = timekeeper.xtime.tv_nsec;
+ nsecs += timekeeping_get_ns();
+ /* If arch requires, add in gettimeoffset() */
+ nsecs += arch_gettimeoffset();
+
+ *offs_real = timekeeper.offs_real;
+ *offs_boot = timekeeper.offs_boot;
+ } while (read_seqretry(&timekeeper.lock, seq));
+
+ now = ktime_add_ns(ktime_set(secs, 0), nsecs);
+ now = ktime_sub(now, *offs_real);
+ return now;
+}
+#endif
+
/**
* ktime_get_monotonic_offset() - get wall_to_monotonic in ktime_t format
*/
Patches currently in stable-queue which might be from johnstul@us.ibm.com are
queue-3.4/timekeeping-fix-leapsecond-triggered-load-spike-issue.patch
queue-3.4/hrtimer-update-hrtimer-base-offsets-each-hrtimer_interrupt.patch
queue-3.4/timekeeping-add-missing-update-call-in-timekeeping_resume.patch
queue-3.4/hrtimers-move-lock-held-region-in-hrtimer_interrupt.patch
queue-3.4/hrtimer-provide-clock_was_set_delayed.patch
queue-3.4/timekeeping-provide-hrtimer-update-function.patch
queue-3.4/timekeeping-maintain-ktime_t-based-offsets-for-hrtimers.patch
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2012-07-17 22:01 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-07-17 6:39 [PATCH 0/7] 3.4-stable: Fix for leapsecond caused hrtimer/futex issue John Stultz
2012-07-17 6:39 ` [PATCH 1/7] 3.4.x: hrtimer: Provide clock_was_set_delayed() John Stultz
2012-07-17 21:59 ` Patch "hrtimer: Provide clock_was_set_delayed()" has been added to the 3.4-stable tree gregkh
2012-07-17 6:39 ` [PATCH 2/7] 3.4.x: timekeeping: Fix leapsecond triggered load spike issue John Stultz
2012-07-17 21:59 ` Patch "timekeeping: Fix leapsecond triggered load spike issue" has been added to the 3.4-stable tree gregkh
2012-07-17 6:39 ` [PATCH 3/7] 3.4.x: timekeeping: Maintain ktime_t based offsets for hrtimers John Stultz
2012-07-17 21:59 ` Patch "timekeeping: Maintain ktime_t based offsets for hrtimers" has been added to the 3.4-stable tree gregkh
2012-07-17 6:39 ` [PATCH 4/7] 3.4.x: hrtimers: Move lock held region in hrtimer_interrupt() John Stultz
2012-07-17 21:59 ` Patch "hrtimers: Move lock held region in hrtimer_interrupt()" has been added to the 3.4-stable tree gregkh
2012-07-17 6:39 ` [PATCH 5/7] 3.4.x: timekeeping: Provide hrtimer update function John Stultz
2012-07-17 21:59 ` Patch "timekeeping: Provide hrtimer update function" has been added to the 3.4-stable tree gregkh
2012-07-17 6:39 ` [PATCH 6/7] 3.4.x: hrtimer: Update hrtimer base offsets each hrtimer_interrupt John Stultz
2012-07-17 21:59 ` Patch "hrtimer: Update hrtimer base offsets each hrtimer_interrupt" has been added to the 3.4-stable tree gregkh
2012-07-17 6:39 ` [PATCH 7/7] 3.4.x: timekeeping: Add missing update call in timekeeping_resume() John Stultz
2012-07-17 21:59 ` Patch "timekeeping: Add missing update call in timekeeping_resume()" has been added to the 3.4-stable tree gregkh
2012-07-17 21:29 ` [PATCH 0/7] 3.4-stable: Fix for leapsecond caused hrtimer/futex issue Greg KH
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).