* [PATCH v2 01/25] timekeeping: Read NTP tick length only once
2024-10-09 8:28 [PATCH v2 00/25] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
@ 2024-10-09 8:28 ` Anna-Maria Behnsen
2024-10-09 17:18 ` John Stultz
2024-10-09 8:28 ` [PATCH v2 02/25] timekeeping: Don't stop time readers across hard_pps() update Anna-Maria Behnsen
` (23 subsequent siblings)
24 siblings, 1 reply; 53+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-09 8:28 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Thomas Gleixner <tglx@linutronix.de>
From: Thomas Gleixner <tglx@linutronix.de>
No point in reading it a second time when the comparison fails.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 7e6f409bf311..6acff4cb7b1c 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -2130,16 +2130,17 @@ static __always_inline void timekeeping_apply_adjustment(struct timekeeper *tk,
*/
static void timekeeping_adjust(struct timekeeper *tk, s64 offset)
{
+ u64 ntp_tl = ntp_tick_length();
u32 mult;
/*
* Determine the multiplier from the current NTP tick length.
* Avoid expensive division when the tick length doesn't change.
*/
- if (likely(tk->ntp_tick == ntp_tick_length())) {
+ if (likely(tk->ntp_tick == ntp_tl)) {
mult = tk->tkr_mono.mult - tk->ntp_err_mult;
} else {
- tk->ntp_tick = ntp_tick_length();
+ tk->ntp_tick = ntp_tl;
mult = div64_u64((tk->ntp_tick >> tk->ntp_error_shift) -
tk->xtime_remainder, tk->cycle_interval);
}
--
2.39.5
^ permalink raw reply related [flat|nested] 53+ messages in thread* [PATCH v2 02/25] timekeeping: Don't stop time readers across hard_pps() update
2024-10-09 8:28 [PATCH v2 00/25] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
2024-10-09 8:28 ` [PATCH v2 01/25] timekeeping: Read NTP tick length only once Anna-Maria Behnsen
@ 2024-10-09 8:28 ` Anna-Maria Behnsen
2024-10-09 8:28 ` [PATCH v2 03/25] timekeeping: Avoid duplicate leap state update Anna-Maria Behnsen
` (22 subsequent siblings)
24 siblings, 0 replies; 53+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-09 8:28 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Thomas Gleixner <tglx@linutronix.de>
From: Thomas Gleixner <tglx@linutronix.de>
hard_pps() update does not modify anything which might be required by time
readers so forcing readers out of the way during the update is a pointless
exercise.
The interaction with adjtimex() and timekeeper updates which call into the
NTP code is properly serialized by timekeeper_lock.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 6acff4cb7b1c..4e3afe22d28c 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -2627,11 +2627,7 @@ void hardpps(const struct timespec64 *phase_ts, const struct timespec64 *raw_ts)
unsigned long flags;
raw_spin_lock_irqsave(&timekeeper_lock, flags);
- write_seqcount_begin(&tk_core.seq);
-
__hardpps(phase_ts, raw_ts);
-
- write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
}
EXPORT_SYMBOL(hardpps);
--
2.39.5
^ permalink raw reply related [flat|nested] 53+ messages in thread* [PATCH v2 03/25] timekeeping: Avoid duplicate leap state update
2024-10-09 8:28 [PATCH v2 00/25] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
2024-10-09 8:28 ` [PATCH v2 01/25] timekeeping: Read NTP tick length only once Anna-Maria Behnsen
2024-10-09 8:28 ` [PATCH v2 02/25] timekeeping: Don't stop time readers across hard_pps() update Anna-Maria Behnsen
@ 2024-10-09 8:28 ` Anna-Maria Behnsen
2024-10-09 17:31 ` John Stultz
2024-10-09 8:28 ` [PATCH v2 04/25] timekeeping: Abort clocksource change in case of failure Anna-Maria Behnsen
` (21 subsequent siblings)
24 siblings, 1 reply; 53+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-09 8:28 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Anna-Maria Behnsen <anna-maria@linutronix.de>
do_adjtimex() invokes tk_update_leap_state() unconditionally even when a
previous invocation of timekeeping_update() already did that update.
Put it into the else path which is invoked when timekeeping_update() is not
called.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 4e3afe22d28c..c4dd460b6f2b 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -2596,8 +2596,9 @@ int do_adjtimex(struct __kernel_timex *txc)
__timekeeping_set_tai_offset(tk, tai);
timekeeping_update(tk, TK_MIRROR | TK_CLOCK_WAS_SET);
clock_set = true;
+ } else {
+ tk_update_leap_state(tk);
}
- tk_update_leap_state(tk);
write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
--
2.39.5
^ permalink raw reply related [flat|nested] 53+ messages in thread* Re: [PATCH v2 03/25] timekeeping: Avoid duplicate leap state update
2024-10-09 8:28 ` [PATCH v2 03/25] timekeeping: Avoid duplicate leap state update Anna-Maria Behnsen
@ 2024-10-09 17:31 ` John Stultz
0 siblings, 0 replies; 53+ messages in thread
From: John Stultz @ 2024-10-09 17:31 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> do_adjtimex() invokes tk_update_leap_state() unconditionally even when a
> previous invocation of timekeeping_update() already did that update.
>
> Put it into the else path which is invoked when timekeeping_update() is not
> called.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Could probably use a comment in the code to that effect, but otherwise:
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v2 04/25] timekeeping: Abort clocksource change in case of failure
2024-10-09 8:28 [PATCH v2 00/25] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (2 preceding siblings ...)
2024-10-09 8:28 ` [PATCH v2 03/25] timekeeping: Avoid duplicate leap state update Anna-Maria Behnsen
@ 2024-10-09 8:28 ` Anna-Maria Behnsen
2024-10-09 19:58 ` John Stultz
2024-10-09 8:28 ` [PATCH v2 05/25] timekeeping: Simplify code in timekeeping_advance() Anna-Maria Behnsen
` (20 subsequent siblings)
24 siblings, 1 reply; 53+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-09 8:28 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Thomas Gleixner <tglx@linutronix.de>
From: Thomas Gleixner <tglx@linutronix.de>
There is no point to go through a full timekeeping update when acquiring a
module reference or enabling the new clocksource fails.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 31 +++++++++++++------------------
1 file changed, 13 insertions(+), 18 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index c4dd460b6f2b..0ae35a71f8cc 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1577,33 +1577,29 @@ static void __timekeeping_set_tai_offset(struct timekeeper *tk, s32 tai_offset)
static int change_clocksource(void *data)
{
struct timekeeper *tk = &tk_core.timekeeper;
- struct clocksource *new, *old = NULL;
+ struct clocksource *new = data, *old = NULL;
unsigned long flags;
- bool change = false;
-
- new = (struct clocksource *) data;
/*
- * If the cs is in module, get a module reference. Succeeds
- * for built-in code (owner == NULL) as well.
+ * If the clocksource is in a module, get a module reference.
+ * Succeeds for built-in code (owner == NULL) as well. Abort if the
+ * reference can't be acquired.
*/
- if (try_module_get(new->owner)) {
- if (!new->enable || new->enable(new) == 0)
- change = true;
- else
- module_put(new->owner);
+ if (!try_module_get(new->owner))
+ return 0;
+
+ /* Abort if the device can't be enabled */
+ if (new->enable && new->enable(new) != 0) {
+ module_put(new->owner);
+ return 0;
}
raw_spin_lock_irqsave(&timekeeper_lock, flags);
write_seqcount_begin(&tk_core.seq);
timekeeping_forward_now(tk);
-
- if (change) {
- old = tk->tkr_mono.clock;
- tk_setup_internals(tk, new);
- }
-
+ old = tk->tkr_mono.clock;
+ tk_setup_internals(tk, new);
timekeeping_update(tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
write_seqcount_end(&tk_core.seq);
@@ -1612,7 +1608,6 @@ static int change_clocksource(void *data)
if (old) {
if (old->disable)
old->disable(old);
-
module_put(old->owner);
}
--
2.39.5
^ permalink raw reply related [flat|nested] 53+ messages in thread* Re: [PATCH v2 04/25] timekeeping: Abort clocksource change in case of failure
2024-10-09 8:28 ` [PATCH v2 04/25] timekeeping: Abort clocksource change in case of failure Anna-Maria Behnsen
@ 2024-10-09 19:58 ` John Stultz
0 siblings, 0 replies; 53+ messages in thread
From: John Stultz @ 2024-10-09 19:58 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Thomas Gleixner <tglx@linutronix.de>
>
> From: Thomas Gleixner <tglx@linutronix.de>
>
> There is no point to go through a full timekeeping update when acquiring a
> module reference or enabling the new clocksource fails.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v2 05/25] timekeeping: Simplify code in timekeeping_advance()
2024-10-09 8:28 [PATCH v2 00/25] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (3 preceding siblings ...)
2024-10-09 8:28 ` [PATCH v2 04/25] timekeeping: Abort clocksource change in case of failure Anna-Maria Behnsen
@ 2024-10-09 8:28 ` Anna-Maria Behnsen
2024-10-09 21:00 ` John Stultz
2024-10-09 8:28 ` [PATCH v2 06/25] timekeeping: Reorder struct timekeeper Anna-Maria Behnsen
` (19 subsequent siblings)
24 siblings, 1 reply; 53+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-09 8:28 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Thomas Gleixner <tglx@linutronix.de>
From: Thomas Gleixner <tglx@linutronix.de>
timekeeping_advance() takes the timekeeper_lock and releases it before
returning. When an early return is required, goto statements are used to
make sure the lock is realeased properly. When the code was written the
locking guard() was not yet available.
Use the guard() to simplify the code and while at it cleanup ordering of
function variables. No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 16 ++++++----------
1 file changed, 6 insertions(+), 10 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 0ae35a71f8cc..e747e46a1a2d 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -2276,23 +2276,22 @@ static bool timekeeping_advance(enum timekeeping_adv_mode mode)
{
struct timekeeper *real_tk = &tk_core.timekeeper;
struct timekeeper *tk = &shadow_timekeeper;
- u64 offset;
- int shift = 0, maxshift;
unsigned int clock_set = 0;
- unsigned long flags;
+ int shift = 0, maxshift;
+ u64 offset;
- raw_spin_lock_irqsave(&timekeeper_lock, flags);
+ guard(raw_spinlock_irqsave)(&timekeeper_lock);
/* Make sure we're fully resumed: */
if (unlikely(timekeeping_suspended))
- goto out;
+ return false;
offset = clocksource_delta(tk_clock_read(&tk->tkr_mono),
tk->tkr_mono.cycle_last, tk->tkr_mono.mask);
/* Check if there's really nothing to do */
if (offset < real_tk->cycle_interval && mode == TK_ADV_TICK)
- goto out;
+ return false;
/* Do some additional sanity checking */
timekeeping_check_update(tk, offset);
@@ -2311,8 +2310,7 @@ static bool timekeeping_advance(enum timekeeping_adv_mode mode)
maxshift = (64 - (ilog2(ntp_tick_length())+1)) - 1;
shift = min(shift, maxshift);
while (offset >= tk->cycle_interval) {
- offset = logarithmic_accumulation(tk, offset, shift,
- &clock_set);
+ offset = logarithmic_accumulation(tk, offset, shift, &clock_set);
if (offset < tk->cycle_interval<<shift)
shift--;
}
@@ -2341,8 +2339,6 @@ static bool timekeeping_advance(enum timekeeping_adv_mode mode)
memcpy(real_tk, tk, sizeof(*tk));
/* The memcpy must come last. Do not put anything here! */
write_seqcount_end(&tk_core.seq);
-out:
- raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
return !!clock_set;
}
--
2.39.5
^ permalink raw reply related [flat|nested] 53+ messages in thread* Re: [PATCH v2 05/25] timekeeping: Simplify code in timekeeping_advance()
2024-10-09 8:28 ` [PATCH v2 05/25] timekeeping: Simplify code in timekeeping_advance() Anna-Maria Behnsen
@ 2024-10-09 21:00 ` John Stultz
0 siblings, 0 replies; 53+ messages in thread
From: John Stultz @ 2024-10-09 21:00 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Thomas Gleixner <tglx@linutronix.de>
>
> From: Thomas Gleixner <tglx@linutronix.de>
>
> timekeeping_advance() takes the timekeeper_lock and releases it before
> returning. When an early return is required, goto statements are used to
> make sure the lock is realeased properly. When the code was written the
> locking guard() was not yet available.
>
> Use the guard() to simplify the code and while at it cleanup ordering of
> function variables. No functional change.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v2 06/25] timekeeping: Reorder struct timekeeper
2024-10-09 8:28 [PATCH v2 00/25] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (4 preceding siblings ...)
2024-10-09 8:28 ` [PATCH v2 05/25] timekeeping: Simplify code in timekeeping_advance() Anna-Maria Behnsen
@ 2024-10-09 8:28 ` Anna-Maria Behnsen
2024-10-11 3:22 ` John Stultz
2024-10-15 10:08 ` [PATCH v2a] " Anna-Maria Behnsen
2024-10-09 8:29 ` [PATCH v2 07/25] timekeeping: Move shadow_timekeeper into tk_core Anna-Maria Behnsen
` (18 subsequent siblings)
24 siblings, 2 replies; 53+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-09 8:28 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Thomas Gleixner <tglx@linutronix.de>
From: Thomas Gleixner <tglx@linutronix.de>
struct timekeeper is ordered suboptimal vs. cachelines. The layout,
including the preceding seqcount (see struct tk_core in timekeeper.c) is:
cacheline 0: seqcount, tkr_mono
cacheline 1: tkr_raw, xtime_sec
cacheline 2: ktime_sec ... tai_offset, internal variables
cacheline 3: next_leap_ktime, raw_sec, internal variables
cacheline 4: internal variables
So any access to via ktime_get*() except for access to CLOCK_MONOTONIC_RAW
will use either cachelines 0 + 1 or cachelines 0 + 2. Access to
CLOCK_MONOTONIC_RAW uses cachelines 0 + 1 + 3.
Reorder the members so that the result is more efficient:
cacheline 0: seqcount, tkr_mono
cacheline 1: xtime_sec, ktime_sec ... tai_offset
cacheline 2: tkr_raw, raw_sec
cacheline 3: internal variables
cacheline 4: internal variables
That means ktime_get*() will access cacheline 0 + 1 and CLOCK_MONOTONIC_RAW
access will use cachelines 0 + 2.
Update kernel-doc and fix formatting issues while at it. Also fix a typo
in struct tk_read_base kernel-doc.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
include/linux/timekeeper_internal.h | 102 +++++++++++++++++++++---------------
1 file changed, 61 insertions(+), 41 deletions(-)
diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h
index 902c20ef495a..430e40549136 100644
--- a/include/linux/timekeeper_internal.h
+++ b/include/linux/timekeeper_internal.h
@@ -26,7 +26,7 @@
* occupies a single 64byte cache line.
*
* The struct is separate from struct timekeeper as it is also used
- * for a fast NMI safe accessors.
+ * for the fast NMI safe accessors.
*
* @base_real is for the fast NMI safe accessor to allow reading clock
* realtime from any context.
@@ -44,33 +44,41 @@ struct tk_read_base {
/**
* struct timekeeper - Structure holding internal timekeeping values.
- * @tkr_mono: The readout base structure for CLOCK_MONOTONIC
- * @tkr_raw: The readout base structure for CLOCK_MONOTONIC_RAW
- * @xtime_sec: Current CLOCK_REALTIME time in seconds
- * @ktime_sec: Current CLOCK_MONOTONIC time in seconds
- * @wall_to_monotonic: CLOCK_REALTIME to CLOCK_MONOTONIC offset
- * @offs_real: Offset clock monotonic -> clock realtime
- * @offs_boot: Offset clock monotonic -> clock boottime
- * @offs_tai: Offset clock monotonic -> clock tai
- * @tai_offset: The current UTC to TAI offset in seconds
- * @clock_was_set_seq: The sequence number of clock was set events
- * @cs_was_changed_seq: The sequence number of clocksource change events
- * @next_leap_ktime: CLOCK_MONOTONIC time value of a pending leap-second
- * @raw_sec: CLOCK_MONOTONIC_RAW time in seconds
- * @monotonic_to_boot: CLOCK_MONOTONIC to CLOCK_BOOTTIME offset
- * @cycle_interval: Number of clock cycles in one NTP interval
- * @xtime_interval: Number of clock shifted nano seconds in one NTP
- * interval.
- * @xtime_remainder: Shifted nano seconds left over when rounding
- * @cycle_interval
- * @raw_interval: Shifted raw nano seconds accumulated per NTP interval.
- * @ntp_error: Difference between accumulated time and NTP time in ntp
- * shifted nano seconds.
- * @ntp_error_shift: Shift conversion between clock shifted nano seconds and
- * ntp shifted nano seconds.
- * @last_warning: Warning ratelimiter (DEBUG_TIMEKEEPING)
- * @underflow_seen: Underflow warning flag (DEBUG_TIMEKEEPING)
- * @overflow_seen: Overflow warning flag (DEBUG_TIMEKEEPING)
+ * @tkr_mono: The readout base structure for CLOCK_MONOTONIC
+ * @xtime_sec: Current CLOCK_REALTIME time in seconds
+ * @ktime_sec: Current CLOCK_MONOTONIC time in seconds
+ * @wall_to_monotonic: CLOCK_REALTIME to CLOCK_MONOTONIC offset
+ * @offs_real: Offset clock monotonic -> clock realtime
+ * @offs_boot: Offset clock monotonic -> clock boottime
+ * @offs_tai: Offset clock monotonic -> clock tai
+ * @tai_offset: The current UTC to TAI offset in seconds
+ * @tkr_raw: The readout base structure for CLOCK_MONOTONIC_RAW
+ * @raw_sec: CLOCK_MONOTONIC_RAW time in seconds
+ * @clock_was_set_seq: The sequence number of clock was set events
+ * @cs_was_changed_seq: The sequence number of clocksource change events
+ * @monotonic_to_boot: CLOCK_MONOTONIC to CLOCK_BOOTTIME offset
+ * @cycle_interval: Number of clock cycles in one NTP interval
+ * @xtime_interval: Number of clock shifted nano seconds in one NTP
+ * interval.
+ * @xtime_remainder: Shifted nano seconds left over when rounding
+ * @cycle_interval
+ * @raw_interval: Shifted raw nano seconds accumulated per NTP interval.
+ * @next_leap_ktime: CLOCK_MONOTONIC time value of a pending leap-second
+ * @ntp_tick: The ntp_tick_length() value currently being
+ * used. This cached copy ensures we consistently
+ * apply the tick length for an entire tick, as
+ * ntp_tick_length may change mid-tick, and we don't
+ * want to apply that new value to the tick in
+ * progress.
+ * @ntp_error: Difference between accumulated time and NTP time in ntp
+ * shifted nano seconds.
+ * @ntp_error_shift: Shift conversion between clock shifted nano seconds and
+ * ntp shifted nano seconds.
+ * @ntp_err_mult: Multiplication factor for scaled math conversion
+ * @skip_second_overflow: Flag used to avoid updating NTP twice with same second
+ * @last_warning: Warning ratelimiter (DEBUG_TIMEKEEPING)
+ * @underflow_seen: Underflow warning flag (DEBUG_TIMEKEEPING)
+ * @overflow_seen: Overflow warning flag (DEBUG_TIMEKEEPING)
*
* Note: For timespec(64) based interfaces wall_to_monotonic is what
* we need to add to xtime (or xtime corrected for sub jiffy times)
@@ -88,10 +96,25 @@ struct tk_read_base {
*
* @monotonic_to_boottime is a timespec64 representation of @offs_boot to
* accelerate the VDSO update for CLOCK_BOOTTIME.
+ *
+ * The cacheline ordering of the structure is optimized for in kernel usage
+ * of the ktime_get() and ktime_get_ts64() family of time accessors. Struct
+ * timekeeper is prepended in the core timekeeeping code with a sequence
+ * count, which results in the following cacheline layout:
+ *
+ * 0: seqcount, tkr_mono
+ * 1: xtime_sec ... tai_offset
+ * 2: tkr_raw, raw_sec
+ * 3,4: Internal variables
+ *
+ * Cacheline 0,1 contain the data which is used for accessing
+ * CLOCK_MONOTONIC/REALTIME/BOOTTIME/TAI, while cacheline 2 contains the
+ * data for accessing CLOCK_MONOTONIC_RAW. Cacheline 3,4 are internal
+ * variables which are only accessed during timekeeper updates once per
+ * tick.
*/
struct timekeeper {
struct tk_read_base tkr_mono;
- struct tk_read_base tkr_raw;
u64 xtime_sec;
unsigned long ktime_sec;
struct timespec64 wall_to_monotonic;
@@ -99,31 +122,28 @@ struct timekeeper {
ktime_t offs_boot;
ktime_t offs_tai;
s32 tai_offset;
+
+ struct tk_read_base tkr_raw;
+ u64 raw_sec;
+
+ /* The following members are for timekeeping internal use */
unsigned int clock_was_set_seq;
u8 cs_was_changed_seq;
- ktime_t next_leap_ktime;
- u64 raw_sec;
+
struct timespec64 monotonic_to_boot;
- /* The following members are for timekeeping internal use */
u64 cycle_interval;
u64 xtime_interval;
s64 xtime_remainder;
u64 raw_interval;
- /* The ntp_tick_length() value currently being used.
- * This cached copy ensures we consistently apply the tick
- * length for an entire tick, as ntp_tick_length may change
- * mid-tick, and we don't want to apply that new value to
- * the tick in progress.
- */
+
+ ktime_t next_leap_ktime;
u64 ntp_tick;
- /* Difference between accumulated time and NTP time in ntp
- * shifted nano seconds. */
s64 ntp_error;
u32 ntp_error_shift;
u32 ntp_err_mult;
- /* Flag used to avoid updating NTP twice with same second */
u32 skip_second_overflow;
+
#ifdef CONFIG_DEBUG_TIMEKEEPING
long last_warning;
/*
--
2.39.5
^ permalink raw reply related [flat|nested] 53+ messages in thread* Re: [PATCH v2 06/25] timekeeping: Reorder struct timekeeper
2024-10-09 8:28 ` [PATCH v2 06/25] timekeeping: Reorder struct timekeeper Anna-Maria Behnsen
@ 2024-10-11 3:22 ` John Stultz
2024-10-14 9:30 ` Anna-Maria Behnsen
2024-10-15 10:08 ` [PATCH v2a] " Anna-Maria Behnsen
1 sibling, 1 reply; 53+ messages in thread
From: John Stultz @ 2024-10-11 3:22 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Thomas Gleixner <tglx@linutronix.de>
>
> From: Thomas Gleixner <tglx@linutronix.de>
>
> struct timekeeper is ordered suboptimal vs. cachelines. The layout,
> including the preceding seqcount (see struct tk_core in timekeeper.c) is:
>
> cacheline 0: seqcount, tkr_mono
> cacheline 1: tkr_raw, xtime_sec
> cacheline 2: ktime_sec ... tai_offset, internal variables
> cacheline 3: next_leap_ktime, raw_sec, internal variables
> cacheline 4: internal variables
>
> So any access to via ktime_get*() except for access to CLOCK_MONOTONIC_RAW
> will use either cachelines 0 + 1 or cachelines 0 + 2. Access to
> CLOCK_MONOTONIC_RAW uses cachelines 0 + 1 + 3.
>
> Reorder the members so that the result is more efficient:
>
> cacheline 0: seqcount, tkr_mono
> cacheline 1: xtime_sec, ktime_sec ... tai_offset
> cacheline 2: tkr_raw, raw_sec
> cacheline 3: internal variables
> cacheline 4: internal variables
>
> That means ktime_get*() will access cacheline 0 + 1 and CLOCK_MONOTONIC_RAW
> access will use cachelines 0 + 2.
>
> Update kernel-doc and fix formatting issues while at it. Also fix a typo
> in struct tk_read_base kernel-doc.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
> ---
> include/linux/timekeeper_internal.h | 102 +++++++++++++++++++++---------------
> 1 file changed, 61 insertions(+), 41 deletions(-)
>
> diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h
> index 902c20ef495a..430e40549136 100644
> --- a/include/linux/timekeeper_internal.h
> +++ b/include/linux/timekeeper_internal.h
> @@ -26,7 +26,7 @@
> * occupies a single 64byte cache line.
> *
> * The struct is separate from struct timekeeper as it is also used
> - * for a fast NMI safe accessors.
> + * for the fast NMI safe accessors.
> *
> * @base_real is for the fast NMI safe accessor to allow reading clock
> * realtime from any context.
> @@ -44,33 +44,41 @@ struct tk_read_base {
>
> /**
> * struct timekeeper - Structure holding internal timekeeping values.
> - * @tkr_mono: The readout base structure for CLOCK_MONOTONIC
> - * @tkr_raw: The readout base structure for CLOCK_MONOTONIC_RAW
> - * @xtime_sec: Current CLOCK_REALTIME time in seconds
> - * @ktime_sec: Current CLOCK_MONOTONIC time in seconds
> - * @wall_to_monotonic: CLOCK_REALTIME to CLOCK_MONOTONIC offset
> - * @offs_real: Offset clock monotonic -> clock realtime
> - * @offs_boot: Offset clock monotonic -> clock boottime
> - * @offs_tai: Offset clock monotonic -> clock tai
> - * @tai_offset: The current UTC to TAI offset in seconds
> - * @clock_was_set_seq: The sequence number of clock was set events
> - * @cs_was_changed_seq: The sequence number of clocksource change events
> - * @next_leap_ktime: CLOCK_MONOTONIC time value of a pending leap-second
> - * @raw_sec: CLOCK_MONOTONIC_RAW time in seconds
> - * @monotonic_to_boot: CLOCK_MONOTONIC to CLOCK_BOOTTIME offset
> - * @cycle_interval: Number of clock cycles in one NTP interval
> - * @xtime_interval: Number of clock shifted nano seconds in one NTP
> - * interval.
> - * @xtime_remainder: Shifted nano seconds left over when rounding
> - * @cycle_interval
> - * @raw_interval: Shifted raw nano seconds accumulated per NTP interval.
> - * @ntp_error: Difference between accumulated time and NTP time in ntp
> - * shifted nano seconds.
> - * @ntp_error_shift: Shift conversion between clock shifted nano seconds and
> - * ntp shifted nano seconds.
> - * @last_warning: Warning ratelimiter (DEBUG_TIMEKEEPING)
> - * @underflow_seen: Underflow warning flag (DEBUG_TIMEKEEPING)
> - * @overflow_seen: Overflow warning flag (DEBUG_TIMEKEEPING)
> + * @tkr_mono: The readout base structure for CLOCK_MONOTONIC
> + * @xtime_sec: Current CLOCK_REALTIME time in seconds
> + * @ktime_sec: Current CLOCK_MONOTONIC time in seconds
> + * @wall_to_monotonic: CLOCK_REALTIME to CLOCK_MONOTONIC offset
> + * @offs_real: Offset clock monotonic -> clock realtime
> + * @offs_boot: Offset clock monotonic -> clock boottime
> + * @offs_tai: Offset clock monotonic -> clock tai
> + * @tai_offset: The current UTC to TAI offset in seconds
> + * @tkr_raw: The readout base structure for CLOCK_MONOTONIC_RAW
> + * @raw_sec: CLOCK_MONOTONIC_RAW time in seconds
> + * @clock_was_set_seq: The sequence number of clock was set events
> + * @cs_was_changed_seq: The sequence number of clocksource change events
> + * @monotonic_to_boot: CLOCK_MONOTONIC to CLOCK_BOOTTIME offset
> + * @cycle_interval: Number of clock cycles in one NTP interval
> + * @xtime_interval: Number of clock shifted nano seconds in one NTP
> + * interval.
> + * @xtime_remainder: Shifted nano seconds left over when rounding
> + * @cycle_interval
> + * @raw_interval: Shifted raw nano seconds accumulated per NTP interval.
> + * @next_leap_ktime: CLOCK_MONOTONIC time value of a pending leap-second
> + * @ntp_tick: The ntp_tick_length() value currently being
> + * used. This cached copy ensures we consistently
> + * apply the tick length for an entire tick, as
> + * ntp_tick_length may change mid-tick, and we don't
> + * want to apply that new value to the tick in
> + * progress.
> + * @ntp_error: Difference between accumulated time and NTP time in ntp
> + * shifted nano seconds.
> + * @ntp_error_shift: Shift conversion between clock shifted nano seconds and
> + * ntp shifted nano seconds.
> + * @ntp_err_mult: Multiplication factor for scaled math conversion
> + * @skip_second_overflow: Flag used to avoid updating NTP twice with same second
> + * @last_warning: Warning ratelimiter (DEBUG_TIMEKEEPING)
> + * @underflow_seen: Underflow warning flag (DEBUG_TIMEKEEPING)
> + * @overflow_seen: Overflow warning flag (DEBUG_TIMEKEEPING)
> *
> * Note: For timespec(64) based interfaces wall_to_monotonic is what
> * we need to add to xtime (or xtime corrected for sub jiffy times)
> @@ -88,10 +96,25 @@ struct tk_read_base {
> *
> * @monotonic_to_boottime is a timespec64 representation of @offs_boot to
> * accelerate the VDSO update for CLOCK_BOOTTIME.
> + *
> + * The cacheline ordering of the structure is optimized for in kernel usage
> + * of the ktime_get() and ktime_get_ts64() family of time accessors. Struct
> + * timekeeper is prepended in the core timekeeeping code with a sequence
> + * count, which results in the following cacheline layout:
> + *
> + * 0: seqcount, tkr_mono
> + * 1: xtime_sec ... tai_offset
> + * 2: tkr_raw, raw_sec
> + * 3,4: Internal variables
> + *
> + * Cacheline 0,1 contain the data which is used for accessing
> + * CLOCK_MONOTONIC/REALTIME/BOOTTIME/TAI, while cacheline 2 contains the
> + * data for accessing CLOCK_MONOTONIC_RAW. Cacheline 3,4 are internal
> + * variables which are only accessed during timekeeper updates once per
> + * tick.
Would it make sense to add divider comments or something in the struct
to make this more visible? I fret in the context of a patch, a + line
adding a new structure element that breaks the ordered alignment might
not be obvious.
thanks
-john
^ permalink raw reply [flat|nested] 53+ messages in thread* Re: [PATCH v2 06/25] timekeeping: Reorder struct timekeeper
2024-10-11 3:22 ` John Stultz
@ 2024-10-14 9:30 ` Anna-Maria Behnsen
0 siblings, 0 replies; 53+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-14 9:30 UTC (permalink / raw)
To: John Stultz
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
John Stultz <jstultz@google.com> writes:
> On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
> <anna-maria@linutronix.de> wrote:
>>
>> From: Thomas Gleixner <tglx@linutronix.de>
>>
>> From: Thomas Gleixner <tglx@linutronix.de>
>>
>> struct timekeeper is ordered suboptimal vs. cachelines. The layout,
>> including the preceding seqcount (see struct tk_core in timekeeper.c) is:
>>
>> cacheline 0: seqcount, tkr_mono
>> cacheline 1: tkr_raw, xtime_sec
>> cacheline 2: ktime_sec ... tai_offset, internal variables
>> cacheline 3: next_leap_ktime, raw_sec, internal variables
>> cacheline 4: internal variables
>>
>> So any access to via ktime_get*() except for access to CLOCK_MONOTONIC_RAW
>> will use either cachelines 0 + 1 or cachelines 0 + 2. Access to
>> CLOCK_MONOTONIC_RAW uses cachelines 0 + 1 + 3.
>>
>> Reorder the members so that the result is more efficient:
>>
>> cacheline 0: seqcount, tkr_mono
>> cacheline 1: xtime_sec, ktime_sec ... tai_offset
>> cacheline 2: tkr_raw, raw_sec
>> cacheline 3: internal variables
>> cacheline 4: internal variables
>>
>> That means ktime_get*() will access cacheline 0 + 1 and CLOCK_MONOTONIC_RAW
>> access will use cachelines 0 + 2.
>>
>> Update kernel-doc and fix formatting issues while at it. Also fix a typo
>> in struct tk_read_base kernel-doc.
>>
>> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> Acked-by: John Stultz <jstultz@google.com>
>
>> ---
>> include/linux/timekeeper_internal.h | 102 +++++++++++++++++++++---------------
>> 1 file changed, 61 insertions(+), 41 deletions(-)
>>
>> diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h
>> index 902c20ef495a..430e40549136 100644
>> --- a/include/linux/timekeeper_internal.h
>> +++ b/include/linux/timekeeper_internal.h
>> @@ -26,7 +26,7 @@
>> * occupies a single 64byte cache line.
>> *
>> * The struct is separate from struct timekeeper as it is also used
>> - * for a fast NMI safe accessors.
>> + * for the fast NMI safe accessors.
>> *
>> * @base_real is for the fast NMI safe accessor to allow reading clock
>> * realtime from any context.
>> @@ -44,33 +44,41 @@ struct tk_read_base {
>>
>> /**
>> * struct timekeeper - Structure holding internal timekeeping values.
>> - * @tkr_mono: The readout base structure for CLOCK_MONOTONIC
>> - * @tkr_raw: The readout base structure for CLOCK_MONOTONIC_RAW
>> - * @xtime_sec: Current CLOCK_REALTIME time in seconds
>> - * @ktime_sec: Current CLOCK_MONOTONIC time in seconds
>> - * @wall_to_monotonic: CLOCK_REALTIME to CLOCK_MONOTONIC offset
>> - * @offs_real: Offset clock monotonic -> clock realtime
>> - * @offs_boot: Offset clock monotonic -> clock boottime
>> - * @offs_tai: Offset clock monotonic -> clock tai
>> - * @tai_offset: The current UTC to TAI offset in seconds
>> - * @clock_was_set_seq: The sequence number of clock was set events
>> - * @cs_was_changed_seq: The sequence number of clocksource change events
>> - * @next_leap_ktime: CLOCK_MONOTONIC time value of a pending leap-second
>> - * @raw_sec: CLOCK_MONOTONIC_RAW time in seconds
>> - * @monotonic_to_boot: CLOCK_MONOTONIC to CLOCK_BOOTTIME offset
>> - * @cycle_interval: Number of clock cycles in one NTP interval
>> - * @xtime_interval: Number of clock shifted nano seconds in one NTP
>> - * interval.
>> - * @xtime_remainder: Shifted nano seconds left over when rounding
>> - * @cycle_interval
>> - * @raw_interval: Shifted raw nano seconds accumulated per NTP interval.
>> - * @ntp_error: Difference between accumulated time and NTP time in ntp
>> - * shifted nano seconds.
>> - * @ntp_error_shift: Shift conversion between clock shifted nano seconds and
>> - * ntp shifted nano seconds.
>> - * @last_warning: Warning ratelimiter (DEBUG_TIMEKEEPING)
>> - * @underflow_seen: Underflow warning flag (DEBUG_TIMEKEEPING)
>> - * @overflow_seen: Overflow warning flag (DEBUG_TIMEKEEPING)
>> + * @tkr_mono: The readout base structure for CLOCK_MONOTONIC
>> + * @xtime_sec: Current CLOCK_REALTIME time in seconds
>> + * @ktime_sec: Current CLOCK_MONOTONIC time in seconds
>> + * @wall_to_monotonic: CLOCK_REALTIME to CLOCK_MONOTONIC offset
>> + * @offs_real: Offset clock monotonic -> clock realtime
>> + * @offs_boot: Offset clock monotonic -> clock boottime
>> + * @offs_tai: Offset clock monotonic -> clock tai
>> + * @tai_offset: The current UTC to TAI offset in seconds
>> + * @tkr_raw: The readout base structure for CLOCK_MONOTONIC_RAW
>> + * @raw_sec: CLOCK_MONOTONIC_RAW time in seconds
>> + * @clock_was_set_seq: The sequence number of clock was set events
>> + * @cs_was_changed_seq: The sequence number of clocksource change events
>> + * @monotonic_to_boot: CLOCK_MONOTONIC to CLOCK_BOOTTIME offset
>> + * @cycle_interval: Number of clock cycles in one NTP interval
>> + * @xtime_interval: Number of clock shifted nano seconds in one NTP
>> + * interval.
>> + * @xtime_remainder: Shifted nano seconds left over when rounding
>> + * @cycle_interval
>> + * @raw_interval: Shifted raw nano seconds accumulated per NTP interval.
>> + * @next_leap_ktime: CLOCK_MONOTONIC time value of a pending leap-second
>> + * @ntp_tick: The ntp_tick_length() value currently being
>> + * used. This cached copy ensures we consistently
>> + * apply the tick length for an entire tick, as
>> + * ntp_tick_length may change mid-tick, and we don't
>> + * want to apply that new value to the tick in
>> + * progress.
>> + * @ntp_error: Difference between accumulated time and NTP time in ntp
>> + * shifted nano seconds.
>> + * @ntp_error_shift: Shift conversion between clock shifted nano seconds and
>> + * ntp shifted nano seconds.
>> + * @ntp_err_mult: Multiplication factor for scaled math conversion
>> + * @skip_second_overflow: Flag used to avoid updating NTP twice with same second
>> + * @last_warning: Warning ratelimiter (DEBUG_TIMEKEEPING)
>> + * @underflow_seen: Underflow warning flag (DEBUG_TIMEKEEPING)
>> + * @overflow_seen: Overflow warning flag (DEBUG_TIMEKEEPING)
>> *
>> * Note: For timespec(64) based interfaces wall_to_monotonic is what
>> * we need to add to xtime (or xtime corrected for sub jiffy times)
>> @@ -88,10 +96,25 @@ struct tk_read_base {
>> *
>> * @monotonic_to_boottime is a timespec64 representation of @offs_boot to
>> * accelerate the VDSO update for CLOCK_BOOTTIME.
>> + *
>> + * The cacheline ordering of the structure is optimized for in kernel usage
>> + * of the ktime_get() and ktime_get_ts64() family of time accessors. Struct
>> + * timekeeper is prepended in the core timekeeeping code with a sequence
>> + * count, which results in the following cacheline layout:
>> + *
>> + * 0: seqcount, tkr_mono
>> + * 1: xtime_sec ... tai_offset
>> + * 2: tkr_raw, raw_sec
>> + * 3,4: Internal variables
>> + *
>> + * Cacheline 0,1 contain the data which is used for accessing
>> + * CLOCK_MONOTONIC/REALTIME/BOOTTIME/TAI, while cacheline 2 contains the
>> + * data for accessing CLOCK_MONOTONIC_RAW. Cacheline 3,4 are internal
>> + * variables which are only accessed during timekeeper updates once per
>> + * tick.
>
> Would it make sense to add divider comments or something in the struct
> to make this more visible? I fret in the context of a patch, a + line
> adding a new structure element that breaks the ordered alignment might
> not be obvious.
This is an argument! I'll add simple comments with /* Cachline X: */
Thanks,
Anna-Maria
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v2a] timekeeping: Reorder struct timekeeper
2024-10-09 8:28 ` [PATCH v2 06/25] timekeeping: Reorder struct timekeeper Anna-Maria Behnsen
2024-10-11 3:22 ` John Stultz
@ 2024-10-15 10:08 ` Anna-Maria Behnsen
1 sibling, 0 replies; 53+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-15 10:08 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Thomas Gleixner <tglx@linutronix.de>
struct timekeeper is ordered suboptimal vs. cachelines. The layout,
including the preceding seqcount (see struct tk_core in timekeeper.c) is:
cacheline 0: seqcount, tkr_mono
cacheline 1: tkr_raw, xtime_sec
cacheline 2: ktime_sec ... tai_offset, internal variables
cacheline 3: next_leap_ktime, raw_sec, internal variables
cacheline 4: internal variables
So any access to via ktime_get*() except for access to CLOCK_MONOTONIC_RAW
will use either cachelines 0 + 1 or cachelines 0 + 2. Access to
CLOCK_MONOTONIC_RAW uses cachelines 0 + 1 + 3.
Reorder the members so that the result is more efficient:
cacheline 0: seqcount, tkr_mono
cacheline 1: xtime_sec, ktime_sec ... tai_offset
cacheline 2: tkr_raw, raw_sec
cacheline 3: internal variables
cacheline 4: internal variables
That means ktime_get*() will access cacheline 0 + 1 and CLOCK_MONOTONIC_RAW
access will use cachelines 0 + 2.
Update kernel-doc and fix formatting issues while at it. Also fix a typo
in struct tk_read_base kernel-doc.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
---
include/linux/timekeeper_internal.h | 106 +++++++++++++++++-----------
1 file changed, 65 insertions(+), 41 deletions(-)
diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h
index 902c20ef495a..a3b6380a7777 100644
--- a/include/linux/timekeeper_internal.h
+++ b/include/linux/timekeeper_internal.h
@@ -26,7 +26,7 @@
* occupies a single 64byte cache line.
*
* The struct is separate from struct timekeeper as it is also used
- * for a fast NMI safe accessors.
+ * for the fast NMI safe accessors.
*
* @base_real is for the fast NMI safe accessor to allow reading clock
* realtime from any context.
@@ -44,33 +44,41 @@ struct tk_read_base {
/**
* struct timekeeper - Structure holding internal timekeeping values.
- * @tkr_mono: The readout base structure for CLOCK_MONOTONIC
- * @tkr_raw: The readout base structure for CLOCK_MONOTONIC_RAW
- * @xtime_sec: Current CLOCK_REALTIME time in seconds
- * @ktime_sec: Current CLOCK_MONOTONIC time in seconds
- * @wall_to_monotonic: CLOCK_REALTIME to CLOCK_MONOTONIC offset
- * @offs_real: Offset clock monotonic -> clock realtime
- * @offs_boot: Offset clock monotonic -> clock boottime
- * @offs_tai: Offset clock monotonic -> clock tai
- * @tai_offset: The current UTC to TAI offset in seconds
- * @clock_was_set_seq: The sequence number of clock was set events
- * @cs_was_changed_seq: The sequence number of clocksource change events
- * @next_leap_ktime: CLOCK_MONOTONIC time value of a pending leap-second
- * @raw_sec: CLOCK_MONOTONIC_RAW time in seconds
- * @monotonic_to_boot: CLOCK_MONOTONIC to CLOCK_BOOTTIME offset
- * @cycle_interval: Number of clock cycles in one NTP interval
- * @xtime_interval: Number of clock shifted nano seconds in one NTP
- * interval.
- * @xtime_remainder: Shifted nano seconds left over when rounding
- * @cycle_interval
- * @raw_interval: Shifted raw nano seconds accumulated per NTP interval.
- * @ntp_error: Difference between accumulated time and NTP time in ntp
- * shifted nano seconds.
- * @ntp_error_shift: Shift conversion between clock shifted nano seconds and
- * ntp shifted nano seconds.
- * @last_warning: Warning ratelimiter (DEBUG_TIMEKEEPING)
- * @underflow_seen: Underflow warning flag (DEBUG_TIMEKEEPING)
- * @overflow_seen: Overflow warning flag (DEBUG_TIMEKEEPING)
+ * @tkr_mono: The readout base structure for CLOCK_MONOTONIC
+ * @xtime_sec: Current CLOCK_REALTIME time in seconds
+ * @ktime_sec: Current CLOCK_MONOTONIC time in seconds
+ * @wall_to_monotonic: CLOCK_REALTIME to CLOCK_MONOTONIC offset
+ * @offs_real: Offset clock monotonic -> clock realtime
+ * @offs_boot: Offset clock monotonic -> clock boottime
+ * @offs_tai: Offset clock monotonic -> clock tai
+ * @tai_offset: The current UTC to TAI offset in seconds
+ * @tkr_raw: The readout base structure for CLOCK_MONOTONIC_RAW
+ * @raw_sec: CLOCK_MONOTONIC_RAW time in seconds
+ * @clock_was_set_seq: The sequence number of clock was set events
+ * @cs_was_changed_seq: The sequence number of clocksource change events
+ * @monotonic_to_boot: CLOCK_MONOTONIC to CLOCK_BOOTTIME offset
+ * @cycle_interval: Number of clock cycles in one NTP interval
+ * @xtime_interval: Number of clock shifted nano seconds in one NTP
+ * interval.
+ * @xtime_remainder: Shifted nano seconds left over when rounding
+ * @cycle_interval
+ * @raw_interval: Shifted raw nano seconds accumulated per NTP interval.
+ * @next_leap_ktime: CLOCK_MONOTONIC time value of a pending leap-second
+ * @ntp_tick: The ntp_tick_length() value currently being
+ * used. This cached copy ensures we consistently
+ * apply the tick length for an entire tick, as
+ * ntp_tick_length may change mid-tick, and we don't
+ * want to apply that new value to the tick in
+ * progress.
+ * @ntp_error: Difference between accumulated time and NTP time in ntp
+ * shifted nano seconds.
+ * @ntp_error_shift: Shift conversion between clock shifted nano seconds and
+ * ntp shifted nano seconds.
+ * @ntp_err_mult: Multiplication factor for scaled math conversion
+ * @skip_second_overflow: Flag used to avoid updating NTP twice with same second
+ * @last_warning: Warning ratelimiter (DEBUG_TIMEKEEPING)
+ * @underflow_seen: Underflow warning flag (DEBUG_TIMEKEEPING)
+ * @overflow_seen: Overflow warning flag (DEBUG_TIMEKEEPING)
*
* Note: For timespec(64) based interfaces wall_to_monotonic is what
* we need to add to xtime (or xtime corrected for sub jiffy times)
@@ -88,10 +96,28 @@ struct tk_read_base {
*
* @monotonic_to_boottime is a timespec64 representation of @offs_boot to
* accelerate the VDSO update for CLOCK_BOOTTIME.
+ *
+ * The cacheline ordering of the structure is optimized for in kernel usage of
+ * the ktime_get() and ktime_get_ts64() family of time accessors. Struct
+ * timekeeper is prepended in the core timekeeping code with a sequence count,
+ * which results in the following cacheline layout:
+ *
+ * 0: seqcount, tkr_mono
+ * 1: xtime_sec ... tai_offset
+ * 2: tkr_raw, raw_sec
+ * 3,4: Internal variables
+ *
+ * Cacheline 0,1 contain the data which is used for accessing
+ * CLOCK_MONOTONIC/REALTIME/BOOTTIME/TAI, while cacheline 2 contains the
+ * data for accessing CLOCK_MONOTONIC_RAW. Cacheline 3,4 are internal
+ * variables which are only accessed during timekeeper updates once per
+ * tick.
*/
struct timekeeper {
+ /* Cacheline 0 (together with prepended seqcount of timekeeper core): */
struct tk_read_base tkr_mono;
- struct tk_read_base tkr_raw;
+
+ /* Cacheline 1: */
u64 xtime_sec;
unsigned long ktime_sec;
struct timespec64 wall_to_monotonic;
@@ -99,31 +125,29 @@ struct timekeeper {
ktime_t offs_boot;
ktime_t offs_tai;
s32 tai_offset;
+
+ /* Cacheline 2: */
+ struct tk_read_base tkr_raw;
+ u64 raw_sec;
+
+ /* Cachline 3 and 4 (timekeeping internal variables): */
unsigned int clock_was_set_seq;
u8 cs_was_changed_seq;
- ktime_t next_leap_ktime;
- u64 raw_sec;
+
struct timespec64 monotonic_to_boot;
- /* The following members are for timekeeping internal use */
u64 cycle_interval;
u64 xtime_interval;
s64 xtime_remainder;
u64 raw_interval;
- /* The ntp_tick_length() value currently being used.
- * This cached copy ensures we consistently apply the tick
- * length for an entire tick, as ntp_tick_length may change
- * mid-tick, and we don't want to apply that new value to
- * the tick in progress.
- */
+
+ ktime_t next_leap_ktime;
u64 ntp_tick;
- /* Difference between accumulated time and NTP time in ntp
- * shifted nano seconds. */
s64 ntp_error;
u32 ntp_error_shift;
u32 ntp_err_mult;
- /* Flag used to avoid updating NTP twice with same second */
u32 skip_second_overflow;
+
#ifdef CONFIG_DEBUG_TIMEKEEPING
long last_warning;
/*
--
2.39.5
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v2 07/25] timekeeping: Move shadow_timekeeper into tk_core
2024-10-09 8:28 [PATCH v2 00/25] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (5 preceding siblings ...)
2024-10-09 8:28 ` [PATCH v2 06/25] timekeeping: Reorder struct timekeeper Anna-Maria Behnsen
@ 2024-10-09 8:29 ` Anna-Maria Behnsen
2024-10-24 21:11 ` John Stultz
2024-10-09 8:29 ` [PATCH v2 08/25] timekeeping: Encapsulate locking/unlocking of timekeeper_lock Anna-Maria Behnsen
` (17 subsequent siblings)
24 siblings, 1 reply; 53+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-09 8:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Thomas Gleixner <tglx@linutronix.de>
From: Thomas Gleixner <tglx@linutronix.de>
tk_core requires shadow_timekeeper to allow timekeeping_advance() updating
without holding the timekeeper sequence count write locked. This allows the
readers to make progress up to the actual update where the shadow
timekeeper is copied over to the real timekeeper.
As long as there is only a single timekeeper, having them separate is
fine. But when the timekeeper infrastructure will be reused for per ptp
clock timekeepers, shadow_timekeeper needs to be part of tk_core.
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index e747e46a1a2d..267b28cf2ab0 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -50,11 +50,11 @@ DEFINE_RAW_SPINLOCK(timekeeper_lock);
static struct {
seqcount_raw_spinlock_t seq;
struct timekeeper timekeeper;
+ struct timekeeper shadow_timekeeper;
} tk_core ____cacheline_aligned = {
.seq = SEQCNT_RAW_SPINLOCK_ZERO(tk_core.seq, &timekeeper_lock),
};
-static struct timekeeper shadow_timekeeper;
/* flag for if timekeeping is suspended */
int __read_mostly timekeeping_suspended;
@@ -776,8 +776,7 @@ static void timekeeping_update(struct timekeeper *tk, unsigned int action)
* timekeeper structure on the next update with stale data
*/
if (action & TK_MIRROR)
- memcpy(&shadow_timekeeper, &tk_core.timekeeper,
- sizeof(tk_core.timekeeper));
+ memcpy(&tk_core.shadow_timekeeper, &tk_core.timekeeper, sizeof(tk_core.timekeeper));
}
/**
@@ -2274,8 +2273,8 @@ static u64 logarithmic_accumulation(struct timekeeper *tk, u64 offset,
*/
static bool timekeeping_advance(enum timekeeping_adv_mode mode)
{
+ struct timekeeper *tk = &tk_core.shadow_timekeeper;
struct timekeeper *real_tk = &tk_core.timekeeper;
- struct timekeeper *tk = &shadow_timekeeper;
unsigned int clock_set = 0;
int shift = 0, maxshift;
u64 offset;
--
2.39.5
^ permalink raw reply related [flat|nested] 53+ messages in thread* Re: [PATCH v2 07/25] timekeeping: Move shadow_timekeeper into tk_core
2024-10-09 8:29 ` [PATCH v2 07/25] timekeeping: Move shadow_timekeeper into tk_core Anna-Maria Behnsen
@ 2024-10-24 21:11 ` John Stultz
0 siblings, 0 replies; 53+ messages in thread
From: John Stultz @ 2024-10-24 21:11 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Thomas Gleixner <tglx@linutronix.de>
>
> From: Thomas Gleixner <tglx@linutronix.de>
>
> tk_core requires shadow_timekeeper to allow timekeeping_advance() updating
> without holding the timekeeper sequence count write locked. This allows the
> readers to make progress up to the actual update where the shadow
> timekeeper is copied over to the real timekeeper.
>
> As long as there is only a single timekeeper, having them separate is
> fine. But when the timekeeper infrastructure will be reused for per ptp
> clock timekeepers, shadow_timekeeper needs to be part of tk_core.
>
> No functional change.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v2 08/25] timekeeping: Encapsulate locking/unlocking of timekeeper_lock
2024-10-09 8:28 [PATCH v2 00/25] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (6 preceding siblings ...)
2024-10-09 8:29 ` [PATCH v2 07/25] timekeeping: Move shadow_timekeeper into tk_core Anna-Maria Behnsen
@ 2024-10-09 8:29 ` Anna-Maria Behnsen
2024-10-24 21:19 ` John Stultz
2024-10-09 8:29 ` [PATCH v2 09/25] timekeeping: Move timekeeper_lock into tk_core Anna-Maria Behnsen
` (16 subsequent siblings)
24 siblings, 1 reply; 53+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-09 8:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Thomas Gleixner <tglx@linutronix.de>
From: Thomas Gleixner <tglx@linutronix.de>
timekeeper_lock protects updates of timekeeper (tk_core). It is also used
by vdso_update_begin/end() and not only internally by the timekeeper code.
As long as there is only a single timekeeper, this works fine. But when
the timekeeper infrastructure will be reused for per ptp clock timekeepers,
timekeeper_lock needs to be part of tk_core..
Therefore encapuslate locking/unlocking of timekeeper_lock and make the
lock static.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 15 ++++++++++++++-
kernel/time/timekeeping_internal.h | 3 ++-
kernel/time/vsyscall.c | 5 ++---
3 files changed, 18 insertions(+), 5 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 267b28cf2ab0..c69004936b29 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -41,7 +41,7 @@ enum timekeeping_adv_mode {
TK_ADV_FREQ
};
-DEFINE_RAW_SPINLOCK(timekeeper_lock);
+static DEFINE_RAW_SPINLOCK(timekeeper_lock);
/*
* The most important data for readout fits into a single 64 byte
@@ -114,6 +114,19 @@ static struct tk_fast tk_fast_raw ____cacheline_aligned = {
.base[1] = FAST_TK_INIT,
};
+unsigned long timekeeper_lock_irqsave(void)
+{
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&timekeeper_lock, flags);
+ return flags;
+}
+
+void timekeeper_unlock_irqrestore(unsigned long flags)
+{
+ raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
+}
+
static inline void tk_normalize_xtime(struct timekeeper *tk)
{
while (tk->tkr_mono.xtime_nsec >= ((u64)NSEC_PER_SEC << tk->tkr_mono.shift)) {
diff --git a/kernel/time/timekeeping_internal.h b/kernel/time/timekeeping_internal.h
index 4ca2787d1642..3babb2bac579 100644
--- a/kernel/time/timekeeping_internal.h
+++ b/kernel/time/timekeeping_internal.h
@@ -34,6 +34,7 @@ static inline u64 clocksource_delta(u64 now, u64 last, u64 mask)
#endif
/* Semi public for serialization of non timekeeper VDSO updates. */
-extern raw_spinlock_t timekeeper_lock;
+unsigned long timekeeper_lock_irqsave(void);
+void timekeeper_unlock_irqrestore(unsigned long flags);
#endif /* _TIMEKEEPING_INTERNAL_H */
diff --git a/kernel/time/vsyscall.c b/kernel/time/vsyscall.c
index 9193d6133e5d..98488b20b594 100644
--- a/kernel/time/vsyscall.c
+++ b/kernel/time/vsyscall.c
@@ -151,9 +151,8 @@ void update_vsyscall_tz(void)
unsigned long vdso_update_begin(void)
{
struct vdso_data *vdata = __arch_get_k_vdso_data();
- unsigned long flags;
+ unsigned long flags = timekeeper_lock_irqsave();
- raw_spin_lock_irqsave(&timekeeper_lock, flags);
vdso_write_begin(vdata);
return flags;
}
@@ -172,5 +171,5 @@ void vdso_update_end(unsigned long flags)
vdso_write_end(vdata);
__arch_sync_vdso_data(vdata);
- raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
+ timekeeper_unlock_irqrestore(flags);
}
--
2.39.5
^ permalink raw reply related [flat|nested] 53+ messages in thread* Re: [PATCH v2 08/25] timekeeping: Encapsulate locking/unlocking of timekeeper_lock
2024-10-09 8:29 ` [PATCH v2 08/25] timekeeping: Encapsulate locking/unlocking of timekeeper_lock Anna-Maria Behnsen
@ 2024-10-24 21:19 ` John Stultz
0 siblings, 0 replies; 53+ messages in thread
From: John Stultz @ 2024-10-24 21:19 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Thomas Gleixner <tglx@linutronix.de>
>
> From: Thomas Gleixner <tglx@linutronix.de>
>
> timekeeper_lock protects updates of timekeeper (tk_core). It is also used
> by vdso_update_begin/end() and not only internally by the timekeeper code.
>
> As long as there is only a single timekeeper, this works fine. But when
> the timekeeper infrastructure will be reused for per ptp clock timekeepers,
> timekeeper_lock needs to be part of tk_core..
>
> Therefore encapuslate locking/unlocking of timekeeper_lock and make the
> lock static.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v2 09/25] timekeeping: Move timekeeper_lock into tk_core
2024-10-09 8:28 [PATCH v2 00/25] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (7 preceding siblings ...)
2024-10-09 8:29 ` [PATCH v2 08/25] timekeeping: Encapsulate locking/unlocking of timekeeper_lock Anna-Maria Behnsen
@ 2024-10-09 8:29 ` Anna-Maria Behnsen
2024-10-24 21:20 ` John Stultz
2024-10-09 8:29 ` [PATCH v2 10/25] timekeeping: Define a struct type for tk_core to make it reusable Anna-Maria Behnsen
` (15 subsequent siblings)
24 siblings, 1 reply; 53+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-09 8:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Anna-Maria Behnsen <anna-maria@linutronix.de>
timekeeper_lock protects updates to struct tk_core but is not part of
struct tk_core. As long as there is only a single timekeeper, this is not a
problem. But when the timekeeper infrastructure will be reused for per ptp
clock timekeepers, timekeeper_lock needs to be part of tk_core.
Move the lock into tk_core, move initialisation of the lock and sequence
counter into timekeeping_init() and update all users of timekeeper_lock.
As this is touching all lock sites, convert them to use:
guard(raw_spinlock_irqsave)(&tk_core.lock);
instead of lock/unlock functions whenever possible.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 72 +++++++++++++++++++----------------------------
1 file changed, 29 insertions(+), 43 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index c69004936b29..8557d32e5e3d 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -41,8 +41,6 @@ enum timekeeping_adv_mode {
TK_ADV_FREQ
};
-static DEFINE_RAW_SPINLOCK(timekeeper_lock);
-
/*
* The most important data for readout fits into a single 64 byte
* cache line.
@@ -51,10 +49,8 @@ static struct {
seqcount_raw_spinlock_t seq;
struct timekeeper timekeeper;
struct timekeeper shadow_timekeeper;
-} tk_core ____cacheline_aligned = {
- .seq = SEQCNT_RAW_SPINLOCK_ZERO(tk_core.seq, &timekeeper_lock),
-};
-
+ raw_spinlock_t lock;
+} tk_core ____cacheline_aligned;
/* flag for if timekeeping is suspended */
int __read_mostly timekeeping_suspended;
@@ -118,13 +114,13 @@ unsigned long timekeeper_lock_irqsave(void)
{
unsigned long flags;
- raw_spin_lock_irqsave(&timekeeper_lock, flags);
+ raw_spin_lock_irqsave(&tk_core.lock, flags);
return flags;
}
void timekeeper_unlock_irqrestore(unsigned long flags)
{
- raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
+ raw_spin_unlock_irqrestore(&tk_core.lock, flags);
}
static inline void tk_normalize_xtime(struct timekeeper *tk)
@@ -197,7 +193,7 @@ static inline void tk_update_sleep_time(struct timekeeper *tk, ktime_t delta)
* the tkr's clocksource may change between the read reference, and the
* clock reference passed to the read function. This can cause crashes if
* the wrong clocksource is passed to the wrong read function.
- * This isn't necessary to use when holding the timekeeper_lock or doing
+ * This isn't necessary to use when holding the tk_core.lock or doing
* a read of the fast-timekeeper tkrs (which is protected by its own locking
* and update logic).
*/
@@ -689,13 +685,11 @@ static void update_pvclock_gtod(struct timekeeper *tk, bool was_set)
int pvclock_gtod_register_notifier(struct notifier_block *nb)
{
struct timekeeper *tk = &tk_core.timekeeper;
- unsigned long flags;
int ret;
- raw_spin_lock_irqsave(&timekeeper_lock, flags);
+ guard(raw_spinlock_irqsave)(&tk_core.lock);
ret = raw_notifier_chain_register(&pvclock_gtod_chain, nb);
update_pvclock_gtod(tk, true);
- raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
return ret;
}
@@ -708,14 +702,8 @@ EXPORT_SYMBOL_GPL(pvclock_gtod_register_notifier);
*/
int pvclock_gtod_unregister_notifier(struct notifier_block *nb)
{
- unsigned long flags;
- int ret;
-
- raw_spin_lock_irqsave(&timekeeper_lock, flags);
- ret = raw_notifier_chain_unregister(&pvclock_gtod_chain, nb);
- raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
-
- return ret;
+ guard(raw_spinlock_irqsave)(&tk_core.lock);
+ return raw_notifier_chain_unregister(&pvclock_gtod_chain, nb);
}
EXPORT_SYMBOL_GPL(pvclock_gtod_unregister_notifier);
@@ -763,7 +751,7 @@ static inline void tk_update_ktime_data(struct timekeeper *tk)
tk->tkr_raw.base = ns_to_ktime(tk->raw_sec * NSEC_PER_SEC);
}
-/* must hold timekeeper_lock */
+/* must hold tk_core.lock */
static void timekeeping_update(struct timekeeper *tk, unsigned int action)
{
if (action & TK_CLEAR_NTP) {
@@ -1460,7 +1448,7 @@ int do_settimeofday64(const struct timespec64 *ts)
if (!timespec64_valid_settod(ts))
return -EINVAL;
- raw_spin_lock_irqsave(&timekeeper_lock, flags);
+ raw_spin_lock_irqsave(&tk_core.lock, flags);
write_seqcount_begin(&tk_core.seq);
timekeeping_forward_now(tk);
@@ -1480,7 +1468,7 @@ int do_settimeofday64(const struct timespec64 *ts)
timekeeping_update(tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
write_seqcount_end(&tk_core.seq);
- raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
+ raw_spin_unlock_irqrestore(&tk_core.lock, flags);
/* Signal hrtimers about time change */
clock_was_set(CLOCK_SET_WALL);
@@ -1510,7 +1498,7 @@ static int timekeeping_inject_offset(const struct timespec64 *ts)
if (ts->tv_nsec < 0 || ts->tv_nsec >= NSEC_PER_SEC)
return -EINVAL;
- raw_spin_lock_irqsave(&timekeeper_lock, flags);
+ raw_spin_lock_irqsave(&tk_core.lock, flags);
write_seqcount_begin(&tk_core.seq);
timekeeping_forward_now(tk);
@@ -1530,7 +1518,7 @@ static int timekeeping_inject_offset(const struct timespec64 *ts)
timekeeping_update(tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
write_seqcount_end(&tk_core.seq);
- raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
+ raw_spin_unlock_irqrestore(&tk_core.lock, flags);
/* Signal hrtimers about time change */
clock_was_set(CLOCK_SET_WALL);
@@ -1606,7 +1594,7 @@ static int change_clocksource(void *data)
return 0;
}
- raw_spin_lock_irqsave(&timekeeper_lock, flags);
+ raw_spin_lock_irqsave(&tk_core.lock, flags);
write_seqcount_begin(&tk_core.seq);
timekeeping_forward_now(tk);
@@ -1615,7 +1603,7 @@ static int change_clocksource(void *data)
timekeeping_update(tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
write_seqcount_end(&tk_core.seq);
- raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
+ raw_spin_unlock_irqrestore(&tk_core.lock, flags);
if (old) {
if (old->disable)
@@ -1770,7 +1758,9 @@ void __init timekeeping_init(void)
struct timespec64 wall_time, boot_offset, wall_to_mono;
struct timekeeper *tk = &tk_core.timekeeper;
struct clocksource *clock;
- unsigned long flags;
+
+ raw_spin_lock_init(&tk_core.lock);
+ seqcount_raw_spinlock_init(&tk_core.seq, &tkd->lock);
read_persistent_wall_and_boot_offset(&wall_time, &boot_offset);
if (timespec64_valid_settod(&wall_time) &&
@@ -1790,7 +1780,7 @@ void __init timekeeping_init(void)
*/
wall_to_mono = timespec64_sub(boot_offset, wall_time);
- raw_spin_lock_irqsave(&timekeeper_lock, flags);
+ guard(raw_spinlock_irqsave)(&tk_core.lock);
write_seqcount_begin(&tk_core.seq);
ntp_init();
@@ -1807,7 +1797,6 @@ void __init timekeeping_init(void)
timekeeping_update(tk, TK_MIRROR | TK_CLOCK_WAS_SET);
write_seqcount_end(&tk_core.seq);
- raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
}
/* time in seconds when suspend began for persistent clock */
@@ -1888,7 +1877,7 @@ void timekeeping_inject_sleeptime64(const struct timespec64 *delta)
struct timekeeper *tk = &tk_core.timekeeper;
unsigned long flags;
- raw_spin_lock_irqsave(&timekeeper_lock, flags);
+ raw_spin_lock_irqsave(&tk_core.lock, flags);
write_seqcount_begin(&tk_core.seq);
suspend_timing_needed = false;
@@ -1900,7 +1889,7 @@ void timekeeping_inject_sleeptime64(const struct timespec64 *delta)
timekeeping_update(tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
write_seqcount_end(&tk_core.seq);
- raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
+ raw_spin_unlock_irqrestore(&tk_core.lock, flags);
/* Signal hrtimers about time change */
clock_was_set(CLOCK_SET_WALL | CLOCK_SET_BOOT);
@@ -1924,7 +1913,7 @@ void timekeeping_resume(void)
clockevents_resume();
clocksource_resume();
- raw_spin_lock_irqsave(&timekeeper_lock, flags);
+ raw_spin_lock_irqsave(&tk_core.lock, flags);
write_seqcount_begin(&tk_core.seq);
/*
@@ -1962,7 +1951,7 @@ void timekeeping_resume(void)
timekeeping_suspended = 0;
timekeeping_update(tk, TK_MIRROR | TK_CLOCK_WAS_SET);
write_seqcount_end(&tk_core.seq);
- raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
+ raw_spin_unlock_irqrestore(&tk_core.lock, flags);
touch_softlockup_watchdog();
@@ -1993,7 +1982,7 @@ int timekeeping_suspend(void)
suspend_timing_needed = true;
- raw_spin_lock_irqsave(&timekeeper_lock, flags);
+ raw_spin_lock_irqsave(&tk_core.lock, flags);
write_seqcount_begin(&tk_core.seq);
timekeeping_forward_now(tk);
timekeeping_suspended = 1;
@@ -2032,7 +2021,7 @@ int timekeeping_suspend(void)
timekeeping_update(tk, TK_MIRROR);
halt_fast_timekeeper(tk);
write_seqcount_end(&tk_core.seq);
- raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
+ raw_spin_unlock_irqrestore(&tk_core.lock, flags);
tick_suspend();
clocksource_suspend();
@@ -2292,7 +2281,7 @@ static bool timekeeping_advance(enum timekeeping_adv_mode mode)
int shift = 0, maxshift;
u64 offset;
- guard(raw_spinlock_irqsave)(&timekeeper_lock);
+ guard(raw_spinlock_irqsave)(&tk_core.lock);
/* Make sure we're fully resumed: */
if (unlikely(timekeeping_suspended))
@@ -2589,7 +2578,7 @@ int do_adjtimex(struct __kernel_timex *txc)
ktime_get_real_ts64(&ts);
add_device_randomness(&ts, sizeof(ts));
- raw_spin_lock_irqsave(&timekeeper_lock, flags);
+ raw_spin_lock_irqsave(&tk_core.lock, flags);
write_seqcount_begin(&tk_core.seq);
orig_tai = tai = tk->tai_offset;
@@ -2604,7 +2593,7 @@ int do_adjtimex(struct __kernel_timex *txc)
}
write_seqcount_end(&tk_core.seq);
- raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
+ raw_spin_unlock_irqrestore(&tk_core.lock, flags);
audit_ntp_log(&ad);
@@ -2628,11 +2617,8 @@ int do_adjtimex(struct __kernel_timex *txc)
*/
void hardpps(const struct timespec64 *phase_ts, const struct timespec64 *raw_ts)
{
- unsigned long flags;
-
- raw_spin_lock_irqsave(&timekeeper_lock, flags);
+ guard(raw_spinlock_irqsave)(&tk_core.lock);
__hardpps(phase_ts, raw_ts);
- raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
}
EXPORT_SYMBOL(hardpps);
#endif /* CONFIG_NTP_PPS */
--
2.39.5
^ permalink raw reply related [flat|nested] 53+ messages in thread* Re: [PATCH v2 09/25] timekeeping: Move timekeeper_lock into tk_core
2024-10-09 8:29 ` [PATCH v2 09/25] timekeeping: Move timekeeper_lock into tk_core Anna-Maria Behnsen
@ 2024-10-24 21:20 ` John Stultz
0 siblings, 0 replies; 53+ messages in thread
From: John Stultz @ 2024-10-24 21:20 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> timekeeper_lock protects updates to struct tk_core but is not part of
> struct tk_core. As long as there is only a single timekeeper, this is not a
> problem. But when the timekeeper infrastructure will be reused for per ptp
> clock timekeepers, timekeeper_lock needs to be part of tk_core.
>
> Move the lock into tk_core, move initialisation of the lock and sequence
> counter into timekeeping_init() and update all users of timekeeper_lock.
>
> As this is touching all lock sites, convert them to use:
>
> guard(raw_spinlock_irqsave)(&tk_core.lock);
>
> instead of lock/unlock functions whenever possible.
>
> Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v2 10/25] timekeeping: Define a struct type for tk_core to make it reusable
2024-10-09 8:28 [PATCH v2 00/25] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (8 preceding siblings ...)
2024-10-09 8:29 ` [PATCH v2 09/25] timekeeping: Move timekeeper_lock into tk_core Anna-Maria Behnsen
@ 2024-10-09 8:29 ` Anna-Maria Behnsen
2024-10-24 21:21 ` John Stultz
2024-10-09 8:29 ` [PATCH v2 11/25] timekeeping: Introduce tkd_basic_setup() to make lock and seqcount init reusable Anna-Maria Behnsen
` (14 subsequent siblings)
24 siblings, 1 reply; 53+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-09 8:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Anna-Maria Behnsen <anna-maria@linutronix.de>
The struct tk_core uses is not reusable. As long as there is only a single
timekeeper, this is not a problem. But when the timekeeper infrastructure
will be reused for per ptp clock timekeepers, an explicit struct type is
required.
Define struct tk_data as explicit struct type for tk_core.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 8557d32e5e3d..40c60bb88416 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -45,12 +45,14 @@ enum timekeeping_adv_mode {
* The most important data for readout fits into a single 64 byte
* cache line.
*/
-static struct {
+struct tk_data {
seqcount_raw_spinlock_t seq;
struct timekeeper timekeeper;
struct timekeeper shadow_timekeeper;
raw_spinlock_t lock;
-} tk_core ____cacheline_aligned;
+} ____cacheline_aligned;
+
+static struct tk_data tk_core;
/* flag for if timekeeping is suspended */
int __read_mostly timekeeping_suspended;
--
2.39.5
^ permalink raw reply related [flat|nested] 53+ messages in thread* Re: [PATCH v2 10/25] timekeeping: Define a struct type for tk_core to make it reusable
2024-10-09 8:29 ` [PATCH v2 10/25] timekeeping: Define a struct type for tk_core to make it reusable Anna-Maria Behnsen
@ 2024-10-24 21:21 ` John Stultz
0 siblings, 0 replies; 53+ messages in thread
From: John Stultz @ 2024-10-24 21:21 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> The struct tk_core uses is not reusable. As long as there is only a single
> timekeeper, this is not a problem. But when the timekeeper infrastructure
> will be reused for per ptp clock timekeepers, an explicit struct type is
> required.
>
> Define struct tk_data as explicit struct type for tk_core.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v2 11/25] timekeeping: Introduce tkd_basic_setup() to make lock and seqcount init reusable
2024-10-09 8:28 [PATCH v2 00/25] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (9 preceding siblings ...)
2024-10-09 8:29 ` [PATCH v2 10/25] timekeeping: Define a struct type for tk_core to make it reusable Anna-Maria Behnsen
@ 2024-10-09 8:29 ` Anna-Maria Behnsen
2024-10-24 21:25 ` John Stultz
2024-10-09 8:29 ` [PATCH v2 12/25] timekeeping: Add struct tk_data as argument to timekeeping_update() Anna-Maria Behnsen
` (13 subsequent siblings)
24 siblings, 1 reply; 53+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-09 8:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
Initialization of lock and seqcount needs to be done for every instance of
timekeeper struct. To be able to easily reuse it, create a separate
function for it.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
v2: new in v2 - splitted from another patch
---
kernel/time/timekeeping.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 40c60bb88416..a98f823be6db 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1734,6 +1734,12 @@ read_persistent_wall_and_boot_offset(struct timespec64 *wall_time,
*boot_offset = ns_to_timespec64(local_clock());
}
+static __init void tkd_basic_setup(struct tk_data *tkd)
+{
+ raw_spin_lock_init(&tkd->lock);
+ seqcount_raw_spinlock_init(&tkd->seq, &tkd->lock);
+}
+
/*
* Flag reflecting whether timekeeping_resume() has injected sleeptime.
*
@@ -1761,8 +1767,7 @@ void __init timekeeping_init(void)
struct timekeeper *tk = &tk_core.timekeeper;
struct clocksource *clock;
- raw_spin_lock_init(&tk_core.lock);
- seqcount_raw_spinlock_init(&tk_core.seq, &tkd->lock);
+ tkd_basic_setup(&tk_core);
read_persistent_wall_and_boot_offset(&wall_time, &boot_offset);
if (timespec64_valid_settod(&wall_time) &&
--
2.39.5
^ permalink raw reply related [flat|nested] 53+ messages in thread* Re: [PATCH v2 11/25] timekeeping: Introduce tkd_basic_setup() to make lock and seqcount init reusable
2024-10-09 8:29 ` [PATCH v2 11/25] timekeeping: Introduce tkd_basic_setup() to make lock and seqcount init reusable Anna-Maria Behnsen
@ 2024-10-24 21:25 ` John Stultz
0 siblings, 0 replies; 53+ messages in thread
From: John Stultz @ 2024-10-24 21:25 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> Initialization of lock and seqcount needs to be done for every instance of
> timekeeper struct. To be able to easily reuse it, create a separate
> function for it.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v2 12/25] timekeeping: Add struct tk_data as argument to timekeeping_update()
2024-10-09 8:28 [PATCH v2 00/25] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (10 preceding siblings ...)
2024-10-09 8:29 ` [PATCH v2 11/25] timekeeping: Introduce tkd_basic_setup() to make lock and seqcount init reusable Anna-Maria Behnsen
@ 2024-10-09 8:29 ` Anna-Maria Behnsen
2024-10-24 21:29 ` John Stultz
2024-10-09 8:29 ` [PATCH v2 13/25] timekeeping: Split out timekeeper update of timekeeping_advanced() Anna-Maria Behnsen
` (12 subsequent siblings)
24 siblings, 1 reply; 53+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-09 8:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Anna-Maria Behnsen <anna-maria@linutronix.de>
Updates of the timekeeper are done in two ways:
1. Updating timekeeper and afterwards memcpy()'ing the result into
shadow_timekeeper using timekeeping_update(). Used everywhere for
updates except in timekeeping_advance(); the sequence counter protected
region starts before the first change to the timekeeper is done.
2. Updating shadow_timekeeper and then memcpy()'ing the result into
timekeeper. Used only by in timekeeping_advance(); The seqence counter
protected region is only around timekeeping_update() and the memcpy for
copy from shadow to timekeeper.
The second option is fast path optimized. The sequence counter protected
region is as short as possible.
As this behaviour is mainly documented by commit messages, but not in code,
it makes the not easy timekeeping code more complicated to read.
There is no reason why updates to the timekeeper can't use the optimized
version everywhere. With this, the code will be cleaner, as code is reused
instead of duplicated.
To be able to access tk_data which contains all required information, add a
pointer to tk_data as an argument to timekeeping_update(). With that
convert the comment about holding the lock into a lockdep assert.
No functional change.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 25 +++++++++++++------------
1 file changed, 13 insertions(+), 12 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index a98f823be6db..878f9606946d 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -528,7 +528,7 @@ EXPORT_SYMBOL_GPL(ktime_get_raw_fast_ns);
* timekeeping_inject_sleeptime64()
* __timekeeping_inject_sleeptime(tk, delta);
* timestamp();
- * timekeeping_update(tk, TK_CLEAR_NTP...);
+ * timekeeping_update(tkd, tk, TK_CLEAR_NTP...);
*
* (2) On 32-bit systems, the 64-bit boot offset (tk->offs_boot) may be
* partially updated. Since the tk->offs_boot update is a rare event, this
@@ -753,9 +753,10 @@ static inline void tk_update_ktime_data(struct timekeeper *tk)
tk->tkr_raw.base = ns_to_ktime(tk->raw_sec * NSEC_PER_SEC);
}
-/* must hold tk_core.lock */
-static void timekeeping_update(struct timekeeper *tk, unsigned int action)
+static void timekeeping_update(struct tk_data *tkd, struct timekeeper *tk, unsigned int action)
{
+ lockdep_assert_held(&tkd->lock);
+
if (action & TK_CLEAR_NTP) {
tk->ntp_error = 0;
ntp_clear();
@@ -1467,7 +1468,7 @@ int do_settimeofday64(const struct timespec64 *ts)
tk_set_xtime(tk, ts);
out:
- timekeeping_update(tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
+ timekeeping_update(&tk_core, tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&tk_core.lock, flags);
@@ -1517,7 +1518,7 @@ static int timekeeping_inject_offset(const struct timespec64 *ts)
tk_set_wall_to_mono(tk, timespec64_sub(tk->wall_to_monotonic, *ts));
error: /* even if we error out, we forwarded the time, so call update */
- timekeeping_update(tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
+ timekeeping_update(&tk_core, tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&tk_core.lock, flags);
@@ -1602,7 +1603,7 @@ static int change_clocksource(void *data)
timekeeping_forward_now(tk);
old = tk->tkr_mono.clock;
tk_setup_internals(tk, new);
- timekeeping_update(tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
+ timekeeping_update(&tk_core, tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&tk_core.lock, flags);
@@ -1801,7 +1802,7 @@ void __init timekeeping_init(void)
tk_set_wall_to_mono(tk, wall_to_mono);
- timekeeping_update(tk, TK_MIRROR | TK_CLOCK_WAS_SET);
+ timekeeping_update(&tk_core, tk, TK_MIRROR | TK_CLOCK_WAS_SET);
write_seqcount_end(&tk_core.seq);
}
@@ -1893,7 +1894,7 @@ void timekeeping_inject_sleeptime64(const struct timespec64 *delta)
__timekeeping_inject_sleeptime(tk, delta);
- timekeeping_update(tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
+ timekeeping_update(&tk_core, tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&tk_core.lock, flags);
@@ -1956,7 +1957,7 @@ void timekeeping_resume(void)
tk->ntp_error = 0;
timekeeping_suspended = 0;
- timekeeping_update(tk, TK_MIRROR | TK_CLOCK_WAS_SET);
+ timekeeping_update(&tk_core, tk, TK_MIRROR | TK_CLOCK_WAS_SET);
write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&tk_core.lock, flags);
@@ -2025,7 +2026,7 @@ int timekeeping_suspend(void)
}
}
- timekeeping_update(tk, TK_MIRROR);
+ timekeeping_update(&tk_core, tk, TK_MIRROR);
halt_fast_timekeeper(tk);
write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&tk_core.lock, flags);
@@ -2343,7 +2344,7 @@ static bool timekeeping_advance(enum timekeeping_adv_mode mode)
* memcpy under the tk_core.seq against one before we start
* updating.
*/
- timekeeping_update(tk, clock_set);
+ timekeeping_update(&tk_core, tk, clock_set);
memcpy(real_tk, tk, sizeof(*tk));
/* The memcpy must come last. Do not put anything here! */
write_seqcount_end(&tk_core.seq);
@@ -2593,7 +2594,7 @@ int do_adjtimex(struct __kernel_timex *txc)
if (tai != orig_tai) {
__timekeeping_set_tai_offset(tk, tai);
- timekeeping_update(tk, TK_MIRROR | TK_CLOCK_WAS_SET);
+ timekeeping_update(&tk_core, tk, TK_MIRROR | TK_CLOCK_WAS_SET);
clock_set = true;
} else {
tk_update_leap_state(tk);
--
2.39.5
^ permalink raw reply related [flat|nested] 53+ messages in thread* Re: [PATCH v2 12/25] timekeeping: Add struct tk_data as argument to timekeeping_update()
2024-10-09 8:29 ` [PATCH v2 12/25] timekeeping: Add struct tk_data as argument to timekeeping_update() Anna-Maria Behnsen
@ 2024-10-24 21:29 ` John Stultz
0 siblings, 0 replies; 53+ messages in thread
From: John Stultz @ 2024-10-24 21:29 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> Updates of the timekeeper are done in two ways:
>
> 1. Updating timekeeper and afterwards memcpy()'ing the result into
> shadow_timekeeper using timekeeping_update(). Used everywhere for
> updates except in timekeeping_advance(); the sequence counter protected
> region starts before the first change to the timekeeper is done.
>
> 2. Updating shadow_timekeeper and then memcpy()'ing the result into
> timekeeper. Used only by in timekeeping_advance(); The seqence counter
> protected region is only around timekeeping_update() and the memcpy for
> copy from shadow to timekeeper.
>
> The second option is fast path optimized. The sequence counter protected
> region is as short as possible.
>
> As this behaviour is mainly documented by commit messages, but not in code,
> it makes the not easy timekeeping code more complicated to read.
>
> There is no reason why updates to the timekeeper can't use the optimized
> version everywhere. With this, the code will be cleaner, as code is reused
> instead of duplicated.
>
> To be able to access tk_data which contains all required information, add a
> pointer to tk_data as an argument to timekeeping_update(). With that
> convert the comment about holding the lock into a lockdep assert.
>
> No functional change.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v2 13/25] timekeeping: Split out timekeeper update of timekeeping_advanced()
2024-10-09 8:28 [PATCH v2 00/25] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (11 preceding siblings ...)
2024-10-09 8:29 ` [PATCH v2 12/25] timekeeping: Add struct tk_data as argument to timekeeping_update() Anna-Maria Behnsen
@ 2024-10-09 8:29 ` Anna-Maria Behnsen
2024-10-24 21:43 ` John Stultz
2024-10-24 21:53 ` John Stultz
2024-10-09 8:29 ` [PATCH v2 14/25] timekeeping: Introduce combined timekeeping action flag Anna-Maria Behnsen
` (11 subsequent siblings)
24 siblings, 2 replies; 53+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-09 8:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Anna-Maria Behnsen <anna-maria@linutronix.de>
timekeeping_advance() is the only optimized function which uses
shadow_timekeeper for updating the real timekeeper to keep the sequence
counter protected region as small as possible.
To be able to transform timekeeper updates in other functions to use the
same logic, split out functionality into a separate function
timekeeper_update_staged().
While at it, document the reason why the sequence counter must be write
held over the call to timekeeping_update() and the copying to the real
timekeeper and why using a pointer based update is suboptimal.
No functional change.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 43 +++++++++++++++++++++++++++----------------
1 file changed, 27 insertions(+), 16 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 878f9606946d..fcb2b8b232d2 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -780,7 +780,32 @@ static void timekeeping_update(struct tk_data *tkd, struct timekeeper *tk, unsig
* timekeeper structure on the next update with stale data
*/
if (action & TK_MIRROR)
- memcpy(&tk_core.shadow_timekeeper, &tk_core.timekeeper, sizeof(tk_core.timekeeper));
+ memcpy(&tkd->shadow_timekeeper, tk, sizeof(*tk));
+}
+
+static void timekeeping_update_staged(struct tk_data *tkd, unsigned int action)
+{
+ /*
+ * Block out readers before invoking timekeeping_update() because
+ * that updates VDSO and other time related infrastructure. Not
+ * blocking the readers might let a reader see time going backwards
+ * when reading from the VDSO after the VDSO update and then
+ * reading in the kernel from the timekeeper before that got updated.
+ */
+ write_seqcount_begin(&tkd->seq);
+
+ timekeeping_update(tkd, &tkd->shadow_timekeeper, action);
+
+ /*
+ * Update the real timekeeper.
+ *
+ * We could avoid this memcpy() by switching pointers, but that has
+ * the downside that the reader side does not longer benefit from
+ * the cacheline optimized data layout of the timekeeper and requires
+ * another indirection.
+ */
+ memcpy(&tkd->timekeeper, &tkd->shadow_timekeeper, sizeof(tkd->shadow_timekeeper));
+ write_seqcount_end(&tkd->seq);
}
/**
@@ -2333,21 +2358,7 @@ static bool timekeeping_advance(enum timekeeping_adv_mode mode)
*/
clock_set |= accumulate_nsecs_to_secs(tk);
- write_seqcount_begin(&tk_core.seq);
- /*
- * Update the real timekeeper.
- *
- * We could avoid this memcpy by switching pointers, but that
- * requires changes to all other timekeeper usage sites as
- * well, i.e. move the timekeeper pointer getter into the
- * spinlocked/seqcount protected sections. And we trade this
- * memcpy under the tk_core.seq against one before we start
- * updating.
- */
- timekeeping_update(&tk_core, tk, clock_set);
- memcpy(real_tk, tk, sizeof(*tk));
- /* The memcpy must come last. Do not put anything here! */
- write_seqcount_end(&tk_core.seq);
+ timekeeping_update_staged(&tk_core, clock_set);
return !!clock_set;
}
--
2.39.5
^ permalink raw reply related [flat|nested] 53+ messages in thread* Re: [PATCH v2 13/25] timekeeping: Split out timekeeper update of timekeeping_advanced()
2024-10-09 8:29 ` [PATCH v2 13/25] timekeeping: Split out timekeeper update of timekeeping_advanced() Anna-Maria Behnsen
@ 2024-10-24 21:43 ` John Stultz
2024-10-24 21:53 ` John Stultz
1 sibling, 0 replies; 53+ messages in thread
From: John Stultz @ 2024-10-24 21:43 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> timekeeping_advance() is the only optimized function which uses
> shadow_timekeeper for updating the real timekeeper to keep the sequence
> counter protected region as small as possible.
>
> To be able to transform timekeeper updates in other functions to use the
> same logic, split out functionality into a separate function
> timekeeper_update_staged().
>
> While at it, document the reason why the sequence counter must be write
> held over the call to timekeeping_update() and the copying to the real
> timekeeper and why using a pointer based update is suboptimal.
>
> No functional change.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v2 13/25] timekeeping: Split out timekeeper update of timekeeping_advanced()
2024-10-09 8:29 ` [PATCH v2 13/25] timekeeping: Split out timekeeper update of timekeeping_advanced() Anna-Maria Behnsen
2024-10-24 21:43 ` John Stultz
@ 2024-10-24 21:53 ` John Stultz
1 sibling, 0 replies; 53+ messages in thread
From: John Stultz @ 2024-10-24 21:53 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> timekeeping_advance() is the only optimized function which uses
> shadow_timekeeper for updating the real timekeeper to keep the sequence
> counter protected region as small as possible.
>
> To be able to transform timekeeper updates in other functions to use the
> same logic, split out functionality into a separate function
> timekeeper_update_staged().
>
> While at it, document the reason why the sequence counter must be write
> held over the call to timekeeping_update() and the copying to the real
> timekeeper and why using a pointer based update is suboptimal.
>
> No functional change.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
> ---
> kernel/time/timekeeping.c | 43 +++++++++++++++++++++++++++----------------
> 1 file changed, 27 insertions(+), 16 deletions(-)
>
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index 878f9606946d..fcb2b8b232d2 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -780,7 +780,32 @@ static void timekeeping_update(struct tk_data *tkd, struct timekeeper *tk, unsig
> * timekeeper structure on the next update with stale data
> */
> if (action & TK_MIRROR)
> - memcpy(&tk_core.shadow_timekeeper, &tk_core.timekeeper, sizeof(tk_core.timekeeper));
> + memcpy(&tkd->shadow_timekeeper, tk, sizeof(*tk));
> +}
> +
> +static void timekeeping_update_staged(struct tk_data *tkd, unsigned int action)
Minor nit I realized as I saw how this was used later on:
timekeeping_update_staged() isn't super clear right off. Maybe
timekeeping_update_from_shadow() might make it more clear?
thanks
-john
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v2 14/25] timekeeping: Introduce combined timekeeping action flag
2024-10-09 8:28 [PATCH v2 00/25] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (12 preceding siblings ...)
2024-10-09 8:29 ` [PATCH v2 13/25] timekeeping: Split out timekeeper update of timekeeping_advanced() Anna-Maria Behnsen
@ 2024-10-09 8:29 ` Anna-Maria Behnsen
2024-10-24 22:12 ` John Stultz
2024-10-09 8:29 ` [PATCH v2 15/25] timekeeping: Provide timekeeping_restore_shadow() Anna-Maria Behnsen
` (10 subsequent siblings)
24 siblings, 1 reply; 53+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-09 8:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
Instead of explicitly listing all the separate timekeeping actions flags,
introduce a new one which covers all actions except TK_MIRROR action.
No functional change.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index fcb2b8b232d2..5a747afe64b4 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -33,6 +33,8 @@
#define TK_MIRROR (1 << 1)
#define TK_CLOCK_WAS_SET (1 << 2)
+#define TK_UPDATE_ALL (TK_CLEAR_NTP | TK_CLOCK_WAS_SET)
+
enum timekeeping_adv_mode {
/* Update timekeeper when a tick has passed */
TK_ADV_TICK,
@@ -1493,7 +1495,7 @@ int do_settimeofday64(const struct timespec64 *ts)
tk_set_xtime(tk, ts);
out:
- timekeeping_update(&tk_core, tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
+ timekeeping_update(&tk_core, tk, TK_UPDATE_ALL | TK_MIRROR);
write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&tk_core.lock, flags);
@@ -1543,7 +1545,7 @@ static int timekeeping_inject_offset(const struct timespec64 *ts)
tk_set_wall_to_mono(tk, timespec64_sub(tk->wall_to_monotonic, *ts));
error: /* even if we error out, we forwarded the time, so call update */
- timekeeping_update(&tk_core, tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
+ timekeeping_update(&tk_core, tk, TK_UPDATE_ALL | TK_MIRROR);
write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&tk_core.lock, flags);
@@ -1628,7 +1630,7 @@ static int change_clocksource(void *data)
timekeeping_forward_now(tk);
old = tk->tkr_mono.clock;
tk_setup_internals(tk, new);
- timekeeping_update(&tk_core, tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
+ timekeeping_update(&tk_core, tk, TK_UPDATE_ALL | TK_MIRROR);
write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&tk_core.lock, flags);
@@ -1919,7 +1921,7 @@ void timekeeping_inject_sleeptime64(const struct timespec64 *delta)
__timekeeping_inject_sleeptime(tk, delta);
- timekeeping_update(&tk_core, tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
+ timekeeping_update(&tk_core, tk, TK_UPDATE_ALL | TK_MIRROR);
write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&tk_core.lock, flags);
--
2.39.5
^ permalink raw reply related [flat|nested] 53+ messages in thread* Re: [PATCH v2 14/25] timekeeping: Introduce combined timekeeping action flag
2024-10-09 8:29 ` [PATCH v2 14/25] timekeeping: Introduce combined timekeeping action flag Anna-Maria Behnsen
@ 2024-10-24 22:12 ` John Stultz
0 siblings, 0 replies; 53+ messages in thread
From: John Stultz @ 2024-10-24 22:12 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> Instead of explicitly listing all the separate timekeeping actions flags,
> introduce a new one which covers all actions except TK_MIRROR action.
>
> No functional change.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
> ---
> kernel/time/timekeeping.c | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index fcb2b8b232d2..5a747afe64b4 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -33,6 +33,8 @@
> #define TK_MIRROR (1 << 1)
> #define TK_CLOCK_WAS_SET (1 << 2)
>
> +#define TK_UPDATE_ALL (TK_CLEAR_NTP | TK_CLOCK_WAS_SET)
> +
Hrm. I feel a little wary around having a flag mask called _ALL when
it doesn't actually include all the other flags.
I also recognize the "TK_CLEAR_NTP | TK_CLOCK_WAS_SET" arguments can
feel repetitive, but I find having them explicitly listed makes the
code more readable to me.
Combining these common ones together just means there is a 4th option
one has to keep in their head to translate.
Further, as I look through the logic TK_MIRROR could probably be
improved by adding a direction (it's easy to mix up what is being
mirrored to what). Maybe TK_MIRROR_TO_SHADOW?
But these are mostly just strategies to help my scatterbrained state,
so this isn't a hard objection if you disagree.
thanks
-john
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v2 15/25] timekeeping: Provide timekeeping_restore_shadow()
2024-10-09 8:28 [PATCH v2 00/25] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (13 preceding siblings ...)
2024-10-09 8:29 ` [PATCH v2 14/25] timekeeping: Introduce combined timekeeping action flag Anna-Maria Behnsen
@ 2024-10-09 8:29 ` Anna-Maria Behnsen
2024-10-24 21:45 ` John Stultz
2024-10-09 8:29 ` [PATCH v2 16/25] timekeeping: Rework do_settimeofday64() to use shadow_timekeeper Anna-Maria Behnsen
` (9 subsequent siblings)
24 siblings, 1 reply; 53+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-09 8:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Thomas Gleixner <tglx@linutronix.de>
From: Thomas Gleixner <tglx@linutronix.de>
Functions which operate on the real timekeeper, e.g. do_settimeofday(),
have error conditions. If they are hit a full timekeeping update is still
required because the already committed operations modified the timekeeper.
When switching these functions to operate on the shadow timekeeper then the
full update can be avoided in the error case, but the modified shadow
timekeeper has to be restored.
Provide a helper function for that.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 5a747afe64b4..41d88f645868 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -755,6 +755,15 @@ static inline void tk_update_ktime_data(struct timekeeper *tk)
tk->tkr_raw.base = ns_to_ktime(tk->raw_sec * NSEC_PER_SEC);
}
+/*
+ * Restore the shadow timekeeper from the real timekeeper.
+ */
+static void timekeeping_restore_shadow(struct tk_data *tkd)
+{
+ lockdep_assert_held(&tkd->lock);
+ memcpy(&tkd->shadow_timekeeper, &tkd->timekeeper, sizeof(tkd->timekeeper));
+}
+
static void timekeeping_update(struct tk_data *tkd, struct timekeeper *tk, unsigned int action)
{
lockdep_assert_held(&tkd->lock);
@@ -782,7 +791,7 @@ static void timekeeping_update(struct tk_data *tkd, struct timekeeper *tk, unsig
* timekeeper structure on the next update with stale data
*/
if (action & TK_MIRROR)
- memcpy(&tkd->shadow_timekeeper, tk, sizeof(*tk));
+ timekeeping_restore_shadow(tkd);
}
static void timekeeping_update_staged(struct tk_data *tkd, unsigned int action)
--
2.39.5
^ permalink raw reply related [flat|nested] 53+ messages in thread* Re: [PATCH v2 15/25] timekeeping: Provide timekeeping_restore_shadow()
2024-10-09 8:29 ` [PATCH v2 15/25] timekeeping: Provide timekeeping_restore_shadow() Anna-Maria Behnsen
@ 2024-10-24 21:45 ` John Stultz
0 siblings, 0 replies; 53+ messages in thread
From: John Stultz @ 2024-10-24 21:45 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Thomas Gleixner <tglx@linutronix.de>
>
> From: Thomas Gleixner <tglx@linutronix.de>
>
> Functions which operate on the real timekeeper, e.g. do_settimeofday(),
> have error conditions. If they are hit a full timekeeping update is still
> required because the already committed operations modified the timekeeper.
>
> When switching these functions to operate on the shadow timekeeper then the
> full update can be avoided in the error case, but the modified shadow
> timekeeper has to be restored.
>
> Provide a helper function for that.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v2 16/25] timekeeping: Rework do_settimeofday64() to use shadow_timekeeper
2024-10-09 8:28 [PATCH v2 00/25] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (14 preceding siblings ...)
2024-10-09 8:29 ` [PATCH v2 15/25] timekeeping: Provide timekeeping_restore_shadow() Anna-Maria Behnsen
@ 2024-10-09 8:29 ` Anna-Maria Behnsen
2024-10-24 21:54 ` John Stultz
2024-10-09 8:29 ` [PATCH v2 17/25] timekeeping: Rework timekeeping_inject_offset() " Anna-Maria Behnsen
` (8 subsequent siblings)
24 siblings, 1 reply; 53+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-09 8:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Anna-Maria Behnsen <anna-maria@linutronix.de>
Updates of the timekeeper can be done by operating on the shadow timekeeper
and afterwards copying the result into the real timekeeper. This has the
advantage, that the sequence count write protected region is kept as small
as possible.
Convert do_settimeofday64() to use this scheme.
That allows to use a scoped_guard() for locking the timekeeper lock as the
usage of the shadow timekeeper allows a rollback in the error case instead
of the full timekeeper update of the original code.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 42 ++++++++++++++++--------------------------
1 file changed, 16 insertions(+), 26 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 41d88f645868..cc01ad53d96d 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1479,45 +1479,35 @@ EXPORT_SYMBOL_GPL(timekeeping_clocksource_has_base);
*/
int do_settimeofday64(const struct timespec64 *ts)
{
- struct timekeeper *tk = &tk_core.timekeeper;
struct timespec64 ts_delta, xt;
- unsigned long flags;
- int ret = 0;
if (!timespec64_valid_settod(ts))
return -EINVAL;
- raw_spin_lock_irqsave(&tk_core.lock, flags);
- write_seqcount_begin(&tk_core.seq);
-
- timekeeping_forward_now(tk);
-
- xt = tk_xtime(tk);
- ts_delta = timespec64_sub(*ts, xt);
+ scoped_guard (raw_spinlock_irqsave, &tk_core.lock) {
+ struct timekeeper *tk = &tk_core.shadow_timekeeper;
- if (timespec64_compare(&tk->wall_to_monotonic, &ts_delta) > 0) {
- ret = -EINVAL;
- goto out;
- }
+ timekeeping_forward_now(tk);
- tk_set_wall_to_mono(tk, timespec64_sub(tk->wall_to_monotonic, ts_delta));
+ xt = tk_xtime(tk);
+ ts_delta = timespec64_sub(*ts, xt);
- tk_set_xtime(tk, ts);
-out:
- timekeeping_update(&tk_core, tk, TK_UPDATE_ALL | TK_MIRROR);
+ if (timespec64_compare(&tk->wall_to_monotonic, &ts_delta) > 0) {
+ timekeeping_restore_shadow(&tk_core);
+ return -EINVAL;
+ }
- write_seqcount_end(&tk_core.seq);
- raw_spin_unlock_irqrestore(&tk_core.lock, flags);
+ tk_set_wall_to_mono(tk, timespec64_sub(tk->wall_to_monotonic, ts_delta));
+ tk_set_xtime(tk, ts);
+ timekeeping_update_staged(&tk_core, TK_UPDATE_ALL);
+ }
/* Signal hrtimers about time change */
clock_was_set(CLOCK_SET_WALL);
- if (!ret) {
- audit_tk_injoffset(ts_delta);
- add_device_randomness(ts, sizeof(*ts));
- }
-
- return ret;
+ audit_tk_injoffset(ts_delta);
+ add_device_randomness(ts, sizeof(*ts));
+ return 0;
}
EXPORT_SYMBOL(do_settimeofday64);
--
2.39.5
^ permalink raw reply related [flat|nested] 53+ messages in thread* Re: [PATCH v2 16/25] timekeeping: Rework do_settimeofday64() to use shadow_timekeeper
2024-10-09 8:29 ` [PATCH v2 16/25] timekeeping: Rework do_settimeofday64() to use shadow_timekeeper Anna-Maria Behnsen
@ 2024-10-24 21:54 ` John Stultz
0 siblings, 0 replies; 53+ messages in thread
From: John Stultz @ 2024-10-24 21:54 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> Updates of the timekeeper can be done by operating on the shadow timekeeper
> and afterwards copying the result into the real timekeeper. This has the
> advantage, that the sequence count write protected region is kept as small
> as possible.
>
> Convert do_settimeofday64() to use this scheme.
>
> That allows to use a scoped_guard() for locking the timekeeper lock as the
> usage of the shadow timekeeper allows a rollback in the error case instead
> of the full timekeeper update of the original code.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
> ---
> kernel/time/timekeeping.c | 42 ++++++++++++++++--------------------------
> 1 file changed, 16 insertions(+), 26 deletions(-)
>
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index 41d88f645868..cc01ad53d96d 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -1479,45 +1479,35 @@ EXPORT_SYMBOL_GPL(timekeeping_clocksource_has_base);
> */
> int do_settimeofday64(const struct timespec64 *ts)
> {
> - struct timekeeper *tk = &tk_core.timekeeper;
> struct timespec64 ts_delta, xt;
> - unsigned long flags;
> - int ret = 0;
>
> if (!timespec64_valid_settod(ts))
> return -EINVAL;
>
> - raw_spin_lock_irqsave(&tk_core.lock, flags);
> - write_seqcount_begin(&tk_core.seq);
> -
> - timekeeping_forward_now(tk);
> -
> - xt = tk_xtime(tk);
> - ts_delta = timespec64_sub(*ts, xt);
> + scoped_guard (raw_spinlock_irqsave, &tk_core.lock) {
> + struct timekeeper *tk = &tk_core.shadow_timekeeper;
nit: maybe shadow_tk instead, so it is additionally clear in the
following logic which is being modified.
Otherwise,
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v2 17/25] timekeeping: Rework timekeeping_inject_offset() to use shadow_timekeeper
2024-10-09 8:28 [PATCH v2 00/25] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (15 preceding siblings ...)
2024-10-09 8:29 ` [PATCH v2 16/25] timekeeping: Rework do_settimeofday64() to use shadow_timekeeper Anna-Maria Behnsen
@ 2024-10-09 8:29 ` Anna-Maria Behnsen
2024-10-24 21:57 ` John Stultz
2024-10-09 8:29 ` [PATCH v2 18/25] timekeeping: Rework change_clocksource() " Anna-Maria Behnsen
` (7 subsequent siblings)
24 siblings, 1 reply; 53+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-09 8:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Anna-Maria Behnsen <anna-maria@linutronix.de>
Updates of the timekeeper can be done by operating on the shadow timekeeper
and afterwards copying the result into the real timekeeper. This has the
advantage, that the sequence count write protected region is kept as small
as possible.
Convert timekeeping_inject_offset() to use this scheme.
That allows to use a scoped_guard() for locking the timekeeper lock as the
usage of the shadow timekeeper allows a rollback in the error case instead
of the full timekeeper update of the original code.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 41 ++++++++++++++++-------------------------
1 file changed, 16 insertions(+), 25 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index cc01ad53d96d..051041e92e54 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1519,40 +1519,31 @@ EXPORT_SYMBOL(do_settimeofday64);
*/
static int timekeeping_inject_offset(const struct timespec64 *ts)
{
- struct timekeeper *tk = &tk_core.timekeeper;
- unsigned long flags;
- struct timespec64 tmp;
- int ret = 0;
-
if (ts->tv_nsec < 0 || ts->tv_nsec >= NSEC_PER_SEC)
return -EINVAL;
- raw_spin_lock_irqsave(&tk_core.lock, flags);
- write_seqcount_begin(&tk_core.seq);
-
- timekeeping_forward_now(tk);
-
- /* Make sure the proposed value is valid */
- tmp = timespec64_add(tk_xtime(tk), *ts);
- if (timespec64_compare(&tk->wall_to_monotonic, ts) > 0 ||
- !timespec64_valid_settod(&tmp)) {
- ret = -EINVAL;
- goto error;
- }
+ scoped_guard (raw_spinlock_irqsave, &tk_core.lock) {
+ struct timekeeper *tk = &tk_core.shadow_timekeeper;
+ struct timespec64 tmp;
- tk_xtime_add(tk, ts);
- tk_set_wall_to_mono(tk, timespec64_sub(tk->wall_to_monotonic, *ts));
+ timekeeping_forward_now(tk);
-error: /* even if we error out, we forwarded the time, so call update */
- timekeeping_update(&tk_core, tk, TK_UPDATE_ALL | TK_MIRROR);
+ /* Make sure the proposed value is valid */
+ tmp = timespec64_add(tk_xtime(tk), *ts);
+ if (timespec64_compare(&tk->wall_to_monotonic, ts) > 0 ||
+ !timespec64_valid_settod(&tmp)) {
+ timekeeping_restore_shadow(&tk_core);
+ return -EINVAL;
+ }
- write_seqcount_end(&tk_core.seq);
- raw_spin_unlock_irqrestore(&tk_core.lock, flags);
+ tk_xtime_add(tk, ts);
+ tk_set_wall_to_mono(tk, timespec64_sub(tk->wall_to_monotonic, *ts));
+ timekeeping_update_staged(&tk_core, TK_UPDATE_ALL);
+ }
/* Signal hrtimers about time change */
clock_was_set(CLOCK_SET_WALL);
-
- return ret;
+ return 0;
}
/*
--
2.39.5
^ permalink raw reply related [flat|nested] 53+ messages in thread* Re: [PATCH v2 17/25] timekeeping: Rework timekeeping_inject_offset() to use shadow_timekeeper
2024-10-09 8:29 ` [PATCH v2 17/25] timekeeping: Rework timekeeping_inject_offset() " Anna-Maria Behnsen
@ 2024-10-24 21:57 ` John Stultz
0 siblings, 0 replies; 53+ messages in thread
From: John Stultz @ 2024-10-24 21:57 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> Updates of the timekeeper can be done by operating on the shadow timekeeper
> and afterwards copying the result into the real timekeeper. This has the
> advantage, that the sequence count write protected region is kept as small
> as possible.
>
> Convert timekeeping_inject_offset() to use this scheme.
>
> That allows to use a scoped_guard() for locking the timekeeper lock as the
> usage of the shadow timekeeper allows a rollback in the error case instead
> of the full timekeeper update of the original code.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
> ---
> kernel/time/timekeeping.c | 41 ++++++++++++++++-------------------------
> 1 file changed, 16 insertions(+), 25 deletions(-)
>
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index cc01ad53d96d..051041e92e54 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -1519,40 +1519,31 @@ EXPORT_SYMBOL(do_settimeofday64);
> */
> static int timekeeping_inject_offset(const struct timespec64 *ts)
> {
> - struct timekeeper *tk = &tk_core.timekeeper;
> - unsigned long flags;
> - struct timespec64 tmp;
> - int ret = 0;
> -
> if (ts->tv_nsec < 0 || ts->tv_nsec >= NSEC_PER_SEC)
> return -EINVAL;
>
> - raw_spin_lock_irqsave(&tk_core.lock, flags);
> - write_seqcount_begin(&tk_core.seq);
> -
> - timekeeping_forward_now(tk);
> -
> - /* Make sure the proposed value is valid */
> - tmp = timespec64_add(tk_xtime(tk), *ts);
> - if (timespec64_compare(&tk->wall_to_monotonic, ts) > 0 ||
> - !timespec64_valid_settod(&tmp)) {
> - ret = -EINVAL;
> - goto error;
> - }
> + scoped_guard (raw_spinlock_irqsave, &tk_core.lock) {
> + struct timekeeper *tk = &tk_core.shadow_timekeeper;
Nit: Same suggestion of naming this tk_shadow for added clarity.
Otherwise,
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v2 18/25] timekeeping: Rework change_clocksource() to use shadow_timekeeper
2024-10-09 8:28 [PATCH v2 00/25] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (16 preceding siblings ...)
2024-10-09 8:29 ` [PATCH v2 17/25] timekeeping: Rework timekeeping_inject_offset() " Anna-Maria Behnsen
@ 2024-10-09 8:29 ` Anna-Maria Behnsen
2024-10-24 21:58 ` John Stultz
2024-10-09 8:29 ` [PATCH v2 19/25] timekeeping: Rework timekeeping_init() " Anna-Maria Behnsen
` (6 subsequent siblings)
24 siblings, 1 reply; 53+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-09 8:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Anna-Maria Behnsen <anna-maria@linutronix.de>
Updates of the timekeeper can be done by operating on the shadow timekeeper
and afterwards copying the result into the real timekeeper. This has the
advantage, that the sequence count write protected region is kept as small
as possible.
Convert change_clocksource() to use this scheme.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 18 +++++++-----------
1 file changed, 7 insertions(+), 11 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 051041e92e54..6d21e9bf5c35 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1596,9 +1596,7 @@ static void __timekeeping_set_tai_offset(struct timekeeper *tk, s32 tai_offset)
*/
static int change_clocksource(void *data)
{
- struct timekeeper *tk = &tk_core.timekeeper;
struct clocksource *new = data, *old = NULL;
- unsigned long flags;
/*
* If the clocksource is in a module, get a module reference.
@@ -1614,16 +1612,14 @@ static int change_clocksource(void *data)
return 0;
}
- raw_spin_lock_irqsave(&tk_core.lock, flags);
- write_seqcount_begin(&tk_core.seq);
-
- timekeeping_forward_now(tk);
- old = tk->tkr_mono.clock;
- tk_setup_internals(tk, new);
- timekeeping_update(&tk_core, tk, TK_UPDATE_ALL | TK_MIRROR);
+ scoped_guard (raw_spinlock_irqsave, &tk_core.lock) {
+ struct timekeeper *tk = &tk_core.shadow_timekeeper;
- write_seqcount_end(&tk_core.seq);
- raw_spin_unlock_irqrestore(&tk_core.lock, flags);
+ timekeeping_forward_now(tk);
+ old = tk->tkr_mono.clock;
+ tk_setup_internals(tk, new);
+ timekeeping_update_staged(&tk_core, TK_UPDATE_ALL);
+ }
if (old) {
if (old->disable)
--
2.39.5
^ permalink raw reply related [flat|nested] 53+ messages in thread* Re: [PATCH v2 18/25] timekeeping: Rework change_clocksource() to use shadow_timekeeper
2024-10-09 8:29 ` [PATCH v2 18/25] timekeeping: Rework change_clocksource() " Anna-Maria Behnsen
@ 2024-10-24 21:58 ` John Stultz
0 siblings, 0 replies; 53+ messages in thread
From: John Stultz @ 2024-10-24 21:58 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> Updates of the timekeeper can be done by operating on the shadow timekeeper
> and afterwards copying the result into the real timekeeper. This has the
> advantage, that the sequence count write protected region is kept as small
> as possible.
>
> Convert change_clocksource() to use this scheme.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
> ---
> kernel/time/timekeeping.c | 18 +++++++-----------
> 1 file changed, 7 insertions(+), 11 deletions(-)
>
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index 051041e92e54..6d21e9bf5c35 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -1596,9 +1596,7 @@ static void __timekeeping_set_tai_offset(struct timekeeper *tk, s32 tai_offset)
> */
> static int change_clocksource(void *data)
> {
> - struct timekeeper *tk = &tk_core.timekeeper;
> struct clocksource *new = data, *old = NULL;
> - unsigned long flags;
>
> /*
> * If the clocksource is in a module, get a module reference.
> @@ -1614,16 +1612,14 @@ static int change_clocksource(void *data)
> return 0;
> }
>
> - raw_spin_lock_irqsave(&tk_core.lock, flags);
> - write_seqcount_begin(&tk_core.seq);
> -
> - timekeeping_forward_now(tk);
> - old = tk->tkr_mono.clock;
> - tk_setup_internals(tk, new);
> - timekeeping_update(&tk_core, tk, TK_UPDATE_ALL | TK_MIRROR);
> + scoped_guard (raw_spinlock_irqsave, &tk_core.lock) {
> + struct timekeeper *tk = &tk_core.shadow_timekeeper;
Same nit: tk_shadow maybe?
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v2 19/25] timekeeping: Rework timekeeping_init() to use shadow_timekeeper
2024-10-09 8:28 [PATCH v2 00/25] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (17 preceding siblings ...)
2024-10-09 8:29 ` [PATCH v2 18/25] timekeeping: Rework change_clocksource() " Anna-Maria Behnsen
@ 2024-10-09 8:29 ` Anna-Maria Behnsen
2024-10-24 22:16 ` John Stultz
2024-10-09 8:29 ` [PATCH v2 20/25] timekeeping: Rework timekeeping_inject_sleeptime64() " Anna-Maria Behnsen
` (5 subsequent siblings)
24 siblings, 1 reply; 53+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-09 8:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Anna-Maria Behnsen <anna-maria@linutronix.de>
For timekeeping_init() the sequence count write held time is not relevant
and it could keep working on the real timekeeper, but there is no reason to
make it different from other timekeeper updates.
Convert it to operate on the shadow timekeeper.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 6d21e9bf5c35..bb225534fee1 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1778,7 +1778,7 @@ static bool persistent_clock_exists;
void __init timekeeping_init(void)
{
struct timespec64 wall_time, boot_offset, wall_to_mono;
- struct timekeeper *tk = &tk_core.timekeeper;
+ struct timekeeper *tk = &tk_core.shadow_timekeeper;
struct clocksource *clock;
tkd_basic_setup(&tk_core);
@@ -1802,7 +1802,7 @@ void __init timekeeping_init(void)
wall_to_mono = timespec64_sub(boot_offset, wall_time);
guard(raw_spinlock_irqsave)(&tk_core.lock);
- write_seqcount_begin(&tk_core.seq);
+
ntp_init();
clock = clocksource_default_clock();
@@ -1815,9 +1815,7 @@ void __init timekeeping_init(void)
tk_set_wall_to_mono(tk, wall_to_mono);
- timekeeping_update(&tk_core, tk, TK_MIRROR | TK_CLOCK_WAS_SET);
-
- write_seqcount_end(&tk_core.seq);
+ timekeeping_update_staged(&tk_core, TK_CLOCK_WAS_SET);
}
/* time in seconds when suspend began for persistent clock */
--
2.39.5
^ permalink raw reply related [flat|nested] 53+ messages in thread* Re: [PATCH v2 19/25] timekeeping: Rework timekeeping_init() to use shadow_timekeeper
2024-10-09 8:29 ` [PATCH v2 19/25] timekeeping: Rework timekeeping_init() " Anna-Maria Behnsen
@ 2024-10-24 22:16 ` John Stultz
0 siblings, 0 replies; 53+ messages in thread
From: John Stultz @ 2024-10-24 22:16 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> For timekeeping_init() the sequence count write held time is not relevant
> and it could keep working on the real timekeeper, but there is no reason to
> make it different from other timekeeper updates.
>
> Convert it to operate on the shadow timekeeper.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
> ---
> kernel/time/timekeeping.c | 8 +++-----
> 1 file changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index 6d21e9bf5c35..bb225534fee1 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -1778,7 +1778,7 @@ static bool persistent_clock_exists;
> void __init timekeeping_init(void)
> {
> struct timespec64 wall_time, boot_offset, wall_to_mono;
> - struct timekeeper *tk = &tk_core.timekeeper;
> + struct timekeeper *tk = &tk_core.shadow_timekeeper;
Same nit as earlier, otherwise
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v2 20/25] timekeeping: Rework timekeeping_inject_sleeptime64() to use shadow_timekeeper
2024-10-09 8:28 [PATCH v2 00/25] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (18 preceding siblings ...)
2024-10-09 8:29 ` [PATCH v2 19/25] timekeeping: Rework timekeeping_init() " Anna-Maria Behnsen
@ 2024-10-09 8:29 ` Anna-Maria Behnsen
2024-10-24 22:16 ` John Stultz
2024-10-09 8:29 ` [PATCH v2 21/25] timekeeping: Rework timekeeping_resume() " Anna-Maria Behnsen
` (4 subsequent siblings)
24 siblings, 1 reply; 53+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-09 8:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Anna-Maria Behnsen <anna-maria@linutronix.de>
Updates of the timekeeper can be done by operating on the shadow timekeeper
and afterwards copying the result into the real timekeeper. This has the
advantage, that the sequence count write protected region is kept as small
as possible.
Convert timekeeping_inject_sleeptime64() to use this scheme.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 22 +++++++---------------
1 file changed, 7 insertions(+), 15 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index bb225534fee1..c1a2726a0d41 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1893,22 +1893,14 @@ bool timekeeping_rtc_skipsuspend(void)
*/
void timekeeping_inject_sleeptime64(const struct timespec64 *delta)
{
- struct timekeeper *tk = &tk_core.timekeeper;
- unsigned long flags;
-
- raw_spin_lock_irqsave(&tk_core.lock, flags);
- write_seqcount_begin(&tk_core.seq);
-
- suspend_timing_needed = false;
-
- timekeeping_forward_now(tk);
-
- __timekeeping_inject_sleeptime(tk, delta);
-
- timekeeping_update(&tk_core, tk, TK_UPDATE_ALL | TK_MIRROR);
+ scoped_guard(raw_spinlock_irqsave, &tk_core.lock) {
+ struct timekeeper *tk = &tk_core.shadow_timekeeper;
- write_seqcount_end(&tk_core.seq);
- raw_spin_unlock_irqrestore(&tk_core.lock, flags);
+ suspend_timing_needed = false;
+ timekeeping_forward_now(tk);
+ __timekeeping_inject_sleeptime(tk, delta);
+ timekeeping_update_staged(&tk_core, TK_UPDATE_ALL);
+ }
/* Signal hrtimers about time change */
clock_was_set(CLOCK_SET_WALL | CLOCK_SET_BOOT);
--
2.39.5
^ permalink raw reply related [flat|nested] 53+ messages in thread* Re: [PATCH v2 20/25] timekeeping: Rework timekeeping_inject_sleeptime64() to use shadow_timekeeper
2024-10-09 8:29 ` [PATCH v2 20/25] timekeeping: Rework timekeeping_inject_sleeptime64() " Anna-Maria Behnsen
@ 2024-10-24 22:16 ` John Stultz
0 siblings, 0 replies; 53+ messages in thread
From: John Stultz @ 2024-10-24 22:16 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> Updates of the timekeeper can be done by operating on the shadow timekeeper
> and afterwards copying the result into the real timekeeper. This has the
> advantage, that the sequence count write protected region is kept as small
> as possible.
>
> Convert timekeeping_inject_sleeptime64() to use this scheme.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
> ---
> kernel/time/timekeeping.c | 22 +++++++---------------
> 1 file changed, 7 insertions(+), 15 deletions(-)
>
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index bb225534fee1..c1a2726a0d41 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -1893,22 +1893,14 @@ bool timekeeping_rtc_skipsuspend(void)
> */
> void timekeeping_inject_sleeptime64(const struct timespec64 *delta)
> {
> - struct timekeeper *tk = &tk_core.timekeeper;
> - unsigned long flags;
> -
> - raw_spin_lock_irqsave(&tk_core.lock, flags);
> - write_seqcount_begin(&tk_core.seq);
> -
> - suspend_timing_needed = false;
> -
> - timekeeping_forward_now(tk);
> -
> - __timekeeping_inject_sleeptime(tk, delta);
> -
> - timekeeping_update(&tk_core, tk, TK_UPDATE_ALL | TK_MIRROR);
> + scoped_guard(raw_spinlock_irqsave, &tk_core.lock) {
> + struct timekeeper *tk = &tk_core.shadow_timekeeper;
Same nit as the rest, otherwise
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v2 21/25] timekeeping: Rework timekeeping_resume() to use shadow_timekeeper
2024-10-09 8:28 [PATCH v2 00/25] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (19 preceding siblings ...)
2024-10-09 8:29 ` [PATCH v2 20/25] timekeeping: Rework timekeeping_inject_sleeptime64() " Anna-Maria Behnsen
@ 2024-10-09 8:29 ` Anna-Maria Behnsen
2024-10-24 22:18 ` John Stultz
2024-10-09 8:29 ` [PATCH v2 22/25] timekeeping: Rework timekeeping_suspend() " Anna-Maria Behnsen
` (3 subsequent siblings)
24 siblings, 1 reply; 53+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-09 8:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Anna-Maria Behnsen <anna-maria@linutronix.de>
Updates of the timekeeper can be done by operating on the shadow timekeeper
and afterwards copying the result into the real timekeeper. This has the
advantage, that the sequence count write protected region is kept as small
as possible.
While the sequence count held time is not relevant for the resume path as
there is no concurrency, there is no reason to have this function
different than all the other update sites.
Convert timekeeping_inject_offset() to use this scheme and cleanup the
variable declaration while at it.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index c1a2726a0d41..63b7a1379ae8 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1912,12 +1912,12 @@ void timekeeping_inject_sleeptime64(const struct timespec64 *delta)
*/
void timekeeping_resume(void)
{
- struct timekeeper *tk = &tk_core.timekeeper;
+ struct timekeeper *tk = &tk_core.shadow_timekeeper;
struct clocksource *clock = tk->tkr_mono.clock;
- unsigned long flags;
struct timespec64 ts_new, ts_delta;
- u64 cycle_now, nsec;
bool inject_sleeptime = false;
+ u64 cycle_now, nsec;
+ unsigned long flags;
read_persistent_clock64(&ts_new);
@@ -1925,7 +1925,6 @@ void timekeeping_resume(void)
clocksource_resume();
raw_spin_lock_irqsave(&tk_core.lock, flags);
- write_seqcount_begin(&tk_core.seq);
/*
* After system resumes, we need to calculate the suspended time and
@@ -1960,8 +1959,7 @@ void timekeeping_resume(void)
tk->ntp_error = 0;
timekeeping_suspended = 0;
- timekeeping_update(&tk_core, tk, TK_MIRROR | TK_CLOCK_WAS_SET);
- write_seqcount_end(&tk_core.seq);
+ timekeeping_update_staged(&tk_core, TK_CLOCK_WAS_SET);
raw_spin_unlock_irqrestore(&tk_core.lock, flags);
touch_softlockup_watchdog();
--
2.39.5
^ permalink raw reply related [flat|nested] 53+ messages in thread* Re: [PATCH v2 21/25] timekeeping: Rework timekeeping_resume() to use shadow_timekeeper
2024-10-09 8:29 ` [PATCH v2 21/25] timekeeping: Rework timekeeping_resume() " Anna-Maria Behnsen
@ 2024-10-24 22:18 ` John Stultz
0 siblings, 0 replies; 53+ messages in thread
From: John Stultz @ 2024-10-24 22:18 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> Updates of the timekeeper can be done by operating on the shadow timekeeper
> and afterwards copying the result into the real timekeeper. This has the
> advantage, that the sequence count write protected region is kept as small
> as possible.
>
> While the sequence count held time is not relevant for the resume path as
> there is no concurrency, there is no reason to have this function
> different than all the other update sites.
>
> Convert timekeeping_inject_offset() to use this scheme and cleanup the
> variable declaration while at it.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Other than the tk_shadow naming nit from earlier,
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v2 22/25] timekeeping: Rework timekeeping_suspend() to use shadow_timekeeper
2024-10-09 8:28 [PATCH v2 00/25] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (20 preceding siblings ...)
2024-10-09 8:29 ` [PATCH v2 21/25] timekeeping: Rework timekeeping_resume() " Anna-Maria Behnsen
@ 2024-10-09 8:29 ` Anna-Maria Behnsen
2024-10-24 22:21 ` John Stultz
2024-10-09 8:29 ` [PATCH v2 23/25] timekeeping: Rework do_adjtimex() " Anna-Maria Behnsen
` (2 subsequent siblings)
24 siblings, 1 reply; 53+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-09 8:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Anna-Maria Behnsen <anna-maria@linutronix.de>
Updates of the timekeeper can be done by operating on the shadow timekeeper
and afterwards copying the result into the real timekeeper. This has the
advantage, that the sequence count write protected region is kept as small
as possible.
While the sequence count held time is not relevant for the resume path as
there is no concurrency, there is no reason to have this function
different than all the other update sites.
Convert timekeeping_inject_offset() to use this scheme and cleanup the
variable declarations while at it.
As halt_fast_timekeeper() does not need protection sequence counter, it is
no problem to move it with this change outside of the sequence counter
protected area. But it still needs to be executed while holding the lock.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 63b7a1379ae8..e15e843ba2b8 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1972,11 +1972,11 @@ void timekeeping_resume(void)
int timekeeping_suspend(void)
{
- struct timekeeper *tk = &tk_core.timekeeper;
- unsigned long flags;
- struct timespec64 delta, delta_delta;
- static struct timespec64 old_delta;
+ struct timekeeper *tk = &tk_core.shadow_timekeeper;
+ struct timespec64 delta, delta_delta;
+ static struct timespec64 old_delta;
struct clocksource *curr_clock;
+ unsigned long flags;
u64 cycle_now;
read_persistent_clock64(&timekeeping_suspend_time);
@@ -1992,7 +1992,6 @@ int timekeeping_suspend(void)
suspend_timing_needed = true;
raw_spin_lock_irqsave(&tk_core.lock, flags);
- write_seqcount_begin(&tk_core.seq);
timekeeping_forward_now(tk);
timekeeping_suspended = 1;
@@ -2027,9 +2026,8 @@ int timekeeping_suspend(void)
}
}
- timekeeping_update(&tk_core, tk, TK_MIRROR);
+ timekeeping_update_staged(&tk_core, 0);
halt_fast_timekeeper(tk);
- write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&tk_core.lock, flags);
tick_suspend();
--
2.39.5
^ permalink raw reply related [flat|nested] 53+ messages in thread* Re: [PATCH v2 22/25] timekeeping: Rework timekeeping_suspend() to use shadow_timekeeper
2024-10-09 8:29 ` [PATCH v2 22/25] timekeeping: Rework timekeeping_suspend() " Anna-Maria Behnsen
@ 2024-10-24 22:21 ` John Stultz
0 siblings, 0 replies; 53+ messages in thread
From: John Stultz @ 2024-10-24 22:21 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> Updates of the timekeeper can be done by operating on the shadow timekeeper
> and afterwards copying the result into the real timekeeper. This has the
> advantage, that the sequence count write protected region is kept as small
> as possible.
>
> While the sequence count held time is not relevant for the resume path as
> there is no concurrency, there is no reason to have this function
> different than all the other update sites.
>
> Convert timekeeping_inject_offset() to use this scheme and cleanup the
> variable declarations while at it.
>
> As halt_fast_timekeeper() does not need protection sequence counter, it is
> no problem to move it with this change outside of the sequence counter
> protected area. But it still needs to be executed while holding the lock.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
tk_shadow naming nit, but otherwise:
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v2 23/25] timekeeping: Rework do_adjtimex() to use shadow_timekeeper
2024-10-09 8:28 [PATCH v2 00/25] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (21 preceding siblings ...)
2024-10-09 8:29 ` [PATCH v2 22/25] timekeeping: Rework timekeeping_suspend() " Anna-Maria Behnsen
@ 2024-10-09 8:29 ` Anna-Maria Behnsen
2024-10-24 22:26 ` John Stultz
2024-10-09 8:29 ` [PATCH v2 24/25] timekeeping: Remove TK_MIRROR timekeeping_update() action Anna-Maria Behnsen
2024-10-09 8:29 ` [PATCH v2 25/25] timekeeping: Merge timekeeping_update_staged() and timekeeping_update() Anna-Maria Behnsen
24 siblings, 1 reply; 53+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-09 8:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Anna-Maria Behnsen <anna-maria@linutronix.de>
Updates of the timekeeper can be done by operating on the shadow timekeeper
and afterwards copying the result into the real timekeeper. This has the
advantage, that the sequence count write protected region is kept as small
as possible.
Convert do_adjtimex() to use this scheme and take the opportunity to use a
scoped_guard() for locking.
That requires to have a separate function for updating the leap state so
that the update is protected by the sequence count. This also brings the
timekeeper and the shadow timekeeper in sync for this state, which was not
the case so far. That's not a correctness problem as the state is only used
at the read sides which use the real timekeeper, but it's inconsistent
nevertheless.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 40 ++++++++++++++++++++++++----------------
1 file changed, 24 insertions(+), 16 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index e15e843ba2b8..b2683a589470 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -722,6 +722,18 @@ static inline void tk_update_leap_state(struct timekeeper *tk)
tk->next_leap_ktime = ktime_sub(tk->next_leap_ktime, tk->offs_real);
}
+/*
+ * Leap state update for both shadow and the real timekeeper
+ * Separate to spare a full memcpy() of the timekeeper.
+ */
+static void tk_update_leap_state_all(struct tk_data *tkd)
+{
+ write_seqcount_begin(&tkd->seq);
+ tk_update_leap_state(&tkd->shadow_timekeeper);
+ tkd->timekeeper.next_leap_ktime = tkd->shadow_timekeeper.next_leap_ktime;
+ write_seqcount_end(&tkd->seq);
+}
+
/*
* Update the ktime_t based scalar nsec members of the timekeeper
*/
@@ -2537,13 +2549,11 @@ EXPORT_SYMBOL_GPL(random_get_entropy_fallback);
*/
int do_adjtimex(struct __kernel_timex *txc)
{
- struct timekeeper *tk = &tk_core.timekeeper;
+ struct timekeeper *tk = &tk_core.shadow_timekeeper;
struct audit_ntp_data ad;
bool offset_set = false;
bool clock_set = false;
struct timespec64 ts;
- unsigned long flags;
- s32 orig_tai, tai;
int ret;
/* Validate the data before disabling interrupts */
@@ -2571,23 +2581,21 @@ int do_adjtimex(struct __kernel_timex *txc)
ktime_get_real_ts64(&ts);
add_device_randomness(&ts, sizeof(ts));
- raw_spin_lock_irqsave(&tk_core.lock, flags);
- write_seqcount_begin(&tk_core.seq);
+ scoped_guard (raw_spinlock_irqsave, &tk_core.lock) {
+ s32 orig_tai, tai;
- orig_tai = tai = tk->tai_offset;
- ret = __do_adjtimex(txc, &ts, &tai, &ad);
+ orig_tai = tai = tk->tai_offset;
+ ret = __do_adjtimex(txc, &ts, &tai, &ad);
- if (tai != orig_tai) {
- __timekeeping_set_tai_offset(tk, tai);
- timekeeping_update(&tk_core, tk, TK_MIRROR | TK_CLOCK_WAS_SET);
- clock_set = true;
- } else {
- tk_update_leap_state(tk);
+ if (tai != orig_tai) {
+ __timekeeping_set_tai_offset(tk, tai);
+ timekeeping_update_staged(&tk_core, TK_CLOCK_WAS_SET);
+ clock_set = true;
+ } else {
+ tk_update_leap_state_all(&tk_core);
+ }
}
- write_seqcount_end(&tk_core.seq);
- raw_spin_unlock_irqrestore(&tk_core.lock, flags);
-
audit_ntp_log(&ad);
/* Update the multiplier immediately if frequency was set directly */
--
2.39.5
^ permalink raw reply related [flat|nested] 53+ messages in thread* Re: [PATCH v2 23/25] timekeeping: Rework do_adjtimex() to use shadow_timekeeper
2024-10-09 8:29 ` [PATCH v2 23/25] timekeeping: Rework do_adjtimex() " Anna-Maria Behnsen
@ 2024-10-24 22:26 ` John Stultz
0 siblings, 0 replies; 53+ messages in thread
From: John Stultz @ 2024-10-24 22:26 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> Updates of the timekeeper can be done by operating on the shadow timekeeper
> and afterwards copying the result into the real timekeeper. This has the
> advantage, that the sequence count write protected region is kept as small
> as possible.
>
> Convert do_adjtimex() to use this scheme and take the opportunity to use a
> scoped_guard() for locking.
>
> That requires to have a separate function for updating the leap state so
> that the update is protected by the sequence count. This also brings the
> timekeeper and the shadow timekeeper in sync for this state, which was not
> the case so far. That's not a correctness problem as the state is only used
> at the read sides which use the real timekeeper, but it's inconsistent
> nevertheless.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Again, the tk_shadow naming nit, but otherwise:
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v2 24/25] timekeeping: Remove TK_MIRROR timekeeping_update() action
2024-10-09 8:28 [PATCH v2 00/25] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (22 preceding siblings ...)
2024-10-09 8:29 ` [PATCH v2 23/25] timekeeping: Rework do_adjtimex() " Anna-Maria Behnsen
@ 2024-10-09 8:29 ` Anna-Maria Behnsen
2024-10-24 22:29 ` John Stultz
2024-10-09 8:29 ` [PATCH v2 25/25] timekeeping: Merge timekeeping_update_staged() and timekeeping_update() Anna-Maria Behnsen
24 siblings, 1 reply; 53+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-09 8:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Anna-Maria Behnsen <anna-maria@linutronix.de>
All call sites of using TK_MIRROR flag in timekeeping_update() are
gone. The TK_MIRROR dependent code path is therefore dead code.
Remove it along with the TK_MIRROR define.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 10 +---------
1 file changed, 1 insertion(+), 9 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index b2683a589470..67d7be2e02fb 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -30,8 +30,7 @@
#include "timekeeping_internal.h"
#define TK_CLEAR_NTP (1 << 0)
-#define TK_MIRROR (1 << 1)
-#define TK_CLOCK_WAS_SET (1 << 2)
+#define TK_CLOCK_WAS_SET (1 << 1)
#define TK_UPDATE_ALL (TK_CLEAR_NTP | TK_CLOCK_WAS_SET)
@@ -797,13 +796,6 @@ static void timekeeping_update(struct tk_data *tkd, struct timekeeper *tk, unsig
if (action & TK_CLOCK_WAS_SET)
tk->clock_was_set_seq++;
- /*
- * The mirroring of the data to the shadow-timekeeper needs
- * to happen last here to ensure we don't over-write the
- * timekeeper structure on the next update with stale data
- */
- if (action & TK_MIRROR)
- timekeeping_restore_shadow(tkd);
}
static void timekeeping_update_staged(struct tk_data *tkd, unsigned int action)
--
2.39.5
^ permalink raw reply related [flat|nested] 53+ messages in thread* Re: [PATCH v2 24/25] timekeeping: Remove TK_MIRROR timekeeping_update() action
2024-10-09 8:29 ` [PATCH v2 24/25] timekeeping: Remove TK_MIRROR timekeeping_update() action Anna-Maria Behnsen
@ 2024-10-24 22:29 ` John Stultz
0 siblings, 0 replies; 53+ messages in thread
From: John Stultz @ 2024-10-24 22:29 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> All call sites of using TK_MIRROR flag in timekeeping_update() are
> gone. The TK_MIRROR dependent code path is therefore dead code.
>
> Remove it along with the TK_MIRROR define.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Ah, apologies, this makes the earlier change in 14/25 much more
obviously good. Sorry for not seeing the big picture until I got here.
I withdraw any objection there
For both, now:
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v2 25/25] timekeeping: Merge timekeeping_update_staged() and timekeeping_update()
2024-10-09 8:28 [PATCH v2 00/25] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (23 preceding siblings ...)
2024-10-09 8:29 ` [PATCH v2 24/25] timekeeping: Remove TK_MIRROR timekeeping_update() action Anna-Maria Behnsen
@ 2024-10-09 8:29 ` Anna-Maria Behnsen
2024-10-24 22:41 ` John Stultz
24 siblings, 1 reply; 53+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-09 8:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Anna-Maria Behnsen <anna-maria@linutronix.de>
timekeeping_update_staged() is the only call site of timekeeping_update().
Merge those functions. No functional change.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 31 ++++++++++++++-----------------
1 file changed, 14 insertions(+), 17 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 67d7be2e02fb..d07eb1946ff1 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -529,7 +529,7 @@ EXPORT_SYMBOL_GPL(ktime_get_raw_fast_ns);
* timekeeping_inject_sleeptime64()
* __timekeeping_inject_sleeptime(tk, delta);
* timestamp();
- * timekeeping_update(tkd, tk, TK_CLEAR_NTP...);
+ * timekeeping_update_staged(tkd, TK_CLEAR_NTP...);
*
* (2) On 32-bit systems, the 64-bit boot offset (tk->offs_boot) may be
* partially updated. Since the tk->offs_boot update is a rare event, this
@@ -775,10 +775,21 @@ static void timekeeping_restore_shadow(struct tk_data *tkd)
memcpy(&tkd->shadow_timekeeper, &tkd->timekeeper, sizeof(tkd->timekeeper));
}
-static void timekeeping_update(struct tk_data *tkd, struct timekeeper *tk, unsigned int action)
+static void timekeeping_update_staged(struct tk_data *tkd, unsigned int action)
{
+ struct timekeeper *tk = &tk_core.shadow_timekeeper;
+
lockdep_assert_held(&tkd->lock);
+ /*
+ * Block out readers before running the updates below because that
+ * updates VDSO and other time related infrastructure. Not blocking
+ * the readers might let a reader see time going backwards when
+ * reading from the VDSO after the VDSO update and then reading in
+ * the kernel from the timekeeper before that got updated.
+ */
+ write_seqcount_begin(&tkd->seq);
+
if (action & TK_CLEAR_NTP) {
tk->ntp_error = 0;
ntp_clear();
@@ -796,20 +807,6 @@ static void timekeeping_update(struct tk_data *tkd, struct timekeeper *tk, unsig
if (action & TK_CLOCK_WAS_SET)
tk->clock_was_set_seq++;
-}
-
-static void timekeeping_update_staged(struct tk_data *tkd, unsigned int action)
-{
- /*
- * Block out readers before invoking timekeeping_update() because
- * that updates VDSO and other time related infrastructure. Not
- * blocking the readers might let a reader see time going backwards
- * when reading from the VDSO after the VDSO update and then
- * reading in the kernel from the timekeeper before that got updated.
- */
- write_seqcount_begin(&tkd->seq);
-
- timekeeping_update(tkd, &tkd->shadow_timekeeper, action);
/*
* Update the real timekeeper.
@@ -819,7 +816,7 @@ static void timekeeping_update_staged(struct tk_data *tkd, unsigned int action)
* the cacheline optimized data layout of the timekeeper and requires
* another indirection.
*/
- memcpy(&tkd->timekeeper, &tkd->shadow_timekeeper, sizeof(tkd->shadow_timekeeper));
+ memcpy(&tkd->timekeeper, tk, sizeof(*tk));
write_seqcount_end(&tkd->seq);
}
--
2.39.5
^ permalink raw reply related [flat|nested] 53+ messages in thread* Re: [PATCH v2 25/25] timekeeping: Merge timekeeping_update_staged() and timekeeping_update()
2024-10-09 8:29 ` [PATCH v2 25/25] timekeeping: Merge timekeeping_update_staged() and timekeeping_update() Anna-Maria Behnsen
@ 2024-10-24 22:41 ` John Stultz
0 siblings, 0 replies; 53+ messages in thread
From: John Stultz @ 2024-10-24 22:41 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> timekeeping_update_staged() is the only call site of timekeeping_update().
>
> Merge those functions. No functional change.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
> ---
> kernel/time/timekeeping.c | 31 ++++++++++++++-----------------
> 1 file changed, 14 insertions(+), 17 deletions(-)
>
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index 67d7be2e02fb..d07eb1946ff1 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -529,7 +529,7 @@ EXPORT_SYMBOL_GPL(ktime_get_raw_fast_ns);
> * timekeeping_inject_sleeptime64()
> * __timekeeping_inject_sleeptime(tk, delta);
> * timestamp();
> - * timekeeping_update(tkd, tk, TK_CLEAR_NTP...);
> + * timekeeping_update_staged(tkd, TK_CLEAR_NTP...);
> *
> * (2) On 32-bit systems, the 64-bit boot offset (tk->offs_boot) may be
> * partially updated. Since the tk->offs_boot update is a rare event, this
> @@ -775,10 +775,21 @@ static void timekeeping_restore_shadow(struct tk_data *tkd)
> memcpy(&tkd->shadow_timekeeper, &tkd->timekeeper, sizeof(tkd->timekeeper));
> }
>
> -static void timekeeping_update(struct tk_data *tkd, struct timekeeper *tk, unsigned int action)
> +static void timekeeping_update_staged(struct tk_data *tkd, unsigned int action)
I still think timekeeping_update_from_shadow would be a better name.
> {
> + struct timekeeper *tk = &tk_core.shadow_timekeeper;
> +
Same naming nit, as its easy to mix up what tk is modifying.
Other than those,
Acked-by: John Stultz <jstultz@google.com>
Thanks so much for these really nice cleanups! They are a great
improvement to the code!
And again, my apologies for jumping off to other things midway through
reviewing these and not getting back to them until now.
thanks
-john
^ permalink raw reply [flat|nested] 53+ messages in thread