* [PATCH 00/24] timekeeping: Rework to prepare support of indenpendent PTP clocks
@ 2024-09-11 13:29 Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 01/24] timekeeping: Read NTP tick length only once Anna-Maria Behnsen
` (23 more replies)
0 siblings, 24 replies; 27+ messages in thread
From: Anna-Maria Behnsen @ 2024-09-11 13:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
The generic clock and timekeeping infrastructure supports only the already
defined clocks and as they are not independent there is no need of
generalization of data structures. But PTP clocks can be independent from
CLOCK_TAI.
PTP clocks already have clock_gettime() support via the file descriptor
based posix clocks. These interfaces access the PTP hardware and are
therefore slow and cannot be used from within the kernel, e.g. TSN
networking.
This problem can be solved by emulating clock_gettime() via the system
clock source e.g. TSC on x86. Such emulation requires:
1. timekeeping mechanism similar to the existing system timekeeping
2. clock steering equivalent to NTP/adjtimex()
In the already existing system timekeeping implementation the lock and
shadow timekeeper are separate from the timekeeper and sequence
counter. Move this information into a new struct type "tk_data" to be able
to recycle it for the above explained approach.
NTP/adjtimex() related information is all stored in static variables. Move
all of them into the new struct type ntp_data to make it reusable. (See
https://lore.kernel.org/r/20240911-devel-anna-maria-b4-timers-ptp-ntp-v1-0-2d52f4e13476@linutronix.de)
Even without the future support for independent PTP clocks, the
generalization of timekeeping and NTP/adjtimex() improves the structure and
readability of the already existing code.
Once this is implemented clock_gettime() support for these clocks via vdso
can be implement as well but this is an orthogonal task.
This queue covers only the generalization of timekeeping:
- Patch 1-5: Basic cleanups
- Patch 6-10: Generalization of tk_data
- Patch 11-24: Use always shadow timekeeper for updates
The queue is available here:
git://git.kernel.org/pub/scm/linux/kernel/git/anna-maria/linux-devel.git timers/ptp/timekeeping
Cc: linux-kernel@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: Miroslav Lichvar <mlichvar@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Christopher S Hall <christopher.s.hall@intel.com>
To: John Stultz <jstultz@google.com>
To: Frederic Weisbecker <frederic@kernel.org>
To: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Thanks,
Anna-Maria
---
Anna-Maria Behnsen (16):
timekeeping: Avoid duplicate leap state update
timekeeping: Move timekeeper_lock into tk_core
timekeeping: Define a struct type for tk_core to make it reusable
timekeeping: Add struct tk_data as argument to timekeeping_update()
timekeeping: Split out timekeeper update of timekeeping_advanced()
timekeeping: Introduce combined timekeeping action flag
timekeeping: Rework do_settimeofday64() to use shadow_timekeeper
timekeeping: Rework timekeeping_inject_offset() to use shadow_timekeeper
timekeeping: Rework change_clocksource() to use shadow_timekeeper
timekeeping: Rework timekeeping_init() to use shadow_timekeeper
timekeeping: Rework timekeeping_inject_sleeptime64() to use shadow_timekeeper
timekeeping: Rework timekeeping_resume() to use shadow_timekeeper
timekeeping: Rework timekeeping_suspend() to use shadow_timekeeper
timekeeping: Rework do_adjtimex() to use shadow_timekeeper
timekeeping: Remove TK_MIRROR timekeeping_update() action
timekeeping: Merge timekeeping_update_staged() and timekeeping_update()
Thomas Gleixner (8):
timekeeping: Read NTP tick length only once
timekeeping: Don't stop time readers across hard_pps() update
timekeeping: Abort clocksource change in case of failure
timekeeping: Simplify code in timekeeping_advance()
timekeeping: Reorder struct timekeeper
timekeeping: Move shadow_timekeeper into tk_core
timekeeping: Encapsulate locking/unlocking of timekeeper_lock
timekeeping: Provide timekeeping_restore_shadow()
include/linux/timekeeper_internal.h | 102 ++++++----
kernel/time/timekeeping.c | 369 +++++++++++++++++-------------------
kernel/time/timekeeping_internal.h | 3 +-
kernel/time/vsyscall.c | 5 +-
4 files changed, 238 insertions(+), 241 deletions(-)
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH 01/24] timekeeping: Read NTP tick length only once
2024-09-11 13:29 [PATCH 00/24] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
@ 2024-09-11 13:29 ` Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 02/24] timekeeping: Don't stop time readers across hard_pps() update Anna-Maria Behnsen
` (22 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: Anna-Maria Behnsen @ 2024-09-11 13:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Thomas Gleixner <tglx@linutronix.de>
No point in reading it a second time when the comparison fails.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 5391e4167d60..b877173f88bb 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -2130,16 +2130,17 @@ static __always_inline void timekeeping_apply_adjustment(struct timekeeper *tk,
*/
static void timekeeping_adjust(struct timekeeper *tk, s64 offset)
{
+ u64 ntp_tl = ntp_tick_length();
u32 mult;
/*
* Determine the multiplier from the current NTP tick length.
* Avoid expensive division when the tick length doesn't change.
*/
- if (likely(tk->ntp_tick == ntp_tick_length())) {
+ if (likely(tk->ntp_tick == ntp_tl)) {
mult = tk->tkr_mono.mult - tk->ntp_err_mult;
} else {
- tk->ntp_tick = ntp_tick_length();
+ tk->ntp_tick = ntp_tl;
mult = div64_u64((tk->ntp_tick >> tk->ntp_error_shift) -
tk->xtime_remainder, tk->cycle_interval);
}
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 02/24] timekeeping: Don't stop time readers across hard_pps() update
2024-09-11 13:29 [PATCH 00/24] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 01/24] timekeeping: Read NTP tick length only once Anna-Maria Behnsen
@ 2024-09-11 13:29 ` Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 03/24] timekeeping: Avoid duplicate leap state update Anna-Maria Behnsen
` (21 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: Anna-Maria Behnsen @ 2024-09-11 13:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Thomas Gleixner <tglx@linutronix.de>
hard_pps() update does not modify anything which might be required by time
readers so forcing readers out of the way during the update is a pointless
exercise.
The interaction with adjtimex() and timekeeper updates which call into the
NTP code is properly serialized by timekeeper_lock.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index b877173f88bb..f09752bae05d 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -2625,11 +2625,7 @@ void hardpps(const struct timespec64 *phase_ts, const struct timespec64 *raw_ts)
unsigned long flags;
raw_spin_lock_irqsave(&timekeeper_lock, flags);
- write_seqcount_begin(&tk_core.seq);
-
__hardpps(phase_ts, raw_ts);
-
- write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
}
EXPORT_SYMBOL(hardpps);
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 03/24] timekeeping: Avoid duplicate leap state update
2024-09-11 13:29 [PATCH 00/24] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 01/24] timekeeping: Read NTP tick length only once Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 02/24] timekeeping: Don't stop time readers across hard_pps() update Anna-Maria Behnsen
@ 2024-09-11 13:29 ` Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 04/24] timekeeping: Abort clocksource change in case of failure Anna-Maria Behnsen
` (20 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: Anna-Maria Behnsen @ 2024-09-11 13:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
do_adjtimex() invokes tk_update_leap_state() unconditionally even when a
previous invocation of timekeeping_update() already did that update.
Put it into the else path which is invoked when timekeeping_update() is not
called.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index f09752bae05d..bdc4a6fe040d 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -2594,8 +2594,9 @@ int do_adjtimex(struct __kernel_timex *txc)
__timekeeping_set_tai_offset(tk, tai);
timekeeping_update(tk, TK_MIRROR | TK_CLOCK_WAS_SET);
clock_set = true;
+ } else {
+ tk_update_leap_state(tk);
}
- tk_update_leap_state(tk);
write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 04/24] timekeeping: Abort clocksource change in case of failure
2024-09-11 13:29 [PATCH 00/24] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (2 preceding siblings ...)
2024-09-11 13:29 ` [PATCH 03/24] timekeeping: Avoid duplicate leap state update Anna-Maria Behnsen
@ 2024-09-11 13:29 ` Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 05/24] timekeeping: Simplify code in timekeeping_advance() Anna-Maria Behnsen
` (19 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: Anna-Maria Behnsen @ 2024-09-11 13:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Thomas Gleixner <tglx@linutronix.de>
There is no point to go through a full timekeeping update when acquiring a
module reference or enabling the new clocksource fails.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 31 +++++++++++++------------------
1 file changed, 13 insertions(+), 18 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index bdc4a6fe040d..7b80dd70062c 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1577,33 +1577,29 @@ static void __timekeeping_set_tai_offset(struct timekeeper *tk, s32 tai_offset)
static int change_clocksource(void *data)
{
struct timekeeper *tk = &tk_core.timekeeper;
- struct clocksource *new, *old = NULL;
+ struct clocksource *new = data, *old = NULL;
unsigned long flags;
- bool change = false;
-
- new = (struct clocksource *) data;
/*
- * If the cs is in module, get a module reference. Succeeds
- * for built-in code (owner == NULL) as well.
+ * If the clocksource is in a module, get a module reference.
+ * Succeeds for built-in code (owner == NULL) as well. Abort if the
+ * reference can't be acquired.
*/
- if (try_module_get(new->owner)) {
- if (!new->enable || new->enable(new) == 0)
- change = true;
- else
- module_put(new->owner);
+ if (!try_module_get(new->owner))
+ return 0;
+
+ /* Abort if the device can't be enabled */
+ if (new->enable && new->enable(new) != 0) {
+ module_put(new->owner);
+ return 0;
}
raw_spin_lock_irqsave(&timekeeper_lock, flags);
write_seqcount_begin(&tk_core.seq);
timekeeping_forward_now(tk);
-
- if (change) {
- old = tk->tkr_mono.clock;
- tk_setup_internals(tk, new);
- }
-
+ old = tk->tkr_mono.clock;
+ tk_setup_internals(tk, new);
timekeeping_update(tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
write_seqcount_end(&tk_core.seq);
@@ -1612,7 +1608,6 @@ static int change_clocksource(void *data)
if (old) {
if (old->disable)
old->disable(old);
-
module_put(old->owner);
}
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 05/24] timekeeping: Simplify code in timekeeping_advance()
2024-09-11 13:29 [PATCH 00/24] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (3 preceding siblings ...)
2024-09-11 13:29 ` [PATCH 04/24] timekeeping: Abort clocksource change in case of failure Anna-Maria Behnsen
@ 2024-09-11 13:29 ` Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 06/24] timekeeping: Reorder struct timekeeper Anna-Maria Behnsen
` (18 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: Anna-Maria Behnsen @ 2024-09-11 13:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Thomas Gleixner <tglx@linutronix.de>
timekeeping_advance() takes the timekeeper_lock and releases it before
returning. When an early return is required, goto statements are used to
make sure the lock is realeased properly. When the code was written the
locking guard() was not yet available.
Use the guard() to simplify the code and while at it cleanup ordering of
function variables. No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 16 ++++++----------
1 file changed, 6 insertions(+), 10 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 7b80dd70062c..e528e283523b 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -2276,23 +2276,22 @@ static bool timekeeping_advance(enum timekeeping_adv_mode mode)
{
struct timekeeper *real_tk = &tk_core.timekeeper;
struct timekeeper *tk = &shadow_timekeeper;
- u64 offset;
- int shift = 0, maxshift;
unsigned int clock_set = 0;
- unsigned long flags;
+ int shift = 0, maxshift;
+ u64 offset;
- raw_spin_lock_irqsave(&timekeeper_lock, flags);
+ guard(raw_spinlock_irqsave)(&timekeeper_lock);
/* Make sure we're fully resumed: */
if (unlikely(timekeeping_suspended))
- goto out;
+ return false;
offset = clocksource_delta(tk_clock_read(&tk->tkr_mono),
tk->tkr_mono.cycle_last, tk->tkr_mono.mask);
/* Check if there's really nothing to do */
if (offset < real_tk->cycle_interval && mode == TK_ADV_TICK)
- goto out;
+ return false;
/* Do some additional sanity checking */
timekeeping_check_update(tk, offset);
@@ -2311,8 +2310,7 @@ static bool timekeeping_advance(enum timekeeping_adv_mode mode)
maxshift = (64 - (ilog2(ntp_tick_length())+1)) - 1;
shift = min(shift, maxshift);
while (offset >= tk->cycle_interval) {
- offset = logarithmic_accumulation(tk, offset, shift,
- &clock_set);
+ offset = logarithmic_accumulation(tk, offset, shift, &clock_set);
if (offset < tk->cycle_interval<<shift)
shift--;
}
@@ -2341,8 +2339,6 @@ static bool timekeeping_advance(enum timekeeping_adv_mode mode)
memcpy(real_tk, tk, sizeof(*tk));
/* The memcpy must come last. Do not put anything here! */
write_seqcount_end(&tk_core.seq);
-out:
- raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
return !!clock_set;
}
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 06/24] timekeeping: Reorder struct timekeeper
2024-09-11 13:29 [PATCH 00/24] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (4 preceding siblings ...)
2024-09-11 13:29 ` [PATCH 05/24] timekeeping: Simplify code in timekeeping_advance() Anna-Maria Behnsen
@ 2024-09-11 13:29 ` Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 07/24] timekeeping: Move shadow_timekeeper into tk_core Anna-Maria Behnsen
` (17 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: Anna-Maria Behnsen @ 2024-09-11 13:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Thomas Gleixner <tglx@linutronix.de>
struct timekeeper is ordered suboptimal vs. cachelines. The layout,
including the preceding seqcount (see struct tk_core in timekeeper.c) is:
cacheline 0: seqcount, tkr_mono
cacheline 1: tkr_raw, xtime_sec
cacheline 2: ktime_sec ... tai_offset, internal variables
cacheline 3: next_leap_ktime, raw_sec, internal variables
cacheline 4: internal variables
So any access to via ktime_get*() except for access to CLOCK_MONOTONIC_RAW
will use either cachelines 0 + 1 or cachelines 0 + 2. Access to
CLOCK_MONOTONIC_RAW uses cachelines 0 + 1 + 3.
Reorder the members so that the result is more efficient:
cacheline 0: seqcount, tkr_mono
cacheline 1: xtime_sec, ktime_sec ... tai_offset
cacheline 2: tkr_raw, raw_sec
cacheline 3: internal variables
cacheline 4: internal variables
That means ktime_get*() will access cacheline 0 + 1 and CLOCK_MONOTONIC_RAW
access will use cachelines 0 + 2.
Update kernel-doc and fix formatting issues while at it. Also fix a typo
in struct tk_read_base kernel-doc.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
include/linux/timekeeper_internal.h | 102 +++++++++++++++++++++---------------
1 file changed, 61 insertions(+), 41 deletions(-)
diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h
index 84ff2844df2a..5805c5f61df4 100644
--- a/include/linux/timekeeper_internal.h
+++ b/include/linux/timekeeper_internal.h
@@ -26,7 +26,7 @@
* occupies a single 64byte cache line.
*
* The struct is separate from struct timekeeper as it is also used
- * for a fast NMI safe accessors.
+ * for the fast NMI safe accessors.
*
* @base_real is for the fast NMI safe accessor to allow reading clock
* realtime from any context.
@@ -44,33 +44,41 @@ struct tk_read_base {
/**
* struct timekeeper - Structure holding internal timekeeping values.
- * @tkr_mono: The readout base structure for CLOCK_MONOTONIC
- * @tkr_raw: The readout base structure for CLOCK_MONOTONIC_RAW
- * @xtime_sec: Current CLOCK_REALTIME time in seconds
- * @ktime_sec: Current CLOCK_MONOTONIC time in seconds
- * @wall_to_monotonic: CLOCK_REALTIME to CLOCK_MONOTONIC offset
- * @offs_real: Offset clock monotonic -> clock realtime
- * @offs_boot: Offset clock monotonic -> clock boottime
- * @offs_tai: Offset clock monotonic -> clock tai
- * @tai_offset: The current UTC to TAI offset in seconds
- * @clock_was_set_seq: The sequence number of clock was set events
- * @cs_was_changed_seq: The sequence number of clocksource change events
- * @next_leap_ktime: CLOCK_MONOTONIC time value of a pending leap-second
- * @raw_sec: CLOCK_MONOTONIC_RAW time in seconds
- * @monotonic_to_boot: CLOCK_MONOTONIC to CLOCK_BOOTTIME offset
- * @cycle_interval: Number of clock cycles in one NTP interval
- * @xtime_interval: Number of clock shifted nano seconds in one NTP
- * interval.
- * @xtime_remainder: Shifted nano seconds left over when rounding
- * @cycle_interval
- * @raw_interval: Shifted raw nano seconds accumulated per NTP interval.
- * @ntp_error: Difference between accumulated time and NTP time in ntp
- * shifted nano seconds.
- * @ntp_error_shift: Shift conversion between clock shifted nano seconds and
- * ntp shifted nano seconds.
- * @last_warning: Warning ratelimiter (DEBUG_TIMEKEEPING)
- * @underflow_seen: Underflow warning flag (DEBUG_TIMEKEEPING)
- * @overflow_seen: Overflow warning flag (DEBUG_TIMEKEEPING)
+ * @tkr_mono: The readout base structure for CLOCK_MONOTONIC
+ * @xtime_sec: Current CLOCK_REALTIME time in seconds
+ * @ktime_sec: Current CLOCK_MONOTONIC time in seconds
+ * @wall_to_monotonic: CLOCK_REALTIME to CLOCK_MONOTONIC offset
+ * @offs_real: Offset clock monotonic -> clock realtime
+ * @offs_boot: Offset clock monotonic -> clock boottime
+ * @offs_tai: Offset clock monotonic -> clock tai
+ * @tai_offset: The current UTC to TAI offset in seconds
+ * @tkr_raw: The readout base structure for CLOCK_MONOTONIC_RAW
+ * @raw_sec: CLOCK_MONOTONIC_RAW time in seconds
+ * @clock_was_set_seq: The sequence number of clock was set events
+ * @cs_was_changed_seq: The sequence number of clocksource change events
+ * @monotonic_to_boot: CLOCK_MONOTONIC to CLOCK_BOOTTIME offset
+ * @cycle_interval: Number of clock cycles in one NTP interval
+ * @xtime_interval: Number of clock shifted nano seconds in one NTP
+ * interval.
+ * @xtime_remainder: Shifted nano seconds left over when rounding
+ * @cycle_interval
+ * @raw_interval: Shifted raw nano seconds accumulated per NTP interval.
+ * @next_leap_ktime: CLOCK_MONOTONIC time value of a pending leap-second
+ * @ntp_tick: The ntp_tick_length() value currently being
+ * used. This cached copy ensures we consistently
+ * apply the tick length for an entire tick, as
+ * ntp_tick_length may change mid-tick, and we don't
+ * want to apply that new value to the tick in
+ * progress.
+ * @ntp_error: Difference between accumulated time and NTP time in ntp
+ * shifted nano seconds.
+ * @ntp_error_shift: Shift conversion between clock shifted nano seconds and
+ * ntp shifted nano seconds.
+ * @ntp_err_mult: Multiplication factor for scaled math conversion
+ * @skip_second_overflow: Flag used to avoid updating NTP twice with same second
+ * @last_warning: Warning ratelimiter (DEBUG_TIMEKEEPING)
+ * @underflow_seen: Underflow warning flag (DEBUG_TIMEKEEPING)
+ * @overflow_seen: Overflow warning flag (DEBUG_TIMEKEEPING)
*
* Note: For timespec(64) based interfaces wall_to_monotonic is what
* we need to add to xtime (or xtime corrected for sub jiffie times)
@@ -88,10 +96,25 @@ struct tk_read_base {
*
* @monotonic_to_boottime is a timespec64 representation of @offs_boot to
* accelerate the VDSO update for CLOCK_BOOTTIME.
+ *
+ * The cacheline ordering of the structure is optimized for in kernel usage
+ * of the ktime_get() and ktime_get_ts64() family of time accessors. Struct
+ * timekeeper is prepended in the core timekeeeping code with a sequence
+ * count, which results in the following cacheline layout:
+ *
+ * 0: seqcount, tkr_mono
+ * 1: xtime_sec ... tai_offset
+ * 2: tkr_raw, raw_sec
+ * 3,4: Internal variables
+ *
+ * Cacheline 0,1 contain the data which is used for accessing
+ * CLOCK_MONOTONIC/REALTIME/BOOTTIME/TAI, while cacheline 2 contains the
+ * data for accessing CLOCK_MONOTONIC_RAW. Cacheline 3,4 are internal
+ * variables which are only accessed during timekeeper updates once per
+ * tick.
*/
struct timekeeper {
struct tk_read_base tkr_mono;
- struct tk_read_base tkr_raw;
u64 xtime_sec;
unsigned long ktime_sec;
struct timespec64 wall_to_monotonic;
@@ -99,31 +122,28 @@ struct timekeeper {
ktime_t offs_boot;
ktime_t offs_tai;
s32 tai_offset;
+
+ struct tk_read_base tkr_raw;
+ u64 raw_sec;
+
+ /* The following members are for timekeeping internal use */
unsigned int clock_was_set_seq;
u8 cs_was_changed_seq;
- ktime_t next_leap_ktime;
- u64 raw_sec;
+
struct timespec64 monotonic_to_boot;
- /* The following members are for timekeeping internal use */
u64 cycle_interval;
u64 xtime_interval;
s64 xtime_remainder;
u64 raw_interval;
- /* The ntp_tick_length() value currently being used.
- * This cached copy ensures we consistently apply the tick
- * length for an entire tick, as ntp_tick_length may change
- * mid-tick, and we don't want to apply that new value to
- * the tick in progress.
- */
+
+ ktime_t next_leap_ktime;
u64 ntp_tick;
- /* Difference between accumulated time and NTP time in ntp
- * shifted nano seconds. */
s64 ntp_error;
u32 ntp_error_shift;
u32 ntp_err_mult;
- /* Flag used to avoid updating NTP twice with same second */
u32 skip_second_overflow;
+
#ifdef CONFIG_DEBUG_TIMEKEEPING
long last_warning;
/*
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 07/24] timekeeping: Move shadow_timekeeper into tk_core
2024-09-11 13:29 [PATCH 00/24] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (5 preceding siblings ...)
2024-09-11 13:29 ` [PATCH 06/24] timekeeping: Reorder struct timekeeper Anna-Maria Behnsen
@ 2024-09-11 13:29 ` Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 08/24] timekeeping: Encapsulate locking/unlocking of timekeeper_lock Anna-Maria Behnsen
` (16 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: Anna-Maria Behnsen @ 2024-09-11 13:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Thomas Gleixner <tglx@linutronix.de>
tk_core requires shadow_timekeeper to allow timekeeping_advance() updating
without holding the timekeeper sequence count write locked. This allows the
readers to make progress up to the actual update where the shadow
timekeeper is copied over to the real timekeeper.
As long as there is only a single timekeeper, having them separate is
fine. But when the timekeeper infrastructure will be reused for per ptp
clock timekeepers, shadow_timekeeper needs to be part of tk_core.
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index e528e283523b..e35e00b0cdd4 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -50,11 +50,11 @@ DEFINE_RAW_SPINLOCK(timekeeper_lock);
static struct {
seqcount_raw_spinlock_t seq;
struct timekeeper timekeeper;
+ struct timekeeper shadow_timekeeper;
} tk_core ____cacheline_aligned = {
.seq = SEQCNT_RAW_SPINLOCK_ZERO(tk_core.seq, &timekeeper_lock),
};
-static struct timekeeper shadow_timekeeper;
/* flag for if timekeeping is suspended */
int __read_mostly timekeeping_suspended;
@@ -776,8 +776,7 @@ static void timekeeping_update(struct timekeeper *tk, unsigned int action)
* timekeeper structure on the next update with stale data
*/
if (action & TK_MIRROR)
- memcpy(&shadow_timekeeper, &tk_core.timekeeper,
- sizeof(tk_core.timekeeper));
+ memcpy(&tk_core.shadow_timekeeper, &tk_core.timekeeper, sizeof(tk_core.timekeeper));
}
/**
@@ -2274,8 +2273,8 @@ static u64 logarithmic_accumulation(struct timekeeper *tk, u64 offset,
*/
static bool timekeeping_advance(enum timekeeping_adv_mode mode)
{
+ struct timekeeper *tk = &tk_core.shadow_timekeeper;
struct timekeeper *real_tk = &tk_core.timekeeper;
- struct timekeeper *tk = &shadow_timekeeper;
unsigned int clock_set = 0;
int shift = 0, maxshift;
u64 offset;
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 08/24] timekeeping: Encapsulate locking/unlocking of timekeeper_lock
2024-09-11 13:29 [PATCH 00/24] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (6 preceding siblings ...)
2024-09-11 13:29 ` [PATCH 07/24] timekeeping: Move shadow_timekeeper into tk_core Anna-Maria Behnsen
@ 2024-09-11 13:29 ` Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 09/24] timekeeping: Move timekeeper_lock into tk_core Anna-Maria Behnsen
` (15 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: Anna-Maria Behnsen @ 2024-09-11 13:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Thomas Gleixner <tglx@linutronix.de>
timekeeper_lock protects updates of timekeeper (tk_core). It is also used
by vdso_update_begin/end() and not only internally by the timekeeper code.
As long as there is only a single timekeeper, this works fine. But when
the timekeeper infrastructure will be reused for per ptp clock timekeepers,
timekeeper_lock needs to be part of tk_core..
Therefore encapuslate locking/unlocking of timekeeper_lock and make the
lock static.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 15 ++++++++++++++-
kernel/time/timekeeping_internal.h | 3 ++-
kernel/time/vsyscall.c | 5 ++---
3 files changed, 18 insertions(+), 5 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index e35e00b0cdd4..e0b6d088a33b 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -41,7 +41,7 @@ enum timekeeping_adv_mode {
TK_ADV_FREQ
};
-DEFINE_RAW_SPINLOCK(timekeeper_lock);
+static DEFINE_RAW_SPINLOCK(timekeeper_lock);
/*
* The most important data for readout fits into a single 64 byte
@@ -114,6 +114,19 @@ static struct tk_fast tk_fast_raw ____cacheline_aligned = {
.base[1] = FAST_TK_INIT,
};
+unsigned long timekeeper_lock_irqsave(void)
+{
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&timekeeper_lock, flags);
+ return flags;
+}
+
+void timekeeper_unlock_irqrestore(unsigned long flags)
+{
+ raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
+}
+
static inline void tk_normalize_xtime(struct timekeeper *tk)
{
while (tk->tkr_mono.xtime_nsec >= ((u64)NSEC_PER_SEC << tk->tkr_mono.shift)) {
diff --git a/kernel/time/timekeeping_internal.h b/kernel/time/timekeeping_internal.h
index 4ca2787d1642..3babb2bac579 100644
--- a/kernel/time/timekeeping_internal.h
+++ b/kernel/time/timekeeping_internal.h
@@ -34,6 +34,7 @@ static inline u64 clocksource_delta(u64 now, u64 last, u64 mask)
#endif
/* Semi public for serialization of non timekeeper VDSO updates. */
-extern raw_spinlock_t timekeeper_lock;
+unsigned long timekeeper_lock_irqsave(void);
+void timekeeper_unlock_irqrestore(unsigned long flags);
#endif /* _TIMEKEEPING_INTERNAL_H */
diff --git a/kernel/time/vsyscall.c b/kernel/time/vsyscall.c
index 9193d6133e5d..98488b20b594 100644
--- a/kernel/time/vsyscall.c
+++ b/kernel/time/vsyscall.c
@@ -151,9 +151,8 @@ void update_vsyscall_tz(void)
unsigned long vdso_update_begin(void)
{
struct vdso_data *vdata = __arch_get_k_vdso_data();
- unsigned long flags;
+ unsigned long flags = timekeeper_lock_irqsave();
- raw_spin_lock_irqsave(&timekeeper_lock, flags);
vdso_write_begin(vdata);
return flags;
}
@@ -172,5 +171,5 @@ void vdso_update_end(unsigned long flags)
vdso_write_end(vdata);
__arch_sync_vdso_data(vdata);
- raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
+ timekeeper_unlock_irqrestore(flags);
}
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 09/24] timekeeping: Move timekeeper_lock into tk_core
2024-09-11 13:29 [PATCH 00/24] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (7 preceding siblings ...)
2024-09-11 13:29 ` [PATCH 08/24] timekeeping: Encapsulate locking/unlocking of timekeeper_lock Anna-Maria Behnsen
@ 2024-09-11 13:29 ` Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 10/24] timekeeping: Define a struct type for tk_core to make it reusable Anna-Maria Behnsen
` (14 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: Anna-Maria Behnsen @ 2024-09-11 13:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
timekeeper_lock protects updates to struct tk_core but is not part of
struct tk_core. As long as there is only a single timekeeper, this is not a
problem. But when the timekeeper infrastructure will be reused for per ptp
clock timekeepers, timekeeper_lock needs to be part of tk_core.
Move the lock into tk_core, move initialisation of the lock and sequence
counter into timekeeping_init() and update all users of timekeeper_lock.
As this is touching all lock sites, convert them to use:
guard(raw_spinlock_irqsave)(&tk_core.lock);
instead of lock/unlock functions whenever possible.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 77 +++++++++++++++++++++--------------------------
1 file changed, 34 insertions(+), 43 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index e0b6d088a33b..f8ffab5bb790 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -41,8 +41,6 @@ enum timekeeping_adv_mode {
TK_ADV_FREQ
};
-static DEFINE_RAW_SPINLOCK(timekeeper_lock);
-
/*
* The most important data for readout fits into a single 64 byte
* cache line.
@@ -51,10 +49,8 @@ static struct {
seqcount_raw_spinlock_t seq;
struct timekeeper timekeeper;
struct timekeeper shadow_timekeeper;
-} tk_core ____cacheline_aligned = {
- .seq = SEQCNT_RAW_SPINLOCK_ZERO(tk_core.seq, &timekeeper_lock),
-};
-
+ raw_spinlock_t lock;
+} tk_core ____cacheline_aligned;
/* flag for if timekeeping is suspended */
int __read_mostly timekeeping_suspended;
@@ -118,13 +114,13 @@ unsigned long timekeeper_lock_irqsave(void)
{
unsigned long flags;
- raw_spin_lock_irqsave(&timekeeper_lock, flags);
+ raw_spin_lock_irqsave(&tk_core.lock, flags);
return flags;
}
void timekeeper_unlock_irqrestore(unsigned long flags)
{
- raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
+ raw_spin_unlock_irqrestore(&tk_core.lock, flags);
}
static inline void tk_normalize_xtime(struct timekeeper *tk)
@@ -197,7 +193,7 @@ static inline void tk_update_sleep_time(struct timekeeper *tk, ktime_t delta)
* the tkr's clocksource may change between the read reference, and the
* clock reference passed to the read function. This can cause crashes if
* the wrong clocksource is passed to the wrong read function.
- * This isn't necessary to use when holding the timekeeper_lock or doing
+ * This isn't necessary to use when holding the tk_core.lock or doing
* a read of the fast-timekeeper tkrs (which is protected by its own locking
* and update logic).
*/
@@ -689,13 +685,11 @@ static void update_pvclock_gtod(struct timekeeper *tk, bool was_set)
int pvclock_gtod_register_notifier(struct notifier_block *nb)
{
struct timekeeper *tk = &tk_core.timekeeper;
- unsigned long flags;
int ret;
- raw_spin_lock_irqsave(&timekeeper_lock, flags);
+ guard(raw_spinlock_irqsave)(&tk_core.lock);
ret = raw_notifier_chain_register(&pvclock_gtod_chain, nb);
update_pvclock_gtod(tk, true);
- raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
return ret;
}
@@ -708,14 +702,8 @@ EXPORT_SYMBOL_GPL(pvclock_gtod_register_notifier);
*/
int pvclock_gtod_unregister_notifier(struct notifier_block *nb)
{
- unsigned long flags;
- int ret;
-
- raw_spin_lock_irqsave(&timekeeper_lock, flags);
- ret = raw_notifier_chain_unregister(&pvclock_gtod_chain, nb);
- raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
-
- return ret;
+ guard(raw_spinlock_irqsave)(&tk_core.lock);
+ return raw_notifier_chain_unregister(&pvclock_gtod_chain, nb);
}
EXPORT_SYMBOL_GPL(pvclock_gtod_unregister_notifier);
@@ -763,7 +751,7 @@ static inline void tk_update_ktime_data(struct timekeeper *tk)
tk->tkr_raw.base = ns_to_ktime(tk->raw_sec * NSEC_PER_SEC);
}
-/* must hold timekeeper_lock */
+/* must hold tk_core.lock */
static void timekeeping_update(struct timekeeper *tk, unsigned int action)
{
if (action & TK_CLEAR_NTP) {
@@ -1460,7 +1448,7 @@ int do_settimeofday64(const struct timespec64 *ts)
if (!timespec64_valid_settod(ts))
return -EINVAL;
- raw_spin_lock_irqsave(&timekeeper_lock, flags);
+ raw_spin_lock_irqsave(&tk_core.lock, flags);
write_seqcount_begin(&tk_core.seq);
timekeeping_forward_now(tk);
@@ -1480,7 +1468,7 @@ int do_settimeofday64(const struct timespec64 *ts)
timekeeping_update(tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
write_seqcount_end(&tk_core.seq);
- raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
+ raw_spin_unlock_irqrestore(&tk_core.lock, flags);
/* Signal hrtimers about time change */
clock_was_set(CLOCK_SET_WALL);
@@ -1510,7 +1498,7 @@ static int timekeeping_inject_offset(const struct timespec64 *ts)
if (ts->tv_nsec < 0 || ts->tv_nsec >= NSEC_PER_SEC)
return -EINVAL;
- raw_spin_lock_irqsave(&timekeeper_lock, flags);
+ raw_spin_lock_irqsave(&tk_core.lock, flags);
write_seqcount_begin(&tk_core.seq);
timekeeping_forward_now(tk);
@@ -1530,7 +1518,7 @@ static int timekeeping_inject_offset(const struct timespec64 *ts)
timekeeping_update(tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
write_seqcount_end(&tk_core.seq);
- raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
+ raw_spin_unlock_irqrestore(&tk_core.lock, flags);
/* Signal hrtimers about time change */
clock_was_set(CLOCK_SET_WALL);
@@ -1606,7 +1594,7 @@ static int change_clocksource(void *data)
return 0;
}
- raw_spin_lock_irqsave(&timekeeper_lock, flags);
+ raw_spin_lock_irqsave(&tk_core.lock, flags);
write_seqcount_begin(&tk_core.seq);
timekeeping_forward_now(tk);
@@ -1615,7 +1603,7 @@ static int change_clocksource(void *data)
timekeeping_update(tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
write_seqcount_end(&tk_core.seq);
- raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
+ raw_spin_unlock_irqrestore(&tk_core.lock, flags);
if (old) {
if (old->disable)
@@ -1744,6 +1732,12 @@ read_persistent_wall_and_boot_offset(struct timespec64 *wall_time,
*boot_offset = ns_to_timespec64(local_clock());
}
+static __init void tkd_basic_setup(struct tk_data *tkd)
+{
+ raw_spin_lock_init(&tkd->lock);
+ seqcount_raw_spinlock_init(&tkd->seq, &tkd->lock);
+}
+
/*
* Flag reflecting whether timekeeping_resume() has injected sleeptime.
*
@@ -1770,7 +1764,8 @@ void __init timekeeping_init(void)
struct timespec64 wall_time, boot_offset, wall_to_mono;
struct timekeeper *tk = &tk_core.timekeeper;
struct clocksource *clock;
- unsigned long flags;
+
+ tkd_basic_setup(&tk_core);
read_persistent_wall_and_boot_offset(&wall_time, &boot_offset);
if (timespec64_valid_settod(&wall_time) &&
@@ -1790,7 +1785,7 @@ void __init timekeeping_init(void)
*/
wall_to_mono = timespec64_sub(boot_offset, wall_time);
- raw_spin_lock_irqsave(&timekeeper_lock, flags);
+ guard(raw_spinlock_irqsave)(&tk_core.lock);
write_seqcount_begin(&tk_core.seq);
ntp_init();
@@ -1807,7 +1802,6 @@ void __init timekeeping_init(void)
timekeeping_update(tk, TK_MIRROR | TK_CLOCK_WAS_SET);
write_seqcount_end(&tk_core.seq);
- raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
}
/* time in seconds when suspend began for persistent clock */
@@ -1888,7 +1882,7 @@ void timekeeping_inject_sleeptime64(const struct timespec64 *delta)
struct timekeeper *tk = &tk_core.timekeeper;
unsigned long flags;
- raw_spin_lock_irqsave(&timekeeper_lock, flags);
+ raw_spin_lock_irqsave(&tk_core.lock, flags);
write_seqcount_begin(&tk_core.seq);
suspend_timing_needed = false;
@@ -1900,7 +1894,7 @@ void timekeeping_inject_sleeptime64(const struct timespec64 *delta)
timekeeping_update(tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
write_seqcount_end(&tk_core.seq);
- raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
+ raw_spin_unlock_irqrestore(&tk_core.lock, flags);
/* Signal hrtimers about time change */
clock_was_set(CLOCK_SET_WALL | CLOCK_SET_BOOT);
@@ -1924,7 +1918,7 @@ void timekeeping_resume(void)
clockevents_resume();
clocksource_resume();
- raw_spin_lock_irqsave(&timekeeper_lock, flags);
+ raw_spin_lock_irqsave(&tk_core.lock, flags);
write_seqcount_begin(&tk_core.seq);
/*
@@ -1962,7 +1956,7 @@ void timekeeping_resume(void)
timekeeping_suspended = 0;
timekeeping_update(tk, TK_MIRROR | TK_CLOCK_WAS_SET);
write_seqcount_end(&tk_core.seq);
- raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
+ raw_spin_unlock_irqrestore(&tk_core.lock, flags);
touch_softlockup_watchdog();
@@ -1993,7 +1987,7 @@ int timekeeping_suspend(void)
suspend_timing_needed = true;
- raw_spin_lock_irqsave(&timekeeper_lock, flags);
+ raw_spin_lock_irqsave(&tk_core.lock, flags);
write_seqcount_begin(&tk_core.seq);
timekeeping_forward_now(tk);
timekeeping_suspended = 1;
@@ -2032,7 +2026,7 @@ int timekeeping_suspend(void)
timekeeping_update(tk, TK_MIRROR);
halt_fast_timekeeper(tk);
write_seqcount_end(&tk_core.seq);
- raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
+ raw_spin_unlock_irqrestore(&tk_core.lock, flags);
tick_suspend();
clocksource_suspend();
@@ -2292,7 +2286,7 @@ static bool timekeeping_advance(enum timekeeping_adv_mode mode)
int shift = 0, maxshift;
u64 offset;
- guard(raw_spinlock_irqsave)(&timekeeper_lock);
+ guard(raw_spinlock_irqsave)(&tk_core.lock);
/* Make sure we're fully resumed: */
if (unlikely(timekeeping_suspended))
@@ -2587,7 +2581,7 @@ int do_adjtimex(struct __kernel_timex *txc)
ktime_get_real_ts64(&ts);
add_device_randomness(&ts, sizeof(ts));
- raw_spin_lock_irqsave(&timekeeper_lock, flags);
+ raw_spin_lock_irqsave(&tk_core.lock, flags);
write_seqcount_begin(&tk_core.seq);
orig_tai = tai = tk->tai_offset;
@@ -2602,7 +2596,7 @@ int do_adjtimex(struct __kernel_timex *txc)
}
write_seqcount_end(&tk_core.seq);
- raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
+ raw_spin_unlock_irqrestore(&tk_core.lock, flags);
audit_ntp_log(&ad);
@@ -2626,11 +2620,8 @@ int do_adjtimex(struct __kernel_timex *txc)
*/
void hardpps(const struct timespec64 *phase_ts, const struct timespec64 *raw_ts)
{
- unsigned long flags;
-
- raw_spin_lock_irqsave(&timekeeper_lock, flags);
+ guard(raw_spinlock_irqsave)(&tk_core.lock);
__hardpps(phase_ts, raw_ts);
- raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
}
EXPORT_SYMBOL(hardpps);
#endif /* CONFIG_NTP_PPS */
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 10/24] timekeeping: Define a struct type for tk_core to make it reusable
2024-09-11 13:29 [PATCH 00/24] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (8 preceding siblings ...)
2024-09-11 13:29 ` [PATCH 09/24] timekeeping: Move timekeeper_lock into tk_core Anna-Maria Behnsen
@ 2024-09-11 13:29 ` Anna-Maria Behnsen
2024-09-12 7:38 ` Simon Horman
2024-09-11 13:29 ` [PATCH 11/24] timekeeping: Add struct tk_data as argument to timekeeping_update() Anna-Maria Behnsen
` (13 subsequent siblings)
23 siblings, 1 reply; 27+ messages in thread
From: Anna-Maria Behnsen @ 2024-09-11 13:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
The struct tk_core uses is not reusable. As long as there is only a single
timekeeper, this is not a problem. But when the timekeeper infrastructure
will be reused for per ptp clock timekeepers, an explicit struct type is
required.
Define struct tk_data as explicit struct type for tk_core.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index f8ffab5bb790..be939ce3bcfc 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -45,12 +45,14 @@ enum timekeeping_adv_mode {
* The most important data for readout fits into a single 64 byte
* cache line.
*/
-static struct {
+struct tk_data {
seqcount_raw_spinlock_t seq;
struct timekeeper timekeeper;
struct timekeeper shadow_timekeeper;
raw_spinlock_t lock;
-} tk_core ____cacheline_aligned;
+} ____cacheline_aligned;
+
+static struct tk_data tk_core;
/* flag for if timekeeping is suspended */
int __read_mostly timekeeping_suspended;
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 11/24] timekeeping: Add struct tk_data as argument to timekeeping_update()
2024-09-11 13:29 [PATCH 00/24] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (9 preceding siblings ...)
2024-09-11 13:29 ` [PATCH 10/24] timekeeping: Define a struct type for tk_core to make it reusable Anna-Maria Behnsen
@ 2024-09-11 13:29 ` Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 12/24] timekeeping: Split out timekeeper update of timekeeping_advanced() Anna-Maria Behnsen
` (12 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: Anna-Maria Behnsen @ 2024-09-11 13:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
Updates of the timekeeper are done in two ways:
1. Updating timekeeper and afterwards memcpy()'ing the result into
shadow_timekeeper using timekeeping_update(). Used everywhere for
updates except in timekeeping_advance(); the sequence counter protected
region starts before the first change to the timekeeper is done.
2. Updating shadow_timekeeper and then memcpy()'ing the result into
timekeeper. Used only by in timekeeping_advance(); The seqence counter
protected region is only around timekeeping_update() and the memcpy for
copy from shadow to timekeeper.
The second option is fast path optimized. The sequence counter protected
region is as short as possible.
As this behaviour is mainly documented by commit messages, but not in code,
it makes the not easy timekeeping code more complicated to read.
There is no reason why updates to the timekeeper can't use the optimized
version everywhere. With this, the code will be cleaner, as code is reused
instead of duplicated.
To be able to access tk_data which contains all required information, add a
pointer to tk_data as an argument to timekeeping_update(). With that
convert the comment about holding the lock into a lockdep assert.
No functional change.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 25 +++++++++++++------------
1 file changed, 13 insertions(+), 12 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index be939ce3bcfc..d92b1f41ba1b 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -528,7 +528,7 @@ EXPORT_SYMBOL_GPL(ktime_get_raw_fast_ns);
* timekeeping_inject_sleeptime64()
* __timekeeping_inject_sleeptime(tk, delta);
* timestamp();
- * timekeeping_update(tk, TK_CLEAR_NTP...);
+ * timekeeping_update(tkd, tk, TK_CLEAR_NTP...);
*
* (2) On 32-bit systems, the 64-bit boot offset (tk->offs_boot) may be
* partially updated. Since the tk->offs_boot update is a rare event, this
@@ -753,9 +753,10 @@ static inline void tk_update_ktime_data(struct timekeeper *tk)
tk->tkr_raw.base = ns_to_ktime(tk->raw_sec * NSEC_PER_SEC);
}
-/* must hold tk_core.lock */
-static void timekeeping_update(struct timekeeper *tk, unsigned int action)
+static void timekeeping_update(struct tk_data *tkd, struct timekeeper *tk, unsigned int action)
{
+ lockdep_assert_held(&tkd->lock);
+
if (action & TK_CLEAR_NTP) {
tk->ntp_error = 0;
ntp_clear();
@@ -1467,7 +1468,7 @@ int do_settimeofday64(const struct timespec64 *ts)
tk_set_xtime(tk, ts);
out:
- timekeeping_update(tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
+ timekeeping_update(&tk_core, tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&tk_core.lock, flags);
@@ -1517,7 +1518,7 @@ static int timekeeping_inject_offset(const struct timespec64 *ts)
tk_set_wall_to_mono(tk, timespec64_sub(tk->wall_to_monotonic, *ts));
error: /* even if we error out, we forwarded the time, so call update */
- timekeeping_update(tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
+ timekeeping_update(&tk_core, tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&tk_core.lock, flags);
@@ -1602,7 +1603,7 @@ static int change_clocksource(void *data)
timekeeping_forward_now(tk);
old = tk->tkr_mono.clock;
tk_setup_internals(tk, new);
- timekeeping_update(tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
+ timekeeping_update(&tk_core, tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&tk_core.lock, flags);
@@ -1801,7 +1802,7 @@ void __init timekeeping_init(void)
tk_set_wall_to_mono(tk, wall_to_mono);
- timekeeping_update(tk, TK_MIRROR | TK_CLOCK_WAS_SET);
+ timekeeping_update(&tk_core, tk, TK_MIRROR | TK_CLOCK_WAS_SET);
write_seqcount_end(&tk_core.seq);
}
@@ -1893,7 +1894,7 @@ void timekeeping_inject_sleeptime64(const struct timespec64 *delta)
__timekeeping_inject_sleeptime(tk, delta);
- timekeeping_update(tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
+ timekeeping_update(&tk_core, tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&tk_core.lock, flags);
@@ -1956,7 +1957,7 @@ void timekeeping_resume(void)
tk->ntp_error = 0;
timekeeping_suspended = 0;
- timekeeping_update(tk, TK_MIRROR | TK_CLOCK_WAS_SET);
+ timekeeping_update(&tk_core, tk, TK_MIRROR | TK_CLOCK_WAS_SET);
write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&tk_core.lock, flags);
@@ -2025,7 +2026,7 @@ int timekeeping_suspend(void)
}
}
- timekeeping_update(tk, TK_MIRROR);
+ timekeeping_update(&tk_core, tk, TK_MIRROR);
halt_fast_timekeeper(tk);
write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&tk_core.lock, flags);
@@ -2343,7 +2344,7 @@ static bool timekeeping_advance(enum timekeeping_adv_mode mode)
* memcpy under the tk_core.seq against one before we start
* updating.
*/
- timekeeping_update(tk, clock_set);
+ timekeeping_update(&tk_core, tk, clock_set);
memcpy(real_tk, tk, sizeof(*tk));
/* The memcpy must come last. Do not put anything here! */
write_seqcount_end(&tk_core.seq);
@@ -2591,7 +2592,7 @@ int do_adjtimex(struct __kernel_timex *txc)
if (tai != orig_tai) {
__timekeeping_set_tai_offset(tk, tai);
- timekeeping_update(tk, TK_MIRROR | TK_CLOCK_WAS_SET);
+ timekeeping_update(&tk_core, tk, TK_MIRROR | TK_CLOCK_WAS_SET);
clock_set = true;
} else {
tk_update_leap_state(tk);
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 12/24] timekeeping: Split out timekeeper update of timekeeping_advanced()
2024-09-11 13:29 [PATCH 00/24] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (10 preceding siblings ...)
2024-09-11 13:29 ` [PATCH 11/24] timekeeping: Add struct tk_data as argument to timekeeping_update() Anna-Maria Behnsen
@ 2024-09-11 13:29 ` Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 13/24] timekeeping: Introduce combined timekeeping action flag Anna-Maria Behnsen
` (11 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: Anna-Maria Behnsen @ 2024-09-11 13:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
timekeeping_advance() is the only optimized function which uses
shadow_timekeeper for updating the real timekeeper to keep the sequence
counter protected region as small as possible.
To be able to transform timekeeper updates in other functions to use the
same logic, split out functionality into a separate function
timekeeper_update_staged().
While at it, document the reason why the sequence counter must be write
held over the call to timekeeping_update() and the copying to the real
timekeeper and why using a pointer based update is suboptimal.
No functional change.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 43 +++++++++++++++++++++++++++----------------
1 file changed, 27 insertions(+), 16 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index d92b1f41ba1b..bf8814508cdf 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -780,7 +780,32 @@ static void timekeeping_update(struct tk_data *tkd, struct timekeeper *tk, unsig
* timekeeper structure on the next update with stale data
*/
if (action & TK_MIRROR)
- memcpy(&tk_core.shadow_timekeeper, &tk_core.timekeeper, sizeof(tk_core.timekeeper));
+ memcpy(&tkd->shadow_timekeeper, tk, sizeof(*tk));
+}
+
+static void timekeeping_update_staged(struct tk_data *tkd, unsigned int action)
+{
+ /*
+ * Block out readers before invoking timekeeping_update() because
+ * that updates VDSO and other time related infrastructure. Not
+ * blocking the readers might let a reader see time going backwards
+ * when reading from the VDSO after the VDSO update and then
+ * reading in the kernel from the timekeeper before that got updated.
+ */
+ write_seqcount_begin(&tkd->seq);
+
+ timekeeping_update(tkd, &tkd->shadow_timekeeper, action);
+
+ /*
+ * Update the real timekeeper.
+ *
+ * We could avoid this memcpy() by switching pointers, but that has
+ * the downside that the reader side does not longer benefit from
+ * the cacheline optimized data layout of the timekeeper and requires
+ * another indirection.
+ */
+ memcpy(&tkd->timekeeper, &tkd->shadow_timekeeper, sizeof(tkd->shadow_timekeeper));
+ write_seqcount_end(&tkd->seq);
}
/**
@@ -2333,21 +2358,7 @@ static bool timekeeping_advance(enum timekeeping_adv_mode mode)
*/
clock_set |= accumulate_nsecs_to_secs(tk);
- write_seqcount_begin(&tk_core.seq);
- /*
- * Update the real timekeeper.
- *
- * We could avoid this memcpy by switching pointers, but that
- * requires changes to all other timekeeper usage sites as
- * well, i.e. move the timekeeper pointer getter into the
- * spinlocked/seqcount protected sections. And we trade this
- * memcpy under the tk_core.seq against one before we start
- * updating.
- */
- timekeeping_update(&tk_core, tk, clock_set);
- memcpy(real_tk, tk, sizeof(*tk));
- /* The memcpy must come last. Do not put anything here! */
- write_seqcount_end(&tk_core.seq);
+ timekeeping_update_staged(&tk_core, clock_set);
return !!clock_set;
}
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 13/24] timekeeping: Introduce combined timekeeping action flag
2024-09-11 13:29 [PATCH 00/24] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (11 preceding siblings ...)
2024-09-11 13:29 ` [PATCH 12/24] timekeeping: Split out timekeeper update of timekeeping_advanced() Anna-Maria Behnsen
@ 2024-09-11 13:29 ` Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 14/24] timekeeping: Provide timekeeping_restore_shadow() Anna-Maria Behnsen
` (10 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: Anna-Maria Behnsen @ 2024-09-11 13:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
Instead of explicitly listing all the separate timekeeping actions flags,
introduce a new one which covers all actions except TK_MIRROR action.
No functional change.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index bf8814508cdf..a69850429fb4 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -33,6 +33,8 @@
#define TK_MIRROR (1 << 1)
#define TK_CLOCK_WAS_SET (1 << 2)
+#define TK_UPDATE_ALL (TK_CLEAR_NTP | TK_CLOCK_WAS_SET)
+
enum timekeeping_adv_mode {
/* Update timekeeper when a tick has passed */
TK_ADV_TICK,
@@ -1493,7 +1495,7 @@ int do_settimeofday64(const struct timespec64 *ts)
tk_set_xtime(tk, ts);
out:
- timekeeping_update(&tk_core, tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
+ timekeeping_update(&tk_core, tk, TK_UPDATE_ALL | TK_MIRROR);
write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&tk_core.lock, flags);
@@ -1543,7 +1545,7 @@ static int timekeeping_inject_offset(const struct timespec64 *ts)
tk_set_wall_to_mono(tk, timespec64_sub(tk->wall_to_monotonic, *ts));
error: /* even if we error out, we forwarded the time, so call update */
- timekeeping_update(&tk_core, tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
+ timekeeping_update(&tk_core, tk, TK_UPDATE_ALL | TK_MIRROR);
write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&tk_core.lock, flags);
@@ -1628,7 +1630,7 @@ static int change_clocksource(void *data)
timekeeping_forward_now(tk);
old = tk->tkr_mono.clock;
tk_setup_internals(tk, new);
- timekeeping_update(&tk_core, tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
+ timekeeping_update(&tk_core, tk, TK_UPDATE_ALL | TK_MIRROR);
write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&tk_core.lock, flags);
@@ -1919,7 +1921,7 @@ void timekeeping_inject_sleeptime64(const struct timespec64 *delta)
__timekeeping_inject_sleeptime(tk, delta);
- timekeeping_update(&tk_core, tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
+ timekeeping_update(&tk_core, tk, TK_UPDATE_ALL | TK_MIRROR);
write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&tk_core.lock, flags);
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 14/24] timekeeping: Provide timekeeping_restore_shadow()
2024-09-11 13:29 [PATCH 00/24] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (12 preceding siblings ...)
2024-09-11 13:29 ` [PATCH 13/24] timekeeping: Introduce combined timekeeping action flag Anna-Maria Behnsen
@ 2024-09-11 13:29 ` Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 15/24] timekeeping: Rework do_settimeofday64() to use shadow_timekeeper Anna-Maria Behnsen
` (9 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: Anna-Maria Behnsen @ 2024-09-11 13:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Thomas Gleixner <tglx@linutronix.de>
Functions which operate on the real timekeeper, e.g. do_settimeofday(),
have error conditions. If they are hit a full timekeeping update is still
required because the already committed operations modified the timekeeper.
When switching these functions to operate on the shadow timekeeper then the
full update can be avoided in the error case, but the modified shadow
timekeeper has to be restored.
Provide a helper function for that.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index a69850429fb4..345117ff665d 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -755,6 +755,15 @@ static inline void tk_update_ktime_data(struct timekeeper *tk)
tk->tkr_raw.base = ns_to_ktime(tk->raw_sec * NSEC_PER_SEC);
}
+/*
+ * Restore the shadow timekeeper from the real timekeeper.
+ */
+static void timekeeping_restore_shadow(struct tk_data *tkd)
+{
+ lockdep_assert_held(&tkd->lock);
+ memcpy(&tkd->shadow_timekeeper, &tkd->timekeeper, sizeof(tkd->timekeeper));
+}
+
static void timekeeping_update(struct tk_data *tkd, struct timekeeper *tk, unsigned int action)
{
lockdep_assert_held(&tkd->lock);
@@ -782,7 +791,7 @@ static void timekeeping_update(struct tk_data *tkd, struct timekeeper *tk, unsig
* timekeeper structure on the next update with stale data
*/
if (action & TK_MIRROR)
- memcpy(&tkd->shadow_timekeeper, tk, sizeof(*tk));
+ timekeeping_restore_shadow(tkd);
}
static void timekeeping_update_staged(struct tk_data *tkd, unsigned int action)
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 15/24] timekeeping: Rework do_settimeofday64() to use shadow_timekeeper
2024-09-11 13:29 [PATCH 00/24] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (13 preceding siblings ...)
2024-09-11 13:29 ` [PATCH 14/24] timekeeping: Provide timekeeping_restore_shadow() Anna-Maria Behnsen
@ 2024-09-11 13:29 ` Anna-Maria Behnsen
2024-09-11 13:30 ` [PATCH 16/24] timekeeping: Rework timekeeping_inject_offset() " Anna-Maria Behnsen
` (8 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: Anna-Maria Behnsen @ 2024-09-11 13:29 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
Updates of the timekeeper can be done by operating on the shadow timekeeper
and afterwards copying the result into the real timekeeper. This has the
advantage, that the sequence count write protected region is kept as small
as possible.
Convert do_settimeofday64() to use this scheme.
That allows to use a scoped_guard() for locking the timekeeper lock as the
usage of the shadow timekeeper allows a rollback in the error case instead
of the full timekeeper update of the original code.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 42 ++++++++++++++++--------------------------
1 file changed, 16 insertions(+), 26 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 345117ff665d..efcfb0da351e 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1479,45 +1479,35 @@ EXPORT_SYMBOL_GPL(timekeeping_clocksource_has_base);
*/
int do_settimeofday64(const struct timespec64 *ts)
{
- struct timekeeper *tk = &tk_core.timekeeper;
struct timespec64 ts_delta, xt;
- unsigned long flags;
- int ret = 0;
if (!timespec64_valid_settod(ts))
return -EINVAL;
- raw_spin_lock_irqsave(&tk_core.lock, flags);
- write_seqcount_begin(&tk_core.seq);
-
- timekeeping_forward_now(tk);
-
- xt = tk_xtime(tk);
- ts_delta = timespec64_sub(*ts, xt);
+ scoped_guard (raw_spinlock_irqsave, &tk_core.lock) {
+ struct timekeeper *tk = &tk_core.shadow_timekeeper;
- if (timespec64_compare(&tk->wall_to_monotonic, &ts_delta) > 0) {
- ret = -EINVAL;
- goto out;
- }
+ timekeeping_forward_now(tk);
- tk_set_wall_to_mono(tk, timespec64_sub(tk->wall_to_monotonic, ts_delta));
+ xt = tk_xtime(tk);
+ ts_delta = timespec64_sub(*ts, xt);
- tk_set_xtime(tk, ts);
-out:
- timekeeping_update(&tk_core, tk, TK_UPDATE_ALL | TK_MIRROR);
+ if (timespec64_compare(&tk->wall_to_monotonic, &ts_delta) > 0) {
+ timekeeping_restore_shadow(&tk_core);
+ return -EINVAL;
+ }
- write_seqcount_end(&tk_core.seq);
- raw_spin_unlock_irqrestore(&tk_core.lock, flags);
+ tk_set_wall_to_mono(tk, timespec64_sub(tk->wall_to_monotonic, ts_delta));
+ tk_set_xtime(tk, ts);
+ timekeeping_update_staged(&tk_core, TK_UPDATE_ALL);
+ }
/* Signal hrtimers about time change */
clock_was_set(CLOCK_SET_WALL);
- if (!ret) {
- audit_tk_injoffset(ts_delta);
- add_device_randomness(ts, sizeof(*ts));
- }
-
- return ret;
+ audit_tk_injoffset(ts_delta);
+ add_device_randomness(ts, sizeof(*ts));
+ return 0;
}
EXPORT_SYMBOL(do_settimeofday64);
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 16/24] timekeeping: Rework timekeeping_inject_offset() to use shadow_timekeeper
2024-09-11 13:29 [PATCH 00/24] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (14 preceding siblings ...)
2024-09-11 13:29 ` [PATCH 15/24] timekeeping: Rework do_settimeofday64() to use shadow_timekeeper Anna-Maria Behnsen
@ 2024-09-11 13:30 ` Anna-Maria Behnsen
2024-09-11 13:30 ` [PATCH 17/24] timekeeping: Rework change_clocksource() " Anna-Maria Behnsen
` (7 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: Anna-Maria Behnsen @ 2024-09-11 13:30 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
Updates of the timekeeper can be done by operating on the shadow timekeeper
and afterwards copying the result into the real timekeeper. This has the
advantage, that the sequence count write protected region is kept as small
as possible.
Convert timekeeping_inject_offset() to use this scheme.
That allows to use a scoped_guard() for locking the timekeeper lock as the
usage of the shadow timekeeper allows a rollback in the error case instead
of the full timekeeper update of the original code.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 41 ++++++++++++++++-------------------------
1 file changed, 16 insertions(+), 25 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index efcfb0da351e..8b8d77463f3e 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1519,40 +1519,31 @@ EXPORT_SYMBOL(do_settimeofday64);
*/
static int timekeeping_inject_offset(const struct timespec64 *ts)
{
- struct timekeeper *tk = &tk_core.timekeeper;
- unsigned long flags;
- struct timespec64 tmp;
- int ret = 0;
-
if (ts->tv_nsec < 0 || ts->tv_nsec >= NSEC_PER_SEC)
return -EINVAL;
- raw_spin_lock_irqsave(&tk_core.lock, flags);
- write_seqcount_begin(&tk_core.seq);
-
- timekeeping_forward_now(tk);
-
- /* Make sure the proposed value is valid */
- tmp = timespec64_add(tk_xtime(tk), *ts);
- if (timespec64_compare(&tk->wall_to_monotonic, ts) > 0 ||
- !timespec64_valid_settod(&tmp)) {
- ret = -EINVAL;
- goto error;
- }
+ scoped_guard (raw_spinlock_irqsave, &tk_core.lock) {
+ struct timekeeper *tk = &tk_core.shadow_timekeeper;
+ struct timespec64 tmp;
- tk_xtime_add(tk, ts);
- tk_set_wall_to_mono(tk, timespec64_sub(tk->wall_to_monotonic, *ts));
+ timekeeping_forward_now(tk);
-error: /* even if we error out, we forwarded the time, so call update */
- timekeeping_update(&tk_core, tk, TK_UPDATE_ALL | TK_MIRROR);
+ /* Make sure the proposed value is valid */
+ tmp = timespec64_add(tk_xtime(tk), *ts);
+ if (timespec64_compare(&tk->wall_to_monotonic, ts) > 0 ||
+ !timespec64_valid_settod(&tmp)) {
+ timekeeping_restore_shadow(&tk_core);
+ return -EINVAL;
+ }
- write_seqcount_end(&tk_core.seq);
- raw_spin_unlock_irqrestore(&tk_core.lock, flags);
+ tk_xtime_add(tk, ts);
+ tk_set_wall_to_mono(tk, timespec64_sub(tk->wall_to_monotonic, *ts));
+ timekeeping_update_staged(&tk_core, TK_UPDATE_ALL);
+ }
/* Signal hrtimers about time change */
clock_was_set(CLOCK_SET_WALL);
-
- return ret;
+ return 0;
}
/*
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 17/24] timekeeping: Rework change_clocksource() to use shadow_timekeeper
2024-09-11 13:29 [PATCH 00/24] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (15 preceding siblings ...)
2024-09-11 13:30 ` [PATCH 16/24] timekeeping: Rework timekeeping_inject_offset() " Anna-Maria Behnsen
@ 2024-09-11 13:30 ` Anna-Maria Behnsen
2024-09-11 13:30 ` [PATCH 18/24] timekeeping: Rework timekeeping_init() " Anna-Maria Behnsen
` (6 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: Anna-Maria Behnsen @ 2024-09-11 13:30 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
Updates of the timekeeper can be done by operating on the shadow timekeeper
and afterwards copying the result into the real timekeeper. This has the
advantage, that the sequence count write protected region is kept as small
as possible.
Convert change_clocksource() to use this scheme.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 18 +++++++-----------
1 file changed, 7 insertions(+), 11 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 8b8d77463f3e..9b4fd620f895 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1596,9 +1596,7 @@ static void __timekeeping_set_tai_offset(struct timekeeper *tk, s32 tai_offset)
*/
static int change_clocksource(void *data)
{
- struct timekeeper *tk = &tk_core.timekeeper;
struct clocksource *new = data, *old = NULL;
- unsigned long flags;
/*
* If the clocksource is in a module, get a module reference.
@@ -1614,16 +1612,14 @@ static int change_clocksource(void *data)
return 0;
}
- raw_spin_lock_irqsave(&tk_core.lock, flags);
- write_seqcount_begin(&tk_core.seq);
-
- timekeeping_forward_now(tk);
- old = tk->tkr_mono.clock;
- tk_setup_internals(tk, new);
- timekeeping_update(&tk_core, tk, TK_UPDATE_ALL | TK_MIRROR);
+ scoped_guard (raw_spinlock_irqsave, &tk_core.lock) {
+ struct timekeeper *tk = &tk_core.shadow_timekeeper;
- write_seqcount_end(&tk_core.seq);
- raw_spin_unlock_irqrestore(&tk_core.lock, flags);
+ timekeeping_forward_now(tk);
+ old = tk->tkr_mono.clock;
+ tk_setup_internals(tk, new);
+ timekeeping_update_staged(&tk_core, TK_UPDATE_ALL);
+ }
if (old) {
if (old->disable)
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 18/24] timekeeping: Rework timekeeping_init() to use shadow_timekeeper
2024-09-11 13:29 [PATCH 00/24] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (16 preceding siblings ...)
2024-09-11 13:30 ` [PATCH 17/24] timekeeping: Rework change_clocksource() " Anna-Maria Behnsen
@ 2024-09-11 13:30 ` Anna-Maria Behnsen
2024-09-11 13:30 ` [PATCH 19/24] timekeeping: Rework timekeeping_inject_sleeptime64() " Anna-Maria Behnsen
` (5 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: Anna-Maria Behnsen @ 2024-09-11 13:30 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
For timekeeping_init() the sequence count write held time is not relevant
and it could keep working on the real timekeeper, but there is no reason to
make it different from other timekeeper updates.
Convert it to operate on the shadow timekeeper.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 9b4fd620f895..223429b1ca06 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1778,7 +1778,7 @@ static bool persistent_clock_exists;
void __init timekeeping_init(void)
{
struct timespec64 wall_time, boot_offset, wall_to_mono;
- struct timekeeper *tk = &tk_core.timekeeper;
+ struct timekeeper *tk = &tk_core.shadow_timekeeper;
struct clocksource *clock;
tkd_basic_setup(&tk_core);
@@ -1802,7 +1802,7 @@ void __init timekeeping_init(void)
wall_to_mono = timespec64_sub(boot_offset, wall_time);
guard(raw_spinlock_irqsave)(&tk_core.lock);
- write_seqcount_begin(&tk_core.seq);
+
ntp_init();
clock = clocksource_default_clock();
@@ -1815,9 +1815,7 @@ void __init timekeeping_init(void)
tk_set_wall_to_mono(tk, wall_to_mono);
- timekeeping_update(&tk_core, tk, TK_MIRROR | TK_CLOCK_WAS_SET);
-
- write_seqcount_end(&tk_core.seq);
+ timekeeping_update_staged(&tk_core, TK_CLOCK_WAS_SET);
}
/* time in seconds when suspend began for persistent clock */
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 19/24] timekeeping: Rework timekeeping_inject_sleeptime64() to use shadow_timekeeper
2024-09-11 13:29 [PATCH 00/24] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (17 preceding siblings ...)
2024-09-11 13:30 ` [PATCH 18/24] timekeeping: Rework timekeeping_init() " Anna-Maria Behnsen
@ 2024-09-11 13:30 ` Anna-Maria Behnsen
2024-09-11 13:30 ` [PATCH 20/24] timekeeping: Rework timekeeping_resume() " Anna-Maria Behnsen
` (4 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: Anna-Maria Behnsen @ 2024-09-11 13:30 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
Updates of the timekeeper can be done by operating on the shadow timekeeper
and afterwards copying the result into the real timekeeper. This has the
advantage, that the sequence count write protected region is kept as small
as possible.
Convert timekeeping_inject_sleeptime64() to use this scheme.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 22 +++++++---------------
1 file changed, 7 insertions(+), 15 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 223429b1ca06..244bb58dcda9 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1893,22 +1893,14 @@ bool timekeeping_rtc_skipsuspend(void)
*/
void timekeeping_inject_sleeptime64(const struct timespec64 *delta)
{
- struct timekeeper *tk = &tk_core.timekeeper;
- unsigned long flags;
-
- raw_spin_lock_irqsave(&tk_core.lock, flags);
- write_seqcount_begin(&tk_core.seq);
-
- suspend_timing_needed = false;
-
- timekeeping_forward_now(tk);
-
- __timekeeping_inject_sleeptime(tk, delta);
-
- timekeeping_update(&tk_core, tk, TK_UPDATE_ALL | TK_MIRROR);
+ scoped_guard(raw_spinlock_irqsave, &tk_core.lock) {
+ struct timekeeper *tk = &tk_core.shadow_timekeeper;
- write_seqcount_end(&tk_core.seq);
- raw_spin_unlock_irqrestore(&tk_core.lock, flags);
+ suspend_timing_needed = false;
+ timekeeping_forward_now(tk);
+ __timekeeping_inject_sleeptime(tk, delta);
+ timekeeping_update_staged(&tk_core, TK_UPDATE_ALL);
+ }
/* Signal hrtimers about time change */
clock_was_set(CLOCK_SET_WALL | CLOCK_SET_BOOT);
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 20/24] timekeeping: Rework timekeeping_resume() to use shadow_timekeeper
2024-09-11 13:29 [PATCH 00/24] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (18 preceding siblings ...)
2024-09-11 13:30 ` [PATCH 19/24] timekeeping: Rework timekeeping_inject_sleeptime64() " Anna-Maria Behnsen
@ 2024-09-11 13:30 ` Anna-Maria Behnsen
2024-09-11 13:30 ` [PATCH 21/24] timekeeping: Rework timekeeping_suspend() " Anna-Maria Behnsen
` (3 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: Anna-Maria Behnsen @ 2024-09-11 13:30 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
Updates of the timekeeper can be done by operating on the shadow timekeeper
and afterwards copying the result into the real timekeeper. This has the
advantage, that the sequence count write protected region is kept as small
as possible.
While the sequence count held time is not relevant for the resume path as
there is no concurrency, there is no reason to have this function
different than all the other update sites.
Convert timekeeping_inject_offset() to use this scheme and cleanup the
variable declaration while at it.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 244bb58dcda9..d4a0d37f88f5 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1912,12 +1912,12 @@ void timekeeping_inject_sleeptime64(const struct timespec64 *delta)
*/
void timekeeping_resume(void)
{
- struct timekeeper *tk = &tk_core.timekeeper;
+ struct timekeeper *tk = &tk_core.shadow_timekeeper;
struct clocksource *clock = tk->tkr_mono.clock;
- unsigned long flags;
struct timespec64 ts_new, ts_delta;
- u64 cycle_now, nsec;
bool inject_sleeptime = false;
+ u64 cycle_now, nsec;
+ unsigned long flags;
read_persistent_clock64(&ts_new);
@@ -1925,7 +1925,6 @@ void timekeeping_resume(void)
clocksource_resume();
raw_spin_lock_irqsave(&tk_core.lock, flags);
- write_seqcount_begin(&tk_core.seq);
/*
* After system resumes, we need to calculate the suspended time and
@@ -1960,8 +1959,7 @@ void timekeeping_resume(void)
tk->ntp_error = 0;
timekeeping_suspended = 0;
- timekeeping_update(&tk_core, tk, TK_MIRROR | TK_CLOCK_WAS_SET);
- write_seqcount_end(&tk_core.seq);
+ timekeeping_update_staged(&tk_core, TK_CLOCK_WAS_SET);
raw_spin_unlock_irqrestore(&tk_core.lock, flags);
touch_softlockup_watchdog();
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 21/24] timekeeping: Rework timekeeping_suspend() to use shadow_timekeeper
2024-09-11 13:29 [PATCH 00/24] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (19 preceding siblings ...)
2024-09-11 13:30 ` [PATCH 20/24] timekeeping: Rework timekeeping_resume() " Anna-Maria Behnsen
@ 2024-09-11 13:30 ` Anna-Maria Behnsen
2024-09-11 13:30 ` [PATCH 22/24] timekeeping: Rework do_adjtimex() " Anna-Maria Behnsen
` (2 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: Anna-Maria Behnsen @ 2024-09-11 13:30 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
Updates of the timekeeper can be done by operating on the shadow timekeeper
and afterwards copying the result into the real timekeeper. This has the
advantage, that the sequence count write protected region is kept as small
as possible.
While the sequence count held time is not relevant for the resume path as
there is no concurrency, there is no reason to have this function
different than all the other update sites.
Convert timekeeping_inject_offset() to use this scheme and cleanup the
variable declarations while at it.
As halt_fast_timekeeper() does not need protection sequence counter, it is
no problem to move it with this change outside of the sequence counter
protected area. But it still needs to be executed while holding the lock.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index d4a0d37f88f5..e3eb11ccdf5b 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1972,11 +1972,11 @@ void timekeeping_resume(void)
int timekeeping_suspend(void)
{
- struct timekeeper *tk = &tk_core.timekeeper;
- unsigned long flags;
- struct timespec64 delta, delta_delta;
- static struct timespec64 old_delta;
+ struct timekeeper *tk = &tk_core.shadow_timekeeper;
+ struct timespec64 delta, delta_delta;
+ static struct timespec64 old_delta;
struct clocksource *curr_clock;
+ unsigned long flags;
u64 cycle_now;
read_persistent_clock64(&timekeeping_suspend_time);
@@ -1992,7 +1992,6 @@ int timekeeping_suspend(void)
suspend_timing_needed = true;
raw_spin_lock_irqsave(&tk_core.lock, flags);
- write_seqcount_begin(&tk_core.seq);
timekeeping_forward_now(tk);
timekeeping_suspended = 1;
@@ -2027,9 +2026,8 @@ int timekeeping_suspend(void)
}
}
- timekeeping_update(&tk_core, tk, TK_MIRROR);
+ timekeeping_update_staged(&tk_core, 0);
halt_fast_timekeeper(tk);
- write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&tk_core.lock, flags);
tick_suspend();
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 22/24] timekeeping: Rework do_adjtimex() to use shadow_timekeeper
2024-09-11 13:29 [PATCH 00/24] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (20 preceding siblings ...)
2024-09-11 13:30 ` [PATCH 21/24] timekeeping: Rework timekeeping_suspend() " Anna-Maria Behnsen
@ 2024-09-11 13:30 ` Anna-Maria Behnsen
2024-09-11 13:30 ` [PATCH 23/24] timekeeping: Remove TK_MIRROR timekeeping_update() action Anna-Maria Behnsen
2024-09-11 13:30 ` [PATCH 24/24] timekeeping: Merge timekeeping_update_staged() and timekeeping_update() Anna-Maria Behnsen
23 siblings, 0 replies; 27+ messages in thread
From: Anna-Maria Behnsen @ 2024-09-11 13:30 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
Updates of the timekeeper can be done by operating on the shadow timekeeper
and afterwards copying the result into the real timekeeper. This has the
advantage, that the sequence count write protected region is kept as small
as possible.
Convert do_adjtimex() to use this scheme and take the opportunity to use a
scoped_guard() for locking.
That requires to have a separate function for updating the leap state so
that the update is protected by the sequence count. This also brings the
timekeeper and the shadow timekeeper in sync for this state, which was not
the case so far. That's not a correctness problem as the state is only used
at the read sides which use the real timekeeper, but it's inconsistent
nevertheless.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 40 ++++++++++++++++++++++++----------------
1 file changed, 24 insertions(+), 16 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index e3eb11ccdf5b..cbca0351ceca 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -722,6 +722,18 @@ static inline void tk_update_leap_state(struct timekeeper *tk)
tk->next_leap_ktime = ktime_sub(tk->next_leap_ktime, tk->offs_real);
}
+/*
+ * Leap state update for both shadow and the real timekeeper
+ * Separate to spare a full memcpy() of the timekeeper.
+ */
+static void tk_update_leap_state_all(struct tk_data *tkd)
+{
+ write_seqcount_begin(&tkd->seq);
+ tk_update_leap_state(&tkd->shadow_timekeeper);
+ tkd->timekeeper.next_leap_ktime = tkd->shadow_timekeeper.next_leap_ktime;
+ write_seqcount_end(&tkd->seq);
+}
+
/*
* Update the ktime_t based scalar nsec members of the timekeeper
*/
@@ -2537,12 +2549,10 @@ EXPORT_SYMBOL_GPL(random_get_entropy_fallback);
*/
int do_adjtimex(struct __kernel_timex *txc)
{
- struct timekeeper *tk = &tk_core.timekeeper;
+ struct timekeeper *tk = &tk_core.shadow_timekeeper;
struct audit_ntp_data ad;
bool clock_set = false;
struct timespec64 ts;
- unsigned long flags;
- s32 orig_tai, tai;
int ret;
/* Validate the data before disabling interrupts */
@@ -2569,23 +2579,21 @@ int do_adjtimex(struct __kernel_timex *txc)
ktime_get_real_ts64(&ts);
add_device_randomness(&ts, sizeof(ts));
- raw_spin_lock_irqsave(&tk_core.lock, flags);
- write_seqcount_begin(&tk_core.seq);
+ scoped_guard (raw_spinlock_irqsave, &tk_core.lock) {
+ s32 orig_tai, tai;
- orig_tai = tai = tk->tai_offset;
- ret = __do_adjtimex(txc, &ts, &tai, &ad);
+ orig_tai = tai = tk->tai_offset;
+ ret = __do_adjtimex(txc, &ts, &tai, &ad);
- if (tai != orig_tai) {
- __timekeeping_set_tai_offset(tk, tai);
- timekeeping_update(&tk_core, tk, TK_MIRROR | TK_CLOCK_WAS_SET);
- clock_set = true;
- } else {
- tk_update_leap_state(tk);
+ if (tai != orig_tai) {
+ __timekeeping_set_tai_offset(tk, tai);
+ timekeeping_update_staged(&tk_core, TK_CLOCK_WAS_SET);
+ clock_set = true;
+ } else {
+ tk_update_leap_state_all(&tk_core);
+ }
}
- write_seqcount_end(&tk_core.seq);
- raw_spin_unlock_irqrestore(&tk_core.lock, flags);
-
audit_ntp_log(&ad);
/* Update the multiplier immediately if frequency was set directly */
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 23/24] timekeeping: Remove TK_MIRROR timekeeping_update() action
2024-09-11 13:29 [PATCH 00/24] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (21 preceding siblings ...)
2024-09-11 13:30 ` [PATCH 22/24] timekeeping: Rework do_adjtimex() " Anna-Maria Behnsen
@ 2024-09-11 13:30 ` Anna-Maria Behnsen
2024-09-11 13:30 ` [PATCH 24/24] timekeeping: Merge timekeeping_update_staged() and timekeeping_update() Anna-Maria Behnsen
23 siblings, 0 replies; 27+ messages in thread
From: Anna-Maria Behnsen @ 2024-09-11 13:30 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
All call sites of using TK_MIRROR flag in timekeeping_update() are
gone. The TK_MIRROR dependent code path is therefore dead code.
Remove it along with the TK_MIRROR define.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 10 +---------
1 file changed, 1 insertion(+), 9 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index cbca0351ceca..6aa77cec23f8 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -30,8 +30,7 @@
#include "timekeeping_internal.h"
#define TK_CLEAR_NTP (1 << 0)
-#define TK_MIRROR (1 << 1)
-#define TK_CLOCK_WAS_SET (1 << 2)
+#define TK_CLOCK_WAS_SET (1 << 1)
#define TK_UPDATE_ALL (TK_CLEAR_NTP | TK_CLOCK_WAS_SET)
@@ -797,13 +796,6 @@ static void timekeeping_update(struct tk_data *tkd, struct timekeeper *tk, unsig
if (action & TK_CLOCK_WAS_SET)
tk->clock_was_set_seq++;
- /*
- * The mirroring of the data to the shadow-timekeeper needs
- * to happen last here to ensure we don't over-write the
- * timekeeper structure on the next update with stale data
- */
- if (action & TK_MIRROR)
- timekeeping_restore_shadow(tkd);
}
static void timekeeping_update_staged(struct tk_data *tkd, unsigned int action)
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 24/24] timekeeping: Merge timekeeping_update_staged() and timekeeping_update()
2024-09-11 13:29 [PATCH 00/24] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
` (22 preceding siblings ...)
2024-09-11 13:30 ` [PATCH 23/24] timekeeping: Remove TK_MIRROR timekeeping_update() action Anna-Maria Behnsen
@ 2024-09-11 13:30 ` Anna-Maria Behnsen
23 siblings, 0 replies; 27+ messages in thread
From: Anna-Maria Behnsen @ 2024-09-11 13:30 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
timekeeping_update_staged() is the only call site of timekeeping_update().
Merge those functions. No functional change.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
kernel/time/timekeeping.c | 31 ++++++++++++++-----------------
1 file changed, 14 insertions(+), 17 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 6aa77cec23f8..8e607413efad 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -529,7 +529,7 @@ EXPORT_SYMBOL_GPL(ktime_get_raw_fast_ns);
* timekeeping_inject_sleeptime64()
* __timekeeping_inject_sleeptime(tk, delta);
* timestamp();
- * timekeeping_update(tkd, tk, TK_CLEAR_NTP...);
+ * timekeeping_update_staged(tkd, TK_CLEAR_NTP...);
*
* (2) On 32-bit systems, the 64-bit boot offset (tk->offs_boot) may be
* partially updated. Since the tk->offs_boot update is a rare event, this
@@ -775,10 +775,21 @@ static void timekeeping_restore_shadow(struct tk_data *tkd)
memcpy(&tkd->shadow_timekeeper, &tkd->timekeeper, sizeof(tkd->timekeeper));
}
-static void timekeeping_update(struct tk_data *tkd, struct timekeeper *tk, unsigned int action)
+static void timekeeping_update_staged(struct tk_data *tkd, unsigned int action)
{
+ struct timekeeper *tk = &tk_core.shadow_timekeeper;
+
lockdep_assert_held(&tkd->lock);
+ /*
+ * Block out readers before running the updates below because that
+ * updates VDSO and other time related infrastructure. Not blocking
+ * the readers might let a reader see time going backwards when
+ * reading from the VDSO after the VDSO update and then reading in
+ * the kernel from the timekeeper before that got updated.
+ */
+ write_seqcount_begin(&tkd->seq);
+
if (action & TK_CLEAR_NTP) {
tk->ntp_error = 0;
ntp_clear();
@@ -796,20 +807,6 @@ static void timekeeping_update(struct tk_data *tkd, struct timekeeper *tk, unsig
if (action & TK_CLOCK_WAS_SET)
tk->clock_was_set_seq++;
-}
-
-static void timekeeping_update_staged(struct tk_data *tkd, unsigned int action)
-{
- /*
- * Block out readers before invoking timekeeping_update() because
- * that updates VDSO and other time related infrastructure. Not
- * blocking the readers might let a reader see time going backwards
- * when reading from the VDSO after the VDSO update and then
- * reading in the kernel from the timekeeper before that got updated.
- */
- write_seqcount_begin(&tkd->seq);
-
- timekeeping_update(tkd, &tkd->shadow_timekeeper, action);
/*
* Update the real timekeeper.
@@ -819,7 +816,7 @@ static void timekeeping_update_staged(struct tk_data *tkd, unsigned int action)
* the cacheline optimized data layout of the timekeeper and requires
* another indirection.
*/
- memcpy(&tkd->timekeeper, &tkd->shadow_timekeeper, sizeof(tkd->shadow_timekeeper));
+ memcpy(&tkd->timekeeper, tk, sizeof(*tk));
write_seqcount_end(&tkd->seq);
}
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH 10/24] timekeeping: Define a struct type for tk_core to make it reusable
2024-09-11 13:29 ` [PATCH 10/24] timekeeping: Define a struct type for tk_core to make it reusable Anna-Maria Behnsen
@ 2024-09-12 7:38 ` Simon Horman
2024-09-13 9:16 ` Anna-Maria Behnsen
0 siblings, 1 reply; 27+ messages in thread
From: Simon Horman @ 2024-09-12 7:38 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: John Stultz, Frederic Weisbecker, Thomas Gleixner, linux-kernel,
netdev, Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Sep 11, 2024 at 03:29:54PM +0200, Anna-Maria Behnsen wrote:
> The struct tk_core uses is not reusable. As long as there is only a single
> timekeeper, this is not a problem. But when the timekeeper infrastructure
> will be reused for per ptp clock timekeepers, an explicit struct type is
> required.
>
> Define struct tk_data as explicit struct type for tk_core.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
...
Hi Anna-Maria,
I wonder if the order of this and the previous patch should
be reversed, or the two patches should be squashed together.
I am seeing a build failure with only patches 01-09/24 of this series
applied, which seem to be resolved by applying this patch.
.../timekeeping.c:1735:43: warning: declaration of 'struct tk_data' will not be visible outside of this function [-Wvisibility]
1735 | static __init void tkd_basic_setup(struct tk_data *tkd)
...
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 10/24] timekeeping: Define a struct type for tk_core to make it reusable
2024-09-12 7:38 ` Simon Horman
@ 2024-09-13 9:16 ` Anna-Maria Behnsen
0 siblings, 0 replies; 27+ messages in thread
From: Anna-Maria Behnsen @ 2024-09-13 9:16 UTC (permalink / raw)
To: Simon Horman
Cc: John Stultz, Frederic Weisbecker, Thomas Gleixner, linux-kernel,
netdev, Miroslav Lichvar, Richard Cochran, Christopher S Hall
Simon Horman <horms@kernel.org> writes:
> On Wed, Sep 11, 2024 at 03:29:54PM +0200, Anna-Maria Behnsen wrote:
>> The struct tk_core uses is not reusable. As long as there is only a single
>> timekeeper, this is not a problem. But when the timekeeper infrastructure
>> will be reused for per ptp clock timekeepers, an explicit struct type is
>> required.
>>
>> Define struct tk_data as explicit struct type for tk_core.
>>
>> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> ...
>
> Hi Anna-Maria,
>
> I wonder if the order of this and the previous patch should
> be reversed, or the two patches should be squashed together.
>
> I am seeing a build failure with only patches 01-09/24 of this series
> applied, which seem to be resolved by applying this patch.
>
> .../timekeeping.c:1735:43: warning: declaration of 'struct tk_data' will not be visible outside of this function [-Wvisibility]
> 1735 | static __init void tkd_basic_setup(struct tk_data *tkd)
> ...
Oh, I'm sorry. I mixed something up. Thanks for letting me know, I'll
have a look at it and fix it!
Anna-Maria
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2024-09-13 9:16 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-11 13:29 [PATCH 00/24] timekeeping: Rework to prepare support of indenpendent PTP clocks Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 01/24] timekeeping: Read NTP tick length only once Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 02/24] timekeeping: Don't stop time readers across hard_pps() update Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 03/24] timekeeping: Avoid duplicate leap state update Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 04/24] timekeeping: Abort clocksource change in case of failure Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 05/24] timekeeping: Simplify code in timekeeping_advance() Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 06/24] timekeeping: Reorder struct timekeeper Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 07/24] timekeeping: Move shadow_timekeeper into tk_core Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 08/24] timekeeping: Encapsulate locking/unlocking of timekeeper_lock Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 09/24] timekeeping: Move timekeeper_lock into tk_core Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 10/24] timekeeping: Define a struct type for tk_core to make it reusable Anna-Maria Behnsen
2024-09-12 7:38 ` Simon Horman
2024-09-13 9:16 ` Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 11/24] timekeeping: Add struct tk_data as argument to timekeeping_update() Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 12/24] timekeeping: Split out timekeeper update of timekeeping_advanced() Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 13/24] timekeeping: Introduce combined timekeeping action flag Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 14/24] timekeeping: Provide timekeeping_restore_shadow() Anna-Maria Behnsen
2024-09-11 13:29 ` [PATCH 15/24] timekeeping: Rework do_settimeofday64() to use shadow_timekeeper Anna-Maria Behnsen
2024-09-11 13:30 ` [PATCH 16/24] timekeeping: Rework timekeeping_inject_offset() " Anna-Maria Behnsen
2024-09-11 13:30 ` [PATCH 17/24] timekeeping: Rework change_clocksource() " Anna-Maria Behnsen
2024-09-11 13:30 ` [PATCH 18/24] timekeeping: Rework timekeeping_init() " Anna-Maria Behnsen
2024-09-11 13:30 ` [PATCH 19/24] timekeeping: Rework timekeeping_inject_sleeptime64() " Anna-Maria Behnsen
2024-09-11 13:30 ` [PATCH 20/24] timekeeping: Rework timekeeping_resume() " Anna-Maria Behnsen
2024-09-11 13:30 ` [PATCH 21/24] timekeeping: Rework timekeeping_suspend() " Anna-Maria Behnsen
2024-09-11 13:30 ` [PATCH 22/24] timekeeping: Rework do_adjtimex() " Anna-Maria Behnsen
2024-09-11 13:30 ` [PATCH 23/24] timekeeping: Remove TK_MIRROR timekeeping_update() action Anna-Maria Behnsen
2024-09-11 13:30 ` [PATCH 24/24] timekeeping: Merge timekeeping_update_staged() and timekeeping_update() Anna-Maria Behnsen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).