* Re: [PATCH v2 06/25] timekeeping: Reorder struct timekeeper
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-6-554456a44a15@linutronix.de>
@ 2024-10-11 3:22 ` John Stultz
2024-10-14 9:30 ` Anna-Maria Behnsen
2024-10-15 10:08 ` [PATCH v2a] " Anna-Maria Behnsen
1 sibling, 1 reply; 23+ messages in thread
From: John Stultz @ 2024-10-11 3:22 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Thomas Gleixner <tglx@linutronix.de>
>
> From: Thomas Gleixner <tglx@linutronix.de>
>
> struct timekeeper is ordered suboptimal vs. cachelines. The layout,
> including the preceding seqcount (see struct tk_core in timekeeper.c) is:
>
> cacheline 0: seqcount, tkr_mono
> cacheline 1: tkr_raw, xtime_sec
> cacheline 2: ktime_sec ... tai_offset, internal variables
> cacheline 3: next_leap_ktime, raw_sec, internal variables
> cacheline 4: internal variables
>
> So any access to via ktime_get*() except for access to CLOCK_MONOTONIC_RAW
> will use either cachelines 0 + 1 or cachelines 0 + 2. Access to
> CLOCK_MONOTONIC_RAW uses cachelines 0 + 1 + 3.
>
> Reorder the members so that the result is more efficient:
>
> cacheline 0: seqcount, tkr_mono
> cacheline 1: xtime_sec, ktime_sec ... tai_offset
> cacheline 2: tkr_raw, raw_sec
> cacheline 3: internal variables
> cacheline 4: internal variables
>
> That means ktime_get*() will access cacheline 0 + 1 and CLOCK_MONOTONIC_RAW
> access will use cachelines 0 + 2.
>
> Update kernel-doc and fix formatting issues while at it. Also fix a typo
> in struct tk_read_base kernel-doc.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
> ---
> include/linux/timekeeper_internal.h | 102 +++++++++++++++++++++---------------
> 1 file changed, 61 insertions(+), 41 deletions(-)
>
> diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h
> index 902c20ef495a..430e40549136 100644
> --- a/include/linux/timekeeper_internal.h
> +++ b/include/linux/timekeeper_internal.h
> @@ -26,7 +26,7 @@
> * occupies a single 64byte cache line.
> *
> * The struct is separate from struct timekeeper as it is also used
> - * for a fast NMI safe accessors.
> + * for the fast NMI safe accessors.
> *
> * @base_real is for the fast NMI safe accessor to allow reading clock
> * realtime from any context.
> @@ -44,33 +44,41 @@ struct tk_read_base {
>
> /**
> * struct timekeeper - Structure holding internal timekeeping values.
> - * @tkr_mono: The readout base structure for CLOCK_MONOTONIC
> - * @tkr_raw: The readout base structure for CLOCK_MONOTONIC_RAW
> - * @xtime_sec: Current CLOCK_REALTIME time in seconds
> - * @ktime_sec: Current CLOCK_MONOTONIC time in seconds
> - * @wall_to_monotonic: CLOCK_REALTIME to CLOCK_MONOTONIC offset
> - * @offs_real: Offset clock monotonic -> clock realtime
> - * @offs_boot: Offset clock monotonic -> clock boottime
> - * @offs_tai: Offset clock monotonic -> clock tai
> - * @tai_offset: The current UTC to TAI offset in seconds
> - * @clock_was_set_seq: The sequence number of clock was set events
> - * @cs_was_changed_seq: The sequence number of clocksource change events
> - * @next_leap_ktime: CLOCK_MONOTONIC time value of a pending leap-second
> - * @raw_sec: CLOCK_MONOTONIC_RAW time in seconds
> - * @monotonic_to_boot: CLOCK_MONOTONIC to CLOCK_BOOTTIME offset
> - * @cycle_interval: Number of clock cycles in one NTP interval
> - * @xtime_interval: Number of clock shifted nano seconds in one NTP
> - * interval.
> - * @xtime_remainder: Shifted nano seconds left over when rounding
> - * @cycle_interval
> - * @raw_interval: Shifted raw nano seconds accumulated per NTP interval.
> - * @ntp_error: Difference between accumulated time and NTP time in ntp
> - * shifted nano seconds.
> - * @ntp_error_shift: Shift conversion between clock shifted nano seconds and
> - * ntp shifted nano seconds.
> - * @last_warning: Warning ratelimiter (DEBUG_TIMEKEEPING)
> - * @underflow_seen: Underflow warning flag (DEBUG_TIMEKEEPING)
> - * @overflow_seen: Overflow warning flag (DEBUG_TIMEKEEPING)
> + * @tkr_mono: The readout base structure for CLOCK_MONOTONIC
> + * @xtime_sec: Current CLOCK_REALTIME time in seconds
> + * @ktime_sec: Current CLOCK_MONOTONIC time in seconds
> + * @wall_to_monotonic: CLOCK_REALTIME to CLOCK_MONOTONIC offset
> + * @offs_real: Offset clock monotonic -> clock realtime
> + * @offs_boot: Offset clock monotonic -> clock boottime
> + * @offs_tai: Offset clock monotonic -> clock tai
> + * @tai_offset: The current UTC to TAI offset in seconds
> + * @tkr_raw: The readout base structure for CLOCK_MONOTONIC_RAW
> + * @raw_sec: CLOCK_MONOTONIC_RAW time in seconds
> + * @clock_was_set_seq: The sequence number of clock was set events
> + * @cs_was_changed_seq: The sequence number of clocksource change events
> + * @monotonic_to_boot: CLOCK_MONOTONIC to CLOCK_BOOTTIME offset
> + * @cycle_interval: Number of clock cycles in one NTP interval
> + * @xtime_interval: Number of clock shifted nano seconds in one NTP
> + * interval.
> + * @xtime_remainder: Shifted nano seconds left over when rounding
> + * @cycle_interval
> + * @raw_interval: Shifted raw nano seconds accumulated per NTP interval.
> + * @next_leap_ktime: CLOCK_MONOTONIC time value of a pending leap-second
> + * @ntp_tick: The ntp_tick_length() value currently being
> + * used. This cached copy ensures we consistently
> + * apply the tick length for an entire tick, as
> + * ntp_tick_length may change mid-tick, and we don't
> + * want to apply that new value to the tick in
> + * progress.
> + * @ntp_error: Difference between accumulated time and NTP time in ntp
> + * shifted nano seconds.
> + * @ntp_error_shift: Shift conversion between clock shifted nano seconds and
> + * ntp shifted nano seconds.
> + * @ntp_err_mult: Multiplication factor for scaled math conversion
> + * @skip_second_overflow: Flag used to avoid updating NTP twice with same second
> + * @last_warning: Warning ratelimiter (DEBUG_TIMEKEEPING)
> + * @underflow_seen: Underflow warning flag (DEBUG_TIMEKEEPING)
> + * @overflow_seen: Overflow warning flag (DEBUG_TIMEKEEPING)
> *
> * Note: For timespec(64) based interfaces wall_to_monotonic is what
> * we need to add to xtime (or xtime corrected for sub jiffy times)
> @@ -88,10 +96,25 @@ struct tk_read_base {
> *
> * @monotonic_to_boottime is a timespec64 representation of @offs_boot to
> * accelerate the VDSO update for CLOCK_BOOTTIME.
> + *
> + * The cacheline ordering of the structure is optimized for in kernel usage
> + * of the ktime_get() and ktime_get_ts64() family of time accessors. Struct
> + * timekeeper is prepended in the core timekeeeping code with a sequence
> + * count, which results in the following cacheline layout:
> + *
> + * 0: seqcount, tkr_mono
> + * 1: xtime_sec ... tai_offset
> + * 2: tkr_raw, raw_sec
> + * 3,4: Internal variables
> + *
> + * Cacheline 0,1 contain the data which is used for accessing
> + * CLOCK_MONOTONIC/REALTIME/BOOTTIME/TAI, while cacheline 2 contains the
> + * data for accessing CLOCK_MONOTONIC_RAW. Cacheline 3,4 are internal
> + * variables which are only accessed during timekeeper updates once per
> + * tick.
Would it make sense to add divider comments or something in the struct
to make this more visible? I fret in the context of a patch, a + line
adding a new structure element that breaks the ordered alignment might
not be obvious.
thanks
-john
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 06/25] timekeeping: Reorder struct timekeeper
2024-10-11 3:22 ` [PATCH v2 06/25] timekeeping: Reorder struct timekeeper John Stultz
@ 2024-10-14 9:30 ` Anna-Maria Behnsen
0 siblings, 0 replies; 23+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-14 9:30 UTC (permalink / raw)
To: John Stultz
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
John Stultz <jstultz@google.com> writes:
> On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
> <anna-maria@linutronix.de> wrote:
>>
>> From: Thomas Gleixner <tglx@linutronix.de>
>>
>> From: Thomas Gleixner <tglx@linutronix.de>
>>
>> struct timekeeper is ordered suboptimal vs. cachelines. The layout,
>> including the preceding seqcount (see struct tk_core in timekeeper.c) is:
>>
>> cacheline 0: seqcount, tkr_mono
>> cacheline 1: tkr_raw, xtime_sec
>> cacheline 2: ktime_sec ... tai_offset, internal variables
>> cacheline 3: next_leap_ktime, raw_sec, internal variables
>> cacheline 4: internal variables
>>
>> So any access to via ktime_get*() except for access to CLOCK_MONOTONIC_RAW
>> will use either cachelines 0 + 1 or cachelines 0 + 2. Access to
>> CLOCK_MONOTONIC_RAW uses cachelines 0 + 1 + 3.
>>
>> Reorder the members so that the result is more efficient:
>>
>> cacheline 0: seqcount, tkr_mono
>> cacheline 1: xtime_sec, ktime_sec ... tai_offset
>> cacheline 2: tkr_raw, raw_sec
>> cacheline 3: internal variables
>> cacheline 4: internal variables
>>
>> That means ktime_get*() will access cacheline 0 + 1 and CLOCK_MONOTONIC_RAW
>> access will use cachelines 0 + 2.
>>
>> Update kernel-doc and fix formatting issues while at it. Also fix a typo
>> in struct tk_read_base kernel-doc.
>>
>> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> Acked-by: John Stultz <jstultz@google.com>
>
>> ---
>> include/linux/timekeeper_internal.h | 102 +++++++++++++++++++++---------------
>> 1 file changed, 61 insertions(+), 41 deletions(-)
>>
>> diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h
>> index 902c20ef495a..430e40549136 100644
>> --- a/include/linux/timekeeper_internal.h
>> +++ b/include/linux/timekeeper_internal.h
>> @@ -26,7 +26,7 @@
>> * occupies a single 64byte cache line.
>> *
>> * The struct is separate from struct timekeeper as it is also used
>> - * for a fast NMI safe accessors.
>> + * for the fast NMI safe accessors.
>> *
>> * @base_real is for the fast NMI safe accessor to allow reading clock
>> * realtime from any context.
>> @@ -44,33 +44,41 @@ struct tk_read_base {
>>
>> /**
>> * struct timekeeper - Structure holding internal timekeeping values.
>> - * @tkr_mono: The readout base structure for CLOCK_MONOTONIC
>> - * @tkr_raw: The readout base structure for CLOCK_MONOTONIC_RAW
>> - * @xtime_sec: Current CLOCK_REALTIME time in seconds
>> - * @ktime_sec: Current CLOCK_MONOTONIC time in seconds
>> - * @wall_to_monotonic: CLOCK_REALTIME to CLOCK_MONOTONIC offset
>> - * @offs_real: Offset clock monotonic -> clock realtime
>> - * @offs_boot: Offset clock monotonic -> clock boottime
>> - * @offs_tai: Offset clock monotonic -> clock tai
>> - * @tai_offset: The current UTC to TAI offset in seconds
>> - * @clock_was_set_seq: The sequence number of clock was set events
>> - * @cs_was_changed_seq: The sequence number of clocksource change events
>> - * @next_leap_ktime: CLOCK_MONOTONIC time value of a pending leap-second
>> - * @raw_sec: CLOCK_MONOTONIC_RAW time in seconds
>> - * @monotonic_to_boot: CLOCK_MONOTONIC to CLOCK_BOOTTIME offset
>> - * @cycle_interval: Number of clock cycles in one NTP interval
>> - * @xtime_interval: Number of clock shifted nano seconds in one NTP
>> - * interval.
>> - * @xtime_remainder: Shifted nano seconds left over when rounding
>> - * @cycle_interval
>> - * @raw_interval: Shifted raw nano seconds accumulated per NTP interval.
>> - * @ntp_error: Difference between accumulated time and NTP time in ntp
>> - * shifted nano seconds.
>> - * @ntp_error_shift: Shift conversion between clock shifted nano seconds and
>> - * ntp shifted nano seconds.
>> - * @last_warning: Warning ratelimiter (DEBUG_TIMEKEEPING)
>> - * @underflow_seen: Underflow warning flag (DEBUG_TIMEKEEPING)
>> - * @overflow_seen: Overflow warning flag (DEBUG_TIMEKEEPING)
>> + * @tkr_mono: The readout base structure for CLOCK_MONOTONIC
>> + * @xtime_sec: Current CLOCK_REALTIME time in seconds
>> + * @ktime_sec: Current CLOCK_MONOTONIC time in seconds
>> + * @wall_to_monotonic: CLOCK_REALTIME to CLOCK_MONOTONIC offset
>> + * @offs_real: Offset clock monotonic -> clock realtime
>> + * @offs_boot: Offset clock monotonic -> clock boottime
>> + * @offs_tai: Offset clock monotonic -> clock tai
>> + * @tai_offset: The current UTC to TAI offset in seconds
>> + * @tkr_raw: The readout base structure for CLOCK_MONOTONIC_RAW
>> + * @raw_sec: CLOCK_MONOTONIC_RAW time in seconds
>> + * @clock_was_set_seq: The sequence number of clock was set events
>> + * @cs_was_changed_seq: The sequence number of clocksource change events
>> + * @monotonic_to_boot: CLOCK_MONOTONIC to CLOCK_BOOTTIME offset
>> + * @cycle_interval: Number of clock cycles in one NTP interval
>> + * @xtime_interval: Number of clock shifted nano seconds in one NTP
>> + * interval.
>> + * @xtime_remainder: Shifted nano seconds left over when rounding
>> + * @cycle_interval
>> + * @raw_interval: Shifted raw nano seconds accumulated per NTP interval.
>> + * @next_leap_ktime: CLOCK_MONOTONIC time value of a pending leap-second
>> + * @ntp_tick: The ntp_tick_length() value currently being
>> + * used. This cached copy ensures we consistently
>> + * apply the tick length for an entire tick, as
>> + * ntp_tick_length may change mid-tick, and we don't
>> + * want to apply that new value to the tick in
>> + * progress.
>> + * @ntp_error: Difference between accumulated time and NTP time in ntp
>> + * shifted nano seconds.
>> + * @ntp_error_shift: Shift conversion between clock shifted nano seconds and
>> + * ntp shifted nano seconds.
>> + * @ntp_err_mult: Multiplication factor for scaled math conversion
>> + * @skip_second_overflow: Flag used to avoid updating NTP twice with same second
>> + * @last_warning: Warning ratelimiter (DEBUG_TIMEKEEPING)
>> + * @underflow_seen: Underflow warning flag (DEBUG_TIMEKEEPING)
>> + * @overflow_seen: Overflow warning flag (DEBUG_TIMEKEEPING)
>> *
>> * Note: For timespec(64) based interfaces wall_to_monotonic is what
>> * we need to add to xtime (or xtime corrected for sub jiffy times)
>> @@ -88,10 +96,25 @@ struct tk_read_base {
>> *
>> * @monotonic_to_boottime is a timespec64 representation of @offs_boot to
>> * accelerate the VDSO update for CLOCK_BOOTTIME.
>> + *
>> + * The cacheline ordering of the structure is optimized for in kernel usage
>> + * of the ktime_get() and ktime_get_ts64() family of time accessors. Struct
>> + * timekeeper is prepended in the core timekeeeping code with a sequence
>> + * count, which results in the following cacheline layout:
>> + *
>> + * 0: seqcount, tkr_mono
>> + * 1: xtime_sec ... tai_offset
>> + * 2: tkr_raw, raw_sec
>> + * 3,4: Internal variables
>> + *
>> + * Cacheline 0,1 contain the data which is used for accessing
>> + * CLOCK_MONOTONIC/REALTIME/BOOTTIME/TAI, while cacheline 2 contains the
>> + * data for accessing CLOCK_MONOTONIC_RAW. Cacheline 3,4 are internal
>> + * variables which are only accessed during timekeeper updates once per
>> + * tick.
>
> Would it make sense to add divider comments or something in the struct
> to make this more visible? I fret in the context of a patch, a + line
> adding a new structure element that breaks the ordered alignment might
> not be obvious.
This is an argument! I'll add simple comments with /* Cachline X: */
Thanks,
Anna-Maria
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v2a] timekeeping: Reorder struct timekeeper
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-6-554456a44a15@linutronix.de>
2024-10-11 3:22 ` [PATCH v2 06/25] timekeeping: Reorder struct timekeeper John Stultz
@ 2024-10-15 10:08 ` Anna-Maria Behnsen
1 sibling, 0 replies; 23+ messages in thread
From: Anna-Maria Behnsen @ 2024-10-15 10:08 UTC (permalink / raw)
To: John Stultz, Frederic Weisbecker, Thomas Gleixner
Cc: linux-kernel, netdev, Miroslav Lichvar, Richard Cochran,
Christopher S Hall, Anna-Maria Behnsen
From: Thomas Gleixner <tglx@linutronix.de>
struct timekeeper is ordered suboptimal vs. cachelines. The layout,
including the preceding seqcount (see struct tk_core in timekeeper.c) is:
cacheline 0: seqcount, tkr_mono
cacheline 1: tkr_raw, xtime_sec
cacheline 2: ktime_sec ... tai_offset, internal variables
cacheline 3: next_leap_ktime, raw_sec, internal variables
cacheline 4: internal variables
So any access to via ktime_get*() except for access to CLOCK_MONOTONIC_RAW
will use either cachelines 0 + 1 or cachelines 0 + 2. Access to
CLOCK_MONOTONIC_RAW uses cachelines 0 + 1 + 3.
Reorder the members so that the result is more efficient:
cacheline 0: seqcount, tkr_mono
cacheline 1: xtime_sec, ktime_sec ... tai_offset
cacheline 2: tkr_raw, raw_sec
cacheline 3: internal variables
cacheline 4: internal variables
That means ktime_get*() will access cacheline 0 + 1 and CLOCK_MONOTONIC_RAW
access will use cachelines 0 + 2.
Update kernel-doc and fix formatting issues while at it. Also fix a typo
in struct tk_read_base kernel-doc.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
---
include/linux/timekeeper_internal.h | 106 +++++++++++++++++-----------
1 file changed, 65 insertions(+), 41 deletions(-)
diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h
index 902c20ef495a..a3b6380a7777 100644
--- a/include/linux/timekeeper_internal.h
+++ b/include/linux/timekeeper_internal.h
@@ -26,7 +26,7 @@
* occupies a single 64byte cache line.
*
* The struct is separate from struct timekeeper as it is also used
- * for a fast NMI safe accessors.
+ * for the fast NMI safe accessors.
*
* @base_real is for the fast NMI safe accessor to allow reading clock
* realtime from any context.
@@ -44,33 +44,41 @@ struct tk_read_base {
/**
* struct timekeeper - Structure holding internal timekeeping values.
- * @tkr_mono: The readout base structure for CLOCK_MONOTONIC
- * @tkr_raw: The readout base structure for CLOCK_MONOTONIC_RAW
- * @xtime_sec: Current CLOCK_REALTIME time in seconds
- * @ktime_sec: Current CLOCK_MONOTONIC time in seconds
- * @wall_to_monotonic: CLOCK_REALTIME to CLOCK_MONOTONIC offset
- * @offs_real: Offset clock monotonic -> clock realtime
- * @offs_boot: Offset clock monotonic -> clock boottime
- * @offs_tai: Offset clock monotonic -> clock tai
- * @tai_offset: The current UTC to TAI offset in seconds
- * @clock_was_set_seq: The sequence number of clock was set events
- * @cs_was_changed_seq: The sequence number of clocksource change events
- * @next_leap_ktime: CLOCK_MONOTONIC time value of a pending leap-second
- * @raw_sec: CLOCK_MONOTONIC_RAW time in seconds
- * @monotonic_to_boot: CLOCK_MONOTONIC to CLOCK_BOOTTIME offset
- * @cycle_interval: Number of clock cycles in one NTP interval
- * @xtime_interval: Number of clock shifted nano seconds in one NTP
- * interval.
- * @xtime_remainder: Shifted nano seconds left over when rounding
- * @cycle_interval
- * @raw_interval: Shifted raw nano seconds accumulated per NTP interval.
- * @ntp_error: Difference between accumulated time and NTP time in ntp
- * shifted nano seconds.
- * @ntp_error_shift: Shift conversion between clock shifted nano seconds and
- * ntp shifted nano seconds.
- * @last_warning: Warning ratelimiter (DEBUG_TIMEKEEPING)
- * @underflow_seen: Underflow warning flag (DEBUG_TIMEKEEPING)
- * @overflow_seen: Overflow warning flag (DEBUG_TIMEKEEPING)
+ * @tkr_mono: The readout base structure for CLOCK_MONOTONIC
+ * @xtime_sec: Current CLOCK_REALTIME time in seconds
+ * @ktime_sec: Current CLOCK_MONOTONIC time in seconds
+ * @wall_to_monotonic: CLOCK_REALTIME to CLOCK_MONOTONIC offset
+ * @offs_real: Offset clock monotonic -> clock realtime
+ * @offs_boot: Offset clock monotonic -> clock boottime
+ * @offs_tai: Offset clock monotonic -> clock tai
+ * @tai_offset: The current UTC to TAI offset in seconds
+ * @tkr_raw: The readout base structure for CLOCK_MONOTONIC_RAW
+ * @raw_sec: CLOCK_MONOTONIC_RAW time in seconds
+ * @clock_was_set_seq: The sequence number of clock was set events
+ * @cs_was_changed_seq: The sequence number of clocksource change events
+ * @monotonic_to_boot: CLOCK_MONOTONIC to CLOCK_BOOTTIME offset
+ * @cycle_interval: Number of clock cycles in one NTP interval
+ * @xtime_interval: Number of clock shifted nano seconds in one NTP
+ * interval.
+ * @xtime_remainder: Shifted nano seconds left over when rounding
+ * @cycle_interval
+ * @raw_interval: Shifted raw nano seconds accumulated per NTP interval.
+ * @next_leap_ktime: CLOCK_MONOTONIC time value of a pending leap-second
+ * @ntp_tick: The ntp_tick_length() value currently being
+ * used. This cached copy ensures we consistently
+ * apply the tick length for an entire tick, as
+ * ntp_tick_length may change mid-tick, and we don't
+ * want to apply that new value to the tick in
+ * progress.
+ * @ntp_error: Difference between accumulated time and NTP time in ntp
+ * shifted nano seconds.
+ * @ntp_error_shift: Shift conversion between clock shifted nano seconds and
+ * ntp shifted nano seconds.
+ * @ntp_err_mult: Multiplication factor for scaled math conversion
+ * @skip_second_overflow: Flag used to avoid updating NTP twice with same second
+ * @last_warning: Warning ratelimiter (DEBUG_TIMEKEEPING)
+ * @underflow_seen: Underflow warning flag (DEBUG_TIMEKEEPING)
+ * @overflow_seen: Overflow warning flag (DEBUG_TIMEKEEPING)
*
* Note: For timespec(64) based interfaces wall_to_monotonic is what
* we need to add to xtime (or xtime corrected for sub jiffy times)
@@ -88,10 +96,28 @@ struct tk_read_base {
*
* @monotonic_to_boottime is a timespec64 representation of @offs_boot to
* accelerate the VDSO update for CLOCK_BOOTTIME.
+ *
+ * The cacheline ordering of the structure is optimized for in kernel usage of
+ * the ktime_get() and ktime_get_ts64() family of time accessors. Struct
+ * timekeeper is prepended in the core timekeeping code with a sequence count,
+ * which results in the following cacheline layout:
+ *
+ * 0: seqcount, tkr_mono
+ * 1: xtime_sec ... tai_offset
+ * 2: tkr_raw, raw_sec
+ * 3,4: Internal variables
+ *
+ * Cacheline 0,1 contain the data which is used for accessing
+ * CLOCK_MONOTONIC/REALTIME/BOOTTIME/TAI, while cacheline 2 contains the
+ * data for accessing CLOCK_MONOTONIC_RAW. Cacheline 3,4 are internal
+ * variables which are only accessed during timekeeper updates once per
+ * tick.
*/
struct timekeeper {
+ /* Cacheline 0 (together with prepended seqcount of timekeeper core): */
struct tk_read_base tkr_mono;
- struct tk_read_base tkr_raw;
+
+ /* Cacheline 1: */
u64 xtime_sec;
unsigned long ktime_sec;
struct timespec64 wall_to_monotonic;
@@ -99,31 +125,29 @@ struct timekeeper {
ktime_t offs_boot;
ktime_t offs_tai;
s32 tai_offset;
+
+ /* Cacheline 2: */
+ struct tk_read_base tkr_raw;
+ u64 raw_sec;
+
+ /* Cachline 3 and 4 (timekeeping internal variables): */
unsigned int clock_was_set_seq;
u8 cs_was_changed_seq;
- ktime_t next_leap_ktime;
- u64 raw_sec;
+
struct timespec64 monotonic_to_boot;
- /* The following members are for timekeeping internal use */
u64 cycle_interval;
u64 xtime_interval;
s64 xtime_remainder;
u64 raw_interval;
- /* The ntp_tick_length() value currently being used.
- * This cached copy ensures we consistently apply the tick
- * length for an entire tick, as ntp_tick_length may change
- * mid-tick, and we don't want to apply that new value to
- * the tick in progress.
- */
+
+ ktime_t next_leap_ktime;
u64 ntp_tick;
- /* Difference between accumulated time and NTP time in ntp
- * shifted nano seconds. */
s64 ntp_error;
u32 ntp_error_shift;
u32 ntp_err_mult;
- /* Flag used to avoid updating NTP twice with same second */
u32 skip_second_overflow;
+
#ifdef CONFIG_DEBUG_TIMEKEEPING
long last_warning;
/*
--
2.39.5
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH v2 07/25] timekeeping: Move shadow_timekeeper into tk_core
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-7-554456a44a15@linutronix.de>
@ 2024-10-24 21:11 ` John Stultz
0 siblings, 0 replies; 23+ messages in thread
From: John Stultz @ 2024-10-24 21:11 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Thomas Gleixner <tglx@linutronix.de>
>
> From: Thomas Gleixner <tglx@linutronix.de>
>
> tk_core requires shadow_timekeeper to allow timekeeping_advance() updating
> without holding the timekeeper sequence count write locked. This allows the
> readers to make progress up to the actual update where the shadow
> timekeeper is copied over to the real timekeeper.
>
> As long as there is only a single timekeeper, having them separate is
> fine. But when the timekeeper infrastructure will be reused for per ptp
> clock timekeepers, shadow_timekeeper needs to be part of tk_core.
>
> No functional change.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 08/25] timekeeping: Encapsulate locking/unlocking of timekeeper_lock
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-8-554456a44a15@linutronix.de>
@ 2024-10-24 21:19 ` John Stultz
0 siblings, 0 replies; 23+ messages in thread
From: John Stultz @ 2024-10-24 21:19 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Thomas Gleixner <tglx@linutronix.de>
>
> From: Thomas Gleixner <tglx@linutronix.de>
>
> timekeeper_lock protects updates of timekeeper (tk_core). It is also used
> by vdso_update_begin/end() and not only internally by the timekeeper code.
>
> As long as there is only a single timekeeper, this works fine. But when
> the timekeeper infrastructure will be reused for per ptp clock timekeepers,
> timekeeper_lock needs to be part of tk_core..
>
> Therefore encapuslate locking/unlocking of timekeeper_lock and make the
> lock static.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 09/25] timekeeping: Move timekeeper_lock into tk_core
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-9-554456a44a15@linutronix.de>
@ 2024-10-24 21:20 ` John Stultz
0 siblings, 0 replies; 23+ messages in thread
From: John Stultz @ 2024-10-24 21:20 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> timekeeper_lock protects updates to struct tk_core but is not part of
> struct tk_core. As long as there is only a single timekeeper, this is not a
> problem. But when the timekeeper infrastructure will be reused for per ptp
> clock timekeepers, timekeeper_lock needs to be part of tk_core.
>
> Move the lock into tk_core, move initialisation of the lock and sequence
> counter into timekeeping_init() and update all users of timekeeper_lock.
>
> As this is touching all lock sites, convert them to use:
>
> guard(raw_spinlock_irqsave)(&tk_core.lock);
>
> instead of lock/unlock functions whenever possible.
>
> Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 10/25] timekeeping: Define a struct type for tk_core to make it reusable
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-10-554456a44a15@linutronix.de>
@ 2024-10-24 21:21 ` John Stultz
0 siblings, 0 replies; 23+ messages in thread
From: John Stultz @ 2024-10-24 21:21 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> The struct tk_core uses is not reusable. As long as there is only a single
> timekeeper, this is not a problem. But when the timekeeper infrastructure
> will be reused for per ptp clock timekeepers, an explicit struct type is
> required.
>
> Define struct tk_data as explicit struct type for tk_core.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 11/25] timekeeping: Introduce tkd_basic_setup() to make lock and seqcount init reusable
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-11-554456a44a15@linutronix.de>
@ 2024-10-24 21:25 ` John Stultz
0 siblings, 0 replies; 23+ messages in thread
From: John Stultz @ 2024-10-24 21:25 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> Initialization of lock and seqcount needs to be done for every instance of
> timekeeper struct. To be able to easily reuse it, create a separate
> function for it.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 12/25] timekeeping: Add struct tk_data as argument to timekeeping_update()
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-12-554456a44a15@linutronix.de>
@ 2024-10-24 21:29 ` John Stultz
0 siblings, 0 replies; 23+ messages in thread
From: John Stultz @ 2024-10-24 21:29 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> Updates of the timekeeper are done in two ways:
>
> 1. Updating timekeeper and afterwards memcpy()'ing the result into
> shadow_timekeeper using timekeeping_update(). Used everywhere for
> updates except in timekeeping_advance(); the sequence counter protected
> region starts before the first change to the timekeeper is done.
>
> 2. Updating shadow_timekeeper and then memcpy()'ing the result into
> timekeeper. Used only by in timekeeping_advance(); The seqence counter
> protected region is only around timekeeping_update() and the memcpy for
> copy from shadow to timekeeper.
>
> The second option is fast path optimized. The sequence counter protected
> region is as short as possible.
>
> As this behaviour is mainly documented by commit messages, but not in code,
> it makes the not easy timekeeping code more complicated to read.
>
> There is no reason why updates to the timekeeper can't use the optimized
> version everywhere. With this, the code will be cleaner, as code is reused
> instead of duplicated.
>
> To be able to access tk_data which contains all required information, add a
> pointer to tk_data as an argument to timekeeping_update(). With that
> convert the comment about holding the lock into a lockdep assert.
>
> No functional change.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 13/25] timekeeping: Split out timekeeper update of timekeeping_advanced()
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-13-554456a44a15@linutronix.de>
@ 2024-10-24 21:43 ` John Stultz
2024-10-24 21:53 ` John Stultz
1 sibling, 0 replies; 23+ messages in thread
From: John Stultz @ 2024-10-24 21:43 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> timekeeping_advance() is the only optimized function which uses
> shadow_timekeeper for updating the real timekeeper to keep the sequence
> counter protected region as small as possible.
>
> To be able to transform timekeeper updates in other functions to use the
> same logic, split out functionality into a separate function
> timekeeper_update_staged().
>
> While at it, document the reason why the sequence counter must be write
> held over the call to timekeeping_update() and the copying to the real
> timekeeper and why using a pointer based update is suboptimal.
>
> No functional change.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 15/25] timekeeping: Provide timekeeping_restore_shadow()
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-15-554456a44a15@linutronix.de>
@ 2024-10-24 21:45 ` John Stultz
0 siblings, 0 replies; 23+ messages in thread
From: John Stultz @ 2024-10-24 21:45 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Thomas Gleixner <tglx@linutronix.de>
>
> From: Thomas Gleixner <tglx@linutronix.de>
>
> Functions which operate on the real timekeeper, e.g. do_settimeofday(),
> have error conditions. If they are hit a full timekeeping update is still
> required because the already committed operations modified the timekeeper.
>
> When switching these functions to operate on the shadow timekeeper then the
> full update can be avoided in the error case, but the modified shadow
> timekeeper has to be restored.
>
> Provide a helper function for that.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 13/25] timekeeping: Split out timekeeper update of timekeeping_advanced()
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-13-554456a44a15@linutronix.de>
2024-10-24 21:43 ` [PATCH v2 13/25] timekeeping: Split out timekeeper update of timekeeping_advanced() John Stultz
@ 2024-10-24 21:53 ` John Stultz
1 sibling, 0 replies; 23+ messages in thread
From: John Stultz @ 2024-10-24 21:53 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> timekeeping_advance() is the only optimized function which uses
> shadow_timekeeper for updating the real timekeeper to keep the sequence
> counter protected region as small as possible.
>
> To be able to transform timekeeper updates in other functions to use the
> same logic, split out functionality into a separate function
> timekeeper_update_staged().
>
> While at it, document the reason why the sequence counter must be write
> held over the call to timekeeping_update() and the copying to the real
> timekeeper and why using a pointer based update is suboptimal.
>
> No functional change.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
> ---
> kernel/time/timekeeping.c | 43 +++++++++++++++++++++++++++----------------
> 1 file changed, 27 insertions(+), 16 deletions(-)
>
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index 878f9606946d..fcb2b8b232d2 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -780,7 +780,32 @@ static void timekeeping_update(struct tk_data *tkd, struct timekeeper *tk, unsig
> * timekeeper structure on the next update with stale data
> */
> if (action & TK_MIRROR)
> - memcpy(&tk_core.shadow_timekeeper, &tk_core.timekeeper, sizeof(tk_core.timekeeper));
> + memcpy(&tkd->shadow_timekeeper, tk, sizeof(*tk));
> +}
> +
> +static void timekeeping_update_staged(struct tk_data *tkd, unsigned int action)
Minor nit I realized as I saw how this was used later on:
timekeeping_update_staged() isn't super clear right off. Maybe
timekeeping_update_from_shadow() might make it more clear?
thanks
-john
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 16/25] timekeeping: Rework do_settimeofday64() to use shadow_timekeeper
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-16-554456a44a15@linutronix.de>
@ 2024-10-24 21:54 ` John Stultz
0 siblings, 0 replies; 23+ messages in thread
From: John Stultz @ 2024-10-24 21:54 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> Updates of the timekeeper can be done by operating on the shadow timekeeper
> and afterwards copying the result into the real timekeeper. This has the
> advantage, that the sequence count write protected region is kept as small
> as possible.
>
> Convert do_settimeofday64() to use this scheme.
>
> That allows to use a scoped_guard() for locking the timekeeper lock as the
> usage of the shadow timekeeper allows a rollback in the error case instead
> of the full timekeeper update of the original code.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
> ---
> kernel/time/timekeeping.c | 42 ++++++++++++++++--------------------------
> 1 file changed, 16 insertions(+), 26 deletions(-)
>
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index 41d88f645868..cc01ad53d96d 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -1479,45 +1479,35 @@ EXPORT_SYMBOL_GPL(timekeeping_clocksource_has_base);
> */
> int do_settimeofday64(const struct timespec64 *ts)
> {
> - struct timekeeper *tk = &tk_core.timekeeper;
> struct timespec64 ts_delta, xt;
> - unsigned long flags;
> - int ret = 0;
>
> if (!timespec64_valid_settod(ts))
> return -EINVAL;
>
> - raw_spin_lock_irqsave(&tk_core.lock, flags);
> - write_seqcount_begin(&tk_core.seq);
> -
> - timekeeping_forward_now(tk);
> -
> - xt = tk_xtime(tk);
> - ts_delta = timespec64_sub(*ts, xt);
> + scoped_guard (raw_spinlock_irqsave, &tk_core.lock) {
> + struct timekeeper *tk = &tk_core.shadow_timekeeper;
nit: maybe shadow_tk instead, so it is additionally clear in the
following logic which is being modified.
Otherwise,
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 17/25] timekeeping: Rework timekeeping_inject_offset() to use shadow_timekeeper
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-17-554456a44a15@linutronix.de>
@ 2024-10-24 21:57 ` John Stultz
0 siblings, 0 replies; 23+ messages in thread
From: John Stultz @ 2024-10-24 21:57 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> Updates of the timekeeper can be done by operating on the shadow timekeeper
> and afterwards copying the result into the real timekeeper. This has the
> advantage, that the sequence count write protected region is kept as small
> as possible.
>
> Convert timekeeping_inject_offset() to use this scheme.
>
> That allows to use a scoped_guard() for locking the timekeeper lock as the
> usage of the shadow timekeeper allows a rollback in the error case instead
> of the full timekeeper update of the original code.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
> ---
> kernel/time/timekeeping.c | 41 ++++++++++++++++-------------------------
> 1 file changed, 16 insertions(+), 25 deletions(-)
>
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index cc01ad53d96d..051041e92e54 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -1519,40 +1519,31 @@ EXPORT_SYMBOL(do_settimeofday64);
> */
> static int timekeeping_inject_offset(const struct timespec64 *ts)
> {
> - struct timekeeper *tk = &tk_core.timekeeper;
> - unsigned long flags;
> - struct timespec64 tmp;
> - int ret = 0;
> -
> if (ts->tv_nsec < 0 || ts->tv_nsec >= NSEC_PER_SEC)
> return -EINVAL;
>
> - raw_spin_lock_irqsave(&tk_core.lock, flags);
> - write_seqcount_begin(&tk_core.seq);
> -
> - timekeeping_forward_now(tk);
> -
> - /* Make sure the proposed value is valid */
> - tmp = timespec64_add(tk_xtime(tk), *ts);
> - if (timespec64_compare(&tk->wall_to_monotonic, ts) > 0 ||
> - !timespec64_valid_settod(&tmp)) {
> - ret = -EINVAL;
> - goto error;
> - }
> + scoped_guard (raw_spinlock_irqsave, &tk_core.lock) {
> + struct timekeeper *tk = &tk_core.shadow_timekeeper;
Nit: Same suggestion of naming this tk_shadow for added clarity.
Otherwise,
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 18/25] timekeeping: Rework change_clocksource() to use shadow_timekeeper
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-18-554456a44a15@linutronix.de>
@ 2024-10-24 21:58 ` John Stultz
0 siblings, 0 replies; 23+ messages in thread
From: John Stultz @ 2024-10-24 21:58 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> Updates of the timekeeper can be done by operating on the shadow timekeeper
> and afterwards copying the result into the real timekeeper. This has the
> advantage, that the sequence count write protected region is kept as small
> as possible.
>
> Convert change_clocksource() to use this scheme.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
> ---
> kernel/time/timekeeping.c | 18 +++++++-----------
> 1 file changed, 7 insertions(+), 11 deletions(-)
>
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index 051041e92e54..6d21e9bf5c35 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -1596,9 +1596,7 @@ static void __timekeeping_set_tai_offset(struct timekeeper *tk, s32 tai_offset)
> */
> static int change_clocksource(void *data)
> {
> - struct timekeeper *tk = &tk_core.timekeeper;
> struct clocksource *new = data, *old = NULL;
> - unsigned long flags;
>
> /*
> * If the clocksource is in a module, get a module reference.
> @@ -1614,16 +1612,14 @@ static int change_clocksource(void *data)
> return 0;
> }
>
> - raw_spin_lock_irqsave(&tk_core.lock, flags);
> - write_seqcount_begin(&tk_core.seq);
> -
> - timekeeping_forward_now(tk);
> - old = tk->tkr_mono.clock;
> - tk_setup_internals(tk, new);
> - timekeeping_update(&tk_core, tk, TK_UPDATE_ALL | TK_MIRROR);
> + scoped_guard (raw_spinlock_irqsave, &tk_core.lock) {
> + struct timekeeper *tk = &tk_core.shadow_timekeeper;
Same nit: tk_shadow maybe?
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 14/25] timekeeping: Introduce combined timekeeping action flag
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-14-554456a44a15@linutronix.de>
@ 2024-10-24 22:12 ` John Stultz
0 siblings, 0 replies; 23+ messages in thread
From: John Stultz @ 2024-10-24 22:12 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> Instead of explicitly listing all the separate timekeeping actions flags,
> introduce a new one which covers all actions except TK_MIRROR action.
>
> No functional change.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
> ---
> kernel/time/timekeeping.c | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index fcb2b8b232d2..5a747afe64b4 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -33,6 +33,8 @@
> #define TK_MIRROR (1 << 1)
> #define TK_CLOCK_WAS_SET (1 << 2)
>
> +#define TK_UPDATE_ALL (TK_CLEAR_NTP | TK_CLOCK_WAS_SET)
> +
Hrm. I feel a little wary around having a flag mask called _ALL when
it doesn't actually include all the other flags.
I also recognize the "TK_CLEAR_NTP | TK_CLOCK_WAS_SET" arguments can
feel repetitive, but I find having them explicitly listed makes the
code more readable to me.
Combining these common ones together just means there is a 4th option
one has to keep in their head to translate.
Further, as I look through the logic TK_MIRROR could probably be
improved by adding a direction (it's easy to mix up what is being
mirrored to what). Maybe TK_MIRROR_TO_SHADOW?
But these are mostly just strategies to help my scatterbrained state,
so this isn't a hard objection if you disagree.
thanks
-john
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 19/25] timekeeping: Rework timekeeping_init() to use shadow_timekeeper
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-19-554456a44a15@linutronix.de>
@ 2024-10-24 22:16 ` John Stultz
0 siblings, 0 replies; 23+ messages in thread
From: John Stultz @ 2024-10-24 22:16 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> For timekeeping_init() the sequence count write held time is not relevant
> and it could keep working on the real timekeeper, but there is no reason to
> make it different from other timekeeper updates.
>
> Convert it to operate on the shadow timekeeper.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
> ---
> kernel/time/timekeeping.c | 8 +++-----
> 1 file changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index 6d21e9bf5c35..bb225534fee1 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -1778,7 +1778,7 @@ static bool persistent_clock_exists;
> void __init timekeeping_init(void)
> {
> struct timespec64 wall_time, boot_offset, wall_to_mono;
> - struct timekeeper *tk = &tk_core.timekeeper;
> + struct timekeeper *tk = &tk_core.shadow_timekeeper;
Same nit as earlier, otherwise
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 20/25] timekeeping: Rework timekeeping_inject_sleeptime64() to use shadow_timekeeper
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-20-554456a44a15@linutronix.de>
@ 2024-10-24 22:16 ` John Stultz
0 siblings, 0 replies; 23+ messages in thread
From: John Stultz @ 2024-10-24 22:16 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> Updates of the timekeeper can be done by operating on the shadow timekeeper
> and afterwards copying the result into the real timekeeper. This has the
> advantage, that the sequence count write protected region is kept as small
> as possible.
>
> Convert timekeeping_inject_sleeptime64() to use this scheme.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
> ---
> kernel/time/timekeeping.c | 22 +++++++---------------
> 1 file changed, 7 insertions(+), 15 deletions(-)
>
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index bb225534fee1..c1a2726a0d41 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -1893,22 +1893,14 @@ bool timekeeping_rtc_skipsuspend(void)
> */
> void timekeeping_inject_sleeptime64(const struct timespec64 *delta)
> {
> - struct timekeeper *tk = &tk_core.timekeeper;
> - unsigned long flags;
> -
> - raw_spin_lock_irqsave(&tk_core.lock, flags);
> - write_seqcount_begin(&tk_core.seq);
> -
> - suspend_timing_needed = false;
> -
> - timekeeping_forward_now(tk);
> -
> - __timekeeping_inject_sleeptime(tk, delta);
> -
> - timekeeping_update(&tk_core, tk, TK_UPDATE_ALL | TK_MIRROR);
> + scoped_guard(raw_spinlock_irqsave, &tk_core.lock) {
> + struct timekeeper *tk = &tk_core.shadow_timekeeper;
Same nit as the rest, otherwise
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 21/25] timekeeping: Rework timekeeping_resume() to use shadow_timekeeper
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-21-554456a44a15@linutronix.de>
@ 2024-10-24 22:18 ` John Stultz
0 siblings, 0 replies; 23+ messages in thread
From: John Stultz @ 2024-10-24 22:18 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> Updates of the timekeeper can be done by operating on the shadow timekeeper
> and afterwards copying the result into the real timekeeper. This has the
> advantage, that the sequence count write protected region is kept as small
> as possible.
>
> While the sequence count held time is not relevant for the resume path as
> there is no concurrency, there is no reason to have this function
> different than all the other update sites.
>
> Convert timekeeping_inject_offset() to use this scheme and cleanup the
> variable declaration while at it.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Other than the tk_shadow naming nit from earlier,
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 22/25] timekeeping: Rework timekeeping_suspend() to use shadow_timekeeper
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-22-554456a44a15@linutronix.de>
@ 2024-10-24 22:21 ` John Stultz
0 siblings, 0 replies; 23+ messages in thread
From: John Stultz @ 2024-10-24 22:21 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> Updates of the timekeeper can be done by operating on the shadow timekeeper
> and afterwards copying the result into the real timekeeper. This has the
> advantage, that the sequence count write protected region is kept as small
> as possible.
>
> While the sequence count held time is not relevant for the resume path as
> there is no concurrency, there is no reason to have this function
> different than all the other update sites.
>
> Convert timekeeping_inject_offset() to use this scheme and cleanup the
> variable declarations while at it.
>
> As halt_fast_timekeeper() does not need protection sequence counter, it is
> no problem to move it with this change outside of the sequence counter
> protected area. But it still needs to be executed while holding the lock.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
tk_shadow naming nit, but otherwise:
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 23/25] timekeeping: Rework do_adjtimex() to use shadow_timekeeper
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-23-554456a44a15@linutronix.de>
@ 2024-10-24 22:26 ` John Stultz
0 siblings, 0 replies; 23+ messages in thread
From: John Stultz @ 2024-10-24 22:26 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> Updates of the timekeeper can be done by operating on the shadow timekeeper
> and afterwards copying the result into the real timekeeper. This has the
> advantage, that the sequence count write protected region is kept as small
> as possible.
>
> Convert do_adjtimex() to use this scheme and take the opportunity to use a
> scoped_guard() for locking.
>
> That requires to have a separate function for updating the leap state so
> that the update is protected by the sequence count. This also brings the
> timekeeper and the shadow timekeeper in sync for this state, which was not
> the case so far. That's not a correctness problem as the state is only used
> at the read sides which use the real timekeeper, but it's inconsistent
> nevertheless.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Again, the tk_shadow naming nit, but otherwise:
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 24/25] timekeeping: Remove TK_MIRROR timekeeping_update() action
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-24-554456a44a15@linutronix.de>
@ 2024-10-24 22:29 ` John Stultz
0 siblings, 0 replies; 23+ messages in thread
From: John Stultz @ 2024-10-24 22:29 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> All call sites of using TK_MIRROR flag in timekeeping_update() are
> gone. The TK_MIRROR dependent code path is therefore dead code.
>
> Remove it along with the TK_MIRROR define.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Ah, apologies, this makes the earlier change in 14/25 much more
obviously good. Sorry for not seeing the big picture until I got here.
I withdraw any objection there
For both, now:
Acked-by: John Stultz <jstultz@google.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 25/25] timekeeping: Merge timekeeping_update_staged() and timekeeping_update()
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-25-554456a44a15@linutronix.de>
@ 2024-10-24 22:41 ` John Stultz
0 siblings, 0 replies; 23+ messages in thread
From: John Stultz @ 2024-10-24 22:41 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: Frederic Weisbecker, Thomas Gleixner, linux-kernel, netdev,
Miroslav Lichvar, Richard Cochran, Christopher S Hall
On Wed, Oct 9, 2024 at 1:29 AM Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> From: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> timekeeping_update_staged() is the only call site of timekeeping_update().
>
> Merge those functions. No functional change.
>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
> ---
> kernel/time/timekeeping.c | 31 ++++++++++++++-----------------
> 1 file changed, 14 insertions(+), 17 deletions(-)
>
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index 67d7be2e02fb..d07eb1946ff1 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -529,7 +529,7 @@ EXPORT_SYMBOL_GPL(ktime_get_raw_fast_ns);
> * timekeeping_inject_sleeptime64()
> * __timekeeping_inject_sleeptime(tk, delta);
> * timestamp();
> - * timekeeping_update(tkd, tk, TK_CLEAR_NTP...);
> + * timekeeping_update_staged(tkd, TK_CLEAR_NTP...);
> *
> * (2) On 32-bit systems, the 64-bit boot offset (tk->offs_boot) may be
> * partially updated. Since the tk->offs_boot update is a rare event, this
> @@ -775,10 +775,21 @@ static void timekeeping_restore_shadow(struct tk_data *tkd)
> memcpy(&tkd->shadow_timekeeper, &tkd->timekeeper, sizeof(tkd->timekeeper));
> }
>
> -static void timekeeping_update(struct tk_data *tkd, struct timekeeper *tk, unsigned int action)
> +static void timekeeping_update_staged(struct tk_data *tkd, unsigned int action)
I still think timekeeping_update_from_shadow would be a better name.
> {
> + struct timekeeper *tk = &tk_core.shadow_timekeeper;
> +
Same naming nit, as its easy to mix up what tk is modifying.
Other than those,
Acked-by: John Stultz <jstultz@google.com>
Thanks so much for these really nice cleanups! They are a great
improvement to the code!
And again, my apologies for jumping off to other things midway through
reviewing these and not getting back to them until now.
thanks
-john
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2024-10-24 22:41 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-0-554456a44a15@linutronix.de>
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-6-554456a44a15@linutronix.de>
2024-10-11 3:22 ` [PATCH v2 06/25] timekeeping: Reorder struct timekeeper John Stultz
2024-10-14 9:30 ` Anna-Maria Behnsen
2024-10-15 10:08 ` [PATCH v2a] " Anna-Maria Behnsen
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-7-554456a44a15@linutronix.de>
2024-10-24 21:11 ` [PATCH v2 07/25] timekeeping: Move shadow_timekeeper into tk_core John Stultz
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-8-554456a44a15@linutronix.de>
2024-10-24 21:19 ` [PATCH v2 08/25] timekeeping: Encapsulate locking/unlocking of timekeeper_lock John Stultz
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-9-554456a44a15@linutronix.de>
2024-10-24 21:20 ` [PATCH v2 09/25] timekeeping: Move timekeeper_lock into tk_core John Stultz
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-10-554456a44a15@linutronix.de>
2024-10-24 21:21 ` [PATCH v2 10/25] timekeeping: Define a struct type for tk_core to make it reusable John Stultz
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-11-554456a44a15@linutronix.de>
2024-10-24 21:25 ` [PATCH v2 11/25] timekeeping: Introduce tkd_basic_setup() to make lock and seqcount init reusable John Stultz
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-12-554456a44a15@linutronix.de>
2024-10-24 21:29 ` [PATCH v2 12/25] timekeeping: Add struct tk_data as argument to timekeeping_update() John Stultz
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-15-554456a44a15@linutronix.de>
2024-10-24 21:45 ` [PATCH v2 15/25] timekeeping: Provide timekeeping_restore_shadow() John Stultz
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-13-554456a44a15@linutronix.de>
2024-10-24 21:43 ` [PATCH v2 13/25] timekeeping: Split out timekeeper update of timekeeping_advanced() John Stultz
2024-10-24 21:53 ` John Stultz
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-16-554456a44a15@linutronix.de>
2024-10-24 21:54 ` [PATCH v2 16/25] timekeeping: Rework do_settimeofday64() to use shadow_timekeeper John Stultz
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-17-554456a44a15@linutronix.de>
2024-10-24 21:57 ` [PATCH v2 17/25] timekeeping: Rework timekeeping_inject_offset() " John Stultz
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-18-554456a44a15@linutronix.de>
2024-10-24 21:58 ` [PATCH v2 18/25] timekeeping: Rework change_clocksource() " John Stultz
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-14-554456a44a15@linutronix.de>
2024-10-24 22:12 ` [PATCH v2 14/25] timekeeping: Introduce combined timekeeping action flag John Stultz
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-19-554456a44a15@linutronix.de>
2024-10-24 22:16 ` [PATCH v2 19/25] timekeeping: Rework timekeeping_init() to use shadow_timekeeper John Stultz
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-20-554456a44a15@linutronix.de>
2024-10-24 22:16 ` [PATCH v2 20/25] timekeeping: Rework timekeeping_inject_sleeptime64() " John Stultz
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-21-554456a44a15@linutronix.de>
2024-10-24 22:18 ` [PATCH v2 21/25] timekeeping: Rework timekeeping_resume() " John Stultz
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-22-554456a44a15@linutronix.de>
2024-10-24 22:21 ` [PATCH v2 22/25] timekeeping: Rework timekeeping_suspend() " John Stultz
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-23-554456a44a15@linutronix.de>
2024-10-24 22:26 ` [PATCH v2 23/25] timekeeping: Rework do_adjtimex() " John Stultz
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-24-554456a44a15@linutronix.de>
2024-10-24 22:29 ` [PATCH v2 24/25] timekeeping: Remove TK_MIRROR timekeeping_update() action John Stultz
[not found] ` <20241009-devel-anna-maria-b4-timers-ptp-timekeeping-v2-25-554456a44a15@linutronix.de>
2024-10-24 22:41 ` [PATCH v2 25/25] timekeeping: Merge timekeeping_update_staged() and timekeeping_update() John Stultz
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox