From: Thomas Gleixner <tglx@linutronix.de>
To: kernel test robot <oliver.sang@intel.com>,
Anna-Maria Behnsen <anna-maria@linutronix.de>
Cc: oe-lkp@lists.linux.dev, lkp@intel.com,
linux-kernel@vger.kernel.org, x86@kernel.org,
John Stultz <jstultz@google.com>,
oliver.sang@intel.com, Marco Elver <elver@google.com>,
Dmitry Vyukov <dvyukov@google.com>,
Frederic Weisbecker <frederic@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Stephen Boyd <sboyd@kernel.org>
Subject: Re: [tip:timers/core] [timekeeping] 5aa6c43eca: BUG:KCSAN:data-race_in_timekeeping_debug_get_ns/timekeeping_update_from_shadow
Date: Wed, 30 Oct 2024 09:50:27 +0100 [thread overview]
Message-ID: <877c9qynxo.ffs@tglx> (raw)
In-Reply-To: <202410301316.e51421de-lkp@intel.com>
On Wed, Oct 30 2024 at 13:47, kernel test robot wrote:
> this is another report about BUG:KCSAN, the change does not introduce new KCSAN
> issue, but causes stats changes as below.
>
> [ 70.265411][ C1] BUG: KCSAN: data-race in timekeeping_debug_get_ns / timekeeping_update_from_shadow
> [ 70.265430][ C1]
> [ 70.265433][ C1] write to 0xffffffff8483fef8 of 296 bytes by interrupt on cpu 0:
> [ 70.265440][ C1] timekeeping_update_from_shadow+0x8e/0x140
> [ 70.265452][ C1] timekeeping_advance (kernel/time/timekeeping.c:2394)
> [ 70.265462][ C1] update_wall_time (kernel/time/timekeeping.c:2403)
timekeeping_update_from_shadow() holds the sequence count write.
> [ 70.265642][ C1] timekeeping_debug_get_ns (kernel/time/timekeeping.c:415 kernel/time/timekeeping.c:399 kernel/time/timekeeping.c:307)
> [ 70.265653][ C1] ktime_get (kernel/time/timekeeping.c:431 (discriminator 4) kernel/time/timekeeping.c:897 (discriminator 4))
> [ 70.265660][ C1] tick_nohz_lowres_handler (kernel/time/tick-sched.c:220 kernel/time/tick-sched.c:290 kernel/time/tick-sched.c:1486)
ktime_get()
do {
seq = read_seqcount_begin(&tk_core.seq);
timekeeping_debug_get_ns();
} while (read_seqcount_retry(&tk_core.seq, seq));
So this should be safe against concurreny. I assume the issue here is
that timekeeping_debug_get_ns() has a nested
do {
seq = read_seqcount_begin(&tk_core.seq);
....
} while (read_seqcount_retry(&tk_core.seq, seq));
inside. Which is still correct, but confuses KCSAN. Marco?
But that aside, since 135225a363ae timekeeping_cycles_to_ns() is fully
overflow protected and unconditionally handles negative motion (before
it was x86 only), the value of timekeeping_debug_get_ns() becomes
questionable.
I'm leaning towards removing it completely.
John?
Thanks,
tglx
---
include/linux/timekeeper_internal.h | 16 ------
kernel/time/timekeeping.c | 87 ++----------------------------------
2 files changed, 5 insertions(+), 98 deletions(-)
--- a/include/linux/timekeeper_internal.h
+++ b/include/linux/timekeeper_internal.h
@@ -76,9 +76,6 @@ struct tk_read_base {
* ntp shifted nano seconds.
* @ntp_err_mult: Multiplication factor for scaled math conversion
* @skip_second_overflow: Flag used to avoid updating NTP twice with same second
- * @last_warning: Warning ratelimiter (DEBUG_TIMEKEEPING)
- * @underflow_seen: Underflow warning flag (DEBUG_TIMEKEEPING)
- * @overflow_seen: Overflow warning flag (DEBUG_TIMEKEEPING)
*
* Note: For timespec(64) based interfaces wall_to_monotonic is what
* we need to add to xtime (or xtime corrected for sub jiffy times)
@@ -147,19 +144,6 @@ struct timekeeper {
u32 ntp_error_shift;
u32 ntp_err_mult;
u32 skip_second_overflow;
-
-#ifdef CONFIG_DEBUG_TIMEKEEPING
- long last_warning;
- /*
- * These simple flag variables are managed
- * without locks, which is racy, but they are
- * ok since we don't really care about being
- * super precise about how many events were
- * seen, just that a problem was observed.
- */
- int underflow_seen;
- int overflow_seen;
-#endif
};
#ifdef CONFIG_GENERIC_TIME_VSYSCALL
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -227,8 +227,6 @@ static inline u64 tk_clock_read(const st
}
#ifdef CONFIG_DEBUG_TIMEKEEPING
-#define WARNING_FREQ (HZ*300) /* 5 minute rate-limiting */
-
static void timekeeping_check_update(struct timekeeper *tk, u64 offset)
{
@@ -239,82 +237,15 @@ static void timekeeping_check_update(str
printk_deferred("WARNING: timekeeping: Cycle offset (%lld) is larger than allowed by the '%s' clock's max_cycles value (%lld): time overflow danger\n",
offset, name, max_cycles);
printk_deferred(" timekeeping: Your kernel is sick, but tries to cope by capping time updates\n");
- } else {
- if (offset > (max_cycles >> 1)) {
+ } else if (offset > (max_cycles >> 1)) {
printk_deferred("INFO: timekeeping: Cycle offset (%lld) is larger than the '%s' clock's 50%% safety margin (%lld)\n",
offset, name, max_cycles >> 1);
printk_deferred(" timekeeping: Your kernel is still fine, but is feeling a bit nervous\n");
- }
- }
-
- if (tk->underflow_seen) {
- if (jiffies - tk->last_warning > WARNING_FREQ) {
- printk_deferred("WARNING: Underflow in clocksource '%s' observed, time update ignored.\n", name);
- printk_deferred(" Please report this, consider using a different clocksource, if possible.\n");
- printk_deferred(" Your kernel is probably still fine.\n");
- tk->last_warning = jiffies;
- }
- tk->underflow_seen = 0;
- }
-
- if (tk->overflow_seen) {
- if (jiffies - tk->last_warning > WARNING_FREQ) {
- printk_deferred("WARNING: Overflow in clocksource '%s' observed, time update capped.\n", name);
- printk_deferred(" Please report this, consider using a different clocksource, if possible.\n");
- printk_deferred(" Your kernel is probably still fine.\n");
- tk->last_warning = jiffies;
- }
- tk->overflow_seen = 0;
}
}
-static inline u64 timekeeping_cycles_to_ns(const struct tk_read_base *tkr, u64 cycles);
-
-static inline u64 timekeeping_debug_get_ns(const struct tk_read_base *tkr)
-{
- struct timekeeper *tk = &tk_core.timekeeper;
- u64 now, last, mask, max, delta;
- unsigned int seq;
-
- /*
- * Since we're called holding a seqcount, the data may shift
- * under us while we're doing the calculation. This can cause
- * false positives, since we'd note a problem but throw the
- * results away. So nest another seqcount here to atomically
- * grab the points we are checking with.
- */
- do {
- seq = read_seqcount_begin(&tk_core.seq);
- now = tk_clock_read(tkr);
- last = tkr->cycle_last;
- mask = tkr->mask;
- max = tkr->clock->max_cycles;
- } while (read_seqcount_retry(&tk_core.seq, seq));
-
- delta = clocksource_delta(now, last, mask);
-
- /*
- * Try to catch underflows by checking if we are seeing small
- * mask-relative negative values.
- */
- if (unlikely((~delta & mask) < (mask >> 3)))
- tk->underflow_seen = 1;
-
- /* Check for multiplication overflows */
- if (unlikely(delta > max))
- tk->overflow_seen = 1;
-
- /* timekeeping_cycles_to_ns() handles both under and overflow */
- return timekeeping_cycles_to_ns(tkr, now);
-}
#else
-static inline void timekeeping_check_update(struct timekeeper *tk, u64 offset)
-{
-}
-static inline u64 timekeeping_debug_get_ns(const struct tk_read_base *tkr)
-{
- BUG();
-}
+static inline void timekeeping_check_update(struct timekeeper *tk, u64 offset) { }
#endif
/**
@@ -421,19 +352,11 @@ static inline u64 timekeeping_cycles_to_
return ((delta * tkr->mult) + tkr->xtime_nsec) >> tkr->shift;
}
-static __always_inline u64 __timekeeping_get_ns(const struct tk_read_base *tkr)
+static __always_inline u64 timekeeping_get_ns(const struct tk_read_base *tkr)
{
return timekeeping_cycles_to_ns(tkr, tk_clock_read(tkr));
}
-static inline u64 timekeeping_get_ns(const struct tk_read_base *tkr)
-{
- if (IS_ENABLED(CONFIG_DEBUG_TIMEKEEPING))
- return timekeeping_debug_get_ns(tkr);
-
- return __timekeeping_get_ns(tkr);
-}
-
/**
* update_fast_timekeeper - Update the fast and NMI safe monotonic timekeeper.
* @tkr: Timekeeping readout base from which we take the update
@@ -477,7 +400,7 @@ static __always_inline u64 __ktime_get_f
seq = raw_read_seqcount_latch(&tkf->seq);
tkr = tkf->base + (seq & 0x01);
now = ktime_to_ns(tkr->base);
- now += __timekeeping_get_ns(tkr);
+ now += timekeeping_get_ns(tkr);
} while (raw_read_seqcount_latch_retry(&tkf->seq, seq));
return now;
@@ -593,7 +516,7 @@ static __always_inline u64 __ktime_get_r
tkr = tkf->base + (seq & 0x01);
basem = ktime_to_ns(tkr->base);
baser = ktime_to_ns(tkr->base_real);
- delta = __timekeeping_get_ns(tkr);
+ delta = timekeeping_get_ns(tkr);
} while (raw_read_seqcount_latch_retry(&tkf->seq, seq));
if (mono)
next prev parent reply other threads:[~2024-10-30 8:50 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-30 5:47 [tip:timers/core] [timekeeping] 5aa6c43eca: BUG:KCSAN:data-race_in_timekeeping_debug_get_ns/timekeeping_update_from_shadow kernel test robot
2024-10-30 8:50 ` Thomas Gleixner [this message]
2024-10-30 9:46 ` Marco Elver
2024-10-30 22:16 ` John Stultz
2024-10-30 23:35 ` Thomas Gleixner
2024-10-31 10:53 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=877c9qynxo.ffs@tglx \
--to=tglx@linutronix.de \
--cc=anna-maria@linutronix.de \
--cc=dvyukov@google.com \
--cc=elver@google.com \
--cc=frederic@kernel.org \
--cc=jstultz@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lkp@intel.com \
--cc=oe-lkp@lists.linux.dev \
--cc=oliver.sang@intel.com \
--cc=peterz@infradead.org \
--cc=sboyd@kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.