From: Jeff Layton <jlayton@kernel.org>
To: John Stultz <jstultz@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
Stephen Boyd <sboyd@kernel.org>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
Steven Rostedt <rostedt@goodmis.org>,
Masami Hiramatsu <mhiramat@kernel.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Jonathan Corbet <corbet@lwn.net>,
Chandan Babu R <chandan.babu@oracle.com>,
"Darrick J. Wong" <djwong@kernel.org>,
Theodore Ts'o <tytso@mit.edu>,
Andreas Dilger <adilger.kernel@dilger.ca>,
Chris Mason <clm@fb.com>, Josef Bacik <josef@toxicpanda.com>,
David Sterba <dsterba@suse.com>, Hugh Dickins <hughd@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Chuck Lever <chuck.lever@oracle.com>,
Vadim Fedorenko <vadim.fedorenko@linux.dev>,
Randy Dunlap <rdunlap@infradead.org>,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org,
linux-btrfs@vger.kernel.org, linux-nfs@vger.kernel.org,
linux-mm@kvack.org
Subject: Re: [PATCH v8 01/11] timekeeping: move multigrain timestamp floor handling into timekeeper
Date: Sat, 14 Sep 2024 19:14:06 -0400 [thread overview]
Message-ID: <a273a8e6bb6785c5666bd2b8b83e3a437ebdeae4.camel@kernel.org> (raw)
In-Reply-To: <CANDhNCpaySH5HmkEb9BS738Fo+Kk=6s0_zNwB=uYtOQ63uc6xw@mail.gmail.com>
On Sat, 2024-09-14 at 13:10 -0700, John Stultz wrote:
> On Sat, Sep 14, 2024 at 10:07 AM Jeff Layton <jlayton@kernel.org> wrote:
> >
> > For multigrain timestamps, we must keep track of the latest timestamp
> > that has ever been handed out, and never hand out a coarse time below
> > that value.
> >
> > Add a static singleton atomic64_t into timekeeper.c that we can use to
> > keep track of the latest fine-grained time ever handed out. This is
> > tracked as a monotonic ktime_t value to ensure that it isn't affected by
> > clock jumps.
> >
> > Add two new public interfaces:
> >
> > - ktime_get_coarse_real_ts64_mg() fills a timespec64 with the later of the
> > coarse-grained clock and the floor time
> >
> > - ktime_get_real_ts64_mg() gets the fine-grained clock value, and tries
> > to swap it into the floor. A timespec64 is filled with the result.
> >
> > Since the floor is global, we take great pains to avoid updating it
> > unless it's absolutely necessary. If we do the cmpxchg and find that the
> > value has been updated since we fetched it, then we discard the
> > fine-grained time that was fetched in favor of the recent update.
> >
> > To maximize the window of this occurring when multiple tasks are racing
> > to update the floor, ktime_get_coarse_real_ts64_mg returns a cookie
> > value that represents the state of the floor tracking word, and
> > ktime_get_real_ts64_mg accepts a cookie value that it uses as the "old"
> > value when calling cmpxchg().
>
> This last bit seems out of date.
>
Thanks. Dropped the last paragraph.
> > ---
> > include/linux/timekeeping.h | 4 +++
> > kernel/time/timekeeping.c | 82 +++++++++++++++++++++++++++++++++++++++++++++
> > 2 files changed, 86 insertions(+)
> >
> > diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
> > index fc12a9ba2c88..7aa85246c183 100644
> > --- a/include/linux/timekeeping.h
> > +++ b/include/linux/timekeeping.h
> > @@ -45,6 +45,10 @@ extern void ktime_get_real_ts64(struct timespec64 *tv);
> > extern void ktime_get_coarse_ts64(struct timespec64 *ts);
> > extern void ktime_get_coarse_real_ts64(struct timespec64 *ts);
> >
> > +/* Multigrain timestamp interfaces */
> > +extern void ktime_get_coarse_real_ts64_mg(struct timespec64 *ts);
> > +extern void ktime_get_real_ts64_mg(struct timespec64 *ts);
> > +
> > void getboottime64(struct timespec64 *ts);
> >
> > /*
> > diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> > index 5391e4167d60..16937242b904 100644
> > --- a/kernel/time/timekeeping.c
> > +++ b/kernel/time/timekeeping.c
> > @@ -114,6 +114,13 @@ static struct tk_fast tk_fast_raw ____cacheline_aligned = {
> > .base[1] = FAST_TK_INIT,
> > };
> >
> > +/*
> > + * This represents the latest fine-grained time that we have handed out as a
> > + * timestamp on the system. Tracked as a monotonic ktime_t, and converted to the
> > + * realtime clock on an as-needed basis.
> > + */
> > +static __cacheline_aligned_in_smp atomic64_t mg_floor;
> > +
> > static inline void tk_normalize_xtime(struct timekeeper *tk)
> > {
> > while (tk->tkr_mono.xtime_nsec >= ((u64)NSEC_PER_SEC << tk->tkr_mono.shift)) {
> > @@ -2394,6 +2401,81 @@ void ktime_get_coarse_real_ts64(struct timespec64 *ts)
> > }
> > EXPORT_SYMBOL(ktime_get_coarse_real_ts64);
> >
> > +/**
> > + * ktime_get_coarse_real_ts64_mg - get later of coarse grained time or floor
> > + * @ts: timespec64 to be filled
> > + *
> > + * Adjust floor to realtime and compare it to the coarse time. Fill
> > + * @ts with the latest one. Note that this is a filesystem-specific
> > + * interface and should be avoided outside of that context.
> > + */
> > +void ktime_get_coarse_real_ts64_mg(struct timespec64 *ts)
> > +{
> > + struct timekeeper *tk = &tk_core.timekeeper;
> > + u64 floor = atomic64_read(&mg_floor);
> > + ktime_t f_real, offset, coarse;
> > + unsigned int seq;
> > +
> > + WARN_ON(timekeeping_suspended);
> > +
> > + do {
> > + seq = read_seqcount_begin(&tk_core.seq);
> > + *ts = tk_xtime(tk);
> > + offset = *offsets[TK_OFFS_REAL];
> > + } while (read_seqcount_retry(&tk_core.seq, seq));
> > +
> > + coarse = timespec64_to_ktime(*ts);
> > + f_real = ktime_add(floor, offset);
> > + if (ktime_after(f_real, coarse))
> > + *ts = ktime_to_timespec64(f_real);
> > +}
> > +EXPORT_SYMBOL_GPL(ktime_get_coarse_real_ts64_mg);
> > +
> > +/**
> > + * ktime_get_real_ts64_mg - attempt to update floor value and return result
> > + * @ts: pointer to the timespec to be set
> > + *
> > + * Get a current monotonic fine-grained time value and attempt to swap
> > + * it into the floor. @ts will be filled with the resulting floor value,
> > + * regardless of the outcome of the swap. Note that this is a filesystem
> > + * specific interface and should be avoided outside of that context.
> > + */
> > +void ktime_get_real_ts64_mg(struct timespec64 *ts, u64 cookie)
>
> Still passing a cookie. It doesn't match the header definition, so I'm
> surprised this builds.
>
Yeah, I didn't see a warning when I built it. Luckily the extra
parameter is ignored anyway, so it no harm. I've fixed that up in my
tree.
> > +{
> > + struct timekeeper *tk = &tk_core.timekeeper;
> > + ktime_t old = atomic64_read(&mg_floor);
> > + ktime_t offset, mono;
> > + unsigned int seq;
> > + u64 nsecs;
> > +
> > + WARN_ON(timekeeping_suspended);
> > +
> > + do {
> > + seq = read_seqcount_begin(&tk_core.seq);
> > +
> > + ts->tv_sec = tk->xtime_sec;
> > + mono = tk->tkr_mono.base;
> > + nsecs = timekeeping_get_ns(&tk->tkr_mono);
> > + offset = *offsets[TK_OFFS_REAL];
> > + } while (read_seqcount_retry(&tk_core.seq, seq));
> > +
> > + mono = ktime_add_ns(mono, nsecs);
> > +
> > + if (atomic64_try_cmpxchg(&mg_floor, &old, mono)) {
> > + ts->tv_nsec = 0;
> > + timespec64_add_ns(ts, nsecs);
> > + } else {
> > + /*
> > + * Something has changed mg_floor since "old" was
> > + * fetched. "old" has now been updated with the
> > + * current value of mg_floor, so use that to return
> > + * the current coarse floor value.
> > + */
> > + *ts = ktime_to_timespec64(ktime_add(old, offset));
> > + }
> > +}
> > +EXPORT_SYMBOL_GPL(ktime_get_real_ts64_mg);
>
> Other than those issues, I'm ok with it. Thanks again for working
> through my concerns!
>
> Since I'm traveling for LPC soon, to save the next cycle, once the
> fixes above are sorted:
> Acked-by: John Stultz <jstultz@google.com>
>
Thanks for the review!
--
Jeff Layton <jlayton@kernel.org>
next prev parent reply other threads:[~2024-09-14 23:14 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-14 17:07 [PATCH v8 00/11] fs: multigrain timestamp redux Jeff Layton
2024-09-14 17:07 ` [PATCH v8 01/11] timekeeping: move multigrain timestamp floor handling into timekeeper Jeff Layton
2024-09-14 20:10 ` John Stultz
2024-09-14 23:14 ` Jeff Layton [this message]
2024-09-16 10:12 ` Thomas Gleixner
2024-09-16 10:32 ` Thomas Gleixner
2024-09-16 10:57 ` Jeff Layton
2024-09-30 19:16 ` Thomas Gleixner
2024-09-30 19:37 ` Jeff Layton
2024-09-30 20:19 ` Thomas Gleixner
2024-09-30 20:53 ` Jeff Layton
2024-09-30 21:35 ` Thomas Gleixner
2024-10-01 9:45 ` Jeff Layton
2024-10-01 12:45 ` Thomas Gleixner
2024-10-02 12:41 ` Jeff Layton
2024-09-19 16:50 ` Jeff Layton
2024-09-30 19:43 ` Thomas Gleixner
2024-09-30 20:12 ` Jeff Layton
2024-09-30 19:13 ` Thomas Gleixner
2024-09-30 19:27 ` Jeff Layton
2024-09-30 20:15 ` Thomas Gleixner
2024-09-14 17:07 ` [PATCH v8 02/11] fs: add infrastructure for multigrain timestamps Jeff Layton
2024-09-14 17:07 ` [PATCH v8 03/11] fs: have setattr_copy handle multigrain timestamps appropriately Jeff Layton
2024-09-14 17:07 ` [PATCH v8 04/11] fs: handle delegated timestamps in setattr_copy_mgtime Jeff Layton
2024-09-14 17:07 ` [PATCH v8 05/11] fs: tracepoints around multigrain timestamp events Jeff Layton
2024-09-15 8:21 ` Steven Rostedt
2024-09-14 17:07 ` [PATCH v8 06/11] fs: add percpu counters for significant " Jeff Layton
2024-09-16 10:20 ` Thomas Gleixner
2024-09-14 17:07 ` [PATCH v8 07/11] Documentation: add a new file documenting multigrain timestamps Jeff Layton
2024-09-16 1:01 ` Bagas Sanjaya
2024-09-19 16:53 ` Jeff Layton
2024-09-14 17:07 ` [PATCH v8 08/11] xfs: switch to " Jeff Layton
2024-09-14 17:07 ` [PATCH v8 09/11] ext4: " Jeff Layton
2024-09-14 17:07 ` [PATCH v8 10/11] btrfs: convert " Jeff Layton
2024-09-14 17:07 ` [PATCH v8 11/11] tmpfs: add support for " Jeff Layton
2024-09-26 16:59 ` [PATCH v8 00/11] fs: multigrain timestamp redux Randy Dunlap
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a273a8e6bb6785c5666bd2b8b83e3a437ebdeae4.camel@kernel.org \
--to=jlayton@kernel.org \
--cc=adilger.kernel@dilger.ca \
--cc=akpm@linux-foundation.org \
--cc=brauner@kernel.org \
--cc=chandan.babu@oracle.com \
--cc=chuck.lever@oracle.com \
--cc=clm@fb.com \
--cc=corbet@lwn.net \
--cc=djwong@kernel.org \
--cc=dsterba@suse.com \
--cc=hughd@google.com \
--cc=jack@suse.cz \
--cc=josef@toxicpanda.com \
--cc=jstultz@google.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=rdunlap@infradead.org \
--cc=rostedt@goodmis.org \
--cc=sboyd@kernel.org \
--cc=tglx@linutronix.de \
--cc=tytso@mit.edu \
--cc=vadim.fedorenko@linux.dev \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).