From: Thomas Gleixner <tglx@kernel.org>
To: David Woodhouse <dwmw2@infradead.org>,
Miroslav Lichvar <mlichvar@redhat.com>
Cc: "Richard Cochran" <richardcochran@gmail.com>,
"Wen Gu" <guwen@linux.alibaba.com>,
"Andrew Lunn" <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
"Eric Dumazet" <edumazet@google.com>,
"Jakub Kicinski" <kuba@kernel.org>,
"Paolo Abeni" <pabeni@redhat.com>,
"John Stultz" <jstultz@google.com>,
"Stephen Boyd" <sboyd@kernel.org>,
"Anna-Maria Behnsen" <anna-maria@linutronix.de>,
"Frederic Weisbecker" <frederic@kernel.org>,
"Shuah Khan" <shuah@kernel.org>,
"Peter Zijlstra" <peterz@infradead.org>,
"Thomas Weißschuh" <thomas.weissschuh@linutronix.de>,
"Arnd Bergmann" <arnd@arndb.de>,
"Julien Ridoux" <ridouxj@amazon.com>,
"Ryan Luu" <rluu@amazon.com>,
linux-kernel@vger.kernel.org,
"Marcelo Tosatti" <mtosatti@redhat.com>
Subject: Re: [RFC PATCH v2 0/8] timekeeping: Fix draft tracking precision and add feed-forward discipline via vmclock
Date: Sun, 24 May 2026 14:36:35 +0200 [thread overview]
Message-ID: <87se7ht25o.ffs@tglx> (raw)
In-Reply-To: <b5faa559e4282a3698021068e9d5f502540e3624.camel@infradead.org>
On Fri, May 22 2026 at 17:23, David Woodhouse wrote:
> On Fri, 2026-05-22 at 17:28 +0200, Thomas Gleixner wrote:
>> This raises an interesting question. Must any of the existing PTM using
>> drivers mplement that new extended getcrosstimestampattr() callback, in
>> order to expose the cycles/csid in attr or can you fallback to the
>> existing callback and have the rest of the fields 0?
>>
>> Same question arises if you change the pre/post timestamp helpers to
>> utilize ktime_get_snapshot_id(). All existing drivers which use them
>> will then automatically retrieve cs_cycles/cs_id.
>
> Taking those in reverse order... yes, this means that with a new
> variant of PTP_SYS_OFFSET_EXTENDED, userspace can see actual counter
> values even for the system parts of those ABA timestamps, even for non-
> PTM clocks, and discipline the TSC/archcounter against the external
> clock.
Correct.
> Currently I have userspace which literally does rdtsc() either side of
> calling the ioctl :)
Why am I not surprised? :)
> And PTM devices can be used with PTP_SYS_OFFSET_PRECISE, which goes
> through get_device_system_crosststamp() as described, and all just
> works? It's just that we now allow userspace to *see* the counter value
> that the driver was already generating.
A new variant of PRECISE
> So to your questions: although there's new userspace ioctl support, the
> *drivers* don't need any modification for that (as long as they use the
> standard prets/postts helpers).
Yes.
> The remaining question is the device timestamp part (the 'B' in the ABA
> sandwich) for PTP_SYS_OFFSET_EXTENDED with PTM-capable drivers. Should
> that get a counterval?
PTM-capable driver support cross timestamps, which will with a new
version of PTP_SYS_OFFSET_PRECISE expose the system counterval. No ABA
for that as it's hardware latched AB.
> I don't have a strong opinion. On one hand we'd have to find a way to
> convert it from PTM for devices where it actually *is* PTM, and that's
> what PTP_SYS_OFFSET_PRECISE is *for*.
Correct.
> But on the other hand, can't the conversion be a whole lot simpler than
> get_device_system_crosststamp() because it's not actually dealing with
> any timekeepers; it's basically only invoking convert_base_to_cs()?
But what for? If you have PTM, use PRECISE. There is _zero_ value of
having pre/post timestamps when the hardware already does the correlated
precise sampling, no?
> And the ioctl should support it *all* but just have a clear way of
> indicating that any of the optional fields including the attrs are
> *not* populated (or use 0/max values maybe?).
Yes.
> So no, I don't think any driver *has* to add any attr support in order
> to expose counter values to userspace. The only reason I asked Arthur
> to mix those things up was for the *userspace* API, to avoid adding yet
> another ioctl over and over again. And now I feel bad for doing so :)
I think you can create _one_ data structure variant, which fits both
EXTENDED_ATTR and PRECISE_ATTR:
struct attrs {
u32 valid;
u32 error_bound;
....
u32 reserved[N];
};
@valid tells user space, which of the attributes has been filled in by
the driver. That avoids bounds based validity checks, which are a pain
as you might end up with different bounds for every attribute. Having a
valid flags field avoids that completely.
struct devtime {
ptp_clock_time device_time;
struct attrs attrs;
};
struct systime {
u64 sys_systime;
u64 sys_rawtime;
u64 sys_counter;
u32 sys_counter_id;
u32 reserved;
};
Exposing @sys_counter_id requires to expose CSID_* in the user space ABI
reliably, as otherwise a kernel internal CSID enum change would blow up
the user space guess work. Your ptp_counter_id approach is error prone.
struct timestamp {
union {
struct systime systime;
struct systime pre_systime;
};
struct devtime devtime;
struct systime post_systime;
};
struct request {
u32 valid;
clockid_t clock_id;
unsigned int num_samples;
u32 reserved[N];
};
I rather have @valid here too. The 'zero the reserved' members approach
is a pain as new kernels have to map 0 to default behavior instead of
being free to make 0 mean what they intend. @valid allows you to use
other sizes than u32 for future fields. All you have to take care of is
to keep the existing fields at the same place as before.
struct ioctl_data {
struct request request;
struct timestamp timestamps[];
};
So for both PTP_SYS_OFFSET_EXTENDED_ATTRS and
PTP_SYS_OFFSET_PRECISE_ATTRS user space allocates enough space to
accomodate data::request::num_samples.
For PTP_SYS_OFFSET_PRECISE_ATTRS num_samples has to be 1 and
data::timestamps[0].post_systime is zeroed by the kernel because it has
no meaning.
So now in the kernel you do:
ptp_sys_offset_extended_attrs(struct ptp_clock *ptp, void __user *argptr)
{
struct ioctl_data __user *data = argptr;
struct request;
if (copy_from_user(&request, &data->request, sizeof(request)))
return -EFAULT;
if (!extattr_request_valid(request))
return -EINVAL;
for (unsigned int i; i < request.num_samples; i++) {
struct ptp_system_timestamp sts = { .clock_id = request.clock_id, };
struct timestamp uts = { };
struct timespec64 devts;
if (ptp->info->gettimex64_attr)
ret = ptp->info->gettimex64_attr(ptp->info, &dev_ts, &sts, &uts.attr);
else if (ptp->info->gettimex64)
ret = ptp->info->gettimex64(ptp->info, &dev_ts, &sts);
else
return -ENOTSUPP;
if (ret)
return ret;
uts.pre_systime = mangle(sts.pre_systime);
uts.devtime.device_time = mangle(dev_ts);
uts.post_systime = mangle(sts.post_systime);
if (!copy_to_user(&data->timestamps[i], uts, sizeof(uts)))
return -EFAULT;
}
return 0;
}
ptp_sys_offset_precise_attrs(struct ptp_clock *ptp, void __user *argptr)
{
struct ioctl_data __user *data = argptr;
struct request;
if (copy_from_user(&request, &data->request, sizeof(request)))
return -EFAULT;
if (!preciseattr_request_valid(request))
return -EINVAL;
struct system_device_crosststamp xtstamp = { .clock_id = request.clock_id, };
struct timestamp uts = { };
if (ptp->info->getcrosststamp_attr)
ret = ptp->info->getcrosststamp_attr(ptp->info, &xtstamp, &uts.attr);
else if (ptp->info->getcrosststamp)
ret = ptp->info->getcrosststamp(ptp->info, &xtstamp);
else
return -ENOTSUPP;
if (ret)
return ret;
uts.systime = mangle(xtstamp.systime);
uts.devtime.device_time = mangle(xtstamp.device);
if (!copy_to_user(&data->timestamps[0], uts, sizeof(uts)))
return -EFAULT;
return 0;
}
Or something like this, which immediately enables the functionality for
all drivers which implement the getcrosststamp() or the gettimex64()
callbacks with a unified user space data structure.
The attributes.valid bits are all zero and and once drivers implement
the _attr callback variants, those attributes supported by the driver
will magically appear with the corresponding valid bits set.
Hmm?
Thanks,
tglx
next prev parent reply other threads:[~2026-05-24 12:36 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-17 21:25 [RFC PATCH v2 0/8] timekeeping: Fix draft tracking precision and add feed-forward discipline via vmclock David Woodhouse
2026-05-17 21:25 ` [RFC PATCH v2 1/8] timekeeping: Remove xtime_remainder from ntp_error accumulation David Woodhouse
2026-05-17 21:25 ` [RFC PATCH v2 2/8] timekeeping: Account for clawback adjustment in ntp_error David Woodhouse
2026-05-19 1:59 ` John Stultz
2026-05-19 10:04 ` David Woodhouse
2026-05-19 19:28 ` John Stultz
2026-05-20 10:47 ` Miroslav Lichvar
2026-05-20 12:37 ` David Woodhouse
2026-05-17 21:25 ` [RFC PATCH v2 3/8] timekeeping: Clamp time_offset delta to prevent infinite tail David Woodhouse
2026-05-19 13:25 ` Miroslav Lichvar
2026-05-19 13:31 ` David Woodhouse
2026-05-19 14:17 ` Miroslav Lichvar
2026-05-19 15:06 ` David Woodhouse
2026-05-17 21:25 ` [RFC PATCH v2 4/8] timekeeping: Add absolute reference for feed-forward clock discipline David Woodhouse
2026-05-19 2:09 ` John Stultz
2026-05-19 11:07 ` David Woodhouse
2026-05-17 21:25 ` [RFC PATCH v2 5/8] ptp_vmclock: Feed reference to timekeeping for feed-forward discipline David Woodhouse
2026-05-17 21:25 ` [RFC PATCH v2 6/8] timekeeping: Guard against divide-by-zero in timekeeping_adjust David Woodhouse
2026-05-17 21:25 ` [RFC PATCH v2 7/8] timekeeping: Drive time_offset skew via per-tick ntp_error transfer David Woodhouse
2026-05-17 21:25 ` [RFC PATCH v2 8/8] WIP: kernel/time: Add /dev/vmclock_host miscdev David Woodhouse
2026-05-19 13:16 ` [RFC PATCH v2 0/8] timekeeping: Fix draft tracking precision and add feed-forward discipline via vmclock Miroslav Lichvar
2026-05-19 15:50 ` David Woodhouse
2026-05-20 10:39 ` Miroslav Lichvar
2026-05-20 12:21 ` David Woodhouse
2026-05-21 6:35 ` Miroslav Lichvar
2026-05-21 9:54 ` David Woodhouse
2026-05-25 8:08 ` Miroslav Lichvar
2026-05-25 9:14 ` David Woodhouse
2026-05-26 7:10 ` Miroslav Lichvar
2026-05-26 10:00 ` David Woodhouse
2026-05-27 7:46 ` Miroslav Lichvar
2026-05-27 12:28 ` David Woodhouse
2026-05-21 18:30 ` Thomas Gleixner
2026-05-21 21:06 ` David Woodhouse
2026-05-22 8:02 ` Thomas Gleixner
2026-05-22 10:01 ` David Woodhouse
2026-05-22 15:28 ` Thomas Gleixner
2026-05-22 16:23 ` David Woodhouse
2026-05-24 12:36 ` Thomas Gleixner [this message]
2026-05-24 13:13 ` David Woodhouse
2026-05-24 15:05 ` Thomas Gleixner
2026-05-25 8:06 ` Arthur Kiyanovski
2026-05-25 8:41 ` David Woodhouse
2026-05-26 14:12 ` Thomas Gleixner
2026-05-22 16:50 ` David Woodhouse
2026-05-24 15:15 ` Thomas Gleixner
2026-05-24 15:37 ` Thomas Gleixner
2026-05-24 15:48 ` Thomas Gleixner
2026-05-24 16:36 ` Thomas Gleixner
2026-05-24 16:42 ` David Woodhouse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87se7ht25o.ffs@tglx \
--to=tglx@kernel.org \
--cc=andrew+netdev@lunn.ch \
--cc=anna-maria@linutronix.de \
--cc=arnd@arndb.de \
--cc=davem@davemloft.net \
--cc=dwmw2@infradead.org \
--cc=edumazet@google.com \
--cc=frederic@kernel.org \
--cc=guwen@linux.alibaba.com \
--cc=jstultz@google.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mlichvar@redhat.com \
--cc=mtosatti@redhat.com \
--cc=pabeni@redhat.com \
--cc=peterz@infradead.org \
--cc=richardcochran@gmail.com \
--cc=ridouxj@amazon.com \
--cc=rluu@amazon.com \
--cc=sboyd@kernel.org \
--cc=shuah@kernel.org \
--cc=thomas.weissschuh@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.