* [PATCH] timekeeping: Always initialize use_nsecs when querying time from phc drivers
@ 2025-07-08 16:46 Markus Blöchl
2025-07-08 19:09 ` John Stultz
0 siblings, 1 reply; 5+ messages in thread
From: Markus Blöchl @ 2025-07-08 16:46 UTC (permalink / raw)
To: Thomas Gleixner, Christopher S. Hall, Lakshmi Sowjanya D
Cc: John Stultz, Stephen Boyd, linux-kernel
Most drivers only populate the fields cycles and cs_id in their
get_time_fn() callback for get_device_system_crosststamp() unless
they explicitly provide nanosecond values.
When this new use_nsecs field was added and used most drivers did not
care.
Clock sources other than CSID_GENERIC could then get converted in
convert_base_to_cs() based on an uninitialized use_nsecs which usually
results in -EINVAL during the following range check.
Fixes: 6b2e29977518 ("timekeeping: Provide infrastructure for converting to/from a base clock")
Cc: stable@vger.kernel.org
Signed-off-by: Markus Blöchl <markus.bloechl@ipetronik.com>
---
Notes:
We observed this in the e1000e driver but at least stmmac and
ptp_kvm also seem affected by this.
ice was recently fixed by a5a441ae283d ("ice/ptp: fix crosstimestamp reporting").
kernel/time/timekeeping.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index a009c91f7b05..be0da807329f 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1269,6 +1269,8 @@ int get_device_system_crosststamp(int (*get_time_fn)
do {
seq = read_seqcount_begin(&tk_core.seq);
+ system_counterval.use_nsecs = false;
+
/*
* Try to synchronously capture device time and a system
* counter value calling back into the device driver
base-commit: d7b8f8e20813f0179d8ef519541a3527e7661d3a
--
2.50.0
Impressum/Imprint: https://www.ipetronik.com/impressum
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] timekeeping: Always initialize use_nsecs when querying time from phc drivers
2025-07-08 16:46 [PATCH] timekeeping: Always initialize use_nsecs when querying time from phc drivers Markus Blöchl
@ 2025-07-08 19:09 ` John Stultz
2025-07-09 8:32 ` Markus Blöchl
0 siblings, 1 reply; 5+ messages in thread
From: John Stultz @ 2025-07-08 19:09 UTC (permalink / raw)
To: Markus Blöchl
Cc: Thomas Gleixner, Christopher S. Hall, Lakshmi Sowjanya D,
Stephen Boyd, linux-kernel
On Tue, Jul 8, 2025 at 9:46 AM Markus Blöchl
<markus.bloechl@ipetronik.com> wrote:
>
> Most drivers only populate the fields cycles and cs_id in their
> get_time_fn() callback for get_device_system_crosststamp() unless
> they explicitly provide nanosecond values.
> When this new use_nsecs field was added and used most drivers did not
> care.
> Clock sources other than CSID_GENERIC could then get converted in
> convert_base_to_cs() based on an uninitialized use_nsecs which usually
> results in -EINVAL during the following range check.
>
> Fixes: 6b2e29977518 ("timekeeping: Provide infrastructure for converting to/from a base clock")
> Cc: stable@vger.kernel.org
> Signed-off-by: Markus Blöchl <markus.bloechl@ipetronik.com>
> ---
>
> Notes:
> We observed this in the e1000e driver but at least stmmac and
> ptp_kvm also seem affected by this.
> ice was recently fixed by a5a441ae283d ("ice/ptp: fix crosstimestamp reporting").
>
>
> kernel/time/timekeeping.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index a009c91f7b05..be0da807329f 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -1269,6 +1269,8 @@ int get_device_system_crosststamp(int (*get_time_fn)
>
> do {
> seq = read_seqcount_begin(&tk_core.seq);
> + system_counterval.use_nsecs = false;
> +
So if the argument is the local system_counterval structure isn't
being fully initialized by the get_time_fn() functions it is passed
to, it seems like it would be better to do so at the top of
get_device_system_crosststamp(), and not inside the seqloop.
But having the responsibility to initialize/fill in the structure
being split across the core and the implementation logic (leaving some
of the fields as optional) feels prone to mistakes, so it makes me
wonder if those drivers implementing the get_time_fn() really ought to
fully fill out the structure, and thus the fix would be better done in
those drivers.
thanks
-john
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] timekeeping: Always initialize use_nsecs when querying time from phc drivers
2025-07-08 19:09 ` John Stultz
@ 2025-07-09 8:32 ` Markus Blöchl
2025-07-18 20:25 ` Thomas Gleixner
0 siblings, 1 reply; 5+ messages in thread
From: Markus Blöchl @ 2025-07-09 8:32 UTC (permalink / raw)
To: John Stultz
Cc: Thomas Gleixner, Christopher S. Hall, Lakshmi Sowjanya D,
Stephen Boyd, linux-kernel
On Tue, Jul 08, 2025 at 12:09:40PM -0700, John Stultz wrote:
> On Tue, Jul 8, 2025 at 9:46 AM Markus Blöchl
> <markus.bloechl@ipetronik.com> wrote:
> >
> > Most drivers only populate the fields cycles and cs_id in their
> > get_time_fn() callback for get_device_system_crosststamp() unless
> > they explicitly provide nanosecond values.
> > When this new use_nsecs field was added and used most drivers did not
> > care.
> > Clock sources other than CSID_GENERIC could then get converted in
> > convert_base_to_cs() based on an uninitialized use_nsecs which usually
> > results in -EINVAL during the following range check.
> >
> > Fixes: 6b2e29977518 ("timekeeping: Provide infrastructure for converting to/from a base clock")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Markus Blöchl <markus.bloechl@ipetronik.com>
> > ---
> >
> > Notes:
> > We observed this in the e1000e driver but at least stmmac and
> > ptp_kvm also seem affected by this.
> > ice was recently fixed by a5a441ae283d ("ice/ptp: fix crosstimestamp reporting").
> >
> >
> > kernel/time/timekeeping.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> > index a009c91f7b05..be0da807329f 100644
> > --- a/kernel/time/timekeeping.c
> > +++ b/kernel/time/timekeeping.c
> > @@ -1269,6 +1269,8 @@ int get_device_system_crosststamp(int (*get_time_fn)
> >
> > do {
> > seq = read_seqcount_begin(&tk_core.seq);
> > + system_counterval.use_nsecs = false;
> > +
>
> So if the argument is the local system_counterval structure isn't
> being fully initialized by the get_time_fn() functions it is passed
> to, it seems like it would be better to do so at the top of
> get_device_system_crosststamp(), and not inside the seqloop.
Probably, I was just afraid of the case where get_time_fn() would take
like *very* different paths during different iterations.
But that seems really unlikely, indeed.
>
> But having the responsibility to initialize/fill in the structure
> being split across the core and the implementation logic (leaving some
> of the fields as optional) feels prone to mistakes, so it makes me
> wonder if those drivers implementing the get_time_fn() really ought to
> fully fill out the structure, and thus the fix would be better done in
> those drivers.
Yes, they should.
I also intend to patch the current drivers to do so but the initial
addition of the use_nsecs field could never have been safe without
some default initialization.
I know that we shouldn't worry too much about out-of-tree drivers,
but the fact that get_device_system_crosststamp() is exported,
the compiler is still perfectly happy *but* it breaks silently and
occasionally bugs me.
So this fix should cover all known and unknown drivers and is easier
to backport into stable.
Meanwhile I am preparing some follow-up patches for net to make the
known drivers fully fill out the structure.
Btw: Do you happen to have any patchwork queue you want those
timekeeping patches to land in?
Thanks for your input
Markus
>
> thanks
> -john
Impressum/Imprint: https://www.ipetronik.com/impressum
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] timekeeping: Always initialize use_nsecs when querying time from phc drivers
2025-07-09 8:32 ` Markus Blöchl
@ 2025-07-18 20:25 ` Thomas Gleixner
2025-07-20 13:51 ` Markus Blöchl
0 siblings, 1 reply; 5+ messages in thread
From: Thomas Gleixner @ 2025-07-18 20:25 UTC (permalink / raw)
To: Markus Blöchl, John Stultz
Cc: Christopher S. Hall, Lakshmi Sowjanya D, Stephen Boyd,
linux-kernel
On Wed, Jul 09 2025 at 10:32, Markus Blöchl wrote:
> On Tue, Jul 08, 2025 at 12:09:40PM -0700, John Stultz wrote:
>> On Tue, Jul 8, 2025 at 9:46 AM Markus Blöchl
>> <markus.bloechl@ipetronik.com> wrote:
>> >
>> > diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
>> > index a009c91f7b05..be0da807329f 100644
>> > --- a/kernel/time/timekeeping.c
>> > +++ b/kernel/time/timekeeping.c
>> > @@ -1269,6 +1269,8 @@ int get_device_system_crosststamp(int (*get_time_fn)
>> >
>> > do {
>> > seq = read_seqcount_begin(&tk_core.seq);
>> > + system_counterval.use_nsecs = false;
>> > +
>>
>> So if the argument is the local system_counterval structure isn't
>> being fully initialized by the get_time_fn() functions it is passed
>> to, it seems like it would be better to do so at the top of
>> get_device_system_crosststamp(), and not inside the seqloop.
>
> Probably, I was just afraid of the case where get_time_fn() would take
> like *very* different paths during different iterations.
> But that seems really unlikely, indeed.
It's impossible. xtstamp->device and the related get_time_fn() are
immutable during the call.
>> But having the responsibility to initialize/fill in the structure
>> being split across the core and the implementation logic (leaving some
>> of the fields as optional) feels prone to mistakes, so it makes me
>> wonder if those drivers implementing the get_time_fn() really ought to
>> fully fill out the structure, and thus the fix would be better done in
>> those drivers.
>
> Yes, they should.
No, they should not.
The data structure is instantiated in get_device_system_crosststamp()
and then handed in un-initialized to get_time_fn(), which is wrong to
begin with. Why?
That means if the structure is ever expanded, then you'd have to fix up
all of the get_time_fn() implementations.
Seriously?
The obviously correct and future proof thing to do is:
- struct system_counterval_t system_counterval;
+ struct system_counterval_t system_counterval = { };
Which fixes the problem you discovered once and forever, no?
Thanks,
tglx
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] timekeeping: Always initialize use_nsecs when querying time from phc drivers
2025-07-18 20:25 ` Thomas Gleixner
@ 2025-07-20 13:51 ` Markus Blöchl
0 siblings, 0 replies; 5+ messages in thread
From: Markus Blöchl @ 2025-07-20 13:51 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Markus Blöchl, John Stultz, Christopher S. Hall,
Lakshmi Sowjanya D, Stephen Boyd, linux-kernel
Hi Thomas,
On Fri, Jul 18, 2025 at 10:25:12PM +0200, Thomas Gleixner wrote:
> On Wed, Jul 09 2025 at 10:32, Markus Blöchl wrote:
> > On Tue, Jul 08, 2025 at 12:09:40PM -0700, John Stultz wrote:
> >> On Tue, Jul 8, 2025 at 9:46 AM Markus Blöchl
> >> <markus.bloechl@ipetronik.com> wrote:
> >> >
> >> > diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> >> > index a009c91f7b05..be0da807329f 100644
> >> > --- a/kernel/time/timekeeping.c
> >> > +++ b/kernel/time/timekeeping.c
> >> > @@ -1269,6 +1269,8 @@ int get_device_system_crosststamp(int (*get_time_fn)
> >> >
> >> > do {
> >> > seq = read_seqcount_begin(&tk_core.seq);
> >> > + system_counterval.use_nsecs = false;
> >> > +
> >>
> >> So if the argument is the local system_counterval structure isn't
> >> being fully initialized by the get_time_fn() functions it is passed
> >> to, it seems like it would be better to do so at the top of
> >> get_device_system_crosststamp(), and not inside the seqloop.
> >
> > Probably, I was just afraid of the case where get_time_fn() would take
> > like *very* different paths during different iterations.
> > But that seems really unlikely, indeed.
>
> It's impossible. xtstamp->device and the related get_time_fn() are
> immutable during the call.
I don't see it being entirely impossible, just nonsensical.
get_time_fn() tends to read volatile external data and is thus far from
being a pure function.
Some implementations (e.g. ptp_vmclock_get_time_fn()) can take some
weird turns, which I did not want to follow all the way.
I have no clue what could happen e.g. during a VM migration.
That's why resetting the struct before every invocation seemed to be
the safer option to me.
If you are certain that something like that can never happen, then I'm
totally happy.
>
> >> But having the responsibility to initialize/fill in the structure
> >> being split across the core and the implementation logic (leaving some
> >> of the fields as optional) feels prone to mistakes, so it makes me
> >> wonder if those drivers implementing the get_time_fn() really ought to
> >> fully fill out the structure, and thus the fix would be better done in
> >> those drivers.
> >
> > Yes, they should.
>
> No, they should not.
>
> The data structure is instantiated in get_device_system_crosststamp()
> and then handed in un-initialized to get_time_fn(), which is wrong to
> begin with. Why?
>
> That means if the structure is ever expanded, then you'd have to fix up
> all of the get_time_fn() implementations.
As long as all get_time_fn() implementations adhere to the convention of
always assigning an entire, fully-initialized struct and are always
compiled against the same headers as the kernel, I don't see that
necessity. But yes, those are obviously a lot of "if"s...
>
> Seriously?
>
> The obviously correct and future proof thing to do is:
>
> - struct system_counterval_t system_counterval;
> + struct system_counterval_t system_counterval = { };
>
> Which fixes the problem you discovered once and forever, no?
Works for me, too.
Reroll follows.
>
> Thanks,
>
> tglx
Thanks a lot,
Markus
--
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-07-20 13:51 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-08 16:46 [PATCH] timekeeping: Always initialize use_nsecs when querying time from phc drivers Markus Blöchl
2025-07-08 19:09 ` John Stultz
2025-07-09 8:32 ` Markus Blöchl
2025-07-18 20:25 ` Thomas Gleixner
2025-07-20 13:51 ` Markus Blöchl
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).