From: Thomas Gleixner <tglx@kernel.org>
To: David Woodhouse <dwmw2@infradead.org>,
Harshitha Ramamurthy <hramamurthy@google.com>,
netdev@vger.kernel.org, Arthur Kiyanovski <akiyano@amazon.com>
Cc: joshwash@google.com, andrew+netdev@lunn.ch, davem@davemloft.net,
edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
richardcochran@gmail.com, jstultz@google.com, sboyd@kernel.org,
willemb@google.com, nktgrg@google.com, jfraker@google.com,
ziweixiao@google.com, maolson@google.com, jordanrhee@google.com,
thostet@google.com, alok.a.tiwari@oracle.com,
pkaligineedi@google.com, horms@kernel.org,
jacob.e.keller@intel.com, yyd@google.com, jefrogers@google.com,
linux-kernel@vger.kernel.org,
"Naman Gulati" <namangulati@google.com>,
"Thomas Weißschuh" <thomas.weissschuh@linutronix.de>
Subject: Re: [PATCH net-next v8 3/3] gve: implement PTP gettimex64
Date: Fri, 22 May 2026 16:41:40 +0200 [thread overview]
Message-ID: <871pf3v74r.ffs@tglx> (raw)
In-Reply-To: <0d13721307646d7ca2bf1aaf22926f0137b9de30.camel@infradead.org>
On Fri, May 22 2026 at 09:56, David Woodhouse wrote:
> On Fri, 2026-05-22 at 09:49 +0200, Thomas Gleixner wrote:
>> > Yes. Isn't that what I asked Arthur to do?
>>
>> You maybe asked him, but I can't see any evidence of it anywhere. All I
>> can see is a magic struct attr, which conveys CS related data that is
>> supposed to be produced magically at some undefined place.
>
> https://lore.kernel.org/all/8b8a748a2ac6e8e3d970f5e74a2774133e7e7d8c.camel@infradead.org/
Right, after I pointed it out :)
>> > > > Is that not how it works? If not.... WTF *is* it for?
>> > >
>> > > Yes, that's the way it works since get_device_system_crosststamp() got
>> > > introduced, but that does not give a random driver access to the actual
>> > > converted (TSC) counter value. The driver can only observe the PTM value
>> > > which is ART on x86 and whatever on other architectures. So it does not
>> > > know anything about it.
>> >
>> > Sure, the driver can only return what it *knows*.
>> >
>> > We're certainly missing some infrastructure here... maybe the pci_bus
>> > should have a CSID which corresponds to its PTM domain, which on x86
>> > would be CSID_X86_ART?
>>
>> Yes, that's correct. But that does not solve the underlying problem:
>>
>> ...
>>
>> Q: How is a driver supposed to fill in attr->cs_cycles coherently?
>>
>> A: Not at all.
>>
>
> Let me rephrase that to check my full understanding of it in context:
>
> Q: There are drivers today which operate on the TSC or arch counter,
> and they're just fine. But how is a future driver which uses PCIe
> PTM supposed to full in attr->cs_cycles coherently?
>
> A: Yeah, our PCI and platform code lack the proper arch-agnostic
> support for interpreting PTM time values; a driver wanting to do
> that is going to need to do some infrastructure work first to
> enable a proper conversion. But that's literally what PTM/ART is
> *for* so that work has to happen at some point anyway.
That work is there:
get_device_system_cross_timestamp()
The only lacking feature is that there is no generic PCI method to tell
the driver which PMT clocksource ID it should use.
So the drivers do it manually:
system->cycles = PTM_READ();
system->cs_id = CSID_ART;
instead of doing:
system->cs_id = pci_dev_ptm_csid(pdev);
everything else is done by the core code for several years already.
The core code provides:
- PCH timestamp
- sys_realtime derived from system->cycles
- sys_monoraw derived from system->cycles
So now with your new proposed attr extenstions, you want attr to return
system->cycles to user space and you don't want the raw PTM value (ART)
but you want the corresponding TSC value which is:
TSC = M * ART + offset;
But the driver has no access to this information. That's why I'm ranting
about an interface which lacks the infrastructure to utilize it
properly.
I've seen this in the past 20 years over and over. The lack of
infrastructure makes driver people do abysmal hacks because they are
tasked to utilize the new interface. The result is a horrible mess,
which others have to clean up later when they want to do core changes.
>> The consequence of this is that the next driver writer will just go and
>> do:
>>
>> attr->cs_id = IS_ENABLED(X86) ? CSID_X86_TSC : CSID_ARM_COUNTER;
>> attr->cs_cycles = get_cycles();
>> or
>> attr->cs_cycles = ptm_cycles * pulledout_of_thin_air_factor +
>> pulledout_of_thin_air_offset;
>
> Driver authors are crap, sure, but that would just be egregiously wrong
> and hopefully it would be caught in review.
You wish. Shit get's merged faster than you can mop it up.
I agree that driver authors are interesting creatures, but you can't
whack them over the head when there is no generic infrastructure they
can use. They never will go and extend existing or provide new
infrastructure. That's on the people who add new functionality, which
will be consumed by driver writers.
> It has to be a literally *synchronous* capture or it's pointless.
You know that and I know that, but ...
>> The same applies to the pre/post timestamp mechanism for
>> gettimexattrs64().
>
> There's literally a ptp helper to generate those, which can use
> ktime_get_snapshot().
With some massaging, yes. Now that someone provided an extensible
snapshot function. :)
> Of *course* I'm trying to solve things for the general case.
>
> But that doesn't extend to implementing random parts of PCI and
> platform infrastructure to allow for hypothetical future drivers which
> want to use PTM to implement the optional fields we're adding today.
Optional fields are going to get filled when people see value in
them. You need no new drivers for that. Existing drivers hop on the new
bus as soon as it hits the road.
>> > Never get_cycles(). Let's perhaps make it *not* need unholy hacks to
>> > interpret a PTM counter value. Otherwise what's the point in ART and
>> > PTM at all?
>>
>> Perhaps? There is no perhaps and aside of that I gave you the solution
>> for the problem already in the mail you just replied to, no?
>
> Hm, did you? I'm sorry, I must have missed it. Even rereading it this
> morning, I see a lot of 'is bonkers' and 'unholy hacks' and '_cannot_
> fill them in coherently' but I can't see a concrete constructive
> suggestion. You meant <87bje8v0xl.ffs@tglx>, yes?
The demonstration patches to extend ktime_get_snapshot() and the half
assed support in get_device_system_crosststamp(). You actually asked me
to provide a SOB for the ktime_get_snapshot_id() part. No?
> Maybe it's lost in the noise. Let's try to simplify.
>
> • Userspace would like to be able to use PTP clocks to discipline the
> actual counter which is directly accessible in userspace and by KVM
> guests. Do do this, we only really need a system_counterval_t in
> the pre/post timestamps, and the helpers can add that.
>
> • *Some* drivers can actually provide a system_counterval_t in the
> device reading itself, either because they're naturally based on
> the CPU counter or because they have PTM. Drivers should report
> what they *do* know, in the correct time domain, and not make
> crap up.
Which needs support in get_device_system_crosststamp() because
that's what drivers have to use to utilize PTM.
> • Any driver which actually wants to do this based on PTM in the
> future will need to do the infrastructure work to make that work
> cleanly, and avoid having to make crap up.
That's wishful thinking. It did not happen in the last two decades
so why would this be different today?
> And if I'm understanding correctly:
>
> • The GVE driver made crap up? Although they do so orthogonally to the
> patch set we're *actually* talking about in this thread, because
> they did that entirely on their own *without* reference to Arthur's
> userspace API changes?
Correct.
Let's move this to the other thread which actually talks about your
vmclock stuff.
Thanks,
tglx
next prev parent reply other threads:[~2026-05-22 14:41 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-14 22:58 [PATCH net-next v8 0/3] gve: add support for PTP gettimex64 Harshitha Ramamurthy
2026-05-14 22:58 ` [PATCH net-next v8 1/3] gve: skip error logging for retryable AdminQ commands Harshitha Ramamurthy
2026-05-14 22:58 ` [PATCH net-next v8 2/3] gve: make nic clock reads thread safe Harshitha Ramamurthy
2026-05-14 22:58 ` [PATCH net-next v8 3/3] gve: implement PTP gettimex64 Harshitha Ramamurthy
2026-05-20 21:08 ` Thomas Gleixner
2026-05-20 22:16 ` Jakub Kicinski
2026-05-21 4:35 ` Jordan Rhee
2026-05-21 10:08 ` David Laight
2026-05-21 11:43 ` David Woodhouse
2026-05-21 18:06 ` Thomas Gleixner
2026-05-21 19:59 ` David Woodhouse
2026-05-21 22:43 ` Thomas Gleixner
2026-05-22 0:12 ` David Woodhouse
2026-05-22 7:49 ` Thomas Gleixner
2026-05-22 8:56 ` David Woodhouse
2026-05-22 14:41 ` Thomas Gleixner [this message]
2026-05-22 10:34 ` David Woodhouse
2026-05-22 14:46 ` Thomas Gleixner
2026-05-20 1:30 ` [PATCH net-next v8 0/3] gve: add support for " patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=871pf3v74r.ffs@tglx \
--to=tglx@kernel.org \
--cc=akiyano@amazon.com \
--cc=alok.a.tiwari@oracle.com \
--cc=andrew+netdev@lunn.ch \
--cc=davem@davemloft.net \
--cc=dwmw2@infradead.org \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=hramamurthy@google.com \
--cc=jacob.e.keller@intel.com \
--cc=jefrogers@google.com \
--cc=jfraker@google.com \
--cc=jordanrhee@google.com \
--cc=joshwash@google.com \
--cc=jstultz@google.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maolson@google.com \
--cc=namangulati@google.com \
--cc=netdev@vger.kernel.org \
--cc=nktgrg@google.com \
--cc=pabeni@redhat.com \
--cc=pkaligineedi@google.com \
--cc=richardcochran@gmail.com \
--cc=sboyd@kernel.org \
--cc=thomas.weissschuh@linutronix.de \
--cc=thostet@google.com \
--cc=willemb@google.com \
--cc=yyd@google.com \
--cc=ziweixiao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox