From: "Michael S. Tsirkin" <mst@redhat.com>
To: David Woodhouse <dwmw2@infradead.org>
Cc: Richard Cochran <richardcochran@gmail.com>,
Peter Hilber <peter.hilber@opensynergy.com>,
linux-kernel@vger.kernel.org, virtualization@lists.linux.dev,
linux-arm-kernel@lists.infradead.org, linux-rtc@vger.kernel.org,
"Ridoux, Julien" <ridouxj@amazon.com>,
virtio-dev@lists.linux.dev, "Luu, Ryan" <rluu@amazon.com>,
"Chashper, David" <chashper@amazon.com>,
"Mohamed Abuelfotoh, Hazem" <abuehaze@amazon.com>,
"Christopher S . Hall" <christopher.s.hall@intel.com>,
Jason Wang <jasowang@redhat.com>,
John Stultz <jstultz@google.com>,
netdev@vger.kernel.org, Stephen Boyd <sboyd@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
Marc Zyngier <maz@kernel.org>,
Mark Rutland <mark.rutland@arm.com>,
Daniel Lezcano <daniel.lezcano@linaro.org>,
Alessandro Zummo <a.zummo@towertech.it>,
Alexandre Belloni <alexandre.belloni@bootlin.com>,
qemu-devel <qemu-devel@nongnu.org>,
Simon Horman <horms@kernel.org>
Subject: Re: [PATCH] ptp: Add vDSO-style vmclock support
Date: Thu, 25 Jul 2024 12:38:23 -0400 [thread overview]
Message-ID: <20240725122603-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <2a27205bfc61e19355d360f428a98e2338ff68c3.camel@infradead.org>
On Thu, Jul 25, 2024 at 04:18:43PM +0100, David Woodhouse wrote:
> On Thu, 2024-07-25 at 10:11 -0400, Michael S. Tsirkin wrote:
> > On Thu, Jul 25, 2024 at 02:50:50PM +0100, David Woodhouse wrote:
> > > Even if the virtio-rtc specification were official today, and I was
> > > able to expose it via PCI, I probably wouldn't do it that way. There's
> > > just far more in virtio-rtc than we need; the simple shared memory
> > > region is perfectly sufficient for most needs, and especially ours.
> >
> > I can't stop amazon from shipping whatever in its hypervisor,
> > I'd just like to understand this better, if there is a use-case
> > not addressed here then we can change virtio to address it.
> >
> > The rtc driver patch posted is 900 lines, yours is 700 lines, does not
> > look like a big difference. As for using a memory region, this is
> > valid, but maybe rtc should be changed to do exactly that?
>
> I'm certainly aiming for virtio-rtc to include that as an *option*,
> because I think I don't think it makes sense for an RTC specification
> aimed at virtual machines *not* to deal with the live migration
> problem.
>
> AFAICT the only ways to deal with the LM problem are either to make a
> hypercall/virtio transaction for *every* clock read which needs to be
> accurate, or expose a memory region for the guest to do it "vDSO-
> style".
virtio can support the second option, we already have
VIRTIO_PCI_CAP_SHARED_MEMORY_CFG, I'd just use it.
> And similarly, unless we want guest userspace to have to make a
> *system* call every time, that memory region needs to be mappable all
> the way to userspace.
This part is classic for pci, mapping pci bar has been well
studied.
> The use case isn't necessarily for all users of gettimeofday(), of
> course; this is for those applications which *need* precision time.
> Like distributed databases which rely on timestamps for coherency, and
> users who get fined millions of dollars when LM messes up their clocks
> and they put wrong timestamps on financial transactions.
I would however worry that with all this pass through,
applications have to be coded to each hypervisor or even
version of the hypervisor.
I don't really know the use-case well enough - is sending
an interrupt to linux and having linux create a device
independent structure not workable?
> > E.g. we can easily add a capability describing such a region.
> > or put it in device config space.
>
> I think it has to be memory, not config space. But yes.
virtio config space, which is just a region in a BAR.
But yes, maybe VIRTIO_PCI_CAP_SHARED_MEMORY_CFG is cleaner.
> The intent is that my driver would be usable with the shared memory
> region from a virtio-rtc device too. It'd need a tiny amount of
> refactoring of the discovery code in vmclock_probe(), which I haven't
> done yet as it would be premature optimisation.
>
> > I mean yes, we can build a new transport for each specific need but in
> > the end we'll get a ton of interfaces with unclear compatibility
> > requirements. If effort is instead spent improving common interfaces,
> > we get consistency and everyone benefits. That's why I'm trying to
> > understand the need here.
>
> It's simplicity. Because this isn't even a "transport". It's just a
> simple breadcrumb given to the guest to tell it where the information
> is.
> In the fullness of time assuming this becomes part of virtio-rtc too,
> the fact that it can *also* be discovered by ACPI is just a tiny
> detail. And it allows hypervisors to implement it a *whole* lot more
> simply.
>
> The addition of an ACPI method to enable the timekeeping does make it a
> tiny bit more than a 'breadcrump', I concede — but that's still
> basically trivial to implement. A whole lot simpler than a full virtio
> device.
virtio has been developed with the painful experience that we keep
making mistakes, or coming up with new needed features,
and that maintaining forward and backward compatibility
becomes a whole lot harder than it seems in the beginning.
--
MST
next prev parent reply other threads:[~2024-07-25 16:39 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-24 17:16 [PATCH] ptp: Add vDSO-style vmclock support David Woodhouse
2024-07-25 5:48 ` Michael S. Tsirkin
2024-07-25 9:56 ` David Woodhouse
2024-07-25 11:31 ` Daniel P. Berrangé
2024-07-25 11:53 ` David Woodhouse
2024-07-25 12:00 ` Daniel P. Berrangé
2024-07-25 12:17 ` Michael S. Tsirkin
2024-07-25 12:27 ` David Woodhouse
2024-07-25 12:29 ` Michael S. Tsirkin
2024-07-25 12:31 ` David Woodhouse
2024-07-25 12:33 ` Michael S. Tsirkin
2024-07-25 13:50 ` David Woodhouse
2024-07-25 14:11 ` Michael S. Tsirkin
2024-07-25 15:18 ` David Woodhouse
2024-07-25 16:38 ` Michael S. Tsirkin [this message]
2024-07-25 19:35 ` David Woodhouse
2024-07-25 20:50 ` Michael S. Tsirkin
2024-07-25 21:00 ` David Woodhouse
2024-07-25 21:04 ` Michael S. Tsirkin
2024-07-25 21:29 ` David Woodhouse
2024-07-25 21:47 ` Michael S. Tsirkin
2024-07-25 22:20 ` David Woodhouse
2024-07-26 6:06 ` Michael S. Tsirkin
2024-07-26 8:35 ` David Woodhouse
2024-07-26 12:52 ` Michael S. Tsirkin
2024-07-26 13:00 ` David Woodhouse
2024-07-26 13:04 ` Michael S. Tsirkin
2024-07-26 13:08 ` David Woodhouse
2024-07-26 5:09 ` Michael S. Tsirkin
2024-07-26 5:55 ` Michael S. Tsirkin
2024-07-26 8:06 ` David Woodhouse
2024-07-26 12:47 ` Michael S. Tsirkin
2024-07-26 12:51 ` David Woodhouse
2024-07-26 16:49 ` Jonathan Cameron via
2024-07-26 18:28 ` David Woodhouse
2024-07-28 10:37 ` Michael S. Tsirkin
2024-07-28 13:07 ` David Woodhouse
2024-07-28 15:23 ` Michael S. Tsirkin
2024-07-29 6:45 ` David Woodhouse
2024-07-25 5:54 ` Michael S. Tsirkin
2024-07-25 10:00 ` David Woodhouse
2024-07-25 11:20 ` Paolo Abeni
2024-07-25 11:49 ` David Woodhouse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240725122603-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=a.zummo@towertech.it \
--cc=abuehaze@amazon.com \
--cc=alexandre.belloni@bootlin.com \
--cc=chashper@amazon.com \
--cc=christopher.s.hall@intel.com \
--cc=daniel.lezcano@linaro.org \
--cc=dwmw2@infradead.org \
--cc=horms@kernel.org \
--cc=jasowang@redhat.com \
--cc=jstultz@google.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rtc@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=maz@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=peter.hilber@opensynergy.com \
--cc=qemu-devel@nongnu.org \
--cc=richardcochran@gmail.com \
--cc=ridouxj@amazon.com \
--cc=rluu@amazon.com \
--cc=sboyd@kernel.org \
--cc=tglx@linutronix.de \
--cc=virtio-dev@lists.linux.dev \
--cc=virtualization@lists.linux.dev \
--cc=xuanzhuo@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).