From: Oliver Upton <oliver.upton@linux.dev>
To: Marc Zyngier <maz@kernel.org>
Cc: kvmarm@lists.linux.dev, kvm@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
James Morse <james.morse@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Zenghui Yu <yuzenghui@huawei.com>,
Ricardo Koller <ricarkol@google.com>,
Simon Veith <sveith@amazon.de>,
dwmw2@infradead.org
Subject: Re: [PATCH 08/16] KVM: arm64: timers: Allow userspace to set the counter offsets
Date: Fri, 17 Feb 2023 22:11:36 +0000 [thread overview]
Message-ID: <Y+/7mO1sxH4jThmu@linux.dev> (raw)
In-Reply-To: <86k00gy4so.wl-maz@kernel.org>
On Fri, Feb 17, 2023 at 10:17:27AM +0000, Marc Zyngier wrote:
> Hi Oliver,
>
> On Thu, 16 Feb 2023 22:09:47 +0000,
> Oliver Upton <oliver.upton@linux.dev> wrote:
> >
> > Hi Marc,
> >
> > On Thu, Feb 16, 2023 at 02:21:15PM +0000, Marc Zyngier wrote:
> > > And this is the moment you have all been waiting for: setting the
> > > counter offsets from userspace.
> > >
> > > We expose a brand new capability that reports the ability to set
> > > the offsets for both the virtual and physical sides, independently.
> > >
> > > In keeping with the architecture, the offsets are expressed as
> > > a delta that is substracted from the physical counter value.
> > >
> > > Once this new API is used, there is no going back, and the counters
> > > cannot be written to to set the offsets implicitly (the writes
> > > are instead ignored).
> >
> > Is there any particular reason to use an explicit ioctl as opposed to
> > the KVM_{GET,SET}_DEVICE_ATTR ioctls? Dunno where you stand on it, but I
> > quite like that interface for simple state management. We also avoid
> > eating up more UAPI bits in the global namespace.
>
> The problem with that is that it requires yet another KVM device for
> this, and I'm lazy. It also makes it a bit harder for the VMM to buy
> into this (need to track another FD, for example).
You can also accept the device ioctls on the actual VM FD, quite like
we do for the vCPU right now. And hey, I've got a patch that gets you
most of the way there!
https://lore.kernel.org/kvmarm/20230211013759.3556016-3-oliver.upton@linux.dev/
> > Is there any reason why we can't just order this ioctl before vCPU
> > creation altogether, or is there a need to do this at runtime? We're
> > about to tolerate multiple writers to the offset value, and I think the
> > only thing we need to guarantee is that the below flag is set before
> > vCPU ioctls have a chance to run.
>
> Again, we don't know for sure whether the final offset is available
> before vcpu creation time. My idea for QEMU would be to perform the
> offset adjustment as late as possible, right before executing the VM,
> after having restored the vcpus with whatever value they had.
So how does userspace work out an offset based on available information?
The part that hasn't clicked for me yet is where userspace gets the
current value of the true physical counter to calculate an offset.
We could make it ABI that the guest's physical counter matches that of
the host by default. Of course, that has been the case since the
beginning of time but it is now directly user-visible.
The only part I don't like about that is that we aren't fully creating
an abstraction around host and guest system time. So here's my current
mental model of how we represent the generic timer to userspace:
+-----------------------+
| |
| Host System Counter |
| (1) |
+-----------------------+
|
+-----------+-----------+
| |
+-----------------+ +-----+ +-----+ +--------------------+
| (2) CNTPOFF_EL2 |--| sub | | sub |--| (3) CNTVOFF_EL2 |
+-----------------+ +-----+ +-----+ +--------------------+
| |
| |
+-----------------+ +----------------+
| (5) CNTPCT_EL0 | | (4) CNTVCT_EL0 |
+-----------------+ +----------------+
AFAICT, this UAPI exposes abstractions for (2) and (3) to userspace, but
userspace cannot directly get at (1).
Chewing on this a bit more, I don't think userspace has any business
messing with virtual and physical time independently, especially when
nested virtualization comes into play.
I think the illusion to userspace needs to be built around the notion of
a system counter:
+-----------------------+
| |
| Host System Counter |
| (1) |
+-----------------------+
|
|
+-----+ +-------------------+
| sub |---| (6) system_offset |
+-----+ +-------------------+
|
|
+-----------------------+
| |
| Guest System Counter |
| (7) |
+-----------------------+
|
+-----------+-----------+
| |
+-----------------+ +-----+ +-----+ +--------------------+
| (2) CNTPOFF_EL2 |--| sub | | sub |--| (3) CNTVOFF_EL2 |
+-----------------+ +-----+ +-----+ +--------------------+
| |
| |
+-----------------+ +----------------+
| (5) CNTPCT_EL0 | | (4) CNTVCT_EL0 |
+-----------------+ +----------------+
And from a UAPI perspective, we would either expose (1) and (6) to let
userspace calculate an offset or simply allow (7) to be directly
read/written.
That frees up the meaning of the counter offsets as being purely a
virtual EL2 thing. These registers would reset to 0, and non-NV guests
could never change their value.
Under the hood KVM would program the true offset registers as:
CNT{P,V}OFF_EL2 = 'virtual CNT{P,V}OFF_EL2' + system_offset
With this we would effectively configure CNTPCT = CNTVCT = 0 at the
point of VM creation. Only crappy thing is it requires full physical
counter/timer emulation for non-ECV systems, but the guest shouldn't be
using the physical counter in the first place.
Yes, this sucks for guests running on hosts w/ NV but not ECV. If anyone
can tell me how an L0 hypervisor is supposed to do NV without ECV, I'm
all ears.
Does any of what I've written make remote sense or have I gone entirely
off the rails with my ASCII art? :)
--
Thanks,
Oliver
next prev parent reply other threads:[~2023-02-17 22:11 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-16 14:21 [PATCH 00/16] KVM: arm64: Rework timer offsetting for fun and profit Marc Zyngier
2023-02-16 14:21 ` [PATCH 01/16] arm64: Add CNTPOFF_EL2 register definition Marc Zyngier
2023-02-16 14:21 ` [PATCH 02/16] arm64: Add HAS_ECV_CNTPOFF capability Marc Zyngier
2023-02-22 4:30 ` Reiji Watanabe
2023-02-22 10:47 ` Marc Zyngier
2023-02-16 14:21 ` [PATCH 03/16] kvm: arm64: Expose {un,}lock_all_vcpus() to the reset of KVM Marc Zyngier
2023-02-23 22:30 ` Colton Lewis
2023-02-16 14:21 ` [PATCH 04/16] KVM: arm64: timers: Use a per-vcpu, per-timer accumulator for fractional ns Marc Zyngier
2023-02-23 22:30 ` Colton Lewis
2023-02-16 14:21 ` [PATCH 05/16] KVM: arm64: timers: Convert per-vcpu virtual offset to a global value Marc Zyngier
2023-02-22 6:15 ` Reiji Watanabe
2023-02-22 10:54 ` Marc Zyngier
2023-02-16 14:21 ` [PATCH 06/16] KVM: arm64: timers: Use CNTPOFF_EL2 to offset the physical timer Marc Zyngier
2023-02-23 22:34 ` Colton Lewis
2023-02-24 8:59 ` Marc Zyngier
2023-02-16 14:21 ` [PATCH 07/16] KVM: arm64: timers: Allow physical offset without CNTPOFF_EL2 Marc Zyngier
2023-02-23 22:40 ` Colton Lewis
2023-02-24 10:54 ` Marc Zyngier
2023-02-16 14:21 ` [PATCH 08/16] KVM: arm64: timers: Allow userspace to set the counter offsets Marc Zyngier
2023-02-16 22:09 ` Oliver Upton
2023-02-17 10:17 ` Marc Zyngier
2023-02-17 22:11 ` Oliver Upton [this message]
2023-02-22 11:56 ` Marc Zyngier
2023-02-22 16:34 ` Oliver Upton
2023-02-23 18:25 ` Marc Zyngier
2023-03-08 7:46 ` Oliver Upton
2023-03-08 7:53 ` Oliver Upton
2023-03-09 8:29 ` Marc Zyngier
2023-03-09 8:25 ` Marc Zyngier
2023-02-23 22:41 ` Colton Lewis
2023-02-24 11:24 ` Marc Zyngier
2023-02-16 14:21 ` [PATCH 09/16] KVM: arm64: timers: Allow save/restoring of the physical timer Marc Zyngier
2023-02-16 14:21 ` [PATCH 10/16] KVM: arm64: timers: Rationalise per-vcpu timer init Marc Zyngier
2023-02-16 14:21 ` [PATCH 11/16] KVM: arm64: Document KVM_ARM_SET_CNT_OFFSETS and co Marc Zyngier
2023-02-16 14:21 ` [PATCH 12/16] KVM: arm64: nv: timers: Add a per-timer, per-vcpu offset Marc Zyngier
2023-02-24 20:07 ` Colton Lewis
2023-02-25 10:32 ` Marc Zyngier
2023-02-16 14:21 ` [PATCH 13/16] KVM: arm64: nv: timers: Support hyp timer emulation Marc Zyngier
2023-02-24 20:08 ` Colton Lewis
2023-02-25 10:34 ` Marc Zyngier
2023-02-16 14:21 ` [PATCH 14/16] KVM: arm64: selftests: Add physical timer registers to the sysreg list Marc Zyngier
2023-02-16 14:21 ` [PATCH 15/16] KVM: arm64: selftests: Augment existing timer test to handle variable offsets Marc Zyngier
2023-03-06 22:08 ` Colton Lewis
2023-03-09 9:01 ` Marc Zyngier
2023-03-10 19:26 ` Colton Lewis
2023-03-12 15:53 ` Marc Zyngier
2023-03-13 11:43 ` Marc Zyngier
2023-03-14 17:47 ` Colton Lewis
2023-03-14 18:18 ` Marc Zyngier
2023-02-16 14:21 ` [PATCH 16/16] KVM: arm64: selftests: Deal with spurious timer interrupts Marc Zyngier
2023-02-21 16:28 ` [PATCH 00/16] KVM: arm64: Rework timer offsetting for fun and profit Veith, Simon
2023-02-21 22:17 ` Marc Zyngier
2023-02-23 22:29 ` Colton Lewis
2023-02-24 8:45 ` Marc Zyngier
2023-02-24 20:07 ` Colton Lewis
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y+/7mO1sxH4jThmu@linux.dev \
--to=oliver.upton@linux.dev \
--cc=dwmw2@infradead.org \
--cc=james.morse@arm.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=maz@kernel.org \
--cc=ricarkol@google.com \
--cc=suzuki.poulose@arm.com \
--cc=sveith@amazon.de \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).