From: Marc Zyngier <maz@kernel.org>
To: Oliver Upton <oupton@google.com>
Cc: kvm@vger.kernel.org, Sean Christopherson <seanjc@google.com>,
Peter Shier <pshier@google.com>,
Raghavendra Rao Anata <rananta@google.com>,
David Matlack <dmatlack@google.com>,
Paolo Bonzini <pbonzini@redhat.com>,
kvmarm@lists.cs.columbia.edu, Jim Mattson <jmattson@google.com>
Subject: Re: [PATCH 02/10] KVM: arm64: Implement initial support for KVM_CAP_SYSTEM_COUNTER_STATE
Date: Wed, 09 Jun 2021 11:23:34 +0100 [thread overview]
Message-ID: <877dj3z68p.wl-maz@kernel.org> (raw)
In-Reply-To: <20210608214742.1897483-3-oupton@google.com>
Hi Oliver,
Please Cc the KVM/arm64 reviewers (now added). Also, please consider
subscribing to the kvmarm mailing list so that I don't have to
manually approve your posts ;-).
On Tue, 08 Jun 2021 22:47:34 +0100,
Oliver Upton <oupton@google.com> wrote:
>
> ARMv8 provides for a virtual counter-timer offset that is added to guest
> views of the virtual counter-timer (CNTVOFF_EL2). To date, KVM has not
> provided userspace with any perception of this, and instead affords a
> value-based scheme of migrating the virtual counter-timer by directly
> reading/writing the guest's CNTVCT_EL0. This is problematic because
> counters continue to elapse while the register is being written, meaning
> it is possible for drift to sneak in to the guest's time scale. This is
> exacerbated by the fact that KVM will calculate an appropriate
> CNTVOFF_EL2 every time the register is written, which will be broadcast
> to all virtual CPUs. The only possible way to avoid causing guest time
> to drift is to restore counter-timers by offset.
Well, the current method has one huge advantage: time can never go
backward from the guest PoV if you restore what you have saved. Yes,
time can elapse, but you don't even need to migrate to observe that.
>
> Implement initial support for KVM_{GET,SET}_SYSTEM_COUNTER_STATE ioctls
> to migrate the value of CNTVOFF_EL2. These ioctls yield precise control
> of the virtual counter-timers to userspace, allowing it to define its
> own heuristics for managing vCPU offsets.
I'm not really in favour of inventing a completely new API, for
multiple reasons:
- CNTVOFF is an EL2 concept. I'd rather not expose it as such as it
becomes really confusing with NV (which does expose its own CNTVOFF
via the ONE_REG interface)
- You seem to allow each vcpu to get its own offset. I don't think
that's right. The architecture defines that all PEs have the same
view of the counters, and an EL1 guest should be given that
illusion.
- by having a parallel save/restore interface, you make it harder to
reason about what happens with concurrent calls to both interfaces
- the userspace API is already horribly bloated, and I'm not overly
keen on adding more if we can avoid it.
I'd rather you extend the current ONE_REG interface and make it modal,
either allowing the restore of an absolute value or an offset for
CNTVCT_EL0. This would also keep a consistent behaviour when restoring
vcpus. The same logic would apply to the physical offset.
As for how to make it modal, we have plenty of bits left in the
ONE_REG encoding. Pick one, and make that a "relative" attribute. This
will result in some minor surgery in the get/set code paths, but at
least no entirely new mechanism.
One question though: how do you plan to reliably compute the offset?
As far as I can see, it is subject to the same issues you described
above (while the guest is being restored, time flies), and you have
the added risk of exposing a counter going backward from a guest
perspective.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
WARNING: multiple messages have this Message-ID (diff)
From: Marc Zyngier <maz@kernel.org>
To: Oliver Upton <oupton@google.com>
Cc: kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu,
Paolo Bonzini <pbonzini@redhat.com>,
Sean Christopherson <seanjc@google.com>,
Peter Shier <pshier@google.com>,
Jim Mattson <jmattson@google.com>,
David Matlack <dmatlack@google.com>,
Ricardo Koller <ricarkol@google.com>,
Jing Zhang <jingzhangos@google.com>,
Raghavendra Rao Anata <rananta@google.com>,
Alexandru Elisei <Alexandru.Elisei@arm.com>,
James Morse <james.morse@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>
Subject: Re: [PATCH 02/10] KVM: arm64: Implement initial support for KVM_CAP_SYSTEM_COUNTER_STATE
Date: Wed, 09 Jun 2021 11:23:34 +0100 [thread overview]
Message-ID: <877dj3z68p.wl-maz@kernel.org> (raw)
In-Reply-To: <20210608214742.1897483-3-oupton@google.com>
Hi Oliver,
Please Cc the KVM/arm64 reviewers (now added). Also, please consider
subscribing to the kvmarm mailing list so that I don't have to
manually approve your posts ;-).
On Tue, 08 Jun 2021 22:47:34 +0100,
Oliver Upton <oupton@google.com> wrote:
>
> ARMv8 provides for a virtual counter-timer offset that is added to guest
> views of the virtual counter-timer (CNTVOFF_EL2). To date, KVM has not
> provided userspace with any perception of this, and instead affords a
> value-based scheme of migrating the virtual counter-timer by directly
> reading/writing the guest's CNTVCT_EL0. This is problematic because
> counters continue to elapse while the register is being written, meaning
> it is possible for drift to sneak in to the guest's time scale. This is
> exacerbated by the fact that KVM will calculate an appropriate
> CNTVOFF_EL2 every time the register is written, which will be broadcast
> to all virtual CPUs. The only possible way to avoid causing guest time
> to drift is to restore counter-timers by offset.
Well, the current method has one huge advantage: time can never go
backward from the guest PoV if you restore what you have saved. Yes,
time can elapse, but you don't even need to migrate to observe that.
>
> Implement initial support for KVM_{GET,SET}_SYSTEM_COUNTER_STATE ioctls
> to migrate the value of CNTVOFF_EL2. These ioctls yield precise control
> of the virtual counter-timers to userspace, allowing it to define its
> own heuristics for managing vCPU offsets.
I'm not really in favour of inventing a completely new API, for
multiple reasons:
- CNTVOFF is an EL2 concept. I'd rather not expose it as such as it
becomes really confusing with NV (which does expose its own CNTVOFF
via the ONE_REG interface)
- You seem to allow each vcpu to get its own offset. I don't think
that's right. The architecture defines that all PEs have the same
view of the counters, and an EL1 guest should be given that
illusion.
- by having a parallel save/restore interface, you make it harder to
reason about what happens with concurrent calls to both interfaces
- the userspace API is already horribly bloated, and I'm not overly
keen on adding more if we can avoid it.
I'd rather you extend the current ONE_REG interface and make it modal,
either allowing the restore of an absolute value or an offset for
CNTVCT_EL0. This would also keep a consistent behaviour when restoring
vcpus. The same logic would apply to the physical offset.
As for how to make it modal, we have plenty of bits left in the
ONE_REG encoding. Pick one, and make that a "relative" attribute. This
will result in some minor surgery in the get/set code paths, but at
least no entirely new mechanism.
One question though: how do you plan to reliably compute the offset?
As far as I can see, it is subject to the same issues you described
above (while the guest is being restored, time flies), and you have
the added risk of exposing a counter going backward from a guest
perspective.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
next prev parent reply other threads:[~2021-06-09 10:23 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-08 21:47 [PATCH 00/10] KVM: Add idempotent controls for migrating system counter state Oliver Upton
2021-06-08 21:47 ` Oliver Upton
2021-06-08 21:47 ` [PATCH 01/10] KVM: Introduce KVM_{GET, SET}_SYSTEM_COUNTER_STATE ioctls Oliver Upton
2021-06-08 21:47 ` [PATCH 01/10] KVM: Introduce KVM_{GET,SET}_SYSTEM_COUNTER_STATE ioctls Oliver Upton
2021-06-08 21:47 ` [PATCH 02/10] KVM: arm64: Implement initial support for KVM_CAP_SYSTEM_COUNTER_STATE Oliver Upton
2021-06-08 21:47 ` Oliver Upton
2021-06-08 21:55 ` Oliver Upton
2021-06-08 21:55 ` Oliver Upton
2021-06-09 10:23 ` Marc Zyngier [this message]
2021-06-09 10:23 ` Marc Zyngier
2021-06-09 14:51 ` Oliver Upton
2021-06-09 14:51 ` Oliver Upton
2021-06-10 6:54 ` Paolo Bonzini
2021-06-10 6:54 ` Paolo Bonzini
2021-06-10 6:26 ` Paolo Bonzini
2021-06-10 6:26 ` Paolo Bonzini
2021-06-08 21:47 ` [PATCH 03/10] selftests: KVM: Introduce system_counter_state_test Oliver Upton
2021-06-08 21:47 ` Oliver Upton
2021-06-08 21:47 ` [PATCH 04/10] KVM: arm64: Add userspace control of the guest's physical counter Oliver Upton
2021-06-08 21:47 ` Oliver Upton
2021-06-08 21:58 ` Oliver Upton
2021-06-08 21:58 ` Oliver Upton
2021-06-08 21:47 ` [PATCH 05/10] selftests: KVM: Add test cases for physical counter offsetting Oliver Upton
2021-06-08 21:47 ` Oliver Upton
2021-06-08 21:47 ` [PATCH 06/10] selftests: KVM: Add counter emulation benchmark Oliver Upton
2021-06-08 21:47 ` Oliver Upton
2021-06-08 21:47 ` [PATCH 07/10] KVM: x86: Refactor tsc synchronization code Oliver Upton
2021-06-08 21:47 ` Oliver Upton
2021-06-08 21:47 ` [PATCH 08/10] KVM: x86: Implement KVM_CAP_SYSTEM_COUNTER_STATE Oliver Upton
2021-06-08 21:47 ` Oliver Upton
2021-06-08 21:47 ` [PATCH 09/10] selftests: KVM: Add support for x86 to system_counter_state_test Oliver Upton
2021-06-08 21:47 ` Oliver Upton
2021-06-08 21:47 ` [PATCH 10/10] Documentation: KVM: Document KVM_{GET, SET}_SYSTEM_COUNTER_STATE ioctls Oliver Upton
2021-06-08 21:47 ` [PATCH 10/10] Documentation: KVM: Document KVM_{GET,SET}_SYSTEM_COUNTER_STATE ioctls Oliver Upton
2021-06-09 13:05 ` [PATCH 00/10] KVM: Add idempotent controls for migrating system counter state Paolo Bonzini
2021-06-09 13:05 ` Paolo Bonzini
2021-06-09 15:11 ` Oliver Upton
2021-06-09 15:11 ` Oliver Upton
2021-06-09 17:05 ` Paolo Bonzini
2021-06-09 17:05 ` Paolo Bonzini
2021-06-09 22:04 ` Oliver Upton
2021-06-09 22:04 ` Oliver Upton
2021-06-10 6:22 ` Paolo Bonzini
2021-06-10 6:22 ` Paolo Bonzini
2021-06-10 6:53 ` Christian Borntraeger
2021-06-10 6:53 ` Christian Borntraeger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=877dj3z68p.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=dmatlack@google.com \
--cc=jmattson@google.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=oupton@google.com \
--cc=pbonzini@redhat.com \
--cc=pshier@google.com \
--cc=rananta@google.com \
--cc=seanjc@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.