* [RFC PATCH v1 0/1] KVM: s390: pv: fix clock comparator late after suspend/resume
@ 2022-08-25 11:50 Nico Boehr
2022-08-25 11:50 ` [RFC PATCH v1 1/1] KVM: s390: pv: don't allow userspace to set the clock under PV Nico Boehr
0 siblings, 1 reply; 3+ messages in thread
From: Nico Boehr @ 2022-08-25 11:50 UTC (permalink / raw)
To: kvm; +Cc: frankja, imbrenda, borntraeger
After a PV guest in QEMU has been paused and resumed, clock comparator
interrupts are delivered to the guest much too late.
This is caused by QEMU's tod-kvm device restoring the guest's TOD clock
upon guest resume. This is not possible with PV, since the guest's TOD
clock is controlled by the ultravisor.
Even if not allowed under PV, KVM allowed the respective call from
userspace (VM attribute KVM_S390_VM_TOD) and updated its internal data
structures on this call. This can make the ultravisor's and KVM's view
of the guest TOD clock inconsistent. This in turn can lead to the late
delivery of clock comparator interrupts when KVM calculates when to wake
the guest.
This fixes the kernel portion of the problem by disallowing the vm attr
call for the guest TOD clock so userspace cannot mess up KVM's view of
the guest TOD. This fix causes an ugly warning in QEMU though, hence
another fix is due for QEMU to simply not even attempt to set the guest
TOD on resume.
Nico Boehr (1):
KVM: s390: pv: don't allow userspace to set the clock under PV
arch/s390/kvm/kvm-s390.c | 9 +++++++++
1 file changed, 9 insertions(+)
--
2.36.1
^ permalink raw reply [flat|nested] 3+ messages in thread
* [RFC PATCH v1 1/1] KVM: s390: pv: don't allow userspace to set the clock under PV
2022-08-25 11:50 [RFC PATCH v1 0/1] KVM: s390: pv: fix clock comparator late after suspend/resume Nico Boehr
@ 2022-08-25 11:50 ` Nico Boehr
2022-08-26 11:35 ` Nico Boehr
0 siblings, 1 reply; 3+ messages in thread
From: Nico Boehr @ 2022-08-25 11:50 UTC (permalink / raw)
To: kvm; +Cc: frankja, imbrenda, borntraeger
When running under PV, the guest's TOD clock is under control of the
ultravisor and the hypervisor isn't allowed to change it. Hence, don't
allow userspace to change the guest's TOD clock by returning
-EOPNOTSUPP.
When userspace changes the guest's TOD clock, KVM updates its
kvm.arch.epoch field and, in addition, the epoch field in all state
descriptions of all VCPUs.
But, under PV, the ultravisor will ignore the epoch field in the state
description and simply overwrite it on next SIE exit with the actual
guest epoch. This leads to KVM having an incorrect view of the guest's
TOD clock: it has updated its internal kvm.arch.epoch field, but the
ultravisor ignores the field in the state description.
Whenever a guest is now waiting for a clock comparator, KVM will
incorrectly calculate the time when the guest should wake up, possibly
causing the guest to sleep for much longer than expected.
Fixes: 0f3035047140 ("KVM: s390: protvirt: Do only reset registers that are accessible")
Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
---
arch/s390/kvm/kvm-s390.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index edfd4bbd0cba..b6404cedda78 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -1259,6 +1259,12 @@ static int kvm_s390_set_tod(struct kvm *kvm, struct kvm_device_attr *attr)
if (attr->flags)
return -EINVAL;
+ mutex_lock(&kvm->lock);
+ if (kvm_s390_pv_is_protected(kvm)) {
+ ret = -EOPNOTSUPP;
+ goto out_unlock;
+ }
+
switch (attr->attr) {
case KVM_S390_VM_TOD_EXT:
ret = kvm_s390_set_tod_ext(kvm, attr);
@@ -1273,6 +1279,9 @@ static int kvm_s390_set_tod(struct kvm *kvm, struct kvm_device_attr *attr)
ret = -ENXIO;
break;
}
+
+out_unlock:
+ mutex_unlock(&kvm->lock);
return ret;
}
--
2.36.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [RFC PATCH v1 1/1] KVM: s390: pv: don't allow userspace to set the clock under PV
2022-08-25 11:50 ` [RFC PATCH v1 1/1] KVM: s390: pv: don't allow userspace to set the clock under PV Nico Boehr
@ 2022-08-26 11:35 ` Nico Boehr
0 siblings, 0 replies; 3+ messages in thread
From: Nico Boehr @ 2022-08-26 11:35 UTC (permalink / raw)
To: kvm; +Cc: frankja, imbrenda, borntraeger
Quoting Nico Boehr (2022-08-25 13:50:15)
> When running under PV, the guest's TOD clock is under control of the
> ultravisor and the hypervisor isn't allowed to change it. Hence, don't
> allow userspace to change the guest's TOD clock by returning
> -EOPNOTSUPP.
>
> When userspace changes the guest's TOD clock, KVM updates its
> kvm.arch.epoch field and, in addition, the epoch field in all state
> descriptions of all VCPUs.
>
> But, under PV, the ultravisor will ignore the epoch field in the state
> description and simply overwrite it on next SIE exit with the actual
> guest epoch. This leads to KVM having an incorrect view of the guest's
> TOD clock: it has updated its internal kvm.arch.epoch field, but the
> ultravisor ignores the field in the state description.
>
> Whenever a guest is now waiting for a clock comparator, KVM will
> incorrectly calculate the time when the guest should wake up, possibly
> causing the guest to sleep for much longer than expected.
>
> Fixes: 0f3035047140 ("KVM: s390: protvirt: Do only reset registers that are accessible")
> Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
This patch seems to break migration (QEMU gets stuck). Possibly a locking issue, I will investigate.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-08-26 11:35 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-08-25 11:50 [RFC PATCH v1 0/1] KVM: s390: pv: fix clock comparator late after suspend/resume Nico Boehr
2022-08-25 11:50 ` [RFC PATCH v1 1/1] KVM: s390: pv: don't allow userspace to set the clock under PV Nico Boehr
2022-08-26 11:35 ` Nico Boehr
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox