* [PATCH 0/6] kvm-s390: collection of kvm-s390 fixes
@ 2009-05-05 14:39 ehrhardt
2009-05-05 14:39 ` [PATCH 1/6] kvm-s390: Fix memory slot versus run ehrhardt
` (5 more replies)
0 siblings, 6 replies; 26+ messages in thread
From: ehrhardt @ 2009-05-05 14:39 UTC (permalink / raw)
To: Avi Kivity, kvm; +Cc: ehrhardt, Christian Borntraeger, Carsten Otte
From: Christian Ehrhardt <ehrhardt@de.ibm.com>
This is a collection of fixes for kvm-s390 that originate from several tests
made in the last few months. They are now tested a while and should be ready
to be merged.
All six patches are created either by Carsten Otte or Christain Borntraeger.
I'm just the one stumbling across the filled patch queue and cleaning them up
for submission. The patches themselve have proper tags to account creator etc.
*not sending patches a few weeks makes somewhat forgetful - I beg a pardon from
all on cc that got it two times now after adding the kvm list this time.
Patches included:
[PATCH 1/6] kvm-s390: Fix memory slot versus run'
[PATCH 2/6] kvm-s390: use hrtimer for clock wakeup from idle'
[PATCH 3/6] kvm-s390: optimize float int lock: spin_lock_bh --> spin_lock'
[PATCH 4/6] kvm-s390: Unlink vcpu on destroy'
[PATCH 5/6] kvm-s390: Sanity check on validity intercept'
[PATCH 6/6] kvm-s390: Verify memory in kvm run'
Overall-Diffstat:
arch/s390/include/asm/kvm_host.h | 5 ++-
arch/s390/kvm/intercept.c | 28 ++++++++++++-------
arch/s390/kvm/interrupt.c | 55 ++++++++++++++++++++++++---------------
arch/s390/kvm/kvm-s390.c | 50 ++++++++++++++++++++++++++++-------
arch/s390/kvm/kvm-s390.h | 4 ++
arch/s390/kvm/priv.c | 4 +-
arch/s390/kvm/sigp.c | 16 +++++------
7 files changed, 110 insertions(+), 52 deletions(-)
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 1/6] kvm-s390: Fix memory slot versus run
2009-05-05 14:39 [PATCH 0/6] kvm-s390: collection of kvm-s390 fixes ehrhardt
@ 2009-05-05 14:39 ` ehrhardt
2009-05-06 12:01 ` Avi Kivity
2009-05-05 14:39 ` [PATCH 2/6] kvm-s390: use hrtimer for clock wakeup from idle ehrhardt
` (4 subsequent siblings)
5 siblings, 1 reply; 26+ messages in thread
From: ehrhardt @ 2009-05-05 14:39 UTC (permalink / raw)
To: Avi Kivity, kvm; +Cc: ehrhardt, Christian Borntraeger, Carsten Otte
From: Carsten Otte <cotte@de.ibm.com>
This patch fixes an incorrectness in the kvm backend for s390.
In case virtual cpus are being created before the corresponding
memory slot is being registered, we need to update the sie
control blocks for the virtual cpus. In order to do that, we
use the vcpu->mutex to lock out kvm_run and friends. This way
we can ensure a consistent update of the memory for the entire
smp configuration.
Reported-by: Mijo Safradin <mijo@linux.vnet.ibm.com>
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
---
arch/s390/kvm/kvm-s390.c | 24 ++++++++++++++++++++----
1 file changed, 20 insertions(+), 4 deletions(-)
Index: kvm/arch/s390/kvm/kvm-s390.c
===================================================================
--- kvm.orig/arch/s390/kvm/kvm-s390.c
+++ kvm/arch/s390/kvm/kvm-s390.c
@@ -657,6 +657,8 @@ int kvm_arch_set_memory_region(struct kv
struct kvm_memory_slot old,
int user_alloc)
{
+ int i;
+
/* A few sanity checks. We can have exactly one memory slot which has
to start at guest virtual zero and which has to be located at a
page boundary in userland and which has to end at a page boundary.
@@ -676,13 +678,27 @@ int kvm_arch_set_memory_region(struct kv
if (mem->memory_size & (PAGE_SIZE - 1))
return -EINVAL;
+ /* lock all vcpus */
+ for (i = 0; i < KVM_MAX_VCPUS; ++i) {
+ if (kvm->vcpus[i])
+ mutex_lock(&kvm->vcpus[i]->mutex);
+ }
+
kvm->arch.guest_origin = mem->userspace_addr;
kvm->arch.guest_memsize = mem->memory_size;
- /* FIXME: we do want to interrupt running CPUs and update their memory
- configuration now to avoid race conditions. But hey, changing the
- memory layout while virtual CPUs are running is usually bad
- programming practice. */
+ /* update sie control blocks, and unlock all vcpus */
+ for (i = 0; i < KVM_MAX_VCPUS; ++i) {
+ if (kvm->vcpus[i]) {
+ kvm->vcpus[i]->arch.sie_block->gmsor =
+ kvm->arch.guest_origin;
+ kvm->vcpus[i]->arch.sie_block->gmslm =
+ kvm->arch.guest_memsize +
+ kvm->arch.guest_origin +
+ VIRTIODESCSPACE - 1ul;
+ mutex_unlock(&kvm->vcpus[i]->mutex);
+ }
+ }
return 0;
}
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 2/6] kvm-s390: use hrtimer for clock wakeup from idle
2009-05-05 14:39 [PATCH 0/6] kvm-s390: collection of kvm-s390 fixes ehrhardt
2009-05-05 14:39 ` [PATCH 1/6] kvm-s390: Fix memory slot versus run ehrhardt
@ 2009-05-05 14:39 ` ehrhardt
2009-05-06 12:10 ` Avi Kivity
2009-05-05 14:39 ` [PATCH 3/6] kvm-s390: optimize float int lock: spin_lock_bh --> spin_lock ehrhardt
` (3 subsequent siblings)
5 siblings, 1 reply; 26+ messages in thread
From: ehrhardt @ 2009-05-05 14:39 UTC (permalink / raw)
To: Avi Kivity, kvm; +Cc: ehrhardt, Christian Borntraeger, Carsten Otte
From: Christian Borntraeger <borntraeger@de.ibm.com>
This patch reworks the s390 clock comparator wakeup to hrtimer. The clock
comparator is a per-cpu value that is compared against the TOD clock. If
ckc <= TOD an external interrupt 1004 is triggered. Since the clock comparator
and the TOD clock have a much higher resolution than jiffies we should use
hrtimers to trigger the wakeup. This speeds up guest nanosleep for small
values.
Since hrtimers callbacks run in hard-irq context, I added a tasklet to do
the actual work with enabled interrupts.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
---
include/asm/kvm_host.h | 5 ++++-
kvm/interrupt.c | 35 +++++++++++++++++++++++++----------
kvm/kvm-s390.c | 7 +++++--
kvm/kvm-s390.h | 4 +++-
4 files changed, 37 insertions(+), 14 deletions(-)
Index: kvm/arch/s390/include/asm/kvm_host.h
===================================================================
--- kvm.orig/arch/s390/include/asm/kvm_host.h 2009-05-05 15:58:45.000000000 +0200
+++ kvm/arch/s390/include/asm/kvm_host.h 2009-05-05 16:16:49.000000000 +0200
@@ -13,6 +13,8 @@
#ifndef ASM_KVM_HOST_H
#define ASM_KVM_HOST_H
+#include <linux/hrtimer.h>
+#include <linux/interrupt.h>
#include <linux/kvm_host.h>
#include <asm/debug.h>
#include <asm/cpuid.h>
@@ -210,7 +212,8 @@
s390_fp_regs guest_fpregs;
unsigned int guest_acrs[NUM_ACRS];
struct kvm_s390_local_interrupt local_int;
- struct timer_list ckc_timer;
+ struct hrtimer ckc_timer;
+ struct tasklet_struct tasklet;
union {
cpuid_t cpu_id;
u64 stidp_data;
Index: kvm/arch/s390/kvm/interrupt.c
===================================================================
--- kvm.orig/arch/s390/kvm/interrupt.c 2009-05-05 15:58:45.000000000 +0200
+++ kvm/arch/s390/kvm/interrupt.c 2009-05-05 16:18:02.000000000 +0200
@@ -12,6 +12,8 @@
#include <asm/lowcore.h>
#include <asm/uaccess.h>
+#include <linux/hrtimer.h>
+#include <linux/interrupt.h>
#include <linux/kvm_host.h>
#include <linux/signal.h>
#include "kvm-s390.h"
@@ -361,12 +363,12 @@
return 0;
}
- sltime = (vcpu->arch.sie_block->ckc - now) / (0xf4240000ul / HZ) + 1;
+ sltime = (vcpu->arch.sie_block->ckc - now)/4096*1000;
- vcpu->arch.ckc_timer.expires = jiffies + sltime;
-
- add_timer(&vcpu->arch.ckc_timer);
- VCPU_EVENT(vcpu, 5, "enabled wait timer:%llx jiffies", sltime);
+ hrtimer_start(&vcpu->arch.ckc_timer, ktime_set(0, sltime),
+ HRTIMER_MODE_REL);
+ VCPU_EVENT(vcpu, 5, "enabled wait via clock comparator: %llx ns",
+ sltime);
no_timer:
spin_lock_bh(&vcpu->arch.local_int.float_int->lock);
spin_lock_bh(&vcpu->arch.local_int.lock);
@@ -389,21 +391,34 @@
remove_wait_queue(&vcpu->wq, &wait);
spin_unlock_bh(&vcpu->arch.local_int.lock);
spin_unlock_bh(&vcpu->arch.local_int.float_int->lock);
- del_timer(&vcpu->arch.ckc_timer);
+ hrtimer_try_to_cancel(&vcpu->arch.ckc_timer);
return 0;
}
-void kvm_s390_idle_wakeup(unsigned long data)
+void kvm_s390_tasklet(unsigned long parm)
{
- struct kvm_vcpu *vcpu = (struct kvm_vcpu *)data;
+ struct kvm_vcpu *vcpu = (struct kvm_vcpu *) parm;
- spin_lock_bh(&vcpu->arch.local_int.lock);
+ spin_lock(&vcpu->arch.local_int.lock);
vcpu->arch.local_int.timer_due = 1;
if (waitqueue_active(&vcpu->arch.local_int.wq))
wake_up_interruptible(&vcpu->arch.local_int.wq);
- spin_unlock_bh(&vcpu->arch.local_int.lock);
+ spin_unlock(&vcpu->arch.local_int.lock);
}
+/*
+ * low level hrtimer wake routine. Because this runs in hardirq context
+ * we schedule a tasklet to do the real work.
+ */
+enum hrtimer_restart kvm_s390_idle_wakeup(struct hrtimer *timer)
+{
+ struct kvm_vcpu *vcpu;
+
+ vcpu = container_of(timer, struct kvm_vcpu, arch.ckc_timer);
+ tasklet_schedule(&vcpu->arch.tasklet);
+
+ return HRTIMER_NORESTART;
+}
void kvm_s390_deliver_pending_interrupts(struct kvm_vcpu *vcpu)
{
Index: kvm/arch/s390/kvm/kvm-s390.c
===================================================================
--- kvm.orig/arch/s390/kvm/kvm-s390.c 2009-05-05 16:16:48.000000000 +0200
+++ kvm/arch/s390/kvm/kvm-s390.c 2009-05-05 16:16:49.000000000 +0200
@@ -15,6 +15,7 @@
#include <linux/compiler.h>
#include <linux/err.h>
#include <linux/fs.h>
+#include <linux/hrtimer.h>
#include <linux/init.h>
#include <linux/kvm.h>
#include <linux/kvm_host.h>
@@ -286,8 +287,10 @@
vcpu->arch.sie_block->gmsor = vcpu->kvm->arch.guest_origin;
vcpu->arch.sie_block->ecb = 2;
vcpu->arch.sie_block->eca = 0xC1002001U;
- setup_timer(&vcpu->arch.ckc_timer, kvm_s390_idle_wakeup,
- (unsigned long) vcpu);
+ hrtimer_init(&vcpu->arch.ckc_timer, CLOCK_REALTIME, HRTIMER_MODE_ABS);
+ tasklet_init(&vcpu->arch.tasklet, kvm_s390_tasklet,
+ (unsigned long) vcpu);
+ vcpu->arch.ckc_timer.function = kvm_s390_idle_wakeup;
get_cpu_id(&vcpu->arch.cpu_id);
vcpu->arch.cpu_id.version = 0xff;
return 0;
Index: kvm/arch/s390/kvm/kvm-s390.h
===================================================================
--- kvm.orig/arch/s390/kvm/kvm-s390.h 2009-05-05 15:58:45.000000000 +0200
+++ kvm/arch/s390/kvm/kvm-s390.h 2009-05-05 16:16:49.000000000 +0200
@@ -14,6 +14,7 @@
#ifndef ARCH_S390_KVM_S390_H
#define ARCH_S390_KVM_S390_H
+#include <linux/hrtimer.h>
#include <linux/kvm.h>
#include <linux/kvm_host.h>
@@ -41,7 +42,8 @@
}
int kvm_s390_handle_wait(struct kvm_vcpu *vcpu);
-void kvm_s390_idle_wakeup(unsigned long data);
+enum hrtimer_restart kvm_s390_idle_wakeup(struct hrtimer *timer);
+void kvm_s390_tasklet(unsigned long parm);
void kvm_s390_deliver_pending_interrupts(struct kvm_vcpu *vcpu);
int kvm_s390_inject_vm(struct kvm *kvm,
struct kvm_s390_interrupt *s390int);
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 3/6] kvm-s390: optimize float int lock: spin_lock_bh --> spin_lock
2009-05-05 14:39 [PATCH 0/6] kvm-s390: collection of kvm-s390 fixes ehrhardt
2009-05-05 14:39 ` [PATCH 1/6] kvm-s390: Fix memory slot versus run ehrhardt
2009-05-05 14:39 ` [PATCH 2/6] kvm-s390: use hrtimer for clock wakeup from idle ehrhardt
@ 2009-05-05 14:39 ` ehrhardt
2009-05-05 14:39 ` [PATCH 4/6] kvm-s390: Unlink vcpu on destroy ehrhardt
` (2 subsequent siblings)
5 siblings, 0 replies; 26+ messages in thread
From: ehrhardt @ 2009-05-05 14:39 UTC (permalink / raw)
To: Avi Kivity, kvm; +Cc: ehrhardt, Christian Borntraeger, Carsten Otte
From: Christian Borntraeger <borntraeger@de.ibm.com>
The floating interrupt lock is only taken in process context. We can
replace all spin_lock_bh with standard spin_lock calls.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
arch/s390/kvm/interrupt.c | 20 ++++++++++----------
arch/s390/kvm/kvm-s390.c | 4 ++--
arch/s390/kvm/priv.c | 4 ++--
arch/s390/kvm/sigp.c | 16 ++++++++--------
4 files changed, 22 insertions(+), 22 deletions(-)
Index: kvm/arch/s390/kvm/interrupt.c
===================================================================
--- kvm.orig/arch/s390/kvm/interrupt.c
+++ kvm/arch/s390/kvm/interrupt.c
@@ -301,13 +301,13 @@ int kvm_cpu_has_interrupt(struct kvm_vcp
}
if ((!rc) && atomic_read(&fi->active)) {
- spin_lock_bh(&fi->lock);
+ spin_lock(&fi->lock);
list_for_each_entry(inti, &fi->list, list)
if (__interrupt_is_deliverable(vcpu, inti)) {
rc = 1;
break;
}
- spin_unlock_bh(&fi->lock);
+ spin_unlock(&fi->lock);
}
if ((!rc) && (vcpu->arch.sie_block->ckc <
@@ -368,7 +368,7 @@ int kvm_s390_handle_wait(struct kvm_vcpu
hrtimer_start(&vcpu->arch.ckc_timer, ktime_set (0, sltime) , HRTIMER_MODE_REL);
VCPU_EVENT(vcpu, 5, "enabled wait via clock comparator: %llx ns", sltime);
no_timer:
- spin_lock_bh(&vcpu->arch.local_int.float_int->lock);
+ spin_lock(&vcpu->arch.local_int.float_int->lock);
spin_lock_bh(&vcpu->arch.local_int.lock);
add_wait_queue(&vcpu->arch.local_int.wq, &wait);
while (list_empty(&vcpu->arch.local_int.list) &&
@@ -377,18 +377,18 @@ no_timer:
!signal_pending(current)) {
set_current_state(TASK_INTERRUPTIBLE);
spin_unlock_bh(&vcpu->arch.local_int.lock);
- spin_unlock_bh(&vcpu->arch.local_int.float_int->lock);
+ spin_unlock(&vcpu->arch.local_int.float_int->lock);
vcpu_put(vcpu);
schedule();
vcpu_load(vcpu);
- spin_lock_bh(&vcpu->arch.local_int.float_int->lock);
+ spin_lock(&vcpu->arch.local_int.float_int->lock);
spin_lock_bh(&vcpu->arch.local_int.lock);
}
__unset_cpu_idle(vcpu);
__set_current_state(TASK_RUNNING);
remove_wait_queue(&vcpu->wq, &wait);
spin_unlock_bh(&vcpu->arch.local_int.lock);
- spin_unlock_bh(&vcpu->arch.local_int.float_int->lock);
+ spin_unlock(&vcpu->arch.local_int.float_int->lock);
hrtimer_try_to_cancel(&vcpu->arch.ckc_timer);
return 0;
}
@@ -455,7 +455,7 @@ void kvm_s390_deliver_pending_interrupts
if (atomic_read(&fi->active)) {
do {
deliver = 0;
- spin_lock_bh(&fi->lock);
+ spin_lock(&fi->lock);
list_for_each_entry_safe(inti, n, &fi->list, list) {
if (__interrupt_is_deliverable(vcpu, inti)) {
list_del(&inti->list);
@@ -466,7 +466,7 @@ void kvm_s390_deliver_pending_interrupts
}
if (list_empty(&fi->list))
atomic_set(&fi->active, 0);
- spin_unlock_bh(&fi->lock);
+ spin_unlock(&fi->lock);
if (deliver) {
__do_deliver_interrupt(vcpu, inti);
kfree(inti);
@@ -531,7 +531,7 @@ int kvm_s390_inject_vm(struct kvm *kvm,
mutex_lock(&kvm->lock);
fi = &kvm->arch.float_int;
- spin_lock_bh(&fi->lock);
+ spin_lock(&fi->lock);
list_add_tail(&inti->list, &fi->list);
atomic_set(&fi->active, 1);
sigcpu = find_first_bit(fi->idle_mask, KVM_MAX_VCPUS);
@@ -548,7 +548,7 @@ int kvm_s390_inject_vm(struct kvm *kvm,
if (waitqueue_active(&li->wq))
wake_up_interruptible(&li->wq);
spin_unlock_bh(&li->lock);
- spin_unlock_bh(&fi->lock);
+ spin_unlock(&fi->lock);
mutex_unlock(&kvm->lock);
return 0;
}
Index: kvm/arch/s390/kvm/kvm-s390.c
===================================================================
--- kvm.orig/arch/s390/kvm/kvm-s390.c
+++ kvm/arch/s390/kvm/kvm-s390.c
@@ -323,11 +323,11 @@ struct kvm_vcpu *kvm_arch_vcpu_create(st
spin_lock_init(&vcpu->arch.local_int.lock);
INIT_LIST_HEAD(&vcpu->arch.local_int.list);
vcpu->arch.local_int.float_int = &kvm->arch.float_int;
- spin_lock_bh(&kvm->arch.float_int.lock);
+ spin_lock(&kvm->arch.float_int.lock);
kvm->arch.float_int.local_int[id] = &vcpu->arch.local_int;
init_waitqueue_head(&vcpu->arch.local_int.wq);
vcpu->arch.local_int.cpuflags = &vcpu->arch.sie_block->cpuflags;
- spin_unlock_bh(&kvm->arch.float_int.lock);
+ spin_unlock(&kvm->arch.float_int.lock);
rc = kvm_vcpu_init(vcpu, kvm, id);
if (rc)
Index: kvm/arch/s390/kvm/priv.c
===================================================================
--- kvm.orig/arch/s390/kvm/priv.c
+++ kvm/arch/s390/kvm/priv.c
@@ -204,11 +204,11 @@ static void handle_stsi_3_2_2(struct kvm
int cpus = 0;
int n;
- spin_lock_bh(&fi->lock);
+ spin_lock(&fi->lock);
for (n = 0; n < KVM_MAX_VCPUS; n++)
if (fi->local_int[n])
cpus++;
- spin_unlock_bh(&fi->lock);
+ spin_unlock(&fi->lock);
/* deal with other level 3 hypervisors */
if (stsi(mem, 3, 2, 2) == -ENOSYS)
Index: kvm/arch/s390/kvm/sigp.c
===================================================================
--- kvm.orig/arch/s390/kvm/sigp.c
+++ kvm/arch/s390/kvm/sigp.c
@@ -52,7 +52,7 @@ static int __sigp_sense(struct kvm_vcpu
if (cpu_addr >= KVM_MAX_VCPUS)
return 3; /* not operational */
- spin_lock_bh(&fi->lock);
+ spin_lock(&fi->lock);
if (fi->local_int[cpu_addr] == NULL)
rc = 3; /* not operational */
else if (atomic_read(fi->local_int[cpu_addr]->cpuflags)
@@ -64,7 +64,7 @@ static int __sigp_sense(struct kvm_vcpu
*reg |= SIGP_STAT_STOPPED;
rc = 1; /* status stored */
}
- spin_unlock_bh(&fi->lock);
+ spin_unlock(&fi->lock);
VCPU_EVENT(vcpu, 4, "sensed status of cpu %x rc %x", cpu_addr, rc);
return rc;
@@ -86,7 +86,7 @@ static int __sigp_emergency(struct kvm_v
inti->type = KVM_S390_INT_EMERGENCY;
- spin_lock_bh(&fi->lock);
+ spin_lock(&fi->lock);
li = fi->local_int[cpu_addr];
if (li == NULL) {
rc = 3; /* not operational */
@@ -102,7 +102,7 @@ static int __sigp_emergency(struct kvm_v
spin_unlock_bh(&li->lock);
rc = 0; /* order accepted */
unlock:
- spin_unlock_bh(&fi->lock);
+ spin_unlock(&fi->lock);
VCPU_EVENT(vcpu, 4, "sent sigp emerg to cpu %x", cpu_addr);
return rc;
}
@@ -123,7 +123,7 @@ static int __sigp_stop(struct kvm_vcpu *
inti->type = KVM_S390_SIGP_STOP;
- spin_lock_bh(&fi->lock);
+ spin_lock(&fi->lock);
li = fi->local_int[cpu_addr];
if (li == NULL) {
rc = 3; /* not operational */
@@ -142,7 +142,7 @@ static int __sigp_stop(struct kvm_vcpu *
spin_unlock_bh(&li->lock);
rc = 0; /* order accepted */
unlock:
- spin_unlock_bh(&fi->lock);
+ spin_unlock(&fi->lock);
VCPU_EVENT(vcpu, 4, "sent sigp stop to cpu %x", cpu_addr);
return rc;
}
@@ -188,7 +188,7 @@ static int __sigp_set_prefix(struct kvm_
if (!inti)
return 2; /* busy */
- spin_lock_bh(&fi->lock);
+ spin_lock(&fi->lock);
li = fi->local_int[cpu_addr];
if ((cpu_addr >= KVM_MAX_VCPUS) || (li == NULL)) {
@@ -220,7 +220,7 @@ static int __sigp_set_prefix(struct kvm_
out_li:
spin_unlock_bh(&li->lock);
out_fi:
- spin_unlock_bh(&fi->lock);
+ spin_unlock(&fi->lock);
return rc;
}
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 4/6] kvm-s390: Unlink vcpu on destroy
2009-05-05 14:39 [PATCH 0/6] kvm-s390: collection of kvm-s390 fixes ehrhardt
` (2 preceding siblings ...)
2009-05-05 14:39 ` [PATCH 3/6] kvm-s390: optimize float int lock: spin_lock_bh --> spin_lock ehrhardt
@ 2009-05-05 14:39 ` ehrhardt
2009-05-06 12:11 ` Avi Kivity
2009-05-05 14:39 ` [PATCH 5/6] kvm-s390: Sanity check on validity intercept ehrhardt
2009-05-05 14:39 ` [PATCH 6/6] kvm-s390: Verify memory in kvm run ehrhardt
5 siblings, 1 reply; 26+ messages in thread
From: ehrhardt @ 2009-05-05 14:39 UTC (permalink / raw)
To: Avi Kivity, kvm; +Cc: ehrhardt, Christian Borntraeger, Carsten Otte
From: Carsten Otte <cotte@de.ibm.com>
This patch makes sure we do unlink a vcpu's sie control block
from the system control area in kvm_arch_vcpu_destroy. This
prevents illegal accesses to the sie control block from other
virtual cpus after free.
Reported-by: Mijo Safradin <mijo@linux.vnet.ibm.com>
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
---
arch/s390/kvm/kvm-s390.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
Index: kvm/arch/s390/kvm/kvm-s390.c
===================================================================
--- kvm.orig/arch/s390/kvm/kvm-s390.c
+++ kvm/arch/s390/kvm/kvm-s390.c
@@ -195,6 +195,9 @@ out_nokvm:
void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
{
VCPU_EVENT(vcpu, 3, "%s", "free cpu");
+ if (vcpu->kvm->arch.sca->cpu[vcpu->vcpu_id].sda ==
+ (__u64) vcpu->arch.sie_block)
+ vcpu->kvm->arch.sca->cpu[vcpu->vcpu_id].sda = 0;
free_page((unsigned long)(vcpu->arch.sie_block));
kvm_vcpu_uninit(vcpu);
kfree(vcpu);
@@ -307,8 +310,10 @@ struct kvm_vcpu *kvm_arch_vcpu_create(st
vcpu->arch.sie_block->icpua = id;
BUG_ON(!kvm->arch.sca);
- BUG_ON(kvm->arch.sca->cpu[id].sda);
- kvm->arch.sca->cpu[id].sda = (__u64) vcpu->arch.sie_block;
+ if (!kvm->arch.sca->cpu[id].sda)
+ kvm->arch.sca->cpu[id].sda = (__u64) vcpu->arch.sie_block;
+ else
+ BUG_ON(!kvm->vcpus[id]); /* vcpu does already exist */
vcpu->arch.sie_block->scaoh = (__u32)(((__u64)kvm->arch.sca) >> 32);
vcpu->arch.sie_block->scaol = (__u32)(__u64)kvm->arch.sca;
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 5/6] kvm-s390: Sanity check on validity intercept
2009-05-05 14:39 [PATCH 0/6] kvm-s390: collection of kvm-s390 fixes ehrhardt
` (3 preceding siblings ...)
2009-05-05 14:39 ` [PATCH 4/6] kvm-s390: Unlink vcpu on destroy ehrhardt
@ 2009-05-05 14:39 ` ehrhardt
2009-05-05 14:39 ` [PATCH 6/6] kvm-s390: Verify memory in kvm run ehrhardt
5 siblings, 0 replies; 26+ messages in thread
From: ehrhardt @ 2009-05-05 14:39 UTC (permalink / raw)
To: Avi Kivity, kvm; +Cc: ehrhardt, Christian Borntraeger, Carsten Otte
From: Carsten Otte <cotte@de.ibm.com>
This patch adds a sanity check for the content of the guest
prefix register content before faulting in the cpu lowcore
that it refers to. The guest might end up in an endless loop
where SIE complains about missing lowcore with incorrect
content of the prefix register without this fix.
Reported-by: Mijo Safradin <mijo@linux.vnet.ibm.com>
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
---
arch/s390/kvm/intercept.c | 28 ++++++++++++++++++----------
1 file changed, 18 insertions(+), 10 deletions(-)
Index: kvm/arch/s390/kvm/intercept.c
===================================================================
--- kvm.orig/arch/s390/kvm/intercept.c
+++ kvm/arch/s390/kvm/intercept.c
@@ -154,17 +154,25 @@ static int handle_stop(struct kvm_vcpu *
static int handle_validity(struct kvm_vcpu *vcpu)
{
int viwhy = vcpu->arch.sie_block->ipb >> 16;
+ int rc;
+
vcpu->stat.exit_validity++;
- if (viwhy == 0x37) {
- fault_in_pages_writeable((char __user *)
- vcpu->kvm->arch.guest_origin +
- vcpu->arch.sie_block->prefix,
- PAGE_SIZE);
- return 0;
- }
- VCPU_EVENT(vcpu, 2, "unhandled validity intercept code %d",
- viwhy);
- return -ENOTSUPP;
+ if ((viwhy == 0x37) && (vcpu->arch.sie_block->prefix
+ <= vcpu->kvm->arch.guest_memsize - 2*PAGE_SIZE)){
+ rc = fault_in_pages_writeable((char __user *)
+ vcpu->kvm->arch.guest_origin +
+ vcpu->arch.sie_block->prefix,
+ 2*PAGE_SIZE);
+ if (rc)
+ /* user will receive sigsegv, exit to user */
+ rc = -ENOTSUPP;
+ } else
+ rc = -ENOTSUPP;
+
+ if (rc)
+ VCPU_EVENT(vcpu, 2, "unhandled validity intercept code %d",
+ viwhy);
+ return rc;
}
static int handle_instruction(struct kvm_vcpu *vcpu)
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 6/6] kvm-s390: Verify memory in kvm run
2009-05-05 14:39 [PATCH 0/6] kvm-s390: collection of kvm-s390 fixes ehrhardt
` (4 preceding siblings ...)
2009-05-05 14:39 ` [PATCH 5/6] kvm-s390: Sanity check on validity intercept ehrhardt
@ 2009-05-05 14:39 ` ehrhardt
5 siblings, 0 replies; 26+ messages in thread
From: ehrhardt @ 2009-05-05 14:39 UTC (permalink / raw)
To: Avi Kivity, kvm; +Cc: ehrhardt, Christian Borntraeger, Carsten Otte
From: Carsten Otte <cotte@de.ibm.com>
This check verifies that the guest we're trying to run in KVM_RUN
has some memory assigned to it. It enters an endless exception
loop if this is not the case.
Reported-by: Mijo Safradin <mijo@linux.vnet.ibm.com>
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
---
arch/s390/kvm/kvm-s390.c | 6 ++++++
1 file changed, 6 insertions(+)
Index: kvm/arch/s390/kvm/kvm-s390.c
===================================================================
--- kvm.orig/arch/s390/kvm/kvm-s390.c
+++ kvm/arch/s390/kvm/kvm-s390.c
@@ -478,6 +478,12 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_v
vcpu_load(vcpu);
+ /* verify, that memory has been registered */
+ if (!vcpu->kvm->arch.guest_memsize) {
+ vcpu_put(vcpu);
+ return -EINVAL;
+ }
+
if (vcpu->sigset_active)
sigprocmask(SIG_SETMASK, &vcpu->sigset, &sigsaved);
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/6] kvm-s390: Fix memory slot versus run
2009-05-05 14:39 ` [PATCH 1/6] kvm-s390: Fix memory slot versus run ehrhardt
@ 2009-05-06 12:01 ` Avi Kivity
2009-05-11 13:00 ` Christian Ehrhardt
0 siblings, 1 reply; 26+ messages in thread
From: Avi Kivity @ 2009-05-06 12:01 UTC (permalink / raw)
To: ehrhardt; +Cc: kvm, Christian Borntraeger, Carsten Otte
ehrhardt@linux.vnet.ibm.com wrote:
> From: Carsten Otte <cotte@de.ibm.com>
>
> This patch fixes an incorrectness in the kvm backend for s390.
> In case virtual cpus are being created before the corresponding
> memory slot is being registered, we need to update the sie
> control blocks for the virtual cpus. In order to do that, we
> use the vcpu->mutex to lock out kvm_run and friends. This way
> we can ensure a consistent update of the memory for the entire
> smp configuration.
> @@ -657,6 +657,8 @@ int kvm_arch_set_memory_region(struct kv
> struct kvm_memory_slot old,
> int user_alloc)
> {
> + int i;
> +
> /* A few sanity checks. We can have exactly one memory slot which has
> to start at guest virtual zero and which has to be located at a
> page boundary in userland and which has to end at a page boundary.
> @@ -676,13 +678,27 @@ int kvm_arch_set_memory_region(struct kv
> if (mem->memory_size & (PAGE_SIZE - 1))
> return -EINVAL;
>
> + /* lock all vcpus */
> + for (i = 0; i < KVM_MAX_VCPUS; ++i) {
> + if (kvm->vcpus[i])
> + mutex_lock(&kvm->vcpus[i]->mutex);
> + }
> +
>
Can't that livelock? Nothing requires a vcpu to ever exit, and if the
cpu on which it's running on has no other load and no interrupts, it
could remain in guest mode indefinitely, and then the ioctl will hang,
waiting for something to happen.
On x86, we use slots_lock to protect memory slots. When we change the
global memory configuration, we set a bit in vcpu->requests, and send an
IPI to all cpus that are currently in guest mode for our guest. This
forces the cpu back to host mode. On the next entry, vcpu_run notices
vcpu->requests has the bit set and reloads the mmu configuration. Of
course, all this may be overkill for s390.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/6] kvm-s390: use hrtimer for clock wakeup from idle
2009-05-05 14:39 ` [PATCH 2/6] kvm-s390: use hrtimer for clock wakeup from idle ehrhardt
@ 2009-05-06 12:10 ` Avi Kivity
2009-05-06 12:36 ` Christian Borntraeger
0 siblings, 1 reply; 26+ messages in thread
From: Avi Kivity @ 2009-05-06 12:10 UTC (permalink / raw)
To: ehrhardt; +Cc: kvm, Christian Borntraeger, Carsten Otte
ehrhardt@linux.vnet.ibm.com wrote:
> From: Christian Borntraeger <borntraeger@de.ibm.com>
>
> This patch reworks the s390 clock comparator wakeup to hrtimer. The clock
> comparator is a per-cpu value that is compared against the TOD clock. If
> ckc <= TOD an external interrupt 1004 is triggered. Since the clock comparator
> and the TOD clock have a much higher resolution than jiffies we should use
> hrtimers to trigger the wakeup. This speeds up guest nanosleep for small
> values.
>
> Since hrtimers callbacks run in hard-irq context, I added a tasklet to do
> the actual work with enabled interrupts.
>
>
> -void kvm_s390_idle_wakeup(unsigned long data)
> +void kvm_s390_tasklet(unsigned long parm)
> {
> - struct kvm_vcpu *vcpu = (struct kvm_vcpu *)data;
> + struct kvm_vcpu *vcpu = (struct kvm_vcpu *) parm;
>
> - spin_lock_bh(&vcpu->arch.local_int.lock);
> + spin_lock(&vcpu->arch.local_int.lock);
> vcpu->arch.local_int.timer_due = 1;
> if (waitqueue_active(&vcpu->arch.local_int.wq))
> wake_up_interruptible(&vcpu->arch.local_int.wq);
> - spin_unlock_bh(&vcpu->arch.local_int.lock);
> + spin_unlock(&vcpu->arch.local_int.lock);
> }
>
Why can't this be done from the timer context (after adjusting the locks)?
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 4/6] kvm-s390: Unlink vcpu on destroy
2009-05-05 14:39 ` [PATCH 4/6] kvm-s390: Unlink vcpu on destroy ehrhardt
@ 2009-05-06 12:11 ` Avi Kivity
2009-05-11 13:00 ` Christian Ehrhardt
0 siblings, 1 reply; 26+ messages in thread
From: Avi Kivity @ 2009-05-06 12:11 UTC (permalink / raw)
To: ehrhardt; +Cc: kvm, Christian Borntraeger, Carsten Otte
ehrhardt@linux.vnet.ibm.com wrote:
> From: Carsten Otte <cotte@de.ibm.com>
>
> This patch makes sure we do unlink a vcpu's sie control block
> from the system control area in kvm_arch_vcpu_destroy. This
> prevents illegal accesses to the sie control block from other
> virtual cpus after free.
>
> Reported-by: Mijo Safradin <mijo@linux.vnet.ibm.com>
> Signed-off-by: Carsten Otte <cotte@de.ibm.com>
> ---
> arch/s390/kvm/kvm-s390.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> Index: kvm/arch/s390/kvm/kvm-s390.c
> ===================================================================
> --- kvm.orig/arch/s390/kvm/kvm-s390.c
> +++ kvm/arch/s390/kvm/kvm-s390.c
> @@ -195,6 +195,9 @@ out_nokvm:
> void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
> {
> VCPU_EVENT(vcpu, 3, "%s", "free cpu");
> + if (vcpu->kvm->arch.sca->cpu[vcpu->vcpu_id].sda ==
> + (__u64) vcpu->arch.sie_block)
> + vcpu->kvm->arch.sca->cpu[vcpu->vcpu_id].sda = 0;
> free_page((unsigned long)(vcpu->arch.sie_block));
>
>
If this is accessed by hardware on a different cpu, don't you need a
memory barrier here?
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/6] kvm-s390: use hrtimer for clock wakeup from idle
2009-05-06 12:10 ` Avi Kivity
@ 2009-05-06 12:36 ` Christian Borntraeger
2009-05-07 10:19 ` Avi Kivity
0 siblings, 1 reply; 26+ messages in thread
From: Christian Borntraeger @ 2009-05-06 12:36 UTC (permalink / raw)
To: Avi Kivity; +Cc: ehrhardt, kvm, Carsten Otte
Am Wednesday 06 May 2009 14:10:07 schrieb Avi Kivity:
> ehrhardt@linux.vnet.ibm.com wrote:
> > From: Christian Borntraeger <borntraeger@de.ibm.com>
> >
> > This patch reworks the s390 clock comparator wakeup to hrtimer. The clock
> > comparator is a per-cpu value that is compared against the TOD clock. If
> > ckc <= TOD an external interrupt 1004 is triggered. Since the clock comparator
> > and the TOD clock have a much higher resolution than jiffies we should use
> > hrtimers to trigger the wakeup. This speeds up guest nanosleep for small
> > values.
> >
> > Since hrtimers callbacks run in hard-irq context, I added a tasklet to do
> > the actual work with enabled interrupts.
> >
> >
> > -void kvm_s390_idle_wakeup(unsigned long data)
> > +void kvm_s390_tasklet(unsigned long parm)
> > {
> > - struct kvm_vcpu *vcpu = (struct kvm_vcpu *)data;
> > + struct kvm_vcpu *vcpu = (struct kvm_vcpu *) parm;
> >
> > - spin_lock_bh(&vcpu->arch.local_int.lock);
> > + spin_lock(&vcpu->arch.local_int.lock);
> > vcpu->arch.local_int.timer_due = 1;
> > if (waitqueue_active(&vcpu->arch.local_int.wq))
> > wake_up_interruptible(&vcpu->arch.local_int.wq);
> > - spin_unlock_bh(&vcpu->arch.local_int.lock);
> > + spin_unlock(&vcpu->arch.local_int.lock);
> > }
> >
>
> Why can't this be done from the timer context (after adjusting the locks)?
It can be done, in fact I did that in my first version. The thing is, we would need to change all local_int.locks to spinlock_irqs, since standard timers are softirqs and hrtimers and hardirqs. Enabling and disabling irqs is a relatively expensive operating on s390 (but locks via compare and swap are quite cheap). Since we take this specific lock in lots of places this lock is on some hot pathes. The idle wakeup on the other hand is not that critical.
Christian
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/6] kvm-s390: use hrtimer for clock wakeup from idle
2009-05-06 12:36 ` Christian Borntraeger
@ 2009-05-07 10:19 ` Avi Kivity
2009-05-07 10:34 ` Christian Borntraeger
2009-05-20 15:48 ` Hollis Blanchard
0 siblings, 2 replies; 26+ messages in thread
From: Avi Kivity @ 2009-05-07 10:19 UTC (permalink / raw)
To: Christian Borntraeger; +Cc: ehrhardt, kvm, Carsten Otte
Christian Borntraeger wrote:
>> Why can't this be done from the timer context (after adjusting the locks)?
>>
>
> It can be done, in fact I did that in my first version. The thing is, we would need to change all local_int.locks to spinlock_irqs, since standard timers are softirqs and hrtimers and hardirqs. Enabling and disabling irqs is a relatively expensive operating on s390 (but locks via compare and swap are quite cheap). Since we take this specific lock in lots of places this lock is on some hot pathes. The idle wakeup on the other hand is not that critical.
>
Makes sense.
Back when irqsave/irqrestore were expensive on x86 (P4 days, I think it
was ~100 cycles) there was talk of using a software irq disable flag
instead of using the hardware support. So
- local_irq_disable() sets a flag
- interrupt handlers check the flag, if set they schedule the interrupt
handler and return immediately
- local_irq_restore() clears the flag and fires and scheduled handlers
Now these operations are pretty cheap on x86, but maybe that can apply
to s390.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/6] kvm-s390: use hrtimer for clock wakeup from idle
2009-05-07 10:19 ` Avi Kivity
@ 2009-05-07 10:34 ` Christian Borntraeger
2009-05-20 15:48 ` Hollis Blanchard
1 sibling, 0 replies; 26+ messages in thread
From: Christian Borntraeger @ 2009-05-07 10:34 UTC (permalink / raw)
To: Avi Kivity; +Cc: ehrhardt, kvm, Carsten Otte
Am Thursday 07 May 2009 12:19:32 schrieb Avi Kivity:
> Makes sense.
>
> Back when irqsave/irqrestore were expensive on x86 (P4 days, I think it
> was ~100 cycles) there was talk of using a software irq disable flag
> instead of using the hardware support. So
>
> - local_irq_disable() sets a flag
>
> - interrupt handlers check the flag, if set they schedule the interrupt
> handler and return immediately
>
> - local_irq_restore() clears the flag and fires and scheduled handlers
>
> Now these operations are pretty cheap on x86, but maybe that can apply
> to s390.
Interesting idea. Nevertheless, I dont think it improve our situation. The affected instructions (stosm and stnsm) are more expensive than compare and swap, but its nowhere near the ~100 cycles of P4.
Christian
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 4/6] kvm-s390: Unlink vcpu on destroy
2009-05-06 12:11 ` Avi Kivity
@ 2009-05-11 13:00 ` Christian Ehrhardt
0 siblings, 0 replies; 26+ messages in thread
From: Christian Ehrhardt @ 2009-05-11 13:00 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm, Christian Borntraeger, Carsten Otte
Avi Kivity wrote:
> ehrhardt@linux.vnet.ibm.com wrote:
>> From: Carsten Otte <cotte@de.ibm.com>
>>
>> This patch makes sure we do unlink a vcpu's sie control block
>> from the system control area in kvm_arch_vcpu_destroy. This
>> prevents illegal accesses to the sie control block from other
>> virtual cpus after free.
>>
>> Reported-by: Mijo Safradin <mijo@linux.vnet.ibm.com>
>> Signed-off-by: Carsten Otte <cotte@de.ibm.com>
>> ---
>> arch/s390/kvm/kvm-s390.c | 9 +++++++--
>> 1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> Index: kvm/arch/s390/kvm/kvm-s390.c
>> ===================================================================
>> --- kvm.orig/arch/s390/kvm/kvm-s390.c
>> +++ kvm/arch/s390/kvm/kvm-s390.c
>> @@ -195,6 +195,9 @@ out_nokvm:
>> void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
>> {
>> VCPU_EVENT(vcpu, 3, "%s", "free cpu");
>> + if (vcpu->kvm->arch.sca->cpu[vcpu->vcpu_id].sda ==
>> + (__u64) vcpu->arch.sie_block)
>> + vcpu->kvm->arch.sca->cpu[vcpu->vcpu_id].sda = 0;
>> free_page((unsigned long)(vcpu->arch.sie_block));
>>
>>
>
> If this is accessed by hardware on a different cpu, don't you need a
> memory barrier here?
>
>
Right, will be in v2
--
Grüsse / regards,
Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/6] kvm-s390: Fix memory slot versus run
2009-05-06 12:01 ` Avi Kivity
@ 2009-05-11 13:00 ` Christian Ehrhardt
2009-05-11 13:15 ` Avi Kivity
0 siblings, 1 reply; 26+ messages in thread
From: Christian Ehrhardt @ 2009-05-11 13:00 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm, Christian Borntraeger, Carsten Otte
Avi Kivity wrote:
> ehrhardt@linux.vnet.ibm.com wrote:
>> From: Carsten Otte <cotte@de.ibm.com>
>>
>> This patch fixes an incorrectness in the kvm backend for s390.
>> In case virtual cpus are being created before the corresponding
>> memory slot is being registered, we need to update the sie
>> control blocks for the virtual cpus. In order to do that, we
>> use the vcpu->mutex to lock out kvm_run and friends. This way
>> we can ensure a consistent update of the memory for the entire
>> smp configuration.
>> @@ -657,6 +657,8 @@ int kvm_arch_set_memory_region(struct kv
>> struct kvm_memory_slot old,
>> int user_alloc)
>> {
>> + int i;
>> +
>> /* A few sanity checks. We can have exactly one memory slot
>> which has
>> to start at guest virtual zero and which has to be located at a
>> page boundary in userland and which has to end at a page
>> boundary.
>> @@ -676,13 +678,27 @@ int kvm_arch_set_memory_region(struct kv
>> if (mem->memory_size & (PAGE_SIZE - 1))
>> return -EINVAL;
>>
>> + /* lock all vcpus */
>> + for (i = 0; i < KVM_MAX_VCPUS; ++i) {
>> + if (kvm->vcpus[i])
>> + mutex_lock(&kvm->vcpus[i]->mutex);
>> + }
>> +
>>
>
> Can't that livelock? Nothing requires a vcpu to ever exit, and if the
> cpu on which it's running on has no other load and no interrupts, it
> could remain in guest mode indefinitely, and then the ioctl will hang,
> waiting for something to happen.
>
Yes it could wait indefinitely - good spot.
> On x86, we use slots_lock to protect memory slots. When we change the
> global memory configuration, we set a bit in vcpu->requests, and send
> an IPI to all cpus that are currently in guest mode for our guest.
> This forces the cpu back to host mode. On the next entry, vcpu_run
> notices vcpu->requests has the bit set and reloads the mmu
> configuration. Of course, all this may be overkill for s390.
>
I thought about implementing it with slots_lock, vcpu->request, etc but
it really looks like overkill for s390.
At least today we can assume that we only have one memslot. Therefore a
set_memslot with already created vcpu's will still not interfere with
running vcpus (they can't run without memslot and since we have only one
they won't run).
Anyway I the code is prepared to "meet" running vcpus, because it might
be different in future. To prevent the livelock issue I changed the code
using mutex_trylock and in case I can't get the lock I explicitly let
the vcpu exit from guest.
--
Grüsse / regards,
Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/6] kvm-s390: Fix memory slot versus run
2009-05-11 13:00 ` Christian Ehrhardt
@ 2009-05-11 13:15 ` Avi Kivity
2009-05-11 13:46 ` Christian Ehrhardt
0 siblings, 1 reply; 26+ messages in thread
From: Avi Kivity @ 2009-05-11 13:15 UTC (permalink / raw)
To: Christian Ehrhardt; +Cc: kvm, Christian Borntraeger, Carsten Otte
Christian Ehrhardt wrote:
>
>> On x86, we use slots_lock to protect memory slots. When we change
>> the global memory configuration, we set a bit in vcpu->requests, and
>> send an IPI to all cpus that are currently in guest mode for our
>> guest. This forces the cpu back to host mode. On the next entry,
>> vcpu_run notices vcpu->requests has the bit set and reloads the mmu
>> configuration. Of course, all this may be overkill for s390.
>>
> I thought about implementing it with slots_lock, vcpu->request, etc
> but it really looks like overkill for s390.
We could make (some of) it common code, so it won't look so bad.
There's value in having all kvm ports do things similarly; though of
course we shouldn't force the solution when it isn't really needed.
vcpu->requests is useful whenever we modify global VM state that needs
to be seen by all vcpus in host mode; see kvm_reload_remote_mmus().
> At least today we can assume that we only have one memslot. Therefore
> a set_memslot with already created vcpu's will still not interfere
> with running vcpus (they can't run without memslot and since we have
> only one they won't run).
> Anyway I the code is prepared to "meet" running vcpus, because it
> might be different in future. To prevent the livelock issue I changed
> the code using mutex_trylock and in case I can't get the lock I
> explicitly let the vcpu exit from guest.
Why not do it unconditionally?
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/6] kvm-s390: Fix memory slot versus run
2009-05-11 13:15 ` Avi Kivity
@ 2009-05-11 13:46 ` Christian Ehrhardt
2009-05-11 14:02 ` Avi Kivity
0 siblings, 1 reply; 26+ messages in thread
From: Christian Ehrhardt @ 2009-05-11 13:46 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm, Christian Borntraeger, Carsten Otte
Avi Kivity wrote:
> Christian Ehrhardt wrote:
>>
>>> On x86, we use slots_lock to protect memory slots. When we change
>>> the global memory configuration, we set a bit in vcpu->requests, and
>>> send an IPI to all cpus that are currently in guest mode for our
>>> guest. This forces the cpu back to host mode. On the next entry,
>>> vcpu_run notices vcpu->requests has the bit set and reloads the mmu
>>> configuration. Of course, all this may be overkill for s390.
>>>
>> I thought about implementing it with slots_lock, vcpu->request, etc
>> but it really looks like overkill for s390.
>
> We could make (some of) it common code, so it won't look so bad.
> There's value in having all kvm ports do things similarly; though of
> course we shouldn't force the solution when it isn't really needed.
>
> vcpu->requests is useful whenever we modify global VM state that needs
> to be seen by all vcpus in host mode; see kvm_reload_remote_mmus().
yeah I read that code after your first hint in that thread, and I agree
that merging some of this into common code might be good.
But in my opinion not now for this bugfix patch (the intention is just
to prevent a user being able to crash the host via vcpu create,set mem&
and vcpu run in that order).
It might be a good point to further streamline this once we use the same
userspace code, but I think it doesn't make sense yet.
>
>> At least today we can assume that we only have one memslot. Therefore
>> a set_memslot with already created vcpu's will still not interfere
>> with running vcpus (they can't run without memslot and since we have
>> only one they won't run).
>> Anyway I the code is prepared to "meet" running vcpus, because it
>> might be different in future. To prevent the livelock issue I changed
>> the code using mutex_trylock and in case I can't get the lock I
>> explicitly let the vcpu exit from guest.
>
> Why not do it unconditionally?
>
hmm I might have written that misleading - eventually it's a loop until
it got the lock
while !trylock
kick vcpu out of guest
schedule
There is no reason to kick out guests where I got the lock cleanly as
far as I see.
Especially as I expect the vcpus not running in the common case as i
explained above (can't run without memslot + we only have one => no vcpu
will run).
--
Grüsse / regards,
Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/6] kvm-s390: Fix memory slot versus run
2009-05-11 13:46 ` Christian Ehrhardt
@ 2009-05-11 14:02 ` Avi Kivity
2009-05-11 14:42 ` Christian Ehrhardt
0 siblings, 1 reply; 26+ messages in thread
From: Avi Kivity @ 2009-05-11 14:02 UTC (permalink / raw)
To: Christian Ehrhardt; +Cc: kvm, Christian Borntraeger, Carsten Otte
Christian Ehrhardt wrote:
>>> I thought about implementing it with slots_lock, vcpu->request, etc
>>> but it really looks like overkill for s390.
>>
>> We could make (some of) it common code, so it won't look so bad.
>> There's value in having all kvm ports do things similarly; though of
>> course we shouldn't force the solution when it isn't really needed.
>>
>> vcpu->requests is useful whenever we modify global VM state that
>> needs to be seen by all vcpus in host mode; see
>> kvm_reload_remote_mmus().
> yeah I read that code after your first hint in that thread, and I
> agree that merging some of this into common code might be good.
> But in my opinion not now for this bugfix patch (the intention is just
> to prevent a user being able to crash the host via vcpu create,set
> mem& and vcpu run in that order).
> It might be a good point to further streamline this once we use the
> same userspace code, but I think it doesn't make sense yet.
Sure, don't mix bugfixes with infrastructure changes, when possible.
>>> At least today we can assume that we only have one memslot.
>>> Therefore a set_memslot with already created vcpu's will still not
>>> interfere with running vcpus (they can't run without memslot and
>>> since we have only one they won't run).
>>> Anyway I the code is prepared to "meet" running vcpus, because it
>>> might be different in future. To prevent the livelock issue I
>>> changed the code using mutex_trylock and in case I can't get the
>>> lock I explicitly let the vcpu exit from guest.
>>
>> Why not do it unconditionally?
>>
> hmm I might have written that misleading - eventually it's a loop
> until it got the lock
> while !trylock
> kick vcpu out of guest
> schedule
>
> There is no reason to kick out guests where I got the lock cleanly as
> far as I see.
> Especially as I expect the vcpus not running in the common case as i
> explained above (can't run without memslot + we only have one => no
> vcpu will run).
Still livelockable, unless you stop the vcpu from entering the guest
immediately.
That's why vcpu->requests is so powerful. Not only you kick the vcpu
out of guest mode, you force it to synchronize when it tries to enter again.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/6] kvm-s390: Fix memory slot versus run
2009-05-11 14:02 ` Avi Kivity
@ 2009-05-11 14:42 ` Christian Ehrhardt
2009-05-11 15:01 ` Avi Kivity
0 siblings, 1 reply; 26+ messages in thread
From: Christian Ehrhardt @ 2009-05-11 14:42 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm, Christian Borntraeger, Carsten Otte
Avi Kivity wrote:
> Christian Ehrhardt wrote:
>>>> I thought about implementing it with slots_lock, vcpu->request, etc
>>>> but it really looks like overkill for s390.
>>>
>>> We could make (some of) it common code, so it won't look so bad.
>>> There's value in having all kvm ports do things similarly; though of
>>> course we shouldn't force the solution when it isn't really needed.
>>>
>>> vcpu->requests is useful whenever we modify global VM state that
>>> needs to be seen by all vcpus in host mode; see
>>> kvm_reload_remote_mmus().
>> yeah I read that code after your first hint in that thread, and I
>> agree that merging some of this into common code might be good.
>> But in my opinion not now for this bugfix patch (the intention is
>> just to prevent a user being able to crash the host via vcpu
>> create,set mem& and vcpu run in that order).
>> It might be a good point to further streamline this once we use the
>> same userspace code, but I think it doesn't make sense yet.
>
> Sure, don't mix bugfixes with infrastructure changes, when possible.
>
>>>> At least today we can assume that we only have one memslot.
>>>> Therefore a set_memslot with already created vcpu's will still not
>>>> interfere with running vcpus (they can't run without memslot and
>>>> since we have only one they won't run).
>>>> Anyway I the code is prepared to "meet" running vcpus, because it
>>>> might be different in future. To prevent the livelock issue I
>>>> changed the code using mutex_trylock and in case I can't get the
>>>> lock I explicitly let the vcpu exit from guest.
>>>
>>> Why not do it unconditionally?
>>>
>> hmm I might have written that misleading - eventually it's a loop
>> until it got the lock
>> while !trylock
>> kick vcpu out of guest
>> schedule
>>
>> There is no reason to kick out guests where I got the lock cleanly as
>> far as I see.
>> Especially as I expect the vcpus not running in the common case as i
>> explained above (can't run without memslot + we only have one => no
>> vcpu will run).
>
> Still livelockable, unless you stop the vcpu from entering the guest
> immediately.
>
> That's why vcpu->requests is so powerful. Not only you kick the vcpu
> out of guest mode, you force it to synchronize when it tries to enter
> again.
>
The bad thing on vcpu->request in that case is that I don't want the
async behaviour of vcpu->requests in that case, I want the memory slot
updated in all vcpu's when the ioctl is returning.
Looking at vcpu->request based solution I don't find the synchronization
I need. The changes to vcpu->arch.guest_origin/guest_memsize and the
changes to vcpu->arch.sie_block->gmsor/gmslm need to happen without the
vcpu running.
Therefor i want the vcpu lock _before_ I update the both structs,
otherwise it could be racy (at least on s390).
On the other hand while it is very++ unlikely to happen you are still
right that it could theoretically livelock there.
I might use vcpu->request in to not enter vcpu run again after such a
"kick" out of guest state.
It would be checked on vcpu_run enter and could then drop the lock, call
schedule, relock and check the flag again until it is cleared.
I'm not yet happy with this solution as I expect it to end up in
something like a reference count which then would not fit into the
existing vcpu->request flags :-/
As I mentioned above the changes to vcpu->arch and vcpu->arch->sie_block
have to be exclusive with the vcpu not running.
If I would find something as "transport" for the information I have on
set_memory_slot (origin/size) until the next vcpu_run entry I could do
both changes there synchronously.
In that case I could really use your suggested solution with
vcpu->request, kick out unconditionally and set values on next (re-)entry.
Hmmm .. Maybe I can find all I need on reentry in vcpu->kvm->memslots[].
If I can change it that way it will definitely require some testing.
... to be continued :-)
--
Grüsse / regards,
Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/6] kvm-s390: Fix memory slot versus run
2009-05-11 14:42 ` Christian Ehrhardt
@ 2009-05-11 15:01 ` Avi Kivity
2009-05-12 9:15 ` Christian Ehrhardt
0 siblings, 1 reply; 26+ messages in thread
From: Avi Kivity @ 2009-05-11 15:01 UTC (permalink / raw)
To: Christian Ehrhardt; +Cc: kvm, Christian Borntraeger, Carsten Otte
Christian Ehrhardt wrote:
>
> The bad thing on vcpu->request in that case is that I don't want the
> async behaviour of vcpu->requests in that case, I want the memory slot
> updated in all vcpu's when the ioctl is returning.
You mean, the hardware can access the vcpu control block even when the
vcpu is not running?
> Looking at vcpu->request based solution I don't find the
> synchronization I need. The changes to
> vcpu->arch.guest_origin/guest_memsize and the changes to
> vcpu->arch.sie_block->gmsor/gmslm need to happen without the vcpu
> running.
> Therefor i want the vcpu lock _before_ I update the both structs,
> otherwise it could be racy (at least on s390).
>
> On the other hand while it is very++ unlikely to happen you are still
> right that it could theoretically livelock there.
> I might use vcpu->request in to not enter vcpu run again after such a
> "kick" out of guest state.
> It would be checked on vcpu_run enter and could then drop the lock,
> call schedule, relock and check the flag again until it is cleared.
> I'm not yet happy with this solution as I expect it to end up in
> something like a reference count which then would not fit into the
> existing vcpu->request flags :-/
>
> As I mentioned above the changes to vcpu->arch and
> vcpu->arch->sie_block have to be exclusive with the vcpu not running.
> If I would find something as "transport" for the information I have on
> set_memory_slot (origin/size) until the next vcpu_run entry I could do
> both changes there synchronously.
The information can be found in kvm->memslots.
> In that case I could really use your suggested solution with
> vcpu->request, kick out unconditionally and set values on next
> (re-)entry.
>
> Hmmm .. Maybe I can find all I need on reentry in vcpu->kvm->memslots[].
Err...
> If I can change it that way it will definitely require some testing.
> ... to be continued :-)
I definitely recommend it -- would bring s390 more in line with the
other ports (I know it's a backward step for you :)
Note our plan is to change slots_lock to RCU, so it's even better if you
use memslots.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/6] kvm-s390: Fix memory slot versus run
2009-05-11 15:01 ` Avi Kivity
@ 2009-05-12 9:15 ` Christian Ehrhardt
2009-05-12 11:35 ` Avi Kivity
0 siblings, 1 reply; 26+ messages in thread
From: Christian Ehrhardt @ 2009-05-12 9:15 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm, Christian Borntraeger, Carsten Otte
Avi Kivity wrote:
> Christian Ehrhardt wrote:
>>
>> The bad thing on vcpu->request in that case is that I don't want the
>> async behaviour of vcpu->requests in that case, I want the memory
>> slot updated in all vcpu's when the ioctl is returning.
>
> You mean, the hardware can access the vcpu control block even when the
> vcpu is not running?
No, hardware only uses it with a running vcpu, but I realised my own
fault while changing the code to vcpu->request style.
For s390 I need to update the KVM->arch and *all*
vcpu->arch->sie_block... data synchronously.
That makes the "per vcpu resync on next entry" approach not feasible.
On the other hand I realized at the same moment that the livelock should
be no issue for us, because as I mentioned:
a) only one memslot
b) a vcpu can't run without memslot
So I don't even need to kick out vcpu's, they just should not be running.
Until we ever support multiple slots, or updates of the existing single
slot this should be ok, so is the bugfix patch this should be.
To avoid a theoretical deadlock in case other code is changing (badly)
it should be fair to aquire the lock with mutex_trylock and return
-EINVAL if we did not get all locks.
[...]
>> If I can change it that way it will definitely require some testing.
>> ... to be continued :-)
>
> I definitely recommend it -- would bring s390 more in line with the
> other ports (I know it's a backward step for you :)
>
> Note our plan is to change slots_lock to RCU, so it's even better if
> you use memslots.
As long as we have the special conditions mentioned above I think its ok
to implement it the way I do it now.
I agree that if we ever support multiple memslots we should strive for a
common solution.
p.s. the second patch in the series ensures that a vcpu really never
runs without a memslot being set as this was another bug we had.
--
Grüsse / regards,
Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/6] kvm-s390: Fix memory slot versus run
2009-05-12 9:15 ` Christian Ehrhardt
@ 2009-05-12 11:35 ` Avi Kivity
2009-05-12 13:33 ` Christian Ehrhardt
0 siblings, 1 reply; 26+ messages in thread
From: Avi Kivity @ 2009-05-12 11:35 UTC (permalink / raw)
To: Christian Ehrhardt; +Cc: kvm, Christian Borntraeger, Carsten Otte
Christian Ehrhardt wrote:
> Avi Kivity wrote:
>> Christian Ehrhardt wrote:
>>>
>>> The bad thing on vcpu->request in that case is that I don't want the
>>> async behaviour of vcpu->requests in that case, I want the memory
>>> slot updated in all vcpu's when the ioctl is returning.
>>
>> You mean, the hardware can access the vcpu control block even when
>> the vcpu is not running?
> No, hardware only uses it with a running vcpu, but I realised my own
> fault while changing the code to vcpu->request style.
> For s390 I need to update the KVM->arch and *all*
> vcpu->arch->sie_block... data synchronously.
Out of interest, can you explain why?
> That makes the "per vcpu resync on next entry" approach not feasible.
>
> On the other hand I realized at the same moment that the livelock
> should be no issue for us, because as I mentioned:
> a) only one memslot
> b) a vcpu can't run without memslot
> So I don't even need to kick out vcpu's, they just should not be running.
> Until we ever support multiple slots, or updates of the existing
> single slot this should be ok, so is the bugfix patch this should be.
> To avoid a theoretical deadlock in case other code is changing (badly)
> it should be fair to aquire the lock with mutex_trylock and return
> -EINVAL if we did not get all locks.
OK.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/6] kvm-s390: Fix memory slot versus run
2009-05-12 11:35 ` Avi Kivity
@ 2009-05-12 13:33 ` Christian Ehrhardt
2009-05-17 22:31 ` Avi Kivity
0 siblings, 1 reply; 26+ messages in thread
From: Christian Ehrhardt @ 2009-05-12 13:33 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm, Christian Borntraeger, Carsten Otte
Avi Kivity wrote:
> Christian Ehrhardt wrote:
>> Avi Kivity wrote:
>>> Christian Ehrhardt wrote:
>>>>
>>>> The bad thing on vcpu->request in that case is that I don't want
>>>> the async behaviour of vcpu->requests in that case, I want the
>>>> memory slot updated in all vcpu's when the ioctl is returning.
>>>
>>> You mean, the hardware can access the vcpu control block even when
>>> the vcpu is not running?
>> No, hardware only uses it with a running vcpu, but I realised my own
>> fault while changing the code to vcpu->request style.
>> For s390 I need to update the KVM->arch and *all*
>> vcpu->arch->sie_block... data synchronously.
>
> Out of interest, can you explain why?
Sure I'll try to give an example.
a) The whole guest has "one" memory slot representing all it's memory.
Therefore some important values like guest_origin and guest_memsize (one
slot so it's just addr+size) are kept at VM level in kvm->arch.
b) We fortunately have cool hardware support for "nearly everything"(tm)
:-) In this case for example we set in vcpu->arch.sie_block the values
for origin and size translated into a "limit" to get memory management
virtualization support.
c) we have other code e.g. all our copy_from/to_guest stuff that uses
the kvm->arch values
If we would allow e.g. updates of a memslot (or as the patch supposes to
harden the set_memory_region code against inconsiderate code changes in
other sections) it might happen that we set the kvm->arch information
but the vcpu->arch->sie_block stuff not until next reentry. Now
concurrently the running vcpu could cause some kind of fault that
involves a copy_from/to_guest. That way we could end up with potentially
invalid handling of that fault (fault handling and running guest would
use different userspace adresses until it is synced on next vcpu
reentry) - it's theoretical I know, but it might cause some issues that
would be hard to find.
On the other hand for the long term I wanted to note that all our
copy_from/to_guest functions is per vcpu, so when we some day implement
updateable memslots, multiple memslots or even just fill "free time"(tm)
and streamline our code we could redesign that origin/size storage. This
could be done multiple ways, either just store it per vcpu or with a
lock for the kvm->arch level variables - both ways and maybe more could
then use the vcpu->request based approach, but unfortunately it's
neither part of that patch nor of the current effort to do that.
The really good thing is, because of our discussion about that I now
have a really detailed idea how I can improve that code aside from this
bugfix patch (lets hope not too far in the future).
>> That makes the "per vcpu resync on next entry" approach not feasible.
>>
>> On the other hand I realized at the same moment that the livelock
>> should be no issue for us, because as I mentioned:
>> a) only one memslot
>> b) a vcpu can't run without memslot
>> So I don't even need to kick out vcpu's, they just should not be
>> running.
>> Until we ever support multiple slots, or updates of the existing
>> single slot this should be ok, so is the bugfix patch this should be.
>> To avoid a theoretical deadlock in case other code is changing
>> (badly) it should be fair to aquire the lock with mutex_trylock and
>> return -EINVAL if we did not get all locks.
>
> OK.
>
>
--
Grüsse / regards,
Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/6] kvm-s390: Fix memory slot versus run
2009-05-12 13:33 ` Christian Ehrhardt
@ 2009-05-17 22:31 ` Avi Kivity
2009-05-20 12:05 ` Christian Ehrhardt
0 siblings, 1 reply; 26+ messages in thread
From: Avi Kivity @ 2009-05-17 22:31 UTC (permalink / raw)
To: Christian Ehrhardt; +Cc: kvm, Christian Borntraeger, Carsten Otte
Christian Ehrhardt wrote:
>>>>> The bad thing on vcpu->request in that case is that I don't want
>>>>> the async behaviour of vcpu->requests in that case, I want the
>>>>> memory slot updated in all vcpu's when the ioctl is returning.
>>>>
>>>> You mean, the hardware can access the vcpu control block even when
>>>> the vcpu is not running?
>>> No, hardware only uses it with a running vcpu, but I realised my own
>>> fault while changing the code to vcpu->request style.
>>> For s390 I need to update the KVM->arch and *all*
>>> vcpu->arch->sie_block... data synchronously.
>>
>> Out of interest, can you explain why?
> Sure I'll try to give an example.
>
> a) The whole guest has "one" memory slot representing all it's memory.
> Therefore some important values like guest_origin and guest_memsize
> (one slot so it's just addr+size) are kept at VM level in kvm->arch.
It should really be kept in kvm->memslots[0]->{userspace_addr, npages}.
This is common to all architectures.
> b) We fortunately have cool hardware support for "nearly
> everything"(tm) :-) In this case for example we set in
> vcpu->arch.sie_block the values for origin and size translated into a
> "limit" to get memory management virtualization support.
x86 has something analogous; shadow or nested page tables are also
per-vcpu and accessed by the hardware while the guest is running.
> c) we have other code e.g. all our copy_from/to_guest stuff that uses
> the kvm->arch values
You want to drop these and use kvm_read_guest() / kvm_write_guest().
> If we would allow e.g. updates of a memslot (or as the patch supposes
> to harden the set_memory_region code against inconsiderate code
> changes in other sections) it might happen that we set the kvm->arch
> information but the vcpu->arch->sie_block stuff not until next
> reentry. Now concurrently the running vcpu could cause some kind of
> fault that involves a copy_from/to_guest. That way we could end up
> with potentially invalid handling of that fault (fault handling and
> running guest would use different userspace adresses until it is
> synced on next vcpu reentry) - it's theoretical I know, but it might
> cause some issues that would be hard to find.
I agree it should be protected. Here's how we do it in arch-independent
code:
- code that looks up memory slots takes slots_lock for read (future: rcu)
- code that changes memory slots takes slots_lock for write, and
requests an mmu reload (includes an IPI to force the vcpu out of guest mode)
Now, once we begin changing a slot no one can touch memory or reenter
the guest until we are done.
> On the other hand for the long term I wanted to note that all our
> copy_from/to_guest functions is per vcpu, so when we some day
> implement updateable memslots, multiple memslots or even just fill
> "free time"(tm) and streamline our code we could redesign that
> origin/size storage. This could be done multiple ways, either just
> store it per vcpu or with a lock for the kvm->arch level variables -
> both ways and maybe more could then use the vcpu->request based
> approach, but unfortunately it's neither part of that patch nor of the
> current effort to do that.
I think we should keep that in generic code. All of that applies to x86
(and ia64 and ppc), if I understand you correctly, and if I understand
the other archs correctly (don't place a large bet).
--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/6] kvm-s390: Fix memory slot versus run
2009-05-17 22:31 ` Avi Kivity
@ 2009-05-20 12:05 ` Christian Ehrhardt
0 siblings, 0 replies; 26+ messages in thread
From: Christian Ehrhardt @ 2009-05-20 12:05 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm, Christian Borntraeger, Carsten Otte
Avi Kivity wrote:
> Christian Ehrhardt wrote:
>>>>>> The bad thing on vcpu->request in that case is that I don't want
>>>>>> the async behaviour of vcpu->requests in that case, I want the
>>>>>> memory slot updated in all vcpu's when the ioctl is returning.
>>>>>
>>>>> You mean, the hardware can access the vcpu control block even when
>>>>> the vcpu is not running?
>>>> No, hardware only uses it with a running vcpu, but I realised my
>>>> own fault while changing the code to vcpu->request style.
>>>> For s390 I need to update the KVM->arch and *all*
>>>> vcpu->arch->sie_block... data synchronously.
>>>
>>> Out of interest, can you explain why?
>> Sure I'll try to give an example.
>>
>> a) The whole guest has "one" memory slot representing all it's
>> memory. Therefore some important values like guest_origin and
>> guest_memsize (one slot so it's just addr+size) are kept at VM level
>> in kvm->arch.
>
> It should really be kept in kvm->memslots[0]->{userspace_addr,
> npages}. This is common to all architectures.
As I said wanted to do that, but due to the need to relocate my work
environment to a new laptop I was a bit stalled the last few days.
A patch series implementing it in a streamlined (storing in memslots
only, slots_lock, vcpu->request, ...) way will soon appear on the list.
> [...]
>> c) we have other code e.g. all our copy_from/to_guest stuff that uses
>> the kvm->arch values
>
> You want to drop these and use kvm_read_guest() / kvm_write_guest().
I put it on my "low-prio-but-very-useful todo list" to take a look at
that too as one of the next opportunities to streamline our code.
[...]
--
Grüsse / regards, Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/6] kvm-s390: use hrtimer for clock wakeup from idle
2009-05-07 10:19 ` Avi Kivity
2009-05-07 10:34 ` Christian Borntraeger
@ 2009-05-20 15:48 ` Hollis Blanchard
1 sibling, 0 replies; 26+ messages in thread
From: Hollis Blanchard @ 2009-05-20 15:48 UTC (permalink / raw)
To: Avi Kivity; +Cc: Christian Borntraeger, ehrhardt, kvm, Carsten Otte
On Thu, 2009-05-07 at 13:19 +0300, Avi Kivity wrote:
> Christian Borntraeger wrote:
> >> Why can't this be done from the timer context (after adjusting the locks)?
> >>
> >
> > It can be done, in fact I did that in my first version. The thing is, we would need to change all local_int.locks to spinlock_irqs, since standard timers are softirqs and hrtimers and hardirqs. Enabling and disabling irqs is a relatively expensive operating on s390 (but locks via compare and swap are quite cheap). Since we take this specific lock in lots of places this lock is on some hot pathes. The idle wakeup on the other hand is not that critical.
> >
>
> Makes sense.
>
> Back when irqsave/irqrestore were expensive on x86 (P4 days, I think it
> was ~100 cycles) there was talk of using a software irq disable flag
> instead of using the hardware support. So
>
> - local_irq_disable() sets a flag
>
> - interrupt handlers check the flag, if set they schedule the interrupt
> handler and return immediately
>
> - local_irq_restore() clears the flag and fires and scheduled handlers
>
> Now these operations are pretty cheap on x86, but maybe that can apply
> to s390.
I don't know how the cycle counts compare, but FWIW normal ppc64 Linux
does almost exactly this (and calls it "lazy irq disabling"). The only
difference is a step 2.5 that really disables interrupts, because some
are level-triggered.
--
Hollis Blanchard
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2009-05-20 15:48 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-05-05 14:39 [PATCH 0/6] kvm-s390: collection of kvm-s390 fixes ehrhardt
2009-05-05 14:39 ` [PATCH 1/6] kvm-s390: Fix memory slot versus run ehrhardt
2009-05-06 12:01 ` Avi Kivity
2009-05-11 13:00 ` Christian Ehrhardt
2009-05-11 13:15 ` Avi Kivity
2009-05-11 13:46 ` Christian Ehrhardt
2009-05-11 14:02 ` Avi Kivity
2009-05-11 14:42 ` Christian Ehrhardt
2009-05-11 15:01 ` Avi Kivity
2009-05-12 9:15 ` Christian Ehrhardt
2009-05-12 11:35 ` Avi Kivity
2009-05-12 13:33 ` Christian Ehrhardt
2009-05-17 22:31 ` Avi Kivity
2009-05-20 12:05 ` Christian Ehrhardt
2009-05-05 14:39 ` [PATCH 2/6] kvm-s390: use hrtimer for clock wakeup from idle ehrhardt
2009-05-06 12:10 ` Avi Kivity
2009-05-06 12:36 ` Christian Borntraeger
2009-05-07 10:19 ` Avi Kivity
2009-05-07 10:34 ` Christian Borntraeger
2009-05-20 15:48 ` Hollis Blanchard
2009-05-05 14:39 ` [PATCH 3/6] kvm-s390: optimize float int lock: spin_lock_bh --> spin_lock ehrhardt
2009-05-05 14:39 ` [PATCH 4/6] kvm-s390: Unlink vcpu on destroy ehrhardt
2009-05-06 12:11 ` Avi Kivity
2009-05-11 13:00 ` Christian Ehrhardt
2009-05-05 14:39 ` [PATCH 5/6] kvm-s390: Sanity check on validity intercept ehrhardt
2009-05-05 14:39 ` [PATCH 6/6] kvm-s390: Verify memory in kvm run ehrhardt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).