[PATCH 0/3] kvm-s390: revised version of kvm-s390 guest memory handling

kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/3] kvm-s390: revised version of kvm-s390 guest memory handling
@ 2009-05-20 13:34 ehrhardt
  2009-05-20 13:34 ` [PATCH 1/3] kvm-s390: infrastructure to kick vcpus out of guest state ehrhardt
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: ehrhardt @ 2009-05-20 13:34 UTC (permalink / raw)
  To: kvm, avi; +Cc: ehrhardt, borntraeger, cotte, heiko.carstens, schwidefsky

From: Christian Ehrhardt <ehrhardt@de.ibm.com>

This patch series results from our discussions about handling memslots and vcpu
mmu reloads. It streamlines kvm-s390 a bit by using slots_lock, vcpu-request
(KVM_REQ_MMU_RELOAD) and a kick mechanism to ensure vcpus come out of guest
context to catch the update.

I tested the reworked code a while with multiple smp guests and some extra
code that periodically injects kicks and/or mmu reload requests, but I'm happy
about every additional review feedback.

Patches included:
Subject: [PATCH 1/3] kvm-s390: infrastructure to kick vcpus out of guest state
Subject: [PATCH 2/3] kvm-s390: fix signal handling
Subject: [PATCH 3/3] kvm-s390: streamline memslot handling

Overall-Diffstat:
 arch/s390/include/asm/kvm_host.h |    9 ++---
 arch/s390/kvm/gaccess.h          |   23 +++++++-------
 arch/s390/kvm/intercept.c        |   18 +++++++----
 arch/s390/kvm/kvm-s390.c         |   57 ++++++++++++++-----------------------
 arch/s390/kvm/kvm-s390.h         |   32 ++++++++++++++++++++
 arch/s390/kvm/sigp.c             |   60 +++++++++++++++++++++++----------------
 6 files changed, 117 insertions(+), 82 deletions(-)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/3] kvm-s390: infrastructure to kick vcpus out of guest state
  2009-05-20 13:34 [PATCH 0/3] kvm-s390: revised version of kvm-s390 guest memory handling ehrhardt
@ 2009-05-20 13:34 ` ehrhardt
  2009-05-20 13:34 ` [PATCH 2/3] kvm-s390: fix signal handling ehrhardt
  2009-05-20 13:34 ` [PATCH 3/3] kvm-s390: streamline memslot handling ehrhardt
  2 siblings, 0 replies; 18+ messages in thread
From: ehrhardt @ 2009-05-20 13:34 UTC (permalink / raw)
  To: kvm, avi; +Cc: ehrhardt, borntraeger, cotte, heiko.carstens, schwidefsky

From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>

To ensure vcpu's come out of guest context in certain cases this patch adds a
s390 specific way to kick them out of guest context. Currently it kicks them
out to rerun the vcpu_run path in the s390 code, but the mechanism itself is
expandable and with a new flag we could also add e.g. kicks to userspace etc.

Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
---

[diffstat]
 include/asm/kvm_host.h |    5 ++--
 kvm/intercept.c        |   12 ++++++++---
 kvm/kvm-s390.c         |    5 ++++
 kvm/kvm-s390.h         |    3 ++
 kvm/sigp.c             |   53 +++++++++++++++++++++++++++++--------------------
 5 files changed, 52 insertions(+), 26 deletions(-)

[diff]
Index: kvm/arch/s390/kvm/intercept.c
===================================================================
--- kvm.orig/arch/s390/kvm/intercept.c
+++ kvm/arch/s390/kvm/intercept.c
@@ -128,7 +128,7 @@ static int handle_noop(struct kvm_vcpu *
 
 static int handle_stop(struct kvm_vcpu *vcpu)
 {
-	int rc;
+	int rc = 0;
 
 	vcpu->stat.exit_stop_request++;
 	atomic_clear_mask(CPUSTAT_RUNNING, &vcpu->arch.sie_block->cpuflags);
@@ -141,12 +141,18 @@ static int handle_stop(struct kvm_vcpu *
 			rc = -ENOTSUPP;
 	}
 
+	if (vcpu->arch.local_int.action_bits & ACTION_RELOADVCPU_ON_STOP) {
+		vcpu->arch.local_int.action_bits &= ~ACTION_RELOADVCPU_ON_STOP;
+		rc = SIE_INTERCEPT_RERUNVCPU;
+		vcpu->run->exit_reason = KVM_EXIT_INTR;
+	}
+
 	if (vcpu->arch.local_int.action_bits & ACTION_STOP_ON_STOP) {
 		vcpu->arch.local_int.action_bits &= ~ACTION_STOP_ON_STOP;
 		VCPU_EVENT(vcpu, 3, "%s", "cpu stopped");
 		rc = -ENOTSUPP;
-	} else
-		rc = 0;
+	}
+
 	spin_unlock_bh(&vcpu->arch.local_int.lock);
 	return rc;
 }
Index: kvm/arch/s390/kvm/kvm-s390.c
===================================================================
--- kvm.orig/arch/s390/kvm/kvm-s390.c
+++ kvm/arch/s390/kvm/kvm-s390.c
@@ -487,6 +487,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_v
 
 	vcpu_load(vcpu);
 
+rerun_vcpu:
 	/* verify, that memory has been registered */
 	if (!vcpu->kvm->arch.guest_memsize) {
 		vcpu_put(vcpu);
@@ -506,6 +507,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_v
 		vcpu->arch.sie_block->gpsw.addr = kvm_run->s390_sieic.addr;
 		break;
 	case KVM_EXIT_UNKNOWN:
+	case KVM_EXIT_INTR:
 	case KVM_EXIT_S390_RESET:
 		break;
 	default:
@@ -519,6 +521,9 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_v
 		rc = kvm_handle_sie_intercept(vcpu);
 	} while (!signal_pending(current) && !rc);
 
+	if (rc == SIE_INTERCEPT_RERUNVCPU)
+		goto rerun_vcpu;
+
 	if (signal_pending(current) && !rc)
 		rc = -EINTR;
 
Index: kvm/arch/s390/kvm/kvm-s390.h
===================================================================
--- kvm.orig/arch/s390/kvm/kvm-s390.h
+++ kvm/arch/s390/kvm/kvm-s390.h
@@ -20,6 +20,8 @@
 
 typedef int (*intercept_handler_t)(struct kvm_vcpu *vcpu);
 
+/* negativ values are error codes, positive values for internal conditions */
+#define SIE_INTERCEPT_RERUNVCPU		(1<<0)
 int kvm_handle_sie_intercept(struct kvm_vcpu *vcpu);
 
 #define VM_EVENT(d_kvm, d_loglevel, d_string, d_args...)\
@@ -50,6 +52,7 @@ int kvm_s390_inject_vm(struct kvm *kvm,
 int kvm_s390_inject_vcpu(struct kvm_vcpu *vcpu,
 		struct kvm_s390_interrupt *s390int);
 int kvm_s390_inject_program_int(struct kvm_vcpu *vcpu, u16 code);
+int kvm_s390_inject_sigp_stop(struct kvm_vcpu *vcpu, int action);
 
 /* implemented in priv.c */
 int kvm_s390_handle_b2(struct kvm_vcpu *vcpu);
Index: kvm/arch/s390/include/asm/kvm_host.h
===================================================================
--- kvm.orig/arch/s390/include/asm/kvm_host.h
+++ kvm/arch/s390/include/asm/kvm_host.h
@@ -180,8 +180,9 @@ struct kvm_s390_interrupt_info {
 };
 
 /* for local_interrupt.action_flags */
-#define ACTION_STORE_ON_STOP 1
-#define ACTION_STOP_ON_STOP  2
+#define ACTION_STORE_ON_STOP		(1<<0)
+#define ACTION_STOP_ON_STOP		(1<<1)
+#define ACTION_RELOADVCPU_ON_STOP	(1<<2)
 
 struct kvm_s390_local_interrupt {
 	spinlock_t lock;
Index: kvm/arch/s390/kvm/sigp.c
===================================================================
--- kvm.orig/arch/s390/kvm/sigp.c
+++ kvm/arch/s390/kvm/sigp.c
@@ -1,7 +1,7 @@
 /*
  * sigp.c - handlinge interprocessor communication
  *
- * Copyright IBM Corp. 2008
+ * Copyright IBM Corp. 2008,2009
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License (version 2 only)
@@ -9,6 +9,7 @@
  *
  *    Author(s): Carsten Otte <cotte@de.ibm.com>
  *               Christian Borntraeger <borntraeger@de.ibm.com>
+ *               Christian Ehrhardt <ehrhardt@de.ibm.com>
  */
 
 #include <linux/kvm.h>
@@ -107,46 +108,57 @@ unlock:
 	return rc;
 }
 
-static int __sigp_stop(struct kvm_vcpu *vcpu, u16 cpu_addr, int store)
+static int __inject_sigp_stop(struct kvm_s390_local_interrupt *li, int action)
 {
-	struct kvm_s390_float_interrupt *fi = &vcpu->kvm->arch.float_int;
-	struct kvm_s390_local_interrupt *li;
 	struct kvm_s390_interrupt_info *inti;
-	int rc;
-
-	if (cpu_addr >= KVM_MAX_VCPUS)
-		return 3; /* not operational */
 
 	inti = kzalloc(sizeof(*inti), GFP_KERNEL);
 	if (!inti)
 		return -ENOMEM;
-
 	inti->type = KVM_S390_SIGP_STOP;
 
-	spin_lock(&fi->lock);
-	li = fi->local_int[cpu_addr];
-	if (li == NULL) {
-		rc = 3; /* not operational */
-		kfree(inti);
-		goto unlock;
-	}
 	spin_lock_bh(&li->lock);
 	list_add_tail(&inti->list, &li->list);
 	atomic_set(&li->active, 1);
 	atomic_set_mask(CPUSTAT_STOP_INT, li->cpuflags);
-	if (store)
-		li->action_bits |= ACTION_STORE_ON_STOP;
-	li->action_bits |= ACTION_STOP_ON_STOP;
+	li->action_bits |= action;
 	if (waitqueue_active(&li->wq))
 		wake_up_interruptible(&li->wq);
 	spin_unlock_bh(&li->lock);
-	rc = 0; /* order accepted */
+
+	return 0; /* order accepted */
+}
+
+static int __sigp_stop(struct kvm_vcpu *vcpu, u16 cpu_addr, int action)
+{
+	struct kvm_s390_float_interrupt *fi = &vcpu->kvm->arch.float_int;
+	struct kvm_s390_local_interrupt *li;
+	int rc;
+
+	if (cpu_addr >= KVM_MAX_VCPUS)
+		return 3; /* not operational */
+
+	spin_lock(&fi->lock);
+	li = fi->local_int[cpu_addr];
+	if (li == NULL) {
+		rc = 3; /* not operational */
+		goto unlock;
+	}
+
+	rc = __inject_sigp_stop(li, action);
+
 unlock:
 	spin_unlock(&fi->lock);
 	VCPU_EVENT(vcpu, 4, "sent sigp stop to cpu %x", cpu_addr);
 	return rc;
 }
 
+int kvm_s390_inject_sigp_stop(struct kvm_vcpu *vcpu, int action)
+{
+	struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
+	return __inject_sigp_stop(li, action);
+}
+
 static int __sigp_set_arch(struct kvm_vcpu *vcpu, u32 parameter)
 {
 	int rc;
@@ -261,11 +273,11 @@ int kvm_s390_handle_sigp(struct kvm_vcpu
 		break;
 	case SIGP_STOP:
 		vcpu->stat.instruction_sigp_stop++;
-		rc = __sigp_stop(vcpu, cpu_addr, 0);
+		rc = __sigp_stop(vcpu, cpu_addr, ACTION_STOP_ON_STOP);
 		break;
 	case SIGP_STOP_STORE_STATUS:
 		vcpu->stat.instruction_sigp_stop++;
-		rc = __sigp_stop(vcpu, cpu_addr, 1);
+		rc = __sigp_stop(vcpu, cpu_addr, ACTION_STORE_ON_STOP);
 		break;
 	case SIGP_SET_ARCH:
 		vcpu->stat.instruction_sigp_arch++;

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 2/3] kvm-s390: fix signal handling
  2009-05-20 13:34 [PATCH 0/3] kvm-s390: revised version of kvm-s390 guest memory handling ehrhardt
  2009-05-20 13:34 ` [PATCH 1/3] kvm-s390: infrastructure to kick vcpus out of guest state ehrhardt
@ 2009-05-20 13:34 ` ehrhardt
  2009-05-20 13:34 ` [PATCH 3/3] kvm-s390: streamline memslot handling ehrhardt
  2 siblings, 0 replies; 18+ messages in thread
From: ehrhardt @ 2009-05-20 13:34 UTC (permalink / raw)
  To: kvm, avi; +Cc: ehrhardt, borntraeger, cotte, heiko.carstens, schwidefsky

From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>

If signal pending is true we exit without updating kvm_run, userspace
currently just does nothing and jumps to kvm_run again.
Since we did not set an exit_reason we might end up with a random one
(whatever was the last exit). Therefore it was possible to e.g. jump to
the psw position the last real interruption set.
Setting the INTR exit reason ensures that no old psw data is swapped
in on reentry.

Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
---

[diffstat]
 kvm-s390.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

[diff]
Index: kvm/arch/s390/kvm/kvm-s390.c
===================================================================
--- kvm.orig/arch/s390/kvm/kvm-s390.c
+++ kvm/arch/s390/kvm/kvm-s390.c
@@ -524,8 +524,10 @@ rerun_vcpu:
 	if (rc == SIE_INTERCEPT_RERUNVCPU)
 		goto rerun_vcpu;
 
-	if (signal_pending(current) && !rc)
+	if (signal_pending(current) && !rc) {
+		kvm_run->exit_reason = KVM_EXIT_INTR;
 		rc = -EINTR;
+	}
 
 	if (rc == -ENOTSUPP) {
 		/* intercept cannot be handled in-kernel, prepare kvm-run */

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 3/3] kvm-s390: streamline memslot handling
  2009-05-20 13:34 [PATCH 0/3] kvm-s390: revised version of kvm-s390 guest memory handling ehrhardt
  2009-05-20 13:34 ` [PATCH 1/3] kvm-s390: infrastructure to kick vcpus out of guest state ehrhardt
  2009-05-20 13:34 ` [PATCH 2/3] kvm-s390: fix signal handling ehrhardt
@ 2009-05-20 13:34 ` ehrhardt
  2009-05-24 14:39   ` Avi Kivity
  2 siblings, 1 reply; 18+ messages in thread
From: ehrhardt @ 2009-05-20 13:34 UTC (permalink / raw)
  To: kvm, avi; +Cc: ehrhardt, borntraeger, cotte, heiko.carstens, schwidefsky

From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>

This patch relocates the variables kvm-s390 uses to track guest mem addr/size.
As discussed dropping the variables at struct kvm_arch level allows to use the
common vcpu->request based mechanism to reload guest memory if e.g. changes
via set_memory_region.
The kick mechanism introduced in this series is used to ensure running vcpus
leave guest state to catch the update.


Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
---

[diffstat]
 include/asm/kvm_host.h |    4 +---
 kvm/gaccess.h          |   23 ++++++++++++-----------
 kvm/intercept.c        |    6 +++---
 kvm/kvm-s390.c         |   48 ++++++++++++++----------------------------------
 kvm/kvm-s390.h         |   29 ++++++++++++++++++++++++++++-
 kvm/sigp.c             |    4 ++--
 6 files changed, 60 insertions(+), 54 deletions(-)

[diff]
Index: kvm/arch/s390/kvm/gaccess.h
===================================================================
--- kvm.orig/arch/s390/kvm/gaccess.h
+++ kvm/arch/s390/kvm/gaccess.h
@@ -1,7 +1,7 @@
 /*
  * gaccess.h -  access guest memory
  *
- * Copyright IBM Corp. 2008
+ * Copyright IBM Corp. 2008,2009
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License (version 2 only)
@@ -16,13 +16,14 @@
 #include <linux/compiler.h>
 #include <linux/kvm_host.h>
 #include <asm/uaccess.h>
+#include "kvm-s390.h"
 
 static inline void __user *__guestaddr_to_user(struct kvm_vcpu *vcpu,
 					       unsigned long guestaddr)
 {
 	unsigned long prefix  = vcpu->arch.sie_block->prefix;
-	unsigned long origin  = vcpu->kvm->arch.guest_origin;
-	unsigned long memsize = vcpu->kvm->arch.guest_memsize;
+	unsigned long origin  = vcpu->arch.sie_block->gmsor;
+	unsigned long memsize = kvm_s390_vcpu_get_memsize(vcpu);
 
 	if (guestaddr < 2 * PAGE_SIZE)
 		guestaddr += prefix;
@@ -158,8 +159,8 @@ static inline int copy_to_guest(struct k
 				const void *from, unsigned long n)
 {
 	unsigned long prefix  = vcpu->arch.sie_block->prefix;
-	unsigned long origin  = vcpu->kvm->arch.guest_origin;
-	unsigned long memsize = vcpu->kvm->arch.guest_memsize;
+	unsigned long origin  = vcpu->arch.sie_block->gmsor;
+	unsigned long memsize = kvm_s390_vcpu_get_memsize(vcpu);
 
 	if ((guestdest < 2 * PAGE_SIZE) && (guestdest + n > 2 * PAGE_SIZE))
 		goto slowpath;
@@ -209,8 +210,8 @@ static inline int copy_from_guest(struct
 				  unsigned long guestsrc, unsigned long n)
 {
 	unsigned long prefix  = vcpu->arch.sie_block->prefix;
-	unsigned long origin  = vcpu->kvm->arch.guest_origin;
-	unsigned long memsize = vcpu->kvm->arch.guest_memsize;
+	unsigned long origin  = vcpu->arch.sie_block->gmsor;
+	unsigned long memsize = kvm_s390_vcpu_get_memsize(vcpu);
 
 	if ((guestsrc < 2 * PAGE_SIZE) && (guestsrc + n > 2 * PAGE_SIZE))
 		goto slowpath;
@@ -244,8 +245,8 @@ static inline int copy_to_guest_absolute
 					 unsigned long guestdest,
 					 const void *from, unsigned long n)
 {
-	unsigned long origin  = vcpu->kvm->arch.guest_origin;
-	unsigned long memsize = vcpu->kvm->arch.guest_memsize;
+	unsigned long origin  = vcpu->arch.sie_block->gmsor;
+	unsigned long memsize = kvm_s390_vcpu_get_memsize(vcpu);
 
 	if (guestdest + n > memsize)
 		return -EFAULT;
@@ -262,8 +263,8 @@ static inline int copy_from_guest_absolu
 					   unsigned long guestsrc,
 					   unsigned long n)
 {
-	unsigned long origin  = vcpu->kvm->arch.guest_origin;
-	unsigned long memsize = vcpu->kvm->arch.guest_memsize;
+	unsigned long origin  = vcpu->arch.sie_block->gmsor;
+	unsigned long memsize = kvm_s390_vcpu_get_memsize(vcpu);
 
 	if (guestsrc + n > memsize)
 		return -EFAULT;
Index: kvm/arch/s390/kvm/intercept.c
===================================================================
--- kvm.orig/arch/s390/kvm/intercept.c
+++ kvm/arch/s390/kvm/intercept.c
@@ -1,7 +1,7 @@
 /*
  * intercept.c - in-kernel handling for sie intercepts
  *
- * Copyright IBM Corp. 2008
+ * Copyright IBM Corp. 2008,2009
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License (version 2 only)
@@ -164,9 +164,9 @@ static int handle_validity(struct kvm_vc
 
 	vcpu->stat.exit_validity++;
 	if ((viwhy == 0x37) && (vcpu->arch.sie_block->prefix
-		<= vcpu->kvm->arch.guest_memsize - 2*PAGE_SIZE)){
+		<= kvm_s390_vcpu_get_memsize(vcpu) - 2*PAGE_SIZE)) {
 		rc = fault_in_pages_writeable((char __user *)
-			 vcpu->kvm->arch.guest_origin +
+			 vcpu->arch.sie_block->gmsor +
 			 vcpu->arch.sie_block->prefix,
 			 2*PAGE_SIZE);
 		if (rc)
Index: kvm/arch/s390/kvm/kvm-s390.c
===================================================================
--- kvm.orig/arch/s390/kvm/kvm-s390.c
+++ kvm/arch/s390/kvm/kvm-s390.c
@@ -1,7 +1,7 @@
 /*
  * s390host.c --  hosting zSeries kernel virtual machines
  *
- * Copyright IBM Corp. 2008
+ * Copyright IBM Corp. 2008,2009
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License (version 2 only)
@@ -10,6 +10,7 @@
  *    Author(s): Carsten Otte <cotte@de.ibm.com>
  *               Christian Borntraeger <borntraeger@de.ibm.com>
  *               Heiko Carstens <heiko.carstens@de.ibm.com>
+ *               Christian Ehrhardt <ehrhardt@de.ibm.com>
  */
 
 #include <linux/compiler.h>
@@ -276,16 +277,10 @@ static void kvm_s390_vcpu_initial_reset(
 	vcpu->arch.sie_block->gbea = 1;
 }
 
-/* The current code can have up to 256 pages for virtio */
-#define VIRTIODESCSPACE (256ul * 4096ul)
-
 int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
 {
 	atomic_set(&vcpu->arch.sie_block->cpuflags, CPUSTAT_ZARCH);
-	vcpu->arch.sie_block->gmslm = vcpu->kvm->arch.guest_memsize +
-				      vcpu->kvm->arch.guest_origin +
-				      VIRTIODESCSPACE - 1ul;
-	vcpu->arch.sie_block->gmsor = vcpu->kvm->arch.guest_origin;
+	set_bit(KVM_REQ_MMU_RELOAD, &vcpu->requests);
 	vcpu->arch.sie_block->ecb   = 2;
 	vcpu->arch.sie_block->eca   = 0xC1002001U;
 	hrtimer_init(&vcpu->arch.ckc_timer, CLOCK_REALTIME, HRTIMER_MODE_ABS);
@@ -488,9 +483,14 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_v
 	vcpu_load(vcpu);
 
 rerun_vcpu:
+	if (vcpu->requests)
+		if (test_and_clear_bit(KVM_REQ_MMU_RELOAD, &vcpu->requests))
+			kvm_s390_vcpu_set_mem(vcpu);
+
 	/* verify, that memory has been registered */
-	if (!vcpu->kvm->arch.guest_memsize) {
+	if (!vcpu->arch.sie_block->gmslm) {
 		vcpu_put(vcpu);
+		VCPU_EVENT(vcpu, 3, "%s", "no memory registered to run vcpu");
 		return -EINVAL;
 	}
 
@@ -688,7 +688,7 @@ int kvm_arch_set_memory_region(struct kv
 	   vmas. It is okay to mmap() and munmap() stuff in this slot after
 	   doing this call at any time */
 
-	if (mem->slot || kvm->arch.guest_memsize)
+	if (mem->slot)
 		return -EINVAL;
 
 	if (mem->guest_phys_addr)
@@ -703,36 +703,16 @@ int kvm_arch_set_memory_region(struct kv
 	if (!user_alloc)
 		return -EINVAL;
 
-	/* lock all vcpus */
-	for (i = 0; i < KVM_MAX_VCPUS; ++i) {
-		if (!kvm->vcpus[i])
-			continue;
-		if (!mutex_trylock(&kvm->vcpus[i]->mutex))
-			goto fail_out;
-	}
-
-	kvm->arch.guest_origin = mem->userspace_addr;
-	kvm->arch.guest_memsize = mem->memory_size;
-
-	/* update sie control blocks, and unlock all vcpus */
+	/* request update of sie control block for all available vcpus */
 	for (i = 0; i < KVM_MAX_VCPUS; ++i) {
 		if (kvm->vcpus[i]) {
-			kvm->vcpus[i]->arch.sie_block->gmsor =
-				kvm->arch.guest_origin;
-			kvm->vcpus[i]->arch.sie_block->gmslm =
-				kvm->arch.guest_memsize +
-				kvm->arch.guest_origin +
-				VIRTIODESCSPACE - 1ul;
-			mutex_unlock(&kvm->vcpus[i]->mutex);
+			set_bit(KVM_REQ_MMU_RELOAD, &kvm->vcpus[i]->requests);
+			kvm_s390_inject_sigp_stop(kvm->vcpus[i],
+						  ACTION_RELOADVCPU_ON_STOP);
 		}
 	}
 
 	return 0;
-
-fail_out:
-	for (; i >= 0; i--)
-		mutex_unlock(&kvm->vcpus[i]->mutex);
-	return -EINVAL;
 }
 
 void kvm_arch_flush_shadow(struct kvm *kvm)
Index: kvm/arch/s390/kvm/kvm-s390.h
===================================================================
--- kvm.orig/arch/s390/kvm/kvm-s390.h
+++ kvm/arch/s390/kvm/kvm-s390.h
@@ -1,7 +1,7 @@
 /*
  * kvm_s390.h -  definition for kvm on s390
  *
- * Copyright IBM Corp. 2008
+ * Copyright IBM Corp. 2008,2009
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License (version 2 only)
@@ -9,6 +9,7 @@
  *
  *    Author(s): Carsten Otte <cotte@de.ibm.com>
  *               Christian Borntraeger <borntraeger@de.ibm.com>
+ *               Christian Ehrhardt <ehrhardt@de.ibm.com>
  */
 
 #ifndef ARCH_S390_KVM_S390_H
@@ -18,6 +19,9 @@
 #include <linux/kvm.h>
 #include <linux/kvm_host.h>
 
+/* The current code can have up to 256 pages for virtio */
+#define VIRTIODESCSPACE (256ul * 4096ul)
+
 typedef int (*intercept_handler_t)(struct kvm_vcpu *vcpu);
 
 /* negativ values are error codes, positive values for internal conditions */
@@ -54,6 +58,29 @@ int kvm_s390_inject_vcpu(struct kvm_vcpu
 int kvm_s390_inject_program_int(struct kvm_vcpu *vcpu, u16 code);
 int kvm_s390_inject_sigp_stop(struct kvm_vcpu *vcpu, int action);
 
+static inline int kvm_s390_vcpu_get_memsize(struct kvm_vcpu *vcpu)
+{
+	return vcpu->arch.sie_block->gmslm
+		- vcpu->arch.sie_block->gmsor
+		- VIRTIODESCSPACE + 1ul;
+}
+
+static inline void kvm_s390_vcpu_set_mem(struct kvm_vcpu *vcpu)
+{
+	struct kvm_memory_slot *mem;
+
+	down_read(&vcpu->kvm->slots_lock);
+	mem = &vcpu->kvm->memslots[0];
+
+	vcpu->arch.sie_block->gmsor = mem->userspace_addr;
+	vcpu->arch.sie_block->gmslm =
+		mem->userspace_addr +
+		(mem->npages << PAGE_SHIFT) +
+		VIRTIODESCSPACE - 1ul;
+
+	up_read(&vcpu->kvm->slots_lock);
+}
+
 /* implemented in priv.c */
 int kvm_s390_handle_b2(struct kvm_vcpu *vcpu);
 
Index: kvm/arch/s390/include/asm/kvm_host.h
===================================================================
--- kvm.orig/arch/s390/include/asm/kvm_host.h
+++ kvm/arch/s390/include/asm/kvm_host.h
@@ -1,7 +1,7 @@
 /*
  * asm-s390/kvm_host.h - definition for kernel virtual machines on s390
  *
- * Copyright IBM Corp. 2008
+ * Copyright IBM Corp. 2008,2009
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License (version 2 only)
@@ -226,8 +226,6 @@ struct kvm_vm_stat {
 };
 
 struct kvm_arch{
-	unsigned long guest_origin;
-	unsigned long guest_memsize;
 	struct sca_block *sca;
 	debug_info_t *dbf;
 	struct kvm_s390_float_interrupt float_int;
Index: kvm/arch/s390/kvm/sigp.c
===================================================================
--- kvm.orig/arch/s390/kvm/sigp.c
+++ kvm/arch/s390/kvm/sigp.c
@@ -189,9 +189,9 @@ static int __sigp_set_prefix(struct kvm_
 	/* make sure that the new value is valid memory */
 	address = address & 0x7fffe000u;
 	if ((copy_from_guest(vcpu, &tmp,
-		(u64) (address + vcpu->kvm->arch.guest_origin) , 1)) ||
+		(u64) (address + vcpu->arch.sie_block->gmsor) , 1)) ||
 	   (copy_from_guest(vcpu, &tmp, (u64) (address +
-			vcpu->kvm->arch.guest_origin + PAGE_SIZE), 1))) {
+			vcpu->arch.sie_block->gmsor + PAGE_SIZE), 1))) {
 		*reg |= SIGP_STAT_INVALID_PARAMETER;
 		return 1; /* invalid parameter */
 	}

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/3] kvm-s390: streamline memslot handling
  2009-05-20 13:34 ` [PATCH 3/3] kvm-s390: streamline memslot handling ehrhardt
@ 2009-05-24 14:39   ` Avi Kivity
  2009-05-25  8:33     ` Christian Ehrhardt
  0 siblings, 1 reply; 18+ messages in thread
From: Avi Kivity @ 2009-05-24 14:39 UTC (permalink / raw)
  To: ehrhardt
  Cc: kvm, borntraeger, cotte, heiko.carstens, schwidefsky,
	Marcelo Tosatti

ehrhardt@linux.vnet.ibm.com wrote:
> From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
>
> This patch relocates the variables kvm-s390 uses to track guest mem addr/size.
> As discussed dropping the variables at struct kvm_arch level allows to use the
> common vcpu->request based mechanism to reload guest memory if e.g. changes
> via set_memory_region.
> The kick mechanism introduced in this series is used to ensure running vcpus
> leave guest state to catch the update.
>
>
>   
>
>  rerun_vcpu:
> +	if (vcpu->requests)
> +		if (test_and_clear_bit(KVM_REQ_MMU_RELOAD, &vcpu->requests))
> +			kvm_s390_vcpu_set_mem(vcpu);
> +
>  	/* verify, that memory has been registered */
> -	if (!vcpu->kvm->arch.guest_memsize) {
> +	if (!vcpu->arch.sie_block->gmslm) {
>  		vcpu_put(vcpu);
> +		VCPU_EVENT(vcpu, 3, "%s", "no memory registered to run vcpu");
>  		return -EINVAL;
>  	}


x86 uses a double check: first we check vcpu->requests outside atomic 
context, then we enter the critical section and check again for signals 
and vcpu->requests.

This allows us (a) to do naughty things in vcpu->requests handlers, (b) 
keep the critical section short.

Does this apply here?

> -	/* update sie control blocks, and unlock all vcpus */
> +	/* request update of sie control block for all available vcpus */
>  	for (i = 0; i < KVM_MAX_VCPUS; ++i) {
>  		if (kvm->vcpus[i]) {
> -			kvm->vcpus[i]->arch.sie_block->gmsor =
> -				kvm->arch.guest_origin;
> -			kvm->vcpus[i]->arch.sie_block->gmslm =
> -				kvm->arch.guest_memsize +
> -				kvm->arch.guest_origin +
> -				VIRTIODESCSPACE - 1ul;
> -			mutex_unlock(&kvm->vcpus[i]->mutex);
> +			set_bit(KVM_REQ_MMU_RELOAD, &kvm->vcpus[i]->requests);
> +			kvm_s390_inject_sigp_stop(kvm->vcpus[i],
> +						  ACTION_RELOADVCPU_ON_STOP);
>  		}
>  	}
>   

There already exists a loop which does this, see 
make_all_cpus_request().  It uses an IPI (Marcelo, can't it use the 
reschedule interrupt?).  It has a couple of optimizations -- if the 
request is already set, it skips the IPI, and it avoids the IPI for 
vcpus out of guest mode.  Maybe it could fit s390 too.


-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/3] kvm-s390: streamline memslot handling
  2009-05-24 14:39   ` Avi Kivity
@ 2009-05-25  8:33     ` Christian Ehrhardt
  2009-05-25 11:40       ` Christian Ehrhardt
  2009-05-26  7:57       ` Avi Kivity
  0 siblings, 2 replies; 18+ messages in thread
From: Christian Ehrhardt @ 2009-05-25  8:33 UTC (permalink / raw)
  To: Avi Kivity
  Cc: kvm, borntraeger, cotte, heiko.carstens, schwidefsky,
	Marcelo Tosatti

Avi Kivity wrote:
> ehrhardt@linux.vnet.ibm.com wrote:
>> From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
>>
>> This patch relocates the variables kvm-s390 uses to track guest mem 
>> addr/size.
>> As discussed dropping the variables at struct kvm_arch level allows 
>> to use the
>> common vcpu->request based mechanism to reload guest memory if e.g. 
>> changes
>> via set_memory_region.
>> The kick mechanism introduced in this series is used to ensure 
>> running vcpus
>> leave guest state to catch the update.
>>
>>
>>  
>>  rerun_vcpu:
>> +    if (vcpu->requests)
>> +        if (test_and_clear_bit(KVM_REQ_MMU_RELOAD, &vcpu->requests))
>> +            kvm_s390_vcpu_set_mem(vcpu);
>> +
>>      /* verify, that memory has been registered */
>> -    if (!vcpu->kvm->arch.guest_memsize) {
>> +    if (!vcpu->arch.sie_block->gmslm) {
>>          vcpu_put(vcpu);
>> +        VCPU_EVENT(vcpu, 3, "%s", "no memory registered to run vcpu");
>>          return -EINVAL;
>>      }
>
>
> x86 uses a double check: first we check vcpu->requests outside atomic 
> context, then we enter the critical section and check again for 
> signals and vcpu->requests.
>
> This allows us (a) to do naughty things in vcpu->requests handlers, 
> (b) keep the critical section short.
>
> Does this apply here?

The patch already keeps the critical inner loop clear of extra code.
The check for vcpu->requests I added is only reached by either a 
heavyweight (userspace) exit/reentry or the explicit kickout of a vcpu 
to this label. Therefore weit fulfills a+b as you mentioned them above. 
Additionally the s390 reload is very rare as well as fast, therefore it 
would not even be an issue.

>> -    /* update sie control blocks, and unlock all vcpus */
>> +    /* request update of sie control block for all available vcpus */
>>      for (i = 0; i < KVM_MAX_VCPUS; ++i) {
>>          if (kvm->vcpus[i]) {
>> -            kvm->vcpus[i]->arch.sie_block->gmsor =
>> -                kvm->arch.guest_origin;
>> -            kvm->vcpus[i]->arch.sie_block->gmslm =
>> -                kvm->arch.guest_memsize +
>> -                kvm->arch.guest_origin +
>> -                VIRTIODESCSPACE - 1ul;
>> -            mutex_unlock(&kvm->vcpus[i]->mutex);
>> +            set_bit(KVM_REQ_MMU_RELOAD, &kvm->vcpus[i]->requests);
>> +            kvm_s390_inject_sigp_stop(kvm->vcpus[i],
>> +                          ACTION_RELOADVCPU_ON_STOP);
>>          }
>>      }
>>   
>
> There already exists a loop which does this, see 
> make_all_cpus_request().  It uses an IPI (Marcelo, can't it use the 
> reschedule interrupt?).  It has a couple of optimizations -- if the 
> request is already set, it skips the IPI, and it avoids the IPI for 
> vcpus out of guest mode.  Maybe it could fit s390 too.
I assume that the IPI on x86 is a implicit consequence of the 
smp_call_function_many function, but I think this doesn't work that way 
for us. The kick implied by that call would be recieved, but not reach 
the code the checke vcpu->request. I could add that behaviour, but that 
could make our normal interrupt handling much slower. Therefore I don't 
want to call that function, but on the other hand I like the "skip if 
the request is already set" functionality and think about adding that in 
my loop.

-- 

Grüsse / regards, Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/3] kvm-s390: streamline memslot handling
  2009-05-25  8:33     ` Christian Ehrhardt
@ 2009-05-25 11:40       ` Christian Ehrhardt
  2009-05-26  7:57       ` Avi Kivity
  1 sibling, 0 replies; 18+ messages in thread
From: Christian Ehrhardt @ 2009-05-25 11:40 UTC (permalink / raw)
  To: Avi Kivity
  Cc: kvm, borntraeger, cotte, heiko.carstens, schwidefsky,
	Marcelo Tosatti

Christian Ehrhardt wrote:
> Avi Kivity wrote:
>> ehrhardt@linux.vnet.ibm.com wrote:
>>> From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
>>>
> [...]
>>> -    /* update sie control blocks, and unlock all vcpus */
>>> +    /* request update of sie control block for all available vcpus */
>>>      for (i = 0; i < KVM_MAX_VCPUS; ++i) {
>>>          if (kvm->vcpus[i]) {
>>> -            kvm->vcpus[i]->arch.sie_block->gmsor =
>>> -                kvm->arch.guest_origin;
>>> -            kvm->vcpus[i]->arch.sie_block->gmslm =
>>> -                kvm->arch.guest_memsize +
>>> -                kvm->arch.guest_origin +
>>> -                VIRTIODESCSPACE - 1ul;
>>> -            mutex_unlock(&kvm->vcpus[i]->mutex);
>>> +            set_bit(KVM_REQ_MMU_RELOAD, &kvm->vcpus[i]->requests);
>>> +            kvm_s390_inject_sigp_stop(kvm->vcpus[i],
>>> +                          ACTION_RELOADVCPU_ON_STOP);
>>>          }
>>>      }
>>>   
>>
>> There already exists a loop which does this, see 
>> make_all_cpus_request().  It uses an IPI (Marcelo, can't it use the 
>> reschedule interrupt?).  It has a couple of optimizations -- if the 
>> request is already set, it skips the IPI, and it avoids the IPI for 
>> vcpus out of guest mode.  Maybe it could fit s390 too.
> I assume that the IPI on x86 is a implicit consequence of the 
> smp_call_function_many function, but I think this doesn't work that 
> way for us. The kick implied by that call would be recieved, but not 
> reach the code the checke vcpu->request. I could add that behaviour, 
> but that could make our normal interrupt handling much slower. 
> Therefore I don't want to call that function, but on the other hand I 
> like the "skip if the request is already set" functionality and think 
> about adding that in my loop.
>

For now I added the optimization to skip kicking vcpus out of guest that 
had the request bit already set to the s390 specific loop (sent as v2 in 
a few minutes).

We might one day consider standardizing some generic kickout levels e.g. 
kick to "inner loop", "arch vcpu run", "generic vcpu run", "userspace", 
... whatever levels fit *all* our use cases. And then let that kicks be 
implemented in an kvm_arch_* backend as it might be very different how 
they behave on different architectures. In case an architecture cannot 
achive reaching the specified kickout level it has to kick to the next 
available upper level which eventually will reach the desired step on 
the way to re-run the vcpu.
Alltogether this should lead to a much more reliable and transparent 
interface that finally should be used all across the generic code.


-- 

Grüsse / regards, Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/3] kvm-s390: infrastructure to kick vcpus out of guest state
  2009-05-25 11:40 [PATCH 0/3] kvm-s390: revised version of kvm-s390 guest memory handling - v2 ehrhardt
@ 2009-05-25 11:40 ` ehrhardt
  2009-05-25 20:22   ` Marcelo Tosatti
  0 siblings, 1 reply; 18+ messages in thread
From: ehrhardt @ 2009-05-25 11:40 UTC (permalink / raw)
  To: kvm, avi; +Cc: ehrhardt, borntraeger, cotte, heiko.carstens, schwidefsky

From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>

To ensure vcpu's come out of guest context in certain cases this patch adds a
s390 specific way to kick them out of guest context. Currently it kicks them
out to rerun the vcpu_run path in the s390 code, but the mechanism itself is
expandable and with a new flag we could also add e.g. kicks to userspace etc.

Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
---

[diffstat]
 include/asm/kvm_host.h |    5 ++--
 kvm/intercept.c        |   12 ++++++++---
 kvm/kvm-s390.c         |    5 ++++
 kvm/kvm-s390.h         |    3 ++
 kvm/sigp.c             |   53 +++++++++++++++++++++++++++++--------------------
 5 files changed, 52 insertions(+), 26 deletions(-)

[diff]
Index: kvm/arch/s390/kvm/intercept.c
===================================================================
--- kvm.orig/arch/s390/kvm/intercept.c
+++ kvm/arch/s390/kvm/intercept.c
@@ -128,7 +128,7 @@ static int handle_noop(struct kvm_vcpu *
 
 static int handle_stop(struct kvm_vcpu *vcpu)
 {
-	int rc;
+	int rc = 0;
 
 	vcpu->stat.exit_stop_request++;
 	atomic_clear_mask(CPUSTAT_RUNNING, &vcpu->arch.sie_block->cpuflags);
@@ -141,12 +141,18 @@ static int handle_stop(struct kvm_vcpu *
 			rc = -ENOTSUPP;
 	}
 
+	if (vcpu->arch.local_int.action_bits & ACTION_RELOADVCPU_ON_STOP) {
+		vcpu->arch.local_int.action_bits &= ~ACTION_RELOADVCPU_ON_STOP;
+		rc = SIE_INTERCEPT_RERUNVCPU;
+		vcpu->run->exit_reason = KVM_EXIT_INTR;
+	}
+
 	if (vcpu->arch.local_int.action_bits & ACTION_STOP_ON_STOP) {
 		vcpu->arch.local_int.action_bits &= ~ACTION_STOP_ON_STOP;
 		VCPU_EVENT(vcpu, 3, "%s", "cpu stopped");
 		rc = -ENOTSUPP;
-	} else
-		rc = 0;
+	}
+
 	spin_unlock_bh(&vcpu->arch.local_int.lock);
 	return rc;
 }
Index: kvm/arch/s390/kvm/kvm-s390.c
===================================================================
--- kvm.orig/arch/s390/kvm/kvm-s390.c
+++ kvm/arch/s390/kvm/kvm-s390.c
@@ -487,6 +487,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_v
 
 	vcpu_load(vcpu);
 
+rerun_vcpu:
 	/* verify, that memory has been registered */
 	if (!vcpu->kvm->arch.guest_memsize) {
 		vcpu_put(vcpu);
@@ -506,6 +507,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_v
 		vcpu->arch.sie_block->gpsw.addr = kvm_run->s390_sieic.addr;
 		break;
 	case KVM_EXIT_UNKNOWN:
+	case KVM_EXIT_INTR:
 	case KVM_EXIT_S390_RESET:
 		break;
 	default:
@@ -519,6 +521,9 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_v
 		rc = kvm_handle_sie_intercept(vcpu);
 	} while (!signal_pending(current) && !rc);
 
+	if (rc == SIE_INTERCEPT_RERUNVCPU)
+		goto rerun_vcpu;
+
 	if (signal_pending(current) && !rc)
 		rc = -EINTR;
 
Index: kvm/arch/s390/kvm/kvm-s390.h
===================================================================
--- kvm.orig/arch/s390/kvm/kvm-s390.h
+++ kvm/arch/s390/kvm/kvm-s390.h
@@ -20,6 +20,8 @@
 
 typedef int (*intercept_handler_t)(struct kvm_vcpu *vcpu);
 
+/* negativ values are error codes, positive values for internal conditions */
+#define SIE_INTERCEPT_RERUNVCPU		(1<<0)
 int kvm_handle_sie_intercept(struct kvm_vcpu *vcpu);
 
 #define VM_EVENT(d_kvm, d_loglevel, d_string, d_args...)\
@@ -50,6 +52,7 @@ int kvm_s390_inject_vm(struct kvm *kvm,
 int kvm_s390_inject_vcpu(struct kvm_vcpu *vcpu,
 		struct kvm_s390_interrupt *s390int);
 int kvm_s390_inject_program_int(struct kvm_vcpu *vcpu, u16 code);
+int kvm_s390_inject_sigp_stop(struct kvm_vcpu *vcpu, int action);
 
 /* implemented in priv.c */
 int kvm_s390_handle_b2(struct kvm_vcpu *vcpu);
Index: kvm/arch/s390/include/asm/kvm_host.h
===================================================================
--- kvm.orig/arch/s390/include/asm/kvm_host.h
+++ kvm/arch/s390/include/asm/kvm_host.h
@@ -180,8 +180,9 @@ struct kvm_s390_interrupt_info {
 };
 
 /* for local_interrupt.action_flags */
-#define ACTION_STORE_ON_STOP 1
-#define ACTION_STOP_ON_STOP  2
+#define ACTION_STORE_ON_STOP		(1<<0)
+#define ACTION_STOP_ON_STOP		(1<<1)
+#define ACTION_RELOADVCPU_ON_STOP	(1<<2)
 
 struct kvm_s390_local_interrupt {
 	spinlock_t lock;
Index: kvm/arch/s390/kvm/sigp.c
===================================================================
--- kvm.orig/arch/s390/kvm/sigp.c
+++ kvm/arch/s390/kvm/sigp.c
@@ -1,7 +1,7 @@
 /*
  * sigp.c - handlinge interprocessor communication
  *
- * Copyright IBM Corp. 2008
+ * Copyright IBM Corp. 2008,2009
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License (version 2 only)
@@ -9,6 +9,7 @@
  *
  *    Author(s): Carsten Otte <cotte@de.ibm.com>
  *               Christian Borntraeger <borntraeger@de.ibm.com>
+ *               Christian Ehrhardt <ehrhardt@de.ibm.com>
  */
 
 #include <linux/kvm.h>
@@ -107,46 +108,57 @@ unlock:
 	return rc;
 }
 
-static int __sigp_stop(struct kvm_vcpu *vcpu, u16 cpu_addr, int store)
+static int __inject_sigp_stop(struct kvm_s390_local_interrupt *li, int action)
 {
-	struct kvm_s390_float_interrupt *fi = &vcpu->kvm->arch.float_int;
-	struct kvm_s390_local_interrupt *li;
 	struct kvm_s390_interrupt_info *inti;
-	int rc;
-
-	if (cpu_addr >= KVM_MAX_VCPUS)
-		return 3; /* not operational */
 
 	inti = kzalloc(sizeof(*inti), GFP_KERNEL);
 	if (!inti)
 		return -ENOMEM;
-
 	inti->type = KVM_S390_SIGP_STOP;
 
-	spin_lock(&fi->lock);
-	li = fi->local_int[cpu_addr];
-	if (li == NULL) {
-		rc = 3; /* not operational */
-		kfree(inti);
-		goto unlock;
-	}
 	spin_lock_bh(&li->lock);
 	list_add_tail(&inti->list, &li->list);
 	atomic_set(&li->active, 1);
 	atomic_set_mask(CPUSTAT_STOP_INT, li->cpuflags);
-	if (store)
-		li->action_bits |= ACTION_STORE_ON_STOP;
-	li->action_bits |= ACTION_STOP_ON_STOP;
+	li->action_bits |= action;
 	if (waitqueue_active(&li->wq))
 		wake_up_interruptible(&li->wq);
 	spin_unlock_bh(&li->lock);
-	rc = 0; /* order accepted */
+
+	return 0; /* order accepted */
+}
+
+static int __sigp_stop(struct kvm_vcpu *vcpu, u16 cpu_addr, int action)
+{
+	struct kvm_s390_float_interrupt *fi = &vcpu->kvm->arch.float_int;
+	struct kvm_s390_local_interrupt *li;
+	int rc;
+
+	if (cpu_addr >= KVM_MAX_VCPUS)
+		return 3; /* not operational */
+
+	spin_lock(&fi->lock);
+	li = fi->local_int[cpu_addr];
+	if (li == NULL) {
+		rc = 3; /* not operational */
+		goto unlock;
+	}
+
+	rc = __inject_sigp_stop(li, action);
+
 unlock:
 	spin_unlock(&fi->lock);
 	VCPU_EVENT(vcpu, 4, "sent sigp stop to cpu %x", cpu_addr);
 	return rc;
 }
 
+int kvm_s390_inject_sigp_stop(struct kvm_vcpu *vcpu, int action)
+{
+	struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
+	return __inject_sigp_stop(li, action);
+}
+
 static int __sigp_set_arch(struct kvm_vcpu *vcpu, u32 parameter)
 {
 	int rc;
@@ -261,11 +273,11 @@ int kvm_s390_handle_sigp(struct kvm_vcpu
 		break;
 	case SIGP_STOP:
 		vcpu->stat.instruction_sigp_stop++;
-		rc = __sigp_stop(vcpu, cpu_addr, 0);
+		rc = __sigp_stop(vcpu, cpu_addr, ACTION_STOP_ON_STOP);
 		break;
 	case SIGP_STOP_STORE_STATUS:
 		vcpu->stat.instruction_sigp_stop++;
-		rc = __sigp_stop(vcpu, cpu_addr, 1);
+		rc = __sigp_stop(vcpu, cpu_addr, ACTION_STORE_ON_STOP);
 		break;
 	case SIGP_SET_ARCH:
 		vcpu->stat.instruction_sigp_arch++;

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/3] kvm-s390: infrastructure to kick vcpus out of guest state
  2009-05-25 11:40 ` [PATCH 1/3] kvm-s390: infrastructure to kick vcpus out of guest state ehrhardt
@ 2009-05-25 20:22   ` Marcelo Tosatti
  2009-05-26  8:02     ` Christian Ehrhardt
  0 siblings, 1 reply; 18+ messages in thread
From: Marcelo Tosatti @ 2009-05-25 20:22 UTC (permalink / raw)
  To: ehrhardt; +Cc: kvm, avi, borntraeger, cotte, heiko.carstens, schwidefsky

On Mon, May 25, 2009 at 01:40:49PM +0200, ehrhardt@linux.vnet.ibm.com wrote:
> From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
> 
> To ensure vcpu's come out of guest context in certain cases this patch adds a
> s390 specific way to kick them out of guest context. Currently it kicks them
> out to rerun the vcpu_run path in the s390 code, but the mechanism itself is
> expandable and with a new flag we could also add e.g. kicks to userspace etc.
> 
> Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>

"For now I added the optimization to skip kicking vcpus out of guest
that had the request bit already set to the s390 specific loop (sent as
v2 in a few minutes).

We might one day consider standardizing some generic kickout levels e.g.
kick to "inner loop", "arch vcpu run", "generic vcpu run", "userspace",
... whatever levels fit *all* our use cases. And then let that kicks be
implemented in an kvm_arch_* backend as it might be very different how
they behave on different architectures."

That would be ideal, yes. Two things make_all_requests handles: 

1) It disables preemption with get_cpu(), so it can reliably check for
cpu id. Somehow you don't need that for s390 when kicking multiple
vcpus?

2) It uses smp_call_function_many(wait=1), which guarantees that by the
time make_all_requests returns no vcpus will be using stale data (the
remote vcpus will have executed ack_flush).

If smp_call_function_many is hidden behind kvm_arch_kick_vcpus, can you
make use of make_all_requests for S390 (without the smp_call_function 
performance impact you mentioned) ?

For x86 we can further optimize make_all_requests by checking REQ_KICK,
and kvm_arch_kick_vcpus would be a good place for that.

And the kickout levels idea you mentioned can come later, as an
optimization?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/3] kvm-s390: streamline memslot handling
  2009-05-25  8:33     ` Christian Ehrhardt
  2009-05-25 11:40       ` Christian Ehrhardt
@ 2009-05-26  7:57       ` Avi Kivity
  2009-05-26  8:31         ` Christian Bornträger
  1 sibling, 1 reply; 18+ messages in thread
From: Avi Kivity @ 2009-05-26  7:57 UTC (permalink / raw)
  To: Christian Ehrhardt
  Cc: kvm, borntraeger, cotte, heiko.carstens, schwidefsky,
	Marcelo Tosatti

Christian Ehrhardt wrote: 
>>
>> There already exists a loop which does this, see 
>> make_all_cpus_request().  It uses an IPI (Marcelo, can't it use the 
>> reschedule interrupt?).  It has a couple of optimizations -- if the 
>> request is already set, it skips the IPI, and it avoids the IPI for 
>> vcpus out of guest mode.  Maybe it could fit s390 too.
> I assume that the IPI on x86 is a implicit consequence of the 
> smp_call_function_many function, 

Yes.  It's only used to exit the guest, the IPI itself does nothing.

> but I think this doesn't work that way for us. The kick implied by 
> that call would be recieved, but not reach the code the checke 
> vcpu->request. 

vcpu->requests is not checked by the IPI.  Instead, it is checked just 
before entering guest mode, with interrupts disabled.

If the request is made before the check, no IPI is made, and the check 
finds the bit set.

If the request is made after the check, an IPI is made, and the guest 
exits immediately after entry.

> I could add that behaviour, but that could make our normal interrupt 
> handling much slower. Therefore I don't want to call that function, 
> but on the other hand I like the "skip if the request is already set" 
> functionality and think about adding that in my loop.

I don't understand why it would affect your interrupt handling.  We need 
someone that talks both x86 and s390 to break the language barrier...

I'll apply the patches, but please do look further into increasing 
commonality.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/3] kvm-s390: infrastructure to kick vcpus out of guest state
  2009-05-25 20:22   ` Marcelo Tosatti
@ 2009-05-26  8:02     ` Christian Ehrhardt
  2009-05-28  3:44       ` Marcelo Tosatti
  0 siblings, 1 reply; 18+ messages in thread
From: Christian Ehrhardt @ 2009-05-26  8:02 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: kvm, avi, borntraeger, cotte, heiko.carstens, schwidefsky

Marcelo Tosatti wrote:
> On Mon, May 25, 2009 at 01:40:49PM +0200, ehrhardt@linux.vnet.ibm.com wrote:
>   
>> From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
>>
>> To ensure vcpu's come out of guest context in certain cases this patch adds a
>> s390 specific way to kick them out of guest context. Currently it kicks them
>> out to rerun the vcpu_run path in the s390 code, but the mechanism itself is
>> expandable and with a new flag we could also add e.g. kicks to userspace etc.
>>
>> Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
>>     
>
> "For now I added the optimization to skip kicking vcpus out of guest
> that had the request bit already set to the s390 specific loop (sent as
> v2 in a few minutes).
>
> We might one day consider standardizing some generic kickout levels e.g.
> kick to "inner loop", "arch vcpu run", "generic vcpu run", "userspace",
> ... whatever levels fit *all* our use cases. And then let that kicks be
> implemented in an kvm_arch_* backend as it might be very different how
> they behave on different architectures."
>
> That would be ideal, yes. Two things make_all_requests handles: 
>
> 1) It disables preemption with get_cpu(), so it can reliably check for
> cpu id. Somehow you don't need that for s390 when kicking multiple
> vcpus?
>   
I don't even need the cpuid as make_all_requests does, I just insert a 
special bit in the vcpu arch part and the vcpu will "come out to me (host)".
Fortunateley the kick is rare and fast so I can just insert it 
unconditionally (it's even ok to insert it if the vcpu is not in guest 
state). That prevents us from needing vcpu lock or detailed checks which 
would end up where we started (no guarantee that vcpu's come out of 
guest context while trying to aquire all vcpu locks)

> 2) It uses smp_call_function_many(wait=1), which guarantees that by the
> time make_all_requests returns no vcpus will be using stale data (the
> remote vcpus will have executed ack_flush).
>   
yes this is really a part my s390 implementation doesn't fulfill yet. 
Currently on return vcpus might still use the old memslot information.
As mentioned before letting all interrupts come "too far" out of the hot 
loop would be a performance issue, therefore I think I will need some 
request&confirm mechanism. I'm not sure yet but maybe it could be as 
easy as this pseudo code example:

# in make_all_requests
# remember we have slots_lock write here and the reentry that updates 
the vcpu specific data aquires slots_lock for read.
loop vcpus
  set_bit in vcpu requests
  kick vcpu #arch function
endloop

loop vcpus
  until the requested bit is disappeared #as the reentry path uses 
test_and_clear it will disappear
endloop

That would be a implicit synchronization and should work, as I wrote 
before setting memslots while the guest is running is rare if ever 
existant for s390. On x86 smp_call_many could then work without the wait 
flag being set.
But I assume that this synchronization approach is slower as it 
serializes all vcpus on reentry (they wait for the slots_lock to get 
dropped), therefore I wanted to ask how often setting memslots on 
runtime will occur on x86 ? Would this approach be acceptable ?

If it is too adventurous for now I can implement it that way in the s390 
code and we split the long term discussion (synchronization + generic 
kickout levels + who knows what comes up).
> If smp_call_function_many is hidden behind kvm_arch_kick_vcpus, can you
> make use of make_all_requests for S390 (without the smp_call_function 
> performance impact you mentioned) ?
>   
In combination with the request&confirm mechanism desribed above it 
should work if smp_call function and all the cpuid gathering which 
belongs to it is hidden behind kvm_arch_kick_vcpus.
> For x86 we can further optimize make_all_requests by checking REQ_KICK,
> and kvm_arch_kick_vcpus would be a good place for that.
>
> And the kickout levels idea you mentioned can come later, as an
> optimization?
yes I agree splitting that to a later optimization is a good idea.

> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>   


-- 

Grüsse / regards, Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/3] kvm-s390: streamline memslot handling
  2009-05-26  7:57       ` Avi Kivity
@ 2009-05-26  8:31         ` Christian Bornträger
  2009-05-26  9:27           ` Avi Kivity
  0 siblings, 1 reply; 18+ messages in thread
From: Christian Bornträger @ 2009-05-26  8:31 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Christian Ehrhardt, kvm, cotte, heiko.carstens, schwidefsky,
	Marcelo Tosatti

Am Dienstag 26 Mai 2009 09:57:58 schrieb Avi Kivity:
> > I could add that behaviour, but that could make our normal interrupt
> > handling much slower. Therefore I don't want to call that function,
> > but on the other hand I like the "skip if the request is already set"
> > functionality and think about adding that in my loop.
>
> I don't understand why it would affect your interrupt handling.  We need

As far as I understand x86, every host interrupt causes a guest exit.

On s390 the SIE instruction is interruptible. On a host interrupt (like an 
IPI)  the host interrupt handler runs and finally jumps back into the SIE 
instruction.  The hardware will continue with guest execution. This has the 
advantage, that we dont have to load/save guest and host registers on host 
interrupts. (the low level interrupt handler saves the registers of the 
interrupted context)

In our low-level interrupt handler we do check for signal_pending, 
machine_check_pending and need_resched to leave the sie instruction. For 
anything else a the host sees a cpu bound guest always in the SIE instruction. 

Christian

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/3] kvm-s390: streamline memslot handling
  2009-05-26  8:31         ` Christian Bornträger
@ 2009-05-26  9:27           ` Avi Kivity
  2009-05-26 10:31             ` Christian Ehrhardt
  0 siblings, 1 reply; 18+ messages in thread
From: Avi Kivity @ 2009-05-26  9:27 UTC (permalink / raw)
  To: Christian Bornträger
  Cc: Christian Ehrhardt, kvm, cotte, heiko.carstens, schwidefsky,
	Marcelo Tosatti

Christian Bornträger wrote:
> Am Dienstag 26 Mai 2009 09:57:58 schrieb Avi Kivity:
>   
>>> I could add that behaviour, but that could make our normal interrupt
>>> handling much slower. Therefore I don't want to call that function,
>>> but on the other hand I like the "skip if the request is already set"
>>> functionality and think about adding that in my loop.
>>>       
>> I don't understand why it would affect your interrupt handling.  We need
>>     
>
> As far as I understand x86, every host interrupt causes a guest exit.
>   

Yes.

> On s390 the SIE instruction is interruptible. On a host interrupt (like an 
> IPI)  the host interrupt handler runs and finally jumps back into the SIE 
> instruction.  The hardware will continue with guest execution. This has the 
> advantage, that we dont have to load/save guest and host registers on host 
> interrupts. (the low level interrupt handler saves the registers of the 
> interrupted context)
>   

Neat stuff.  Wish I had something like that.

> In our low-level interrupt handler we do check for signal_pending, 
> machine_check_pending and need_resched to leave the sie instruction. For 
> anything else a the host sees a cpu bound guest always in the SIE instruction. 
>   

Okay, now I understand (and agree with) you multi-level kick thing.  
Maybe we could do it like so:

Interrupt handler (on s390 only) checks vcpu->requests, handles the ones 
it cans.  If bits are still set, it exits to arch loop, which handles 
the bits it cans.  If bits are still set, it exits to the generic code 
loop, which can finally exit to userspace.

Does this fit with s390 hardware?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/3] kvm-s390: streamline memslot handling
  2009-05-26  9:27           ` Avi Kivity
@ 2009-05-26 10:31             ` Christian Ehrhardt
  0 siblings, 0 replies; 18+ messages in thread
From: Christian Ehrhardt @ 2009-05-26 10:31 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Christian Bornträger, kvm, cotte, heiko.carstens,
	schwidefsky, Marcelo Tosatti

Avi Kivity wrote:
> Christian Bornträger wrote:
>> Am Dienstag 26 Mai 2009 09:57:58 schrieb Avi Kivity:
>>   
[...]
>> In our low-level interrupt handler we do check for signal_pending, 
>> machine_check_pending and need_resched to leave the sie instruction. 
>> For anything else a the host sees a cpu bound guest always in the SIE 
>> instruction.   
>
> Okay, now I understand (and agree with) you multi-level kick thing.  
> Maybe we could do it like so:
>
> Interrupt handler (on s390 only) checks vcpu->requests, handles the 
> ones it cans.  If bits are still set, it exits to arch loop, which 
> handles the bits it cans.  If bits are still set, it exits to the 
> generic code loop, which can finally exit to userspace.
>
> Does this fit with s390 hardware?
>
I like this idea instead of explicitly kicking to an (upper) level to 
use the lowest kick and exit if not able to handle.
I think it should work (no guarantee) and I try to come up with 
something in the next few days - either a updated patch series or 
additional discussion input :-).

-- 

Grüsse / regards, Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/3] kvm-s390: infrastructure to kick vcpus out of guest state
  2009-05-26  8:02     ` Christian Ehrhardt
@ 2009-05-28  3:44       ` Marcelo Tosatti
  2009-05-28  7:59         ` Christian Ehrhardt
  0 siblings, 1 reply; 18+ messages in thread
From: Marcelo Tosatti @ 2009-05-28  3:44 UTC (permalink / raw)
  To: Christian Ehrhardt
  Cc: kvm, avi, borntraeger, cotte, heiko.carstens, schwidefsky


On Tue, May 26, 2009 at 10:02:59AM +0200, Christian Ehrhardt wrote:
> Marcelo Tosatti wrote:
>> On Mon, May 25, 2009 at 01:40:49PM +0200, ehrhardt@linux.vnet.ibm.com wrote:
>>   
>>> From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
>>>
>>> To ensure vcpu's come out of guest context in certain cases this patch adds a
>>> s390 specific way to kick them out of guest context. Currently it kicks them
>>> out to rerun the vcpu_run path in the s390 code, but the mechanism itself is
>>> expandable and with a new flag we could also add e.g. kicks to userspace etc.
>>>
>>> Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
>>>     
>>
>> "For now I added the optimization to skip kicking vcpus out of guest
>> that had the request bit already set to the s390 specific loop (sent as
>> v2 in a few minutes).
>>
>> We might one day consider standardizing some generic kickout levels e.g.
>> kick to "inner loop", "arch vcpu run", "generic vcpu run", "userspace",
>> ... whatever levels fit *all* our use cases. And then let that kicks be
>> implemented in an kvm_arch_* backend as it might be very different how
>> they behave on different architectures."
>>
>> That would be ideal, yes. Two things make_all_requests handles: 
>>
>> 1) It disables preemption with get_cpu(), so it can reliably check for
>> cpu id. Somehow you don't need that for s390 when kicking multiple
>> vcpus?
>>   
> I don't even need the cpuid as make_all_requests does, I just insert a  
> special bit in the vcpu arch part and the vcpu will "come out to me 
> (host)".
> Fortunateley the kick is rare and fast so I can just insert it  
> unconditionally (it's even ok to insert it if the vcpu is not in guest  
> state). That prevents us from needing vcpu lock or detailed checks which  
> would end up where we started (no guarantee that vcpu's come out of  
> guest context while trying to aquire all vcpu locks)

Let me see if I get this right: you kick the vcpus out of guest mode by
using a special bit in the vcpu arch part. OK.

What I don't understand is this: 
"would end up where we started (no guarantee that vcpu's come out of
guest context while trying to aquire all vcpu locks)"

So you _need_ a mechanism to kick all vcpus out of guest mode?

>> 2) It uses smp_call_function_many(wait=1), which guarantees that by the
>> time make_all_requests returns no vcpus will be using stale data (the
>> remote vcpus will have executed ack_flush).
>>   
> yes this is really a part my s390 implementation doesn't fulfill yet.  
> Currently on return vcpus might still use the old memslot information.
> As mentioned before letting all interrupts come "too far" out of the hot  
> loop would be a performance issue, therefore I think I will need some  
> request&confirm mechanism. I'm not sure yet but maybe it could be as  
> easy as this pseudo code example:
>
> # in make_all_requests
> # remember we have slots_lock write here and the reentry that updates  
> the vcpu specific data aquires slots_lock for read.
> loop vcpus
>  set_bit in vcpu requests
>  kick vcpu #arch function
> endloop
>
> loop vcpus
>  until the requested bit is disappeared #as the reentry path uses  
> test_and_clear it will disappear
> endloop
>
> That would be a implicit synchronization and should work, as I wrote  
> before setting memslots while the guest is running is rare if ever  
> existant for s390. On x86 smp_call_many could then work without the wait  
> flag being set.

I see, yes. 

> But I assume that this synchronization approach is slower as it  
> serializes all vcpus on reentry (they wait for the slots_lock to get  
> dropped), therefore I wanted to ask how often setting memslots on  
> runtime will occur on x86 ? Would this approach be acceptable ?

For x86 we need slots_lock for two things:

1) to protect the memslot structures from changing (very rare), ie:
kvm_set_memory.

2) to protect updates to the dirty bitmap (operations on behalf of
guest) which take slots_lock for read versus updates to that dirty
bitmap (an ioctl that retrieves what pages have been dirtied in the
memslots, and clears the dirtyness info).

All you need for S390 is 1), AFAICS.

For 1), we can drop the slots_lock usage, but instead create an
explicit synchronization point, where all vcpus are forced to (say
kvm_vcpu_block) "paused" state. qemu-kvm has such notion.

Same language?

> If it is too adventurous for now I can implement it that way in the s390  
> code and we split the long term discussion (synchronization + generic  
> kickout levels + who knows what comes up).
>> If smp_call_function_many is hidden behind kvm_arch_kick_vcpus, can you
>> make use of make_all_requests for S390 (without the smp_call_function  
>> performance impact you mentioned) ?
>>   
> In combination with the request&confirm mechanism desribed above it  
> should work if smp_call function and all the cpuid gathering which  
> belongs to it is hidden behind kvm_arch_kick_vcpus.
>> For x86 we can further optimize make_all_requests by checking REQ_KICK,
>> and kvm_arch_kick_vcpus would be a good place for that.
>>
>> And the kickout levels idea you mentioned can come later, as an
>> optimization?
> yes I agree splitting that to a later optimization is a good idea.
>
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>   
>
>
> -- 
>
> Grüsse / regards, Christian Ehrhardt
> IBM Linux Technology Center, Open Virtualization 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/3] kvm-s390: infrastructure to kick vcpus out of guest state
  2009-05-28  3:44       ` Marcelo Tosatti
@ 2009-05-28  7:59         ` Christian Ehrhardt
  2009-05-28  8:42           ` Avi Kivity
  0 siblings, 1 reply; 18+ messages in thread
From: Christian Ehrhardt @ 2009-05-28  7:59 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: kvm, avi, borntraeger, cotte, heiko.carstens, schwidefsky

Marcelo Tosatti wrote:
> On Tue, May 26, 2009 at 10:02:59AM +0200, Christian Ehrhardt wrote:
>   
>> Marcelo Tosatti wrote:
>>     
>>> On Mon, May 25, 2009 at 01:40:49PM +0200, ehrhardt@linux.vnet.ibm.com wrote:
>>>   
>>>       
>>>> From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
>>>>
>>>> To ensure vcpu's come out of guest context in certain cases this patch adds a
>>>> s390 specific way to kick them out of guest context. Currently it kicks them
>>>> out to rerun the vcpu_run path in the s390 code, but the mechanism itself is
>>>> expandable and with a new flag we could also add e.g. kicks to userspace etc.
>>>>
>>>> Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
>>>>     
>>>>         
>>> "For now I added the optimization to skip kicking vcpus out of guest
>>> that had the request bit already set to the s390 specific loop (sent as
>>> v2 in a few minutes).
>>>
>>> We might one day consider standardizing some generic kickout levels e.g.
>>> kick to "inner loop", "arch vcpu run", "generic vcpu run", "userspace",
>>> ... whatever levels fit *all* our use cases. And then let that kicks be
>>> implemented in an kvm_arch_* backend as it might be very different how
>>> they behave on different architectures."
>>>
>>> That would be ideal, yes. Two things make_all_requests handles: 
>>>
>>> 1) It disables preemption with get_cpu(), so it can reliably check for
>>> cpu id. Somehow you don't need that for s390 when kicking multiple
>>> vcpus?
>>>   
>>>       
>> I don't even need the cpuid as make_all_requests does, I just insert a  
>> special bit in the vcpu arch part and the vcpu will "come out to me 
>> (host)".
>> Fortunateley the kick is rare and fast so I can just insert it  
>> unconditionally (it's even ok to insert it if the vcpu is not in guest  
>> state). That prevents us from needing vcpu lock or detailed checks which  
>> would end up where we started (no guarantee that vcpu's come out of  
>> guest context while trying to aquire all vcpu locks)
>>     
>
> Let me see if I get this right: you kick the vcpus out of guest mode by
> using a special bit in the vcpu arch part. OK.
>
> What I don't understand is this: 
> "would end up where we started (no guarantee that vcpu's come out of
> guest context while trying to aquire all vcpu locks)"
>   
initially the mechanism looped over vcpu's and just aquired the vcpu 
lock and then updated the vcpu.arch infor directly.
Avi mentioned that we have no guarantee if/when the vcpu will come out 
of guest context to free a lock currently held and suggested the 
mechanism x86 uses via setting vcpu->request and kicking the vcpu. Thats 
the eason behind "end up where we (the discussion) started", if we would 
need the vcpu lock again we would be at the beginnign of the discussion.
> So you _need_ a mechanism to kick all vcpus out of guest mode?
>   
I have a mechanism to kick a vcpu, and I use it. Due to the fact that 
smp_call_* don't work as kick for us the kick is an arch specific function.
I hop ethat clarified this part :-)
>>> 2) It uses smp_call_function_many(wait=1), which guarantees that by the
>>> time make_all_requests returns no vcpus will be using stale data (the
>>> remote vcpus will have executed ack_flush).
>>>   
>>>       
>> yes this is really a part my s390 implementation doesn't fulfill yet.  
>> Currently on return vcpus might still use the old memslot information.
>> As mentioned before letting all interrupts come "too far" out of the hot  
>> loop would be a performance issue, therefore I think I will need some  
>> request&confirm mechanism. I'm not sure yet but maybe it could be as  
>> easy as this pseudo code example:
>>
>> # in make_all_requests
>> # remember we have slots_lock write here and the reentry that updates  
>> the vcpu specific data aquires slots_lock for read.
>> loop vcpus
>>  set_bit in vcpu requests
>>  kick vcpu #arch function
>> endloop
>>
>> loop vcpus
>>  until the requested bit is disappeared #as the reentry path uses  
>> test_and_clear it will disappear
>> endloop
>>
>> That would be a implicit synchronization and should work, as I wrote  
>> before setting memslots while the guest is running is rare if ever  
>> existant for s390. On x86 smp_call_many could then work without the wait  
>> flag being set.
>>     
>
> I see, yes. 
>
>   
>> But I assume that this synchronization approach is slower as it  
>> serializes all vcpus on reentry (they wait for the slots_lock to get  
>> dropped), therefore I wanted to ask how often setting memslots on  
>> runtime will occur on x86 ? Would this approach be acceptable ?
>>     
>
> For x86 we need slots_lock for two things:
>
> 1) to protect the memslot structures from changing (very rare), ie:
> kvm_set_memory.
>
> 2) to protect updates to the dirty bitmap (operations on behalf of
> guest) which take slots_lock for read versus updates to that dirty
> bitmap (an ioctl that retrieves what pages have been dirtied in the
> memslots, and clears the dirtyness info).
>
> All you need for S390 is 1), AFAICS.
>   
correct
> For 1), we can drop the slots_lock usage, but instead create an
> explicit synchronization point, where all vcpus are forced to (say
> kvm_vcpu_block) "paused" state. qemu-kvm has such notion.
>
> Same language?
>   
Yes, I think i got your point :-)
But I think by keeping slots_lock we already got our synchronization 
point and don't need an explicit one adding extra code and maybe locks.
As I mentioned above it should synchronize already implicit. When I 
looked at it once more yesterday I realized that kvm_set_memory is not 
performance critical anyway (i.e. does not have to be the fastest ioctl 
on earth) so we could be one step smarter and instead of serializing all 
vcpu's among each other we could set_memory just let do it one by one.

In case I lost you again due to my obviously confusing mainframe 
language this week you might want to my next patch submission where I 
implement that in the s390 arch code as an example. I'll put you on cc 
and in that new code we might find an implicit language synchronization 
for us :-)

[...]

-- 

Grüsse / regards, Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/3] kvm-s390: infrastructure to kick vcpus out of guest state
  2009-05-28  7:59         ` Christian Ehrhardt
@ 2009-05-28  8:42           ` Avi Kivity
  2009-05-28 13:11             ` Christian Ehrhardt
  0 siblings, 1 reply; 18+ messages in thread
From: Avi Kivity @ 2009-05-28  8:42 UTC (permalink / raw)
  To: Christian Ehrhardt
  Cc: Marcelo Tosatti, kvm, borntraeger, cotte, heiko.carstens,
	schwidefsky

Christian Ehrhardt wrote:
>> So you _need_ a mechanism to kick all vcpus out of guest mode?
>>   
> I have a mechanism to kick a vcpu, and I use it. Due to the fact that 
> smp_call_* don't work as kick for us the kick is an arch specific 
> function.
> I hop ethat clarified this part :-)
>

You could still use make_all_vcpus_request(), just change 
smp_call_function_many() to your own kicker.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/3] kvm-s390: infrastructure to kick vcpus out of guest state
  2009-05-28  8:42           ` Avi Kivity
@ 2009-05-28 13:11             ` Christian Ehrhardt
  0 siblings, 0 replies; 18+ messages in thread
From: Christian Ehrhardt @ 2009-05-28 13:11 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Marcelo Tosatti, kvm, borntraeger, cotte, heiko.carstens,
	schwidefsky

Avi Kivity wrote:
> Christian Ehrhardt wrote:
>>> So you _need_ a mechanism to kick all vcpus out of guest mode?
>>>   
>> I have a mechanism to kick a vcpu, and I use it. Due to the fact that 
>> smp_call_* don't work as kick for us the kick is an arch specific 
>> function.
>> I hop ethat clarified this part :-)
>>
>
> You could still use make_all_vcpus_request(), just change 
> smp_call_function_many() to your own kicker.
>
Yes and I like this idea for further unification, but I don't want it 
mixed too much into the patches in discussion atm.
Because on one hand I have some problems giving my arch specific kick a 
behaviour like "return when the guest WAS kicked" and on the other hand 
I would e.g. also need to streamline the check in make_all_vcpus_request 
which cpu is running etc because vcpu->cpu stays -1 all the time on s390 
(never used).

Therefore I would unify things step by step and this way allow single 
task to went off my task pile here :-)

-- 

Grüsse / regards, Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization 


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2009-05-28 13:11 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-05-20 13:34 [PATCH 0/3] kvm-s390: revised version of kvm-s390 guest memory handling ehrhardt
2009-05-20 13:34 ` [PATCH 1/3] kvm-s390: infrastructure to kick vcpus out of guest state ehrhardt
2009-05-20 13:34 ` [PATCH 2/3] kvm-s390: fix signal handling ehrhardt
2009-05-20 13:34 ` [PATCH 3/3] kvm-s390: streamline memslot handling ehrhardt
2009-05-24 14:39   ` Avi Kivity
2009-05-25  8:33     ` Christian Ehrhardt
2009-05-25 11:40       ` Christian Ehrhardt
2009-05-26  7:57       ` Avi Kivity
2009-05-26  8:31         ` Christian Bornträger
2009-05-26  9:27           ` Avi Kivity
2009-05-26 10:31             ` Christian Ehrhardt
  -- strict thread matches above, loose matches on Subject: below --
2009-05-25 11:40 [PATCH 0/3] kvm-s390: revised version of kvm-s390 guest memory handling - v2 ehrhardt
2009-05-25 11:40 ` [PATCH 1/3] kvm-s390: infrastructure to kick vcpus out of guest state ehrhardt
2009-05-25 20:22   ` Marcelo Tosatti
2009-05-26  8:02     ` Christian Ehrhardt
2009-05-28  3:44       ` Marcelo Tosatti
2009-05-28  7:59         ` Christian Ehrhardt
2009-05-28  8:42           ` Avi Kivity
2009-05-28 13:11             ` Christian Ehrhardt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).