kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/5] KVM: optimize userspace exits with a new ioctl
@ 2015-08-14 10:08 Radim Krčmář
  2015-08-14 10:08 ` [PATCH v3 1/5] KVM: add kvm_has_request wrapper Radim Krčmář
                   ` (6 more replies)
  0 siblings, 7 replies; 11+ messages in thread
From: Radim Krčmář @ 2015-08-14 10:08 UTC (permalink / raw)
  To: linux-kernel; +Cc: kvm, Paolo Bonzini, Christian Borntraeger

v3:
 * acked by Christian [1/5]
 * use ioctl argument directly (unsigned long as flags) [4/5]
 * precisely #ifdef arch-specific ioctls [5/5]
v2:
 * move request_exits debug counter patch right after introduction of
   KVM_REQ_EXIT [3/5]
 * use vcpu ioctl instead of vm one [4/5]
 * shrink kvm_user_exit from 64 to 32 bytes [4/5]
 * new [5/5]

QEMU uses SIGUSR1 to force a userspace exit and also to queue an early
exit before calling VCPU_RUN -- the signal is blocked in user space and
temporarily unblocked in VCPU_RUN.
The temporal unblocking by sigprocmask() in kvm_arch_vcpu_ioctl_run()
takes a shared siglock, which leads to cacheline bouncing in NUMA
systems.

This series allows the same with a new request bit and VM IOCTL that
marks and kicks target VCPU, hence no need to unblock.

inl_from_{pmtimer,qemu} vmexit benchmark from kvm-unit-tests shows ~5%
speedup for 1-4 VCPUs (300-2000 saved cycles) without noticeably
regressing kernel VM exits.
(Paolo did a quick run of older version of this series on a NUMA system
 and the speedup was around 35% when utilizing more nodes.)

Radim Krčmář (5):
  KVM: add kvm_has_request wrapper
  KVM: add KVM_REQ_EXIT request for userspace exit
  KVM: x86: add request_exits debug counter
  KVM: add KVM_USER_EXIT vcpu ioctl for userspace exit
  KVM: refactor asynchronous vcpu ioctl dispatch

 Documentation/virtual/kvm/api.txt | 25 +++++++++++++++++++++++++
 arch/x86/include/asm/kvm_host.h   |  1 +
 arch/x86/kvm/vmx.c                |  4 ++--
 arch/x86/kvm/x86.c                | 23 +++++++++++++++++++++++
 include/linux/kvm_host.h          | 15 +++++++++++++--
 include/uapi/linux/kvm.h          |  4 ++++
 virt/kvm/kvm_main.c               | 15 ++++++++++-----
 7 files changed, 78 insertions(+), 9 deletions(-)

-- 
2.5.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v3 1/5] KVM: add kvm_has_request wrapper
  2015-08-14 10:08 [PATCH v3 0/5] KVM: optimize userspace exits with a new ioctl Radim Krčmář
@ 2015-08-14 10:08 ` Radim Krčmář
  2015-08-14 10:08 ` [PATCH v3 2/5] KVM: add KVM_REQ_EXIT request for userspace exit Radim Krčmář
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Radim Krčmář @ 2015-08-14 10:08 UTC (permalink / raw)
  To: linux-kernel; +Cc: kvm, Paolo Bonzini, Christian Borntraeger

We want to have requests abstracted from bit operations.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
---
 v3: acked by Christian

 arch/x86/kvm/vmx.c       | 2 +-
 include/linux/kvm_host.h | 7 ++++++-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 4cf25b90dbe0..40c6180a0ecb 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -5809,7 +5809,7 @@ static int handle_invalid_guest_state(struct kvm_vcpu *vcpu)
 		if (intr_window_requested && vmx_interrupt_allowed(vcpu))
 			return handle_interrupt_window(&vmx->vcpu);
 
-		if (test_bit(KVM_REQ_EVENT, &vcpu->requests))
+		if (kvm_has_request(KVM_REQ_EVENT, vcpu))
 			return 1;
 
 		err = emulate_instruction(vcpu, EMULTYPE_NO_REEXECUTE);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 27ccdf91a465..52e388367a26 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1089,9 +1089,14 @@ static inline void kvm_make_request(int req, struct kvm_vcpu *vcpu)
 	set_bit(req, &vcpu->requests);
 }
 
+static inline bool kvm_has_request(int req, struct kvm_vcpu *vcpu)
+{
+	return test_bit(req, &vcpu->requests);
+}
+
 static inline bool kvm_check_request(int req, struct kvm_vcpu *vcpu)
 {
-	if (test_bit(req, &vcpu->requests)) {
+	if (kvm_has_request(req, vcpu)) {
 		clear_bit(req, &vcpu->requests);
 		return true;
 	} else {
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 2/5] KVM: add KVM_REQ_EXIT request for userspace exit
  2015-08-14 10:08 [PATCH v3 0/5] KVM: optimize userspace exits with a new ioctl Radim Krčmář
  2015-08-14 10:08 ` [PATCH v3 1/5] KVM: add kvm_has_request wrapper Radim Krčmář
@ 2015-08-14 10:08 ` Radim Krčmář
  2015-08-20  3:55   ` Wanpeng Li
  2015-08-14 10:08 ` [PATCH v3 3/5] KVM: x86: add request_exits debug counter Radim Krčmář
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 11+ messages in thread
From: Radim Krčmář @ 2015-08-14 10:08 UTC (permalink / raw)
  To: linux-kernel; +Cc: kvm, Paolo Bonzini, Christian Borntraeger

When userspace wants KVM to exit to userspace, it sends a signal.
This has a disadvantage of requiring a change to the signal mask because
the signal needs to be blocked in userspace to stay pending when sending
to self.

Using a request flag allows us to shave 200-300 cycles from every
userspace exit and the speedup grows with NUMA because unblocking
touches shared spinlock.

The disadvantage is that it adds an overhead of one bit check for all
kernel exits.  A quick tracing shows that the ratio of userspace exits
after boot is about 1/5 and in subsequent run of nmap and kernel compile
has about 1/60, so the check should not regress global performance.

All signal_pending() calls are userspace exit requests, so we add a
check for KVM_REQ_EXIT there.  There is one omitted call in kvm_vcpu_run
because KVM_REQ_EXIT is implied in earlier check for requests.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
---
 arch/x86/kvm/vmx.c       | 2 +-
 arch/x86/kvm/x86.c       | 6 ++++++
 include/linux/kvm_host.h | 8 +++++++-
 include/uapi/linux/kvm.h | 1 +
 virt/kvm/kvm_main.c      | 2 +-
 5 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 40c6180a0ecb..2b789a869ef5 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -5833,7 +5833,7 @@ static int handle_invalid_guest_state(struct kvm_vcpu *vcpu)
 			goto out;
 		}
 
-		if (signal_pending(current))
+		if (kvm_need_exit(vcpu))
 			goto out;
 		if (need_resched())
 			schedule();
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e5850076bf7b..c3df7733af09 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6548,6 +6548,11 @@ static int vcpu_run(struct kvm_vcpu *vcpu)
 			++vcpu->stat.signal_exits;
 			break;
 		}
+		if (unlikely(kvm_has_request(KVM_REQ_EXIT, vcpu))) {
+			r = 0;
+			vcpu->run->exit_reason = KVM_EXIT_REQUEST;
+			break;
+		}
 		if (need_resched()) {
 			srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
 			cond_resched();
@@ -6684,6 +6689,7 @@ out:
 	post_kvm_run_save(vcpu);
 	if (vcpu->sigset_active)
 		sigprocmask(SIG_SETMASK, &sigsaved, NULL);
+	clear_bit(KVM_REQ_EXIT, &vcpu->requests);
 
 	return r;
 }
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 52e388367a26..dcc57171e3ec 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -121,7 +121,7 @@ static inline bool is_error_page(struct page *page)
 #define KVM_REQ_UNHALT             6
 #define KVM_REQ_MMU_SYNC           7
 #define KVM_REQ_CLOCK_UPDATE       8
-#define KVM_REQ_KICK               9
+#define KVM_REQ_EXIT               9
 #define KVM_REQ_DEACTIVATE_FPU    10
 #define KVM_REQ_EVENT             11
 #define KVM_REQ_APF_HALT          12
@@ -1104,6 +1104,12 @@ static inline bool kvm_check_request(int req, struct kvm_vcpu *vcpu)
 	}
 }
 
+static inline bool kvm_need_exit(struct kvm_vcpu *vcpu)
+{
+	return signal_pending(current) ||
+	       kvm_has_request(KVM_REQ_EXIT, vcpu);
+}
+
 extern bool kvm_rebooting;
 
 struct kvm_device {
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 26daafbba9ec..d996a7cdb4d2 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -184,6 +184,7 @@ struct kvm_s390_skeys {
 #define KVM_EXIT_SYSTEM_EVENT     24
 #define KVM_EXIT_S390_STSI        25
 #define KVM_EXIT_IOAPIC_EOI       26
+#define KVM_EXIT_REQUEST          27
 
 /* For KVM_EXIT_INTERNAL_ERROR */
 /* Emulate instruction failed. */
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index d8db2f8fce9c..347899966178 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1914,7 +1914,7 @@ static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu)
 	}
 	if (kvm_cpu_has_pending_timer(vcpu))
 		return -EINTR;
-	if (signal_pending(current))
+	if (kvm_need_exit(vcpu))
 		return -EINTR;
 
 	return 0;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 3/5] KVM: x86: add request_exits debug counter
  2015-08-14 10:08 [PATCH v3 0/5] KVM: optimize userspace exits with a new ioctl Radim Krčmář
  2015-08-14 10:08 ` [PATCH v3 1/5] KVM: add kvm_has_request wrapper Radim Krčmář
  2015-08-14 10:08 ` [PATCH v3 2/5] KVM: add KVM_REQ_EXIT request for userspace exit Radim Krčmář
@ 2015-08-14 10:08 ` Radim Krčmář
  2015-08-14 10:08 ` [PATCH v3 4/5] KVM: add KVM_USER_EXIT vcpu ioctl for userspace exit Radim Krčmář
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Radim Krčmář @ 2015-08-14 10:08 UTC (permalink / raw)
  To: linux-kernel; +Cc: kvm, Paolo Bonzini, Christian Borntraeger

We are still interested in the amount of exits userspace requested and
signal_exits doesn't cover that anymore.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
---
 v2: move request_exits debug counter patch right after introduction of
     KVM_REQ_EXIT

 arch/x86/include/asm/kvm_host.h | 1 +
 arch/x86/kvm/x86.c              | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 09acaa64ef8e..95c05a3d02d4 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -729,6 +729,7 @@ struct kvm_vcpu_stat {
 	u32 hypercalls;
 	u32 irq_injections;
 	u32 nmi_injections;
+	u32 request_exits;
 };
 
 struct x86_instruction_info;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c3df7733af09..37db1b32684a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -145,6 +145,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
 	{ "io_exits", VCPU_STAT(io_exits) },
 	{ "mmio_exits", VCPU_STAT(mmio_exits) },
 	{ "signal_exits", VCPU_STAT(signal_exits) },
+	{ "request_exits", VCPU_STAT(request_exits) },
 	{ "irq_window", VCPU_STAT(irq_window_exits) },
 	{ "nmi_window", VCPU_STAT(nmi_window_exits) },
 	{ "halt_exits", VCPU_STAT(halt_exits) },
@@ -6551,6 +6552,7 @@ static int vcpu_run(struct kvm_vcpu *vcpu)
 		if (unlikely(kvm_has_request(KVM_REQ_EXIT, vcpu))) {
 			r = 0;
 			vcpu->run->exit_reason = KVM_EXIT_REQUEST;
+			++vcpu->stat.request_exits;
 			break;
 		}
 		if (need_resched()) {
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 4/5] KVM: add KVM_USER_EXIT vcpu ioctl for userspace exit
  2015-08-14 10:08 [PATCH v3 0/5] KVM: optimize userspace exits with a new ioctl Radim Krčmář
                   ` (2 preceding siblings ...)
  2015-08-14 10:08 ` [PATCH v3 3/5] KVM: x86: add request_exits debug counter Radim Krčmář
@ 2015-08-14 10:08 ` Radim Krčmář
  2015-08-14 10:08 ` [PATCH v3 5/5] KVM: refactor asynchronous vcpu ioctl dispatch Radim Krčmář
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Radim Krčmář @ 2015-08-14 10:08 UTC (permalink / raw)
  To: linux-kernel; +Cc: kvm, Paolo Bonzini, Christian Borntraeger

The guest can use KVM_USER_EXIT instead of a signal-based exiting to
userspace.  Availability depends on KVM_CAP_USER_EXIT.
Only x86 is implemented so far.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
---
 v3:
  * use ioctl argument directly (unsigned long as flags) [Paolo]
 v2:
  * use vcpu ioctl instead of vm one [Paolo]
  * shrink kvm_user_exit from 64 to 32 bytes

 Documentation/virtual/kvm/api.txt | 25 +++++++++++++++++++++++++
 arch/x86/kvm/x86.c                | 15 +++++++++++++++
 include/uapi/linux/kvm.h          |  3 +++
 virt/kvm/kvm_main.c               |  5 +++--
 4 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 3c714d43a717..df087ff3c5b6 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -3020,6 +3020,31 @@ Returns: 0 on success, -1 on error
 
 Queues an SMI on the thread's vcpu.
 
+
+4.97 KVM_USER_EXIT
+
+Capability: KVM_CAP_USER_EXIT
+Architectures: x86
+Type: vcpu ioctl
+Parameters: unsigned long flags (in)
+Returns: 0 on success,
+         -EINVAL if flags is not 0
+
+The ioctl is asynchronous to VCPU execution and can be issued from all threads.
+
+Make vcpu_id exit to userspace as soon as possible.  If the VCPU is not running
+in kernel at the time, it will exit early on the next call to KVM_RUN.
+If the VCPU was going to exit because of other reasons when KVM_USER_EXIT was
+issued, it will keep the original exit reason without exiting early on next
+KVM_RUN.
+If VCPU exited because of KVM_USER_EXIT, the exit reason is KVM_EXIT_REQUEST.
+
+This ioctl has very similar effect (same sans some races on userspace exit) as
+sending a signal (that is blocked in userspace and set in KVM_SET_SIGNAL_MASK)
+to the VCPU thread.
+
+
+
 5. The kvm_run structure
 ------------------------
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 37db1b32684a..d985806b17b1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2467,6 +2467,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_ASSIGN_DEV_IRQ:
 	case KVM_CAP_PCI_2_3:
 #endif
+	case KVM_CAP_USER_EXIT:
 		r = 1;
 		break;
 	case KVM_CAP_X86_SMM:
@@ -3078,6 +3079,17 @@ static int kvm_set_guest_paused(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
+static int kvm_vcpu_ioctl_user_exit(struct kvm_vcpu *vcpu, unsigned long flags)
+{
+	if (flags != 0)
+		return -EINVAL;
+
+	kvm_make_request(KVM_REQ_EXIT, vcpu);
+	kvm_vcpu_kick(vcpu);
+
+	return 0;
+}
+
 long kvm_arch_vcpu_ioctl(struct file *filp,
 			 unsigned int ioctl, unsigned long arg)
 {
@@ -3342,6 +3354,9 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 		r = kvm_set_guest_paused(vcpu);
 		goto out;
 	}
+	case KVM_USER_EXIT:
+		r = kvm_vcpu_ioctl_user_exit(vcpu, arg);
+		break;
 	default:
 		r = -EINVAL;
 	}
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index d996a7cdb4d2..58b3a07adc81 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -826,6 +826,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_X86_SMM 117
 #define KVM_CAP_MULTI_ADDRESS_SPACE 118
 #define KVM_CAP_SPLIT_IRQCHIP 119
+#define KVM_CAP_USER_EXIT 120
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -1213,6 +1214,8 @@ struct kvm_s390_ucas_mapping {
 #define KVM_S390_GET_IRQ_STATE	  _IOW(KVMIO, 0xb6, struct kvm_s390_irq_state)
 /* Available with KVM_CAP_X86_SMM */
 #define KVM_SMI                   _IO(KVMIO,   0xb7)
+/* Available with KVM_CAP_USER_EXIT */
+#define KVM_USER_EXIT             _IO(KVMIO,   0xb8)
 
 #define KVM_DEV_ASSIGN_ENABLE_IOMMU	(1 << 0)
 #define KVM_DEV_ASSIGN_PCI_2_3		(1 << 1)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 347899966178..dfa2d5f27713 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2251,15 +2251,16 @@ static long kvm_vcpu_ioctl(struct file *filp,
 	if (unlikely(_IOC_TYPE(ioctl) != KVMIO))
 		return -EINVAL;
 
-#if defined(CONFIG_S390) || defined(CONFIG_PPC) || defined(CONFIG_MIPS)
 	/*
 	 * Special cases: vcpu ioctls that are asynchronous to vcpu execution,
 	 * so vcpu_load() would break it.
 	 */
+#if defined(CONFIG_S390) || defined(CONFIG_PPC) || defined(CONFIG_MIPS)
 	if (ioctl == KVM_S390_INTERRUPT || ioctl == KVM_S390_IRQ || ioctl == KVM_INTERRUPT)
 		return kvm_arch_vcpu_ioctl(filp, ioctl, arg);
 #endif
-
+	if (ioctl == KVM_USER_EXIT)
+		return kvm_arch_vcpu_ioctl(filp, ioctl, arg);
 
 	r = vcpu_load(vcpu);
 	if (r)
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 5/5] KVM: refactor asynchronous vcpu ioctl dispatch
  2015-08-14 10:08 [PATCH v3 0/5] KVM: optimize userspace exits with a new ioctl Radim Krčmář
                   ` (3 preceding siblings ...)
  2015-08-14 10:08 ` [PATCH v3 4/5] KVM: add KVM_USER_EXIT vcpu ioctl for userspace exit Radim Krčmář
@ 2015-08-14 10:08 ` Radim Krčmář
  2015-08-14 23:37 ` [PATCH v3 0/5] KVM: optimize userspace exits with a new ioctl Paolo Bonzini
  2015-09-02 10:31 ` Christian Borntraeger
  6 siblings, 0 replies; 11+ messages in thread
From: Radim Krčmář @ 2015-08-14 10:08 UTC (permalink / raw)
  To: linux-kernel; +Cc: kvm, Paolo Bonzini, Christian Borntraeger

I find the switch easier to read and modify.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
---
 v3: precisely #ifdef arch-specific ioctls [Christian]
 v2: new

 virt/kvm/kvm_main.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index dfa2d5f27713..c059c01161fe 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2255,12 +2255,16 @@ static long kvm_vcpu_ioctl(struct file *filp,
 	 * Special cases: vcpu ioctls that are asynchronous to vcpu execution,
 	 * so vcpu_load() would break it.
 	 */
-#if defined(CONFIG_S390) || defined(CONFIG_PPC) || defined(CONFIG_MIPS)
-	if (ioctl == KVM_S390_INTERRUPT || ioctl == KVM_S390_IRQ || ioctl == KVM_INTERRUPT)
-		return kvm_arch_vcpu_ioctl(filp, ioctl, arg);
+	switch (ioctl) {
+#if defined(CONFIG_S390)
+	case KVM_S390_INTERRUPT:
+	case KVM_S390_IRQ:
+#elif defined(CONFIG_PPC) || defined(CONFIG_MIPS)
+	case KVM_INTERRUPT:
 #endif
-	if (ioctl == KVM_USER_EXIT)
+	case KVM_USER_EXIT:
 		return kvm_arch_vcpu_ioctl(filp, ioctl, arg);
+	}
 
 	r = vcpu_load(vcpu);
 	if (r)
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 0/5] KVM: optimize userspace exits with a new ioctl
  2015-08-14 10:08 [PATCH v3 0/5] KVM: optimize userspace exits with a new ioctl Radim Krčmář
                   ` (4 preceding siblings ...)
  2015-08-14 10:08 ` [PATCH v3 5/5] KVM: refactor asynchronous vcpu ioctl dispatch Radim Krčmář
@ 2015-08-14 23:37 ` Paolo Bonzini
  2015-09-02 10:31 ` Christian Borntraeger
  6 siblings, 0 replies; 11+ messages in thread
From: Paolo Bonzini @ 2015-08-14 23:37 UTC (permalink / raw)
  To: Radim Krčmář, linux-kernel; +Cc: kvm, Christian Borntraeger



On 14/08/2015 12:08, Radim Krčmář wrote:
> v3:
>  * acked by Christian [1/5]
>  * use ioctl argument directly (unsigned long as flags) [4/5]
>  * precisely #ifdef arch-specific ioctls [5/5]
> v2:
>  * move request_exits debug counter patch right after introduction of
>    KVM_REQ_EXIT [3/5]
>  * use vcpu ioctl instead of vm one [4/5]
>  * shrink kvm_user_exit from 64 to 32 bytes [4/5]
>  * new [5/5]
> 
> QEMU uses SIGUSR1 to force a userspace exit and also to queue an early
> exit before calling VCPU_RUN -- the signal is blocked in user space and
> temporarily unblocked in VCPU_RUN.
> The temporal unblocking by sigprocmask() in kvm_arch_vcpu_ioctl_run()
> takes a shared siglock, which leads to cacheline bouncing in NUMA
> systems.
> 
> This series allows the same with a new request bit and VM IOCTL that
> marks and kicks target VCPU, hence no need to unblock.
> 
> inl_from_{pmtimer,qemu} vmexit benchmark from kvm-unit-tests shows ~5%
> speedup for 1-4 VCPUs (300-2000 saved cycles) without noticeably
> regressing kernel VM exits.
> (Paolo did a quick run of older version of this series on a NUMA system
>  and the speedup was around 35% when utilizing more nodes.)
> 
> Radim Krčmář (5):
>   KVM: add kvm_has_request wrapper
>   KVM: add KVM_REQ_EXIT request for userspace exit
>   KVM: x86: add request_exits debug counter
>   KVM: add KVM_USER_EXIT vcpu ioctl for userspace exit
>   KVM: refactor asynchronous vcpu ioctl dispatch
> 
>  Documentation/virtual/kvm/api.txt | 25 +++++++++++++++++++++++++
>  arch/x86/include/asm/kvm_host.h   |  1 +
>  arch/x86/kvm/vmx.c                |  4 ++--
>  arch/x86/kvm/x86.c                | 23 +++++++++++++++++++++++
>  include/linux/kvm_host.h          | 15 +++++++++++++--
>  include/uapi/linux/kvm.h          |  4 ++++
>  virt/kvm/kvm_main.c               | 15 ++++++++++-----
>  7 files changed, 78 insertions(+), 9 deletions(-)
> 

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

... however, we still need to decide what to do about machine-check
exceptions before enabling the capability, otherwise we'd need a new
KVM_CAP_USER_EXIT_MCE capability in the future.  So I'm holding up the
patches for now.

Paolo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 2/5] KVM: add KVM_REQ_EXIT request for userspace exit
  2015-08-14 10:08 ` [PATCH v3 2/5] KVM: add KVM_REQ_EXIT request for userspace exit Radim Krčmář
@ 2015-08-20  3:55   ` Wanpeng Li
  2015-08-22  7:04     ` Paolo Bonzini
  0 siblings, 1 reply; 11+ messages in thread
From: Wanpeng Li @ 2015-08-20  3:55 UTC (permalink / raw)
  To: Radim Krčmář, linux-kernel
  Cc: kvm, Paolo Bonzini, Christian Borntraeger

On 8/14/15 6:08 PM, Radim Krčmář wrote:
> When userspace wants KVM to exit to userspace, it sends a signal.
> This has a disadvantage of requiring a change to the signal mask because
> the signal needs to be blocked in userspace to stay pending when sending
> to self.
>
> Using a request flag allows us to shave 200-300 cycles from every
> userspace exit and the speedup grows with NUMA because unblocking
> touches shared spinlock.
>
> The disadvantage is that it adds an overhead of one bit check for all
> kernel exits.  A quick tracing shows that the ratio of userspace exits
> after boot is about 1/5 and in subsequent run of nmap and kernel compile
> has about 1/60, so the check should not regress global performance.
>
> All signal_pending() calls are userspace exit requests, so we add a
> check for KVM_REQ_EXIT there.  There is one omitted call in kvm_vcpu_run
> because KVM_REQ_EXIT is implied in earlier check for requests.

Actually I see more SIGUSR1 signals are intercepted by signal_pending() 
in vcpu_enter_guest() and vcpu_run() w/ win7 guest and kernel_irqchip=off.

Regards,
Wanpeng Li

>
> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
> ---
>   arch/x86/kvm/vmx.c       | 2 +-
>   arch/x86/kvm/x86.c       | 6 ++++++
>   include/linux/kvm_host.h | 8 +++++++-
>   include/uapi/linux/kvm.h | 1 +
>   virt/kvm/kvm_main.c      | 2 +-
>   5 files changed, 16 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 40c6180a0ecb..2b789a869ef5 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -5833,7 +5833,7 @@ static int handle_invalid_guest_state(struct kvm_vcpu *vcpu)
>   			goto out;
>   		}
>   
> -		if (signal_pending(current))
> +		if (kvm_need_exit(vcpu))
>   			goto out;
>   		if (need_resched())
>   			schedule();
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index e5850076bf7b..c3df7733af09 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6548,6 +6548,11 @@ static int vcpu_run(struct kvm_vcpu *vcpu)
>   			++vcpu->stat.signal_exits;
>   			break;
>   		}
> +		if (unlikely(kvm_has_request(KVM_REQ_EXIT, vcpu))) {
> +			r = 0;
> +			vcpu->run->exit_reason = KVM_EXIT_REQUEST;
> +			break;
> +		}
>   		if (need_resched()) {
>   			srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
>   			cond_resched();
> @@ -6684,6 +6689,7 @@ out:
>   	post_kvm_run_save(vcpu);
>   	if (vcpu->sigset_active)
>   		sigprocmask(SIG_SETMASK, &sigsaved, NULL);
> +	clear_bit(KVM_REQ_EXIT, &vcpu->requests);
>   
>   	return r;
>   }
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 52e388367a26..dcc57171e3ec 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -121,7 +121,7 @@ static inline bool is_error_page(struct page *page)
>   #define KVM_REQ_UNHALT             6
>   #define KVM_REQ_MMU_SYNC           7
>   #define KVM_REQ_CLOCK_UPDATE       8
> -#define KVM_REQ_KICK               9
> +#define KVM_REQ_EXIT               9
>   #define KVM_REQ_DEACTIVATE_FPU    10
>   #define KVM_REQ_EVENT             11
>   #define KVM_REQ_APF_HALT          12
> @@ -1104,6 +1104,12 @@ static inline bool kvm_check_request(int req, struct kvm_vcpu *vcpu)
>   	}
>   }
>   
> +static inline bool kvm_need_exit(struct kvm_vcpu *vcpu)
> +{
> +	return signal_pending(current) ||
> +	       kvm_has_request(KVM_REQ_EXIT, vcpu);
> +}
> +
>   extern bool kvm_rebooting;
>   
>   struct kvm_device {
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 26daafbba9ec..d996a7cdb4d2 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -184,6 +184,7 @@ struct kvm_s390_skeys {
>   #define KVM_EXIT_SYSTEM_EVENT     24
>   #define KVM_EXIT_S390_STSI        25
>   #define KVM_EXIT_IOAPIC_EOI       26
> +#define KVM_EXIT_REQUEST          27
>   
>   /* For KVM_EXIT_INTERNAL_ERROR */
>   /* Emulate instruction failed. */
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index d8db2f8fce9c..347899966178 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1914,7 +1914,7 @@ static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu)
>   	}
>   	if (kvm_cpu_has_pending_timer(vcpu))
>   		return -EINTR;
> -	if (signal_pending(current))
> +	if (kvm_need_exit(vcpu))
>   		return -EINTR;
>   
>   	return 0;


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 2/5] KVM: add KVM_REQ_EXIT request for userspace exit
  2015-08-20  3:55   ` Wanpeng Li
@ 2015-08-22  7:04     ` Paolo Bonzini
  0 siblings, 0 replies; 11+ messages in thread
From: Paolo Bonzini @ 2015-08-22  7:04 UTC (permalink / raw)
  To: Wanpeng Li, Radim Krčmář, linux-kernel
  Cc: kvm, Christian Borntraeger



On 19/08/2015 20:55, Wanpeng Li wrote:
>> The disadvantage is that it adds an overhead of one bit check for all
>> kernel exits.  A quick tracing shows that the ratio of userspace exits
>> after boot is about 1/5 and in subsequent run of nmap and kernel compile
>> has about 1/60, so the check should not regress global performance.
>>
>> All signal_pending() calls are userspace exit requests, so we add a
>> check for KVM_REQ_EXIT there.  There is one omitted call in kvm_vcpu_run
>> because KVM_REQ_EXIT is implied in earlier check for requests.
> 
> Actually I see more SIGUSR1 signals are intercepted by signal_pending()
> in vcpu_enter_guest() and vcpu_run() w/ win7 guest and kernel_irqchip=off.

You need more patches on the QEMU side.  I tested a version that is
mostly okay but not ready for upstream inclusion.

Paolo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 0/5] KVM: optimize userspace exits with a new ioctl
  2015-08-14 10:08 [PATCH v3 0/5] KVM: optimize userspace exits with a new ioctl Radim Krčmář
                   ` (5 preceding siblings ...)
  2015-08-14 23:37 ` [PATCH v3 0/5] KVM: optimize userspace exits with a new ioctl Paolo Bonzini
@ 2015-09-02 10:31 ` Christian Borntraeger
  2015-09-07 12:33   ` Paolo Bonzini
  6 siblings, 1 reply; 11+ messages in thread
From: Christian Borntraeger @ 2015-09-02 10:31 UTC (permalink / raw)
  To: Radim Krčmář, linux-kernel; +Cc: kvm, Paolo Bonzini

Am 14.08.2015 um 12:08 schrieb Radim Krčmář:
> v3:
>  * acked by Christian [1/5]
>  * use ioctl argument directly (unsigned long as flags) [4/5]
>  * precisely #ifdef arch-specific ioctls [5/5]
> v2:
>  * move request_exits debug counter patch right after introduction of
>    KVM_REQ_EXIT [3/5]
>  * use vcpu ioctl instead of vm one [4/5]
>  * shrink kvm_user_exit from 64 to 32 bytes [4/5]
>  * new [5/5]
> 
> QEMU uses SIGUSR1 to force a userspace exit and also to queue an early
> exit before calling VCPU_RUN -- the signal is blocked in user space and
> temporarily unblocked in VCPU_RUN.
> The temporal unblocking by sigprocmask() in kvm_arch_vcpu_ioctl_run()
> takes a shared siglock, which leads to cacheline bouncing in NUMA
> systems.
> 
> This series allows the same with a new request bit and VM IOCTL that
> marks and kicks target VCPU, hence no need to unblock.
> 
> inl_from_{pmtimer,qemu} vmexit benchmark from kvm-unit-tests shows ~5%
> speedup for 1-4 VCPUs (300-2000 saved cycles) without noticeably
> regressing kernel VM exits.
> (Paolo did a quick run of older version of this series on a NUMA system
>  and the speedup was around 35% when utilizing more nodes.)
> 
> Radim Krčmář (5):
>   KVM: add kvm_has_request wrapper
>   KVM: add KVM_REQ_EXIT request for userspace exit
>   KVM: x86: add request_exits debug counter
>   KVM: add KVM_USER_EXIT vcpu ioctl for userspace exit
>   KVM: refactor asynchronous vcpu ioctl dispatch
> 
>  Documentation/virtual/kvm/api.txt | 25 +++++++++++++++++++++++++
>  arch/x86/include/asm/kvm_host.h   |  1 +
>  arch/x86/kvm/vmx.c                |  4 ++--
>  arch/x86/kvm/x86.c                | 23 +++++++++++++++++++++++
>  include/linux/kvm_host.h          | 15 +++++++++++++--
>  include/uapi/linux/kvm.h          |  4 ++++
>  virt/kvm/kvm_main.c               | 15 ++++++++++-----
>  7 files changed, 78 insertions(+), 9 deletions(-)
> 

As far as I can see this should also work for s390 (when we implement 
REQ_EXIT handling)

To double check my understanding: these improvements come with a changed
userspace that does not use KVM_SET_SIGNAL_MASK and therefore this
conditional is false
	if (vcpu->sigset_active)
		sigprocmask(SIG_SETMASK, &vcpu->sigset, &sigsaved);
Correct?

That also means we improve exits that go to userspace but lightweight exits
that stay inside the kernel are not affected. 

Correct?

Christian


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 0/5] KVM: optimize userspace exits with a new ioctl
  2015-09-02 10:31 ` Christian Borntraeger
@ 2015-09-07 12:33   ` Paolo Bonzini
  0 siblings, 0 replies; 11+ messages in thread
From: Paolo Bonzini @ 2015-09-07 12:33 UTC (permalink / raw)
  To: Christian Borntraeger, Radim Krčmář, linux-kernel; +Cc: kvm



On 02/09/2015 12:31, Christian Borntraeger wrote:
> As far as I can see this should also work for s390 (when we implement 
> REQ_EXIT handling)
> 
> To double check my understanding: these improvements come with a changed
> userspace that does not use KVM_SET_SIGNAL_MASK and therefore this
> conditional is false
> 	if (vcpu->sigset_active)
> 		sigprocmask(SIG_SETMASK, &vcpu->sigset, &sigsaved);
> Correct?
> 
> That also means we improve exits that go to userspace but lightweight exits
> that stay inside the kernel are not affected. 

Yes, both observations are correct.

Paolo

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-09-07 12:33 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-14 10:08 [PATCH v3 0/5] KVM: optimize userspace exits with a new ioctl Radim Krčmář
2015-08-14 10:08 ` [PATCH v3 1/5] KVM: add kvm_has_request wrapper Radim Krčmář
2015-08-14 10:08 ` [PATCH v3 2/5] KVM: add KVM_REQ_EXIT request for userspace exit Radim Krčmář
2015-08-20  3:55   ` Wanpeng Li
2015-08-22  7:04     ` Paolo Bonzini
2015-08-14 10:08 ` [PATCH v3 3/5] KVM: x86: add request_exits debug counter Radim Krčmář
2015-08-14 10:08 ` [PATCH v3 4/5] KVM: add KVM_USER_EXIT vcpu ioctl for userspace exit Radim Krčmář
2015-08-14 10:08 ` [PATCH v3 5/5] KVM: refactor asynchronous vcpu ioctl dispatch Radim Krčmář
2015-08-14 23:37 ` [PATCH v3 0/5] KVM: optimize userspace exits with a new ioctl Paolo Bonzini
2015-09-02 10:31 ` Christian Borntraeger
2015-09-07 12:33   ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).