[PATCH 0/2] Adaptive halt-polling toggle

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/2] Adaptive halt-polling toggle
@ 2015-09-01 21:41 David Matlack
  2015-09-01 21:41 ` [PATCH 1/2] KVM: make halt_poll_ns per-VCPU David Matlack
  2015-09-01 21:41 ` [PATCH 2/2] kvm: adaptive halt-polling toggle David Matlack
  0 siblings, 2 replies; 3+ messages in thread
From: David Matlack @ 2015-09-01 21:41 UTC (permalink / raw)
  To: kvm; +Cc: linux-kernel, pbonzini, wanpeng.li, peter, David Matlack

This patchset adds a dynamic on/off switch for polling. This patchset
gets good performance on its own for both idle and Message Passing
workloads.

                           no-poll     always-poll    adaptive-toggle
---------------------------------------------------------------------
Idle (nohz) VCPU %c0       0.12        0.32           0.15
Idle (250HZ) VCPU %c0      1.22        6.35           1.27
TCP_RR latency             39 us       25 us          25 us

(3.16 Linux guest, halt_poll_ns=200000)

"Idle (X) VCPU %c0" is the percent of time the physical cpu spent in
c0 over 60 seconds (each VCPU is pinned to a PCPU). (nohz) means the
guest was tickless. (250HZ) means the guest was ticking at 250HZ.

The big win is with ticking operating systems. Running the linux guest
with nohz=off (and HZ=250), we save 5% CPUs/second and get close to
no-polling overhead levels by using the adaptive toggle. The savings
should be even higher for higher frequency ticks.

Since we get low idle overhead with polling now, halt_poll_ns defaults
to 200000, instead of 0. We can increase halt_poll_ns a bit more once
we have dynamic halt-polling length adjustments (Wanpeng's patch). We
should however keep halt_poll_ns below 1 ms since that is the tick
frequency used by windows.

David Matlack (1):
  kvm: adaptive halt-polling toggle

Wanpeng Li (1):
  KVM: make halt_poll_ns per-VCPU

 include/linux/kvm_host.h   |   1 +
 include/trace/events/kvm.h |  23 ++++++----
 virt/kvm/kvm_main.c        | 111 ++++++++++++++++++++++++++++++++++-----------
 3 files changed, 99 insertions(+), 36 deletions(-)

-- 
2.5.0.457.gab17608

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH 1/2] KVM: make halt_poll_ns per-VCPU
  2015-09-01 21:41 [PATCH 0/2] Adaptive halt-polling toggle David Matlack
@ 2015-09-01 21:41 ` David Matlack
  2015-09-01 21:41 ` [PATCH 2/2] kvm: adaptive halt-polling toggle David Matlack
  1 sibling, 0 replies; 3+ messages in thread
From: David Matlack @ 2015-09-01 21:41 UTC (permalink / raw)
  To: kvm; +Cc: linux-kernel, pbonzini, wanpeng.li, peter

From: Wanpeng Li <wanpeng.li@hotmail.com>

Change halt_poll_ns into per-VCPU variable, seeded from module parameter,
to allow greater flexibility.

Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
 include/linux/kvm_host.h | 1 +
 virt/kvm/kvm_main.c      | 5 +++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 05e99b8..382cbef 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -241,6 +241,7 @@ struct kvm_vcpu {
 	int sigset_active;
 	sigset_t sigset;
 	struct kvm_vcpu_stat stat;
+	unsigned int halt_poll_ns;
 
 #ifdef CONFIG_HAS_IOMEM
 	int mmio_needed;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 8b8a444..977ffb1 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -217,6 +217,7 @@ int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id)
 	vcpu->kvm = kvm;
 	vcpu->vcpu_id = id;
 	vcpu->pid = NULL;
+	vcpu->halt_poll_ns = 0;
 	init_waitqueue_head(&vcpu->wq);
 	kvm_async_pf_vcpu_init(vcpu);
 
@@ -1930,8 +1931,8 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 	bool waited = false;
 
 	start = cur = ktime_get();
-	if (halt_poll_ns) {
-		ktime_t stop = ktime_add_ns(ktime_get(), halt_poll_ns);
+	if (vcpu->halt_poll_ns) {
+		ktime_t stop = ktime_add_ns(ktime_get(), vcpu->halt_poll_ns);
 
 		do {
 			/*
-- 
2.5.0.457.gab17608


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH 2/2] kvm: adaptive halt-polling toggle
  2015-09-01 21:41 [PATCH 0/2] Adaptive halt-polling toggle David Matlack
  2015-09-01 21:41 ` [PATCH 1/2] KVM: make halt_poll_ns per-VCPU David Matlack
@ 2015-09-01 21:41 ` David Matlack
  1 sibling, 0 replies; 3+ messages in thread
From: David Matlack @ 2015-09-01 21:41 UTC (permalink / raw)
  To: kvm; +Cc: linux-kernel, pbonzini, wanpeng.li, peter, David Matlack

This patch removes almost all of the overhead of polling for idle VCPUs
by disabling polling for long halts. The length of the previous halt
is used as a predictor for the current halt:

  if (length of previous halt < halt_poll_ns): poll for halt_poll_ns
  else: don't poll

This tends to work well in practice. For VMs running Message Passing
workloads, all halts are short and so the VCPU should always poll. When
a VCPU is idle, all halts are long and so the VCPU should never halt.
Experimental results on an IvyBridge host show adaptive toggling gets
close to the best of both worlds.

                           no-poll     always-poll    adaptive-toggle
---------------------------------------------------------------------
Idle (nohz) VCPU %c0       0.12        0.32           0.15
Idle (250HZ) VCPU %c0      1.22        6.35           1.27
TCP_RR latency             39 us       25 us          25 us

(3.16 Linux guest, halt_poll_ns=200000)

The big win is with ticking operating systems. Running the linux guest
with nohz=off (and HZ=250), we save 5% CPU/second and get close to
no-polling overhead levels by using the adaptive toggle. The savings
should be even higher for higher frequency ticks.

Signed-off-by: David Matlack <dmatlack@google.com>
---
 include/trace/events/kvm.h |  23 ++++++----
 virt/kvm/kvm_main.c        | 110 ++++++++++++++++++++++++++++++++++-----------
 2 files changed, 97 insertions(+), 36 deletions(-)

diff --git a/include/trace/events/kvm.h b/include/trace/events/kvm.h
index a44062d..34e0b11 100644
--- a/include/trace/events/kvm.h
+++ b/include/trace/events/kvm.h
@@ -38,22 +38,27 @@ TRACE_EVENT(kvm_userspace_exit,
 );
 
 TRACE_EVENT(kvm_vcpu_wakeup,
-	    TP_PROTO(__u64 ns, bool waited),
-	    TP_ARGS(ns, waited),
+	TP_PROTO(bool poll, bool success, __u64 poll_ns, __u64 wait_ns),
+	TP_ARGS(poll, success, poll_ns, wait_ns),
 
 	TP_STRUCT__entry(
-		__field(	__u64,		ns		)
-		__field(	bool,		waited		)
+		__field(	 bool,		poll		)
+		__field(	 bool,		success		)
+		__field(	__u64,		poll_ns		)
+		__field(	__u64,		wait_ns		)
 	),
 
 	TP_fast_assign(
-		__entry->ns		= ns;
-		__entry->waited		= waited;
+		__entry->poll		= poll;
+		__entry->success	= success;
+		__entry->poll_ns	= poll_ns;
+		__entry->wait_ns	= wait_ns;
 	),
 
-	TP_printk("%s time %lld ns",
-		  __entry->waited ? "wait" : "poll",
-		  __entry->ns)
+	TP_printk("%s %s, poll ns %lld, wait ns %lld",
+		  __entry->poll ? "poll" : "wait",
+		  __entry->success ? "success" : "fail",
+		  __entry->poll_ns, __entry->wait_ns)
 );
 
 #if defined(CONFIG_HAVE_KVM_IRQFD)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 977ffb1..3a66694 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -66,7 +66,8 @@
 MODULE_AUTHOR("Qumranet");
 MODULE_LICENSE("GPL");
 
-static unsigned int halt_poll_ns;
+/* The maximum amount of time a vcpu will poll for interrupts while halted. */
+static unsigned int halt_poll_ns = 200000;
 module_param(halt_poll_ns, uint, S_IRUGO | S_IWUSR);
 
 /*
@@ -1907,6 +1908,7 @@ void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, gfn_t gfn)
 }
 EXPORT_SYMBOL_GPL(kvm_vcpu_mark_page_dirty);
 
+/* This sets KVM_REQ_UNHALT if an interrupt arrives. */
 static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu)
 {
 	if (kvm_arch_vcpu_runnable(vcpu)) {
@@ -1921,47 +1923,101 @@ static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
-/*
- * The vCPU has executed a HLT instruction with in-kernel mode enabled.
- */
-void kvm_vcpu_block(struct kvm_vcpu *vcpu)
+static void
+update_vcpu_block_predictor(struct kvm_vcpu *vcpu, u64 poll_ns, u64 wait_ns)
 {
-	ktime_t start, cur;
-	DEFINE_WAIT(wait);
-	bool waited = false;
-
-	start = cur = ktime_get();
-	if (vcpu->halt_poll_ns) {
-		ktime_t stop = ktime_add_ns(ktime_get(), vcpu->halt_poll_ns);
-
-		do {
-			/*
-			 * This sets KVM_REQ_UNHALT if an interrupt
-			 * arrives.
-			 */
-			if (kvm_vcpu_check_block(vcpu) < 0) {
-				++vcpu->stat.halt_successful_poll;
-				goto out;
-			}
-			cur = ktime_get();
-		} while (single_task_running() && ktime_before(cur, stop));
+	u64 block_ns = poll_ns + wait_ns;
+
+	if (block_ns <= vcpu->halt_poll_ns)
+		return;
+
+	if (block_ns < halt_poll_ns)
+		/* we had a short block and our poll time is too small */
+		vcpu->halt_poll_ns = halt_poll_ns;
+	else
+		/* we had a long block. disable polling. */
+		vcpu->halt_poll_ns = 0;
+}
+
+static bool kvm_vcpu_try_poll(struct kvm_vcpu *vcpu, u64 *poll_ns)
+{
+	bool done = false;
+	ktime_t deadline;
+	ktime_t start;
+
+	start = ktime_get();
+	deadline = ktime_add_ns(start, vcpu->halt_poll_ns);
+
+	while (single_task_running() && ktime_before(ktime_get(), deadline)) {
+		if (kvm_vcpu_check_block(vcpu) < 0) {
+			++vcpu->stat.halt_successful_poll;
+			done = true;
+			break;
+		}
 	}
 
+	*poll_ns = ktime_to_ns(ktime_sub(ktime_get(), start));
+	return done;
+}
+
+static void kvm_vcpu_wait(struct kvm_vcpu *vcpu, u64 *wait_ns)
+{
+	DEFINE_WAIT(wait);
+	ktime_t start;
+
+	start = ktime_get();
+
 	for (;;) {
 		prepare_to_wait(&vcpu->wq, &wait, TASK_INTERRUPTIBLE);
 
 		if (kvm_vcpu_check_block(vcpu) < 0)
 			break;
 
-		waited = true;
 		schedule();
 	}
 
 	finish_wait(&vcpu->wq, &wait);
-	cur = ktime_get();
+
+	*wait_ns = ktime_to_ns(ktime_sub(ktime_get(), start));
+}
+
+void __kvm_vcpu_block(struct kvm_vcpu *vcpu)
+{
+	bool prediction_success = false;
+	u64 poll_ns = 0;
+	u64 wait_ns = 0;
+
+	if (vcpu->halt_poll_ns && kvm_vcpu_try_poll(vcpu, &poll_ns)) {
+		prediction_success = true;
+		goto out;
+	}
+
+	kvm_vcpu_wait(vcpu, &wait_ns);
+
+	if (!vcpu->halt_poll_ns && wait_ns > halt_poll_ns)
+		prediction_success = true;
 
 out:
-	trace_kvm_vcpu_wakeup(ktime_to_ns(cur) - ktime_to_ns(start), waited);
+	trace_kvm_vcpu_wakeup(vcpu->halt_poll_ns, prediction_success,
+			      poll_ns, wait_ns);
+
+	update_vcpu_block_predictor(vcpu, poll_ns, wait_ns);
+}
+
+/*
+ * The vCPU has executed a HLT instruction with in-kernel mode enabled.
+ */
+void kvm_vcpu_block(struct kvm_vcpu *vcpu)
+{
+	/*
+	 * kvm_vcpu_block can be called more than once between vcpu resumes.
+	 * All calls except the first will always return immediately. We don't
+	 * want those calls to affect poll/wait prediction, so we return here.
+	 */
+	if (kvm_vcpu_check_block(vcpu) < 0)
+		return;
+
+	__kvm_vcpu_block(vcpu);
 }
 EXPORT_SYMBOL_GPL(kvm_vcpu_block);
 
-- 
2.5.0.457.gab17608


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-09-01 21:41 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-01 21:41 [PATCH 0/2] Adaptive halt-polling toggle David Matlack
2015-09-01 21:41 ` [PATCH 1/2] KVM: make halt_poll_ns per-VCPU David Matlack
2015-09-01 21:41 ` [PATCH 2/2] kvm: adaptive halt-polling toggle David Matlack

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox