linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Waiman Long <Waiman.Long@hp.com>
To: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Arnd Bergmann <arnd@arndb.de>,
	Peter Zijlstra <peterz@infradead.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>,
	Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>,
	kvm@vger.kernel.org, virtualization@lists.linux-foundation.org,
	Andi Kleen <andi@firstfloor.org>,
	Michel Lespinasse <walken@google.com>,
	Alok Kataria <akataria@vmware.com>,
	linux-arch@vger.kernel.org, Gleb Natapov <gleb@redhat.com>,
	x86@kernel.org, xen-devel@lists.xenproject.org,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Scott J Norton <scott.norton@hp.com>,
	Rusty Russell <rusty@rustcorp.com.au>,
	Steven Rostedt <rostedt@goodmis.org>,
	Chris Wright <chrisw@sous-sol.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	Aswin Chandramouleeswaran <aswin@hp.com>,
	Chegu Vinod <chegu_vinod@hp.com>,
	Waiman Long <Waiman.Long@hp.com>,
	linux-kernel@vger.kernel.org,
	David Vrabel <david.vrabel@citrix.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linu
Subject: [PATCH v7 10/11] pvqspinlock, x86: Enable qspinlock PV support for KVM
Date: Wed, 19 Mar 2014 16:14:08 -0400	[thread overview]
Message-ID: <1395260049-30839-11-git-send-email-Waiman.Long@hp.com> (raw)
In-Reply-To: <1395260049-30839-1-git-send-email-Waiman.Long@hp.com>

This patch adds the necessary KVM specific code to allow KVM to support
the sleeping and CPU kicking operations needed by the queue spinlock PV
code.

Two KVM guests of 20 CPU cores (2 nodes) were created for performance
testing.  With only one KVM guest powered on (no overcommit), the
disk workload of the AIM7 benchmark was run on both ext4 and xfs RAM
disks at 3000 users on a 3.14-rc6 based kernel. The JPM (jobs/minute)
data of the test run were:

  kernel                        XFS FS  %change ext4 FS %change
  ------                        ------  ------- ------- -------
  PV ticketlock (baseline)      2409639    -    1289398    -
  qspinlock                     2425876  +0.7%  1291248  +0.1%
  PV qspinlock                  2425876  +0.7%  1299639  +0.8%
  unfair qspinlock              2435724  +1.1%  1441153 +11.8%
  unfair + PV qspinlock         2479339	 +2.9%	1459854 +13.2%

The XFS test had moderate spinlock contention of 1.6% whereas the
ext4 test had heavy spinlock contention of 15.4% as reported by perf.

A more interesting performance comparison is when there is
overcommit. With both PV guests (all 20 CPUs equally shared with
both guests - 200% overcommit) turned on and the "idle=poll" kernel
option to simulate a busy guest. The results of running the same AIM7
workloads are shown in the table below.

  XFS Test:
  kernel		 JPM	Real Time   Sys Time	Usr Time
  -----			 ---	---------   --------	--------
  PV ticketlock		597015	  29.99      424.44	 25.85
  qspinlock		117493	 153.20	    1006.06	 59.60
  PV qspinlock		616438	  29.20	     397.57	 23.31
  unfair qspinlock	642398	  28.02	     396.42	 25.42
  unfair + PV qspinlock	633803	  28.40	     400.45	 26.16

  ext4 Test:
  kernel		 JPM	Real Time   Sys Time	Usr Time
  -----			 ---	---------   --------	--------
  PV ticketlock		120984	 148.78     2378.98	 29.27
  qspinlock		 54995	 327.30	    5023.58	 54.73
  PV qspinlock		124215	 144.91	    2282.22	 28.89
  unfair qspinlock	467411	  38.51	     481.80	 25.80
  unfair + PV qspinlock	471080	  38.21	     482.40	 25.09

The kernel build test (make -j 20) results are as follows:

  kernel		Real Time   Sys Time	Usr Time
  -----			---------   --------	--------
  PV ticketlock		20m6.158s   41m7.167s	283m3.790s
  qspinlock		26m41.294s  74m55.585s	346m31.981s
  PV qspinlock		20m17.429s  41m19.434s	281m21.238s
  unfair qspinlock	19m58.315s  40m18.011s	279m27.177s
  unfair + PV qspinlock	20m0.030s   40m35.011s	278m50.522s

With no overcommit, the PV code doesn't really do anything. The unfair
lock does provide some performance advantage depending on the workload.

In an overcommited PV guest, however, there can be a big performance
benefit with PV and unfair lock. In term of spinlock contention,
the ordering of the 3 workloads are:

    kernel build < AIM7 disk xfs < AIM7 disk ext4

With light spinlock contention, the PV ticketlock can be a bit
faster than PV qspinlock. On moderate to high spinlock contention,
PV qspinlock performs better. The unfair lock has the best performance
in all cases, especially with heavy spinlock contention.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
---
 arch/x86/kernel/kvm.c |   82 +++++++++++++++++++++++++++++++++++++++++++++++++
 kernel/Kconfig.locks  |    2 +-
 2 files changed, 83 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index f318e78..c28bc1b 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -568,6 +568,7 @@ static void kvm_kick_cpu(int cpu)
 	kvm_hypercall2(KVM_HC_KICK_CPU, flags, apicid);
 }
 
+#ifndef CONFIG_QUEUE_SPINLOCK
 enum kvm_contention_stat {
 	TAKEN_SLOW,
 	TAKEN_SLOW_PICKUP,
@@ -795,6 +796,82 @@ static void kvm_unlock_kick(struct arch_spinlock *lock, __ticket_t ticket)
 		}
 	}
 }
+#else /* !CONFIG_QUEUE_SPINLOCK */
+
+#ifdef CONFIG_KVM_DEBUG_FS
+static struct dentry *d_spin_debug;
+static struct dentry *d_kvm_debug;
+static u32 kick_stats;		/* CPU kick count    */
+static u32 hibernate_stats;	/* Hibernation count */
+
+static int __init kvm_spinlock_debugfs(void)
+{
+	d_kvm_debug = debugfs_create_dir("kvm-guest", NULL);
+	if (!d_kvm_debug) {
+		printk(KERN_WARNING
+		       "Could not create 'kvm' debugfs directory\n");
+		return -ENOMEM;
+	}
+	d_spin_debug = debugfs_create_dir("spinlocks", d_kvm_debug);
+
+	debugfs_create_u32("kick_stats", 0644, d_spin_debug, &kick_stats);
+	debugfs_create_u32("hibernate_stats",
+			   0644, d_spin_debug, &hibernate_stats);
+	return 0;
+}
+
+static inline void inc_kick_stats(void)
+{
+	add_smp(&kick_stats, 1);
+}
+
+static inline void inc_hib_stats(void)
+{
+	add_smp(&hibernate_stats, 1);
+}
+
+fs_initcall(kvm_spinlock_debugfs);
+
+#else /* CONFIG_KVM_DEBUG_FS */
+static inline void inc_kick_stats(void)
+{
+}
+
+static inline void inc_hib_stats(void)
+{
+
+}
+#endif /* CONFIG_KVM_DEBUG_FS */
+
+static void kvm_kick_cpu_type(int cpu)
+{
+	kvm_kick_cpu(cpu);
+	inc_kick_stats();
+}
+
+/*
+ * Halt the current CPU & release it back to the host
+ */
+static void kvm_hibernate(void)
+{
+	unsigned long flags;
+
+	if (in_nmi())
+		return;
+
+	inc_hib_stats();
+	/*
+	 * Make sure an interrupt handler can't upset things in a
+	 * partially setup state.
+	 */
+	local_irq_save(flags);
+	if (arch_irqs_disabled_flags(flags))
+		halt();
+	else
+		safe_halt();
+	local_irq_restore(flags);
+}
+#endif /* !CONFIG_QUEUE_SPINLOCK */
 
 /*
  * Setup pv_lock_ops to exploit KVM_FEATURE_PV_UNHALT if present.
@@ -807,8 +884,13 @@ void __init kvm_spinlock_init(void)
 	if (!kvm_para_has_feature(KVM_FEATURE_PV_UNHALT))
 		return;
 
+#ifdef CONFIG_QUEUE_SPINLOCK
+	pv_lock_ops.kick_cpu = kvm_kick_cpu_type;
+	pv_lock_ops.hibernate = kvm_hibernate;
+#else
 	pv_lock_ops.lock_spinning = PV_CALLEE_SAVE(kvm_lock_spinning);
 	pv_lock_ops.unlock_kick = kvm_unlock_kick;
+#endif
 }
 
 static __init int kvm_spinlock_init_jump(void)
diff --git a/kernel/Kconfig.locks b/kernel/Kconfig.locks
index f185584..a70fdeb 100644
--- a/kernel/Kconfig.locks
+++ b/kernel/Kconfig.locks
@@ -229,4 +229,4 @@ config ARCH_USE_QUEUE_SPINLOCK
 
 config QUEUE_SPINLOCK
 	def_bool y if ARCH_USE_QUEUE_SPINLOCK
-	depends on SMP && !PARAVIRT_SPINLOCKS
+	depends on SMP && (!PARAVIRT_SPINLOCKS || !XEN)
-- 
1.7.1

  parent reply	other threads:[~2014-03-19 20:14 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-19 20:13 [PATCH v7 00/11] qspinlock: a 4-byte queue spinlock with PV support Waiman Long
2014-03-19 20:13 ` [PATCH v7 01/11] qspinlock: A generic 4-byte queue spinlock implementation Waiman Long
2014-03-19 20:14 ` [PATCH v7 02/11] qspinlock, x86: Enable x86-64 to use queue spinlock Waiman Long
2014-03-19 20:24   ` Konrad Rzeszutek Wilk
2014-03-21  3:36     ` Waiman Long
2014-03-19 20:14 ` [PATCH v7 03/11] qspinlock: More optimized code for smaller NR_CPUS Waiman Long
2014-03-19 20:14 ` [PATCH v7 04/11] qspinlock: Optimized code path for 2 contending tasks Waiman Long
2014-03-19 20:14 ` [PATCH v7 05/11] pvqspinlock, x86: Allow unfair spinlock in a PV guest Waiman Long
2014-03-19 20:14 ` [PATCH v7 06/11] pvqspinlock, x86: Allow unfair queue spinlock in a KVM guest Waiman Long
2014-03-20 22:01   ` Paolo Bonzini
2014-03-20 22:01     ` Paolo Bonzini
2014-03-20 22:29     ` H. Peter Anvin
2014-03-20 22:29       ` H. Peter Anvin
2014-03-20 22:40       ` Paolo Bonzini
2014-03-20 22:40         ` Paolo Bonzini
2014-03-21 21:27     ` Waiman Long
2014-03-19 20:14 ` [PATCH v7 07/11] pvqspinlock, x86: Allow unfair queue spinlock in a XEN guest Waiman Long
2014-03-19 20:28   ` Konrad Rzeszutek Wilk
2014-03-21  3:40     ` Waiman Long
2014-03-19 20:14 ` [PATCH v7 08/11] pvqspinlock, x86: Rename paravirt_ticketlocks_enabled Waiman Long
2014-03-19 20:14 ` [PATCH v7 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support Waiman Long
2014-03-19 20:14 ` Waiman Long [this message]
2014-03-20 22:07   ` [PATCH v7 10/11] pvqspinlock, x86: Enable qspinlock PV support for KVM Paolo Bonzini
2014-03-20 22:07     ` Paolo Bonzini
2014-03-21 21:41     ` Waiman Long
2014-03-19 20:14 ` [PATCH RFC v7 11/11] pvqspinlock, x86: Enable qspinlock PV support for XEN Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1395260049-30839-11-git-send-email-Waiman.Long@hp.com \
    --to=waiman.long@hp.com \
    --cc=akataria@vmware.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=arnd@arndb.de \
    --cc=aswin@hp.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=chegu_vinod@hp.com \
    --cc=chrisw@sous-sol.org \
    --cc=david.vrabel@citrix.com \
    --cc=gleb@redhat.com \
    --cc=hpa@zytor.com \
    --cc=jeremy@goop.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=raghavendra.kt@linux.vnet.ibm.com \
    --cc=rostedt@goodmis.org \
    --cc=rusty@rustcorp.com.au \
    --cc=scott.norton@hp.com \
    --cc=tglx@linutronix.de \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=walken@google.com \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).