linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 0/6] locking/qspinlock: Enhance pvqspinlock performance
@ 2015-09-11 18:37 Waiman Long
  2015-09-11 18:37 ` [PATCH v6 1/6] locking/qspinlock: relaxes cmpxchg & xchg ops in native code Waiman Long
                   ` (5 more replies)
  0 siblings, 6 replies; 32+ messages in thread
From: Waiman Long @ 2015-09-11 18:37 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, H. Peter Anvin
  Cc: x86, linux-kernel, Scott J Norton, Douglas Hatch, Davidlohr Bueso,
	Waiman Long

v5->v6:
 - Added a new patch 1 to relax the cmpxchg and xchg operations in
   the native code path to reduce performance overhead on non-x86
   architectures.
 - Updated the unconditional PV kick patch as suggested by PeterZ.
 - Added a new patch to allow one lock stealing attempt at slowpath
   entry point to reduce performance penalty due to lock waiter
   preemption.
 - Removed the pending bit and kick-ahead patches as they didn't show
   any noticeable performance improvement on top of the lock stealing
   patch.
 - Simplified the adaptive spinning patch as the lock stealing patch
   allows more aggressive pv_wait() without much performance penalty
   in non-overcommitted VMs.

v4->v5:
 - Rebased the patch to the latest tip tree.
 - Corrected the comments and commit log for patch 1.
 - Removed the v4 patch 5 as PV kick deferment is no longer needed with
   the new tip tree.
 - Simplified the adaptive spinning patch (patch 6) & improve its
   performance a bit further.
 - Re-ran the benchmark test with the new patch.

v3->v4:
 - Patch 1: add comment about possible racing condition in PV unlock.
 - Patch 2: simplified the pv_pending_lock() function as suggested by
   Davidlohr.
 - Move PV unlock optimization patch forward to patch 4 & rerun
   performance test.

v2->v3:
 - Moved deferred kicking enablement patch forward & move back
   the kick-ahead patch to make the effect of kick-ahead more visible.
 - Reworked patch 6 to make it more readable.
 - Reverted back to use state as a tri-state variable instead of
   adding an additional bistate variable.
 - Added performance data for different values of PV_KICK_AHEAD_MAX.
 - Add a new patch to optimize PV unlock code path performance.

v1->v2:
 - Take out the queued unfair lock patches
 - Add a patch to simplify the PV unlock code
 - Move pending bit and statistics collection patches to the front
 - Keep vCPU kicking in pv_kick_node(), but defer it to unlock time
   when appropriate.
 - Change the wait-early patch to use adaptive spinning to better
   balance the difference effect on normal and over-committed guests.
 - Add patch-to-patch performance changes in the patch commit logs.

This patchset tries to improve the performance of both regular and
over-commmitted VM guests. The adaptive spinning patch was inspired
by the "Do Virtual Machines Really Scale?" blog from Sanidhya Kashyap.

Patch 1 relaxes the memory order restriction of atomic operations by
using less restrictive _acquire and _release variants of cmpxchg()
and xchg(). This will reduce performance overhead when ported to other
non-x86 architectures.

Patch 2 simplifies the unlock code by removing the unnecessary
state check.

Patch 2 adds pending bit support to pvqspinlock improving performance
at light load.

Patch 3 optimizes the PV unlock code path performance for x86-64
architecture.

Patch 4 allows the collection of various count data that are useful
to see what is happening in the system. They do add a bit of overhead
when enabled slowing performance a tiny bit.

Patch 5 allows one lock stealing attempt at slowpath entry. This causes
a pretty big performance improvement for over-committed VM guests.

Patch 6 enables adaptive spinning in the queue nodes. This patch
leads to further performance improvement in over-committed guest,
though it is not as big as the previous patch.

Waiman Long (6):
  locking/qspinlock: relaxes cmpxchg & xchg ops in native code
  locking/pvqspinlock: Unconditional PV kick with _Q_SLOW_VAL
  locking/pvqspinlock, x86: Optimize PV unlock code path
  locking/pvqspinlock: Collect slowpath lock statistics
  locking/pvqspinlock: Allow 1 lock stealing attempt
  locking/pvqspinlock: Queue node adaptive spinning

 arch/x86/Kconfig                          |    7 +
 arch/x86/include/asm/qspinlock.h          |    2 +-
 arch/x86/include/asm/qspinlock_paravirt.h |   59 +++++
 include/asm-generic/qspinlock.h           |    6 +-
 kernel/locking/qspinlock.c                |   45 +++--
 kernel/locking/qspinlock_paravirt.h       |  378 +++++++++++++++++++++++++----
 6 files changed, 431 insertions(+), 66 deletions(-)


^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2015-09-18  8:51 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-11 18:37 [PATCH v6 0/6] locking/qspinlock: Enhance pvqspinlock performance Waiman Long
2015-09-11 18:37 ` [PATCH v6 1/6] locking/qspinlock: relaxes cmpxchg & xchg ops in native code Waiman Long
2015-09-11 22:27   ` Davidlohr Bueso
2015-09-14 12:06     ` Peter Zijlstra
2015-09-14 18:40       ` Waiman Long
2015-09-14 15:16     ` Waiman Long
2015-09-11 18:37 ` [PATCH v6 2/6] locking/pvqspinlock: Unconditional PV kick with _Q_SLOW_VAL Waiman Long
2015-09-18  8:50   ` [tip:locking/core] locking/pvqspinlock: Kick the PV CPU unconditionally when _Q_SLOW_VAL tip-bot for Waiman Long
2015-09-11 18:37 ` [PATCH v6 3/6] locking/pvqspinlock, x86: Optimize PV unlock code path Waiman Long
2015-09-11 18:37 ` [PATCH v6 4/6] locking/pvqspinlock: Collect slowpath lock statistics Waiman Long
2015-09-11 23:13   ` Davidlohr Bueso
2015-09-14 15:25     ` Waiman Long
2015-09-14 21:41       ` Davidlohr Bueso
2015-09-15  3:47         ` Waiman Long
2015-09-11 18:37 ` [PATCH v6 5/6] locking/pvqspinlock: Allow 1 lock stealing attempt Waiman Long
2015-09-14 13:57   ` Peter Zijlstra
2015-09-14 19:02     ` Waiman Long
2015-09-14 14:00   ` Peter Zijlstra
2015-09-14 19:15     ` Waiman Long
2015-09-14 19:38       ` Waiman Long
2015-09-15  8:24       ` Peter Zijlstra
2015-09-15 15:29         ` Waiman Long
2015-09-16 15:01           ` Peter Zijlstra
2015-09-17 15:08             ` Waiman Long
2015-09-14 14:04   ` Peter Zijlstra
2015-09-14 19:19     ` Waiman Long
2015-09-11 18:37 ` [PATCH v6 6/6] locking/pvqspinlock: Queue node adaptive spinning Waiman Long
2015-09-14 14:10   ` Peter Zijlstra
2015-09-14 19:37     ` Waiman Long
2015-09-15  8:38       ` Peter Zijlstra
2015-09-15 15:32         ` Waiman Long
2015-09-16 15:03           ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).