All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v7 0/5] locking/qspinlock: Enhance pvqspinlock performance
@ 2015-09-22 20:50 Waiman Long
  2015-09-22 20:50 ` [PATCH v7 1/5] locking/qspinlock: relaxes cmpxchg & xchg ops in native code Waiman Long
                   ` (4 more replies)
  0 siblings, 5 replies; 22+ messages in thread
From: Waiman Long @ 2015-09-22 20:50 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, H. Peter Anvin
  Cc: x86, linux-kernel, Scott J Norton, Douglas Hatch, Davidlohr Bueso,
	Waiman Long

v6->v7:
 - Removed arch/x86/include/asm/qspinlock.h from patch 1.
 - Removed the unconditional PV kick patch as it has been merged
   into tip.
 - Changed the pvstat_inc() API to add a new condition parameter.
 - Added comments and rearrange code in patch 4 to clarify where
   lock stealing happened.
 - In patch 5, removed the check for pv_wait count when deciding when
   to wait early.
 - Updated copyrights and email address.

v5->v6:
 - Added a new patch 1 to relax the cmpxchg and xchg operations in
   the native code path to reduce performance overhead on non-x86
   architectures.
 - Updated the unconditional PV kick patch as suggested by PeterZ.
 - Added a new patch to allow one lock stealing attempt at slowpath
   entry point to reduce performance penalty due to lock waiter
   preemption.
 - Removed the pending bit and kick-ahead patches as they didn't show
   any noticeable performance improvement on top of the lock stealing
   patch.
 - Simplified the adaptive spinning patch as the lock stealing patch
   allows more aggressive pv_wait() without much performance penalty
   in non-overcommitted VMs.

v4->v5:
 - Rebased the patch to the latest tip tree.
 - Corrected the comments and commit log for patch 1.
 - Removed the v4 patch 5 as PV kick deferment is no longer needed with
   the new tip tree.
 - Simplified the adaptive spinning patch (patch 6) & improve its
   performance a bit further.
 - Re-ran the benchmark test with the new patch.

v3->v4:
 - Patch 1: add comment about possible racing condition in PV unlock.
 - Patch 2: simplified the pv_pending_lock() function as suggested by
   Davidlohr.
 - Move PV unlock optimization patch forward to patch 4 & rerun
   performance test.

v2->v3:
 - Moved deferred kicking enablement patch forward & move back
   the kick-ahead patch to make the effect of kick-ahead more visible.
 - Reworked patch 6 to make it more readable.
 - Reverted back to use state as a tri-state variable instead of
   adding an additional bistate variable.
 - Added performance data for different values of PV_KICK_AHEAD_MAX.
 - Add a new patch to optimize PV unlock code path performance.

v1->v2:
 - Take out the queued unfair lock patches
 - Add a patch to simplify the PV unlock code
 - Move pending bit and statistics collection patches to the front
 - Keep vCPU kicking in pv_kick_node(), but defer it to unlock time
   when appropriate.
 - Change the wait-early patch to use adaptive spinning to better
   balance the difference effect on normal and over-committed guests.
 - Add patch-to-patch performance changes in the patch commit logs.

This patchset tries to improve the performance of both regular and
over-commmitted VM guests. The adaptive spinning patch was inspired
by the "Do Virtual Machines Really Scale?" blog from Sanidhya Kashyap.

Patch 1 relaxes the memory order restriction of atomic operations by
using less restrictive _acquire and _release variants of cmpxchg()
and xchg(). This will reduce performance overhead when ported to other
non-x86 architectures.

Patch 2 optimizes the PV unlock code path performance for x86-64
architecture.

Patch 3 allows the collection of various count data that are useful
to see what is happening in the system. They do add a bit of overhead
when enabled slowing performance a tiny bit.

Patch 4 allows one lock stealing attempt at slowpath entry. This causes
a pretty big performance improvement for over-committed VM guests.

Patch 5 enables adaptive spinning in the queue nodes. This patch
leads to further performance improvement in over-committed guest,
though it is not as big as the previous patch.

Waiman Long (5):
  locking/qspinlock: relaxes cmpxchg & xchg ops in native code
  locking/pvqspinlock, x86: Optimize PV unlock code path
  locking/pvqspinlock: Collect slowpath lock statistics
  locking/pvqspinlock: Allow 1 lock stealing attempt
  locking/pvqspinlock: Queue node adaptive spinning

 arch/x86/Kconfig                          |    9 +
 arch/x86/include/asm/qspinlock_paravirt.h |   59 +++++
 include/asm-generic/qspinlock.h           |    9 +-
 kernel/locking/qspinlock.c                |   48 +++--
 kernel/locking/qspinlock_paravirt.h       |  376 +++++++++++++++++++++++++----
 5 files changed, 437 insertions(+), 64 deletions(-)


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2015-10-15 21:02 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-22 20:50 [PATCH v7 0/5] locking/qspinlock: Enhance pvqspinlock performance Waiman Long
2015-09-22 20:50 ` [PATCH v7 1/5] locking/qspinlock: relaxes cmpxchg & xchg ops in native code Waiman Long
2015-10-13 18:02   ` Peter Zijlstra
2015-10-13 20:38     ` Waiman Long
2015-10-13 20:47       ` Peter Zijlstra
2015-10-14  9:39     ` Will Deacon
2015-09-22 20:50 ` [PATCH v7 2/5] locking/pvqspinlock, x86: Optimize PV unlock code path Waiman Long
2015-09-22 20:50 ` [PATCH v7 3/5] locking/pvqspinlock: Collect slowpath lock statistics Waiman Long
2015-10-13 20:05   ` Peter Zijlstra
2015-10-13 21:06     ` Waiman Long
2015-09-22 20:50 ` [PATCH v7 4/5] locking/pvqspinlock: Allow 1 lock stealing attempt Waiman Long
2015-10-13 18:23   ` Peter Zijlstra
2015-10-13 20:41     ` Waiman Long
2015-10-13 20:44       ` Peter Zijlstra
2015-10-15 20:44         ` Waiman Long
2015-10-13 19:39   ` Peter Zijlstra
2015-10-13 20:45     ` Waiman Long
2015-10-13 19:56   ` Peter Zijlstra
2015-10-13 20:50     ` Waiman Long
2015-10-14  9:28       ` Peter Zijlstra
2015-10-15 21:01         ` Waiman Long
2015-09-22 20:50 ` [PATCH v7 5/5] locking/pvqspinlock: Queue node adaptive spinning Waiman Long

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.