From: Malcolm Crossley <malcolm.crossley@citrix.com>
To: malcolm.crossley@citrix.com, JBeulich@suse.com,
ian.campbell@citrix.com, andrew.cooper3@citrix.com,
Marcos.Matsunaga@oracle.com, keir@xen.org,
konrad.wilk@oracle.com, george.dunlap@eu.citrix.com
Cc: xen-devel@lists.xenproject.org, dario.faggioli@citrix.com,
stefano.stabellini@citrix.com
Subject: [PATCHv5 0/3] Implement per-cpu reader-writer locks
Date: Fri, 18 Dec 2015 16:08:37 +0000 [thread overview]
Message-ID: <1450454920-11036-1-git-send-email-malcolm.crossley@citrix.com> (raw)
This patch series adds per-cpu reader-writer locks as a generic lock
implementation and then converts the grant table and p2m rwlocks to
use the percpu rwlocks, in order to improve multi-socket host performance.
CPU profiling has revealed the rwlocks themselves suffer from severe cache
line bouncing due to the cmpxchg operation used even when taking a read lock.
Multiqueue paravirtualised I/O results in heavy contention of the grant table
and p2m read locks of a specific domain and so I/O throughput is bottlenecked
by the overhead of the cache line bouncing itself.
Per-cpu read locks avoid lock cache line bouncing by using a per-cpu data
area to record a CPU has taken the read lock. Correctness is enforced for the
write lock by using a per lock barrier which forces the per-cpu read lock
to revert to using a standard read lock. The write lock then polls all
the percpu data area until active readers for the lock have exited.
Removing the cache line bouncing on a multi-socket Haswell-EP system
dramatically improves performance, with 16 vCPU network IO performance going
from 15 gb/s to 64 gb/s! The host under test was fully utilising all 40
logical CPU's at 64 gb/s, so a bigger logical CPU host may see an even better
IO improvement.
Note: Benchmarking of the these performance improvements should be done with
the non debug version of the hypervisor otherwise the map_domain_page spinlock
is the main bottleneck.
Changes in V4:
- Move percpu_owner ASSERTS to be inline function
- Rename grant table rwlock wrappers
Changes in V4:
- Fix the ASSERTS for the percpu_owner check
Changes in V3:
- Add percpu rwlock owner for debug Xen builds
- Validate percpu rwlock owner at runtime for debug Xen builds
- Fix hard tab issues
- Use percpu rwlock wrappers for grant table rwlock users
- Add comments why rw_is_locked ASSERTS have been removed in grant table code
Changes in V2:
- Add Cover letter
- Convert p2m rwlock to percpu rwlock
- Improve percpu rwlock to safely handle simultaneously holding 2 or more
locks
- Move percpu rwlock barrier from global to per lock
- Move write lock cpumask variable to a percpu variable
- Add macros to help initialise and use percpu rwlocks
- Updated IO benchmark results to cover revised locking implementation
Malcolm Crossley (3):
rwlock: Add per-cpu reader-writer lock infrastructure
grant_table: convert grant table rwlock to percpu rwlock
p2m: convert p2m rwlock to percpu rwlock
xen/arch/arm/mm.c | 4 +-
xen/arch/x86/mm.c | 4 +-
xen/arch/x86/mm/mm-locks.h | 12 ++--
xen/arch/x86/mm/p2m.c | 1 +
xen/common/grant_table.c | 126 +++++++++++++++++++++++-------------------
xen/common/spinlock.c | 46 +++++++++++++++
xen/include/asm-arm/percpu.h | 5 ++
xen/include/asm-x86/mm.h | 2 +-
xen/include/asm-x86/percpu.h | 6 ++
xen/include/xen/grant_table.h | 24 +++++++-
xen/include/xen/percpu.h | 4 ++
xen/include/xen/spinlock.h | 115 ++++++++++++++++++++++++++++++++++++++
12 files changed, 282 insertions(+), 67 deletions(-)
--
1.7.12.4
next reply other threads:[~2015-12-18 16:08 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-18 16:08 Malcolm Crossley [this message]
2015-12-18 16:08 ` [PATCHv5 1/3] rwlock: Add per-cpu reader-writer lock infrastructure Malcolm Crossley
2015-12-18 16:39 ` Jan Beulich
2015-12-22 11:56 ` George Dunlap
2016-01-11 15:06 ` Malcolm Crossley
2016-01-19 10:29 ` Malcolm Crossley
2016-01-19 12:25 ` George Dunlap
2016-01-20 15:30 ` George Dunlap
2016-01-21 15:17 ` Ian Campbell
2015-12-18 16:08 ` [PATCHv5 2/3] grant_table: convert grant table rwlock to percpu rwlock Malcolm Crossley
2015-12-18 16:40 ` Jan Beulich
2016-01-21 15:31 ` Ian Campbell
2015-12-18 16:08 ` [PATCHv5 3/3] p2m: convert p2m " Malcolm Crossley
2015-12-22 12:07 ` George Dunlap
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1450454920-11036-1-git-send-email-malcolm.crossley@citrix.com \
--to=malcolm.crossley@citrix.com \
--cc=JBeulich@suse.com \
--cc=Marcos.Matsunaga@oracle.com \
--cc=andrew.cooper3@citrix.com \
--cc=dario.faggioli@citrix.com \
--cc=george.dunlap@eu.citrix.com \
--cc=ian.campbell@citrix.com \
--cc=keir@xen.org \
--cc=konrad.wilk@oracle.com \
--cc=stefano.stabellini@citrix.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).