xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0 of 3] x86/mem_sharing: Improve performance of rmap, fix cascading bugs
@ 2012-04-24 19:48 Andres Lagar-Cavilla
  2012-04-24 19:48 ` [PATCH 1 of 3] x86/mem_sharing: Don't destroy a page's shared state before depleting its <gfn, domid> tuple list Andres Lagar-Cavilla
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Andres Lagar-Cavilla @ 2012-04-24 19:48 UTC (permalink / raw)
  To: xen-devel; +Cc: andres, tim

This is a repost of patches 2 and 3 of the series initially posted on Apr 12th.

The first patch has been split into two functionally isolated patches as per
Tim Deegan's request.

The original posting (suitably edited) follows
--------------------------------
The sharing subsystem does not scale elegantly with high degrees of page
sharing. The culprit is a reverse map that each shared frame maintains,
resolving to all domain pages pointing to the shared frame. Because the rmap is
implemented with a O(n) search linked-list, CoW unsharing can result in
prolonged search times.

The place where this becomes most obvious is during domain destruction, during
which all shared p2m entries need to be unshared. Destroying a domain with a
lot of sharing could result in minutes of hypervisor freeze-up: 7 minutes for a
2 GiB domain! As a result, errors cascade throughout the system, including soft
lockups, watchdogs firing, IO drivers timing out, etc.

The proposed solution is to mutate the rmap from a linked list to a hash table
when the number of domain pages referencing the shared frame exceeds a
threshold. This maintains minimal space use for the common case of relatively
low sharing, and switches to an O(1) data structure for heavily shared pages,
with an space overhead of one page. The threshold chosen is 256, as a single
page can fit 256 spill lists for 256 buckets in a hash table.

With these patches in place, domain destruction for a 2 GiB domain with a
shared frame including over a hundred thousand references drops from 7 minutes
to two seconds.

Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>

 xen/arch/x86/mm/mem_sharing.c     |    8 +-
 xen/arch/x86/mm/mem_sharing.c     |  138 ++++++++++++++++++++++--------
 xen/arch/x86/mm/mem_sharing.c     |  170 +++++++++++++++++++++++++++++++++++--
 xen/include/asm-x86/mem_sharing.h |   13 ++-
 4 files changed, 274 insertions(+), 55 deletions(-)

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-04-26  9:12 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-04-24 19:48 [PATCH 0 of 3] x86/mem_sharing: Improve performance of rmap, fix cascading bugs Andres Lagar-Cavilla
2012-04-24 19:48 ` [PATCH 1 of 3] x86/mem_sharing: Don't destroy a page's shared state before depleting its <gfn, domid> tuple list Andres Lagar-Cavilla
2012-04-24 19:48 ` [PATCH 2 of 3] x86/mem_sharing: modularize reverse map for shared frames Andres Lagar-Cavilla
2012-04-24 19:48 ` [PATCH 3 of 3] x86/mem_sharing: For shared pages with many references, use a hash table instead of a list Andres Lagar-Cavilla
2012-04-26  9:12 ` [PATCH 0 of 3] x86/mem_sharing: Improve performance of rmap, fix cascading bugs Tim Deegan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).