From: Wei Liu <wei.liu2@citrix.com>
To: xen-devel@lists.xen.org
Cc: wei.liu2@citrix.com, Ian Campbell <ian.campbell@citrix.com>,
Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
Andrew Cooper <andrew.cooper3@citrix.com>,
Dario Faggioli <dario.faggioli@citrix.com>,
David Vrabel <david.vrabel@citrix.com>,
Boris Ostrovsky <boris.ostrovsky@oracle.com>
Subject: Linux Xen Balloon Driver Improvement (Draft 2)
Date: Mon, 27 Oct 2014 12:33:28 +0000 [thread overview]
Message-ID: <20141027123328.GA9067@zion.uk.xensource.com> (raw)
Hi all
This is the draft 2 of the design.
PDF version can be found at
http://xenbits.xen.org/people/liuw/xen-balloon-driver-improvement.pdf
Changes in this version:
1. Style, grammar and typo fixes.
2. Make this document Linux centric.
3. Add a new section for NUMA-aware ballooning.
% Linux Xen Balloon Driver Improvement
% Wei Liu <<wei.liu2@citrix.com>>
% Draft 2
----------------------------------------------------------
Version Date Changes
------- --------- ---------------------------------
2 24/10/2014 Style fixes, more clarifications.
1 22/10/2014 Initial version.
----------------------------------------------------------
## Introduction
This document describe a design to improve Xen balloon driver in Linux.
## Motives
1. Balloon pages fragments guest physical address space.
2. Balloon compaction infrastructure can migrate ballooned pages from
start of Linux memory zone to end of zone, hence creating
contiguous guest physical address space.
3. Having contiguous guest physical address enables some options to
improve performance.
## Goal of improvement
The balloon driver makes use of as many huge pages as possible,
defragmenting guest address space. Contiguous guest address space
permits huge page ballooning which helps prevent host address space
fragmentation.
This should be achieved without any particular hypervisor side
feature.
## Design and implementation
When the balloon driver is asked to increase / decrease reservation,
it will always start with a huge page. However, due to resource
availability in both hypervisor and guest, it's not always possible to
get hold of a huge page. In that case the driver will fall back to use
normal size page. Balloon driver later will try to coalesce small size
pages into huge page. As time goes by, both Xen and guest should use
more and more huge pages.
To achieve the said goal, several changes will be made:
1. Make use of balloon page compaction.
2. Maintain multiple queues for pages of different sizes and purposes.
3. Periodically exchange normal size pages with huge pages.
### Make use of balloon page compaction
Balloon page migration moves balloon pages from start of zone to end
of zone, making guest physical address space contiguous. This gives
balloon driver a chance to allocate huge pages in order to coalesce
small pages.
Currently, Xen balloon driver gets its page directly from page
allocator. To enable balloon page migration, those pages now need to
be allocated from core balloon driver. Pages allocated from core
balloon driver are subject to balloon page compaction.
The use of Linux balloon page compaction doesn't require introducing
new interfaces between Xen balloon driver and the rest of the
system. Most changes are internal to Xen balloon driver.
Xen balloon driver will also need to provide a callback to migrate
balloon page. In essence callback function receives "old page", which
is a already ballooned out page, and "new page", which is a page to be
ballooned out, then it inflates "old page" and deflates "new page".
The core of migration callback is XENMEM\_exchange hypercall. This
makes sure that inflation of old page and deflation of new page is
done atomically, so even if a domain is beyond its memory target and
the target is being enforced, it can still compact memory.
### Maintain multiple queues for pages of different sizes and purposes
We maintain multiple queues for pages of different sizes inside Xen
balloon driver, so that Xen balloon worker thread can coalesce smaller
size pages into one larger size page. Queues for special purposed
pages, such as balloon pages used to map foreign pages, are also
maintained. These special purposed pages are not subject to migration
and page coalescence.
For instance, balloon driver can maintain three queues:
1. queue for 2 MB pages
2. queue for 4 KB pages (delegated to core balloon driver)
3. queue for pages used to mapped pages from other domain
More queues can be added when necessary, but for now one queue for
normal pages and one queue for huge page should be enough.
### Periodically exchange normal size pages with huge pages
Worker thread wakes up periodically to check if there are enough pages
in normal size page queue to coalesce into a huge page. If so, it will
try to exchange that huge page into a number of normal size pages with
XENMEM\_exchange hypercall.
## Relationship with NUMA-aware ballooning
Another orthogonal improvement to Linux balloon driver is NUMA-aware
ballooning.
The use of balloon page compaction will not interfere with NUMA-ware
ballooning because balloon compaction, which is part of Linux's memory
subsystem, is already NUMA-aware.
All the changes proposed in this design can be made NUMA-aware
provided virtual NUMA topology information is in place.
## Flowcharts
These flowcharts assume normal page size is 4K and huge page size is
2M. They show how two queues are maintained. Please note that
"requeue on failure" is not drawn on the flowcharts to make the
flowcharts easier to reason.



next reply other threads:[~2014-10-27 12:33 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-27 12:33 Wei Liu [this message]
2014-10-27 14:23 ` Linux Xen Balloon Driver Improvement (Draft 2) David Vrabel
2014-10-27 16:29 ` Wei Liu
2014-10-27 17:29 ` David Vrabel
2014-10-27 19:10 ` Wei Liu
2014-10-27 19:42 ` Stefano Stabellini
2014-10-27 19:14 ` Wei Liu
2014-10-28 10:51 ` David Vrabel
2014-10-29 11:01 ` Wei Liu
2014-12-15 10:52 ` David Vrabel
2014-12-15 10:58 ` Wei Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141027123328.GA9067@zion.uk.xensource.com \
--to=wei.liu2@citrix.com \
--cc=andrew.cooper3@citrix.com \
--cc=boris.ostrovsky@oracle.com \
--cc=dario.faggioli@citrix.com \
--cc=david.vrabel@citrix.com \
--cc=ian.campbell@citrix.com \
--cc=stefano.stabellini@eu.citrix.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.