xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Tamas Lengyel <tamas.lengyel@zentific.com>
To: Andres Lagar-Cavilla <andres.lagarcavilla@gmail.com>
Cc: patrick.wilbur@gmail.com, Steven Maresca <steve@zentific.com>,
	Tim Deegan <tim@xen.org>,
	Andres Lagar-Cavilla <andres@lagarcavilla.org>,
	xen-devel@lists.xen.org
Subject: Re: [PATCH v2] tools/tests/mem-sharing/memshrtool share-all test
Date: Sat, 23 Mar 2013 19:07:45 +0100	[thread overview]
Message-ID: <CAErYnsgsDo-LXWxpr_ew1+FOK9wDSBAF-6z-Z2GdX634RW7aeg@mail.gmail.com> (raw)
In-Reply-To: <E2CC169B-447E-4DD7-A5D1-B35390FC0475@gmail.com>

Hi Andres,
thanks for taking a look at this patch!

> In terms of higher level:
> - Are these really clone VMs? In order to nominate gfns, they must be allocated … so, what was allocated in the target VM before this? How would you share two 16GB domains if you have 2GB free before allocating the target domain (setting aside how do you deal with CoW overflow, which is a separate issue). You may consider revisiting the add to physmap sharing memop.
> - Can you document when should one call this? Or at least your envisioned scenario. Ties in with the question before.

While add_to_physmap would be ideal to quickly clone VMs, I haven't
found anything useful (documentation/code sample) on doing it that
way. The only way I found to clone a VM right now is using XL
save/restore and than deduplicating the pages using nominate/share. I
do this in the following order:

1. Retrieve origin VMs configuration.
2. Parse and modify the config by changing the VM's name, disk and
network interface. The disk assigned to the clone is a CoW disk (qcow2
or LVM), the network bridge is a new bridge as to avoid MAC/IP
collision with the origin VM.
3. Create a FIFO pipe on the filesystem (mkfifo /tmp/cloning)
4. Use XL to clone the VM's execution state and memory without
deduplication: xl pause <origin> && xl save -c <origin> /tmp/cloning |
xl restore -p <modified config> /tmp/cloning
5. Use the routine in this patch to deduplicate the memory
6. Unpause clone.

This is quite wasteful as you and Patrick pointed it out in
http://lists.xen.org/archives/html/xen-devel/2012-02/msg00259.html.
Unfortunately, I haven't found a straight forward way to duplicate
only the execution state of a VM without duplicating it's entire
memory to allow me to use add_to_physmap. If XL would have an option
in it's xl save routine to do a partial save, that would be great. I
did scan through the XL code to determine how one would do that but
I'm not even close to understanding the internals of XL.

> - I think it's high time we batch sharing calls. I have not been able to do this, but it should be relatively simple to submit a hypervisor patch to achieve this. For what you are trying to do, it will give you a very nice boost in performance.

That sounds like something that would be very useful.

>>> +#define PAGE_SIZE_KB (XC_PAGE_SIZE/1024)
> A matter of style, but in my view this is unneeded, see below.
>>> +        pages=info.max_memkb/PAGE_SIZE_KB;
>>> 2, cleaner code, more inline with the code base.

Sure.

>>> +        source_pages=source_info.max_memkb/PAGE_SIZE_KB;
> In most scenarios you would need to pause this, particularly as VMs may self-modify their physmap (balloon, mmio, etc)

See above my intended usage (origin and clone should both be paused
during this operation).

>>> +
>>> +        if(pages != source_pages) {
>>> +            printf("Page count in source and destination domain doesn't match "
> to stderr.

OK.

>>> +        for(share_page=0;share_page<=pages;++share_page) {
> The memory layout of an hvm is sparse. While tot pages will get you a lot of sharing, it will not get you all. For example, for a VM with nominal 4GB of RAM, the max gfn is around 4.25GB.

This is something that lacks documentation (or did I just failed
finding it?) so thanks for shedding some light on it! =) I did spend
days trying to figure out the best way of getting the list of valid
gfn's of a domain, without success. This approach did seem to work OK,
although the number of pages shared this way was never 100% as parts
of the memory fail at the nominate call, hence my continue; in the
code for those cases.

> Even for small VMs, you have gfns in the 3.75-4GB range. You should check equality of max gfn, which might be a very difficult thing to achieve depending on the stage of a VM's lifetime at which you call this.

Can you elaborate on this? (Is this documented anywhere? How would you
determine the max gfn of a domain?)

> And you should have a policy for dealing with physmap holes (for example, is there any point in sharing the VGA mmio? yes/no, your call, argue for it, document it, etc)

I guess this depends on the intended usage of the clone. For my
purposes the closer the clone is to the origin the better. Of course,
there are situations where this is simply not possible (for example
cloning a VM with PCI passthrough devices).

Tamas

  reply	other threads:[~2013-03-23 18:07 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-18 13:34 [PATCH v2] tools/tests/mem-sharing/memshrtool share-all test Tamas Lengyel
2013-03-21 12:17 ` Tim Deegan
2013-03-22 19:25   ` Andres Lagar-Cavilla
2013-03-23 18:07     ` Tamas Lengyel [this message]
2013-04-22 12:07     ` Ian Campbell
2013-04-22 12:11       ` George Dunlap
2013-04-22 14:46         ` Andres Lagar-Cavilla

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAErYnsgsDo-LXWxpr_ew1+FOK9wDSBAF-6z-Z2GdX634RW7aeg@mail.gmail.com \
    --to=tamas.lengyel@zentific.com \
    --cc=andres.lagarcavilla@gmail.com \
    --cc=andres@lagarcavilla.org \
    --cc=patrick.wilbur@gmail.com \
    --cc=steve@zentific.com \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).