From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Rafal Wojtczuk <rafal@invisiblethingslab.com>
Cc: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
Keir Fraser <keir.fraser@eu.citrix.com>
Subject: Re: Re: Improving domU restore time
Date: Tue, 01 Jun 2010 10:00:09 -0700 [thread overview]
Message-ID: <4C053C99.6010906@goop.org> (raw)
In-Reply-To: <20100531094243.GB3374@emperor2.itldev.org>
On 05/31/2010 02:42 AM, Rafal Wojtczuk wrote:
> Hello,
>
>> I would be grateful for the comments on possible methods to improve domain
>> restore performance. Focusing on the PV case, if it matters.
>>
> Continuing the topic; thank you to everyone that responded so far.
>
> Focusing on xen-3.4.3 case for now, dom0/domU still 2.6.32.x pvops x86_64.
> Let me just reiterate that for our purposes, the domain save time (and
> possible related post-processing) is not critical, it
> is only the restore time that matters. I did some experiments; they involve:
> 1) before saving a domain, have domU allocate all free memory in an userland
> process, then fill it with some MAGIC_PATTERN. Save domU, then process the
> savefile, removing all pfns (and their page content) that refer to a page
> containing MAGIC_PATTERN.
> This reduces the savefile size.
>
Why not just balloon the domain down?
> 2) instead of executing "xm restore savefile", just poke the xmlrpc request
> to Xend unix socket via socat
>
I would seek alternatives to the xend/xm toolset. I've been doing my
bit to make libxenlight/xl useful, though it still needs a lot of work
to get it to anything remotely production-ready...
> 3) change the /etc/xen/scripts/block so that in the "add file:" case, it calls
> only 3 processes (xenstore-read, losetup, xenstore-write); assuming the
> sharing check can be done elsewhere, this should provide realistic lower
> bound for the execution time
>
> For a domain with 400MB RAM and 4 vbds, with the savefile in the fs cache,
> this cuts down the restore real time from 2700 ms to 1153 ms. Some questions:
> a) is the 1) method safe ? Normally, xc_domain_restore() allocates mfns via
> xc_domain_memory_populate_physmap() and then calls
> xc_add_mmu_update(MMU_MACHPHYS_UPDATE) on
> the pfn/mfn pairs. If we remove some pfns from the savefile, this will not
> happen. Instead, the mfn for the removed pfn (referring to memory whose
> content we don't care for) will be allocated in uncanonicalize_pagetable(),
> because there will be a pte entry for this page. But uncanonicalize_pagetable()
> does not call xc_add_mmu_update(). Still, the domain seems to be restored
> properly (naturally the buffer filled previously with MAGIC_PATTERN now
> contains junk, but this is the whole purpose of it).
> Again, is xc_add_mmu_update(MMU_MACHPHYS_UPDATE) really needed in the above
> scenario ? It basically does
> set_gpfn_from_mfn(mfn, gpfn)
> but this should already be taken care for by
> xc_domain_memory_populate_physmap() ?
>
> b) There still seems to be some discrepancy between the real time (1153ms) and
> the CPU time (970ms); considering this is a machine with 2 cores (and at
> least the hotplug scripts execute in parallel), it is notable. What can cause
> the involved processes to sleep (we read the savefile from fs cache, so there
> should be no disk reads at all). Is the single threaded nature of xenstored
> the possible cause for the delays ?
>
Have you tried oxenstored? It works well for me, and seems to be a lot
faster.
> Generally xenstored seems to be quite busy during the restore. Do you think
> some of the queries (from Xend?) are redundant ? Is there anything else
> that can be removed from the relevant Xend code with no harm ? This question
> may sound too blunt; but given the fact that "xm restore savefile" wastes 220
> ms of CPU time doing apparently nothing useful, I would assume there is some
> overhead in Xend too.
> The systemtap trace in the attachment; it does not contain a line about the
> xenstored CPU ticks (259ms, really a lot?), as xenstored does not terminate
> any thread.
>
> c)
>
>>> Also, it looks really excessive that basically copying 400MB of memory takes
>>> over 1.3s cpu time. Is IOCTL_PRIVCMD_MMAPBATCH the culprit (its
>>>
>> I would expect IOCTL_PRIVCMD_MMAPBATCH to be the most significant part of
>> that loop.
>>
> Let's imagine there is a hypercall do_direct_memcpy_from_dom0_to_mfn(int
> mfn_count, mfn* mfn_array, char * pages_content).
> Would it make xc_restore faster if instead of using the xc_map_foreign_batch()
> interface, it would call the above hypercall ? On x86_64 all the physical
> memory is already mapped in the hypervisor (is this correct?), so this could
> be quicker, as no page table setup would be necessary ?
>
The main cost of pagetable manipulations is the tlb flush; if you can
batch all your setups together to amortize the cost of the tlb flush, it
should be pretty quick. But if batching is not being used properly,
then it could get very expensive. My own observation of "strace xl
restore" is that it seems to do a *lot* of ioctls on privcmd, but I
haven't looked more closely to see what those calls are, and whether
they're being done in an optimal way.
J
next prev parent reply other threads:[~2010-06-01 17:00 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-25 10:35 Improving domU restore time Rafal Wojtczuk
2010-05-25 10:58 ` Joanna Rutkowska
2010-05-25 11:50 ` Keir Fraser
2010-05-25 12:50 ` Rafal Wojtczuk
2010-05-25 12:59 ` Keir Fraser
2010-05-25 13:33 ` scrubbing free'd pages James Harper
2010-05-25 13:39 ` Keir Fraser
2010-05-25 13:48 ` Paul Durrant
2010-05-25 14:12 ` scrubbing pages on vm pause Joanna Rutkowska
2010-05-25 14:13 ` Keir Fraser
2010-05-25 14:19 ` Joanna Rutkowska
2010-05-25 14:19 ` Keir Fraser
2010-05-25 14:24 ` Joanna Rutkowska
2010-05-25 13:02 ` Improving domU restore time Keir Fraser
2010-05-31 9:42 ` Rafal Wojtczuk
2010-06-01 17:00 ` Jeremy Fitzhardinge [this message]
2010-06-02 16:24 ` Rafal Wojtczuk
2010-06-02 16:33 ` Jeremy Fitzhardinge
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C053C99.6010906@goop.org \
--to=jeremy@goop.org \
--cc=keir.fraser@eu.citrix.com \
--cc=rafal@invisiblethingslab.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.