From: Anthony Liguori <aliguori@us.ibm.com>
To: Peter Feiner <peter@gridcentric.ca>
Cc: Andres Lagar-Cavilla <andreslc@gridcentric.ca>, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 1/1] exec: make -mem-path filenames deterministic
Date: Tue, 08 Jan 2013 13:04:34 -0600 [thread overview]
Message-ID: <876237ebst.fsf@codemonkey.ws> (raw)
In-Reply-To: <CADiFPY+nTPOvrXSAO1F12J7toq-QXwB6gi-1CQ2DR2LtCzwweg@mail.gmail.com>
Peter Feiner <peter@gridcentric.ca> writes:
>> This is not reasonable IMHO.
>>
>> I was okay with sticking a name on a ramblock, but encoding a guest PA
>> offset turns this into a supported ABI which I'm not willing to do.
>>
>> A one line change is one thing, but not a complex new option that
>> introduces an ABI only for a proprietary product that's jumping through hoops to keep
>> from contributing useful logic to QEMU.
>
> Hi Anthony,
>
> Thanks for getting back to me.
>
> Sticking a name on the ramblock file would suite our product just
> fine. Indeed, this is what we had agreed upon at the KVM forum.
> However, I submitted a more complex patch in an attempt to expose a
> more general & easy to use feature; I was trying to make a more useful
> contribution than the simple patch :-)
>
> Perhaps I can assuage your ABI concern and argue the utility of this
> patch vs the one-line version. However, if you aren't satisfied,
> please let me know and I'll resubmit the one-line version.
Yes, please submit the oneliner.
> On ABI: This patch doesn't add a new ABI. QEMU already has this ABI
> due to Xen live migration.
>
> When a Xen domain is booted, a new domain is created with an empty
> physmap. Then QEMU is launched. QEMU creates its ramblocks and, via
> memory callbacks (xen_add_to_physmap), populates Xen's physmap using
> ramblock sizes & offsets.
>
> On incoming migration, the Xen toolstack creates a new domain,
> populates its physmap, and copies RAM from the outgoing migration.
> When QEMU is launched, it populates its Xen memory model (i.e.,
> XenIOState) by reading the domain's existing physmap from xenstore.
> When QEMU creates ramblocks, the callbacks in xen-all.c _ignore_ the
> new ramblocks because their offsets are already in the physmap. If the
> new ramblocks had different sizes & offsets than those from the
> outgoing QEMU process, then QEMU's memory model would be inconsistent
> with Xen's (i.e., the physmap maintained by the hypervisor and the
> XenIOState maintained in userspace). In particular, QEMU would expect
> memory at a particular physmap offset that wouldn't have been
> populated by the Xen toolstack during live migration.
This is an internal detail between Xen and QEMU. That doesn't mean it's
a general public API.
I'm fairly certain that Xen does not support arbitrary versions of QEMU
to be used as qemu-dm.
Regards,
Anthony Liguori
>
> On utility: Just adding ramblock names to backing file paths makes
> post-copy migration & cloning possible, but involves some painful VFS
> contortions, which I give a detailed example of below. On the other
> hand, these new -mem-path parameters make post-copy migration &
> cloning simple by leveraging an existing QMP command, existing
> filesystems, and kernel behavior. Put another way, the useful logic
> for memory sharing and post-copy live migration already exists in the
> kernel and a myriad of filesystems. A fairly small patch (albeit not
> one line) enables that logic in QEMU.
>
> Peter
>
> Detailed example:
>
> Suppose you have a patched QEMU that adds ramblock names to their
> backing files and you want to implement memory sharing via cloning.
> When clones come up, each of their ramblocks' backing files need to
> contain the same data as the corresponding backing file from the
> parent (obviously you want those new backing files to somehow share
> pages and COW). The basic idea is to save the parent's ramblock files
> and arrange for the clones to open them.
>
> You can see the parent's ramblock files easily enough by looking at
> the unlinked ramblock files (e.g., /proc/pid/fd/10 is a symlink to
> /tmp/qemu_back_mem.pc.ram.WHFZYw (deleted), /proc/pid/fd/11 is a
> symlink to /tmp/qemu_back_mem.vga.vram.WT1yQW (deleted), etc.).
> Unfortunately, since they're all mapped MAP_PRIVATE, these symlinks,
> when opened, will give all zeros. So you can either implement your own
> filesystem that gives you a backdoor to the MAP_PRIVATE pages (fast
> but complicated), or you can use qemu's monitor to dump guest RAM
> (slow but works).
>
> When a clone runs and creates a new backing file using mkstemp, you
> need to arrange for that backing file to somehow contain the same data
> as the corresponding file from the parent. There is an obvious
> heuristic for determining this correspondence: parse the ramblock name
> from the child's file and use the matching file from the parent.
> Correctness aside (e.g., multiple ramblocks can have the same name,
> e.g., e1000.rom, but this is moot because the _important_ ramblocks,
> i.e., pc.ram and vga.ram, are unique in the emulated system we care
> about), implementing this heuristic is a pain. To see the file being
> created, you need to implement a custom file system. Moreover, to
> share memory with another file that's been opened MAP_PRIVATE, you
> have to implement your own VMA operations. Oye!
next prev parent reply other threads:[~2013-01-08 19:05 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-20 12:10 [Qemu-devel] [PATCH 0/1] Exogenous memory management via -mem-path Peter Feiner
2012-11-20 12:10 ` [Qemu-devel] [PATCH 1/1] exec: make -mem-path filenames deterministic Peter Feiner
2012-11-29 16:29 ` Peter Feiner
2013-01-07 19:55 ` Anthony Liguori
2013-01-08 15:59 ` Peter Feiner
2013-01-08 19:04 ` Anthony Liguori [this message]
2013-01-08 19:59 ` Peter Maydell
2013-03-01 17:21 ` [Qemu-devel] [PATCH v2] " peter
2013-03-01 18:47 ` Andreas Färber
2013-03-01 19:20 ` Peter Feiner
2013-01-02 19:34 ` [Qemu-devel] [PATCH 0/1] Exogenous memory management via -mem-path Peter Feiner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=876237ebst.fsf@codemonkey.ws \
--to=aliguori@us.ibm.com \
--cc=andreslc@gridcentric.ca \
--cc=peter@gridcentric.ca \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).