All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anthony Liguori <aliguori@us.ibm.com>
To: Peter Feiner <peter@gridcentric.ca>
Cc: Andres Lagar-Cavilla <andreslc@gridcentric.ca>, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 1/1] exec: make -mem-path filenames deterministic
Date: Tue, 08 Jan 2013 13:04:34 -0600	[thread overview]
Message-ID: <876237ebst.fsf@codemonkey.ws> (raw)
In-Reply-To: <CADiFPY+nTPOvrXSAO1F12J7toq-QXwB6gi-1CQ2DR2LtCzwweg@mail.gmail.com>

Peter Feiner <peter@gridcentric.ca> writes:

>> This is not reasonable IMHO.
>>
>> I was okay with sticking a name on a ramblock, but encoding a guest PA
>> offset turns this into a supported ABI which I'm not willing to do.
>>
>> A one line change is one thing, but not a complex new option that
>> introduces an ABI only for a proprietary product that's jumping through hoops to keep
>> from contributing useful logic to QEMU.
>
> Hi Anthony,
>
> Thanks for getting back to me.
>
> Sticking a name on the ramblock file would suite our product just
> fine. Indeed, this is what we had agreed upon at the KVM forum.
> However, I submitted a more complex patch in an attempt to expose a
> more general & easy to use feature; I was trying to make a more useful
> contribution than the simple patch :-)
>
> Perhaps I can assuage your ABI concern and argue the utility of this
> patch vs the one-line version. However, if you aren't satisfied,
> please let me know and I'll resubmit the one-line version.

Yes, please submit the oneliner.

> On ABI: This patch doesn't add a new ABI. QEMU already has this ABI
> due to Xen live migration.
>
> When a Xen domain is booted, a new domain is created with an empty
> physmap. Then QEMU is launched. QEMU creates its ramblocks and, via
> memory callbacks (xen_add_to_physmap), populates Xen's physmap using
> ramblock sizes & offsets.
>
> On incoming migration, the Xen toolstack creates a new domain,
> populates its physmap, and copies RAM from the outgoing migration.
> When QEMU is launched, it populates its Xen memory model (i.e.,
> XenIOState) by reading the domain's existing physmap from xenstore.
> When QEMU creates ramblocks, the callbacks in xen-all.c _ignore_ the
> new ramblocks because their offsets are already in the physmap. If the
> new ramblocks had different sizes & offsets than those from the
> outgoing QEMU process, then QEMU's memory model would be inconsistent
> with Xen's (i.e., the physmap maintained by the hypervisor and the
> XenIOState maintained in userspace). In particular, QEMU would expect
> memory at a particular physmap offset that wouldn't have been
> populated by the Xen toolstack during live migration.

This is an internal detail between Xen and QEMU.  That doesn't mean it's
a general public API.

I'm fairly certain that Xen does not support arbitrary versions of QEMU
to be used as qemu-dm.

Regards,

Anthony Liguori

>
> On utility: Just adding ramblock names to backing file paths makes
> post-copy migration & cloning possible, but involves some painful VFS
> contortions, which I give a detailed example of below. On the other
> hand, these new -mem-path parameters make post-copy migration &
> cloning simple by leveraging an existing QMP command, existing
> filesystems, and kernel behavior. Put another way, the useful logic
> for memory sharing and post-copy live migration already exists in the
> kernel and a myriad of filesystems.  A fairly small patch (albeit not
> one line) enables that logic in QEMU.
>
> Peter
>
> Detailed example:
>
> Suppose you have a patched QEMU that adds ramblock names to their
> backing files and you want to implement memory sharing via cloning.
> When clones come up, each of their ramblocks' backing files need to
> contain the same data as the corresponding backing file from the
> parent (obviously you want those new backing files to somehow share
> pages and COW). The basic idea is to save the parent's ramblock files
> and arrange for the clones to open them.
>
> You can see the parent's ramblock files easily enough by looking at
> the unlinked ramblock files (e.g., /proc/pid/fd/10 is a symlink to
> /tmp/qemu_back_mem.pc.ram.WHFZYw (deleted), /proc/pid/fd/11 is a
> symlink to /tmp/qemu_back_mem.vga.vram.WT1yQW (deleted), etc.).
> Unfortunately, since they're all mapped MAP_PRIVATE, these symlinks,
> when opened, will give all zeros. So you can either implement your own
> filesystem that gives you a backdoor to the MAP_PRIVATE pages (fast
> but complicated), or you can use qemu's monitor to dump guest RAM
> (slow but works).
>
> When a clone runs and creates a new backing file using mkstemp, you
> need to arrange for that backing file to somehow contain the same data
> as the corresponding file from the parent. There is an obvious
> heuristic for determining this correspondence: parse the ramblock name
> from the child's file and use the matching file from the parent.
> Correctness aside (e.g., multiple ramblocks can have the same name,
> e.g., e1000.rom, but this is moot because the _important_ ramblocks,
> i.e., pc.ram and vga.ram, are unique in the emulated system we care
> about), implementing this heuristic is a pain. To see the file being
> created, you need to implement a custom file system. Moreover, to
> share memory with another file that's been opened MAP_PRIVATE, you
> have to implement your own VMA operations. Oye!

  reply	other threads:[~2013-01-08 19:05 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-20 12:10 [Qemu-devel] [PATCH 0/1] Exogenous memory management via -mem-path Peter Feiner
2012-11-20 12:10 ` [Qemu-devel] [PATCH 1/1] exec: make -mem-path filenames deterministic Peter Feiner
2012-11-29 16:29   ` Peter Feiner
2013-01-07 19:55   ` Anthony Liguori
2013-01-08 15:59     ` Peter Feiner
2013-01-08 19:04       ` Anthony Liguori [this message]
2013-01-08 19:59         ` Peter Maydell
2013-03-01 17:21         ` [Qemu-devel] [PATCH v2] " peter
2013-03-01 18:47           ` Andreas Färber
2013-03-01 19:20             ` Peter Feiner
2013-01-02 19:34 ` [Qemu-devel] [PATCH 0/1] Exogenous memory management via -mem-path Peter Feiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=876237ebst.fsf@codemonkey.ws \
    --to=aliguori@us.ibm.com \
    --cc=andreslc@gridcentric.ca \
    --cc=peter@gridcentric.ca \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.