qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Valerio Aimale <valerio@aimale.com>
To: Markus Armbruster <armbru@redhat.com>
Cc: lcapitulino@redhat.com, Eduardo Habkost <ehabkost@redhat.com>,
	qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
Date: Tue, 27 Oct 2015 09:18:29 -0600	[thread overview]
Message-ID: <562F95C5.2020006@aimale.com> (raw)
In-Reply-To: <87wpu84arz.fsf@blackfin.pond.sub.org>

On 10/27/15 9:00 AM, Markus Armbruster wrote:
> Valerio Aimale <valerio@aimale.com> writes:
>
>> On 10/26/15 11:52 AM, Eduardo Habkost wrote:
>>>
>>> I was trying to advocate the use of a shared mmap'ed region. The sharing
>>> would be two-ways (RW for both) between the QEMU virtualizer and the libvmi
>>> process. I envision that there could be a QEMU command line argument, such
>>> as "--mmap-guest-memory <filename>" Understand that Eric feels strongly the
>>> libvmi client should own the file name - I have not forgotten that. When
>>> that command line argument is given, as part of the guest initialization,
>>> QEMU creates a file of size equal to the size of the guest memory containing
>>> all zeros, mmaps that file to the guest memory with  PROT_READ|PROT_WRITE
>>> and MAP_FILE|MAP_SHARED, then starts the guest.
>>> This is basically what memory-backend-file (and the legacy -mem-path
>>> option) already does today, but it unlinks the file just after opening
>>> it. We can change it to accept a full filename and/or an option to make
>>> it not unlink the file after opening it.
>>>
>>> I don't remember if memory-backend-file is usable without -numa, but we
>>> could make it possible somehow.
>> Eduardo, I did try this approach. It takes 2 line changes in exec.c:
>> comment the unlink out, and making sure MAP_SHARED is used when
>> -mem-path and -mem-prealloc are given. It works beautifully, and
>> libvmi accesses are fast. However, the VM is slowed down to a crawl,
>> obviously, because each RAM access by the VM triggers a page fault on
>> the mmapped file. I don't think having a crawling VM is desirable, so
>> this approach goes out the door.
> Uh, I don't understand why "each RAM access by the VM triggers a page
> fault".  Can you show us the patch you used?
Sorry, too brief of an explanation. Every time the guest flips a byte in 
physical RAM, I think that triggers a page write to the mmaped file. My 
understanding is that, with MAP_SHARED, each write to RAM triggers a 
file write, hence the slowness. These are the simple changes I made, to 
test it - as a proof of concept.

in exec.c of the qemu-2.4.0.1 change

---
     fd = mkstemp(filename);
     if (fd < 0) {
         error_setg_errno(errp, errno,
                          "unable to create backing store for hugepages");
         g_free(filename);
         goto error;
     }
     unlink(filename);
     g_free(filename);

     memory = (memory+hpagesize-1) & ~(hpagesize-1);

     /*
      * ftruncate is not supported by hugetlbfs in older
      * hosts, so don't bother bailing out on errors.
      * If anything goes wrong with it under other filesystems,
      * mmap will fail.
      */
     if (ftruncate(fd, memory)) {
         perror("ftruncate");
     }

     area = mmap(0, memory, PROT_READ | PROT_WRITE,
                 (block->flags & RAM_SHARED ? MAP_SHARED : MAP_PRIVATE),
                 fd, 0);
---

to

---
     fd = mkstemp(filename);
     if (fd < 0) {
         error_setg_errno(errp, errno,
                          "unable to create backing store for hugepages");
         g_free(filename);
         goto error;
     }
     /* unlink(filename); */ /* Valerio's change to persist guest RAM 
mmaped file */
     g_free(filename);

     memory = (memory+hpagesize-1) & ~(hpagesize-1);

     /*
      * ftruncate is not supported by hugetlbfs in older
      * hosts, so don't bother bailing out on errors.
      * If anything goes wrong with it under other filesystems,
      * mmap will fail.
      */
     if (ftruncate(fd, memory)) {
         perror("ftruncate");
     }

     area = mmap(0, memory, PROT_READ | PROT_WRITE,
                 MAP_FILE | MAP_SHARED, /* Valerio's change to persist 
guest RAM mmaped file */
                 fd, 0);
---

then, recompile qemu.

Launch a VM as

/usr/local/bin/qemu-system-x86_64 -name Windows10 -S -machine 
pc-i440fx-2.4,accel=kvm,usb=off [...] -mem-prealloc -mem-path /tmp/maps

# I know -mem-path is deprecated, but I used for speeding up the proof 
of concept.

With the above command, I have a the following file

$ ls -l /tmp/maps/
-rw------- 1 libvirt-qemu kvm 2147483648 Oct 27 08:31 
qemu_back_mem.pc.ram.fP4sKH

which is a mmap of the Win VM physical RAM

$ hexdump -C /tmp/maps/qemu_back_mem.val.pc.ram.fP4sKH

00000000  53 ff 00 f0 53 ff 00 f0  c3 e2 00 f0 53 ff 00 f0 
|S...S.......S...|
[...]
00000760  24 02 c3 49 6e 76 61 6c  69 64 20 70 61 72 74 69 |$..Invalid 
parti|
00000770  74 69 6f 6e 20 74 61 62  6c 65 00 45 72 72 6f 72  |tion 
table.Error|
00000780  20 6c 6f 61 64 69 6e 67  20 6f 70 65 72 61 74 69  | loading 
operati|
00000790  6e 67 20 73 79 73 74 65  6d 00 4d 69 73 73 69 6e  |ng 
system.Missin|
000007a0  67 20 6f 70 65 72 61 74  69 6e 67 20 73 79 73 74  |g operating 
syst|
000007b0  65 6d 00 00 00 63 7b 9a  73 d8 99 ce 00 00 80 20 
|em...c{.s...... |
[...]

I did not try to mmap'ing to a file on a RAMdisk. Without physical disk 
I/O, the VM might run faster.

  reply	other threads:[~2015-10-27 15:18 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-15 23:44 [Qemu-devel] QEMU patch to allow VM introspection via libvmi valerio
2015-10-15 23:44 ` [Qemu-devel] [PATCH] QEMU patch for libvmi to introspect QEMU/kvm virtual machines. Usually this patch is distributed with libvmi, but, it might be more useful to have it in the QEMU source permanently valerio
2015-10-19 21:33   ` Eric Blake
2015-10-21 15:11     ` Valerio Aimale
2015-10-16  8:15 ` [Qemu-devel] QEMU patch to allow VM introspection via libvmi Markus Armbruster
2015-10-16 14:30   ` Valerio Aimale
2015-10-19  7:52     ` Markus Armbruster
2015-10-19 14:37       ` Valerio Aimale
2015-10-21 10:54         ` Markus Armbruster
2015-10-21 15:50           ` Valerio Aimale
2015-10-22 11:50             ` Markus Armbruster
2015-10-22 18:11               ` Valerio Aimale
2015-10-23  6:31                 ` Markus Armbruster
2015-10-22 18:43           ` Valerio Aimale
2015-10-22 18:54             ` Eric Blake
2015-10-22 19:12           ` Eduardo Habkost
2015-10-22 19:57             ` Valerio Aimale
2015-10-22 20:03               ` Eric Blake
2015-10-22 20:45                 ` Valerio Aimale
2015-10-22 21:47               ` Eduardo Habkost
2015-10-22 21:51                 ` Valerio Aimale
2015-10-23  8:25                   ` Daniel P. Berrange
2015-10-23 19:00                     ` Eduardo Habkost
2015-10-23 18:55                   ` Eduardo Habkost
2015-10-23 19:08                     ` Valerio Aimale
2015-10-26  9:09                       ` Markus Armbruster
2015-10-26 17:37                         ` Valerio Aimale
2015-10-26 17:52                           ` Eduardo Habkost
2015-10-27 14:17                             ` Valerio Aimale
2015-10-27 15:00                               ` Markus Armbruster
2015-10-27 15:18                                 ` Valerio Aimale [this message]
2015-10-27 15:31                                   ` Valerio Aimale
2015-10-27 16:11                                   ` Markus Armbruster
2015-10-27 16:27                                     ` Valerio Aimale
2015-10-23  6:35             ` Markus Armbruster
2015-10-23  8:18               ` Daniel P. Berrange
2015-10-23 14:48                 ` Valerio Aimale
2015-10-23 14:44               ` Valerio Aimale
2015-10-23 14:56                 ` Eric Blake
2015-10-23 15:03                   ` Valerio Aimale
2015-10-23 19:24               ` Eduardo Habkost
2015-10-23 20:02                 ` Richard Henderson
2015-11-02 12:55                 ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=562F95C5.2020006@aimale.com \
    --to=valerio@aimale.com \
    --cc=armbru@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=lcapitulino@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).