From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47464) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zr61F-0008Ep-DP for qemu-devel@nongnu.org; Tue, 27 Oct 2015 11:18:38 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Zr61B-0005Rk-EQ for qemu-devel@nongnu.org; Tue, 27 Oct 2015 11:18:37 -0400 Received: from smtp.aimale.com ([166.78.138.199]:50597) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zr61B-0005Rc-81 for qemu-devel@nongnu.org; Tue, 27 Oct 2015 11:18:33 -0400 References: <56250035.40805@aimale.com> <87twpkqyow.fsf@blackfin.pond.sub.org> <20151022191203.GC3736@thinpad.lan.raisama.net> <56293F99.1060109@aimale.com> <20151022214719.GD3736@thinpad.lan.raisama.net> <56295A60.1040901@aimale.com> <20151023185504.GI3736@thinpad.lan.raisama.net> <562A85C9.6050309@aimale.com> <87ziz60zeg.fsf@blackfin.pond.sub.org> <562E64C0.1080509@aimale.com> <20151026175217.GC4180@thinpad.lan.raisama.net> <562F8796.4070503@aimale.com> <87wpu84arz.fsf@blackfin.pond.sub.org> From: Valerio Aimale Message-ID: <562F95C5.2020006@aimale.com> Date: Tue, 27 Oct 2015 09:18:29 -0600 MIME-Version: 1.0 In-Reply-To: <87wpu84arz.fsf@blackfin.pond.sub.org> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Markus Armbruster Cc: lcapitulino@redhat.com, Eduardo Habkost , qemu-devel@nongnu.org On 10/27/15 9:00 AM, Markus Armbruster wrote: > Valerio Aimale writes: > >> On 10/26/15 11:52 AM, Eduardo Habkost wrote: >>> >>> I was trying to advocate the use of a shared mmap'ed region. The sharing >>> would be two-ways (RW for both) between the QEMU virtualizer and the libvmi >>> process. I envision that there could be a QEMU command line argument, such >>> as "--mmap-guest-memory " Understand that Eric feels strongly the >>> libvmi client should own the file name - I have not forgotten that. When >>> that command line argument is given, as part of the guest initialization, >>> QEMU creates a file of size equal to the size of the guest memory containing >>> all zeros, mmaps that file to the guest memory with PROT_READ|PROT_WRITE >>> and MAP_FILE|MAP_SHARED, then starts the guest. >>> This is basically what memory-backend-file (and the legacy -mem-path >>> option) already does today, but it unlinks the file just after opening >>> it. We can change it to accept a full filename and/or an option to make >>> it not unlink the file after opening it. >>> >>> I don't remember if memory-backend-file is usable without -numa, but we >>> could make it possible somehow. >> Eduardo, I did try this approach. It takes 2 line changes in exec.c: >> comment the unlink out, and making sure MAP_SHARED is used when >> -mem-path and -mem-prealloc are given. It works beautifully, and >> libvmi accesses are fast. However, the VM is slowed down to a crawl, >> obviously, because each RAM access by the VM triggers a page fault on >> the mmapped file. I don't think having a crawling VM is desirable, so >> this approach goes out the door. > Uh, I don't understand why "each RAM access by the VM triggers a page > fault". Can you show us the patch you used? Sorry, too brief of an explanation. Every time the guest flips a byte in physical RAM, I think that triggers a page write to the mmaped file. My understanding is that, with MAP_SHARED, each write to RAM triggers a file write, hence the slowness. These are the simple changes I made, to test it - as a proof of concept. in exec.c of the qemu-2.4.0.1 change --- fd = mkstemp(filename); if (fd < 0) { error_setg_errno(errp, errno, "unable to create backing store for hugepages"); g_free(filename); goto error; } unlink(filename); g_free(filename); memory = (memory+hpagesize-1) & ~(hpagesize-1); /* * ftruncate is not supported by hugetlbfs in older * hosts, so don't bother bailing out on errors. * If anything goes wrong with it under other filesystems, * mmap will fail. */ if (ftruncate(fd, memory)) { perror("ftruncate"); } area = mmap(0, memory, PROT_READ | PROT_WRITE, (block->flags & RAM_SHARED ? MAP_SHARED : MAP_PRIVATE), fd, 0); --- to --- fd = mkstemp(filename); if (fd < 0) { error_setg_errno(errp, errno, "unable to create backing store for hugepages"); g_free(filename); goto error; } /* unlink(filename); */ /* Valerio's change to persist guest RAM mmaped file */ g_free(filename); memory = (memory+hpagesize-1) & ~(hpagesize-1); /* * ftruncate is not supported by hugetlbfs in older * hosts, so don't bother bailing out on errors. * If anything goes wrong with it under other filesystems, * mmap will fail. */ if (ftruncate(fd, memory)) { perror("ftruncate"); } area = mmap(0, memory, PROT_READ | PROT_WRITE, MAP_FILE | MAP_SHARED, /* Valerio's change to persist guest RAM mmaped file */ fd, 0); --- then, recompile qemu. Launch a VM as /usr/local/bin/qemu-system-x86_64 -name Windows10 -S -machine pc-i440fx-2.4,accel=kvm,usb=off [...] -mem-prealloc -mem-path /tmp/maps # I know -mem-path is deprecated, but I used for speeding up the proof of concept. With the above command, I have a the following file $ ls -l /tmp/maps/ -rw------- 1 libvirt-qemu kvm 2147483648 Oct 27 08:31 qemu_back_mem.pc.ram.fP4sKH which is a mmap of the Win VM physical RAM $ hexdump -C /tmp/maps/qemu_back_mem.val.pc.ram.fP4sKH 00000000 53 ff 00 f0 53 ff 00 f0 c3 e2 00 f0 53 ff 00 f0 |S...S.......S...| [...] 00000760 24 02 c3 49 6e 76 61 6c 69 64 20 70 61 72 74 69 |$..Invalid parti| 00000770 74 69 6f 6e 20 74 61 62 6c 65 00 45 72 72 6f 72 |tion table.Error| 00000780 20 6c 6f 61 64 69 6e 67 20 6f 70 65 72 61 74 69 | loading operati| 00000790 6e 67 20 73 79 73 74 65 6d 00 4d 69 73 73 69 6e |ng system.Missin| 000007a0 67 20 6f 70 65 72 61 74 69 6e 67 20 73 79 73 74 |g operating syst| 000007b0 65 6d 00 00 00 63 7b 9a 73 d8 99 ce 00 00 80 20 |em...c{.s...... | [...] I did not try to mmap'ing to a file on a RAMdisk. Without physical disk I/O, the VM might run faster.