From: Valerio Aimale <valerio@aimale.com>
To: Markus Armbruster <armbru@redhat.com>
Cc: qemu-devel@nongnu.org, ehabkost@redhat.com, lcapitulino@redhat.com
Subject: Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
Date: Mon, 19 Oct 2015 08:37:41 -0600 [thread overview]
Message-ID: <56250035.40805@aimale.com> (raw)
In-Reply-To: <87io63xpke.fsf@blackfin.pond.sub.org>
On 10/19/15 1:52 AM, Markus Armbruster wrote:
> Valerio Aimale <valerio@aimale.com> writes:
>
>> On 10/16/15 2:15 AM, Markus Armbruster wrote:
>>> valerio@aimale.com writes:
>>>
>>>> All-
>>>>
>>>> I've produced a patch for the current QEMU HEAD, for libvmi to
>>>> introspect QEMU/KVM VMs.
>>>>
>>>> Libvmi has patches for the old qeum-kvm fork, inside its source tree:
>>>> https://github.com/libvmi/libvmi/tree/master/tools/qemu-kvm-patch
>>>>
>>>> This patch adds a hmp and a qmp command, "pmemaccess". When the
>>>> commands is invoked with a string arguments (a filename), it will open
>>>> a UNIX socket and spawn a listening thread.
>>>>
>>>> The client writes binary commands to the socket, in the form of a c
>>>> structure:
>>>>
>>>> struct request {
>>>> uint8_t type; // 0 quit, 1 read, 2 write, ... rest reserved
>>>> uint64_t address; // address to read from OR write to
>>>> uint64_t length; // number of bytes to read OR write
>>>> };
>>>>
>>>> The client receives as a response, either (length+1) bytes, if it is a
>>>> read operation, or 1 byte ifit is a write operation.
>>>>
>>>> The last bytes of a read operation response indicates success (1
>>>> success, 0 failure). The single byte returned for a write operation
>>>> indicates same (1 success, 0 failure).
>>> So, if you ask to read 1 MiB, and it fails, you get back 1 MiB of
>>> garbage followed by the "it failed" byte?
>> Markus, that appear to be the case. However, I did not write the
>> communication protocol between libvmi and qemu. I'm assuming that the
>> person that wrote the protocol, did not want to bother with over
>> complicating things.
>>
>> https://github.com/libvmi/libvmi/blob/master/libvmi/driver/kvm/kvm.c
>>
>> I'm thinking he assumed reads would be small in size and the price of
>> reading garbage was less than the price of writing a more complicated
>> protocol. I can see his point, confronted with the same problem, I
>> might have done the same.
> All right, the interface is designed for *small* memory blocks then.
>
> Makes me wonder why he needs a separate binary protocol on a separate
> socket. Small blocks could be done just fine in QMP.
The problem is speed. if one's analyzing the memory space of a running
process (physical and paged), libvmi will make a large number of small
and mid-sized reads. If one uses xp, or pmemsave, the overhead is quite
significant. xp has overhead due to encoding, and pmemsave has overhead
due to file open/write (server), file open/read/close/unlink (client).
Others have gone through the problem before me. It appears that pmemsave
and xp are significantly slower than reading memory using a socket via
pmemaccess.
The following data is not mine, but it shows the time, in milliseconds,
required to resolve the content of a paged memory address via socket
(pmemaccess) , pmemsave and xp
http://cl.ly/image/322a3s0h1V05
Again, I did not produce those data points, they come from an old libvmi
thread.
I think it might be conceivable that there could be a QMP command that
returns the content of an arbitrarily size memory region as a base64 or
a base85 json string. It would still have both time- (due to
encoding/decoding) and space- (base64 has 33% and ase85 would be 7%)
overhead, + json encoding/decoding overhead. It might still be the case
that socket would outperform such a command as well, speed-vise. I don't
think it would be any faster than xp.
There's also a similar patch, floating around the internet, the uses
shared memory, instead of sockets, as inter-process communication
between libvmi and QEMU. I've never used that.
>
>>>> The socket API was written by the libvmi author and it works the with
>>>> current libvmi version. The libvmi client-side implementation is at:
>>>>
>>>> https://github.com/libvmi/libvmi/blob/master/libvmi/driver/kvm/kvm.c
>>>>
>>>> As many use kvm VM's for introspection, malware and security analysis,
>>>> it might be worth thinking about making the pmemaccess a permanent
>>>> hmp/qmp command, as opposed to having to produce a patch at each QEMU
>>>> point release.
>>> Related existing commands: memsave, pmemsave, dump-guest-memory.
>>>
>>> Can you explain why these won't do for your use case?
>> For people who do security analysis there are two use cases, static
>> and dynamic analysis. With memsave, pmemsave and dum-guest-memory one
>> can do static analysis. I.e. snapshotting a VM and see what was
>> happening at that point in time.
>> Dynamic analysis require to be able to 'introspect' a VM while it's running.
>>
>> If you take a snapshot of two people exchanging a glass of water, and
>> you happen to take it at the very moment both persons have their hands
>> on the glass, it's hard to tell who passed the glass to whom. If you
>> have a movie of the same scene, it's obvious who's the giver and who's
>> the receiver. Same use case.
> I understand the need for introspecting a running guest. What exactly
> makes the existing commands unsuitable for that?
Speed. See discussion above.
>
>> More to the point, there's a host of C and python frameworks to
>> dynamically analyze VMs: volatility, rekal, "drakvuf", etc. They all
>> build on top of libvmi. I did not want to reinvent the wheel.
> Fair enough.
>
> Front page http://libvmi.com/ claims "Works with Xen, KVM, Qemu, and Raw
> memory files." What exactly is missing for KVM?
When they say they support kvm, what they really mean they support the
(retired, I understand) qemu-kvm fork via a patch that is provided in
the libvmi source tree. I think the most recent qem-kvm supported is 1.6.0
https://github.com/libvmi/libvmi/tree/master/tools/qemu-kvm-patch
I wanted to bring support to the head revision of QEMU, to bring libvmi
level with more modern QEMU.
Maybe the solution is simply to put this patch in the libvmi source
tree, which I've already asked to do via pull request, leaving QEMU alone.
However, the patch has to be updated at every QEMU point release. I
wanted to avoid that, if at all possible.
>
>> Mind you, 99.9% of people that do dynamic VM analysis use xen. They
>> contend that xen has better introspection support. In my case, I did
>> not want to bother with dedicating a full server to be a xen domain
>> 0. I just wanted to do a quick test by standing up a QEMU/kvm VM, in
>> an otherwise purposed server.
> I'm not at all against better introspection support in QEMU. I'm just
> trying to understand the problem you're trying to solve with your
> patches.
What all users of libvmi would love to have is super high speed access
to VM physical memory as part of the QEMU source tree, and not supported
via a patch. Implemented as the QEMU owners see fit, as long as it is
blazing fast and easy accessed via client library or inter-process
communication.
My gut feeling is that it has to bypass QMP protocol/encoding/file
access/json to be fast, but, it is just a gut feeling - worth nothing.
>
>>>> Also, the pmemsave commands QAPI should be changed to be usable with
>>>> 64bit VM's
>>>>
>>>> in qapi-schema.json
>>>>
>>>> from
>>>>
>>>> ---
>>>> { 'command': 'pmemsave',
>>>> 'data': {'val': 'int', 'size': 'int', 'filename': 'str'} }
>>>> ---
>>>>
>>>> to
>>>>
>>>> ---
>>>> { 'command': 'pmemsave',
>>>> 'data': {'val': 'int64', 'size': 'int64', 'filename': 'str'} }
>>>> ---
>>> In the QAPI schema, 'int' is actually an alias for 'int64'. Yes, that's
>>> confusing.
>> I think it's confusing for the HMP parser too. If you have a VM with
>> 8Gb of RAM and want to snapshot the whole physical memory, via HMP
>> over telnet this is what happens:
>>
>> $ telnet localhost 1234
>> Trying 127.0.0.1...
>> Connected to localhost.
>> Escape character is '^]'.
>> QEMU 2.4.0.1 monitor - type 'help' for more information
>> (qemu) help pmemsave
>> pmemsave addr size file -- save to disk physical memory dump starting
>> at 'addr' of size 'size'
>> (qemu) pmemsave 0 8589934591 "/tmp/memorydump"
>> 'pmemsave' has failed: integer is for 32-bit values
>> Try "help pmemsave" for more information
>> (qemu) quit
> Your change to pmemsave's definition in qapi-schema.json is effectively a
> no-op.
>
> Your example shows *HMP* command pmemsave. The definition of an HMP
> command is *independent* of the QMP command. The implementation *uses*
> the QMP command.
>
> QMP pmemsave is defined in qapi-schema.json as
>
> { 'command': 'pmemsave',
> 'data': {'val': 'int', 'size': 'int', 'filename': 'str'} }
>
> Its implementation is in cpus.c:
>
> void qmp_pmemsave(int64_t addr, int64_t size, const char *filename,
> Error **errp)
>
> Note the int64_t size.
>
> HMP pmemsave is defined in hmp-commands.hx as
>
> {
> .name = "pmemsave",
> .args_type = "val:l,size:i,filename:s",
> .params = "addr size file",
> .help = "save to disk physical memory dump starting at 'addr' of size 'size'",
> .mhandler.cmd = hmp_pmemsave,
> },
>
> Its implementation is in hmp.c:
>
> void hmp_pmemsave(Monitor *mon, const QDict *qdict)
> {
> uint32_t size = qdict_get_int(qdict, "size");
> const char *filename = qdict_get_str(qdict, "filename");
> uint64_t addr = qdict_get_int(qdict, "val");
> Error *err = NULL;
>
> qmp_pmemsave(addr, size, filename, &err);
> hmp_handle_error(mon, &err);
> }
>
> Note uint32_t size.
>
> Arguably, the QMP size argument should use 'size' (an alias for
> 'uint64'), and the HMP args_type should use 'size:o'.
Understand all that. Indeed, I've re-implemented 'pmemaccess' the same
way pmemsave is implemented. There is a single function, and two points
of entrance, one for HMP and one for QMP. I think pmemacess mimics
pmemsave closely.
However, if one wants to simply dump a memory region, via HMP for human
easy of use/debug/testing purposes, one cannot dump memory regions that
resides higher than 2^32-1
>> With the changes I suggested, the command succeeds
>>
>> $ telnet localhost 1234
>> Trying 127.0.0.1...
>> Connected to localhost.
>> Escape character is '^]'.
>> QEMU 2.4.0.1 monitor - type 'help' for more information
>> (qemu) help pmemsave
>> pmemsave addr size file -- save to disk physical memory dump starting
>> at 'addr' of size 'size'
>> (qemu) pmemsave 0 8589934591 "/tmp/memorydump"
>> (qemu) quit
>>
>> However I just noticed that the dump is just about 4GB in size, so
>> there might be more changes needed to snapshot all physical memory of
>> a 64 but VM. I did not investigate any further.
>>
>> ls -l /tmp/memorydump
>> -rw-rw-r-- 1 libvirt-qemu kvm 4294967295 Oct 16 08:04 /tmp/memorydump
>>
>>>> hmp-commands.hx and qmp-commands.hx should be edited accordingly. I
>>>> did not make the above pmemsave changes part of my patch.
>>>>
>>>> Let me know if you have any questions,
>>>>
>>>> Valerio
next prev parent reply other threads:[~2015-10-19 14:38 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-15 23:44 [Qemu-devel] QEMU patch to allow VM introspection via libvmi valerio
2015-10-15 23:44 ` [Qemu-devel] [PATCH] QEMU patch for libvmi to introspect QEMU/kvm virtual machines. Usually this patch is distributed with libvmi, but, it might be more useful to have it in the QEMU source permanently valerio
2015-10-19 21:33 ` Eric Blake
2015-10-21 15:11 ` Valerio Aimale
2015-10-16 8:15 ` [Qemu-devel] QEMU patch to allow VM introspection via libvmi Markus Armbruster
2015-10-16 14:30 ` Valerio Aimale
2015-10-19 7:52 ` Markus Armbruster
2015-10-19 14:37 ` Valerio Aimale [this message]
2015-10-21 10:54 ` Markus Armbruster
2015-10-21 15:50 ` Valerio Aimale
2015-10-22 11:50 ` Markus Armbruster
2015-10-22 18:11 ` Valerio Aimale
2015-10-23 6:31 ` Markus Armbruster
2015-10-22 18:43 ` Valerio Aimale
2015-10-22 18:54 ` Eric Blake
2015-10-22 19:12 ` Eduardo Habkost
2015-10-22 19:57 ` Valerio Aimale
2015-10-22 20:03 ` Eric Blake
2015-10-22 20:45 ` Valerio Aimale
2015-10-22 21:47 ` Eduardo Habkost
2015-10-22 21:51 ` Valerio Aimale
2015-10-23 8:25 ` Daniel P. Berrange
2015-10-23 19:00 ` Eduardo Habkost
2015-10-23 18:55 ` Eduardo Habkost
2015-10-23 19:08 ` Valerio Aimale
2015-10-26 9:09 ` Markus Armbruster
2015-10-26 17:37 ` Valerio Aimale
2015-10-26 17:52 ` Eduardo Habkost
2015-10-27 14:17 ` Valerio Aimale
2015-10-27 15:00 ` Markus Armbruster
2015-10-27 15:18 ` Valerio Aimale
2015-10-27 15:31 ` Valerio Aimale
2015-10-27 16:11 ` Markus Armbruster
2015-10-27 16:27 ` Valerio Aimale
2015-10-23 6:35 ` Markus Armbruster
2015-10-23 8:18 ` Daniel P. Berrange
2015-10-23 14:48 ` Valerio Aimale
2015-10-23 14:44 ` Valerio Aimale
2015-10-23 14:56 ` Eric Blake
2015-10-23 15:03 ` Valerio Aimale
2015-10-23 19:24 ` Eduardo Habkost
2015-10-23 20:02 ` Richard Henderson
2015-11-02 12:55 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56250035.40805@aimale.com \
--to=valerio@aimale.com \
--cc=armbru@redhat.com \
--cc=ehabkost@redhat.com \
--cc=lcapitulino@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).