* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-21 10:54 ` Markus Armbruster
@ 2015-10-21 15:50 ` Valerio Aimale
2015-10-22 11:50 ` Markus Armbruster
2015-10-22 18:43 ` Valerio Aimale
2015-10-22 19:12 ` Eduardo Habkost
2 siblings, 1 reply; 43+ messages in thread
From: Valerio Aimale @ 2015-10-21 15:50 UTC (permalink / raw)
To: Markus Armbruster; +Cc: lcapitulino, qemu-devel, ehabkost
On 10/21/15 4:54 AM, Markus Armbruster wrote:
> Valerio Aimale <valerio@aimale.com> writes:
>
>> On 10/19/15 1:52 AM, Markus Armbruster wrote:
>>> Valerio Aimale <valerio@aimale.com> writes:
>>>
>>>> On 10/16/15 2:15 AM, Markus Armbruster wrote:
>>>>> valerio@aimale.com writes:
>>>>>
>>>>>> All-
>>>>>>
>>>>>> I've produced a patch for the current QEMU HEAD, for libvmi to
>>>>>> introspect QEMU/KVM VMs.
>>>>>>
>>>>>> Libvmi has patches for the old qeum-kvm fork, inside its source tree:
>>>>>> https://github.com/libvmi/libvmi/tree/master/tools/qemu-kvm-patch
>>>>>>
>>>>>> This patch adds a hmp and a qmp command, "pmemaccess". When the
>>>>>> commands is invoked with a string arguments (a filename), it will open
>>>>>> a UNIX socket and spawn a listening thread.
>>>>>>
>>>>>> The client writes binary commands to the socket, in the form of a c
>>>>>> structure:
>>>>>>
>>>>>> struct request {
>>>>>> uint8_t type; // 0 quit, 1 read, 2 write, ... rest reserved
>>>>>> uint64_t address; // address to read from OR write to
>>>>>> uint64_t length; // number of bytes to read OR write
>>>>>> };
>>>>>>
>>>>>> The client receives as a response, either (length+1) bytes, if it is a
>>>>>> read operation, or 1 byte ifit is a write operation.
>>>>>>
>>>>>> The last bytes of a read operation response indicates success (1
>>>>>> success, 0 failure). The single byte returned for a write operation
>>>>>> indicates same (1 success, 0 failure).
>>>>> So, if you ask to read 1 MiB, and it fails, you get back 1 MiB of
>>>>> garbage followed by the "it failed" byte?
>>>> Markus, that appear to be the case. However, I did not write the
>>>> communication protocol between libvmi and qemu. I'm assuming that the
>>>> person that wrote the protocol, did not want to bother with over
>>>> complicating things.
>>>>
>>>> https://github.com/libvmi/libvmi/blob/master/libvmi/driver/kvm/kvm.c
>>>>
>>>> I'm thinking he assumed reads would be small in size and the price of
>>>> reading garbage was less than the price of writing a more complicated
>>>> protocol. I can see his point, confronted with the same problem, I
>>>> might have done the same.
>>> All right, the interface is designed for *small* memory blocks then.
>>>
>>> Makes me wonder why he needs a separate binary protocol on a separate
>>> socket. Small blocks could be done just fine in QMP.
>> The problem is speed. if one's analyzing the memory space of a running
>> process (physical and paged), libvmi will make a large number of small
>> and mid-sized reads. If one uses xp, or pmemsave, the overhead is
>> quite significant. xp has overhead due to encoding, and pmemsave has
>> overhead due to file open/write (server), file open/read/close/unlink
>> (client).
>>
>> Others have gone through the problem before me. It appears that
>> pmemsave and xp are significantly slower than reading memory using a
>> socket via pmemaccess.
> That they're slower isn't surprising, but I'd expect the cost of
> encoding a small block to be insiginificant compared to the cost of the
> network roundtrips.
>
> As block size increases, the space overhead of encoding will eventually
> bite. But for that usage, the binary protocol appears ill-suited,
> unless the client can pretty reliably avoid read failure. I haven't
> examined its failure modes, yet.
>
>> The following data is not mine, but it shows the time, in
>> milliseconds, required to resolve the content of a paged memory
>> address via socket (pmemaccess) , pmemsave and xp
>>
>> http://cl.ly/image/322a3s0h1V05
>>
>> Again, I did not produce those data points, they come from an old
>> libvmi thread.
> 90ms is a very long time. What exactly was measured?
That is a fair question to ask. Unfortunately, I extracted that data
plot from an old thread in some libvmi mailing list. I do not have the
data and code that produced it. Sifting through the thread, I can see
the code
was never published. I will take it upon myself to produce code that
compares timing - in a fair fashion - of libvmi doing an atomic
operation and a larger-scale operation (like listing running processes)
via gdb, pmemaccess/socket, pmemsave, xp, and hopefully, a version of xp
that returns byte streams of memory regions base64 or base85 encoded in
json strings. I'll publish results and code.
However, given workload and life happening, it will be some time before
I complete that task.
>
>> I think it might be conceivable that there could be a QMP command that
>> returns the content of an arbitrarily size memory region as a base64
>> or a base85 json string. It would still have both time- (due to
>> encoding/decoding) and space- (base64 has 33% and ase85 would be 7%)
>> overhead, + json encoding/decoding overhead. It might still be the
>> case that socket would outperform such a command as well,
>> speed-vise. I don't think it would be any faster than xp.
> A special-purpose binary protocol over a dedicated socket will always do
> less than a QMP solution (ignoring foolishness like transmitting crap on
> read error the client is then expected to throw away). The question is
> whether the difference in work translates to a worthwhile difference in
> performance.
>
> The larger question is actually whether we have an existing interface
> that can serve the libvmi's needs. We've discussed monitor commands
> like xp, pmemsave, pmemread. There's another existing interface: the
> GDB stub. Have you considered it?
>
>> There's also a similar patch, floating around the internet, the uses
>> shared memory, instead of sockets, as inter-process communication
>> between libvmi and QEMU. I've never used that.
> By the time you built a working IPC mechanism on top of shared memory,
> you're often no better off than with AF_LOCAL sockets.
>
> Crazy idea: can we allocate guest memory in a way that support sharing
> it with another process? Eduardo, can -mem-path do such wild things?
>
>>>>>> The socket API was written by the libvmi author and it works the with
>>>>>> current libvmi version. The libvmi client-side implementation is at:
>>>>>>
>>>>>> https://github.com/libvmi/libvmi/blob/master/libvmi/driver/kvm/kvm.c
>>>>>>
>>>>>> As many use kvm VM's for introspection, malware and security analysis,
>>>>>> it might be worth thinking about making the pmemaccess a permanent
>>>>>> hmp/qmp command, as opposed to having to produce a patch at each QEMU
>>>>>> point release.
>>>>> Related existing commands: memsave, pmemsave, dump-guest-memory.
>>>>>
>>>>> Can you explain why these won't do for your use case?
>>>> For people who do security analysis there are two use cases, static
>>>> and dynamic analysis. With memsave, pmemsave and dum-guest-memory one
>>>> can do static analysis. I.e. snapshotting a VM and see what was
>>>> happening at that point in time.
>>>> Dynamic analysis require to be able to 'introspect' a VM while it's running.
>>>>
>>>> If you take a snapshot of two people exchanging a glass of water, and
>>>> you happen to take it at the very moment both persons have their hands
>>>> on the glass, it's hard to tell who passed the glass to whom. If you
>>>> have a movie of the same scene, it's obvious who's the giver and who's
>>>> the receiver. Same use case.
>>> I understand the need for introspecting a running guest. What exactly
>>> makes the existing commands unsuitable for that?
>> Speed. See discussion above.
>>>> More to the point, there's a host of C and python frameworks to
>>>> dynamically analyze VMs: volatility, rekal, "drakvuf", etc. They all
>>>> build on top of libvmi. I did not want to reinvent the wheel.
>>> Fair enough.
>>>
>>> Front page http://libvmi.com/ claims "Works with Xen, KVM, Qemu, and Raw
>>> memory files." What exactly is missing for KVM?
>> When they say they support kvm, what they really mean they support the
>> (retired, I understand) qemu-kvm fork via a patch that is provided in
>> the libvmi source tree. I think the most recent qem-kvm supported is
>> 1.6.0
>>
>> https://github.com/libvmi/libvmi/tree/master/tools/qemu-kvm-patch
>>
>> I wanted to bring support to the head revision of QEMU, to bring
>> libvmi level with more modern QEMU.
>>
>> Maybe the solution is simply to put this patch in the libvmi source
>> tree, which I've already asked to do via pull request, leaving QEMU
>> alone.
>> However, the patch has to be updated at every QEMU point release. I
>> wanted to avoid that, if at all possible.
>>
>>>> Mind you, 99.9% of people that do dynamic VM analysis use xen. They
>>>> contend that xen has better introspection support. In my case, I did
>>>> not want to bother with dedicating a full server to be a xen domain
>>>> 0. I just wanted to do a quick test by standing up a QEMU/kvm VM, in
>>>> an otherwise purposed server.
>>> I'm not at all against better introspection support in QEMU. I'm just
>>> trying to understand the problem you're trying to solve with your
>>> patches.
>> What all users of libvmi would love to have is super high speed access
>> to VM physical memory as part of the QEMU source tree, and not
>> supported via a patch. Implemented as the QEMU owners see fit, as long
>> as it is blazing fast and easy accessed via client library or
>> inter-process communication.
> The use case makes sense to me, we just need to figure out how we want
> to serve it in QEMU.
>
>> My gut feeling is that it has to bypass QMP protocol/encoding/file
>> access/json to be fast, but, it is just a gut feeling - worth nothing.
> My gut feeling is that QMP should do fine in overhead compared to other
> solutions involving socket I/O as long as the data sizes are *small*.
> Latency might be an issue, though: QMP commands are processed from the
> main loop. A dedicated server thread can be more responsive, but
> letting it write to shared resources could be "interesting".
See above, I'll produce some code and results so we can all think about
the issue.
>
>>>>>> Also, the pmemsave commands QAPI should be changed to be usable with
>>>>>> 64bit VM's
>>>>>>
>>>>>> in qapi-schema.json
>>>>>>
>>>>>> from
>>>>>>
>>>>>> ---
>>>>>> { 'command': 'pmemsave',
>>>>>> 'data': {'val': 'int', 'size': 'int', 'filename': 'str'} }
>>>>>> ---
>>>>>>
>>>>>> to
>>>>>>
>>>>>> ---
>>>>>> { 'command': 'pmemsave',
>>>>>> 'data': {'val': 'int64', 'size': 'int64', 'filename': 'str'} }
>>>>>> ---
>>>>> In the QAPI schema, 'int' is actually an alias for 'int64'. Yes, that's
>>>>> confusing.
>>>> I think it's confusing for the HMP parser too. If you have a VM with
>>>> 8Gb of RAM and want to snapshot the whole physical memory, via HMP
>>>> over telnet this is what happens:
>>>>
>>>> $ telnet localhost 1234
>>>> Trying 127.0.0.1...
>>>> Connected to localhost.
>>>> Escape character is '^]'.
>>>> QEMU 2.4.0.1 monitor - type 'help' for more information
>>>> (qemu) help pmemsave
>>>> pmemsave addr size file -- save to disk physical memory dump starting
>>>> at 'addr' of size 'size'
>>>> (qemu) pmemsave 0 8589934591 "/tmp/memorydump"
>>>> 'pmemsave' has failed: integer is for 32-bit values
>>>> Try "help pmemsave" for more information
>>>> (qemu) quit
>>> Your change to pmemsave's definition in qapi-schema.json is effectively a
>>> no-op.
>>>
>>> Your example shows *HMP* command pmemsave. The definition of an HMP
>>> command is *independent* of the QMP command. The implementation *uses*
>>> the QMP command.
>>>
>>> QMP pmemsave is defined in qapi-schema.json as
>>>
>>> { 'command': 'pmemsave',
>>> 'data': {'val': 'int', 'size': 'int', 'filename': 'str'} }
>>>
>>> Its implementation is in cpus.c:
>>>
>>> void qmp_pmemsave(int64_t addr, int64_t size, const char *filename,
>>> Error **errp)
>>>
>>> Note the int64_t size.
>>>
>>> HMP pmemsave is defined in hmp-commands.hx as
>>>
>>> {
>>> .name = "pmemsave",
>>> .args_type = "val:l,size:i,filename:s",
>>> .params = "addr size file",
>>> .help = "save to disk physical memory dump starting at 'addr' of size 'size'",
>>> .mhandler.cmd = hmp_pmemsave,
>>> },
>>>
>>> Its implementation is in hmp.c:
>>>
>>> void hmp_pmemsave(Monitor *mon, const QDict *qdict)
>>> {
>>> uint32_t size = qdict_get_int(qdict, "size");
>>> const char *filename = qdict_get_str(qdict, "filename");
>>> uint64_t addr = qdict_get_int(qdict, "val");
>>> Error *err = NULL;
>>>
>>> qmp_pmemsave(addr, size, filename, &err);
>>> hmp_handle_error(mon, &err);
>>> }
>>>
>>> Note uint32_t size.
>>>
>>> Arguably, the QMP size argument should use 'size' (an alias for
>>> 'uint64'), and the HMP args_type should use 'size:o'.
>> Understand all that. Indeed, I've re-implemented 'pmemaccess' the same
>> way pmemsave is implemented. There is a single function, and two
>> points of entrance, one for HMP and one for QMP. I think pmemacess
>> mimics pmemsave closely.
>>
>> However, if one wants to simply dump a memory region, via HMP for
>> human easy of use/debug/testing purposes, one cannot dump memory
>> regions that resides higher than 2^32-1
> Can you give an example?
Yes. I was trying to dump the full extent of physical memory of a VM
that has 8GB memory space (ballooned). I simply did this:
$ telnet localhost 1234
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
QEMU 2.4.0.1 monitor - type 'help' for more information
(qemu) pmemsave 0 8589934591 "/tmp/memsaved"
'pmemsave' has failed: integer is for 32-bit values
Maybe I misunderstood how pmemsave works. Maybe I should have used
dump-guest-memory
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-21 15:50 ` Valerio Aimale
@ 2015-10-22 11:50 ` Markus Armbruster
2015-10-22 18:11 ` Valerio Aimale
0 siblings, 1 reply; 43+ messages in thread
From: Markus Armbruster @ 2015-10-22 11:50 UTC (permalink / raw)
To: Valerio Aimale; +Cc: qemu-devel, ehabkost, lcapitulino
Valerio Aimale <valerio@aimale.com> writes:
> On 10/21/15 4:54 AM, Markus Armbruster wrote:
>> Valerio Aimale <valerio@aimale.com> writes:
>>
>>> On 10/19/15 1:52 AM, Markus Armbruster wrote:
>>>> Valerio Aimale <valerio@aimale.com> writes:
>>>>
>>>>> On 10/16/15 2:15 AM, Markus Armbruster wrote:
>>>>>> valerio@aimale.com writes:
>>>>>>
>>>>>>> All-
>>>>>>>
>>>>>>> I've produced a patch for the current QEMU HEAD, for libvmi to
>>>>>>> introspect QEMU/KVM VMs.
>>>>>>>
>>>>>>> Libvmi has patches for the old qeum-kvm fork, inside its source tree:
>>>>>>> https://github.com/libvmi/libvmi/tree/master/tools/qemu-kvm-patch
>>>>>>>
>>>>>>> This patch adds a hmp and a qmp command, "pmemaccess". When the
>>>>>>> commands is invoked with a string arguments (a filename), it will open
>>>>>>> a UNIX socket and spawn a listening thread.
>>>>>>>
>>>>>>> The client writes binary commands to the socket, in the form of a c
>>>>>>> structure:
>>>>>>>
>>>>>>> struct request {
>>>>>>> uint8_t type; // 0 quit, 1 read, 2 write, ... rest reserved
>>>>>>> uint64_t address; // address to read from OR write to
>>>>>>> uint64_t length; // number of bytes to read OR write
>>>>>>> };
>>>>>>>
>>>>>>> The client receives as a response, either (length+1) bytes, if it is a
>>>>>>> read operation, or 1 byte ifit is a write operation.
>>>>>>>
>>>>>>> The last bytes of a read operation response indicates success (1
>>>>>>> success, 0 failure). The single byte returned for a write operation
>>>>>>> indicates same (1 success, 0 failure).
>>>>>> So, if you ask to read 1 MiB, and it fails, you get back 1 MiB of
>>>>>> garbage followed by the "it failed" byte?
>>>>> Markus, that appear to be the case. However, I did not write the
>>>>> communication protocol between libvmi and qemu. I'm assuming that the
>>>>> person that wrote the protocol, did not want to bother with over
>>>>> complicating things.
>>>>>
>>>>> https://github.com/libvmi/libvmi/blob/master/libvmi/driver/kvm/kvm.c
>>>>>
>>>>> I'm thinking he assumed reads would be small in size and the price of
>>>>> reading garbage was less than the price of writing a more complicated
>>>>> protocol. I can see his point, confronted with the same problem, I
>>>>> might have done the same.
>>>> All right, the interface is designed for *small* memory blocks then.
>>>>
>>>> Makes me wonder why he needs a separate binary protocol on a separate
>>>> socket. Small blocks could be done just fine in QMP.
>>> The problem is speed. if one's analyzing the memory space of a running
>>> process (physical and paged), libvmi will make a large number of small
>>> and mid-sized reads. If one uses xp, or pmemsave, the overhead is
>>> quite significant. xp has overhead due to encoding, and pmemsave has
>>> overhead due to file open/write (server), file open/read/close/unlink
>>> (client).
>>>
>>> Others have gone through the problem before me. It appears that
>>> pmemsave and xp are significantly slower than reading memory using a
>>> socket via pmemaccess.
>> That they're slower isn't surprising, but I'd expect the cost of
>> encoding a small block to be insiginificant compared to the cost of the
>> network roundtrips.
>>
>> As block size increases, the space overhead of encoding will eventually
>> bite. But for that usage, the binary protocol appears ill-suited,
>> unless the client can pretty reliably avoid read failure. I haven't
>> examined its failure modes, yet.
>>
>>> The following data is not mine, but it shows the time, in
>>> milliseconds, required to resolve the content of a paged memory
>>> address via socket (pmemaccess) , pmemsave and xp
>>>
>>> http://cl.ly/image/322a3s0h1V05
>>>
>>> Again, I did not produce those data points, they come from an old
>>> libvmi thread.
>> 90ms is a very long time. What exactly was measured?
> That is a fair question to ask. Unfortunately, I extracted that data
> plot from an old thread in some libvmi mailing list. I do not have the
> data and code that produced it. Sifting through the thread, I can see
> the code
> was never published. I will take it upon myself to produce code that
> compares timing - in a fair fashion - of libvmi doing an atomic
> operation and a larger-scale operation (like listing running
> processes) via gdb, pmemaccess/socket, pmemsave, xp, and hopefully, a
> version of xp that returns byte streams of memory regions base64 or
> base85 encoded in json strings. I'll publish results and code.
>
> However, given workload and life happening, it will be some time
> before I complete that task.
No problem. I'd like to have your use case addressed, but there's no
need for haste.
[...]
>>>>>>> Also, the pmemsave commands QAPI should be changed to be usable with
>>>>>>> 64bit VM's
>>>>>>>
>>>>>>> in qapi-schema.json
>>>>>>>
>>>>>>> from
>>>>>>>
>>>>>>> ---
>>>>>>> { 'command': 'pmemsave',
>>>>>>> 'data': {'val': 'int', 'size': 'int', 'filename': 'str'} }
>>>>>>> ---
>>>>>>>
>>>>>>> to
>>>>>>>
>>>>>>> ---
>>>>>>> { 'command': 'pmemsave',
>>>>>>> 'data': {'val': 'int64', 'size': 'int64', 'filename': 'str'} }
>>>>>>> ---
>>>>>> In the QAPI schema, 'int' is actually an alias for 'int64'. Yes, that's
>>>>>> confusing.
>>>>> I think it's confusing for the HMP parser too. If you have a VM with
>>>>> 8Gb of RAM and want to snapshot the whole physical memory, via HMP
>>>>> over telnet this is what happens:
>>>>>
>>>>> $ telnet localhost 1234
>>>>> Trying 127.0.0.1...
>>>>> Connected to localhost.
>>>>> Escape character is '^]'.
>>>>> QEMU 2.4.0.1 monitor - type 'help' for more information
>>>>> (qemu) help pmemsave
>>>>> pmemsave addr size file -- save to disk physical memory dump starting
>>>>> at 'addr' of size 'size'
>>>>> (qemu) pmemsave 0 8589934591 "/tmp/memorydump"
>>>>> 'pmemsave' has failed: integer is for 32-bit values
>>>>> Try "help pmemsave" for more information
>>>>> (qemu) quit
>>>> Your change to pmemsave's definition in qapi-schema.json is effectively a
>>>> no-op.
>>>>
>>>> Your example shows *HMP* command pmemsave. The definition of an HMP
>>>> command is *independent* of the QMP command. The implementation *uses*
>>>> the QMP command.
>>>>
>>>> QMP pmemsave is defined in qapi-schema.json as
>>>>
>>>> { 'command': 'pmemsave',
>>>> 'data': {'val': 'int', 'size': 'int', 'filename': 'str'} }
>>>>
>>>> Its implementation is in cpus.c:
>>>>
>>>> void qmp_pmemsave(int64_t addr, int64_t size, const char *filename,
>>>> Error **errp)
>>>>
>>>> Note the int64_t size.
>>>>
>>>> HMP pmemsave is defined in hmp-commands.hx as
>>>>
>>>> {
>>>> .name = "pmemsave",
>>>> .args_type = "val:l,size:i,filename:s",
>>>> .params = "addr size file",
>>>> .help = "save to disk physical memory dump starting at 'addr' of size 'size'",
>>>> .mhandler.cmd = hmp_pmemsave,
>>>> },
>>>>
>>>> Its implementation is in hmp.c:
>>>>
>>>> void hmp_pmemsave(Monitor *mon, const QDict *qdict)
>>>> {
>>>> uint32_t size = qdict_get_int(qdict, "size");
>>>> const char *filename = qdict_get_str(qdict, "filename");
>>>> uint64_t addr = qdict_get_int(qdict, "val");
>>>> Error *err = NULL;
>>>>
>>>> qmp_pmemsave(addr, size, filename, &err);
>>>> hmp_handle_error(mon, &err);
>>>> }
>>>>
>>>> Note uint32_t size.
>>>>
>>>> Arguably, the QMP size argument should use 'size' (an alias for
>>>> 'uint64'), and the HMP args_type should use 'size:o'.
>>> Understand all that. Indeed, I've re-implemented 'pmemaccess' the same
>>> way pmemsave is implemented. There is a single function, and two
>>> points of entrance, one for HMP and one for QMP. I think pmemacess
>>> mimics pmemsave closely.
>>>
>>> However, if one wants to simply dump a memory region, via HMP for
>>> human easy of use/debug/testing purposes, one cannot dump memory
>>> regions that resides higher than 2^32-1
>> Can you give an example?
> Yes. I was trying to dump the full extent of physical memory of a VM
> that has 8GB memory space (ballooned). I simply did this:
>
> $ telnet localhost 1234
> Trying 127.0.0.1...
> Connected to localhost.
> Escape character is '^]'.
> QEMU 2.4.0.1 monitor - type 'help' for more information
> (qemu) pmemsave 0 8589934591 "/tmp/memsaved"
> 'pmemsave' has failed: integer is for 32-bit values
>
> Maybe I misunderstood how pmemsave works. Maybe I should have used
> dump-guest-memory
This is am unnecessary limitation caused by 'size:i' instead of
'size:o'. Fixable.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-22 11:50 ` Markus Armbruster
@ 2015-10-22 18:11 ` Valerio Aimale
2015-10-23 6:31 ` Markus Armbruster
0 siblings, 1 reply; 43+ messages in thread
From: Valerio Aimale @ 2015-10-22 18:11 UTC (permalink / raw)
To: Markus Armbruster; +Cc: qemu-devel, ehabkost, lcapitulino
On 10/22/15 5:50 AM, Markus Armbruster wrote:
> Valerio Aimale <valerio@aimale.com> writes:
>
>> On 10/21/15 4:54 AM, Markus Armbruster wrote:
>>> Valerio Aimale <valerio@aimale.com> writes:
>>>
>>>> On 10/19/15 1:52 AM, Markus Armbruster wrote:
>>>>> Valerio Aimale <valerio@aimale.com> writes:
>>>>>
>>>>>> On 10/16/15 2:15 AM, Markus Armbruster wrote:
>>>>>>> valerio@aimale.com writes:
>>>>>>>
>>>>>>>> All-
>>>>>>>>
>>>>>>>> I've produced a patch for the current QEMU HEAD, for libvmi to
>>>>>>>> introspect QEMU/KVM VMs.
>>>>>>>>
>>>>>>>> Libvmi has patches for the old qeum-kvm fork, inside its source tree:
>>>>>>>> https://github.com/libvmi/libvmi/tree/master/tools/qemu-kvm-patch
>>>>>>>>
>>>>>>>> This patch adds a hmp and a qmp command, "pmemaccess". When the
>>>>>>>> commands is invoked with a string arguments (a filename), it will open
>>>>>>>> a UNIX socket and spawn a listening thread.
>>>>>>>>
>>>>>>>> The client writes binary commands to the socket, in the form of a c
>>>>>>>> structure:
>>>>>>>>
>>>>>>>> struct request {
>>>>>>>> uint8_t type; // 0 quit, 1 read, 2 write, ... rest reserved
>>>>>>>> uint64_t address; // address to read from OR write to
>>>>>>>> uint64_t length; // number of bytes to read OR write
>>>>>>>> };
>>>>>>>>
>>>>>>>> The client receives as a response, either (length+1) bytes, if it is a
>>>>>>>> read operation, or 1 byte ifit is a write operation.
>>>>>>>>
>>>>>>>> The last bytes of a read operation response indicates success (1
>>>>>>>> success, 0 failure). The single byte returned for a write operation
>>>>>>>> indicates same (1 success, 0 failure).
>>>>>>> So, if you ask to read 1 MiB, and it fails, you get back 1 MiB of
>>>>>>> garbage followed by the "it failed" byte?
>>>>>> Markus, that appear to be the case. However, I did not write the
>>>>>> communication protocol between libvmi and qemu. I'm assuming that the
>>>>>> person that wrote the protocol, did not want to bother with over
>>>>>> complicating things.
>>>>>>
>>>>>> https://github.com/libvmi/libvmi/blob/master/libvmi/driver/kvm/kvm.c
>>>>>>
>>>>>> I'm thinking he assumed reads would be small in size and the price of
>>>>>> reading garbage was less than the price of writing a more complicated
>>>>>> protocol. I can see his point, confronted with the same problem, I
>>>>>> might have done the same.
>>>>> All right, the interface is designed for *small* memory blocks then.
>>>>>
>>>>> Makes me wonder why he needs a separate binary protocol on a separate
>>>>> socket. Small blocks could be done just fine in QMP.
>>>> The problem is speed. if one's analyzing the memory space of a running
>>>> process (physical and paged), libvmi will make a large number of small
>>>> and mid-sized reads. If one uses xp, or pmemsave, the overhead is
>>>> quite significant. xp has overhead due to encoding, and pmemsave has
>>>> overhead due to file open/write (server), file open/read/close/unlink
>>>> (client).
>>>>
>>>> Others have gone through the problem before me. It appears that
>>>> pmemsave and xp are significantly slower than reading memory using a
>>>> socket via pmemaccess.
>>> That they're slower isn't surprising, but I'd expect the cost of
>>> encoding a small block to be insiginificant compared to the cost of the
>>> network roundtrips.
>>>
>>> As block size increases, the space overhead of encoding will eventually
>>> bite. But for that usage, the binary protocol appears ill-suited,
>>> unless the client can pretty reliably avoid read failure. I haven't
>>> examined its failure modes, yet.
>>>
>>>> The following data is not mine, but it shows the time, in
>>>> milliseconds, required to resolve the content of a paged memory
>>>> address via socket (pmemaccess) , pmemsave and xp
>>>>
>>>> http://cl.ly/image/322a3s0h1V05
>>>>
>>>> Again, I did not produce those data points, they come from an old
>>>> libvmi thread.
>>> 90ms is a very long time. What exactly was measured?
>> That is a fair question to ask. Unfortunately, I extracted that data
>> plot from an old thread in some libvmi mailing list. I do not have the
>> data and code that produced it. Sifting through the thread, I can see
>> the code
>> was never published. I will take it upon myself to produce code that
>> compares timing - in a fair fashion - of libvmi doing an atomic
>> operation and a larger-scale operation (like listing running
>> processes) via gdb, pmemaccess/socket, pmemsave, xp, and hopefully, a
>> version of xp that returns byte streams of memory regions base64 or
>> base85 encoded in json strings. I'll publish results and code.
>>
>> However, given workload and life happening, it will be some time
>> before I complete that task.
> No problem. I'd like to have your use case addressed, but there's no
> need for haste.
Thanks, Markus. Appreciate your help.
>
> [...]
>>>>>>>> Also, the pmemsave commands QAPI should be changed to be usable with
>>>>>>>> 64bit VM's
>>>>>>>>
>>>>>>>> in qapi-schema.json
>>>>>>>>
>>>>>>>> from
>>>>>>>>
>>>>>>>> ---
>>>>>>>> { 'command': 'pmemsave',
>>>>>>>> 'data': {'val': 'int', 'size': 'int', 'filename': 'str'} }
>>>>>>>> ---
>>>>>>>>
>>>>>>>> to
>>>>>>>>
>>>>>>>> ---
>>>>>>>> { 'command': 'pmemsave',
>>>>>>>> 'data': {'val': 'int64', 'size': 'int64', 'filename': 'str'} }
>>>>>>>> ---
>>>>>>> In the QAPI schema, 'int' is actually an alias for 'int64'. Yes, that's
>>>>>>> confusing.
>>>>>> I think it's confusing for the HMP parser too. If you have a VM with
>>>>>> 8Gb of RAM and want to snapshot the whole physical memory, via HMP
>>>>>> over telnet this is what happens:
>>>>>>
>>>>>> $ telnet localhost 1234
>>>>>> Trying 127.0.0.1...
>>>>>> Connected to localhost.
>>>>>> Escape character is '^]'.
>>>>>> QEMU 2.4.0.1 monitor - type 'help' for more information
>>>>>> (qemu) help pmemsave
>>>>>> pmemsave addr size file -- save to disk physical memory dump starting
>>>>>> at 'addr' of size 'size'
>>>>>> (qemu) pmemsave 0 8589934591 "/tmp/memorydump"
>>>>>> 'pmemsave' has failed: integer is for 32-bit values
>>>>>> Try "help pmemsave" for more information
>>>>>> (qemu) quit
>>>>> Your change to pmemsave's definition in qapi-schema.json is effectively a
>>>>> no-op.
>>>>>
>>>>> Your example shows *HMP* command pmemsave. The definition of an HMP
>>>>> command is *independent* of the QMP command. The implementation *uses*
>>>>> the QMP command.
>>>>>
>>>>> QMP pmemsave is defined in qapi-schema.json as
>>>>>
>>>>> { 'command': 'pmemsave',
>>>>> 'data': {'val': 'int', 'size': 'int', 'filename': 'str'} }
>>>>>
>>>>> Its implementation is in cpus.c:
>>>>>
>>>>> void qmp_pmemsave(int64_t addr, int64_t size, const char *filename,
>>>>> Error **errp)
>>>>>
>>>>> Note the int64_t size.
>>>>>
>>>>> HMP pmemsave is defined in hmp-commands.hx as
>>>>>
>>>>> {
>>>>> .name = "pmemsave",
>>>>> .args_type = "val:l,size:i,filename:s",
>>>>> .params = "addr size file",
>>>>> .help = "save to disk physical memory dump starting at 'addr' of size 'size'",
>>>>> .mhandler.cmd = hmp_pmemsave,
>>>>> },
>>>>>
>>>>> Its implementation is in hmp.c:
>>>>>
>>>>> void hmp_pmemsave(Monitor *mon, const QDict *qdict)
>>>>> {
>>>>> uint32_t size = qdict_get_int(qdict, "size");
>>>>> const char *filename = qdict_get_str(qdict, "filename");
>>>>> uint64_t addr = qdict_get_int(qdict, "val");
>>>>> Error *err = NULL;
>>>>>
>>>>> qmp_pmemsave(addr, size, filename, &err);
>>>>> hmp_handle_error(mon, &err);
>>>>> }
>>>>>
>>>>> Note uint32_t size.
>>>>>
>>>>> Arguably, the QMP size argument should use 'size' (an alias for
>>>>> 'uint64'), and the HMP args_type should use 'size:o'.
>>>> Understand all that. Indeed, I've re-implemented 'pmemaccess' the same
>>>> way pmemsave is implemented. There is a single function, and two
>>>> points of entrance, one for HMP and one for QMP. I think pmemacess
>>>> mimics pmemsave closely.
>>>>
>>>> However, if one wants to simply dump a memory region, via HMP for
>>>> human easy of use/debug/testing purposes, one cannot dump memory
>>>> regions that resides higher than 2^32-1
>>> Can you give an example?
>> Yes. I was trying to dump the full extent of physical memory of a VM
>> that has 8GB memory space (ballooned). I simply did this:
>>
>> $ telnet localhost 1234
>> Trying 127.0.0.1...
>> Connected to localhost.
>> Escape character is '^]'.
>> QEMU 2.4.0.1 monitor - type 'help' for more information
>> (qemu) pmemsave 0 8589934591 "/tmp/memsaved"
>> 'pmemsave' has failed: integer is for 32-bit values
>>
>> Maybe I misunderstood how pmemsave works. Maybe I should have used
>> dump-guest-memory
> This is am unnecessary limitation caused by 'size:i' instead of
> 'size:o'. Fixable.
I think I tried changing size:i to size:l, but, I was still receiving
the error.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-22 18:11 ` Valerio Aimale
@ 2015-10-23 6:31 ` Markus Armbruster
0 siblings, 0 replies; 43+ messages in thread
From: Markus Armbruster @ 2015-10-23 6:31 UTC (permalink / raw)
To: Valerio Aimale; +Cc: lcapitulino, qemu-devel, ehabkost
Valerio Aimale <valerio@aimale.com> writes:
> On 10/22/15 5:50 AM, Markus Armbruster wrote:
>> Valerio Aimale <valerio@aimale.com> writes:
>>
>>> On 10/21/15 4:54 AM, Markus Armbruster wrote:
[...]
>>>> Can you give an example?
>>> Yes. I was trying to dump the full extent of physical memory of a VM
>>> that has 8GB memory space (ballooned). I simply did this:
>>>
>>> $ telnet localhost 1234
>>> Trying 127.0.0.1...
>>> Connected to localhost.
>>> Escape character is '^]'.
>>> QEMU 2.4.0.1 monitor - type 'help' for more information
>>> (qemu) pmemsave 0 8589934591 "/tmp/memsaved"
>>> 'pmemsave' has failed: integer is for 32-bit values
>>>
>>> Maybe I misunderstood how pmemsave works. Maybe I should have used
>>> dump-guest-memory
>> This is am unnecessary limitation caused by 'size:i' instead of
>> 'size:o'. Fixable.
> I think I tried changing size:i to size:l, but, I was still receiving
> the error.
I should have a look. HMP is relatively low priority for me; feel free
to remind me in a week or two.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-21 10:54 ` Markus Armbruster
2015-10-21 15:50 ` Valerio Aimale
@ 2015-10-22 18:43 ` Valerio Aimale
2015-10-22 18:54 ` Eric Blake
2015-10-22 19:12 ` Eduardo Habkost
2 siblings, 1 reply; 43+ messages in thread
From: Valerio Aimale @ 2015-10-22 18:43 UTC (permalink / raw)
To: Markus Armbruster; +Cc: lcapitulino, qemu-devel, ehabkost
On 10/21/15 4:54 AM, Markus Armbruster wrote:
> Valerio Aimale <valerio@aimale.com> writes:
>
>> On 10/19/15 1:52 AM, Markus Armbruster wrote:
>>> Valerio Aimale <valerio@aimale.com> writes:
>>>
>>>> On 10/16/15 2:15 AM, Markus Armbruster wrote:
>>>>> valerio@aimale.com writes:
>>>>>
>>>>>> All-
>>>>>>
>>>>>> I've produced a patch for the current QEMU HEAD, for libvmi to
>>>>>> introspect QEMU/KVM VMs.
>>>>>>
>>>>>> Libvmi has patches for the old qeum-kvm fork, inside its source tree:
>>>>>> https://github.com/libvmi/libvmi/tree/master/tools/qemu-kvm-patch
>>>>>>
>>>>>> This patch adds a hmp and a qmp command, "pmemaccess". When the
>>>>>> commands is invoked with a string arguments (a filename), it will open
>>>>>> a UNIX socket and spawn a listening thread.
>>>>>>
>>>>>> The client writes binary commands to the socket, in the form of a c
>>>>>> structure:
>>>>>>
>>>>>> struct request {
>>>>>> uint8_t type; // 0 quit, 1 read, 2 write, ... rest reserved
>>>>>> uint64_t address; // address to read from OR write to
>>>>>> uint64_t length; // number of bytes to read OR write
>>>>>> };
>>>>>>
>>>>>> The client receives as a response, either (length+1) bytes, if it is a
>>>>>> read operation, or 1 byte ifit is a write operation.
>>>>>>
>>>>>> The last bytes of a read operation response indicates success (1
>>>>>> success, 0 failure). The single byte returned for a write operation
>>>>>> indicates same (1 success, 0 failure).
>>>>> So, if you ask to read 1 MiB, and it fails, you get back 1 MiB of
>>>>> garbage followed by the "it failed" byte?
>>>> Markus, that appear to be the case. However, I did not write the
>>>> communication protocol between libvmi and qemu. I'm assuming that the
>>>> person that wrote the protocol, did not want to bother with over
>>>> complicating things.
>>>>
>>>> https://github.com/libvmi/libvmi/blob/master/libvmi/driver/kvm/kvm.c
>>>>
>>>> I'm thinking he assumed reads would be small in size and the price of
>>>> reading garbage was less than the price of writing a more complicated
>>>> protocol. I can see his point, confronted with the same problem, I
>>>> might have done the same.
>>> All right, the interface is designed for *small* memory blocks then.
>>>
>>> Makes me wonder why he needs a separate binary protocol on a separate
>>> socket. Small blocks could be done just fine in QMP.
>> The problem is speed. if one's analyzing the memory space of a running
>> process (physical and paged), libvmi will make a large number of small
>> and mid-sized reads. If one uses xp, or pmemsave, the overhead is
>> quite significant. xp has overhead due to encoding, and pmemsave has
>> overhead due to file open/write (server), file open/read/close/unlink
>> (client).
>>
>> Others have gone through the problem before me. It appears that
>> pmemsave and xp are significantly slower than reading memory using a
>> socket via pmemaccess.
> That they're slower isn't surprising, but I'd expect the cost of
> encoding a small block to be insiginificant compared to the cost of the
> network roundtrips.
>
> As block size increases, the space overhead of encoding will eventually
> bite. But for that usage, the binary protocol appears ill-suited,
> unless the client can pretty reliably avoid read failure. I haven't
> examined its failure modes, yet.
>
>> The following data is not mine, but it shows the time, in
>> milliseconds, required to resolve the content of a paged memory
>> address via socket (pmemaccess) , pmemsave and xp
>>
>> http://cl.ly/image/322a3s0h1V05
>>
>> Again, I did not produce those data points, they come from an old
>> libvmi thread.
> 90ms is a very long time. What exactly was measured?
>
>> I think it might be conceivable that there could be a QMP command that
>> returns the content of an arbitrarily size memory region as a base64
>> or a base85 json string. It would still have both time- (due to
>> encoding/decoding) and space- (base64 has 33% and ase85 would be 7%)
>> overhead, + json encoding/decoding overhead. It might still be the
>> case that socket would outperform such a command as well,
>> speed-vise. I don't think it would be any faster than xp.
> A special-purpose binary protocol over a dedicated socket will always do
> less than a QMP solution (ignoring foolishness like transmitting crap on
> read error the client is then expected to throw away). The question is
> whether the difference in work translates to a worthwhile difference in
> performance.
>
> The larger question is actually whether we have an existing interface
> that can serve the libvmi's needs. We've discussed monitor commands
> like xp, pmemsave, pmemread. There's another existing interface: the
> GDB stub. Have you considered it?
>
>> There's also a similar patch, floating around the internet, the uses
>> shared memory, instead of sockets, as inter-process communication
>> between libvmi and QEMU. I've never used that.
> By the time you built a working IPC mechanism on top of shared memory,
> you're often no better off than with AF_LOCAL sockets.
>
> Crazy idea: can we allocate guest memory in a way that support sharing
> it with another process? Eduardo, can -mem-path do such wild things?
Markus, your suggestion led to a lightbulb going off in my head.
What if there was a qmp command, say 'pmemmap' then when invoked,
performs the following:
qmp_pmemmap( [...]) {
char *template = "/tmp/QEM_mmap_XXXXXXX";
int mmap_fd;
uint8_t *local_memspace = malloc( (size_t) 8589934592 /* assuming
VM with 8GB RAM */);
cpu_physical_memory_rw( (hwaddr) 0, local_memspace , (hwaddr)
8589934592 /* assuming VM with 8GB RAM */, 0 /* no write for now will
discuss write later */);
mmap_fd = mkstemp("/tmp/QEUM_mmap_XXXXXXX");
mmap((void *) local_memspace, (size_t) 8589934592, PROT_READ |
PROT_WRITE, MAP_SHARED | MAP_ANON, mmap_fd, (off_t) 0);
/* etc */
}
pmemmap would return the following json
{
'success' : 'true',
'map_filename' : '/tmp/QEM_mmap_1234567'
}
the qmp client/caller, would then allocate a region of memory equal to
the size of the file; mmap() the file '/tmp/QEM_mmap_1234567' into the
region. It would then have (read, maybe write?) access to the full
extent of the guest memory without making any other qmp call. I think it
would be fast, and with low memory usage, as mmap() is pretty efficient.
Of course, there would be a 'pmemunmap' qmp commands that would perform
the cleanup
/* etc. */
munmap()
cpu_physical_memory_unmap();
/* etc. */
Would that work? Is mapping the full extent of the guest RAM too much to
ask of cpu_physical_memory_rw()?
>
>>>>>> The socket API was written by the libvmi author and it works the with
>>>>>> current libvmi version. The libvmi client-side implementation is at:
>>>>>>
>>>>>> https://github.com/libvmi/libvmi/blob/master/libvmi/driver/kvm/kvm.c
>>>>>>
>>>>>> As many use kvm VM's for introspection, malware and security analysis,
>>>>>> it might be worth thinking about making the pmemaccess a permanent
>>>>>> hmp/qmp command, as opposed to having to produce a patch at each QEMU
>>>>>> point release.
>>>>> Related existing commands: memsave, pmemsave, dump-guest-memory.
>>>>>
>>>>> Can you explain why these won't do for your use case?
>>>> For people who do security analysis there are two use cases, static
>>>> and dynamic analysis. With memsave, pmemsave and dum-guest-memory one
>>>> can do static analysis. I.e. snapshotting a VM and see what was
>>>> happening at that point in time.
>>>> Dynamic analysis require to be able to 'introspect' a VM while it's running.
>>>>
>>>> If you take a snapshot of two people exchanging a glass of water, and
>>>> you happen to take it at the very moment both persons have their hands
>>>> on the glass, it's hard to tell who passed the glass to whom. If you
>>>> have a movie of the same scene, it's obvious who's the giver and who's
>>>> the receiver. Same use case.
>>> I understand the need for introspecting a running guest. What exactly
>>> makes the existing commands unsuitable for that?
>> Speed. See discussion above.
>>>> More to the point, there's a host of C and python frameworks to
>>>> dynamically analyze VMs: volatility, rekal, "drakvuf", etc. They all
>>>> build on top of libvmi. I did not want to reinvent the wheel.
>>> Fair enough.
>>>
>>> Front page http://libvmi.com/ claims "Works with Xen, KVM, Qemu, and Raw
>>> memory files." What exactly is missing for KVM?
>> When they say they support kvm, what they really mean they support the
>> (retired, I understand) qemu-kvm fork via a patch that is provided in
>> the libvmi source tree. I think the most recent qem-kvm supported is
>> 1.6.0
>>
>> https://github.com/libvmi/libvmi/tree/master/tools/qemu-kvm-patch
>>
>> I wanted to bring support to the head revision of QEMU, to bring
>> libvmi level with more modern QEMU.
>>
>> Maybe the solution is simply to put this patch in the libvmi source
>> tree, which I've already asked to do via pull request, leaving QEMU
>> alone.
>> However, the patch has to be updated at every QEMU point release. I
>> wanted to avoid that, if at all possible.
>>
>>>> Mind you, 99.9% of people that do dynamic VM analysis use xen. They
>>>> contend that xen has better introspection support. In my case, I did
>>>> not want to bother with dedicating a full server to be a xen domain
>>>> 0. I just wanted to do a quick test by standing up a QEMU/kvm VM, in
>>>> an otherwise purposed server.
>>> I'm not at all against better introspection support in QEMU. I'm just
>>> trying to understand the problem you're trying to solve with your
>>> patches.
>> What all users of libvmi would love to have is super high speed access
>> to VM physical memory as part of the QEMU source tree, and not
>> supported via a patch. Implemented as the QEMU owners see fit, as long
>> as it is blazing fast and easy accessed via client library or
>> inter-process communication.
> The use case makes sense to me, we just need to figure out how we want
> to serve it in QEMU.
>
>> My gut feeling is that it has to bypass QMP protocol/encoding/file
>> access/json to be fast, but, it is just a gut feeling - worth nothing.
> My gut feeling is that QMP should do fine in overhead compared to other
> solutions involving socket I/O as long as the data sizes are *small*.
> Latency might be an issue, though: QMP commands are processed from the
> main loop. A dedicated server thread can be more responsive, but
> letting it write to shared resources could be "interesting".
>
>>>>>> Also, the pmemsave commands QAPI should be changed to be usable with
>>>>>> 64bit VM's
>>>>>>
>>>>>> in qapi-schema.json
>>>>>>
>>>>>> from
>>>>>>
>>>>>> ---
>>>>>> { 'command': 'pmemsave',
>>>>>> 'data': {'val': 'int', 'size': 'int', 'filename': 'str'} }
>>>>>> ---
>>>>>>
>>>>>> to
>>>>>>
>>>>>> ---
>>>>>> { 'command': 'pmemsave',
>>>>>> 'data': {'val': 'int64', 'size': 'int64', 'filename': 'str'} }
>>>>>> ---
>>>>> In the QAPI schema, 'int' is actually an alias for 'int64'. Yes, that's
>>>>> confusing.
>>>> I think it's confusing for the HMP parser too. If you have a VM with
>>>> 8Gb of RAM and want to snapshot the whole physical memory, via HMP
>>>> over telnet this is what happens:
>>>>
>>>> $ telnet localhost 1234
>>>> Trying 127.0.0.1...
>>>> Connected to localhost.
>>>> Escape character is '^]'.
>>>> QEMU 2.4.0.1 monitor - type 'help' for more information
>>>> (qemu) help pmemsave
>>>> pmemsave addr size file -- save to disk physical memory dump starting
>>>> at 'addr' of size 'size'
>>>> (qemu) pmemsave 0 8589934591 "/tmp/memorydump"
>>>> 'pmemsave' has failed: integer is for 32-bit values
>>>> Try "help pmemsave" for more information
>>>> (qemu) quit
>>> Your change to pmemsave's definition in qapi-schema.json is effectively a
>>> no-op.
>>>
>>> Your example shows *HMP* command pmemsave. The definition of an HMP
>>> command is *independent* of the QMP command. The implementation *uses*
>>> the QMP command.
>>>
>>> QMP pmemsave is defined in qapi-schema.json as
>>>
>>> { 'command': 'pmemsave',
>>> 'data': {'val': 'int', 'size': 'int', 'filename': 'str'} }
>>>
>>> Its implementation is in cpus.c:
>>>
>>> void qmp_pmemsave(int64_t addr, int64_t size, const char *filename,
>>> Error **errp)
>>>
>>> Note the int64_t size.
>>>
>>> HMP pmemsave is defined in hmp-commands.hx as
>>>
>>> {
>>> .name = "pmemsave",
>>> .args_type = "val:l,size:i,filename:s",
>>> .params = "addr size file",
>>> .help = "save to disk physical memory dump starting at 'addr' of size 'size'",
>>> .mhandler.cmd = hmp_pmemsave,
>>> },
>>>
>>> Its implementation is in hmp.c:
>>>
>>> void hmp_pmemsave(Monitor *mon, const QDict *qdict)
>>> {
>>> uint32_t size = qdict_get_int(qdict, "size");
>>> const char *filename = qdict_get_str(qdict, "filename");
>>> uint64_t addr = qdict_get_int(qdict, "val");
>>> Error *err = NULL;
>>>
>>> qmp_pmemsave(addr, size, filename, &err);
>>> hmp_handle_error(mon, &err);
>>> }
>>>
>>> Note uint32_t size.
>>>
>>> Arguably, the QMP size argument should use 'size' (an alias for
>>> 'uint64'), and the HMP args_type should use 'size:o'.
>> Understand all that. Indeed, I've re-implemented 'pmemaccess' the same
>> way pmemsave is implemented. There is a single function, and two
>> points of entrance, one for HMP and one for QMP. I think pmemacess
>> mimics pmemsave closely.
>>
>> However, if one wants to simply dump a memory region, via HMP for
>> human easy of use/debug/testing purposes, one cannot dump memory
>> regions that resides higher than 2^32-1
> Can you give an example?
>
> [...]
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-22 18:43 ` Valerio Aimale
@ 2015-10-22 18:54 ` Eric Blake
0 siblings, 0 replies; 43+ messages in thread
From: Eric Blake @ 2015-10-22 18:54 UTC (permalink / raw)
To: Valerio Aimale, Markus Armbruster; +Cc: qemu-devel, ehabkost, lcapitulino
[-- Attachment #1: Type: text/plain, Size: 702 bytes --]
On 10/22/2015 12:43 PM, Valerio Aimale wrote:
>
> What if there was a qmp command, say 'pmemmap' then when invoked,
> performs the following:
>
> qmp_pmemmap( [...]) {
>
> char *template = "/tmp/QEM_mmap_XXXXXXX";
Why not let the caller pass in the file name, rather than opening it
ourselves? But the idea of coordinating a file that both caller and qemu
mmap to the same guest memory view might indeed have merit.
[by the way, it's okay to trim messages to the relevant portions, rather
than making readers scroll through lots of quoted content to find the meat]
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-21 10:54 ` Markus Armbruster
2015-10-21 15:50 ` Valerio Aimale
2015-10-22 18:43 ` Valerio Aimale
@ 2015-10-22 19:12 ` Eduardo Habkost
2015-10-22 19:57 ` Valerio Aimale
2015-10-23 6:35 ` Markus Armbruster
2 siblings, 2 replies; 43+ messages in thread
From: Eduardo Habkost @ 2015-10-22 19:12 UTC (permalink / raw)
To: Markus Armbruster; +Cc: qemu-devel, Valerio Aimale, lcapitulino
On Wed, Oct 21, 2015 at 12:54:23PM +0200, Markus Armbruster wrote:
> Valerio Aimale <valerio@aimale.com> writes:
[...]
> > There's also a similar patch, floating around the internet, the uses
> > shared memory, instead of sockets, as inter-process communication
> > between libvmi and QEMU. I've never used that.
>
> By the time you built a working IPC mechanism on top of shared memory,
> you're often no better off than with AF_LOCAL sockets.
>
> Crazy idea: can we allocate guest memory in a way that support sharing
> it with another process? Eduardo, can -mem-path do such wild things?
It can't today, but just because it creates a temporary file inside
mem-path and unlinks it immediately after opening a file descriptor. We
could make memory-backend-file also accept a full filename as argument,
or add a mechanism to let QEMU send the open file descriptor to a QMP
client.
--
Eduardo
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-22 19:12 ` Eduardo Habkost
@ 2015-10-22 19:57 ` Valerio Aimale
2015-10-22 20:03 ` Eric Blake
2015-10-22 21:47 ` Eduardo Habkost
2015-10-23 6:35 ` Markus Armbruster
1 sibling, 2 replies; 43+ messages in thread
From: Valerio Aimale @ 2015-10-22 19:57 UTC (permalink / raw)
To: Eduardo Habkost, Markus Armbruster; +Cc: qemu-devel, lcapitulino
On 10/22/15 1:12 PM, Eduardo Habkost wrote:
> On Wed, Oct 21, 2015 at 12:54:23PM +0200, Markus Armbruster wrote:
>> Valerio Aimale <valerio@aimale.com> writes:
> [...]
>>> There's also a similar patch, floating around the internet, the uses
>>> shared memory, instead of sockets, as inter-process communication
>>> between libvmi and QEMU. I've never used that.
>> By the time you built a working IPC mechanism on top of shared memory,
>> you're often no better off than with AF_LOCAL sockets.
>>
>> Crazy idea: can we allocate guest memory in a way that support sharing
>> it with another process? Eduardo, can -mem-path do such wild things?
> It can't today, but just because it creates a temporary file inside
> mem-path and unlinks it immediately after opening a file descriptor. We
> could make memory-backend-file also accept a full filename as argument,
> or add a mechanism to let QEMU send the open file descriptor to a QMP
> client.
>
Eduardo, would my "artisanal" idea of creating an mmap'ed image of the
guest memory footprint work, augmented by Eric's suggestion of having
the qmp client pass the filename?
qmp_pmemmap( [...]) {
char *template = "/tmp/QEM_mmap_XXXXXXX";
int mmap_fd;
uint8_t *local_memspace = malloc( (size_t) 8589934592 /* assuming
VM with 8GB RAM */);
cpu_physical_memory_rw( (hwaddr) 0, local_memspace , (hwaddr)
8589934592 /* assuming VM with 8GB RAM */, 0 /* no write for now will
discuss write later */);
mmap_fd = mkstemp("/tmp/QEUM_mmap_XXXXXXX");
mmap((void *) local_memspace, (size_t) 8589934592, PROT_READ |
PROT_WRITE, MAP_SHARED | MAP_ANON, mmap_fd, (off_t) 0);
/* etc */
}
pmemmap would return the following json
{
'success' : 'true',
'map_filename' : '/tmp/QEM_mmap_1234567'
}
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-22 19:57 ` Valerio Aimale
@ 2015-10-22 20:03 ` Eric Blake
2015-10-22 20:45 ` Valerio Aimale
2015-10-22 21:47 ` Eduardo Habkost
1 sibling, 1 reply; 43+ messages in thread
From: Eric Blake @ 2015-10-22 20:03 UTC (permalink / raw)
To: Valerio Aimale, Eduardo Habkost, Markus Armbruster
Cc: qemu-devel, lcapitulino
[-- Attachment #1: Type: text/plain, Size: 639 bytes --]
On 10/22/2015 01:57 PM, Valerio Aimale wrote:
>
> pmemmap would return the following json
>
> {
> 'success' : 'true',
> 'map_filename' : '/tmp/QEM_mmap_1234567'
> }
In general, it is better if the client controls the filename, and not
qemu. This is because things like libvirt like to run qemu in a
highly-constrained environment, where the caller can pass in a file
descriptor that qemu cannot itself open(). So returning a filename is
pointless if the filename was already provided by the caller.
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-22 20:03 ` Eric Blake
@ 2015-10-22 20:45 ` Valerio Aimale
0 siblings, 0 replies; 43+ messages in thread
From: Valerio Aimale @ 2015-10-22 20:45 UTC (permalink / raw)
To: Eric Blake, Eduardo Habkost, Markus Armbruster; +Cc: qemu-devel, lcapitulino
On 10/22/15 2:03 PM, Eric Blake wrote:
> On 10/22/2015 01:57 PM, Valerio Aimale wrote:
>
>> pmemmap would return the following json
>>
>> {
>> 'success' : 'true',
>> 'map_filename' : '/tmp/QEM_mmap_1234567'
>> }
> In general, it is better if the client controls the filename, and not
> qemu. This is because things like libvirt like to run qemu in a
> highly-constrained environment, where the caller can pass in a file
> descriptor that qemu cannot itself open(). So returning a filename is
> pointless if the filename was already provided by the caller.
>
Eric, I absolutely and positively agree with you. I was just
brainstorming. Consider my pseudo-C code as the mailing list analog of
somebody scribbling on a white board and trying explain an idea.
I agree with you the file name should come from the qmp client.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-22 19:57 ` Valerio Aimale
2015-10-22 20:03 ` Eric Blake
@ 2015-10-22 21:47 ` Eduardo Habkost
2015-10-22 21:51 ` Valerio Aimale
1 sibling, 1 reply; 43+ messages in thread
From: Eduardo Habkost @ 2015-10-22 21:47 UTC (permalink / raw)
To: Valerio Aimale; +Cc: lcapitulino, Markus Armbruster, qemu-devel
On Thu, Oct 22, 2015 at 01:57:13PM -0600, Valerio Aimale wrote:
> On 10/22/15 1:12 PM, Eduardo Habkost wrote:
> >On Wed, Oct 21, 2015 at 12:54:23PM +0200, Markus Armbruster wrote:
> >>Valerio Aimale <valerio@aimale.com> writes:
> >[...]
> >>>There's also a similar patch, floating around the internet, the uses
> >>>shared memory, instead of sockets, as inter-process communication
> >>>between libvmi and QEMU. I've never used that.
> >>By the time you built a working IPC mechanism on top of shared memory,
> >>you're often no better off than with AF_LOCAL sockets.
> >>
> >>Crazy idea: can we allocate guest memory in a way that support sharing
> >>it with another process? Eduardo, can -mem-path do such wild things?
> >It can't today, but just because it creates a temporary file inside
> >mem-path and unlinks it immediately after opening a file descriptor. We
> >could make memory-backend-file also accept a full filename as argument,
> >or add a mechanism to let QEMU send the open file descriptor to a QMP
> >client.
> >
> Eduardo, would my "artisanal" idea of creating an mmap'ed image of the guest
> memory footprint work, augmented by Eric's suggestion of having the qmp
> client pass the filename?
The code below doesn't make sense to me.
>
> qmp_pmemmap( [...]) {
>
> char *template = "/tmp/QEM_mmap_XXXXXXX";
> int mmap_fd;
> uint8_t *local_memspace = malloc( (size_t) 8589934592 /* assuming VM
> with 8GB RAM */);
>
> cpu_physical_memory_rw( (hwaddr) 0, local_memspace , (hwaddr)
> 8589934592 /* assuming VM with 8GB RAM */, 0 /* no write for now will
> discuss write later */);
Are you suggesting copying all the VM RAM contents to a whole new area?
Why?
>
> mmap_fd = mkstemp("/tmp/QEUM_mmap_XXXXXXX");
>
> mmap((void *) local_memspace, (size_t) 8589934592, PROT_READ |
> PROT_WRITE, MAP_SHARED | MAP_ANON, mmap_fd, (off_t) 0);
I don't understand what you are trying to do here.
>
> /* etc */
>
> }
>
> pmemmap would return the following json
>
> {
> 'success' : 'true',
> 'map_filename' : '/tmp/QEM_mmap_1234567'
> }
>
>
--
Eduardo
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-22 21:47 ` Eduardo Habkost
@ 2015-10-22 21:51 ` Valerio Aimale
2015-10-23 8:25 ` Daniel P. Berrange
2015-10-23 18:55 ` Eduardo Habkost
0 siblings, 2 replies; 43+ messages in thread
From: Valerio Aimale @ 2015-10-22 21:51 UTC (permalink / raw)
To: Eduardo Habkost; +Cc: lcapitulino, Markus Armbruster, qemu-devel
On 10/22/15 3:47 PM, Eduardo Habkost wrote:
> On Thu, Oct 22, 2015 at 01:57:13PM -0600, Valerio Aimale wrote:
>> On 10/22/15 1:12 PM, Eduardo Habkost wrote:
>>> On Wed, Oct 21, 2015 at 12:54:23PM +0200, Markus Armbruster wrote:
>>>> Valerio Aimale <valerio@aimale.com> writes:
>>> [...]
>>>>> There's also a similar patch, floating around the internet, the uses
>>>>> shared memory, instead of sockets, as inter-process communication
>>>>> between libvmi and QEMU. I've never used that.
>>>> By the time you built a working IPC mechanism on top of shared memory,
>>>> you're often no better off than with AF_LOCAL sockets.
>>>>
>>>> Crazy idea: can we allocate guest memory in a way that support sharing
>>>> it with another process? Eduardo, can -mem-path do such wild things?
>>> It can't today, but just because it creates a temporary file inside
>>> mem-path and unlinks it immediately after opening a file descriptor. We
>>> could make memory-backend-file also accept a full filename as argument,
>>> or add a mechanism to let QEMU send the open file descriptor to a QMP
>>> client.
>>>
>> Eduardo, would my "artisanal" idea of creating an mmap'ed image of the guest
>> memory footprint work, augmented by Eric's suggestion of having the qmp
>> client pass the filename?
> The code below doesn't make sense to me.
Ok. What I am trying to do is to create a mmapped() memory area of the
guest physical memory that can be shared between QEMU and an external
process, such that the external process can read arbitrary locations of
the qemu guest physical memory.
In short, I'm using mmap MAP_SHARED to share the guest memory area with
a process that is external to QEMU
does it make better sense now?
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-22 21:51 ` Valerio Aimale
@ 2015-10-23 8:25 ` Daniel P. Berrange
2015-10-23 19:00 ` Eduardo Habkost
2015-10-23 18:55 ` Eduardo Habkost
1 sibling, 1 reply; 43+ messages in thread
From: Daniel P. Berrange @ 2015-10-23 8:25 UTC (permalink / raw)
To: Valerio Aimale
Cc: qemu-devel, Markus Armbruster, Eduardo Habkost, lcapitulino
On Thu, Oct 22, 2015 at 03:51:28PM -0600, Valerio Aimale wrote:
> On 10/22/15 3:47 PM, Eduardo Habkost wrote:
> >On Thu, Oct 22, 2015 at 01:57:13PM -0600, Valerio Aimale wrote:
> >>On 10/22/15 1:12 PM, Eduardo Habkost wrote:
> >>>On Wed, Oct 21, 2015 at 12:54:23PM +0200, Markus Armbruster wrote:
> >>>>Valerio Aimale <valerio@aimale.com> writes:
> >>>[...]
> >>>>>There's also a similar patch, floating around the internet, the uses
> >>>>>shared memory, instead of sockets, as inter-process communication
> >>>>>between libvmi and QEMU. I've never used that.
> >>>>By the time you built a working IPC mechanism on top of shared memory,
> >>>>you're often no better off than with AF_LOCAL sockets.
> >>>>
> >>>>Crazy idea: can we allocate guest memory in a way that support sharing
> >>>>it with another process? Eduardo, can -mem-path do such wild things?
> >>>It can't today, but just because it creates a temporary file inside
> >>>mem-path and unlinks it immediately after opening a file descriptor. We
> >>>could make memory-backend-file also accept a full filename as argument,
> >>>or add a mechanism to let QEMU send the open file descriptor to a QMP
> >>>client.
> >>>
> >>Eduardo, would my "artisanal" idea of creating an mmap'ed image of the guest
> >>memory footprint work, augmented by Eric's suggestion of having the qmp
> >>client pass the filename?
> >The code below doesn't make sense to me.
>
> Ok. What I am trying to do is to create a mmapped() memory area of the guest
> physical memory that can be shared between QEMU and an external process,
> such that the external process can read arbitrary locations of the qemu
> guest physical memory.
> In short, I'm using mmap MAP_SHARED to share the guest memory area with a
> process that is external to QEMU
I wonder if it is possible for you get access to the guest memory via
sysfs instead, by simply accessing /proc/$PID/mem avoiding the need for
any special help from QEMU. /proc/$PID/maps can be used to find the
offset of the guest memory region too, but looking for the map region
that has the size that matches guest RAM size.
Regards,
Daniel
--
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-23 8:25 ` Daniel P. Berrange
@ 2015-10-23 19:00 ` Eduardo Habkost
0 siblings, 0 replies; 43+ messages in thread
From: Eduardo Habkost @ 2015-10-23 19:00 UTC (permalink / raw)
To: Daniel P. Berrange
Cc: qemu-devel, Markus Armbruster, Valerio Aimale, lcapitulino
On Fri, Oct 23, 2015 at 09:25:03AM +0100, Daniel P. Berrange wrote:
> On Thu, Oct 22, 2015 at 03:51:28PM -0600, Valerio Aimale wrote:
> > On 10/22/15 3:47 PM, Eduardo Habkost wrote:
> > >On Thu, Oct 22, 2015 at 01:57:13PM -0600, Valerio Aimale wrote:
> > >>On 10/22/15 1:12 PM, Eduardo Habkost wrote:
> > >>>On Wed, Oct 21, 2015 at 12:54:23PM +0200, Markus Armbruster wrote:
> > >>>>Valerio Aimale <valerio@aimale.com> writes:
> > >>>[...]
> > >>>>>There's also a similar patch, floating around the internet, the uses
> > >>>>>shared memory, instead of sockets, as inter-process communication
> > >>>>>between libvmi and QEMU. I've never used that.
> > >>>>By the time you built a working IPC mechanism on top of shared memory,
> > >>>>you're often no better off than with AF_LOCAL sockets.
> > >>>>
> > >>>>Crazy idea: can we allocate guest memory in a way that support sharing
> > >>>>it with another process? Eduardo, can -mem-path do such wild things?
> > >>>It can't today, but just because it creates a temporary file inside
> > >>>mem-path and unlinks it immediately after opening a file descriptor. We
> > >>>could make memory-backend-file also accept a full filename as argument,
> > >>>or add a mechanism to let QEMU send the open file descriptor to a QMP
> > >>>client.
> > >>>
> > >>Eduardo, would my "artisanal" idea of creating an mmap'ed image of the guest
> > >>memory footprint work, augmented by Eric's suggestion of having the qmp
> > >>client pass the filename?
> > >The code below doesn't make sense to me.
> >
> > Ok. What I am trying to do is to create a mmapped() memory area of the guest
> > physical memory that can be shared between QEMU and an external process,
> > such that the external process can read arbitrary locations of the qemu
> > guest physical memory.
> > In short, I'm using mmap MAP_SHARED to share the guest memory area with a
> > process that is external to QEMU
>
> I wonder if it is possible for you get access to the guest memory via
> sysfs instead, by simply accessing /proc/$PID/mem avoiding the need for
> any special help from QEMU. /proc/$PID/maps can be used to find the
> offset of the guest memory region too, but looking for the map region
> that has the size that matches guest RAM size.
If libvmi needs to work with existing VMs (without reconfiguring and
restarting them), this may be the only way we could let it access guest
memory directly.
But I wouldn't trust the method of simply looking for the map region
that has the right size. This would be making many assumptions about how
exactly QEMU allocates guest RAM internally. Also, it would require
libvmi to ptrace-attach to QEMU first.
--
Eduardo
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-22 21:51 ` Valerio Aimale
2015-10-23 8:25 ` Daniel P. Berrange
@ 2015-10-23 18:55 ` Eduardo Habkost
2015-10-23 19:08 ` Valerio Aimale
1 sibling, 1 reply; 43+ messages in thread
From: Eduardo Habkost @ 2015-10-23 18:55 UTC (permalink / raw)
To: Valerio Aimale; +Cc: lcapitulino, Markus Armbruster, qemu-devel
On Thu, Oct 22, 2015 at 03:51:28PM -0600, Valerio Aimale wrote:
> On 10/22/15 3:47 PM, Eduardo Habkost wrote:
> >On Thu, Oct 22, 2015 at 01:57:13PM -0600, Valerio Aimale wrote:
> >>On 10/22/15 1:12 PM, Eduardo Habkost wrote:
> >>>On Wed, Oct 21, 2015 at 12:54:23PM +0200, Markus Armbruster wrote:
> >>>>Valerio Aimale <valerio@aimale.com> writes:
> >>>[...]
> >>>>>There's also a similar patch, floating around the internet, the uses
> >>>>>shared memory, instead of sockets, as inter-process communication
> >>>>>between libvmi and QEMU. I've never used that.
> >>>>By the time you built a working IPC mechanism on top of shared memory,
> >>>>you're often no better off than with AF_LOCAL sockets.
> >>>>
> >>>>Crazy idea: can we allocate guest memory in a way that support sharing
> >>>>it with another process? Eduardo, can -mem-path do such wild things?
> >>>It can't today, but just because it creates a temporary file inside
> >>>mem-path and unlinks it immediately after opening a file descriptor. We
> >>>could make memory-backend-file also accept a full filename as argument,
> >>>or add a mechanism to let QEMU send the open file descriptor to a QMP
> >>>client.
> >>>
> >>Eduardo, would my "artisanal" idea of creating an mmap'ed image of the guest
> >>memory footprint work, augmented by Eric's suggestion of having the qmp
> >>client pass the filename?
> >The code below doesn't make sense to me.
>
> Ok. What I am trying to do is to create a mmapped() memory area of the guest
> physical memory that can be shared between QEMU and an external process,
> such that the external process can read arbitrary locations of the qemu
> guest physical memory.
> In short, I'm using mmap MAP_SHARED to share the guest memory area with a
> process that is external to QEMU
>
> does it make better sense now?
I think you are confused about what mmap() does. It will create a new
mapping into the process address space, containing the data from an
existing file, not the other way around.
--
Eduardo
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-23 18:55 ` Eduardo Habkost
@ 2015-10-23 19:08 ` Valerio Aimale
2015-10-26 9:09 ` Markus Armbruster
0 siblings, 1 reply; 43+ messages in thread
From: Valerio Aimale @ 2015-10-23 19:08 UTC (permalink / raw)
To: Eduardo Habkost; +Cc: lcapitulino, Markus Armbruster, qemu-devel
On 10/23/15 12:55 PM, Eduardo Habkost wrote:
> On Thu, Oct 22, 2015 at 03:51:28PM -0600, Valerio Aimale wrote:
>> On 10/22/15 3:47 PM, Eduardo Habkost wrote:
>>> On Thu, Oct 22, 2015 at 01:57:13PM -0600, Valerio Aimale wrote:
>>>> On 10/22/15 1:12 PM, Eduardo Habkost wrote:
>>>>> On Wed, Oct 21, 2015 at 12:54:23PM +0200, Markus Armbruster wrote:
>>>>>> Valerio Aimale <valerio@aimale.com> writes:
>>>>> [...]
>>>>>>> There's also a similar patch, floating around the internet, the uses
>>>>>>> shared memory, instead of sockets, as inter-process communication
>>>>>>> between libvmi and QEMU. I've never used that.
>>>>>> By the time you built a working IPC mechanism on top of shared memory,
>>>>>> you're often no better off than with AF_LOCAL sockets.
>>>>>>
>>>>>> Crazy idea: can we allocate guest memory in a way that support sharing
>>>>>> it with another process? Eduardo, can -mem-path do such wild things?
>>>>> It can't today, but just because it creates a temporary file inside
>>>>> mem-path and unlinks it immediately after opening a file descriptor. We
>>>>> could make memory-backend-file also accept a full filename as argument,
>>>>> or add a mechanism to let QEMU send the open file descriptor to a QMP
>>>>> client.
>>>>>
>>>> Eduardo, would my "artisanal" idea of creating an mmap'ed image of the guest
>>>> memory footprint work, augmented by Eric's suggestion of having the qmp
>>>> client pass the filename?
>>> The code below doesn't make sense to me.
>> Ok. What I am trying to do is to create a mmapped() memory area of the guest
>> physical memory that can be shared between QEMU and an external process,
>> such that the external process can read arbitrary locations of the qemu
>> guest physical memory.
>> In short, I'm using mmap MAP_SHARED to share the guest memory area with a
>> process that is external to QEMU
>>
>> does it make better sense now?
> I think you are confused about what mmap() does. It will create a new
> mapping into the process address space, containing the data from an
> existing file, not the other way around.
>
Eduardo, I think it would be a common rule of politeness not to pass any
judgement on a person that you don't know, but for some texts in a
mailing list. I think I understand how mmap() works, and very well.
Participating is this discussion has been a struggle for me. For the
good of the libvmi users, I have been trying to ignore the judgements,
the comments and so on. But, alas, I throw my hands up in the air, and I
surrender.
I think libvmi can live, as it has for the past years, by patching the
QEMU source tree on as needed basis, and keeping the patch in the libvmi
source tree, without disturbing any further the QEMU community.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-23 19:08 ` Valerio Aimale
@ 2015-10-26 9:09 ` Markus Armbruster
2015-10-26 17:37 ` Valerio Aimale
0 siblings, 1 reply; 43+ messages in thread
From: Markus Armbruster @ 2015-10-26 9:09 UTC (permalink / raw)
To: Valerio Aimale; +Cc: qemu-devel, Eduardo Habkost, lcapitulino
Valerio Aimale <valerio@aimale.com> writes:
> On 10/23/15 12:55 PM, Eduardo Habkost wrote:
>> On Thu, Oct 22, 2015 at 03:51:28PM -0600, Valerio Aimale wrote:
>>> On 10/22/15 3:47 PM, Eduardo Habkost wrote:
>>>> On Thu, Oct 22, 2015 at 01:57:13PM -0600, Valerio Aimale wrote:
>>>>> On 10/22/15 1:12 PM, Eduardo Habkost wrote:
>>>>>> On Wed, Oct 21, 2015 at 12:54:23PM +0200, Markus Armbruster wrote:
>>>>>>> Valerio Aimale <valerio@aimale.com> writes:
>>>>>> [...]
>>>>>>>> There's also a similar patch, floating around the internet, the uses
>>>>>>>> shared memory, instead of sockets, as inter-process communication
>>>>>>>> between libvmi and QEMU. I've never used that.
>>>>>>> By the time you built a working IPC mechanism on top of shared memory,
>>>>>>> you're often no better off than with AF_LOCAL sockets.
>>>>>>>
>>>>>>> Crazy idea: can we allocate guest memory in a way that support sharing
>>>>>>> it with another process? Eduardo, can -mem-path do such wild things?
>>>>>> It can't today, but just because it creates a temporary file inside
>>>>>> mem-path and unlinks it immediately after opening a file descriptor. We
>>>>>> could make memory-backend-file also accept a full filename as argument,
>>>>>> or add a mechanism to let QEMU send the open file descriptor to a QMP
>>>>>> client.
>>>>>>
>>>>> Eduardo, would my "artisanal" idea of creating an mmap'ed image
>>>>> of the guest
>>>>> memory footprint work, augmented by Eric's suggestion of having the qmp
>>>>> client pass the filename?
>>>> The code below doesn't make sense to me.
>>> Ok. What I am trying to do is to create a mmapped() memory area of the guest
>>> physical memory that can be shared between QEMU and an external process,
>>> such that the external process can read arbitrary locations of the qemu
>>> guest physical memory.
>>> In short, I'm using mmap MAP_SHARED to share the guest memory area with a
>>> process that is external to QEMU
>>>
>>> does it make better sense now?
>> I think you are confused about what mmap() does. It will create a new
>> mapping into the process address space, containing the data from an
>> existing file, not the other way around.
>>
> Eduardo, I think it would be a common rule of politeness not to pass
> any judgement on a person that you don't know, but for some texts in a
> mailing list. I think I understand how mmap() works, and very well.
>
> Participating is this discussion has been a struggle for me. For the
> good of the libvmi users, I have been trying to ignore the judgements,
> the comments and so on. But, alas, I throw my hands up in the air, and
> I surrender.
I'm sorry we exceeded your tolerance for frustration. This mailing list
can be tough. We try to be welcoming (believe it or not), but we too
often fail (okay, that part is easily believable).
To be honest, I had difficulties understanding your explanation, and
ended up guessing. I figure Eduardo did the same, and guessed
incorrectly. There but for the grace of God go I.
> I think libvmi can live, as it has for the past years, by patching the
> QEMU source tree on as needed basis, and keeping the patch in the
> libvmi source tree, without disturbing any further the QEMU community.
I'm sure libvmi can continue to require a patched QEMU, but I'm equally
sure getting its needs satisfied out of the box would be better for all.
To get that done, we need to understand the problem, and map the
solution space.
So let me try to summarize the thread, and what I've learned from it so
far. Valerio, if you could correct misunderstandings, I'd be much
obliged.
LibVMI is a C library with Python bindings that makes it easy to monitor
the low-level details of a running virtual machine by viewing its
memory, trapping on hardware events, and accessing the vCPU registers.
This is called virtual machine introspection.
[Direct quote from http://libvmi.com/]
For that purpose, LibVMI needs (among other things) sufficiently fast
means to read and write guest memory.
Its existing solution is a non-upstream patch that adds a new, simple
protocol for reading and writing physical guest memory over TCP, and
monitor commands to start this service.
This thread is in reply to Valerio's attempt to upstream this patch.
Good move.
The usual questions for feature requests apply:
1. Is this a use case we want to serve?
Unreserved yes. Supporting virtual machine introspection with LibVMI
makes sense.
2. Can it be served by existing guest introspection interfaces?
Not quite clear, yet.
LibVMI interacts with QEMU/KVM virtual machines via libvirt. We want
to be able to start and stop introspecting running virtual machines
managed by libvirt. Rules out solutions that require QEMU to be
started with special command line options. We really want QMP
commands. HMP commands would work, but HMP is not a stable
interface. It's fine for prototyping, of course.
Interfaces discussed include the following monitor commands:
* x, xp
There's overhead for encoding / decoding. LibVMI developers
apparently have found it too slow. They measured 90ms, which very
slow indeed. Turns out LibVMI goes through virsh every time! I
suspect the overhead starting virsh dwarves everything else several
times over. Therefore, we don't have an idea on this method's real
overhead, yet.
Transferring large blocks through the monitor connection can be
problematic. The monitor is control plane, bulk data should go
through a data plane. Not sure whether this is an issue for
LibVMI's usage.
x and xp are only in HMP.
They cover only reading.
* memsave, pmemsave
Similar to x, xp, except here we use the file system as data plane.
Additionally, we trade encoding / decoding overhead for temporary
file handling overhead.
Cover only reading.
* dump-guest-memory
More powerful (and more complex) than memsave, pmemsave, but geared
towards a different use case: taking a crash dump of a running VM
for static analysis.
Covers only reading.
Dumps in ELF or compressed kdump format.
Supports dump to file descriptor, which could perhaps be used to
avoid temporary files.
* gdbserver
This starts a server for the GDB Remote Serial Protocol on a TCP
port. This is for debugging the *guest*, not QEMU. Can do much
more than read and write memory.
README.rst in the LibVMI source tree suggests
- LibVMI can already use this interface, but it's apparently slower
than using its non-upstream patch. How much?
- It requires the user to configure his VMs to be started with QEMU
command line option -s, which means you can't introspect a
running VM you didn't prepare for it from the start. Easy enough
to improve: use the monitor command!
gdbserver is only in HMP, probably because it's viewed as a
debugging tool for human use. LibVMI would be a programmatic user,
so adding gdbserver to QMP could be justified.
3. Assuming existing interfaces won't do, how could a new one look like?
* Special purpose TCP protocol, QMP command to start/stop the service
This is the existing non-upstream solution.
We can quibble over the protocol, e.g. its weird handling of read
errors, but that's detail
* Share guest memory with LibVMI somehow
Still in the "crazy idea" stage. Fish the memory out of
/proc/$PID/mem? Create a file QEMU and LibVMI can mmap?
Writing has same synchronization problems as with multi-threaded
TCG. These are being addressed, but I have now idea how the
solutions will translate.
I'm afraid this got a bit long. Thanks for reading this far.
I hope we can work together and find a solution that satisfies LibVMI.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-26 9:09 ` Markus Armbruster
@ 2015-10-26 17:37 ` Valerio Aimale
2015-10-26 17:52 ` Eduardo Habkost
0 siblings, 1 reply; 43+ messages in thread
From: Valerio Aimale @ 2015-10-26 17:37 UTC (permalink / raw)
To: Markus Armbruster; +Cc: qemu-devel, Eduardo Habkost, lcapitulino
On 10/26/15 3:09 AM, Markus Armbruster wrote:
> [...]
>
>> Eduardo, I think it would be a common rule of politeness not to pass
>> any judgement on a person that you don't know, but for some texts in a
>> mailing list. I think I understand how mmap() works, and very well.
>>
>> Participating is this discussion has been a struggle for me. For the
>> good of the libvmi users, I have been trying to ignore the judgements,
>> the comments and so on. But, alas, I throw my hands up in the air, and
>> I surrender.
> I'm sorry we exceeded your tolerance for frustration. This mailing list
> can be tough. We try to be welcoming (believe it or not), but we too
> often fail (okay, that part is easily believable).
>
> To be honest, I had difficulties understanding your explanation, and
> ended up guessing. I figure Eduardo did the same, and guessed
> incorrectly. There but for the grace of God go I.
Well, I did scribble my C sample excerpt too fast. Participating in
mailing list is not part of my job description - I was short on time, I
admit to that. However, there is a big difference in saying "I do no
understand your explanation, please try again" and saying "you're
confused about mmap()"
I was trying to advocate the use of a shared mmap'ed region. The sharing
would be two-ways (RW for both) between the QEMU virtualizer and the
libvmi process. I envision that there could be a QEMU command line
argument, such as "--mmap-guest-memory <filename>" Understand that Eric
feels strongly the libvmi client should own the file name - I have not
forgotten that. When that command line argument is given, as part of the
guest initialization, QEMU creates a file of size equal to the size of
the guest memory containing all zeros, mmaps that file to the guest
memory with PROT_READ|PROT_WRITE and MAP_FILE|MAP_SHARED, then starts
the guest. And, if at all possible, makes the filename querable via qmp
and/or hmp, so that the filename of the mmap would not need to be
maintained in two different places, leading to maintenance nightmares.
Shared mmaped regions can be used as inter-process communication, here's
a quick and dirty example:
p1.c
---
#include <sys/mman.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/file.h>
#include <sys/mman.h>
#include <sys/wait.h>
#include <errno.h>
#include <signal.h>
void handle_signal(int signal);
/* sorry, for ease of development I need these to be global */
int fh;
char *p;
void handle_signal(int signal) {
const char *signal_name;
sigset_t pending;
switch (signal) {
case SIGHUP:
signal_name = "SIGHUP";
fprintf(stdout, "Process 1 -- Map now contains: %s\n", p);
munmap(p, sysconf(_SC_PAGE_SIZE) );
close(fh);
exit(0);
break;
default:
fprintf(stderr, "Caught wrong signal: %d\n", signal);
return;
}
}
void main(int argc, char **argv) {
struct sigaction sa;
sa.sa_handler = &handle_signal;
sa.sa_flags = SA_RESTART;
sigfillset(&sa.sa_mask);
if (sigaction(SIGHUP, &sa, NULL) == -1) {
perror("Error: cannot handle SIGHUP");
exit(1);
}
if ( (fh = open("shared.map", O_RDWR | O_CREAT, S_IRWXU)) ) {
p = mmap(NULL, sysconf(_SC_PAGE_SIZE),
PROT_READ|PROT_WRITE, MAP_FILE|MAP_SHARED, fh, (off_t) 0);
if (p == MAP_FAILED) { printf("poop, didn't map:
%s\n",strerror(errno)); close(fh); exit(1);}
p[0] = 0xcc;
fprintf(stdout, "Process 1 -- Writing to map: All
your bases are belong to us.\n");
sprintf( (char*) p, "All your bases are belong to us.");
while(1);
}
}
---
p2.c
---
#include <sys/mman.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/file.h>
#include <sys/mman.h>
#include <sys/wait.h>
#include <errno.h>
#include <signal.h>
void main(int argc, char **argv) {
int fh;
void *p;
int pid;
pid = atoi(argv[1]);
sleep(1);
if ( (fh = open("shared.map", O_RDWR, S_IRWXU)) ) {
p = mmap(NULL, sysconf(_SC_PAGE_SIZE),
PROT_READ|PROT_WRITE, MAP_SHARED|MAP_FILE,
fh, (off_t) 0);
printf("Process 2 -- Map now contains: %s\n", (char*)p);
printf("Process 2 -- Writing to map: All your bases
*NOW* belong to us.\n");
fflush(stdout);
sprintf( (char*) p, "All your bases *NOW* belong to us.");
kill(pid, SIGHUP);
sleep(3);
munmap(p, sysconf(_SC_PAGE_SIZE));
close(fh);
}
}
---
if I run both, in bash, as:
rm -f shared.map ; \
gcc -o p1 p1.c ; \
gcc -o p2 p2.c ; \
for i in ` seq 1 `getconf PAGESIZE``; do echo -e -n "\0" > shared.map ;
done ; \
./p1 & ./p2 $!
I get the following output:
$ rm -f shared.map ; \
> gcc -o p1 p1.c ; \
> gcc -o p2 p2.c ; \
> for i in ` seq 1 `getconf PAGESIZE``; do echo -e -n "\0" > shared.map
; done ; \
> ./p1 & ./p2 $!
[1] 8223
Process 1 -- Writing to map: All your bases are belong to us.
Process 2 -- Map now contains: All your bases are belong to us.
Process 2 -- Writing to map: All your bases *NOW* belong to us.
Process 1 -- Map now contains: All your bases *NOW* belong to us.
[1]+ Done ./p1
To me, the example above shows the used of a mmaped region shared
between two processes.
C code above was written too fast again! Some headers are redundant, I
do not check for all error conditions, I don't do all the cleanup - it's
kind of quick and dirty. I hope you can forgive me.
(it works on mac and linux unmodified, though quick and dirty)
I am fascinated by Daniel's suggestion of accessing the guest memory via
the QMU process existing map, found through /dev/pid/mmaps. It is my
understanding that Daniel's suggestion would require ptrace()'ing the
QEMU process, and potentially stopping the QEMU process to access the
guest mmap'ed memory.
>
>
> This thread is in reply to Valerio's attempt to upstream this patch.
> Good move.
>
> The usual questions for feature requests apply:
>
> 1. Is this a use case we want to serve?
>
> Unreserved yes. Supporting virtual machine introspection with LibVMI
> makes sense.
>
> [...]
Markus that's a great description of needs and potential approaches. It
requires a bit of thinking. I reserve the right to provide comments
later on.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-26 17:37 ` Valerio Aimale
@ 2015-10-26 17:52 ` Eduardo Habkost
2015-10-27 14:17 ` Valerio Aimale
0 siblings, 1 reply; 43+ messages in thread
From: Eduardo Habkost @ 2015-10-26 17:52 UTC (permalink / raw)
To: Valerio Aimale; +Cc: qemu-devel, Markus Armbruster, lcapitulino
On Mon, Oct 26, 2015 at 11:37:04AM -0600, Valerio Aimale wrote:
> On 10/26/15 3:09 AM, Markus Armbruster wrote:
> >[...]
> >
> >>Eduardo, I think it would be a common rule of politeness not to pass
> >>any judgement on a person that you don't know, but for some texts in a
> >>mailing list. I think I understand how mmap() works, and very well.
> >>
> >>Participating is this discussion has been a struggle for me. For the
> >>good of the libvmi users, I have been trying to ignore the judgements,
> >>the comments and so on. But, alas, I throw my hands up in the air, and
> >>I surrender.
> >I'm sorry we exceeded your tolerance for frustration. This mailing list
> >can be tough. We try to be welcoming (believe it or not), but we too
> >often fail (okay, that part is easily believable).
> >
> >To be honest, I had difficulties understanding your explanation, and
> >ended up guessing. I figure Eduardo did the same, and guessed
> >incorrectly. There but for the grace of God go I.
> Well, I did scribble my C sample excerpt too fast. Participating in mailing
> list is not part of my job description - I was short on time, I admit to
> that. However, there is a big difference in saying "I do no understand your
> explanation, please try again" and saying "you're confused about mmap()"
>
> I was trying to advocate the use of a shared mmap'ed region. The sharing
> would be two-ways (RW for both) between the QEMU virtualizer and the libvmi
> process. I envision that there could be a QEMU command line argument, such
> as "--mmap-guest-memory <filename>" Understand that Eric feels strongly the
> libvmi client should own the file name - I have not forgotten that. When
> that command line argument is given, as part of the guest initialization,
> QEMU creates a file of size equal to the size of the guest memory containing
> all zeros, mmaps that file to the guest memory with PROT_READ|PROT_WRITE
> and MAP_FILE|MAP_SHARED, then starts the guest.
This is basically what memory-backend-file (and the legacy -mem-path
option) already does today, but it unlinks the file just after opening
it. We can change it to accept a full filename and/or an option to make
it not unlink the file after opening it.
I don't remember if memory-backend-file is usable without -numa, but we
could make it possible somehow.
> And, if at all possible,
> makes the filename querable via qmp and/or hmp, so that the filename of the
> mmap would not need to be maintained in two different places, leading to
> maintenance nightmares.
You can probably query the memory-backend-file QOM properties using QMP,
already. Then the filename could be exposed as a QOM property.
--
Eduardo
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-26 17:52 ` Eduardo Habkost
@ 2015-10-27 14:17 ` Valerio Aimale
2015-10-27 15:00 ` Markus Armbruster
0 siblings, 1 reply; 43+ messages in thread
From: Valerio Aimale @ 2015-10-27 14:17 UTC (permalink / raw)
To: Eduardo Habkost; +Cc: qemu-devel, Markus Armbruster, lcapitulino
On 10/26/15 11:52 AM, Eduardo Habkost wrote:
>
>
> I was trying to advocate the use of a shared mmap'ed region. The sharing
> would be two-ways (RW for both) between the QEMU virtualizer and the libvmi
> process. I envision that there could be a QEMU command line argument, such
> as "--mmap-guest-memory <filename>" Understand that Eric feels strongly the
> libvmi client should own the file name - I have not forgotten that. When
> that command line argument is given, as part of the guest initialization,
> QEMU creates a file of size equal to the size of the guest memory containing
> all zeros, mmaps that file to the guest memory with PROT_READ|PROT_WRITE
> and MAP_FILE|MAP_SHARED, then starts the guest.
> This is basically what memory-backend-file (and the legacy -mem-path
> option) already does today, but it unlinks the file just after opening
> it. We can change it to accept a full filename and/or an option to make
> it not unlink the file after opening it.
>
> I don't remember if memory-backend-file is usable without -numa, but we
> could make it possible somehow.
Eduardo, I did try this approach. It takes 2 line changes in exec.c:
comment the unlink out, and making sure MAP_SHARED is used when
-mem-path and -mem-prealloc are given. It works beautifully, and libvmi
accesses are fast. However, the VM is slowed down to a crawl, obviously,
because each RAM access by the VM triggers a page fault on the mmapped
file. I don't think having a crawling VM is desirable, so this approach
goes out the door.
I think we're back at estimating the speed of other approaches as
discussed previously:
- via UNIX socket as per existing patch
- via xp parsing the human readable xp output
- via an xp-like command the returns memory content baseXX-encoded into
a json string
- via shared memory as per existing code and patch
Any other?
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-27 14:17 ` Valerio Aimale
@ 2015-10-27 15:00 ` Markus Armbruster
2015-10-27 15:18 ` Valerio Aimale
0 siblings, 1 reply; 43+ messages in thread
From: Markus Armbruster @ 2015-10-27 15:00 UTC (permalink / raw)
To: Valerio Aimale; +Cc: lcapitulino, Eduardo Habkost, qemu-devel
Valerio Aimale <valerio@aimale.com> writes:
> On 10/26/15 11:52 AM, Eduardo Habkost wrote:
>>
>>
>> I was trying to advocate the use of a shared mmap'ed region. The sharing
>> would be two-ways (RW for both) between the QEMU virtualizer and the libvmi
>> process. I envision that there could be a QEMU command line argument, such
>> as "--mmap-guest-memory <filename>" Understand that Eric feels strongly the
>> libvmi client should own the file name - I have not forgotten that. When
>> that command line argument is given, as part of the guest initialization,
>> QEMU creates a file of size equal to the size of the guest memory containing
>> all zeros, mmaps that file to the guest memory with PROT_READ|PROT_WRITE
>> and MAP_FILE|MAP_SHARED, then starts the guest.
>> This is basically what memory-backend-file (and the legacy -mem-path
>> option) already does today, but it unlinks the file just after opening
>> it. We can change it to accept a full filename and/or an option to make
>> it not unlink the file after opening it.
>>
>> I don't remember if memory-backend-file is usable without -numa, but we
>> could make it possible somehow.
> Eduardo, I did try this approach. It takes 2 line changes in exec.c:
> comment the unlink out, and making sure MAP_SHARED is used when
> -mem-path and -mem-prealloc are given. It works beautifully, and
> libvmi accesses are fast. However, the VM is slowed down to a crawl,
> obviously, because each RAM access by the VM triggers a page fault on
> the mmapped file. I don't think having a crawling VM is desirable, so
> this approach goes out the door.
Uh, I don't understand why "each RAM access by the VM triggers a page
fault". Can you show us the patch you used?
> I think we're back at estimating the speed of other approaches as
> discussed previously:
>
> - via UNIX socket as per existing patch
> - via xp parsing the human readable xp output
> - via an xp-like command the returns memory content baseXX-encoded
> into a json string
> - via shared memory as per existing code and patch
>
> Any other?
Yes, the existing alternative LibVMI method via gdbserver should be
included in the comparison.
Naturally, any approach that does the actual work via QMP will be dog
slow as long as LibVMI launches virsh for each QMP command. Fixable, as
Eric pointed out: use the libvirt API and link against libvirt.so
instead. No idea how much work that'll be.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-27 15:00 ` Markus Armbruster
@ 2015-10-27 15:18 ` Valerio Aimale
2015-10-27 15:31 ` Valerio Aimale
2015-10-27 16:11 ` Markus Armbruster
0 siblings, 2 replies; 43+ messages in thread
From: Valerio Aimale @ 2015-10-27 15:18 UTC (permalink / raw)
To: Markus Armbruster; +Cc: lcapitulino, Eduardo Habkost, qemu-devel
On 10/27/15 9:00 AM, Markus Armbruster wrote:
> Valerio Aimale <valerio@aimale.com> writes:
>
>> On 10/26/15 11:52 AM, Eduardo Habkost wrote:
>>>
>>> I was trying to advocate the use of a shared mmap'ed region. The sharing
>>> would be two-ways (RW for both) between the QEMU virtualizer and the libvmi
>>> process. I envision that there could be a QEMU command line argument, such
>>> as "--mmap-guest-memory <filename>" Understand that Eric feels strongly the
>>> libvmi client should own the file name - I have not forgotten that. When
>>> that command line argument is given, as part of the guest initialization,
>>> QEMU creates a file of size equal to the size of the guest memory containing
>>> all zeros, mmaps that file to the guest memory with PROT_READ|PROT_WRITE
>>> and MAP_FILE|MAP_SHARED, then starts the guest.
>>> This is basically what memory-backend-file (and the legacy -mem-path
>>> option) already does today, but it unlinks the file just after opening
>>> it. We can change it to accept a full filename and/or an option to make
>>> it not unlink the file after opening it.
>>>
>>> I don't remember if memory-backend-file is usable without -numa, but we
>>> could make it possible somehow.
>> Eduardo, I did try this approach. It takes 2 line changes in exec.c:
>> comment the unlink out, and making sure MAP_SHARED is used when
>> -mem-path and -mem-prealloc are given. It works beautifully, and
>> libvmi accesses are fast. However, the VM is slowed down to a crawl,
>> obviously, because each RAM access by the VM triggers a page fault on
>> the mmapped file. I don't think having a crawling VM is desirable, so
>> this approach goes out the door.
> Uh, I don't understand why "each RAM access by the VM triggers a page
> fault". Can you show us the patch you used?
Sorry, too brief of an explanation. Every time the guest flips a byte in
physical RAM, I think that triggers a page write to the mmaped file. My
understanding is that, with MAP_SHARED, each write to RAM triggers a
file write, hence the slowness. These are the simple changes I made, to
test it - as a proof of concept.
in exec.c of the qemu-2.4.0.1 change
---
fd = mkstemp(filename);
if (fd < 0) {
error_setg_errno(errp, errno,
"unable to create backing store for hugepages");
g_free(filename);
goto error;
}
unlink(filename);
g_free(filename);
memory = (memory+hpagesize-1) & ~(hpagesize-1);
/*
* ftruncate is not supported by hugetlbfs in older
* hosts, so don't bother bailing out on errors.
* If anything goes wrong with it under other filesystems,
* mmap will fail.
*/
if (ftruncate(fd, memory)) {
perror("ftruncate");
}
area = mmap(0, memory, PROT_READ | PROT_WRITE,
(block->flags & RAM_SHARED ? MAP_SHARED : MAP_PRIVATE),
fd, 0);
---
to
---
fd = mkstemp(filename);
if (fd < 0) {
error_setg_errno(errp, errno,
"unable to create backing store for hugepages");
g_free(filename);
goto error;
}
/* unlink(filename); */ /* Valerio's change to persist guest RAM
mmaped file */
g_free(filename);
memory = (memory+hpagesize-1) & ~(hpagesize-1);
/*
* ftruncate is not supported by hugetlbfs in older
* hosts, so don't bother bailing out on errors.
* If anything goes wrong with it under other filesystems,
* mmap will fail.
*/
if (ftruncate(fd, memory)) {
perror("ftruncate");
}
area = mmap(0, memory, PROT_READ | PROT_WRITE,
MAP_FILE | MAP_SHARED, /* Valerio's change to persist
guest RAM mmaped file */
fd, 0);
---
then, recompile qemu.
Launch a VM as
/usr/local/bin/qemu-system-x86_64 -name Windows10 -S -machine
pc-i440fx-2.4,accel=kvm,usb=off [...] -mem-prealloc -mem-path /tmp/maps
# I know -mem-path is deprecated, but I used for speeding up the proof
of concept.
With the above command, I have a the following file
$ ls -l /tmp/maps/
-rw------- 1 libvirt-qemu kvm 2147483648 Oct 27 08:31
qemu_back_mem.pc.ram.fP4sKH
which is a mmap of the Win VM physical RAM
$ hexdump -C /tmp/maps/qemu_back_mem.val.pc.ram.fP4sKH
00000000 53 ff 00 f0 53 ff 00 f0 c3 e2 00 f0 53 ff 00 f0
|S...S.......S...|
[...]
00000760 24 02 c3 49 6e 76 61 6c 69 64 20 70 61 72 74 69 |$..Invalid
parti|
00000770 74 69 6f 6e 20 74 61 62 6c 65 00 45 72 72 6f 72 |tion
table.Error|
00000780 20 6c 6f 61 64 69 6e 67 20 6f 70 65 72 61 74 69 | loading
operati|
00000790 6e 67 20 73 79 73 74 65 6d 00 4d 69 73 73 69 6e |ng
system.Missin|
000007a0 67 20 6f 70 65 72 61 74 69 6e 67 20 73 79 73 74 |g operating
syst|
000007b0 65 6d 00 00 00 63 7b 9a 73 d8 99 ce 00 00 80 20
|em...c{.s...... |
[...]
I did not try to mmap'ing to a file on a RAMdisk. Without physical disk
I/O, the VM might run faster.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-27 15:18 ` Valerio Aimale
@ 2015-10-27 15:31 ` Valerio Aimale
2015-10-27 16:11 ` Markus Armbruster
1 sibling, 0 replies; 43+ messages in thread
From: Valerio Aimale @ 2015-10-27 15:31 UTC (permalink / raw)
To: Markus Armbruster; +Cc: lcapitulino, Eduardo Habkost, qemu-devel
On 10/27/15 9:18 AM, Valerio Aimale wrote:
>
>
> I did not try to mmap'ing to a file on a RAMdisk. Without physical
> disk I/O, the VM might run faster.
>
>
I did try with the file on a ramdisk
$ sudo mount -o size=3G -t tmpfs none /ramdisk
$ /usr/local/bin/qemu-system-x86_64 -name Windows10 -S -machine
pc-i440fx-2.4,accel=kvm,usb=off [...] -mem-prealloc -mem-path /ramdisk
with that, the speed of the VM is acceptable. It will not take your
breath away, but it is reasonable.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-27 15:18 ` Valerio Aimale
2015-10-27 15:31 ` Valerio Aimale
@ 2015-10-27 16:11 ` Markus Armbruster
2015-10-27 16:27 ` Valerio Aimale
1 sibling, 1 reply; 43+ messages in thread
From: Markus Armbruster @ 2015-10-27 16:11 UTC (permalink / raw)
To: Valerio Aimale; +Cc: qemu-devel, Eduardo Habkost, lcapitulino
Valerio Aimale <valerio@aimale.com> writes:
> On 10/27/15 9:00 AM, Markus Armbruster wrote:
>> Valerio Aimale <valerio@aimale.com> writes:
>>
>>> On 10/26/15 11:52 AM, Eduardo Habkost wrote:
>>>>
>>>> I was trying to advocate the use of a shared mmap'ed region. The sharing
>>>> would be two-ways (RW for both) between the QEMU virtualizer and the libvmi
>>>> process. I envision that there could be a QEMU command line argument, such
>>>> as "--mmap-guest-memory <filename>" Understand that Eric feels strongly the
>>>> libvmi client should own the file name - I have not forgotten that. When
>>>> that command line argument is given, as part of the guest initialization,
>>>> QEMU creates a file of size equal to the size of the guest memory containing
>>>> all zeros, mmaps that file to the guest memory with PROT_READ|PROT_WRITE
>>>> and MAP_FILE|MAP_SHARED, then starts the guest.
>>>> This is basically what memory-backend-file (and the legacy -mem-path
>>>> option) already does today, but it unlinks the file just after opening
>>>> it. We can change it to accept a full filename and/or an option to make
>>>> it not unlink the file after opening it.
>>>>
>>>> I don't remember if memory-backend-file is usable without -numa, but we
>>>> could make it possible somehow.
>>> Eduardo, I did try this approach. It takes 2 line changes in exec.c:
>>> comment the unlink out, and making sure MAP_SHARED is used when
>>> -mem-path and -mem-prealloc are given. It works beautifully, and
>>> libvmi accesses are fast. However, the VM is slowed down to a crawl,
>>> obviously, because each RAM access by the VM triggers a page fault on
>>> the mmapped file. I don't think having a crawling VM is desirable, so
>>> this approach goes out the door.
>> Uh, I don't understand why "each RAM access by the VM triggers a page
>> fault". Can you show us the patch you used?
> Sorry, too brief of an explanation. Every time the guest flips a byte in
> physical RAM, I think that triggers a page write to the mmaped file. My
> understanding is that, with MAP_SHARED, each write to RAM triggers a
> file write, hence the slowness. These are the simple changes I made, to
> test it - as a proof of concept.
Ah, that actually makes sense. Thanks!
[...]
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-27 16:11 ` Markus Armbruster
@ 2015-10-27 16:27 ` Valerio Aimale
0 siblings, 0 replies; 43+ messages in thread
From: Valerio Aimale @ 2015-10-27 16:27 UTC (permalink / raw)
To: Markus Armbruster; +Cc: qemu-devel, Eduardo Habkost, lcapitulino
On 10/27/15 10:11 AM, Markus Armbruster wrote:
> [...]
>>>> Eduardo, I did try this approach. It takes 2 line changes in exec.c:
>>>> comment the unlink out, and making sure MAP_SHARED is used when
>>>> -mem-path and -mem-prealloc are given. It works beautifully, and
>>>> libvmi accesses are fast. However, the VM is slowed down to a crawl,
>>>> obviously, because each RAM access by the VM triggers a page fault on
>>>> the mmapped file. I don't think having a crawling VM is desirable, so
>>>> this approach goes out the door.
>>> Uh, I don't understand why "each RAM access by the VM triggers a page
>>> fault". Can you show us the patch you used?
>> Sorry, too brief of an explanation. Every time the guest flips a byte in
>> physical RAM, I think that triggers a page write to the mmaped file. My
>> understanding is that, with MAP_SHARED, each write to RAM triggers a
>> file write, hence the slowness. These are the simple changes I made, to
>> test it - as a proof of concept.
> Ah, that actually makes sense. Thanks!
>
> [...]
However, when the guest RAM mmap'ed file resides on a RAMdisk on the
host, the guest OS responsiveness is more than acceptable. Perhaps this
is a viable approach. It might requires a minimum of changes to the QEMU
source and maybe 1 extra command line argument.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-22 19:12 ` Eduardo Habkost
2015-10-22 19:57 ` Valerio Aimale
@ 2015-10-23 6:35 ` Markus Armbruster
2015-10-23 8:18 ` Daniel P. Berrange
` (2 more replies)
1 sibling, 3 replies; 43+ messages in thread
From: Markus Armbruster @ 2015-10-23 6:35 UTC (permalink / raw)
To: Eduardo Habkost; +Cc: qemu-devel, Valerio Aimale, lcapitulino
Eduardo Habkost <ehabkost@redhat.com> writes:
> On Wed, Oct 21, 2015 at 12:54:23PM +0200, Markus Armbruster wrote:
>> Valerio Aimale <valerio@aimale.com> writes:
> [...]
>> > There's also a similar patch, floating around the internet, the uses
>> > shared memory, instead of sockets, as inter-process communication
>> > between libvmi and QEMU. I've never used that.
>>
>> By the time you built a working IPC mechanism on top of shared memory,
>> you're often no better off than with AF_LOCAL sockets.
>>
>> Crazy idea: can we allocate guest memory in a way that support sharing
>> it with another process? Eduardo, can -mem-path do such wild things?
>
> It can't today, but just because it creates a temporary file inside
> mem-path and unlinks it immediately after opening a file descriptor. We
> could make memory-backend-file also accept a full filename as argument,
> or add a mechanism to let QEMU send the open file descriptor to a QMP
> client.
Valerio, would an command line option to share guest memory suffice, or
does it have to be a monitor command? If the latter, why?
Eduardo, I'm not sure writing to guest memory behind TCG's back will
work. Do you know?
Will it work with KVM?
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-23 6:35 ` Markus Armbruster
@ 2015-10-23 8:18 ` Daniel P. Berrange
2015-10-23 14:48 ` Valerio Aimale
2015-10-23 14:44 ` Valerio Aimale
2015-10-23 19:24 ` Eduardo Habkost
2 siblings, 1 reply; 43+ messages in thread
From: Daniel P. Berrange @ 2015-10-23 8:18 UTC (permalink / raw)
To: Markus Armbruster
Cc: lcapitulino, Eduardo Habkost, Valerio Aimale, qemu-devel
On Fri, Oct 23, 2015 at 08:35:15AM +0200, Markus Armbruster wrote:
> Eduardo Habkost <ehabkost@redhat.com> writes:
>
> > On Wed, Oct 21, 2015 at 12:54:23PM +0200, Markus Armbruster wrote:
> >> Valerio Aimale <valerio@aimale.com> writes:
> > [...]
> >> > There's also a similar patch, floating around the internet, the uses
> >> > shared memory, instead of sockets, as inter-process communication
> >> > between libvmi and QEMU. I've never used that.
> >>
> >> By the time you built a working IPC mechanism on top of shared memory,
> >> you're often no better off than with AF_LOCAL sockets.
> >>
> >> Crazy idea: can we allocate guest memory in a way that support sharing
> >> it with another process? Eduardo, can -mem-path do such wild things?
> >
> > It can't today, but just because it creates a temporary file inside
> > mem-path and unlinks it immediately after opening a file descriptor. We
> > could make memory-backend-file also accept a full filename as argument,
> > or add a mechanism to let QEMU send the open file descriptor to a QMP
> > client.
>
> Valerio, would an command line option to share guest memory suffice, or
> does it have to be a monitor command? If the latter, why?
IIUC, libvmi wants to be able to connect to arbitrary pre-existing
running KVM instances on the host. As such I think it cannot assume
anything about the way they have been started, so requiring they
be booted with a special command line arg looks impractical for this
scenario.
Regards,
Daniel
--
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-23 8:18 ` Daniel P. Berrange
@ 2015-10-23 14:48 ` Valerio Aimale
0 siblings, 0 replies; 43+ messages in thread
From: Valerio Aimale @ 2015-10-23 14:48 UTC (permalink / raw)
To: Daniel P. Berrange, Markus Armbruster
Cc: lcapitulino, Eduardo Habkost, qemu-devel
On 10/23/15 2:18 AM, Daniel P. Berrange wrote:
> [...]
> It can't today, but just because it creates a temporary file inside
> mem-path and unlinks it immediately after opening a file descriptor. We
> could make memory-backend-file also accept a full filename as argument,
> or add a mechanism to let QEMU send the open file descriptor to a QMP
> client.
>> Valerio, would an command line option to share guest memory suffice, or
>> does it have to be a monitor command? If the latter, why?
> IIUC, libvmi wants to be able to connect to arbitrary pre-existing
> running KVM instances on the host. As such I think it cannot assume
> anything about the way they have been started, so requiring they
> be booted with a special command line arg looks impractical for this
> scenario.
>
> Regards,
> Daniel
Daniel, you are correct. libvmi knows of QEMU/KVM VMs only via
libvirt/virsh. The scenario would work if libvmi was able to query, via
qmp command, the path of the shared memory map.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-23 6:35 ` Markus Armbruster
2015-10-23 8:18 ` Daniel P. Berrange
@ 2015-10-23 14:44 ` Valerio Aimale
2015-10-23 14:56 ` Eric Blake
2015-10-23 19:24 ` Eduardo Habkost
2 siblings, 1 reply; 43+ messages in thread
From: Valerio Aimale @ 2015-10-23 14:44 UTC (permalink / raw)
To: Markus Armbruster, Eduardo Habkost; +Cc: qemu-devel, lcapitulino
On 10/23/15 12:35 AM, Markus Armbruster wrote:
> Eduardo Habkost <ehabkost@redhat.com> writes:
>
>> On Wed, Oct 21, 2015 at 12:54:23PM +0200, Markus Armbruster wrote:
>>> Valerio Aimale <valerio@aimale.com> writes:
>> [...]
>>>> There's also a similar patch, floating around the internet, the uses
>>>> shared memory, instead of sockets, as inter-process communication
>>>> between libvmi and QEMU. I've never used that.
>>> By the time you built a working IPC mechanism on top of shared memory,
>>> you're often no better off than with AF_LOCAL sockets.
>>>
>>> Crazy idea: can we allocate guest memory in a way that support sharing
>>> it with another process? Eduardo, can -mem-path do such wild things?
>> It can't today, but just because it creates a temporary file inside
>> mem-path and unlinks it immediately after opening a file descriptor. We
>> could make memory-backend-file also accept a full filename as argument,
>> or add a mechanism to let QEMU send the open file descriptor to a QMP
>> client.
> Valerio, would an command line option to share guest memory suffice, or
> does it have to be a monitor command? If the latter, why?
As Daniel points out, later in the thread, libvmi knows about QEMU VMs
only via libvirt/virsh.
Thus, a command line option to share guest memory would work only if
there was an info qmp command, info-guestmaps that would return
something like
{
'enabled' : 'true',
'filename' : '/path/to/the/guest/memory/map'
}
so that libvmi can find the map.
Libvmi dependence on virsh is so strict, that libvmi does not even know
if the QEMU VM has an open qmp unix socket or inet socket, to send
commands through. Thus, libvmi sends qmp commands (to query registers,
as an example) via
virsh qemu-monitor-command Windows10B '{"execute":
"human-monitor-command", "arguments": {"command-line": "info registers"}}'
It would be nice if there was a qmp command to query if there are
qmp/hmp sockets open, info-sockets that would return something like
{
'unix' : {
'enabled': 'true',
'address': '/path/to/socket/'
},
'inet' : {
'enabled': 'true',
'address': '1.2.3.4:5678'
}
}
so that libvmi can find the rendezvous unix/inet address and sends
commands through that. As of now, each qmp commands requires a popen()
that forks virsh, which compounds to the slowness.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-23 14:44 ` Valerio Aimale
@ 2015-10-23 14:56 ` Eric Blake
2015-10-23 15:03 ` Valerio Aimale
0 siblings, 1 reply; 43+ messages in thread
From: Eric Blake @ 2015-10-23 14:56 UTC (permalink / raw)
To: Valerio Aimale, Markus Armbruster, Eduardo Habkost
Cc: qemu-devel, lcapitulino
[-- Attachment #1: Type: text/plain, Size: 1419 bytes --]
On 10/23/2015 08:44 AM, Valerio Aimale wrote:
>
> Libvmi dependence on virsh is so strict, that libvmi does not even know
> if the QEMU VM has an open qmp unix socket or inet socket, to send
> commands through. Thus, libvmi sends qmp commands (to query registers,
> as an example) via
>
> virsh qemu-monitor-command Windows10B '{"execute":
> "human-monitor-command", "arguments": {"command-line": "info registers"}}'
This is an unsupported back door of libvirt; you should really also
consider enhancing libvirt to add a formal API to expose this
information so that you don't have to resort to the monitor back door.
But that's a topic for the libvirt list.
>
> so that libvmi can find the rendezvous unix/inet address and sends
> commands through that. As of now, each qmp commands requires a popen()
> that forks virsh, which compounds to the slowness.
No, don't blame virsh for your slowness. Write your own C program that
links against libvirt.so, and which holds and reuses a persistent
connection, using the same libvirt APIs as would be used by virsh. All
the overhead of spawning a shell to spawn virsh to open a fresh
connection for each command will go away. Any solution that uses
popen() to virsh is _screaming_ to be rewritten to use libvirt.so natively.
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-23 14:56 ` Eric Blake
@ 2015-10-23 15:03 ` Valerio Aimale
0 siblings, 0 replies; 43+ messages in thread
From: Valerio Aimale @ 2015-10-23 15:03 UTC (permalink / raw)
To: Eric Blake, Markus Armbruster, Eduardo Habkost; +Cc: qemu-devel, lcapitulino
On 10/23/15 8:56 AM, Eric Blake wrote:
> On 10/23/2015 08:44 AM, Valerio Aimale wrote:
>
>> Libvmi dependence on virsh is so strict, that libvmi does not even know
>> if the QEMU VM has an open qmp unix socket or inet socket, to send
>> commands through. Thus, libvmi sends qmp commands (to query registers,
>> as an example) via
>>
>> virsh qemu-monitor-command Windows10B '{"execute":
>> "human-monitor-command", "arguments": {"command-line": "info registers"}}'
> This is an unsupported back door of libvirt; you should really also
> consider enhancing libvirt to add a formal API to expose this
> information so that you don't have to resort to the monitor back door.
> But that's a topic for the libvirt list.
>
>> so that libvmi can find the rendezvous unix/inet address and sends
>> commands through that. As of now, each qmp commands requires a popen()
>> that forks virsh, which compounds to the slowness.
> No, don't blame virsh for your slowness. Write your own C program that
> links against libvirt.so, and which holds and reuses a persistent
> connection, using the same libvirt APIs as would be used by virsh. All
> the overhead of spawning a shell to spawn virsh to open a fresh
> connection for each command will go away. Any solution that uses
> popen() to virsh is _screaming_ to be rewritten to use libvirt.so natively.
>
Eric, good points. Libvmi was written quite some time ago; that libvirt
api might not even have been there. When we agree on an access method to
the guest mem, I will rewrite those legacy parts. I'm not the author of
libvmi. I'm just trying to make it better, when accessing QEMU/KVM VMs
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-23 6:35 ` Markus Armbruster
2015-10-23 8:18 ` Daniel P. Berrange
2015-10-23 14:44 ` Valerio Aimale
@ 2015-10-23 19:24 ` Eduardo Habkost
2015-10-23 20:02 ` Richard Henderson
2015-11-02 12:55 ` Paolo Bonzini
2 siblings, 2 replies; 43+ messages in thread
From: Eduardo Habkost @ 2015-10-23 19:24 UTC (permalink / raw)
To: Markus Armbruster
Cc: Paolo Bonzini, Richard Henderson, qemu-devel, Valerio Aimale,
lcapitulino
On Fri, Oct 23, 2015 at 08:35:15AM +0200, Markus Armbruster wrote:
> Eduardo Habkost <ehabkost@redhat.com> writes:
>
> > On Wed, Oct 21, 2015 at 12:54:23PM +0200, Markus Armbruster wrote:
> >> Valerio Aimale <valerio@aimale.com> writes:
> > [...]
> >> > There's also a similar patch, floating around the internet, the uses
> >> > shared memory, instead of sockets, as inter-process communication
> >> > between libvmi and QEMU. I've never used that.
> >>
> >> By the time you built a working IPC mechanism on top of shared memory,
> >> you're often no better off than with AF_LOCAL sockets.
> >>
> >> Crazy idea: can we allocate guest memory in a way that support sharing
> >> it with another process? Eduardo, can -mem-path do such wild things?
> >
> > It can't today, but just because it creates a temporary file inside
> > mem-path and unlinks it immediately after opening a file descriptor. We
> > could make memory-backend-file also accept a full filename as argument,
> > or add a mechanism to let QEMU send the open file descriptor to a QMP
> > client.
>
> Valerio, would an command line option to share guest memory suffice, or
> does it have to be a monitor command? If the latter, why?
>
> Eduardo, I'm not sure writing to guest memory behind TCG's back will
> work. Do you know?
I don't know. I guess it may possibly surprise TCG depending on how some
operations are implemented, but it sounds unlikely. CCing Richard.
>
> Will it work with KVM?
I always assumed it would work, but CCing Paolo to see if there are any
pitfalls.
--
Eduardo
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-23 19:24 ` Eduardo Habkost
@ 2015-10-23 20:02 ` Richard Henderson
2015-11-02 12:55 ` Paolo Bonzini
1 sibling, 0 replies; 43+ messages in thread
From: Richard Henderson @ 2015-10-23 20:02 UTC (permalink / raw)
To: Eduardo Habkost, Markus Armbruster
Cc: Paolo Bonzini, qemu-devel, Valerio Aimale, lcapitulino
On 10/23/2015 09:24 AM, Eduardo Habkost wrote:
> On Fri, Oct 23, 2015 at 08:35:15AM +0200, Markus Armbruster wrote:
>> Eduardo Habkost <ehabkost@redhat.com> writes:
>>
>>> On Wed, Oct 21, 2015 at 12:54:23PM +0200, Markus Armbruster wrote:
>>>> Valerio Aimale <valerio@aimale.com> writes:
>>> [...]
>>>>> There's also a similar patch, floating around the internet, the uses
>>>>> shared memory, instead of sockets, as inter-process communication
>>>>> between libvmi and QEMU. I've never used that.
>>>>
>>>> By the time you built a working IPC mechanism on top of shared memory,
>>>> you're often no better off than with AF_LOCAL sockets.
>>>>
>>>> Crazy idea: can we allocate guest memory in a way that support sharing
>>>> it with another process? Eduardo, can -mem-path do such wild things?
>>>
>>> It can't today, but just because it creates a temporary file inside
>>> mem-path and unlinks it immediately after opening a file descriptor. We
>>> could make memory-backend-file also accept a full filename as argument,
>>> or add a mechanism to let QEMU send the open file descriptor to a QMP
>>> client.
>>
>> Valerio, would an command line option to share guest memory suffice, or
>> does it have to be a monitor command? If the latter, why?
>>
>> Eduardo, I'm not sure writing to guest memory behind TCG's back will
>> work. Do you know?
>
> I don't know. I guess it may possibly surprise TCG depending on how some
> operations are implemented, but it sounds unlikely. CCing Richard.
Writing to guest memory will work, in that the guest will see the changes. The
harder part is synchronization. Here you'll face all the same problems that
are currently being addressed in the multi-threaded tcg patch sets.
r~
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] QEMU patch to allow VM introspection via libvmi
2015-10-23 19:24 ` Eduardo Habkost
2015-10-23 20:02 ` Richard Henderson
@ 2015-11-02 12:55 ` Paolo Bonzini
1 sibling, 0 replies; 43+ messages in thread
From: Paolo Bonzini @ 2015-11-02 12:55 UTC (permalink / raw)
To: Eduardo Habkost, Markus Armbruster
Cc: Richard Henderson, qemu-devel, Valerio Aimale, lcapitulino
On 23/10/2015 21:24, Eduardo Habkost wrote:
>> >
>> > Valerio, would an command line option to share guest memory suffice, or
>> > does it have to be a monitor command? If the latter, why?
>> >
>> > Eduardo, I'm not sure writing to guest memory behind TCG's back will
>> > work. Do you know?
> I don't know. I guess it may possibly surprise TCG depending on how some
> operations are implemented, but it sounds unlikely. CCing Richard.
>
>> >
>> > Will it work with KVM?
> I always assumed it would work, but CCing Paolo to see if there are any
> pitfalls.
Yes, both should work.
Paolo
^ permalink raw reply [flat|nested] 43+ messages in thread