New capabilities for plugins

All of lore.kernel.org
 help / color / mirror / Atom feed

* New capabilities for plugins
@ 2025-08-04 10:14 Florian Hofhammer
  2025-08-04 16:05 ` Alex Bennée
  2025-08-04 17:01 ` Pierrick Bouvier
  0 siblings, 2 replies; 13+ messages in thread
From: Florian Hofhammer @ 2025-08-04 10:14 UTC (permalink / raw)
  To: qemu-devel

Hello,

I'm currently working a lot with QEMU plugins for dynamic analysis of 
userspace binaries (i.e., running under qemu-user). While working on 
that, I found that the QEMU plugin API luckily has been getting more and 
more capabilities with recent versions but that I'm still lacking some 
functionality for my use cases.
More specifically, the "vcpu_syscall_cb" and "vcpu_syscall_ret" 
callbacks already allow me to instrument syscall translation entry and 
exit points. While the register read/write APIs also allow me to modify 
register contents in my syscall callback implementations, there is 
currently no good way to emulate a syscall myself in the plugin or 
explicitly set the syscall return value (as it will be overwritten with 
the original syscall's return value again, even if I set the 
corresponding guest register).

I was wondering whether the QEMU community would be open to extending 
the plugin API so that a plugin can fully emulate a syscall without the 
original syscall being executed by QEMU. I had multiple approaches for 
that in mind, with some working patches locally that I'd be happy to 
share and build such a feature on:

1. Change the API of the existing callbacks so that the syscall entry 
point callback returns "bool" instead of "void" and if any of the 
registered callbacks returns true, execution of the actual syscall is 
skipped.
2. Introduce a new API function that sets a flag for a specific syscall 
to be skipped:
2a. A function that's called once in the manner of "always skip the 
syscall with this specific syscall number" or
2b. a function that's called every time in the syscall entry point 
callback in the manner of "skip this specific instance of the syscall".

I'd be happy to get your opinion on those proposals and to 
develop/submit the corresponding patches!

Thanks in advance and best regards,
Florian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: New capabilities for plugins
  2025-08-04 10:14 New capabilities for plugins Florian Hofhammer
@ 2025-08-04 16:05 ` Alex Bennée
  2025-08-05 13:22   ` Florian Hofhammer
  2025-08-05 16:12   ` Daniel P. Berrangé
  2025-08-04 17:01 ` Pierrick Bouvier
  1 sibling, 2 replies; 13+ messages in thread
From: Alex Bennée @ 2025-08-04 16:05 UTC (permalink / raw)
  To: Florian Hofhammer
  Cc: qemu-devel, Richard Henderson, Laurent Vivier, Warner Losh

Florian Hofhammer <florian.hofhammer@epfl.ch> writes:

> Hello,

(Added the *-user MAINTAINERS to the CC)

> I'm currently working a lot with QEMU plugins for dynamic analysis of
> userspace binaries (i.e., running under qemu-user). While working on
> that, I found that the QEMU plugin API luckily has been getting more
> and more capabilities with recent versions but that I'm still lacking
> some functionality for my use cases.

We are slowly expanding the capabilities although we don't want to go to
fast lest we introduce APIs we need to fix later. That said we carry the
warning that the plugin API "reserves the right to change or break the
API should it need to do so" so hopefully plugin authors know what to
expect.

> More specifically, the "vcpu_syscall_cb" and "vcpu_syscall_ret"
> callbacks already allow me to instrument syscall translation entry and
> exit points. While the register read/write APIs also allow me to
> modify register contents in my syscall callback implementations, there
> is currently no good way to emulate a syscall myself in the plugin or
> explicitly set the syscall return value (as it will be overwritten
> with the original syscall's return value again, even if I set the
> corresponding guest register).
>
> I was wondering whether the QEMU community would be open to extending
> the plugin API so that a plugin can fully emulate a syscall without
> the original syscall being executed by QEMU.

I will defer to the *-user maintainers here. One thing we are keen to
avoid is plugins being used as a mechanism to work around the GPL
requirements of QEMU itself. It would be useful if you could outline the
use case for a plugin doing the emulation itself?

I'm fairly sure there are some syscalls where the plugin wouldn't be able to
do an emulation that makes sense, for example fork/vfork/execve because
QEMU itself has to do a bunch of housekeeping to keep track of stuff.

Also while the write memory helpers can take the place of QEMU's own
logic they come with a bunch of caveats about memory consistency. There
is stuff we do during lock_user() which those mechanisms don't.

> I had multiple approaches
> for that in mind, with some working patches locally that I'd be happy
> to share and build such a feature on:
>
> 1. Change the API of the existing callbacks so that the syscall entry
> point callback returns "bool" instead of "void" and if any of the
> registered callbacks returns true, execution of the actual syscall is
> skipped.
> 2. Introduce a new API function that sets a flag for a specific
> syscall to be skipped:
> 2a. A function that's called once in the manner of "always skip the
> syscall with this specific syscall number" or
> 2b. a function that's called every time in the syscall entry point
> callback in the manner of "skip this specific instance of the
> syscall".
>
> I'd be happy to get your opinion on those proposals and to
> develop/submit the corresponding patches!

Another option would be to have a set_pc function that would restart the
execution at new PC. Then the vcpu_syscall_cb callback could set the PC
to post the syscall with whatever state it wants to set up.

>
> Thanks in advance and best regards,
> Florian

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: New capabilities for plugins
  2025-08-04 16:05 ` Alex Bennée
@ 2025-08-05 13:22   ` Florian Hofhammer
  2025-08-05 14:16     ` Alex Bennée
  2025-08-05 16:12   ` Daniel P. Berrangé
  1 sibling, 1 reply; 13+ messages in thread
From: Florian Hofhammer @ 2025-08-05 13:22 UTC (permalink / raw)
  To: Alex Bennée, pierrick.bouvier
  Cc: qemu-devel, richard.henderson, laurent, imp

Hi Alex, hi Pierrick,

I'm taking the freedom to reply to both of you at the same time, I hope 
you don't mind :)

On 04/08/2025 18:05, Alex Bennée wrote:
 >> I was wondering whether the QEMU community would be open to extending
 >> the plugin API so that a plugin can fully emulate a syscall without
 >> the original syscall being executed by QEMU.
 >
 > I will defer to the *-user maintainers here. One thing we are keen to
 > avoid is plugins being used as a mechanism to work around the GPL
 > requirements of QEMU itself. It would be useful if you could outline
 > the use case for a plugin doing the emulation itself?

On 04/08/2025 19:01, Pierrick Bouvier wrote:
 > Before talking about the how and what, it could be useful to explain
 > why it's needed to replace syscalls.

I'm using QEMU as a tool for security research in the context of my 
studies. I'm analyzing black-box binaries, i.e., I don't have source 
code access and ideally don't want to spend much time statically reverse 
engineering the binaries (there might be complex logic, code 
obfuscation, etc. at play). Emulating syscalls myself instead of passing 
them through to the host OS has two advantages:

1. I can sandbox the (untrusted) binaries.
2. I can make the binaries execute more of their code by simulating an 
environment different to the one they're actually running in (e.g., by 
returning values from syscalls that are required to pass certain 
conditional checks in the code).

Both could in theory also be achieved by using utilities like seccomp 
filters or eBPF programs. However, I'd like to have arbitrarily complex 
logic to determine the outcome of a syscall, which quickly reaches its 
limits with the aforementioned approaches, in addition to the overhead 
of switching into and out of the kernel (negligible for a single 
execution but quickly adds up if we're talking about automated 
analyses). Further, I'd like my code to be able to profit from future 
improvements to QEMU and therefore implement it as a plugin (which is 
likely more portable than constantly forward-porting patches from a 
custom QEMU fork, as the majority of security research is doing it 
currently).
As a contrived example, I might want to inspect the arguments to a read 
syscall and return data accordingly. Say, I know from a previous open 
syscall that a file descriptor refers to /dev/random, I might want to 
return exactly the (arguably non-random) bytes required to pass a 
certain condition when a read syscall on that file descriptor is issued.

 > Another option would be to have a set_pc function that would restart
 > the execution at new PC. Then the vcpu_syscall_cb callback could set
 > the PC to post the syscall with whatever state it wants to set up.

Such a set_pc functionality is already covered with the register write 
API, as long as I have a handle to the PC register, right? Please do 
correct me if I'm misunderstanding something here!

Thanks for your input,
Florian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: New capabilities for plugins
  2025-08-05 13:22   ` Florian Hofhammer
@ 2025-08-05 14:16     ` Alex Bennée
  2025-08-05 14:30       ` Florian Hofhammer
  0 siblings, 1 reply; 13+ messages in thread
From: Alex Bennée @ 2025-08-05 14:16 UTC (permalink / raw)
  To: Florian Hofhammer
  Cc: pierrick.bouvier, qemu-devel, richard.henderson, laurent, imp

Florian Hofhammer <florian.hofhammer@epfl.ch> writes:

> Hi Alex, hi Pierrick,
>
> I'm taking the freedom to reply to both of you at the same time, I
> hope you don't mind :)
>
> On 04/08/2025 18:05, Alex Bennée wrote:
>>> I was wondering whether the QEMU community would be open to extending
>>> the plugin API so that a plugin can fully emulate a syscall without
>>> the original syscall being executed by QEMU.
>>
<snip>
>> Another option would be to have a set_pc function that would restart
>> the execution at new PC. Then the vcpu_syscall_cb callback could set
>> the PC to post the syscall with whatever state it wants to set up.
>
> Such a set_pc functionality is already covered with the register write
> API, as long as I have a handle to the PC register, right? Please do
> correct me if I'm misunderstanding something here!

Ahh we should make that clear. It requires special handling as the PC
isn't automatically updated every instruction. For analysis this isn't a
problem as the TB itself knows the vaddr of each instruction so can save
it if it wants.

Currently if you write to the PC it won't change flow - and it will
likely be reset as we exit the syscall.

c.f. https://gitlab.com/qemu-project/qemu/-/issues/2208

>
> Thanks for your input,
> Florian

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: New capabilities for plugins
  2025-08-05 14:16     ` Alex Bennée
@ 2025-08-05 14:30       ` Florian Hofhammer
  2025-08-05 15:30         ` Alex Bennée
  0 siblings, 1 reply; 13+ messages in thread
From: Florian Hofhammer @ 2025-08-05 14:30 UTC (permalink / raw)
  To: Alex Bennée
  Cc: pierrick.bouvier, qemu-devel, richard.henderson, laurent, imp

On 05/08/2025 16:16, Alex Bennée wrote:
>>> Another option would be to have a set_pc function that would restart
>>> the execution at new PC. Then the vcpu_syscall_cb callback could set
>>> the PC to post the syscall with whatever state it wants to set up.
>>
>> Such a set_pc functionality is already covered with the register write
>> API, as long as I have a handle to the PC register, right? Please do
>> correct me if I'm misunderstanding something here!
> 
> Ahh we should make that clear. It requires special handling as the PC
> isn't automatically updated every instruction. For analysis this isn't a
> problem as the TB itself knows the vaddr of each instruction so can save
> it if it wants.
> 
> Currently if you write to the PC it won't change flow - and it will
> likely be reset as we exit the syscall.
> 
> c.f. https://gitlab.com/qemu-project/qemu/-/issues/2208

Thanks for the clarification, I haven't fully thought the implications
of updating the PC on the jited code through.
Do I understand correctly that this would likely require hooking into
the TCG in a way so that the target address of this set_pc function gets
retranslated? While I've dug into the QEMU code quite a bit already, I'm
not sure I'm familiar enough with the TCG internals to be able to tell
whether such a set_pc function could determine the address of an
(arbitrary) already translated block. I.e., if the target PC is not just
the next instruction after the syscall, can QEMU determine whether the
target address has already been translated and if yes, where the
generated code actually is located?

Thanks,
Florian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: New capabilities for plugins
  2025-08-05 14:30       ` Florian Hofhammer
@ 2025-08-05 15:30         ` Alex Bennée
  2025-08-21 16:02           ` Florian Hofhammer
  0 siblings, 1 reply; 13+ messages in thread
From: Alex Bennée @ 2025-08-05 15:30 UTC (permalink / raw)
  To: Florian Hofhammer
  Cc: pierrick.bouvier, qemu-devel, richard.henderson, laurent, imp

Florian Hofhammer <florian.hofhammer@epfl.ch> writes:

> On 05/08/2025 16:16, Alex Bennée wrote:
>>>> Another option would be to have a set_pc function that would restart
>>>> the execution at new PC. Then the vcpu_syscall_cb callback could set
>>>> the PC to post the syscall with whatever state it wants to set up.
>>>
>>> Such a set_pc functionality is already covered with the register write
>>> API, as long as I have a handle to the PC register, right? Please do
>>> correct me if I'm misunderstanding something here!
>> Ahh we should make that clear. It requires special handling as the
>> PC
>> isn't automatically updated every instruction. For analysis this isn't a
>> problem as the TB itself knows the vaddr of each instruction so can save
>> it if it wants.
>> Currently if you write to the PC it won't change flow - and it will
>> likely be reset as we exit the syscall.
>> c.f. https://gitlab.com/qemu-project/qemu/-/issues/2208
>
> Thanks for the clarification, I haven't fully thought the implications
> of updating the PC on the jited code through.
> Do I understand correctly that this would likely require hooking into
> the TCG in a way so that the target address of this set_pc function gets
> retranslated?

I think to read the PC we would just need to make sure we properly
resolve it - internally QEMU does this for faults with:

    tb = tcg_tb_lookup(retaddr);
    cpu_restore_state_from_tb(cpu, tb, retaddr);

where retaddr is the address of the translated code. We just need to
special case PC handling in the read path.

> While I've dug into the QEMU code quite a bit already, I'm
> not sure I'm familiar enough with the TCG internals to be able to tell
> whether such a set_pc function could determine the address of an
> (arbitrary) already translated block. I.e., if the target PC is not just
> the next instruction after the syscall, can QEMU determine whether the
> target address has already been translated and if yes, where the
> generated code actually is located?

No need - we just need to exit the loop via cpu_loop_exit_restore() and
the code will do the right thing. However we probably don't want to
trigger that via register write as we would surprise the plugin -
especially if there are other hooks still to run. So we would want an
explicit helper to do it.

>
> Thanks,
> Florian

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: New capabilities for plugins
  2025-08-05 15:30         ` Alex Bennée
@ 2025-08-21 16:02           ` Florian Hofhammer
  2025-08-22  8:44             ` Alex Bennée
  0 siblings, 1 reply; 13+ messages in thread
From: Florian Hofhammer @ 2025-08-21 16:02 UTC (permalink / raw)
  To: Alex Bennée
  Cc: pierrick.bouvier, qemu-devel, richard.henderson, laurent, imp

Hi Alex,

Sorry for the late reply, I've been out of office and did not check my
mail.

On 05/08/2025 17:30, Alex Bennée wrote:
> I think to read the PC we would just need to make sure we properly
> resolve it - internally QEMU does this for faults with:
> 
>      tb = tcg_tb_lookup(retaddr);
>      cpu_restore_state_from_tb(cpu, tb, retaddr);
> 
> where retaddr is the address of the translated code. We just need to
> special case PC handling in the read path.
> *snip* 
> No need - we just need to exit the loop via cpu_loop_exit_restore() and
> the code will do the right thing. However we probably don't want to
> trigger that via register write as we would surprise the plugin -
> especially if there are other hooks still to run. So we would want an
> explicit helper to do it.

Is this something the QEMU maintainers would be interested in? If yes,
I'm happy to dig into the codebase and submit some patches for review.
But this of course depends on whether such a feature is even desirable
in QEMU (cf. the parallel discussion thread).

Best regards,
Florian


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: New capabilities for plugins
  2025-08-21 16:02           ` Florian Hofhammer
@ 2025-08-22  8:44             ` Alex Bennée
  2025-08-26 14:37               ` Florian Hofhammer
  0 siblings, 1 reply; 13+ messages in thread
From: Alex Bennée @ 2025-08-22  8:44 UTC (permalink / raw)
  To: Florian Hofhammer
  Cc: pierrick.bouvier, qemu-devel, richard.henderson, laurent, imp

Florian Hofhammer <florian.hofhammer@epfl.ch> writes:

> Hi Alex,
>
> Sorry for the late reply, I've been out of office and did not check my
> mail.
>
> On 05/08/2025 17:30, Alex Bennée wrote:
>> I think to read the PC we would just need to make sure we properly
>> resolve it - internally QEMU does this for faults with:
>>      tb = tcg_tb_lookup(retaddr);
>>      cpu_restore_state_from_tb(cpu, tb, retaddr);
>> where retaddr is the address of the translated code. We just need to
>> special case PC handling in the read path.
>> *snip* No need - we just need to exit the loop via
>> cpu_loop_exit_restore() and
>> the code will do the right thing. However we probably don't want to
>> trigger that via register write as we would surprise the plugin -
>> especially if there are other hooks still to run. So we would want an
>> explicit helper to do it.
>
> Is this something the QEMU maintainers would be interested in? If yes,
> I'm happy to dig into the codebase and submit some patches for review.
> But this of course depends on whether such a feature is even desirable
> in QEMU (cf. the parallel discussion thread).

I think writing the patches would be a useful exercise anyway. The way
the plugin code is structured should mean you can keep the changes
fairly localised which would reduce the burden of maintaining an
out-of-tree patch if it isn't accepted. This wasn't really possible
pre-plugins as instrumentation was often deep in the frontends which is
actively maintained code with constant changes making re-basing a
nightmare.

>
> Best regards,
> Florian

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: New capabilities for plugins
  2025-08-22  8:44             ` Alex Bennée
@ 2025-08-26 14:37               ` Florian Hofhammer
  0 siblings, 0 replies; 13+ messages in thread
From: Florian Hofhammer @ 2025-08-26 14:37 UTC (permalink / raw)
  To: Alex Bennée
  Cc: pierrick.bouvier, qemu-devel, richard.henderson, laurent, imp

Hi Alex,

> I think writing the patches would be a useful exercise anyway. The way
> the plugin code is structured should mean you can keep the changes
> fairly localised which would reduce the burden of maintaining an
> out-of-tree patch if it isn't accepted. This wasn't really possible
> pre-plugins as instrumentation was often deep in the frontends which is
> actively maintained code with constant changes making re-basing a
> nightmare.

I'll try to work on this and ask for feedback once I have some decent
patches ready. Thanks for the explanations and support so far!

Best regards,
Florian


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: New capabilities for plugins
  2025-08-04 16:05 ` Alex Bennée
  2025-08-05 13:22   ` Florian Hofhammer
@ 2025-08-05 16:12   ` Daniel P. Berrangé
  2025-08-21 15:58     ` Florian Hofhammer
  1 sibling, 1 reply; 13+ messages in thread
From: Daniel P. Berrangé @ 2025-08-05 16:12 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Florian Hofhammer, qemu-devel, Richard Henderson, Laurent Vivier,
	Warner Losh

On Mon, Aug 04, 2025 at 05:05:06PM +0100, Alex Bennée wrote:
> Florian Hofhammer <florian.hofhammer@epfl.ch> writes:
> 
> > Hello,
> 
> (Added the *-user MAINTAINERS to the CC)
> > More specifically, the "vcpu_syscall_cb" and "vcpu_syscall_ret"
> > callbacks already allow me to instrument syscall translation entry and
> > exit points. While the register read/write APIs also allow me to
> > modify register contents in my syscall callback implementations, there
> > is currently no good way to emulate a syscall myself in the plugin or
> > explicitly set the syscall return value (as it will be overwritten
> > with the original syscall's return value again, even if I set the
> > corresponding guest register).
> >
> > I was wondering whether the QEMU community would be open to extending
> > the plugin API so that a plugin can fully emulate a syscall without
> > the original syscall being executed by QEMU.
> 
> I will defer to the *-user maintainers here. One thing we are keen to
> avoid is plugins being used as a mechanism to work around the GPL
> requirements of QEMU itself. It would be useful if you could outline the
> use case for a plugin doing the emulation itself?

Yeah, this sounds like it is potentially going a step too far in enabling
fully out of tree extension of core QEMU functionality.

If something conceptually is in scope of the core QEMU codebase, then
IMHO, our plugin system should aim to avoid enabling external
implementations as far as is practical.  That was easy when plugins
were limited to observability, but the more we enable in terms of
state modification the wider pandora's box is opened.

Where to draw the line is a hard problem.

Excluding some undesirable features, may well force us to exclude some
potentially desirable plugin use cases at the same time, which may make
it impossible to keep everyone satisfied.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: New capabilities for plugins
  2025-08-05 16:12   ` Daniel P. Berrangé
@ 2025-08-21 15:58     ` Florian Hofhammer
  2025-08-22  8:41       ` Alex Bennée
  0 siblings, 1 reply; 13+ messages in thread
From: Florian Hofhammer @ 2025-08-21 15:58 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: qemu-devel, Richard Henderson, Laurent Vivier, Warner Losh,
	Alex Bennée

Hi Daniel,

My apologies for the late reply, I've been out of office the past two
weeks.

> Yeah, this sounds like it is potentially going a step too far in enabling
> fully out of tree extension of core QEMU functionality.
> 
> If something conceptually is in scope of the core QEMU codebase, then
> IMHO, our plugin system should aim to avoid enabling external
> implementations as far as is practical.  That was easy when plugins
> were limited to observability, but the more we enable in terms of
> state modification the wider pandora's box is opened.
> 
> Where to draw the line is a hard problem.

As I'm new to the QEMU mailing list (not to tinkering with QEMU /
implementing things in or on top of QEMU, though), I'm not fully
familiar with the requirement of preventing out of tree extensions
(or at least certain functionality in such plugins).

I personally (of course with a somewhat biased view :)) see features
that allow modifying internal state (such as syscall behavior) as
beneficial or even required for certain dynamic binary analysis use
cases. Currently, the situation in academic research is that researchers
(i.e., typically PhD students) for dynamic analysis use cases fork QEMU
and implement their use case directly in the core logic. Such patches
can of course never be upstreamed, as they're very use-case-specific and
typically don't generalize to the full QEMU functionality (or might even
break / deteriorate other functionality that's not relevant to this
specific use case).
As soon as the corresponding project is considered "done" / "finished",
the code basically just rots and never gets rebased onto newer versions
of QEMU anymore, basically freezing it in time with all current bugs and
without potential future improvements to QEMU's core.

That's why I personally actually really love QEMU's plugin interface: it
allows me to introspect the system state without adding custom hooks
everywhere in the code that might not be portable to newer QEMU
versions! Consequently, my code is reusable either by myself later on or
other researchers with any version of QEMU that supports at least the
specified plugin API version.
However, I often wish to modify QEMU's state as well, such as in the
provided example with modifying syscall return values or skipping
over a syscall altogether.

Do I understand correctly that handling syscalls is considered in scope
of the core QEMU codebase and therefore shouldn't be possible to do via
a plugin? If I understand correctly, as it stands, there's no way to
modify syscall behavior on an emulator-level with QEMU instead of at the
kernel-level via seccomp, eBPF, kernel modules, ...

Best regards,
Florian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: New capabilities for plugins
  2025-08-21 15:58     ` Florian Hofhammer
@ 2025-08-22  8:41       ` Alex Bennée
  0 siblings, 0 replies; 13+ messages in thread
From: Alex Bennée @ 2025-08-22  8:41 UTC (permalink / raw)
  To: Florian Hofhammer
  Cc: Daniel P. Berrangé, qemu-devel, Richard Henderson,
	Laurent Vivier, Warner Losh

Florian Hofhammer <florian.hofhammer@epfl.ch> writes:

> Hi Daniel,
>
> My apologies for the late reply, I've been out of office the past two
> weeks.
>
>> Yeah, this sounds like it is potentially going a step too far in enabling
>> fully out of tree extension of core QEMU functionality.
>> If something conceptually is in scope of the core QEMU codebase,
>> then
>> IMHO, our plugin system should aim to avoid enabling external
>> implementations as far as is practical.  That was easy when plugins
>> were limited to observability, but the more we enable in terms of
>> state modification the wider pandora's box is opened.
>> Where to draw the line is a hard problem.
>
> As I'm new to the QEMU mailing list (not to tinkering with QEMU /
> implementing things in or on top of QEMU, though), I'm not fully
> familiar with the requirement of preventing out of tree extensions
> (or at least certain functionality in such plugins).

I don't think we can prevent out-of-tree extensions. We do however say
the plugin API is not considered a stable interface - we want to be able
to evolve the interface without worrying about supporting code we can't
see.

There is basic versioning support in the plugin interface to catch
mismatches but there is certainly no backward compatibility - unlike for
example QEMU's approach to machine model versioning.

> I personally (of course with a somewhat biased view :)) see features
> that allow modifying internal state (such as syscall behavior) as
> beneficial or even required for certain dynamic binary analysis use
> cases. Currently, the situation in academic research is that researchers
> (i.e., typically PhD students) for dynamic analysis use cases fork QEMU
> and implement their use case directly in the core logic. Such patches
> can of course never be upstreamed, as they're very use-case-specific and
> typically don't generalize to the full QEMU functionality (or might even
> break / deteriorate other functionality that's not relevant to this
> specific use case).
> As soon as the corresponding project is considered "done" / "finished",
> the code basically just rots and never gets rebased onto newer versions
> of QEMU anymore, basically freezing it in time with all current bugs and
> without potential future improvements to QEMU's core.

There is a long history of academic forks of QEMU which have been done
to support various papers. Usually these changes never get upstreamed
and as you say get left to bitrot on old branches. This is unfortunate
because potential improvements to upstream QEMU never get made.

I should point out there are some cases of academic work being
upstreamed - the MTTCG changes led to a number of papers being published
and I like to think the overall quality of the academic work was
improved by having early review of the patches on the list as it was
being developed.

> That's why I personally actually really love QEMU's plugin interface: it
> allows me to introspect the system state without adding custom hooks
> everywhere in the code that might not be portable to newer QEMU
> versions! Consequently, my code is reusable either by myself later on or
> other researchers with any version of QEMU that supports at least the
> specified plugin API version.

Absolutely - the aim of the plugin interface was to remove the need for
custom hooks which inevitably bitrot when the instruction frontends get
re-factored. The unicorn engine fork for example will never be as up to
date for emulating AArch64 as the upstream because it is an actively
maintained architecture forever adding new CPU features.

> However, I often wish to modify QEMU's state as well, such as in the
> provided example with modifying syscall return values or skipping
> over a syscall altogether.
>
> Do I understand correctly that handling syscalls is considered in scope
> of the core QEMU codebase and therefore shouldn't be possible to do via
> a plugin?

I'm not sure. As I understand it the communities general worry is
splitting development unnecessarily and leaving QEMU as an "open core"
to which proprietary extensions became the only way to emulate certain
hardware or syscalls.

The security use-case certainly seems like a new novel use case that I'm
personally minded to support. It doesn't sound like much else is needed
other than the ability to reset the PC while executing.

> If I understand correctly, as it stands, there's no way to
> modify syscall behavior on an emulator-level with QEMU instead of at the
> kernel-level via seccomp, eBPF, kernel modules, ...
>
> Best regards,
> Florian

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: New capabilities for plugins
  2025-08-04 10:14 New capabilities for plugins Florian Hofhammer
  2025-08-04 16:05 ` Alex Bennée
@ 2025-08-04 17:01 ` Pierrick Bouvier
  1 sibling, 0 replies; 13+ messages in thread
From: Pierrick Bouvier @ 2025-08-04 17:01 UTC (permalink / raw)
  To: Florian Hofhammer, qemu-devel

Hello Florian,

thanks for your interest about QEMU plugins.
As you noticed, we recently expanded API to be able to modify state of 
executed program/machine. Before that, plugins were limited to observe 
what is happening, without allowing any modification. One of the reasons 
if that it might break QEMU in subtle ways, but we came to realize that 
benefits are worth the risk.

On 8/4/25 3:14 AM, Florian Hofhammer wrote:
> Hello,
> 
> I'm currently working a lot with QEMU plugins for dynamic analysis of
> userspace binaries (i.e., running under qemu-user). While working on
> that, I found that the QEMU plugin API luckily has been getting more and
> more capabilities with recent versions but that I'm still lacking some
> functionality for my use cases.
> More specifically, the "vcpu_syscall_cb" and "vcpu_syscall_ret"
> callbacks already allow me to instrument syscall translation entry and
> exit points. While the register read/write APIs also allow me to modify
> register contents in my syscall callback implementations, there is
> currently no good way to emulate a syscall myself in the plugin or
> explicitly set the syscall return value (as it will be overwritten with
> the original syscall's return value again, even if I set the
> corresponding guest register).
>

The vcpu_syscall_ret callback is called just at the end of do_syscall 
(before returning value), so this is why the value does not get overwritten.

If you replace the returned value at the right point, which should be 
the next instruction after svc (instruction callbacks are called 
*before* instruction is executed), this should overwrite the return as 
expected.

> I was wondering whether the QEMU community would be open to extending
> the plugin API so that a plugin can fully emulate a syscall without the
> original syscall being executed by QEMU. I had multiple approaches for
> that in mind, with some working patches locally that I'd be happy to
> share and build such a feature on:
> 
> 1. Change the API of the existing callbacks so that the syscall entry
> point callback returns "bool" instead of "void" and if any of the
> registered callbacks returns true, execution of the actual syscall is
> skipped.
> 2. Introduce a new API function that sets a flag for a specific syscall
> to be skipped:
> 2a. A function that's called once in the manner of "always skip the
> syscall with this specific syscall number" or
> 2b. a function that's called every time in the syscall entry point
> callback in the manner of "skip this specific instance of the syscall".
> 
> I'd be happy to get your opinion on those proposals and to
> develop/submit the corresponding patches!
>

Before talking about the how and what, it could be useful to explain why 
it's needed to replace syscalls.

It's not clear for me how a program can do anything useful if we replace 
real syscalls with fake stubs or skip them. Could you give us a bit more 
details about your work and goals? This will help going in the right 
direction for what you need.

> Thanks in advance and best regards,
> Florian
> 
> 

Thanks,
Pierrick

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-08-26 14:44 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-04 10:14 New capabilities for plugins Florian Hofhammer
2025-08-04 16:05 ` Alex Bennée
2025-08-05 13:22   ` Florian Hofhammer
2025-08-05 14:16     ` Alex Bennée
2025-08-05 14:30       ` Florian Hofhammer
2025-08-05 15:30         ` Alex Bennée
2025-08-21 16:02           ` Florian Hofhammer
2025-08-22  8:44             ` Alex Bennée
2025-08-26 14:37               ` Florian Hofhammer
2025-08-05 16:12   ` Daniel P. Berrangé
2025-08-21 15:58     ` Florian Hofhammer
2025-08-22  8:41       ` Alex Bennée
2025-08-04 17:01 ` Pierrick Bouvier

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.