From: Florian Weimer <fweimer@redhat.com>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: "Daniel P. Berrangé" <berrange@redhat.com>,
"Andreas Schwab" <schwab@suse.de>, "Helge Deller" <deller@gmx.de>,
qemu-devel@nongnu.org
Subject: Re: Generic way to detect qemu linux-user emulation
Date: Wed, 25 Mar 2026 18:08:12 +0100 [thread overview]
Message-ID: <lhumrzvn8g3.fsf@oldenburg.str.redhat.com> (raw)
In-Reply-To: <CAFEAcA_ZBz3yvUYo5WhqmKRqCm+Jy1R01pshtU0NPRzzbP4hYQ@mail.gmail.com> (Peter Maydell's message of "Tue, 18 Mar 2025 15:04:51 +0000")
* Peter Maydell:
> On Tue, 18 Mar 2025 at 13:55, Daniel P. Berrangé <berrange@redhat.com> wrote:
>>
>> On Tue, Mar 18, 2025 at 01:06:17PM +0000, Peter Maydell wrote:
>> > The difficulty with vfork() (and, more generally, with various of
>> > the clone() syscall flag combinations) is that because we use the
>> > host libc we are restricted to the thread/process creation options
>> > that that libc permits: which is only fork() and pthread_create().
>> > vfork() wants "create a new process like fork with its own file
>> > descriptors, signal handlers, etc, but share all the memory space with
>> > the parent", and the host libc just doesn't provide us with the tools
>> > to do that. (We can't call the host vfork() because we wouldn't be
>> > abiding by the rules it imposes, like "don't return from the function
>> > that called vfork".)
>> >
>> > If we were implemented as a usermode emulator that sat on the raw
>> > kernel syscalls, we could directly call the clone syscall and
>> > use that to provide at least a wider range of the possible clone
>> > flag options; but our dependency on libc means we have to avoid
>> > doing things that would confuse it.
>>
>> I guess I'm not seeing how libc is blocking us in this respect ?
>> The clone() syscall wrapper is exposed by glibc at least, and it
>> is possible to call it, albeit with some caveats that we might
>> miss any logic glibc has around its fork() wrapper. The spec
>> requires that any child must immediately call execve after vfrok
>> so I'm wondering just what risk of confusion we would have in
>> practice ?
>
> I think my notes about clone are a red herring for vfork
> specifically. For vfork in the child, the vfork spec requires
> a very minimal amount of stuff to happen in the child, but QEMU's
> own TCG data structures and calls and processes mean that we
> will be doing a lot more than the guest does. For instance,
> we need to return from the function that called vfork, so we
> can continue to execute the guest code. And the guest code will
> likely call into the translator to generate more code, which will
> (a) mess up the TCG data structures for the parent and (b)
> probably result in our calling into libc functions that aren't
> OK to call.
Yes, the problem with vfork is the own state data structures for
qemu-user. It may be okay to do this if the process is single-threaded,
but it won't really work if it is multi-threaded.
I think you would need to use userfaultfd to mimic vfork behavior for
emulated code only, and that seems to be quite a big project.
Maybe it would work to create the new PID off a new thread (created with
pthread_create) via vfork, and proxy emulated system calls through that
process, while still running the emulator in the original process. The
trampoline could be a simple syscall function wrapper that communicates
through shared memory and a process-shared condition variable or
barrier. Shouldn't this give the right semantics? The emulated code
would see the new process identity because system calls like getpid are
executed remotely in that process. It might be easier to implement than
userfaultfd support.
Thanks,
Florian
next prev parent reply other threads:[~2026-03-25 17:08 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-18 10:18 Generic way to detect qemu linux-user emulation Andreas Schwab
2025-03-18 10:36 ` Helge Deller
2025-03-18 10:45 ` Helge Deller
2025-03-18 10:53 ` Peter Maydell
2025-03-18 11:58 ` Daniel P. Berrangé
2025-03-18 12:34 ` Andreas Schwab
2025-03-18 12:43 ` Daniel P. Berrangé
2025-03-18 13:06 ` Peter Maydell
2025-03-18 13:54 ` Daniel P. Berrangé
2025-03-18 14:17 ` Andreas Schwab
2025-03-18 17:32 ` Daniel P. Berrangé
2025-03-18 15:04 ` Peter Maydell
2025-03-18 17:08 ` Peter Maydell
2025-03-18 17:18 ` Daniel P. Berrangé
2025-03-18 17:48 ` Peter Maydell
2026-03-25 17:08 ` Florian Weimer [this message]
2025-03-18 11:10 ` Andreas Schwab
2026-03-25 14:51 ` Lawrence Hunter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=lhumrzvn8g3.fsf@oldenburg.str.redhat.com \
--to=fweimer@redhat.com \
--cc=berrange@redhat.com \
--cc=deller@gmx.de \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=schwab@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox