Re: Approaches for same-on-same linux-user execve?

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Daniel P. Berrangé" <berrange@redhat.com>
To: "Alex Bennée" <alex.bennee@linaro.org>
Cc: Laurent Vivier <laurent@vivier.eu>,
	Richard Henderson <richard.henderson@linaro.org>,
	assad.hashmi@linaro.org, qemu-devel@nongnu.org,
	James Bottomley <James.Bottomley@hansenpartnership.com>,
	qemu-arm@nongnu.org, "Eric W. Biederman" <ebiederm@xmission.com>,
	Arnd Bergmann <arnd.bergmann@linaro.org>
Subject: Re: Approaches for same-on-same linux-user execve?
Date: Fri, 8 Oct 2021 12:01:07 +0100	[thread overview]
Message-ID: <YWAk88MtPDufjzmK@redhat.com> (raw)
In-Reply-To: <877deoevj8.fsf@linaro.org>

On Thu, Oct 07, 2021 at 03:32:19PM +0100, Alex Bennée wrote:
> Hi,
> 
> I came across a use-case this week for ARM although this may be also
> applicable to architectures where QEMU's emulation is ahead of the
> hardware currently widely available - for example if you want to
> exercise SVE code on AArch64. When the linux-user architecture is not
> the same as the host architecture then binfmt_misc works perfectly fine.
> 
> However in the case you are running same-on-same you can't use
> binfmt_misc to redirect execution to using QEMU because any attempt to
> trap native binaries will cause your userspace to hang as binfmt_misc
> will be invoked to run the QEMU binary needed to run your application
> and a deadlock ensues.
> 
> There are some hacks you can apply at a local level like tweaking the
> elf header of the binaries you want to run under emulation and adjusting
> the binfmt_mask appropriately. This works but is messy and a faff to
> set-up.
> 
> An ideal setup would be would be for the kernel to catch a SIGILL from a
> failing user space program and then to re-launch the process using QEMU
> with the old processes maps and execution state so it could continue.
> However I suspect there are enough moving parts to make this very
> fragile (e.g. what happens to the results of library feature probing
> code). So two approaches I can think of are:
> 
> Trap execve in QEMU linux-user
> ------------------------------
> 
> We could add a flag to QEMU so at the point of execve it manually
> invokes the new process with QEMU, passing on the flag to persist this
> behaviour.
> 
> 
> Add path mask to binfmt_misc
> ----------------------------
> 
> The other option would be to extend binfmt_misc to have a path mask so
> it only applies it's alternative execution scheme to binaries in a
> particular section of the file-system (or maybe some sort of pattern?).
> 
> Are there any other approaches you could take? Which do you think has
> the most merit?

Could a new Linux personality flag be useful in combination with a
new flag in binfmt_misc.

eg a flag "E" for binfmt_misc which indicates the rule must only be
applied if the process is execve()d with PER_USE_BINFMT personality
set.

That would let you add a native match rule to binfmt_misc without
it affecting your system initially. To then run native binaries via
qemu-user you just need to set the personality() flag and the only
that  sub-process tree gets redirected.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

WARNING: multiple messages have this Message-ID (diff)

From: "Daniel P. Berrangé" <berrange@redhat.com>
To: "Alex Bennée" <alex.bennee@linaro.org>
Cc: assad.hashmi@linaro.org,
	Richard Henderson <richard.henderson@linaro.org>,
	Laurent Vivier <laurent@vivier.eu>,
	qemu-devel@nongnu.org,
	James Bottomley <James.Bottomley@hansenpartnership.com>,
	qemu-arm@nongnu.org, "Eric W. Biederman" <ebiederm@xmission.com>,
	Arnd Bergmann <arnd.bergmann@linaro.org>
Subject: Re: Approaches for same-on-same linux-user execve?
Date: Fri, 8 Oct 2021 12:01:07 +0100	[thread overview]
Message-ID: <YWAk88MtPDufjzmK@redhat.com> (raw)
In-Reply-To: <877deoevj8.fsf@linaro.org>

On Thu, Oct 07, 2021 at 03:32:19PM +0100, Alex Bennée wrote:
> Hi,
> 
> I came across a use-case this week for ARM although this may be also
> applicable to architectures where QEMU's emulation is ahead of the
> hardware currently widely available - for example if you want to
> exercise SVE code on AArch64. When the linux-user architecture is not
> the same as the host architecture then binfmt_misc works perfectly fine.
> 
> However in the case you are running same-on-same you can't use
> binfmt_misc to redirect execution to using QEMU because any attempt to
> trap native binaries will cause your userspace to hang as binfmt_misc
> will be invoked to run the QEMU binary needed to run your application
> and a deadlock ensues.
> 
> There are some hacks you can apply at a local level like tweaking the
> elf header of the binaries you want to run under emulation and adjusting
> the binfmt_mask appropriately. This works but is messy and a faff to
> set-up.
> 
> An ideal setup would be would be for the kernel to catch a SIGILL from a
> failing user space program and then to re-launch the process using QEMU
> with the old processes maps and execution state so it could continue.
> However I suspect there are enough moving parts to make this very
> fragile (e.g. what happens to the results of library feature probing
> code). So two approaches I can think of are:
> 
> Trap execve in QEMU linux-user
> ------------------------------
> 
> We could add a flag to QEMU so at the point of execve it manually
> invokes the new process with QEMU, passing on the flag to persist this
> behaviour.
> 
> 
> Add path mask to binfmt_misc
> ----------------------------
> 
> The other option would be to extend binfmt_misc to have a path mask so
> it only applies it's alternative execution scheme to binaries in a
> particular section of the file-system (or maybe some sort of pattern?).
> 
> Are there any other approaches you could take? Which do you think has
> the most merit?

Could a new Linux personality flag be useful in combination with a
new flag in binfmt_misc.

eg a flag "E" for binfmt_misc which indicates the rule must only be
applied if the process is execve()d with PER_USE_BINFMT personality
set.

That would let you add a native match rule to binfmt_misc without
it affecting your system initially. To then run native binaries via
qemu-user you just need to set the personality() flag and the only
that  sub-process tree gets redirected.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

next prev parent reply	other threads:[~2021-10-08 11:01 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-07 14:32 Approaches for same-on-same linux-user execve? Alex Bennée
2021-10-07 14:32 ` Alex Bennée
2021-10-07 16:28 ` Arnd Bergmann
2021-10-07 16:28   ` Arnd Bergmann
2021-10-08 10:44   ` Alex Bennée
2021-10-08 10:44     ` Alex Bennée
2021-10-14 13:01     ` Assad Hashmi
2021-10-14 13:01       ` Assad Hashmi
2021-10-07 18:59 ` Laurent Vivier
2021-10-07 18:59   ` Laurent Vivier
2021-10-07 19:13 ` Warner Losh
2021-10-07 19:13   ` Warner Losh
2021-10-08 11:01 ` Daniel P. Berrangé [this message]
2021-10-08 11:01   ` Daniel P. Berrangé
2021-10-08 11:20 ` Arnd Bergmann
2021-10-08 11:20   ` Arnd Bergmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YWAk88MtPDufjzmK@redhat.com \
    --to=berrange@redhat.com \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=alex.bennee@linaro.org \
    --cc=arnd.bergmann@linaro.org \
    --cc=assad.hashmi@linaro.org \
    --cc=ebiederm@xmission.com \
    --cc=laurent@vivier.eu \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.