Re: [Qemu-devel] [PATCH v1] os-posix: Add -unshare option

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Daniel P. Berrange" <berrange@redhat.com>
To: Ross Lagerwall <ross.lagerwall@citrix.com>
Cc: qemu-devel@nongnu.org, Markus Armbruster <armbru@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v1] os-posix: Add -unshare option
Date: Thu, 19 Oct 2017 17:24:20 +0100	[thread overview]
Message-ID: <20171019162420.GA8408@redhat.com> (raw)
In-Reply-To: <20171019160419.11611-1-ross.lagerwall@citrix.com>

On Thu, Oct 19, 2017 at 05:04:19PM +0100, Ross Lagerwall wrote:
> Add an option to allow calling unshare() just before starting guest
> execution. The option allows unsharing one or more of the mount
> namespace, the network namespace, and the IPC namespace. This is useful
> to restrict the ability of QEMU to cause damage to the system should it
> be compromised.
> 
> An example of using this would be to have QEMU open a QMP socket at
> startup and unshare the network namespace. The instance of QEMU could
> still be controlled by the QMP socket since that belongs in the original
> namespace, but if QEMU were compromised it wouldn't be able to open any
> new connections, even to other processes on the same machine.

Unless I'm misunderstanding you, what's described here is already possible
by just using the 'unshare' command to spawn QEMU:

  # unshare --ipc --mount --net qemu-system-x86_64 -qmp unix:/tmp/foo,server -vnc :1
  qemu-system-x86_64: -qmp unix:/tmp/foo,server: QEMU waiting for connection on: disconnected:unix:/tmp/foo,server

And in another shell I can still access the QMP socket from the original host
namespace

  # ./scripts/qmp/qmp-shell /tmp/foo
  Welcome to the QMP low-level shell!
  Connected to QEMU 2.9.1

  (QEMU) query-kvm
  {"return": {"enabled": false, "present": true}}

FWIW, even if that were not possible, you could still do it by wrapping the
qmp-shell in an 'nsenter' call. eg

  nsenter --target $QEMUPID --net ./scripts/qmp/qmp-shell /tmp/foo

> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
> ---
>  os-posix.c      | 34 ++++++++++++++++++++++++++++++++++
>  qemu-options.hx | 14 ++++++++++++++
>  2 files changed, 48 insertions(+)
> 
> diff --git a/os-posix.c b/os-posix.c
> index b9c2343..cfc5c38 100644
> --- a/os-posix.c
> +++ b/os-posix.c
> @@ -45,6 +45,7 @@ static struct passwd *user_pwd;
>  static const char *chroot_dir;
>  static int daemonize;
>  static int daemon_pipe;
> +static int unshare_flags;
>  
>  void os_setup_early_signal_handling(void)
>  {
> @@ -160,6 +161,28 @@ void os_parse_cmd_args(int index, const char *optarg)
>          fips_set_state(true);
>          break;
>  #endif
> +#ifdef CONFIG_SETNS
> +    case QEMU_OPTION_unshare:
> +        {
> +            char *flag;
> +            char *opts = g_strdup(optarg);
> +
> +            while ((flag = qemu_strsep(&opts, ",")) != NULL) {
> +                if (!strcmp(flag, "mount")) {
> +                    unshare_flags |= CLONE_NEWNS;
> +                } else if (!strcmp(flag, "net")) {
> +                    unshare_flags |= CLONE_NEWNET;
> +                } else if (!strcmp(flag, "ipc")) {
> +                    unshare_flags |= CLONE_NEWIPC;
> +                } else {
> +                    fprintf(stderr, "Unknown unshare option: %s\n", flag);
> +                    exit(1);
> +                }
> +            }
> +            g_free(opts);
> +        }
> +        break;
> +#endif
>      }
>  }
>  
> @@ -201,6 +224,16 @@ static void change_root(void)
>  
>  }
>  
> +static void unshare_namespaces(void)
> +{
> +    if (unshare_flags) {
> +        if (unshare(unshare_flags) < 0) {
> +            perror("could not unshare");
> +            exit(1);
> +        }
> +    }
> +}
> +
>  void os_daemonize(void)
>  {
>      if (daemonize) {
> @@ -266,6 +299,7 @@ void os_setup_post(void)
>      }
>  
>      change_root();
> +    unshare_namespaces();
>      change_process_uid();

This has some really bad implications.  All the command line options that are
given are processed *beforfe* os_setup_post() is called. IOW, -chardev, -vnc,
-migrate, -net, etc will all be configured in the context of the host namespace.

If you then use the QMP monitor to run  chardev_add,  device_add, migrate,
hostnet_add, etc this will all take place in the new namespace.

So the exact same args give as ARGV now have completely different semantics
when given via QMP.

I think this is really very undesirable.

If you wrap QEMU execution in 'unshare' as I illustrate above, then the
semantics of ARGV & QMP remain consistent.

FWIW, as a further point that might be of interest, libvirt will now spawn
a new private mount namespace for QEMU by default. We do this so that we can
give QEMU a private /dev filesystem with only the devices its permitted to
use present as device nodes.  The ability to do such setup tasks inbetween
namespace creation and QEMU launching is broadly useful. For example, if
using a private network namespace, you might want to create a veth pair and
put one end in the namespace, so that QEMU's network services have some
level of outside network connectivity - eg to enable QEMU to connect to a remote
QEMU for live migration.

So overall, I absolutely encourage the use of namespaces to confine QEMU,
but I tend to think namespace creation/setup is better done outside QEMU
before launching it.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

next prev parent reply	other threads:[~2017-10-19 16:24 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-19 16:04 [Qemu-devel] [PATCH v1] os-posix: Add -unshare option Ross Lagerwall
2017-10-19 16:24 ` Daniel P. Berrange [this message]
2017-10-23 14:30   ` Ross Lagerwall
2017-10-23 14:50     ` Daniel P. Berrange
2017-10-23 15:01       ` Ross Lagerwall
2017-10-23 15:05         ` Daniel P. Berrange
2017-10-24 12:35     ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171019162420.GA8408@redhat.com \
    --to=berrange@redhat.com \
    --cc=armbru@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=ross.lagerwall@citrix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).