From: Juan Quintela <quintela@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org,
"Philippe Mathieu-Daudé" <philmd@linaro.org>,
"Michal Prívozník" <mprivozn@redhat.com>,
"Leonardo Bras Soares Passos" <lsoaresp@redhat.com>,
"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
"Daniel P . Berrangé" <berrange@redhat.com>
Subject: Re: [PATCH v3 2/2] util/userfaultfd: Support /dev/userfaultfd
Date: Wed, 08 Feb 2023 20:06:35 +0100 [thread overview]
Message-ID: <87wn4s9dtw.fsf@secure.mitica> (raw)
In-Reply-To: <20230207205711.1187216-3-peterx@redhat.com> (Peter Xu's message of "Tue, 7 Feb 2023 15:57:11 -0500")
Peter Xu <peterx@redhat.com> wrote:
> Teach QEMU to use /dev/userfaultfd when it existed and fallback to the
> system call if either it's not there or doesn't have enough permission.
>
> Firstly, as long as the app has permission to access /dev/userfaultfd, it
> always have the ability to trap kernel faults which QEMU mostly wants.
> Meanwhile, in some context (e.g. containers) the userfaultfd syscall can be
> forbidden, so it can be the major way to use postcopy in a restricted
> environment with strict seccomp setup.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
> diff --git a/util/userfaultfd.c b/util/userfaultfd.c
> index 4953b3137d..fdff4867e8 100644
> --- a/util/userfaultfd.c
> +++ b/util/userfaultfd.c
> @@ -18,10 +18,42 @@
> #include <poll.h>
> #include <sys/syscall.h>
> #include <sys/ioctl.h>
> +#include <fcntl.h>
> +
> +typedef enum {
> + UFFD_UNINITIALIZED = 0,
> + UFFD_USE_DEV_PATH,
> + UFFD_USE_SYSCALL,
> +} uffd_open_mode;
Having an enum or a magic -2 value, better the enum O:-)
> int uffd_open(int flags)
> {
> #if defined(__NR_userfaultfd)
> + static uffd_open_mode open_mode;
> + static int uffd_dev;
> +
> + /* Detect how to generate uffd desc when run the 1st time */
> + if (open_mode == UFFD_UNINITIALIZED) {
> + /*
> + * Make /dev/userfaultfd the default approach because it has better
> + * permission controls, meanwhile allows kernel faults without any
> + * privilege requirement (e.g. SYS_CAP_PTRACE).
> + */
> + uffd_dev = open("/dev/userfaultfd", O_RDWR | O_CLOEXEC);
> + if (uffd_dev >= 0) {
> + open_mode = UFFD_USE_DEV_PATH;
> + } else {
> + /* Fallback to the system call */
> + open_mode = UFFD_USE_SYSCALL;
> + }
> + trace_uffd_detect_open_mode(open_mode);
> + }
> +
> + if (open_mode == UFFD_USE_DEV_PATH) {
> + assert(uffd_dev >= 0);
> + return ioctl(uffd_dev, USERFAULTFD_IOC_NEW, flags);
> + }
> +
> return syscall(__NR_userfaultfd, flags);
> #else
> return -EINVAL;
The rest is ok, thanks.
queued.
prev parent reply other threads:[~2023-02-08 19:07 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-07 20:57 [PATCH v3 0/2] util/userfaultfd: Support /dev/userfaultfd Peter Xu
2023-02-07 20:57 ` [PATCH v3 1/2] linux-headers: Update to v6.1 Peter Xu
2023-02-08 14:58 ` Cornelia Huck
2023-02-08 19:05 ` Juan Quintela
2023-02-07 20:57 ` [PATCH v3 2/2] util/userfaultfd: Support /dev/userfaultfd Peter Xu
2023-02-08 19:06 ` Juan Quintela [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87wn4s9dtw.fsf@secure.mitica \
--to=quintela@redhat.com \
--cc=berrange@redhat.com \
--cc=dgilbert@redhat.com \
--cc=lsoaresp@redhat.com \
--cc=mprivozn@redhat.com \
--cc=peterx@redhat.com \
--cc=philmd@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.