* [Qemu-devel] [PATCH v2 1/9] configure: probe for memfd
2015-07-28 17:36 [Qemu-devel] [PATCH v2 0/9] vhost-user: add migration log support (for 2.5) Marc-André Lureau
@ 2015-07-28 17:36 ` Marc-André Lureau
2015-07-28 17:36 ` [Qemu-devel] [PATCH v2 2/9] posix: add linux-only memfd fallback Marc-André Lureau
` (7 subsequent siblings)
8 siblings, 0 replies; 18+ messages in thread
From: Marc-André Lureau @ 2015-07-28 17:36 UTC (permalink / raw)
To: qemu-devel
Cc: haifeng.lin, mst, thibaut.collet, jasowang, pbonzini,
Marc-André Lureau
Check if memfd_create() is part of system libc.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
configure | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/configure b/configure
index 704b34c..c754700 100755
--- a/configure
+++ b/configure
@@ -3406,6 +3406,22 @@ if compile_prog "" "" ; then
eventfd=yes
fi
+# check if memfd is supported
+memfd=no
+cat > $TMPC << EOF
+#include <sys/memfd.h>
+
+int main(void)
+{
+ return memfd_create("foo", MFD_ALLOW_SEALING);
+}
+EOF
+if compile_prog "" "" ; then
+ memfd=yes
+fi
+
+
+
# check for fallocate
fallocate=no
cat > $TMPC << EOF
@@ -4786,6 +4802,9 @@ fi
if test "$eventfd" = "yes" ; then
echo "CONFIG_EVENTFD=y" >> $config_host_mak
fi
+if test "$memfd" = "yes" ; then
+ echo "CONFIG_MEMFD=y" >> $config_host_mak
+fi
if test "$fallocate" = "yes" ; then
echo "CONFIG_FALLOCATE=y" >> $config_host_mak
fi
--
2.4.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH v2 2/9] posix: add linux-only memfd fallback
2015-07-28 17:36 [Qemu-devel] [PATCH v2 0/9] vhost-user: add migration log support (for 2.5) Marc-André Lureau
2015-07-28 17:36 ` [Qemu-devel] [PATCH v2 1/9] configure: probe for memfd Marc-André Lureau
@ 2015-07-28 17:36 ` Marc-André Lureau
2015-08-03 11:31 ` Marc-André Lureau
2015-07-28 17:36 ` [Qemu-devel] [PATCH v2 3/9] osdep: add memfd helpers Marc-André Lureau
` (6 subsequent siblings)
8 siblings, 1 reply; 18+ messages in thread
From: Marc-André Lureau @ 2015-07-28 17:36 UTC (permalink / raw)
To: qemu-devel
Cc: haifeng.lin, mst, thibaut.collet, jasowang, pbonzini,
Marc-André Lureau
Implement memfd_create() fallback if not available in system libc.
memfd_create() is still not included in glibc today, atlhough it's been
available since Linux 3.17 in Oct 2014.
memfd has numerous advantages over traditional shm/mmap for ipc memory
sharing with fd handler, which we are going to make use of for
vhost-user logging memory in following patches.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
include/qemu/memfd.h | 20 +++++++++++++++++++
util/Makefile.objs | 2 +-
util/memfd.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 77 insertions(+), 1 deletion(-)
create mode 100644 include/qemu/memfd.h
create mode 100644 util/memfd.c
diff --git a/include/qemu/memfd.h b/include/qemu/memfd.h
new file mode 100644
index 0000000..8b1fe6a
--- /dev/null
+++ b/include/qemu/memfd.h
@@ -0,0 +1,20 @@
+#ifndef QEMU_MEMFD_H
+#define QEMU_MEMFD_H
+
+#include "config-host.h"
+
+#ifndef F_LINUX_SPECIFIC_BASE
+#define F_LINUX_SPECIFIC_BASE 1024
+#endif
+
+#ifndef F_ADD_SEALS
+#define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9)
+#define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10)
+
+#define F_SEAL_SEAL 0x0001 /* prevent further seals from being set */
+#define F_SEAL_SHRINK 0x0002 /* prevent file from shrinking */
+#define F_SEAL_GROW 0x0004 /* prevent file from growing */
+#define F_SEAL_WRITE 0x0008 /* prevent writes */
+#endif
+
+#endif /* QEMU_MEMFD_H */
diff --git a/util/Makefile.objs b/util/Makefile.objs
index 114d657..84c5485 100644
--- a/util/Makefile.objs
+++ b/util/Makefile.objs
@@ -1,6 +1,6 @@
util-obj-y = osdep.o cutils.o unicode.o qemu-timer-common.o
util-obj-$(CONFIG_WIN32) += oslib-win32.o qemu-thread-win32.o event_notifier-win32.o
-util-obj-$(CONFIG_POSIX) += oslib-posix.o qemu-thread-posix.o event_notifier-posix.o qemu-openpty.o
+util-obj-$(CONFIG_POSIX) += oslib-posix.o qemu-thread-posix.o event_notifier-posix.o qemu-openpty.o memfd.o
util-obj-y += envlist.o path.o module.o
util-obj-$(call lnot,$(CONFIG_INT128)) += host-utils.o
util-obj-y += bitmap.o bitops.o hbitmap.o
diff --git a/util/memfd.c b/util/memfd.c
new file mode 100644
index 0000000..a98d57e
--- /dev/null
+++ b/util/memfd.c
@@ -0,0 +1,56 @@
+/*
+ * memfd.c
+ *
+ * Copyright (c) 2015 Red Hat, Inc.
+ *
+ * QEMU library functions on POSIX which are shared between QEMU and
+ * the QEMU tools.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "config-host.h"
+
+#include "qemu/memfd.h"
+
+#ifdef CONFIG_MEMFD
+#include <sys/memfd.h>
+#elif defined CONFIG_LINUX
+#include <unistd.h>
+#include <sys/syscall.h>
+#include <asm/unistd.h>
+
+#ifndef MFD_CLOEXEC
+#define MFD_CLOEXEC 0x0001U
+#endif
+
+#ifndef MFD_ALLOW_SEALING
+#define MFD_ALLOW_SEALING 0x0002U
+#endif
+
+static inline int memfd_create(const char *name, unsigned int flags)
+{
+ return syscall(__NR_memfd_create, name, flags);
+}
+#else /* !LINUX */
+static inline int memfd_create(const char *name, unsigned int flags)
+{
+ return -1;
+}
+#endif
--
2.4.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH v2 3/9] osdep: add memfd helpers
2015-07-28 17:36 [Qemu-devel] [PATCH v2 0/9] vhost-user: add migration log support (for 2.5) Marc-André Lureau
2015-07-28 17:36 ` [Qemu-devel] [PATCH v2 1/9] configure: probe for memfd Marc-André Lureau
2015-07-28 17:36 ` [Qemu-devel] [PATCH v2 2/9] posix: add linux-only memfd fallback Marc-André Lureau
@ 2015-07-28 17:36 ` Marc-André Lureau
2015-08-03 11:32 ` Marc-André Lureau
2015-07-28 17:36 ` [Qemu-devel] [PATCH v2 4/9] vhost: alloc shareable log Marc-André Lureau
` (5 subsequent siblings)
8 siblings, 1 reply; 18+ messages in thread
From: Marc-André Lureau @ 2015-07-28 17:36 UTC (permalink / raw)
To: qemu-devel
Cc: haifeng.lin, mst, thibaut.collet, jasowang, pbonzini,
Marc-André Lureau
Add qemu_memfd_alloc/free() helpers.
The function helps to allocate and seal a memfd, and implements an
open/unlink/mmap fallback for system that do not support memfd.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
include/qemu/memfd.h | 4 +++
util/memfd.c | 74 ++++++++++++++++++++++++++++++++++++++++++++++++++--
2 files changed, 76 insertions(+), 2 deletions(-)
diff --git a/include/qemu/memfd.h b/include/qemu/memfd.h
index 8b1fe6a..950fb88 100644
--- a/include/qemu/memfd.h
+++ b/include/qemu/memfd.h
@@ -17,4 +17,8 @@
#define F_SEAL_WRITE 0x0008 /* prevent writes */
#endif
+void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals,
+ int *fd);
+void qemu_memfd_free(void *ptr, size_t size, int fd);
+
#endif /* QEMU_MEMFD_H */
diff --git a/util/memfd.c b/util/memfd.c
index a98d57e..8b2b785 100644
--- a/util/memfd.c
+++ b/util/memfd.c
@@ -27,6 +27,14 @@
#include "config-host.h"
+#include <glib.h>
+#include <glib/gprintf.h>
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <fcntl.h>
+#include <sys/mman.h>
+
#include "qemu/memfd.h"
#ifdef CONFIG_MEMFD
@@ -44,13 +52,75 @@
#define MFD_ALLOW_SEALING 0x0002U
#endif
-static inline int memfd_create(const char *name, unsigned int flags)
+static int memfd_create(const char *name, unsigned int flags)
{
return syscall(__NR_memfd_create, name, flags);
}
#else /* !LINUX */
-static inline int memfd_create(const char *name, unsigned int flags)
+static int memfd_create(const char *name, unsigned int flags)
{
return -1;
}
#endif
+
+void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals,
+ int *fd)
+{
+ void *ptr;
+ int mfd;
+
+ mfd = memfd_create(name, MFD_ALLOW_SEALING|MFD_CLOEXEC);
+ if (mfd != -1) {
+ if (ftruncate(mfd, size) == -1) {
+ perror("ftruncate");
+ close(mfd);
+ return NULL;
+ }
+
+ if (fcntl(mfd, F_ADD_SEALS, seals) == -1) {
+ perror("fcntl");
+ close(mfd);
+ return NULL;
+ }
+ } else {
+ const char *tmpdir = getenv("TMPDIR");
+ gchar *fname;
+
+ tmpdir = tmpdir ? tmpdir : "/tmp";
+
+ fname = g_strdup_printf("%s/memfd-XXXXXX", tmpdir);
+ mfd = mkstemp(fname);
+ unlink(fname);
+ g_free(fname);
+
+ if (mfd == -1) {
+ perror("mkstemp");
+ return NULL;
+ }
+
+ if (ftruncate(mfd, size) == -1) {
+ perror("ftruncate");
+ close(mfd);
+ return NULL;
+ }
+ }
+
+ ptr = mmap(0, size, PROT_READ|PROT_WRITE, MAP_SHARED, mfd, 0);
+ if (ptr == MAP_FAILED) {
+ perror("mmap");
+ close(mfd);
+ return NULL;
+ }
+
+ *fd = mfd;
+ return ptr;
+}
+
+void qemu_memfd_free(void *ptr, size_t size, int fd)
+{
+ if (ptr) {
+ munmap(ptr, size);
+ }
+
+ close(fd);
+}
--
2.4.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH v2 4/9] vhost: alloc shareable log
2015-07-28 17:36 [Qemu-devel] [PATCH v2 0/9] vhost-user: add migration log support (for 2.5) Marc-André Lureau
` (2 preceding siblings ...)
2015-07-28 17:36 ` [Qemu-devel] [PATCH v2 3/9] osdep: add memfd helpers Marc-André Lureau
@ 2015-07-28 17:36 ` Marc-André Lureau
2015-07-29 16:53 ` Dr. David Alan Gilbert
2015-07-28 17:36 ` [Qemu-devel] [PATCH v2 5/9] vhost: document log resizing Marc-André Lureau
` (4 subsequent siblings)
8 siblings, 1 reply; 18+ messages in thread
From: Marc-André Lureau @ 2015-07-28 17:36 UTC (permalink / raw)
To: qemu-devel
Cc: haifeng.lin, mst, thibaut.collet, jasowang, pbonzini,
Marc-André Lureau
If the backend is of type VHOST_BACKEND_TYPE_USER, allocate
shareable memory.
Note: vhost_log_get() can use a global "vhost_log" that can be shared by
several vhost devices. We may want instead a common shareable log and a
common non-shareable one.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
hw/virtio/vhost.c | 38 +++++++++++++++++++++++++++++++-------
include/hw/virtio/vhost.h | 3 ++-
2 files changed, 33 insertions(+), 8 deletions(-)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 2712c6f..862e786 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -18,6 +18,7 @@
#include "qemu/atomic.h"
#include "qemu/range.h"
#include "qemu/error-report.h"
+#include "qemu/memfd.h"
#include <linux/vhost.h>
#include "exec/address-spaces.h"
#include "hw/virtio/virtio-bus.h"
@@ -286,20 +287,34 @@ static uint64_t vhost_get_log_size(struct vhost_dev *dev)
}
return log_size;
}
-static struct vhost_log *vhost_log_alloc(uint64_t size)
+
+static struct vhost_log *vhost_log_alloc(uint64_t size, bool share)
{
- struct vhost_log *log = g_malloc0(sizeof *log + size * sizeof(*(log->log)));
+ struct vhost_log *log;
+ uint64_t logsize = size * sizeof(*(log->log));
+ int fd = -1;
+
+ log = g_new0(struct vhost_log, 1);
+ if (share) {
+ log->log = qemu_memfd_alloc("vhost-log", logsize,
+ F_SEAL_GROW|F_SEAL_SHRINK|F_SEAL_SEAL, &fd);
+ memset(log->log, 0, logsize);
+ } else {
+ log->log = g_malloc0(logsize);
+ }
log->size = size;
log->refcnt = 1;
+ log->fd = fd;
return log;
}
-static struct vhost_log *vhost_log_get(uint64_t size)
+static struct vhost_log *vhost_log_get(uint64_t size, bool share)
{
- if (!vhost_log || vhost_log->size != size) {
- vhost_log = vhost_log_alloc(size);
+ if (!vhost_log || vhost_log->size != size ||
+ (share && vhost_log->fd == -1)) {
+ vhost_log = vhost_log_alloc(size, share);
} else {
++vhost_log->refcnt;
}
@@ -324,13 +339,21 @@ static void vhost_log_put(struct vhost_dev *dev, bool sync)
if (vhost_log == log) {
vhost_log = NULL;
}
+
+ if (log->fd == -1) {
+ g_free(log->log);
+ } else {
+ qemu_memfd_free(log->log, log->size * sizeof(*(log->log)),
+ log->fd);
+ }
g_free(log);
}
}
static inline void vhost_dev_log_resize(struct vhost_dev* dev, uint64_t size)
{
- struct vhost_log *log = vhost_log_get(size);
+ bool share = dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER;
+ struct vhost_log *log = vhost_log_get(size, share);
uint64_t log_base = (uintptr_t)log->log;
int r;
@@ -1136,9 +1159,10 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
if (hdev->log_enabled) {
uint64_t log_base;
+ bool share = hdev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER;
hdev->log_size = vhost_get_log_size(hdev);
- hdev->log = vhost_log_get(hdev->log_size);
+ hdev->log = vhost_log_get(hdev->log_size, share);
log_base = (uintptr_t)hdev->log->log;
r = hdev->vhost_ops->vhost_call(hdev, VHOST_SET_LOG_BASE,
hdev->log_size ? &log_base : NULL);
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 6467c73..ab1dcac 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -31,7 +31,8 @@ typedef unsigned long vhost_log_chunk_t;
struct vhost_log {
unsigned long long size;
int refcnt;
- vhost_log_chunk_t log[0];
+ int fd;
+ vhost_log_chunk_t *log;
};
struct vhost_memory;
--
2.4.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH v2 4/9] vhost: alloc shareable log
2015-07-28 17:36 ` [Qemu-devel] [PATCH v2 4/9] vhost: alloc shareable log Marc-André Lureau
@ 2015-07-29 16:53 ` Dr. David Alan Gilbert
2015-07-29 17:04 ` Marc-André Lureau
0 siblings, 1 reply; 18+ messages in thread
From: Dr. David Alan Gilbert @ 2015-07-29 16:53 UTC (permalink / raw)
To: Marc-André Lureau
Cc: haifeng.lin, mst, thibaut.collet, jasowang, qemu-devel, pbonzini
* Marc-André Lureau (marcandre.lureau@redhat.com) wrote:
> If the backend is of type VHOST_BACKEND_TYPE_USER, allocate
> shareable memory.
>
> Note: vhost_log_get() can use a global "vhost_log" that can be shared by
> several vhost devices. We may want instead a common shareable log and a
> common non-shareable one.
>
> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> ---
> hw/virtio/vhost.c | 38 +++++++++++++++++++++++++++++++-------
> include/hw/virtio/vhost.h | 3 ++-
> 2 files changed, 33 insertions(+), 8 deletions(-)
>
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 2712c6f..862e786 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -18,6 +18,7 @@
> #include "qemu/atomic.h"
> #include "qemu/range.h"
> #include "qemu/error-report.h"
> +#include "qemu/memfd.h"
> #include <linux/vhost.h>
> #include "exec/address-spaces.h"
> #include "hw/virtio/virtio-bus.h"
> @@ -286,20 +287,34 @@ static uint64_t vhost_get_log_size(struct vhost_dev *dev)
> }
> return log_size;
> }
> -static struct vhost_log *vhost_log_alloc(uint64_t size)
> +
> +static struct vhost_log *vhost_log_alloc(uint64_t size, bool share)
> {
> - struct vhost_log *log = g_malloc0(sizeof *log + size * sizeof(*(log->log)));
> + struct vhost_log *log;
> + uint64_t logsize = size * sizeof(*(log->log));
> + int fd = -1;
> +
> + log = g_new0(struct vhost_log, 1);
> + if (share) {
> + log->log = qemu_memfd_alloc("vhost-log", logsize,
> + F_SEAL_GROW|F_SEAL_SHRINK|F_SEAL_SEAL, &fd);
> + memset(log->log, 0, logsize);
qemu_memfd_alloc can return NULL can't it - so that needs checking?
> + } else {
> + log->log = g_malloc0(logsize);
I know the old code also used g_malloc0, but if the log isn't 'small'
then g_try_malloc0 is possibly safer and properly return errors
if it can't be allocated.
Dave
> + }
>
> log->size = size;
> log->refcnt = 1;
> + log->fd = fd;
>
> return log;
> }
>
> -static struct vhost_log *vhost_log_get(uint64_t size)
> +static struct vhost_log *vhost_log_get(uint64_t size, bool share)
> {
> - if (!vhost_log || vhost_log->size != size) {
> - vhost_log = vhost_log_alloc(size);
> + if (!vhost_log || vhost_log->size != size ||
> + (share && vhost_log->fd == -1)) {
> + vhost_log = vhost_log_alloc(size, share);
> } else {
> ++vhost_log->refcnt;
> }
> @@ -324,13 +339,21 @@ static void vhost_log_put(struct vhost_dev *dev, bool sync)
> if (vhost_log == log) {
> vhost_log = NULL;
> }
> +
> + if (log->fd == -1) {
> + g_free(log->log);
> + } else {
> + qemu_memfd_free(log->log, log->size * sizeof(*(log->log)),
> + log->fd);
> + }
> g_free(log);
> }
> }
>
> static inline void vhost_dev_log_resize(struct vhost_dev* dev, uint64_t size)
> {
> - struct vhost_log *log = vhost_log_get(size);
> + bool share = dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER;
> + struct vhost_log *log = vhost_log_get(size, share);
> uint64_t log_base = (uintptr_t)log->log;
> int r;
>
> @@ -1136,9 +1159,10 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
>
> if (hdev->log_enabled) {
> uint64_t log_base;
> + bool share = hdev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER;
>
> hdev->log_size = vhost_get_log_size(hdev);
> - hdev->log = vhost_log_get(hdev->log_size);
> + hdev->log = vhost_log_get(hdev->log_size, share);
> log_base = (uintptr_t)hdev->log->log;
> r = hdev->vhost_ops->vhost_call(hdev, VHOST_SET_LOG_BASE,
> hdev->log_size ? &log_base : NULL);
> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> index 6467c73..ab1dcac 100644
> --- a/include/hw/virtio/vhost.h
> +++ b/include/hw/virtio/vhost.h
> @@ -31,7 +31,8 @@ typedef unsigned long vhost_log_chunk_t;
> struct vhost_log {
> unsigned long long size;
> int refcnt;
> - vhost_log_chunk_t log[0];
> + int fd;
> + vhost_log_chunk_t *log;
> };
>
> struct vhost_memory;
> --
> 2.4.3
>
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH v2 4/9] vhost: alloc shareable log
2015-07-29 16:53 ` Dr. David Alan Gilbert
@ 2015-07-29 17:04 ` Marc-André Lureau
2015-07-29 17:08 ` Dr. David Alan Gilbert
2015-07-29 17:22 ` Michael S. Tsirkin
0 siblings, 2 replies; 18+ messages in thread
From: Marc-André Lureau @ 2015-07-29 17:04 UTC (permalink / raw)
To: Dr. David Alan Gilbert
Cc: haifeng lin, mst, thibaut collet, jasowang, qemu-devel, pbonzini,
Marc-André Lureau
Hi
----- Original Message -----
> * Marc-André Lureau (marcandre.lureau@redhat.com) wrote:
> > If the backend is of type VHOST_BACKEND_TYPE_USER, allocate
> > shareable memory.
> >
> > Note: vhost_log_get() can use a global "vhost_log" that can be shared by
> > several vhost devices. We may want instead a common shareable log and a
> > common non-shareable one.
> >
> > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> > ---
> > hw/virtio/vhost.c | 38 +++++++++++++++++++++++++++++++-------
> > include/hw/virtio/vhost.h | 3 ++-
> > 2 files changed, 33 insertions(+), 8 deletions(-)
> >
> > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > index 2712c6f..862e786 100644
> > --- a/hw/virtio/vhost.c
> > +++ b/hw/virtio/vhost.c
> > @@ -18,6 +18,7 @@
> > #include "qemu/atomic.h"
> > #include "qemu/range.h"
> > #include "qemu/error-report.h"
> > +#include "qemu/memfd.h"
> > #include <linux/vhost.h>
> > #include "exec/address-spaces.h"
> > #include "hw/virtio/virtio-bus.h"
> > @@ -286,20 +287,34 @@ static uint64_t vhost_get_log_size(struct vhost_dev
> > *dev)
> > }
> > return log_size;
> > }
> > -static struct vhost_log *vhost_log_alloc(uint64_t size)
> > +
> > +static struct vhost_log *vhost_log_alloc(uint64_t size, bool share)
> > {
> > - struct vhost_log *log = g_malloc0(sizeof *log + size *
> > sizeof(*(log->log)));
> > + struct vhost_log *log;
> > + uint64_t logsize = size * sizeof(*(log->log));
> > + int fd = -1;
> > +
> > + log = g_new0(struct vhost_log, 1);
> > + if (share) {
> > + log->log = qemu_memfd_alloc("vhost-log", logsize,
> > + F_SEAL_GROW|F_SEAL_SHRINK|F_SEAL_SEAL,
> > &fd);
> > + memset(log->log, 0, logsize);
>
> qemu_memfd_alloc can return NULL can't it - so that needs checking?
>
> > + } else {
> > + log->log = g_malloc0(logsize);
>
> I know the old code also used g_malloc0, but if the log isn't 'small'
> then g_try_malloc0 is possibly safer and properly return errors
> if it can't be allocated.
Yeah, I agree it's better to check for the return value here (as you pointed out, I followed the existing pattern).
Maybe we are just screwed if it happens, live migration shouldn't succeed if it can't be done properly imho.
What's your take on this Michael?
cheers
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH v2 4/9] vhost: alloc shareable log
2015-07-29 17:04 ` Marc-André Lureau
@ 2015-07-29 17:08 ` Dr. David Alan Gilbert
2015-07-29 17:22 ` Michael S. Tsirkin
1 sibling, 0 replies; 18+ messages in thread
From: Dr. David Alan Gilbert @ 2015-07-29 17:08 UTC (permalink / raw)
To: Marc-André Lureau
Cc: haifeng lin, mst, thibaut collet, jasowang, qemu-devel, pbonzini,
Marc-André Lureau
* Marc-André Lureau (mlureau@redhat.com) wrote:
> Hi
>
> ----- Original Message -----
> > * Marc-André Lureau (marcandre.lureau@redhat.com) wrote:
> > > If the backend is of type VHOST_BACKEND_TYPE_USER, allocate
> > > shareable memory.
> > >
> > > Note: vhost_log_get() can use a global "vhost_log" that can be shared by
> > > several vhost devices. We may want instead a common shareable log and a
> > > common non-shareable one.
> > >
> > > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> > > ---
> > > hw/virtio/vhost.c | 38 +++++++++++++++++++++++++++++++-------
> > > include/hw/virtio/vhost.h | 3 ++-
> > > 2 files changed, 33 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > > index 2712c6f..862e786 100644
> > > --- a/hw/virtio/vhost.c
> > > +++ b/hw/virtio/vhost.c
> > > @@ -18,6 +18,7 @@
> > > #include "qemu/atomic.h"
> > > #include "qemu/range.h"
> > > #include "qemu/error-report.h"
> > > +#include "qemu/memfd.h"
> > > #include <linux/vhost.h>
> > > #include "exec/address-spaces.h"
> > > #include "hw/virtio/virtio-bus.h"
> > > @@ -286,20 +287,34 @@ static uint64_t vhost_get_log_size(struct vhost_dev
> > > *dev)
> > > }
> > > return log_size;
> > > }
> > > -static struct vhost_log *vhost_log_alloc(uint64_t size)
> > > +
> > > +static struct vhost_log *vhost_log_alloc(uint64_t size, bool share)
> > > {
> > > - struct vhost_log *log = g_malloc0(sizeof *log + size *
> > > sizeof(*(log->log)));
> > > + struct vhost_log *log;
> > > + uint64_t logsize = size * sizeof(*(log->log));
> > > + int fd = -1;
> > > +
> > > + log = g_new0(struct vhost_log, 1);
> > > + if (share) {
> > > + log->log = qemu_memfd_alloc("vhost-log", logsize,
> > > + F_SEAL_GROW|F_SEAL_SHRINK|F_SEAL_SEAL,
> > > &fd);
> > > + memset(log->log, 0, logsize);
> >
> > qemu_memfd_alloc can return NULL can't it - so that needs checking?
> >
> > > + } else {
> > > + log->log = g_malloc0(logsize);
> >
> > I know the old code also used g_malloc0, but if the log isn't 'small'
> > then g_try_malloc0 is possibly safer and properly return errors
> > if it can't be allocated.
>
> Yeah, I agree it's better to check for the return value here (as you pointed out, I followed the existing pattern).
>
> Maybe we are just screwed if it happens, live migration shouldn't succeed if it can't be done properly imho.
Probably; we try to be careful particularly on the source of memory allocations
during migration, since the VM is running just fine, and it's unfortunate
to kill it at that point.
Dave
> What's your take on this Michael?
>
> cheers
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH v2 4/9] vhost: alloc shareable log
2015-07-29 17:04 ` Marc-André Lureau
2015-07-29 17:08 ` Dr. David Alan Gilbert
@ 2015-07-29 17:22 ` Michael S. Tsirkin
1 sibling, 0 replies; 18+ messages in thread
From: Michael S. Tsirkin @ 2015-07-29 17:22 UTC (permalink / raw)
To: Marc-André Lureau
Cc: haifeng lin, thibaut collet, jasowang, Dr. David Alan Gilbert,
qemu-devel, pbonzini, Marc-André Lureau
On Wed, Jul 29, 2015 at 01:04:51PM -0400, Marc-André Lureau wrote:
> Hi
>
> ----- Original Message -----
> > * Marc-André Lureau (marcandre.lureau@redhat.com) wrote:
> > > If the backend is of type VHOST_BACKEND_TYPE_USER, allocate
> > > shareable memory.
> > >
> > > Note: vhost_log_get() can use a global "vhost_log" that can be shared by
> > > several vhost devices. We may want instead a common shareable log and a
> > > common non-shareable one.
> > >
> > > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> > > ---
> > > hw/virtio/vhost.c | 38 +++++++++++++++++++++++++++++++-------
> > > include/hw/virtio/vhost.h | 3 ++-
> > > 2 files changed, 33 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > > index 2712c6f..862e786 100644
> > > --- a/hw/virtio/vhost.c
> > > +++ b/hw/virtio/vhost.c
> > > @@ -18,6 +18,7 @@
> > > #include "qemu/atomic.h"
> > > #include "qemu/range.h"
> > > #include "qemu/error-report.h"
> > > +#include "qemu/memfd.h"
> > > #include <linux/vhost.h>
> > > #include "exec/address-spaces.h"
> > > #include "hw/virtio/virtio-bus.h"
> > > @@ -286,20 +287,34 @@ static uint64_t vhost_get_log_size(struct vhost_dev
> > > *dev)
> > > }
> > > return log_size;
> > > }
> > > -static struct vhost_log *vhost_log_alloc(uint64_t size)
> > > +
> > > +static struct vhost_log *vhost_log_alloc(uint64_t size, bool share)
> > > {
> > > - struct vhost_log *log = g_malloc0(sizeof *log + size *
> > > sizeof(*(log->log)));
> > > + struct vhost_log *log;
> > > + uint64_t logsize = size * sizeof(*(log->log));
> > > + int fd = -1;
> > > +
> > > + log = g_new0(struct vhost_log, 1);
> > > + if (share) {
> > > + log->log = qemu_memfd_alloc("vhost-log", logsize,
> > > + F_SEAL_GROW|F_SEAL_SHRINK|F_SEAL_SEAL,
> > > &fd);
> > > + memset(log->log, 0, logsize);
> >
> > qemu_memfd_alloc can return NULL can't it - so that needs checking?
> >
> > > + } else {
> > > + log->log = g_malloc0(logsize);
> >
> > I know the old code also used g_malloc0, but if the log isn't 'small'
> > then g_try_malloc0 is possibly safer and properly return errors
> > if it can't be allocated.
>
> Yeah, I agree it's better to check for the return value here (as you pointed out, I followed the existing pattern).
>
> Maybe we are just screwed if it happens, live migration shouldn't succeed if it can't be done properly imho.
>
> What's your take on this Michael?
>
> cheers
I guess we could fail migration in that case ...
Since current code uses g_malloc0 I feel this can be addressed by
a separate patch later.
--
MST
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH v2 5/9] vhost: document log resizing
2015-07-28 17:36 [Qemu-devel] [PATCH v2 0/9] vhost-user: add migration log support (for 2.5) Marc-André Lureau
` (3 preceding siblings ...)
2015-07-28 17:36 ` [Qemu-devel] [PATCH v2 4/9] vhost: alloc shareable log Marc-André Lureau
@ 2015-07-28 17:36 ` Marc-André Lureau
2015-07-28 17:36 ` [Qemu-devel] [PATCH v2 6/9] vhost: use variable arguments for vhost_call() Marc-André Lureau
` (3 subsequent siblings)
8 siblings, 0 replies; 18+ messages in thread
From: Marc-André Lureau @ 2015-07-28 17:36 UTC (permalink / raw)
To: qemu-devel
Cc: haifeng.lin, mst, thibaut.collet, jasowang, pbonzini,
Marc-André Lureau
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
hw/virtio/vhost.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 862e786..057d548 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -357,6 +357,8 @@ static inline void vhost_dev_log_resize(struct vhost_dev* dev, uint64_t size)
uint64_t log_base = (uintptr_t)log->log;
int r;
+ /* inform backend of log switching, this must be done before
+ releasing the current log, to ensure no logging is lost */
r = dev->vhost_ops->vhost_call(dev, VHOST_SET_LOG_BASE, &log_base);
assert(r >= 0);
vhost_log_put(dev, true);
--
2.4.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH v2 6/9] vhost: use variable arguments for vhost_call()
2015-07-28 17:36 [Qemu-devel] [PATCH v2 0/9] vhost-user: add migration log support (for 2.5) Marc-André Lureau
` (4 preceding siblings ...)
2015-07-28 17:36 ` [Qemu-devel] [PATCH v2 5/9] vhost: document log resizing Marc-André Lureau
@ 2015-07-28 17:36 ` Marc-André Lureau
2015-08-01 2:47 ` l
2015-07-28 17:36 ` [Qemu-devel] [PATCH v2 7/9] vhost-user: start and end the va_list Marc-André Lureau
` (2 subsequent siblings)
8 siblings, 1 reply; 18+ messages in thread
From: Marc-André Lureau @ 2015-07-28 17:36 UTC (permalink / raw)
To: qemu-devel
Cc: haifeng.lin, mst, thibaut.collet, jasowang, pbonzini,
Marc-André Lureau
It is useful to pass extra arguments to the funtions, for
various backend needs.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
hw/virtio/vhost-backend.c | 4 ++--
hw/virtio/vhost-user.c | 4 ++--
include/hw/virtio/vhost-backend.h | 6 ++++--
3 files changed, 8 insertions(+), 6 deletions(-)
diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
index 4d68a27..7255089 100644
--- a/hw/virtio/vhost-backend.c
+++ b/hw/virtio/vhost-backend.c
@@ -14,8 +14,8 @@
#include <sys/ioctl.h>
-static int vhost_kernel_call(struct vhost_dev *dev, unsigned long int request,
- void *arg)
+static int vhost_kernel_call(struct vhost_dev *dev,
+ unsigned long int request, void *arg, ...)
{
int fd = (uintptr_t) dev->opaque;
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 4993b63..8b6d7e7 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -190,8 +190,8 @@ static int vhost_user_write(struct vhost_dev *dev, VhostUserMsg *msg,
0 : -1;
}
-static int vhost_user_call(struct vhost_dev *dev, unsigned long int request,
- void *arg)
+static int vhost_user_call(struct vhost_dev *dev,
+ unsigned long int request, void *arg, ...)
{
VhostUserMsg msg;
VhostUserRequest msg_request;
diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h
index e472f29..36fa0f7 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -11,6 +11,8 @@
#ifndef VHOST_BACKEND_H_
#define VHOST_BACKEND_H_
+#include <stdarg.h>
+
typedef enum VhostBackendType {
VHOST_BACKEND_TYPE_NONE = 0,
VHOST_BACKEND_TYPE_KERNEL = 1,
@@ -20,8 +22,8 @@ typedef enum VhostBackendType {
struct vhost_dev;
-typedef int (*vhost_call)(struct vhost_dev *dev, unsigned long int request,
- void *arg);
+typedef int (*vhost_call)(struct vhost_dev *dev,
+ unsigned long int request, void *arg, ...);
typedef int (*vhost_backend_init)(struct vhost_dev *dev, void *opaque);
typedef int (*vhost_backend_cleanup)(struct vhost_dev *dev);
--
2.4.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH v2 6/9] vhost: use variable arguments for vhost_call()
2015-07-28 17:36 ` [Qemu-devel] [PATCH v2 6/9] vhost: use variable arguments for vhost_call() Marc-André Lureau
@ 2015-08-01 2:47 ` l
2015-08-03 11:03 ` Marc-André Lureau
0 siblings, 1 reply; 18+ messages in thread
From: l @ 2015-08-01 2:47 UTC (permalink / raw)
To: Marc-André Lureau, qemu-devel
Cc: haifeng.lin, mst, thibaut.collet, jasowang, pbonzini
Marc-André Lureau <marcandre.lureau@redhat.com> writes:
Hi André,
> It is useful to pass extra arguments to the funtions, for
> various backend needs.
>
> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> ---
> hw/virtio/vhost-backend.c | 4 ++--
> hw/virtio/vhost-user.c | 4 ++--
> include/hw/virtio/vhost-backend.h | 6 ++++--
> 3 files changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
> index 4d68a27..7255089 100644
> --- a/hw/virtio/vhost-backend.c
> +++ b/hw/virtio/vhost-backend.c
> @@ -14,8 +14,8 @@
>
> #include <sys/ioctl.h>
>
> -static int vhost_kernel_call(struct vhost_dev *dev, unsigned long int request,
> - void *arg)
> +static int vhost_kernel_call(struct vhost_dev *dev,
> + unsigned long int request, void *arg, ...)
I Couldn't see in your set where you change vhost_kernel_call() implementation or
make any use of this change.
Regards...
--
Leandro Dorileo
> {
> int fd = (uintptr_t) dev->opaque;
>
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 4993b63..8b6d7e7 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -190,8 +190,8 @@ static int vhost_user_write(struct vhost_dev *dev, VhostUserMsg *msg,
> 0 : -1;
> }
>
> -static int vhost_user_call(struct vhost_dev *dev, unsigned long int request,
> - void *arg)
> +static int vhost_user_call(struct vhost_dev *dev,
> + unsigned long int request, void *arg, ...)
> {
> VhostUserMsg msg;
> VhostUserRequest msg_request;
> diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h
> index e472f29..36fa0f7 100644
> --- a/include/hw/virtio/vhost-backend.h
> +++ b/include/hw/virtio/vhost-backend.h
> @@ -11,6 +11,8 @@
> #ifndef VHOST_BACKEND_H_
> #define VHOST_BACKEND_H_
>
> +#include <stdarg.h>
> +
> typedef enum VhostBackendType {
> VHOST_BACKEND_TYPE_NONE = 0,
> VHOST_BACKEND_TYPE_KERNEL = 1,
> @@ -20,8 +22,8 @@ typedef enum VhostBackendType {
>
> struct vhost_dev;
>
> -typedef int (*vhost_call)(struct vhost_dev *dev, unsigned long int request,
> - void *arg);
> +typedef int (*vhost_call)(struct vhost_dev *dev,
> + unsigned long int request, void *arg, ...);
> typedef int (*vhost_backend_init)(struct vhost_dev *dev, void *opaque);
> typedef int (*vhost_backend_cleanup)(struct vhost_dev *dev);
>
> --
> 2.4.3
>
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH v2 6/9] vhost: use variable arguments for vhost_call()
2015-08-01 2:47 ` l
@ 2015-08-03 11:03 ` Marc-André Lureau
0 siblings, 0 replies; 18+ messages in thread
From: Marc-André Lureau @ 2015-08-03 11:03 UTC (permalink / raw)
To: l
Cc: Linhaifeng, Michael S. Tsirkin, Thibaut Collet, Jason Wang, QEMU,
Paolo Bonzini
Hi
On Sat, Aug 1, 2015 at 4:47 AM, <l@dorileo.org> wrote:
>> diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
>> index 4d68a27..7255089 100644
>> --- a/hw/virtio/vhost-backend.c
>> +++ b/hw/virtio/vhost-backend.c
>> @@ -14,8 +14,8 @@
>>
>> #include <sys/ioctl.h>
>>
>> -static int vhost_kernel_call(struct vhost_dev *dev, unsigned long int request,
>> - void *arg)
>> +static int vhost_kernel_call(struct vhost_dev *dev,
>> + unsigned long int request, void *arg, ...)
>
>
>
> I Couldn't see in your set where you change vhost_kernel_call() implementation or
> make any use of this change.
>
That's expected, the change is necessary for vhost-user only atm. They both use
the same callback, so vhost-kernel must be changed too and ignore the
extra args.
cheers
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH v2 7/9] vhost-user: start and end the va_list
2015-07-28 17:36 [Qemu-devel] [PATCH v2 0/9] vhost-user: add migration log support (for 2.5) Marc-André Lureau
` (5 preceding siblings ...)
2015-07-28 17:36 ` [Qemu-devel] [PATCH v2 6/9] vhost: use variable arguments for vhost_call() Marc-André Lureau
@ 2015-07-28 17:36 ` Marc-André Lureau
2015-07-28 17:36 ` [Qemu-devel] [PATCH v2 8/9] vhost-user: send log shm fd along with log_base Marc-André Lureau
2015-07-28 17:36 ` [Qemu-devel] [PATCH v2 9/9] vhost-user: document migration log Marc-André Lureau
8 siblings, 0 replies; 18+ messages in thread
From: Marc-André Lureau @ 2015-07-28 17:36 UTC (permalink / raw)
To: qemu-devel
Cc: haifeng.lin, mst, thibaut.collet, jasowang, pbonzini,
Marc-André Lureau
Use a common start and exit branch to open and cleanup the va_list.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
hw/virtio/vhost-user.c | 32 ++++++++++++++++++++------------
1 file changed, 20 insertions(+), 12 deletions(-)
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 8b6d7e7..21ecbcd 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -198,11 +198,13 @@ static int vhost_user_call(struct vhost_dev *dev,
struct vhost_vring_file *file = 0;
int need_reply = 0;
int fds[VHOST_MEMORY_MAX_NREGIONS];
- int i, fd;
+ int i, fd, ret = 0;
size_t fd_num = 0;
+ va_list ap;
assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER);
+ va_start(ap, arg);
msg_request = vhost_user_request_translate(request);
msg.request = msg_request;
msg.flags = VHOST_USER_VERSION;
@@ -247,7 +249,8 @@ static int vhost_user_call(struct vhost_dev *dev,
if (!fd_num) {
error_report("Failed initializing vhost-user memory map, "
"consider using -object memory-backend-file share=on");
- return -1;
+ ret = -1;
+ goto end;
}
msg.size = sizeof(m.memory.nregions);
@@ -291,48 +294,53 @@ static int vhost_user_call(struct vhost_dev *dev,
break;
default:
error_report("vhost-user trying to send unhandled ioctl");
- return -1;
- break;
+ ret = -1;
+ goto end;
}
if (vhost_user_write(dev, &msg, fds, fd_num) < 0) {
- return 0;
+ goto end;
}
if (need_reply) {
if (vhost_user_read(dev, &msg) < 0) {
- return 0;
+ goto end;
}
if (msg_request != msg.request) {
error_report("Received unexpected msg type."
" Expected %d received %d", msg_request, msg.request);
- return -1;
+ ret = -1;
+ goto end;
}
switch (msg_request) {
case VHOST_USER_GET_FEATURES:
if (msg.size != sizeof(m.u64)) {
error_report("Received bad msg size.");
- return -1;
+ ret = -1;
+ goto end;
}
*((__u64 *) arg) = msg.u64;
break;
case VHOST_USER_GET_VRING_BASE:
if (msg.size != sizeof(m.state)) {
error_report("Received bad msg size.");
- return -1;
+ ret = -1;
+ goto end;
}
memcpy(arg, &msg.state, sizeof(struct vhost_vring_state));
break;
default:
error_report("Received unexpected msg type.");
- return -1;
- break;
+ ret = -1;
+ goto end;
}
}
- return 0;
+end:
+ va_end(ap);
+ return ret;
}
static int vhost_user_init(struct vhost_dev *dev, void *opaque)
--
2.4.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH v2 8/9] vhost-user: send log shm fd along with log_base
2015-07-28 17:36 [Qemu-devel] [PATCH v2 0/9] vhost-user: add migration log support (for 2.5) Marc-André Lureau
` (6 preceding siblings ...)
2015-07-28 17:36 ` [Qemu-devel] [PATCH v2 7/9] vhost-user: start and end the va_list Marc-André Lureau
@ 2015-07-28 17:36 ` Marc-André Lureau
2015-07-28 17:36 ` [Qemu-devel] [PATCH v2 9/9] vhost-user: document migration log Marc-André Lureau
8 siblings, 0 replies; 18+ messages in thread
From: Marc-André Lureau @ 2015-07-28 17:36 UTC (permalink / raw)
To: qemu-devel
Cc: haifeng.lin, mst, thibaut.collet, jasowang, pbonzini,
Marc-André Lureau
Send the shm for the dirty pages logging if the backend support
VHOST_USER_PROTOCOL_F_LOG_SHMFD.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
hw/virtio/vhost-user.c | 16 ++++++++++++++--
hw/virtio/vhost.c | 5 +++--
2 files changed, 17 insertions(+), 4 deletions(-)
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 21ecbcd..b2f46a9 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -26,7 +26,9 @@
#define VHOST_MEMORY_MAX_NREGIONS 8
#define VHOST_USER_F_PROTOCOL_FEATURES 30
-#define VHOST_USER_PROTOCOL_FEATURE_MASK 0x0ULL
+
+#define VHOST_USER_PROTOCOL_FEATURE_MASK 0x1ULL
+#define VHOST_USER_PROTOCOL_F_LOG_SHMFD 0
typedef enum VhostUserRequest {
VHOST_USER_NONE = 0,
@@ -215,8 +217,18 @@ static int vhost_user_call(struct vhost_dev *dev,
need_reply = 1;
break;
+ case VHOST_USER_SET_LOG_BASE: {
+ struct vhost_log *log = va_arg(ap, struct vhost_log *);
+
+ if (__virtio_has_feature(dev->protocol_features,
+ VHOST_USER_PROTOCOL_F_LOG_SHMFD) &&
+ log->fd != -1) {
+ fds[fd_num++] = log->fd;
+ }
+ }
+ /* fall through */
+
case VHOST_USER_SET_FEATURES:
- case VHOST_USER_SET_LOG_BASE:
msg.u64 = *((__u64 *) arg);
msg.size = sizeof(m.u64);
break;
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 057d548..ed8b1a5 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -359,7 +359,7 @@ static inline void vhost_dev_log_resize(struct vhost_dev* dev, uint64_t size)
/* inform backend of log switching, this must be done before
releasing the current log, to ensure no logging is lost */
- r = dev->vhost_ops->vhost_call(dev, VHOST_SET_LOG_BASE, &log_base);
+ r = dev->vhost_ops->vhost_call(dev, VHOST_SET_LOG_BASE, &log_base, log);
assert(r >= 0);
vhost_log_put(dev, true);
dev->log = log;
@@ -1167,7 +1167,8 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
hdev->log = vhost_log_get(hdev->log_size, share);
log_base = (uintptr_t)hdev->log->log;
r = hdev->vhost_ops->vhost_call(hdev, VHOST_SET_LOG_BASE,
- hdev->log_size ? &log_base : NULL);
+ hdev->log_size ? &log_base : NULL,
+ hdev->log);
if (r < 0) {
r = -errno;
goto fail_log;
--
2.4.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH v2 9/9] vhost-user: document migration log
2015-07-28 17:36 [Qemu-devel] [PATCH v2 0/9] vhost-user: add migration log support (for 2.5) Marc-André Lureau
` (7 preceding siblings ...)
2015-07-28 17:36 ` [Qemu-devel] [PATCH v2 8/9] vhost-user: send log shm fd along with log_base Marc-André Lureau
@ 2015-07-28 17:36 ` Marc-André Lureau
8 siblings, 0 replies; 18+ messages in thread
From: Marc-André Lureau @ 2015-07-28 17:36 UTC (permalink / raw)
To: qemu-devel
Cc: haifeng.lin, mst, thibaut.collet, jasowang, pbonzini,
Marc-André Lureau
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
docs/specs/vhost-user.txt | 42 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 42 insertions(+)
diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 0062baa..2cf9ab4 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -120,6 +120,7 @@ There are several messages that the master sends with file descriptors passed
in the ancillary data:
* VHOST_SET_MEM_TABLE
+ * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
* VHOST_SET_LOG_FD
* VHOST_SET_VRING_KICK
* VHOST_SET_VRING_CALL
@@ -135,6 +136,11 @@ As older slaves don't support negotiating protocol features,
a feature bit was dedicated for this purpose:
#define VHOST_USER_F_PROTOCOL_FEATURES 30
+Protocol features
+-----------------
+
+#define VHOST_USER_PROTOCOL_F_LOG_SHMFD 0
+
Message types
-------------
@@ -301,3 +307,39 @@ Message types
Bits (0-7) of the payload contain the vring index. Bit 8 is the
invalid FD flag. This flag is set when there is no file descriptor
in the ancillary data.
+
+Migration
+---------
+
+During live migration, the master may need to track the modifications
+the slave makes to the memory mapped regions. The client should mark
+the dirty pages in a log. Once it complies to this logging, it may
+declare VHOST_F_LOG_ALL has a vhost feature.
+
+All the modifications to memory pointed by vring "descriptor" should
+be marked. Modifications to "used" vring should be marked if
+VHOST_VRING_F_LOG is part of ring's features.
+
+Dirty pages are of size:
+#define VHOST_LOG_PAGE 0x1000
+
+The log memory fd is provided in the ancillary data of
+VHOST_USER_SET_LOG_BASE message when the slave has
+VHOST_USER_PROTOCOL_F_LOG_SHMFD protocol feature.
+
+The size of the log may be computed by using all the known guest
+addresses. The log covers from address 0 to the maximum of guest
+regions. In pseudo-code, to mark page at "addr" as dirty:
+
+page = addr / VHOST_LOG_PAGE
+log[page / 8] |= 1 << page % 8
+
+Use atomic operations, as the log may be concurrently manipulated.
+
+VHOST_USER_SET_LOG_FD is an optional message with an eventfd in
+ancillary data, it may be used to inform the master that the log has
+been modified.
+
+Once the source has finished migration, VHOST_USER_RESET_OWNER message
+will be sent by the source. No further update must be done before the
+destination takes over with new regions & rings.
--
2.4.3
^ permalink raw reply related [flat|nested] 18+ messages in thread