* [Qemu-devel] [PATCH RFC 0/6] vhost-user: add migration log support
@ 2015-07-23 1:36 Marc-André Lureau
2015-07-23 1:36 ` [Qemu-devel] [PATCH RFC 1/6] configure: probe for memfd Marc-André Lureau
` (5 more replies)
0 siblings, 6 replies; 18+ messages in thread
From: Marc-André Lureau @ 2015-07-23 1:36 UTC (permalink / raw)
To: qemu-devel
Cc: thibaut.collet, pbonzini, haifeng.lin, Marc-André Lureau,
mst
Hi,
The following series implement shareable log for vhost-user to support
memory tracking during live migration. On qemu-side, the solution is
fairly straightfoward since vhost already supports the dirty log, only
vhost-user couldn't access the log memory until then.
The series is based on top of "protocol feature negotiation" series
proposed earlier by Michael S. Tsirkin.
The last patch provides some documentation on what the backend is
supposed to do to handle logging properly. I tested this solution
against a modified "vapp": https://github.com/elmarco/vapp branch
"log"
The development branch I used is:
https://github.com/elmarco/qemu branch "vhost-user"
Comments welcome!
Marc-André Lureau (6):
configure: probe for memfd
posix: add linux-only memfd fallback
osdep: add memfd helpers
vhost: alloc shareable log
vhost-user: send log shm fd along with log_base
vhost-user: document migration log
configure | 19 ++++++++++++++
docs/specs/vhost-user.txt | 40 +++++++++++++++++++++++++++++
hw/virtio/vhost-user.c | 13 ++++++++--
hw/virtio/vhost.c | 42 ++++++++++++++++++++++++-------
include/hw/virtio/vhost.h | 3 ++-
include/qemu/osdep.h | 64 +++++++++++++++++++++++++++++++++++++++++++++++
util/oslib-posix.c | 62 +++++++++++++++++++++++++++++++++++++++++++++
7 files changed, 231 insertions(+), 12 deletions(-)
--
2.4.3
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH RFC 1/6] configure: probe for memfd
2015-07-23 1:36 [Qemu-devel] [PATCH RFC 0/6] vhost-user: add migration log support Marc-André Lureau
@ 2015-07-23 1:36 ` Marc-André Lureau
2015-07-23 1:36 ` [Qemu-devel] [PATCH RFC 2/6] posix: add linux-only memfd fallback Marc-André Lureau
` (4 subsequent siblings)
5 siblings, 0 replies; 18+ messages in thread
From: Marc-André Lureau @ 2015-07-23 1:36 UTC (permalink / raw)
To: qemu-devel
Cc: thibaut.collet, pbonzini, haifeng.lin, Marc-André Lureau,
mst
Check if memfd_create() is part of system libc.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
configure | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/configure b/configure
index cc0338d..9a401d4 100755
--- a/configure
+++ b/configure
@@ -3390,6 +3390,22 @@ if compile_prog "" "" ; then
eventfd=yes
fi
+# check if memfd is supported
+memfd=no
+cat > $TMPC << EOF
+#include <sys/memfd.h>
+
+int main(void)
+{
+ return memfd_create("foo", MFD_ALLOW_SEALING);
+}
+EOF
+if compile_prog "" "" ; then
+ memfd=yes
+fi
+
+
+
# check for fallocate
fallocate=no
cat > $TMPC << EOF
@@ -4770,6 +4786,9 @@ fi
if test "$eventfd" = "yes" ; then
echo "CONFIG_EVENTFD=y" >> $config_host_mak
fi
+if test "$memfd" = "yes" ; then
+ echo "CONFIG_MEMFD=y" >> $config_host_mak
+fi
if test "$fallocate" = "yes" ; then
echo "CONFIG_FALLOCATE=y" >> $config_host_mak
fi
--
2.4.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH RFC 2/6] posix: add linux-only memfd fallback
2015-07-23 1:36 [Qemu-devel] [PATCH RFC 0/6] vhost-user: add migration log support Marc-André Lureau
2015-07-23 1:36 ` [Qemu-devel] [PATCH RFC 1/6] configure: probe for memfd Marc-André Lureau
@ 2015-07-23 1:36 ` Marc-André Lureau
2015-07-23 15:25 ` Michael S. Tsirkin
2015-07-23 1:36 ` [Qemu-devel] [PATCH RFC 3/6] osdep: add memfd helpers Marc-André Lureau
` (3 subsequent siblings)
5 siblings, 1 reply; 18+ messages in thread
From: Marc-André Lureau @ 2015-07-23 1:36 UTC (permalink / raw)
To: qemu-devel
Cc: thibaut.collet, pbonzini, haifeng.lin, Marc-André Lureau,
mst
Implement memfd_create() fallback if not available in system libc.
memfd_create() is still not included in glibc today, atlhough it's been
available since Linux 3.17 in Oct 2014.
memfd has numerous advantages over traditional shm/mmap for ipc memory
sharing with fd handler, which we are going to make use of for
vhost-user logging memory in following patches.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
include/qemu/osdep.h | 59 ++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 59 insertions(+)
diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index 3247364..adc138b 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -6,6 +6,7 @@
#include <stddef.h>
#include <stdbool.h>
#include <stdint.h>
+#include <unistd.h>
#include <sys/types.h>
#ifdef __OpenBSD__
#include <sys/signal.h>
@@ -20,6 +21,64 @@
#include <sys/time.h>
+#ifdef CONFIG_LINUX
+
+#ifndef F_LINUX_SPECIFIC_BASE
+#define F_LINUX_SPECIFIC_BASE 1024
+#endif
+
+#ifndef F_ADD_SEALS
+#define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9)
+#define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10)
+
+#define F_SEAL_SEAL 0x0001 /* prevent further seals from being set */
+#define F_SEAL_SHRINK 0x0002 /* prevent file from shrinking */
+#define F_SEAL_GROW 0x0004 /* prevent file from growing */
+#define F_SEAL_WRITE 0x0008 /* prevent writes */
+#endif
+
+#ifndef MFD_ALLOW_SEALING
+#define MFD_ALLOW_SEALING 0x0002U
+#endif
+
+#ifndef MFD_CLOEXEC
+#define MFD_CLOEXEC 0x0001U
+#endif
+
+#ifndef __NR_memfd_create
+# if defined __x86_64__
+# define __NR_memfd_create 319
+# elif defined __arm__
+# define __NR_memfd_create 385
+# elif defined __aarch64__
+# define __NR_memfd_create 279
+# elif defined _MIPS_SIM
+# if _MIPS_SIM == _MIPS_SIM_ABI32
+# define __NR_memfd_create 4354
+# endif
+# if _MIPS_SIM == _MIPS_SIM_NABI32
+# define __NR_memfd_create 6318
+# endif
+# if _MIPS_SIM == _MIPS_SIM_ABI64
+# define __NR_memfd_create 5314
+# endif
+# elif defined __i386__
+# define __NR_memfd_create 356
+# else
+# warning "__NR_memfd_create unknown for your architecture"
+# define __NR_memfd_create 0xffffffff
+# endif
+#endif
+
+#ifndef CONFIG_MEMFD
+static inline int memfd_create(const char *name, unsigned int flags)
+{
+ return syscall(__NR_memfd_create, name, flags);
+}
+#endif
+
+#endif /* LINUX */
+
#if defined(CONFIG_SOLARIS) && CONFIG_SOLARIS_VERSION < 10
/* [u]int_fast*_t not in <sys/int_types.h> */
typedef unsigned char uint_fast8_t;
--
2.4.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH RFC 3/6] osdep: add memfd helpers
2015-07-23 1:36 [Qemu-devel] [PATCH RFC 0/6] vhost-user: add migration log support Marc-André Lureau
2015-07-23 1:36 ` [Qemu-devel] [PATCH RFC 1/6] configure: probe for memfd Marc-André Lureau
2015-07-23 1:36 ` [Qemu-devel] [PATCH RFC 2/6] posix: add linux-only memfd fallback Marc-André Lureau
@ 2015-07-23 1:36 ` Marc-André Lureau
2015-07-23 1:36 ` [Qemu-devel] [PATCH RFC 4/6] vhost: alloc shareable log Marc-André Lureau
` (2 subsequent siblings)
5 siblings, 0 replies; 18+ messages in thread
From: Marc-André Lureau @ 2015-07-23 1:36 UTC (permalink / raw)
To: qemu-devel
Cc: thibaut.collet, pbonzini, haifeng.lin, Marc-André Lureau,
mst
Add qemu_memfd_alloc/free() helpers.
The function helps to allocate and seal a memfd, and implements an
open/unlink/mmap fallback for system that do not support memfd.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
include/qemu/osdep.h | 5 +++++
util/oslib-posix.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 67 insertions(+)
diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index adc138b..c49145f 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -167,6 +167,11 @@ void *qemu_anon_ram_alloc(size_t size, uint64_t *align);
void qemu_vfree(void *ptr);
void qemu_anon_ram_free(void *ptr, size_t size);
+void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals,
+ int *fd);
+void qemu_memfd_free(void *ptr, size_t size, int fd);
+
+
#define QEMU_MADV_INVALID -1
#if defined(CONFIG_MADVISE)
diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index 3ae4987..6e5a143 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -482,3 +482,65 @@ int qemu_read_password(char *buf, int buf_size)
printf("\n");
return ret;
}
+
+void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals,
+ int *fd)
+{
+ void *ptr;
+ int mfd;
+
+ mfd = memfd_create(name, MFD_ALLOW_SEALING|MFD_CLOEXEC);
+ if (mfd != -1) {
+ if (ftruncate(mfd, size) == -1) {
+ perror("ftruncate");
+ close(mfd);
+ return NULL;
+ }
+
+ if (fcntl(mfd, F_ADD_SEALS, seals) == -1) {
+ perror("fcntl");
+ close(mfd);
+ return NULL;
+ }
+ } else {
+ const char *tmpdir = getenv("TMPDIR");
+ gchar *fname;
+
+ tmpdir = tmpdir ? tmpdir : "/tmp";
+
+ fname = g_strdup_printf("%s/memfd-XXXXXX", tmpdir);
+ mfd = mkstemp(fname);
+ unlink(fname);
+ g_free(fname);
+
+ if (mfd == -1) {
+ perror("mkstemp");
+ return NULL;
+ }
+
+ if (ftruncate(mfd, size) == -1) {
+ perror("ftruncate");
+ close(mfd);
+ return NULL;
+ }
+ }
+
+ ptr = mmap(0, size, PROT_READ|PROT_WRITE, MAP_SHARED, mfd, 0);
+ if (ptr == MAP_FAILED) {
+ perror("mmap");
+ close(mfd);
+ return NULL;
+ }
+
+ *fd = mfd;
+ return ptr;
+}
+
+void qemu_memfd_free(void *ptr, size_t size, int fd)
+{
+ if (ptr) {
+ munmap(ptr, size);
+ }
+
+ close(fd);
+}
--
2.4.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH RFC 4/6] vhost: alloc shareable log
2015-07-23 1:36 [Qemu-devel] [PATCH RFC 0/6] vhost-user: add migration log support Marc-André Lureau
` (2 preceding siblings ...)
2015-07-23 1:36 ` [Qemu-devel] [PATCH RFC 3/6] osdep: add memfd helpers Marc-André Lureau
@ 2015-07-23 1:36 ` Marc-André Lureau
2015-07-28 5:28 ` Jason Wang
2015-07-23 1:36 ` [Qemu-devel] [PATCH RFC 5/6] vhost-user: send log shm fd along with log_base Marc-André Lureau
2015-07-23 1:36 ` [Qemu-devel] [PATCH RFC 6/6] vhost-user: document migration log Marc-André Lureau
5 siblings, 1 reply; 18+ messages in thread
From: Marc-André Lureau @ 2015-07-23 1:36 UTC (permalink / raw)
To: qemu-devel
Cc: thibaut.collet, pbonzini, haifeng.lin, Marc-André Lureau,
mst
If the backend is of type VHOST_BACKEND_TYPE_USER, allocate
shareable memory.
Note: vhost_log_get() can use a global "vhost_log" that can be shared by
several vhost devices. We may want instead a common shareable log and a
common non-shareable one.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
hw/virtio/vhost.c | 42 +++++++++++++++++++++++++++++++++---------
include/hw/virtio/vhost.h | 3 ++-
2 files changed, 35 insertions(+), 10 deletions(-)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 2712c6f..12dd644 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -286,20 +286,34 @@ static uint64_t vhost_get_log_size(struct vhost_dev *dev)
}
return log_size;
}
-static struct vhost_log *vhost_log_alloc(uint64_t size)
+
+static struct vhost_log *vhost_log_alloc(uint64_t size, bool share)
{
- struct vhost_log *log = g_malloc0(sizeof *log + size * sizeof(*(log->log)));
+ struct vhost_log *log;
+ uint64_t logsize = size * sizeof(*(log->log));
+ int fd = -1;
+
+ log = g_new0(struct vhost_log, 1);
+ if (share) {
+ log->log = qemu_memfd_alloc("vhost-log", logsize,
+ F_SEAL_GROW|F_SEAL_SHRINK|F_SEAL_SEAL, &fd);
+ memset(log->log, 0, logsize);
+ } else {
+ log->log = g_malloc0(logsize);
+ }
log->size = size;
log->refcnt = 1;
+ log->fd = fd;
return log;
}
-static struct vhost_log *vhost_log_get(uint64_t size)
+static struct vhost_log *vhost_log_get(uint64_t size, bool share)
{
- if (!vhost_log || vhost_log->size != size) {
- vhost_log = vhost_log_alloc(size);
+ if (!vhost_log || vhost_log->size != size ||
+ (share && vhost_log->fd == -1)) {
+ vhost_log = vhost_log_alloc(size, share);
} else {
++vhost_log->refcnt;
}
@@ -324,21 +338,30 @@ static void vhost_log_put(struct vhost_dev *dev, bool sync)
if (vhost_log == log) {
vhost_log = NULL;
}
+
+ if (log->fd == -1) {
+ g_free(log->log);
+ } else {
+ qemu_memfd_free(log->log, log->size * sizeof(*(log->log)),
+ log->fd);
+ }
g_free(log);
}
}
static inline void vhost_dev_log_resize(struct vhost_dev* dev, uint64_t size)
{
- struct vhost_log *log = vhost_log_get(size);
+ bool share = dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER;
+ struct vhost_log *log = vhost_log_get(size, share);
uint64_t log_base = (uintptr_t)log->log;
int r;
- r = dev->vhost_ops->vhost_call(dev, VHOST_SET_LOG_BASE, &log_base);
- assert(r >= 0);
vhost_log_put(dev, true);
dev->log = log;
dev->log_size = size;
+
+ r = dev->vhost_ops->vhost_call(dev, VHOST_SET_LOG_BASE, &log_base);
+ assert(r >= 0);
}
static int vhost_verify_ring_mappings(struct vhost_dev *dev,
@@ -1136,9 +1159,10 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
if (hdev->log_enabled) {
uint64_t log_base;
+ bool share = hdev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER;
hdev->log_size = vhost_get_log_size(hdev);
- hdev->log = vhost_log_get(hdev->log_size);
+ hdev->log = vhost_log_get(hdev->log_size, share);
log_base = (uintptr_t)hdev->log->log;
r = hdev->vhost_ops->vhost_call(hdev, VHOST_SET_LOG_BASE,
hdev->log_size ? &log_base : NULL);
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 6467c73..ab1dcac 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -31,7 +31,8 @@ typedef unsigned long vhost_log_chunk_t;
struct vhost_log {
unsigned long long size;
int refcnt;
- vhost_log_chunk_t log[0];
+ int fd;
+ vhost_log_chunk_t *log;
};
struct vhost_memory;
--
2.4.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH RFC 5/6] vhost-user: send log shm fd along with log_base
2015-07-23 1:36 [Qemu-devel] [PATCH RFC 0/6] vhost-user: add migration log support Marc-André Lureau
` (3 preceding siblings ...)
2015-07-23 1:36 ` [Qemu-devel] [PATCH RFC 4/6] vhost: alloc shareable log Marc-André Lureau
@ 2015-07-23 1:36 ` Marc-André Lureau
2015-07-23 1:36 ` [Qemu-devel] [PATCH RFC 6/6] vhost-user: document migration log Marc-André Lureau
5 siblings, 0 replies; 18+ messages in thread
From: Marc-André Lureau @ 2015-07-23 1:36 UTC (permalink / raw)
To: qemu-devel
Cc: thibaut.collet, pbonzini, haifeng.lin, Marc-André Lureau,
mst
Send the shm for the dirty pages logging if the backend support
VHOST_USER_PROTOCOL_F_LOG_SHMFD.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
hw/virtio/vhost-user.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 4993b63..fe75618 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -26,7 +26,9 @@
#define VHOST_MEMORY_MAX_NREGIONS 8
#define VHOST_USER_F_PROTOCOL_FEATURES 30
-#define VHOST_USER_PROTOCOL_FEATURE_MASK 0x0ULL
+
+#define VHOST_USER_PROTOCOL_FEATURE_MASK 0x1ULL
+#define VHOST_USER_PROTOCOL_F_LOG_SHMFD 0
typedef enum VhostUserRequest {
VHOST_USER_NONE = 0,
@@ -213,8 +215,15 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned long int request,
need_reply = 1;
break;
- case VHOST_USER_SET_FEATURES:
case VHOST_USER_SET_LOG_BASE:
+ if (__virtio_has_feature(dev->protocol_features,
+ VHOST_USER_PROTOCOL_F_LOG_SHMFD) &&
+ dev->log->fd != -1) {
+ fds[fd_num++] = dev->log->fd;
+ }
+ /* fall through */
+
+ case VHOST_USER_SET_FEATURES:
msg.u64 = *((__u64 *) arg);
msg.size = sizeof(m.u64);
break;
--
2.4.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH RFC 6/6] vhost-user: document migration log
2015-07-23 1:36 [Qemu-devel] [PATCH RFC 0/6] vhost-user: add migration log support Marc-André Lureau
` (4 preceding siblings ...)
2015-07-23 1:36 ` [Qemu-devel] [PATCH RFC 5/6] vhost-user: send log shm fd along with log_base Marc-André Lureau
@ 2015-07-23 1:36 ` Marc-André Lureau
2015-07-23 15:30 ` Michael S. Tsirkin
5 siblings, 1 reply; 18+ messages in thread
From: Marc-André Lureau @ 2015-07-23 1:36 UTC (permalink / raw)
To: qemu-devel
Cc: thibaut.collet, pbonzini, haifeng.lin, Marc-André Lureau,
mst
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
docs/specs/vhost-user.txt | 40 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 40 insertions(+)
diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 0062baa..c2d2e2a 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -120,6 +120,7 @@ There are several messages that the master sends with file descriptors passed
in the ancillary data:
* VHOST_SET_MEM_TABLE
+ * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
* VHOST_SET_LOG_FD
* VHOST_SET_VRING_KICK
* VHOST_SET_VRING_CALL
@@ -135,6 +136,11 @@ As older slaves don't support negotiating protocol features,
a feature bit was dedicated for this purpose:
#define VHOST_USER_F_PROTOCOL_FEATURES 30
+Protocol features
+-----------------
+
+#define VHOST_USER_PROTOCOL_F_LOG_SHMFD 0
+
Message types
-------------
@@ -301,3 +307,37 @@ Message types
Bits (0-7) of the payload contain the vring index. Bit 8 is the
invalid FD flag. This flag is set when there is no file descriptor
in the ancillary data.
+
+Migration
+---------
+
+During live migration, the master may need to track the modifications
+the slave makes to the memory mapped regions. The client should mark
+the dirty pages in a log. Once it complies to this logging, it may
+declare VHOST_F_LOG_ALL has a vhost feature.
+
+All the modifications to memory pointed by vring "descriptor" should
+be marked. Modifications to "used" vring should be marked if
+VHOST_VRING_F_LOG is part of ring's features.
+
+Dirty pages are of size:
+#define VHOST_LOG_PAGE 0x1000
+
+The log memory fd is provided in the ancillary data of
+VHOST_USER_SET_LOG_BASE message when the slave has
+VHOST_USER_PROTOCOL_F_LOG_SHMFD protocol feature.
+
+The size of the log may be computed by using all the known guest
+addresses. The log covers from address 0 to the maximum of guest
+regions. In pseudo-code, to mark page at "addr" as dirty:
+
+page = addr / VHOST_LOG_PAGE
+log[page / 8] |= 1 << page % 8
+
+VHOST_USER_SET_LOG_FD is an optional message with an eventfd in
+ancillary data, it may be used to inform the master that the log has
+been modified.
+
+Once the source has finished migration, VHOST_USER_RESET_OWNER message
+will be sent by the source. No further update must be done before the
+destination takes over with new regions & rings.
--
2.4.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 2/6] posix: add linux-only memfd fallback
2015-07-23 1:36 ` [Qemu-devel] [PATCH RFC 2/6] posix: add linux-only memfd fallback Marc-André Lureau
@ 2015-07-23 15:25 ` Michael S. Tsirkin
2015-07-28 8:11 ` Paolo Bonzini
0 siblings, 1 reply; 18+ messages in thread
From: Michael S. Tsirkin @ 2015-07-23 15:25 UTC (permalink / raw)
To: Marc-André Lureau; +Cc: thibaut.collet, pbonzini, qemu-devel, haifeng.lin
On Thu, Jul 23, 2015 at 03:36:39AM +0200, Marc-André Lureau wrote:
> Implement memfd_create() fallback if not available in system libc.
> memfd_create() is still not included in glibc today, atlhough it's been
> available since Linux 3.17 in Oct 2014.
>
> memfd has numerous advantages over traditional shm/mmap for ipc memory
> sharing with fd handler, which we are going to make use of for
> vhost-user logging memory in following patches.
>
> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> ---
> include/qemu/osdep.h | 59 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 59 insertions(+)
>
> diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
> index 3247364..adc138b 100644
> --- a/include/qemu/osdep.h
> +++ b/include/qemu/osdep.h
> @@ -6,6 +6,7 @@
> #include <stddef.h>
> #include <stdbool.h>
> #include <stdint.h>
> +#include <unistd.h>
> #include <sys/types.h>
> #ifdef __OpenBSD__
> #include <sys/signal.h>
> @@ -20,6 +21,64 @@
>
> #include <sys/time.h>
>
> +#ifdef CONFIG_LINUX
> +
> +#ifndef F_LINUX_SPECIFIC_BASE
> +#define F_LINUX_SPECIFIC_BASE 1024
> +#endif
> +
> +#ifndef F_ADD_SEALS
> +#define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9)
> +#define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10)
> +
> +#define F_SEAL_SEAL 0x0001 /* prevent further seals from being set */
> +#define F_SEAL_SHRINK 0x0002 /* prevent file from shrinking */
> +#define F_SEAL_GROW 0x0004 /* prevent file from growing */
> +#define F_SEAL_WRITE 0x0008 /* prevent writes */
> +#endif
These are from include/uapi/linux/fcntl.h,
they should be imported into linux-headers I think.
> +
> +#ifndef MFD_ALLOW_SEALING
> +#define MFD_ALLOW_SEALING 0x0002U
> +#endif
> +
> +#ifndef MFD_CLOEXEC
> +#define MFD_CLOEXEC 0x0001U
> +#endif
> +
> +#ifndef __NR_memfd_create
> +# if defined __x86_64__
> +# define __NR_memfd_create 319
> +# elif defined __arm__
> +# define __NR_memfd_create 385
> +# elif defined __aarch64__
> +# define __NR_memfd_create 279
> +# elif defined _MIPS_SIM
> +# if _MIPS_SIM == _MIPS_SIM_ABI32
> +# define __NR_memfd_create 4354
> +# endif
> +# if _MIPS_SIM == _MIPS_SIM_NABI32
> +# define __NR_memfd_create 6318
> +# endif
> +# if _MIPS_SIM == _MIPS_SIM_ABI64
> +# define __NR_memfd_create 5314
> +# endif
What's defining all these macros?
> +# elif defined __i386__
> +# define __NR_memfd_create 356
> +# else
> +# warning "__NR_memfd_create unknown for your architecture"
> +# define __NR_memfd_create 0xffffffff
> +# endif
> +#endif
> +
> +#ifndef CONFIG_MEMFD
> +static inline int memfd_create(const char *name, unsigned int flags)
> +{
> + return syscall(__NR_memfd_create, name, flags);
> +}
> +#endif
How about making these non-inline?
I think we need stubs for non-posix systems, right?
> +
> +#endif /* LINUX */
> +
> #if defined(CONFIG_SOLARIS) && CONFIG_SOLARIS_VERSION < 10
> /* [u]int_fast*_t not in <sys/int_types.h> */
> typedef unsigned char uint_fast8_t;
> --
> 2.4.3
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 6/6] vhost-user: document migration log
2015-07-23 1:36 ` [Qemu-devel] [PATCH RFC 6/6] vhost-user: document migration log Marc-André Lureau
@ 2015-07-23 15:30 ` Michael S. Tsirkin
2015-07-23 15:36 ` Marc-André Lureau
0 siblings, 1 reply; 18+ messages in thread
From: Michael S. Tsirkin @ 2015-07-23 15:30 UTC (permalink / raw)
To: Marc-André Lureau; +Cc: thibaut.collet, pbonzini, qemu-devel, haifeng.lin
On Thu, Jul 23, 2015 at 03:36:43AM +0200, Marc-André Lureau wrote:
> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
for some reason I didn't get 5/6.
> ---
> docs/specs/vhost-user.txt | 40 ++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 40 insertions(+)
>
> diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> index 0062baa..c2d2e2a 100644
> --- a/docs/specs/vhost-user.txt
> +++ b/docs/specs/vhost-user.txt
> @@ -120,6 +120,7 @@ There are several messages that the master sends with file descriptors passed
> in the ancillary data:
>
> * VHOST_SET_MEM_TABLE
> + * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
> * VHOST_SET_LOG_FD
> * VHOST_SET_VRING_KICK
> * VHOST_SET_VRING_CALL
> @@ -135,6 +136,11 @@ As older slaves don't support negotiating protocol features,
> a feature bit was dedicated for this purpose:
> #define VHOST_USER_F_PROTOCOL_FEATURES 30
>
> +Protocol features
> +-----------------
> +
> +#define VHOST_USER_PROTOCOL_F_LOG_SHMFD 0
> +
> Message types
> -------------
>
> @@ -301,3 +307,37 @@ Message types
> Bits (0-7) of the payload contain the vring index. Bit 8 is the
> invalid FD flag. This flag is set when there is no file descriptor
> in the ancillary data.
> +
> +Migration
> +---------
> +
> +During live migration, the master may need to track the modifications
> +the slave makes to the memory mapped regions. The client should mark
> +the dirty pages in a log. Once it complies to this logging, it may
> +declare VHOST_F_LOG_ALL has a vhost feature.
> +
> +All the modifications to memory pointed by vring "descriptor" should
> +be marked. Modifications to "used" vring should be marked if
> +VHOST_VRING_F_LOG is part of ring's features.
It's device's features I think.
> +
> +Dirty pages are of size:
> +#define VHOST_LOG_PAGE 0x1000
> +
> +The log memory fd is provided in the ancillary data of
> +VHOST_USER_SET_LOG_BASE message when the slave has
> +VHOST_USER_PROTOCOL_F_LOG_SHMFD protocol feature.
> +
> +The size of the log may be computed by using all the known guest
> +addresses. The log covers from address 0 to the maximum of guest
> +regions. In pseudo-code, to mark page at "addr" as dirty:
> +
> +page = addr / VHOST_LOG_PAGE
> +log[page / 8] |= 1 << page % 8
Pls note it must be done atomically.
> +
> +VHOST_USER_SET_LOG_FD is an optional message with an eventfd in
> +ancillary data, it may be used to inform the master that the log has
> +been modified.
> +
> +Once the source has finished migration, VHOST_USER_RESET_OWNER message
> +will be sent by the source. No further update must be done before the
> +destination takes over with new regions & rings.
> --
> 2.4.3
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 6/6] vhost-user: document migration log
2015-07-23 15:30 ` Michael S. Tsirkin
@ 2015-07-23 15:36 ` Marc-André Lureau
0 siblings, 0 replies; 18+ messages in thread
From: Marc-André Lureau @ 2015-07-23 15:36 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: haifeng lin, Marc-André Lureau, pbonzini, qemu-devel,
thibaut collet
Hi
----- Original Message -----
> On Thu, Jul 23, 2015 at 03:36:43AM +0200, Marc-André Lureau wrote:
> > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
>
> for some reason I didn't get 5/6.
>
strange: http://lists.nongnu.org/archive/html/qemu-devel/2015-07/msg04640.html
> > ---
> > docs/specs/vhost-user.txt | 40 ++++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 40 insertions(+)
> >
> > diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> > index 0062baa..c2d2e2a 100644
> > --- a/docs/specs/vhost-user.txt
> > +++ b/docs/specs/vhost-user.txt
> > @@ -120,6 +120,7 @@ There are several messages that the master sends with
> > file descriptors passed
> > in the ancillary data:
> >
> > * VHOST_SET_MEM_TABLE
> > + * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
> > * VHOST_SET_LOG_FD
> > * VHOST_SET_VRING_KICK
> > * VHOST_SET_VRING_CALL
> > @@ -135,6 +136,11 @@ As older slaves don't support negotiating protocol
> > features,
> > a feature bit was dedicated for this purpose:
> > #define VHOST_USER_F_PROTOCOL_FEATURES 30
> >
> > +Protocol features
> > +-----------------
> > +
> > +#define VHOST_USER_PROTOCOL_F_LOG_SHMFD 0
> > +
> > Message types
> > -------------
> >
> > @@ -301,3 +307,37 @@ Message types
> > Bits (0-7) of the payload contain the vring index. Bit 8 is the
> > invalid FD flag. This flag is set when there is no file descriptor
> > in the ancillary data.
> > +
> > +Migration
> > +---------
> > +
> > +During live migration, the master may need to track the modifications
> > +the slave makes to the memory mapped regions. The client should mark
> > +the dirty pages in a log. Once it complies to this logging, it may
> > +declare VHOST_F_LOG_ALL has a vhost feature.
> > +
> > +All the modifications to memory pointed by vring "descriptor" should
> > +be marked. Modifications to "used" vring should be marked if
> > +VHOST_VRING_F_LOG is part of ring's features.
>
> It's device's features I think.
Hmm, it's part of both, not sure why: see vhost_virtqueue_set_addr() and vhost_dev_set_features()
Not sure it's correct in device features, it doesn't seem to be check in kernel vhost.c either.
There is also some dead definitions like VHOST_MEMORY_F_LOG there
>
> > +
> > +Dirty pages are of size:
> > +#define VHOST_LOG_PAGE 0x1000
> > +
> > +The log memory fd is provided in the ancillary data of
> > +VHOST_USER_SET_LOG_BASE message when the slave has
> > +VHOST_USER_PROTOCOL_F_LOG_SHMFD protocol feature.
> > +
> > +The size of the log may be computed by using all the known guest
> > +addresses. The log covers from address 0 to the maximum of guest
> > +regions. In pseudo-code, to mark page at "addr" as dirty:
> > +
> > +page = addr / VHOST_LOG_PAGE
> > +log[page / 8] |= 1 << page % 8
>
> Pls note it must be done atomically.
ok
>
>
> > +
> > +VHOST_USER_SET_LOG_FD is an optional message with an eventfd in
> > +ancillary data, it may be used to inform the master that the log has
> > +been modified.
> > +
> > +Once the source has finished migration, VHOST_USER_RESET_OWNER message
> > +will be sent by the source. No further update must be done before the
> > +destination takes over with new regions & rings.
> > --
> > 2.4.3
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 4/6] vhost: alloc shareable log
2015-07-23 1:36 ` [Qemu-devel] [PATCH RFC 4/6] vhost: alloc shareable log Marc-André Lureau
@ 2015-07-28 5:28 ` Jason Wang
2015-07-28 10:10 ` Michael S. Tsirkin
0 siblings, 1 reply; 18+ messages in thread
From: Jason Wang @ 2015-07-28 5:28 UTC (permalink / raw)
To: Marc-André Lureau, qemu-devel
Cc: thibaut.collet, mst, haifeng.lin, pbonzini
On 07/23/2015 09:36 AM, Marc-André Lureau wrote:
> If the backend is of type VHOST_BACKEND_TYPE_USER, allocate
> shareable memory.
>
> Note: vhost_log_get() can use a global "vhost_log" that can be shared by
> several vhost devices. We may want instead a common shareable log and a
> common non-shareable one.
>
> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> ---
> hw/virtio/vhost.c | 42 +++++++++++++++++++++++++++++++++---------
> include/hw/virtio/vhost.h | 3 ++-
> 2 files changed, 35 insertions(+), 10 deletions(-)
>
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 2712c6f..12dd644 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -286,20 +286,34 @@ static uint64_t vhost_get_log_size(struct vhost_dev *dev)
> }
> return log_size;
> }
> -static struct vhost_log *vhost_log_alloc(uint64_t size)
> +
> +static struct vhost_log *vhost_log_alloc(uint64_t size, bool share)
> {
> - struct vhost_log *log = g_malloc0(sizeof *log + size * sizeof(*(log->log)));
> + struct vhost_log *log;
> + uint64_t logsize = size * sizeof(*(log->log));
> + int fd = -1;
> +
> + log = g_new0(struct vhost_log, 1);
> + if (share) {
> + log->log = qemu_memfd_alloc("vhost-log", logsize,
> + F_SEAL_GROW|F_SEAL_SHRINK|F_SEAL_SEAL, &fd);
> + memset(log->log, 0, logsize);
> + } else {
> + log->log = g_malloc0(logsize);
> + }
>
> log->size = size;
> log->refcnt = 1;
> + log->fd = fd;
>
> return log;
> }
>
> -static struct vhost_log *vhost_log_get(uint64_t size)
> +static struct vhost_log *vhost_log_get(uint64_t size, bool share)
> {
> - if (!vhost_log || vhost_log->size != size) {
> - vhost_log = vhost_log_alloc(size);
> + if (!vhost_log || vhost_log->size != size ||
> + (share && vhost_log->fd == -1)) {
> + vhost_log = vhost_log_alloc(size, share);
> } else {
> ++vhost_log->refcnt;
> }
> @@ -324,21 +338,30 @@ static void vhost_log_put(struct vhost_dev *dev, bool sync)
> if (vhost_log == log) {
> vhost_log = NULL;
> }
> +
> + if (log->fd == -1) {
> + g_free(log->log);
> + } else {
> + qemu_memfd_free(log->log, log->size * sizeof(*(log->log)),
> + log->fd);
> + }
> g_free(log);
> }
> }
>
> static inline void vhost_dev_log_resize(struct vhost_dev* dev, uint64_t size)
> {
> - struct vhost_log *log = vhost_log_get(size);
> + bool share = dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER;
> + struct vhost_log *log = vhost_log_get(size, share);
> uint64_t log_base = (uintptr_t)log->log;
> int r;
>
> - r = dev->vhost_ops->vhost_call(dev, VHOST_SET_LOG_BASE, &log_base);
> - assert(r >= 0);
> vhost_log_put(dev, true);
> dev->log = log;
> dev->log_size = size;
> +
> + r = dev->vhost_ops->vhost_call(dev, VHOST_SET_LOG_BASE, &log_base);
> + assert(r >= 0);
> }
Why this change is needed?
>
> static int vhost_verify_ring_mappings(struct vhost_dev *dev,
> @@ -1136,9 +1159,10 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
>
> if (hdev->log_enabled) {
> uint64_t log_base;
> + bool share = hdev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER;
>
> hdev->log_size = vhost_get_log_size(hdev);
> - hdev->log = vhost_log_get(hdev->log_size);
> + hdev->log = vhost_log_get(hdev->log_size, share);
> log_base = (uintptr_t)hdev->log->log;
> r = hdev->vhost_ops->vhost_call(hdev, VHOST_SET_LOG_BASE,
> hdev->log_size ? &log_base : NULL);
> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> index 6467c73..ab1dcac 100644
> --- a/include/hw/virtio/vhost.h
> +++ b/include/hw/virtio/vhost.h
> @@ -31,7 +31,8 @@ typedef unsigned long vhost_log_chunk_t;
> struct vhost_log {
> unsigned long long size;
> int refcnt;
> - vhost_log_chunk_t log[0];
> + int fd;
> + vhost_log_chunk_t *log;
> };
>
> struct vhost_memory;
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 2/6] posix: add linux-only memfd fallback
2015-07-23 15:25 ` Michael S. Tsirkin
@ 2015-07-28 8:11 ` Paolo Bonzini
2015-07-28 10:58 ` Marc-André Lureau
0 siblings, 1 reply; 18+ messages in thread
From: Paolo Bonzini @ 2015-07-28 8:11 UTC (permalink / raw)
To: Michael S. Tsirkin, Marc-André Lureau
Cc: thibaut.collet, qemu-devel, haifeng.lin
On 23/07/2015 17:25, Michael S. Tsirkin wrote:
> > +#ifdef CONFIG_LINUX
> > +
> > +#ifndef F_LINUX_SPECIFIC_BASE
> > +#define F_LINUX_SPECIFIC_BASE 1024
> > +#endif
> > +
> > +#ifndef F_ADD_SEALS
> > +#define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9)
> > +#define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10)
> > +
> > +#define F_SEAL_SEAL 0x0001 /* prevent further seals from being set */
> > +#define F_SEAL_SHRINK 0x0002 /* prevent file from shrinking */
> > +#define F_SEAL_GROW 0x0004 /* prevent file from growing */
> > +#define F_SEAL_WRITE 0x0008 /* prevent writes */
> > +#endif
>
> These are from include/uapi/linux/fcntl.h,
> they should be imported into linux-headers I think.
linux-headers is usually used for virt-related features that we want in
QEMU a few weeks before they are distributed upstream.
Here, I think just including linux/fcntl.h is enough.
>> +#ifndef __NR_memfd_create
>> +# if defined __x86_64__
>> +# define __NR_memfd_create 319
>> +# elif defined __arm__
>> +# define __NR_memfd_create 385
>> +# elif defined __aarch64__
>> +# define __NR_memfd_create 279
>> +# elif defined _MIPS_SIM
>> +# if _MIPS_SIM == _MIPS_SIM_ABI32
>> +# define __NR_memfd_create 4354
>> +# endif
>> +# if _MIPS_SIM == _MIPS_SIM_NABI32
>> +# define __NR_memfd_create 6318
>> +# endif
>> +# if _MIPS_SIM == _MIPS_SIM_ABI64
>> +# define __NR_memfd_create 5314
>> +# endif
>
> What's defining all these macros?
They're in asm/unistd.h.
I think that, instead of making qemu/osdep.h the new qemu-common.h, the
wrappers added by patch 3 should be declared in a new header
qemu/memfd.h. The implementation in util/memfd.c can include both
linux/fcntl.h and asm/unistd.h.
Paolo
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 4/6] vhost: alloc shareable log
2015-07-28 5:28 ` Jason Wang
@ 2015-07-28 10:10 ` Michael S. Tsirkin
2015-07-28 14:42 ` Marc-André Lureau
0 siblings, 1 reply; 18+ messages in thread
From: Michael S. Tsirkin @ 2015-07-28 10:10 UTC (permalink / raw)
To: Jason Wang
Cc: haifeng.lin, Marc-André Lureau, pbonzini, qemu-devel,
thibaut.collet
On Tue, Jul 28, 2015 at 01:28:05PM +0800, Jason Wang wrote:
>
>
> On 07/23/2015 09:36 AM, Marc-André Lureau wrote:
> > If the backend is of type VHOST_BACKEND_TYPE_USER, allocate
> > shareable memory.
> >
> > Note: vhost_log_get() can use a global "vhost_log" that can be shared by
> > several vhost devices. We may want instead a common shareable log and a
> > common non-shareable one.
> >
> > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> > ---
> > hw/virtio/vhost.c | 42 +++++++++++++++++++++++++++++++++---------
> > include/hw/virtio/vhost.h | 3 ++-
> > 2 files changed, 35 insertions(+), 10 deletions(-)
> >
> > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > index 2712c6f..12dd644 100644
> > --- a/hw/virtio/vhost.c
> > +++ b/hw/virtio/vhost.c
> > @@ -286,20 +286,34 @@ static uint64_t vhost_get_log_size(struct vhost_dev *dev)
> > }
> > return log_size;
> > }
> > -static struct vhost_log *vhost_log_alloc(uint64_t size)
> > +
> > +static struct vhost_log *vhost_log_alloc(uint64_t size, bool share)
> > {
> > - struct vhost_log *log = g_malloc0(sizeof *log + size * sizeof(*(log->log)));
> > + struct vhost_log *log;
> > + uint64_t logsize = size * sizeof(*(log->log));
> > + int fd = -1;
> > +
> > + log = g_new0(struct vhost_log, 1);
> > + if (share) {
> > + log->log = qemu_memfd_alloc("vhost-log", logsize,
> > + F_SEAL_GROW|F_SEAL_SHRINK|F_SEAL_SEAL, &fd);
> > + memset(log->log, 0, logsize);
> > + } else {
> > + log->log = g_malloc0(logsize);
> > + }
> >
> > log->size = size;
> > log->refcnt = 1;
> > + log->fd = fd;
> >
> > return log;
> > }
> >
> > -static struct vhost_log *vhost_log_get(uint64_t size)
> > +static struct vhost_log *vhost_log_get(uint64_t size, bool share)
> > {
> > - if (!vhost_log || vhost_log->size != size) {
> > - vhost_log = vhost_log_alloc(size);
> > + if (!vhost_log || vhost_log->size != size ||
> > + (share && vhost_log->fd == -1)) {
> > + vhost_log = vhost_log_alloc(size, share);
> > } else {
> > ++vhost_log->refcnt;
> > }
> > @@ -324,21 +338,30 @@ static void vhost_log_put(struct vhost_dev *dev, bool sync)
> > if (vhost_log == log) {
> > vhost_log = NULL;
> > }
> > +
> > + if (log->fd == -1) {
> > + g_free(log->log);
> > + } else {
> > + qemu_memfd_free(log->log, log->size * sizeof(*(log->log)),
> > + log->fd);
> > + }
> > g_free(log);
> > }
> > }
> >
> > static inline void vhost_dev_log_resize(struct vhost_dev* dev, uint64_t size)
> > {
> > - struct vhost_log *log = vhost_log_get(size);
> > + bool share = dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER;
> > + struct vhost_log *log = vhost_log_get(size, share);
> > uint64_t log_base = (uintptr_t)log->log;
> > int r;
> >
> > - r = dev->vhost_ops->vhost_call(dev, VHOST_SET_LOG_BASE, &log_base);
> > - assert(r >= 0);
> > vhost_log_put(dev, true);
> > dev->log = log;
> > dev->log_size = size;
> > +
> > + r = dev->vhost_ops->vhost_call(dev, VHOST_SET_LOG_BASE, &log_base);
> > + assert(r >= 0);
> > }
>
> Why this change is needed?
I know why it's needed :) But this needs to be stated in the commit log.
Also, it only makes sense if remote supports getting the logfd.
> >
> > static int vhost_verify_ring_mappings(struct vhost_dev *dev,
> > @@ -1136,9 +1159,10 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
> >
> > if (hdev->log_enabled) {
> > uint64_t log_base;
> > + bool share = hdev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER;
> >
> > hdev->log_size = vhost_get_log_size(hdev);
> > - hdev->log = vhost_log_get(hdev->log_size);
> > + hdev->log = vhost_log_get(hdev->log_size, share);
> > log_base = (uintptr_t)hdev->log->log;
> > r = hdev->vhost_ops->vhost_call(hdev, VHOST_SET_LOG_BASE,
> > hdev->log_size ? &log_base : NULL);
> > diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> > index 6467c73..ab1dcac 100644
> > --- a/include/hw/virtio/vhost.h
> > +++ b/include/hw/virtio/vhost.h
> > @@ -31,7 +31,8 @@ typedef unsigned long vhost_log_chunk_t;
> > struct vhost_log {
> > unsigned long long size;
> > int refcnt;
> > - vhost_log_chunk_t log[0];
> > + int fd;
> > + vhost_log_chunk_t *log;
> > };
> >
> > struct vhost_memory;
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 2/6] posix: add linux-only memfd fallback
2015-07-28 8:11 ` Paolo Bonzini
@ 2015-07-28 10:58 ` Marc-André Lureau
2015-07-28 11:50 ` Paolo Bonzini
0 siblings, 1 reply; 18+ messages in thread
From: Marc-André Lureau @ 2015-07-28 10:58 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Linhaifeng, Thibaut Collet, QEMU, Michael S. Tsirkin
Hi
On Tue, Jul 28, 2015 at 10:11 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>
>> What's defining all these macros?
>
> They're in asm/unistd.h.
>
> I think that, instead of making qemu/osdep.h the new qemu-common.h, the
> wrappers added by patch 3 should be declared in a new header
> qemu/memfd.h. The implementation in util/memfd.c can include both
> linux/fcntl.h and asm/unistd.h.
>
Ok, shouldn't it keep the inline function? this avoids future clash
when upgrading glibc.
--
Marc-André Lureau
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 2/6] posix: add linux-only memfd fallback
2015-07-28 10:58 ` Marc-André Lureau
@ 2015-07-28 11:50 ` Paolo Bonzini
2015-07-28 14:25 ` Marc-André Lureau
0 siblings, 1 reply; 18+ messages in thread
From: Paolo Bonzini @ 2015-07-28 11:50 UTC (permalink / raw)
To: Marc-André Lureau
Cc: Thibaut Collet, Michael S. Tsirkin, Linhaifeng, QEMU
On 28/07/2015 12:58, Marc-André Lureau wrote:
> Hi
>
> On Tue, Jul 28, 2015 at 10:11 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>
>>> What's defining all these macros?
>>
>> They're in asm/unistd.h.
>>
>> I think that, instead of making qemu/osdep.h the new qemu-common.h, the
>> wrappers added by patch 3 should be declared in a new header
>> qemu/memfd.h. The implementation in util/memfd.c can include both
>> linux/fcntl.h and asm/unistd.h.
>>
>
> Ok, shouldn't it keep the inline function? this avoids future clash
> when upgrading glibc.
Can the inline function stay in util/memfd.c?
Paolo
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 2/6] posix: add linux-only memfd fallback
2015-07-28 11:50 ` Paolo Bonzini
@ 2015-07-28 14:25 ` Marc-André Lureau
2015-07-28 16:37 ` Paolo Bonzini
0 siblings, 1 reply; 18+ messages in thread
From: Marc-André Lureau @ 2015-07-28 14:25 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Thibaut Collet, Michael S. Tsirkin, Linhaifeng, QEMU
Hi
On Tue, Jul 28, 2015 at 1:50 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Can the inline function stay in util/memfd.c?
I see little benefits in that, only the qemu_memfd_alloc helpers would
then be exported. Then the inline is probably unnecessary if moved in
the memfd.c.
--
Marc-André Lureau
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 4/6] vhost: alloc shareable log
2015-07-28 10:10 ` Michael S. Tsirkin
@ 2015-07-28 14:42 ` Marc-André Lureau
0 siblings, 0 replies; 18+ messages in thread
From: Marc-André Lureau @ 2015-07-28 14:42 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Paolo Bonzini, Jason Wang, Linhaifeng, Thibaut Collet, QEMU
Hi
On Tue, Jul 28, 2015 at 12:10 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> I know why it's needed :) But this needs to be stated in the commit log.
> Also, it only makes sense if remote supports getting the logfd.
Thanks for pointing out this change. Actually, I think the current log
overlap when resizing is on purpose: there shouldn't be any time
without log. I'll rework that to keep the same ordering, keeping a
gap-less log switching. I'll also comment this part, as this is easy
to overlook.
--
Marc-André Lureau
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 2/6] posix: add linux-only memfd fallback
2015-07-28 14:25 ` Marc-André Lureau
@ 2015-07-28 16:37 ` Paolo Bonzini
0 siblings, 0 replies; 18+ messages in thread
From: Paolo Bonzini @ 2015-07-28 16:37 UTC (permalink / raw)
To: Marc-André Lureau
Cc: Thibaut Collet, Michael S. Tsirkin, Linhaifeng, QEMU
On 28/07/2015 16:25, Marc-André Lureau wrote:
> > Can the inline function stay in util/memfd.c?
> I see little benefits in that, only the qemu_memfd_alloc helpers would
> then be exported. Then the inline is probably unnecessary if moved in
> the memfd.c.
That's just a matter of taste, I agree.
Paolo
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2015-07-28 16:38 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-07-23 1:36 [Qemu-devel] [PATCH RFC 0/6] vhost-user: add migration log support Marc-André Lureau
2015-07-23 1:36 ` [Qemu-devel] [PATCH RFC 1/6] configure: probe for memfd Marc-André Lureau
2015-07-23 1:36 ` [Qemu-devel] [PATCH RFC 2/6] posix: add linux-only memfd fallback Marc-André Lureau
2015-07-23 15:25 ` Michael S. Tsirkin
2015-07-28 8:11 ` Paolo Bonzini
2015-07-28 10:58 ` Marc-André Lureau
2015-07-28 11:50 ` Paolo Bonzini
2015-07-28 14:25 ` Marc-André Lureau
2015-07-28 16:37 ` Paolo Bonzini
2015-07-23 1:36 ` [Qemu-devel] [PATCH RFC 3/6] osdep: add memfd helpers Marc-André Lureau
2015-07-23 1:36 ` [Qemu-devel] [PATCH RFC 4/6] vhost: alloc shareable log Marc-André Lureau
2015-07-28 5:28 ` Jason Wang
2015-07-28 10:10 ` Michael S. Tsirkin
2015-07-28 14:42 ` Marc-André Lureau
2015-07-23 1:36 ` [Qemu-devel] [PATCH RFC 5/6] vhost-user: send log shm fd along with log_base Marc-André Lureau
2015-07-23 1:36 ` [Qemu-devel] [PATCH RFC 6/6] vhost-user: document migration log Marc-André Lureau
2015-07-23 15:30 ` Michael S. Tsirkin
2015-07-23 15:36 ` Marc-André Lureau
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).