qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: qemu-devel@nongnu.org
Cc: "Peter Maydell" <peter.maydell@linaro.org>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	"Marc-André Lureau" <marcandre.lureau@redhat.com>
Subject: [Qemu-devel] [PULL v2 36/50] vhost+postcopy: Send address back to qemu
Date: Tue, 20 Mar 2018 05:17:53 +0200	[thread overview]
Message-ID: <1521515720-612046-37-git-send-email-mst@redhat.com> (raw)
In-Reply-To: <1521515720-612046-1-git-send-email-mst@redhat.com>

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

We need a better way, but at the moment we need the address of the
mappings sent back to qemu so it can interpret the messages on the
userfaultfd it reads.

This is done as a 3 stage set:
   QEMU -> client
      set_mem_table

   mmap stuff, get addresses

   client -> qemu
       here are the addresses

   qemu -> client
       OK - now you can use them

That ensures that qemu has registered the new addresses in it's
userfault code before the client starts accessing them.

Note: We don't ask for the default 'ack' reply since we've got our own.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 docs/interop/vhost-user.txt           |  9 +++++
 contrib/libvhost-user/libvhost-user.c | 24 ++++++++++++-
 hw/virtio/vhost-user.c                | 67 +++++++++++++++++++++++++++++++++--
 hw/virtio/trace-events                |  1 +
 4 files changed, 98 insertions(+), 3 deletions(-)

diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt
index 0d24203..e295ef1 100644
--- a/docs/interop/vhost-user.txt
+++ b/docs/interop/vhost-user.txt
@@ -455,12 +455,21 @@ Master message types
       Id: 5
       Equivalent ioctl: VHOST_SET_MEM_TABLE
       Master payload: memory regions description
+      Slave payload: (postcopy only) memory regions description
 
       Sets the memory map regions on the slave so it can translate the vring
       addresses. In the ancillary data there is an array of file descriptors
       for each memory mapped region. The size and ordering of the fds matches
       the number and ordering of memory regions.
 
+      When VHOST_USER_POSTCOPY_LISTEN has been received, SET_MEM_TABLE replies with
+      the bases of the memory mapped regions to the master.  The slave must
+      have mmap'd the regions but not yet accessed them and should not yet generate
+      a userfault event. Note NEED_REPLY_MASK is not set in this case.
+      QEMU will then reply back to the list of mappings with an empty
+      VHOST_USER_SET_MEM_TABLE as an acknowledgment; only upon reception of this
+      message may the guest start accessing the memory and generating faults.
+
  * VHOST_USER_SET_LOG_BASE
 
       Id: 6
diff --git a/contrib/libvhost-user/libvhost-user.c b/contrib/libvhost-user/libvhost-user.c
index 7c8cd587..6314549 100644
--- a/contrib/libvhost-user/libvhost-user.c
+++ b/contrib/libvhost-user/libvhost-user.c
@@ -491,10 +491,32 @@ vu_set_mem_table_exec_postcopy(VuDev *dev, VhostUserMsg *vmsg)
                    dev_region->mmap_addr);
         }
 
+        /* Return the address to QEMU so that it can translate the ufd
+         * fault addresses back.
+         */
+        msg_region->userspace_addr = (uintptr_t)(mmap_addr +
+                                                 dev_region->mmap_offset);
         close(vmsg->fds[i]);
     }
 
-    /* TODO: Get address back to QEMU */
+    /* Send the message back to qemu with the addresses filled in */
+    vmsg->fd_num = 0;
+    if (!vu_message_write(dev, dev->sock, vmsg)) {
+        vu_panic(dev, "failed to respond to set-mem-table for postcopy");
+        return false;
+    }
+
+    /* Wait for QEMU to confirm that it's registered the handler for the
+     * faults.
+     */
+    if (!vu_message_read(dev, dev->sock, vmsg) ||
+        vmsg->size != sizeof(vmsg->payload.u64) ||
+        vmsg->payload.u64 != 0) {
+        vu_panic(dev, "failed to receive valid ack for postcopy set-mem-table");
+        return false;
+    }
+
+    /* OK, now we can go and register the memory and generate faults */
     for (i = 0; i < dev->nregions; i++) {
         VuDevRegion *dev_region = &dev->regions[i];
 #ifdef UFFDIO_REGISTER
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index b6757eb..1603d70 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -174,6 +174,7 @@ struct vhost_user {
     int slave_fd;
     NotifierWithReturn postcopy_notifier;
     struct PostCopyFD  postcopy_fd;
+    uint64_t           postcopy_client_bases[VHOST_MEMORY_MAX_NREGIONS];
     /* True once we've entered postcopy_listen */
     bool               postcopy_listen;
 };
@@ -343,12 +344,15 @@ static int vhost_user_set_log_base(struct vhost_dev *dev, uint64_t base,
 static int vhost_user_set_mem_table_postcopy(struct vhost_dev *dev,
                                              struct vhost_memory *mem)
 {
+    struct vhost_user *u = dev->opaque;
     int fds[VHOST_MEMORY_MAX_NREGIONS];
     int i, fd;
     size_t fd_num = 0;
     bool reply_supported = virtio_has_feature(dev->protocol_features,
                                               VHOST_USER_PROTOCOL_F_REPLY_ACK);
-    /* TODO: Add actual postcopy differences */
+    VhostUserMsg msg_reply;
+    int region_i, msg_i;
+
     VhostUserMsg msg = {
         .hdr.request = VHOST_USER_SET_MEM_TABLE,
         .hdr.flags = VHOST_USER_VERSION,
@@ -395,6 +399,64 @@ static int vhost_user_set_mem_table_postcopy(struct vhost_dev *dev,
         return -1;
     }
 
+    if (vhost_user_read(dev, &msg_reply) < 0) {
+        return -1;
+    }
+
+    if (msg_reply.hdr.request != VHOST_USER_SET_MEM_TABLE) {
+        error_report("%s: Received unexpected msg type."
+                     "Expected %d received %d", __func__,
+                     VHOST_USER_SET_MEM_TABLE, msg_reply.hdr.request);
+        return -1;
+    }
+    /* We're using the same structure, just reusing one of the
+     * fields, so it should be the same size.
+     */
+    if (msg_reply.hdr.size != msg.hdr.size) {
+        error_report("%s: Unexpected size for postcopy reply "
+                     "%d vs %d", __func__, msg_reply.hdr.size, msg.hdr.size);
+        return -1;
+    }
+
+    memset(u->postcopy_client_bases, 0,
+           sizeof(uint64_t) * VHOST_MEMORY_MAX_NREGIONS);
+
+    /* They're in the same order as the regions that were sent
+     * but some of the regions were skipped (above) if they
+     * didn't have fd's
+    */
+    for (msg_i = 0, region_i = 0;
+         region_i < dev->mem->nregions;
+        region_i++) {
+        if (msg_i < fd_num &&
+            msg_reply.payload.memory.regions[msg_i].guest_phys_addr ==
+            dev->mem->regions[region_i].guest_phys_addr) {
+            u->postcopy_client_bases[region_i] =
+                msg_reply.payload.memory.regions[msg_i].userspace_addr;
+            trace_vhost_user_set_mem_table_postcopy(
+                msg_reply.payload.memory.regions[msg_i].userspace_addr,
+                msg.payload.memory.regions[msg_i].userspace_addr,
+                msg_i, region_i);
+            msg_i++;
+        }
+    }
+    if (msg_i != fd_num) {
+        error_report("%s: postcopy reply not fully consumed "
+                     "%d vs %zd",
+                     __func__, msg_i, fd_num);
+        return -1;
+    }
+    /* Now we've registered this with the postcopy code, we ack to the client,
+     * because now we're in the position to be able to deal with any faults
+     * it generates.
+     */
+    /* TODO: Use this for failure cases as well with a bad value */
+    msg.hdr.size = sizeof(msg.payload.u64);
+    msg.payload.u64 = 0; /* OK */
+    if (vhost_user_write(dev, &msg, NULL, 0) < 0) {
+        return -1;
+    }
+
     if (reply_supported) {
         return process_message_reply(dev, &msg);
     }
@@ -411,7 +473,8 @@ static int vhost_user_set_mem_table(struct vhost_dev *dev,
     size_t fd_num = 0;
     bool do_postcopy = u->postcopy_listen && u->postcopy_fd.handler;
     bool reply_supported = virtio_has_feature(dev->protocol_features,
-                                              VHOST_USER_PROTOCOL_F_REPLY_ACK);
+                                          VHOST_USER_PROTOCOL_F_REPLY_ACK) &&
+                                          !do_postcopy;
 
     if (do_postcopy) {
         /* Postcopy has enough differences that it's best done in it's own
diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 06ec03d..05d18ad 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -8,6 +8,7 @@ vhost_section(const char *name, int r) "%s:%d"
 
 # hw/virtio/vhost-user.c
 vhost_user_postcopy_listen(void) ""
+vhost_user_set_mem_table_postcopy(uint64_t client_addr, uint64_t qhva, int reply_i, int region_i) "client:0x%"PRIx64" for hva: 0x%"PRIx64" reply %d region %d"
 
 # hw/virtio/virtio.c
 virtqueue_alloc_element(void *elem, size_t sz, unsigned in_num, unsigned out_num) "elem %p size %zd in_num %u out_num %u"
-- 
MST

  parent reply	other threads:[~2018-03-20  3:17 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-20  3:16 [Qemu-devel] [PULL v2 00/50] virtio, vhost, pci, pc: features, cleanups Michael S. Tsirkin
2018-03-20  3:16 ` [Qemu-devel] [PULL v2 01/50] scripts/update-linux-headers: add ethtool.h and update to 4.16.0-rc4 Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 03/50] virtio-net: add linkspeed and duplex settings to virtio-net Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 02/50] virtio-net: use 64-bit values for feature flags Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 04/50] acpi: remove unused acpi-dsdt.aml Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 05/50] pc: replace pm object initialization with one-liner in acpi_get_pm_info() Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 07/50] acpi: add build_append_gas() helper for Generic Address Structure Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 06/50] acpi: reuse AcpiGenericAddress instead of Acpi20GenericAddress Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 08/50] acpi: move ACPI_PORT_SMI_CMD define to header it belongs to Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 09/50] pc: acpi: isolate FADT specific data into AcpiFadtData structure Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 10/50] pc: acpi: use build_append_foo() API to construct FADT Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 11/50] acpi: move build_fadt() from i386 specific to generic ACPI source Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 12/50] virt_arm: acpi: reuse common build_fadt() Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 13/50] tests: acpi: don't read all fields in test_acpi_fadt_table() Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 15/50] hw/pci: remove obsolete PCIDevice->init() Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 14/50] standard-headers: update virtio_net.h Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 16/50] pc-dimm: make qmp_pc_dimm_device_list() sort devices by address Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 17/50] qmp: distinguish PC-DIMM and NVDIMM in MemoryDeviceInfoList Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 18/50] hw/acpi-build: build SRAT memory affinity structures for DIMM devices Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 19/50] tests/bios-tables-test: add test cases for DIMM proximity Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 20/50] test/acpi-test-data: add ACPI tables for dimmpxm test Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 21/50] Makefile: add target to print generated files Michael S. Tsirkin
2018-04-13  7:21   ` Markus Armbruster
2018-04-13 10:04     ` Marc-André Lureau
2018-04-13 12:51       ` Michael S. Tsirkin
2018-05-04  5:44         ` Markus Armbruster
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 22/50] migrate: Update ram_block_discard_range for shared Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 23/50] qemu_ram_block_host_offset Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 24/50] postcopy: use UFFDIO_ZEROPAGE only when available Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 25/50] postcopy: Add notifier chain Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 26/50] postcopy: Add vhost-user flag for postcopy and check it Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 27/50] vhost-user: Add 'VHOST_USER_POSTCOPY_ADVISE' message Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 28/50] libvhost-user: Support sending fds back to qemu Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 29/50] libvhost-user: Open userfaultfd Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 30/50] postcopy: Allow registering of fd handler Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 32/50] vhost+postcopy: Transmit 'listen' to slave Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 31/50] vhost+postcopy: Register shared ufd with postcopy Michael S. Tsirkin
2018-04-27 16:12   ` Peter Maydell
2018-05-02 10:58     ` Dr. David Alan Gilbert
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 33/50] postcopy+vhost-user: Split set_mem_table for postcopy Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 34/50] migration/ram: ramblock_recv_bitmap_test_byte_offset Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 35/50] libvhost-user+postcopy: Register new regions with the ufd Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 37/50] vhost+postcopy: Stash RAMBlock and offset Michael S. Tsirkin
2018-03-20  3:17 ` Michael S. Tsirkin [this message]
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 39/50] vhost+postcopy: Resolve client address Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 38/50] vhost+postcopy: Helper to send requests to source for shared pages Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 41/50] postcopy: postcopy_notify_shared_wake Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 40/50] postcopy: helper for waking shared Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 44/50] libvhost-user: mprotect & madvises for postcopy Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 43/50] vhost+postcopy: Call wakeups Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 42/50] vhost+postcopy: Add vhost waker Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 46/50] vhost+postcopy: Wire up POSTCOPY_END notify Michael S. Tsirkin
2018-03-20  3:17 ` [Qemu-devel] [PULL v2 45/50] vhost-user: Add VHOST_USER_POSTCOPY_END message Michael S. Tsirkin
2018-03-20  3:18 ` [Qemu-devel] [PULL v2 48/50] postcopy: Allow shared memory Michael S. Tsirkin
2018-03-20  3:18 ` [Qemu-devel] [PULL v2 47/50] vhost: Huge page align and merge Michael S. Tsirkin
2018-03-20  3:18 ` [Qemu-devel] [PULL v2 49/50] libvhost-user: Claim support for postcopy Michael S. Tsirkin
2018-03-20  3:18 ` [Qemu-devel] [PULL v2 50/50] postcopy shared docs Michael S. Tsirkin
2018-03-20 14:18 ` [Qemu-devel] [PULL v2 00/50] virtio, vhost, pci, pc: features, cleanups Peter Maydell
2018-03-20 14:37   ` Michael S. Tsirkin
2018-03-20 15:05   ` Michael S. Tsirkin
2018-03-20 15:41     ` Peter Maydell
2018-03-20 15:51       ` Michael S. Tsirkin
2018-03-20 15:54         ` Peter Maydell
2018-03-20 16:02           ` Michael S. Tsirkin
2018-03-20 17:18             ` Peter Maydell
2018-03-20 17:22               ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1521515720-612046-37-git-send-email-mst@redhat.com \
    --to=mst@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=marcandre.lureau@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).