From: "Michael S. Tsirkin" <mst@redhat.com>
To: Prerna Saxena <saxenap.ltc@gmail.com>
Cc: qemu-devel@nongnu.org, marcandre.lureau@gmail.com,
felipe@nutanix.com, anilkumar.boggarapu@nutanix.com,
Prerna Saxena <prerna.saxena@nutanix.com>
Subject: Re: [Qemu-devel] [PATCH v4 2/2] vhost-user: Attempt to fix a race with set_mem_table.
Date: Wed, 27 Jul 2016 16:30:00 +0300 [thread overview]
Message-ID: <20160727162921-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <1469613157-8942-3-git-send-email-saxenap.ltc@gmail.com>
On Wed, Jul 27, 2016 at 02:52:37AM -0700, Prerna Saxena wrote:
> From: Prerna Saxena <prerna.saxena@nutanix.com>
>
> The set_mem_table command currently does not seek a reply. Hence, there is
> no easy way for a remote application to notify to QEMU when it finished
> setting up memory, or if there were errors doing so.
>
> As an example:
> (1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net
> application). SET_MEM_TABLE does not require a reply according to the spec.
> (2) Qemu commits the memory to the guest.
> (3) Guest issues an I/O operation over a new memory region which was configured on (1).
> (4) The application has not yet remapped the memory, but it sees the I/O request.
> (5) The application cannot satisfy the request because it does not know about those GPAs.
>
> While a guaranteed fix would require a protocol extension (committed separately),
> a best-effort workaround for existing applications is to send a GET_FEATURES
> message before completing the vhost_user_set_mem_table() call.
> Since GET_FEATURES requires a reply, an application that processes vhost-user
> messages synchronously would probably have completed the SET_MEM_TABLE before replying.
>
> Signed-off-by: Prerna Saxena <prerna.saxena@nutanix.com>
Could you pls reorder patchset so this is 1/2?
1/1 is still under review but I'd like to make sure
we have some kind of fix in place for 2.7.
> ---
> hw/virtio/vhost-user.c | 123 ++++++++++++++++++++++++++-----------------------
> 1 file changed, 65 insertions(+), 58 deletions(-)
>
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 0cdb918..f96607e 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -254,64 +254,6 @@ static int vhost_user_set_log_base(struct vhost_dev *dev, uint64_t base,
> return 0;
> }
>
> -static int vhost_user_set_mem_table(struct vhost_dev *dev,
> - struct vhost_memory *mem)
> -{
> - int fds[VHOST_MEMORY_MAX_NREGIONS];
> - int i, fd;
> - size_t fd_num = 0;
> - bool reply_supported = virtio_has_feature(dev->protocol_features,
> - VHOST_USER_PROTOCOL_F_REPLY_ACK);
> -
> - VhostUserMsg msg = {
> - .request = VHOST_USER_SET_MEM_TABLE,
> - .flags = VHOST_USER_VERSION,
> - };
> -
> - if (reply_supported) {
> - msg.flags |= VHOST_USER_NEED_RESPONSE_MASK;
> - }
> -
> - for (i = 0; i < dev->mem->nregions; ++i) {
> - struct vhost_memory_region *reg = dev->mem->regions + i;
> - ram_addr_t offset;
> - MemoryRegion *mr;
> -
> - assert((uintptr_t)reg->userspace_addr == reg->userspace_addr);
> - mr = memory_region_from_host((void *)(uintptr_t)reg->userspace_addr,
> - &offset);
> - fd = memory_region_get_fd(mr);
> - if (fd > 0) {
> - msg.payload.memory.regions[fd_num].userspace_addr = reg->userspace_addr;
> - msg.payload.memory.regions[fd_num].memory_size = reg->memory_size;
> - msg.payload.memory.regions[fd_num].guest_phys_addr = reg->guest_phys_addr;
> - msg.payload.memory.regions[fd_num].mmap_offset = offset;
> - assert(fd_num < VHOST_MEMORY_MAX_NREGIONS);
> - fds[fd_num++] = fd;
> - }
> - }
> -
> - msg.payload.memory.nregions = fd_num;
> -
> - if (!fd_num) {
> - error_report("Failed initializing vhost-user memory map, "
> - "consider using -object memory-backend-file share=on");
> - return -1;
> - }
> -
> - msg.size = sizeof(msg.payload.memory.nregions);
> - msg.size += sizeof(msg.payload.memory.padding);
> - msg.size += fd_num * sizeof(VhostUserMemoryRegion);
> -
> - vhost_user_write(dev, &msg, fds, fd_num);
> -
> - if (reply_supported) {
> - return process_message_reply(dev, msg.request);
> - }
> -
> - return 0;
> -}
> -
> static int vhost_user_set_vring_addr(struct vhost_dev *dev,
> struct vhost_vring_addr *addr)
> {
> @@ -514,6 +456,71 @@ static int vhost_user_get_features(struct vhost_dev *dev, uint64_t *features)
> return vhost_user_get_u64(dev, VHOST_USER_GET_FEATURES, features);
> }
>
> +static int vhost_user_set_mem_table(struct vhost_dev *dev,
> + struct vhost_memory *mem)
> +{
> + int fds[VHOST_MEMORY_MAX_NREGIONS];
> + int i, fd;
> + size_t fd_num = 0;
> + uint64_t features;
> + bool reply_supported = virtio_has_feature(dev->protocol_features,
> + VHOST_USER_PROTOCOL_F_REPLY_ACK);
> +
> + VhostUserMsg msg = {
> + .request = VHOST_USER_SET_MEM_TABLE,
> + .flags = VHOST_USER_VERSION,
> + };
> +
> + if (reply_supported) {
> + msg.flags |= VHOST_USER_NEED_RESPONSE_MASK;
> + }
> +
> + for (i = 0; i < dev->mem->nregions; ++i) {
> + struct vhost_memory_region *reg = dev->mem->regions + i;
> + ram_addr_t offset;
> + MemoryRegion *mr;
> +
> + assert((uintptr_t)reg->userspace_addr == reg->userspace_addr);
> + mr = memory_region_from_host((void *)(uintptr_t)reg->userspace_addr,
> + &offset);
> + fd = memory_region_get_fd(mr);
> + if (fd > 0) {
> + msg.payload.memory.regions[fd_num].userspace_addr = reg->userspace_addr;
> + msg.payload.memory.regions[fd_num].memory_size = reg->memory_size;
> + msg.payload.memory.regions[fd_num].guest_phys_addr = reg->guest_phys_addr;
> + msg.payload.memory.regions[fd_num].mmap_offset = offset;
> + assert(fd_num < VHOST_MEMORY_MAX_NREGIONS);
> + fds[fd_num++] = fd;
> + }
> + }
> +
> + msg.payload.memory.nregions = fd_num;
> +
> + if (!fd_num) {
> + error_report("Failed initializing vhost-user memory map, "
> + "consider using -object memory-backend-file share=on");
> + return -1;
> + }
> +
> + msg.size = sizeof(msg.payload.memory.nregions);
> + msg.size += sizeof(msg.payload.memory.padding);
> + msg.size += fd_num * sizeof(VhostUserMemoryRegion);
> +
> + vhost_user_write(dev, &msg, fds, fd_num);
> +
> + if (reply_supported) {
> + return process_message_reply(dev, msg.request);
> + } else {
> + /* Note: It is (yet) unknown when the client application has finished
> + * remapping the GPA.
> + * Attempt to prevent a race by sending a command that requires a reply.
> + */
> + vhost_user_get_features(dev, &features);
> + }
> +
> + return 0;
> +}
> +
> static int vhost_user_set_owner(struct vhost_dev *dev)
> {
> VhostUserMsg msg = {
> --
> 1.8.1.2
next prev parent reply other threads:[~2016-07-27 13:30 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-27 9:52 [Qemu-devel] [PATCH v4 0/2] vhost-user: Extend protocol to receive replies on any command Prerna Saxena
2016-07-27 9:52 ` [Qemu-devel] [PATCH v4 1/2] vhost-user: Introduce a new protocol feature REPLY_ACK Prerna Saxena
2016-07-27 11:05 ` Marc-André Lureau
2016-07-27 12:56 ` Prerna Saxena
2016-07-27 13:28 ` Michael S. Tsirkin
2016-07-28 6:28 ` Prerna Saxena
2016-07-27 9:52 ` [Qemu-devel] [PATCH v4 2/2] vhost-user: Attempt to fix a race with set_mem_table Prerna Saxena
2016-07-27 13:30 ` Michael S. Tsirkin [this message]
2016-07-28 7:09 ` Prerna Saxena
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160727162921-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=anilkumar.boggarapu@nutanix.com \
--cc=felipe@nutanix.com \
--cc=marcandre.lureau@gmail.com \
--cc=prerna.saxena@nutanix.com \
--cc=qemu-devel@nongnu.org \
--cc=saxenap.ltc@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).