qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Igor Mammedov <imammedo@redhat.com>
Cc: qemu-devel@nongnu.org, maxime.coquelin@redhat.com,
	a.perevalov@samsung.com, mst@redhat.com,
	marcandre.lureau@redhat.com, lvivier@redhat.com,
	aarcange@redhat.com, felipe@nutanix.com, peterx@redhat.com,
	quintela@redhat.com
Subject: Re: [Qemu-devel] [RFC v2 30/32] vhost: Merge neighbouring hugepage regions where appropriate
Date: Mon, 25 Sep 2017 12:19:55 +0100	[thread overview]
Message-ID: <20170925111954.GA3178@work-vm> (raw)
In-Reply-To: <20170914111832.7edc9268@nial.brq.redhat.com>

* Igor Mammedov (imammedo@redhat.com) wrote:
> On Thu, 24 Aug 2017 20:27:28 +0100
> "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> wrote:
> 
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > 
> > Where two regions are created with a gap such that when aligned
> > to hugepage boundaries, the two regions overlap, merge them.
> why only hugepage boundaries, it should be applicable any alignment

Actually this patch isn't huge-page specific - it just aligns to the
pagesize; but do we ever hit a case where a region is smaller than a
normal page and thus is changed by this?

> I'd say the patch isn't what I've had in mind when we discussed issue,

Ah

> it builds on already existing merging code and complicates
> code even more.

Yes it is a little complex.

> Have you looked into possibility to rebuild memory map from scratch
> every time vhost_region_add/vhost_region_del is called or even at
> vhost_commit() time to reduce rebuild from a set of memory sections
> that vhost tracks?
> That should simplify algorithm a lot as memory sections are coming
> from flat view and never overlap compared to current merged memory
> map in vhost_dev::mem, so it won't have to deal with first splitting
> and then merging back every time flatview changes.

I hadn't; I was concentrating on changing the existing code rather than
reworking it - especially since I don't/didn't know much about the
notifiers.

Are you suggesting that basically vhost_region_add/del do nothing
(except maybe set a flag) and the real work gets done in vhost_commit()?
(I also found I had to call the merge from vhost_dev_start as well as
vhost_commit - I guess from the first use?)

If I just did everything in vhost_commit where do I start - is that
using something like address_space_to_flatview(address_space_memory) to
get the main FlatView and somehow walk that?

Dave

> > I also add quite a few trace events to see what's going on.
> > 
> > Note: This doesn't handle all the cases, but does handle the common
> > case on a PC due to the 640k hole.
> > 
> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> > ---
> >  hw/virtio/trace-events | 11 +++++++
> >  hw/virtio/vhost.c      | 79 +++++++++++++++++++++++++++++++++++++++++++++++++-
> >  2 files changed, 89 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
> > index 5b599617a1..f98efb39fd 100644
> > --- a/hw/virtio/trace-events
> > +++ b/hw/virtio/trace-events
> > @@ -1,5 +1,16 @@
> >  # See docs/devel/tracing.txt for syntax documentation.
> >  
> > +# hw/virtio/vhost.c
> > +vhost_dev_assign_memory_merged(int from, int to, uint64_t size, uint64_t start_addr, uint64_t uaddr) "f/t=%d/%d 0x%"PRIx64" @ P: 0x%"PRIx64" U: 0x%"PRIx64
> > +vhost_dev_assign_memory_not_merged(uint64_t size, uint64_t start_addr, uint64_t uaddr) "0x%"PRIx64" @ P: 0x%"PRIx64" U: 0x%"PRIx64
> > +vhost_dev_assign_memory_entry(uint64_t size, uint64_t start_addr, uint64_t uaddr) "0x%"PRIx64" @ P: 0x%"PRIx64" U: 0x%"PRIx64
> > +vhost_dev_assign_memory_exit(uint32_t nregions) "%"PRId32
> > +vhost_huge_page_stretch_and_merge_entry(uint32_t nregions) "%"PRId32
> > +vhost_huge_page_stretch_and_merge_can(void) ""
> > +vhost_huge_page_stretch_and_merge_size_align(int d, uint64_t gpa, uint64_t align) "%d: gpa: 0x%"PRIx64" align: 0x%"PRIx64
> > +vhost_huge_page_stretch_and_merge_start_align(int d, uint64_t gpa, uint64_t align) "%d: gpa: 0x%"PRIx64" align: 0x%"PRIx64
> > +vhost_section(const char *name, int r) "%s:%d"
> > +
> >  # hw/virtio/vhost-user.c
> >  vhost_user_postcopy_end_entry(void) ""
> >  vhost_user_postcopy_end_exit(void) ""
> > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > index 6eddb099b0..fb506e747f 100644
> > --- a/hw/virtio/vhost.c
> > +++ b/hw/virtio/vhost.c
> > @@ -27,6 +27,7 @@
> >  #include "hw/virtio/virtio-access.h"
> >  #include "migration/blocker.h"
> >  #include "sysemu/dma.h"
> > +#include "trace.h"
> >  
> >  /* enabled until disconnected backend stabilizes */
> >  #define _VHOST_DEBUG 1
> > @@ -250,6 +251,8 @@ static void vhost_dev_assign_memory(struct vhost_dev *dev,
> >  {
> >      int from, to;
> >      struct vhost_memory_region *merged = NULL;
> > +    trace_vhost_dev_assign_memory_entry(size, start_addr, uaddr);
> > +
> >      for (from = 0, to = 0; from < dev->mem->nregions; ++from, ++to) {
> >          struct vhost_memory_region *reg = dev->mem->regions + to;
> >          uint64_t prlast, urlast;
> > @@ -293,11 +296,13 @@ static void vhost_dev_assign_memory(struct vhost_dev *dev,
> >          uaddr = merged->userspace_addr = u;
> >          start_addr = merged->guest_phys_addr = s;
> >          size = merged->memory_size = e - s + 1;
> > +        trace_vhost_dev_assign_memory_merged(from, to, size, start_addr, uaddr);
> >          assert(merged->memory_size);
> >      }
> >  
> >      if (!merged) {
> >          struct vhost_memory_region *reg = dev->mem->regions + to;
> > +        trace_vhost_dev_assign_memory_not_merged(size, start_addr, uaddr);
> >          memset(reg, 0, sizeof *reg);
> >          reg->memory_size = size;
> >          assert(reg->memory_size);
> > @@ -307,6 +312,7 @@ static void vhost_dev_assign_memory(struct vhost_dev *dev,
> >      }
> >      assert(to <= dev->mem->nregions + 1);
> >      dev->mem->nregions = to;
> > +    trace_vhost_dev_assign_memory_exit(to);
> >  }
> >  
> >  static uint64_t vhost_get_log_size(struct vhost_dev *dev)
> > @@ -610,8 +616,12 @@ static void vhost_set_memory(MemoryListener *listener,
> >  
> >  static bool vhost_section(MemoryRegionSection *section)
> >  {
> > -    return memory_region_is_ram(section->mr) &&
> > +    bool result;
> > +    result = memory_region_is_ram(section->mr) &&
> >          !memory_region_is_rom(section->mr);
> > +
> > +    trace_vhost_section(section->mr->name, result);
> > +    return result;
> >  }
> >  
> >  static void vhost_begin(MemoryListener *listener)
> > @@ -622,6 +632,68 @@ static void vhost_begin(MemoryListener *listener)
> >      dev->mem_changed_start_addr = -1;
> >  }
> >  
> > +/* Look for regions that are hugepage backed but not aligned
> > + * and fix them up to be aligned.
> > + * TODO: For now this is just enough to deal with the 640k hole
> > + */
> > +static bool vhost_huge_page_stretch_and_merge(struct vhost_dev *dev)
> > +{
> > +    int i, j;
> > +    bool result = true;
> > +    trace_vhost_huge_page_stretch_and_merge_entry(dev->mem->nregions);
> > +
> > +    for (i = 0; i < dev->mem->nregions; i++) {
> > +        struct vhost_memory_region *reg = dev->mem->regions + i;
> > +        ram_addr_t offset;
> > +        RAMBlock *rb = qemu_ram_block_from_host((void *)reg->userspace_addr,
> > +                                                false, &offset);
> > +        size_t pagesize = qemu_ram_pagesize(rb);
> > +        uint64_t alignage;
> > +        alignage = reg->guest_phys_addr & (pagesize - 1);
> > +        if (alignage) {
> > +
> > +            trace_vhost_huge_page_stretch_and_merge_start_align(i,
> > +                                                (uint64_t)reg->guest_phys_addr,
> > +                                                alignage);
> > +            for (j = 0; j < dev->mem->nregions; j++) {
> > +                struct vhost_memory_region *oreg = dev->mem->regions + j;
> > +                if (j == i) {
> > +                    continue;
> > +                }
> > +
> > +                if (oreg->guest_phys_addr ==
> > +                        (reg->guest_phys_addr - alignage) &&
> > +                    oreg->userspace_addr ==
> > +                         (reg->userspace_addr - alignage)) {
> > +                    struct vhost_memory_region treg = *reg;
> > +                    trace_vhost_huge_page_stretch_and_merge_can();
> > +                    vhost_dev_unassign_memory(dev, oreg->guest_phys_addr,
> > +                                              oreg->memory_size);
> > +                    vhost_dev_unassign_memory(dev, treg.guest_phys_addr,
> > +                                              treg.memory_size);
> > +                    vhost_dev_assign_memory(dev,
> > +                                            treg.guest_phys_addr - alignage,
> > +                                            treg.memory_size + alignage,
> > +                                            treg.userspace_addr - alignage);
> > +                    return vhost_huge_page_stretch_and_merge(dev);
> > +                }
> > +            }
> > +        }
> > +        alignage = reg->memory_size & (pagesize - 1);
> > +        if (alignage) {
> > +            trace_vhost_huge_page_stretch_and_merge_size_align(i,
> > +                                               (uint64_t)reg->guest_phys_addr,
> > +                                               alignage);
> > +            /* We ignore this if we find something else to merge,
> > +             * so we only return false if we're left with this
> > +             */
> > +            result = false;
> > +        }
> > +    }
> > +
> > +    return result;
> > +}
> > +
> >  static void vhost_commit(MemoryListener *listener)
> >  {
> >      struct vhost_dev *dev = container_of(listener, struct vhost_dev,
> > @@ -641,6 +713,7 @@ static void vhost_commit(MemoryListener *listener)
> >          return;
> >      }
> >  
> > +    vhost_huge_page_stretch_and_merge(dev);
> >      if (dev->started) {
> >          start_addr = dev->mem_changed_start_addr;
> >          size = dev->mem_changed_end_addr - dev->mem_changed_start_addr + 1;
> > @@ -1512,6 +1585,10 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
> >          goto fail_features;
> >      }
> >  
> > +    if (!vhost_huge_page_stretch_and_merge(hdev)) {
> > +        VHOST_OPS_DEBUG("vhost_huge_page_stretch_and_merge failed");
> > +        goto fail_mem;
> > +    }
> >      if (vhost_dev_has_iommu(hdev)) {
> >          memory_listener_register(&hdev->iommu_listener, vdev->dma_as);
> >      }
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  reply	other threads:[~2017-09-25 11:20 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20170824192750epcas5p484df9724ca7c0a259a4dd85425a69e1d@epcas5p4.samsung.com>
2017-08-24 19:26 ` [Qemu-devel] [RFC v2 00/32] postcopy+vhost-user/shared ram Dr. David Alan Gilbert (git)
2017-08-24 19:26   ` [Qemu-devel] [RFC v2 01/32] vhu: vu_queue_started Dr. David Alan Gilbert (git)
2017-08-24 23:10     ` Marc-André Lureau
2017-08-25 14:58       ` Dr. David Alan Gilbert
2017-08-30 13:02     ` Michael S. Tsirkin
2017-08-30 13:13       ` Marc-André Lureau
2017-09-05 12:58         ` Dr. David Alan Gilbert
2017-09-05 13:01           ` Dr. David Alan Gilbert
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 02/32] vhub: Only process received packets on started queues Dr. David Alan Gilbert (git)
2017-08-30  9:59     ` Marc-André Lureau
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 03/32] migrate: Update ram_block_discard_range for shared Dr. David Alan Gilbert (git)
2017-08-29  5:30     ` Peter Xu
2017-09-18 12:18       ` Dr. David Alan Gilbert
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 04/32] qemu_ram_block_host_offset Dr. David Alan Gilbert (git)
2017-08-25 12:11     ` Philippe Mathieu-Daudé
2017-08-25 15:28       ` Dr. David Alan Gilbert
2017-08-29  5:36     ` Peter Xu
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 05/32] migration/ram: ramblock_recv_bitmap_test_byte_offset Dr. David Alan Gilbert (git)
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 06/32] postcopy: use UFFDIO_ZEROPAGE only when available Dr. David Alan Gilbert (git)
2017-08-30  9:57     ` Marc-André Lureau
2017-09-07 10:55       ` Dr. David Alan Gilbert
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 07/32] postcopy: Add notifier chain Dr. David Alan Gilbert (git)
2017-08-29  6:02     ` Peter Xu
2017-09-11 17:00       ` Dr. David Alan Gilbert
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 08/32] postcopy: Add vhost-user flag for postcopy and check it Dr. David Alan Gilbert (git)
2017-08-29  6:22     ` Peter Xu
2017-09-13 14:34       ` Dr. David Alan Gilbert
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 09/32] vhost-user: Add 'VHOST_USER_POSTCOPY_ADVISE' message Dr. David Alan Gilbert (git)
2017-08-30 10:07     ` Marc-André Lureau
2017-09-07 11:04       ` Dr. David Alan Gilbert
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 10/32] vhub: Support sending fds back to qemu Dr. David Alan Gilbert (git)
2017-08-30 10:22     ` Marc-André Lureau
2017-09-07 11:31       ` Dr. David Alan Gilbert
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 11/32] vhub: Open userfaultfd Dr. David Alan Gilbert (git)
2017-08-29  6:40     ` Peter Xu
2017-09-15 17:33       ` Dr. David Alan Gilbert
2017-08-30 10:30     ` Marc-André Lureau
2017-09-07 16:36       ` Dr. David Alan Gilbert
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 12/32] postcopy: Allow registering of fd handler Dr. David Alan Gilbert (git)
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 13/32] vhost+postcopy: Register shared ufd with postcopy Dr. David Alan Gilbert (git)
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 14/32] vhost+postcopy: Transmit 'listen' to client Dr. David Alan Gilbert (git)
2017-08-30 10:37     ` Marc-André Lureau
2017-09-07 12:10       ` Dr. David Alan Gilbert
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 15/32] vhost+postcopy: Register new regions with the ufd Dr. David Alan Gilbert (git)
2017-08-30 10:42     ` Marc-André Lureau
2017-09-08 14:50       ` Dr. David Alan Gilbert
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 16/32] vhost+postcopy: Send address back to qemu Dr. David Alan Gilbert (git)
2017-08-29  8:30     ` Peter Xu
2017-09-12 17:15       ` Dr. David Alan Gilbert
2017-09-13  4:29         ` Peter Xu
2017-09-13 12:15           ` Dr. David Alan Gilbert
2017-09-15  8:57             ` Peter Xu
2017-09-15 15:32               ` Dr. David Alan Gilbert
2017-09-18  9:31               ` Dr. David Alan Gilbert
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 17/32] vhost+postcopy: Stash RAMBlock and offset Dr. David Alan Gilbert (git)
2017-08-30  5:51     ` Peter Xu
2017-09-13 15:59       ` Dr. David Alan Gilbert
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 18/32] vhost+postcopy: Send requests to source for shared pages Dr. David Alan Gilbert (git)
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 19/32] vhost+postcopy: Resolve client address Dr. David Alan Gilbert (git)
2017-08-30  5:28     ` Peter Xu
2017-09-11 11:58       ` Dr. David Alan Gilbert
2017-09-13  5:18         ` Peter Xu
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 20/32] postcopy: wake shared Dr. David Alan Gilbert (git)
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 21/32] postcopy: postcopy_notify_shared_wake Dr. David Alan Gilbert (git)
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 22/32] vhost+postcopy: Add vhost waker Dr. David Alan Gilbert (git)
2017-08-30  5:55     ` Peter Xu
2017-09-13 13:09       ` Dr. David Alan Gilbert
2017-09-18  3:57         ` Peter Xu
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 23/32] vhost+postcopy: Call wakeups Dr. David Alan Gilbert (git)
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 24/32] vub+postcopy: madvises Dr. David Alan Gilbert (git)
2017-08-30 10:48     ` Marc-André Lureau
2017-09-07 12:30       ` Dr. David Alan Gilbert
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 25/32] vhost+postcopy: Lock around set_mem_table Dr. David Alan Gilbert (git)
2017-08-30  6:50     ` Peter Xu
2017-09-25 17:56       ` Dr. David Alan Gilbert
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 26/32] vhost: Add VHOST_USER_POSTCOPY_END message Dr. David Alan Gilbert (git)
2017-08-30  6:55     ` Peter Xu
2017-09-11 11:31       ` Dr. David Alan Gilbert
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 27/32] vhost+postcopy: Wire up POSTCOPY_END notify Dr. David Alan Gilbert (git)
2017-08-30  6:57     ` Peter Xu
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 28/32] postcopy: Allow shared memory Dr. David Alan Gilbert (git)
2017-08-30 10:39     ` Marc-André Lureau
2017-09-07 12:15       ` Dr. David Alan Gilbert
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 29/32] vhost-user: Claim support for postcopy Dr. David Alan Gilbert (git)
2017-08-30 10:50     ` Marc-André Lureau
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 30/32] vhost: Merge neighbouring hugepage regions where appropriate Dr. David Alan Gilbert (git)
2017-09-14  9:18     ` Igor Mammedov
2017-09-25 11:19       ` Dr. David Alan Gilbert [this message]
2017-10-02 13:49         ` Igor Mammedov
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 31/32] vhost: Don't break merged regions on small remove/non-adds Dr. David Alan Gilbert (git)
2017-08-24 19:27   ` [Qemu-devel] [RFC v2 32/32] postcopy shared docs Dr. David Alan Gilbert (git)
2017-09-01 13:34   ` [Qemu-devel] [RFC v2 00/32] postcopy+vhost-user/shared ram Alexey Perevalov
2017-09-01 13:42     ` Maxime Coquelin
2017-10-16  8:32       ` Alexey Perevalov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170925111954.GA3178@work-vm \
    --to=dgilbert@redhat.com \
    --cc=a.perevalov@samsung.com \
    --cc=aarcange@redhat.com \
    --cc=felipe@nutanix.com \
    --cc=imammedo@redhat.com \
    --cc=lvivier@redhat.com \
    --cc=marcandre.lureau@redhat.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=mst@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).