From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Igor Mammedov <imammedo@redhat.com>
Cc: qemu-devel@nongnu.org, mst@redhat.com, groug@kaod.org,
stefanha@gmail.com
Subject: Re: [Qemu-devel] [RFC] vhost: check if ring mapping is still valid when building memmap
Date: Mon, 11 Dec 2017 11:03:00 +0000 [thread overview]
Message-ID: <20171211110259.GB2424@work-vm> (raw)
In-Reply-To: <20171211103755.2c95f86a@redhat.com>
* Igor Mammedov (imammedo@redhat.com) wrote:
> On Fri, 8 Dec 2017 17:51:56 +0000
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
>
> > * Igor Mammedov (imammedo@redhat.com) wrote:
> > > On Thu, 7 Dec 2017 18:17:51 +0000
> > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > >
> > > > * Igor Mammedov (imammedo@redhat.com) wrote:
> > > > > vhost_verify_ring_mappings() were used to verify that
> > > > > rings are still accessible and related memory hasn't
> > > > > been moved after flatview is updated.
> > > > >
> > > > > It were doing checks by mapping ring's GPA+len and
> > > > > checking that HVA hasn't changed with new memory map.
> > > > > To avoid maybe expensive mapping call, we were
> > > > > identifying address range that changed and were doing
> > > > > mapping only if ring were in changed range.
> > > > >
> > > > > However it's no neccessary to perform ringi's GPA
> > > > > mapping as we already have it's current HVA and all
> > > > > we need is to verify that ring's GPA translates to
> > > > > the same HVA in updated flatview.
> > > > >
> > > > > That could be done during time when we are building
> > > > > vhost memmap where vhost_update_mem_cb() already maps
> > > > > every RAM MemoryRegionSection to get section HVA. This
> > > > > fact can be used to check that ring belongs to the same
> > > > > MemoryRegion in the same place, like this:
> > > > >
> > > > > hva_ring_offset = GPA(ring) - GPA(MemoryRegionSection)
> > > > > ring_hva == HVA(MemoryRegionSection) + hva_ring_offset
> > > > >
> > > > > Doing this would allow us to drop ~100LOC of code which
> > > > > figures out changed memory range and avoid calling not
> > > > > needed reverse vhost_memory_map(is_write=true) which is
> > > > > overkill for the task at hand.
> > > >
> > > > There are a few things about this I don't understand; however if
> > > > it's right I agree that it means we can kill off my comparison
> > > > code.
> > > >
> > > >
> > > > First, can I check what constraints 'verify_ring' needs to check;
> > > > if I'm understanding the original code it's checking that :
> > > > a) If a queue falls within a region, that the whole queue can
> > > > be mapped
> > > see vhost_verify_ring_part_mapping():
> > >
> > > p = vhost_memory_map(dev, part_addr, &l, 1);
> > > if (!p || l != part_size)
> > > error_out
> > >
> > > 1st: (!p) requested part_addr must be mapped
> > > i.e. be a part of MemorySection in flatview
> > > AND
> > > 2nd: (l != part_size) mapped range size must match what we asked for
> > > i.e. mapped as continuous range so that
> > > [GPA, GPA + part_size) could directly correspond to [HVA, HVA + part_size)
> > > and that's is possible only withing 1 MemorySection && 1 MeoryRegion
> > > if I read address_space_map() correctly
> > > flatview_translate() -> GPA's MemorySection
> > > flatview_extend_translation() -> 1:1 GPA->HVA range size
> > >
> > > So answer in case of RAM memory region that we are interested in, would be:
> > > queue falls within a MemorySection and whole queue fits in to it
> >
> > Yes, OK.
> >
> > > > b) And the start of the queue corresponds to where we thought
> > > > it used to be (in GPA?)
> > > that cached at vq->desc queue HVA hasn't changed after flatview change
> > > if (p != part)
> > > error_out
> >
> > OK, so it's the HVA that's not changed based on taking the part_addr
> > which is GPA and checking the map?
> Yes, I think so.
>
> > > > so that doesn't have any constraint on the ordering of the regions
> > > > or whether the region is the same size or anything.
> > > > Also I don't think it would spot if there was a qeueue that had no
> > > > region associated with it at all.
> > > >
> > > > Now, comments on your code below:
> > > >
> > > >
> > > > > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > > > > ---
> > > > > PS:
> > > > > should be applied on top of David's series
> > > > >
> > > > > ---
> > > > > include/hw/virtio/vhost.h | 2 -
> > > > > hw/virtio/vhost.c | 195 ++++++++++++++--------------------------------
> > > > > 2 files changed, 57 insertions(+), 140 deletions(-)
> > > > >
> > > > > diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> > > > > index 467dc77..fc4af5c 100644
> > > > > --- a/include/hw/virtio/vhost.h
> > > > > +++ b/include/hw/virtio/vhost.h
> > > > > @@ -68,8 +68,6 @@ struct vhost_dev {
> > > > > uint64_t log_size;
> > > > > Error *migration_blocker;
> > > > > bool memory_changed;
> > > > > - hwaddr mem_changed_start_addr;
> > > > > - hwaddr mem_changed_end_addr;
> > > > > const VhostOps *vhost_ops;
> > > > > void *opaque;
> > > > > struct vhost_log *log;
> > > > > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > > > > index 5b9a7d7..026bac3 100644
> > > > > --- a/hw/virtio/vhost.c
> > > > > +++ b/hw/virtio/vhost.c
> > > > > @@ -296,35 +296,43 @@ static void vhost_memory_unmap(struct vhost_dev *dev, void *buffer,
> > > > > }
> > > > > }
> > > > >
> > > > > -static int vhost_verify_ring_part_mapping(struct vhost_dev *dev,
> > > > > - void *part,
> > > > > - uint64_t part_addr,
> > > > > - uint64_t part_size,
> > > > > - uint64_t start_addr,
> > > > > - uint64_t size)
> > > > > +static int vhost_verify_ring_part_mapping(void *ring_hva,
> > > > > + uint64_t ring_gpa,
> > > > > + uint64_t ring_size,
> > > > > + void *reg_hva,
> > > > > + uint64_t reg_gpa,
> > > > > + uint64_t reg_size)
> > > > > {
> > > > > - hwaddr l;
> > > > > - void *p;
> > > > > - int r = 0;
> > > > > + uint64_t hva_ring_offset;
> > > > > + uint64_t ring_last = range_get_last(ring_gpa, ring_size);
> > > > > + uint64_t reg_last = range_get_last(reg_gpa, reg_size);
> > > > >
> > > > > - if (!ranges_overlap(start_addr, size, part_addr, part_size)) {
> > > > > + /* start check from the end so that the rest of checks
> > > > > + * are done when whole ring is in merged sections range
> > > > > + */
> > > > > + if (ring_last < reg_last || ring_gpa > reg_last) {
> > > > > return 0;
> > > > > }
> > > >
> > > > so does that mean if our region never grows to be as big as the ring
> > > > we wont spot the problem?
> > > I think there is mistake here it should be:
> > > ring_last < reg_gpa || ring_gpa > reg_last
> > >
> > > so if ring is out side of continuous region, we do not care => no change
> >
> > OK, I don't see how that corresponds to your 'start check from the end'
> > comment - I thought it was doing something smarter to deal with this
> > being called from the _cb routine potentially before another part of the
> > range had been joined onto it.
> > In that case, we can just use ranges_overlap like the original
> > routine did.
> I suppose range check will do as well, the reason for check from the end
> was optimization to make less checks than ranges_overlap(),
> considering we are comparing against vhost_memory_region.
Have a look at the RFC v2 I've posted; I've reworked it a bit so
we call this code aftwards rather than from the _cb.
> But it seems like a bit an overdoing, since by the commit time
> flatview would contain 1:1 mapping of MemorySection:vhost_memory_region
> (modulo sections we are not interested in). It should be unlikely to
> get 2 MemoryRegions allocated one after another and merge in vhost_memory_region()
> into one vhost_memory_region. Maybe we would be better off with just
> copying MemorySections into vhost_memory_region in vhost_update_mem_cb()
> and drop merging altogether.
Except I need the merging for the hugepage stuff that's coming in the
postcopy world.
> I'd even consider 'non deterministic' merging of 2 MemoryRegions harmful
> to migration, since vhost might see different memory map on target vs source.
I think that's come up before and it was decided it's not a problem
as long as the regions are still mapped; the only consistency required
is between the qemu and the vhost (either the kernel or the vhost-user);
it shouldn't be a guest visibile issue if I understand it correctly.
Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2017-12-11 11:03 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-29 18:50 [Qemu-devel] [RFC 0/7] Rework vhost memory region updates Dr. David Alan Gilbert (git)
2017-11-29 18:50 ` [Qemu-devel] [RFC 1/7] memory: address_space_iterate Dr. David Alan Gilbert (git)
2017-11-29 18:50 ` [Qemu-devel] [RFC 2/7] vhost: Move log_dirty check Dr. David Alan Gilbert (git)
2017-11-29 18:50 ` [Qemu-devel] [RFC 3/7] vhost: New memory update functions Dr. David Alan Gilbert (git)
2017-11-30 15:48 ` Igor Mammedov
2017-12-05 18:25 ` Dr. David Alan Gilbert
2017-11-29 18:50 ` [Qemu-devel] [RFC 4/7] vhost: update_mem_cb implementation Dr. David Alan Gilbert (git)
2017-11-30 11:27 ` Igor Mammedov
2017-12-06 20:09 ` Dr. David Alan Gilbert
2017-11-29 18:50 ` [Qemu-devel] [RFC 5/7] vhost: Compare new and old memory lists Dr. David Alan Gilbert (git)
2017-11-29 18:50 ` [Qemu-devel] [RFC 6/7] vhost: Copy updated region data into device state Dr. David Alan Gilbert (git)
2017-11-29 18:50 ` [Qemu-devel] [RFC 7/7] vhost: Remove vhost_set_memory and children Dr. David Alan Gilbert (git)
2017-11-30 11:22 ` [Qemu-devel] [RFC 0/7] Rework vhost memory region updates Igor Mammedov
2017-11-30 12:08 ` Dr. David Alan Gilbert
2017-11-30 12:40 ` Igor Mammedov
2017-11-30 12:47 ` Dr. David Alan Gilbert
2017-11-30 12:58 ` Igor Mammedov
2017-11-30 13:06 ` Dr. David Alan Gilbert
2017-11-30 15:08 ` Igor Mammedov
2017-11-30 15:18 ` Dr. David Alan Gilbert
2017-11-30 15:32 ` Igor Mammedov
2017-11-30 15:41 ` Dr. David Alan Gilbert
2017-11-30 16:51 ` Greg Kurz
2017-12-01 10:02 ` Stefan Hajnoczi
2017-12-01 10:19 ` Dr. David Alan Gilbert
2017-12-01 14:22 ` [Qemu-devel] [RFC] vhost: check if ring mapping is still valid when building memmap Igor Mammedov
2017-12-07 18:17 ` Dr. David Alan Gilbert
2017-12-08 14:42 ` Igor Mammedov
2017-12-08 17:51 ` Dr. David Alan Gilbert
2017-12-11 9:37 ` Igor Mammedov
2017-12-11 11:03 ` Dr. David Alan Gilbert [this message]
2017-12-11 13:45 ` Igor Mammedov
2017-12-11 15:43 ` Dr. David Alan Gilbert
2017-12-08 20:45 ` Dr. David Alan Gilbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171211110259.GB2424@work-vm \
--to=dgilbert@redhat.com \
--cc=groug@kaod.org \
--cc=imammedo@redhat.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).