qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: "Alex Bennée" <alex.bennee@linaro.org>
Cc: qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>,
	"Michael S . Tsirkin" <mst@redhat.com>
Subject: Re: [RFC PATCH] hw/virtio/vhost: re-factor vhost-section and allow DIRTY_MEMORY_CODE
Date: Thu, 4 Jun 2020 15:29:42 +0100	[thread overview]
Message-ID: <20200604142942.GH2851@work-vm> (raw)
In-Reply-To: <875zc753pf.fsf@linaro.org>

* Alex Bennée (alex.bennee@linaro.org) wrote:
> 
> Dr. David Alan Gilbert <dgilbert@redhat.com> writes:
> 
> > * Alex Bennée (alex.bennee@linaro.org) wrote:
> >> The purpose of vhost_section is to identify RAM regions that need to
> >> be made available to a vhost client. However when running under TCG
> >> all RAM sections have DIRTY_MEMORY_CODE set which leads to problems
> >> down the line. The original comment implies VGA regions are a problem
> >> but doesn't explain why vhost has a problem with it.
> >> 
> >> Re-factor the code so:
> >> 
> >>   - steps are clearer to follow
> >>   - reason for rejection is recorded in the trace point
> >>   - we allow DIRTY_MEMORY_CODE when TCG is enabled
> >
> > The problem with VGA is that a VGA page can become mapped and unmapped
> > under the control of the guest; somewhere in a low address.  This tends
> > to break hugepage mappings.
> > For vhost-user, and in particular vhost-user-postcopy this means it
> > fails the mapping on the vhost-user client.
> >
> > However the other problem is that with vhost-user, the vhost-user client
> > is changing memory; and won't mark the pages as dirty - except for
> > migration (I'm not clear if vhost kernel does this).
> 
> For virtio this shouldn't be a problem because whatever the vhost-user
> client writes to should never be read by the guest until it gets kicked
> by the client to signal the virtqueue is done.
> 
> I guess migration is a fairly moot point given I haven't seen anything
> outside of a test declare VHOST_F_LOG_ALL support.
> 
> > So TCG won't notice a page that's been changed by the driver; now in
> > most cases it's rare for a device to be writing directly into a page
> > you're going to execute out of, but it's not unknown.
> 
> Not unknown outside of bugs?

I can think of sane reasons, I just don't know if they normally happen.
For example, it would make sense to me for a boot loader or simple OS to
tell a virtio-blk device to write directly into an area of RAM and then
for the OS to jump into it as soon as it gets notified that the device
completes.
I'm assuming incoming data under Linux gets copied around before use
so that incoming data never gets immediately executed.

> So stage 2 of this exercise is limiting the amount of exposed RAM to the
> client to just the virtqueues themselves (which is all vhost-user-rpmb
> should need).
> 
> > So, as it is, any area that's expecting to get non-migration dirty
> > notifications is going to be disappointed by a vhost-user backend.
> 
> It's not outside the realms of possibility that we could implement
> feedback to the softmmu/migration information from a vhost-user client
> but for now I think it's safe to assume we are eliding over the issue.

Dave

> 
> >
> > Dave
> >
> >> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> >> Cc: Michael S. Tsirkin <mst@redhat.com>
> >> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> >> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> >> ---
> >>  hw/virtio/vhost.c | 46 ++++++++++++++++++++++++++++++++--------------
> >>  1 file changed, 32 insertions(+), 14 deletions(-)
> >> 
> >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> >> index aff98a0ede5..f81fc87e74c 100644
> >> --- a/hw/virtio/vhost.c
> >> +++ b/hw/virtio/vhost.c
> >> @@ -27,6 +27,7 @@
> >>  #include "migration/blocker.h"
> >>  #include "migration/qemu-file-types.h"
> >>  #include "sysemu/dma.h"
> >> +#include "sysemu/tcg.h"
> >>  #include "trace.h"
> >>  
> >>  /* enabled until disconnected backend stabilizes */
> >> @@ -403,26 +404,43 @@ static int vhost_verify_ring_mappings(struct vhost_dev *dev,
> >>      return r;
> >>  }
> >>  
> >> +/*
> >> + * vhost_section: identify sections needed for vhost access
> >> + *
> >> + * We only care about RAM sections here (where virtqueue can live). If
> >> + * we find one we still allow the backend to potentially filter it out
> >> + * of our list.
> >> + */
> >>  static bool vhost_section(struct vhost_dev *dev, MemoryRegionSection *section)
> >>  {
> >> -    bool result;
> >> -    bool log_dirty = memory_region_get_dirty_log_mask(section->mr) &
> >> -                     ~(1 << DIRTY_MEMORY_MIGRATION);
> >> -    result = memory_region_is_ram(section->mr) &&
> >> -        !memory_region_is_rom(section->mr);
> >> -
> >> -    /* Vhost doesn't handle any block which is doing dirty-tracking other
> >> -     * than migration; this typically fires on VGA areas.
> >> -     */
> >> -    result &= !log_dirty;
> >> +    enum { OK = 0, NOT_RAM, DIRTY, FILTERED } result = NOT_RAM;
> >> +
> >> +    if (memory_region_is_ram(section->mr) && !memory_region_is_rom(section->mr)) {
> >> +        uint8_t dirty_mask = memory_region_get_dirty_log_mask(section->mr);
> >> +        uint8_t handled_dirty;
> >>  
> >> -    if (result && dev->vhost_ops->vhost_backend_mem_section_filter) {
> >> -        result &=
> >> -            dev->vhost_ops->vhost_backend_mem_section_filter(dev, section);
> >> +        /*
> >> +         * Vhost doesn't handle any block which is doing dirty-tracking other
> >> +         * than migration; this typically fires on VGA areas. However
> >> +         * for TCG we also do dirty code page tracking which shouldn't
> >> +         * get in the way.
> >> +         */
> >> +        handled_dirty = (1 << DIRTY_MEMORY_MIGRATION);
> >> +        if (tcg_enabled()) {
> >> +            handled_dirty |= (1 << DIRTY_MEMORY_CODE);
> >> +        }
> >> +        if (dirty_mask & ~handled_dirty) {
> >> +            result = DIRTY;
> >> +        } else if (dev->vhost_ops->vhost_backend_mem_section_filter &&
> >> +            !dev->vhost_ops->vhost_backend_mem_section_filter(dev, section)) {
> >> +            result = FILTERED;
> >> +        } else {
> >> +            result = OK;
> >> +        }
> >>      }
> >>  
> >>      trace_vhost_section(section->mr->name, result);
> >> -    return result;
> >> +    return result == OK;
> >>  }
> >>  
> >>  static void vhost_begin(MemoryListener *listener)
> >> -- 
> >> 2.20.1
> >> 
> 
> 
> -- 
> Alex Bennée
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



      reply	other threads:[~2020-06-04 14:31 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-04 11:13 [RFC PATCH] hw/virtio/vhost: re-factor vhost-section and allow DIRTY_MEMORY_CODE Alex Bennée
2020-06-04 11:24 ` Michael S. Tsirkin
2020-06-04 11:49   ` Alex Bennée
2020-06-04 11:55     ` Michael S. Tsirkin
2020-06-04 12:39       ` Alex Bennée
2020-06-04 13:07         ` Dr. David Alan Gilbert
2020-06-04 12:58     ` Philippe Mathieu-Daudé
2020-06-04 13:50       ` Alex Bennée
2020-06-04 13:26 ` Dr. David Alan Gilbert
2020-06-04 14:02   ` Alex Bennée
2020-06-04 14:29     ` Dr. David Alan Gilbert [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200604142942.GH2851@work-vm \
    --to=dgilbert@redhat.com \
    --cc=alex.bennee@linaro.org \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).