Re: [RFC PATCH] hw/virtio/vhost: re-factor vhost-section and allow DIRTY_MEMORY_CODE

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: "Alex Bennée" <alex.bennee@linaro.org>
Cc: qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [RFC PATCH] hw/virtio/vhost: re-factor vhost-section and allow DIRTY_MEMORY_CODE
Date: Thu, 4 Jun 2020 14:07:29 +0100	[thread overview]
Message-ID: <20200604130729.GF2851@work-vm> (raw)
In-Reply-To: <87d06f57jd.fsf@linaro.org>

* Alex BennÃ©e (alex.bennee@linaro.org) wrote:
> 
> Michael S. Tsirkin <mst@redhat.com> writes:
> 
> > On Thu, Jun 04, 2020 at 12:49:17PM +0100, Alex BennÃƒÂ©e wrote:
> >> 
> >> Michael S. Tsirkin <mst@redhat.com> writes:
> >> 
> >> > On Thu, Jun 04, 2020 at 12:13:23PM +0100, Alex BennÃƒÆ’Ã‚Â©e wrote:
> >> >> The purpose of vhost_section is to identify RAM regions that need to
> >> >> be made available to a vhost client. However when running under TCG
> >> >> all RAM sections have DIRTY_MEMORY_CODE set which leads to problems
> >> >> down the line. The original comment implies VGA regions are a problem
> >> >> but doesn't explain why vhost has a problem with it.
> >> >> 
> >> >> Re-factor the code so:
> >> >> 
> >> >>   - steps are clearer to follow
> >> >>   - reason for rejection is recorded in the trace point
> >> >>   - we allow DIRTY_MEMORY_CODE when TCG is enabled
> >> >> 
> >> >> Signed-off-by: Alex BennÃƒÆ’Ã‚Â©e <alex.bennee@linaro.org>
> >> >> Cc: Michael S. Tsirkin <mst@redhat.com>
> >> >> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> >> >> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> >> >> ---
> >> >>  hw/virtio/vhost.c | 46 ++++++++++++++++++++++++++++++++--------------
> >> >>  1 file changed, 32 insertions(+), 14 deletions(-)
> >> >> 
> >> >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> >> >> index aff98a0ede5..f81fc87e74c 100644
> >> >> --- a/hw/virtio/vhost.c
> >> >> +++ b/hw/virtio/vhost.c
> >> >> @@ -27,6 +27,7 @@
> >> >>  #include "migration/blocker.h"
> >> >>  #include "migration/qemu-file-types.h"
> >> >>  #include "sysemu/dma.h"
> >> >> +#include "sysemu/tcg.h"
> >> >>  #include "trace.h"
> >> >>  
> >> >>  /* enabled until disconnected backend stabilizes */
> >> >> @@ -403,26 +404,43 @@ static int vhost_verify_ring_mappings(struct vhost_dev *dev,
> >> >>      return r;
> >> >>  }
> >> >>  
> >> >> +/*
> >> >> + * vhost_section: identify sections needed for vhost access
> >> >> + *
> >> >> + * We only care about RAM sections here (where virtqueue can live). If
> >> >> + * we find one we still allow the backend to potentially filter it out
> >> >> + * of our list.
> >> >> + */
> >> >>  static bool vhost_section(struct vhost_dev *dev, MemoryRegionSection *section)
> >> >>  {
> >> >> -    bool result;
> >> >> -    bool log_dirty = memory_region_get_dirty_log_mask(section->mr) &
> >> >> -                     ~(1 << DIRTY_MEMORY_MIGRATION);
> >> >> -    result = memory_region_is_ram(section->mr) &&
> >> >> -        !memory_region_is_rom(section->mr);
> >> >> -
> >> >> -    /* Vhost doesn't handle any block which is doing dirty-tracking other
> >> >> -     * than migration; this typically fires on VGA areas.
> >> >> -     */
> >> >> -    result &= !log_dirty;
> >> >> +    enum { OK = 0, NOT_RAM, DIRTY, FILTERED } result = NOT_RAM;
> >> >
> >> > I'm not sure what does this enum buy us as compared to bool.
> >> 
> >> The only real point of the enum is to give a little more detailed
> >> information to the trace point to expose why a section wasn't included.
> >> In a previous iteration I just had the tracepoint at the bottom before a
> >> return true where all other legs had returned false. We could switch to
> >> just having the tracepoint hit for explicit inclusions?
> >
> > I didn't notice.  Yes, ok more tracepoints IMHO.
> 
> I can simplify to two:
> 
>   trace_vhost_section(mr->name)
>   trace_vhost_reject_section(mr->name, int reason)
> 
> Not sure if it's worth defining a enum outside just for the purposes of
> the trace though. Do we have the concept of per-trace event enum codes?

If you want a 'reason' for the trace, then why not just make
  const char *result

Dave

> >> > Also why force OK to 0?
> >> 
> >> Personal preference where 0 indicates success and !0 indicates failure
> >> of various kinds. Again we can drop if we don't want the information in
> >> the tracepoint.
> >
> > So in that case we need to set all values so people can decode them
> > from the trace. But I think it's best to just have more trace points
> > or drop it from the trace.
> >
> >> > And I prefer an explicit "else result = NOT_RAM" below
> >> > instead of initializing it here.
> >> 
> >> Ok.
> >> 
> >> >
> >> >> +
> >> >> +    if (memory_region_is_ram(section->mr) && !memory_region_is_rom(section->mr)) {
> >> >> +        uint8_t dirty_mask = memory_region_get_dirty_log_mask(section->mr);
> >> >> +        uint8_t handled_dirty;
> >> >>  
> >> >> -    if (result && dev->vhost_ops->vhost_backend_mem_section_filter) {
> >> >> -        result &=
> >> >> -            dev->vhost_ops->vhost_backend_mem_section_filter(dev, section);
> >> >> +        /*
> >> >> +         * Vhost doesn't handle any block which is doing dirty-tracking other
> >> >> +         * than migration; this typically fires on VGA areas. However
> >> >> +         * for TCG we also do dirty code page tracking which shouldn't
> >> >> +         * get in the way.
> >> >> +         */
> >> >> +        handled_dirty = (1 << DIRTY_MEMORY_MIGRATION);
> >> >> +        if (tcg_enabled()) {
> >> >> +            handled_dirty |= (1 << DIRTY_MEMORY_CODE);
> >> >> +        }
> >> >
> >> > So DIRTY_MEMORY_CODE is only set by TCG right? Thus I'm guessing
> >> > we can just allow this unconditionally.
> >> 
> >> Which actually makes the test:
> >> 
> >>   if (dirty_mask & DIRTY_MEMORY_VGA) {
> >>      .. fail ..
> >>   }
> >> 
> >> which is more in line with the comment although wouldn't fail if we
> >> added additional DIRTY_MEMORY flags. This leads to the question what
> >> exactly is it about DIRTY tracking that vhost doesn't like.
> >
> > vhost does not know how to track writes to specific regions. It can either
> > track all writes to memory (which slows it down quite a bit)
> > or no writes.
> 
> So can vhost interfere with dirty tracking itself in the kernel by
> trapping the writes? I guess there is no way this can happen with
> vhost-user?
> 
> (I wonder what would happen if a vhost-user daemon did an mprotect() on
> RAM from it's shared view?)
> 
> > It never actually *needs* to write to VGA,
> > so we do a hack and just skip these and then if that's the
> > only thing we need to track then we don't need to enable
> > its dirty tracking.
> >
> > I don't really know what is DIRTY_MEMORY_CODE and when it's set.
> 
> We use it softmmu do any pages that have code in them always force the
> slow-path into cputlb for writes to those pages. This allows us to
> detect self-modifying code. The kernel would never get involved but I
> don't think vhost and TCG is compatible anyway. I'm only really
> interested in vhost-user and it's interaction with TCG.
> 
> I'll spin a v2 now.
> 
> -- 
> Alex BennÃ©e
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

next prev parent reply	other threads:[~2020-06-04 13:08 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-04 11:13 [RFC PATCH] hw/virtio/vhost: re-factor vhost-section and allow DIRTY_MEMORY_CODE Alex Bennée
2020-06-04 11:24 ` Michael S. Tsirkin
2020-06-04 11:49   ` Alex Bennée
2020-06-04 11:55     ` Michael S. Tsirkin
2020-06-04 12:39       ` Alex Bennée
2020-06-04 13:07         ` Dr. David Alan Gilbert [this message]
2020-06-04 12:58     ` Philippe Mathieu-Daudé
2020-06-04 13:50       ` Alex Bennée
2020-06-04 13:26 ` Dr. David Alan Gilbert
2020-06-04 14:02   ` Alex Bennée
2020-06-04 14:29     ` Dr. David Alan Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200604130729.GF2851@work-vm \
    --to=dgilbert@redhat.com \
    --cc=alex.bennee@linaro.org \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.