qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Alex Bennée" <alex.bennee@linaro.org>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [RFC PATCH] hw/virtio/vhost: re-factor vhost-section and allow DIRTY_MEMORY_CODE
Date: Thu, 04 Jun 2020 13:39:50 +0100	[thread overview]
Message-ID: <87d06f57jd.fsf@linaro.org> (raw)
In-Reply-To: <20200604075020-mutt-send-email-mst@kernel.org>


Michael S. Tsirkin <mst@redhat.com> writes:

> On Thu, Jun 04, 2020 at 12:49:17PM +0100, Alex Bennée wrote:
>> 
>> Michael S. Tsirkin <mst@redhat.com> writes:
>> 
>> > On Thu, Jun 04, 2020 at 12:13:23PM +0100, Alex Bennée wrote:
>> >> The purpose of vhost_section is to identify RAM regions that need to
>> >> be made available to a vhost client. However when running under TCG
>> >> all RAM sections have DIRTY_MEMORY_CODE set which leads to problems
>> >> down the line. The original comment implies VGA regions are a problem
>> >> but doesn't explain why vhost has a problem with it.
>> >> 
>> >> Re-factor the code so:
>> >> 
>> >>   - steps are clearer to follow
>> >>   - reason for rejection is recorded in the trace point
>> >>   - we allow DIRTY_MEMORY_CODE when TCG is enabled
>> >> 
>> >> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> >> Cc: Michael S. Tsirkin <mst@redhat.com>
>> >> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> >> Cc: Stefan Hajnoczi <stefanha@redhat.com>
>> >> ---
>> >>  hw/virtio/vhost.c | 46 ++++++++++++++++++++++++++++++++--------------
>> >>  1 file changed, 32 insertions(+), 14 deletions(-)
>> >> 
>> >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>> >> index aff98a0ede5..f81fc87e74c 100644
>> >> --- a/hw/virtio/vhost.c
>> >> +++ b/hw/virtio/vhost.c
>> >> @@ -27,6 +27,7 @@
>> >>  #include "migration/blocker.h"
>> >>  #include "migration/qemu-file-types.h"
>> >>  #include "sysemu/dma.h"
>> >> +#include "sysemu/tcg.h"
>> >>  #include "trace.h"
>> >>  
>> >>  /* enabled until disconnected backend stabilizes */
>> >> @@ -403,26 +404,43 @@ static int vhost_verify_ring_mappings(struct vhost_dev *dev,
>> >>      return r;
>> >>  }
>> >>  
>> >> +/*
>> >> + * vhost_section: identify sections needed for vhost access
>> >> + *
>> >> + * We only care about RAM sections here (where virtqueue can live). If
>> >> + * we find one we still allow the backend to potentially filter it out
>> >> + * of our list.
>> >> + */
>> >>  static bool vhost_section(struct vhost_dev *dev, MemoryRegionSection *section)
>> >>  {
>> >> -    bool result;
>> >> -    bool log_dirty = memory_region_get_dirty_log_mask(section->mr) &
>> >> -                     ~(1 << DIRTY_MEMORY_MIGRATION);
>> >> -    result = memory_region_is_ram(section->mr) &&
>> >> -        !memory_region_is_rom(section->mr);
>> >> -
>> >> -    /* Vhost doesn't handle any block which is doing dirty-tracking other
>> >> -     * than migration; this typically fires on VGA areas.
>> >> -     */
>> >> -    result &= !log_dirty;
>> >> +    enum { OK = 0, NOT_RAM, DIRTY, FILTERED } result = NOT_RAM;
>> >
>> > I'm not sure what does this enum buy us as compared to bool.
>> 
>> The only real point of the enum is to give a little more detailed
>> information to the trace point to expose why a section wasn't included.
>> In a previous iteration I just had the tracepoint at the bottom before a
>> return true where all other legs had returned false. We could switch to
>> just having the tracepoint hit for explicit inclusions?
>
> I didn't notice.  Yes, ok more tracepoints IMHO.

I can simplify to two:

  trace_vhost_section(mr->name)
  trace_vhost_reject_section(mr->name, int reason)

Not sure if it's worth defining a enum outside just for the purposes of
the trace though. Do we have the concept of per-trace event enum codes?

>> > Also why force OK to 0?
>> 
>> Personal preference where 0 indicates success and !0 indicates failure
>> of various kinds. Again we can drop if we don't want the information in
>> the tracepoint.
>
> So in that case we need to set all values so people can decode them
> from the trace. But I think it's best to just have more trace points
> or drop it from the trace.
>
>> > And I prefer an explicit "else result = NOT_RAM" below
>> > instead of initializing it here.
>> 
>> Ok.
>> 
>> >
>> >> +
>> >> +    if (memory_region_is_ram(section->mr) && !memory_region_is_rom(section->mr)) {
>> >> +        uint8_t dirty_mask = memory_region_get_dirty_log_mask(section->mr);
>> >> +        uint8_t handled_dirty;
>> >>  
>> >> -    if (result && dev->vhost_ops->vhost_backend_mem_section_filter) {
>> >> -        result &=
>> >> -            dev->vhost_ops->vhost_backend_mem_section_filter(dev, section);
>> >> +        /*
>> >> +         * Vhost doesn't handle any block which is doing dirty-tracking other
>> >> +         * than migration; this typically fires on VGA areas. However
>> >> +         * for TCG we also do dirty code page tracking which shouldn't
>> >> +         * get in the way.
>> >> +         */
>> >> +        handled_dirty = (1 << DIRTY_MEMORY_MIGRATION);
>> >> +        if (tcg_enabled()) {
>> >> +            handled_dirty |= (1 << DIRTY_MEMORY_CODE);
>> >> +        }
>> >
>> > So DIRTY_MEMORY_CODE is only set by TCG right? Thus I'm guessing
>> > we can just allow this unconditionally.
>> 
>> Which actually makes the test:
>> 
>>   if (dirty_mask & DIRTY_MEMORY_VGA) {
>>      .. fail ..
>>   }
>> 
>> which is more in line with the comment although wouldn't fail if we
>> added additional DIRTY_MEMORY flags. This leads to the question what
>> exactly is it about DIRTY tracking that vhost doesn't like.
>
> vhost does not know how to track writes to specific regions. It can either
> track all writes to memory (which slows it down quite a bit)
> or no writes.

So can vhost interfere with dirty tracking itself in the kernel by
trapping the writes? I guess there is no way this can happen with
vhost-user?

(I wonder what would happen if a vhost-user daemon did an mprotect() on
RAM from it's shared view?)

> It never actually *needs* to write to VGA,
> so we do a hack and just skip these and then if that's the
> only thing we need to track then we don't need to enable
> its dirty tracking.
>
> I don't really know what is DIRTY_MEMORY_CODE and when it's set.

We use it softmmu do any pages that have code in them always force the
slow-path into cputlb for writes to those pages. This allows us to
detect self-modifying code. The kernel would never get involved but I
don't think vhost and TCG is compatible anyway. I'm only really
interested in vhost-user and it's interaction with TCG.

I'll spin a v2 now.

-- 
Alex Bennée


  reply	other threads:[~2020-06-04 12:43 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-04 11:13 [RFC PATCH] hw/virtio/vhost: re-factor vhost-section and allow DIRTY_MEMORY_CODE Alex Bennée
2020-06-04 11:24 ` Michael S. Tsirkin
2020-06-04 11:49   ` Alex Bennée
2020-06-04 11:55     ` Michael S. Tsirkin
2020-06-04 12:39       ` Alex Bennée [this message]
2020-06-04 13:07         ` Dr. David Alan Gilbert
2020-06-04 12:58     ` Philippe Mathieu-Daudé
2020-06-04 13:50       ` Alex Bennée
2020-06-04 13:26 ` Dr. David Alan Gilbert
2020-06-04 14:02   ` Alex Bennée
2020-06-04 14:29     ` Dr. David Alan Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87d06f57jd.fsf@linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=dgilbert@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).