All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Roger Pau Monne <roger.pau@citrix.com>, xen-devel@lists.xenproject.org
Cc: Tim Deegan <tim@xen.org>, Jan Beulich <jbeulich@suse.com>
Subject: Re: [PATCH RFC v2 3/3] xen: rework paging_log_dirty_op to work with hvm guests
Date: Thu, 2 Apr 2015 20:46:33 +0100	[thread overview]
Message-ID: <551D9C99.4070107@citrix.com> (raw)
In-Reply-To: <1427970395-16203-4-git-send-email-roger.pau@citrix.com>

On 02/04/15 11:26, Roger Pau Monne wrote:
> When the caller of paging_log_dirty_op is a hvm guest Xen would choke when
> trying to copy the dirty bitmap to the guest because the paging lock is
> already held.

Are you sure? Presumably you get an mm lock ordering violation, because
paging_log_dirty_op() should take the target domains paging lock, rather
than your own (which is prohibited by the current check at the top of
paging_domctl()).

Unfortunately, dropping the paging_lock() here is unsafe, as it will
result in corruption of the logdirty bitmap from non-domain sources such
as HVMOP_modified_memory.

I will need to find some time with a large pot of coffee and a
whiteboard, but I suspect it might actually be safe to alter the current
mm_lock() enforcement to maintain independent levels for a source and
destination domain.

Up until now, the toolstack domain has always been PV (with very little
in the way of locking), and I don't believe our current locking model is
suitable for an HVM domain performing toolstack operations on another,
where both the source and destination need locking.

~Andrew

>
> Fix this by independently mapping each page of the guest bitmap as needed
> without the paging lock held.
>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> Cc: Tim Deegan <tim@xen.org>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
>  xen/arch/x86/mm/paging.c | 99 +++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 89 insertions(+), 10 deletions(-)
>
> diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c
> index b54d76a..4dcf942 100644
> --- a/xen/arch/x86/mm/paging.c
> +++ b/xen/arch/x86/mm/paging.c
> @@ -397,6 +397,53 @@ int paging_mfn_is_dirty(struct domain *d, mfn_t gmfn)
>      return rv;
>  }
>  
> +static inline void *map_dirty_bitmap(XEN_GUEST_HANDLE_64(uint8) dirty_bitmap,
> +                                     unsigned long pages,
> +                                     struct page_info **page,
> +                                     unsigned long *mapped_page)
> +{
> +    p2m_type_t p2mt;
> +    uint32_t pfec;
> +    unsigned long gfn;
> +
> +    gfn = paging_gva_to_gfn(current,
> +                            (paddr_t)(((char *)dirty_bitmap.p) + (pages >> 3)),
> +                            &pfec);
> +    if ( gfn == INVALID_GFN )
> +        return NULL;
> +
> +    *page = get_page_from_gfn(current->domain, gfn, &p2mt, P2M_UNSHARE);
> +
> +    if ( p2m_is_paging(p2mt) )
> +    {
> +        put_page(*page);
> +        p2m_mem_paging_populate(current->domain, gfn);
> +        return NULL;
> +    }
> +    if ( p2m_is_shared(p2mt) )
> +    {
> +        put_page(*page);
> +        return NULL;
> +    }
> +    if ( p2m_is_grant(p2mt) )
> +    {
> +        put_page(*page);
> +        return NULL;
> +    }
> +
> +    *mapped_page = pages;
> +    return __map_domain_page(*page);
> +}
> +
> +static inline void unmap_dirty_bitmap(void *addr, struct page_info *page)
> +{
> +    if ( addr != NULL )
> +    {
> +        unmap_domain_page(addr);
> +        put_page(page);
> +    }
> +}
> +
>  
>  /* Read a domain's log-dirty bitmap and stats.  If the operation is a CLEAN,
>   * clear the bitmap and stats as well. */
> @@ -409,9 +456,23 @@ static int paging_log_dirty_op(struct domain *d,
>      mfn_t *l4 = NULL, *l3 = NULL, *l2 = NULL;
>      unsigned long *l1 = NULL;
>      int i4, i3, i2;
> +    uint8_t *dirty_bitmap = NULL;
> +    struct page_info *page;
> +    unsigned long index_mapped = 0;
>  
>      if ( !resuming )
>          domain_pause(d);
> +
> +    dirty_bitmap = map_dirty_bitmap(sc->dirty_bitmap,
> +                                    resuming ?
> +                                        d->arch.paging.preempt.log_dirty.done :
> +                                        0,
> +                                    &page, &index_mapped);
> +    if ( dirty_bitmap == NULL )
> +    {
> +        domain_unpause(d);
> +        return -EFAULT;
> +    }
>      paging_lock(d);
>  
>      if ( !d->arch.paging.preempt.dom )
> @@ -448,21 +509,23 @@ static int paging_log_dirty_op(struct domain *d,
>          goto out;
>      }
>  
> -    l4 = paging_map_log_dirty_bitmap(d);
>      i4 = d->arch.paging.preempt.log_dirty.i4;
>      i3 = d->arch.paging.preempt.log_dirty.i3;
> +    i2 = 0;
>      pages = d->arch.paging.preempt.log_dirty.done;
>  
> + again:
> +    l4 = paging_map_log_dirty_bitmap(d);
> +
>      for ( ; (pages < sc->pages) && (i4 < LOGDIRTY_NODE_ENTRIES); i4++, i3 = 0 )
>      {
>          l3 = (l4 && mfn_valid(l4[i4])) ? map_domain_page(mfn_x(l4[i4])) : NULL;
> -        for ( ; (pages < sc->pages) && (i3 < LOGDIRTY_NODE_ENTRIES); i3++ )
> +        for ( ; (pages < sc->pages) && (i3 < LOGDIRTY_NODE_ENTRIES);
> +             i3++, i2 = 0 )
>          {
>              l2 = ((l3 && mfn_valid(l3[i3])) ?
>                    map_domain_page(mfn_x(l3[i3])) : NULL);
> -            for ( i2 = 0;
> -                  (pages < sc->pages) && (i2 < LOGDIRTY_NODE_ENTRIES);
> -                  i2++ )
> +            for ( ; (pages < sc->pages) && (i2 < LOGDIRTY_NODE_ENTRIES); i2++ )
>              {
>                  unsigned int bytes = PAGE_SIZE;
>                  l1 = ((l2 && mfn_valid(l2[i2])) ?
> @@ -471,11 +534,25 @@ static int paging_log_dirty_op(struct domain *d,
>                      bytes = (unsigned int)((sc->pages - pages + 7) >> 3);
>                  if ( likely(peek) )
>                  {
> -                    if ( (l1 ? copy_to_guest_offset(sc->dirty_bitmap,
> -                                                    pages >> 3, (uint8_t *)l1,
> -                                                    bytes)
> -                             : clear_guest_offset(sc->dirty_bitmap,
> -                                                  pages >> 3, bytes)) != 0 )
> +                    if ( (pages >> 3) >= (index_mapped >> 3) + 4096 ) {
> +                        /* We need to map next page */
> +                        paging_unlock(d);
> +                        unmap_dirty_bitmap(dirty_bitmap, page);
> +                        dirty_bitmap = map_dirty_bitmap(sc->dirty_bitmap, pages,
> +                                                        &page, &index_mapped);
> +                        paging_lock(d);
> +                        if ( dirty_bitmap == NULL )
> +                        {
> +                            rv = -EFAULT;
> +                            goto out;
> +                        }
> +                        goto again;
> +                    }
> +                    BUG_ON(((pages >> 3) % PAGE_SIZE) + bytes > PAGE_SIZE);
> +                    if ( (l1 ? memcpy(dirty_bitmap + ((pages >> 3) % PAGE_SIZE),
> +                                      (uint8_t *)l1, bytes)
> +                             : memset(dirty_bitmap + ((pages >> 3) % PAGE_SIZE),
> +                                      0, bytes)) == NULL )
>                      {
>                          rv = -EFAULT;
>                          goto out;
> @@ -549,12 +626,14 @@ static int paging_log_dirty_op(struct domain *d,
>           * paging modes (shadow or hap).  Safe because the domain is paused. */
>          d->arch.paging.log_dirty.clean_dirty_bitmap(d);
>      }
> +    unmap_dirty_bitmap(dirty_bitmap, page);
>      domain_unpause(d);
>      return rv;
>  
>   out:
>      d->arch.paging.preempt.dom = NULL;
>      paging_unlock(d);
> +    unmap_dirty_bitmap(dirty_bitmap, page);
>      domain_unpause(d);
>  
>      if ( l1 )


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2015-04-02 19:46 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-02 10:26 [PATCH RFC v2 0/3] xen/pvh: enable migration on PVH Dom0 Roger Pau Monne
2015-04-02 10:26 ` [PATCH RFC v2 1/3] xen/pvh: enable mmu_update hypercall Roger Pau Monne
2015-04-02 10:42   ` Ian Campbell
2015-04-02 11:37     ` Roger Pau Monné
2015-04-02 11:50     ` Andrew Cooper
2015-04-02 12:43       ` Jürgen Groß
2015-04-02 12:56         ` Andrew Cooper
2015-04-02 10:26 ` [PATCH RFC v2 2/3] xen/shadow: fix shadow_track_dirty_vram to work on hvm guests Roger Pau Monne
2015-04-02 19:06   ` Andrew Cooper
2015-04-09 12:41     ` Tim Deegan
2015-04-09 12:45       ` Andrew Cooper
2015-04-02 10:26 ` [PATCH RFC v2 3/3] xen: rework paging_log_dirty_op to work with " Roger Pau Monne
2015-04-02 19:46   ` Andrew Cooper [this message]
2015-04-03 14:12     ` Tim Deegan
2015-04-07 10:09       ` Roger Pau Monné
2015-04-09 13:05         ` Tim Deegan
2015-04-09 13:01   ` Tim Deegan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=551D9C99.4070107@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=roger.pau@citrix.com \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.