All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Wei Huang <w1.huang@samsung.com>, xen-devel@lists.xen.org
Cc: yjhyun.yoo@samsung.com, julien.grall@linaro.org,
	ian.campbell@citrix.com, jaeyong.yoo@samsung.com,
	stefano.stabellini@eu.citrix.com
Subject: Re: [RFC v2 4/6] xen/arm: Implement VLPT for guest p2m mapping in live migration
Date: Wed, 16 Apr 2014 00:40:45 +0100	[thread overview]
Message-ID: <534DC37D.1040509@citrix.com> (raw)
In-Reply-To: <1397595918-30419-5-git-send-email-w1.huang@samsung.com>

On 15/04/2014 22:05, Wei Huang wrote:
> From: Jaeyong Yoo <jaeyong.yoo@samsung.com>
>
> Thsi patch implements VLPT (virtual-linear page table) for fast accessing
> of 3rd PTE of guest P2M. For more info about VLPT, please see
> http://www.technovelty.org/linux/virtual-linear-page-table.html.
>
> When creating a mapping for VLPT, just copy the 1st level PTE of guest p2m
> to xen's 2nd level PTE. Then the mapping becomes the following:
>       xen's 1st PTE -->
>       xen's 2nd PTE (which is the same as 1st PTE of guest p2m) -->
>       guest p2m's 2nd PTE -->
>       guest p2m's 3rd PTE (the memory contents where the vlpt points)
>
> This function is used in dirty-page tracing. When domU write-fault is
> trapped by xen, xen can immediately locate the 3rd PTE of guest p2m.
>
> The following link shows the performance comparison for handling a
> dirty-page between vlpt and typical page table walking.
> http://lists.xen.org/archives/html/xen-devel/2013-08/msg01503.html
>
> Signed-off-by: Jaeyong Yoo <jaeyong.yoo@samsung.com>
> ---
>  xen/arch/arm/domain.c        |    5 ++
>  xen/arch/arm/mm.c            |  116 ++++++++++++++++++++++++++++++++++++++++++
>  xen/include/asm-arm/config.h |    7 +++
>  xen/include/asm-arm/domain.h |    7 +++
>  xen/include/asm-arm/mm.h     |   17 +++++++
>  5 files changed, 152 insertions(+)
>
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index b125857..3f04a77 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -502,6 +502,11 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags)
>      /* Default the virtual ID to match the physical */
>      d->arch.vpidr = boot_cpu_data.midr.bits;
>  
> +    d->arch.dirty.second_lvl_start = 0;
> +    d->arch.dirty.second_lvl_end = 0;
> +    d->arch.dirty.second_lvl[0] = NULL;
> +    d->arch.dirty.second_lvl[1] = NULL;
> +

alloc_domain_struct() clears the domain page before handing it back.
Initialising things like this to 0 is pointless.

>      clear_page(d->shared_info);
>      share_xen_page_with_guest(
>          virt_to_page(d->shared_info), d, XENSHARE_writable);
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index 473ad04..a315752 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -750,6 +750,122 @@ void *__init arch_vmap_virt_end(void)
>      return (void *)VMAP_VIRT_END;
>  }
>  
> +/* Flush the vlpt area */
> +void flush_vlpt(struct domain *d)
> +{
> +    int flush_size;
> +
> +    flush_size = (d->arch.dirty.second_lvl_end -
> +                  d->arch.dirty.second_lvl_start) << SECOND_SHIFT;
> +
> +    /* flushing the 3rd level mapping */
> +    flush_xen_data_tlb_range_va(d->arch.dirty.second_lvl_start << SECOND_SHIFT,

This is taking a volatile int, shifting left by 21 and then using it as
a virtual address.  It is going to do the wrong thing more often than
the right thing.

second_lvl_{start,end} should be unsigned long, and I cant see a
justification for why they should be volatile.

> +                                flush_size);
> +}
> +
> +/* Restore the xen page table for vlpt mapping for domain */
> +void restore_vlpt(struct domain *d)
> +{
> +    int i;
> +
> +    dsb(sy);
> +
> +    for ( i = d->arch.dirty.second_lvl_start; i < d->arch.dirty.second_lvl_end;
> +          ++i )
> +    {
> +        int k = i % LPAE_ENTRIES;
> +        int l = i / LPAE_ENTRIES;

These are all array indices - they should all be unsigned, as should the
loop variable.

> +
> +        if ( xen_second[i].bits != d->arch.dirty.second_lvl[l][k].bits )
> +        {
> +            write_pte(&xen_second[i], d->arch.dirty.second_lvl[l][k]);
> +            flush_xen_data_tlb_range_va(i << SECOND_SHIFT, 1 << SECOND_SHIFT);

shifted int as virtual address.  Also, SECOND_SIZE exists.

> +        }
> +    }
> +    
> +    dsb(sy);
> +    isb();
> +}
> +
> +/* Set up the xen page table for vlpt mapping for domain */
> +int prepare_vlpt(struct domain *d)
> +{
> +    int xen_second_linear_base;
> +    int gp2m_start_index, gp2m_end_index;
> +    struct p2m_domain *p2m = &d->arch.p2m;
> +    struct page_info *second_lvl_page;
> +    paddr_t gma_start = 0;
> +    paddr_t gma_end = 0;
> +    lpae_t *first[2];
> +    int i;
> +    uint64_t required, avail = VIRT_LIN_P2M_END - VIRT_LIN_P2M_START;
> +
> +    domain_get_gpfn_range(d, &gma_start, &gma_end);
> +    required = (gma_end - gma_start) >> LPAE_SHIFT;
> +
> +    if ( required > avail )
> +    {
> +        dprintk(XENLOG_ERR, "Available VLPT is small for domU guest"
> +                "(avail: %llx, required: %llx)\n", (unsigned long long)avail,
> +                (unsigned long long)required);
> +        return -ENOMEM;
> +    }
> +
> +    xen_second_linear_base = second_linear_offset(VIRT_LIN_P2M_START);

unsigned int held in signed integer value.

> +
> +    gp2m_start_index = gma_start >> FIRST_SHIFT;
> +    gp2m_end_index = (gma_end >> FIRST_SHIFT) + 1;
> +
> +    if ( xen_second_linear_base + gp2m_end_index >= LPAE_ENTRIES * 2 )
> +    {
> +        dprintk(XENLOG_ERR, "xen second page is small for VLPT for domU");
> +        return -ENOMEM;
> +    }
> +
> +    second_lvl_page = alloc_domheap_pages(NULL, 1, 0);
> +    if ( second_lvl_page == NULL )
> +        return -ENOMEM;
> +
> +    /* First level p2m is 2 consecutive pages */
> +    d->arch.dirty.second_lvl[0] = map_domain_page_global(
> +        page_to_mfn(second_lvl_page) );
> +    d->arch.dirty.second_lvl[1] = map_domain_page_global(
> +        page_to_mfn(second_lvl_page+1) );
> +
> +    first[0] = __map_domain_page(p2m->first_level);
> +    first[1] = __map_domain_page(p2m->first_level+1);

Xen style - spaces immediately surrounding binary operators.

~Andrew

> +
> +    for ( i = gp2m_start_index; i < gp2m_end_index; ++i )
> +    {
> +        int k = i % LPAE_ENTRIES;
> +        int l = i / LPAE_ENTRIES;
> +        int k2 = (xen_second_linear_base + i) % LPAE_ENTRIES;
> +        int l2 = (xen_second_linear_base + i) / LPAE_ENTRIES;
> +
> +        write_pte(&xen_second[xen_second_linear_base+i], first[l][k]);
> +
> +        /* we copy the mapping into domain's structure as a reference
> +         * in case of the context switch (used in restore_vlpt) */
> +        d->arch.dirty.second_lvl[l2][k2] = first[l][k];
> +    }
> +    unmap_domain_page(first[0]);
> +    unmap_domain_page(first[1]);
> +
> +    /* storing the start and end index */
> +    d->arch.dirty.second_lvl_start = xen_second_linear_base + gp2m_start_index;
> +    d->arch.dirty.second_lvl_end = xen_second_linear_base + gp2m_end_index;
> +
> +    flush_vlpt(d);
> +
> +    return 0;
> +}
> +
> +void cleanup_vlpt(struct domain *d)
> +{
> +    /* First level p2m is 2 consecutive pages */
> +    unmap_domain_page_global(d->arch.dirty.second_lvl[0]);
> +    unmap_domain_page_global(d->arch.dirty.second_lvl[1]);
> +}
>  /*
>   * This function should only be used to remap device address ranges
>   * TODO: add a check to verify this assumption
> diff --git a/xen/include/asm-arm/config.h b/xen/include/asm-arm/config.h
> index ef291ff..47d1bce 100644
> --- a/xen/include/asm-arm/config.h
> +++ b/xen/include/asm-arm/config.h
> @@ -87,6 +87,7 @@
>   *   0  -   8M   <COMMON>
>   *
>   *  32M - 128M   Frametable: 24 bytes per page for 16GB of RAM
> + * 128M - 256M   Virtual-linear mapping to P2M table
>   * 256M -   1G   VMAP: ioremap and early_ioremap use this virtual address
>   *                    space
>   *
> @@ -124,7 +125,9 @@
>  #define CONFIG_SEPARATE_XENHEAP 1
>  
>  #define FRAMETABLE_VIRT_START  _AT(vaddr_t,0x02000000)
> +#define VIRT_LIN_P2M_START     _AT(vaddr_t,0x08000000)
>  #define VMAP_VIRT_START  _AT(vaddr_t,0x10000000)
> +#define VIRT_LIN_P2M_END       VMAP_VIRT_START
>  #define XENHEAP_VIRT_START     _AT(vaddr_t,0x40000000)
>  #define XENHEAP_VIRT_END       _AT(vaddr_t,0x7fffffff)
>  #define DOMHEAP_VIRT_START     _AT(vaddr_t,0x80000000)
> @@ -157,6 +160,10 @@
>  
>  #define HYPERVISOR_VIRT_END    DIRECTMAP_VIRT_END
>  
> +/* VIRT_LIN_P2M_START and VIRT_LIN_P2M_END for vlpt */
> +#define VIRT_LIN_P2M_START     _AT(vaddr_t, 0x08000000)
> +#define VIRT_LIN_P2M_END       VMAP_VIRT_START
> +
>  #endif
>  
>  /* Fixmap slots */
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index 28c359a..5321bd6 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -161,6 +161,13 @@ struct arch_domain
>          spinlock_t                  lock;
>      } vuart;
>  
> +    /* dirty-page tracing */
> +    struct {
> +        volatile int second_lvl_start;   /* for context switch */
> +        volatile int second_lvl_end;
> +        lpae_t *second_lvl[2];           /* copy of guest p2m's first */
> +    } dirty;
> +
>      unsigned int evtchn_irq;
>  }  __cacheline_aligned;
>  
> diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
> index 8347524..5fd684f 100644
> --- a/xen/include/asm-arm/mm.h
> +++ b/xen/include/asm-arm/mm.h
> @@ -4,6 +4,7 @@
>  #include <xen/config.h>
>  #include <xen/kernel.h>
>  #include <asm/page.h>
> +#include <asm/config.h>
>  #include <public/xen.h>
>  
>  /* Align Xen to a 2 MiB boundary. */
> @@ -342,6 +343,22 @@ static inline void put_page_and_type(struct page_info *page)
>      put_page(page);
>  }
>  
> +int prepare_vlpt(struct domain *d);
> +void cleanup_vlpt(struct domain *d);
> +void restore_vlpt(struct domain *d);
> +
> +/* calculate the xen's virtual address for accessing the leaf PTE of
> + * a given address (GPA) */
> +static inline lpae_t * get_vlpt_3lvl_pte(paddr_t addr)
> +{
> +    lpae_t *table = (lpae_t *)VIRT_LIN_P2M_START;
> +
> +    /* Since we slotted the guest's first p2m page table to xen's
> +     * second page table, one shift is enough for calculating the
> +     * index of guest p2m table entry */
> +    return &table[addr >> PAGE_SHIFT];
> +}
> +
>  #endif /*  __ARCH_ARM_MM__ */
>  /*
>   * Local variables:

  parent reply	other threads:[~2014-04-15 23:40 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-15 21:05 [RFC v2 0/6] xen/arm: Support guest VM save/restore/migration Wei Huang
2014-04-15 21:05 ` [RFC v2 1/6] xen/arm: Save and restore support with hvm context hypercalls Wei Huang
2014-04-15 23:37   ` Andrew Cooper
2014-04-16 21:50     ` Wei Huang
2014-04-17 12:55       ` Andrew Cooper
2014-04-16  9:48   ` Julien Grall
2014-04-16 10:30     ` Jan Beulich
2014-04-16 15:54       ` Wei Huang
2014-04-17 15:06   ` Julien Grall
2014-04-17 16:55     ` Wei Huang
2014-05-12  9:16     ` Ian Campbell
2014-05-12 12:04       ` Julien Grall
     [not found]         ` <53723ACC.8040402@samsung.com>
2014-05-13 15:42           ` Julien Grall
2014-05-13 16:18             ` Wei Huang
2014-05-13 16:37               ` Julien Grall
2014-05-13 16:44                 ` Wei Huang
2014-05-13 17:33                   ` Julien Grall
2014-04-15 21:05 ` [RFC v2 2/6] xen/arm: implement support for XENMEM_maximum_gpfn hypercall Wei Huang
2014-04-15 22:46   ` Julien Grall
2014-04-16 15:33     ` Wei Huang
2014-04-15 21:05 ` [RFC v2 3/6] xen/arm: support guest do_suspend function Wei Huang
2014-04-15 23:38   ` Andrew Cooper
2014-04-15 23:39   ` Andrew Cooper
2014-04-16  9:10   ` Julien Grall
2014-04-15 21:05 ` [RFC v2 4/6] xen/arm: Implement VLPT for guest p2m mapping in live migration Wei Huang
2014-04-15 22:29   ` Julien Grall
2014-04-15 23:40   ` Andrew Cooper [this message]
2014-04-22 17:54   ` Julien Grall
2014-04-15 21:05 ` [RFC v2 5/6] xen/arm: Implement hypercall for dirty page tracing Wei Huang
2014-04-15 23:38   ` Andrew Cooper
2014-04-15 21:05 ` [RFC v2 6/6] xen/arm: Implement toolstack for xl restore/save and migrate Wei Huang
2014-04-15 23:40   ` Andrew Cooper
2014-04-15 21:05 ` [RFC v2 0/6] xen/arm: Support guest VM save/restore/migration Wei Huang
2014-04-15 22:23   ` Julien Grall
2014-04-15 21:05 ` [RFC v2 1/6] xen/arm: Save and restore support with hvm context hypercalls Wei Huang
2014-04-15 21:05 ` [RFC v2 2/6] xen/arm: implement support for XENMEM_maximum_gpfn hypercall Wei Huang
2014-04-15 21:05 ` [RFC v2 3/6] xen/arm: support guest do_suspend function Wei Huang
2014-04-15 21:05 ` [RFC v2 4/6] xen/arm: Implement VLPT for guest p2m mapping in live migration Wei Huang
2014-04-15 21:05 ` [RFC v2 5/6] xen/arm: Implement hypercall for dirty page tracing Wei Huang
2014-04-15 23:35   ` Julien Grall
2014-04-23 11:59   ` Julien Grall
2014-04-15 21:05 ` [RFC v2 6/6] xen/arm: Implement toolstack for xl restore/save and migrate Wei Huang
2014-04-16 16:29 ` [RFC v2 0/6] xen/arm: Support guest VM save/restore/migration Julien Grall
2014-04-16 16:41   ` Wei Huang
2014-04-16 16:50     ` Julien Grall
2014-04-23 11:49 ` Ian Campbell
2014-04-23 18:41   ` Wei Huang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=534DC37D.1040509@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=ian.campbell@citrix.com \
    --cc=jaeyong.yoo@samsung.com \
    --cc=julien.grall@linaro.org \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=w1.huang@samsung.com \
    --cc=xen-devel@lists.xen.org \
    --cc=yjhyun.yoo@samsung.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.