xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Wei Huang <w1.huang@samsung.com>, xen-devel@lists.xen.org
Cc: yjhyun.yoo@samsung.com, julien.grall@linaro.org,
	ian.campbell@citrix.com, jaeyong.yoo@samsung.com,
	stefano.stabellini@eu.citrix.com
Subject: Re: [RFC v2 4/6] xen/arm: Implement VLPT for guest p2m mapping in live migration
Date: Wed, 16 Apr 2014 00:40:45 +0100	[thread overview]
Message-ID: <534DC37D.1040509@citrix.com> (raw)
In-Reply-To: <1397595918-30419-5-git-send-email-w1.huang@samsung.com>

On 15/04/2014 22:05, Wei Huang wrote:
> From: Jaeyong Yoo <jaeyong.yoo@samsung.com>
>
> Thsi patch implements VLPT (virtual-linear page table) for fast accessing
> of 3rd PTE of guest P2M. For more info about VLPT, please see
> http://www.technovelty.org/linux/virtual-linear-page-table.html.
>
> When creating a mapping for VLPT, just copy the 1st level PTE of guest p2m
> to xen's 2nd level PTE. Then the mapping becomes the following:
>       xen's 1st PTE -->
>       xen's 2nd PTE (which is the same as 1st PTE of guest p2m) -->
>       guest p2m's 2nd PTE -->
>       guest p2m's 3rd PTE (the memory contents where the vlpt points)
>
> This function is used in dirty-page tracing. When domU write-fault is
> trapped by xen, xen can immediately locate the 3rd PTE of guest p2m.
>
> The following link shows the performance comparison for handling a
> dirty-page between vlpt and typical page table walking.
> http://lists.xen.org/archives/html/xen-devel/2013-08/msg01503.html
>
> Signed-off-by: Jaeyong Yoo <jaeyong.yoo@samsung.com>
> ---
>  xen/arch/arm/domain.c        |    5 ++
>  xen/arch/arm/mm.c            |  116 ++++++++++++++++++++++++++++++++++++++++++
>  xen/include/asm-arm/config.h |    7 +++
>  xen/include/asm-arm/domain.h |    7 +++
>  xen/include/asm-arm/mm.h     |   17 +++++++
>  5 files changed, 152 insertions(+)
>
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index b125857..3f04a77 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -502,6 +502,11 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags)
>      /* Default the virtual ID to match the physical */
>      d->arch.vpidr = boot_cpu_data.midr.bits;
>  
> +    d->arch.dirty.second_lvl_start = 0;
> +    d->arch.dirty.second_lvl_end = 0;
> +    d->arch.dirty.second_lvl[0] = NULL;
> +    d->arch.dirty.second_lvl[1] = NULL;
> +

alloc_domain_struct() clears the domain page before handing it back.
Initialising things like this to 0 is pointless.

>      clear_page(d->shared_info);
>      share_xen_page_with_guest(
>          virt_to_page(d->shared_info), d, XENSHARE_writable);
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index 473ad04..a315752 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -750,6 +750,122 @@ void *__init arch_vmap_virt_end(void)
>      return (void *)VMAP_VIRT_END;
>  }
>  
> +/* Flush the vlpt area */
> +void flush_vlpt(struct domain *d)
> +{
> +    int flush_size;
> +
> +    flush_size = (d->arch.dirty.second_lvl_end -
> +                  d->arch.dirty.second_lvl_start) << SECOND_SHIFT;
> +
> +    /* flushing the 3rd level mapping */
> +    flush_xen_data_tlb_range_va(d->arch.dirty.second_lvl_start << SECOND_SHIFT,

This is taking a volatile int, shifting left by 21 and then using it as
a virtual address.  It is going to do the wrong thing more often than
the right thing.

second_lvl_{start,end} should be unsigned long, and I cant see a
justification for why they should be volatile.

> +                                flush_size);
> +}
> +
> +/* Restore the xen page table for vlpt mapping for domain */
> +void restore_vlpt(struct domain *d)
> +{
> +    int i;
> +
> +    dsb(sy);
> +
> +    for ( i = d->arch.dirty.second_lvl_start; i < d->arch.dirty.second_lvl_end;
> +          ++i )
> +    {
> +        int k = i % LPAE_ENTRIES;
> +        int l = i / LPAE_ENTRIES;

These are all array indices - they should all be unsigned, as should the
loop variable.

> +
> +        if ( xen_second[i].bits != d->arch.dirty.second_lvl[l][k].bits )
> +        {
> +            write_pte(&xen_second[i], d->arch.dirty.second_lvl[l][k]);
> +            flush_xen_data_tlb_range_va(i << SECOND_SHIFT, 1 << SECOND_SHIFT);

shifted int as virtual address.  Also, SECOND_SIZE exists.

> +        }
> +    }
> +    
> +    dsb(sy);
> +    isb();
> +}
> +
> +/* Set up the xen page table for vlpt mapping for domain */
> +int prepare_vlpt(struct domain *d)
> +{
> +    int xen_second_linear_base;
> +    int gp2m_start_index, gp2m_end_index;
> +    struct p2m_domain *p2m = &d->arch.p2m;
> +    struct page_info *second_lvl_page;
> +    paddr_t gma_start = 0;
> +    paddr_t gma_end = 0;
> +    lpae_t *first[2];
> +    int i;
> +    uint64_t required, avail = VIRT_LIN_P2M_END - VIRT_LIN_P2M_START;
> +
> +    domain_get_gpfn_range(d, &gma_start, &gma_end);
> +    required = (gma_end - gma_start) >> LPAE_SHIFT;
> +
> +    if ( required > avail )
> +    {
> +        dprintk(XENLOG_ERR, "Available VLPT is small for domU guest"
> +                "(avail: %llx, required: %llx)\n", (unsigned long long)avail,
> +                (unsigned long long)required);
> +        return -ENOMEM;
> +    }
> +
> +    xen_second_linear_base = second_linear_offset(VIRT_LIN_P2M_START);

unsigned int held in signed integer value.

> +
> +    gp2m_start_index = gma_start >> FIRST_SHIFT;
> +    gp2m_end_index = (gma_end >> FIRST_SHIFT) + 1;
> +
> +    if ( xen_second_linear_base + gp2m_end_index >= LPAE_ENTRIES * 2 )
> +    {
> +        dprintk(XENLOG_ERR, "xen second page is small for VLPT for domU");
> +        return -ENOMEM;
> +    }
> +
> +    second_lvl_page = alloc_domheap_pages(NULL, 1, 0);
> +    if ( second_lvl_page == NULL )
> +        return -ENOMEM;
> +
> +    /* First level p2m is 2 consecutive pages */
> +    d->arch.dirty.second_lvl[0] = map_domain_page_global(
> +        page_to_mfn(second_lvl_page) );
> +    d->arch.dirty.second_lvl[1] = map_domain_page_global(
> +        page_to_mfn(second_lvl_page+1) );
> +
> +    first[0] = __map_domain_page(p2m->first_level);
> +    first[1] = __map_domain_page(p2m->first_level+1);

Xen style - spaces immediately surrounding binary operators.

~Andrew

> +
> +    for ( i = gp2m_start_index; i < gp2m_end_index; ++i )
> +    {
> +        int k = i % LPAE_ENTRIES;
> +        int l = i / LPAE_ENTRIES;
> +        int k2 = (xen_second_linear_base + i) % LPAE_ENTRIES;
> +        int l2 = (xen_second_linear_base + i) / LPAE_ENTRIES;
> +
> +        write_pte(&xen_second[xen_second_linear_base+i], first[l][k]);
> +
> +        /* we copy the mapping into domain's structure as a reference
> +         * in case of the context switch (used in restore_vlpt) */
> +        d->arch.dirty.second_lvl[l2][k2] = first[l][k];
> +    }
> +    unmap_domain_page(first[0]);
> +    unmap_domain_page(first[1]);
> +
> +    /* storing the start and end index */
> +    d->arch.dirty.second_lvl_start = xen_second_linear_base + gp2m_start_index;
> +    d->arch.dirty.second_lvl_end = xen_second_linear_base + gp2m_end_index;
> +
> +    flush_vlpt(d);
> +
> +    return 0;
> +}
> +
> +void cleanup_vlpt(struct domain *d)
> +{
> +    /* First level p2m is 2 consecutive pages */
> +    unmap_domain_page_global(d->arch.dirty.second_lvl[0]);
> +    unmap_domain_page_global(d->arch.dirty.second_lvl[1]);
> +}
>  /*
>   * This function should only be used to remap device address ranges
>   * TODO: add a check to verify this assumption
> diff --git a/xen/include/asm-arm/config.h b/xen/include/asm-arm/config.h
> index ef291ff..47d1bce 100644
> --- a/xen/include/asm-arm/config.h
> +++ b/xen/include/asm-arm/config.h
> @@ -87,6 +87,7 @@
>   *   0  -   8M   <COMMON>
>   *
>   *  32M - 128M   Frametable: 24 bytes per page for 16GB of RAM
> + * 128M - 256M   Virtual-linear mapping to P2M table
>   * 256M -   1G   VMAP: ioremap and early_ioremap use this virtual address
>   *                    space
>   *
> @@ -124,7 +125,9 @@
>  #define CONFIG_SEPARATE_XENHEAP 1
>  
>  #define FRAMETABLE_VIRT_START  _AT(vaddr_t,0x02000000)
> +#define VIRT_LIN_P2M_START     _AT(vaddr_t,0x08000000)
>  #define VMAP_VIRT_START  _AT(vaddr_t,0x10000000)
> +#define VIRT_LIN_P2M_END       VMAP_VIRT_START
>  #define XENHEAP_VIRT_START     _AT(vaddr_t,0x40000000)
>  #define XENHEAP_VIRT_END       _AT(vaddr_t,0x7fffffff)
>  #define DOMHEAP_VIRT_START     _AT(vaddr_t,0x80000000)
> @@ -157,6 +160,10 @@
>  
>  #define HYPERVISOR_VIRT_END    DIRECTMAP_VIRT_END
>  
> +/* VIRT_LIN_P2M_START and VIRT_LIN_P2M_END for vlpt */
> +#define VIRT_LIN_P2M_START     _AT(vaddr_t, 0x08000000)
> +#define VIRT_LIN_P2M_END       VMAP_VIRT_START
> +
>  #endif
>  
>  /* Fixmap slots */
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index 28c359a..5321bd6 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -161,6 +161,13 @@ struct arch_domain
>          spinlock_t                  lock;
>      } vuart;
>  
> +    /* dirty-page tracing */
> +    struct {
> +        volatile int second_lvl_start;   /* for context switch */
> +        volatile int second_lvl_end;
> +        lpae_t *second_lvl[2];           /* copy of guest p2m's first */
> +    } dirty;
> +
>      unsigned int evtchn_irq;
>  }  __cacheline_aligned;
>  
> diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
> index 8347524..5fd684f 100644
> --- a/xen/include/asm-arm/mm.h
> +++ b/xen/include/asm-arm/mm.h
> @@ -4,6 +4,7 @@
>  #include <xen/config.h>
>  #include <xen/kernel.h>
>  #include <asm/page.h>
> +#include <asm/config.h>
>  #include <public/xen.h>
>  
>  /* Align Xen to a 2 MiB boundary. */
> @@ -342,6 +343,22 @@ static inline void put_page_and_type(struct page_info *page)
>      put_page(page);
>  }
>  
> +int prepare_vlpt(struct domain *d);
> +void cleanup_vlpt(struct domain *d);
> +void restore_vlpt(struct domain *d);
> +
> +/* calculate the xen's virtual address for accessing the leaf PTE of
> + * a given address (GPA) */
> +static inline lpae_t * get_vlpt_3lvl_pte(paddr_t addr)
> +{
> +    lpae_t *table = (lpae_t *)VIRT_LIN_P2M_START;
> +
> +    /* Since we slotted the guest's first p2m page table to xen's
> +     * second page table, one shift is enough for calculating the
> +     * index of guest p2m table entry */
> +    return &table[addr >> PAGE_SHIFT];
> +}
> +
>  #endif /*  __ARCH_ARM_MM__ */
>  /*
>   * Local variables:

  parent reply	other threads:[~2014-04-15 23:40 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-15 21:05 [RFC v2 0/6] xen/arm: Support guest VM save/restore/migration Wei Huang
2014-04-15 21:05 ` [RFC v2 1/6] xen/arm: Save and restore support with hvm context hypercalls Wei Huang
2014-04-15 23:37   ` Andrew Cooper
2014-04-16 21:50     ` Wei Huang
2014-04-17 12:55       ` Andrew Cooper
2014-04-16  9:48   ` Julien Grall
2014-04-16 10:30     ` Jan Beulich
2014-04-16 15:54       ` Wei Huang
2014-04-17 15:06   ` Julien Grall
2014-04-17 16:55     ` Wei Huang
2014-05-12  9:16     ` Ian Campbell
2014-05-12 12:04       ` Julien Grall
     [not found]         ` <53723ACC.8040402@samsung.com>
2014-05-13 15:42           ` Julien Grall
2014-05-13 16:18             ` Wei Huang
2014-05-13 16:37               ` Julien Grall
2014-05-13 16:44                 ` Wei Huang
2014-05-13 17:33                   ` Julien Grall
2014-04-15 21:05 ` [RFC v2 2/6] xen/arm: implement support for XENMEM_maximum_gpfn hypercall Wei Huang
2014-04-15 22:46   ` Julien Grall
2014-04-16 15:33     ` Wei Huang
2014-04-15 21:05 ` [RFC v2 3/6] xen/arm: support guest do_suspend function Wei Huang
2014-04-15 23:38   ` Andrew Cooper
2014-04-15 23:39   ` Andrew Cooper
2014-04-16  9:10   ` Julien Grall
2014-04-15 21:05 ` [RFC v2 4/6] xen/arm: Implement VLPT for guest p2m mapping in live migration Wei Huang
2014-04-15 22:29   ` Julien Grall
2014-04-15 23:40   ` Andrew Cooper [this message]
2014-04-22 17:54   ` Julien Grall
2014-04-15 21:05 ` [RFC v2 5/6] xen/arm: Implement hypercall for dirty page tracing Wei Huang
2014-04-15 23:38   ` Andrew Cooper
2014-04-15 21:05 ` [RFC v2 6/6] xen/arm: Implement toolstack for xl restore/save and migrate Wei Huang
2014-04-15 23:40   ` Andrew Cooper
2014-04-15 21:05 ` [RFC v2 0/6] xen/arm: Support guest VM save/restore/migration Wei Huang
2014-04-15 22:23   ` Julien Grall
2014-04-15 21:05 ` [RFC v2 1/6] xen/arm: Save and restore support with hvm context hypercalls Wei Huang
2014-04-15 21:05 ` [RFC v2 2/6] xen/arm: implement support for XENMEM_maximum_gpfn hypercall Wei Huang
2014-04-15 21:05 ` [RFC v2 3/6] xen/arm: support guest do_suspend function Wei Huang
2014-04-15 21:05 ` [RFC v2 4/6] xen/arm: Implement VLPT for guest p2m mapping in live migration Wei Huang
2014-04-15 21:05 ` [RFC v2 5/6] xen/arm: Implement hypercall for dirty page tracing Wei Huang
2014-04-15 23:35   ` Julien Grall
2014-04-23 11:59   ` Julien Grall
2014-04-15 21:05 ` [RFC v2 6/6] xen/arm: Implement toolstack for xl restore/save and migrate Wei Huang
2014-04-16 16:29 ` [RFC v2 0/6] xen/arm: Support guest VM save/restore/migration Julien Grall
2014-04-16 16:41   ` Wei Huang
2014-04-16 16:50     ` Julien Grall
2014-04-23 11:49 ` Ian Campbell
2014-04-23 18:41   ` Wei Huang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=534DC37D.1040509@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=ian.campbell@citrix.com \
    --cc=jaeyong.yoo@samsung.com \
    --cc=julien.grall@linaro.org \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=w1.huang@samsung.com \
    --cc=xen-devel@lists.xen.org \
    --cc=yjhyun.yoo@samsung.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).