All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Vrabel <david.vrabel@citrix.com>
To: Juergen Gross <jgross@suse.com>, <linux-kernel@vger.kernel.org>,
	<xen-devel@lists.xensource.com>, <konrad.wilk@oracle.com>,
	<boris.ostrovsky@oracle.com>, <david.vrabel@citrix.com>,
	<jbeulich@suse.com>
Subject: Re: [Xen-devel] [PATCH V2 3/3] xen: eliminate scalability issues from initial mapping setup
Date: Wed, 17 Sep 2014 15:07:05 +0100	[thread overview]
Message-ID: <54199589.9040703@citrix.com> (raw)
In-Reply-To: <1410927157-15069-4-git-send-email-jgross@suse.com>

On 17/09/14 05:12, Juergen Gross wrote:
> Direct Xen to place the initial P->M table outside of the initial
> mapping, as otherwise the 1G (implementation) / 2G (theoretical)
> restriction on the size of the initial mapping limits the amount
> of memory a domain can be handed initially.
> 
> As the initial P->M table is copied rather early during boot to
> domain private memory and it's initial virtual mapping is dropped,
> the easiest way to avoid virtual address conflicts with other
> addresses in the kernel is to use a user address area for the
> virtual address of the initial P->M table. This allows us to just
> throw away the page tables of the initial mapping after the copy
> without having to care about address invalidation.

This needs an additional paragraph like:

  "This does not increase the amount of memory the guest can use.  This
is still limited to 512 GiB by the 3-level p2m."

> --- a/arch/x86/xen/mmu.c
> +++ b/arch/x86/xen/mmu.c
> @@ -1198,6 +1198,76 @@ static void __init xen_cleanhighmap(unsigned long vaddr,
[...]
> +/*
> + * Since it is well isolated we can (and since it is perhaps large we should)
> + * also free the page tables mapping the initial P->M table.
> + */
> +static void __init xen_cleanmfnmap(unsigned long vaddr)
> +{
> +	unsigned long va = vaddr & PMD_MASK;
> +	unsigned long pa;
> +	pgd_t *pgd = pgd_offset_k(va);
> +	pud_t *pud_page = pud_offset(pgd, 0);
> +	pud_t *pud;
> +	pmd_t *pmd;
> +	pte_t *pte;
> +	unsigned int i;
> +
> +	set_pgd(pgd, __pgd(0));
> +	do {
> +		pud = pud_page + pud_index(va);
> +		if (pud_none(*pud)) {
> +			va += PUD_SIZE;
> +		} else if (pud_large(*pud)) {
> +			pa = pud_val(*pud) & PHYSICAL_PAGE_MASK;
> +			xen_free_ro_pages(pa, PUD_SIZE);
> +			va += PUD_SIZE;

Are you missing a ClearPagePinned(..) here?

> +		} else {
> +			pmd = pmd_offset(pud, va);
> +			if (pmd_large(*pmd)) {
> +				pa = pmd_val(*pmd) & PHYSICAL_PAGE_MASK;
> +				xen_free_ro_pages(pa, PMD_SIZE);
> +			} else if (!pmd_none(*pmd)) {
> +				pte = pte_offset_kernel(pmd, va);
> +				for (i = 0; i < PTRS_PER_PTE; ++i) {
> +					if (pte_none(pte[i]))
> +						break;
> +					pa = pte_pfn(pte[i]) << PAGE_SHIFT;
> +					xen_free_ro_pages(pa, PAGE_SIZE);
> +				}

> +				pa = __pa(pte) & PHYSICAL_PAGE_MASK;
> +				ClearPagePinned(virt_to_page(__va(pa)));
> +				xen_free_ro_pages(pa, PAGE_SIZE);

Put this into a helper function?  It's used here...

> +			}
> +			va += PMD_SIZE;
> +			if (pmd_index(va))
> +				continue;
> +			pa = __pa(pmd) & PHYSICAL_PAGE_MASK;
> +			ClearPagePinned(virt_to_page(__va(pa)));
> +			xen_free_ro_pages(pa, PAGE_SIZE);

...and here...

> +		}
> +
> +	} while (pud_index(va) || pmd_index(va));
> +	pa = __pa(pud_page) & PHYSICAL_PAGE_MASK;
> +	ClearPagePinned(virt_to_page(__va(pa)));
> +	xen_free_ro_pages(pa, PAGE_SIZE);

... and here.

> @@ -1529,6 +1604,22 @@ static pte_t __init mask_rw_pte(pte_t *ptep, pte_t pte)
>  #else /* CONFIG_X86_64 */
>  static pte_t __init mask_rw_pte(pte_t *ptep, pte_t pte)
>  {
> +	unsigned long pfn;
> +
> +	if (xen_feature(XENFEAT_writable_page_tables) ||
> +	    xen_feature(XENFEAT_auto_translated_physmap) ||
> +	    xen_start_info->mfn_list >= __START_KERNEL_map)
> +		return pte;
> +
> +	/*
> +	 * Pages belonging to the initial p2m list mapped outside the default
> +	 * address range must be mapped read-only.

Why?  I didn't think was anything special about these MFNs.

David

WARNING: multiple messages have this Message-ID (diff)
From: David Vrabel <david.vrabel@citrix.com>
To: Juergen Gross <jgross@suse.com>,
	linux-kernel@vger.kernel.org, xen-devel@lists.xensource.com,
	konrad.wilk@oracle.com, boris.ostrovsky@oracle.com,
	david.vrabel@citrix.com, jbeulich@suse.com
Subject: Re: [Xen-devel] [PATCH V2 3/3] xen: eliminate scalability issues from initial mapping setup
Date: Wed, 17 Sep 2014 15:07:05 +0100	[thread overview]
Message-ID: <54199589.9040703@citrix.com> (raw)
In-Reply-To: <1410927157-15069-4-git-send-email-jgross@suse.com>

On 17/09/14 05:12, Juergen Gross wrote:
> Direct Xen to place the initial P->M table outside of the initial
> mapping, as otherwise the 1G (implementation) / 2G (theoretical)
> restriction on the size of the initial mapping limits the amount
> of memory a domain can be handed initially.
> 
> As the initial P->M table is copied rather early during boot to
> domain private memory and it's initial virtual mapping is dropped,
> the easiest way to avoid virtual address conflicts with other
> addresses in the kernel is to use a user address area for the
> virtual address of the initial P->M table. This allows us to just
> throw away the page tables of the initial mapping after the copy
> without having to care about address invalidation.

This needs an additional paragraph like:

  "This does not increase the amount of memory the guest can use.  This
is still limited to 512 GiB by the 3-level p2m."

> --- a/arch/x86/xen/mmu.c
> +++ b/arch/x86/xen/mmu.c
> @@ -1198,6 +1198,76 @@ static void __init xen_cleanhighmap(unsigned long vaddr,
[...]
> +/*
> + * Since it is well isolated we can (and since it is perhaps large we should)
> + * also free the page tables mapping the initial P->M table.
> + */
> +static void __init xen_cleanmfnmap(unsigned long vaddr)
> +{
> +	unsigned long va = vaddr & PMD_MASK;
> +	unsigned long pa;
> +	pgd_t *pgd = pgd_offset_k(va);
> +	pud_t *pud_page = pud_offset(pgd, 0);
> +	pud_t *pud;
> +	pmd_t *pmd;
> +	pte_t *pte;
> +	unsigned int i;
> +
> +	set_pgd(pgd, __pgd(0));
> +	do {
> +		pud = pud_page + pud_index(va);
> +		if (pud_none(*pud)) {
> +			va += PUD_SIZE;
> +		} else if (pud_large(*pud)) {
> +			pa = pud_val(*pud) & PHYSICAL_PAGE_MASK;
> +			xen_free_ro_pages(pa, PUD_SIZE);
> +			va += PUD_SIZE;

Are you missing a ClearPagePinned(..) here?

> +		} else {
> +			pmd = pmd_offset(pud, va);
> +			if (pmd_large(*pmd)) {
> +				pa = pmd_val(*pmd) & PHYSICAL_PAGE_MASK;
> +				xen_free_ro_pages(pa, PMD_SIZE);
> +			} else if (!pmd_none(*pmd)) {
> +				pte = pte_offset_kernel(pmd, va);
> +				for (i = 0; i < PTRS_PER_PTE; ++i) {
> +					if (pte_none(pte[i]))
> +						break;
> +					pa = pte_pfn(pte[i]) << PAGE_SHIFT;
> +					xen_free_ro_pages(pa, PAGE_SIZE);
> +				}

> +				pa = __pa(pte) & PHYSICAL_PAGE_MASK;
> +				ClearPagePinned(virt_to_page(__va(pa)));
> +				xen_free_ro_pages(pa, PAGE_SIZE);

Put this into a helper function?  It's used here...

> +			}
> +			va += PMD_SIZE;
> +			if (pmd_index(va))
> +				continue;
> +			pa = __pa(pmd) & PHYSICAL_PAGE_MASK;
> +			ClearPagePinned(virt_to_page(__va(pa)));
> +			xen_free_ro_pages(pa, PAGE_SIZE);

...and here...

> +		}
> +
> +	} while (pud_index(va) || pmd_index(va));
> +	pa = __pa(pud_page) & PHYSICAL_PAGE_MASK;
> +	ClearPagePinned(virt_to_page(__va(pa)));
> +	xen_free_ro_pages(pa, PAGE_SIZE);

... and here.

> @@ -1529,6 +1604,22 @@ static pte_t __init mask_rw_pte(pte_t *ptep, pte_t pte)
>  #else /* CONFIG_X86_64 */
>  static pte_t __init mask_rw_pte(pte_t *ptep, pte_t pte)
>  {
> +	unsigned long pfn;
> +
> +	if (xen_feature(XENFEAT_writable_page_tables) ||
> +	    xen_feature(XENFEAT_auto_translated_physmap) ||
> +	    xen_start_info->mfn_list >= __START_KERNEL_map)
> +		return pte;
> +
> +	/*
> +	 * Pages belonging to the initial p2m list mapped outside the default
> +	 * address range must be mapped read-only.

Why?  I didn't think was anything special about these MFNs.

David

  reply	other threads:[~2014-09-17 14:08 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-17  4:12 [PATCH V2 0/3] xen: remove some memory limits from pv-domains Juergen Gross
2014-09-17  4:12 ` [PATCH V2 1/3] xen: sync some headers with xen tree Juergen Gross
2014-09-17  4:12 ` [PATCH V2 2/3] xen: eliminate scalability issues from initrd handling Juergen Gross
2014-09-17 13:45   ` [Xen-devel] " David Vrabel
2014-09-17 13:45     ` David Vrabel
2014-09-17 14:01     ` Juergen Gross
2014-09-17  4:12 ` [PATCH V2 3/3] xen: eliminate scalability issues from initial mapping setup Juergen Gross
2014-09-17 14:07   ` David Vrabel [this message]
2014-09-17 14:07     ` [Xen-devel] " David Vrabel
2014-09-17 14:17     ` Jan Beulich
2014-09-17 14:17     ` Jan Beulich
2014-09-17 14:20     ` [Xen-devel] " Juergen Gross
2014-09-17 14:42       ` David Vrabel
2014-09-17 14:42         ` David Vrabel
2014-09-17 14:47         ` Juergen Gross
2014-09-17 14:43 ` [PATCH V2 0/3] xen: remove some memory limits from pv-domains David Vrabel
2014-09-17 14:43   ` David Vrabel
2014-09-17 14:48   ` Juergen Gross

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54199589.9040703@citrix.com \
    --to=david.vrabel@citrix.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=jbeulich@suse.com \
    --cc=jgross@suse.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.