All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Hillf Danton <dhillf@gmail.com>, Dan Smith <danms@us.ibm.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
	Paul Turner <pjt@google.com>,
	Suresh Siddha <suresh.b.siddha@intel.com>,
	Mike Galbraith <efault@gmx.de>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Lai Jiangshan <laijs@cn.fujitsu.com>,
	Bharata B Rao <bharata.rao@gmail.com>,
	Lee Schermerhorn <Lee.Schermerhorn@hp.com>,
	Rik van Riel <riel@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>
Subject: Re: [PATCH 02/39] xen: document Xen is using an unused bit for the pagetables
Date: Fri, 30 Mar 2012 17:40:05 -0400	[thread overview]
Message-ID: <20120330214005.GC23599@phenom.dumpdata.com> (raw)
In-Reply-To: <1332783986-24195-3-git-send-email-aarcange@redhat.com>

On Mon, Mar 26, 2012 at 07:45:49PM +0200, Andrea Arcangeli wrote:
> Xen has taken over the last reserved bit available for the pagetables
> which is set through ioremap, this documents it and makes the code
> more readable.

About a year ago we redid the P2M code to ditch the major use case for this.
But there were two left over cases that I hadn't found a good solution
for that would allow us to completly eliminate the use of this bit:

1). When setting a PTE of a PFN which overlaps an E820 hole or any of the
    non-E820-RAM entries, we lookup in the P2M and find out that
    the PFN is a 1:1 and return a pte.pte | pfn << PAGE_SIZE.

    But we also stick the _PAGE_IOMAP on it so that when the call to
    xen_pte_val is done we don't end up doing the lookup in the P2M tree
    once more and just set the pte as is.

    So this is the dance between xen_pte_val and xen_make_pte.

2). When the userspace tries to mmap a guest memory for save/migrate
    or to setup something in the guest, it would use the xen_remap_domain_mfn_range
    to setup PTE's with the guest's PFN (gpfn). The _PAGE_IOMAP
    is used again to tell xen_pte_val to not bother looking it up in the
    P2M tree and use it as is.

So.. any thoughts on how to eliminate the usage of this?
> 
> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
> ---
>  arch/x86/include/asm/pgtable_types.h |   11 +++++++++--
>  1 files changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
> index 013286a..b74cac9 100644
> --- a/arch/x86/include/asm/pgtable_types.h
> +++ b/arch/x86/include/asm/pgtable_types.h
> @@ -17,7 +17,7 @@
>  #define _PAGE_BIT_PAT		7	/* on 4KB pages */
>  #define _PAGE_BIT_GLOBAL	8	/* Global TLB entry PPro+ */
>  #define _PAGE_BIT_UNUSED1	9	/* available for programmer */
> -#define _PAGE_BIT_IOMAP		10	/* flag used to indicate IO mapping */
> +#define _PAGE_BIT_UNUSED2	10
>  #define _PAGE_BIT_HIDDEN	11	/* hidden by kmemcheck */
>  #define _PAGE_BIT_PAT_LARGE	12	/* On 2MB or 1GB pages */
>  #define _PAGE_BIT_SPECIAL	_PAGE_BIT_UNUSED1
> @@ -41,7 +41,7 @@
>  #define _PAGE_PSE	(_AT(pteval_t, 1) << _PAGE_BIT_PSE)
>  #define _PAGE_GLOBAL	(_AT(pteval_t, 1) << _PAGE_BIT_GLOBAL)
>  #define _PAGE_UNUSED1	(_AT(pteval_t, 1) << _PAGE_BIT_UNUSED1)
> -#define _PAGE_IOMAP	(_AT(pteval_t, 1) << _PAGE_BIT_IOMAP)
> +#define _PAGE_UNUSED2	(_AT(pteval_t, 1) << _PAGE_BIT_UNUSED2)
>  #define _PAGE_PAT	(_AT(pteval_t, 1) << _PAGE_BIT_PAT)
>  #define _PAGE_PAT_LARGE (_AT(pteval_t, 1) << _PAGE_BIT_PAT_LARGE)
>  #define _PAGE_SPECIAL	(_AT(pteval_t, 1) << _PAGE_BIT_SPECIAL)
> @@ -49,6 +49,13 @@
>  #define _PAGE_SPLITTING	(_AT(pteval_t, 1) << _PAGE_BIT_SPLITTING)
>  #define __HAVE_ARCH_PTE_SPECIAL
>  
> +/* flag used to indicate IO mapping */
> +#ifdef CONFIG_XEN
> +#define _PAGE_IOMAP	(_AT(pteval_t, 1) << _PAGE_BIT_UNUSED2)
> +#else
> +#define _PAGE_IOMAP	(_AT(pteval_t, 0))
> +#endif
> +
>  #ifdef CONFIG_KMEMCHECK
>  #define _PAGE_HIDDEN	(_AT(pteval_t, 1) << _PAGE_BIT_HIDDEN)
>  #else
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Hillf Danton <dhillf@gmail.com>, Dan Smith <danms@us.ibm.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
	Paul Turner <pjt@google.com>,
	Suresh Siddha <suresh.b.siddha@intel.com>,
	Mike Galbraith <efault@gmx.de>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Lai Jiangshan <laijs@cn.fujitsu.com>,
	Bharata B Rao <bharata.rao@gmail.com>,
	Lee Schermerhorn <Lee.Schermerhorn@hp.com>,
	Rik van Riel <riel@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>
Subject: Re: [PATCH 02/39] xen: document Xen is using an unused bit for the pagetables
Date: Fri, 30 Mar 2012 17:40:05 -0400	[thread overview]
Message-ID: <20120330214005.GC23599@phenom.dumpdata.com> (raw)
In-Reply-To: <1332783986-24195-3-git-send-email-aarcange@redhat.com>

On Mon, Mar 26, 2012 at 07:45:49PM +0200, Andrea Arcangeli wrote:
> Xen has taken over the last reserved bit available for the pagetables
> which is set through ioremap, this documents it and makes the code
> more readable.

About a year ago we redid the P2M code to ditch the major use case for this.
But there were two left over cases that I hadn't found a good solution
for that would allow us to completly eliminate the use of this bit:

1). When setting a PTE of a PFN which overlaps an E820 hole or any of the
    non-E820-RAM entries, we lookup in the P2M and find out that
    the PFN is a 1:1 and return a pte.pte | pfn << PAGE_SIZE.

    But we also stick the _PAGE_IOMAP on it so that when the call to
    xen_pte_val is done we don't end up doing the lookup in the P2M tree
    once more and just set the pte as is.

    So this is the dance between xen_pte_val and xen_make_pte.

2). When the userspace tries to mmap a guest memory for save/migrate
    or to setup something in the guest, it would use the xen_remap_domain_mfn_range
    to setup PTE's with the guest's PFN (gpfn). The _PAGE_IOMAP
    is used again to tell xen_pte_val to not bother looking it up in the
    P2M tree and use it as is.

So.. any thoughts on how to eliminate the usage of this?
> 
> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
> ---
>  arch/x86/include/asm/pgtable_types.h |   11 +++++++++--
>  1 files changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
> index 013286a..b74cac9 100644
> --- a/arch/x86/include/asm/pgtable_types.h
> +++ b/arch/x86/include/asm/pgtable_types.h
> @@ -17,7 +17,7 @@
>  #define _PAGE_BIT_PAT		7	/* on 4KB pages */
>  #define _PAGE_BIT_GLOBAL	8	/* Global TLB entry PPro+ */
>  #define _PAGE_BIT_UNUSED1	9	/* available for programmer */
> -#define _PAGE_BIT_IOMAP		10	/* flag used to indicate IO mapping */
> +#define _PAGE_BIT_UNUSED2	10
>  #define _PAGE_BIT_HIDDEN	11	/* hidden by kmemcheck */
>  #define _PAGE_BIT_PAT_LARGE	12	/* On 2MB or 1GB pages */
>  #define _PAGE_BIT_SPECIAL	_PAGE_BIT_UNUSED1
> @@ -41,7 +41,7 @@
>  #define _PAGE_PSE	(_AT(pteval_t, 1) << _PAGE_BIT_PSE)
>  #define _PAGE_GLOBAL	(_AT(pteval_t, 1) << _PAGE_BIT_GLOBAL)
>  #define _PAGE_UNUSED1	(_AT(pteval_t, 1) << _PAGE_BIT_UNUSED1)
> -#define _PAGE_IOMAP	(_AT(pteval_t, 1) << _PAGE_BIT_IOMAP)
> +#define _PAGE_UNUSED2	(_AT(pteval_t, 1) << _PAGE_BIT_UNUSED2)
>  #define _PAGE_PAT	(_AT(pteval_t, 1) << _PAGE_BIT_PAT)
>  #define _PAGE_PAT_LARGE (_AT(pteval_t, 1) << _PAGE_BIT_PAT_LARGE)
>  #define _PAGE_SPECIAL	(_AT(pteval_t, 1) << _PAGE_BIT_SPECIAL)
> @@ -49,6 +49,13 @@
>  #define _PAGE_SPLITTING	(_AT(pteval_t, 1) << _PAGE_BIT_SPLITTING)
>  #define __HAVE_ARCH_PTE_SPECIAL
>  
> +/* flag used to indicate IO mapping */
> +#ifdef CONFIG_XEN
> +#define _PAGE_IOMAP	(_AT(pteval_t, 1) << _PAGE_BIT_UNUSED2)
> +#else
> +#define _PAGE_IOMAP	(_AT(pteval_t, 0))
> +#endif
> +
>  #ifdef CONFIG_KMEMCHECK
>  #define _PAGE_HIDDEN	(_AT(pteval_t, 1) << _PAGE_BIT_HIDDEN)
>  #else
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

  reply	other threads:[~2012-03-30 21:45 UTC|newest]

Thread overview: 125+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-26 17:45 [PATCH 00/39] [RFC] AutoNUMA alpha10 Andrea Arcangeli
2012-03-26 17:45 ` Andrea Arcangeli
2012-03-26 17:45 ` [PATCH 01/39] autonuma: make set_pmd_at always available Andrea Arcangeli
2012-03-26 17:45   ` Andrea Arcangeli
2012-03-26 17:45 ` [PATCH 02/39] xen: document Xen is using an unused bit for the pagetables Andrea Arcangeli
2012-03-26 17:45   ` Andrea Arcangeli
2012-03-30 21:40   ` Konrad Rzeszutek Wilk [this message]
2012-03-30 21:40     ` Konrad Rzeszutek Wilk
2012-03-26 17:45 ` [PATCH 03/39] autonuma: define _PAGE_NUMA_PTE and _PAGE_NUMA_PMD Andrea Arcangeli
2012-03-26 17:45   ` Andrea Arcangeli
2012-03-26 17:45 ` [PATCH 04/39] autonuma: x86 pte_numa() and pmd_numa() Andrea Arcangeli
2012-03-26 17:45   ` Andrea Arcangeli
2012-03-26 17:45 ` [PATCH 05/39] autonuma: generic " Andrea Arcangeli
2012-03-26 17:45   ` Andrea Arcangeli
2012-03-26 17:45 ` [PATCH 06/39] autonuma: teach gup_fast about pte_numa Andrea Arcangeli
2012-03-26 17:45   ` Andrea Arcangeli
2012-03-26 17:45 ` [PATCH 07/39] autonuma: introduce kthread_bind_node() Andrea Arcangeli
2012-03-26 17:45   ` Andrea Arcangeli
2012-03-26 18:32   ` Peter Zijlstra
2012-03-26 18:32     ` Peter Zijlstra
2012-03-27 15:22     ` Andrea Arcangeli
2012-03-27 15:22       ` Andrea Arcangeli
2012-03-27 15:45       ` Peter Zijlstra
2012-03-27 15:45         ` Peter Zijlstra
2012-03-27 16:04         ` Andrea Arcangeli
2012-03-27 16:04           ` Andrea Arcangeli
2012-03-27 16:19           ` Peter Zijlstra
2012-03-27 16:19             ` Peter Zijlstra
2012-03-26 17:45 ` [PATCH 08/39] autonuma: mm_autonuma and sched_autonuma data structures Andrea Arcangeli
2012-03-26 17:45   ` Andrea Arcangeli
2012-03-26 17:45 ` [PATCH 09/39] autonuma: define the autonuma flags Andrea Arcangeli
2012-03-26 17:45   ` Andrea Arcangeli
2012-03-26 17:45 ` [PATCH 10/39] autonuma: core autonuma.h header Andrea Arcangeli
2012-03-26 17:45   ` Andrea Arcangeli
2012-03-26 17:45 ` [PATCH 11/39] autonuma: CPU follow memory algorithm Andrea Arcangeli
2012-03-26 17:45   ` Andrea Arcangeli
2012-03-26 18:25   ` Peter Zijlstra
2012-03-26 18:25     ` Peter Zijlstra
2012-03-26 19:28     ` Rik van Riel
2012-03-26 19:28       ` Rik van Riel
2012-03-26 19:44       ` Andrea Arcangeli
2012-03-26 19:44         ` Andrea Arcangeli
2012-03-26 19:58         ` Linus Torvalds
2012-03-26 20:39           ` Andrea Arcangeli
2012-03-26 20:39             ` Andrea Arcangeli
2012-03-27  8:39             ` Peter Zijlstra
2012-03-27  8:39               ` Peter Zijlstra
2012-03-27 14:37               ` Andrea Arcangeli
2012-03-27 14:37                 ` Andrea Arcangeli
2012-03-27 16:15               ` Andrea Arcangeli
2012-03-27 16:15                 ` Andrea Arcangeli
2012-03-28 11:26                 ` Peter Zijlstra
2012-03-28 11:26                   ` Peter Zijlstra
2012-03-28 18:39                   ` Andrea Arcangeli
2012-03-28 18:39                     ` Andrea Arcangeli
2012-03-27 17:09               ` Ingo Molnar
2012-03-27 17:09                 ` Ingo Molnar
2012-03-26 17:45 ` [PATCH 12/39] autonuma: add page structure fields Andrea Arcangeli
2012-03-26 17:45   ` Andrea Arcangeli
2012-03-26 17:46 ` [PATCH 13/39] autonuma: knuma_migrated per NUMA node queues Andrea Arcangeli
2012-03-26 17:46   ` Andrea Arcangeli
2012-03-26 17:46 ` [PATCH 14/39] autonuma: init knuma_migrated queues Andrea Arcangeli
2012-03-26 17:46   ` Andrea Arcangeli
2012-03-26 17:46 ` [PATCH 15/39] autonuma: autonuma_enter/exit Andrea Arcangeli
2012-03-26 17:46   ` Andrea Arcangeli
2012-03-26 17:46 ` [PATCH 16/39] autonuma: call autonuma_setup_new_exec() Andrea Arcangeli
2012-03-26 17:46   ` Andrea Arcangeli
2012-03-26 17:46 ` [PATCH 17/39] autonuma: alloc/free/init sched_autonuma Andrea Arcangeli
2012-03-26 17:46   ` Andrea Arcangeli
2012-03-26 17:46 ` [PATCH 18/39] autonuma: alloc/free/init mm_autonuma Andrea Arcangeli
2012-03-26 17:46   ` Andrea Arcangeli
2012-03-26 17:46 ` [PATCH 19/39] mm: add unlikely to the mm allocation failure check Andrea Arcangeli
2012-03-26 17:46   ` Andrea Arcangeli
2012-03-26 17:46 ` [PATCH 20/39] autonuma: avoid CFS select_task_rq_fair to return -1 Andrea Arcangeli
2012-03-26 17:46   ` Andrea Arcangeli
2012-03-26 19:36   ` Peter Zijlstra
2012-03-26 19:36     ` Peter Zijlstra
2012-03-26 20:53     ` Andrea Arcangeli
2012-03-26 20:53       ` Andrea Arcangeli
2012-03-26 17:46 ` [PATCH 21/39] autonuma: fix selecting task runqueue Andrea Arcangeli
2012-03-26 17:46   ` Andrea Arcangeli
2012-03-26 17:46 ` [PATCH 22/39] autonuma: select_task_rq_fair cleanup new_cpu < 0 fix Andrea Arcangeli
2012-03-26 17:46   ` Andrea Arcangeli
2012-03-26 17:46 ` [PATCH 23/39] autonuma: teach CFS about autonuma affinity Andrea Arcangeli
2012-03-26 17:46   ` Andrea Arcangeli
2012-03-26 17:46 ` [PATCH 24/39] autonuma: fix finding idlest cpu Andrea Arcangeli
2012-03-26 17:46   ` Andrea Arcangeli
2012-03-26 17:46 ` [PATCH 25/39] autonuma: fix selecting idle sibling Andrea Arcangeli
2012-03-26 17:46   ` Andrea Arcangeli
2012-03-26 17:46 ` [PATCH 26/39] autonuma: select_idle_sibling cleanup target assignment Andrea Arcangeli
2012-03-26 17:46   ` Andrea Arcangeli
2012-03-26 17:46 ` [PATCH 27/39] autonuma: core Andrea Arcangeli
2012-03-26 17:46   ` Andrea Arcangeli
2012-03-26 17:46 ` [PATCH 28/39] autonuma: follow_page check for pte_numa/pmd_numa Andrea Arcangeli
2012-03-26 17:46   ` Andrea Arcangeli
2012-03-26 17:46 ` [PATCH 29/39] autonuma: default mempolicy follow AutoNUMA Andrea Arcangeli
2012-03-26 17:46   ` Andrea Arcangeli
2012-03-26 17:46 ` [PATCH 30/39] autonuma: call autonuma_split_huge_page() Andrea Arcangeli
2012-03-26 17:46   ` Andrea Arcangeli
2012-03-26 17:46 ` [PATCH 31/39] autonuma: make khugepaged pte_numa aware Andrea Arcangeli
2012-03-26 17:46   ` Andrea Arcangeli
2012-03-26 17:46 ` [PATCH 32/39] autonuma: retain page last_nid information in khugepaged Andrea Arcangeli
2012-03-26 17:46   ` Andrea Arcangeli
2012-03-26 17:46 ` [PATCH 33/39] autonuma: numa hinting page faults entry points Andrea Arcangeli
2012-03-26 17:46   ` Andrea Arcangeli
2012-03-26 17:46 ` [PATCH 34/39] autonuma: reset autonuma page data when pages are freed Andrea Arcangeli
2012-03-26 17:46   ` Andrea Arcangeli
2012-03-26 17:46 ` [PATCH 35/39] autonuma: initialize page structure fields Andrea Arcangeli
2012-03-26 17:46   ` Andrea Arcangeli
2012-03-26 17:46 ` [PATCH 36/39] autonuma: link mm/autonuma.o and kernel/sched/numa.o Andrea Arcangeli
2012-03-26 17:46   ` Andrea Arcangeli
2012-03-26 17:46 ` [PATCH 37/39] autonuma: add CONFIG_AUTONUMA and CONFIG_AUTONUMA_DEFAULT_ENABLED Andrea Arcangeli
2012-03-26 17:46   ` Andrea Arcangeli
2012-03-26 17:46 ` [PATCH 38/39] autonuma: boost khugepaged scanning rate Andrea Arcangeli
2012-03-26 17:46   ` Andrea Arcangeli
2012-03-26 17:46 ` [PATCH 39/39] autonuma: NUMA scheduler SMT awareness Andrea Arcangeli
2012-03-26 17:46   ` Andrea Arcangeli
2012-03-26 18:57   ` Peter Zijlstra
2012-03-26 18:57     ` Peter Zijlstra
2012-03-27  0:00     ` Andrea Arcangeli
2012-03-27  0:00       ` Andrea Arcangeli
2012-03-28 13:51       ` Andrea Arcangeli
2012-03-28 13:51         ` Andrea Arcangeli
2012-04-03 20:35 ` [PATCH 00/39] [RFC] AutoNUMA alpha10 Srivatsa Vaddagiri
2012-04-03 20:35   ` Srivatsa Vaddagiri

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120330214005.GC23599@phenom.dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=Lee.Schermerhorn@hp.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=bharata.rao@gmail.com \
    --cc=danms@us.ibm.com \
    --cc=dhillf@gmail.com \
    --cc=efault@gmx.de \
    --cc=hannes@cmpxchg.org \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@elte.hu \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=pjt@google.com \
    --cc=riel@redhat.com \
    --cc=suresh.b.siddha@intel.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.