From: Andrea Arcangeli <aarcange@redhat.com>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Hillf Danton <dhillf@gmail.com>, Dan Smith <danms@us.ibm.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
Paul Turner <pjt@google.com>,
Suresh Siddha <suresh.b.siddha@intel.com>,
Mike Galbraith <efault@gmx.de>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Lai Jiangshan <laijs@cn.fujitsu.com>,
Bharata B Rao <bharata.rao@gmail.com>,
Lee Schermerhorn <Lee.Schermerhorn@hp.com>,
Rik van Riel <riel@redhat.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>,
Christoph Lameter <cl@linux.com>, Alex Shi <alex.shi@intel.com>,
Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>,
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
Don Morris <don.morris@hp.com>, Tony Breeds <tbreeds@au1.ibm.com>,
Kumar Gala <galak@kernel.crashing.org>
Subject: Re: [PATCH 33/36] autonuma: powerpc port
Date: Thu, 23 Aug 2012 17:23:31 +0200 [thread overview]
Message-ID: <20120823152331.GF3570@redhat.com> (raw)
In-Reply-To: <1345698660.13399.23.camel@pasglop>
Hi Benjamin,
On Thu, Aug 23, 2012 at 03:11:00PM +1000, Benjamin Herrenschmidt wrote:
> Basically PROT_NONE turns into _PAGE_PRESENT without _PAGE_USER for us.
Maybe the simplest is to implement pte_numa as !_PAGE_USER too. No
need to clear the _PAGE_PRESENT bit and to alter pte_present() if
clearing _PAGE_USER already achieves it.
It should be trivial to add the vma parameter to pte_numa(pte, vma) so
you can implement pte_numa by checking the vma->vm_page_prot in the
inline pte_numa function, to be able to tell if it's a real prot none
(in which case pte_numa return false) or if it's the NUMA hinting page
fault. In the latter case pte_numa will return true.
> However, the embedded ppc situation is more interesting... and it looks
> like it is indeed broken, meaning that a user can coerce the kernel into
> accessing PROT_NONE on its behalf with copy_from_user & co (though read
> only really).
>
> Looks like the SW TLB handlers used on embedded should also check
> whether the address is a user or kernel address, and enforce _PAGE_USER
> in the former case. They might have done in the past, it's possible that
> it's code we lost, but as it is, it's broken.
>
> The case of HW loaded TLB embedded will need a different definition of
> PAGE_NONE as well I suspect. Kumar, can you have a look ?
Even if we can't track copy-user accesses with the NUMA
hinting page faults, AUTONUMA should still work fairly well. The
flakey PROTNONE on embedded, is more a problem in itself than it would
be for pte_numa on embedded.
OTOH AutoNUMA working on embedded isn't important so it may be just
better to disable it until !_PAGE_USER is reliable.
> I wasn't especially thinking of ppc32... there's also hash64-4k or
> embedded 64... Also pgtable.h is common, so all those added uses of
> _PAGE_NUMA_PTE to static inline functions are going to break the build
> unless _PAGE_NUMA_PTE is #defined to 0 when not used (we do that for a
> bunch of bits in pte-common.h already).
It'd be actually worse if it would build ;). But I guess using
_PAGE_USER to implement pte_numa will solve the problem for 4k page
size too.
We can discuss this during kernel summit ;).
Thanks a lot!
Andrea
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Andrea Arcangeli <aarcange@redhat.com>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Hillf Danton <dhillf@gmail.com>, Dan Smith <danms@us.ibm.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
Paul Turner <pjt@google.com>,
Suresh Siddha <suresh.b.siddha@intel.com>,
Mike Galbraith <efault@gmx.de>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Lai Jiangshan <laijs@cn.fujitsu.com>,
Bharata B Rao <bharata.rao@gmail.com>,
Lee Schermerhorn <Lee.Schermerhorn@hp.com>,
Rik van Riel <riel@redhat.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>,
Christoph Lameter <cl@linux.com>, Alex Shi <alex.shi@intel.com>,
Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>,
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
Don Morris <don.morris@hp.com>, Tony Breeds <tbreeds@au1.ibm.com>,
Kumar Gala <galak@kernel.crashing.org>
Subject: Re: [PATCH 33/36] autonuma: powerpc port
Date: Thu, 23 Aug 2012 17:23:31 +0200 [thread overview]
Message-ID: <20120823152331.GF3570@redhat.com> (raw)
In-Reply-To: <1345698660.13399.23.camel@pasglop>
Hi Benjamin,
On Thu, Aug 23, 2012 at 03:11:00PM +1000, Benjamin Herrenschmidt wrote:
> Basically PROT_NONE turns into _PAGE_PRESENT without _PAGE_USER for us.
Maybe the simplest is to implement pte_numa as !_PAGE_USER too. No
need to clear the _PAGE_PRESENT bit and to alter pte_present() if
clearing _PAGE_USER already achieves it.
It should be trivial to add the vma parameter to pte_numa(pte, vma) so
you can implement pte_numa by checking the vma->vm_page_prot in the
inline pte_numa function, to be able to tell if it's a real prot none
(in which case pte_numa return false) or if it's the NUMA hinting page
fault. In the latter case pte_numa will return true.
> However, the embedded ppc situation is more interesting... and it looks
> like it is indeed broken, meaning that a user can coerce the kernel into
> accessing PROT_NONE on its behalf with copy_from_user & co (though read
> only really).
>
> Looks like the SW TLB handlers used on embedded should also check
> whether the address is a user or kernel address, and enforce _PAGE_USER
> in the former case. They might have done in the past, it's possible that
> it's code we lost, but as it is, it's broken.
>
> The case of HW loaded TLB embedded will need a different definition of
> PAGE_NONE as well I suspect. Kumar, can you have a look ?
Even if we can't track copy-user accesses with the NUMA
hinting page faults, AUTONUMA should still work fairly well. The
flakey PROTNONE on embedded, is more a problem in itself than it would
be for pte_numa on embedded.
OTOH AutoNUMA working on embedded isn't important so it may be just
better to disable it until !_PAGE_USER is reliable.
> I wasn't especially thinking of ppc32... there's also hash64-4k or
> embedded 64... Also pgtable.h is common, so all those added uses of
> _PAGE_NUMA_PTE to static inline functions are going to break the build
> unless _PAGE_NUMA_PTE is #defined to 0 when not used (we do that for a
> bunch of bits in pte-common.h already).
It'd be actually worse if it would build ;). But I guess using
_PAGE_USER to implement pte_numa will solve the problem for 4k page
size too.
We can discuss this during kernel summit ;).
Thanks a lot!
Andrea
next prev parent reply other threads:[~2012-08-23 15:24 UTC|newest]
Thread overview: 108+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-22 14:58 [PATCH 00/36] AutoNUMA24 Andrea Arcangeli
2012-08-22 14:58 ` Andrea Arcangeli
2012-08-22 14:58 ` [PATCH 01/36] autonuma: make set_pmd_at always available Andrea Arcangeli
2012-08-22 14:58 ` Andrea Arcangeli
2012-08-22 14:58 ` [PATCH 02/36] autonuma: export is_vma_temporary_stack() even if CONFIG_TRANSPARENT_HUGEPAGE=n Andrea Arcangeli
2012-08-22 14:58 ` Andrea Arcangeli
2012-08-22 14:58 ` [PATCH 03/36] autonuma: define _PAGE_NUMA_PTE and _PAGE_NUMA_PMD Andrea Arcangeli
2012-08-22 14:58 ` Andrea Arcangeli
2012-08-22 14:58 ` [PATCH 04/36] autonuma: pte_numa() and pmd_numa() Andrea Arcangeli
2012-08-22 14:58 ` Andrea Arcangeli
2012-08-22 14:58 ` [PATCH 05/36] autonuma: teach gup_fast about pmd_numa Andrea Arcangeli
2012-08-22 14:58 ` Andrea Arcangeli
2012-08-22 14:58 ` [PATCH 06/36] autonuma: introduce kthread_bind_node() Andrea Arcangeli
2012-08-22 14:58 ` Andrea Arcangeli
2012-08-22 14:58 ` [PATCH 07/36] autonuma: mm_autonuma and task_autonuma data structures Andrea Arcangeli
2012-08-22 14:58 ` Andrea Arcangeli
2012-08-22 14:58 ` [PATCH 08/36] autonuma: define the autonuma flags Andrea Arcangeli
2012-08-22 14:58 ` Andrea Arcangeli
2012-08-22 14:58 ` [PATCH 09/36] autonuma: core autonuma.h header Andrea Arcangeli
2012-08-22 14:58 ` Andrea Arcangeli
2012-08-22 14:58 ` [PATCH 10/36] autonuma: CPU follows memory algorithm Andrea Arcangeli
2012-08-22 14:58 ` Andrea Arcangeli
2012-08-22 14:58 ` [PATCH 11/36] autonuma: add page structure fields Andrea Arcangeli
2012-08-22 14:58 ` Andrea Arcangeli
2012-08-22 14:58 ` [PATCH 12/36] autonuma: knuma_migrated per NUMA node queues Andrea Arcangeli
2012-08-22 14:58 ` Andrea Arcangeli
2012-08-22 14:58 ` [PATCH 13/36] autonuma: autonuma_enter/exit Andrea Arcangeli
2012-08-22 14:58 ` Andrea Arcangeli
2012-08-22 14:58 ` [PATCH 14/36] autonuma: call autonuma_setup_new_exec() Andrea Arcangeli
2012-08-22 14:58 ` Andrea Arcangeli
2012-08-22 14:58 ` [PATCH 15/36] autonuma: alloc/free/init task_autonuma Andrea Arcangeli
2012-08-22 14:58 ` Andrea Arcangeli
2012-08-22 14:59 ` [PATCH 16/36] autonuma: alloc/free/init mm_autonuma Andrea Arcangeli
2012-08-22 14:59 ` Andrea Arcangeli
2012-08-22 14:59 ` [PATCH 17/36] autonuma: prevent select_task_rq_fair to return -1 Andrea Arcangeli
2012-08-22 14:59 ` Andrea Arcangeli
2012-08-22 14:59 ` [PATCH 18/36] autonuma: teach CFS about autonuma affinity Andrea Arcangeli
2012-08-22 14:59 ` Andrea Arcangeli
2012-08-22 14:59 ` [PATCH 19/36] autonuma: memory follows CPU algorithm and task/mm_autonuma stats collection Andrea Arcangeli
2012-08-22 14:59 ` Andrea Arcangeli
2012-08-22 20:19 ` Andi Kleen
2012-08-22 20:19 ` Andi Kleen
2012-08-22 21:22 ` Hugh Dickins
2012-08-22 21:22 ` Hugh Dickins
2012-08-22 21:24 ` Andrea Arcangeli
2012-08-22 21:24 ` Andrea Arcangeli
2012-08-22 22:37 ` Andi Kleen
2012-08-22 22:37 ` Andi Kleen
2012-08-22 22:46 ` Andrea Arcangeli
2012-08-22 22:46 ` Andrea Arcangeli
2012-08-22 14:59 ` [PATCH 20/36] autonuma: default mempolicy follow AutoNUMA Andrea Arcangeli
2012-08-22 14:59 ` Andrea Arcangeli
2012-08-22 14:59 ` [PATCH 21/36] autonuma: call autonuma_split_huge_page() Andrea Arcangeli
2012-08-22 14:59 ` Andrea Arcangeli
2012-08-22 14:59 ` [PATCH 22/36] autonuma: make khugepaged pte_numa aware Andrea Arcangeli
2012-08-22 14:59 ` Andrea Arcangeli
2012-08-22 14:59 ` [PATCH 23/36] autonuma: retain page last_nid information in khugepaged Andrea Arcangeli
2012-08-22 14:59 ` Andrea Arcangeli
2012-08-22 14:59 ` [PATCH 24/36] autonuma: numa hinting page faults entry points Andrea Arcangeli
2012-08-22 14:59 ` Andrea Arcangeli
2012-08-22 14:59 ` [PATCH 25/36] autonuma: reset autonuma page data when pages are freed Andrea Arcangeli
2012-08-22 14:59 ` Andrea Arcangeli
2012-08-22 14:59 ` [PATCH 26/36] autonuma: link mm/autonuma.o and kernel/sched/numa.o Andrea Arcangeli
2012-08-22 14:59 ` Andrea Arcangeli
2012-08-22 14:59 ` [PATCH 27/36] autonuma: add CONFIG_AUTONUMA and CONFIG_AUTONUMA_DEFAULT_ENABLED Andrea Arcangeli
2012-08-22 14:59 ` Andrea Arcangeli
2012-08-22 14:59 ` [PATCH 28/36] autonuma: page_autonuma Andrea Arcangeli
2012-08-22 14:59 ` Andrea Arcangeli
2012-08-22 14:59 ` [PATCH 29/36] autonuma: autonuma_migrate_head[0] dynamic size Andrea Arcangeli
2012-08-22 14:59 ` Andrea Arcangeli
2012-08-22 14:59 ` [PATCH 30/36] autonuma: bugcheck page_autonuma fields on newly allocated pages Andrea Arcangeli
2012-08-22 14:59 ` Andrea Arcangeli
2012-08-22 14:59 ` [PATCH 31/36] autonuma: shrink the per-page page_autonuma struct size Andrea Arcangeli
2012-08-22 14:59 ` Andrea Arcangeli
2012-08-22 14:59 ` [PATCH 32/36] autonuma: boost khugepaged scanning rate Andrea Arcangeli
2012-08-22 14:59 ` Andrea Arcangeli
2012-08-22 14:59 ` [PATCH 33/36] autonuma: powerpc port Andrea Arcangeli
2012-08-22 14:59 ` Andrea Arcangeli
2012-08-22 22:01 ` Benjamin Herrenschmidt
2012-08-22 22:01 ` Benjamin Herrenschmidt
2012-08-22 22:35 ` Andrea Arcangeli
2012-08-22 22:35 ` Andrea Arcangeli
2012-08-23 5:11 ` Benjamin Herrenschmidt
2012-08-23 5:11 ` Benjamin Herrenschmidt
2012-08-23 15:23 ` Andrea Arcangeli [this message]
2012-08-23 15:23 ` Andrea Arcangeli
2012-08-23 22:13 ` Benjamin Herrenschmidt
2012-08-23 22:13 ` Benjamin Herrenschmidt
2012-08-22 22:56 ` Benjamin Herrenschmidt
2012-08-22 22:56 ` Benjamin Herrenschmidt
2012-08-22 23:06 ` Andrea Arcangeli
2012-08-22 23:06 ` Andrea Arcangeli
2012-08-23 4:15 ` Vaidyanathan Srinivasan
2012-08-23 4:15 ` Vaidyanathan Srinivasan
2012-08-22 14:59 ` [PATCH 34/36] autonuma: make the AUTONUMA_SCAN_PMD_FLAG conditional to CONFIG_HAVE_ARCH_AUTONUMA_SCAN_PMD Andrea Arcangeli
2012-08-22 14:59 ` Andrea Arcangeli
2012-08-22 14:59 ` [PATCH 35/36] autonuma: add knuma_migrated/allow_first_fault in sysfs Andrea Arcangeli
2012-08-22 14:59 ` Andrea Arcangeli
2012-08-22 14:59 ` [PATCH 36/36] autonuma: add mm_autonuma working set estimation Andrea Arcangeli
2012-08-22 14:59 ` Andrea Arcangeli
2012-08-22 19:26 ` [PATCH 00/36] AutoNUMA24 Rik van Riel
2012-08-22 19:26 ` Rik van Riel
2012-08-22 21:40 ` Ingo Molnar
2012-08-22 21:40 ` Ingo Molnar
2012-08-22 22:19 ` Andrea Arcangeli
2012-08-22 22:19 ` Andrea Arcangeli
2012-08-23 8:42 ` Ingo Molnar
2012-08-23 8:42 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120823152331.GF3570@redhat.com \
--to=aarcange@redhat.com \
--cc=Lee.Schermerhorn@hp.com \
--cc=akpm@linux-foundation.org \
--cc=alex.shi@intel.com \
--cc=benh@kernel.crashing.org \
--cc=bharata.rao@gmail.com \
--cc=cl@linux.com \
--cc=danms@us.ibm.com \
--cc=dhillf@gmail.com \
--cc=don.morris@hp.com \
--cc=efault@gmx.de \
--cc=galak@kernel.crashing.org \
--cc=hannes@cmpxchg.org \
--cc=konrad.wilk@oracle.com \
--cc=laijs@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mauricfo@linux.vnet.ibm.com \
--cc=mingo@elte.hu \
--cc=paulmck@linux.vnet.ibm.com \
--cc=pjt@google.com \
--cc=riel@redhat.com \
--cc=suresh.b.siddha@intel.com \
--cc=svaidy@linux.vnet.ibm.com \
--cc=tbreeds@au1.ibm.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=vatsa@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.