* [PATCH 0/6] ARM: mm: HugeTLB + THP support.
@ 2013-02-08 15:01 Steve Capper
2013-02-08 15:01 ` [PATCH 1/6] ARM: mm: correct pte_same behaviour for LPAE Steve Capper
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: Steve Capper @ 2013-02-08 15:01 UTC (permalink / raw)
To: linux-arch, linux-arm-kernel
Cc: c.dall, akpm, mhocko, kirill, aarcange, cmetcalf, hoffman,
notasas, bill4carson, will.deacon, catalin.marinas, maen, shadi,
tawfik, Steve Capper
Hello,
The following patches bring both HugeTLB support and Transparent
HugePage (THP) support to ARM.
These are not intended for 3.9.
Both short descriptors (non-LPAE) and long descriptors (LPAE) are
supported.
The non-LPAE HugeTLB code is based on patches by Bill Carson [1],
but instead of allocating extra memory to store "Linux PTEs", it
re-purposes the domain bits of section descriptors and constructs
huge Linux PTEs on demand.
As PMDs are walked directly by the kernel THP functions (there are
no huge_pmd_offset style functions), any "linux PMD"/"hardware PMD"
distinction would require some re-working of the ARM PMD/PTE code.
Use of the domain bits allows for a more straightforward THP
implementation.
Some general HugeTLB code relating to huge page migration on memory
failure (CONFIG_MEMORY_FAILURE) de-references huge pte_t *s directly
rather than use the huge_ptep_get and set_huge_pte_at functions.
Thus this config option is incompatible with non-LPAE hugepages. At
the moment I can only see x86 using CONFIG_MEMORY_FAILURE though.
Non-LPAE code was tested on an Arndale board (Exynos 5250) and a
Tegra 2 TrimSlice.
The LPAE code manipulates the hardware page tables directly as the
long descriptors are wide enough to contain all the Linux PTE
information.
The LPAE code has been tested on an Arndale board (Exynos 5250).
This patch set is based on 3.8-rc6.
Major changes since the RFC:
* huge pmd sharing removed from the 3-level code as this was
found to be very rarely, if ever?, used. This allowed for some
code simplification.
* hardware pmd bits for 2-levels of paging are now taken from
mmu.c. Also the mapping code now uses pte/pmd bit helper
functions rather than the custom pre-processor logic.
Cheers,
--
Steve
[1] - http://lists.infradead.org/pipermail/linux-arm-kernel/2012-February/084359.html
Catalin Marinas (2):
ARM: mm: HugeTLB support for LPAE systems.
ARM: mm: Transparent huge page support for LPAE systems.
Steve Capper (4):
ARM: mm: correct pte_same behaviour for LPAE.
ARM: mm: Add support for flushing HugeTLB pages.
ARM: mm: HugeTLB support for non-LPAE systems.
ARM: mm: Transparent huge page support for non-LPAE systems.
arch/arm/Kconfig | 8 ++
arch/arm/include/asm/hugetlb-2level.h | 121 +++++++++++++++++++++
arch/arm/include/asm/hugetlb-3level.h | 61 +++++++++++
arch/arm/include/asm/hugetlb.h | 87 +++++++++++++++
arch/arm/include/asm/pgtable-2level.h | 151 +++++++++++++++++++++++++++
arch/arm/include/asm/pgtable-3level-hwdef.h | 2 +
arch/arm/include/asm/pgtable-3level.h | 83 +++++++++++++++
arch/arm/include/asm/pgtable.h | 13 ++-
arch/arm/include/asm/tlb.h | 16 ++-
arch/arm/include/asm/tlbflush.h | 2 +
arch/arm/kernel/head.S | 10 +-
arch/arm/mm/Makefile | 1 +
arch/arm/mm/dma-mapping.c | 2 +-
arch/arm/mm/fault.c | 11 --
arch/arm/mm/flush.c | 26 +++--
arch/arm/mm/fsr-2level.c | 4 +-
arch/arm/mm/fsr-3level.c | 4 +-
arch/arm/mm/hugetlbpage.c | 100 ++++++++++++++++++
arch/arm/mm/mmu.c | 27 +++++
19 files changed, 697 insertions(+), 32 deletions(-)
create mode 100644 arch/arm/include/asm/hugetlb-2level.h
create mode 100644 arch/arm/include/asm/hugetlb-3level.h
create mode 100644 arch/arm/include/asm/hugetlb.h
create mode 100644 arch/arm/mm/hugetlbpage.c
--
1.7.9.5
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/6] ARM: mm: correct pte_same behaviour for LPAE.
2013-02-08 15:01 [PATCH 0/6] ARM: mm: HugeTLB + THP support Steve Capper
@ 2013-02-08 15:01 ` Steve Capper
2013-02-08 15:01 ` [PATCH 2/6] ARM: mm: Add support for flushing HugeTLB pages Steve Capper
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Steve Capper @ 2013-02-08 15:01 UTC (permalink / raw)
To: linux-arch, linux-arm-kernel
Cc: c.dall, akpm, mhocko, kirill, aarcange, cmetcalf, hoffman,
notasas, bill4carson, will.deacon, catalin.marinas, maen, shadi,
tawfik, Steve Capper
For 3 levels of paging the PTE_EXT_NG bit will be set for user
address ptes that are written to a page table but not for ptes
created with mk_pte.
This can cause some comparison tests made by pte_same to fail
spuriously and lead to other problems.
To correct this behaviour, we mask off PTE_EXT_NG for any pte that
is present before running the comparison.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
---
arch/arm/include/asm/pgtable-3level.h | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index a3f3792..2b73b0f 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -148,6 +148,23 @@ static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr)
clean_pmd_entry(pmdp); \
} while (0)
+/*
+ * For 3 levels of paging the PTE_EXT_NG bit will be set for user address ptes
+ * that are written to a page table but not for ptes created with mk_pte.
+ *
+ * In hugetlb_no_page, a new huge pte (new_pte) is generated and passed to
+ * hugetlb_cow, where it is compared with an entry in a page table.
+ * This comparison test fails erroneously leading ultimately to a memory leak.
+ *
+ * To correct this behaviour, we mask off PTE_EXT_NG for any pte that is
+ * present before running the comparison.
+ */
+#define __HAVE_ARCH_PTE_SAME
+#define pte_same(pte_a,pte_b) ((pte_present(pte_a) ? pte_val(pte_a) & ~PTE_EXT_NG \
+ : pte_val(pte_a)) \
+ == (pte_present(pte_b) ? pte_val(pte_b) & ~PTE_EXT_NG \
+ : pte_val(pte_b)))
+
#define set_pte_ext(ptep,pte,ext) cpu_set_pte_ext(ptep,__pte(pte_val(pte)|(ext)))
#endif /* __ASSEMBLY__ */
--
1.7.9.5
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/6] ARM: mm: Add support for flushing HugeTLB pages.
2013-02-08 15:01 [PATCH 0/6] ARM: mm: HugeTLB + THP support Steve Capper
2013-02-08 15:01 ` [PATCH 1/6] ARM: mm: correct pte_same behaviour for LPAE Steve Capper
@ 2013-02-08 15:01 ` Steve Capper
2013-02-08 15:01 ` [PATCH 3/6] ARM: mm: HugeTLB support for LPAE systems Steve Capper
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Steve Capper @ 2013-02-08 15:01 UTC (permalink / raw)
To: linux-arch, linux-arm-kernel
Cc: c.dall, akpm, mhocko, kirill, aarcange, cmetcalf, hoffman,
notasas, bill4carson, will.deacon, catalin.marinas, maen, shadi,
tawfik, Steve Capper
On ARM we use the __flush_dcache_page function to flush the dcache
of pages when needed; usually when the PG_dcache_clean bit is unset
and we are setting a PTE.
A HugeTLB page is represented as a compound page consisting of an
array of pages. Thus to flush the dcache of a HugeTLB page, one must
flush more than a single page.
This patch modifies __flush_dcache_page such that all constituent
pages of a HugeTLB page are flushed.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
---
arch/arm/mm/flush.c | 26 ++++++++++++++++----------
1 file changed, 16 insertions(+), 10 deletions(-)
diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
index 1c8f7f5..7f32f96 100644
--- a/arch/arm/mm/flush.c
+++ b/arch/arm/mm/flush.c
@@ -17,6 +17,7 @@
#include <asm/highmem.h>
#include <asm/smp_plat.h>
#include <asm/tlbflush.h>
+#include <linux/hugetlb.h>
#include "mm.h"
@@ -168,17 +169,22 @@ void __flush_dcache_page(struct address_space *mapping, struct page *page)
* coherent with the kernels mapping.
*/
if (!PageHighMem(page)) {
- __cpuc_flush_dcache_area(page_address(page), PAGE_SIZE);
+ size_t page_size = PAGE_SIZE << compound_order(page);
+ __cpuc_flush_dcache_area(page_address(page), page_size);
} else {
- void *addr = kmap_high_get(page);
- if (addr) {
- __cpuc_flush_dcache_area(addr, PAGE_SIZE);
- kunmap_high(page);
- } else if (cache_is_vipt()) {
- /* unmapped pages might still be cached */
- addr = kmap_atomic(page);
- __cpuc_flush_dcache_area(addr, PAGE_SIZE);
- kunmap_atomic(addr);
+ unsigned long i;
+ for(i = 0; i < (1 << compound_order(page)); i++) {
+ struct page *cpage = page + i;
+ void *addr = kmap_high_get(cpage);
+ if (addr) {
+ __cpuc_flush_dcache_area(addr, PAGE_SIZE);
+ kunmap_high(cpage);
+ } else if (cache_is_vipt()) {
+ /* unmapped pages might still be cached */
+ addr = kmap_atomic(cpage);
+ __cpuc_flush_dcache_area(addr, PAGE_SIZE);
+ kunmap_atomic(addr);
+ }
}
}
--
1.7.9.5
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 3/6] ARM: mm: HugeTLB support for LPAE systems.
2013-02-08 15:01 [PATCH 0/6] ARM: mm: HugeTLB + THP support Steve Capper
2013-02-08 15:01 ` [PATCH 1/6] ARM: mm: correct pte_same behaviour for LPAE Steve Capper
2013-02-08 15:01 ` [PATCH 2/6] ARM: mm: Add support for flushing HugeTLB pages Steve Capper
@ 2013-02-08 15:01 ` Steve Capper
2013-02-08 15:01 ` [PATCH 4/6] ARM: mm: HugeTLB support for non-LPAE systems Steve Capper
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Steve Capper @ 2013-02-08 15:01 UTC (permalink / raw)
To: linux-arch, linux-arm-kernel
Cc: c.dall, akpm, mhocko, kirill, aarcange, cmetcalf, hoffman,
notasas, bill4carson, will.deacon, catalin.marinas, maen, shadi,
tawfik, Steve Capper
From: Catalin Marinas <catalin.marinas@arm.com>
This patch adds support for hugetlbfs based on the x86 implementation.
It allows mapping of 2MB sections (see Documentation/vm/hugetlbpage.txt
for usage). The 64K pages configuration is not supported (section size
is 512MB in this case).
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[steve.capper@arm.com: symbolic constants replace numbers in places.
Split up into multiple files, to simplify future non-LPAE support,
removed huge_pmd_share code, as this is very rarely executed].
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
---
arch/arm/Kconfig | 4 ++
arch/arm/include/asm/hugetlb-3level.h | 61 ++++++++++++++++++++
arch/arm/include/asm/hugetlb.h | 83 +++++++++++++++++++++++++++
arch/arm/include/asm/pgtable-3level.h | 13 +++++
arch/arm/mm/Makefile | 1 +
arch/arm/mm/dma-mapping.c | 2 +-
arch/arm/mm/fsr-3level.c | 2 +-
arch/arm/mm/hugetlbpage.c | 100 +++++++++++++++++++++++++++++++++
8 files changed, 264 insertions(+), 2 deletions(-)
create mode 100644 arch/arm/include/asm/hugetlb-3level.h
create mode 100644 arch/arm/include/asm/hugetlb.h
create mode 100644 arch/arm/mm/hugetlbpage.c
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 67874b8..9a80544 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1778,6 +1778,10 @@ config HW_PERF_EVENTS
Enable hardware performance counter support for perf events. If
disabled, perf events will use software events only.
+config SYS_SUPPORTS_HUGETLBFS
+ def_bool y
+ depends on ARM_LPAE
+
source "mm/Kconfig"
config FORCE_MAX_ZONEORDER
diff --git a/arch/arm/include/asm/hugetlb-3level.h b/arch/arm/include/asm/hugetlb-3level.h
new file mode 100644
index 0000000..4868064
--- /dev/null
+++ b/arch/arm/include/asm/hugetlb-3level.h
@@ -0,0 +1,61 @@
+/*
+ * arch/arm/include/asm/hugetlb-3level.h
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * Based on arch/x86/include/asm/hugetlb.h.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+
+#ifndef _ASM_ARM_HUGETLB_3LEVEL_H
+#define _ASM_ARM_HUGETLB_3LEVEL_H
+
+static inline pte_t huge_ptep_get(pte_t *ptep)
+{
+ return *ptep;
+}
+
+static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
+ pte_t *ptep, pte_t pte)
+{
+ set_pte_at(mm, addr, ptep, pte);
+}
+
+static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
+ unsigned long addr, pte_t *ptep)
+{
+ ptep_clear_flush(vma, addr, ptep);
+}
+
+static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
+ unsigned long addr, pte_t *ptep)
+{
+ ptep_set_wrprotect(mm, addr, ptep);
+}
+
+static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
+ unsigned long addr, pte_t *ptep)
+{
+ return ptep_get_and_clear(mm, addr, ptep);
+}
+
+static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
+ unsigned long addr, pte_t *ptep,
+ pte_t pte, int dirty)
+{
+ return ptep_set_access_flags(vma, addr, ptep, pte, dirty);
+}
+
+#endif /* _ASM_ARM_HUGETLB_3LEVEL_H */
diff --git a/arch/arm/include/asm/hugetlb.h b/arch/arm/include/asm/hugetlb.h
new file mode 100644
index 0000000..7af9cf6
--- /dev/null
+++ b/arch/arm/include/asm/hugetlb.h
@@ -0,0 +1,83 @@
+/*
+ * arch/arm/include/asm/hugetlb.h
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * Based on arch/x86/include/asm/hugetlb.h
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+
+#ifndef _ASM_ARM_HUGETLB_H
+#define _ASM_ARM_HUGETLB_H
+
+#include <asm/page.h>
+
+#include <asm/hugetlb-3level.h>
+
+static inline void hugetlb_free_pgd_range(struct mmu_gather *tlb,
+ unsigned long addr, unsigned long end,
+ unsigned long floor,
+ unsigned long ceiling)
+{
+ free_pgd_range(tlb, addr, end, floor, ceiling);
+}
+
+
+static inline int is_hugepage_only_range(struct mm_struct *mm,
+ unsigned long addr, unsigned long len)
+{
+ return 0;
+}
+
+static inline int prepare_hugepage_range(struct file *file,
+ unsigned long addr, unsigned long len)
+{
+ struct hstate *h = hstate_file(file);
+ if (len & ~huge_page_mask(h))
+ return -EINVAL;
+ if (addr & ~huge_page_mask(h))
+ return -EINVAL;
+ return 0;
+}
+
+static inline void hugetlb_prefault_arch_hook(struct mm_struct *mm)
+{
+}
+
+static inline int huge_pte_none(pte_t pte)
+{
+ return pte_none(pte);
+}
+
+static inline pte_t huge_pte_wrprotect(pte_t pte)
+{
+ return pte_wrprotect(pte);
+}
+
+static inline int arch_prepare_hugepage(struct page *page)
+{
+ return 0;
+}
+
+static inline void arch_release_hugepage(struct page *page)
+{
+}
+
+static inline void arch_clear_hugepage_flags(struct page *page)
+{
+ clear_bit(PG_dcache_clean, &page->flags);
+}
+
+#endif /* _ASM_ARM_HUGETLB_H */
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index 2b73b0f..d89be57 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -62,6 +62,14 @@
#define USER_PTRS_PER_PGD (PAGE_OFFSET / PGDIR_SIZE)
/*
+ * Hugetlb definitions.
+ */
+#define HPAGE_SHIFT PMD_SHIFT
+#define HPAGE_SIZE (_AC(1, UL) << HPAGE_SHIFT)
+#define HPAGE_MASK (~(HPAGE_SIZE - 1))
+#define HUGETLB_PAGE_ORDER (HPAGE_SHIFT - PAGE_SHIFT)
+
+/*
* "Linux" PTE definitions for LPAE.
*
* These bits overlap with the hardware bits but the naming is preserved for
@@ -167,6 +175,11 @@ static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr)
#define set_pte_ext(ptep,pte,ext) cpu_set_pte_ext(ptep,__pte(pte_val(pte)|(ext)))
+#define pte_huge(pte) ((pte_val(pte) & PMD_TYPE_MASK) == PMD_TYPE_SECT)
+
+#define pte_mkhuge(pte) (__pte((pte_val(pte) & ~PMD_TYPE_MASK) | PMD_TYPE_SECT))
+
+
#endif /* __ASSEMBLY__ */
#endif /* _ASM_PGTABLE_3LEVEL_H */
diff --git a/arch/arm/mm/Makefile b/arch/arm/mm/Makefile
index 8a9c4cb..a33b139 100644
--- a/arch/arm/mm/Makefile
+++ b/arch/arm/mm/Makefile
@@ -16,6 +16,7 @@ obj-$(CONFIG_MODULES) += proc-syms.o
obj-$(CONFIG_ALIGNMENT_TRAP) += alignment.o
obj-$(CONFIG_HIGHMEM) += highmem.o
+obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o
obj-$(CONFIG_CPU_ABRT_NOMMU) += abort-nommu.o
obj-$(CONFIG_CPU_ABRT_EV4) += abort-ev4.o
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 076c26d..74a0ff5 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -239,7 +239,7 @@ static void __dma_free_buffer(struct page *page, size_t size)
#ifdef CONFIG_MMU
#ifdef CONFIG_HUGETLB_PAGE
-#error ARM Coherent DMA allocator does not (yet) support huge TLB
+#warning ARM Coherent DMA allocator does not (yet) support huge TLB
#endif
static void *__alloc_from_contiguous(struct device *dev, size_t size,
diff --git a/arch/arm/mm/fsr-3level.c b/arch/arm/mm/fsr-3level.c
index 05a4e94..e115fc7 100644
--- a/arch/arm/mm/fsr-3level.c
+++ b/arch/arm/mm/fsr-3level.c
@@ -13,7 +13,7 @@ static struct fsr_info fsr_info[] = {
{ do_page_fault, SIGSEGV, SEGV_ACCERR, "level 3 access flag fault" },
{ do_bad, SIGBUS, 0, "reserved permission fault" },
{ do_bad, SIGSEGV, SEGV_ACCERR, "level 1 permission fault" },
- { do_sect_fault, SIGSEGV, SEGV_ACCERR, "level 2 permission fault" },
+ { do_page_fault, SIGSEGV, SEGV_ACCERR, "level 2 permission fault" },
{ do_page_fault, SIGSEGV, SEGV_ACCERR, "level 3 permission fault" },
{ do_bad, SIGBUS, 0, "synchronous external abort" },
{ do_bad, SIGBUS, 0, "asynchronous external abort" },
diff --git a/arch/arm/mm/hugetlbpage.c b/arch/arm/mm/hugetlbpage.c
new file mode 100644
index 0000000..93b78be
--- /dev/null
+++ b/arch/arm/mm/hugetlbpage.c
@@ -0,0 +1,100 @@
+/*
+ * arch/arm/mm/hugetlbpage.c
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * Based on arch/x86/include/asm/hugetlb.h and Bill Carson's patches
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+
+#include <linux/init.h>
+#include <linux/fs.h>
+#include <linux/mm.h>
+#include <linux/hugetlb.h>
+#include <linux/pagemap.h>
+#include <linux/err.h>
+#include <linux/sysctl.h>
+#include <asm/mman.h>
+#include <asm/tlb.h>
+#include <asm/tlbflush.h>
+#include <asm/pgalloc.h>
+
+/*
+ * On ARM, huge pages are backed by pmd's rather than pte's, so we do a lot
+ * of type casting from pmd_t * to pte_t *.
+ */
+
+pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr)
+{
+ pgd_t *pgd;
+ pud_t *pud;
+ pmd_t *pmd = NULL;
+
+ pgd = pgd_offset(mm, addr);
+ if (pgd_present(*pgd)) {
+ pud = pud_offset(pgd, addr);
+ if (pud_present(*pud))
+ pmd = pmd_offset(pud, addr);
+ }
+
+ return (pte_t *)pmd;
+}
+
+struct page *follow_huge_addr(struct mm_struct *mm, unsigned long address,
+ int write)
+{
+ return ERR_PTR(-EINVAL);
+}
+
+int pmd_huge(pmd_t pmd)
+{
+ return (pmd_val(pmd) & PMD_TYPE_MASK) == PMD_TYPE_SECT;
+}
+
+int pud_huge(pud_t pud)
+{
+ return 0;
+}
+
+int huge_pmd_unshare(struct mm_struct *mm, unsigned long *addr, pte_t *ptep)
+{
+ return 0;
+}
+
+pte_t *huge_pte_alloc(struct mm_struct *mm,
+ unsigned long addr, unsigned long sz)
+{
+ pgd_t *pgd;
+ pud_t *pud;
+ pte_t *pte = NULL;
+
+ pgd = pgd_offset(mm, addr);
+ pud = pud_alloc(mm, pgd, addr);
+ if (pud)
+ pte = (pte_t *)pmd_alloc(mm, pud, addr);
+
+ return pte;
+}
+
+struct page *follow_huge_pmd(struct mm_struct *mm, unsigned long address,
+ pmd_t *pmd, int write)
+{
+ struct page *page;
+ unsigned long pfn;
+
+ pfn = ((pmd_val(*pmd) & HPAGE_MASK) >> PAGE_SHIFT);
+ page = pfn_to_page(pfn);
+ return page;
+}
--
1.7.9.5
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 4/6] ARM: mm: HugeTLB support for non-LPAE systems.
2013-02-08 15:01 [PATCH 0/6] ARM: mm: HugeTLB + THP support Steve Capper
` (2 preceding siblings ...)
2013-02-08 15:01 ` [PATCH 3/6] ARM: mm: HugeTLB support for LPAE systems Steve Capper
@ 2013-02-08 15:01 ` Steve Capper
2013-02-08 15:01 ` [PATCH 5/6] ARM: mm: Transparent huge page support for LPAE systems Steve Capper
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Steve Capper @ 2013-02-08 15:01 UTC (permalink / raw)
To: linux-arch, linux-arm-kernel
Cc: c.dall, akpm, mhocko, kirill, aarcange, cmetcalf, hoffman,
notasas, bill4carson, will.deacon, catalin.marinas, maen, shadi,
tawfik, Steve Capper
Based on Bill Carson's HugeTLB patch, with the big difference being
in the way PTEs are passed back to the memory manager. Rather than
store a "Linux Huge PTE" separately; we make one up on the fly in
huge_ptep_get. Also rather than consider 16M supersections, we focus
solely on 2x1M sections.
To construct a huge PTE on the fly we need additional information
(such as the accessed flag and dirty bit) which we choose to store
in the domain bits of the short section descriptor. In order to use
these domain bits for storage, we need to make ourselves a client
for all 16 domains and this is done in head.S.
Storing extra information in the domain bits also makes it a lot
easier to implement Transparent Huge Pages, and some of the code in
pgtable-2level.h is arranged to facilitate THP support in a later
patch.
Non-LPAE HugeTLB pages are incompatible with the huge page migration
code (enabled when CONFIG_MEMORY_FAILURE is selected) as that code
dereferences PTEs directly, rather than calling huge_ptep_get and
set_huge_pte_at.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
---
arch/arm/Kconfig | 2 +-
arch/arm/include/asm/hugetlb-2level.h | 121 +++++++++++++++++++++++++++++++++
arch/arm/include/asm/hugetlb.h | 4 ++
arch/arm/include/asm/pgtable-2level.h | 102 +++++++++++++++++++++++++++
arch/arm/include/asm/pgtable.h | 2 +
arch/arm/include/asm/tlb.h | 10 ++-
arch/arm/kernel/head.S | 10 ++-
arch/arm/mm/fault.c | 11 ---
arch/arm/mm/fsr-2level.c | 4 +-
arch/arm/mm/mmu.c | 27 ++++++++
10 files changed, 276 insertions(+), 17 deletions(-)
create mode 100644 arch/arm/include/asm/hugetlb-2level.h
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 9a80544..077f5b9 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1780,7 +1780,7 @@ config HW_PERF_EVENTS
config SYS_SUPPORTS_HUGETLBFS
def_bool y
- depends on ARM_LPAE
+ depends on ARM_LPAE || (!CPU_USE_DOMAINS && !MEMORY_FAILURE)
source "mm/Kconfig"
diff --git a/arch/arm/include/asm/hugetlb-2level.h b/arch/arm/include/asm/hugetlb-2level.h
new file mode 100644
index 0000000..eb0e8b6
--- /dev/null
+++ b/arch/arm/include/asm/hugetlb-2level.h
@@ -0,0 +1,121 @@
+/*
+ * arch/arm/include/asm/hugetlb-2level.h
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * Based on arch/x86/include/asm/hugetlb.h and Bill Carson's patches
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+
+#ifndef _ASM_ARM_HUGETLB_2LEVEL_H
+#define _ASM_ARM_HUGETLB_2LEVEL_H
+
+
+static inline pte_t huge_ptep_get(pte_t *ptep)
+{
+ pmd_t pmd = *((pmd_t *)ptep);
+ pte_t retval;
+
+ if (!pmd_val(pmd))
+ return __pte(0);
+
+ retval = __pte((pteval_t) (pmd_val(pmd) & HPAGE_MASK)
+ | arm_hugepteprotval);
+
+ if (pmd_exec(pmd))
+ retval = pte_mkexec(retval);
+ else
+ retval = pte_mknexec(retval);
+
+ if (pmd_young(pmd))
+ retval = pte_mkyoung(retval);
+ else
+ retval = pte_mkold(retval);
+
+ if (pmd_dirty(pmd))
+ retval = pte_mkdirty(retval);
+ else
+ retval = pte_mkclean(retval);
+
+ if (pmd_write(pmd))
+ retval = pte_mkwrite(retval);
+ else
+ retval = pte_wrprotect(retval);
+
+ return retval;
+}
+
+static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
+ pte_t *ptep, pte_t pte)
+{
+ pmdval_t pmdval = (pmdval_t) pte_val(pte);
+ pmd_t *pmdp = (pmd_t *) ptep;
+
+ /* take the target address bits from the pte only */
+ pmdval &= HPAGE_MASK;
+
+ /*
+ * now use pmd_modify to translate the permission bits from the pte
+ * and set the memory type information.
+ */
+ pmdval = pmd_val(pmd_modify(__pmd(pmdval), __pgprot(pte_val(pte))));
+
+ __sync_icache_dcache(pte);
+
+ set_pmd_at(mm, addr, pmdp, __pmd(pmdval));
+}
+
+static inline pte_t pte_mkhuge(pte_t pte) { return pte; }
+
+static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
+ unsigned long addr, pte_t *ptep)
+{
+ pmd_t *pmdp = (pmd_t *)ptep;
+ pmd_clear(pmdp);
+ flush_tlb_range(vma, addr, addr + HPAGE_SIZE);
+}
+
+static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
+ unsigned long addr, pte_t *ptep)
+{
+ pmd_t *pmdp = (pmd_t *) ptep;
+ set_pmd_at(mm, addr, pmdp, pmd_wrprotect(*pmdp));
+}
+
+
+static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
+ unsigned long addr, pte_t *ptep)
+{
+ pmd_t *pmdp = (pmd_t *)ptep;
+ pte_t pte = huge_ptep_get(ptep);
+ pmd_clear(pmdp);
+
+ return pte;
+}
+
+static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
+ unsigned long addr, pte_t *ptep,
+ pte_t pte, int dirty)
+{
+ int changed = !pte_same(huge_ptep_get(ptep), pte);
+ if (changed) {
+ set_huge_pte_at(vma->vm_mm, addr, ptep, pte);
+ flush_tlb_range(vma, addr, addr + HPAGE_SIZE);
+ }
+
+ return changed;
+}
+
+#endif /* _ASM_ARM_HUGETLB_2LEVEL_H */
diff --git a/arch/arm/include/asm/hugetlb.h b/arch/arm/include/asm/hugetlb.h
index 7af9cf6..1e92975 100644
--- a/arch/arm/include/asm/hugetlb.h
+++ b/arch/arm/include/asm/hugetlb.h
@@ -24,7 +24,11 @@
#include <asm/page.h>
+#ifdef CONFIG_ARM_LPAE
#include <asm/hugetlb-3level.h>
+#else
+#include <asm/hugetlb-2level.h>
+#endif
static inline void hugetlb_free_pgd_range(struct mmu_gather *tlb,
unsigned long addr, unsigned long end,
diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h
index f97ee02..d859613 100644
--- a/arch/arm/include/asm/pgtable-2level.h
+++ b/arch/arm/include/asm/pgtable-2level.h
@@ -181,6 +181,108 @@ static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr)
#define set_pte_ext(ptep,pte,ext) cpu_set_pte_ext(ptep,pte,ext)
+
+#ifdef CONFIG_SYS_SUPPORTS_HUGETLBFS
+
+/*
+ * now follows some of the definitions to allow huge page support, we can't put
+ * these in the hugetlb source files as they are also required for transparent
+ * hugepage support.
+ */
+
+#define HPAGE_SHIFT PMD_SHIFT
+#define HPAGE_SIZE (_AC(1, UL) << HPAGE_SHIFT)
+#define HPAGE_MASK (~(HPAGE_SIZE - 1))
+#define HUGETLB_PAGE_ORDER (HPAGE_SHIFT - PAGE_SHIFT)
+
+#define HUGE_LINUX_PTE_COUNT (PAGE_OFFSET >> HPAGE_SHIFT)
+#define HUGE_LINUX_PTE_SIZE (HUGE_LINUX_PTE_COUNT * sizeof(pte_t *))
+#define HUGE_LINUX_PTE_INDEX(addr) (addr >> HPAGE_SHIFT)
+
+/*
+ * We re-purpose the following domain bits in the section descriptor
+ */
+#define PMD_DOMAIN_MASK (_AT(pmdval_t, 0xF) << 5)
+#define PMD_DSECT_DIRTY (_AT(pmdval_t, 1) << 5)
+#define PMD_DSECT_AF (_AT(pmdval_t, 1) << 6)
+
+#define PMD_BIT_FUNC(fn,op) \
+static inline pmd_t pmd_##fn(pmd_t pmd) { pmd_val(pmd) op; return pmd; }
+
+static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr,
+ pmd_t *pmdp, pmd_t pmd)
+{
+ /*
+ * we can sometimes be passed a pmd pointing to a level 2 descriptor
+ * from collapse_huge_page.
+ */
+ if ((pmd_val(pmd) & PMD_TYPE_MASK) == PMD_TYPE_TABLE) {
+ pmdp[0] = __pmd(pmd_val(pmd));
+ pmdp[1] = __pmd(pmd_val(pmd) + 256 * sizeof(pte_t));
+ } else {
+ pmdp[0] = __pmd(pmd_val(pmd)); /* first 1M section */
+ pmdp[1] = __pmd(pmd_val(pmd) + SECTION_SIZE); /* second 1M section */
+ }
+
+ flush_pmd_entry(pmdp);
+}
+
+extern pmdval_t arm_hugepmdprotval;
+extern pteval_t arm_hugepteprotval;
+
+#define pmd_mkhuge(pmd) (__pmd((pmd_val(pmd) & ~PMD_TYPE_MASK) | PMD_TYPE_SECT))
+
+PMD_BIT_FUNC(mkold, &= ~PMD_DSECT_AF);
+PMD_BIT_FUNC(mkdirty, |= PMD_DSECT_DIRTY);
+PMD_BIT_FUNC(mkclean, &= ~PMD_DSECT_DIRTY);
+PMD_BIT_FUNC(mkyoung, |= PMD_DSECT_AF);
+PMD_BIT_FUNC(mkwrite, |= PMD_SECT_AP_WRITE);
+PMD_BIT_FUNC(wrprotect, &= ~PMD_SECT_AP_WRITE);
+PMD_BIT_FUNC(mknotpresent, &= ~PMD_TYPE_MASK);
+PMD_BIT_FUNC(mkexec, &= ~PMD_SECT_XN);
+PMD_BIT_FUNC(mknexec, |= PMD_SECT_XN);
+
+#define pmd_young(pmd) (pmd_val(pmd) & PMD_DSECT_AF)
+#define pmd_write(pmd) (pmd_val(pmd) & PMD_SECT_AP_WRITE)
+#define pmd_exec(pmd) (!(pmd_val(pmd) & PMD_SECT_XN))
+#define pmd_dirty(pmd) (pmd_val(pmd) & PMD_DSECT_DIRTY)
+
+#define __HAVE_ARCH_PMD_WRITE
+
+#define pmd_modify(pmd, prot) \
+({ \
+ pmd_t pmdret = __pmd((pmd_val(pmd) & (PMD_MASK | PMD_DOMAIN_MASK)) \
+ | arm_hugepmdprotval); \
+ pgprot_t inprot = prot; \
+ pte_t newprot = __pte(pgprot_val(inprot)); \
+ \
+ if (pte_dirty(newprot)) \
+ pmdret = pmd_mkdirty(pmdret); \
+ else \
+ pmdret = pmd_mkclean(pmdret); \
+ \
+ if (pte_exec(newprot)) \
+ pmdret = pmd_mkexec(pmdret); \
+ else \
+ pmdret = pmd_mknexec(pmdret); \
+ \
+ if (pte_write(newprot)) \
+ pmdret = pmd_mkwrite(pmdret); \
+ else \
+ pmdret = pmd_wrprotect(pmdret); \
+ \
+ if (pte_young(newprot)) \
+ pmdret = pmd_mkyoung(pmdret); \
+ else \
+ pmdret = pmd_mkold(pmdret); \
+ \
+ pmdret; \
+})
+
+#else
+#define HPAGE_SIZE 0
+#endif /* CONFIG_SYS_SUPPORTS_HUGETLBFS */
+
#endif /* __ASSEMBLY__ */
#endif /* _ASM_PGTABLE_2LEVEL_H */
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index 9c82f98..4e7efc6 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -235,6 +235,8 @@ PTE_BIT_FUNC(mkclean, &= ~L_PTE_DIRTY);
PTE_BIT_FUNC(mkdirty, |= L_PTE_DIRTY);
PTE_BIT_FUNC(mkold, &= ~L_PTE_YOUNG);
PTE_BIT_FUNC(mkyoung, |= L_PTE_YOUNG);
+PTE_BIT_FUNC(mkexec, &= ~L_PTE_XN);
+PTE_BIT_FUNC(mknexec, |= L_PTE_XN);
static inline pte_t pte_mkspecial(pte_t pte) { return pte; }
diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h
index 99a1951..685e9e87 100644
--- a/arch/arm/include/asm/tlb.h
+++ b/arch/arm/include/asm/tlb.h
@@ -92,10 +92,16 @@ static inline void tlb_flush(struct mmu_gather *tlb)
static inline void tlb_add_flush(struct mmu_gather *tlb, unsigned long addr)
{
if (!tlb->fullmm) {
+ unsigned long size = PAGE_SIZE;
+
if (addr < tlb->range_start)
tlb->range_start = addr;
- if (addr + PAGE_SIZE > tlb->range_end)
- tlb->range_end = addr + PAGE_SIZE;
+
+ if (tlb->vma && is_vm_hugetlb_page(tlb->vma))
+ size = HPAGE_SIZE;
+
+ if (addr + size > tlb->range_end)
+ tlb->range_end = addr + size;
}
}
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 486a15a..1378221 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -413,13 +413,21 @@ __enable_mmu:
mov r5, #0
mcrr p15, 0, r4, r5, c2 @ load TTBR0
#else
+#ifndef CONFIG_SYS_SUPPORTS_HUGETLBFS
mov r5, #(domain_val(DOMAIN_USER, DOMAIN_MANAGER) | \
domain_val(DOMAIN_KERNEL, DOMAIN_MANAGER) | \
domain_val(DOMAIN_TABLE, DOMAIN_MANAGER) | \
domain_val(DOMAIN_IO, DOMAIN_CLIENT))
+#else
+ @ set ourselves as the client in all domains
+ @ this allows us to then use the 4 domain bits in the
+ @ section descriptors in our transparent huge pages
+ ldr r5, =0x55555555
+#endif /* CONFIG_SYS_SUPPORTS_HUGETLBFS */
+
mcr p15, 0, r5, c3, c0, 0 @ load domain access register
mcr p15, 0, r4, c2, c0, 0 @ load page table pointer
-#endif
+#endif /* CONFIG_ARM_LPAE */
b __turn_mmu_on
ENDPROC(__enable_mmu)
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index 5dbf13f..95d53f9 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -488,17 +488,6 @@ do_translation_fault(unsigned long addr, unsigned int fsr,
#endif /* CONFIG_MMU */
/*
- * Some section permission faults need to be handled gracefully.
- * They can happen due to a __{get,put}_user during an oops.
- */
-static int
-do_sect_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
-{
- do_bad_area(addr, fsr, regs);
- return 0;
-}
-
-/*
* This abort handler always returns "fault".
*/
static int
diff --git a/arch/arm/mm/fsr-2level.c b/arch/arm/mm/fsr-2level.c
index 18ca74c..c1a2afc 100644
--- a/arch/arm/mm/fsr-2level.c
+++ b/arch/arm/mm/fsr-2level.c
@@ -16,7 +16,7 @@ static struct fsr_info fsr_info[] = {
{ do_bad, SIGBUS, 0, "external abort on non-linefetch" },
{ do_bad, SIGSEGV, SEGV_ACCERR, "page domain fault" },
{ do_bad, SIGBUS, 0, "external abort on translation" },
- { do_sect_fault, SIGSEGV, SEGV_ACCERR, "section permission fault" },
+ { do_page_fault, SIGSEGV, SEGV_ACCERR, "section permission fault" },
{ do_bad, SIGBUS, 0, "external abort on translation" },
{ do_page_fault, SIGSEGV, SEGV_ACCERR, "page permission fault" },
/*
@@ -56,7 +56,7 @@ static struct fsr_info ifsr_info[] = {
{ do_bad, SIGBUS, 0, "unknown 10" },
{ do_bad, SIGSEGV, SEGV_ACCERR, "page domain fault" },
{ do_bad, SIGBUS, 0, "external abort on translation" },
- { do_sect_fault, SIGSEGV, SEGV_ACCERR, "section permission fault" },
+ { do_page_fault, SIGSEGV, SEGV_ACCERR, "section permission fault" },
{ do_bad, SIGBUS, 0, "external abort on translation" },
{ do_page_fault, SIGSEGV, SEGV_ACCERR, "page permission fault" },
{ do_bad, SIGBUS, 0, "unknown 16" },
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index ce328c7..74f1917 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -303,6 +303,21 @@ const struct mem_type *get_mem_type(unsigned int type)
EXPORT_SYMBOL(get_mem_type);
/*
+ * If the system supports huge pages and we are running with short descriptors,
+ * then compute the pmd and linux pte prot values for a huge page.
+ *
+ * These values are used by both the HugeTLB and THP code.
+ */
+#if defined(CONFIG_SYS_SUPPORTS_HUGETLBFS) && !defined(CONFIG_ARM_LPAE)
+pmdval_t arm_hugepmdprotval;
+EXPORT_SYMBOL(arm_hugepmdprotval);
+
+pteval_t arm_hugepteprotval;
+EXPORT_SYMBOL(arm_hugepteprotval);
+#endif
+
+
+/*
* Adjust the PMD section entries according to the CPU in use.
*/
static void __init build_mem_type_table(void)
@@ -526,6 +541,18 @@ static void __init build_mem_type_table(void)
if (t->prot_sect)
t->prot_sect |= PMD_DOMAIN(t->domain);
}
+
+#if defined(CONFIG_SYS_SUPPORTS_HUGETLBFS) && !defined(CONFIG_ARM_LPAE)
+ /*
+ * we assume all huge pages are user pages and that hardware access
+ * flag updates are disabled (i.e. SCTLR.AFE == 0b).
+ */
+ arm_hugepteprotval = mem_types[MT_MEMORY].prot_pte | L_PTE_USER | L_PTE_VALID;
+
+ arm_hugepmdprotval = mem_types[MT_MEMORY].prot_sect | PMD_SECT_AP_READ
+ | PMD_SECT_nG;
+#endif
+
}
#ifdef CONFIG_ARM_DMA_MEM_BUFFERABLE
--
1.7.9.5
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 5/6] ARM: mm: Transparent huge page support for LPAE systems.
2013-02-08 15:01 [PATCH 0/6] ARM: mm: HugeTLB + THP support Steve Capper
` (3 preceding siblings ...)
2013-02-08 15:01 ` [PATCH 4/6] ARM: mm: HugeTLB support for non-LPAE systems Steve Capper
@ 2013-02-08 15:01 ` Steve Capper
2013-02-08 15:01 ` [PATCH 6/6] ARM: mm: Transparent huge page support for non-LPAE systems Steve Capper
2013-02-10 20:45 ` [PATCH 0/6] ARM: mm: HugeTLB + THP support Grazvydas Ignotas
6 siblings, 0 replies; 8+ messages in thread
From: Steve Capper @ 2013-02-08 15:01 UTC (permalink / raw)
To: linux-arch, linux-arm-kernel
Cc: c.dall, akpm, mhocko, kirill, aarcange, cmetcalf, hoffman,
notasas, bill4carson, will.deacon, catalin.marinas, maen, shadi,
tawfik, Steve Capper
From: Catalin Marinas <catalin.marinas@arm.com>
The patch adds support for THP (transparent huge pages) to LPAE
systems. When this feature is enabled, the kernel tries to map
anonymous pages as 2MB sections where possible.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[steve.capper@arm.com: symbolic constants used, value of PMD_SECT_SPLITTING
adjusted, tlbflush.h included in pgtable.h]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
---
arch/arm/Kconfig | 4 +++
arch/arm/include/asm/pgtable-2level.h | 2 ++
arch/arm/include/asm/pgtable-3level-hwdef.h | 2 ++
arch/arm/include/asm/pgtable-3level.h | 51 +++++++++++++++++++++++++++
arch/arm/include/asm/pgtable.h | 4 ++-
arch/arm/include/asm/tlb.h | 6 ++++
arch/arm/include/asm/tlbflush.h | 2 ++
arch/arm/mm/fsr-3level.c | 2 +-
8 files changed, 71 insertions(+), 2 deletions(-)
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 077f5b9..7bee9b9 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1782,6 +1782,10 @@ config SYS_SUPPORTS_HUGETLBFS
def_bool y
depends on ARM_LPAE || (!CPU_USE_DOMAINS && !MEMORY_FAILURE)
+config HAVE_ARCH_TRANSPARENT_HUGEPAGE
+ def_bool y
+ depends on ARM_LPAE
+
source "mm/Kconfig"
config FORCE_MAX_ZONEORDER
diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h
index d859613..7423ecf 100644
--- a/arch/arm/include/asm/pgtable-2level.h
+++ b/arch/arm/include/asm/pgtable-2level.h
@@ -179,6 +179,8 @@ static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr)
/* we don't need complex calculations here as the pmd is folded into the pgd */
#define pmd_addr_end(addr,end) (end)
+#define pmd_present(pmd) ((pmd_val(pmd) & PMD_TYPE_MASK) != PMD_TYPE_FAULT)
+
#define set_pte_ext(ptep,pte,ext) cpu_set_pte_ext(ptep,pte,ext)
diff --git a/arch/arm/include/asm/pgtable-3level-hwdef.h b/arch/arm/include/asm/pgtable-3level-hwdef.h
index d795282..53c7f67 100644
--- a/arch/arm/include/asm/pgtable-3level-hwdef.h
+++ b/arch/arm/include/asm/pgtable-3level-hwdef.h
@@ -38,6 +38,8 @@
*/
#define PMD_SECT_BUFFERABLE (_AT(pmdval_t, 1) << 2)
#define PMD_SECT_CACHEABLE (_AT(pmdval_t, 1) << 3)
+#define PMD_SECT_USER (_AT(pmdval_t, 1) << 6) /* AP[1] */
+#define PMD_SECT_RDONLY (_AT(pmdval_t, 1) << 7) /* AP[2] */
#define PMD_SECT_S (_AT(pmdval_t, 3) << 8)
#define PMD_SECT_AF (_AT(pmdval_t, 1) << 10)
#define PMD_SECT_nG (_AT(pmdval_t, 1) << 11)
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index d89be57..1b0351d 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -87,6 +87,9 @@
#define L_PTE_SPECIAL (_AT(pteval_t, 1) << 56) /* unused */
#define L_PTE_NONE (_AT(pteval_t, 1) << 57) /* PROT_NONE */
+#define PMD_SECT_DIRTY (_AT(pmdval_t, 1) << 55)
+#define PMD_SECT_SPLITTING (_AT(pmdval_t, 1) << 57)
+
/*
* To be used in assembly code with the upper page attributes.
*/
@@ -180,6 +183,54 @@ static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr)
#define pte_mkhuge(pte) (__pte((pte_val(pte) & ~PMD_TYPE_MASK) | PMD_TYPE_SECT))
+#define pmd_present(pmd) ((pmd_val(pmd) & PMD_TYPE_MASK) != PMD_TYPE_FAULT)
+#define pmd_young(pmd) (pmd_val(pmd) & PMD_SECT_AF)
+
+#define __HAVE_ARCH_PMD_WRITE
+#define pmd_write(pmd) (!(pmd_val(pmd) & PMD_SECT_RDONLY))
+
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+#define pmd_trans_huge(pmd) ((pmd_val(pmd) & PMD_TYPE_MASK) == PMD_TYPE_SECT)
+#define pmd_trans_splitting(pmd) (pmd_val(pmd) & PMD_SECT_SPLITTING)
+#endif
+
+#define PMD_BIT_FUNC(fn,op) \
+static inline pmd_t pmd_##fn(pmd_t pmd) { pmd_val(pmd) op; return pmd; }
+
+PMD_BIT_FUNC(wrprotect, |= PMD_SECT_RDONLY);
+PMD_BIT_FUNC(mkold, &= ~PMD_SECT_AF);
+PMD_BIT_FUNC(mksplitting, |= PMD_SECT_SPLITTING);
+PMD_BIT_FUNC(mkwrite, &= ~PMD_SECT_RDONLY);
+PMD_BIT_FUNC(mkdirty, |= PMD_SECT_DIRTY);
+PMD_BIT_FUNC(mkyoung, |= PMD_SECT_AF);
+PMD_BIT_FUNC(mknotpresent, &= ~PMD_TYPE_MASK);
+
+#define pmd_mkhuge(pmd) (__pmd((pmd_val(pmd) & ~PMD_TYPE_MASK) | PMD_TYPE_SECT))
+
+#define pmd_pfn(pmd) (((pmd_val(pmd) & PMD_MASK) & PHYS_MASK) >> PAGE_SHIFT)
+#define pfn_pmd(pfn,prot) (__pmd(((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot)))
+#define mk_pmd(page,prot) pfn_pmd(page_to_pfn(page),prot)
+
+static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot)
+{
+ const pmdval_t mask = PMD_SECT_USER | PMD_SECT_XN | PMD_SECT_RDONLY;
+ pmd_val(pmd) = (pmd_val(pmd) & ~mask) | (pgprot_val(newprot) & mask);
+ return pmd;
+}
+
+static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr,
+ pmd_t *pmdp, pmd_t pmd)
+{
+ BUG_ON(addr >= TASK_SIZE);
+ *pmdp = __pmd(pmd_val(pmd) | PMD_SECT_nG);
+ flush_pmd_entry(pmdp);
+}
+
+static inline int has_transparent_hugepage(void)
+{
+ return 1;
+}
+
#endif /* __ASSEMBLY__ */
#endif /* _ASM_PGTABLE_3LEVEL_H */
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index 4e7efc6..bb27451 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -24,6 +24,9 @@
#include <asm/memory.h>
#include <asm/pgtable-hwdef.h>
+
+#include <asm/tlbflush.h>
+
#ifdef CONFIG_ARM_LPAE
#include <asm/pgtable-3level.h>
#else
@@ -163,7 +166,6 @@ extern pgd_t swapper_pg_dir[PTRS_PER_PGD];
#define pgd_offset_k(addr) pgd_offset(&init_mm, addr)
#define pmd_none(pmd) (!pmd_val(pmd))
-#define pmd_present(pmd) (pmd_val(pmd))
static inline pte_t *pmd_page_vaddr(pmd_t pmd)
{
diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h
index 685e9e87..0fc2d9d 100644
--- a/arch/arm/include/asm/tlb.h
+++ b/arch/arm/include/asm/tlb.h
@@ -229,6 +229,12 @@ static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmdp,
#endif
}
+static inline void
+tlb_remove_pmd_tlb_entry(struct mmu_gather *tlb, pmd_t *pmdp, unsigned long addr)
+{
+ tlb_add_flush(tlb, addr);
+}
+
#define pte_free_tlb(tlb, ptep, addr) __pte_free_tlb(tlb, ptep, addr)
#define pmd_free_tlb(tlb, pmdp, addr) __pmd_free_tlb(tlb, pmdp, addr)
#define pud_free_tlb(tlb, pudp, addr) pud_free((tlb)->mm, pudp)
diff --git a/arch/arm/include/asm/tlbflush.h b/arch/arm/include/asm/tlbflush.h
index 6e924d3..907cede 100644
--- a/arch/arm/include/asm/tlbflush.h
+++ b/arch/arm/include/asm/tlbflush.h
@@ -505,6 +505,8 @@ static inline void update_mmu_cache(struct vm_area_struct *vma,
}
#endif
+#define update_mmu_cache_pmd(vma, address, pmd) do { } while (0)
+
#endif
#endif /* CONFIG_MMU */
diff --git a/arch/arm/mm/fsr-3level.c b/arch/arm/mm/fsr-3level.c
index e115fc7..ab4409a 100644
--- a/arch/arm/mm/fsr-3level.c
+++ b/arch/arm/mm/fsr-3level.c
@@ -9,7 +9,7 @@ static struct fsr_info fsr_info[] = {
{ do_page_fault, SIGSEGV, SEGV_MAPERR, "level 3 translation fault" },
{ do_bad, SIGBUS, 0, "reserved access flag fault" },
{ do_bad, SIGSEGV, SEGV_ACCERR, "level 1 access flag fault" },
- { do_bad, SIGSEGV, SEGV_ACCERR, "level 2 access flag fault" },
+ { do_page_fault, SIGSEGV, SEGV_ACCERR, "level 2 access flag fault" },
{ do_page_fault, SIGSEGV, SEGV_ACCERR, "level 3 access flag fault" },
{ do_bad, SIGBUS, 0, "reserved permission fault" },
{ do_bad, SIGSEGV, SEGV_ACCERR, "level 1 permission fault" },
--
1.7.9.5
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 6/6] ARM: mm: Transparent huge page support for non-LPAE systems.
2013-02-08 15:01 [PATCH 0/6] ARM: mm: HugeTLB + THP support Steve Capper
` (4 preceding siblings ...)
2013-02-08 15:01 ` [PATCH 5/6] ARM: mm: Transparent huge page support for LPAE systems Steve Capper
@ 2013-02-08 15:01 ` Steve Capper
2013-02-10 20:45 ` [PATCH 0/6] ARM: mm: HugeTLB + THP support Grazvydas Ignotas
6 siblings, 0 replies; 8+ messages in thread
From: Steve Capper @ 2013-02-08 15:01 UTC (permalink / raw)
To: linux-arch, linux-arm-kernel
Cc: c.dall, akpm, mhocko, kirill, aarcange, cmetcalf, hoffman,
notasas, bill4carson, will.deacon, catalin.marinas, maen, shadi,
tawfik, Steve Capper
Much of the required code for THP has been implemented in the
earlier non-LPAE HugeTLB patch.
One more domain bit is used (to store whether or not the THP is
splitting).
Some THP helper functions are defined; and we have to re-define
pmd_page such that it distinguishes between page tables and
sections.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
---
arch/arm/Kconfig | 2 +-
arch/arm/include/asm/pgtable-2level.h | 47 +++++++++++++++++++++++++++++++++
arch/arm/include/asm/pgtable-3level.h | 2 ++
arch/arm/include/asm/pgtable.h | 7 +++--
4 files changed, 55 insertions(+), 3 deletions(-)
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 7bee9b9..32e6aba 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1784,7 +1784,7 @@ config SYS_SUPPORTS_HUGETLBFS
config HAVE_ARCH_TRANSPARENT_HUGEPAGE
def_bool y
- depends on ARM_LPAE
+ depends on SYS_SUPPORTS_HUGETLBFS
source "mm/Kconfig"
diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h
index 7423ecf..e3b5ec2 100644
--- a/arch/arm/include/asm/pgtable-2level.h
+++ b/arch/arm/include/asm/pgtable-2level.h
@@ -207,6 +207,7 @@ static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr)
#define PMD_DOMAIN_MASK (_AT(pmdval_t, 0xF) << 5)
#define PMD_DSECT_DIRTY (_AT(pmdval_t, 1) << 5)
#define PMD_DSECT_AF (_AT(pmdval_t, 1) << 6)
+#define PMD_DSECT_SPLITTING (_AT(pmdval_t, 1) << 7)
#define PMD_BIT_FUNC(fn,op) \
static inline pmd_t pmd_##fn(pmd_t pmd) { pmd_val(pmd) op; return pmd; }
@@ -285,6 +286,52 @@ PMD_BIT_FUNC(mknexec, |= PMD_SECT_XN);
#define HPAGE_SIZE 0
#endif /* CONFIG_SYS_SUPPORTS_HUGETLBFS */
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+#define pmd_mkhuge(pmd) (__pmd((pmd_val(pmd) & ~PMD_TYPE_MASK) | PMD_TYPE_SECT))
+
+PMD_BIT_FUNC(mksplitting, |= PMD_DSECT_SPLITTING);
+#define pmd_trans_splitting(pmd) (pmd_val(pmd) & PMD_DSECT_SPLITTING)
+#define pmd_trans_huge(pmd) ((pmd_val(pmd) & PMD_TYPE_MASK) == PMD_TYPE_SECT)
+
+static inline unsigned long pmd_pfn(pmd_t pmd)
+{
+ /*
+ * for a section, we need to mask off more of the pmd
+ * before looking up the pfn
+ */
+ if ((pmd_val(pmd) & PMD_TYPE_MASK) == PMD_TYPE_SECT)
+ return __phys_to_pfn(pmd_val(pmd) & HPAGE_MASK);
+ else
+ return __phys_to_pfn(pmd_val(pmd) & PHYS_MASK);
+}
+
+#define pfn_pmd(pfn,prot) pmd_modify(__pmd(__pfn_to_phys(pfn)),prot);
+#define mk_pmd(page,prot) pfn_pmd(page_to_pfn(page),prot);
+
+static inline int has_transparent_hugepage(void)
+{
+ return 1;
+}
+
+#define _PMD_HUGE(pmd) ((pmd_val(pmd) & PMD_TYPE_MASK) == PMD_TYPE_SECT)
+#define _PMD_HPAGE(pmd) (phys_to_page(pmd_val(pmd) & HPAGE_MASK))
+#else
+#define _PMD_HUGE(pmd) (0)
+#define _PMD_HPAGE(pmd) (0)
+#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+
+static inline struct page *pmd_page(pmd_t pmd)
+{
+ /*
+ * for a section, we need to mask off more of the pmd
+ * before looking up the page as it is a section descriptor.
+ */
+ if (_PMD_HUGE(pmd))
+ return _PMD_HPAGE(pmd);
+
+ return phys_to_page(pmd_val(pmd) & PHYS_MASK);
+}
+
#endif /* __ASSEMBLY__ */
#endif /* _ASM_PGTABLE_2LEVEL_H */
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index 1b0351d..cc16398 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -211,6 +211,8 @@ PMD_BIT_FUNC(mknotpresent, &= ~PMD_TYPE_MASK);
#define pfn_pmd(pfn,prot) (__pmd(((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot)))
#define mk_pmd(page,prot) pfn_pmd(page_to_pfn(page),prot)
+#define pmd_page(pmd) pfn_to_page(__phys_to_pfn(pmd_val(pmd) & PHYS_MASK))
+
static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot)
{
const pmdval_t mask = PMD_SECT_USER | PMD_SECT_XN | PMD_SECT_RDONLY;
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index bb27451..850270c5 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -169,11 +169,14 @@ extern pgd_t swapper_pg_dir[PTRS_PER_PGD];
static inline pte_t *pmd_page_vaddr(pmd_t pmd)
{
+#ifdef SYS_SUPPORTS_HUGETLBFS
+ if ((pmd_val(pmd) & PMD_TYPE_MASK) == PMD_TYPE_SECT)
+ return __va(pmd_val(pmd) & HPAGE_MASK);
+#endif
+
return __va(pmd_val(pmd) & PHYS_MASK & (s32)PAGE_MASK);
}
-#define pmd_page(pmd) pfn_to_page(__phys_to_pfn(pmd_val(pmd) & PHYS_MASK))
-
#ifndef CONFIG_HIGHPTE
#define __pte_map(pmd) pmd_page_vaddr(*(pmd))
#define __pte_unmap(pte) do { } while (0)
--
1.7.9.5
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 0/6] ARM: mm: HugeTLB + THP support.
2013-02-08 15:01 [PATCH 0/6] ARM: mm: HugeTLB + THP support Steve Capper
` (5 preceding siblings ...)
2013-02-08 15:01 ` [PATCH 6/6] ARM: mm: Transparent huge page support for non-LPAE systems Steve Capper
@ 2013-02-10 20:45 ` Grazvydas Ignotas
6 siblings, 0 replies; 8+ messages in thread
From: Grazvydas Ignotas @ 2013-02-10 20:45 UTC (permalink / raw)
To: Steve Capper
Cc: linux-arch, linux-arm-kernel, c.dall, akpm, mhocko, kirill,
aarcange, cmetcalf, hoffman, bill4carson, will.deacon,
catalin.marinas, maen, shadi, tawfik
On Fri, Feb 8, 2013 at 5:01 PM, Steve Capper <steve.capper@arm.com> wrote:
> Hello,
> The following patches bring both HugeTLB support and Transparent
> HugePage (THP) support to ARM.
>
> These are not intended for 3.9.
I've tested this on 3.2 with this series and dependencies (LPAE and
some others) backported because I currently only have full board
support on 3.2. This was done on OMAP3/CortexA8 pandora board running
PSX emulator that renders graphics to 16bpp 1024x512 framebuffer
(sometimes 2048x1024 in "enhanced" mode). Here are some average frames
per second counter readings without huge pages and with them:
(not showing games as they are irrelevant here)
no hugetlb / hugetlb
game1: ~52 / 57
game2: ~67 / 77
game3: 21 / 26
game4: 22 / 25
As it can be seen here huge pages are very useful for this kind of
load, so I'm really looking forward for this to get merged.
FWIW:
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
--
Gražvydas
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2013-02-10 20:46 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-02-08 15:01 [PATCH 0/6] ARM: mm: HugeTLB + THP support Steve Capper
2013-02-08 15:01 ` [PATCH 1/6] ARM: mm: correct pte_same behaviour for LPAE Steve Capper
2013-02-08 15:01 ` [PATCH 2/6] ARM: mm: Add support for flushing HugeTLB pages Steve Capper
2013-02-08 15:01 ` [PATCH 3/6] ARM: mm: HugeTLB support for LPAE systems Steve Capper
2013-02-08 15:01 ` [PATCH 4/6] ARM: mm: HugeTLB support for non-LPAE systems Steve Capper
2013-02-08 15:01 ` [PATCH 5/6] ARM: mm: Transparent huge page support for LPAE systems Steve Capper
2013-02-08 15:01 ` [PATCH 6/6] ARM: mm: Transparent huge page support for non-LPAE systems Steve Capper
2013-02-10 20:45 ` [PATCH 0/6] ARM: mm: HugeTLB + THP support Grazvydas Ignotas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).