* [PATCH v10 1/7] filemap: check compound_head(page)->mapping in filemap_fault()
2019-08-01 18:42 [PATCH v10 0/7] Enable THP for text section of non-shmem files Song Liu
@ 2019-08-01 18:42 ` Song Liu
2019-08-01 18:42 ` [PATCH v10 2/7] filemap: check compound_head(page)->mapping in pagecache_get_page() Song Liu
` (5 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Song Liu @ 2019-08-01 18:42 UTC (permalink / raw)
To: linux-mm, linux-fsdevel, linux-kernel
Cc: matthew.wilcox, kirill.shutemov, kernel-team, william.kucharski,
akpm, hdanton, Song Liu
Currently, filemap_fault() avoids race condition with truncate by
checking page->mapping == mapping. This does not work for compound
pages. This patch let it check compound_head(page)->mapping instead.
Acked-by: Rik van Riel <riel@surriel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Song Liu <songliubraving@fb.com>
---
mm/filemap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/filemap.c b/mm/filemap.c
index 7161fb937e78..d0bd9e585c2f 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2537,7 +2537,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
goto out_retry;
/* Did it get truncated? */
- if (unlikely(page->mapping != mapping)) {
+ if (unlikely(compound_head(page)->mapping != mapping)) {
unlock_page(page);
put_page(page);
goto retry_find;
--
2.17.1
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH v10 2/7] filemap: check compound_head(page)->mapping in pagecache_get_page()
2019-08-01 18:42 [PATCH v10 0/7] Enable THP for text section of non-shmem files Song Liu
2019-08-01 18:42 ` [PATCH v10 1/7] filemap: check compound_head(page)->mapping in filemap_fault() Song Liu
@ 2019-08-01 18:42 ` Song Liu
2019-08-12 20:33 ` Johannes Weiner
2019-08-01 18:42 ` [PATCH v10 3/7] filemap: update offset check in filemap_fault() Song Liu
` (4 subsequent siblings)
6 siblings, 1 reply; 9+ messages in thread
From: Song Liu @ 2019-08-01 18:42 UTC (permalink / raw)
To: linux-mm, linux-fsdevel, linux-kernel
Cc: matthew.wilcox, kirill.shutemov, kernel-team, william.kucharski,
akpm, hdanton, Song Liu
Similar to previous patch, pagecache_get_page() avoids race condition
with truncate by checking page->mapping == mapping. This does not work
for compound pages. This patch let it check compound_head(page)->mapping
instead.
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Song Liu <songliubraving@fb.com>
---
mm/filemap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/filemap.c b/mm/filemap.c
index d0bd9e585c2f..aaee1ef96f6d 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1644,7 +1644,7 @@ struct page *pagecache_get_page(struct address_space *mapping, pgoff_t offset,
}
/* Has the page been truncated? */
- if (unlikely(page->mapping != mapping)) {
+ if (unlikely(compound_head(page)->mapping != mapping)) {
unlock_page(page);
put_page(page);
goto repeat;
--
2.17.1
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH v10 2/7] filemap: check compound_head(page)->mapping in pagecache_get_page()
2019-08-01 18:42 ` [PATCH v10 2/7] filemap: check compound_head(page)->mapping in pagecache_get_page() Song Liu
@ 2019-08-12 20:33 ` Johannes Weiner
0 siblings, 0 replies; 9+ messages in thread
From: Johannes Weiner @ 2019-08-12 20:33 UTC (permalink / raw)
To: Song Liu
Cc: linux-mm, linux-fsdevel, linux-kernel, matthew.wilcox,
kirill.shutemov, kernel-team, william.kucharski, akpm, hdanton
On Thu, Aug 01, 2019 at 11:42:39AM -0700, Song Liu wrote:
> Similar to previous patch, pagecache_get_page() avoids race condition
> with truncate by checking page->mapping == mapping. This does not work
> for compound pages. This patch let it check compound_head(page)->mapping
> instead.
>
> Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
> Signed-off-by: Song Liu <songliubraving@fb.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v10 3/7] filemap: update offset check in filemap_fault()
2019-08-01 18:42 [PATCH v10 0/7] Enable THP for text section of non-shmem files Song Liu
2019-08-01 18:42 ` [PATCH v10 1/7] filemap: check compound_head(page)->mapping in filemap_fault() Song Liu
2019-08-01 18:42 ` [PATCH v10 2/7] filemap: check compound_head(page)->mapping in pagecache_get_page() Song Liu
@ 2019-08-01 18:42 ` Song Liu
2019-08-01 18:42 ` [PATCH v10 4/7] mm,thp: stats for file backed THP Song Liu
` (3 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Song Liu @ 2019-08-01 18:42 UTC (permalink / raw)
To: linux-mm, linux-fsdevel, linux-kernel
Cc: matthew.wilcox, kirill.shutemov, kernel-team, william.kucharski,
akpm, hdanton, Song Liu
With THP, current check of offset:
VM_BUG_ON_PAGE(page->index != offset, page);
is no longer accurate. Update it to:
VM_BUG_ON_PAGE(page_to_pgoff(page) != offset, page);
Acked-by: Rik van Riel <riel@surriel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Song Liu <songliubraving@fb.com>
---
mm/filemap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/filemap.c b/mm/filemap.c
index aaee1ef96f6d..97c7b7b92c20 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2542,7 +2542,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
put_page(page);
goto retry_find;
}
- VM_BUG_ON_PAGE(page->index != offset, page);
+ VM_BUG_ON_PAGE(page_to_pgoff(page) != offset, page);
/*
* We have a locked page in the page cache, now we need to check
--
2.17.1
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH v10 4/7] mm,thp: stats for file backed THP
2019-08-01 18:42 [PATCH v10 0/7] Enable THP for text section of non-shmem files Song Liu
` (2 preceding siblings ...)
2019-08-01 18:42 ` [PATCH v10 3/7] filemap: update offset check in filemap_fault() Song Liu
@ 2019-08-01 18:42 ` Song Liu
2019-08-01 18:42 ` [PATCH v10 5/7] khugepaged: rename collapse_shmem() and khugepaged_scan_shmem() Song Liu
` (2 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Song Liu @ 2019-08-01 18:42 UTC (permalink / raw)
To: linux-mm, linux-fsdevel, linux-kernel
Cc: matthew.wilcox, kirill.shutemov, kernel-team, william.kucharski,
akpm, hdanton, Song Liu
In preparation for non-shmem THP, this patch adds a few stats and exposes
them in /proc/meminfo, /sys/bus/node/devices/<node>/meminfo, and
/proc/<pid>/task/<tid>/smaps.
This patch is mostly a rewrite of Kirill A. Shutemov's earlier version:
https://lkml.kernel.org/r/20170126115819.58875-5-kirill.shutemov@linux.intel.com/
Acked-by: Rik van Riel <riel@surriel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Song Liu <songliubraving@fb.com>
---
drivers/base/node.c | 6 ++++++
fs/proc/meminfo.c | 4 ++++
fs/proc/task_mmu.c | 4 +++-
include/linux/mmzone.h | 2 ++
mm/vmstat.c | 2 ++
5 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/drivers/base/node.c b/drivers/base/node.c
index 75b7e6f6535b..4f2714ee819b 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -427,6 +427,8 @@ static ssize_t node_read_meminfo(struct device *dev,
"Node %d AnonHugePages: %8lu kB\n"
"Node %d ShmemHugePages: %8lu kB\n"
"Node %d ShmemPmdMapped: %8lu kB\n"
+ "Node %d FileHugePages: %8lu kB\n"
+ "Node %d FilePmdMapped: %8lu kB\n"
#endif
,
nid, K(node_page_state(pgdat, NR_FILE_DIRTY)),
@@ -452,6 +454,10 @@ static ssize_t node_read_meminfo(struct device *dev,
nid, K(node_page_state(pgdat, NR_SHMEM_THPS) *
HPAGE_PMD_NR),
nid, K(node_page_state(pgdat, NR_SHMEM_PMDMAPPED) *
+ HPAGE_PMD_NR),
+ nid, K(node_page_state(pgdat, NR_FILE_THPS) *
+ HPAGE_PMD_NR),
+ nid, K(node_page_state(pgdat, NR_FILE_PMDMAPPED) *
HPAGE_PMD_NR)
#endif
);
diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
index 465ea0153b2a..82673470dde7 100644
--- a/fs/proc/meminfo.c
+++ b/fs/proc/meminfo.c
@@ -136,6 +136,10 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
global_node_page_state(NR_SHMEM_THPS) * HPAGE_PMD_NR);
show_val_kb(m, "ShmemPmdMapped: ",
global_node_page_state(NR_SHMEM_PMDMAPPED) * HPAGE_PMD_NR);
+ show_val_kb(m, "FileHugePages: ",
+ global_node_page_state(NR_FILE_THPS) * HPAGE_PMD_NR);
+ show_val_kb(m, "FilePmdMapped: ",
+ global_node_page_state(NR_FILE_PMDMAPPED) * HPAGE_PMD_NR);
#endif
#ifdef CONFIG_CMA
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 731642e0f5a0..1ea7d730774c 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -417,6 +417,7 @@ struct mem_size_stats {
unsigned long lazyfree;
unsigned long anonymous_thp;
unsigned long shmem_thp;
+ unsigned long file_thp;
unsigned long swap;
unsigned long shared_hugetlb;
unsigned long private_hugetlb;
@@ -586,7 +587,7 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr,
else if (is_zone_device_page(page))
/* pass */;
else
- VM_BUG_ON_PAGE(1, page);
+ mss->file_thp += HPAGE_PMD_SIZE;
smaps_account(mss, page, true, pmd_young(*pmd), pmd_dirty(*pmd), locked);
}
#else
@@ -803,6 +804,7 @@ static void __show_smap(struct seq_file *m, const struct mem_size_stats *mss,
SEQ_PUT_DEC(" kB\nLazyFree: ", mss->lazyfree);
SEQ_PUT_DEC(" kB\nAnonHugePages: ", mss->anonymous_thp);
SEQ_PUT_DEC(" kB\nShmemPmdMapped: ", mss->shmem_thp);
+ SEQ_PUT_DEC(" kB\nFilePmdMapped: ", mss->file_thp);
SEQ_PUT_DEC(" kB\nShared_Hugetlb: ", mss->shared_hugetlb);
seq_put_decimal_ull_width(m, " kB\nPrivate_Hugetlb: ",
mss->private_hugetlb >> 10, 7);
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index d77d717c620c..aa0dd8ca36c8 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -234,6 +234,8 @@ enum node_stat_item {
NR_SHMEM, /* shmem pages (included tmpfs/GEM pages) */
NR_SHMEM_THPS,
NR_SHMEM_PMDMAPPED,
+ NR_FILE_THPS,
+ NR_FILE_PMDMAPPED,
NR_ANON_THPS,
NR_UNSTABLE_NFS, /* NFS unstable pages */
NR_VMSCAN_WRITE,
diff --git a/mm/vmstat.c b/mm/vmstat.c
index fd7e16ca6996..6afc892a148a 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1158,6 +1158,8 @@ const char * const vmstat_text[] = {
"nr_shmem",
"nr_shmem_hugepages",
"nr_shmem_pmdmapped",
+ "nr_file_hugepages",
+ "nr_file_pmdmapped",
"nr_anon_transparent_hugepages",
"nr_unstable",
"nr_vmscan_write",
--
2.17.1
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH v10 5/7] khugepaged: rename collapse_shmem() and khugepaged_scan_shmem()
2019-08-01 18:42 [PATCH v10 0/7] Enable THP for text section of non-shmem files Song Liu
` (3 preceding siblings ...)
2019-08-01 18:42 ` [PATCH v10 4/7] mm,thp: stats for file backed THP Song Liu
@ 2019-08-01 18:42 ` Song Liu
[not found] ` <20190801184244.3169074-7-songliubraving@fb.com>
[not found] ` <20190801184244.3169074-8-songliubraving@fb.com>
6 siblings, 0 replies; 9+ messages in thread
From: Song Liu @ 2019-08-01 18:42 UTC (permalink / raw)
To: linux-mm, linux-fsdevel, linux-kernel
Cc: matthew.wilcox, kirill.shutemov, kernel-team, william.kucharski,
akpm, hdanton, Song Liu
Next patch will add khugepaged support of non-shmem files. This patch
renames these two functions to reflect the new functionality:
collapse_shmem() => collapse_file()
khugepaged_scan_shmem() => khugepaged_scan_file()
Acked-by: Rik van Riel <riel@surriel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Song Liu <songliubraving@fb.com>
---
mm/khugepaged.c | 23 +++++++++++------------
1 file changed, 11 insertions(+), 12 deletions(-)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index b9949014346b..9d3cc2061960 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1426,7 +1426,7 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
}
/**
- * collapse_shmem - collapse small tmpfs/shmem pages into huge one.
+ * collapse_file - collapse small tmpfs/shmem pages into huge one.
*
* Basic scheme is simple, details are more complex:
* - allocate and lock a new huge page;
@@ -1443,10 +1443,11 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
* + restore gaps in the page cache;
* + unlock and free huge page;
*/
-static void collapse_shmem(struct mm_struct *mm,
- struct address_space *mapping, pgoff_t start,
+static void collapse_file(struct mm_struct *mm,
+ struct file *file, pgoff_t start,
struct page **hpage, int node)
{
+ struct address_space *mapping = file->f_mapping;
gfp_t gfp;
struct page *new_page;
struct mem_cgroup *memcg;
@@ -1702,11 +1703,11 @@ static void collapse_shmem(struct mm_struct *mm,
/* TODO: tracepoints */
}
-static void khugepaged_scan_shmem(struct mm_struct *mm,
- struct address_space *mapping,
- pgoff_t start, struct page **hpage)
+static void khugepaged_scan_file(struct mm_struct *mm,
+ struct file *file, pgoff_t start, struct page **hpage)
{
struct page *page = NULL;
+ struct address_space *mapping = file->f_mapping;
XA_STATE(xas, &mapping->i_pages, start);
int present, swap;
int node = NUMA_NO_NODE;
@@ -1770,16 +1771,15 @@ static void khugepaged_scan_shmem(struct mm_struct *mm,
result = SCAN_EXCEED_NONE_PTE;
} else {
node = khugepaged_find_target_node();
- collapse_shmem(mm, mapping, start, hpage, node);
+ collapse_file(mm, file, start, hpage, node);
}
}
/* TODO: tracepoints */
}
#else
-static void khugepaged_scan_shmem(struct mm_struct *mm,
- struct address_space *mapping,
- pgoff_t start, struct page **hpage)
+static void khugepaged_scan_file(struct mm_struct *mm,
+ struct file *file, pgoff_t start, struct page **hpage)
{
BUILD_BUG();
}
@@ -1862,8 +1862,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages,
file = get_file(vma->vm_file);
up_read(&mm->mmap_sem);
ret = 1;
- khugepaged_scan_shmem(mm, file->f_mapping,
- pgoff, hpage);
+ khugepaged_scan_file(mm, file, pgoff, hpage);
fput(file);
} else {
ret = khugepaged_scan_pmd(mm, vma,
--
2.17.1
^ permalink raw reply related [flat|nested] 9+ messages in thread[parent not found: <20190801184244.3169074-7-songliubraving@fb.com>]
* Re: [PATCH v10 6/7] mm,thp: add read-only THP support for (non-shmem) FS
[not found] ` <20190801184244.3169074-7-songliubraving@fb.com>
@ 2019-08-12 20:36 ` Johannes Weiner
0 siblings, 0 replies; 9+ messages in thread
From: Johannes Weiner @ 2019-08-12 20:36 UTC (permalink / raw)
To: Song Liu
Cc: linux-mm, linux-fsdevel, linux-kernel, matthew.wilcox,
kirill.shutemov, kernel-team, william.kucharski, akpm, hdanton
On Thu, Aug 01, 2019 at 11:42:43AM -0700, Song Liu wrote:
> This patch is (hopefully) the first step to enable THP for non-shmem
> filesystems.
>
> This patch enables an application to put part of its text sections to THP
> via madvise, for example:
>
> madvise((void *)0x600000, 0x200000, MADV_HUGEPAGE);
>
> We tried to reuse the logic for THP on tmpfs.
>
> Currently, write is not supported for non-shmem THP. khugepaged will only
> process vma with VM_DENYWRITE. sys_mmap() ignores VM_DENYWRITE requests
> (see ksys_mmap_pgoff). The only way to create vma with VM_DENYWRITE is
> execve(). This requirement limits non-shmem THP to text sections.
>
> The next patch will handle writes, which would only happen when the all
> the vmas with VM_DENYWRITE are unmapped.
>
> An EXPERIMENTAL config, READ_ONLY_THP_FOR_FS, is added to gate this
> feature.
>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Acked-by: Rik van Riel <riel@surriel.com>
> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Signed-off-by: Song Liu <songliubraving@fb.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <20190801184244.3169074-8-songliubraving@fb.com>]
* Re: [PATCH v10 7/7] mm,thp: avoid writes to file with THP in pagecache
[not found] ` <20190801184244.3169074-8-songliubraving@fb.com>
@ 2019-08-12 20:38 ` Johannes Weiner
0 siblings, 0 replies; 9+ messages in thread
From: Johannes Weiner @ 2019-08-12 20:38 UTC (permalink / raw)
To: Song Liu
Cc: linux-mm, linux-fsdevel, linux-kernel, matthew.wilcox,
kirill.shutemov, kernel-team, william.kucharski, akpm, hdanton
On Thu, Aug 01, 2019 at 11:42:44AM -0700, Song Liu wrote:
> In previous patch, an application could put part of its text section in
> THP via madvise(). These THPs will be protected from writes when the
> application is still running (TXTBSY). However, after the application
> exits, the file is available for writes.
>
> This patch avoids writes to file THP by dropping page cache for the file
> when the file is open for write. A new counter nr_thps is added to struct
> address_space. In do_dentry_open(), if the file is open for write and
> nr_thps is non-zero, we drop page cache for the whole file.
>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Reported-by: kbuild test robot <lkp@intel.com>
> Acked-by: Rik van Riel <riel@surriel.com>
> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Signed-off-by: Song Liu <songliubraving@fb.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
^ permalink raw reply [flat|nested] 9+ messages in thread