* [PATCHv5 04/28] mm, thp: adjust conditions when we can reuse the page on WP fault
@ 2015-04-23 21:03 ` Kirill A. Shutemov
0 siblings, 0 replies; 17+ messages in thread
From: Kirill A. Shutemov @ 2015-04-23 21:03 UTC (permalink / raw)
To: Andrew Morton, Andrea Arcangeli, Hugh Dickins
Cc: Dave Hansen, Mel Gorman, Rik van Riel, Vlastimil Babka,
Christoph Lameter, Naoya Horiguchi, Steve Capper,
Aneesh Kumar K.V, Johannes Weiner, Michal Hocko, Jerome Marchand,
Sasha Levin, linux-kernel, linux-mm, Kirill A. Shutemov
With new refcounting we will be able map the same compound page with
PTEs and PMDs. It requires adjustment to conditions when we can reuse
the page on write-protection fault.
For PTE fault we can't reuse the page if it's part of huge page.
For PMD we can only reuse the page if nobody else maps the huge page or
it's part. We can do it by checking page_mapcount() on each sub-page,
but it's expensive.
The cheaper way is to check page_count() to be equal 1: every mapcount
takes page reference, so this way we can guarantee, that the PMD is the
only mapping.
This approach can give false negative if somebody pinned the page, but
that doesn't affect correctness.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
---
include/linux/swap.h | 3 ++-
mm/huge_memory.c | 12 +++++++++++-
mm/swapfile.c | 3 +++
3 files changed, 16 insertions(+), 2 deletions(-)
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 0428e4c84e1d..17cdd6b9456b 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -524,7 +524,8 @@ static inline int page_swapcount(struct page *page)
return 0;
}
-#define reuse_swap_page(page) (page_mapcount(page) == 1)
+#define reuse_swap_page(page) \
+ (!PageTransCompound(page) && page_mapcount(page) == 1)
static inline int try_to_free_swap(struct page *page)
{
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 534f353e12bf..fd8af5b9917f 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1103,7 +1103,17 @@ int do_huge_pmd_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
page = pmd_page(orig_pmd);
VM_BUG_ON_PAGE(!PageCompound(page) || !PageHead(page), page);
- if (page_mapcount(page) == 1) {
+ /*
+ * We can only reuse the page if nobody else maps the huge page or it's
+ * part. We can do it by checking page_mapcount() on each sub-page, but
+ * it's expensive.
+ * The cheaper way is to check page_count() to be equal 1: every
+ * mapcount takes page reference reference, so this way we can
+ * guarantee, that the PMD is the only mapping.
+ * This can give false negative if somebody pinned the page, but that's
+ * fine.
+ */
+ if (page_mapcount(page) == 1 && page_count(page) == 1) {
pmd_t entry;
entry = pmd_mkyoung(orig_pmd);
entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 6dd365d1c488..3cd5f188b996 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -887,6 +887,9 @@ int reuse_swap_page(struct page *page)
VM_BUG_ON_PAGE(!PageLocked(page), page);
if (unlikely(PageKsm(page)))
return 0;
+ /* The page is part of THP and cannot be reused */
+ if (PageTransCompound(page))
+ return 0;
count = page_mapcount(page);
if (count <= 1 && PageSwapCache(page)) {
count += page_swapcount(page);
--
2.1.4
^ permalink raw reply related [flat|nested] 17+ messages in thread* Re: [PATCHv5 04/28] mm, thp: adjust conditions when we can reuse the page on WP fault
2015-04-23 21:03 ` Kirill A. Shutemov
(?)
@ 2015-04-29 15:54 ` Jerome Marchand
-1 siblings, 0 replies; 17+ messages in thread
From: Jerome Marchand @ 2015-04-29 15:54 UTC (permalink / raw)
To: Kirill A. Shutemov, Andrew Morton, Andrea Arcangeli, Hugh Dickins
Cc: Dave Hansen, Mel Gorman, Rik van Riel, Vlastimil Babka,
Christoph Lameter, Naoya Horiguchi, Steve Capper,
Aneesh Kumar K.V, Johannes Weiner, Michal Hocko, Sasha Levin,
linux-kernel, linux-mm
[-- Attachment #1: Type: text/plain, Size: 3176 bytes --]
On 04/23/2015 11:03 PM, Kirill A. Shutemov wrote:
> With new refcounting we will be able map the same compound page with
> PTEs and PMDs. It requires adjustment to conditions when we can reuse
> the page on write-protection fault.
>
> For PTE fault we can't reuse the page if it's part of huge page.
>
> For PMD we can only reuse the page if nobody else maps the huge page or
> it's part. We can do it by checking page_mapcount() on each sub-page,
> but it's expensive.
>
> The cheaper way is to check page_count() to be equal 1: every mapcount
> takes page reference, so this way we can guarantee, that the PMD is the
> only mapping.
>
> This approach can give false negative if somebody pinned the page, but
> that doesn't affect correctness.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Tested-by: Sasha Levin <sasha.levin@oracle.com>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
> ---
> include/linux/swap.h | 3 ++-
> mm/huge_memory.c | 12 +++++++++++-
> mm/swapfile.c | 3 +++
> 3 files changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index 0428e4c84e1d..17cdd6b9456b 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -524,7 +524,8 @@ static inline int page_swapcount(struct page *page)
> return 0;
> }
>
> -#define reuse_swap_page(page) (page_mapcount(page) == 1)
> +#define reuse_swap_page(page) \
> + (!PageTransCompound(page) && page_mapcount(page) == 1)
>
> static inline int try_to_free_swap(struct page *page)
> {
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 534f353e12bf..fd8af5b9917f 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1103,7 +1103,17 @@ int do_huge_pmd_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
>
> page = pmd_page(orig_pmd);
> VM_BUG_ON_PAGE(!PageCompound(page) || !PageHead(page), page);
> - if (page_mapcount(page) == 1) {
> + /*
> + * We can only reuse the page if nobody else maps the huge page or it's
> + * part. We can do it by checking page_mapcount() on each sub-page, but
> + * it's expensive.
> + * The cheaper way is to check page_count() to be equal 1: every
> + * mapcount takes page reference reference, so this way we can
> + * guarantee, that the PMD is the only mapping.
> + * This can give false negative if somebody pinned the page, but that's
> + * fine.
> + */
> + if (page_mapcount(page) == 1 && page_count(page) == 1) {
> pmd_t entry;
> entry = pmd_mkyoung(orig_pmd);
> entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 6dd365d1c488..3cd5f188b996 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -887,6 +887,9 @@ int reuse_swap_page(struct page *page)
> VM_BUG_ON_PAGE(!PageLocked(page), page);
> if (unlikely(PageKsm(page)))
> return 0;
> + /* The page is part of THP and cannot be reused */
> + if (PageTransCompound(page))
> + return 0;
> count = page_mapcount(page);
> if (count <= 1 && PageSwapCache(page)) {
> count += page_swapcount(page);
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [PATCHv5 04/28] mm, thp: adjust conditions when we can reuse the page on WP fault
2015-04-23 21:03 ` Kirill A. Shutemov
@ 2015-05-15 9:15 ` Vlastimil Babka
-1 siblings, 0 replies; 17+ messages in thread
From: Vlastimil Babka @ 2015-05-15 9:15 UTC (permalink / raw)
To: Kirill A. Shutemov, Andrew Morton, Andrea Arcangeli, Hugh Dickins
Cc: Dave Hansen, Mel Gorman, Rik van Riel, Christoph Lameter,
Naoya Horiguchi, Steve Capper, Aneesh Kumar K.V, Johannes Weiner,
Michal Hocko, Jerome Marchand, Sasha Levin, linux-kernel,
linux-mm
On 04/23/2015 11:03 PM, Kirill A. Shutemov wrote:
> With new refcounting we will be able map the same compound page with
> PTEs and PMDs. It requires adjustment to conditions when we can reuse
> the page on write-protection fault.
>
> For PTE fault we can't reuse the page if it's part of huge page.
>
> For PMD we can only reuse the page if nobody else maps the huge page or
> it's part. We can do it by checking page_mapcount() on each sub-page,
> but it's expensive.
>
> The cheaper way is to check page_count() to be equal 1: every mapcount
> takes page reference, so this way we can guarantee, that the PMD is the
> only mapping.
>
> This approach can give false negative if somebody pinned the page, but
> that doesn't affect correctness.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Tested-by: Sasha Levin <sasha.levin@oracle.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
So couldn't the same trick be used in Patch 1 to avoid counting
individual oder-0 pages?
> ---
> include/linux/swap.h | 3 ++-
> mm/huge_memory.c | 12 +++++++++++-
> mm/swapfile.c | 3 +++
> 3 files changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index 0428e4c84e1d..17cdd6b9456b 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -524,7 +524,8 @@ static inline int page_swapcount(struct page *page)
> return 0;
> }
>
> -#define reuse_swap_page(page) (page_mapcount(page) == 1)
> +#define reuse_swap_page(page) \
> + (!PageTransCompound(page) && page_mapcount(page) == 1)
>
> static inline int try_to_free_swap(struct page *page)
> {
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 534f353e12bf..fd8af5b9917f 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1103,7 +1103,17 @@ int do_huge_pmd_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
>
> page = pmd_page(orig_pmd);
> VM_BUG_ON_PAGE(!PageCompound(page) || !PageHead(page), page);
> - if (page_mapcount(page) == 1) {
> + /*
> + * We can only reuse the page if nobody else maps the huge page or it's
> + * part. We can do it by checking page_mapcount() on each sub-page, but
> + * it's expensive.
> + * The cheaper way is to check page_count() to be equal 1: every
> + * mapcount takes page reference reference, so this way we can
> + * guarantee, that the PMD is the only mapping.
> + * This can give false negative if somebody pinned the page, but that's
> + * fine.
> + */
> + if (page_mapcount(page) == 1 && page_count(page) == 1) {
> pmd_t entry;
> entry = pmd_mkyoung(orig_pmd);
> entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 6dd365d1c488..3cd5f188b996 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -887,6 +887,9 @@ int reuse_swap_page(struct page *page)
> VM_BUG_ON_PAGE(!PageLocked(page), page);
> if (unlikely(PageKsm(page)))
> return 0;
> + /* The page is part of THP and cannot be reused */
> + if (PageTransCompound(page))
> + return 0;
> count = page_mapcount(page);
> if (count <= 1 && PageSwapCache(page)) {
> count += page_swapcount(page);
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [PATCHv5 04/28] mm, thp: adjust conditions when we can reuse the page on WP fault
@ 2015-05-15 9:15 ` Vlastimil Babka
0 siblings, 0 replies; 17+ messages in thread
From: Vlastimil Babka @ 2015-05-15 9:15 UTC (permalink / raw)
To: Kirill A. Shutemov, Andrew Morton, Andrea Arcangeli, Hugh Dickins
Cc: Dave Hansen, Mel Gorman, Rik van Riel, Christoph Lameter,
Naoya Horiguchi, Steve Capper, Aneesh Kumar K.V, Johannes Weiner,
Michal Hocko, Jerome Marchand, Sasha Levin, linux-kernel,
linux-mm
On 04/23/2015 11:03 PM, Kirill A. Shutemov wrote:
> With new refcounting we will be able map the same compound page with
> PTEs and PMDs. It requires adjustment to conditions when we can reuse
> the page on write-protection fault.
>
> For PTE fault we can't reuse the page if it's part of huge page.
>
> For PMD we can only reuse the page if nobody else maps the huge page or
> it's part. We can do it by checking page_mapcount() on each sub-page,
> but it's expensive.
>
> The cheaper way is to check page_count() to be equal 1: every mapcount
> takes page reference, so this way we can guarantee, that the PMD is the
> only mapping.
>
> This approach can give false negative if somebody pinned the page, but
> that doesn't affect correctness.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Tested-by: Sasha Levin <sasha.levin@oracle.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
So couldn't the same trick be used in Patch 1 to avoid counting
individual oder-0 pages?
> ---
> include/linux/swap.h | 3 ++-
> mm/huge_memory.c | 12 +++++++++++-
> mm/swapfile.c | 3 +++
> 3 files changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index 0428e4c84e1d..17cdd6b9456b 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -524,7 +524,8 @@ static inline int page_swapcount(struct page *page)
> return 0;
> }
>
> -#define reuse_swap_page(page) (page_mapcount(page) == 1)
> +#define reuse_swap_page(page) \
> + (!PageTransCompound(page) && page_mapcount(page) == 1)
>
> static inline int try_to_free_swap(struct page *page)
> {
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 534f353e12bf..fd8af5b9917f 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1103,7 +1103,17 @@ int do_huge_pmd_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
>
> page = pmd_page(orig_pmd);
> VM_BUG_ON_PAGE(!PageCompound(page) || !PageHead(page), page);
> - if (page_mapcount(page) == 1) {
> + /*
> + * We can only reuse the page if nobody else maps the huge page or it's
> + * part. We can do it by checking page_mapcount() on each sub-page, but
> + * it's expensive.
> + * The cheaper way is to check page_count() to be equal 1: every
> + * mapcount takes page reference reference, so this way we can
> + * guarantee, that the PMD is the only mapping.
> + * This can give false negative if somebody pinned the page, but that's
> + * fine.
> + */
> + if (page_mapcount(page) == 1 && page_count(page) == 1) {
> pmd_t entry;
> entry = pmd_mkyoung(orig_pmd);
> entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 6dd365d1c488..3cd5f188b996 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -887,6 +887,9 @@ int reuse_swap_page(struct page *page)
> VM_BUG_ON_PAGE(!PageLocked(page), page);
> if (unlikely(PageKsm(page)))
> return 0;
> + /* The page is part of THP and cannot be reused */
> + if (PageTransCompound(page))
> + return 0;
> count = page_mapcount(page);
> if (count <= 1 && PageSwapCache(page)) {
> count += page_swapcount(page);
>
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [PATCHv5 04/28] mm, thp: adjust conditions when we can reuse the page on WP fault
2015-05-15 9:15 ` Vlastimil Babka
@ 2015-05-15 11:21 ` Kirill A. Shutemov
-1 siblings, 0 replies; 17+ messages in thread
From: Kirill A. Shutemov @ 2015-05-15 11:21 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Kirill A. Shutemov, Andrew Morton, Andrea Arcangeli, Hugh Dickins,
Dave Hansen, Mel Gorman, Rik van Riel, Christoph Lameter,
Naoya Horiguchi, Steve Capper, Aneesh Kumar K.V, Johannes Weiner,
Michal Hocko, Jerome Marchand, Sasha Levin, linux-kernel,
linux-mm
On Fri, May 15, 2015 at 11:15:00AM +0200, Vlastimil Babka wrote:
> On 04/23/2015 11:03 PM, Kirill A. Shutemov wrote:
> >With new refcounting we will be able map the same compound page with
> >PTEs and PMDs. It requires adjustment to conditions when we can reuse
> >the page on write-protection fault.
> >
> >For PTE fault we can't reuse the page if it's part of huge page.
> >
> >For PMD we can only reuse the page if nobody else maps the huge page or
> >it's part. We can do it by checking page_mapcount() on each sub-page,
> >but it's expensive.
> >
> >The cheaper way is to check page_count() to be equal 1: every mapcount
> >takes page reference, so this way we can guarantee, that the PMD is the
> >only mapping.
> >
> >This approach can give false negative if somebody pinned the page, but
> >that doesn't affect correctness.
> >
> >Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> >Tested-by: Sasha Levin <sasha.levin@oracle.com>
>
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
>
> So couldn't the same trick be used in Patch 1 to avoid counting individual
> oder-0 pages?
Hm. You're right, we could. But is smaps that performance sensitive to
bother?
--
Kirill A. Shutemov
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCHv5 04/28] mm, thp: adjust conditions when we can reuse the page on WP fault
@ 2015-05-15 11:21 ` Kirill A. Shutemov
0 siblings, 0 replies; 17+ messages in thread
From: Kirill A. Shutemov @ 2015-05-15 11:21 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Kirill A. Shutemov, Andrew Morton, Andrea Arcangeli, Hugh Dickins,
Dave Hansen, Mel Gorman, Rik van Riel, Christoph Lameter,
Naoya Horiguchi, Steve Capper, Aneesh Kumar K.V, Johannes Weiner,
Michal Hocko, Jerome Marchand, Sasha Levin, linux-kernel,
linux-mm
On Fri, May 15, 2015 at 11:15:00AM +0200, Vlastimil Babka wrote:
> On 04/23/2015 11:03 PM, Kirill A. Shutemov wrote:
> >With new refcounting we will be able map the same compound page with
> >PTEs and PMDs. It requires adjustment to conditions when we can reuse
> >the page on write-protection fault.
> >
> >For PTE fault we can't reuse the page if it's part of huge page.
> >
> >For PMD we can only reuse the page if nobody else maps the huge page or
> >it's part. We can do it by checking page_mapcount() on each sub-page,
> >but it's expensive.
> >
> >The cheaper way is to check page_count() to be equal 1: every mapcount
> >takes page reference, so this way we can guarantee, that the PMD is the
> >only mapping.
> >
> >This approach can give false negative if somebody pinned the page, but
> >that doesn't affect correctness.
> >
> >Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> >Tested-by: Sasha Levin <sasha.levin@oracle.com>
>
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
>
> So couldn't the same trick be used in Patch 1 to avoid counting individual
> oder-0 pages?
Hm. You're right, we could. But is smaps that performance sensitive to
bother?
--
Kirill A. Shutemov
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCHv5 04/28] mm, thp: adjust conditions when we can reuse the page on WP fault
2015-05-15 11:21 ` Kirill A. Shutemov
@ 2015-05-15 11:35 ` Vlastimil Babka
-1 siblings, 0 replies; 17+ messages in thread
From: Vlastimil Babka @ 2015-05-15 11:35 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Kirill A. Shutemov, Andrew Morton, Andrea Arcangeli, Hugh Dickins,
Dave Hansen, Mel Gorman, Rik van Riel, Christoph Lameter,
Naoya Horiguchi, Steve Capper, Aneesh Kumar K.V, Johannes Weiner,
Michal Hocko, Jerome Marchand, Sasha Levin, linux-kernel,
linux-mm
On 05/15/2015 01:21 PM, Kirill A. Shutemov wrote:
> On Fri, May 15, 2015 at 11:15:00AM +0200, Vlastimil Babka wrote:
>> On 04/23/2015 11:03 PM, Kirill A. Shutemov wrote:
>>> With new refcounting we will be able map the same compound page with
>>> PTEs and PMDs. It requires adjustment to conditions when we can reuse
>>> the page on write-protection fault.
>>>
>>> For PTE fault we can't reuse the page if it's part of huge page.
>>>
>>> For PMD we can only reuse the page if nobody else maps the huge page or
>>> it's part. We can do it by checking page_mapcount() on each sub-page,
>>> but it's expensive.
>>>
>>> The cheaper way is to check page_count() to be equal 1: every mapcount
>>> takes page reference, so this way we can guarantee, that the PMD is the
>>> only mapping.
>>>
>>> This approach can give false negative if somebody pinned the page, but
>>> that doesn't affect correctness.
>>>
>>> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>> Tested-by: Sasha Levin <sasha.levin@oracle.com>
>>
>> Acked-by: Vlastimil Babka <vbabka@suse.cz>
>>
>> So couldn't the same trick be used in Patch 1 to avoid counting individual
>> oder-0 pages?
>
> Hm. You're right, we could. But is smaps that performance sensitive to
> bother?
Well, I was nudged to optimize it when doing the shmem swap accounting
changes there :) User may not care about the latency of obtaining the
smaps file contents, but since it has mmap_sem locked for that, the
process might care...
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCHv5 04/28] mm, thp: adjust conditions when we can reuse the page on WP fault
@ 2015-05-15 11:35 ` Vlastimil Babka
0 siblings, 0 replies; 17+ messages in thread
From: Vlastimil Babka @ 2015-05-15 11:35 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Kirill A. Shutemov, Andrew Morton, Andrea Arcangeli, Hugh Dickins,
Dave Hansen, Mel Gorman, Rik van Riel, Christoph Lameter,
Naoya Horiguchi, Steve Capper, Aneesh Kumar K.V, Johannes Weiner,
Michal Hocko, Jerome Marchand, Sasha Levin, linux-kernel,
linux-mm
On 05/15/2015 01:21 PM, Kirill A. Shutemov wrote:
> On Fri, May 15, 2015 at 11:15:00AM +0200, Vlastimil Babka wrote:
>> On 04/23/2015 11:03 PM, Kirill A. Shutemov wrote:
>>> With new refcounting we will be able map the same compound page with
>>> PTEs and PMDs. It requires adjustment to conditions when we can reuse
>>> the page on write-protection fault.
>>>
>>> For PTE fault we can't reuse the page if it's part of huge page.
>>>
>>> For PMD we can only reuse the page if nobody else maps the huge page or
>>> it's part. We can do it by checking page_mapcount() on each sub-page,
>>> but it's expensive.
>>>
>>> The cheaper way is to check page_count() to be equal 1: every mapcount
>>> takes page reference, so this way we can guarantee, that the PMD is the
>>> only mapping.
>>>
>>> This approach can give false negative if somebody pinned the page, but
>>> that doesn't affect correctness.
>>>
>>> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>> Tested-by: Sasha Levin <sasha.levin@oracle.com>
>>
>> Acked-by: Vlastimil Babka <vbabka@suse.cz>
>>
>> So couldn't the same trick be used in Patch 1 to avoid counting individual
>> oder-0 pages?
>
> Hm. You're right, we could. But is smaps that performance sensitive to
> bother?
Well, I was nudged to optimize it when doing the shmem swap accounting
changes there :) User may not care about the latency of obtaining the
smaps file contents, but since it has mmap_sem locked for that, the
process might care...
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCHv5 04/28] mm, thp: adjust conditions when we can reuse the page on WP fault
2015-05-15 11:35 ` Vlastimil Babka
@ 2015-05-15 13:29 ` Kirill A. Shutemov
-1 siblings, 0 replies; 17+ messages in thread
From: Kirill A. Shutemov @ 2015-05-15 13:29 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Kirill A. Shutemov, Andrew Morton, Andrea Arcangeli, Hugh Dickins,
Dave Hansen, Mel Gorman, Rik van Riel, Christoph Lameter,
Naoya Horiguchi, Steve Capper, Aneesh Kumar K.V, Johannes Weiner,
Michal Hocko, Jerome Marchand, Sasha Levin, linux-kernel,
linux-mm
On Fri, May 15, 2015 at 01:35:49PM +0200, Vlastimil Babka wrote:
> On 05/15/2015 01:21 PM, Kirill A. Shutemov wrote:
> >On Fri, May 15, 2015 at 11:15:00AM +0200, Vlastimil Babka wrote:
> >>On 04/23/2015 11:03 PM, Kirill A. Shutemov wrote:
> >>>With new refcounting we will be able map the same compound page with
> >>>PTEs and PMDs. It requires adjustment to conditions when we can reuse
> >>>the page on write-protection fault.
> >>>
> >>>For PTE fault we can't reuse the page if it's part of huge page.
> >>>
> >>>For PMD we can only reuse the page if nobody else maps the huge page or
> >>>it's part. We can do it by checking page_mapcount() on each sub-page,
> >>>but it's expensive.
> >>>
> >>>The cheaper way is to check page_count() to be equal 1: every mapcount
> >>>takes page reference, so this way we can guarantee, that the PMD is the
> >>>only mapping.
> >>>
> >>>This approach can give false negative if somebody pinned the page, but
> >>>that doesn't affect correctness.
> >>>
> >>>Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> >>>Tested-by: Sasha Levin <sasha.levin@oracle.com>
> >>
> >>Acked-by: Vlastimil Babka <vbabka@suse.cz>
> >>
> >>So couldn't the same trick be used in Patch 1 to avoid counting individual
> >>oder-0 pages?
> >
> >Hm. You're right, we could. But is smaps that performance sensitive to
> >bother?
>
> Well, I was nudged to optimize it when doing the shmem swap accounting
> changes there :) User may not care about the latency of obtaining the smaps
> file contents, but since it has mmap_sem locked for that, the process might
> care...
Somewthing like this?
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index e04399e53965..5bc3d2b1176e 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -462,6 +462,19 @@ static void smaps_account(struct mem_size_stats *mss, struct page *page,
if (young || PageReferenced(page))
mss->referenced += size;
+ /*
+ * page_count(page) == 1 guarantees the page is mapped exactly once.
+ * If any subpage of the compound page mapped with PTE it would elevate
+ * page_count().
+ */
+ if (page_count(page) == 1) {
+ if (dirty || PageDirty(page))
+ mss->private_dirty += size;
+ else
+ mss->private_clean += size;
+ return;
+ }
+
for (i = 0; i < nr; i++, page++) {
int mapcount = page_mapcount(page);
--
Kirill A. Shutemov
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 17+ messages in thread* Re: [PATCHv5 04/28] mm, thp: adjust conditions when we can reuse the page on WP fault
@ 2015-05-15 13:29 ` Kirill A. Shutemov
0 siblings, 0 replies; 17+ messages in thread
From: Kirill A. Shutemov @ 2015-05-15 13:29 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Kirill A. Shutemov, Andrew Morton, Andrea Arcangeli, Hugh Dickins,
Dave Hansen, Mel Gorman, Rik van Riel, Christoph Lameter,
Naoya Horiguchi, Steve Capper, Aneesh Kumar K.V, Johannes Weiner,
Michal Hocko, Jerome Marchand, Sasha Levin, linux-kernel,
linux-mm
On Fri, May 15, 2015 at 01:35:49PM +0200, Vlastimil Babka wrote:
> On 05/15/2015 01:21 PM, Kirill A. Shutemov wrote:
> >On Fri, May 15, 2015 at 11:15:00AM +0200, Vlastimil Babka wrote:
> >>On 04/23/2015 11:03 PM, Kirill A. Shutemov wrote:
> >>>With new refcounting we will be able map the same compound page with
> >>>PTEs and PMDs. It requires adjustment to conditions when we can reuse
> >>>the page on write-protection fault.
> >>>
> >>>For PTE fault we can't reuse the page if it's part of huge page.
> >>>
> >>>For PMD we can only reuse the page if nobody else maps the huge page or
> >>>it's part. We can do it by checking page_mapcount() on each sub-page,
> >>>but it's expensive.
> >>>
> >>>The cheaper way is to check page_count() to be equal 1: every mapcount
> >>>takes page reference, so this way we can guarantee, that the PMD is the
> >>>only mapping.
> >>>
> >>>This approach can give false negative if somebody pinned the page, but
> >>>that doesn't affect correctness.
> >>>
> >>>Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> >>>Tested-by: Sasha Levin <sasha.levin@oracle.com>
> >>
> >>Acked-by: Vlastimil Babka <vbabka@suse.cz>
> >>
> >>So couldn't the same trick be used in Patch 1 to avoid counting individual
> >>oder-0 pages?
> >
> >Hm. You're right, we could. But is smaps that performance sensitive to
> >bother?
>
> Well, I was nudged to optimize it when doing the shmem swap accounting
> changes there :) User may not care about the latency of obtaining the smaps
> file contents, but since it has mmap_sem locked for that, the process might
> care...
Somewthing like this?
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index e04399e53965..5bc3d2b1176e 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -462,6 +462,19 @@ static void smaps_account(struct mem_size_stats *mss, struct page *page,
if (young || PageReferenced(page))
mss->referenced += size;
+ /*
+ * page_count(page) == 1 guarantees the page is mapped exactly once.
+ * If any subpage of the compound page mapped with PTE it would elevate
+ * page_count().
+ */
+ if (page_count(page) == 1) {
+ if (dirty || PageDirty(page))
+ mss->private_dirty += size;
+ else
+ mss->private_clean += size;
+ return;
+ }
+
for (i = 0; i < nr; i++, page++) {
int mapcount = page_mapcount(page);
--
Kirill A. Shutemov
^ permalink raw reply related [flat|nested] 17+ messages in thread* Re: [PATCHv5 04/28] mm, thp: adjust conditions when we can reuse the page on WP fault
2015-05-15 13:29 ` Kirill A. Shutemov
@ 2015-05-19 13:00 ` Vlastimil Babka
-1 siblings, 0 replies; 17+ messages in thread
From: Vlastimil Babka @ 2015-05-19 13:00 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Kirill A. Shutemov, Andrew Morton, Andrea Arcangeli, Hugh Dickins,
Dave Hansen, Mel Gorman, Rik van Riel, Christoph Lameter,
Naoya Horiguchi, Steve Capper, Aneesh Kumar K.V, Johannes Weiner,
Michal Hocko, Jerome Marchand, Sasha Levin, linux-kernel,
linux-mm
On 05/15/2015 03:29 PM, Kirill A. Shutemov wrote:
> On Fri, May 15, 2015 at 01:35:49PM +0200, Vlastimil Babka wrote:
>> On 05/15/2015 01:21 PM, Kirill A. Shutemov wrote:
>>> On Fri, May 15, 2015 at 11:15:00AM +0200, Vlastimil Babka wrote:
>>>> On 04/23/2015 11:03 PM, Kirill A. Shutemov wrote:
>>>>> With new refcounting we will be able map the same compound page with
>>>>> PTEs and PMDs. It requires adjustment to conditions when we can reuse
>>>>> the page on write-protection fault.
>>>>>
>>>>> For PTE fault we can't reuse the page if it's part of huge page.
>>>>>
>>>>> For PMD we can only reuse the page if nobody else maps the huge page or
>>>>> it's part. We can do it by checking page_mapcount() on each sub-page,
>>>>> but it's expensive.
>>>>>
>>>>> The cheaper way is to check page_count() to be equal 1: every mapcount
>>>>> takes page reference, so this way we can guarantee, that the PMD is the
>>>>> only mapping.
>>>>>
>>>>> This approach can give false negative if somebody pinned the page, but
>>>>> that doesn't affect correctness.
>>>>>
>>>>> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>>>> Tested-by: Sasha Levin <sasha.levin@oracle.com>
>>>>
>>>> Acked-by: Vlastimil Babka <vbabka@suse.cz>
>>>>
>>>> So couldn't the same trick be used in Patch 1 to avoid counting individual
>>>> oder-0 pages?
>>>
>>> Hm. You're right, we could. But is smaps that performance sensitive to
>>> bother?
>>
>> Well, I was nudged to optimize it when doing the shmem swap accounting
>> changes there :) User may not care about the latency of obtaining the smaps
>> file contents, but since it has mmap_sem locked for that, the process might
>> care...
>
> Somewthing like this?
Yeah, that should work.
>
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index e04399e53965..5bc3d2b1176e 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -462,6 +462,19 @@ static void smaps_account(struct mem_size_stats *mss, struct page *page,
> if (young || PageReferenced(page))
> mss->referenced += size;
>
> + /*
> + * page_count(page) == 1 guarantees the page is mapped exactly once.
> + * If any subpage of the compound page mapped with PTE it would elevate
> + * page_count().
> + */
> + if (page_count(page) == 1) {
> + if (dirty || PageDirty(page))
> + mss->private_dirty += size;
> + else
> + mss->private_clean += size;
> + return;
> + }
> +
> for (i = 0; i < nr; i++, page++) {
> int mapcount = page_mapcount(page);
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [PATCHv5 04/28] mm, thp: adjust conditions when we can reuse the page on WP fault
@ 2015-05-19 13:00 ` Vlastimil Babka
0 siblings, 0 replies; 17+ messages in thread
From: Vlastimil Babka @ 2015-05-19 13:00 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Kirill A. Shutemov, Andrew Morton, Andrea Arcangeli, Hugh Dickins,
Dave Hansen, Mel Gorman, Rik van Riel, Christoph Lameter,
Naoya Horiguchi, Steve Capper, Aneesh Kumar K.V, Johannes Weiner,
Michal Hocko, Jerome Marchand, Sasha Levin, linux-kernel,
linux-mm
On 05/15/2015 03:29 PM, Kirill A. Shutemov wrote:
> On Fri, May 15, 2015 at 01:35:49PM +0200, Vlastimil Babka wrote:
>> On 05/15/2015 01:21 PM, Kirill A. Shutemov wrote:
>>> On Fri, May 15, 2015 at 11:15:00AM +0200, Vlastimil Babka wrote:
>>>> On 04/23/2015 11:03 PM, Kirill A. Shutemov wrote:
>>>>> With new refcounting we will be able map the same compound page with
>>>>> PTEs and PMDs. It requires adjustment to conditions when we can reuse
>>>>> the page on write-protection fault.
>>>>>
>>>>> For PTE fault we can't reuse the page if it's part of huge page.
>>>>>
>>>>> For PMD we can only reuse the page if nobody else maps the huge page or
>>>>> it's part. We can do it by checking page_mapcount() on each sub-page,
>>>>> but it's expensive.
>>>>>
>>>>> The cheaper way is to check page_count() to be equal 1: every mapcount
>>>>> takes page reference, so this way we can guarantee, that the PMD is the
>>>>> only mapping.
>>>>>
>>>>> This approach can give false negative if somebody pinned the page, but
>>>>> that doesn't affect correctness.
>>>>>
>>>>> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>>>> Tested-by: Sasha Levin <sasha.levin@oracle.com>
>>>>
>>>> Acked-by: Vlastimil Babka <vbabka@suse.cz>
>>>>
>>>> So couldn't the same trick be used in Patch 1 to avoid counting individual
>>>> oder-0 pages?
>>>
>>> Hm. You're right, we could. But is smaps that performance sensitive to
>>> bother?
>>
>> Well, I was nudged to optimize it when doing the shmem swap accounting
>> changes there :) User may not care about the latency of obtaining the smaps
>> file contents, but since it has mmap_sem locked for that, the process might
>> care...
>
> Somewthing like this?
Yeah, that should work.
>
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index e04399e53965..5bc3d2b1176e 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -462,6 +462,19 @@ static void smaps_account(struct mem_size_stats *mss, struct page *page,
> if (young || PageReferenced(page))
> mss->referenced += size;
>
> + /*
> + * page_count(page) == 1 guarantees the page is mapped exactly once.
> + * If any subpage of the compound page mapped with PTE it would elevate
> + * page_count().
> + */
> + if (page_count(page) == 1) {
> + if (dirty || PageDirty(page))
> + mss->private_dirty += size;
> + else
> + mss->private_clean += size;
> + return;
> + }
> +
> for (i = 0; i < nr; i++, page++) {
> int mapcount = page_mapcount(page);
>
>
^ permalink raw reply [flat|nested] 17+ messages in thread