* [PATCH 1/3] mm: add comment for __mod_zone_page_stat
@ 2014-05-10 7:15 Jianyu Zhan
2014-05-10 7:17 ` [PATCH 2/3] mm: use a light-weight __mod_zone_page_state in mlocked_vma_newpage() Jianyu Zhan
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Jianyu Zhan @ 2014-05-10 7:15 UTC (permalink / raw)
To: akpm, mgorman, cody, liuj97, zhangyanfei, srivatsa.bhat, dave,
iamjoonsoo.kim, n-horiguchi, kirill.shutemov, schwidefsky,
nasa4836, gorcunov, riel, cl, toshi.kani, paul.gortmaker
Cc: linux-mm, linux-kernel
__mod_zone_page_stat() is not irq-safe, so it should be used carefully.
And it is not appropirately documented now. This patch adds comment for
it, and also documents for some of its call sites.
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jianyu Zhan <nasa4836@gmail.com>
---
mm/page_alloc.c | 2 ++
mm/rmap.c | 6 ++++++
mm/vmstat.c | 16 +++++++++++++++-
3 files changed, 23 insertions(+), 1 deletion(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5dba293..9d6f474 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -659,6 +659,8 @@ static inline int free_pages_check(struct page *page)
*
* And clear the zone's pages_scanned counter, to hold off the "all pages are
* pinned" detection logic.
+ *
+ * Note: this function should be used with irq disabled.
*/
static void free_pcppages_bulk(struct zone *zone, int count,
struct per_cpu_pages *pcp)
diff --git a/mm/rmap.c b/mm/rmap.c
index 9c3e773..6078a30 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -979,6 +979,8 @@ void page_add_anon_rmap(struct page *page,
/*
* Special version of the above for do_swap_page, which often runs
* into pages that are exclusively owned by the current process.
+ * So we could use the irq-unsafe version __{inc|mod}_zone_page_stat
+ * here without others racing change it in between.
* Everybody else should continue to use page_add_anon_rmap above.
*/
void do_page_add_anon_rmap(struct page *page,
@@ -1077,6 +1079,10 @@ void page_remove_rmap(struct page *page)
/*
* Hugepages are not counted in NR_ANON_PAGES nor NR_FILE_MAPPED
* and not charged by memcg for now.
+ *
+ * And we are the last user of this page, so it is safe to use
+ * the irq-unsafe version __{mod|dec}_zone_page here, since we
+ * have no racer.
*/
if (unlikely(PageHuge(page)))
goto out;
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 302dd07..778f154 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -207,7 +207,21 @@ void set_pgdat_percpu_threshold(pg_data_t *pgdat,
}
/*
- * For use when we know that interrupts are disabled.
+ * Optimized modificatoin function.
+ *
+ * The code basically does the modification in two steps:
+ *
+ * 1. read the current counter based on the processor number
+ * 2. modificate the counter write it back.
+ *
+ * So this function should be used with the guarantee that
+ *
+ * 1. interrupts are disabled, or
+ * 2. interrupts are enabled, but no other sites would race to
+ * modify this counter in between.
+ *
+ * Otherwise, an irq-safe version mod_zone_page_state() should
+ * be used instead.
*/
void __mod_zone_page_state(struct zone *zone, enum zone_stat_item item,
int delta)
--
2.0.0-rc1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH 2/3] mm: use a light-weight __mod_zone_page_state in mlocked_vma_newpage()
2014-05-10 7:15 [PATCH 1/3] mm: add comment for __mod_zone_page_stat Jianyu Zhan
@ 2014-05-10 7:17 ` Jianyu Zhan
2014-05-10 20:16 ` Hugh Dickins
2014-05-10 7:18 ` [PATCH 3/3] mm: rename mlocked_vma_newpage to newpage_in_mlocked_vma Jianyu Zhan
2014-05-10 19:51 ` [PATCH 1/3] mm: add comment for __mod_zone_page_stat Hugh Dickins
2 siblings, 1 reply; 9+ messages in thread
From: Jianyu Zhan @ 2014-05-10 7:17 UTC (permalink / raw)
To: akpm, riel, mgorman, nasa4836, fabf, cldu, sasha.levin, aarcange,
zhangyanfei, oleg, n-horiguchi, iamjoonsoo.kim, kirill.shutemov,
liwanp, gorcunov
Cc: linux-mm, linux-kernel
mlocked_vma_newpage() is only called in fault path by
page_add_new_anon_rmap(), which is called on a *new* page.
And such page is initially only visible via the pagetables, and the
pte is locked while calling page_add_new_anon_rmap(), so we need not
use an irq-safe mod_zone_page_state() here, using a light-weight version
__mod_zone_page_state() would be OK.
And as suggested by Andrew, to reduce the risks that new call sites
incorrectly using mlocked_vma_newpage() without knowing they are adding
racing, this patch also moves it from internal.h to right before its only
call site page_add_new_anon_rmap() in rmap.c, with detailed document added.
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jianyu Zhan <nasa4836@gmail.com>
---
mm/internal.h | 22 ++--------------------
mm/rmap.c | 24 ++++++++++++++++++++++++
2 files changed, 26 insertions(+), 20 deletions(-)
diff --git a/mm/internal.h b/mm/internal.h
index 07b6736..20abafb 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -183,26 +183,8 @@ static inline void munlock_vma_pages_all(struct vm_area_struct *vma)
munlock_vma_pages_range(vma, vma->vm_start, vma->vm_end);
}
-/*
- * Called only in fault path, to determine if a new page is being
- * mapped into a LOCKED vma. If it is, mark page as mlocked.
- */
-static inline int mlocked_vma_newpage(struct vm_area_struct *vma,
- struct page *page)
-{
- VM_BUG_ON_PAGE(PageLRU(page), page);
-
- if (likely((vma->vm_flags & (VM_LOCKED | VM_SPECIAL)) != VM_LOCKED))
- return 0;
-
- if (!TestSetPageMlocked(page)) {
- mod_zone_page_state(page_zone(page), NR_MLOCK,
- hpage_nr_pages(page));
- count_vm_event(UNEVICTABLE_PGMLOCKED);
- }
- return 1;
-}
-
+extern int mlocked_vma_newpage(struct vm_area_struct *vma,
+ struct page *page);
/*
* must be called with vma's mmap_sem held for read or write, and page locked.
*/
diff --git a/mm/rmap.c b/mm/rmap.c
index 6078a30..a9d02ef 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1005,6 +1005,30 @@ void do_page_add_anon_rmap(struct page *page,
__page_check_anon_rmap(page, vma, address);
}
+/*
+ * Called only in fault path, to determine if a new page is being
+ * mapped into a LOCKED vma. If it is, mark page as mlocked.
+ * This function is only called in fault path by
+ * page_add_new_anon_rmap(), which is called on a *new* page.
+ * And such page is initially only visible via the pagetables, and the
+ * pte is locked while calling page_add_new_anon_rmap(), so using a
+ * light-weight version __mod_zone_page_state() would be OK.
+ */
+int mlocked_vma_newpage(struct vm_area_struct *vma,
+ struct page *page)
+{
+ VM_BUG_ON_PAGE(PageLRU(page), page);
+ if (likely((vma->vm_flags & (VM_LOCKED | VM_SPECIAL)) != VM_LOCKED))
+ return 0;
+
+ if (!TestSetPageMlocked(page)) {
+ __mod_zone_page_state(page_zone(page), NR_MLOCK,
+ hpage_nr_pages(page));
+ count_vm_event(UNEVICTABLE_PGMLOCKED);
+ }
+ return 1;
+}
+
/**
* page_add_new_anon_rmap - add pte mapping to a new anonymous page
* @page: the page to add the mapping to
--
2.0.0-rc1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH 2/3] mm: use a light-weight __mod_zone_page_state in mlocked_vma_newpage()
2014-05-10 7:17 ` [PATCH 2/3] mm: use a light-weight __mod_zone_page_state in mlocked_vma_newpage() Jianyu Zhan
@ 2014-05-10 20:16 ` Hugh Dickins
0 siblings, 0 replies; 9+ messages in thread
From: Hugh Dickins @ 2014-05-10 20:16 UTC (permalink / raw)
To: Jianyu Zhan
Cc: akpm, riel, mgorman, fabf, cldu, sasha.levin, aarcange,
zhangyanfei, oleg, n-horiguchi, iamjoonsoo.kim, kirill.shutemov,
liwanp, gorcunov, linux-mm, linux-kernel
On Sat, 10 May 2014, Jianyu Zhan wrote:
> mlocked_vma_newpage() is only called in fault path by
> page_add_new_anon_rmap(), which is called on a *new* page.
> And such page is initially only visible via the pagetables, and the
> pte is locked while calling page_add_new_anon_rmap(), so we need not
> use an irq-safe mod_zone_page_state() here, using a light-weight version
> __mod_zone_page_state() would be OK.
>
> And as suggested by Andrew, to reduce the risks that new call sites
> incorrectly using mlocked_vma_newpage() without knowing they are adding
> racing, this patch also moves it from internal.h to right before its only
> call site page_add_new_anon_rmap() in rmap.c, with detailed document added.
>
> Suggested-by: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Jianyu Zhan <nasa4836@gmail.com>
I completely agree with Andrew's suggestion that you move the code
from mm/internal.h to its sole callsite in mm/rmap.c; but I much
prefer his "probably better ... open-coding its logic into
page_add_new_anon_rmap()".
That saves you from having to dream up a satisfactory alternative name,
and a lengthy comment, and let's everybody see just what's going on.
The function-in-internal.h thing dates from an interim in which,
running short of page flags, we were not confident that we wanted
to dedicate one to PageMlocked: not all configs had it and internal.h
was somewhere to hide the #ifdefs. Well, PageMlocked is there only
when CONFIG_MMU, but mm/rmap.c is only built for CONFIG_MMU anyway.
> ---
> mm/internal.h | 22 ++--------------------
> mm/rmap.c | 24 ++++++++++++++++++++++++
> 2 files changed, 26 insertions(+), 20 deletions(-)
>
> diff --git a/mm/internal.h b/mm/internal.h
> index 07b6736..20abafb 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -183,26 +183,8 @@ static inline void munlock_vma_pages_all(struct vm_area_struct *vma)
> munlock_vma_pages_range(vma, vma->vm_start, vma->vm_end);
> }
>
> -/*
> - * Called only in fault path, to determine if a new page is being
> - * mapped into a LOCKED vma. If it is, mark page as mlocked.
> - */
> -static inline int mlocked_vma_newpage(struct vm_area_struct *vma,
> - struct page *page)
> -{
> - VM_BUG_ON_PAGE(PageLRU(page), page);
> -
> - if (likely((vma->vm_flags & (VM_LOCKED | VM_SPECIAL)) != VM_LOCKED))
> - return 0;
> -
> - if (!TestSetPageMlocked(page)) {
> - mod_zone_page_state(page_zone(page), NR_MLOCK,
So there we see mod_zone_page_state...
> - hpage_nr_pages(page));
> - count_vm_event(UNEVICTABLE_PGMLOCKED);
> - }
> - return 1;
> -}
> -
> +extern int mlocked_vma_newpage(struct vm_area_struct *vma,
> + struct page *page);
Why are you adding an extern declaration for it, when it's only
used from the one source file to which you are moving it?
Why are you not removing the !CONFIG_MMU declaration?
> /*
> * must be called with vma's mmap_sem held for read or write, and page locked.
> */
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 6078a30..a9d02ef 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1005,6 +1005,30 @@ void do_page_add_anon_rmap(struct page *page,
> __page_check_anon_rmap(page, vma, address);
> }
>
> +/*
> + * Called only in fault path, to determine if a new page is being
> + * mapped into a LOCKED vma. If it is, mark page as mlocked.
> + * This function is only called in fault path by
> + * page_add_new_anon_rmap(), which is called on a *new* page.
> + * And such page is initially only visible via the pagetables, and the
> + * pte is locked while calling page_add_new_anon_rmap(), so using a
> + * light-weight version __mod_zone_page_state() would be OK.
No. See my remarks on 1/3, that's not it at all: it's that we do
not update the NR_MLOCK count from interrupt context (I hope: check).
> + */
> +int mlocked_vma_newpage(struct vm_area_struct *vma,
I would say static, except you should just be bringing the code
into its callsite.
> + struct page *page)
> +{
> + VM_BUG_ON_PAGE(PageLRU(page), page);
> + if (likely((vma->vm_flags & (VM_LOCKED | VM_SPECIAL)) != VM_LOCKED))
> + return 0;
> +
> + if (!TestSetPageMlocked(page)) {
> + __mod_zone_page_state(page_zone(page), NR_MLOCK,
And here appears __mod_zone_page_state: you have a patch for moving
a function from one place to another, and buried within that movement
you hide a "signficant" change. No, don't do that.
Hugh
> + hpage_nr_pages(page));
> + count_vm_event(UNEVICTABLE_PGMLOCKED);
> + }
> + return 1;
> +}
> +
> /**
> * page_add_new_anon_rmap - add pte mapping to a new anonymous page
> * @page: the page to add the mapping to
> --
> 2.0.0-rc1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 3/3] mm: rename mlocked_vma_newpage to newpage_in_mlocked_vma
2014-05-10 7:15 [PATCH 1/3] mm: add comment for __mod_zone_page_stat Jianyu Zhan
2014-05-10 7:17 ` [PATCH 2/3] mm: use a light-weight __mod_zone_page_state in mlocked_vma_newpage() Jianyu Zhan
@ 2014-05-10 7:18 ` Jianyu Zhan
2014-05-10 20:21 ` Hugh Dickins
2014-05-10 19:51 ` [PATCH 1/3] mm: add comment for __mod_zone_page_stat Hugh Dickins
2 siblings, 1 reply; 9+ messages in thread
From: Jianyu Zhan @ 2014-05-10 7:18 UTC (permalink / raw)
To: akpm, riel, aarcange, nasa4836, fabf, zhangyanfei, sasha.levin,
mgorman, oleg, n-horiguchi, iamjoonsoo.kim, kirill.shutemov,
liwanp, gorcunov
Cc: linux-mm, linux-kernel
mlocked_vma_newpage is used to determine if a new page is mapped into
a *mlocked* vma. It is poorly named, so rename it to newpage_in_mlocked_vma.
Signed-off-by: Jianyu Zhan <nasa4836@gmail.com>
---
mm/internal.h | 4 ++--
mm/rmap.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/mm/internal.h b/mm/internal.h
index 20abafb..35efd79 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -183,7 +183,7 @@ static inline void munlock_vma_pages_all(struct vm_area_struct *vma)
munlock_vma_pages_range(vma, vma->vm_start, vma->vm_end);
}
-extern int mlocked_vma_newpage(struct vm_area_struct *vma,
+extern int newpage_in_mlocked_vma(struct vm_area_struct *vma,
struct page *page);
/*
* must be called with vma's mmap_sem held for read or write, and page locked.
@@ -227,7 +227,7 @@ extern unsigned long vma_address(struct page *page,
struct vm_area_struct *vma);
#endif
#else /* !CONFIG_MMU */
-static inline int mlocked_vma_newpage(struct vm_area_struct *v, struct page *p)
+static inline int newpage_in_mlocked_vma(struct vm_area_struct *v, struct page *p)
{
return 0;
}
diff --git a/mm/rmap.c b/mm/rmap.c
index a9d02ef..9ff6915 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1014,7 +1014,7 @@ void do_page_add_anon_rmap(struct page *page,
* pte is locked while calling page_add_new_anon_rmap(), so using a
* light-weight version __mod_zone_page_state() would be OK.
*/
-int mlocked_vma_newpage(struct vm_area_struct *vma,
+int newpage_in_mlocked_vma(struct vm_area_struct *vma,
struct page *page)
{
VM_BUG_ON_PAGE(PageLRU(page), page);
@@ -1050,7 +1050,7 @@ void page_add_new_anon_rmap(struct page *page,
__mod_zone_page_state(page_zone(page), NR_ANON_PAGES,
hpage_nr_pages(page));
__page_set_anon_rmap(page, vma, address, 1);
- if (!mlocked_vma_newpage(vma, page)) {
+ if (!newpage_in_mlocked_vma(vma, page)) {
SetPageActive(page);
lru_cache_add(page);
} else
--
2.0.0-rc1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH 3/3] mm: rename mlocked_vma_newpage to newpage_in_mlocked_vma
2014-05-10 7:18 ` [PATCH 3/3] mm: rename mlocked_vma_newpage to newpage_in_mlocked_vma Jianyu Zhan
@ 2014-05-10 20:21 ` Hugh Dickins
0 siblings, 0 replies; 9+ messages in thread
From: Hugh Dickins @ 2014-05-10 20:21 UTC (permalink / raw)
To: Jianyu Zhan
Cc: akpm, riel, aarcange, fabf, zhangyanfei, sasha.levin, mgorman,
oleg, n-horiguchi, iamjoonsoo.kim, kirill.shutemov, liwanp,
gorcunov, linux-mm, linux-kernel
On Sat, 10 May 2014, Jianyu Zhan wrote:
> mlocked_vma_newpage is used to determine if a new page is mapped into
> a *mlocked* vma. It is poorly named, so rename it to newpage_in_mlocked_vma.
>
> Signed-off-by: Jianyu Zhan <nasa4836@gmail.com>
newpage_in_mlocked_vma() is not as bad a name as mlocked_vma_newpage(),
but it's not especially good either (sounds like it answers a question,
not like it actually does something). Please just put it into its
sole caller and we don't need an obscuring name at all.
> ---
> mm/internal.h | 4 ++--
> mm/rmap.c | 4 ++--
> 2 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/mm/internal.h b/mm/internal.h
> index 20abafb..35efd79 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -183,7 +183,7 @@ static inline void munlock_vma_pages_all(struct vm_area_struct *vma)
> munlock_vma_pages_range(vma, vma->vm_start, vma->vm_end);
> }
>
> -extern int mlocked_vma_newpage(struct vm_area_struct *vma,
> +extern int newpage_in_mlocked_vma(struct vm_area_struct *vma,
> struct page *page);
This should have vanished in the previous patch.
> /*
> * must be called with vma's mmap_sem held for read or write, and page locked.
> @@ -227,7 +227,7 @@ extern unsigned long vma_address(struct page *page,
> struct vm_area_struct *vma);
> #endif
> #else /* !CONFIG_MMU */
> -static inline int mlocked_vma_newpage(struct vm_area_struct *v, struct page *p)
> +static inline int newpage_in_mlocked_vma(struct vm_area_struct *v, struct page *p)
Ditto.
Hugh
> {
> return 0;
> }
> diff --git a/mm/rmap.c b/mm/rmap.c
> index a9d02ef..9ff6915 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1014,7 +1014,7 @@ void do_page_add_anon_rmap(struct page *page,
> * pte is locked while calling page_add_new_anon_rmap(), so using a
> * light-weight version __mod_zone_page_state() would be OK.
> */
> -int mlocked_vma_newpage(struct vm_area_struct *vma,
> +int newpage_in_mlocked_vma(struct vm_area_struct *vma,
> struct page *page)
> {
> VM_BUG_ON_PAGE(PageLRU(page), page);
> @@ -1050,7 +1050,7 @@ void page_add_new_anon_rmap(struct page *page,
> __mod_zone_page_state(page_zone(page), NR_ANON_PAGES,
> hpage_nr_pages(page));
> __page_set_anon_rmap(page, vma, address, 1);
> - if (!mlocked_vma_newpage(vma, page)) {
> + if (!newpage_in_mlocked_vma(vma, page)) {
> SetPageActive(page);
> lru_cache_add(page);
> } else
> --
> 2.0.0-rc1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/3] mm: add comment for __mod_zone_page_stat
2014-05-10 7:15 [PATCH 1/3] mm: add comment for __mod_zone_page_stat Jianyu Zhan
2014-05-10 7:17 ` [PATCH 2/3] mm: use a light-weight __mod_zone_page_state in mlocked_vma_newpage() Jianyu Zhan
2014-05-10 7:18 ` [PATCH 3/3] mm: rename mlocked_vma_newpage to newpage_in_mlocked_vma Jianyu Zhan
@ 2014-05-10 19:51 ` Hugh Dickins
2 siblings, 0 replies; 9+ messages in thread
From: Hugh Dickins @ 2014-05-10 19:51 UTC (permalink / raw)
To: Jianyu Zhan
Cc: akpm, mgorman, cody, liuj97, zhangyanfei, srivatsa.bhat, dave,
iamjoonsoo.kim, n-horiguchi, kirill.shutemov, schwidefsky,
gorcunov, riel, cl, toshi.kani, paul.gortmaker, linux-mm,
linux-kernel
On Sat, 10 May 2014, Jianyu Zhan wrote:
> __mod_zone_page_stat() is not irq-safe, so it should be used carefully.
> And it is not appropirately documented now. This patch adds comment for
> it, and also documents for some of its call sites.
>
> Suggested-by: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Jianyu Zhan <nasa4836@gmail.com>
Your original __mod_zone_page_state happened to be correct;
but you have no understanding of why it was correct, so its
comment was very wrong, even after you changed the irq wording.
This series just propagates your misunderstanding further,
while providing an object lesson in how not to present a series.
Sorry, you have quickly developed an unenviable reputation for
patches which waste developers' time: please consider your
patches much more carefully before posting them.
> ---
> mm/page_alloc.c | 2 ++
> mm/rmap.c | 6 ++++++
> mm/vmstat.c | 16 +++++++++++++++-
> 3 files changed, 23 insertions(+), 1 deletion(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 5dba293..9d6f474 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -659,6 +659,8 @@ static inline int free_pages_check(struct page *page)
> *
> * And clear the zone's pages_scanned counter, to hold off the "all pages are
> * pinned" detection logic.
> + *
> + * Note: this function should be used with irq disabled.
Correct, but I don't see that that needed saying. This is a static
function which is being used as intended: just because it matched your
search for "__mod_zone_page_state" is not a reason for you to add that
comment; irq disabled is only one of its prerequisites.
> */
> static void free_pcppages_bulk(struct zone *zone, int count,
> struct per_cpu_pages *pcp)
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 9c3e773..6078a30 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -979,6 +979,8 @@ void page_add_anon_rmap(struct page *page,
> /*
> * Special version of the above for do_swap_page, which often runs
> * into pages that are exclusively owned by the current process.
> + * So we could use the irq-unsafe version __{inc|mod}_zone_page_stat
> + * here without others racing change it in between.
And yet you can immediately see them being used without any test
for "exclusive" below: why is that? Think about it.
> * Everybody else should continue to use page_add_anon_rmap above.
> */
> void do_page_add_anon_rmap(struct page *page,
> @@ -1077,6 +1079,10 @@ void page_remove_rmap(struct page *page)
> /*
> * Hugepages are not counted in NR_ANON_PAGES nor NR_FILE_MAPPED
> * and not charged by memcg for now.
> + *
> + * And we are the last user of this page, so it is safe to use
> + * the irq-unsafe version __{mod|dec}_zone_page here, since we
> + * have no racer.
Again, the code is correct to be using the irq-unsafe version, but your
comment is doubly wrong.
We are not necessarily the last user of this page, merely the one that
just now brought the mapcount down to 0.
But think: what bearing would being the last user of this page have on
the safety of using __mod_zone_page_state to adjust per-zone counters?
None at all. A page does not move from one zone to another (though its
contents might be migrated from one page to another when safe to do so).
Once upon a time, from 2.6.16 to 2.6.32, there was indeed a relevant
and helpful comment in __page_set_anon_rmap():
/*
* nr_mapped state can be updated without turning off
* interrupts because it is not modified via interrupt.
*/
__inc_page_state(nr_mapped);
The comment survived the replacement of nr_mapped, but eventually
it got cleaned away completely.
It is safe to use the irq-unsafe __mod_zone_page_stat on counters
which are never modified via interrupt.
> */
> if (unlikely(PageHuge(page)))
> goto out;
> diff --git a/mm/vmstat.c b/mm/vmstat.c
> index 302dd07..778f154 100644
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -207,7 +207,21 @@ void set_pgdat_percpu_threshold(pg_data_t *pgdat,
> }
>
> /*
> - * For use when we know that interrupts are disabled.
> + * Optimized modificatoin function.
> + *
> + * The code basically does the modification in two steps:
> + *
> + * 1. read the current counter based on the processor number
> + * 2. modificate the counter write it back.
> + *
> + * So this function should be used with the guarantee that
> + *
> + * 1. interrupts are disabled, or
> + * 2. interrupts are enabled, but no other sites would race to
> + * modify this counter in between.
> + *
> + * Otherwise, an irq-safe version mod_zone_page_state() should
> + * be used instead.
You are right that the comment is not good enough, but I don't trust
your version either. Since percpu variables are involved, it's important
that preemption be disabled too (see comment above __inc_zone_state).
I'd prefer to let Christoph write the definitive version,
but my first stab at it would be:
/*
* For use when we know that interrupts are disabled,
* or when we know that preemption is disabled and that
* particular counter cannot be updated from interrupt context.
*/
Hugh
> */
> void __mod_zone_page_state(struct zone *zone, enum zone_stat_item item,
> int delta)
> --
> 2.0.0-rc1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/3] mm: use a light-weight __mod_zone_page_state in mlocked_vma_newpage()
@ 2014-05-11 12:33 Jianyu Zhan
0 siblings, 0 replies; 9+ messages in thread
From: Jianyu Zhan @ 2014-05-11 12:33 UTC (permalink / raw)
To: akpm, hughd, riel, aarcange, nasa4836, oleg, fabf, zhangyanfei,
mgorman, sasha.levin, cldu, n-horiguchi, iamjoonsoo.kim,
kirill.shutemov, xemul, gorcunov
Cc: linux-mm, linux-kernel
>I completely agree with Andrew's suggestion that you move the code
>from mm/internal.h to its sole callsite in mm/rmap.c; but I much
>prefer his "probably better ... open-coding its logic into
>page_add_new_anon_rmap()".
>
>That saves you from having to dream up a satisfactory alternative name,
>and a lengthy comment, and let's everybody see just what's going on.
Hi, also thanks for the detailed comments!!!
Yes, I also agree. But I just saw that mlocked_vma_newpage() is used
as a test stament in page_add_new_anon_rmap(), like:
if (!mlocked_vma_newpage())
...
else
...
It is quite clear code logic for reading, so I think it is appropriate
to still make it a function. But that's OK, I've folded it into
page_add_new_anon_rmap() in the new patch, see below.
>The function-in-internal.h thing dates from an interim in which,
>running short of page flags, we were not confident that we wanted
>to dedicate one to PageMlocked: not all configs had it and internal.h
>was somewhere to hide the #ifdefs. Well, PageMlocked is there only
>when CONFIG_MMU, but mm/rmap.c is only built for CONFIG_MMU anyway.
>In previous commit(mm: use the light version __mod_zone_page_state in
>mlocked_vma_newpage()) a irq-unsafe __mod_zone_page_state is used.
>And as suggested by Andrew, to reduce the risks that new call sites
>incorrectly using mlocked_vma_newpage() without knowing they are adding
>racing, this patch folds mlocked_vma_newpage() into its only call site,
>page_add_new_anon_rmap, to make it open-cocded.
Thanks for telling me this, which be not be learned from code.
-----<8-----
mm: fold mlocked_vma_newpage() into its only call site
In previous commit(mm: use the light version __mod_zone_page_state in
mlocked_vma_newpage()) a irq-unsafe __mod_zone_page_state is used.
And as suggested by Andrew, to reduce the risks that new call sites
incorrectly using mlocked_vma_newpage() without knowing they are adding
racing, this patch folds mlocked_vma_newpage() into its only call site,
page_add_new_anon_rmap, to make it open-cocded.
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Suggested-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Jianyu Zhan <nasa4836@gmail.com>
---
mm/internal.h | 31 -------------------------------
mm/rmap.c | 22 +++++++++++++++++++---
2 files changed, 19 insertions(+), 34 deletions(-)
diff --git a/mm/internal.h b/mm/internal.h
index 7140c9b..29f3dc8 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -184,33 +184,6 @@ static inline void munlock_vma_pages_all(struct vm_area_struct *vma)
}
/*
- * Called only in fault path, to determine if a new page is being
- * mapped into a LOCKED vma. If it is, mark page as mlocked.
- */
-static inline int mlocked_vma_newpage(struct vm_area_struct *vma,
- struct page *page)
-{
- VM_BUG_ON_PAGE(PageLRU(page), page);
-
- if (likely((vma->vm_flags & (VM_LOCKED | VM_SPECIAL)) != VM_LOCKED))
- return 0;
-
- if (!TestSetPageMlocked(page)) {
- /*
- * We use the irq-unsafe __mod_zone_page_stat because
- * 1. this counter is not modified in interrupt context, and
- * 2. pte lock is held, and this a newpage, which is initially
- * only visible via the pagetables. So this would exclude
- * racy processes from preemting us and to modify it.
- */
- __mod_zone_page_state(page_zone(page), NR_MLOCK,
- hpage_nr_pages(page));
- count_vm_event(UNEVICTABLE_PGMLOCKED);
- }
- return 1;
-}
-
-/*
* must be called with vma's mmap_sem held for read or write, and page locked.
*/
extern void mlock_vma_page(struct page *page);
@@ -252,10 +225,6 @@ extern unsigned long vma_address(struct page *page,
struct vm_area_struct *vma);
#endif
#else /* !CONFIG_MMU */
-static inline int mlocked_vma_newpage(struct vm_area_struct *v, struct page *p)
-{
- return 0;
-}
static inline void clear_page_mlock(struct page *page) { }
static inline void mlock_vma_page(struct page *page) { }
static inline void mlock_migrate_page(struct page *new, struct page *old) { }
diff --git a/mm/rmap.c b/mm/rmap.c
index 0700253..b7bf67b 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1030,11 +1030,27 @@ void page_add_new_anon_rmap(struct page *page,
__mod_zone_page_state(page_zone(page), NR_ANON_PAGES,
hpage_nr_pages(page));
__page_set_anon_rmap(page, vma, address, 1);
- if (!mlocked_vma_newpage(vma, page)) {
+
+ VM_BUG_ON_PAGE(PageLRU(page), page);
+ if (likely((vma->vm_flags & (VM_LOCKED | VM_SPECIAL)) != VM_LOCKED)) {
SetPageActive(page);
lru_cache_add(page);
- } else
- add_page_to_unevictable_list(page);
+ return;
+ }
+
+ if (!TestSetPageMlocked(page)) {
+ /*
+ * We use the irq-unsafe __mod_zone_page_stat because
+ * 1. this counter is not modified in interrupt context, and
+ * 2. pte lock is held, and this a newpage, which is initially
+ * only visible via the pagetables. So this would exclude
+ * racy processes from preemting us and to modify it.
+ */
+ __mod_zone_page_state(page_zone(page), NR_MLOCK,
+ hpage_nr_pages(page));
+ count_vm_event(UNEVICTABLE_PGMLOCKED);
+ }
+ add_page_to_unevictable_list(page);
}
/**
--
2.0.0-rc1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH 2/3] mm: use a light-weight __mod_zone_page_state in mlocked_vma_newpage()
@ 2014-05-12 17:58 Jianyu Zhan
2014-05-14 4:12 ` Hugh Dickins
0 siblings, 1 reply; 9+ messages in thread
From: Jianyu Zhan @ 2014-05-12 17:58 UTC (permalink / raw)
To: akpm, mgorman, nasa4836, sasha.levin, zhangyanfei, oleg, fabf,
cldu, iamjoonsoo.kim, n-horiguchi, kirill.shutemov, schwidefsky,
gorcunov
Cc: linux-mm, linux-kernel
Andrew, since the previous patch
[PATCH 1/3] mm: add comment for __mod_zone_page_stat
is updated, update this one accordingly.
-----<8-----
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/3] mm: use a light-weight __mod_zone_page_state in mlocked_vma_newpage()
2014-05-12 17:58 Jianyu Zhan
@ 2014-05-14 4:12 ` Hugh Dickins
0 siblings, 0 replies; 9+ messages in thread
From: Hugh Dickins @ 2014-05-14 4:12 UTC (permalink / raw)
To: Jianyu Zhan
Cc: akpm, mgorman, sasha.levin, zhangyanfei, oleg, fabf, cldu,
iamjoonsoo.kim, n-horiguchi, kirill.shutemov, schwidefsky,
gorcunov, linux-mm, linux-kernel
On Tue, 13 May 2014, Jianyu Zhan wrote:
> Andrew, since the previous patch
>
> [PATCH 1/3] mm: add comment for __mod_zone_page_stat
>
> is updated, update this one accordingly.
>
> -----<8-----
> From 9701fbdb3f9e7730b89780a5bf22effd1580cf35 Mon Sep 17 00:00:00 2001
> From: Jianyu Zhan <nasa4836@gmail.com>
> Date: Tue, 13 May 2014 01:48:01 +0800
> Subject: [PATCH] mm: fold mlocked_vma_newpage() into its only call site
>
> In previous commit(mm: use the light version __mod_zone_page_state in
> mlocked_vma_newpage()) a irq-unsafe __mod_zone_page_state is used.
> And as suggested by Andrew, to reduce the risks that new call sites
> incorrectly using mlocked_vma_newpage() without knowing they are adding
> racing, this patch folds mlocked_vma_newpage() into its only call site,
> page_add_new_anon_rmap, to make it open-cocded for people to know what
> is going on.
>
> Suggested-by: Andrew Morton <akpm@linux-foundation.org>
> Suggested-by: Hugh Dickins <hughd@google.com>
> Signed-off-by: Jianyu Zhan <nasa4836@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Yes, much better, thanks: you made and commented the __ change in the
previous patch, and now you just do the move. I'd have probably moved
the VM_BUG_ON_PAGE(PageLRU,) up to the top of page_add_new_anon_rmap(),
where we already document some expectations on entry (or else removed it
completely, given that lru_cache_add() does the same); but that's a nit,
no need to make further change now.
> ---
> mm/internal.h | 29 -----------------------------
> mm/rmap.c | 20 +++++++++++++++++---
> 2 files changed, 17 insertions(+), 32 deletions(-)
>
> diff --git a/mm/internal.h b/mm/internal.h
> index d6a4868..29f3dc8 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -184,31 +184,6 @@ static inline void munlock_vma_pages_all(struct vm_area_struct *vma)
> }
>
> /*
> - * Called only in fault path, to determine if a new page is being
> - * mapped into a LOCKED vma. If it is, mark page as mlocked.
> - */
> -static inline int mlocked_vma_newpage(struct vm_area_struct *vma,
> - struct page *page)
> -{
> - VM_BUG_ON_PAGE(PageLRU(page), page);
> -
> - if (likely((vma->vm_flags & (VM_LOCKED | VM_SPECIAL)) != VM_LOCKED))
> - return 0;
> -
> - if (!TestSetPageMlocked(page)) {
> - /*
> - * We use the irq-unsafe __mod_zone_page_stat because
> - * this counter is not modified from interrupt context, and the
> - * pte lock is held(spinlock), which implies preemption disabled.
> - */
> - __mod_zone_page_state(page_zone(page), NR_MLOCK,
> - hpage_nr_pages(page));
> - count_vm_event(UNEVICTABLE_PGMLOCKED);
> - }
> - return 1;
> -}
> -
> -/*
> * must be called with vma's mmap_sem held for read or write, and page locked.
> */
> extern void mlock_vma_page(struct page *page);
> @@ -250,10 +225,6 @@ extern unsigned long vma_address(struct page *page,
> struct vm_area_struct *vma);
> #endif
> #else /* !CONFIG_MMU */
> -static inline int mlocked_vma_newpage(struct vm_area_struct *v, struct page *p)
> -{
> - return 0;
> -}
> static inline void clear_page_mlock(struct page *page) { }
> static inline void mlock_vma_page(struct page *page) { }
> static inline void mlock_migrate_page(struct page *new, struct page *old) { }
> diff --git a/mm/rmap.c b/mm/rmap.c
> index fa73194..386b78f 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1029,11 +1029,25 @@ void page_add_new_anon_rmap(struct page *page,
> __mod_zone_page_state(page_zone(page), NR_ANON_PAGES,
> hpage_nr_pages(page));
> __page_set_anon_rmap(page, vma, address, 1);
> - if (!mlocked_vma_newpage(vma, page)) {
> +
> + VM_BUG_ON_PAGE(PageLRU(page), page);
> + if (likely((vma->vm_flags & (VM_LOCKED | VM_SPECIAL)) != VM_LOCKED)) {
> SetPageActive(page);
> lru_cache_add(page);
> - } else
> - add_page_to_unevictable_list(page);
> + return;
> + }
> +
> + if (!TestSetPageMlocked(page)) {
> + /*
> + * We use the irq-unsafe __mod_zone_page_stat because
> + * this counter is not modified from interrupt context, and the
> + * pte lock is held(spinlock), which implies preemption disabled.
> + */
> + __mod_zone_page_state(page_zone(page), NR_MLOCK,
> + hpage_nr_pages(page));
> + count_vm_event(UNEVICTABLE_PGMLOCKED);
> + }
> + add_page_to_unevictable_list(page);
> }
>
> /**
> --
> 2.0.0-rc1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2014-05-14 4:13 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-10 7:15 [PATCH 1/3] mm: add comment for __mod_zone_page_stat Jianyu Zhan
2014-05-10 7:17 ` [PATCH 2/3] mm: use a light-weight __mod_zone_page_state in mlocked_vma_newpage() Jianyu Zhan
2014-05-10 20:16 ` Hugh Dickins
2014-05-10 7:18 ` [PATCH 3/3] mm: rename mlocked_vma_newpage to newpage_in_mlocked_vma Jianyu Zhan
2014-05-10 20:21 ` Hugh Dickins
2014-05-10 19:51 ` [PATCH 1/3] mm: add comment for __mod_zone_page_stat Hugh Dickins
-- strict thread matches above, loose matches on Subject: below --
2014-05-11 12:33 [PATCH 2/3] mm: use a light-weight __mod_zone_page_state in mlocked_vma_newpage() Jianyu Zhan
2014-05-12 17:58 Jianyu Zhan
2014-05-14 4:12 ` Hugh Dickins
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).