* [RFC] shared page table for hugetlbpage memory causing leak.
@ 2008-01-16 17:25 Larry Woodman
2008-01-16 18:54 ` Adam Litke
0 siblings, 1 reply; 6+ messages in thread
From: Larry Woodman @ 2008-01-16 17:25 UTC (permalink / raw)
To: linux-mm
[-- Attachment #1: Type: text/plain, Size: 1284 bytes --]
I think the shared page table code for hugetlb memory on x86 and x86_64
is causing a leak. When a user of hugepages exits using this code the
system
leaks some of the hugepages.
-------------------------------------------------------
Part of /proc/meminfo just before database startup:
HugePages_Total: 5500
HugePages_Free: 5500
HugePages_Rsvd: 0
Hugepagesize: 2048 kB
Just before shutdown:
HugePages_Total: 5500
HugePages_Free: 4475
HugePages_Rsvd: 0
Hugepagesize: 2048 kB
After shutdown:
HugePages_Total: 5500
HugePages_Free: 4988
HugePages_Rsvd: 0
Hugepagesize: 2048 kB
----------------------------------------------------------
I think the problem occurs durring a fork, in copy_hugetlb_page_range().
It locates the dst_pte using huge_pte_alloc(). Since huge_pte_alloc()
calls huge_pmd_share() it will share the pmd page if can yet the main
loop in copy_hugetlb_page_range() does a get_page() on every hugepage.
This is a violation of the shared hugepmd pagetable protocol and creates
additional referenced to the hugepages.
I think we can skip the entire replication of the ptes when the hugepage
pagetables are shared. This patch skips copying the ptes and the get_page()
calls if the hugetlbpage pagetable is shared.
[-- Attachment #2: linux-shared.patch --]
[-- Type: text/plain, Size: 1178 bytes --]
--- linux-2.6.23/mm/hugetlb.c.orig 2008-01-16 12:05:41.496448000 -0500
+++ linux-2.6.23/mm/hugetlb.c 2008-01-16 12:09:57.184746000 -0500
@@ -377,18 +377,22 @@ int copy_hugetlb_page_range(struct mm_st
dst_pte = huge_pte_alloc(dst, addr);
if (!dst_pte)
goto nomem;
- spin_lock(&dst->page_table_lock);
- spin_lock(&src->page_table_lock);
- if (!pte_none(*src_pte)) {
- if (cow)
- ptep_set_wrprotect(src, addr, src_pte);
- entry = *src_pte;
- ptepage = pte_page(entry);
- get_page(ptepage);
- set_huge_pte_at(dst, addr, dst_pte, entry);
+
+ /* if hugetlbpage pagetables are shared dont take additional references */
+ if(!(is_vm_hugtlb_page(vma) && dst_pte == src_pte)) {
+ spin_lock(&dst->page_table_lock);
+ spin_lock(&src->page_table_lock);
+ if (!pte_none(*src_pte)) {
+ if (cow)
+ ptep_set_wrprotect(src, addr, src_pte);
+ entry = *src_pte;
+ ptepage = pte_page(entry);
+ get_page(ptepage);
+ set_huge_pte_at(dst, addr, dst_pte, entry);
+ }
+ spin_unlock(&src->page_table_lock);
+ spin_unlock(&dst->page_table_lock);
}
- spin_unlock(&src->page_table_lock);
- spin_unlock(&dst->page_table_lock);
}
return 0;
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] shared page table for hugetlbpage memory causing leak.
2008-01-16 17:25 [RFC] shared page table for hugetlbpage memory causing leak Larry Woodman
@ 2008-01-16 18:54 ` Adam Litke
2008-01-16 18:55 ` Larry Woodman
2008-01-17 10:19 ` Balbir Singh
0 siblings, 2 replies; 6+ messages in thread
From: Adam Litke @ 2008-01-16 18:54 UTC (permalink / raw)
To: Larry Woodman; +Cc: linux-mm
Since we know we are dealing with a hugetlb VMA, how about the
following, simpler, _untested_ patch:
Signed-off-by: Adam Litke <agl@us.ibm.com>
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 6f97821..75b0e4f 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -644,6 +644,11 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
dst_pte = huge_pte_alloc(dst, addr);
if (!dst_pte)
goto nomem;
+
+ /* If page table is shared do not copy or take references */
+ if (src_pte == dst_pte)
+ continue;
+
spin_lock(&dst->page_table_lock);
spin_lock(&src->page_table_lock);
if (!pte_none(*src_pte)) {
--
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [RFC] shared page table for hugetlbpage memory causing leak.
2008-01-16 18:54 ` Adam Litke
@ 2008-01-16 18:55 ` Larry Woodman
2008-01-17 10:19 ` Balbir Singh
1 sibling, 0 replies; 6+ messages in thread
From: Larry Woodman @ 2008-01-16 18:55 UTC (permalink / raw)
To: Adam Litke; +Cc: linux-mm
Adam Litke wrote:
>Since we know we are dealing with a hugetlb VMA, how about the
>following, simpler, _untested_ patch:
>
>Signed-off-by: Adam Litke <agl@us.ibm.com>
>
>diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>index 6f97821..75b0e4f 100644
>--- a/mm/hugetlb.c
>+++ b/mm/hugetlb.c
>@@ -644,6 +644,11 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
> dst_pte = huge_pte_alloc(dst, addr);
> if (!dst_pte)
> goto nomem;
>+
>+ /* If page table is shared do not copy or take references */
>+ if (src_pte == dst_pte)
>+ continue;
>+
> spin_lock(&dst->page_table_lock);
> spin_lock(&src->page_table_lock);
> if (!pte_none(*src_pte)) {
>
>
>
>
Agreed.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] shared page table for hugetlbpage memory causing leak.
2008-01-16 18:54 ` Adam Litke
2008-01-16 18:55 ` Larry Woodman
@ 2008-01-17 10:19 ` Balbir Singh
2008-01-17 11:53 ` Larry Woodman
1 sibling, 1 reply; 6+ messages in thread
From: Balbir Singh @ 2008-01-17 10:19 UTC (permalink / raw)
To: Adam Litke; +Cc: Larry Woodman, linux-mm
* Adam Litke <agl@us.ibm.com> [2008-01-16 12:54:28]:
> Since we know we are dealing with a hugetlb VMA, how about the
> following, simpler, _untested_ patch:
>
> Signed-off-by: Adam Litke <agl@us.ibm.com>
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 6f97821..75b0e4f 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -644,6 +644,11 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
> dst_pte = huge_pte_alloc(dst, addr);
> if (!dst_pte)
> goto nomem;
> +
> + /* If page table is shared do not copy or take references */
> + if (src_pte == dst_pte)
> + continue;
> +
Shouldn't you be checking the PTE contents rather than the pointers?
Shouldn't the check be
if (unlikely(pte_same(*src_pte, *dst_pte))
continue;
> spin_lock(&dst->page_table_lock);
> spin_lock(&src->page_table_lock);
> if (!pte_none(*src_pte)) {
>
--
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] shared page table for hugetlbpage memory causing leak.
2008-01-17 10:19 ` Balbir Singh
@ 2008-01-17 11:53 ` Larry Woodman
2008-01-17 12:12 ` Balbir Singh
0 siblings, 1 reply; 6+ messages in thread
From: Larry Woodman @ 2008-01-17 11:53 UTC (permalink / raw)
To: balbir; +Cc: Adam Litke, linux-mm
On Thu, 2008-01-17 at 15:49 +0530, Balbir Singh wrote:
> * Adam Litke <agl@us.ibm.com> [2008-01-16 12:54:28]:
>
> > Since we know we are dealing with a hugetlb VMA, how about the
> > following, simpler, _untested_ patch:
> >
> > Signed-off-by: Adam Litke <agl@us.ibm.com>
> >
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index 6f97821..75b0e4f 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -644,6 +644,11 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
> > dst_pte = huge_pte_alloc(dst, addr);
> > if (!dst_pte)
> > goto nomem;
> > +
> > + /* If page table is shared do not copy or take references */
> > + if (src_pte == dst_pte)
> > + continue;
> > +
>
> Shouldn't you be checking the PTE contents rather than the pointers?
No, this is chacking for shared page tables not shared pages.
> Shouldn't the check be
>
> if (unlikely(pte_same(*src_pte, *dst_pte))
> continue;
>
>
> > spin_lock(&dst->page_table_lock);
> > spin_lock(&src->page_table_lock);
> > if (!pte_none(*src_pte)) {
> >
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] shared page table for hugetlbpage memory causing leak.
2008-01-17 11:53 ` Larry Woodman
@ 2008-01-17 12:12 ` Balbir Singh
0 siblings, 0 replies; 6+ messages in thread
From: Balbir Singh @ 2008-01-17 12:12 UTC (permalink / raw)
To: Larry Woodman; +Cc: Adam Litke, linux-mm
* Larry Woodman <lwoodman@redhat.com> [2008-01-17 06:53:38]:
> On Thu, 2008-01-17 at 15:49 +0530, Balbir Singh wrote:
> > * Adam Litke <agl@us.ibm.com> [2008-01-16 12:54:28]:
> >
> > > Since we know we are dealing with a hugetlb VMA, how about the
> > > following, simpler, _untested_ patch:
> > >
> > > Signed-off-by: Adam Litke <agl@us.ibm.com>
> > >
> > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > > index 6f97821..75b0e4f 100644
> > > --- a/mm/hugetlb.c
> > > +++ b/mm/hugetlb.c
> > > @@ -644,6 +644,11 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
> > > dst_pte = huge_pte_alloc(dst, addr);
> > > if (!dst_pte)
> > > goto nomem;
> > > +
> > > + /* If page table is shared do not copy or take references */
> > > + if (src_pte == dst_pte)
> > > + continue;
> > > +
> >
> > Shouldn't you be checking the PTE contents rather than the pointers?
> No, this is chacking for shared page tables not shared pages.
Aah.. I see.
Thanks for clarifying!
--
Balbir Singh
Linux Technology Center
IBM, ISTL
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2008-01-17 12:12 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-16 17:25 [RFC] shared page table for hugetlbpage memory causing leak Larry Woodman
2008-01-16 18:54 ` Adam Litke
2008-01-16 18:55 ` Larry Woodman
2008-01-17 10:19 ` Balbir Singh
2008-01-17 11:53 ` Larry Woodman
2008-01-17 12:12 ` Balbir Singh
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).