* [PATCH] mm: hugetlb: undo change to page mapcount in fault handler
@ 2011-12-22 13:36 Hillf Danton
2011-12-22 16:36 ` Michal Hocko
0 siblings, 1 reply; 9+ messages in thread
From: Hillf Danton @ 2011-12-22 13:36 UTC (permalink / raw)
To: linux-mm; +Cc: LKML, Andrew Morton, Michal Hocko, KAMEZAWA Hiroyuki
Page mapcount is changed only when it is folded into page table entry.
Cc: Michal Hocko <mhocko@suse.cz>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Hillf Danton <dhillf@gmail.com>
---
--- a/mm/hugetlb.c Tue Dec 20 21:26:30 2011
+++ b/mm/hugetlb.c Thu Dec 22 21:29:42 2011
@@ -2509,6 +2509,7 @@ static int hugetlb_no_page(struct mm_str
{
struct hstate *h = hstate_vma(vma);
int ret = VM_FAULT_SIGBUS;
+ int anon_rmap = 0;
pgoff_t idx;
unsigned long size;
struct page *page;
@@ -2563,14 +2564,13 @@ retry:
spin_lock(&inode->i_lock);
inode->i_blocks += blocks_per_huge_page(h);
spin_unlock(&inode->i_lock);
- page_dup_rmap(page);
} else {
lock_page(page);
if (unlikely(anon_vma_prepare(vma))) {
ret = VM_FAULT_OOM;
goto backout_unlocked;
}
- hugepage_add_new_anon_rmap(page, vma, address);
+ anon_rmap = 1;
}
} else {
/*
@@ -2583,7 +2583,6 @@ retry:
VM_FAULT_SET_HINDEX(h - hstates);
goto backout_unlocked;
}
- page_dup_rmap(page);
}
/*
@@ -2607,6 +2606,10 @@ retry:
if (!huge_pte_none(huge_ptep_get(ptep)))
goto backout;
+ if (anon_rmap)
+ hugepage_add_new_anon_rmap(page, vma, address);
+ else
+ page_dup_rmap(page);
new_pte = make_huge_pte(vma, page, ((vma->vm_flags & VM_WRITE)
&& (vma->vm_flags & VM_SHARED)));
set_huge_pte_at(mm, address, ptep, new_pte);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] mm: hugetlb: undo change to page mapcount in fault handler
2011-12-22 13:36 [PATCH] mm: hugetlb: undo change to page mapcount in fault handler Hillf Danton
@ 2011-12-22 16:36 ` Michal Hocko
2011-12-23 13:00 ` Hillf Danton
0 siblings, 1 reply; 9+ messages in thread
From: Michal Hocko @ 2011-12-22 16:36 UTC (permalink / raw)
To: Hillf Danton; +Cc: linux-mm, LKML, Andrew Morton, KAMEZAWA Hiroyuki
On Thu 22-12-11 21:36:34, Hillf Danton wrote:
> Page mapcount is changed only when it is folded into page table entry.
The changelog is rather cryptic. What about something like:
Page mapcount should be updated only if we are sure that the page ends
up in the page table otherwise we would leak if we couldn't COW due to
reservations or if idx is out of bounds.
The patch itself looks correct.
>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Hillf Danton <dhillf@gmail.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Thanks
> ---
>
> --- a/mm/hugetlb.c Tue Dec 20 21:26:30 2011
> +++ b/mm/hugetlb.c Thu Dec 22 21:29:42 2011
> @@ -2509,6 +2509,7 @@ static int hugetlb_no_page(struct mm_str
> {
> struct hstate *h = hstate_vma(vma);
> int ret = VM_FAULT_SIGBUS;
> + int anon_rmap = 0;
> pgoff_t idx;
> unsigned long size;
> struct page *page;
> @@ -2563,14 +2564,13 @@ retry:
> spin_lock(&inode->i_lock);
> inode->i_blocks += blocks_per_huge_page(h);
> spin_unlock(&inode->i_lock);
> - page_dup_rmap(page);
> } else {
> lock_page(page);
> if (unlikely(anon_vma_prepare(vma))) {
> ret = VM_FAULT_OOM;
> goto backout_unlocked;
> }
> - hugepage_add_new_anon_rmap(page, vma, address);
> + anon_rmap = 1;
> }
> } else {
> /*
> @@ -2583,7 +2583,6 @@ retry:
> VM_FAULT_SET_HINDEX(h - hstates);
> goto backout_unlocked;
> }
> - page_dup_rmap(page);
> }
>
> /*
> @@ -2607,6 +2606,10 @@ retry:
> if (!huge_pte_none(huge_ptep_get(ptep)))
> goto backout;
>
> + if (anon_rmap)
> + hugepage_add_new_anon_rmap(page, vma, address);
> + else
> + page_dup_rmap(page);
> new_pte = make_huge_pte(vma, page, ((vma->vm_flags & VM_WRITE)
> && (vma->vm_flags & VM_SHARED)));
> set_huge_pte_at(mm, address, ptep, new_pte);
--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] mm: hugetlb: undo change to page mapcount in fault handler
2011-12-22 16:36 ` Michal Hocko
@ 2011-12-23 13:00 ` Hillf Danton
2011-12-26 7:00 ` KAMEZAWA Hiroyuki
2012-01-04 23:16 ` Andrew Morton
0 siblings, 2 replies; 9+ messages in thread
From: Hillf Danton @ 2011-12-23 13:00 UTC (permalink / raw)
To: Michal Hocko; +Cc: linux-mm, LKML, Andrew Morton, KAMEZAWA Hiroyuki
On Fri, Dec 23, 2011 at 12:36 AM, Michal Hocko <mhocko@suse.cz> wrote:
>
> The changelog is rather cryptic. What about something like:
>
It is included in the following version, thanks.
===CUT HERE===
From: Hillf Danton <dhillf@gmail.com>
Subject: [PATCH] mm: hugetlb: undo change to page mapcount in fault handler
Page mapcount should be updated only if we are sure that the page ends
up in the page table otherwise we would leak if we couldn't COW due to
reservations or if idx is out of bounds.
Signed-off-by: Hillf Danton <dhillf@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
---
--- a/mm/hugetlb.c Tue Dec 20 21:26:30 2011
+++ b/mm/hugetlb.c Thu Dec 22 21:29:42 2011
@@ -2509,6 +2509,7 @@ static int hugetlb_no_page(struct mm_str
{
struct hstate *h = hstate_vma(vma);
int ret = VM_FAULT_SIGBUS;
+ int anon_rmap = 0;
pgoff_t idx;
unsigned long size;
struct page *page;
@@ -2563,14 +2564,13 @@ retry:
spin_lock(&inode->i_lock);
inode->i_blocks += blocks_per_huge_page(h);
spin_unlock(&inode->i_lock);
- page_dup_rmap(page);
} else {
lock_page(page);
if (unlikely(anon_vma_prepare(vma))) {
ret = VM_FAULT_OOM;
goto backout_unlocked;
}
- hugepage_add_new_anon_rmap(page, vma, address);
+ anon_rmap = 1;
}
} else {
/*
@@ -2583,7 +2583,6 @@ retry:
VM_FAULT_SET_HINDEX(h - hstates);
goto backout_unlocked;
}
- page_dup_rmap(page);
}
/*
@@ -2607,6 +2606,10 @@ retry:
if (!huge_pte_none(huge_ptep_get(ptep)))
goto backout;
+ if (anon_rmap)
+ hugepage_add_new_anon_rmap(page, vma, address);
+ else
+ page_dup_rmap(page);
new_pte = make_huge_pte(vma, page, ((vma->vm_flags & VM_WRITE)
&& (vma->vm_flags & VM_SHARED)));
set_huge_pte_at(mm, address, ptep, new_pte);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] mm: hugetlb: undo change to page mapcount in fault handler
2011-12-23 13:00 ` Hillf Danton
@ 2011-12-26 7:00 ` KAMEZAWA Hiroyuki
2012-01-04 23:16 ` Andrew Morton
1 sibling, 0 replies; 9+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-12-26 7:00 UTC (permalink / raw)
To: Hillf Danton; +Cc: Michal Hocko, linux-mm, LKML, Andrew Morton
On Fri, 23 Dec 2011 21:00:41 +0800
Hillf Danton <dhillf@gmail.com> wrote:
> On Fri, Dec 23, 2011 at 12:36 AM, Michal Hocko <mhocko@suse.cz> wrote:
> >
> > The changelog is rather cryptic. What about something like:
> >
>
> It is included in the following version, thanks.
>
> ===CUT HERE===
> From: Hillf Danton <dhillf@gmail.com>
> Subject: [PATCH] mm: hugetlb: undo change to page mapcount in fault handler
>
> Page mapcount should be updated only if we are sure that the page ends
> up in the page table otherwise we would leak if we couldn't COW due to
> reservations or if idx is out of bounds.
>
> Signed-off-by: Hillf Danton <dhillf@gmail.com>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] mm: hugetlb: undo change to page mapcount in fault handler
2011-12-23 13:00 ` Hillf Danton
2011-12-26 7:00 ` KAMEZAWA Hiroyuki
@ 2012-01-04 23:16 ` Andrew Morton
2012-01-10 20:45 ` Andrew Morton
2012-01-11 12:06 ` Hillf Danton
1 sibling, 2 replies; 9+ messages in thread
From: Andrew Morton @ 2012-01-04 23:16 UTC (permalink / raw)
To: Hillf Danton; +Cc: Michal Hocko, linux-mm, LKML, KAMEZAWA Hiroyuki
On Fri, 23 Dec 2011 21:00:41 +0800
Hillf Danton <dhillf@gmail.com> wrote:
> Page mapcount should be updated only if we are sure that the page ends
> up in the page table otherwise we would leak if we couldn't COW due to
> reservations or if idx is out of bounds.
It would be much nicer if we could run vma_needs_reservation() before
even looking up or allocating the page.
And afaict the interface is set up to do that: you run
vma_needs_reservation() before allocating the page and then
vma_commit_reservation() afterwards.
But hugetlb_no_page() and hugetlb_fault() appear to have forgotten to
run vma_commit_reservation() altogether. Why isn't this as busted as
it appears to be?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] mm: hugetlb: undo change to page mapcount in fault handler
2012-01-04 23:16 ` Andrew Morton
@ 2012-01-10 20:45 ` Andrew Morton
2012-01-11 12:06 ` Hillf Danton
1 sibling, 0 replies; 9+ messages in thread
From: Andrew Morton @ 2012-01-10 20:45 UTC (permalink / raw)
To: Hillf Danton, Michal Hocko, linux-mm, LKML, KAMEZAWA Hiroyuki
On Wed, 4 Jan 2012 15:16:32 -0800
Andrew Morton <akpm@linux-foundation.org> wrote:
> On Fri, 23 Dec 2011 21:00:41 +0800
> Hillf Danton <dhillf@gmail.com> wrote:
>
> > Page mapcount should be updated only if we are sure that the page ends
> > up in the page table otherwise we would leak if we couldn't COW due to
> > reservations or if idx is out of bounds.
>
> It would be much nicer if we could run vma_needs_reservation() before
> even looking up or allocating the page.
>
> And afaict the interface is set up to do that: you run
> vma_needs_reservation() before allocating the page and then
> vma_commit_reservation() afterwards.
>
> But hugetlb_no_page() and hugetlb_fault() appear to have forgotten to
> run vma_commit_reservation() altogether. Why isn't this as busted as
> it appears to be?
ping?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] mm: hugetlb: undo change to page mapcount in fault handler
2012-01-04 23:16 ` Andrew Morton
2012-01-10 20:45 ` Andrew Morton
@ 2012-01-11 12:06 ` Hillf Danton
2012-01-13 23:39 ` Andrew Morton
1 sibling, 1 reply; 9+ messages in thread
From: Hillf Danton @ 2012-01-11 12:06 UTC (permalink / raw)
To: Andrew Morton; +Cc: Michal Hocko, linux-mm, LKML, KAMEZAWA Hiroyuki
On Thu, Jan 5, 2012 at 7:16 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Fri, 23 Dec 2011 21:00:41 +0800
> Hillf Danton <dhillf@gmail.com> wrote:
>
>> Page mapcount should be updated only if we are sure that the page ends
>> up in the page table otherwise we would leak if we couldn't COW due to
>> reservations or if idx is out of bounds.
>
> It would be much nicer if we could run vma_needs_reservation() before
> even looking up or allocating the page.
>
> And afaict the interface is set up to do that: you run
> vma_needs_reservation() before allocating the page and then
> vma_commit_reservation() afterwards.
>
> But hugetlb_no_page() and hugetlb_fault() appear to have forgotten to
> run vma_commit_reservation() altogether. Why isn't this as busted as
> it appears to be?
Hi Andrew
IIUC the two operations, vma_{needs, commit}_reservation, are folded in
alloc_huge_page(), need to break the pair?
Thanks
Hillf
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] mm: hugetlb: undo change to page mapcount in fault handler
2012-01-11 12:06 ` Hillf Danton
@ 2012-01-13 23:39 ` Andrew Morton
2012-01-14 5:27 ` Hillf Danton
0 siblings, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2012-01-13 23:39 UTC (permalink / raw)
To: Hillf Danton; +Cc: Michal Hocko, linux-mm, LKML, KAMEZAWA Hiroyuki
On Wed, 11 Jan 2012 20:06:30 +0800
Hillf Danton <dhillf@gmail.com> wrote:
> On Thu, Jan 5, 2012 at 7:16 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
> > On Fri, 23 Dec 2011 21:00:41 +0800
> > Hillf Danton <dhillf@gmail.com> wrote:
> >
> >> Page mapcount should be updated only if we are sure that the page ends
> >> up in the page table otherwise we would leak if we couldn't COW due to
> >> reservations or if idx is out of bounds.
> >
> > It would be much nicer if we could run vma_needs_reservation() before
> > even looking up or allocating the page.
> >
> > And afaict the interface is set up to do that: you run
> > vma_needs_reservation() before allocating the page and then
> > vma_commit_reservation() afterwards.
> >
> > But hugetlb_no_page() and hugetlb_fault() appear to have forgotten to
> > run vma_commit_reservation() altogether. __Why isn't this as busted as
> > it appears to be?
>
> Hi Andrew
>
> IIUC the two operations, vma_{needs, commit}_reservation, are folded in
> alloc_huge_page(), need to break the pair?
Looking at it again, it appears that the vma_needs_reservation() calls
are used to predict whether a subsequent COW attempt is going to fail.
If that's correct then things aren't as bad as I first thought.
However I suspect the code in hugetlb_no_page() is a bit racy: the
vma_needs_reservation() call should happen after we've taken
page_table_lock. As things stand, another thread could sneak in there
and steal the reservation which this thread thought was safe.
What do you think?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] mm: hugetlb: undo change to page mapcount in fault handler
2012-01-13 23:39 ` Andrew Morton
@ 2012-01-14 5:27 ` Hillf Danton
0 siblings, 0 replies; 9+ messages in thread
From: Hillf Danton @ 2012-01-14 5:27 UTC (permalink / raw)
To: Andrew Morton; +Cc: Michal Hocko, linux-mm, LKML, KAMEZAWA Hiroyuki
On Sat, Jan 14, 2012 at 7:39 AM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Wed, 11 Jan 2012 20:06:30 +0800
> Hillf Danton <dhillf@gmail.com> wrote:
>
>> On Thu, Jan 5, 2012 at 7:16 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
>> > On Fri, 23 Dec 2011 21:00:41 +0800
>> > Hillf Danton <dhillf@gmail.com> wrote:
>> >
>> >> Page mapcount should be updated only if we are sure that the page ends
>> >> up in the page table otherwise we would leak if we couldn't COW due to
>> >> reservations or if idx is out of bounds.
>> >
>> > It would be much nicer if we could run vma_needs_reservation() before
>> > even looking up or allocating the page.
>> >
>> > And afaict the interface is set up to do that: you run
>> > vma_needs_reservation() before allocating the page and then
>> > vma_commit_reservation() afterwards.
>> >
>> > But hugetlb_no_page() and hugetlb_fault() appear to have forgotten to
>> > run vma_commit_reservation() altogether. __Why isn't this as busted as
>> > it appears to be?
>>
>> Hi Andrew
>>
>> IIUC the two operations, vma_{needs, commit}_reservation, are folded in
>> alloc_huge_page(), need to break the pair?
>
> Looking at it again, it appears that the vma_needs_reservation() calls
> are used to predict whether a subsequent COW attempt is going to fail.
>
> If that's correct then things aren't as bad as I first thought.
> However I suspect the code in hugetlb_no_page() is a bit racy: the
> vma_needs_reservation() call should happen after we've taken
> page_table_lock. As things stand, another thread could sneak in there
> and steal the reservation which this thread thought was safe.
>
> What do you think?
>
Hi Andrew
The case of no page, in the fault path, is handled after acquiring
hugetlb_instantiation_mutex, and on ohter hand, kmalloc is called
if new region required, so no race to check reservation needed but
after spinning page_table_lock.
Thanks
Hillf
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2012-01-14 5:27 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-22 13:36 [PATCH] mm: hugetlb: undo change to page mapcount in fault handler Hillf Danton
2011-12-22 16:36 ` Michal Hocko
2011-12-23 13:00 ` Hillf Danton
2011-12-26 7:00 ` KAMEZAWA Hiroyuki
2012-01-04 23:16 ` Andrew Morton
2012-01-10 20:45 ` Andrew Morton
2012-01-11 12:06 ` Hillf Danton
2012-01-13 23:39 ` Andrew Morton
2012-01-14 5:27 ` Hillf Danton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).