From: Adam Litke <agl@us.ibm.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: akpm@linux-foundation.org, dean@arctic.org,
linux-kernel@vger.kernel.org, wli@holomorphy.com,
dwg@au1.ibm.com, apw@shadowen.org, linux-mm@kvack.org,
andi@firstfloor.org, kenchen@google.com, abh@cray.com
Subject: Re: [PATCH 3/3] Guarantee that COW faults for a process that called mmap(MAP_PRIVATE) on hugetlbfs will succeed
Date: Wed, 28 May 2008 13:16:07 -0500 [thread overview]
Message-ID: <1211998567.12036.65.camel@localhost.localdomain> (raw)
In-Reply-To: <20080527185128.16194.87380.sendpatchset@skynet.skynet.ie>
On Tue, 2008-05-27 at 19:51 +0100, Mel Gorman wrote:
> After patch 2 in this series, a process that successfully calls mmap()
> for a MAP_PRIVATE mapping will be guaranteed to successfully fault until a
> process calls fork(). At that point, the next write fault from the parent
> could fail due to COW if the child still has a reference.
>
> We only reserve pages for the parent but a copy must be made to avoid leaking
> data from the parent to the child after fork(). Reserves could be taken for
> both parent and child at fork time to guarantee faults but if the mapping
> is large it is highly likely we will not have sufficient pages for the
> reservation, and it is common to fork only to exec() immediatly after. A
> failure here would be very undesirable.
>
> Note that the current behaviour of mainline with MAP_PRIVATE pages is
> pretty bad. The following situation is allowed to occur today.
>
> 1. Process calls mmap(MAP_PRIVATE)
> 2. Process calls mlock() to fault all pages and makes sure it succeeds
> 3. Process forks()
> 4. Process writes to MAP_PRIVATE mapping while child still exists
> 5. If the COW fails at this point, the process gets SIGKILLed even though it
> had taken care to ensure the pages existed
>
> This patch improves the situation by guaranteeing the reliability of the
> process that successfully calls mmap(). When the parent performs COW, it
> will try to satisfy the allocation without using reserves. If that fails the
> parent will steal the page leaving any children without a page. Faults from
> the child after that point will result in failure. If the child COW happens
> first, an attempt will be made to allocate the page without reserves and
> the child will get SIGKILLed on failure.
>
> To summarise the new behaviour:
>
> 1. If the original mapper performs COW on a private mapping with multiple
> references, it will attempt to allocate a hugepage from the pool or
> the buddy allocator without using the existing reserves. On fail, VMAs
> mapping the same area are traversed and the page being COW'd is unmapped
> where found. It will then steal the original page as the last mapper in
> the normal way.
>
> 2. The VMAs the pages were unmapped from are flagged to note that pages
> with data no longer exist. Future no-page faults on those VMAs will
> terminate the process as otherwise it would appear that data was corrupted.
> A warning is printed to the console that this situation occured.
>
> 2. If the child performs COW first, it will attempt to satisfy the COW
> from the pool if there are enough pages or via the buddy allocator if
> overcommit is allowed and the buddy allocator can satisfy the request. If
> it fails, the child will be killed.
>
> If the pool is large enough, existing applications will not notice that the
> reserves were a factor. Existing applications depending on the no-reserves
> been set are unlikely to exist as for much of the history of hugetlbfs,
> pages were prefaulted at mmap(), allocating the pages at that point or failing
> the mmap().
>
> Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Adam Litke <agl@us.ibm.com>
--
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center
WARNING: multiple messages have this Message-ID (diff)
From: Adam Litke <agl@us.ibm.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: akpm@linux-foundation.org, dean@arctic.org,
linux-kernel@vger.kernel.org, wli@holomorphy.com,
dwg@au1.ibm.com, apw@shadowen.org, linux-mm@kvack.org,
andi@firstfloor.org, kenchen@google.com, abh@cray.com
Subject: Re: [PATCH 3/3] Guarantee that COW faults for a process that called mmap(MAP_PRIVATE) on hugetlbfs will succeed
Date: Wed, 28 May 2008 13:16:07 -0500 [thread overview]
Message-ID: <1211998567.12036.65.camel@localhost.localdomain> (raw)
In-Reply-To: <20080527185128.16194.87380.sendpatchset@skynet.skynet.ie>
On Tue, 2008-05-27 at 19:51 +0100, Mel Gorman wrote:
> After patch 2 in this series, a process that successfully calls mmap()
> for a MAP_PRIVATE mapping will be guaranteed to successfully fault until a
> process calls fork(). At that point, the next write fault from the parent
> could fail due to COW if the child still has a reference.
>
> We only reserve pages for the parent but a copy must be made to avoid leaking
> data from the parent to the child after fork(). Reserves could be taken for
> both parent and child at fork time to guarantee faults but if the mapping
> is large it is highly likely we will not have sufficient pages for the
> reservation, and it is common to fork only to exec() immediatly after. A
> failure here would be very undesirable.
>
> Note that the current behaviour of mainline with MAP_PRIVATE pages is
> pretty bad. The following situation is allowed to occur today.
>
> 1. Process calls mmap(MAP_PRIVATE)
> 2. Process calls mlock() to fault all pages and makes sure it succeeds
> 3. Process forks()
> 4. Process writes to MAP_PRIVATE mapping while child still exists
> 5. If the COW fails at this point, the process gets SIGKILLed even though it
> had taken care to ensure the pages existed
>
> This patch improves the situation by guaranteeing the reliability of the
> process that successfully calls mmap(). When the parent performs COW, it
> will try to satisfy the allocation without using reserves. If that fails the
> parent will steal the page leaving any children without a page. Faults from
> the child after that point will result in failure. If the child COW happens
> first, an attempt will be made to allocate the page without reserves and
> the child will get SIGKILLed on failure.
>
> To summarise the new behaviour:
>
> 1. If the original mapper performs COW on a private mapping with multiple
> references, it will attempt to allocate a hugepage from the pool or
> the buddy allocator without using the existing reserves. On fail, VMAs
> mapping the same area are traversed and the page being COW'd is unmapped
> where found. It will then steal the original page as the last mapper in
> the normal way.
>
> 2. The VMAs the pages were unmapped from are flagged to note that pages
> with data no longer exist. Future no-page faults on those VMAs will
> terminate the process as otherwise it would appear that data was corrupted.
> A warning is printed to the console that this situation occured.
>
> 2. If the child performs COW first, it will attempt to satisfy the COW
> from the pool if there are enough pages or via the buddy allocator if
> overcommit is allowed and the buddy allocator can satisfy the request. If
> it fails, the child will be killed.
>
> If the pool is large enough, existing applications will not notice that the
> reserves were a factor. Existing applications depending on the no-reserves
> been set are unlikely to exist as for much of the history of hugetlbfs,
> pages were prefaulted at mmap(), allocating the pages at that point or failing
> the mmap().
>
> Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Adam Litke <agl@us.ibm.com>
--
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-05-28 18:16 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-27 18:50 [PATCH 0/3] Guarantee faults for processes that call mmap(MAP_PRIVATE) on hugetlbfs v4 Mel Gorman
2008-05-27 18:50 ` Mel Gorman
2008-05-27 18:50 ` [PATCH 1/3] Move hugetlb_acct_memory() Mel Gorman
2008-05-27 18:50 ` Mel Gorman
2008-05-28 13:37 ` Adam Litke
2008-05-28 13:37 ` Adam Litke
2008-05-27 18:51 ` [PATCH 2/3] Reserve huge pages for reliable MAP_PRIVATE hugetlbfs mappings until fork() Mel Gorman
2008-05-27 18:51 ` Mel Gorman
2008-05-28 13:52 ` Adam Litke
2008-05-28 13:52 ` Adam Litke
2008-05-27 18:51 ` [PATCH 3/3] Guarantee that COW faults for a process that called mmap(MAP_PRIVATE) on hugetlbfs will succeed Mel Gorman
2008-05-27 18:51 ` Mel Gorman
2008-05-28 16:00 ` Mel Gorman
2008-05-28 16:00 ` Mel Gorman
2008-05-28 18:15 ` Adam Litke
2008-05-28 18:15 ` Adam Litke
2008-05-28 18:16 ` Adam Litke [this message]
2008-05-28 18:16 ` Adam Litke
2008-05-29 1:42 ` Andrew Morton
2008-05-29 1:42 ` Andrew Morton
2008-05-30 16:57 ` [PATCH 0/2] hugetlb reservations v4/MAP_NORESERVE V3 cleanups Andy Whitcroft
2008-05-30 16:57 ` Andy Whitcroft
2008-05-30 16:58 ` [PATCH 1/2] huge page private reservation review cleanups Andy Whitcroft
2008-05-30 16:58 ` Andy Whitcroft
2008-05-30 20:29 ` Andrew Morton
2008-05-30 20:29 ` Andrew Morton
2008-05-31 13:06 ` Mel Gorman
2008-05-31 13:06 ` Mel Gorman
2008-05-31 12:21 ` Mel Gorman
2008-05-31 12:21 ` Mel Gorman
2008-05-30 16:58 ` [PATCH 2/2] huge page MAP_NORESERVE " Andy Whitcroft
2008-05-30 16:58 ` Andy Whitcroft
-- strict thread matches above, loose matches on Subject: below --
2008-05-20 16:28 [PATCH 0/3] Guarantee faults for processes that call mmap(MAP_PRIVATE) on hugetlbfs v3 Mel Gorman
2008-05-20 16:29 ` [PATCH 3/3] Guarantee that COW faults for a process that called mmap(MAP_PRIVATE) on hugetlbfs will succeed Mel Gorman
2008-05-20 16:29 ` Mel Gorman
2008-05-07 19:38 [PATCH 0/3] Guarantee faults for processes that call mmap(MAP_PRIVATE) on hugetlbfs v2 Mel Gorman
2008-05-07 19:39 ` [PATCH 3/3] Guarantee that COW faults for a process that called mmap(MAP_PRIVATE) on hugetlbfs will succeed Mel Gorman
2008-05-07 19:39 ` Mel Gorman
2008-05-14 20:55 ` Adam Litke
2008-05-14 20:55 ` Adam Litke
2008-05-16 12:15 ` Mel Gorman
2008-05-16 12:15 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1211998567.12036.65.camel@localhost.localdomain \
--to=agl@us.ibm.com \
--cc=abh@cray.com \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=apw@shadowen.org \
--cc=dean@arctic.org \
--cc=dwg@au1.ibm.com \
--cc=kenchen@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=wli@holomorphy.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.