linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC] Hugetlb fallback to normal pages
@ 2006-04-26 19:46 Adam Litke
  2006-04-27 23:31 ` Chen, Kenneth W
  0 siblings, 1 reply; 4+ messages in thread
From: Adam Litke @ 2006-04-26 19:46 UTC (permalink / raw)
  To: linux-mm

Thanks to the latest hugetlb accounting patches, we now have reliable
shared mappings.  Private mappings are much more difficult because there
is no way to know up-front how many huge pages will be required (we may
have forking combined with unknown copy-on-write activity).  So private
mappings currently get full overcommit semantics and when a fault cannot
be handled, the apps get SIGBUS.

The problem: Random SIGBUS crashes for applications using large pages
are not acceptable.  We need a way to handle the fault without giving up
and killing the process.

So I've been mulling it over and as I see it, we either 1) Swap out huge
pages, or 2) Demote huge pages.  In either case we need to be willing to
accept the performance penalty to gain stability.  At this point, I
think swapping is too intrusive and way too slow so I am considering
demotion options.  To simplify things at first, I am only considering
i386 (and demoting only private mappings of course).

Here's my idea:  When we fail to instantiate a new page at fault time,
split the affected vma such that we have a new vma to cover the 1 huge
page we are demoting.  Allocate HPAGE_SIZE/PAGE_SIZE normal pages.  Use
the page table to locate any populated hugetlb pages.  Copy the data
into the normal pages and install them in the page table.  Do any other
fixup required to make the new VMA anonymous.  Return.

Any general opinions on the idea (flame retardant suit is equipped)?  As
far as I can tell, we don't split vmas during fault anywhere else.  Is
there inherent problems with doing so?  What about the conversion
process to an anonymous VMA?  Since we are dealing with private mappings
only, divorcing the vma from the hugetlbfs file should be okay afaics.

I know code speaks louder than words, but talk is cheap and that's why
I'm starting with it :)  Thanks for your comments.

-- 
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [RFC] Hugetlb fallback to normal pages
  2006-04-26 19:46 [RFC] Hugetlb fallback to normal pages Adam Litke
@ 2006-04-27 23:31 ` Chen, Kenneth W
  2006-05-01 15:46   ` Dave Hansen
  0 siblings, 1 reply; 4+ messages in thread
From: Chen, Kenneth W @ 2006-04-27 23:31 UTC (permalink / raw)
  To: 'Adam Litke', linux-mm

Adam Litke wrote on Wednesday, April 26, 2006 12:46 PM
> The problem: Random SIGBUS crashes for applications using large pages
> are not acceptable.  We need a way to handle the fault without giving up
> and killing the process.
> 
> So I've been mulling it over and as I see it, we either 1) Swap out huge
> pages, or 2) Demote huge pages.  In either case we need to be willing to
> accept the performance penalty to gain stability.  At this point, I
> think swapping is too intrusive and way too slow so I am considering
> demotion options.  To simplify things at first, I am only considering
> i386 (and demoting only private mappings of course).

Maybe hugetlb needs a page reclaim logic?


> Here's my idea:  When we fail to instantiate a new page at fault time,
> split the affected vma such that we have a new vma to cover the 1 huge
> page we are demoting.  Allocate HPAGE_SIZE/PAGE_SIZE normal pages.  Use
> the page table to locate any populated hugetlb pages.  Copy the data
> into the normal pages and install them in the page table.  Do any other
> fixup required to make the new VMA anonymous.  Return.
> 
> Any general opinions on the idea (flame retardant suit is equipped)?  As
> far as I can tell, we don't split vmas during fault anywhere else.  Is
> there inherent problems with doing so?  What about the conversion
> process to an anonymous VMA?  Since we are dealing with private mappings
> only, divorcing the vma from the hugetlbfs file should be okay afaics.

Some arch don't support mixed page size within a range of virtual address.
So automatic fallback to smaller page won't work on that arch :-(

- Ken

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [RFC] Hugetlb fallback to normal pages
  2006-04-27 23:31 ` Chen, Kenneth W
@ 2006-05-01 15:46   ` Dave Hansen
  2006-05-01 16:23     ` Christoph Lameter
  0 siblings, 1 reply; 4+ messages in thread
From: Dave Hansen @ 2006-05-01 15:46 UTC (permalink / raw)
  To: Chen, Kenneth W; +Cc: 'Adam Litke', linux-mm

On Thu, 2006-04-27 at 16:31 -0700, Chen, Kenneth W wrote:
> Some arch don't support mixed page size within a range of virtual address.
> So automatic fallback to smaller page won't work on that arch :-( 

Well, they're mixed to _some_ degree :)

On ppc64, for instance, we have to do the selection in 256MB granules.
It is still feasible, the cost is just much higher than it would be on
x86.

What are the restrictions on ia64?

-- Dave

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [RFC] Hugetlb fallback to normal pages
  2006-05-01 15:46   ` Dave Hansen
@ 2006-05-01 16:23     ` Christoph Lameter
  0 siblings, 0 replies; 4+ messages in thread
From: Christoph Lameter @ 2006-05-01 16:23 UTC (permalink / raw)
  To: Dave Hansen; +Cc: Chen, Kenneth W, 'Adam Litke', linux-mm

On Mon, 1 May 2006, Dave Hansen wrote:

> What are the restrictions on ia64?

IA64 has a special virtual address range for huge page tables. We would 
first have to introduce various page sizes in the virtual address range 
for huge pages (I think Ken is working on something like that) and then 
the fallback could only be to a "huge" page of order 0. But then the page
would still not behave like a normal page.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2006-05-01 16:23 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-26 19:46 [RFC] Hugetlb fallback to normal pages Adam Litke
2006-04-27 23:31 ` Chen, Kenneth W
2006-05-01 15:46   ` Dave Hansen
2006-05-01 16:23     ` Christoph Lameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).