From: Andrea Arcangeli <aarcange@redhat.com>
To: Christoph Lameter <cl@linux-foundation.org>
Cc: linux-mm@kvack.org, Marcelo Tosatti <mtosatti@redhat.com>,
Adam Litke <agl@us.ibm.com>, Avi Kivity <avi@redhat.com>,
Izik Eidus <ieidus@redhat.com>,
Hugh Dickins <hugh.dickins@tiscali.co.uk>,
Nick Piggin <npiggin@suse.de>, Rik van Riel <riel@redhat.com>,
Mel Gorman <mel@csn.ul.ie>, Dave Hansen <dave@linux.vnet.ibm.com>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Ingo Molnar <mingo@elte.hu>, Mike Travis <travis@sgi.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Chris Wright <chrisw@sous-sol.org>,
Andrew Morton <akpm@linux-foundation.org>,
bpicco@redhat.com,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Balbir Singh <balbir@linux.vnet.ibm.com>,
Arnd Bergmann <arnd@arndb.de>
Subject: Re: [PATCH 28 of 32] pmd_trans_huge migrate bugcheck
Date: Wed, 3 Feb 2010 16:49:00 +0100 [thread overview]
Message-ID: <20100203154900.GA29308@random.random> (raw)
In-Reply-To: <alpine.DEB.2.00.1002011542170.2384@router.home>
On Mon, Feb 01, 2010 at 03:46:47PM -0600, Christoph Lameter wrote:
> On Sun, 31 Jan 2010, Andrea Arcangeli wrote:
>
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -819,6 +820,10 @@ static int do_move_page_to_node_array(st
> > if (PageReserved(page) || PageKsm(page))
> > goto put_and_set;
> >
> > + if (unlikely(PageTransCompound(page)))
> > + if (unlikely(split_huge_page(page)))
> > + goto put_and_set;
> > +
> > pp->page = page;
> > err = page_to_nid(page);
>
> How does this work? do_move_page_to_node_array takes an array of page
> pointers in pp (struct page_to_node). Lets say one is a compound page.
yes, all it matters is that it's not an array of "struct page"
pointers in input to that function.
>
> Now we split this into 512 4k pages? and pp only points to the first of
> them?
page_to_node is only set in the "addr" field before split_huge_page
runs, see pp[j].addr = ... That is the input of the syscall.
> The rest of the move_pages() logic will only see one 4k page and move it.
Before follow_page is called, nobody could ever see any "struct
page". And after we call it, we immediately call split_huge_page if it
returned a tail/head page. (collapse_huge_page can't be collapsing it
again under us because of the pin taken by follow_page(FOLL_GET)).
split_huge_page runs before isolate_lru_page is called, so the lru
mangling isn't involved in the split (besides it would work anyway).
> The remaining 511 pages are left dangling? With an increased refcount?
The reamining 511 pages will be taken over by the next follow_page if
userland asks for it, userland will have no way to know if ram is
backed by hugepage or regular page so it has to submit one address per
page as syscall api has to be backwards compatible or everything
breaks. All other 511 pages have no increased refcount from the first
follow_page (or more precisely nothing related to the follow_page on
the 1st page, their page_count simply goes to match the head page
mapcount plus any additional pin on tail pages previously taken by
gup).
Only after the first follow_page for an address backed by an hugepage
we will call split_huge_page, all other follow_page on that 2m
naturally aligned virtual chunk will return regular pages like if no
hugepage existed.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-02-03 15:49 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-01-31 20:27 [PATCH 00 of 32] Transparent Hugepage support #9 Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 01 of 32] define MADV_HUGEPAGE Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 02 of 32] compound_lock Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 03 of 32] alter compound get_page/put_page Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 04 of 32] update futex compound knowledge Andrea Arcangeli
2010-02-16 11:33 ` Peter Zijlstra
2010-03-01 17:58 ` Andrea Arcangeli
2010-03-01 18:07 ` Peter Zijlstra
2010-03-01 18:23 ` Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 05 of 32] fix bad_page to show the real reason the page is bad Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 06 of 32] clear compound mapping Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 07 of 32] add native_set_pmd_at Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 08 of 32] add pmd paravirt ops Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 09 of 32] no paravirt version of pmd ops Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 10 of 32] export maybe_mkwrite Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 11 of 32] comment reminder in destroy_compound_page Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 12 of 32] config_transparent_hugepage Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 13 of 32] special pmd_trans_* functions Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 14 of 32] add pmd mangling generic functions Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 15 of 32] add pmd mangling functions to x86 Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 16 of 32] bail out gup_fast on splitting pmd Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 17 of 32] pte alloc trans splitting Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 18 of 32] add pmd mmu_notifier helpers Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 19 of 32] clear page compound Andrea Arcangeli
2010-02-01 21:37 ` Christoph Lameter
2010-01-31 20:27 ` [PATCH 20 of 32] add pmd_huge_pte to mm_struct Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 21 of 32] split_huge_page_mm/vma Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 22 of 32] split_huge_page paging Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 23 of 32] clear_copy_huge_page Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 24 of 32] kvm mmu transparent hugepage support Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 25 of 32] transparent hugepage core Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 26 of 32] verify pmd_trans_huge isn't leaking Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 27 of 32] madvise(MADV_HUGEPAGE) Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 28 of 32] pmd_trans_huge migrate bugcheck Andrea Arcangeli
2010-02-01 21:46 ` Christoph Lameter
2010-02-03 15:49 ` Andrea Arcangeli [this message]
2010-01-31 20:27 ` [PATCH 29 of 32] memcg compound Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 30 of 32] memcg huge memory Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 31 of 32] transparent hugepage vmstat Andrea Arcangeli
2010-01-31 20:27 ` [PATCH 32 of 32] khugepaged Andrea Arcangeli
2010-02-01 17:03 ` Rik van Riel
2010-02-02 13:56 ` Andrea Arcangeli
2010-02-01 22:18 ` Christoph Lameter
2010-02-01 22:56 ` Andrea Arcangeli
2010-02-02 19:52 ` Christoph Lameter
2010-02-02 20:24 ` Andrea Arcangeli
2010-02-03 16:13 ` Christoph Lameter
2010-02-03 16:30 ` Andrea Arcangeli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100203154900.GA29308@random.random \
--to=aarcange@redhat.com \
--cc=agl@us.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=avi@redhat.com \
--cc=balbir@linux.vnet.ibm.com \
--cc=benh@kernel.crashing.org \
--cc=bpicco@redhat.com \
--cc=chrisw@sous-sol.org \
--cc=cl@linux-foundation.org \
--cc=dave@linux.vnet.ibm.com \
--cc=hugh.dickins@tiscali.co.uk \
--cc=ieidus@redhat.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=mingo@elte.hu \
--cc=mtosatti@redhat.com \
--cc=npiggin@suse.de \
--cc=riel@redhat.com \
--cc=travis@sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).