From: Anshuman Khandual <khandual@linux.vnet.ibm.com>
To: Hugh Dickins <hughd@google.com>,
"Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Linux PPC dev <linuxppc-dev@ozlabs.org>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
kirill.shutemov@linux.intel.com
Subject: Re: Question on follow_page_mask
Date: Wed, 24 Feb 2016 17:15:07 +0530 [thread overview]
Message-ID: <56CD97C3.6080407@linux.vnet.ibm.com> (raw)
In-Reply-To: <alpine.LSU.2.11.1602231230010.3043@eggly.anvils>
On 02/24/2016 02:37 AM, Hugh Dickins via Linuxppc-dev wrote:
> On Tue, 23 Feb 2016, Kirill A. Shutemov wrote:
>> On Tue, Feb 23, 2016 at 06:45:05PM +0530, Anshuman Khandual wrote:
>>> Not able to understand the first code block of follow_page_mask
>>> function. follow_huge_addr function is expected to find the page
>>> struct for the given address if it turns out to be a HugeTLB page
>>> but then when it finds the page we bug on if it had been called
>>> with FOLL_GET flag.
>>>
>>> page = follow_huge_addr(mm, address, flags & FOLL_WRITE);
>>> if (!IS_ERR(page)) {
>>> BUG_ON(flags & FOLL_GET);
>>> return page;
>>> }
>>>
>>> do_move_page_to_node_array calls follow_page with FOLL_GET which
>>> in turn calls follow_page_mask with FOLL_GET. On POWER, the
>>> function follow_huge_addr is defined and does not return -EINVAL
>>> like the generic one. It returns the page struct if its a HugeTLB
>>> page. Just curious to know what is the purpose behind the BUG_ON.
>>
>> I would guess requesting pin on non-reclaimable page is considered
>> useless, meaning suspicius behavior. BUG_ON() is overkill, I think.
>> WARN_ON_ONCE() would make it.
Hugh, thanks for such a detailed response.
>
> No, it's there to guard against abuse, until the correct functionality
> is implemented: which has not so far been required, I think.
>
> The problem is that a get_page() here is too late: it needs to be done
> inside each arch's implementation of follow_huge_addr(), while holding
> whatever is the appropriate lock, dropped by the time it returns here.
Got it.
>
> If you look through where FOLL_GET is usually implemented, such as in
> follow_page_pte(), but pud and pmd cases too, I hope you'll still find
> that they are careful to get the reference on the page while it's safe
> in the pagetable.
yeah, true.
>
> But follow_huge_addr() would need some work to offer the same guarantees:
> it's good for those "peep at a page without actually getting a reference"
> cases, but not good enough for preventing a page for being put to some
> other use completely, before we've secured it with our reference.
Yeah, I understand that now.
>
> Unless something's changed: the last time I recall the issue coming up,
> was when Naoya Horiguchi was working on hugetlbfs page migration: see
> linux-kernel/linux-mm mail thread "BUG at mm/memory.c:1489!" from
> 28 May 2014; and the resolution there was not to support the
> follow_huge_addr() case (which IIRC is peculiar to powerpc alone?).
Yeah correct, looked into that discussion. Also tried out the discussed
small patch where we do a get_page(page) if the returned page is a head
of a HugeTLB compound page and the flag contains FOLL_GET.
That made the do_move_page_to_node_array function work before hitting a
race condition with the test case in the commit message of e66f17ff7177
("mm/hugetlb: take page table lock in follow_huge_pmd()").
I believe the test exposes the locking problem which is not being taken
care at the follow_huge_addr function level right now on powerpc and
forces it to call follow_huge_pmd (which does a BUG_ON() on powerpc)
after failing to detect a huge page at the PMD level when it checked
the first time around.
next prev parent reply other threads:[~2016-02-24 11:46 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-23 13:15 Question on follow_page_mask Anshuman Khandual
2016-02-23 14:03 ` Kirill A. Shutemov
2016-02-23 21:07 ` Hugh Dickins
2016-02-24 11:45 ` Anshuman Khandual [this message]
2016-02-24 11:22 ` Anshuman Khandual
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56CD97C3.6080407@linux.vnet.ibm.com \
--to=khandual@linux.vnet.ibm.com \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=hughd@google.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=kirill@shutemov.name \
--cc=linuxppc-dev@ozlabs.org \
--cc=n-horiguchi@ah.jp.nec.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).