All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: sasha.levin@oracle.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, riel@redhat.com
Subject: Re: [PATCH v2] mm, hugetlbfs: fix rmapping for anonymous hugepages with page_pgoff()
Date: Fri, 28 Feb 2014 15:14:27 -0800	[thread overview]
Message-ID: <20140228151427.dd232b07960dcf876112e191@linux-foundation.org> (raw)
In-Reply-To: <5310ea8b.c425e00a.2cd9.ffffe097SMTPIN_ADDED_BROKEN@mx.google.com>

On Fri, 28 Feb 2014 14:59:02 -0500 Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:

> page->index stores pagecache index when the page is mapped into file mapping
> region, and the index is in pagecache size unit, so it depends on the page
> size. Some of users of reverse mapping obviously assumes that page->index
> is in PAGE_CACHE_SHIFT unit, so they don't work for anonymous hugepage.
> 
> For example, consider that we have 3-hugepage vma and try to mbind the 2nd
> hugepage to migrate to another node. Then the vma is split and migrate_page()
> is called for the 2nd hugepage (belonging to the middle vma.)
> In migrate operation, rmap_walk_anon() tries to find the relevant vma to
> which the target hugepage belongs, but here we miscalculate pgoff.
> So anon_vma_interval_tree_foreach() grabs invalid vma, which fires VM_BUG_ON.
> 
> This patch introduces a new API that is usable both for normal page and
> hugepage to get PAGE_SIZE offset from page->index. Users should clearly
> distinguish page_index for pagecache index and page_pgoff for page offset.
> 
> ..
>
> --- a/include/linux/pagemap.h
> +++ b/include/linux/pagemap.h
> @@ -307,6 +307,22 @@ static inline loff_t page_file_offset(struct page *page)
>  	return ((loff_t)page_file_index(page)) << PAGE_CACHE_SHIFT;
>  }
>  
> +static inline unsigned int page_size_order(struct page *page)
> +{
> +	return unlikely(PageHuge(page)) ?
> +		huge_page_size_order(page) :
> +		(PAGE_CACHE_SHIFT - PAGE_SHIFT);
> +}

Could use some nice documentation, please.  Why it exists, what it
does.  Particularly: what sort of pages it can and can't operate on,
and why.

The presence of PAGE_CACHE_SIZE is unfortunate - it at least implies
that the page is a pagecache page.  I dunno, maybe just use "0"?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: sasha.levin@oracle.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, riel@redhat.com
Subject: Re: [PATCH v2] mm, hugetlbfs: fix rmapping for anonymous hugepages with page_pgoff()
Date: Fri, 28 Feb 2014 15:14:27 -0800	[thread overview]
Message-ID: <20140228151427.dd232b07960dcf876112e191@linux-foundation.org> (raw)
In-Reply-To: <5310ea8b.c425e00a.2cd9.ffffe097SMTPIN_ADDED_BROKEN@mx.google.com>

On Fri, 28 Feb 2014 14:59:02 -0500 Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:

> page->index stores pagecache index when the page is mapped into file mapping
> region, and the index is in pagecache size unit, so it depends on the page
> size. Some of users of reverse mapping obviously assumes that page->index
> is in PAGE_CACHE_SHIFT unit, so they don't work for anonymous hugepage.
> 
> For example, consider that we have 3-hugepage vma and try to mbind the 2nd
> hugepage to migrate to another node. Then the vma is split and migrate_page()
> is called for the 2nd hugepage (belonging to the middle vma.)
> In migrate operation, rmap_walk_anon() tries to find the relevant vma to
> which the target hugepage belongs, but here we miscalculate pgoff.
> So anon_vma_interval_tree_foreach() grabs invalid vma, which fires VM_BUG_ON.
> 
> This patch introduces a new API that is usable both for normal page and
> hugepage to get PAGE_SIZE offset from page->index. Users should clearly
> distinguish page_index for pagecache index and page_pgoff for page offset.
> 
> ..
>
> --- a/include/linux/pagemap.h
> +++ b/include/linux/pagemap.h
> @@ -307,6 +307,22 @@ static inline loff_t page_file_offset(struct page *page)
>  	return ((loff_t)page_file_index(page)) << PAGE_CACHE_SHIFT;
>  }
>  
> +static inline unsigned int page_size_order(struct page *page)
> +{
> +	return unlikely(PageHuge(page)) ?
> +		huge_page_size_order(page) :
> +		(PAGE_CACHE_SHIFT - PAGE_SHIFT);
> +}

Could use some nice documentation, please.  Why it exists, what it
does.  Particularly: what sort of pages it can and can't operate on,
and why.

The presence of PAGE_CACHE_SIZE is unfortunate - it at least implies
that the page is a pagecache page.  I dunno, maybe just use "0"?


  parent reply	other threads:[~2014-02-28 23:14 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-27  4:39 [PATCH 0/3] fixes on page table walker and hugepage rmapping Naoya Horiguchi
2014-02-27  4:39 ` Naoya Horiguchi
2014-02-27  4:39 ` [PATCH 1/3] mm/pagewalk.c: fix end address calculation in walk_page_range() Naoya Horiguchi
2014-02-27  4:39   ` Naoya Horiguchi
2014-02-27 21:03   ` Andrew Morton
2014-02-27 21:03     ` Andrew Morton
2014-02-27 21:19     ` Naoya Horiguchi
2014-02-27 21:20       ` Kirill A. Shutemov
2014-02-27 21:20         ` Kirill A. Shutemov
2014-02-27 21:54         ` Naoya Horiguchi
2014-02-27  4:39 ` [PATCH 2/3] mm, hugetlbfs: fix rmapping for anonymous hugepages with page_pgoff() Naoya Horiguchi
2014-02-27  4:39   ` Naoya Horiguchi
2014-02-27 21:19   ` Andrew Morton
2014-02-27 21:19     ` Andrew Morton
2014-02-27 21:53     ` Naoya Horiguchi
2014-02-28 19:59       ` [PATCH v2] " Naoya Horiguchi
     [not found]       ` <5310ea8b.c425e00a.2cd9.ffffe097SMTPIN_ADDED_BROKEN@mx.google.com>
2014-02-28 23:14         ` Andrew Morton [this message]
2014-02-28 23:14           ` Andrew Morton
2014-03-01  3:35           ` [PATCH v3] " Naoya Horiguchi
     [not found]           ` <1393644926-49vw3qw9@n-horiguchi@ah.jp.nec.com>
2014-03-01 23:08             ` Sasha Levin
2014-03-01 23:08               ` Sasha Levin
2014-03-03  5:02               ` [PATCH] mm: add pte_present() check on existing hugetlb_entry callbacks Naoya Horiguchi
2014-03-03  5:02                 ` Naoya Horiguchi
2014-03-03 20:06                 ` Sasha Levin
2014-03-03 20:06                   ` Sasha Levin
2014-03-03 21:38                   ` Sasha Levin
2014-03-03 21:38                     ` Sasha Levin
2014-03-04 21:32                     ` Naoya Horiguchi
     [not found]                     ` <1393968743-imrxpynb@n-horiguchi@ah.jp.nec.com>
2014-03-04 22:46                       ` Sasha Levin
2014-03-04 22:46                         ` Sasha Levin
2014-03-04 23:49                         ` Naoya Horiguchi
     [not found]                         ` <1393976967-lnmm5xcs@n-horiguchi@ah.jp.nec.com>
2014-03-06  4:31                           ` Sasha Levin
2014-03-06  4:31                             ` Sasha Levin
2014-03-06 16:08                             ` Naoya Horiguchi
     [not found]                             ` <1394122113-xsq3i6vw@n-horiguchi@ah.jp.nec.com>
2014-03-06 21:16                               ` Sasha Levin
2014-03-06 21:16                                 ` Sasha Levin
2014-03-07  6:35                                 ` Naoya Horiguchi
2014-03-15  6:45                                   ` Naoya Horiguchi
2014-02-27  4:39 ` [PATCH 3/3] mm: call vma_adjust_trans_huge() only for thp-enabled vma Naoya Horiguchi
2014-02-27  4:39   ` Naoya Horiguchi
2014-02-27 21:23   ` Andrew Morton
2014-02-27 21:23     ` Andrew Morton
2014-02-27 22:08     ` Naoya Horiguchi
2014-02-27 22:56   ` Kirill A. Shutemov
2014-02-27 22:56     ` Kirill A. Shutemov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140228151427.dd232b07960dcf876112e191@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=riel@redhat.com \
    --cc=sasha.levin@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.