* [PATCH] mm: get page_cache_get_speculative() work on tail pages
@ 2015-04-01 22:52 Kirill A. Shutemov
2015-04-01 23:21 ` Hugh Dickins
0 siblings, 1 reply; 5+ messages in thread
From: Kirill A. Shutemov @ 2015-04-01 22:52 UTC (permalink / raw)
To: Andrew Morton, Hugh Dickins
Cc: linux-mm, Kirill A. Shutemov, Steve Capper, Andrea Arcangeli,
Paul E. McKenney
Generic RCU fast GUP rely on page_cache_get_speculative() to obtain pin
on pte-mapped page. As pointed by Aneesh during review of my compound
pages refcounting rework, page_cache_get_speculative() would fail on
pte-mapped tail page, since tail pages always have page->_count == 0.
That means we would never be able to successfully obtain pin on
pte-mapped tail page via generic RCU fast GUP.
But the problem is not exclusive to my patchset. In current kernel some
drivers (sound, for instance) already map compound pages with PTEs.
Let's teach page_cache_get_speculative() about tail. We can acquire pin
by speculatively taking pin on head page and recheck that compound page
didn't disappear under us. Retry if it did.
We don't care about THP tail page refcounting -- THP *tail* pages
shouldn't be found where page_cache_get_speculative() is used --
pagecache radix tree or page tables.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
include/linux/pagemap.h | 31 ++++++++++++++++++++++++++-----
1 file changed, 26 insertions(+), 5 deletions(-)
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 7c3790764795..573a2510da36 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -142,8 +142,10 @@ void release_pages(struct page **pages, int nr, bool cold);
*/
static inline int page_cache_get_speculative(struct page *page)
{
+ struct page *head_page;
VM_BUG_ON(in_interrupt());
-
+retry:
+ head_page = compound_head_fast(page);
#ifdef CONFIG_TINY_RCU
# ifdef CONFIG_PREEMPT_COUNT
VM_BUG_ON(!in_atomic());
@@ -157,11 +159,11 @@ static inline int page_cache_get_speculative(struct page *page)
* disabling preempt, and hence no need for the "speculative get" that
* SMP requires.
*/
- VM_BUG_ON_PAGE(page_count(page) == 0, page);
- atomic_inc(&page->_count);
+ VM_BUG_ON_PAGE(page_count(head_page) == 0, head_page);
+ atomic_inc(&head_page->_count);
#else
- if (unlikely(!get_page_unless_zero(page))) {
+ if (unlikely(!get_page_unless_zero(head_page))) {
/*
* Either the page has been freed, or will be freed.
* In either case, retry here and the caller should
@@ -170,7 +172,26 @@ static inline int page_cache_get_speculative(struct page *page)
return 0;
}
#endif
- VM_BUG_ON_PAGE(PageTail(page), page);
+ /* compound_head_fast() seen PageTail(page) == true */
+ if (unlikely(head_page != page)) {
+ /*
+ * compound_head_fast() could fetch dangling page->first_page
+ * pointer to an old compound page, so recheck that it's still
+ * a tail page before returning.
+ */
+ smp_mb__after_atomic();
+ if (unlikely(!PageTail(page))) {
+ put_page(head_page);
+ goto retry;
+ }
+ /*
+ * Tail page refcounting is only required for THP pages.
+ * If page_cache_get_speculative() got called on tail-THP pages
+ * something went horribly wrong. We don't have THP in pagecache
+ * and we don't map tail-THP to page tables.
+ */
+ VM_BUG_ON_PAGE(compound_tail_refcounted(head_page), head_page);
+ }
return 1;
}
--
2.1.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] mm: get page_cache_get_speculative() work on tail pages
2015-04-01 22:52 [PATCH] mm: get page_cache_get_speculative() work on tail pages Kirill A. Shutemov
@ 2015-04-01 23:21 ` Hugh Dickins
2015-04-01 23:56 ` Kirill A. Shutemov
0 siblings, 1 reply; 5+ messages in thread
From: Hugh Dickins @ 2015-04-01 23:21 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Andrew Morton, Hugh Dickins, linux-mm, Steve Capper,
Andrea Arcangeli, Paul E. McKenney
On Thu, 2 Apr 2015, Kirill A. Shutemov wrote:
> Generic RCU fast GUP rely on page_cache_get_speculative() to obtain pin
> on pte-mapped page. As pointed by Aneesh during review of my compound
> pages refcounting rework, page_cache_get_speculative() would fail on
> pte-mapped tail page, since tail pages always have page->_count == 0.
>
> That means we would never be able to successfully obtain pin on
> pte-mapped tail page via generic RCU fast GUP.
>
> But the problem is not exclusive to my patchset. In current kernel some
> drivers (sound, for instance) already map compound pages with PTEs.
Hah, you were sending this as I was replying to the original thread.
Do we care if fast gup fails on some hardware driver's compound pages?
I don't think we do, and it would be better not to complicate the
low-level page_cache_get_speculative for them.
Hugh
>
> Let's teach page_cache_get_speculative() about tail. We can acquire pin
> by speculatively taking pin on head page and recheck that compound page
> didn't disappear under us. Retry if it did.
>
> We don't care about THP tail page refcounting -- THP *tail* pages
> shouldn't be found where page_cache_get_speculative() is used --
> pagecache radix tree or page tables.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Reported-by: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> Cc: Steve Capper <steve.capper@linaro.org>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> ---
> include/linux/pagemap.h | 31 ++++++++++++++++++++++++++-----
> 1 file changed, 26 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> index 7c3790764795..573a2510da36 100644
> --- a/include/linux/pagemap.h
> +++ b/include/linux/pagemap.h
> @@ -142,8 +142,10 @@ void release_pages(struct page **pages, int nr, bool cold);
> */
> static inline int page_cache_get_speculative(struct page *page)
> {
> + struct page *head_page;
> VM_BUG_ON(in_interrupt());
> -
> +retry:
> + head_page = compound_head_fast(page);
> #ifdef CONFIG_TINY_RCU
> # ifdef CONFIG_PREEMPT_COUNT
> VM_BUG_ON(!in_atomic());
> @@ -157,11 +159,11 @@ static inline int page_cache_get_speculative(struct page *page)
> * disabling preempt, and hence no need for the "speculative get" that
> * SMP requires.
> */
> - VM_BUG_ON_PAGE(page_count(page) == 0, page);
> - atomic_inc(&page->_count);
> + VM_BUG_ON_PAGE(page_count(head_page) == 0, head_page);
> + atomic_inc(&head_page->_count);
>
> #else
> - if (unlikely(!get_page_unless_zero(page))) {
> + if (unlikely(!get_page_unless_zero(head_page))) {
> /*
> * Either the page has been freed, or will be freed.
> * In either case, retry here and the caller should
> @@ -170,7 +172,26 @@ static inline int page_cache_get_speculative(struct page *page)
> return 0;
> }
> #endif
> - VM_BUG_ON_PAGE(PageTail(page), page);
> + /* compound_head_fast() seen PageTail(page) == true */
> + if (unlikely(head_page != page)) {
> + /*
> + * compound_head_fast() could fetch dangling page->first_page
> + * pointer to an old compound page, so recheck that it's still
> + * a tail page before returning.
> + */
> + smp_mb__after_atomic();
> + if (unlikely(!PageTail(page))) {
> + put_page(head_page);
> + goto retry;
> + }
> + /*
> + * Tail page refcounting is only required for THP pages.
> + * If page_cache_get_speculative() got called on tail-THP pages
> + * something went horribly wrong. We don't have THP in pagecache
> + * and we don't map tail-THP to page tables.
> + */
> + VM_BUG_ON_PAGE(compound_tail_refcounted(head_page), head_page);
> + }
>
> return 1;
> }
> --
> 2.1.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] mm: get page_cache_get_speculative() work on tail pages
2015-04-01 23:21 ` Hugh Dickins
@ 2015-04-01 23:56 ` Kirill A. Shutemov
2015-04-02 0:08 ` Hugh Dickins
0 siblings, 1 reply; 5+ messages in thread
From: Kirill A. Shutemov @ 2015-04-01 23:56 UTC (permalink / raw)
To: Hugh Dickins
Cc: Kirill A. Shutemov, Andrew Morton, linux-mm, Steve Capper,
Andrea Arcangeli, Paul E. McKenney
On Wed, Apr 01, 2015 at 04:21:30PM -0700, Hugh Dickins wrote:
> On Thu, 2 Apr 2015, Kirill A. Shutemov wrote:
>
> > Generic RCU fast GUP rely on page_cache_get_speculative() to obtain pin
> > on pte-mapped page. As pointed by Aneesh during review of my compound
> > pages refcounting rework, page_cache_get_speculative() would fail on
> > pte-mapped tail page, since tail pages always have page->_count == 0.
> >
> > That means we would never be able to successfully obtain pin on
> > pte-mapped tail page via generic RCU fast GUP.
> >
> > But the problem is not exclusive to my patchset. In current kernel some
> > drivers (sound, for instance) already map compound pages with PTEs.
>
> Hah, you were sending this as I was replying to the original thread.
>
> Do we care if fast gup fails on some hardware driver's compound pages?
> I don't think we do, and it would be better not to complicate the
> low-level page_cache_get_speculative for them.
Fair enough :-/
I'll check tomorrow if it will look more reasonable on gup_pte_range()
level, rather than page_cache_get_speculative().
--
Kirill A. Shutemov
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] mm: get page_cache_get_speculative() work on tail pages
2015-04-01 23:56 ` Kirill A. Shutemov
@ 2015-04-02 0:08 ` Hugh Dickins
2015-04-02 12:09 ` Kirill A. Shutemov
0 siblings, 1 reply; 5+ messages in thread
From: Hugh Dickins @ 2015-04-02 0:08 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Hugh Dickins, Kirill A. Shutemov, Andrew Morton, linux-mm,
Steve Capper, Andrea Arcangeli, Paul E. McKenney
On Thu, 2 Apr 2015, Kirill A. Shutemov wrote:
> On Wed, Apr 01, 2015 at 04:21:30PM -0700, Hugh Dickins wrote:
> > On Thu, 2 Apr 2015, Kirill A. Shutemov wrote:
> >
> > > Generic RCU fast GUP rely on page_cache_get_speculative() to obtain pin
> > > on pte-mapped page. As pointed by Aneesh during review of my compound
> > > pages refcounting rework, page_cache_get_speculative() would fail on
> > > pte-mapped tail page, since tail pages always have page->_count == 0.
> > >
> > > That means we would never be able to successfully obtain pin on
> > > pte-mapped tail page via generic RCU fast GUP.
> > >
> > > But the problem is not exclusive to my patchset. In current kernel some
> > > drivers (sound, for instance) already map compound pages with PTEs.
> >
> > Hah, you were sending this as I was replying to the original thread.
> >
> > Do we care if fast gup fails on some hardware driver's compound pages?
> > I don't think we do, and it would be better not to complicate the
> > low-level page_cache_get_speculative for them.
>
> Fair enough :-/
>
> I'll check tomorrow if it will look more reasonable on gup_pte_range()
> level, rather than page_cache_get_speculative().
But we don't need it on the (fast) gup_pte_range() level either, do we?
Or do you have THP changes in mmotm which are now demanding this?
Hugh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] mm: get page_cache_get_speculative() work on tail pages
2015-04-02 0:08 ` Hugh Dickins
@ 2015-04-02 12:09 ` Kirill A. Shutemov
0 siblings, 0 replies; 5+ messages in thread
From: Kirill A. Shutemov @ 2015-04-02 12:09 UTC (permalink / raw)
To: Hugh Dickins
Cc: Kirill A. Shutemov, Andrew Morton, linux-mm, Steve Capper,
Andrea Arcangeli, Paul E. McKenney
On Wed, Apr 01, 2015 at 05:08:53PM -0700, Hugh Dickins wrote:
> On Thu, 2 Apr 2015, Kirill A. Shutemov wrote:
> > On Wed, Apr 01, 2015 at 04:21:30PM -0700, Hugh Dickins wrote:
> > > On Thu, 2 Apr 2015, Kirill A. Shutemov wrote:
> > >
> > > > Generic RCU fast GUP rely on page_cache_get_speculative() to obtain pin
> > > > on pte-mapped page. As pointed by Aneesh during review of my compound
> > > > pages refcounting rework, page_cache_get_speculative() would fail on
> > > > pte-mapped tail page, since tail pages always have page->_count == 0.
> > > >
> > > > That means we would never be able to successfully obtain pin on
> > > > pte-mapped tail page via generic RCU fast GUP.
> > > >
> > > > But the problem is not exclusive to my patchset. In current kernel some
> > > > drivers (sound, for instance) already map compound pages with PTEs.
> > >
> > > Hah, you were sending this as I was replying to the original thread.
> > >
> > > Do we care if fast gup fails on some hardware driver's compound pages?
> > > I don't think we do, and it would be better not to complicate the
> > > low-level page_cache_get_speculative for them.
> >
> > Fair enough :-/
> >
> > I'll check tomorrow if it will look more reasonable on gup_pte_range()
> > level, rather than page_cache_get_speculative().
>
> But we don't need it on the (fast) gup_pte_range() level either, do we?
> Or do you have THP changes in mmotm which are now demanding this?
No. I'll keep it local to my patchset.
--
Kirill A. Shutemov
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-04-02 12:09 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-04-01 22:52 [PATCH] mm: get page_cache_get_speculative() work on tail pages Kirill A. Shutemov
2015-04-01 23:21 ` Hugh Dickins
2015-04-01 23:56 ` Kirill A. Shutemov
2015-04-02 0:08 ` Hugh Dickins
2015-04-02 12:09 ` Kirill A. Shutemov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).