From: Minchan Kim <minchan@kernel.org>
To: John Hubbard <jhubbard@nvidia.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
linux-mm <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>,
"Paul E . McKenney" <paulmck@kernel.org>,
John Dias <joaodias@google.com>,
David Hildenbrand <david@redhat.com>
Subject: Re: [PATCH v5] mm: fix is_pinnable_page against on cma page
Date: Thu, 12 May 2022 13:54:50 -0700 [thread overview]
Message-ID: <Yn10GkInyZNtqASa@google.com> (raw)
In-Reply-To: <5d9eb30e-6e0e-81a3-2b2c-47adc4e85470@nvidia.com>
On Thu, May 12, 2022 at 01:51:47PM -0700, John Hubbard wrote:
> On 5/12/22 13:41, Minchan Kim wrote:
> > Pages on CMA area could have MIGRATE_ISOLATE as well as MIGRATE_CMA
> > so current is_pinnable_page could miss CMA pages which has MIGRATE_
> > ISOLATE. It ends up pinning CMA pages as longterm at pin_user_pages
> > APIs so CMA allocation keep failed until the pin is released.
> >
> > CPU 0 CPU 1 - Task B
> >
> > cma_alloc
> > alloc_contig_range
> > pin_user_pages_fast(FOLL_LONGTERM)
> > change pageblock as MIGRATE_ISOLATE
> > internal_get_user_pages_fast
> > lockless_pages_from_mm
> > gup_pte_range
> > try_grab_folio
> > is_pinnable_page
> > return true;
> > So, pinned the page successfully.
> > page migration failure with pinned page
> > ..
> > .. After 30 sec
> > unpin_user_page(page)
> >
> > CMA allocation succeeded after 30 sec.
> >
> > The CMA allocation path protects the migration type change race
> > using zone->lock but what GUP path need to know is just whether the
> > page is on CMA area or not rather than exact migration type.
> > Thus, we don't need zone->lock but just checks migration type in
> > either of (MIGRATE_ISOLATE and MIGRATE_CMA).
> >
> > Adding the MIGRATE_ISOLATE check in is_pinnable_page could cause
> > rejecting of pinning pages on MIGRATE_ISOLATE pageblocks even
> > though it's neither CMA nor movable zone if the page is temporarily
> > unmovable. However, such a migration failure by unexpected temporal
> > refcount holding is general issue, not only come from MIGRATE_ISOLATE
> > and the MIGRATE_ISOLATE is also transient state like other temporal
> > elevated refcount problem.
> >
> > Cc: "Paul E . McKenney" <paulmck@kernel.org>
> > Cc: John Hubbard <jhubbard@nvidia.com>
> > Cc: David Hildenbrand <david@redhat.com>
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > ---
> > * from v4 - https://lore.kernel.org/all/20220510211743.95831-1-minchan@kernel.org/
> > * clarification why we need READ_ONCE - Paul
> > * Adding a comment about READ_ONCE - John
> >
> > * from v3 - https://lore.kernel.org/all/20220509153430.4125710-1-minchan@kernel.org/
> > * Fix typo and adding more description - akpm
> >
> > * from v2 - https://lore.kernel.org/all/20220505064429.2818496-1-minchan@kernel.org/
> > * Use __READ_ONCE instead of volatile - akpm
> >
> > * from v1 - https://lore.kernel.org/all/20220502173558.2510641-1-minchan@kernel.org/
> > * fix build warning - lkp
> > * fix refetching issue of migration type
> > * add side effect on !ZONE_MOVABLE and !MIGRATE_CMA in description - david
> >
> > include/linux/mm.h | 16 ++++++++++++++--
> > 1 file changed, 14 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 6acca5cecbc5..2d7a5d87decd 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -1625,8 +1625,20 @@ static inline bool page_needs_cow_for_dma(struct vm_area_struct *vma,
> > #ifdef CONFIG_MIGRATION
> > static inline bool is_pinnable_page(struct page *page)
> > {
> > - return !(is_zone_movable_page(page) || is_migrate_cma_page(page)) ||
> > - is_zero_pfn(page_to_pfn(page));
> > +#ifdef CONFIG_CMA
> > + /*
> > + * Defend against future compiler LTO features, or code refactoring
> > + * that inlines the above function, by forcing a single read. Because,
> > + * this routine races with set_pageblock_migratetype(), and we want to
> > + * avoid reading zero, when actually one or the other flags was set.
> > + */
>
> The most interesting line got dropped in this version. :)
>
> This is missing:
>
> int __mt = get_pageblock_migratetype(page);
>
> Assuming that that is restored, please feel free to add:
>
> Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Just caught after clicked the button with my fat finger :(
Thanks, John!
Andrew, Could you pick this up?
From 90ad049d48f5c36075f17ac996dfe3c33127aeb6 Mon Sep 17 00:00:00 2001
From: Minchan Kim <minchan@kernel.org>
Date: Mon, 2 May 2022 10:03:48 -0700
Subject: [PATCH v5] mm: fix is_pinnable_page against on cma page
Pages on CMA area could have MIGRATE_ISOLATE as well as MIGRATE_CMA
so current is_pinnable_page could miss CMA pages which has MIGRATE_
ISOLATE. It ends up pinning CMA pages as longterm at pin_user_pages
APIs so CMA allocation keep failed until the pin is released.
CPU 0 CPU 1 - Task B
cma_alloc
alloc_contig_range
pin_user_pages_fast(FOLL_LONGTERM)
change pageblock as MIGRATE_ISOLATE
internal_get_user_pages_fast
lockless_pages_from_mm
gup_pte_range
try_grab_folio
is_pinnable_page
return true;
So, pinned the page successfully.
page migration failure with pinned page
..
.. After 30 sec
unpin_user_page(page)
CMA allocation succeeded after 30 sec.
The CMA allocation path protects the migration type change race
using zone->lock but what GUP path need to know is just whether the
page is on CMA area or not rather than exact migration type.
Thus, we don't need zone->lock but just checks migration type in
either of (MIGRATE_ISOLATE and MIGRATE_CMA).
Adding the MIGRATE_ISOLATE check in is_pinnable_page could cause
rejecting of pinning pages on MIGRATE_ISOLATE pageblocks even
though it's neither CMA nor movable zone if the page is temporarily
unmovable. However, such a migration failure by unexpected temporal
refcount holding is general issue, not only come from MIGRATE_ISOLATE
and the MIGRATE_ISOLATE is also transient state like other temporal
elevated refcount problem.
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
* from v4 - https://lore.kernel.org/all/20220510211743.95831-1-minchan@kernel.org/
* clarification why we need READ_ONCE - Paul
* Adding a comment about READ_ONCE - John
* from v3 - https://lore.kernel.org/all/20220509153430.4125710-1-minchan@kernel.org/
* Fix typo and adding more description - akpm
* from v2 - https://lore.kernel.org/all/20220505064429.2818496-1-minchan@kernel.org/
* Use __READ_ONCE instead of volatile - akpm
* from v1 - https://lore.kernel.org/all/20220502173558.2510641-1-minchan@kernel.org/
* fix build warning - lkp
* fix refetching issue of migration type
* add side effect on !ZONE_MOVABLE and !MIGRATE_CMA in description - david
include/linux/mm.h | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 6acca5cecbc5..b23c6f1b90b5 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1625,8 +1625,21 @@ static inline bool page_needs_cow_for_dma(struct vm_area_struct *vma,
#ifdef CONFIG_MIGRATION
static inline bool is_pinnable_page(struct page *page)
{
- return !(is_zone_movable_page(page) || is_migrate_cma_page(page)) ||
- is_zero_pfn(page_to_pfn(page));
+#ifdef CONFIG_CMA
+ /*
+ * Defend against future compiler LTO features, or code refactoring
+ * that inlines the above function, by forcing a single read. Because,
+ * this routine races with set_pageblock_migratetype(), and we want to
+ * avoid reading zero, when actually one or the other flags was set.
+ */
+ int __mt = get_pageblock_migratetype(page);
+ int mt = __READ_ONCE(__mt);
+
+ if (mt & (MIGRATE_CMA | MIGRATE_ISOLATE))
+ return false;
+#endif
+
+ return !(is_zone_movable_page(page) || is_zero_pfn(page_to_pfn(page)));
}
#else
static inline bool is_pinnable_page(struct page *page)
--
2.36.0.550.gb090851708-goog
next prev parent reply other threads:[~2022-05-12 20:54 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-12 20:41 [PATCH v5] mm: fix is_pinnable_page against on cma page Minchan Kim
2022-05-12 20:51 ` John Hubbard
2022-05-12 20:54 ` Minchan Kim [this message]
2022-05-13 3:48 ` kernel test robot
2022-05-13 4:19 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Yn10GkInyZNtqASa@google.com \
--to=minchan@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=jhubbard@nvidia.com \
--cc=joaodias@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=paulmck@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.