From: Alistair Popple <apopple@nvidia.com>
To: "Wang, Haiyue" <haiyue.wang@intel.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"david@redhat.com" <david@redhat.com>,
"linmiaohe@huawei.com" <linmiaohe@huawei.com>,
"Huang, Ying" <ying.huang@intel.com>,
"songmuchun@bytedance.com" <songmuchun@bytedance.com>,
"naoya.horiguchi@linux.dev" <naoya.horiguchi@linux.dev>,
"alex.sierra@amd.com" <alex.sierra@amd.com>,
Felix Kuehling <Felix.Kuehling@amd.com>
Subject: Re: [PATCH v5 2/2] mm: fix the handling Non-LRU pages returned by follow_page
Date: Tue, 16 Aug 2022 12:45:23 +1000 [thread overview]
Message-ID: <87mtc4vr64.fsf@nvdebian.thelocal> (raw)
In-Reply-To: <BYAPR11MB34954F8869F88E8BD34035B0F76B9@BYAPR11MB3495.namprd11.prod.outlook.com>
"Wang, Haiyue" <haiyue.wang@intel.com> writes:
>> -----Original Message-----
>> From: Alistair Popple <apopple@nvidia.com>
>> Sent: Tuesday, August 16, 2022 08:01
>> To: Wang, Haiyue <haiyue.wang@intel.com>
>> Cc: linux-mm@kvack.org; linux-kernel@vger.kernel.org; akpm@linux-foundation.org; david@redhat.com;
>> linmiaohe@huawei.com; Huang, Ying <ying.huang@intel.com>; songmuchun@bytedance.com;
>> naoya.horiguchi@linux.dev; alex.sierra@amd.com; Felix Kuehling <Felix.Kuehling@amd.com>
>> Subject: Re: [PATCH v5 2/2] mm: fix the handling Non-LRU pages returned by follow_page
>>
>>
>> Haiyue Wang <haiyue.wang@intel.com> writes:
>>
>> > The handling Non-LRU pages returned by follow_page() jumps directly, it
>> > doesn't call put_page() to handle the reference count, since 'FOLL_GET'
>> > flag for follow_page() has get_page() called. Fix the zone device page
>> > check by handling the page reference count correctly before returning.
>> >
>> > And as David reviewed, "device pages are never PageKsm pages". Drop this
>> > zone device page check for break_ksm().
>> >
>> > Fixes: 3218f8712d6b ("mm: handling Non-LRU pages returned by vm_normal_pages")
>> > Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
>> > ---
>> > mm/huge_memory.c | 4 ++--
>> > mm/ksm.c | 12 +++++++++---
>> > mm/migrate.c | 10 +++++++---
>> > 3 files changed, 18 insertions(+), 8 deletions(-)
>> >
>> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> > index 8a7c1b344abe..b2ba17c3dcd7 100644
>> > --- a/mm/huge_memory.c
>> > +++ b/mm/huge_memory.c
>> > @@ -2963,10 +2963,10 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start,
>> > /* FOLL_DUMP to ignore special (like zero) pages */
>> > page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP);
>> >
>> > - if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
>> > + if (IS_ERR_OR_NULL(page))
>> > continue;
>> >
>> > - if (!is_transparent_hugepage(page))
>> > + if (is_zone_device_page(page) || !is_transparent_hugepage(page))
>> > goto next;
>> >
>> > total++;
>> > diff --git a/mm/ksm.c b/mm/ksm.c
>> > index 42ab153335a2..e26f57fc1f0e 100644
>> > --- a/mm/ksm.c
>> > +++ b/mm/ksm.c
>> > @@ -475,7 +475,7 @@ static int break_ksm(struct vm_area_struct *vma, unsigned long addr)
>> > cond_resched();
>> > page = follow_page(vma, addr,
>> > FOLL_GET | FOLL_MIGRATION | FOLL_REMOTE);
>> > - if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
>> > + if (IS_ERR_OR_NULL(page))
>> > break;
>> > if (PageKsm(page))
>> > ret = handle_mm_fault(vma, addr,
>> > @@ -560,12 +560,15 @@ static struct page *get_mergeable_page(struct rmap_item *rmap_item)
>> > goto out;
>> >
>> > page = follow_page(vma, addr, FOLL_GET);
>> > - if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
>> > + if (IS_ERR_OR_NULL(page))
>> > goto out;
>> > + if (is_zone_device_page(page))
>>
>> Same as for break_ksm() I think we should be able to drop the
>> is_zone_device_page() check here because scan_get_next_rmap_item()
>> already filters out zone device pages.
>>
>
> The 'page' for scan_get_next_rmap_item() is from 'vma' which is NOT MERGEABLE:
> for (; vma; vma = vma->vm_next) {
> if (!(vma->vm_flags & VM_MERGEABLE))
> continue;
>
> The 'page' for get_mergeable_page() is from 'vma' which is MERGEABLE by 'find_mergeable_vma()'
Oh, ok. I'm actually not too familiar with KSM but I think I follow so
if you think we need to keep the check by all means do so.
> So they may be different, and the unstable_tree_search_insert() shows the logical:
>
> 'page' vs 'tree_page':
>
> tree_page = get_mergeable_page(tree_rmap_item);
> if (!tree_page)
> return NULL;
>
> /*
> * Don't substitute a ksm page for a forked page.
> */
> if (page == tree_page) {
> put_page(tree_page);
> return NULL;
> }
>
> ret = memcmp_pages(page, tree_page);
>
>
>> > + goto out_putpage;
>> > if (PageAnon(page)) {
>> > flush_anon_page(vma, page, addr);
>> > flush_dcache_page(page);
>> > } else {
>> > +out_putpage:
>> > put_page(page);
>> > out:
>> > page = NULL;
>> > @@ -2308,11 +2311,13 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page)
>> > if (ksm_test_exit(mm))
>> > break;
>> > *page = follow_page(vma, ksm_scan.address, FOLL_GET);
>> > - if (IS_ERR_OR_NULL(*page) || is_zone_device_page(*page)) {
>> > + if (IS_ERR_OR_NULL(*page)) {
>> > ksm_scan.address += PAGE_SIZE;
>> > cond_resched();
>> > continue;
>> > }
>> > + if (is_zone_device_page(*page))
>> > + goto next_page;
>> > if (PageAnon(*page)) {
>> > flush_anon_page(vma, *page, ksm_scan.address);
>> > flush_dcache_page(*page);
>> > @@ -2327,6 +2332,7 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page)
>> > mmap_read_unlock(mm);
>> > return rmap_item;
>> > }
>> > +next_page:
>> > put_page(*page);
>> > ksm_scan.address += PAGE_SIZE;
>> > cond_resched();
>> > diff --git a/mm/migrate.c b/mm/migrate.c
>> > index 581dfaad9257..fee12cd2f294 100644
>> > --- a/mm/migrate.c
>> > +++ b/mm/migrate.c
>> > @@ -1672,9 +1672,12 @@ static int add_page_for_migration(struct mm_struct *mm, unsigned long addr,
>> > goto out;
>> >
>> > err = -ENOENT;
>> > - if (!page || is_zone_device_page(page))
>> > + if (!page)
>> > goto out;
>> >
>> > + if (is_zone_device_page(page))
>> > + goto out_putpage;
>> > +
>> > err = 0;
>> > if (page_to_nid(page) == node)
>> > goto out_putpage;
>> > @@ -1868,8 +1871,9 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,
>> > if (IS_ERR(page))
>> > goto set_status;
>> >
>> > - if (page && !is_zone_device_page(page)) {
>> > - err = page_to_nid(page);
>> > + if (page) {
>> > + err = !is_zone_device_page(page) ? page_to_nid(page)
>> > + : -ENOENT;
>>
>> Can we remove the multiple layers of conditionals here? Something like
>> this is cleaner and easier to understand IMHO:
>
> OK, I will try it in new patch.
Thanks.
>>
>> - if (page && !is_zone_device_page(page)) {
>> - err = page_to_nid(page);
>> - if (foll_flags & FOLL_GET)
>> - put_page(page);
>> - } else {
>> + if (!page) {
>> err = -ENOENT;
>> + goto set_status;
>> }
>> +
>> + if (is_zone_device_page(page))
>> + err = -ENOENT;
>> + else
>> + err = page_to_nid_page(page);
>> +
>> + if (foll_flags & FOLL_GET)
>> + put_page(page);
>>
>> Thanks.
>>
>> - Alistair
>>
>> > if (foll_flags & FOLL_GET)
>> > put_page(page);
>> > } else {
next prev parent reply other threads:[~2022-08-16 2:49 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-12 8:49 [PATCH v1] mm: migration: fix the FOLL_GET failure on following huge page Haiyue Wang
2022-08-13 23:28 ` Andrew Morton
2022-08-14 6:20 ` Wang, Haiyue
2022-08-14 6:49 ` Wang, Haiyue
2022-08-14 14:05 ` [PATCH v2 0/3] fix follow_page related issues Haiyue Wang
2022-08-14 14:05 ` [PATCH v2 1/3] mm: revert handling Non-LRU pages returned by follow_page Haiyue Wang
2022-08-14 16:30 ` David Hildenbrand
2022-08-15 1:02 ` Wang, Haiyue
2022-08-14 14:05 ` [PATCH v2 2/3] mm: migration: fix the FOLL_GET failure on following huge page Haiyue Wang
2022-08-14 14:05 ` [PATCH v2 3/3] mm: handling Non-LRU pages returned by follow_page Haiyue Wang
2022-08-14 16:34 ` David Hildenbrand
2022-08-15 1:03 ` Wang, Haiyue
2022-08-15 1:03 ` [PATCH v3 0/2] fix follow_page related issues Haiyue Wang
2022-08-15 1:03 ` [PATCH v3 1/2] mm: migration: fix the FOLL_GET failure on following huge page Haiyue Wang
2022-08-15 1:59 ` Huang, Ying
2022-08-15 2:10 ` Wang, Haiyue
2022-08-15 2:15 ` Wang, Haiyue
2022-08-15 2:51 ` Huang, Ying
2022-08-15 1:03 ` [PATCH v3 2/2] mm: fix the handling Non-LRU pages returned by follow_page Haiyue Wang
2022-08-15 1:39 ` Huang, Ying
2022-08-15 1:46 ` Wang, Haiyue
2022-08-15 1:59 ` [PATCH v4 0/2] fix follow_page related issues Haiyue Wang
2022-08-15 1:59 ` [PATCH v4 1/2] mm: migration: fix the FOLL_GET failure on following huge page Haiyue Wang
2022-08-15 4:28 ` Alistair Popple
2022-08-15 4:40 ` Wang, Haiyue
2022-08-15 5:16 ` Alistair Popple
2022-08-15 5:20 ` Wang, Haiyue
2022-08-15 5:35 ` Alistair Popple
2022-08-15 5:37 ` Wang, Haiyue
2022-08-15 1:59 ` [PATCH v4 2/2] mm: fix the handling Non-LRU pages returned by follow_page Haiyue Wang
2022-08-15 7:02 ` [PATCH v5 0/2] fix follow_page related issues Haiyue Wang
2022-08-15 7:02 ` [PATCH v5 1/2] mm: migration: fix the FOLL_GET failure on following huge page Haiyue Wang
2022-08-15 7:40 ` Huang, Ying
2022-08-15 7:02 ` [PATCH v5 2/2] mm: fix the handling Non-LRU pages returned by follow_page Haiyue Wang
2022-08-15 7:50 ` Huang, Ying
2022-08-15 14:28 ` Felix Kuehling
2022-08-16 0:00 ` Alistair Popple
2022-08-16 1:12 ` Wang, Haiyue
2022-08-16 2:45 ` Alistair Popple [this message]
2022-08-16 2:20 ` [PATCH v6 0/2] fix follow_page related issues Haiyue Wang
2022-08-16 2:21 ` [PATCH v6 1/2] mm: migration: fix the FOLL_GET failure on following huge page Haiyue Wang
2022-08-16 8:54 ` Baolin Wang
2022-08-17 0:58 ` Andrew Morton
2022-08-17 3:31 ` Wang, Haiyue
2022-08-17 5:43 ` Andrew Morton
2022-08-17 5:47 ` Wang, Haiyue
2022-08-17 17:26 ` Mike Kravetz
2022-08-17 21:58 ` Mike Kravetz
2022-08-18 0:32 ` Wang, Haiyue
2022-08-19 11:22 ` Michael Ellerman
2022-08-19 11:22 ` Michael Ellerman
2022-08-19 16:55 ` Mike Kravetz
2022-08-19 16:55 ` Mike Kravetz
2022-08-26 13:07 ` Michael Ellerman
2022-08-26 13:07 ` Michael Ellerman
2022-08-18 11:51 ` Gerald Schaefer
2022-08-18 11:57 ` Gerald Schaefer
2022-08-17 2:12 ` Miaohe Lin
2022-08-16 2:21 ` [PATCH v6 2/2] mm: fix the handling Non-LRU pages returned by follow_page Haiyue Wang
2022-08-16 4:42 ` Alistair Popple
2022-08-17 2:34 ` Miaohe Lin
2022-08-23 10:07 ` David Hildenbrand
2022-08-23 13:26 ` Wang, Haiyue
2022-08-23 13:27 ` David Hildenbrand
2022-08-23 13:29 ` Wang, Haiyue
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87mtc4vr64.fsf@nvdebian.thelocal \
--to=apopple@nvidia.com \
--cc=Felix.Kuehling@amd.com \
--cc=akpm@linux-foundation.org \
--cc=alex.sierra@amd.com \
--cc=david@redhat.com \
--cc=haiyue.wang@intel.com \
--cc=linmiaohe@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=naoya.horiguchi@linux.dev \
--cc=songmuchun@bytedance.com \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.