All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Li Xinhai" <lixinhai.lxh@gmail.com>
To: "HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>,
	"yang.shi" <yang.shi@linux.alibaba.com>,
	"Mike Kravetz" <mike.kravetz@oracle.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	 akpm <akpm@linux-foundation.org>,  mhocko <mhocko@suse.com>,
	 n-horiguchi <n-horiguchi@ah.jp.nec.com>
Subject: Re: [PATCH 2/2] mm/mempolicy: Skip walking HUGETLB vma if MPOL_MF_STRICT is specified alone
Date: Thu, 16 Jan 2020 23:32:33 +0800	[thread overview]
Message-ID: <20200116233231408902133@gmail.com> (raw)
In-Reply-To: 20200116080729.GA19016@hori.linux.bs1.fc.nec.co.jp

On 2020-01-16 at 16:07 HORIGUCHI NAOYA(堀口 直也) wrote:
>Hi everyone,
>
>Thank you all for finding and digging the issue.
>
>> Summary
>> =======
>> It 'looks' like the statement "MPOL_MF_STRICT is ignored on huge page
>> mappings." is left over from the original mbind implementation.  When
>> the huge page migration support was added, I can not be sure if ignoring
>> MPOL_MF_STRICT for huge pages during the verify/isolation phase was
>> intentional.  It seems like it was as the return value from
>> isolate_huge_page() is ignored.
>
>This summary is totally correct.  I've simply missed considering MPOL_MF_STRICT
>flag when implementing hugetlb migration.  As you pointed out, the discrepacy
>between the manpage and the code is also due to the lack of updates on the
>"MPOL_MF_STRICT is ignored on huge page mappings." statement.
>
>On Wed, Jan 15, 2020 at 01:59:14PM -0800, Yang Shi wrote:
>>
>> On 1/15/20 1:45 PM, Mike Kravetz wrote:
>> > On 1/15/20 1:30 PM, Yang Shi wrote:
>> > > On 1/15/20 1:07 PM, Mike Kravetz wrote:
>> > > > What should we do?
>> > > > ==================
>> > > > 1) Nothing more than optimizations by Li Xinhai.  Behavior that could be
>> > > >      seen as conflicting with man page has existed since v3.12 and I am
>> > > >      not aware of any complaints.
>> > > > 2) In addition to optimizations by Li Xinhai, modify code to truly ignore
>> > > >      MPOL_MF_STRICT for huge page mappings.  This would be fairly easy to do
>> > > >      after a failure of migrate_pages().  We could simply traverse the list
>> > > >      of pages that were not migrated looking for any non-hugetlb page.
>> > > I don't think we can do this easily since migrate_pages() would put the migration failed hugetlb pages back to hugepage_activelist so there should not any hugetlb page reside on the pagelist regardless of failure if I read the code correctly.
>
>Although this behavior seems to me not prevent from finding non-hugetlb
>pages in migration list, this is a difference in migration behavior between
>normal pages and hugepages that might be better to be optimized.
>Maybe hugepages failed to migrate should remain in migration list after
>migrate_pages() returns and the should be put back via putback_movable_pages().
>
>> > >
>> > You are correct.  I made an assumption without actually looking at the code. :(
>> >
>> > > Other than that traversing page list to look for a certain type page doesn't sound scalable to me.
>> > >
>> > > > 3) Remove the statement "MPOL_MF_STRICT is ignored on huge page mappings."
>> > > >      and modify code accordingly.
>> > > >
>> > > > My suggestion would be for 1 or 2.  Thoughts?
>> > > By rethinking the history (thanks again for digging into it), it sounds #3 should be more reasonable. It sounds like the behavior was changed since hugetlb migration was added but the man page was not updated to reflect the change.
>> > >
>> > Let's hope Naoya comments.  My only concern with #3 is that we will be changing
>> > behavior.  I do not think many people (if any) depend on existing behavior.
>> > However, you can never be sure.
>>
>> Yes, this would change the bahavior, but I don't see why we have to treat
>> hugetlb specially nowadays with migration supported.
>
>(Option #1 is good for short term solution, but eventually) I agree with option #3.
>We have no reason to handle hugetlb differently about MPOL_MF_STRICT flag. 

Thanks. Same thoughts for option #3, but it seems better not change current
behavior. 
Add more about current behavior of code:
- In unmap&move phase, there is no different behavior of handling hugepage
and non-hugepage, that is when STRICT is set, report -EIO if any page
can't move, when STRICT is not set, don't report when failed to move;
- In isolation phase, STRICT is effective for non-hugepge, that means set
STRICT alone will cause -EIO if found misplaced pages, and set STRICT with
MOVE* will cause -EIO if failed to isolate pages; for hugepage, STRICT is
ignored, it don't detect misplaced pages nor report -EIO if isolation failed.

This patch don't change any part of current behavior, only avoids walking
page table, where currently do nothing if STRICT is set alone.

>
>Thanks,
>Naoya Horiguchi

  reply	other threads:[~2020-01-16 15:32 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-14  9:16 [PATCH 1/2] mm/mempolicy: Checking hstate for hugetlbfs page in vma_migratable Li Xinhai
2020-01-14  9:16 ` [PATCH 2/2] mm/mempolicy: Skip walking HUGETLB vma if MPOL_MF_STRICT is specified alone Li Xinhai
2020-01-14 14:09   ` Li Xinhai
2020-01-14 18:27     ` Yang Shi
2020-01-15  1:07     ` Mike Kravetz
2020-01-15  1:24       ` Yang Shi
2020-01-15  4:28         ` Mike Kravetz
2020-01-15  5:23           ` Yang Shi
2020-01-15  7:36             ` Li Xinhai
2020-01-15 17:16               ` Yang Shi
2020-01-15 21:07       ` Mike Kravetz
2020-01-15 21:30         ` Yang Shi
2020-01-15 21:45           ` Mike Kravetz
2020-01-15 21:59             ` Yang Shi
2020-01-16  8:07               ` HORIGUCHI NAOYA(堀口 直也)
2020-01-16 15:32                 ` Li Xinhai [this message]
2020-01-16  7:59         ` Michal Hocko
2020-01-16 19:22           ` Mike Kravetz
2020-01-17  2:32             ` Yang Shi
2020-01-17  2:38             ` Li Xinhai
2020-01-17  7:57               ` Michal Hocko
2020-01-17 12:05                 ` Li Xinhai
2020-01-17 15:20                   ` Michal Hocko
2020-01-17 15:46                     ` Li Xinhai
2020-01-20 12:45                       ` Michal Hocko
2020-01-21 14:15                         ` Li Xinhai
2020-01-21 14:53                           ` Michal Hocko
2020-01-22 13:55                             ` Li Xinhai
2020-01-14 19:12 ` [PATCH 1/2] mm/mempolicy: Checking hstate for hugetlbfs page in vma_migratable Mike Kravetz
2020-01-15  1:25 ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200116233231408902133@gmail.com \
    --to=lixinhai.lxh@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=naoya.horiguchi@nec.com \
    --cc=yang.shi@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.