From: "Huang\, Ying" <ying.huang@intel.com>
To: Yang Shi <shy828301@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Yang Shi <yang.shi@linux.alibaba.com>,
"David Rientjes" <rientjes@google.com>,
Dan Williams <dan.j.williams@intel.com>,
Linux-MM <linux-mm@kvack.org>
Subject: Re: [RFC][PATCH 5/9] mm/migrate: demote pages during reclaim
Date: Fri, 21 Aug 2020 08:57:50 +0800 [thread overview]
Message-ID: <87v9hcvmr5.fsf@yhuang-dev.intel.com> (raw)
In-Reply-To: <CAHbLzkrjxm38VV+ibQxoQkC4nW7F13aJcL5RypUchX30rqUstA@mail.gmail.com> (Yang Shi's message of "Thu, 20 Aug 2020 09:26:57 -0700")
Yang Shi <shy828301@gmail.com> writes:
> On Thu, Aug 20, 2020 at 8:22 AM Dave Hansen <dave.hansen@intel.com> wrote:
>>
>> On 8/20/20 1:06 AM, Huang, Ying wrote:
>> >> + /* Migrate pages selected for demotion */
>> >> + nr_reclaimed += demote_page_list(&ret_pages, &demote_pages, pgdat, sc);
>> >> +
>> >> pgactivate = stat->nr_activate[0] + stat->nr_activate[1];
>> >>
>> >> mem_cgroup_uncharge_list(&free_pages);
>> >> _
>> > Generally, it's good to batch the page migration. But one side effect
>> > is that, if the pages are failed to be migrated, they will be placed
>> > back to the LRU list instead of falling back to be reclaimed really.
>> > This may cause some issue in some situation. For example, if there's no
>> > enough space in the PMEM (slow) node, so the page migration fails, OOM
>> > may be triggered, because the direct reclaiming on the DRAM (fast) node
>> > may make no progress, while it can reclaim some pages really before.
>>
>> Yes, agreed.
>
> Kind of. But I think that should be transient and very rare. The
> kswapd on pmem nodes will be waken up to drop pages when we try to
> allocate migration target pages. It should be very rare that there is
> not reclaimable page on pmem nodes.
>
>>
>> There are a couple of ways we could fix this. Instead of splicing
>> 'demote_pages' back into 'ret_pages', we could try to get them back on
>> 'page_list' and goto the beginning on shrink_page_list(). This will
>> probably yield the best behavior, but might be a bit ugly.
>>
>> We could also add a field to 'struct scan_control' and just stop trying
>> to migrate after it has failed one or more times. The trick will be
>> picking a threshold that doesn't mess with either the normal reclaim
>> rate or the migration rate.
>
> In my patchset I implemented a fallback mechanism via adding a new
> PGDAT_CONTENDED node flag. Please check this out:
> https://patchwork.kernel.org/patch/10993839/.
>
> Basically the PGDAT_CONTENDED flag will be set once migrate_pages()
> return -ENOMEM which indicates the target pmem node is under memory
> pressure, then it would fallback to regular reclaim path. The flag
> would be cleared by clear_pgdat_congested() once the pmem node memory
> pressure is gone.
There may be some races between the flag set and clear. For example,
- try to migrate some pages from DRAM node to PMEM node
- no enough free pages on the PMEM node, so wakeup kswapd
- kswapd on PMEM node reclaimed some page and try to clear
PGDAT_CONTENDED on DRAM node
- set PGDAT_CONTENDED on DRAM node
This may be resolvable. But I still prefer to fallback to real page
reclaiming directly for the pages failed to be migrated. That looks
more robust.
Best Regards,
Huang, Ying
> We already use node flags to indicate the state of node in reclaim
> code, i.e. PGDAT_WRITEBACK, PGDAT_DIRTY, etc. So, adding a new flag
> sounds more straightforward to me IMHO.
>
>>
>> This is on my list to fix up next.
>>
next prev parent reply other threads:[~2020-08-21 0:57 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-18 18:41 [RFC][PATCH 0/9] [v3] Migrate Pages in lieu of discard Dave Hansen
2020-08-18 18:41 ` [RFC][PATCH 1/9] mm/numa: node demotion data structure and lookup Dave Hansen
2020-08-18 18:41 ` [RFC][PATCH 2/9] mm/numa: automatically generate node migration order Dave Hansen
2020-08-20 21:57 ` Yang Shi
2020-08-18 18:41 ` [RFC][PATCH 3/9] mm/migrate: update migration order during on hotplug events Dave Hansen
2020-08-20 22:07 ` Yang Shi
2020-08-18 18:41 ` [RFC][PATCH 4/9] mm/migrate: make migrate_pages() return nr_succeeded Dave Hansen
2020-09-17 1:25 ` Huang, Ying
2020-08-18 18:41 ` [RFC][PATCH 5/9] mm/migrate: demote pages during reclaim Dave Hansen
2020-08-20 8:06 ` Huang, Ying
2020-08-20 15:21 ` Dave Hansen
2020-08-20 16:26 ` Yang Shi
2020-08-21 0:57 ` Huang, Ying [this message]
2020-08-21 16:17 ` Yang Shi
2020-08-20 22:42 ` Yang Shi
2020-08-18 18:41 ` [RFC][PATCH 6/9] mm/vmscan: add page demotion counter Dave Hansen
2020-08-20 22:26 ` Yang Shi
2020-08-20 23:58 ` Yang Shi
2020-08-20 22:56 ` Yang Shi
2020-08-18 18:41 ` [RFC][PATCH 7/9] mm/vmscan: Consider anonymous pages without swap Dave Hansen
2020-08-18 18:41 ` [RFC][PATCH 8/9] mm/vmscan: never demote for memcg reclaim Dave Hansen
2020-08-20 22:50 ` Yang Shi
2020-08-18 18:41 ` [RFC][PATCH 9/9] mm/migrate: new zone_reclaim_mode to enable reclaim migration Dave Hansen
2020-08-20 0:47 ` [RFC][PATCH 0/9] [v3] Migrate Pages in lieu of discard Yang Shi
2020-08-24 22:36 ` Keith Busch
-- strict thread matches above, loose matches on Subject: below --
2020-10-07 16:17 [RFC][PATCH 0/9] [v4][RESEND] " Dave Hansen
2020-10-07 16:17 ` [RFC][PATCH 5/9] mm/migrate: demote pages during reclaim Dave Hansen
2020-10-07 16:17 ` Dave Hansen
2020-10-27 15:29 ` Oscar Salvador
2020-10-27 16:53 ` Yang Shi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87v9hcvmr5.fsf@yhuang-dev.intel.com \
--to=ying.huang@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=rientjes@google.com \
--cc=shy828301@gmail.com \
--cc=yang.shi@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.