linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Oscar Salvador <osalvador@suse.de>
To: Dave Hansen <dave.hansen@linux.intel.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	yang.shi@linux.alibaba.com, rientjes@google.com,
	ying.huang@intel.com, dan.j.williams@intel.com
Subject: Re: [RFC][PATCH 5/9] mm/migrate: demote pages during reclaim
Date: Tue, 27 Oct 2020 16:29:02 +0100	[thread overview]
Message-ID: <20201027152858.GA11135@linux> (raw)
In-Reply-To: <20201007161745.26B1D789@viggo.jf.intel.com>

On Wed, Oct 07, 2020 at 09:17:45AM -0700, Dave Hansen wrote:
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Yang Shi <yang.shi@linux.alibaba.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Huang Ying <ying.huang@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>

I am still going through all the details, but just my thoughts on things
that caught my eye:

> --- a/include/linux/migrate.h~demote-with-migrate_pages	2020-10-07 09:15:31.028642442 -0700
> +++ b/include/linux/migrate.h	2020-10-07 09:15:31.034642442 -0700
> @@ -27,6 +27,7 @@ enum migrate_reason {
>  	MR_MEMPOLICY_MBIND,
>  	MR_NUMA_MISPLACED,
>  	MR_CONTIG_RANGE,
> +	MR_DEMOTION,
>  	MR_TYPES

I think you also need to add it under include/trace/events/migrate.h, so
mm_migrate_pages event can know about it.

> +bool migrate_demote_page_ok(struct page *page, struct scan_control *sc)

Make it static?
Also, scan_control seems to be unused here.

> +{
> +	int next_nid = next_demotion_node(page_to_nid(page));
> +
> +	VM_BUG_ON_PAGE(!PageLocked(page), page);

Right after the call to migrate_demote_page_ok, we call unlock_page
which already has this check in place.
I know that this is only to be on the safe side and we do not loss anything,
but just my thoughts.

> +static struct page *alloc_demote_page(struct page *page, unsigned long node)
> +{
> +	/*
> +	 * Try to fail quickly if memory on the target node is not
> +	 * available.  Leaving out __GFP_IO and __GFP_FS helps with
> +	 * this.  If the desintation node is full, we want kswapd to
> +	 * run there so that its pages will get reclaimed and future
> +	 * migration attempts may succeed.
> +	 */
> +	gfp_t flags = (__GFP_HIGHMEM | __GFP_MOVABLE | __GFP_NORETRY |
> +		       __GFP_NOMEMALLOC | __GFP_NOWARN | __GFP_THISNODE |
> +		       __GFP_KSWAPD_RECLAIM);

I think it would be nicer to have this as a real GFP_ thingy defined.
e.g: GFP_DEMOTION

> +	/* HugeTLB pages should not be on the LRU */
> +	WARN_ON_ONCE(PageHuge(page));

I am not sure about this one.
This could only happen if the page, which now it is in another list, ends up in
the buddy system. That is quite unlikely bth.
And nevertheless, this is only a warning, which means that if this scenario gets
to happen, we will be allocating a single page to satisfy a higher-order page, and
I am not sure about the situation we will end up with.

> +
> +	if (PageTransHuge(page)) {
> +		struct page *thp;
> +
> +		flags |= __GFP_COMP;
> +
> +		thp = alloc_pages_node(node, flags, HPAGE_PMD_ORDER);
> +		if (!thp)
> +			return NULL;
> +		prep_transhuge_page(thp);
> +		return thp;
> +	}
> +
> +	return __alloc_pages_node(node, flags, 0);

Would make sense to transform this in some sort of new_demotion_page,
which actually calls alloc_migration_target with the right stuff in place?
And then pass a struct migration_target_control so alloc_migration_target
does the right thing.
alloc_migration_target also takes care of calling prep_transhuge_page
when needed.
e.g:

static struct page *new_demotion_node(struct page *page, unsigned long private)
{
        struct migration_target_control mtc = {
                .nid = private,
                .gfp_mask = GFP_DEMOTION,
        };

        if (PageTransHuge(page))
                mtc.gfp_mask |= __GFP_COMP;

        return alloc_migration_target(page, (unsigned long)&mtc);
}

The only thing I see is that alloc_migration_target seems to "override"
the gfp_mask and does ORs GFP_TRANSHUGE for THP pages, which includes
__GFP_DIRECT_RECLAIM (not appreciated in this case).
But maybe this can be worked around by checking if gfp_mask == GFP_DEMOTION,
and if so, just keep the mask as it is.

> +
> +	if (list_empty(demote_pages))
> +		return 0;
> +
> +	/* Demotion ignores all cpuset and mempolicy settings */
> +	err = migrate_pages(demote_pages, alloc_demote_page, NULL,
> +			    target_nid, MIGRATE_ASYNC, MR_DEMOTION,
> +			    &nr_succeeded);

As I said, instead of alloc_demote_page, use a new_demote_page and make
alloc_migration_target handle the allocations and prep thp pages.


-- 
Oscar Salvador
SUSE L3


  reply	other threads:[~2020-10-27 15:29 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-07 16:17 [RFC][PATCH 0/9] [v4][RESEND] Migrate Pages in lieu of discard Dave Hansen
2020-10-07 16:17 ` [RFC][PATCH 1/9] mm/numa: node demotion data structure and lookup Dave Hansen
2020-10-07 16:17 ` [RFC][PATCH 2/9] mm/numa: automatically generate node migration order Dave Hansen
2020-10-07 16:17 ` [RFC][PATCH 3/9] mm/migrate: update migration order during on hotplug events Dave Hansen
2020-10-07 18:08   ` osalvador
2020-10-07 16:17 ` [RFC][PATCH 4/9] mm/migrate: make migrate_pages() return nr_succeeded Dave Hansen
2020-10-07 16:17 ` [RFC][PATCH 5/9] mm/migrate: demote pages during reclaim Dave Hansen
2020-10-27 15:29   ` Oscar Salvador [this message]
2020-10-27 16:53     ` Yang Shi
2020-10-07 16:17 ` [RFC][PATCH 6/9] mm/vmscan: add page demotion counter Dave Hansen
2020-10-19  7:37   ` Huang, Ying
2020-10-27 16:41     ` Yang Shi
2020-10-28  1:25       ` Huang, Ying
2020-10-07 16:17 ` [RFC][PATCH 7/9] mm/vmscan: Consider anonymous pages without swap Dave Hansen
2020-10-29  8:14   ` Oscar Salvador
2020-10-29 14:33     ` Dave Hansen
2020-10-29 15:57       ` Yang Shi
2020-10-29 19:08         ` osalvador
2020-10-29 19:30           ` Yang Shi
2020-10-07 16:17 ` [RFC][PATCH 8/9] mm/vmscan: never demote for memcg reclaim Dave Hansen
2020-10-07 16:17 ` [RFC][PATCH 9/9] mm/migrate: new zone_reclaim_mode to enable reclaim migration Dave Hansen
2020-10-12 21:30 ` [RFC][PATCH 0/9] [v4][RESEND] Migrate Pages in lieu of discard Yang Shi
     [not found] <20200818184122.29C415DF@viggo.jf.intel.com>
     [not found] ` <20200818184131.C972AFCC@viggo.jf.intel.com>
     [not found]   ` <87lfi9wxk9.fsf@yhuang-dev.intel.com>
2020-08-20 15:21     ` [RFC][PATCH 5/9] mm/migrate: demote pages during reclaim Dave Hansen
2020-08-20 16:26       ` Yang Shi
2020-08-21  0:57         ` Huang, Ying
2020-08-21 16:17           ` Yang Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201027152858.GA11135@linux \
    --to=osalvador@suse.de \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rientjes@google.com \
    --cc=yang.shi@linux.alibaba.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).