All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Hansen <dave.hansen@intel.com>
To: Michal Hocko <mhocko@kernel.org>, Yang Shi <yang.shi@linux.alibaba.com>
Cc: mgorman@techsingularity.net, riel@surriel.com,
	hannes@cmpxchg.org, akpm@linux-foundation.org,
	keith.busch@intel.com, dan.j.williams@intel.com,
	fengguang.wu@intel.com, fan.du@intel.com, ying.huang@intel.com,
	ziy@nvidia.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [v2 RFC PATCH 0/9] Another Approach to Use PMEM as NUMA Node
Date: Tue, 16 Apr 2019 07:30:20 -0700	[thread overview]
Message-ID: <b9b40585-cb59-3d42-bcf8-e59bff77c663@intel.com> (raw)
In-Reply-To: <20190416074714.GD11561@dhcp22.suse.cz>

On 4/16/19 12:47 AM, Michal Hocko wrote:
> You definitely have to follow policy. You cannot demote to a node which
> is outside of the cpuset/mempolicy because you are breaking contract
> expected by the userspace. That implies doing a rmap walk.

What *is* the contract with userspace, anyway? :)

Obviously, the preferred policy doesn't have any strict contract.

The strict binding has a bit more of a contract, but it doesn't prevent
swapping.  Strict binding also doesn't keep another app from moving the
memory.

We have a reasonable argument that demotion is better than swapping.
So, we could say that even if a VMA has a strict NUMA policy, demoting
pages mapped there pages still beats swapping them or tossing the page
cache.  It's doing them a favor to demote them.

Or, maybe we just need a swap hybrid where demotion moves the page but
keeps it unmapped and in the swap cache.  That way an access gets a
fault and we can promote the page back to where it should be.  That
would be faster than I/O-based swap for sure.

Anyway, I agree that the kernel probably shouldn't be moving pages
around willy-nilly with no consideration for memory policies, but users
might give us some wiggle room too.


  reply	other threads:[~2019-04-16 14:30 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-11  3:56 [v2 RFC PATCH 0/9] Another Approach to Use PMEM as NUMA Node Yang Shi
2019-04-11  3:56 ` [v2 PATCH 1/9] mm: define N_CPU_MEM node states Yang Shi
2019-04-11  3:56 ` [v2 PATCH 2/9] mm: page_alloc: make find_next_best_node find return cpuless node Yang Shi
2019-04-11  3:56 ` [v2 PATCH 3/9] mm: numa: promote pages to DRAM when it gets accessed twice Yang Shi
2019-04-11  3:56 ` [v2 PATCH 4/9] mm: migrate: make migrate_pages() return nr_succeeded Yang Shi
2019-04-11  3:56 ` [v2 PATCH 5/9] mm: vmscan: demote anon DRAM pages to PMEM node Yang Shi
2019-04-11 14:31   ` Dave Hansen
2019-04-15 22:10     ` Yang Shi
2019-04-15 22:14       ` Dave Hansen
2019-04-15 22:26         ` Yang Shi
2019-04-11  3:56 ` [v2 PATCH 6/9] mm: vmscan: don't demote for memcg reclaim Yang Shi
2019-04-11  3:56 ` [v2 PATCH 7/9] mm: vmscan: check if the demote target node is contended or not Yang Shi
2019-04-11 16:06   ` Dave Hansen
2019-04-15 22:06     ` Yang Shi
2019-04-15 22:13       ` Dave Hansen
2019-04-15 22:23         ` Yang Shi
2019-04-11  3:56 ` [v2 PATCH 8/9] mm: vmscan: add page demotion counter Yang Shi
2019-04-11  3:56 ` [v2 PATCH 9/9] mm: numa: add page promotion counter Yang Shi
2019-04-11 14:28 ` [v2 RFC PATCH 0/9] Another Approach to Use PMEM as NUMA Node Dave Hansen
2019-04-12  8:47 ` Michal Hocko
2019-04-16  0:09   ` Yang Shi
2019-04-16  7:47     ` Michal Hocko
2019-04-16 14:30       ` Dave Hansen [this message]
2019-04-16 14:39         ` Michal Hocko
2019-04-16 15:46           ` Dave Hansen
2019-04-16 18:34             ` Michal Hocko
2019-04-16 15:33         ` Zi Yan
2019-04-16 15:55           ` Dave Hansen
2019-04-16 16:12             ` Zi Yan
2019-04-16 19:19       ` Yang Shi
2019-04-16 21:22         ` Dave Hansen
2019-04-16 21:59           ` Yang Shi
2019-04-16 23:04             ` Dave Hansen
2019-04-16 23:17               ` Yang Shi
2019-04-17 15:13                 ` Keith Busch
2019-04-17  9:23           ` Michal Hocko
2019-04-17 15:23             ` Keith Busch
2019-04-17 15:39               ` Michal Hocko
2019-04-17 15:37                 ` Keith Busch
2019-04-17 16:39                   ` Michal Hocko
2019-04-17 17:26                     ` Yang Shi
2019-04-17 17:29                       ` Keith Busch
2019-04-17 17:51                       ` Michal Hocko
2019-04-18 16:24                         ` Yang Shi
2019-04-17 17:13             ` Dave Hansen
2019-04-17 17:57               ` Michal Hocko
2019-04-18 18:16               ` Keith Busch
2019-04-18 19:23                 ` Yang Shi
2019-04-18 21:07                   ` Zi Yan
2019-04-16 23:18         ` Yang Shi
2019-04-17  9:17         ` Michal Hocko
2019-05-01  6:43           ` Fengguang Wu
2019-04-17 20:43         ` Yang Shi
2019-04-18  9:02           ` Michal Hocko
2019-05-01  5:20             ` Fengguang Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b9b40585-cb59-3d42-bcf8-e59bff77c663@intel.com \
    --to=dave.hansen@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=fan.du@intel.com \
    --cc=fengguang.wu@intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=keith.busch@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=riel@surriel.com \
    --cc=yang.shi@linux.alibaba.com \
    --cc=ying.huang@intel.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.