linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Hansen <dave.hansen@intel.com>
To: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	Anshuman Khandual <khandual@linux.vnet.ibm.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Cc: mhocko@suse.com, js1304@gmail.com, vbabka@suse.cz,
	mgorman@suse.de, minchan@kernel.org, akpm@linux-foundation.org,
	bsingharora@gmail.com
Subject: Re: [RFC 5/8] mm: Add new flag VM_CDM for coherent device memory
Date: Tue, 25 Oct 2016 13:01:29 -0700	[thread overview]
Message-ID: <580FBA19.9050504@intel.com> (raw)
In-Reply-To: <87pomojkvu.fsf@linux.vnet.ibm.com>

On 10/25/2016 12:20 PM, Aneesh Kumar K.V wrote:
> Dave Hansen <dave.hansen@intel.com> writes:
>> On 10/23/2016 09:31 PM, Anshuman Khandual wrote:
>>> VMAs containing coherent device memory should be marked with VM_CDM. These
>>> VMAs need to be identified in various core kernel paths and this new flag
>>> will help in this regard.
>>
>> ... and it's sticky?  So if a VMA *ever* has one of these funky pages in
>> it, it's stuck being VM_CDM forever?  Never to be merged with other
>> VMAs?  Never to see the light of autonuma ever again?
>>
>> What if a 100TB VMA has one page of fancy pants device memory, and the
>> rest normal vanilla memory?  Do we really want to consider the whole
>> thing fancy?
> 
> This definitely needs fine tuning. I guess we should look at this as
> possibly stating that, coherent device would like to not participate in
> auto numa balancing
...

Right, in this one, particular case you don't want NUMA balancing.  But,
if you have to take an _explicit_ action to even get access to this
coherent memory (setting a NUMA policy), why keeps that explicit action
from also explicitly disabling NUMA migration?

I really don't think we should tie together the isolation aspect with
anything else, including NUMA balancing.

For instance, on x86, we have the ability for devices to grok the CPU's
page tables, including doing faults.  There's very little to stop us
from doing things like autonuma.

> One possible option is to use a software pte bit (may be steal
> _PAGE_DEVMAP) and prevent a numa pte setup from change_prot_numa().
> ie, if the pfn backing the pte is from coherent device we don't allow
> that to be converted to a prot none pte for numa faults ?

Why would you need to tag individual pages, especially if the VMA has a
policy set on it that disallows migration?

But, even if you did need to identify individual pages from the PTE, you
can easily do:

	page_to_nid(pfn_to_page(pte_pfn(pte)))

and then tell if the node is a fancy-pants device node.

  reply	other threads:[~2016-10-25 20:01 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-24  4:31 [RFC 0/8] Define coherent device memory node Anshuman Khandual
2016-10-24  4:31 ` [RFC 1/8] mm: " Anshuman Khandual
2016-10-24 17:09   ` Dave Hansen
2016-10-25  1:22     ` Anshuman Khandual
2016-10-25 15:47       ` Dave Hansen
2016-10-24  4:31 ` [RFC 2/8] mm: Add specialized fallback zonelist for coherent device memory nodes Anshuman Khandual
2016-10-24 17:10   ` Dave Hansen
2016-10-25  1:27     ` Anshuman Khandual
2016-11-17  7:40   ` Anshuman Khandual
2016-11-17  7:59     ` [DRAFT 1/2] mm/cpuset: Exclude CDM nodes from each task's mems_allowed node mask Anshuman Khandual
2016-11-17  7:59       ` [DRAFT 2/2] mm/hugetlb: Restrict HugeTLB allocations only to the system RAM nodes Anshuman Khandual
2016-11-17  8:28       ` [DRAFT 1/2] mm/cpuset: Exclude CDM nodes from each task's mems_allowed node mask kbuild test robot
2016-10-24  4:31 ` [RFC 3/8] mm: Isolate coherent device memory nodes from HugeTLB allocation paths Anshuman Khandual
2016-10-24 17:16   ` Dave Hansen
2016-10-25  4:15     ` Aneesh Kumar K.V
2016-10-25  7:17       ` Balbir Singh
2016-10-25  7:25         ` Balbir Singh
2016-10-24  4:31 ` [RFC 4/8] mm: Accommodate coherent device memory nodes in MPOL_BIND implementation Anshuman Khandual
2016-10-24  4:31 ` [RFC 5/8] mm: Add new flag VM_CDM for coherent device memory Anshuman Khandual
2016-10-24 17:38   ` Dave Hansen
2016-10-24 18:00     ` Dave Hansen
2016-10-25 12:36     ` Balbir Singh
2016-10-25 19:20     ` Aneesh Kumar K.V
2016-10-25 20:01       ` Dave Hansen [this message]
2016-10-24  4:31 ` [RFC 6/8] mm: Make VM_CDM marked VMAs non migratable Anshuman Khandual
2016-10-24  4:31 ` [RFC 7/8] mm: Add a new migration function migrate_virtual_range() Anshuman Khandual
2016-10-24  4:31 ` [RFC 8/8] mm: Add N_COHERENT_DEVICE node type into node_states[] Anshuman Khandual
2016-10-25  7:22   ` Balbir Singh
2016-10-26  4:52     ` Anshuman Khandual
2016-10-24  4:42 ` [DEBUG 00/10] Test and debug patches for coherent device memory Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 01/10] dt-bindings: Add doc for ibm,hotplug-aperture Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 02/10] powerpc/mm: Create numa nodes for hotplug memory Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 03/10] powerpc/mm: Allow memory hotplug into a memory less node Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 04/10] mm: Enable CONFIG_MOVABLE_NODE on powerpc Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 05/10] powerpc/mm: Identify isolation seeking coherent memory nodes during boot Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 06/10] mm: Export definition of 'zone_names' array through mmzone.h Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 07/10] mm: Add debugfs interface to dump each node's zonelist information Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 08/10] powerpc: Enable CONFIG_MOVABLE_NODE for PPC64 platform Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 09/10] drivers: Add two drivers for coherent device memory tests Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 10/10] test: Add a script to perform random VMA migrations across nodes Anshuman Khandual
2016-10-24 17:09 ` [RFC 0/8] Define coherent device memory node Jerome Glisse
2016-10-25  4:26   ` Aneesh Kumar K.V
2016-10-25 15:16     ` Jerome Glisse
2016-10-26 11:09       ` Aneesh Kumar K.V
2016-10-26 16:07         ` Jerome Glisse
2016-10-28  5:29           ` Aneesh Kumar K.V
2016-10-28 16:16             ` Jerome Glisse
2016-11-05  5:21     ` Anshuman Khandual
2016-11-05 18:02       ` Jerome Glisse
2016-10-25  4:59   ` Aneesh Kumar K.V
2016-10-25 15:32     ` Jerome Glisse
2016-10-25 17:31       ` Aneesh Kumar K.V
2016-10-25 18:52         ` Jerome Glisse
2016-10-26 11:13           ` Anshuman Khandual
2016-10-26 16:02             ` Jerome Glisse
2016-10-27  4:38               ` Anshuman Khandual
2016-10-27  7:03                 ` Anshuman Khandual
2016-10-27 15:05                   ` Jerome Glisse
2016-10-28  5:47                     ` Anshuman Khandual
2016-10-28 16:08                       ` Jerome Glisse
2016-10-26 12:56           ` Anshuman Khandual
2016-10-26 16:28             ` Jerome Glisse
2016-10-27 10:23               ` Balbir Singh
2016-10-25 12:07   ` Balbir Singh
2016-10-25 15:21     ` Jerome Glisse
2016-10-24 18:04 ` Dave Hansen
2016-10-24 18:32   ` David Nellans
2016-10-24 19:36     ` Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=580FBA19.9050504@intel.com \
    --to=dave.hansen@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=bsingharora@gmail.com \
    --cc=js1304@gmail.com \
    --cc=khandual@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).