linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jerome Glisse <jglisse@redhat.com>
To: Dave Hansen <dave.hansen@intel.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	mhocko@suse.com, vbabka@suse.cz, mgorman@suse.de,
	minchan@kernel.org,
	aneesh kumar <aneesh.kumar@linux.vnet.ibm.com>,
	bsingharora@gmail.com, srikar@linux.vnet.ibm.com,
	haren@linux.vnet.ibm.com,
	dan j williams <dan.j.williams@intel.com>
Subject: Re: [RFC V2 12/12] mm: Tag VMA with VM_CDM flag explicitly during mbind(MPOL_BIND)
Date: Wed, 8 Feb 2017 10:04:39 -0500 (EST)	[thread overview]
Message-ID: <1808730857.1637296.1486566279643.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <bfb7f080-6f0a-743f-654b-54f41443e44a@intel.com>

> On 01/30/2017 08:36 PM, Anshuman Khandual wrote:
> > On 01/30/2017 11:24 PM, Dave Hansen wrote:
> >> On 01/29/2017 07:35 PM, Anshuman Khandual wrote:
> >>> +		if ((new_pol->mode == MPOL_BIND)
> >>> +			&& nodemask_has_cdm(new_pol->v.nodes))
> >>> +			set_vm_cdm(vma);
> >> So, if you did:
> >>
> >> 	mbind(addr, PAGE_SIZE, MPOL_BIND, all_nodes, ...);
> >> 	mbind(addr, PAGE_SIZE, MPOL_BIND, one_non_cdm_node, ...);
> >>
> >> You end up with a VMA that can never have KSM done on it, etc...  Even
> >> though there's no good reason for it.  I guess /proc/$pid/smaps might be
> >> able to help us figure out what was going on here, but that still seems
> >> like an awful lot of damage.
> > 
> > Agreed, this VMA should not remain tagged after the second call. It does
> > not make sense. For this kind of scenarios we can re-evaluate the VMA
> > tag every time the nodemask change is attempted. But if we are looking for
> > some runtime re-evaluation then we need to steal some cycles are during
> > general VMA processing opportunity points like merging and split to do
> > the necessary re-evaluation. Should do we do these kind two kinds of
> > re-evaluation to be more optimal ?
> 
> I'm still unconvinced that you *need* detection like this.  Scanning big
> VMAs is going to be really painful.
> 
> I thought I asked before but I can't find it in this thread.  But, we
> have explicit interfaces for disabling KSM and khugepaged.  Why do we
> need implicit ones like this in addition to those?
> 

I said it in other part of the thread i think the vma flag is a no go. Because
it try to set something that is orthogonal to vma. That you want some vma to
use device memory on new allocation is a valid policy for a vma to have. But to
have a flag that say various kernel subsystem hey my memory is special skip me
is wrong.

The fact that you want to exclude device memory from KSM or autonuma is valid but
it should be done at struct page level ie KSM or autonuma should check the type
of page before doing anything. For CDM pages they would skip. It could be the flags
idea that was discussed.

The overhead of doing it at page level is far lower than trying to manage a vma
flags with all the issue related to vma merging, splitting and lifetime of such
flags. Moreover this flags is an all or nothing, it does not consider the case
where you have as much regular page as CDM page in a vma. It would block regular
page from under going the usual KSM/autonuma ...

I do strongly believe that this vma flag is a bad idea.

Cheers,
Jérôme

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2017-02-08 15:04 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-30  3:35 [RFC V2 00/12] Define coherent device memory node Anshuman Khandual
2017-01-30  3:35 ` [RFC V2 01/12] mm: Define coherent device memory (CDM) node Anshuman Khandual
2017-01-30  3:35 ` [RFC V2 02/12] mm: Isolate HugeTLB allocations away from CDM nodes Anshuman Khandual
2017-01-30 17:19   ` Dave Hansen
2017-01-31  1:03     ` Anshuman Khandual
2017-01-31  1:37       ` Dave Hansen
2017-02-01 13:59         ` Anshuman Khandual
2017-02-01 19:01           ` Dave Hansen
2017-01-30  3:35 ` [RFC V2 03/12] mm: Change generic FALLBACK zonelist creation process Anshuman Khandual
2017-01-30 17:34   ` Dave Hansen
2017-01-31  1:36     ` Anshuman Khandual
2017-01-31  1:57       ` Dave Hansen
2017-01-31  7:25         ` John Hubbard
2017-01-31 18:04           ` Dave Hansen
2017-01-31 19:14             ` David Nellans
2017-02-01  6:56             ` Anshuman Khandual
2017-02-01  6:46           ` Anshuman Khandual
2017-02-01  6:40         ` Anshuman Khandual
2017-01-30  3:35 ` [RFC V2 04/12] mm: Change mbind(MPOL_BIND) implementation for CDM nodes Anshuman Khandual
2017-01-30  3:35 ` [RFC V2 05/12] cpuset: Add cpuset_inc() inside cpuset_init() Anshuman Khandual
2017-01-30 17:36   ` Dave Hansen
2017-01-30 20:30   ` Mel Gorman
2017-01-31 14:22     ` [RFC] cpuset: Enable changing of top_cpuset's mems_allowed nodemask Anshuman Khandual
2017-01-31 16:00       ` Mel Gorman
2017-02-01  7:31         ` Anshuman Khandual
2017-02-01  8:53           ` Michal Hocko
2017-02-01  9:18           ` Mel Gorman
2017-01-31 14:36     ` [RFC V2 05/12] cpuset: Add cpuset_inc() inside cpuset_init() Vlastimil Babka
2017-01-31 15:30       ` Anshuman Khandual
2017-01-30  3:35 ` [RFC V2 06/12] mm: Exclude CDM nodes from task->mems_allowed and root cpuset Anshuman Khandual
2017-01-30  3:35 ` [RFC V2 07/12] mm: Ignore cpuset enforcement when allocation flag has __GFP_THISNODE Anshuman Khandual
2017-01-30  3:35 ` [RFC V2 08/12] mm: Add new VMA flag VM_CDM Anshuman Khandual
2017-01-30 18:52   ` Jerome Glisse
2017-01-31  4:22     ` Anshuman Khandual
2017-01-31  6:05       ` Jerome Glisse
2017-01-30  3:35 ` [RFC V2 09/12] mm: Exclude CDM marked VMAs from auto NUMA Anshuman Khandual
2017-01-30  3:35 ` [RFC V2 10/12] mm: Ignore madvise(MADV_MERGEABLE) request for VM_CDM marked VMAs Anshuman Khandual
2017-01-30  3:35 ` [RFC V2 11/12] mm: Tag VMA with VM_CDM flag during page fault Anshuman Khandual
2017-01-30 17:51   ` Dave Hansen
2017-01-31  5:10     ` Anshuman Khandual
2017-01-31 17:54       ` Dave Hansen
2017-01-30  3:35 ` [RFC V2 12/12] mm: Tag VMA with VM_CDM flag explicitly during mbind(MPOL_BIND) Anshuman Khandual
2017-01-30 17:54   ` Dave Hansen
2017-01-31  4:36     ` Anshuman Khandual
2017-02-07 18:07       ` Dave Hansen
2017-02-08 14:13         ` Anshuman Khandual
2017-02-08 15:04         ` Jerome Glisse [this message]
2017-01-30  3:35 ` [DEBUG 13/21] powerpc/mm: Identify coherent device memory nodes during platform init Anshuman Khandual
2017-01-30  3:35 ` [DEBUG 14/21] powerpc/mm: Create numa nodes for hotplug memory Anshuman Khandual
2017-01-30  3:35 ` [DEBUG 15/21] powerpc/mm: Enable CONFIG_MOVABLE_NODE for PPC64 platform Anshuman Khandual
2017-01-30  3:35 ` [DEBUG 16/21] mm: Enable CONFIG_MOVABLE_NODE on powerpc Anshuman Khandual
2017-01-30  3:35 ` [DEBUG 17/21] mm: Export definition of 'zone_names' array through mmzone.h Anshuman Khandual
2017-01-30  3:35 ` [DEBUG 18/21] mm: Add debugfs interface to dump each node's zonelist information Anshuman Khandual
2017-01-30  3:36 ` [DEBUG 19/21] mm: Add migrate_virtual_range migration interface Anshuman Khandual
2017-01-30  3:36 ` [DEBUG 20/21] drivers: Add two drivers for coherent device memory tests Anshuman Khandual
2017-01-30  3:36 ` [DEBUG 21/21] selftests/powerpc: Add a script to perform random VMA migrations Anshuman Khandual
2017-01-31  5:48 ` [RFC V2 00/12] Define coherent device memory node Anshuman Khandual
2017-01-31  6:15   ` Jerome Glisse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1808730857.1637296.1486566279643.JavaMail.zimbra@redhat.com \
    --to=jglisse@redhat.com \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=bsingharora@gmail.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=haren@linux.vnet.ibm.com \
    --cc=khandual@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).