All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Hansen <dave.hansen@intel.com>
To: Anshuman Khandual <khandual@linux.vnet.ibm.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Cc: mhocko@suse.com, vbabka@suse.cz, mgorman@suse.de,
	minchan@kernel.org, aneesh.kumar@linux.vnet.ibm.com,
	bsingharora@gmail.com, srikar@linux.vnet.ibm.com,
	haren@linux.vnet.ibm.com, jglisse@redhat.com,
	Li Zefan <lizefan@huawei.com>
Subject: Re: [RFC 4/4] mm: Ignore cpuset enforcement when allocation flag has __GFP_THISNODE
Date: Wed, 30 Nov 2016 11:43:39 -0800	[thread overview]
Message-ID: <facddba2-ab56-0fea-c608-0bae65e32dbd@intel.com> (raw)
In-Reply-To: <583EB52D.3080307@linux.vnet.ibm.com>

On 11/30/2016 03:17 AM, Anshuman Khandual wrote:
> Right but what is the rationale behind this ? This what is in the in-code
> documentation for this function __cpuset_node_allowed().
> 
>  *	GFP_KERNEL   - any node in enclosing hardwalled cpuset ok
>  
> If the allocation has requested GFP_KERNEL, should not it look for the
> entire system for memory ? Does cpuset still has to be enforced ?

Documentation/cgroup-v1/cpusets.txt explains it quite a bit.

>> What exactly are the kernel-internal places that need to allocate from
>> the coherent device node?  When would this be done out of the context of
>> an application *asking* for memory in the new node?
> 
> The primary user right now is a driver who wants to move around mapped
> pages of an application from system RAM to CDM nodes and back. If the
> application has requested for it though an ioctl(), during migration
> the destination pages will be allocated on the CDM *in* the task context.

Side note: uhh, so you're doing migrate_pages() through some kind of new
ioctl()?  Why?

I think you're actually pointing out a hole in how cpusets currently
works, especially about the workqueue.  I'm not quite sure if this is by
design for migrate_pages() (a task doing migrate_pages() can pages for a
task from a cpuset even though that task isn't able to allocate itself).

> The driver could also have scheduled migration chunks in the work queue
> which can execute later on. IIUC those execution and corresponding
> allocation into CDM node will be *out* of context of the task.

Yeah, the current->mems_allowed in __cpuset_node_allowed() does seem
rather wrong for something happening in another task's context.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Dave Hansen <dave.hansen@intel.com>
To: Anshuman Khandual <khandual@linux.vnet.ibm.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Cc: mhocko@suse.com, vbabka@suse.cz, mgorman@suse.de,
	minchan@kernel.org, aneesh.kumar@linux.vnet.ibm.com,
	bsingharora@gmail.com, srikar@linux.vnet.ibm.com,
	haren@linux.vnet.ibm.com, jglisse@redhat.com,
	Li Zefan <lizefan@huawei.com>
Subject: Re: [RFC 4/4] mm: Ignore cpuset enforcement when allocation flag has __GFP_THISNODE
Date: Wed, 30 Nov 2016 11:43:39 -0800	[thread overview]
Message-ID: <facddba2-ab56-0fea-c608-0bae65e32dbd@intel.com> (raw)
In-Reply-To: <583EB52D.3080307@linux.vnet.ibm.com>

On 11/30/2016 03:17 AM, Anshuman Khandual wrote:
> Right but what is the rationale behind this ? This what is in the in-code
> documentation for this function __cpuset_node_allowed().
> 
>  *	GFP_KERNEL   - any node in enclosing hardwalled cpuset ok
>  
> If the allocation has requested GFP_KERNEL, should not it look for the
> entire system for memory ? Does cpuset still has to be enforced ?

Documentation/cgroup-v1/cpusets.txt explains it quite a bit.

>> What exactly are the kernel-internal places that need to allocate from
>> the coherent device node?  When would this be done out of the context of
>> an application *asking* for memory in the new node?
> 
> The primary user right now is a driver who wants to move around mapped
> pages of an application from system RAM to CDM nodes and back. If the
> application has requested for it though an ioctl(), during migration
> the destination pages will be allocated on the CDM *in* the task context.

Side note: uhh, so you're doing migrate_pages() through some kind of new
ioctl()?  Why?

I think you're actually pointing out a hole in how cpusets currently
works, especially about the workqueue.  I'm not quite sure if this is by
design for migrate_pages() (a task doing migrate_pages() can pages for a
task from a cpuset even though that task isn't able to allocate itself).

> The driver could also have scheduled migration chunks in the work queue
> which can execute later on. IIUC those execution and corresponding
> allocation into CDM node will be *out* of context of the task.

Yeah, the current->mems_allowed in __cpuset_node_allowed() does seem
rather wrong for something happening in another task's context.

  reply	other threads:[~2016-11-30 19:43 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-22 14:19 [RFC 0/4] Define coherent device memory node Anshuman Khandual
2016-11-22 14:19 ` Anshuman Khandual
2016-11-22 14:19 ` [RFC 1/4] mm: " Anshuman Khandual
2016-11-22 14:19   ` Anshuman Khandual
2016-11-29 17:57   ` Dave Hansen
2016-11-29 17:57     ` Dave Hansen
2016-11-30 11:46     ` Anshuman Khandual
2016-11-30 11:46       ` Anshuman Khandual
2016-11-22 14:19 ` [RFC 2/4] mm/cpuset: Exclude coherent device memory nodes from mems_allowed Anshuman Khandual
2016-11-22 14:19   ` Anshuman Khandual
2016-11-22 14:19 ` [RFC 3/4] mm/hugetlb: Restrict HugeTLB page allocations only to system ram nodemask Anshuman Khandual
2016-11-22 14:19   ` Anshuman Khandual
2016-11-22 14:19 ` [RFC 4/4] mm: Ignore cpuset enforcement when allocation flag has __GFP_THISNODE Anshuman Khandual
2016-11-22 14:19   ` Anshuman Khandual
2016-11-28 21:12   ` Dave Hansen
2016-11-28 21:12     ` Dave Hansen
2016-11-29  6:51     ` Anshuman Khandual
2016-11-29  6:51       ` Anshuman Khandual
2016-11-29 16:52       ` Dave Hansen
2016-11-29 16:52         ` Dave Hansen
2016-11-30 11:17         ` Anshuman Khandual
2016-11-30 11:17           ` Anshuman Khandual
2016-11-30 19:43           ` Dave Hansen [this message]
2016-11-30 19:43             ` Dave Hansen
2016-11-22 14:19 ` [DEBUG 05/12] powerpc/mm: Identify coherent device memory nodes during platform init Anshuman Khandual
2016-11-22 14:19   ` Anshuman Khandual
2016-11-22 14:19 ` [DEBUG 06/12] powerpc/mm: Create numa nodes for hotplug memory Anshuman Khandual
2016-11-22 14:19   ` Anshuman Khandual
2016-11-22 14:19 ` [DEBUG 07/12] powerpc/mm: Allow memory hotplug into a memory less node Anshuman Khandual
2016-11-22 14:19   ` Anshuman Khandual
2016-11-22 14:19 ` [DEBUG 08/12] mm: Enable CONFIG_MOVABLE_NODE on powerpc Anshuman Khandual
2016-11-22 14:19   ` Anshuman Khandual
2016-11-22 14:19 ` [DEBUG 09/12] powerpc: Enable CONFIG_MOVABLE_NODE for PPC64 platform Anshuman Khandual
2016-11-22 14:19   ` Anshuman Khandual
2016-11-22 14:19 ` [DEBUG 10/12] mm: Add a new migration function migrate_virtual_range() Anshuman Khandual
2016-11-22 14:19   ` Anshuman Khandual
2016-11-22 14:19 ` [DEBUG 11/12] drivers: Add two drivers for coherent device memory tests Anshuman Khandual
2016-11-22 14:19   ` Anshuman Khandual
2016-11-22 14:19 ` [DEBUG 12/12] test: Add a script to perform random VMA migrations across nodes Anshuman Khandual
2016-11-22 14:19   ` Anshuman Khandual

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=facddba2-ab56-0fea-c608-0bae65e32dbd@intel.com \
    --to=dave.hansen@intel.com \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=bsingharora@gmail.com \
    --cc=haren@linux.vnet.ibm.com \
    --cc=jglisse@redhat.com \
    --cc=khandual@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizefan@huawei.com \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.