All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wen Congyang <wency@cn.fujitsu.com>
To: David Rientjes <rientjes@google.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-doc@vger.kernel.org, Rob Landley <rob@landley.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>,
	Lai Jiangshan <laijs@cn.fujitsu.com>,
	Jiang Liu <jiang.liu@huawei.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Minchan Kim <minchan.kim@gmail.com>, Mel Gorman <mgorman@suse.de>,
	Yinghai Lu <yinghai@kernel.org>,
	"rusty@rustcorp.com.au" <rusty@rustcorp.com.au>
Subject: Re: [PART3 Patch 00/14] introduce N_MEMORY
Date: Fri, 02 Nov 2012 15:41:55 +0800	[thread overview]
Message-ID: <50937943.2040302@cn.fujitsu.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1211011431130.19373@chino.kir.corp.google.com>

At 11/02/2012 05:36 AM, David Rientjes Wrote:
> On Thu, 1 Nov 2012, Wen Congyang wrote:
> 
>>> This doesn't describe why we need the new node state, unfortunately.  It 
>>
>> 1. Somethimes, we use the node which contains the memory that can be used by
>>    kernel.
>> 2. Sometimes, we use the node which contains the memory.
>>
>> In case1, we use N_HIGH_MEMORY, and we use N_MEMORY in case2.
>>
> 
> Yeah, that's clear, but the question is still _why_ we want two different 
> nodemasks.  I know that this part of the patchset simply introduces the 
> new nodemask because the name "N_MEMORY" is more clear than 
> "N_HIGH_MEMORY", but there's no real incentive for making that change by 
> introducing a new nodemask where a simple rename would suffice.
> 
> I can only assume that you want to later use one of them for a different 
> purpose: those that do not include nodes that consist of only 
> ZONE_MOVABLE.  But that change for MPOL_BIND is nacked since it 
> significantly changes the semantics of set_mempolicy() and you can't break 
> userspace (see my response to that from yesterday).  Until that problem is 
> addressed, then there's no reason for the additional nodemask so nack on 
> this series as well.
> 

I still think that we need two nodemasks: one store the node which has memory
that the kernel can use, and one store the node which has memory.

For example:

==========================
static void *__meminit alloc_page_cgroup(size_t size, int nid)
{
	gfp_t flags = GFP_KERNEL | __GFP_ZERO | __GFP_NOWARN;
	void *addr = NULL;

	addr = alloc_pages_exact_nid(nid, size, flags);
	if (addr) {
		kmemleak_alloc(addr, size, 1, flags);
		return addr;
	}

	if (node_state(nid, N_HIGH_MEMORY))
		addr = vzalloc_node(size, nid);
	else
		addr = vzalloc(size);

	return addr;
}
==========================
If the node only has ZONE_MOVABLE memory, we should use vzalloc().
So we should have a mask that stores the node which has memory that
the kernel can use.

==========================
static int mpol_set_nodemask(struct mempolicy *pol,
		     const nodemask_t *nodes, struct nodemask_scratch *nsc)
{
	int ret;

	/* if mode is MPOL_DEFAULT, pol is NULL. This is right. */
	if (pol == NULL)
		return 0;
	/* Check N_HIGH_MEMORY */
	nodes_and(nsc->mask1,
		  cpuset_current_mems_allowed, node_states[N_HIGH_MEMORY]);
...
		if (pol->flags & MPOL_F_RELATIVE_NODES)
			mpol_relative_nodemask(&nsc->mask2, nodes,&nsc->mask1);
		else
			nodes_and(nsc->mask2, *nodes, nsc->mask1);
...
}
==========================
If the user specifies 2 nodes: one has ZONE_MOVABLE memory, and the other one doesn't.
nsc->mask2 should contain these 2 nodes. So we should hava a mask that store the node
which has memory.

There maybe something wrong in the change for MPOL_BIND. But this patchset is needed.

Thanks
Wen Congyang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Wen Congyang <wency@cn.fujitsu.com>
To: David Rientjes <rientjes@google.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-doc@vger.kernel.org, Rob Landley <rob@landley.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>,
	Lai Jiangshan <laijs@cn.fujitsu.com>,
	Jiang Liu <jiang.liu@huawei.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Minchan Kim <minchan.kim@gmail.com>, Mel Gorman <mgorman@suse.de>,
	Yinghai Lu <yinghai@kernel.org>,
	"rusty@rustcorp.com.au" <rusty@rustcorp.com.au>
Subject: Re: [PART3 Patch 00/14] introduce N_MEMORY
Date: Fri, 02 Nov 2012 15:41:55 +0800	[thread overview]
Message-ID: <50937943.2040302@cn.fujitsu.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1211011431130.19373@chino.kir.corp.google.com>

At 11/02/2012 05:36 AM, David Rientjes Wrote:
> On Thu, 1 Nov 2012, Wen Congyang wrote:
> 
>>> This doesn't describe why we need the new node state, unfortunately.  It 
>>
>> 1. Somethimes, we use the node which contains the memory that can be used by
>>    kernel.
>> 2. Sometimes, we use the node which contains the memory.
>>
>> In case1, we use N_HIGH_MEMORY, and we use N_MEMORY in case2.
>>
> 
> Yeah, that's clear, but the question is still _why_ we want two different 
> nodemasks.  I know that this part of the patchset simply introduces the 
> new nodemask because the name "N_MEMORY" is more clear than 
> "N_HIGH_MEMORY", but there's no real incentive for making that change by 
> introducing a new nodemask where a simple rename would suffice.
> 
> I can only assume that you want to later use one of them for a different 
> purpose: those that do not include nodes that consist of only 
> ZONE_MOVABLE.  But that change for MPOL_BIND is nacked since it 
> significantly changes the semantics of set_mempolicy() and you can't break 
> userspace (see my response to that from yesterday).  Until that problem is 
> addressed, then there's no reason for the additional nodemask so nack on 
> this series as well.
> 

I still think that we need two nodemasks: one store the node which has memory
that the kernel can use, and one store the node which has memory.

For example:

==========================
static void *__meminit alloc_page_cgroup(size_t size, int nid)
{
	gfp_t flags = GFP_KERNEL | __GFP_ZERO | __GFP_NOWARN;
	void *addr = NULL;

	addr = alloc_pages_exact_nid(nid, size, flags);
	if (addr) {
		kmemleak_alloc(addr, size, 1, flags);
		return addr;
	}

	if (node_state(nid, N_HIGH_MEMORY))
		addr = vzalloc_node(size, nid);
	else
		addr = vzalloc(size);

	return addr;
}
==========================
If the node only has ZONE_MOVABLE memory, we should use vzalloc().
So we should have a mask that stores the node which has memory that
the kernel can use.

==========================
static int mpol_set_nodemask(struct mempolicy *pol,
		     const nodemask_t *nodes, struct nodemask_scratch *nsc)
{
	int ret;

	/* if mode is MPOL_DEFAULT, pol is NULL. This is right. */
	if (pol == NULL)
		return 0;
	/* Check N_HIGH_MEMORY */
	nodes_and(nsc->mask1,
		  cpuset_current_mems_allowed, node_states[N_HIGH_MEMORY]);
...
		if (pol->flags & MPOL_F_RELATIVE_NODES)
			mpol_relative_nodemask(&nsc->mask2, nodes,&nsc->mask1);
		else
			nodes_and(nsc->mask2, *nodes, nsc->mask1);
...
}
==========================
If the user specifies 2 nodes: one has ZONE_MOVABLE memory, and the other one doesn't.
nsc->mask2 should contain these 2 nodes. So we should hava a mask that store the node
which has memory.

There maybe something wrong in the change for MPOL_BIND. But this patchset is needed.

Thanks
Wen Congyang


  reply	other threads:[~2012-11-02  7:36 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-31  8:03 [PART3 Patch 00/14] introduce N_MEMORY Wen Congyang
2012-10-31  8:03 ` Wen Congyang
2012-10-31  8:03 ` [PART3 Patch 01/14] node_states: " Wen Congyang
2012-10-31  8:03   ` Wen Congyang
2012-10-31  8:04 ` [PART3 Patch 02/14] cpuset: use N_MEMORY instead N_HIGH_MEMORY Wen Congyang
2012-10-31  8:04   ` Wen Congyang
2012-10-31  8:04 ` [PART3 Patch 03/14] procfs: " Wen Congyang
2012-10-31  8:04   ` Wen Congyang
2012-10-31  8:04 ` [PART3 Patch 04/14] memcontrol: " Wen Congyang
2012-10-31  8:04   ` Wen Congyang
2012-10-31  8:04 ` [PART3 Patch 05/14] oom: " Wen Congyang
2012-10-31  8:04   ` Wen Congyang
2012-10-31  8:04 ` [PART3 Patch 06/14] mm,migrate: " Wen Congyang
2012-10-31  8:04   ` Wen Congyang
2012-10-31  8:04 ` [PART3 Patch 07/14] mempolicy: " Wen Congyang
2012-10-31  8:04   ` Wen Congyang
2012-10-31  8:04 ` [PART3 Patch 08/14] hugetlb: " Wen Congyang
2012-10-31  8:04   ` Wen Congyang
2012-10-31  8:04 ` [PART3 Patch 09/14] vmstat: " Wen Congyang
2012-10-31  8:04   ` Wen Congyang
2012-10-31  8:04 ` [PART3 Patch 10/14] kthread: " Wen Congyang
2012-10-31  8:04   ` Wen Congyang
2012-10-31  8:04 ` [PART3 Patch 11/14] init: " Wen Congyang
2012-10-31  8:04   ` Wen Congyang
2012-10-31  8:04 ` [PART3 Patch 12/14] vmscan: " Wen Congyang
2012-10-31  8:04   ` Wen Congyang
2012-10-31  8:04 ` [PART3 Patch 13/14] page_alloc: use N_MEMORY instead N_HIGH_MEMORY change the node_states initialization Wen Congyang
2012-10-31  8:04   ` Wen Congyang
2012-10-31  8:04 ` [PART3 Patch 14/14] hotplug: update nodemasks management Wen Congyang
2012-10-31  8:04   ` Wen Congyang
2012-10-31 18:16 ` [PART3 Patch 00/14] introduce N_MEMORY David Rientjes
2012-10-31 18:16   ` David Rientjes
2012-11-01  6:13   ` Wen Congyang
2012-11-01  6:13     ` Wen Congyang
2012-11-01 21:36     ` David Rientjes
2012-11-01 21:36       ` David Rientjes
2012-11-02  7:41       ` Wen Congyang [this message]
2012-11-02  7:41         ` Wen Congyang
2012-11-14 19:52         ` Andrew Morton
2012-11-14 19:52           ` Andrew Morton
2012-11-15  6:33           ` Wen Congyang
2012-11-15  6:33             ` Wen Congyang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50937943.2040302@cn.fujitsu.com \
    --to=wency@cn.fujitsu.com \
    --cc=akpm@linux-foundation.org \
    --cc=isimatu.yasuaki@jp.fujitsu.com \
    --cc=jiang.liu@huawei.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=minchan.kim@gmail.com \
    --cc=rientjes@google.com \
    --cc=rob@landley.net \
    --cc=rusty@rustcorp.com.au \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.