From: Andrew Morton <akpm@linux-foundation.org>
To: Wen Congyang <wency@cn.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-doc@vger.kernel.org, Rob Landley <rob@landley.net>,
Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>,
Lai Jiangshan <laijs@cn.fujitsu.com>,
Jiang Liu <jiang.liu@huawei.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Minchan Kim <minchan.kim@gmail.com>, Mel Gorman <mgorman@suse.de>,
Yinghai Lu <yinghai@kernel.org>,
"rusty@rustcorp.com.au" <rusty@rustcorp.com.au>
Subject: Re: [PART3 Patch 00/14] introduce N_MEMORY
Date: Wed, 14 Nov 2012 11:52:27 -0800 [thread overview]
Message-ID: <20121114115227.8763c3cd.akpm@linux-foundation.org> (raw)
In-Reply-To: <50937943.2040302@cn.fujitsu.com>
On Fri, 02 Nov 2012 15:41:55 +0800
Wen Congyang <wency@cn.fujitsu.com> wrote:
> At 11/02/2012 05:36 AM, David Rientjes Wrote:
> > On Thu, 1 Nov 2012, Wen Congyang wrote:
> >
> >>> This doesn't describe why we need the new node state, unfortunately. It
> >>
> >> 1. Somethimes, we use the node which contains the memory that can be used by
> >> kernel.
> >> 2. Sometimes, we use the node which contains the memory.
> >>
> >> In case1, we use N_HIGH_MEMORY, and we use N_MEMORY in case2.
> >>
> >
> > Yeah, that's clear, but the question is still _why_ we want two different
> > nodemasks. I know that this part of the patchset simply introduces the
> > new nodemask because the name "N_MEMORY" is more clear than
> > "N_HIGH_MEMORY", but there's no real incentive for making that change by
> > introducing a new nodemask where a simple rename would suffice.
> >
> > I can only assume that you want to later use one of them for a different
> > purpose: those that do not include nodes that consist of only
> > ZONE_MOVABLE. But that change for MPOL_BIND is nacked since it
> > significantly changes the semantics of set_mempolicy() and you can't break
> > userspace (see my response to that from yesterday). Until that problem is
> > addressed, then there's no reason for the additional nodemask so nack on
> > this series as well.
I cannot locate "my response to that from yesterday". Specificity, please!
>
> I still think that we need two nodemasks: one store the node which has memory
> that the kernel can use, and one store the node which has memory.
>
> For example:
>
> ==========================
> static void *__meminit alloc_page_cgroup(size_t size, int nid)
> {
> gfp_t flags = GFP_KERNEL | __GFP_ZERO | __GFP_NOWARN;
> void *addr = NULL;
>
> addr = alloc_pages_exact_nid(nid, size, flags);
> if (addr) {
> kmemleak_alloc(addr, size, 1, flags);
> return addr;
> }
>
> if (node_state(nid, N_HIGH_MEMORY))
> addr = vzalloc_node(size, nid);
> else
> addr = vzalloc(size);
>
> return addr;
> }
> ==========================
> If the node only has ZONE_MOVABLE memory, we should use vzalloc().
> So we should have a mask that stores the node which has memory that
> the kernel can use.
>
> ==========================
> static int mpol_set_nodemask(struct mempolicy *pol,
> const nodemask_t *nodes, struct nodemask_scratch *nsc)
> {
> int ret;
>
> /* if mode is MPOL_DEFAULT, pol is NULL. This is right. */
> if (pol == NULL)
> return 0;
> /* Check N_HIGH_MEMORY */
> nodes_and(nsc->mask1,
> cpuset_current_mems_allowed, node_states[N_HIGH_MEMORY]);
> ...
> if (pol->flags & MPOL_F_RELATIVE_NODES)
> mpol_relative_nodemask(&nsc->mask2, nodes,&nsc->mask1);
> else
> nodes_and(nsc->mask2, *nodes, nsc->mask1);
> ...
> }
> ==========================
> If the user specifies 2 nodes: one has ZONE_MOVABLE memory, and the other one doesn't.
> nsc->mask2 should contain these 2 nodes. So we should hava a mask that store the node
> which has memory.
>
> There maybe something wrong in the change for MPOL_BIND. But this patchset is needed.
Well, let's discuss the userspace-visible non-back-compatible mpol
change. What is it, why did it happen, what is its impact, is it
acceptable?
I grabbed "PART1" and "PART2", but that's as far as I got with the six
memory hotplug patch series.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: Wen Congyang <wency@cn.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-doc@vger.kernel.org, Rob Landley <rob@landley.net>,
Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>,
Lai Jiangshan <laijs@cn.fujitsu.com>,
Jiang Liu <jiang.liu@huawei.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Minchan Kim <minchan.kim@gmail.com>, Mel Gorman <mgorman@suse.de>,
Yinghai Lu <yinghai@kernel.org>,
"rusty@rustcorp.com.au" <rusty@rustcorp.com.au>
Subject: Re: [PART3 Patch 00/14] introduce N_MEMORY
Date: Wed, 14 Nov 2012 11:52:27 -0800 [thread overview]
Message-ID: <20121114115227.8763c3cd.akpm@linux-foundation.org> (raw)
In-Reply-To: <50937943.2040302@cn.fujitsu.com>
On Fri, 02 Nov 2012 15:41:55 +0800
Wen Congyang <wency@cn.fujitsu.com> wrote:
> At 11/02/2012 05:36 AM, David Rientjes Wrote:
> > On Thu, 1 Nov 2012, Wen Congyang wrote:
> >
> >>> This doesn't describe why we need the new node state, unfortunately. It
> >>
> >> 1. Somethimes, we use the node which contains the memory that can be used by
> >> kernel.
> >> 2. Sometimes, we use the node which contains the memory.
> >>
> >> In case1, we use N_HIGH_MEMORY, and we use N_MEMORY in case2.
> >>
> >
> > Yeah, that's clear, but the question is still _why_ we want two different
> > nodemasks. I know that this part of the patchset simply introduces the
> > new nodemask because the name "N_MEMORY" is more clear than
> > "N_HIGH_MEMORY", but there's no real incentive for making that change by
> > introducing a new nodemask where a simple rename would suffice.
> >
> > I can only assume that you want to later use one of them for a different
> > purpose: those that do not include nodes that consist of only
> > ZONE_MOVABLE. But that change for MPOL_BIND is nacked since it
> > significantly changes the semantics of set_mempolicy() and you can't break
> > userspace (see my response to that from yesterday). Until that problem is
> > addressed, then there's no reason for the additional nodemask so nack on
> > this series as well.
I cannot locate "my response to that from yesterday". Specificity, please!
>
> I still think that we need two nodemasks: one store the node which has memory
> that the kernel can use, and one store the node which has memory.
>
> For example:
>
> ==========================
> static void *__meminit alloc_page_cgroup(size_t size, int nid)
> {
> gfp_t flags = GFP_KERNEL | __GFP_ZERO | __GFP_NOWARN;
> void *addr = NULL;
>
> addr = alloc_pages_exact_nid(nid, size, flags);
> if (addr) {
> kmemleak_alloc(addr, size, 1, flags);
> return addr;
> }
>
> if (node_state(nid, N_HIGH_MEMORY))
> addr = vzalloc_node(size, nid);
> else
> addr = vzalloc(size);
>
> return addr;
> }
> ==========================
> If the node only has ZONE_MOVABLE memory, we should use vzalloc().
> So we should have a mask that stores the node which has memory that
> the kernel can use.
>
> ==========================
> static int mpol_set_nodemask(struct mempolicy *pol,
> const nodemask_t *nodes, struct nodemask_scratch *nsc)
> {
> int ret;
>
> /* if mode is MPOL_DEFAULT, pol is NULL. This is right. */
> if (pol == NULL)
> return 0;
> /* Check N_HIGH_MEMORY */
> nodes_and(nsc->mask1,
> cpuset_current_mems_allowed, node_states[N_HIGH_MEMORY]);
> ...
> if (pol->flags & MPOL_F_RELATIVE_NODES)
> mpol_relative_nodemask(&nsc->mask2, nodes,&nsc->mask1);
> else
> nodes_and(nsc->mask2, *nodes, nsc->mask1);
> ...
> }
> ==========================
> If the user specifies 2 nodes: one has ZONE_MOVABLE memory, and the other one doesn't.
> nsc->mask2 should contain these 2 nodes. So we should hava a mask that store the node
> which has memory.
>
> There maybe something wrong in the change for MPOL_BIND. But this patchset is needed.
Well, let's discuss the userspace-visible non-back-compatible mpol
change. What is it, why did it happen, what is its impact, is it
acceptable?
I grabbed "PART1" and "PART2", but that's as far as I got with the six
memory hotplug patch series.
next prev parent reply other threads:[~2012-11-14 19:52 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-10-31 8:03 [PART3 Patch 00/14] introduce N_MEMORY Wen Congyang
2012-10-31 8:03 ` Wen Congyang
2012-10-31 8:03 ` [PART3 Patch 01/14] node_states: " Wen Congyang
2012-10-31 8:03 ` Wen Congyang
2012-10-31 8:04 ` [PART3 Patch 02/14] cpuset: use N_MEMORY instead N_HIGH_MEMORY Wen Congyang
2012-10-31 8:04 ` Wen Congyang
2012-10-31 8:04 ` [PART3 Patch 03/14] procfs: " Wen Congyang
2012-10-31 8:04 ` Wen Congyang
2012-10-31 8:04 ` [PART3 Patch 04/14] memcontrol: " Wen Congyang
2012-10-31 8:04 ` Wen Congyang
2012-10-31 8:04 ` [PART3 Patch 05/14] oom: " Wen Congyang
2012-10-31 8:04 ` Wen Congyang
2012-10-31 8:04 ` [PART3 Patch 06/14] mm,migrate: " Wen Congyang
2012-10-31 8:04 ` Wen Congyang
2012-10-31 8:04 ` [PART3 Patch 07/14] mempolicy: " Wen Congyang
2012-10-31 8:04 ` Wen Congyang
2012-10-31 8:04 ` [PART3 Patch 08/14] hugetlb: " Wen Congyang
2012-10-31 8:04 ` Wen Congyang
2012-10-31 8:04 ` [PART3 Patch 09/14] vmstat: " Wen Congyang
2012-10-31 8:04 ` Wen Congyang
2012-10-31 8:04 ` [PART3 Patch 10/14] kthread: " Wen Congyang
2012-10-31 8:04 ` Wen Congyang
2012-10-31 8:04 ` [PART3 Patch 11/14] init: " Wen Congyang
2012-10-31 8:04 ` Wen Congyang
2012-10-31 8:04 ` [PART3 Patch 12/14] vmscan: " Wen Congyang
2012-10-31 8:04 ` Wen Congyang
2012-10-31 8:04 ` [PART3 Patch 13/14] page_alloc: use N_MEMORY instead N_HIGH_MEMORY change the node_states initialization Wen Congyang
2012-10-31 8:04 ` Wen Congyang
2012-10-31 8:04 ` [PART3 Patch 14/14] hotplug: update nodemasks management Wen Congyang
2012-10-31 8:04 ` Wen Congyang
2012-10-31 18:16 ` [PART3 Patch 00/14] introduce N_MEMORY David Rientjes
2012-10-31 18:16 ` David Rientjes
2012-11-01 6:13 ` Wen Congyang
2012-11-01 6:13 ` Wen Congyang
2012-11-01 21:36 ` David Rientjes
2012-11-01 21:36 ` David Rientjes
2012-11-02 7:41 ` Wen Congyang
2012-11-02 7:41 ` Wen Congyang
2012-11-14 19:52 ` Andrew Morton [this message]
2012-11-14 19:52 ` Andrew Morton
2012-11-15 6:33 ` Wen Congyang
2012-11-15 6:33 ` Wen Congyang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121114115227.8763c3cd.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=isimatu.yasuaki@jp.fujitsu.com \
--cc=jiang.liu@huawei.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=laijs@cn.fujitsu.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=minchan.kim@gmail.com \
--cc=rientjes@google.com \
--cc=rob@landley.net \
--cc=rusty@rustcorp.com.au \
--cc=wency@cn.fujitsu.com \
--cc=yinghai@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.