linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Tang Chen <tangchen@cn.fujitsu.com>
To: Tejun Heo <tj@kernel.org>
Cc: tglx@linutronix.de, mingo@elte.hu, hpa@zytor.com,
	akpm@linux-foundation.org, trenn@suse.de, yinghai@kernel.org,
	jiang.liu@huawei.com, wency@cn.fujitsu.com, laijs@cn.fujitsu.com,
	isimatu.yasuaki@jp.fujitsu.com, izumi.taku@jp.fujitsu.com,
	mgorman@suse.de, minchan@kernel.org, mina86@mina86.com,
	gong.chen@linux.intel.com, vasilis.liaskovitis@profitbricks.com,
	lwoodman@redhat.com, riel@redhat.com, jweiner@redhat.com,
	prarit@redhat.com, zhangyanfei@cn.fujitsu.com,
	yanghy@cn.fujitsu.com, x86@kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-acpi@vger.kernel.org
Subject: Re: [PATCH 11/21] x86: get pg_data_t's memory from other node
Date: Wed, 24 Jul 2013 11:52:53 +0800	[thread overview]
Message-ID: <51EF4F95.1050308@cn.fujitsu.com> (raw)
In-Reply-To: <20130723200924.GP21100@mtj.dyndns.org>

On 07/24/2013 04:09 AM, Tejun Heo wrote:
> On Fri, Jul 19, 2013 at 03:59:24PM +0800, Tang Chen wrote:
>> From: Yasuaki Ishimatsu<isimatu.yasuaki@jp.fujitsu.com>
>>
>> If system can create movable node which all memory of the
>> node is allocated as ZONE_MOVABLE, setup_node_data() cannot
>> allocate memory for the node's pg_data_t.
>> So, use memblock_alloc_try_nid() instead of memblock_alloc_nid()
>> to retry when the first allocation fails. Otherwise, the system
>> could failed to boot.
......
>> -	nd_pa = memblock_alloc_nid(nd_size, SMP_CACHE_BYTES, nid);
>> +	nd_pa = memblock_alloc_try_nid(nd_size, SMP_CACHE_BYTES, nid);
>>   	if (!nd_pa) {
>> -		pr_err("Cannot find %zu bytes in node %d\n",
>> -		       nd_size, nid);
>> +		pr_err("Cannot find %zu bytes in any node\n", nd_size);
>
> Hmm... we want the node data to be colocated on the same node and I
> don't think being hotpluggable necessarily requires the node data to
> be allocated on a different node.  Does node data of a hotpluggable
> node need to stay around after hotunplug?
>
> I don't think it's a huge issue but it'd be great if we can clarify
> where the restriction is coming from.
>

You are right, the node data could be on hotpluggable node. And Yinghai
also said pagetable and vmemmap could be on hotpluggable node.

But for now, doing so will break memory hot-remove path. I should have
mentioned so in the log, which I didn't do.

A node could have several memory devices. And the device who holds node
data should be hot-removed in the last place. But in NUAM level, we don't
know which memory_block (/sys/devices/system/node/nodeX/memoryXXX) belongs
to which memory device. We only have node. So we can only do node hotplug.

Also as Yinghai's previous patch-set did, he put pagetable on local node.
And we met the same problem. when hot-removing memory, we have to ensure
the memory device containing pagetable being hot-removed in the last place.

But in virtualization, developers are now developing memory hotplug in qemu,
which support a single memory device hotplug. So a whole node hotplug will
not satisfy virtualization users.

At last, we concluded that we'd better do memory hotplug and local node
things (local node node data, pagetable, vmemmap, ...) in two steps.
Please refer to https://lkml.org/lkml/2013/6/19/73

The node data should be on local, I agree with that. I'm not saying I
won't do it. Just for now, it will be complicated to fix memory hot-remove
path. So I think pushing this patch for now, and do the local node things
in the next step.

Thanks.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-07-24  3:50 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-19  7:59 [PATCH 00/21] Arrange hotpluggable memory as ZONE_MOVABLE Tang Chen
2013-07-19  7:59 ` [PATCH 01/21] acpi: Print Hot-Pluggable Field in SRAT Tang Chen
2013-07-23 18:48   ` Tejun Heo
2013-07-23 19:15     ` Joe Perches
2013-07-23 19:20       ` Tejun Heo
2013-07-23 19:26         ` Joe Perches
2013-07-24  1:46     ` Tang Chen
2013-07-19  7:59 ` [PATCH 02/21] memblock, numa: Introduce flag into memblock Tang Chen
2013-07-23 19:09   ` Tejun Heo
2013-07-24  2:53     ` Tang Chen
2013-07-24 15:54       ` Tejun Heo
2013-07-25  6:42         ` Tang Chen
2013-07-19  7:59 ` [PATCH 03/21] x86, acpi, numa, mem-hotplug: Introduce MEMBLK_HOTPLUGGABLE to reserve hotpluggable memory Tang Chen
2013-07-23 19:19   ` Tejun Heo
2013-07-24  2:55     ` Tang Chen
2013-07-19  7:59 ` [PATCH 04/21] acpi: Remove "continue" in macro INVALID_TABLE() Tang Chen
2013-07-23 19:15   ` Tejun Heo
2013-07-19  7:59 ` [PATCH 05/21] acpi: Introduce acpi_invalid_table() to check if a table is invalid Tang Chen
2013-07-19  7:59 ` [PATCH 06/21] x86, acpi: Split acpi_boot_table_init() into two parts Tang Chen
2013-07-19  7:59 ` [PATCH 07/21] x86, acpi: Initialize ACPI root table list earlier Tang Chen
2013-07-19  7:59 ` [PATCH 08/21] x86, acpi: Also initialize signature and length when parsing root table Tang Chen
2013-07-23 19:45   ` Tejun Heo
2013-07-25  6:50     ` Tang Chen
2013-07-19  7:59 ` [PATCH 09/21] x86: Make get_ramdisk_{image|size}() global Tang Chen
2013-07-23 19:56   ` Tejun Heo
2013-07-24  3:12     ` Tang Chen
2013-07-19  7:59 ` [PATCH 10/21] earlycpio.c: Fix the confusing comment of find_cpio_data() Tang Chen
2013-07-23 20:02   ` Tejun Heo
2013-07-24  3:20     ` Tang Chen
2013-07-19  7:59 ` [PATCH 11/21] x86: get pg_data_t's memory from other node Tang Chen
2013-07-23 20:09   ` Tejun Heo
2013-07-24  3:52     ` Tang Chen [this message]
2013-07-24 16:03       ` Tejun Heo
2013-07-19  7:59 ` [PATCH 12/21] x86, acpi: Try to find if SRAT is overrided earlier Tang Chen
2013-07-23 20:27   ` Tejun Heo
2013-07-24  6:57     ` Tang Chen
2013-07-19  7:59 ` [PATCH 13/21] x86, acpi: Try to find SRAT in firmware earlier Tang Chen
2013-07-23 20:49   ` Tejun Heo
2013-07-24 10:12     ` Tang Chen
2013-07-24 15:55       ` Tejun Heo
2013-07-23 23:26   ` Cody P Schafer
2013-07-24 10:16     ` Tang Chen
2013-07-19  7:59 ` [PATCH 14/21] x86, acpi, numa: Reserve hotpluggable memory at early time Tang Chen
2013-07-23 20:55   ` Tejun Heo
2013-07-23 21:32     ` Tejun Heo
2013-07-25  2:13       ` Tang Chen
2013-07-25 15:17         ` Tejun Heo
2013-07-26  3:45           ` Tang Chen
2013-07-26 10:26             ` Tejun Heo
2013-07-26 10:27               ` Tejun Heo
2013-07-29  2:12               ` Tang Chen
2013-07-29 17:10                 ` Tejun Heo
2013-07-19  7:59 ` [PATCH 15/21] x86, acpi, numa: Don't reserve memory on nodes the kernel resides in Tang Chen
2013-07-23 20:59   ` Tejun Heo
2013-07-25  2:34     ` Tang Chen
2013-07-19  7:59 ` [PATCH 16/21] x86, memblock, mem-hotplug: Free hotpluggable memory reserved by memblock Tang Chen
2013-07-23 21:00   ` Tejun Heo
2013-07-25  2:35     ` Tang Chen
2013-07-19  7:59 ` [PATCH 17/21] page_alloc, mem-hotplug: Improve movablecore to {en|dis}able using SRAT Tang Chen
2013-07-23 21:04   ` Tejun Heo
2013-07-23 21:11     ` Tejun Heo
2013-07-25  3:50       ` Tang Chen
2013-07-25 15:09         ` Tejun Heo
2013-07-26  3:58           ` Tang Chen
2013-07-19  7:59 ` [PATCH 18/21] x86, numa: Synchronize nid info in memblock.reserve with numa_meminfo Tang Chen
2013-07-23 21:25   ` Tejun Heo
2013-07-25  4:09     ` Tang Chen
2013-07-25 15:05       ` Tejun Heo
2013-07-26  4:00         ` Tang Chen
2013-07-19  7:59 ` [PATCH 19/21] x86, numa: Save nid when reserve memory into memblock.reserved[] Tang Chen
2013-07-19  7:59 ` [PATCH 20/21] x86, numa, acpi, memory-hotplug: Make movablecore=acpi have higher priority Tang Chen
2013-07-23 21:21   ` Tejun Heo
2013-07-19  7:59 ` [PATCH 21/21] doc, page_alloc, acpi, mem-hotplug: Add doc for movablecore=acpi boot option Tang Chen
2013-07-23 21:21   ` Tejun Heo
2013-07-25  3:53     ` Tang Chen
2013-07-22  2:48 ` [PATCH 00/21] Arrange hotpluggable memory as ZONE_MOVABLE Tang Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51EF4F95.1050308@cn.fujitsu.com \
    --to=tangchen@cn.fujitsu.com \
    --cc=akpm@linux-foundation.org \
    --cc=gong.chen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=isimatu.yasuaki@jp.fujitsu.com \
    --cc=izumi.taku@jp.fujitsu.com \
    --cc=jiang.liu@huawei.com \
    --cc=jweiner@redhat.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lwoodman@redhat.com \
    --cc=mgorman@suse.de \
    --cc=mina86@mina86.com \
    --cc=minchan@kernel.org \
    --cc=mingo@elte.hu \
    --cc=prarit@redhat.com \
    --cc=riel@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=trenn@suse.de \
    --cc=vasilis.liaskovitis@profitbricks.com \
    --cc=wency@cn.fujitsu.com \
    --cc=x86@kernel.org \
    --cc=yanghy@cn.fujitsu.com \
    --cc=yinghai@kernel.org \
    --cc=zhangyanfei@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).