From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org,
Alexander Graf <agraf@suse.de>
Subject: Re: [Qemu-devel] [PATCH 3/7] spapr: Refactor spapr_populate_memory()
Date: Tue, 24 Jun 2014 16:14:11 +1000 [thread overview]
Message-ID: <53A91733.9060405@ozlabs.ru> (raw)
In-Reply-To: <20140624030856.GA29537@linux.vnet.ibm.com>
On 06/24/2014 01:08 PM, Nishanth Aravamudan wrote:
> On 21.06.2014 [13:06:53 +1000], Alexey Kardashevskiy wrote:
>> On 06/21/2014 08:55 AM, Nishanth Aravamudan wrote:
>>> On 16.06.2014 [17:53:49 +1000], Alexey Kardashevskiy wrote:
>>>> Current QEMU does not support memoryless NUMA nodes.
>>>> This prepares SPAPR for that.
>>>>
>>>> This moves 2 calls of spapr_populate_memory_node() into
>>>> the existing loop which handles nodes other than than
>>>> the first one.
>>>>
>>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>>>> ---
>>>> hw/ppc/spapr.c | 31 +++++++++++--------------------
>>>> 1 file changed, 11 insertions(+), 20 deletions(-)
>>>>
>>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>>> index cb3a10a..666b676 100644
>>>> --- a/hw/ppc/spapr.c
>>>> +++ b/hw/ppc/spapr.c
>>>> @@ -689,28 +689,13 @@ static void spapr_populate_memory_node(void *fdt, int nodeid, hwaddr start,
>>>>
>>>> static int spapr_populate_memory(sPAPREnvironment *spapr, void *fdt)
>>>> {
>>>> - hwaddr node0_size, mem_start, node_size;
>>>> + hwaddr mem_start, node_size;
>>>> int i;
>>>>
>>>> - /* memory node(s) */
>>>> - if (nb_numa_nodes > 1 && node_mem[0] < ram_size) {
>>>> - node0_size = node_mem[0];
>>>> - } else {
>>>> - node0_size = ram_size;
>>>> - }
>>>> -
>>>> - /* RMA */
>>>> - spapr_populate_memory_node(fdt, 0, 0, spapr->rma_size);
>>>> -
>>>> - /* RAM: Node 0 */
>>>> - if (node0_size > spapr->rma_size) {
>>>> - spapr_populate_memory_node(fdt, 0, spapr->rma_size,
>>>> - node0_size - spapr->rma_size);
>>>> - }
>>>> -
>>>> - /* RAM: Node 1 and beyond */
>>>> - mem_start = node0_size;
>>>> - for (i = 1; i < nb_numa_nodes; i++) {
>>>> + for (i = 0, mem_start = 0; i < nb_numa_nodes; ++i) {
>>>> + if (!node_mem[i]) {
>>>> + continue;
>>>> + }
>>>
>>> Doesn't this skip memoryless nodes? What actually puts the memoryless
>>> node in the device-tree?
>>
>> It does skip.
>>
>>> And if you were to put them in, wouldn't spapr_populate_memory_node()
>>> fail because we'd be creating two nodes with memory@XXX where XXX is the
>>> same (starting address) for both?
>>
>> I cannot do this now - there is no way to tell from the command line where
>> I want NUMA node memory start from so I'll end up with multiple nodes with
>> the same name and QEMU won't start. When NUMA fixes reach upstream, I'll
>> try to work out something on top of that.
>
> Ah I got something here. With the patches I just sent to enable sparse
> NUMA nodes, plus your series rebased on top, here's what I see in a
> Linux LPAR:
>
> qemu-system-ppc64 -machine pseries,accel=kvm,usb=off -m 4096 -realtime mlock=off -numa node,nodeid=3,mem=4096,cpus=2-3 -numa node,nodeid=2,mem=0,cpus=0-1 -smp 4
>
> info numa
> 2 nodes
> node 2 cpus: 0 1
> node 2 size: 0 MB
> node 3 cpus: 2 3
> node 3 size: 4096 MB
>
> numactl --hardware
> available: 3 nodes (0,2-3)
> node 0 cpus:
> node 0 size: 0 MB
> node 0 free: 0 MB
> node 2 cpus: 0 1
> node 2 size: 0 MB
> node 2 free: 0 MB
> node 3 cpus: 2 3
> node 3 size: 4073 MB
> node 3 free: 3830 MB
> node distances:
> node 0 2 3
> 0: 10 40 40
> 2: 40 10 40
> 3: 40 40 10
>
> The trick, it seems, is if you have a memoryless node, it needs to
> have CPUs assigned to it.
Yep. The device tree does not have NUMA nodes, it only has CPUs and
memory@xxx (memory banks?) and the guest kernel has to parse
ibm,associativity and reconstruct the NUMA topology. If some node is not
mentioned in any ibm,associativity, it does not exist.
> The CPU's "ibm,associativity" property will
> make Linux set up the proper NUMA topology.
>
> Thoughts? Should there be a check that every "present" NUMA node at
> least either has CPUs or memory.
May be, I'll wait for NUMA stuff in upstream, apply your patch(es), my
patches and see what I get :)
> It seems like if neither are present,
> we can just hotplug them later?
hotplug what? NUMA nodes?
> Does qemu support topology for PCI devices?
Nope.
--
Alexey
next prev parent reply other threads:[~2014-06-24 6:14 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-16 7:53 [Qemu-devel] [PATCH 0/7] spapr: rework memory nodes Alexey Kardashevskiy
2014-06-16 7:53 ` [Qemu-devel] [PATCH 1/7] spapr: Move DT memory node rendering to a helper Alexey Kardashevskiy
2014-06-16 7:53 ` [Qemu-devel] [PATCH 2/7] spapr: Use DT memory node rendering helper for other nodes Alexey Kardashevskiy
2014-06-16 7:53 ` [Qemu-devel] [PATCH 3/7] spapr: Refactor spapr_populate_memory() Alexey Kardashevskiy
2014-06-18 5:04 ` Alexey Kardashevskiy
2014-06-20 19:10 ` Nishanth Aravamudan
2014-06-21 3:08 ` Alexey Kardashevskiy
2014-06-23 17:41 ` Nishanth Aravamudan
2014-06-23 22:02 ` Alexey Kardashevskiy
2014-06-20 22:55 ` Nishanth Aravamudan
2014-06-21 3:06 ` Alexey Kardashevskiy
2014-06-23 17:40 ` Nishanth Aravamudan
2014-06-24 6:07 ` Alexey Kardashevskiy
2014-06-24 17:07 ` Nishanth Aravamudan
2014-06-24 3:08 ` Nishanth Aravamudan
2014-06-24 6:14 ` Alexey Kardashevskiy [this message]
2014-06-24 17:01 ` Nishanth Aravamudan
2014-07-21 18:08 ` Nishanth Aravamudan
2014-06-16 7:53 ` [Qemu-devel] [PATCH 4/7] spapr: Split memory nodes to power-of-two blocks Alexey Kardashevskiy
2014-06-17 7:07 ` Alexey Kardashevskiy
2014-06-16 7:53 ` [Qemu-devel] [PATCH 5/7] spapr: Add a helper for node0_size calculation Alexey Kardashevskiy
2014-06-16 18:43 ` Nishanth Aravamudan
2014-06-16 7:53 ` [Qemu-devel] [PATCH 6/7] spapr: Fix ibm, associativity for memory nodes Alexey Kardashevskiy
2014-06-16 7:53 ` [Qemu-devel] [PATCH 7/7] numa: Allow empty nodes Alexey Kardashevskiy
2014-06-16 16:15 ` Eduardo Habkost
2014-06-16 18:49 ` Nishanth Aravamudan
2014-06-16 20:11 ` Eduardo Habkost
2014-06-16 20:31 ` Eduardo Habkost
2014-06-17 0:21 ` Nishanth Aravamudan
2014-06-17 0:16 ` Nishanth Aravamudan
2014-06-16 8:16 ` [Qemu-devel] [PATCH 0/7] spapr: rework memory nodes Alexey Kardashevskiy
2014-06-16 18:26 ` Nishanth Aravamudan
2014-06-16 20:51 ` Eduardo Habkost
2014-06-17 0:25 ` Nishanth Aravamudan
2014-06-17 1:37 ` Eduardo Habkost
2014-06-17 18:36 ` Nishanth Aravamudan
2014-06-17 1:41 ` Eduardo Habkost
2014-06-17 18:37 ` Nishanth Aravamudan
2014-06-17 5:51 ` Alexey Kardashevskiy
2014-06-17 14:07 ` Eduardo Habkost
2014-06-17 18:38 ` Nishanth Aravamudan
2014-06-17 19:22 ` Eduardo Habkost
2014-06-18 18:28 ` Nishanth Aravamudan
2014-06-18 19:33 ` Eduardo Habkost
2014-06-18 23:58 ` Nishanth Aravamudan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53A91733.9060405@ozlabs.ru \
--to=aik@ozlabs.ru \
--cc=agraf@suse.de \
--cc=nacc@linux.vnet.ibm.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).