From: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org,
Alexander Graf <agraf@suse.de>
Subject: Re: [Qemu-devel] [PATCH 3/7] spapr: Refactor spapr_populate_memory()
Date: Mon, 23 Jun 2014 20:08:56 -0700 [thread overview]
Message-ID: <20140624030856.GA29537@linux.vnet.ibm.com> (raw)
In-Reply-To: <53A4F6CD.3040600@ozlabs.ru>
On 21.06.2014 [13:06:53 +1000], Alexey Kardashevskiy wrote:
> On 06/21/2014 08:55 AM, Nishanth Aravamudan wrote:
> > On 16.06.2014 [17:53:49 +1000], Alexey Kardashevskiy wrote:
> >> Current QEMU does not support memoryless NUMA nodes.
> >> This prepares SPAPR for that.
> >>
> >> This moves 2 calls of spapr_populate_memory_node() into
> >> the existing loop which handles nodes other than than
> >> the first one.
> >>
> >> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> >> ---
> >> hw/ppc/spapr.c | 31 +++++++++++--------------------
> >> 1 file changed, 11 insertions(+), 20 deletions(-)
> >>
> >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >> index cb3a10a..666b676 100644
> >> --- a/hw/ppc/spapr.c
> >> +++ b/hw/ppc/spapr.c
> >> @@ -689,28 +689,13 @@ static void spapr_populate_memory_node(void *fdt, int nodeid, hwaddr start,
> >>
> >> static int spapr_populate_memory(sPAPREnvironment *spapr, void *fdt)
> >> {
> >> - hwaddr node0_size, mem_start, node_size;
> >> + hwaddr mem_start, node_size;
> >> int i;
> >>
> >> - /* memory node(s) */
> >> - if (nb_numa_nodes > 1 && node_mem[0] < ram_size) {
> >> - node0_size = node_mem[0];
> >> - } else {
> >> - node0_size = ram_size;
> >> - }
> >> -
> >> - /* RMA */
> >> - spapr_populate_memory_node(fdt, 0, 0, spapr->rma_size);
> >> -
> >> - /* RAM: Node 0 */
> >> - if (node0_size > spapr->rma_size) {
> >> - spapr_populate_memory_node(fdt, 0, spapr->rma_size,
> >> - node0_size - spapr->rma_size);
> >> - }
> >> -
> >> - /* RAM: Node 1 and beyond */
> >> - mem_start = node0_size;
> >> - for (i = 1; i < nb_numa_nodes; i++) {
> >> + for (i = 0, mem_start = 0; i < nb_numa_nodes; ++i) {
> >> + if (!node_mem[i]) {
> >> + continue;
> >> + }
> >
> > Doesn't this skip memoryless nodes? What actually puts the memoryless
> > node in the device-tree?
>
> It does skip.
>
> > And if you were to put them in, wouldn't spapr_populate_memory_node()
> > fail because we'd be creating two nodes with memory@XXX where XXX is the
> > same (starting address) for both?
>
> I cannot do this now - there is no way to tell from the command line where
> I want NUMA node memory start from so I'll end up with multiple nodes with
> the same name and QEMU won't start. When NUMA fixes reach upstream, I'll
> try to work out something on top of that.
Ah I got something here. With the patches I just sent to enable sparse
NUMA nodes, plus your series rebased on top, here's what I see in a
Linux LPAR:
qemu-system-ppc64 -machine pseries,accel=kvm,usb=off -m 4096 -realtime mlock=off -numa node,nodeid=3,mem=4096,cpus=2-3 -numa node,nodeid=2,mem=0,cpus=0-1 -smp 4
info numa
2 nodes
node 2 cpus: 0 1
node 2 size: 0 MB
node 3 cpus: 2 3
node 3 size: 4096 MB
numactl --hardware
available: 3 nodes (0,2-3)
node 0 cpus:
node 0 size: 0 MB
node 0 free: 0 MB
node 2 cpus: 0 1
node 2 size: 0 MB
node 2 free: 0 MB
node 3 cpus: 2 3
node 3 size: 4073 MB
node 3 free: 3830 MB
node distances:
node 0 2 3
0: 10 40 40
2: 40 10 40
3: 40 40 10
The trick, it seems, is if you have a memoryless node, it needs to
have CPUs assigned to it. The CPU's "ibm,associativity" property will
make Linux set up the proper NUMA topology.
Thoughts? Should there be a check that every "present" NUMA node at
least either has CPUs or memory. It seems like if neither are present,
we can just hotplug them later? Does qemu support topology for PCI
devices?
Thanks,
Nish
next prev parent reply other threads:[~2014-06-24 3:09 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-16 7:53 [Qemu-devel] [PATCH 0/7] spapr: rework memory nodes Alexey Kardashevskiy
2014-06-16 7:53 ` [Qemu-devel] [PATCH 1/7] spapr: Move DT memory node rendering to a helper Alexey Kardashevskiy
2014-06-16 7:53 ` [Qemu-devel] [PATCH 2/7] spapr: Use DT memory node rendering helper for other nodes Alexey Kardashevskiy
2014-06-16 7:53 ` [Qemu-devel] [PATCH 3/7] spapr: Refactor spapr_populate_memory() Alexey Kardashevskiy
2014-06-18 5:04 ` Alexey Kardashevskiy
2014-06-20 19:10 ` Nishanth Aravamudan
2014-06-21 3:08 ` Alexey Kardashevskiy
2014-06-23 17:41 ` Nishanth Aravamudan
2014-06-23 22:02 ` Alexey Kardashevskiy
2014-06-20 22:55 ` Nishanth Aravamudan
2014-06-21 3:06 ` Alexey Kardashevskiy
2014-06-23 17:40 ` Nishanth Aravamudan
2014-06-24 6:07 ` Alexey Kardashevskiy
2014-06-24 17:07 ` Nishanth Aravamudan
2014-06-24 3:08 ` Nishanth Aravamudan [this message]
2014-06-24 6:14 ` Alexey Kardashevskiy
2014-06-24 17:01 ` Nishanth Aravamudan
2014-07-21 18:08 ` Nishanth Aravamudan
2014-06-16 7:53 ` [Qemu-devel] [PATCH 4/7] spapr: Split memory nodes to power-of-two blocks Alexey Kardashevskiy
2014-06-17 7:07 ` Alexey Kardashevskiy
2014-06-16 7:53 ` [Qemu-devel] [PATCH 5/7] spapr: Add a helper for node0_size calculation Alexey Kardashevskiy
2014-06-16 18:43 ` Nishanth Aravamudan
2014-06-16 7:53 ` [Qemu-devel] [PATCH 6/7] spapr: Fix ibm, associativity for memory nodes Alexey Kardashevskiy
2014-06-16 7:53 ` [Qemu-devel] [PATCH 7/7] numa: Allow empty nodes Alexey Kardashevskiy
2014-06-16 16:15 ` Eduardo Habkost
2014-06-16 18:49 ` Nishanth Aravamudan
2014-06-16 20:11 ` Eduardo Habkost
2014-06-16 20:31 ` Eduardo Habkost
2014-06-17 0:21 ` Nishanth Aravamudan
2014-06-17 0:16 ` Nishanth Aravamudan
2014-06-16 8:16 ` [Qemu-devel] [PATCH 0/7] spapr: rework memory nodes Alexey Kardashevskiy
2014-06-16 18:26 ` Nishanth Aravamudan
2014-06-16 20:51 ` Eduardo Habkost
2014-06-17 0:25 ` Nishanth Aravamudan
2014-06-17 1:37 ` Eduardo Habkost
2014-06-17 18:36 ` Nishanth Aravamudan
2014-06-17 1:41 ` Eduardo Habkost
2014-06-17 18:37 ` Nishanth Aravamudan
2014-06-17 5:51 ` Alexey Kardashevskiy
2014-06-17 14:07 ` Eduardo Habkost
2014-06-17 18:38 ` Nishanth Aravamudan
2014-06-17 19:22 ` Eduardo Habkost
2014-06-18 18:28 ` Nishanth Aravamudan
2014-06-18 19:33 ` Eduardo Habkost
2014-06-18 23:58 ` Nishanth Aravamudan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140624030856.GA29537@linux.vnet.ibm.com \
--to=nacc@linux.vnet.ibm.com \
--cc=agraf@suse.de \
--cc=aik@ozlabs.ru \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).