From: Greg Kurz <groug@kaod.org>
To: Serhii Popovych <spopovyc@redhat.com>
Cc: Laurent Vivier <lvivier@redhat.com>,
qemu-ppc@nongnu.org, qemu-devel@nongnu.org,
david@gibson.dropbear.id.au
Subject: Re: [Qemu-devel] [Qemu-ppc] [PATCH for 3.1] spapr: Fix ibm, max-associativity-domains property number of nodes
Date: Wed, 21 Nov 2018 14:58:18 +0100 [thread overview]
Message-ID: <20181121145818.5dbe5fbb@bahia.lan> (raw)
In-Reply-To: <576a1203-eb43-8c87-6865-51ffc2c6451b@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 6919 bytes --]
On Tue, 20 Nov 2018 20:58:45 +0200
Serhii Popovych <spopovyc@redhat.com> wrote:
> Greg Kurz wrote:
> > On Mon, 19 Nov 2018 14:48:34 +0100
> > Laurent Vivier <lvivier@redhat.com> wrote:
> >
> >> On 19/11/2018 14:27, Greg Kurz wrote:
> >>> On Mon, 19 Nov 2018 08:09:38 -0500
> >>> Serhii Popovych <spopovyc@redhat.com> wrote:
> >>>
> >>>> Laurent Vivier reported off by one with maximum number of NUMA nodes
> >>>> provided by qemu-kvm being less by one than required according to
> >>>> description of "ibm,max-associativity-domains" property in LoPAPR.
> >>>>
> >>>> It appears that I incorrectly treated LoPAPR description of this
> >>>> property assuming it provides last valid domain (NUMA node here)
> >>>> instead of maximum number of domains.
> >>>>
> >>>> ### Before hot-add
> >>>>
> >>>> (qemu) info numa
> >>>> 3 nodes
> >>>> node 0 cpus: 0
> >>>> node 0 size: 0 MB
> >>>> node 0 plugged: 0 MB
> >>>> node 1 cpus:
> >>>> node 1 size: 1024 MB
> >>>> node 1 plugged: 0 MB
> >>>> node 2 cpus:
> >>>> node 2 size: 0 MB
> >>>> node 2 plugged: 0 MB
> >>>>
> >>>> $ numactl -H
> >>>> available: 2 nodes (0-1)
> >>>> node 0 cpus: 0
> >>>> node 0 size: 0 MB
> >>>> node 0 free: 0 MB
> >>>> node 1 cpus:
> >>>> node 1 size: 999 MB
> >>>> node 1 free: 658 MB
> >>>> node distances:
> >>>> node 0 1
> >>>> 0: 10 40
> >>>> 1: 40 10
> >>>>
> >>>> ### Hot-add
> >>>>
> >>>> (qemu) object_add memory-backend-ram,id=mem0,size=1G
> >>>> (qemu) device_add pc-dimm,id=dimm1,memdev=mem0,node=2
> >>>> (qemu) [ 87.704898] pseries-hotplug-mem: Attempting to hot-add 4 ...
> >>>> <there is no "Initmem setup node 2 [mem 0xHEX-0xHEX]">
> >>>> [ 87.705128] lpar: Attempting to resize HPT to shift 21
> >>>> ... <HPT resize messages>
> >>>>
> >>>> ### After hot-add
> >>>>
> >>>> (qemu) info numa
> >>>> 3 nodes
> >>>> node 0 cpus: 0
> >>>> node 0 size: 0 MB
> >>>> node 0 plugged: 0 MB
> >>>> node 1 cpus:
> >>>> node 1 size: 1024 MB
> >>>> node 1 plugged: 0 MB
> >>>> node 2 cpus:
> >>>> node 2 size: 1024 MB
> >>>> node 2 plugged: 1024 MB
> >>>>
> >>>> $ numactl -H
> >>>> available: 2 nodes (0-1)
> >>>> ^^^^^^^^^^^^^^^^^^^^^^^^
> >>>> Still only two nodes (and memory hot-added to node 0 below)
> >>>> node 0 cpus: 0
> >>>> node 0 size: 1024 MB
> >>>> node 0 free: 1021 MB
> >>>> node 1 cpus:
> >>>> node 1 size: 999 MB
> >>>> node 1 free: 658 MB
> >>>> node distances:
> >>>> node 0 1
> >>>> 0: 10 40
> >>>> 1: 40 10
> >>>>
> >>>> After fix applied numactl(8) reports 3 nodes available and memory
> >>>> plugged into node 2 as expected.
> >>>>
> >>>> Fixes: da9f80fbad21 ("spapr: Add ibm,max-associativity-domains property")
> >>>> Reported-by: Laurent Vivier <lvivier@redhat.com>
> >>>> Signed-off-by: Serhii Popovych <spopovyc@redhat.com>
> >>>> ---
> >>>> hw/ppc/spapr.c | 2 +-
> >>>> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >>>> index 7afd1a1..843ae6c 100644
> >>>> --- a/hw/ppc/spapr.c
> >>>> +++ b/hw/ppc/spapr.c
> >>>> @@ -1033,7 +1033,7 @@ static void spapr_dt_rtas(sPAPRMachineState *spapr, void *fdt)
> >>>> cpu_to_be32(0),
> >>>> cpu_to_be32(0),
> >>>> cpu_to_be32(0),
> >>>> - cpu_to_be32(nb_numa_nodes ? nb_numa_nodes - 1 : 0),
> >>>> + cpu_to_be32(nb_numa_nodes ? nb_numa_nodes : 0),
> >>>
> >>> Maybe simply cpu_to_be32(nb_numa_nodes) ?
> >>
> >> Or "cpu_to_be32(nb_numa_nodes ? nb_numa_nodes : 1)" ?
>
> Linux handles zero correctly, but nb_numa_nodes ?: 1 looks better.
>
> I did testing with just cpu_to_be32(nb_numa_nodes) and
> cpu_to_be32(nb_numa_nodes ? nb_numa_nodes : 1) it works with Linux
> correctly in both cases
>
> (guest)# numactl -H
> available: 1 nodes (0)
> node 0 cpus: 0
> node 0 size: 487 MB
> node 0 free: 148 MB
> node distances:
> node 0
> 0: 10
>
> (qemu) info numa
> 0 nodes
>
> >>
> >> In spapr_populate_drconf_memory() we have this logic.
> >>
> >
> > Hmm... maybe you're right, it seems that the code assumes
> > non-NUMA configs have at one node. Similar assumption is
> > also present in pc_dimm_realize():
> >
> > if (((nb_numa_nodes > 0) && (dimm->node >= nb_numa_nodes)) ||
> > (!nb_numa_nodes && dimm->node))
> According to this nb_numa_nodes can be zero
>
> > error_setg(errp, "'DIMM property " PC_DIMM_NODE_PROP " has value %"
> > PRIu32 "' which exceeds the number of numa nodes: %d",
> > dimm->node, nb_numa_nodes ? nb_numa_nodes : 1);
> and this just handles this case to show proper error message.
>
Indeed but it doesn't really explain why we're doing this...
> > return;
> > }
>
> >
> > This is a bit confusing...
... fortunately, these commits shed some light:
commit 7db8a127e373e468d1f61e46e01e50d1aa33e827
Author: Alexey Kardashevskiy <aik@ozlabs.ru>
Date: Thu Jul 3 13:10:04 2014 +1000
spapr: Refactor spapr_populate_memory() to allow memoryless nodes
Current QEMU does not support memoryless NUMA nodes, however
actual hardware may have them so it makes sense to have a way
to emulate them in QEMU. This prepares SPAPR for that.
This moves 2 calls of spapr_populate_memory_node() into
the existing loop over numa nodes so first several nodes may
have no memory and this still will work.
If there is no numa configuration, the code assumes there is just
a single node at 0 and it has all the guest memory.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
commit 6663864e950d40c467ae4ab81c4dac64d7a8d9e6
Author: Bharata B Rao <bharata@linux.vnet.ibm.com>
Date: Mon Aug 3 11:05:40 2015 +0530
spapr: Populate ibm,associativity-lookup-arrays correctly for non-NUMA
When NUMA isn't configured explicitly, assume node 0 is present for
the purpose of creating ibm,associativity-lookup-arrays property
under ibm,dynamic-reconfiguration-memory DT node. This ensures that
the associativity index property is correctly updated in ibm,dynamic-memory
for the LMB that is hotplugged.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
So I guess ?: 1 is consistent with the longstanding assumption in spapr
that the machine always has a "node 0", even for non-NUMA setups.
Maybe this logic should be consolidated in some helper for better
clarity.
> >
> >> Thanks,
> >> Laurent
> >
> >
>
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
prev parent reply other threads:[~2018-11-21 13:58 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-19 13:09 [Qemu-devel] [PATCH for 3.1] spapr: Fix ibm, max-associativity-domains property number of nodes Serhii Popovych
2018-11-19 13:27 ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
2018-11-19 13:30 ` Laurent Vivier
2018-11-19 16:18 ` Serhii Popovych
2018-11-19 13:48 ` Laurent Vivier
2018-11-19 16:59 ` Greg Kurz
2018-11-20 18:58 ` Serhii Popovych
2018-11-21 13:58 ` Greg Kurz [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181121145818.5dbe5fbb@bahia.lan \
--to=groug@kaod.org \
--cc=david@gibson.dropbear.id.au \
--cc=lvivier@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
--cc=spopovyc@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).