From: Eduardo Habkost <ehabkost@redhat.com>
To: Dou Liyang <douly.fnst@cn.fujitsu.com>
Cc: qemu-devel@nongnu.org, imammedo@redhat.com, pbonzini@redhat.com,
mst@redhat.com, rth@twiddle.net
Subject: Re: [Qemu-devel] [PATCH v4 1/2] hw/acpi-build: Fix SRAT memory building when there is no memory in node0
Date: Wed, 30 Aug 2017 17:55:54 -0300 [thread overview]
Message-ID: <20170830205554.GA17184@localhost.localdomain> (raw)
In-Reply-To: <1503372250-5092-2-git-send-email-douly.fnst@cn.fujitsu.com>
On Tue, Aug 22, 2017 at 11:24:09AM +0800, Dou Liyang wrote:
> Currently, Using the fisrt node without memory on the machine makes
> QEMU unhappy. With this example command line:
> ... \
> -m 1024M,slots=4,maxmem=32G \
> -numa node,nodeid=0 \
> -numa node,mem=1024M,nodeid=1 \
> -numa node,nodeid=2 \
> -numa node,nodeid=3 \
> Guest reports "No NUMA configuration found" and the NUMA topology is
> wrong.
>
> This is because when QEMU builds ACPI SRAT, it regards node0 as the
> default node to deal with the memory hole(640K-1M). this means the
> node0 must have some memory(>1M), but, actually it can have no
> memory.
>
> Fix this problem by replace the node0 with the first node which has
> memory on it. Add a new function for each node. Also do some cleanup.
>
> Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
> ---
> hw/i386/acpi-build.c | 78 +++++++++++++++++++++++++++++++++-------------------
> 1 file changed, 50 insertions(+), 28 deletions(-)
>
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index 98dd424..f93d712 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -2318,15 +2318,43 @@ build_tpm2(GArray *table_data, BIOSLinker *linker)
> (void *)tpm2_ptr, "TPM2", sizeof(*tpm2_ptr), 4, NULL, NULL);
> }
>
> +static uint64_t
> +build_srat_node_entry(GArray *table_data, PCMachineState *pcms,
> + int i, uint64_t mem_base, uint64_t mem_len)
> +{
> + AcpiSratMemoryAffinity *numamem;
> + uint64_t next_base;
> +
> + next_base = mem_base + mem_len;
> +
> + /* Cut out the ACPI_PCI hole */
> + if (mem_base <= pcms->below_4g_mem_size &&
> + next_base > pcms->below_4g_mem_size) {
> + mem_len -= next_base - pcms->below_4g_mem_size;
> + if (mem_len > 0) {
> + numamem = acpi_data_push(table_data, sizeof *numamem);
> + build_srat_memory(numamem, mem_base, mem_len, i,
> + MEM_AFFINITY_ENABLED);
> + }
> + mem_base = 1ULL << 32;
> + mem_len = next_base - pcms->below_4g_mem_size;
> + next_base += (1ULL << 32) - pcms->below_4g_mem_size;
> + }
> + numamem = acpi_data_push(table_data, sizeof *numamem);
> + build_srat_memory(numamem, mem_base, mem_len, i,
> + MEM_AFFINITY_ENABLED);
> + return next_base;
> +}
> +
> static void
> build_srat(GArray *table_data, BIOSLinker *linker, MachineState *machine)
> {
> AcpiSystemResourceAffinityTable *srat;
> AcpiSratMemoryAffinity *numamem;
>
> - int i;
> + int i, node;
> int srat_start, numa_start, slots;
> - uint64_t mem_len, mem_base, next_base;
> + uint64_t mem_len, mem_base;
> MachineClass *mc = MACHINE_GET_CLASS(machine);
> const CPUArchIdList *apic_ids = mc->possible_cpu_arch_ids(machine);
> PCMachineState *pcms = PC_MACHINE(machine);
> @@ -2370,36 +2398,30 @@ build_srat(GArray *table_data, BIOSLinker *linker, MachineState *machine)
> /* the memory map is a bit tricky, it contains at least one hole
> * from 640k-1M and possibly another one from 3.5G-4G.
> */
> - next_base = 0;
> +
> numa_start = table_data->len;
>
> - numamem = acpi_data_push(table_data, sizeof *numamem);
> - build_srat_memory(numamem, 0, 640 * 1024, 0, MEM_AFFINITY_ENABLED);
> - next_base = 1024 * 1024;
> - for (i = 1; i < pcms->numa_nodes + 1; ++i) {
> - mem_base = next_base;
> - mem_len = pcms->node_mem[i - 1];
> - if (i == 1) {
> - mem_len -= 1024 * 1024;
> + /* get the first node which has memory and map the hole from 640K-1M */
> + for (node = 0; node < pcms->numa_nodes; node++) {
> + if (pcms->node_mem[node] != 0) {
> + break;
> }
> - next_base = mem_base + mem_len;
> -
> - /* Cut out the ACPI_PCI hole */
> - if (mem_base <= pcms->below_4g_mem_size &&
> - next_base > pcms->below_4g_mem_size) {
> - mem_len -= next_base - pcms->below_4g_mem_size;
> - if (mem_len > 0) {
> - numamem = acpi_data_push(table_data, sizeof *numamem);
> - build_srat_memory(numamem, mem_base, mem_len, i - 1,
> - MEM_AFFINITY_ENABLED);
> - }
> - mem_base = 1ULL << 32;
> - mem_len = next_base - pcms->below_4g_mem_size;
> - next_base += (1ULL << 32) - pcms->below_4g_mem_size;
> + }
> + numamem = acpi_data_push(table_data, sizeof *numamem);
> + build_srat_memory(numamem, 0, 640 * 1024, node, MEM_AFFINITY_ENABLED);
> +
> + /* map the rest of memory from 1M */
> + mem_base = 1024 * 1024;
> + mem_len = pcms->node_mem[node] - mem_base;
> + mem_base = build_srat_node_entry(table_data, pcms, node,
> + mem_base, mem_len);
> +
> + for (i = 0; i < pcms->numa_nodes; i++) {
> + if (i == node) {
> + continue;
> }
Now the nodes will be out of order, if node 0 has no RAM. Why?
Why not implement this in the same way the PCI 4GB hole is
already implemented. e.g.:
#define HOLE_640K_START (640 * 1024)
#define HOLE_640K_END (1024 * 1024)
for (node = 0; node < pcms->numa_nodes; node++) {
mem_base = next_base;
mem_len = pcms->node_mem[node];
next_base = mem_base + mem_len;
/* Cut out the 640K hole */
if (mem_base <= HOLE_640K_START &&
next_base > HOLE_640K_START) {
mem_len -= next_base - HOLE_640K;
if (mem_len > 0) {
numamem = acpi_data_push(table_data, sizeof *numamem);
build_srat_memory(numamem, mem_base, mem_len, node,
MEM_AFFINITY_ENABLED);
}
mem_base = HOLE_640K_END;
/* ... */
}
/* Cut out the ACPI_PCI hole */
if (mem_base <= pcms->below_4g_mem_size &&
next_base > pcms->below_4g_mem_size) {
mem_len -= next_base - pcms->below_4g_mem_size;
/* ... *]
}
numamem = acpi_data_push(table_data, sizeof *numamem);
build_srat_memory(numamem, mem_base, mem_len, node,
MEM_AFFINITY_ENABLED);
/* ... */
}
> - numamem = acpi_data_push(table_data, sizeof *numamem);
> - build_srat_memory(numamem, mem_base, mem_len, i - 1,
> - MEM_AFFINITY_ENABLED);
> + mem_base = build_srat_node_entry(table_data, pcms, i,
> + mem_base, pcms->node_mem[i]);
> }
> slots = (table_data->len - numa_start) / sizeof *numamem;
> for (; slots < pcms->numa_nodes + 2; slots++) {
> --
> 2.5.5
>
>
>
>
--
Eduardo
next prev parent reply other threads:[~2017-08-30 20:56 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-22 3:24 [Qemu-devel] [PATCH v4 0/2] hw/acpi-build: Fix ACPI SRAT Memory Affinity building Dou Liyang
2017-08-22 3:24 ` [Qemu-devel] [PATCH v4 1/2] hw/acpi-build: Fix SRAT memory building when there is no memory in node0 Dou Liyang
2017-08-30 20:55 ` Eduardo Habkost [this message]
2017-08-31 10:38 ` Dou Liyang
2017-08-31 12:09 ` Dou Liyang
2017-08-22 3:24 ` [Qemu-devel] [PATCH v4 2/2] ACPI/unit-test: Add a new testcase for RAM allocation in numa node Dou Liyang
2017-08-23 8:40 ` Igor Mammedov
2017-08-23 12:12 ` Dou Liyang
2017-08-23 12:45 ` Igor Mammedov
2017-08-23 13:35 ` Dou Liyang
2017-08-23 17:25 ` Eduardo Habkost
2017-08-24 1:30 ` Dou Liyang
2017-08-23 17:47 ` Igor Mammedov
2017-08-24 1:42 ` Dou Liyang
2017-08-24 12:33 ` Igor Mammedov
2017-08-24 13:20 ` Dou Liyang
2017-08-24 13:50 ` [Qemu-devel] [PATCH v5 " Dou Liyang
2017-08-23 2:48 ` [Qemu-devel] [PATCH v4 0/2] hw/acpi-build: Fix ACPI SRAT Memory Affinity building Michael S. Tsirkin
2017-08-23 3:47 ` Dou Liyang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170830205554.GA17184@localhost.localdomain \
--to=ehabkost@redhat.com \
--cc=douly.fnst@cn.fujitsu.com \
--cc=imammedo@redhat.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=rth@twiddle.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).