From: Gavin Shan <gshan@redhat.com>
To: "wangyanan (Y)" <wangyanan55@huawei.com>,
Igor Mammedov <imammedo@redhat.com>
Cc: peter.maydell@linaro.org, drjones@redhat.com,
richard.henderson@linaro.org, qemu-devel@nongnu.org,
zhenyzha@redhat.com, qemu-arm@nongnu.org, shan.gavin@gmail.com
Subject: Re: [PATCH v2 1/3] hw/arm/virt: Fix CPU's default NUMA node ID
Date: Wed, 23 Mar 2022 11:29:19 +0800 [thread overview]
Message-ID: <d4a8d585-2ce8-410e-ae69-f126bc013c4f@redhat.com> (raw)
In-Reply-To: <e6efb1ca-08bb-fce5-de58-b8e2079880ca@huawei.com>
Hi Yanan,
On 3/21/22 10:28 AM, wangyanan (Y) wrote:
> On 2022/3/18 21:27, Igor Mammedov wrote:
>> On Fri, 18 Mar 2022 21:00:35 +0800
>> "wangyanan (Y)" <wangyanan55@huawei.com> wrote:
>>
>>> On 2022/3/18 17:56, Igor Mammedov wrote:
>>>> On Fri, 18 Mar 2022 14:23:34 +0800
>>>> "wangyanan (Y)" <wangyanan55@huawei.com> wrote:
>>>>> Hi Gavin,
>>>>>
>>>>> On 2022/3/3 11:11, Gavin Shan wrote:
>>>>>> The default CPU-to-NUMA association is given by mc->get_default_cpu_node_id()
>>>>>> when it isn't provided explicitly. However, the CPU topology isn't fully
>>>>>> considered in the default association and it causes CPU topology broken
>>>>>> warnings on booting Linux guest.
>>>>>>
>>>>>> For example, the following warning messages are observed when the Linux guest
>>>>>> is booted with the following command lines.
>>>>>>
>>>>>> /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
>>>>>> -accel kvm -machine virt,gic-version=host \
>>>>>> -cpu host \
>>>>>> -smp 6,sockets=2,cores=3,threads=1 \
>>>>>> -m 1024M,slots=16,maxmem=64G \
>>>>>> -object memory-backend-ram,id=mem0,size=128M \
>>>>>> -object memory-backend-ram,id=mem1,size=128M \
>>>>>> -object memory-backend-ram,id=mem2,size=128M \
>>>>>> -object memory-backend-ram,id=mem3,size=128M \
>>>>>> -object memory-backend-ram,id=mem4,size=128M \
>>>>>> -object memory-backend-ram,id=mem4,size=384M \
>>>>>> -numa node,nodeid=0,memdev=mem0 \
>>>>>> -numa node,nodeid=1,memdev=mem1 \
>>>>>> -numa node,nodeid=2,memdev=mem2 \
>>>>>> -numa node,nodeid=3,memdev=mem3 \
>>>>>> -numa node,nodeid=4,memdev=mem4 \
>>>>>> -numa node,nodeid=5,memdev=mem5
>>>>>> :
>>>>>> alternatives: patching kernel code
>>>>>> BUG: arch topology borken
>>>>>> the CLS domain not a subset of the MC domain
>>>>>> <the above error log repeats>
>>>>>> BUG: arch topology borken
>>>>>> the DIE domain not a subset of the NODE domain
>>>>>>
>>>>>> With current implementation of mc->get_default_cpu_node_id(), CPU#0 to CPU#5
>>>>>> are associated with NODE#0 to NODE#5 separately. That's incorrect because
>>>>>> CPU#0/1/2 should be associated with same NUMA node because they're seated
>>>>>> in same socket.
>>>>>>
>>>>>> This fixes the issue by populating the CPU topology in virt_possible_cpu_arch_ids()
>>>>>> and considering the socket index when default CPU-to-NUMA association is given
>>>>>> in virt_possible_cpu_arch_ids(). With this applied, no more CPU topology broken
>>>>>> warnings are seen from the Linux guest. The 6 CPUs are associated with NODE#0/1,
>>>>>> but there are no CPUs associated with NODE#2/3/4/5.
>>>>> It may be better to split this patch into two. One extends
>>>>> virt_possible_cpu_arch_ids,
Agreed, I will do in v3. Sorry that I forgot to mention it in last reply.
Thanks,
Gavin
>>>>> and the other fixes the numa node ID issue.
>>>>>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>>>>>> ---
>>>>>> hw/arm/virt.c | 17 ++++++++++++++++-
>>>>>> 1 file changed, 16 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>>>>>> index 46bf7ceddf..dee02b60fc 100644
>>>>>> --- a/hw/arm/virt.c
>>>>>> +++ b/hw/arm/virt.c
>>>>>> @@ -2488,7 +2488,9 @@ virt_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
>>>>>> static int64_t virt_get_default_cpu_node_id(const MachineState *ms, int idx)
>>>>>> {
>>>>>> - return idx % ms->numa_state->num_nodes;
>>>>>> + int64_t socket_id = ms->possible_cpus->cpus[idx].props.socket_id;
>>>>>> +
>>>>>> + return socket_id % ms->numa_state->num_nodes;
>>>>>> }
>>>>>> static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>>>>>> @@ -2496,6 +2498,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>>>>>> int n;
>>>>>> unsigned int max_cpus = ms->smp.max_cpus;
>>>>>> VirtMachineState *vms = VIRT_MACHINE(ms);
>>>>>> + MachineClass *mc = MACHINE_GET_CLASS(vms);
>>>>>> if (ms->possible_cpus) {
>>>>>> assert(ms->possible_cpus->len == max_cpus);
>>>>>> @@ -2509,6 +2512,18 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>>>>>> ms->possible_cpus->cpus[n].type = ms->cpu_type;
>>>>>> ms->possible_cpus->cpus[n].arch_id =
>>>>>> virt_cpu_mp_affinity(vms, n);
>>>>>> +
>>>>>> + ms->possible_cpus->cpus[n].props.has_socket_id = true;
>>>>>> + ms->possible_cpus->cpus[n].props.socket_id =
>>>>>> + n / (ms->smp.dies * ms->smp.clusters *
>>>>>> + ms->smp.cores * ms->smp.threads);
>>>>>> + if (mc->smp_props.dies_supported) {
>>>>>> + ms->possible_cpus->cpus[n].props.has_die_id = true;
>>>>>> + ms->possible_cpus->cpus[n].props.die_id =
>>>>>> + n / (ms->smp.clusters * ms->smp.cores * ms->smp.threads);
>>>>>> + }
>>>>> I still don't think we need to consider dies if it's certainly not
>>>>> supported yet, IOW, we will never come into the if-branch.
>>>>> We are populating arm-specific topo info instead of the generic,
>>>>> we can probably uniformly update this part together with other
>>>>> necessary places when we decide to support dies for arm virt
>>>>> machine in the future. :)
>>>> it seems we do support dies and they are supposed to be numa boundary too,
>>>> so perhaps we should account for it when generating node-id.
>>> Sorry, I actually meant that we currently don't support dies for arm, so
>>> that
>>> we will always have "mc->smp_props.dies_supported == False" here, which
>>> makes the code a bit unnecessary. dies are only supported for x86 for
>>> now. :)
>>>
>> then perhaps add an assert() here, so that we would notice and fix this
>> place when dies_supported becomes true.
> A simple assert() works here, I think.
>
> Thanks,
> Yanan
>>> Thanks,
>>> Yanan
>>>>>> + ms->possible_cpus->cpus[n].props.has_core_id = true;
>>>>>> + ms->possible_cpus->cpus[n].props.core_id = n / ms->smp.threads;
>>>>>> ms->possible_cpus->cpus[n].props.has_thread_id = true;
>>>>>> ms->possible_cpus->cpus[n].props.thread_id = n;
>>>>>> }
>>>>> Maybe we should use the same algorithm in x86_topo_ids_from_idx
>>>>> to populate the IDs, so that scope of socket-id will be [0, total_sockets),
>>>>> scope of thread-id is [0, threads_per_core), and so on. Then with a
>>>>> group of socket/cluster/core/thread-id, we determine a CPU.
>>>>>
>>>>> Suggestion: For the long term, is it necessary now to add similar topo
>>>>> info infrastructure for ARM, such as X86CPUTopoInfo, X86CPUTopoIDs,
>>>>> x86_topo_ids_from_idx?
>>>>>
>>>>> Thanks,
>>>>> Yanan
>>>> .
>> .
>
next prev parent reply other threads:[~2022-03-23 3:31 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-03 3:11 [PATCH v2 0/3] hw/arm/virt: Fix CPU's default NUMA node ID Gavin Shan
2022-03-03 3:11 ` [PATCH v2 1/3] " Gavin Shan
2022-03-18 6:23 ` wangyanan (Y) via
2022-03-18 9:56 ` Igor Mammedov
2022-03-18 13:00 ` wangyanan (Y) via
2022-03-18 13:27 ` Igor Mammedov
2022-03-21 2:28 ` wangyanan (Y) via
2022-03-23 3:26 ` Gavin Shan
2022-03-23 3:29 ` Gavin Shan [this message]
2022-03-03 3:11 ` [PATCH v2 2/3] hw/acpi/aml-build: Use existing CPU topology to build PPTT table Gavin Shan
2022-03-18 6:34 ` wangyanan (Y) via
2022-03-18 13:28 ` Igor Mammedov
2022-03-23 3:31 ` Gavin Shan
2022-03-03 3:11 ` [PATCH v2 3/3] hw/arm/virt: Unify ACPI processor ID in MADT and SRAT table Gavin Shan
2022-03-14 6:24 ` [PATCH v2 0/3] hw/arm/virt: Fix CPU's default NUMA node ID Gavin Shan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d4a8d585-2ce8-410e-ae69-f126bc013c4f@redhat.com \
--to=gshan@redhat.com \
--cc=drjones@redhat.com \
--cc=imammedo@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-arm@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
--cc=shan.gavin@gmail.com \
--cc=wangyanan55@huawei.com \
--cc=zhenyzha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).