From: Igor Mammedov <imammedo@redhat.com>
To: Babu Moger <babu.moger@amd.com>
Cc: ehabkost@redhat.com, mst@redhat.com, armbru@redhat.com,
qemu-devel@nongnu.org, pbonzini@redhat.com, rth@twiddle.net
Subject: Re: [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models
Date: Tue, 4 Feb 2020 09:02:30 +0100 [thread overview]
Message-ID: <20200204090230.28f31a87@redhat.com> (raw)
In-Reply-To: <b493a4f4-48de-79a7-00d5-119fbe789879@amd.com>
On Mon, 3 Feb 2020 13:31:29 -0600
Babu Moger <babu.moger@amd.com> wrote:
> On 2/3/20 8:59 AM, Igor Mammedov wrote:
> > On Tue, 03 Dec 2019 18:36:54 -0600
> > Babu Moger <babu.moger@amd.com> wrote:
> >
> >> This series fixes APIC ID encoding problems on AMD EPYC CPUs.
> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.redhat.com%2Fshow_bug.cgi%3Fid%3D1728166&data=02%7C01%7Cbabu.moger%40amd.com%7C50685202e372472d7b2c08d7a8b9afa6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637163387802886193&sdata=N%2FaBBZ8G3D1gCNvabVQ%2FraHvINazcVeEc9FWdxQAWmg%3D&reserved=0
> >>
> >> Currently, the APIC ID is decoded based on the sequence
> >> sockets->dies->cores->threads. This works for most standard AMD and other
> >> vendors' configurations, but this decoding sequence does not follow that of
> >> AMD's APIC ID enumeration strictly. In some cases this can cause CPU topology
> >> inconsistency. When booting a guest VM, the kernel tries to validate the
> >> topology, and finds it inconsistent with the enumeration of EPYC cpu models.
> >>
> >> To fix the problem we need to build the topology as per the Processor
> >> Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1
> >> Processors. It is available at https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.amd.com%2Fsystem%2Ffiles%2FTechDocs%2F55570-B1_PUB.zip&data=02%7C01%7Cbabu.moger%40amd.com%7C50685202e372472d7b2c08d7a8b9afa6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637163387802886193&sdata=McjyMS3A3x5Jr57VxJmHDyh5jumdybzW%2FwLtE4FAKHQ%3D&reserved=0
> >>
> >> Here is the text from the PPR.
> >> Operating systems are expected to use Core::X86::Cpuid::SizeId[ApicIdSize], the
> >> number of least significant bits in the Initial APIC ID that indicate core ID
> >> within a processor, in constructing per-core CPUID masks.
> >> Core::X86::Cpuid::SizeId[ApicIdSize] determines the maximum number of cores
> >> (MNC) that the processor could theoretically support, not the actual number of
> >> cores that are actually implemented or enabled on the processor, as indicated
> >> by Core::X86::Cpuid::SizeId[NC].
> >> Each Core::X86::Apic::ApicId[ApicId] register is preset as follows:
> >> • ApicId[6] = Socket ID.
> >> • ApicId[5:4] = Node ID.
> >> • ApicId[3] = Logical CCX L3 complex ID
> >> • ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} : {1'b0,LogicalCoreID[1:0]}
> >
> >
> > After checking out all patches and some pondering, used here approach
> > looks to me too intrusive for the task at hand especially where it
> > comes to generic code.
> >
> > (Ignore till ==== to see suggestion how to simplify without reading
> > reasoning behind it first)
> >
> > Lets look for a way to simplify it a little bit.
> >
> > So problem we are trying to solve,
> > 1: calculate APIC IDs based on cpu type (to e more specific: for EPYC based CPUs)
> > 2: it depends on knowing total number of numa nodes.
> >
> > Externally workflow looks like following:
> > 1. user provides -smp x,sockets,cores,...,maxcpus
> > that's used by possible_cpu_arch_ids() singleton to build list of
> > possible CPUs (which is available to user via command 'hotpluggable-cpus')
> >
> > Hook could be called very early and possible_cpus data might be
> > not complete. It builds a list of possible CPUs which user could
> > modify later.
> >
> > 2.1 user uses "-numa cpu,node-id=x,..." or legacy "-numa node,node_id=x,cpus="
> > options to assign cpus to nodes, which is one way or another calling
> > machine_set_cpu_numa_node(). The later updates 'possible_cpus' list
> > with node information. It happens early when total number of nodes
> > is not available.
> >
> > 2.2 user does not provide explicit node mappings for CPUs.
> > QEMU steps in and assigns possible cpus to nodes in machine_numa_finish_cpu_init()
> > (using the same machine_set_cpu_numa_node()) right before calling boards
> > specific machine init(). At that time total number of nodes is known.
> >
> > In 1 -- 2.1 cases, 'arch_id' in 'possible_cpus' list doesn't have to be defined before
> > boards init() is run.
> >
> > In 2.2 case it calls get_default_cpu_node_id() -> x86_get_default_cpu_node_id()
> > which uses arch_id calculate numa node.
> > But then question is: does it have to use APIC id or could it infer 'pkg_id',
> > it's after, from ms->possible_cpus->cpus[i].props data?
>
> Not sure if I got the question right. In this case because the numa
> information is not provided all the cpus are assigned to only one node.
> The apic id is used here to get the correct pkg_id.
apicid was composed from socket/core/thread[/die] tuple which cpus[i].props is.
Question is if we can compose only pkg_id based on the same data without
converting it to apicid and then "reverse engineering" it back
original data?
Or more direct question: is socket-id the same as pkg_id?
>
> >
> > With that out of the way APIC ID will be used only during board's init(),
> > so board could update possible_cpus with valid APIC IDs at the start of
> > x86_cpus_init().
> >
> > ====
> > in nutshell it would be much easier to do following:
> >
> > 1. make x86_get_default_cpu_node_id() APIC ID in-depended or
> > if impossible as alternative recompute APIC IDs there if cpu
> > type is EPYC based (since number of nodes is already known)
> > 2. recompute APIC IDs in x86_cpus_init() if cpu type is EPYC based
> >
> > this way one doesn't need to touch generic numa code, introduce
> > x86 specific init_apicid_fn() hook into generic code and keep
> > x86/EPYC nuances contained within x86 code only.
>
> I was kind of already working in the similar direction in v4.
> 1. We already have split the numa initialization in patch #12(Split the
> numa initialization). This way we know exactly how many numa nodes are
> there before hand.
I suggest to drop that patch, It's the one that touches generic numa
code and adding more legacy based extensions like cpu_indexes.
Which I'd like to get rid of to begin with, so only -numa cpu is left.
I think it's not necessary to touch numa code at all for apicid generation
purpose, as I tried to explain above. We should be able to keep
this x86 only business.
> 2. Planning to remove init_apicid_fn
> 3. Insert the handlers inside X86CPUDefinition.
what handlers do you mean?
> 4. EPYC model will have its own apid id handlers. Everything else will be
> initialized with a default handlers(current default handler).
> 5. The function pc_possible_cpu_arch_ids will load the model definition
> and initialize the PCMachineState data structure with the model specific
> handlers.
I'm not sure what do you mean here.
> Does that sound similar to what you are thinking. Thoughts?
If you have something to share and can push it on github,
I can look at, whether it has design issues to spare you a round trip on a list.
(it won't be proper review but at least I can help to pinpoint most problematic parts)
>
> >
> >> v3:
> >> 1. Consolidated the topology information in structure X86CPUTopoInfo.
> >> 2. Changed the ccx_id to llc_id as commented by upstream.
> >> 3. Generalized the apic id decoding. It is mostly similar to current apic id
> >> except that it adds new field llc_id when numa configured. Removes all the
> >> hardcoded values.
> >> 4. Removed the earlier parse_numa split. And moved the numa node initialization
> >> inside the numa_complete_configuration. This is bit cleaner as commented by
> >> Eduardo.
> >> 5. Added new function init_apicid_fn inside machine_class structure. This
> >> will be used to update the apic id handler specific to cpu model.
> >> 6. Updated the cpuid unit tests.
> >> 7. TODO : Need to figure out how to dynamically update the handlers using cpu models.
> >> I might some guidance on that.
> >>
> >> v2:
> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2F156779689013.21957.1631551572950676212.stgit%40localhost.localdomain%2F&data=02%7C01%7Cbabu.moger%40amd.com%7C50685202e372472d7b2c08d7a8b9afa6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637163387802886193&sdata=ls1cxA1yh0P05zYsAf3sLXDM11DFHtxZvfWWaar7Mgg%3D&reserved=0
> >> 1. Introduced the new property epyc to enable new epyc mode.
> >> 2. Separated the epyc mode and non epyc mode function.
> >> 3. Introduced function pointers in PCMachineState to handle the
> >> differences.
> >> 4. Mildly tested different combinations to make things are working as expected.
> >> 5. TODO : Setting the epyc feature bit needs to be worked out. This feature is
> >> supported only on AMD EPYC models. I may need some guidance on that.
> >>
> >> v1:
> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2F20190731232032.51786-1-babu.moger%40amd.com%2F&data=02%7C01%7Cbabu.moger%40amd.com%7C50685202e372472d7b2c08d7a8b9afa6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637163387802886193&sdata=nT4T9RIL4EeSvB%2Ff9%2BjbU7lldopjglQ2X6uYx13WMPE%3D&reserved=0
> >>
> >> ---
> >>
> >> Babu Moger (18):
> >> hw/i386: Rename X86CPUTopoInfo structure to X86CPUTopoIDs
> >> hw/i386: Introduce X86CPUTopoInfo to contain topology info
> >> hw/i386: Consolidate topology functions
> >> hw/i386: Introduce initialize_topo_info to initialize X86CPUTopoInfo
> >> machine: Add SMP Sockets in CpuTopology
> >> hw/core: Add core complex id in X86CPU topology
> >> machine: Add a new function init_apicid_fn in MachineClass
> >> hw/i386: Update structures for nodes_per_pkg
> >> i386: Add CPUX86Family type in CPUX86State
> >> hw/386: Add EPYC mode topology decoding functions
> >> i386: Cleanup and use the EPYC mode topology functions
> >> numa: Split the numa initialization
> >> hw/i386: Introduce apicid_from_cpu_idx in PCMachineState
> >> hw/i386: Introduce topo_ids_from_apicid handler PCMachineState
> >> hw/i386: Introduce apic_id_from_topo_ids handler in PCMachineState
> >> hw/i386: Introduce EPYC mode function handlers
> >> i386: Fix pkg_id offset for epyc mode
> >> tests: Update the Unit tests
> >>
> >>
> >> hw/core/machine-hmp-cmds.c | 3 +
> >> hw/core/machine.c | 14 +++
> >> hw/core/numa.c | 62 +++++++++----
> >> hw/i386/pc.c | 132 +++++++++++++++++++---------
> >> include/hw/boards.h | 3 +
> >> include/hw/i386/pc.h | 9 ++
> >> include/hw/i386/topology.h | 209 +++++++++++++++++++++++++++++++-------------
> >> include/sysemu/numa.h | 5 +
> >> qapi/machine.json | 7 +
> >> target/i386/cpu.c | 196 ++++++++++++-----------------------------
> >> target/i386/cpu.h | 9 ++
> >> tests/test-x86-cpuid.c | 115 ++++++++++++++----------
> >> vl.c | 4 +
> >> 13 files changed, 455 insertions(+), 313 deletions(-)
> >>
> >> --
> >>
> >
>
next prev parent reply other threads:[~2020-02-04 8:04 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-04 0:36 [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models Babu Moger
2019-12-04 0:37 ` [PATCH v3 01/18] hw/i386: Rename X86CPUTopoInfo structure to X86CPUTopoIDs Babu Moger
2020-02-03 15:08 ` Igor Mammedov
2020-02-03 18:25 ` Babu Moger
2019-12-04 0:37 ` [PATCH v3 02/18] hw/i386: Introduce X86CPUTopoInfo to contain topology info Babu Moger
2020-01-28 15:44 ` Igor Mammedov
2019-12-04 0:37 ` [PATCH v3 03/18] hw/i386: Consolidate topology functions Babu Moger
2020-01-28 15:46 ` Igor Mammedov
2019-12-04 0:37 ` [PATCH v3 04/18] hw/i386: Introduce initialize_topo_info to initialize X86CPUTopoInfo Babu Moger
2020-01-28 15:49 ` Igor Mammedov
2020-01-28 16:42 ` Babu Moger
2019-12-04 0:37 ` [PATCH v3 05/18] machine: Add SMP Sockets in CpuTopology Babu Moger
2019-12-04 0:37 ` [PATCH v3 06/18] hw/core: Add core complex id in X86CPU topology Babu Moger
2020-01-28 16:27 ` Igor Mammedov
2020-01-28 16:44 ` Babu Moger
2020-01-28 16:31 ` Eric Blake
2020-01-28 16:44 ` Babu Moger
2019-12-04 0:37 ` [PATCH v3 07/18] machine: Add a new function init_apicid_fn in MachineClass Babu Moger
2020-01-28 16:29 ` Igor Mammedov
2020-01-28 19:45 ` Babu Moger
2020-01-28 20:12 ` Eduardo Habkost
2020-01-29 9:14 ` Igor Mammedov
2020-01-29 16:17 ` Babu Moger
2020-02-03 15:17 ` Igor Mammedov
2020-02-03 21:49 ` Babu Moger
2020-02-04 7:38 ` Igor Mammedov
2020-01-29 16:32 ` Babu Moger
2020-01-29 16:51 ` Eduardo Habkost
2020-01-29 17:05 ` Babu Moger
2019-12-04 0:37 ` [PATCH v3 08/18] hw/i386: Update structures for nodes_per_pkg Babu Moger
2019-12-04 0:37 ` [PATCH v3 09/18] i386: Add CPUX86Family type in CPUX86State Babu Moger
2019-12-04 0:38 ` [PATCH v3 10/18] hw/386: Add EPYC mode topology decoding functions Babu Moger
2019-12-04 0:38 ` [PATCH v3 11/18] i386: Cleanup and use the EPYC mode topology functions Babu Moger
2019-12-04 0:38 ` [PATCH v3 12/18] numa: Split the numa initialization Babu Moger
2019-12-04 0:38 ` [PATCH v3 13/18] hw/i386: Introduce apicid_from_cpu_idx in PCMachineState Babu Moger
2019-12-04 0:38 ` [PATCH v3 14/18] hw/i386: Introduce topo_ids_from_apicid handler PCMachineState Babu Moger
2019-12-04 0:38 ` [PATCH v3 15/18] hw/i386: Introduce apic_id_from_topo_ids handler in PCMachineState Babu Moger
2019-12-04 0:38 ` [PATCH v3 16/18] hw/i386: Introduce EPYC mode function handlers Babu Moger
2020-01-28 20:04 ` Eduardo Habkost
2020-01-28 21:48 ` Babu Moger
2020-01-29 16:41 ` Eduardo Habkost
2019-12-04 0:38 ` [PATCH v3 17/18] i386: Fix pkg_id offset for epyc mode Babu Moger
2019-12-04 0:39 ` [PATCH v3 18/18] tests: Update the Unit tests Babu Moger
2020-02-03 14:59 ` [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models Igor Mammedov
2020-02-03 19:31 ` Babu Moger
2020-02-04 8:02 ` Igor Mammedov [this message]
2020-02-04 19:08 ` Babu Moger
2020-02-05 9:38 ` Igor Mammedov
2020-02-05 16:10 ` Babu Moger
2020-02-05 16:56 ` Igor Mammedov
2020-02-05 19:07 ` Babu Moger
2020-02-06 13:08 ` Igor Mammedov
2020-02-06 15:32 ` Babu Moger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200204090230.28f31a87@redhat.com \
--to=imammedo@redhat.com \
--cc=armbru@redhat.com \
--cc=babu.moger@amd.com \
--cc=ehabkost@redhat.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=rth@twiddle.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).