From: Igor Mammedov <imammedo@redhat.com>
To: Andrew Jones <drjones@redhat.com>
Cc: peter.maydell@linaro.org, Gavin Shan <gshan@redhat.com>,
ehabkost@redhat.com, robh@kernel.org, qemu-devel@nongnu.org,
qemu-arm@nongnu.org, shan.gavin@gmail.com
Subject: Re: [PATCH v3 1/2] numa: Require distance map when empty node exists
Date: Wed, 13 Oct 2021 13:53:46 +0200 [thread overview]
Message-ID: <20211013135346.3a8f6c9a@redhat.com> (raw)
In-Reply-To: <20211013113544.4xrfagduw4nlbvou@gator.home>
On Wed, 13 Oct 2021 13:35:44 +0200
Andrew Jones <drjones@redhat.com> wrote:
> On Wed, Oct 13, 2021 at 01:30:11PM +0200, Igor Mammedov wrote:
> > On Wed, 13 Oct 2021 12:58:04 +0800
> > Gavin Shan <gshan@redhat.com> wrote:
> >
> > > The following option is used to specify the distance map. It's
> > > possible the option isn't provided by user. In this case, the
> > > distance map isn't populated and exposed to platform. On the
> > > other hand, the empty NUMA node, where no memory resides, is
> > > allowed on platforms like ARM64 virt. For these empty NUMA
> > > nodes, their corresponding device-tree nodes aren't populated,
> > > but their NUMA IDs should be included in the "/distance-map"
> > > device-tree node, so that kernel can probe them properly if
> > > device-tree is used.
> > >
> > > -numa,dist,src=<numa_id>,dst=<numa_id>,val=<distance>
> > >
> > > This adds extra check after the machine is initialized, to
> > > ask for the distance map from user when empty nodes exist in
> > > device-tree.
> > >
> > > Signed-off-by: Gavin Shan <gshan@redhat.com>
> > > ---
> > > hw/core/machine.c | 4 ++++
> > > hw/core/numa.c | 24 ++++++++++++++++++++++++
> > > include/sysemu/numa.h | 1 +
> > > 3 files changed, 29 insertions(+)
> > >
> > > diff --git a/hw/core/machine.c b/hw/core/machine.c
> > > index b8d95eec32..c0765ad973 100644
> > > --- a/hw/core/machine.c
> > > +++ b/hw/core/machine.c
> > > @@ -1355,6 +1355,10 @@ void machine_run_board_init(MachineState *machine)
> > > accel_init_interfaces(ACCEL_GET_CLASS(machine->accelerator));
> > > machine_class->init(machine);
> > > phase_advance(PHASE_MACHINE_INITIALIZED);
> > > +
> > > + if (machine->numa_state) {
> > > + numa_complete_validation(machine);
> > > + }
> > > }
> > >
> > > static NotifierList machine_init_done_notifiers =
> > > diff --git a/hw/core/numa.c b/hw/core/numa.c
> > > index 510d096a88..7404b7dd38 100644
> > > --- a/hw/core/numa.c
> > > +++ b/hw/core/numa.c
> > > @@ -727,6 +727,30 @@ void numa_complete_configuration(MachineState *ms)
> > > }
> > > }
> > >
> > > +/*
> > > + * When device-tree is used by the machine, the empty node IDs should
> > > + * be included in the distance map. So we need provide pairs of distances
> > > + * in this case.
> > > + */
> > > +void numa_complete_validation(MachineState *ms)
> > > +{
> > > + NodeInfo *numa_info = ms->numa_state->nodes;
> > > + int nb_numa_nodes = ms->numa_state->num_nodes;
> > > + int i;
> > > +
> > > + if (!ms->fdt || ms->numa_state->have_numa_distance) {
> >
> > also skip check/limitation when VM is launched with ACPI enabled?
>
> Even with ACPI enabled we generate a DT and would like that DT to be as
> complete as possible. Of course we should generate a SLIT table with
Guest will work just fine without distance map as SRAT describes
all numa nodes.
You are forcing VM to have SLIT just for the sake of 'completeness'
that's practically unused.
I'm still unsure about pushing all of this in generic numa code,
as this will be used only by ARM for now. It's better to keep it
ARM specific, and when RISCV machine will start using this, it
could be moved to generic code.
> the distance information the user provides on the command line in order
> to satisfy the check, and we will, since we already have that code in
> place.
>
> Thanks,
> drew
>
> >
> > > + return;
> > > + }
> > > +
> > > + for (i = 0; i < nb_numa_nodes; i++) {
> > > + if (numa_info[i].present && !numa_info[i].node_mem) {
> > > + error_report("Empty node %d found, please provide "
> > > + "distance map.", i);
> > > + exit(EXIT_FAILURE);
> > > + }
> > > + }
> > > +}
> > > +
> > > void parse_numa_opts(MachineState *ms)
> > > {
> > > qemu_opts_foreach(qemu_find_opts("numa"), parse_numa, ms, &error_fatal);
> > > diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h
> > > index 4173ef2afa..80f25ab830 100644
> > > --- a/include/sysemu/numa.h
> > > +++ b/include/sysemu/numa.h
> > > @@ -104,6 +104,7 @@ void parse_numa_hmat_lb(NumaState *numa_state, NumaHmatLBOptions *node,
> > > void parse_numa_hmat_cache(MachineState *ms, NumaHmatCacheOptions *node,
> > > Error **errp);
> > > void numa_complete_configuration(MachineState *ms);
> > > +void numa_complete_validation(MachineState *ms);
> > > void query_numa_node_mem(NumaNodeMem node_mem[], MachineState *ms);
> > > extern QemuOptsList qemu_numa_opts;
> > > void numa_cpu_pre_plug(const struct CPUArchId *slot, DeviceState *dev,
> >
>
next prev parent reply other threads:[~2021-10-13 11:54 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-13 4:58 [PATCH v3 0/2] hw/arm/virt: Fix qemu booting failure on device-tree Gavin Shan
2021-10-13 4:58 ` [PATCH v3 1/2] numa: Require distance map when empty node exists Gavin Shan
2021-10-13 6:35 ` Andrew Jones
2021-10-13 11:30 ` Igor Mammedov
2021-10-13 11:35 ` Andrew Jones
2021-10-13 11:53 ` Igor Mammedov [this message]
2021-10-13 12:11 ` Andrew Jones
2021-10-13 12:28 ` Andrew Jones
2021-10-14 15:14 ` Igor Mammedov
2021-10-14 15:36 ` Andrew Jones
2021-10-15 8:22 ` Gavin Shan
2021-10-15 8:33 ` Andrew Jones
2021-10-15 10:51 ` Gavin Shan
2021-10-13 4:58 ` [PATCH v3 2/2] hw/arm/virt: Don't create device-tree node for empty NUMA node Gavin Shan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211013135346.3a8f6c9a@redhat.com \
--to=imammedo@redhat.com \
--cc=drjones@redhat.com \
--cc=ehabkost@redhat.com \
--cc=gshan@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-arm@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=robh@kernel.org \
--cc=shan.gavin@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).