From: Igor Mammedov <imammedo@redhat.com>
To: Tao Xu <tao3.xu@intel.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>,
"Liu, Jingqi" <jingqi.liu@intel.com>,
"Du, Fan" <fan.du@intel.com>,
Qemu Developers <qemu-devel@nongnu.org>,
"daniel@linux.ibm.com" <daniel@linux.ibm.com>,
Jonathan Cameron <jonathan.cameron@huawei.com>,
Dan Williams <dan.j.williams@intel.com>
Subject: Re: [Qemu-devel] [PATCH v9 05/11] numa: Extend CLI to provide initiator information for numa nodes
Date: Tue, 27 Aug 2019 15:12:06 +0200 [thread overview]
Message-ID: <20190827151206.7b2ddce5@redhat.com> (raw)
In-Reply-To: <b0a958d6-2e2c-ab4b-36b7-b1fc13e0c2e3@intel.com>
On Tue, 20 Aug 2019 16:34:44 +0800
Tao Xu <tao3.xu@intel.com> wrote:
> On 8/16/2019 10:57 PM, Igor Mammedov wrote:
> > On Wed, 14 Aug 2019 19:31:27 -0700
> > Dan Williams <dan.j.williams@intel.com> wrote:
> >
> >> On Wed, Aug 14, 2019 at 6:57 PM Tao Xu <tao3.xu@intel.com> wrote:
> >>>
> >>> On 8/15/2019 5:29 AM, Dan Williams wrote:
> >>>> On Tue, Aug 13, 2019 at 10:14 PM Tao Xu <tao3.xu@intel.com> wrote:
> >>>>>
> >>>>> On 8/14/2019 10:39 AM, Dan Williams wrote:
> >>>>>> On Tue, Aug 13, 2019 at 8:00 AM Igor Mammedov <imammedo@redhat.com> wrote:
> >>>>>>>
> >>>>>>> On Fri, 9 Aug 2019 14:57:25 +0800
> >>>>>>> Tao <tao3.xu@intel.com> wrote:
> >>>>>>>
> >>>>>>>> From: Tao Xu <tao3.xu@intel.com>
> >>>>>>>>
> >>> [...]
> >>>>>>>> + for (i = 0; i < machine->numa_state->num_nodes; i++) {
> >>>>>>>> + if (numa_info[i].initiator_valid &&
> >>>>>>>> + !numa_info[numa_info[i].initiator].has_cpu) {
> >>>>>>> ^^^^^^^^^^^^^^^^^^^^^^ possible out of bounds read, see bellow
> >>>>>>>
> >>>>>>>> + error_report("The initiator-id %"PRIu16 " of NUMA node %d"
> >>>>>>>> + " does not exist.", numa_info[i].initiator, i);
> >>>>>>>> + error_printf("\n");
> >>>>>>>> +
> >>>>>>>> + exit(1);
> >>>>>>>> + }
> >>>>>>> it takes care only about nodes that have cpus or memory-only ones that have
> >>>>>>> initiator explicitly provided on CLI. And leaves possibility to have
> >>>>>>> memory-only nodes without initiator mixed with nodes that have initiator.
> >>>>>>> Is it valid to have mixed configuration?
> >>>>>>> Should we forbid it?
> >>>>>>
> >>>>>> The spec talks about the "Proximity Domain for the Attached Initiator"
> >>>>>> field only being valid if the memory controller for the memory can be
> >>>>>> identified by an initiator id in the SRAT. So I expect the only way to
> >>>>>> define a memory proximity domain without this local initiator is to
> >>>>>> allow specifying a node-id that does not have an entry in the SRAT.
> >>>>>>
> >>>>> Hi Dan,
> >>>>>
> >>>>> So there may be a situation for the Attached Initiator field is not
> >>>>> valid? If true, I would allow user to input Initiator invalid.
> >>>>
> >>>> Yes it's something the OS needs to consider because the platform may
> >>>> not be able to meet the constraint that a single initiator is
> >>>> associated with the memory controller for a given memory target. In
> >>>> retrospect it would have been nice if the spec reserved 0xffffffff for
> >>>> this purpose, but it seems "not in SRAT" is the only way to identify
> >>>> memory that is not attached to any single initiator.
> >>>>
> >>> But As far as I konw, QEMU can't emulate a NUMA node "not in SRAT". I am
> >>> wondering if it is effective only set Initiator invalid?
> >>
> >> You don't need to emulate a NUMA node not in SRAT. Just put a number
> >> in this HMAT entry larger than the largest proximity domain number
> >> found in the SRAT.
> >>>
> >>
> >
> > So behavior is really not defined in the spec
> > (well I wasn't able to convince myself that above behavior is in the spec).
> >
> > In this case I'd go with a strict check for now not allowing invalid initiator
> > (we can easily relax check and allow it point to nonsense later but no other way around)
> >
>
> So let me summarize the solution, in order to avoid misunderstanding, if
> there are something wrong, pls tell me:
>
> 1)
> -machine,hmat=yes
> -object memory-backend-ram,size=1G,id=m0 \
> -object memory-backend-ram,size=1G,id=m1 \
> -object memory-backend-ram,size=1G,id=m2 \
> -numa node,nodeid=0,memdev=m0 \
> -numa node,nodeid=1,memdev=m1,initiator=0 \
> -numa node,nodeid=2,memdev=m2,initiator=0 \
> -numa cpu,node-id=0,socket-id=0 \
> -numa cpu,node-id=0,socket-id=1
>
> then qemu can use HMAT.
>
> 2)
> if initiator this case:
>
> -numa node,nodeid=0,memdev=m0 \
> -numa node,nodeid=1,memdev=m1,initiator=0 \
> -numa node,nodeid=2,memdev=m2
>
> then qemu can't boot and show error message.
>
> 3)
> if initiator this case:
>
> -numa node,nodeid=0,memdev=m0 \
> -numa node,nodeid=1,memdev=m1,initiator=0 \
> -numa node,nodeid=2,memdev=m2,initiator=1
>
> then qemu can boot and the initiator of nodeid=2 is invalid.
In this last case I'd error out instead of booting with invalid config.
next prev parent reply other threads:[~2019-08-27 13:13 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-09 6:57 [Qemu-devel] [PATCH v9 00/11] Build ACPI Heterogeneous Memory Attribute Table (HMAT) Tao
2019-08-09 6:57 ` [Qemu-devel] [PATCH v9 01/11] hw/arm: simplify arm_load_dtb Tao
2019-08-13 21:55 ` Alistair Francis
2019-08-14 1:19 ` Andrew Jeffery
2019-08-13 21:55 ` Eduardo Habkost
2019-08-14 13:08 ` Cédric Le Goater
2019-08-09 6:57 ` [Qemu-devel] [PATCH v9 02/11] numa: move numa global variable nb_numa_nodes into MachineState Tao
2019-08-09 6:57 ` [Qemu-devel] [PATCH v9 03/11] numa: move numa global variable have_numa_distance " Tao
2019-08-09 6:57 ` [Qemu-devel] [PATCH v9 04/11] numa: move numa global variable numa_info " Tao
2019-08-09 6:57 ` [Qemu-devel] [PATCH v9 05/11] numa: Extend CLI to provide initiator information for numa nodes Tao
2019-08-13 15:00 ` Igor Mammedov
2019-08-14 2:24 ` Tao Xu
2019-08-16 14:47 ` Igor Mammedov
2019-08-14 2:39 ` Dan Williams
2019-08-14 5:13 ` Tao Xu
2019-08-14 21:29 ` Dan Williams
2019-08-15 1:56 ` Tao Xu
2019-08-15 2:31 ` Dan Williams
2019-08-16 14:57 ` Igor Mammedov
2019-08-20 8:34 ` Tao Xu
2019-08-27 13:12 ` Igor Mammedov [this message]
2019-08-28 1:09 ` Tao Xu
2019-08-09 6:57 ` [Qemu-devel] [PATCH v9 06/11] hmat acpi: Build Memory Proximity Domain Attributes Structure(s) Tao
2019-08-09 6:57 ` [Qemu-devel] [PATCH v9 07/11] hmat acpi: Build System Locality Latency and Bandwidth Information Structure(s) Tao
2019-08-09 6:57 ` [Qemu-devel] [PATCH v9 08/11] hmat acpi: Build Memory Side Cache " Tao
2019-08-09 6:57 ` [Qemu-devel] [PATCH v9 09/11] numa: Extend the CLI to provide memory latency and bandwidth information Tao
2019-08-12 5:13 ` Daniel Black
2019-08-12 6:11 ` Tao Xu
2019-08-13 15:11 ` Eric Blake
2019-08-14 2:58 ` Tao Xu
2019-08-09 6:57 ` [Qemu-devel] [PATCH v9 10/11] numa: Extend the CLI to provide memory side cache information Tao
2019-08-09 6:57 ` [Qemu-devel] [PATCH v9 11/11] tests/bios-tables-test: add test cases for ACPI HMAT Tao
2019-08-09 11:11 ` [Qemu-devel] [PATCH v9 00/11] Build ACPI Heterogeneous Memory Attribute Table (HMAT) no-reply
2019-08-13 8:53 ` Tao Xu
2019-08-14 20:57 ` Eduardo Habkost
2019-08-15 0:53 ` Tao Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190827151206.7b2ddce5@redhat.com \
--to=imammedo@redhat.com \
--cc=dan.j.williams@intel.com \
--cc=daniel@linux.ibm.com \
--cc=ehabkost@redhat.com \
--cc=fan.du@intel.com \
--cc=jingqi.liu@intel.com \
--cc=jonathan.cameron@huawei.com \
--cc=qemu-devel@nongnu.org \
--cc=tao3.xu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).