All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To: David Hildenbrand <david@redhat.com>
Cc: peter.maydell@linaro.org, Andrew Jones <drjones@redhat.com>,
	Gavin Shan <gshan@redhat.com>,
	ehabkost@redhat.com, alison.schofield@intel.com,
	richard.henderson@linaro.org, qemu-devel@nongnu.org,
	qemu-arm@nongnu.org, shan.gavin@gmail.com,
	Igor Mammedov <imammedo@redhat.com>,
	Dan Williams <dan.j.williams@intel.com>
Subject: Re: [PATCH v2] hw/arm/virt: Expose empty NUMA nodes through ACPI
Date: Wed, 17 Nov 2021 14:30:15 +0000	[thread overview]
Message-ID: <20211117143015.00002e0a@Huawei.com> (raw)
In-Reply-To: <188faab7-1e57-2bc1-846f-9457433c2f9d@redhat.com>

On Tue, 16 Nov 2021 12:11:29 +0100
David Hildenbrand <david@redhat.com> wrote:

> >>
> >> Examples include exposing HBM or PMEM to the VM. Just like on real HW,
> >> this memory is exposed via cpu-less, special nodes. In contrast to real
> >> HW, the memory is hotplugged later (I don't think HW supports hotplug
> >> like that yet, but it might just be a matter of time).  
> > 
> > I suppose some of that maybe covered by GENERIC_AFFINITY entries in SRAT
> > some by MEMORY entries. Or nodes created dynamically like with normal
> > hotplug memory.
> >   

The naming of the define is unhelpful.  GENERIC_AFFINITY here corresponds
to Generic Initiator Affinity.  So no good for memory. This is meant for
representation of accelerators / network cards etc so you can get the NUMA
characteristics for them accessing Memory in other nodes.

My understanding of 'traditional' memory hotplug is that typically the
PA into which memory is hotplugged is known at boot time whether or not
the memory is physically present.  As such, you present that in SRAT and rely
on the EFI memory map / other information sources to know the memory isn't
there.  When it is hotplugged later the address is looked up in SRAT to identify
the NUMA node.

That model is less useful for more flexible entities like virtio-mem or
indeed physical hardware such as CXL type 3 memory devices which typically
need their own nodes.

For the CXL type 3 option, currently proposal is to use the CXL table entries
representing Physical Address space regions to work out how many NUMA nodes
are needed and just create extra ones at boot.
https://lore.kernel.org/linux-cxl/163553711933.2509508.2203471175679990.stgit@dwillia2-desk3.amr.corp.intel.com

It's a heuristic as we might need more nodes to represent things well kernel
side, but it's better than nothing and less effort that true dynamic node creation.
If you chase through the earlier versions of Alison's patch you will find some
discussion of that.

I wonder if virtio-mem should just grow a CDAT instance via a DOE?

That would make all this stuff discoverable via PCI config space rather than ACPI
CDAT is at:
https://uefi.org/sites/default/files/resources/Coherent%20Device%20Attribute%20Table_1.01.pdf
but the table access protocol over PCI DOE is currently in the CXL 2.0 spec
(nothing stops others using it though AFAIK).

However, then we'd actually need either dynamic node creation in the OS, or
some sort of reserved pool of extra nodes.  Long term it may be the most
flexible option.

Jonathan

> 
> I'm certainly no SRAT expert, but seems like under VMWare something
> similar can happen:
> 
> https://lkml.kernel.org/r/BAE95F0C-FAA7-40C6-A0D6-5049B1207A27@vmware.com
> 
> "VM was powered on with 4 vCPUs (4 NUMA nodes) and 4GB memory.
> ACPI SRAT reports 128 possible CPUs and 128 possible NUMA nodes."
> 
> Note that that discussion is about hotplugging CPUs to memory-less,
> hotplugged nodes.
> 
> But there seems to be some way to expose possible NUMA nodes. Maybe
> that's via GENERIC_AFFINITY.
> 


  reply	other threads:[~2021-11-17 14:30 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-27  5:29 [PATCH v2] hw/arm/virt: Expose empty NUMA nodes through ACPI Gavin Shan
2021-10-27 15:40 ` Igor Mammedov
2021-10-28 11:32   ` Gavin Shan
2021-11-01  8:44     ` Igor Mammedov
2021-11-01 23:44       ` Gavin Shan
2021-11-02  7:39         ` Andrew Jones
2021-11-05 12:47           ` Gavin Shan
2021-11-10 10:33             ` Igor Mammedov
2021-11-10 11:01               ` David Hildenbrand
2021-11-12 13:27                 ` Igor Mammedov
2021-11-16 11:11                   ` David Hildenbrand
2021-11-17 14:30                     ` Jonathan Cameron [this message]
2021-11-17 18:08                       ` David Hildenbrand
2021-11-18 10:28                         ` Jonathan Cameron
2021-11-18 11:06                           ` David Hildenbrand
2021-11-18 11:23                             ` Jonathan Cameron
2021-11-19 10:58                               ` Jonathan Cameron
2021-11-19 11:33                                 ` David Hildenbrand
2021-11-19 17:26                                   ` Jonathan Cameron
2021-11-19 17:56                                     ` David Hildenbrand
2021-11-17 18:26                   ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211117143015.00002e0a@Huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=alison.schofield@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=david@redhat.com \
    --cc=drjones@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=gshan@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=shan.gavin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.