All of lore.kernel.org
 help / color / mirror / Atom feed
From: Igor Mammedov <imammedo@redhat.com>
To: Gavin Shan <gshan@redhat.com>
Cc: peter.maydell@linaro.org, Andrew Jones <drjones@redhat.com>,
	ehabkost@redhat.com, David Hildenbrand <david@redhat.com>,
	richard.henderson@linaro.org, qemu-devel@nongnu.org,
	qemu-arm@nongnu.org, shan.gavin@gmail.com
Subject: Re: [PATCH v2] hw/arm/virt: Expose empty NUMA nodes through ACPI
Date: Wed, 10 Nov 2021 11:33:04 +0100	[thread overview]
Message-ID: <20211110113304.2d713d4a@redhat.com> (raw)
In-Reply-To: <b8ed4687-e30a-d70f-0816-bd8ba490ceb7@redhat.com>

On Fri, 5 Nov 2021 23:47:37 +1100
Gavin Shan <gshan@redhat.com> wrote:

> Hi Drew and Igor,
> 
> On 11/2/21 6:39 PM, Andrew Jones wrote:
> > On Tue, Nov 02, 2021 at 10:44:08AM +1100, Gavin Shan wrote:  
> >>
> >> Yeah, I agree. I don't have strong sense to expose these empty nodes
> >> for now. Please ignore the patch.
> >>  
> > 
> > So were describing empty numa nodes on the command line ever a reasonable
> > thing to do? What happens on x86 machine types when describing empty numa
> > nodes? I'm starting to think that the solution all along was just to
> > error out when a numa node has memory size = 0...

memory less nodes are fine as long as there is another type of device
that describes  a node (apic/gic/...).
But there is no way in spec to describe completely empty nodes,
and I dislike adding out of spec entries just to fake an empty node.


> Sorry for the delay as I spent a few days looking into linux virtio-mem
> driver. I'm afraid we still need this patch for ARM64. I don't think x86

does it behave the same way is using pc-dimm hotplug instead of virtio-mem?

CCing David
as it might be virtio-mem issue.

PS:
maybe for virtio-mem-pci, we need to add GENERIC_AFFINITY entry into SRAT
and describe it as PCI device (we don't do that yet if I'm no mistaken).

> has this issue even though I didn't experiment on X86. For example, I
> have the following command lines. The hot added memory is put into node#0
> instead of node#2, which is wrong.
> 
> There are several bitmaps tracking the node states in Linux kernel. One of
> them is @possible_map, which tracks the nodes available, but don't have to
> be online. @passible_map is sorted out from the following ACPI table.
> 
>    ACPI_SRAT_TYPE_MEMORY_AFFINITY
>    ACPI_SRAT_TYPE_GENERIC_AFFINITY
>    ACPI_SIG_SLIT                          # if it exists when optional distance map
>                                           # is provided on QEMU side.
> 
> Note: Drew might ask why we have node#2 in "/sys/devices/system/node" again.
> hw/arm/virt-acpi-build.c::build_srat() creates additional node in ACPI SRAT
> table and the node's PXM is 3 ((ms->numa_state->num_nodes - 1)) in this case,
> but linux kernel assigns node#2 to it.
> 
>    /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
>    -accel kvm -machine virt,gic-version=host               \
>    -cpu host -smp 4,sockets=2,cores=2,threads=1            \
>    -m 1024M,slots=16,maxmem=64G                            \
>    -object memory-backend-ram,id=mem0,size=512M            \
>    -object memory-backend-ram,id=mem1,size=512M            \
>    -numa node,nodeid=0,cpus=0-1,memdev=mem0                \
>    -numa node,nodeid=1,cpus=2-3,memdev=mem1                \
>    -numa node,nodeid=2 -numa node,nodeid=3                 \
>    -object memory-backend-ram,id=vmem0,size=512M           \
>    -device virtio-mem-pci,id=vm0,memdev=vmem0,node=2,requested-size=0 \
>    -object memory-backend-ram,id=vmem1,size=512M           \
>    -device virtio-mem-pci,id=vm1,memdev=vmem1,node=3,requested-size=0
>       :
>    # ls  /sys/devices/system/node | grep node
>    node0
>    node1
>    node2
>    # cat /proc/meminfo | grep MemTotal\:
>    MemTotal:        1003104 kB
>    # cat /sys/devices/system/node/node0/meminfo | grep MemTotal\:
>    Node 0 MemTotal: 524288 kB
> 
>    (qemu) qom-set vm0 requested-size 512M
>    # cat /proc/meminfo | grep MemTotal\:
>    MemTotal:        1527392 kB
>    # cat /sys/devices/system/node/node0/meminfo | grep MemTotal\:
>    Node 0 MemTotal: 1013652 kB
> 
> Try above test after the patch is applied. The hot added memory is
> put into node#2 correctly as the user expected.
> 
>    # ls  /sys/devices/system/node | grep node
>    node0
>    node1
>    node2
>    node3
>    # cat /proc/meminfo | grep MemTotal\:
>    MemTotal:        1003100 kB
>    # cat /sys/devices/system/node/node2/meminfo | grep MemTotal\:
>    Node 2 MemTotal: 0 kB
> 
>    (qemu) qom-set vm0 requested-size 512M
>    # cat /proc/meminfo | grep MemTotal\:
>    MemTotal:        1527388 kB
>    # cat /sys/devices/system/node/node2/meminfo | grep MemTotal\:
>    Node 2 MemTotal: 524288 kB
> 
> Thanks,
> Gavin
> 
> 
>    
> 


  reply	other threads:[~2021-11-10 10:36 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-27  5:29 [PATCH v2] hw/arm/virt: Expose empty NUMA nodes through ACPI Gavin Shan
2021-10-27 15:40 ` Igor Mammedov
2021-10-28 11:32   ` Gavin Shan
2021-11-01  8:44     ` Igor Mammedov
2021-11-01 23:44       ` Gavin Shan
2021-11-02  7:39         ` Andrew Jones
2021-11-05 12:47           ` Gavin Shan
2021-11-10 10:33             ` Igor Mammedov [this message]
2021-11-10 11:01               ` David Hildenbrand
2021-11-12 13:27                 ` Igor Mammedov
2021-11-16 11:11                   ` David Hildenbrand
2021-11-17 14:30                     ` Jonathan Cameron
2021-11-17 18:08                       ` David Hildenbrand
2021-11-18 10:28                         ` Jonathan Cameron
2021-11-18 11:06                           ` David Hildenbrand
2021-11-18 11:23                             ` Jonathan Cameron
2021-11-19 10:58                               ` Jonathan Cameron
2021-11-19 11:33                                 ` David Hildenbrand
2021-11-19 17:26                                   ` Jonathan Cameron
2021-11-19 17:56                                     ` David Hildenbrand
2021-11-17 18:26                   ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211110113304.2d713d4a@redhat.com \
    --to=imammedo@redhat.com \
    --cc=david@redhat.com \
    --cc=drjones@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=gshan@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=shan.gavin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.