From mboxrd@z Thu Jan 1 00:00:00 1970 Received: by 2002:ac2:5544:0:0:0:0:0 with SMTP id l4csp789741lfk; Wed, 17 Nov 2021 06:30:57 -0800 (PST) X-Google-Smtp-Source: ABdhPJyC7tWVBXWrllb5A/CChvsMw18TJtHFe5KSU5TP5rceBnBFlMo8r45ZQvyMdNBd33n6Ac+0 X-Received: by 2002:ab0:2617:: with SMTP id c23mr24472831uao.38.1637159457551; Wed, 17 Nov 2021 06:30:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1637159457; cv=none; d=google.com; s=arc-20160816; b=L3J98E6B6PkzvFqgv0bUYZXWqoGLSrIA2g8N7Ec+5UAh2BRTQKyvWTFsmbVr6edoQs s5rF8XnwH4yVJ+tRPQv3Kg8P1YhArG0vfpVlVEJGtKXtEWqlqAo98VC8OxAtQCYcLQBA D0r+lEf37mPY6eruxmJJ2kBRDmILNnJER+xGTBg+x/dIqRHeHrhemeG+h8rf+RqiNWtr xcjfisiAHalUj2kZSolHEanid3OEgMBvThhT78AXafwwcV0+Abl4r5hQKKpKTXIzlyj8 DXM5c8SHGXIBw/SJBeks86/xxn2S8ndSrq99V48GhGodFb1T/+TGTGxrDxyBVD/bDP2I J//Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:organization:references:in-reply-to:message-id:subject :to:from:date; bh=6NBe0x8yaoYuQA4RS48rrMh4/eIdedU2cGenWbjpcjM=; b=obVpcHhsyrRlDuBtumZDcy5YaizpzkRNzy49W4e8p/YokFIF2tJ4AUwjbdkr4KwYzY CPxWw3N3/Y/lonoBj3D820nXIPTK0Uu24W3nhwq4haxnmVSjlN7/STsXs0ZedgWJfVt6 D54GP7Eibn8IxG9pH1jKZyDkfWQcfX9lgaa2p+9zp29Ykszf9T/lgkhmrvqPteLBsmGn 295d4kg3lvceoppIdVy4IcagEAmfLgu1IEy01pbd+Jl/TGKf+fKetSfNyo2q/lB3ZYpb UAtrTlv8WPWtcPzqaDh3pzCAZvtWAV7lI1TutLXYhXM9ExooJJRxNmgFgTecjaHYY2Hw 5y6g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of qemu-arm-bounces+alex.bennee=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-arm-bounces+alex.bennee=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id j27si3120210vka.2.2021.11.17.06.30.57 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Wed, 17 Nov 2021 06:30:57 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-arm-bounces+alex.bennee=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; spf=pass (google.com: domain of qemu-arm-bounces+alex.bennee=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-arm-bounces+alex.bennee=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Received: from localhost ([::1]:36304 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mnLxf-0005ac-0i for alex.bennee@linaro.org; Wed, 17 Nov 2021 09:30:55 -0500 Received: from eggs.gnu.org ([209.51.188.92]:53116) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mnLxJ-0005XP-A1; Wed, 17 Nov 2021 09:30:33 -0500 Received: from frasgout.his.huawei.com ([185.176.79.56]:2152) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mnLxE-0006Ru-Ix; Wed, 17 Nov 2021 09:30:33 -0500 Received: from fraeml742-chm.china.huawei.com (unknown [172.18.147.200]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4HvQL92yfYz67mnw; Wed, 17 Nov 2021 22:29:37 +0800 (CST) Received: from lhreml710-chm.china.huawei.com (10.201.108.61) by fraeml742-chm.china.huawei.com (10.206.15.223) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Wed, 17 Nov 2021 15:30:19 +0100 Received: from localhost (10.52.126.160) by lhreml710-chm.china.huawei.com (10.201.108.61) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Wed, 17 Nov 2021 14:30:17 +0000 Date: Wed, 17 Nov 2021 14:30:15 +0000 From: Jonathan Cameron To: David Hildenbrand Subject: Re: [PATCH v2] hw/arm/virt: Expose empty NUMA nodes through ACPI Message-ID: <20211117143015.00002e0a@Huawei.com> In-Reply-To: <188faab7-1e57-2bc1-846f-9457433c2f9d@redhat.com> References: <20211027052958.280741-1-gshan@redhat.com> <20211027174028.1f16fcfb@redhat.com> <20211101094431.71e1a50a@redhat.com> <47dc3a95-ed77-6c0e-d024-27cb22c338eb@redhat.com> <20211102073948.am3p3hcqqd3cfvru@gator.home> <20211110113304.2d713d4a@redhat.com> <5180ecee-62e2-cd6f-d595-c7c29eff6039@redhat.com> <20211112142751.4807ab50@redhat.com> <188faab7-1e57-2bc1-846f-9457433c2f9d@redhat.com> Organization: Huawei Technologies Research and Development (UK) Ltd. X-Mailer: Claws Mail 4.0.0 (GTK+ 3.24.29; i686-w64-mingw32) MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.52.126.160] X-ClientProxiedBy: lhreml719-chm.china.huawei.com (10.201.108.70) To lhreml710-chm.china.huawei.com (10.201.108.61) X-CFilter-Loop: Reflected Received-SPF: pass client-ip=185.176.79.56; envelope-from=jonathan.cameron@huawei.com; helo=frasgout.his.huawei.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-arm@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, Andrew Jones , Gavin Shan , ehabkost@redhat.com, alison.schofield@intel.com, richard.henderson@linaro.org, qemu-devel@nongnu.org, qemu-arm@nongnu.org, shan.gavin@gmail.com, Igor Mammedov , Dan Williams Errors-To: qemu-arm-bounces+alex.bennee=linaro.org@nongnu.org Sender: "Qemu-arm" X-TUID: i1f0WHdv3+Du On Tue, 16 Nov 2021 12:11:29 +0100 David Hildenbrand wrote: > >> > >> Examples include exposing HBM or PMEM to the VM. Just like on real HW, > >> this memory is exposed via cpu-less, special nodes. In contrast to real > >> HW, the memory is hotplugged later (I don't think HW supports hotplug > >> like that yet, but it might just be a matter of time). > > > > I suppose some of that maybe covered by GENERIC_AFFINITY entries in SRAT > > some by MEMORY entries. Or nodes created dynamically like with normal > > hotplug memory. > > The naming of the define is unhelpful. GENERIC_AFFINITY here corresponds to Generic Initiator Affinity. So no good for memory. This is meant for representation of accelerators / network cards etc so you can get the NUMA characteristics for them accessing Memory in other nodes. My understanding of 'traditional' memory hotplug is that typically the PA into which memory is hotplugged is known at boot time whether or not the memory is physically present. As such, you present that in SRAT and rely on the EFI memory map / other information sources to know the memory isn't there. When it is hotplugged later the address is looked up in SRAT to identify the NUMA node. That model is less useful for more flexible entities like virtio-mem or indeed physical hardware such as CXL type 3 memory devices which typically need their own nodes. For the CXL type 3 option, currently proposal is to use the CXL table entries representing Physical Address space regions to work out how many NUMA nodes are needed and just create extra ones at boot. https://lore.kernel.org/linux-cxl/163553711933.2509508.2203471175679990.stgit@dwillia2-desk3.amr.corp.intel.com It's a heuristic as we might need more nodes to represent things well kernel side, but it's better than nothing and less effort that true dynamic node creation. If you chase through the earlier versions of Alison's patch you will find some discussion of that. I wonder if virtio-mem should just grow a CDAT instance via a DOE? That would make all this stuff discoverable via PCI config space rather than ACPI CDAT is at: https://uefi.org/sites/default/files/resources/Coherent%20Device%20Attribute%20Table_1.01.pdf but the table access protocol over PCI DOE is currently in the CXL 2.0 spec (nothing stops others using it though AFAIK). However, then we'd actually need either dynamic node creation in the OS, or some sort of reserved pool of extra nodes. Long term it may be the most flexible option. Jonathan > > I'm certainly no SRAT expert, but seems like under VMWare something > similar can happen: > > https://lkml.kernel.org/r/BAE95F0C-FAA7-40C6-A0D6-5049B1207A27@vmware.com > > "VM was powered on with 4 vCPUs (4 NUMA nodes) and 4GB memory. > ACPI SRAT reports 128 possible CPUs and 128 possible NUMA nodes." > > Note that that discussion is about hotplugging CPUs to memory-less, > hotplugged nodes. > > But there seems to be some way to expose possible NUMA nodes. Maybe > that's via GENERIC_AFFINITY. >