From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 92F5BCE7A81 for ; Mon, 25 Sep 2023 13:56:00 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qkm3C-0005KD-BH; Mon, 25 Sep 2023 09:55:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qkm3A-0005Hr-GH; Mon, 25 Sep 2023 09:55:00 -0400 Received: from frasgout.his.huawei.com ([185.176.79.56]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qkm37-0006OC-Tt; Mon, 25 Sep 2023 09:55:00 -0400 Received: from lhrpeml500005.china.huawei.com (unknown [172.18.147.200]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4RvPPj5XGvz6K9h1; Mon, 25 Sep 2023 21:49:45 +0800 (CST) Received: from localhost (10.202.227.76) by lhrpeml500005.china.huawei.com (7.191.163.240) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Mon, 25 Sep 2023 14:54:42 +0100 Date: Mon, 25 Sep 2023 14:54:40 +0100 To: Ankit Agrawal CC: Jason Gunthorpe , "alex.williamson@redhat.com" , "clg@redhat.com" , "shannon.zhaosl@gmail.com" , "peter.maydell@linaro.org" , "ani@anisinha.ca" , Aniket Agashe , Neo Jia , Kirti Wankhede , "Tarun Gupta (SW-GPU)" , Vikram Sethi , "Andy Currid" , "qemu-arm@nongnu.org" , "qemu-devel@nongnu.org" Subject: Re: [PATCH v1 3/4] hw/arm/virt-acpi-build: patch guest SRAT for NUMA nodes Message-ID: <20230925145440.00005072@Huawei.com> In-Reply-To: References: <20230915024559.6565-1-ankita@nvidia.com> <20230915024559.6565-4-ankita@nvidia.com> <20230915153740.00006185@Huawei.com> Organization: Huawei Technologies Research and Development (UK) Ltd. X-Mailer: Claws Mail 4.1.0 (GTK 3.24.33; x86_64-w64-mingw32) MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.202.227.76] X-ClientProxiedBy: lhrpeml100006.china.huawei.com (7.191.160.224) To lhrpeml500005.china.huawei.com (7.191.163.240) X-CFilter-Loop: Reflected Received-SPF: pass client-ip=185.176.79.56; envelope-from=jonathan.cameron@huawei.com; helo=frasgout.his.huawei.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-to: Jonathan Cameron From: Jonathan Cameron via Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org On Fri, 22 Sep 2023 05:49:46 +0000 Ankit Agrawal wrote: > Hi Jonathan Hi Ankit, > > > > + if (pcidev->pdev.has_coherent_memory) { > > > + uint64_t start_node = object_property_get_uint(obj, > > > + "dev_mem_pxm_start", &error_abort); > > > + uint64_t node_count = object_property_get_uint(obj, > > > + "dev_mem_pxm_count", &error_abort); > > > + uint64_t node_index; > > > + > > > + /* > > > + * Add the node_count PXM domains starting from start_node as > > > + * hot pluggable. The VM kernel parse the PXM domains and > > > + * creates NUMA nodes. > > > + */ > > > + for (node_index = 0; node_index < node_count; node_index++) > > > + build_srat_memory(table_data, 0, 0, start_node + node_index, > > > + MEM_AFFINITY_ENABLED | > > > + MEM_AFFINITY_HOTPLUGGABLE); > > > > 0 size SRAT entries for memory? That's not valid. > > Can you explain in what sense are these invalid? The Linux kernel accepts > such setting and I had tested it. ACPI specification doesn't define any means of 'updating' the memory range, so whilst I guess they are not specifically disallowed without a spec definition of what it means this is walking into a mine field. In particular the description of the hot pluggable bit worries me: "The system hardware supports hot-add and hot-remove of this memory region." So I think your definition is calling out that you can hot plug memory into a region of zero size. To me that's nonsensical so a paranoid OS writer might just spit out firmware error message and refuse to boot. There is no guarantee other operating systems won't blow up if they see one of these. To be able to do this safely I think you probably need an ACPI spec update to say what such a zero length, zero base region means. Possible the ASWG folk would say this is fine and I'm reading too much into the spec, but I'd definitely suggest asking them via the appropriate path, or throwing in a code first proposal for a comment on this special case and see what response you get - my guess is it will be 'fix Linux' :( > > > Seems like you've run into the same issue CXL has with dynamic addition of > > nodes to the kernel and all you want to do here is make sure it thinks there are > > enough nodes so initializes various structures large enough. > > > Yes, exactly. >