From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, UNWANTED_LANGUAGE_BODY,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B6D7C433DF for ; Thu, 28 May 2020 05:49:28 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1CDE720899 for ; Thu, 28 May 2020 05:49:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1CDE720899 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:58538 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jeBPz-0001XW-91 for qemu-devel@archiver.kernel.org; Thu, 28 May 2020 01:49:27 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:37270) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jeBPA-0008HX-AN for qemu-devel@nongnu.org; Thu, 28 May 2020 01:48:36 -0400 Received: from mga02.intel.com ([134.134.136.20]:8381) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jeBP9-0001M4-9Z for qemu-devel@nongnu.org; Thu, 28 May 2020 01:48:36 -0400 IronPort-SDR: Y2Y1GCQ+GpVtxpLkcuR7Tv+VpQFhNkRYDhzzSIk44kRbRZ3yebUzXCHtzeBcJCxd9FCyf2Y/xg +BRnkx3aMrMA== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 May 2020 22:48:32 -0700 IronPort-SDR: 1EYWfMEzbi8C6LozMMzNGf3p/3TkPNHT0VMCOxNrQBKXRA4REOomfA7gJShMelnvUzn+8r+QGn C+zURv3MIZKA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,443,1583222400"; d="scan'208";a="255696227" Received: from vverma7-mobl4.lm.intel.com ([10.251.137.143]) by fmsmga007.fm.intel.com with ESMTP; 27 May 2020 22:48:31 -0700 From: Vishal Verma To: Subject: [PATCH v2 2/3] hw/acpi-build: account for NVDIMM numa nodes in SRAT Date: Wed, 27 May 2020 23:48:06 -0600 Message-Id: <20200528054807.21278-3-vishal.l.verma@intel.com> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20200528054807.21278-1-vishal.l.verma@intel.com> References: <20200528054807.21278-1-vishal.l.verma@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=134.134.136.20; envelope-from=vishal.l.verma@intel.com; helo=mga02.intel.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/05/28 01:48:29 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Laurent Vivier , Thomas Huth , Xiao Guangrong , "Michael S. Tsirkin" , jingqi.liu@intel.com, Dave Hansen , Eduardo Habkost , Vishal Verma , Paolo Bonzini , Igor Mammedov , Dan Williams , Richard Henderson Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" NVDIMMs can belong to their own proximity domains, as described by the NFIT. In such cases, the SRAT needs to have Memory Affinity structures in the SRAT for these NVDIMMs, otherwise Linux doesn't populate node data structures properly during NUMA initialization. See the following for an example failure case. https://lore.kernel.org/linux-nvdimm/20200416225438.15208-1-vishal.l.verma@intel.com/ Fix this by adding device address range and node information from NVDIMMs to the SRAT in build_srat(). The relevant command line options to exercise this are below. Nodes 0-1 contain CPUs and regular memory, and nodes 2-3 are the NVDIMM address space. -numa node,nodeid=0,mem=2048M, -numa node,nodeid=1,mem=2048M, -numa node,nodeid=2,mem=0, -object memory-backend-file,id=nvmem0,share,mem-path=nvdimm-0,size=16384M,align=128M -device nvdimm,memdev=nvmem0,id=nv0,label-size=2M,node=2 -numa node,nodeid=3,mem=0, -object memory-backend-file,id=nvmem1,share,mem-path=nvdimm-1,size=16384M,align=128M -device nvdimm,memdev=nvmem1,id=nv1,label-size=2M,node=3 Cc: Jingqi Liu Cc: Michael S. Tsirkin Reviewed-by: Jingqi Liu Signed-off-by: Vishal Verma --- hw/acpi/nvdimm.c | 26 ++++++++++++++++++++++++++ hw/i386/acpi-build.c | 10 ++++++++++ include/hw/mem/nvdimm.h | 1 + 3 files changed, 37 insertions(+) diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c index 9316d12b70..d322c6a7a7 100644 --- a/hw/acpi/nvdimm.c +++ b/hw/acpi/nvdimm.c @@ -28,6 +28,7 @@ #include "qemu/osdep.h" #include "qemu/uuid.h" +#include "qapi/error.h" #include "hw/acpi/acpi.h" #include "hw/acpi/aml-build.h" #include "hw/acpi/bios-linker-loader.h" @@ -1334,6 +1335,31 @@ static void nvdimm_build_ssdt(GArray *table_offsets, GArray *table_data, free_aml_allocator(); } +void *nvdimm_build_srat(GArray *table_data) +{ + AcpiSratMemoryAffinity *numamem = NULL; + GSList *device_list = nvdimm_get_device_list(); + + for (; device_list; device_list = device_list->next) { + DeviceState *dev = device_list->data; + uint64_t addr, size; + int node; + + node = object_property_get_int(OBJECT(dev), PC_DIMM_NODE_PROP, + &error_abort); + addr = object_property_get_uint(OBJECT(dev), PC_DIMM_ADDR_PROP, + &error_abort); + size = object_property_get_uint(OBJECT(dev), PC_DIMM_SIZE_PROP, + &error_abort); + + numamem = acpi_data_push(table_data, sizeof *numamem); + build_srat_memory(numamem, addr, size, node, + MEM_AFFINITY_ENABLED | MEM_AFFINITY_NON_VOLATILE); + } + g_slist_free(device_list); + return numamem; +} + void nvdimm_build_acpi(GArray *table_offsets, GArray *table_data, BIOSLinker *linker, NVDIMMState *state, uint32_t ram_slots) diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 2e15f6848e..1461d8a718 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -2428,6 +2428,16 @@ build_srat(GArray *table_data, BIOSLinker *linker, MachineState *machine) MEM_AFFINITY_ENABLED); } } + + if (machine->nvdimms_state->is_enabled) { + void *ret; + + ret = nvdimm_build_srat(table_data); + if (ret != NULL) { + numamem = ret; + } + } + slots = (table_data->len - numa_start) / sizeof *numamem; for (; slots < pcms->numa_nodes + 2; slots++) { numamem = acpi_data_push(table_data, sizeof *numamem); diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h index a3c08955e8..fbe56509b8 100644 --- a/include/hw/mem/nvdimm.h +++ b/include/hw/mem/nvdimm.h @@ -155,6 +155,7 @@ typedef struct NVDIMMState NVDIMMState; void nvdimm_init_acpi_state(NVDIMMState *state, MemoryRegion *io, struct AcpiGenericAddress dsm_io, FWCfgState *fw_cfg, Object *owner); +void *nvdimm_build_srat(GArray *table_data); void nvdimm_build_acpi(GArray *table_offsets, GArray *table_data, BIOSLinker *linker, NVDIMMState *state, uint32_t ram_slots); -- 2.26.2