From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98E0AC07E95 for ; Wed, 7 Jul 2021 10:16:32 +0000 (UTC) Received: from phobos.denx.de (phobos.denx.de [85.214.62.61]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B0AB961C92 for ; Wed, 7 Jul 2021 10:16:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B0AB961C92 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=u-boot-bounces@lists.denx.de Received: from h2850616.stratoserver.net (localhost [IPv6:::1]) by phobos.denx.de (Postfix) with ESMTP id E889782C4C; Wed, 7 Jul 2021 12:16:29 +0200 (CEST) Authentication-Results: phobos.denx.de; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: phobos.denx.de; spf=pass smtp.mailfrom=u-boot-bounces@lists.denx.de Authentication-Results: phobos.denx.de; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="l1v5dCRt"; dkim-atps=neutral Received: by phobos.denx.de (Postfix, from userid 109) id F3B0582C81; Wed, 7 Jul 2021 12:16:27 +0200 (CEST) Received: from mail-pg1-x529.google.com (mail-pg1-x529.google.com [IPv6:2607:f8b0:4864:20::529]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)) (No client certificate requested) by phobos.denx.de (Postfix) with ESMTPS id 3170D82C35 for ; Wed, 7 Jul 2021 12:16:23 +0200 (CEST) Authentication-Results: phobos.denx.de; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: phobos.denx.de; spf=pass smtp.mailfrom=takahiro.akashi@linaro.org Received: by mail-pg1-x529.google.com with SMTP id 62so1738015pgf.1 for ; Wed, 07 Jul 2021 03:16:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=6S5d393AD27DC64SznSx9BxzV75Yh3/BXHKf7eEwFf4=; b=l1v5dCRtjl6QEECjfMMRyTNu1k0GH6pYUdPSOp9EB8wgq2c8tN9iv2EEWURUmPbP4W 0qbm3XOFr2Dp2j5mIxEPR9HOFK1U3OezHHvKsNGQ1uUtmbLuGgQm4/etQfc5E7YPFXfg IGwkPgNZMAECffmwa5HED1UQEfFslqI1+QqRKeiYOWY4e8bOz5jzIWZWmabDxMJA9Xfo c7Oe2C1uPFp53EDAO0q9ugMRskTPMQdDYNSK74aSfQKItDjkgdN3jnWD97ZxeHMOlExU B8IbLcIdP7IJR3+kC2CpbZ6vk1tAWM7qmzSvJOd9GL0F11wTN2ozoUHleHY7dUjFmZQw U33A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-disposition :content-transfer-encoding:in-reply-to; bh=6S5d393AD27DC64SznSx9BxzV75Yh3/BXHKf7eEwFf4=; b=XQw7SiZBMUuDxv9yJF3TXsja5uL2ztCvISxpjflq26qzqxqiqAVsjlE52+PQQgPZ+J YE+/TAKX6UdWoogP/BiuKVbQkgkOZUfEVUoJF+lpIBcBvncupbC8rlnVwrjKCpFbGe+v nnoBlpVk7TDp2IfalbjkbFUdkr+Co9w55BWkp4i1xs00UhsZf8wpH//Kglq5PCExWBEZ hCIgRC1ALJux34Xyc/+J7s3xOCS/ZNVRvNEmViXRxVk6yYX/9wGeBNromsFQCVme7Qit JSVMvI6Mx7pZFKQOeRJUSOKERZoaeF9e1wUvn7yKSlRkuobYVwIqDD7llnUFmPkWWZEn CBHg== X-Gm-Message-State: AOAM5316AyI7dxRDB6/FzngW7qauwdf8G7WlhpIZA7QdtC0FtQGKA636 nYVnS7KOAmYvaEnXwidSSlp09A== X-Google-Smtp-Source: ABdhPJyZpCPsWZUbaOQj3d/ZLWPb0Go7ePoDDG6jrJ/jAINIu7pV2QCBA4cXxkilUXWRwSLSATs4mQ== X-Received: by 2002:a65:6a01:: with SMTP id m1mr25702731pgu.201.1625652981390; Wed, 07 Jul 2021 03:16:21 -0700 (PDT) Received: from laputa (p3dd30549.tkyea130.ap.so-net.ne.jp. [61.211.5.73]) by smtp.gmail.com with ESMTPSA id q5sm12614167pgt.46.2021.07.07.03.16.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Jul 2021 03:16:20 -0700 (PDT) Date: Wed, 7 Jul 2021 19:16:15 +0900 From: AKASHI Takahiro To: Fran??ois Ozog Cc: Heinrich Schuchardt , Ilias Apalodimas , Tuomas Tynkkynen , U-Boot Mailing List Subject: Re: QEMU NUMA and U-Boot Message-ID: <20210707101615.GA49079@laputa> Mail-Followup-To: AKASHI Takahiro , Fran??ois Ozog , Heinrich Schuchardt , Ilias Apalodimas , Tuomas Tynkkynen , U-Boot Mailing List References: <20210707014435.GA24369@laputa> <611184A1-9948-4377-94A2-89ACBBEB649B@gmx.de> <9D4C1597-9A51-40A7-9937-EA3E0163991E@gmx.de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-BeenThere: u-boot@lists.denx.de X-Mailman-Version: 2.1.34 Precedence: list List-Id: U-Boot discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: u-boot-bounces@lists.denx.de Sender: "U-Boot" X-Virus-Scanned: clamav-milter 0.103.2 at phobos.denx.de X-Virus-Status: Clean On Wed, Jul 07, 2021 at 11:37:19AM +0200, Fran??ois Ozog wrote: > On Wed, 7 Jul 2021 at 09:40, François Ozog wrote: > > > On Wed, 7 Jul 2021 at 05:59, Heinrich Schuchardt > > wrote: > > > > > > Am 7. Juli 2021 05:18:20 MESZ schrieb Heinrich Schuchardt < > > xypron.glpk@gmx.de>: > > > >Am 7. Juli 2021 03:44:35 MESZ schrieb AKASHI Takahiro > > > >: > > > >>François, > > > >> > > > >>On Tue, Jul 06, 2021 at 08:10:08PM +0200, Heinrich Schuchardt wrote: > > > >>> On 7/6/21 6:13 PM, François Ozog wrote: > > > >>> > Hi Heinrich, U-Boot 2021-07rc5 does not take into account memory > > > >>> > description when using Qemu 5.2 NUMA configuration to adapt memory > > > >>map > > > >>> > (kernel_addr_r...): > > > >>> > > > > >>> > -smp 4 \ > > > >>> > -m 8G,slots=2,maxmem=16G \ > > > >>> > -object memory-backend-ram,size=4G,id=m0 \ > > > >>> > -object memory-backend-ram,size=4G,id=m1 \ > > > >>> > -numa node,cpus=0-1,nodeid=0,memdev=m0 \ > > > >>> > -numa node,cpus=2-3,nodeid=1,memdev=m1 > > > >>> > > > > >>> > kernel_addr_r is still 0x4040000 and thus you can't use it to > > > >>bootefi. > > > >>> > > > > >>> > fdt addr 0x13ede6de0; fdt print > > > >>> > > > > >>> > Displays fdt while I think it should not. > > > >>> > > > > >>> > If I load the kernel at dram.start, the load works but not boot > > > >>> > > > > >>> > U-Boot 2021.07 (Jul 06 2021 - 13:26:43 +0000) > > > >>> > > > > >>> > > > > >>> > DRAM:4 GiB > > > >>> > > > > >>> > Flash: 64 MiB > > > >>> > > > > >>> > Loading Environment from Flash... OK > > > >>> > > > > >>> > In:pl011@9000000 > > > >>> > > > > >>> > Out: pl011@9000000 > > > >>> > > > > >>> > Err: pl011@9000000 > > > >>> > > > > >>> > Net: eth0: virtio-net#32 > > > >>> > > > > >>> > Hit any key to stop autoboot:0 > > > >>> > > > > >>> > => > > > >>> > > > > >>> > => bdinfo > > > >>> > > > > >>> > boot_params = 0x0000000000000000 > > > >>> > > > > >>> > DRAM bank = 0x0000000000000000 > > > >>> > > > > >>> > -> start= 0x0000000140000000 > > > >>> > > > > >>> > -> size = 0x0000000100000000 > > > >>> > > > > >>> > flashstart= 0x0000000000000000 > > > >>> > > > > >>> > flashsize = 0x0000000004000000 > > > >>> > > > > >>> > flashoffset = 0x00000000000bc990 > > > >>> > > > > >>> > baudrate= 115200 bps > > > >>> > > > > >>> > relocaddr = 0x000000013ff27000 > > > >>> > > > > >>> > reloc off = 0x000000013ff27000 > > > >>> > > > > >>> > Build = 64-bit > > > >>> > > > > >>> > current eth = virtio-net#32 > > > >>> > > > > >>> > ethaddr = 52:52:52:52:52:52 > > > >>> > > > > >>> > IP addr = > > > >>> > > > > >>> > fdt_blob= 0x000000013ede6de0 > > > >>> > > > > >>> > new_fdt = 0x000000013ede6de0 > > > >>> > > > > >>> > fdt_size= 0x0000000000100000 > > > >>> > > > > >>> > lmb_dump_all: > > > >>> > > > > >>> > memory.cnt= 0x1 > > > >>> > > > > >>> > memory.reg[0x0].base = 0x140000000 > > > >>> > > > > >>> > .size = 0x100000000 > > > >>> > > > > >>> > > > > >>> > reserved.cnt= 0x0 > > > >>> > > > > >>> > arch_number = 0x0000000000000000 > > > >>> > > > > >>> > TLB addr= 0x000000013fff0000 > > > >>> > > > > >>> > irq_sp= 0x000000013ede6dd0 > > > >>> > > > > >>> > sp start= 0x000000013ede6dd0 > > > >>> > > > > >>> > Early malloc usage: 3a8 / 2000 > > > >>> > > > > >>> > => load virtio 0:1 0x140000000 /oskit.efi > > > >>> > > > > >>> > 853424 bytes read in 1 ms (813.9 MiB/s) > > > >>> > > > > >>> > => bootefi0x140000000 0x13ede6dd0 > > > >>> > > > > >>> > ERROR: Failed to register WaitForKey event > > > >>> > > > > >>> > Setting OsIndications failed > > > >>> > > > > >>> > Error: Cannot initialize UEFI sub-system, r = 9 > > > >>> > > > > >>> > > > > >>> > I think there is a need to calculate memory map based on previous > > > >>> > firmware (TFA, QEMU can be considered as previous frimware) > > > >>information > > > >>> > (DT or blob_list). > > > >>> > > > > >>> > What do you think ? > > > >>> > > > > >>> > Cheers > > > >>> > > > > >>> > FF > > > >>> > > > > >>> > -- > > > >>> > > > > >>> > François-Frédéric Ozog | /Director Business Development/ > > > >>> > T: +33.67221.6485 > > > >>> > francois.ozog@linaro.org > > > >>| Skype: ffozog > > > >>> > > > > >>> > > > > >>> > > > >>> The kernel load address is hard coded here: > > > >>> include/configs/qemu-arm.h:41: "kernel_addr_r=0x40400000\0" \ > > > >>> > > > >>> bdinfo shows: > > > >>> DRAM start = 0x140000000 > > > >>> DRAM size = 0x100000000 > > > >>> > > > >>> fdt addr $fdt_addr > > > >>> fdt printf > > > >>> > > > >>> shows two memory areas. One at 40000000, one at 140000000. > > > >> > > > >>(This shows that U-Boot receives a correct memory map via dtb.) > > > >> > > > >>Is this a NUMA machine, isn't it? Why should we care of which > > > >>memory region be used here? Please note that this is a virtual > > > >machine, > > > >>there is no practical difference between two regions. > > > >> > > > >>The root problem is that U-Boot did not recognize there were two > > > >>memory regions. We can fix this issue in either way: > > > >> > > > >>1) > > > >>diff --git a/configs/qemu_arm64_defconfig > > > >>b/configs/qemu_arm64_defconfig > > > >>index f6e586627a8e..b70ffae8bf6e 100644 > > > >>--- a/configs/qemu_arm64_defconfig > > > >>+++ b/configs/qemu_arm64_defconfig > > > >>@@ -1,7 +1,7 @@ > > > >> CONFIG_ARM=y > > > >> CONFIG_POSITION_INDEPENDENT=y > > > >> CONFIG_ARCH_QEMU=y > > > >>-CONFIG_NR_DRAM_BANKS=1 > > > >>+CONFIG_NR_DRAM_BANKS=2 > > > >> CONFIG_ENV_SIZE=0x40000 > > > >> CONFIG_ENV_SECT_SIZE=0x40000 > > > >> CONFIG_AHCI=y > > > >> > > > >>2) > > > >>diff --git a/lib/fdtdec.c b/lib/fdtdec.c > > > >>index 4b097fb588ed..4067ea2dead6 100644 > > > >>--- a/lib/fdtdec.c > > > >>+++ b/lib/fdtdec.c > > > >>@@ -1111,7 +1111,7 @@ int fdtdec_setup_memory_banksize(void) > > > >> return -EINVAL; > > > >> } > > > >> > > > >>- for (bank = 0; bank < CONFIG_NR_DRAM_BANKS; bank++) { > > > >>+ for (bank = 0; ; bank++) { > > > >> ret = ofnode_read_resource(mem, reg++, &res); > > > >> if (ret < 0) { > > > >> reg = 0; > > > >> > > > >> (fdtdec_setup_memory_banksize() is called in dram_init_banksize().) > > > >> > > > >> > > > >>(2) seems much better, but I don't know why we had to use > > > >>CONFIG_NR_DRAM_BANKS here. > > > >> > > > > 2) alone does not work as other places in the code refer to > > CONFIG_NR_DRAM_BANKS. Setting ...BANKS to 32 makes my code work and > > bdinfo seems now correct: > > > => bdinfo > > boot_params = 0x0000000000000000 > > DRAM bank = 0x0000000000000000 > > -> start = 0x0000000140000000 > > -> size = 0x0000000100000000 > > DRAM bank = 0x0000000000000001 > > -> start = 0x0000000040000000 > > -> size = 0x0000000100000000 > > flashstart = 0x0000000000000000 > > flashsize = 0x0000000004000000 > > flashoffset = 0x00000000000bcb88 > > baudrate = 115200 bps > > relocaddr = 0x000000013ff27000 > > reloc off = 0x000000013ff27000 > > Build = 64-bit > > current eth = virtio-net#32 > > ethaddr = 52:52:52:52:52:52 > > IP addr = > > fdt_blob = 0x000000013ede6cf0 > > new_fdt = 0x000000013ede6cf0 > > fdt_size = 0x0000000000100000 > > lmb_dump_all: > > memory.cnt = 0x1 > > memory.reg[0x0].base = 0x40000000 > > .size = 0x200000000 > > reserved.cnt = 0x1 > > reserved.reg[0x0].base = 0x13ede58f0 > > .size = 0x121a710 > > arch_number = 0x0000000000000000 > > TLB addr = 0x000000013fff0000 > > irq_sp = 0x000000013ede6ce0 > > sp start = 0x000000013ede6ce0 > > Early malloc usage: 3a8 / 2000 > > > > May I suggest you propose a combined patch Akashi-san? If we assume > > NUMA systems to be tested up to 8 nodes to mimic real existing > > enterprise hardware and up to 4 memory slots (say for memory hot > > plugging tests) what about a default value of 32? Alternatively, we > > could set this value to a much higher one if the costs are negligible. > > > > > > Well, lets not rush as there are other twists: > > the 4G bank in node 1 is marked BootServicesData in the UEFI GetMemoryMap > which I assume is not the case. EDK2 reports it as ConventionalMemory. > > The root cause seem to be gd->ramtop not being setup properly. > > Further analysis shows that the DT passed to the booted EFI payload does > not seem to be correct: > > DT fragment passed to U-Boot > > memory@140000000 { > numa-node-id = <0x00000001>; > reg = <0x00000001 0x40000000 0x00000001 0x00000000>; > device_type = "memory"; > }; > memory@40000000 { > numa-node-id = <0x00000000>; > reg = <0x00000000 0x40000000 0x00000001 0x00000000>; > device_type = "memory"; > }; > > DT passed to payload (as per my debug code): > > memory@140000000: memory > > numa-node-id 1 > > reg (len= 32) > > 140000000 100000000 > > 40000000 100000000 > > memory@40000000: memory > > numa-node-id 0 > > reg (len= 16) > > 40000000 100000000 > > I am investigating this further... You should check the logic of fdt_fixup_memory_banks() which is called this way: efi_dt_fixup() image_setup_libfdt() arch_fixup_fdt() fdt_fixup_memory_banks() What it does is to put *all* the memory regions unconditionally as a single "reg" array into the *first-detected* "memory" node, which is "memory@140000000" in this case. It means that this function doesn't respect NUMA configuration. -Takahiro Akashi > > >>In this case, other occurrences of CONFIG_NR_DRAM_BANKS in this file > > > >>should be replaced with a variable for it. > > > >> > > > >>> Your use case is well beyond the typical U-Boot usage. So I guess it > > > >>> will be up to Linaro to provide the necessary patches: > > > >>> > > > >>> * determine the active CPU > > > >>> * determine the RAM assigned to the active CPU according > > > >>> to the numa-node-id in the device-tree > > > >>> * make sure that U-Boot only uses the memory of the active CPU > > > >>> internally > > > >>> * make sure that the UEFI memory map contains a compliant > > > >description > > > >>> * possibly, dynamically set up the environment variables > > > >>> > > > >>> +CC Tuomas Tynkkynen (maintainer for qemu_arm64_defconfig) > > > >> > > > >>For (1), we'd better have a different config, or increase > > > >>the value of CONFIG_NR_DRAM_BANKS to a bigger number? > > > > > > > >Is the system configured such that each CPU can access the others CPU's > > > >RAM when entering U-Boot? > > > > > > > >Best regards > > > > > > > >Heinrich > > > > > > > > > > At least the comments for this patch sound as if on a physical system > > cross NUMA node memory access is only available after full SMP > > initialization: > > > > > > > > https://patchwork.kernel.org/project/linux-acpi/patch/20180625130552.5636-1-lorenzo.pieralisi@arm.com/ > > > > > > QEMU may be less restrictive. > > > > > > QEMU allows the node distance to be 255 indicating that cross node > > access is infeasible. > > > > > > Best regards > > > > > > Heinrich > > > > > > >> > > > >>-Takahiro Akashi > > > >> > > > >> > > > >>> Best regards > > > >>> > > > >>> Heinrich > > > > > > > > > -- > > François-Frédéric Ozog | Director Business Development > > T: +33.67221.6485 > > francois.ozog@linaro.org | Skype: ffozog > > > > > -- > François-Frédéric Ozog | *Director Business Development* > T: +33.67221.6485 > francois.ozog@linaro.org | Skype: ffozog