From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1FBD2C4828F for ; Fri, 2 Feb 2024 15:29:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=BM2HzNg2w63SpTbBkKw/qYz3MyrHMjDG/9Gr9HFZ6KE=; b=wY76k3XW+9rElw QDzFuFFE1SCqiw/EML3Pe4gvWhTGCkG8U67t8jbVcNsmbFjGaVsax0TDzkoWs+7/6N3nVz8fdPnLJ Hs/BVzOdFF+TroUT2xe0cfcsTWU+xmlEwUCs4l7TAoOT2IjXNZS6avc1Zi15WvSgt/LlzWujtNRpy k9ZGzJv8sJ84m3nZvAylsM6Fpbqg5avD302M9ON6WNeGf281/K4ZVLzhEN8zz2YOnS/+bebzMhw+G 47I5t1AyRXpJvN8HO2Xi7Cmp7Bg5vdXJhAGjtLo8Z6H4hbBS+KC2AlrHUAi7rivrt3VAG+zyYpRx3 NhBJrLAmxhhLFJQHgTBQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rVvTf-0000000C0TM-0RAv; Fri, 02 Feb 2024 15:29:15 +0000 Received: from dfw.source.kernel.org ([139.178.84.217]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1rVvTc-0000000C0ST-0KWI for linux-arm-kernel@lists.infradead.org; Fri, 02 Feb 2024 15:29:13 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id ADEA362632; Fri, 2 Feb 2024 15:29:10 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 19F71C433C7; Fri, 2 Feb 2024 15:29:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1706887750; bh=eWCT34Fq+b7991k5pNSiwWeUVYQTzCYYL/cyt2+oOI4=; h=Date:From:To:List-Id:Cc:Subject:References:In-Reply-To:From; b=FQcMYbJD377UCUye1ongyzueF7fnGQ9TAFH0kSvL/OREHfX9BuNNBixbmGDTpSsMq DnoNGZ/Kn76YG6ZZNVnKfa3/2buI1j9JIbr8szqq+uV17LUgaGkVQBGv9rk7OttVZJ CbCxbTeiws5gZ66nmRbYgOViuF/j045PjFYyBXr4mtt5VG2MOzy/9vu1e9jJ9TpbeZ E8UCx5f2lOujI5D4n2prtl7OWXa8ya7Gukiz8HJHfe7jVUdAgbX6BKLJz2qTR9Ht10 Qg40JNqkJOm3scz4i6gT2vl7FSJ697eV2hiFvdYOd3MTXu4ACBMyrz8qgR+m9LyROd iwO5+XeVvnCrg== Date: Fri, 2 Feb 2024 09:29:08 -0600 From: Rob Herring To: Oreoluwa Babatunde Cc: catalin.marinas@arm.com, will@kernel.org, frowand.list@gmail.com, vgupta@kernel.org, arnd@arndb.de, olof@lixom.net, soc@kernel.org, guoren@kernel.org, monstr@monstr.eu, palmer@dabbelt.com, aou@eecs.berkeley.edu, dinguyen@kernel.org, chenhuacai@kernel.org, tsbogend@alpha.franken.de, jonas@southpole.se, stefan.kristiansson@saunalahti.fi, shorne@gmail.com, mpe@ellerman.id.au, ysato@users.sourceforge.jp, dalias@libc.org, glaubitz@physik.fu-berlin.de, richard@nod.at, anton.ivanov@cambridgegreys.com, johannes@sipsolutions.net, chris@zankel.net, jcmvbkbc@gmail.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, devicetree@vger.kernel.org, linux-arm-msm@vger.kernel.org, kernel@quicinc.com Subject: Re: [PATCH 00/46] Dynamic allocation of reserved_mem array. Message-ID: <20240202152908.GA4045321-robh@kernel.org> References: <20240126235425.12233-1-quic_obabatun@quicinc.com> <20240131000710.GA2581425-robh@kernel.org> <51dc64bb-3101-4b4a-a54f-c0df6c0b264c@quicinc.com> <20240201194653.GA1328565-robh@kernel.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240202_072912_255389_2EEDA493 X-CRM114-Status: GOOD ( 60.63 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Thu, Feb 01, 2024 at 01:10:18PM -0800, Oreoluwa Babatunde wrote: > = > On 2/1/2024 11:46 AM, Rob Herring wrote: > > On Thu, Feb 01, 2024 at 09:08:06AM -0800, Oreoluwa Babatunde wrote: > >> On 1/30/2024 4:07 PM, Rob Herring wrote: > >>> On Fri, Jan 26, 2024 at 03:53:39PM -0800, Oreoluwa Babatunde wrote: > >>>> The reserved_mem array is used to store data for the different > >>>> reserved memory regions defined in the DT of a device. The array > >>>> stores information such as region name, node, start-address, and size > >>>> of the reserved memory regions. > >>>> > >>>> The array is currently statically allocated with a size of > >>>> MAX_RESERVED_REGIONS(64). This means that any system that specifies a > >>>> number of reserved memory regions greater than MAX_RESERVED_REGIONS(= 64) > >>>> will not have enough space to store the information for all the regi= ons. > >>>> > >>>> Therefore, this series extends the use of the static array for > >>>> reserved_mem, and introduces a dynamically allocated array using > >>>> memblock_alloc() based on the number of reserved memory regions > >>>> specified in the DT. > >>>> > >>>> Some architectures such as arm64 require the page tables to be setup > >>>> before memblock allocated memory is writable. Therefore, the dynamic > >>>> allocation of the reserved_mem array will need to be done after the > >>>> page tables have been setup on these architectures. In most cases th= at > >>>> will be after paging_init(). > >>>> > >>>> Reserved memory regions can be divided into 2 groups. > >>>> i) Statically-placed reserved memory regions > >>>> i.e. regions defined in the DT using the @reg property. > >>>> ii) Dynamically-placed reserved memory regions. > >>>> i.e. regions specified in the DT using the @alloc_ranges > >>>> and @size properties. > >>>> > >>>> It is possible to call memblock_reserve() and memblock_mark_nomap() = on > >>>> the statically-placed reserved memory regions and not need to save t= hem > >>>> to the reserved_mem array until memory is allocated for it using > >>>> memblock, which will be after the page tables have been setup. > >>>> For the dynamically-placed reserved memory regions, it is not possib= le > >>>> to wait to store its information because the starting address is > >>>> allocated only at run time, and hence they need to be stored somewhe= re > >>>> after they are allocated. > >>>> Waiting until after the page tables have been setup to allocate memo= ry > >>>> for the dynamically-placed regions is also not an option because the > >>>> allocations will come from memory that have already been added to the > >>>> page tables, which is not good for memory that is supposed to be > >>>> reserved and/or marked as nomap. > >>>> > >>>> Therefore, this series splits up the processing of the reserved memo= ry > >>>> regions into two stages, of which the first stage is carried out by > >>>> early_init_fdt_scan_reserved_mem() and the second is carried out by > >>>> fdt_init_reserved_mem(). > >>>> > >>>> The early_init_fdt_scan_reserved_mem(), which is called before the p= age > >>>> tables are setup is used to: > >>>> 1. Call memblock_reserve() and memblock_mark_nomap() on all the > >>>> statically-placed reserved memory regions as needed. > >>>> 2. Allocate memory from memblock for the dynamically-placed reserved > >>>> memory regions and store them in the static array for reserved_me= m. > >>>> memblock_reserve() and memblock_mark_nomap() are also called as > >>>> needed on all the memory allocated for the dynamically-placed > >>>> regions. > >>>> 3. Count the total number of reserved memory regions found in the DT. > >>>> > >>>> fdt_init_reserved_mem(), which should be called after the page tables > >>>> have been setup, is used to carry out the following: > >>>> 1. Allocate memory for the reserved_mem array based on the number of > >>>> reserved memory regions counted as mentioned above. > >>>> 2. Copy all the information for the dynamically-placed reserved memo= ry > >>>> regions from the static array into the new allocated memory for t= he > >>>> reserved_mem array. > >>>> 3. Add the information for the statically-placed reserved memory into > >>>> reserved_mem array. > >>>> 4. Run the region specific init functions for each of the reserve me= mory > >>>> regions saved in the reserved_mem array. > >>> I don't see the need for fdt_init_reserved_mem() to be explicitly cal= led = > >>> by arch code. I said this already, but that can be done at the same t= ime = > >>> as unflattening the DT. The same conditions are needed for both: we n= eed = > >>> to be able to allocate memory from memblock. > >>> > >>> To put it another way, if fdt_init_reserved_mem() can be called "earl= y", = > >>> then unflattening could be moved earlier as well. Though I don't thin= k = > >>> we should optimize that. I'd rather see all arches call the DT functi= ons = > >>> at the same stages. > >> Hi Rob, > >> > >> The reason we moved fdt_init_reserved_mem() back into the arch specifi= c code > >> was because we realized that there was no apparently obvious way to ca= ll > >> early_init_fdt_scan_reserved_mem() and fdt_init_reserved_mem() in the = correct > >> order that will work for all archs if we placed fdt_init_reserved_mem(= ) inside the > >> unflatten_devicetree() function. > >> > >> early_init_fdt_scan_reserved_mem() needs to be > >> called first before fdt_init_reserved_mem(). But on some archs, > >> unflatten_devicetree() is called before early_init_fdt_scan_reserved_m= em(), which > >> means that if we have fdt_init_reserved_mem() inside the unflatten_dev= icetree() > >> function, it will be called before early_init_fdt_scan_reserved_mem(). > >> > >> This is connected to your other comments on Patch 7 & Patch 14. > >> I agree, unflatten_devicetree() should NOT be getting called before we= reserve > >> memory for the reserved memory regions because that could cause memory= to be > >> allocated from regions that should be reserved. > >> > >> Hence, resolving this issue should allow us to call fdt_init_reserved_= mem() from > >> the=A0 unflatten_devicetree() function without it changing the order t= hat we are > >> trying to have. > > There's one issue I've found which is unflatten_device_tree() isn't = > > called for ACPI case on arm64. Turns out we need /reserved-memory = > > handled in that case too. However, I think we're going to change = > > calling unflatten_device_tree() unconditionally for another reason[1]. = > > > > [1] https://lore.kernel.org/all/efe6a7886c3491cc9c225a903efa2b1e.sboyd@= kernel.org/ > > > >> I will work on implementing this and send another revision. > > I think we should go with a simpler route that's just copy the an = > > initial array in initdata to a properly sized, allocated array like the = > > patch below. Of course it will need some arch fixes and a follow-on = > > patch to increase the initial array size. > > > > 8<-------------------------------------------------------------------- > > From: Rob Herring > > Date: Wed, 31 Jan 2024 16:26:23 -0600 > > Subject: [PATCH] of: reserved-mem: Re-allocate reserved_mem array to ac= tual > > size > > > > In preparation to increase the static reserved_mem array size yet again, > > copy the initial array to an allocated array sized based on the actual > > size needed. Now increasing the the size of the static reserved_mem > > array only eats up the initdata space. For platforms with reasonable > > number of reserved regions, we have a net gain in free memory. > > > > In order to do memblock allocations, fdt_init_reserved_mem() is moved a > > bit later to unflatten_device_tree(). On some arches this is effectively > > a nop. [...] > Hi Rob, > = > One thing that could come up with this is that=A0 memory > for the dynamically-placed reserved memory regions > won't be allocated until we call fdt_init_reserved_mem(). > (i.e. reserved memory regions defined using @alloc-ranges > and @size properties) > = > Since fdt_init_reserved_mem() is now being called from > unflatten_device_tree(), the page tables would have been > setup on most architectures, which means we will be > allocating from memory that have already been mapped. > = > Could this be an issue for memory that is supposed to be > reserved? = I suppose if the alloc-ranges region is not much bigger than the size = and the kernel already made some allocation that landed in the region, = then the allocation could fail. Not much we can do other than alloc the = reserved regions as soon as possible. Are there cases where that's not = happening? I suppose the kernel could try and avoid all alloc-ranges until they've = been allocated, but that would have to be best effort. I've seen = optimizations where it's desired to spread buffers across DRAM banks, so = you could have N alloc-ranges for N banks that covers all of memory. There's also the issue that if you have more fixed regions than memblock = can handle (128) before it can reallocate its arrays, then the = page tables themselves could be allocated in reserved regions. > Especially for the regions that are specified as > no-map? 'no-map' is a hint, not a guarantee. Arm32 ignores it for regions = within the kernel's linear map (at least it used to). I don't think = anything changes here with it. Rob _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel