From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Hildenbrand Date: Thu, 06 May 2021 19:38:37 +0000 Subject: Re: [RFC PATCH 0/7] Memory hotplug/hotremove at subsection size Message-Id: List-Id: References: <20210506152623.178731-1-zi.yan@sent.com> <9D7FD316-988E-4B11-AC1C-64FF790BA79E@nvidia.com> <3a51f564-f3d1-c21f-93b5-1b91639523ec@redhat.com> <16962E62-7D1E-4E06-B832-EC91F54CC359@nvidia.com> <3A6D54CF-76F4-4401-A434-84BEB813A65A@nvidia.com> <0e850dcb-c69a-188b-7ab9-09e6644af3ab@redhat.com> <20210506193026.GE388843@casper.infradead.org> In-Reply-To: <20210506193026.GE388843@casper.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Matthew Wilcox Cc: Zi Yan , Oscar Salvador , Michael Ellerman , Benjamin Herrenschmidt , Thomas Gleixner , x86@kernel.org, Andy Lutomirski , "Rafael J . Wysocki" , Andrew Morton , Mike Rapoport , Anshuman Khandual , Michal Hocko , Dan Williams , Wei Yang , linux-ia64@vger.kernel.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org On 06.05.21 21:30, Matthew Wilcox wrote: > On Thu, May 06, 2021 at 09:10:52PM +0200, David Hildenbrand wrote: >> I have to admit that I am not really a friend of that. I still think our >> target goal should be to have gigantic THP *in addition to* ordinary THP. >> Use gigantic THP where enabled and possible, and just use ordinary THP >> everywhere else. Having one pageblock granularity is a real limitation IMHO >> and requires us to hack the system to support it to some degree. > > You're thinking too small with only two THP sizes ;-) I'm aiming to Well, I raised in my other mail that we will have multiple different use cases, including multiple different THP e.g., on aarch64 ;) > support arbitrary power-of-two memory allocations. I think there's a > fruitful discussion to be had about how that works for anonymous memory -- > with page cache, we have readahead to tell us when our predictions of use > are actually fulfilled. It doesn't tell us what percentage of the pages Right, and I think we have to think about a better approach than just increasing the pageblock_order. > allocated were actually used, but it's a hint. It's a big lift to go from > 2MB all the way to 1GB ... if you can look back to see that the previous > 1GB was basically fully populated, then maybe jump up from allocating > 2MB folios to allocating a 1GB folio, but wow, that's a big step. > > This goal really does mean that we want to allocate from the page > allocator, and so we do want to grow MAX_ORDER. I suppose we could > do somethig ugly like > > if (order <= MAX_ORDER) > alloc_page() > else > alloc_really_big_page() > > but that feels like unnecessary hardship to place on the user. I had something similar for the sort term in mind, relying on alloc_contig_pages() (and maybe ZONE_MOVABLE to make allocations more likely to succeed). Devil's in the details (page migration, ...). -- Thanks, David / dhildenb