From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F0A3630F52B for ; Thu, 13 Nov 2025 18:44:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763059480; cv=none; b=atSdF5MP8ZeUrPcGKXHXsrqRuLB83ytI2LRzSGlUHwZD8zYXAGbCeRCCJFzQ0RNm26ZRhh+v6jmEXA+s6zwOqHXZx+0JpwGdLQkpZePatDN7tfdGpNtbqZk6ZQobgEJmdVXD1TOXELLOfOOLk27kttTi11E+NDZDra8CxP5FaYk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763059480; c=relaxed/simple; bh=RE5P0X9RyxwuZjuBP8m4q0k+WYSIyGbR3hJ44uf4r7s=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=Pwkz69vaTBky/wT6T8EvvyTmsx6ymmwgRJ5uzyXhx1seqefQeTY9sPB16rypIsL1UgDGnE+tiZt0ykuSl0FbsUwde60BIUFIdblELCy5dngUPkyRwaBFz6sSlsCP1sHB2YUfNBXu3eo5ME/ibflNfd8l45fj3dOWMCT+0lGQy/w= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=pC0cZpHv; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="pC0cZpHv" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E9161C19421; Thu, 13 Nov 2025 18:44:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1763059479; bh=RE5P0X9RyxwuZjuBP8m4q0k+WYSIyGbR3hJ44uf4r7s=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=pC0cZpHvF6d0hLbVgoHS87+c6/lwBz35mdlUvl+etMUkoQnyYhzYKdq2Y6LU9nGsL TyIbKCToQbXil/j7Hqc2snvjgEbSuvrBmloVgxHooOPwgNH+R0THEcYG4brJgOo6M8 bqZp04im0C6eLsy+xT+Mro2rnnIa+sLi32Njq6GAliFD2kKvOJkNmUSsSp6CSqWxjD gXUyK8iE4eLp8CE7dWr+FSbvrhhoxms/LwucBVOTfQjAdZGyUEuJe5ycL/jzDB7B5G c+urL7KgfK111uclHjIboI/TfOcSAS6d252bJxTcdqfHzq3R+2MwtEVtqZJeohqgxr eCxQROZG94TZg== Message-ID: <027b6ac9-836d-4f89-a819-e24d487f9c8e@kernel.org> Date: Thu, 13 Nov 2025 19:44:31 +0100 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v1] mm: fix MAX_FOLIO_ORDER on powerpc configs with hugetlb To: "David Hildenbrand (Red Hat)" , Lorenzo Stoakes Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev , Sourabh Jain , Andrew Morton , "Ritesh Harjani (IBM)" , Madhavan Srinivasan , Donet Tom , Michael Ellerman , Nicholas Piggin , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko References: <20251112145632.508687-1-david@kernel.org> <3fa6d496-b9de-4b66-a7db-247eebec92ca@kernel.org> From: Christophe Leroy Content-Language: fr-FR In-Reply-To: <3fa6d496-b9de-4b66-a7db-247eebec92ca@kernel.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Le 13/11/2025 à 16:21, David Hildenbrand (Red Hat) a écrit : > On 13.11.25 14:01, Lorenzo Stoakes wrote: > > [...] > >>> @@ -137,6 +137,7 @@ config PPC >>>       select ARCH_HAS_DMA_OPS            if PPC64 >>>       select ARCH_HAS_FORTIFY_SOURCE >>>       select ARCH_HAS_GCOV_PROFILE_ALL >>> +    select ARCH_HAS_GIGANTIC_PAGE        if ARCH_SUPPORTS_HUGETLBFS >> >> Given we know the architecture can support it (presumably all powerpc >> arches or all that can support hugetlbfs anyway?), this seems reasonable. > > powerpc allows for quite some different configs, so I assume there are > some configs that don't allow ARCH_SUPPORTS_HUGETLBFS. Yes indeed. For instance the powerpc 603 and 604 have no huge pages. > > [...] > >>>   /* >>>    * There is no real limit on the folio size. We limit them to the >>> maximum we >>> - * currently expect (e.g., hugetlb, dax). >>> + * currently expect: with hugetlb, we expect no folios larger than >>> 16 GiB. >> >> Maybe worth saying 'see CONFIG_HAVE_GIGANTIC_FOLIOS definition' or >> something? > > To me that's implied from the initial ifdef. But not strong opinion > about spelling that out. > >> >>> + */ >>> +#define MAX_FOLIO_ORDER        get_order(SZ_16G) >> >> Hmm, is the base page size somehow runtime adjustable on powerpc? Why >> isn't >> PUD_ORDER good enough here? > > We tried P4D_ORDER but even that doesn't work. I think we effectively > end up with cont-pmd/cont-PUD mappings (or even cont-p4d, I am not 100% > sure because the folding code complicates that). > > See powerpcs variant of huge_pte_alloc() where we have stuff like > > p4d = p4d_offset(pgd_offset(mm, addr), addr); > if (!mm_pud_folded(mm) && sz >= P4D_SIZE) >     return (pte_t *)p4d; > > As soon as we go to things like P4D_ORDER we're suddenly in the range of > 512 GiB on x86 etc, so that's also not what we want as an easy fix. (and > it didn't work) > On 32 bits there are only PGDIR et Page Table, PGDIR_SHIFT = P4D_SHIFT = PUD_SHIFT = PMD_SHIFT For instance on powerpc 8xx, PGDIR_SIZE is 4M Largest hugepage is 8M. So even PGDIR_ORDER isn't enough. Christophe