From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id E46B530F55F
	for <linux-kernel@vger.kernel.org>; Mon, 15 Jun 2026 13:35:42 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18
ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1781530543; cv=none; b=UxVIULtXqM3IdKihg7XI9Ay/9BiE8hPMNadC2rQVJ4UMkTzTQuVIgFGmmeFyv6vB+NsIGkZmeBpyT8SehTjhmik/G8m4XbdfGhVWZbrXiz5g7YYmbyW52pgj4B6bQBPpJYVKJXQosGoDBOfRvywiXRie0g9AMKrrj3B/zXoBBj4=
ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1781530543; c=relaxed/simple;
	bh=GSmlKIozUbCHtNrgfvRbPcoIe1T1PmE+j6Tcm/FKrh0=;
	h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID:
	 MIME-Version:Content-Type; b=hl9Bvnlna6Gb7OM8Ug2KGPVL+isObufU8j6LViMCLwt5Cr0gZ3x5INHlzlF97QvAa4Wi+ZhbYNI5FVO/ACvt1xuc4zdIshEAmpKtDEyS4Ny8irIsDel0F37b7aECtFhMOY2dd0terr2Ol7WJfsOV8F6OMbu9psj/WzWq9BRt52E=
ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=BlW1Jx+R; arc=none smtp.client-ip=100.103.45.18
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="BlW1Jx+R"
Received: by smtp.kernel.org (Postfix) with ESMTPSA id AC8A51F000E9;
	Mon, 15 Jun 2026 13:35:40 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org;
	s=k20260515; t=1781530542;
	bh=uUFCxSXn77xBdFdSa9gDApYhBo9jSYuSz5ASxlKKY7Q=;
	h=From:To:Cc:Subject:In-Reply-To:References:Date;
	b=BlW1Jx+RJJPWKOuK2BiPu12UqEMM/0i9SqlIuUO8sktQdXHgZfF5/AVYlbU9km0AC
	 CVXiobcnUzhC7Mqe/gAkNn8f7BFLyztRpIbj7yvyMgp/zy4nxulzodpOsJu8QqWxqU
	 WAInfd+vR+vgM5GyHybLyVd9XvunckQ+3NlxcfzVhYnLuDheQ8+SY4ApjlSAP+1h7M
	 ZuucmK2QuzfQ9TeQnl4iR+657mqzMtBmuflMzNXGfEImVPdA4V7LBFQMdb604EaEvq
	 KdbFlKAnkYuJ5ZtzK74U7MERy+c84vIOOJt/12WkHsKJq5NCndA/hsKA5xJJp+iH0j
	 VSOz1LY8W/F5w==
From: Pratyush Yadav <pratyush@kernel.org>
To: Mike Rapoport <rppt@kernel.org>
Cc: Pratyush Yadav <pratyush@kernel.org>,  Pasha Tatashin
 <pasha.tatashin@soleen.com>,  Alexander Graf <graf@amazon.com>,  Muchun
 Song <muchun.song@linux.dev>,  Oscar Salvador <osalvador@suse.de>,  David
 Hildenbrand <david@kernel.org>,  Andrew Morton
 <akpm@linux-foundation.org>,  Jason Miu <jasonmiu@google.com>,  Jork
 Loeser <jloeser@linux.microsoft.com>,  kexec@lists.infradead.org,
  linux-mm@kvack.org,  linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 16/18] memblock: make HugeTLB bootmem allocation work
 with KHO
In-Reply-To: <178143855120.2123877.5431342391381982046.b4-review@b4> (Mike
	Rapoport's message of "Sun, 14 Jun 2026 15:02:31 +0300")
References: <20260605183501.3884950-1-pratyush@kernel.org>
	<20260605183501.3884950-17-pratyush@kernel.org>
	<178143855120.2123877.5431342391381982046.b4-review@b4>
Date: Mon, 15 Jun 2026 15:35:39 +0200
Message-ID: <2vxzpl1soris.fsf@kernel.org>
User-Agent: Gnus/5.13 (Gnus v5.13)
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Type: text/plain

On Sun, Jun 14 2026, Mike Rapoport wrote:

> On Fri, 05 Jun 2026 20:34:49 +0200, Pratyush Yadav <pratyush@kernel.org> wrote:
>> Gigantic huge page allocation is somewhat broken currently when KHO is
>> used.
>> 
>> Firstly, they break KHO scratch size accounting. RSRV_KERN is used to
>> track how much memory is reserved for use by the kernel. Since
>> alloc_bootmem() calls the memblock_alloc*() APIs, the hugepages
>
> hugetlb::alloc_bootmem()

ACK.

>
>> [...]
>> First, it does not use mirrored memory for hugetlb. Mirrored memory is a
>> limited resource that is best saved for kernel data structures, not user
>> memory.
>> 
>> Second, if the memory found overlaps with KHO scratch areas, it discards
>> the memory and retries.
>
> This sentence is somewhat hard to parse.

Okay, let me retry:

    Second, if the free memory area found by memblock_find_in_range_node()
    is a part of a KHO scratch area, the free area is not used. Allocation
    is retried starting after the free area to ensure no hugepages come from
    KHO scratch.

Any better?

>
>>
>>
>> diff --git a/mm/memblock.c b/mm/memblock.c
>> index 6349c48154f4..131e54dd5d8d 100644
>> --- a/mm/memblock.c
>> +++ b/mm/memblock.c
>> @@ -1756,6 +1761,69 @@ void * __init memblock_alloc_try_nid_raw(
>> [ ... skip 51 lines ... ]
>> +		if (memblock_bottom_up())
>> +			start = addr + size;
>> +		else
>> +			start = addr - size;
>> +
>> +		goto retry;
>
> Hmm, two goto retry don't seem nice :/
> Although I can't see how to imporove it really.

Dunno, looked easy enough to understand to me.

>
> Maybe add a helper for going the node fallback?

There is a small downside. There will then be no way to know the
fallback was tried already, so if a retry is done because of scratch
overlap, the fallback needs to be done again.

I don't think it should be too bad, so if you still prefer this then I
can do it.

-- 
Regards,
Pratyush Yadav