From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wm1-f66.google.com (mail-wm1-f66.google.com [209.85.128.66]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 64A3E32D0C2 for ; Tue, 6 Jan 2026 14:48:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.66 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767710906; cv=none; b=mr6NAYZrEuAL4EhvbFGQRqUf7LPzILkm0eGvm5hoV7cltuz6TQpX/Z4F9AyVIIjy/61gMtmnGlsUKgmsiWtWypFiDFY3/dokPitm6Yuk+JZ32pTiLvRfAczN7zYkd83vAbIbCz+5ZbeWiPlTLxuQpkRcEYSakpXGB8M0oTQKeek= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767710906; c=relaxed/simple; bh=vxLk3w5ZAJ/8wSZ2KgWvbnidn5O5/Dc7J9O8tLyBpG8=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=UnvKGu0GUHZMwXgkZSNKtBaQAh/Czugh1su90qN8D5v+ksBd3BWrSEo/ZMJE9+L5TirCcQUInxLkb07AUNcIJSaOadn/DxoW72cM9NLQRXmjhxc1eRPoahFGAQZG0bMadsEhTNzMqtct+Xkm1xYRpEXwno3ZfCB3qbQ99rkYbwc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com; spf=pass smtp.mailfrom=suse.com; dkim=pass (2048-bit key) header.d=suse.com header.i=@suse.com header.b=JyOIfaXQ; arc=none smtp.client-ip=209.85.128.66 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=suse.com header.i=@suse.com header.b="JyOIfaXQ" Received: by mail-wm1-f66.google.com with SMTP id 5b1f17b1804b1-47d3ffb0f44so6740935e9.3 for ; Tue, 06 Jan 2026 06:48:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1767710903; x=1768315703; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=zcDYYUlSpjZtiuhui593Rk5sJ8xjAAB2MtRfUEdlRcA=; b=JyOIfaXQNcMydmlHLrCDXaNIUD0dL9jP/Z68wGexuh40aqROjEqFxQB0CDhOz0oMrh 7vUbCWhpsBDW03fD7K7nGDhWOTJdYx+Js00jxChWbuduCWi451Jhkf0/pmvj430WmvXD 7b+IYyfOPAhWRy6n80x3xgF0LranNd7kDB7/sGtt/gnZ/WynG8FOIOjXc5XsG7jddwPk 4x66BQqzvEFwUq8iMTh2IkhXTTymHCQ0+OXon0hAwbt2WVWnxfwojSeMPPRhvhPZzvb+ BhxCiWT2vzxYVdMgFX9Xvi0+yp7iYmIzZMzqS2aoIdLHr7ufH9dFcx8Zium7gleTK4bt kl/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767710903; x=1768315703; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zcDYYUlSpjZtiuhui593Rk5sJ8xjAAB2MtRfUEdlRcA=; b=xDg0EJ9IhO686g5gH3DfWTM6flJyIqnR1UbjG1wEDcUWb9xesPp8IIFnrrBzrFr+Jl GE1/7stpj877/hhTlIQwsSRmg/znxlKEjBFV7TA29yNri5Z/RR5pROLalgQ2rmJjssMu psaz28Ri83IW+Kv/kmAR85k/D2MrtCOZFnmnE5iJ7LtBr/XJjm/5PZxnhRpu6mxIZw59 NiLdcXIR0TQszn4mMbY4d6lkdnPBWTmHgGN3+ZAlGHsh1V3+9LKTJDrERIW1izuLQdxY oFumlZWrZcTZXLGZJJUE5deXALipgN7f8APbKVaoKxHIVHDItjJoMIIk3KHGHZq3JCEp fLfg== X-Forwarded-Encrypted: i=1; AJvYcCVQg6h035hZHPw194QwpdilOmwzX4naETmg//Hc9vcHWx/tE90aMiIOX9ZIGpRtGaTda9apStM/Dc3CUtc=@vger.kernel.org X-Gm-Message-State: AOJu0YyRZNzMT620G8Qr/CUb6UbI+T6+nkJ1uo0xvlC0mWy4XACPzd03 2/H4u3yrqLsgYTl7J0SbA1LSSJzw/b6Wkz8amphs1l01rD0EFhxOTujiE7Uw14DIQH8= X-Gm-Gg: AY/fxX7KV/HUshUjk2/bRQQgP4Y9EfoX/Q9yhW0fdAHVQjUNQKndkR6VFQf9RUK0CKh uH+H5OPe24i+lgKKIsq1TtOj3NI97wIUJfk3ZxRW9y2mhldp1yjhpvkpUuvvTfhZQ/w9D77YURQ 0tPXY2QGAKeCKCD0hyNG3q46xmZE52z1NeLMdym2/53A2pTxACn+znqWlDkr2pYn3idWBvMD8TR c301eCUnVsAc6rdydV4UekRjbgWvvBdfpuxMoxN0yGxOmQ1ML9iFVFepKfi2ROKyqO5EYDoYqm3 7kUxoJq+gwPXFrTcMdhBKvgfvIPZSmG+nYOEV10rNO2yQtdL/p5dy96pySiWASjY5RfpLxbez9Z BOWTY+sNX6HYTBJ0g21ija/189mX4abLd2gJNJU78rMA9IJkbRVoGDZt3xsCIDnT5VVeSgv85CF VkrMJUfezO2lzSEI5yAtcMnVNU X-Google-Smtp-Source: AGHT+IG+fGx05jjqgVtGpbmzaDFhFjx3lqKyFTpOmlDCBAFJagE9HL0lIyPb6OYzFukf4RkWIHBZRA== X-Received: by 2002:a05:600c:5395:b0:479:1348:c614 with SMTP id 5b1f17b1804b1-47d7f09fe13mr36724665e9.26.1767710902546; Tue, 06 Jan 2026 06:48:22 -0800 (PST) Received: from localhost (109-81-90-116.rct.o2.cz. [109.81.90.116]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-47d7f9a31bdsm19312725e9.0.2026.01.06.06.48.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Jan 2026 06:48:22 -0800 (PST) Date: Tue, 6 Jan 2026 15:48:20 +0100 From: Michal Hocko To: Gregory Price Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@meta.com, akpm@linux-foundation.org, vbabka@suse.cz, surenb@google.com, jackmanb@google.com, hannes@cmpxchg.org, ziy@nvidia.com, richard.weiyang@gmail.com, David Hildenbrand Subject: Re: [PATCH v7] page_alloc: allow migration of smaller hugepages during contig_alloc Message-ID: References: <20251221124656.2362540-1-gourry@gourry.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20251221124656.2362540-1-gourry@gourry.net> On Sun 21-12-25 07:46:56, Gregory Price wrote: > We presently skip regions with hugepages entirely when trying to do > contiguous page allocation. This will cause otherwise-movable > 2MB HugeTLB pages to be considered unmovable, and makes 1GB gigantic > page allocation less reliable on systems utilizing both. > > Commit 4d73ba5fa710 ("mm: page_alloc: skip regions with hugetlbfs pages > when allocating 1G pages") skipped all HugePage containing regions > because it can cause significant delays in 1G allocation (as HugeTLB > migrations may fail for a number of reasons). > > Instead, if hugepage migration is enabled, consider regions with > hugepages smaller than the target contiguous allocation request > as valid targets for allocation. > > We optimize for the existing behavior by searching for non-hugetlb > regions in a first pass, then retrying the search to include hugetlb > only on failure. This allows the existing fast-path to remain the > default case with a slow-path fallback to increase reliability. > > We only fallback to the slow path if a hugetlb region was detected, > and we do a full re-scan because the zones/blocks may have changed > during the first pass (and it's not worth further complexity). > > isolate_migrate_pages_block() has similar hugetlb filter logic, and > the hugetlb code does a migratable check in folio_isolate_hugetlb() > during isolation. The code servicing the allocation and migration > already supports this exact use case. > > To test, allocate a bunch of 2MB HugeTLB pages (in this case 48GB) > and then attempt to allocate some 1G HugeTLB pages (in this case 4GB) > (Scale to your machine's memory capacity). > > echo 24576 > .../hugepages-2048kB/nr_hugepages > echo 4 > .../hugepages-1048576kB/nr_hugepages > > Prior to this patch, the 1GB page reservation can fail if no contiguous > 1GB pages remain. After this patch, the kernel will try to move 2MB > pages and successfully allocate the 1GB pages (assuming overall > sufficient memory is available). Also tested this while a program had > the 2MB reservations mapped, and the 1GB reservation still succeeds. > > folio_alloc_gigantic() is the primary user of alloc_contig_pages(), > other users are debug or init-time allocations and largely unaffected. > - ppc/memtrace is a debugfs interface > - x86/tdx memory allocation occurs once on module-init > - kfence/core happens once on module (late) init > - THP uses it in debug_vm_pgtable_alloc_huge_page at __init time > > Suggested-by: David Hildenbrand > Link: https://lore.kernel.org/linux-mm/6fe3562d-49b2-4975-aa86-e139c535ad00@redhat.com/ > Signed-off-by: Gregory Price > Reviewed-by: Zi Yan > Reviewed-by: Wei Yang Sorry to be quite late with this one. Making this two stage process is a reasonable compromise. Have you considered using hugepage_movable_supported? Anyway Acked-by: Michal Hocko Thanks! -- Michal Hocko SUSE Labs