From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DD486C3ABD8 for ; Thu, 15 May 2025 03:05:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D33996B00E7; Wed, 14 May 2025 23:05:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CE1846B00FB; Wed, 14 May 2025 23:05:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B5AC76B00FC; Wed, 14 May 2025 23:05:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 93C166B00E7 for ; Wed, 14 May 2025 23:05:20 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id D9FBABEF28 for ; Thu, 15 May 2025 03:05:20 +0000 (UTC) X-FDA: 83443651200.30.C5526F6 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf10.hostedemail.com (Postfix) with ESMTP id 2359EC0011 for ; Thu, 15 May 2025 03:05:18 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=YfO4oqVb; spf=pass (imf10.hostedemail.com: domain of npache@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=npache@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747278319; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=LERVvJDdQ50N7NrtLEv/wIwv1YxbOJn5S0TLZdjZtxg=; b=wUSZ3WSmG6PzAcMH/pQA05itc6Of3alMpaywNIGpK1gCLTS2PhnmSy8oamMkjjAYyxx8Vb 3l6rKRZULyHlsW2bRW71PV2S7nCZPqnB5I4krzEQT+Rzbkwew4l1rd0QsRvbzLDj4Iyfi7 ddsRMHnKb6cmr7fVJg6Q0hElgP0iQts= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=YfO4oqVb; spf=pass (imf10.hostedemail.com: domain of npache@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=npache@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747278319; a=rsa-sha256; cv=none; b=USKmTvQnAgp10Zb593StTdlC4oipK914dR9gI5M0UBSSjF6ROGzB0FATk6lXpHaEoryzbf CbZJhgbZjnodOAK081HSSf94SmsT/5WFynnN6N1wo0OF39Hrs+x8cHvKAAfeeYuTq2DlRV dLuMg5+ZESiNprWFCySsrFSr262K8GM= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1747278318; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LERVvJDdQ50N7NrtLEv/wIwv1YxbOJn5S0TLZdjZtxg=; b=YfO4oqVbBd1fcC56qIbn/eGmQiJGn48oZLBotlCtcV7Bl2GIqHTOLg0tKNTBFeUYS5/6XS pjzh1SfDqxPi9EvcQFJu947gSJPuC+/MhwsJGXSmsa1//7qJSsRMZgTKVvr9fESDdy5zjj FV7D/TvWwD6T2QGJG91OtO9a7k1h/JM= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-338-cltwqETOOn6wvvgk1_WJWg-1; Wed, 14 May 2025 23:05:15 -0400 X-MC-Unique: cltwqETOOn6wvvgk1_WJWg-1 X-Mimecast-MFC-AGG-ID: cltwqETOOn6wvvgk1_WJWg_1747278311 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 566C31956094; Thu, 15 May 2025 03:05:11 +0000 (UTC) Received: from h1.redhat.com (unknown [10.22.88.116]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 22D7418008F4; Thu, 15 May 2025 03:04:55 +0000 (UTC) From: Nico Pache To: linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Cc: david@redhat.com, ziy@nvidia.com, baolin.wang@linux.alibaba.com, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, ryan.roberts@arm.com, dev.jain@arm.com, corbet@lwn.net, rostedt@goodmis.org, mhiramat@kernel.org, mathieu.desnoyers@efficios.com, akpm@linux-foundation.org, baohua@kernel.org, willy@infradead.org, peterx@redhat.com, wangkefeng.wang@huawei.com, usamaarif642@gmail.com, sunnanyong@huawei.com, vishal.moola@gmail.com, thomas.hellstrom@linux.intel.com, yang@os.amperecomputing.com, kirill.shutemov@linux.intel.com, aarcange@redhat.com, raquini@redhat.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, tiwai@suse.de, will@kernel.org, dave.hansen@linux.intel.com, jack@suse.cz, cl@gentwo.org, jglisse@google.com, surenb@google.com, zokeefe@google.com, hannes@cmpxchg.org, rientjes@google.com, mhocko@suse.com, rdunlap@infradead.org Subject: [PATCH v6 06/12] khugepaged: introduce khugepaged_scan_bitmap for mTHP support Date: Wed, 14 May 2025 21:03:06 -0600 Message-ID: <20250515030312.125567-7-npache@redhat.com> In-Reply-To: <20250515030312.125567-6-npache@redhat.com> References: <20250515030312.125567-1-npache@redhat.com> <20250515030312.125567-2-npache@redhat.com> <20250515030312.125567-3-npache@redhat.com> <20250515030312.125567-4-npache@redhat.com> <20250515030312.125567-5-npache@redhat.com> <20250515030312.125567-6-npache@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 X-Rspamd-Server: rspam10 X-Stat-Signature: 4srnh7crkth14i8yqsznhm9ogxoamdrf X-Rspamd-Queue-Id: 2359EC0011 X-Rspam-User: X-HE-Tag: 1747278318-132122 X-HE-Meta: U2FsdGVkX199p4wDGp/ED7T6fQPiD8J5tJjHbI7K5UEh7IK7AnSl9Z0M6Cpcf4NcEbpcCbL5iX6RX9/rtjxC2gMI+f2Q8s2aH3IvpbcdGa9bUo6u2cyCZzyiaxY6E2XVZQfUrTdjzhamy1faUDy+AVbVRkOPy8WIXfVoS4gf3p+uZKLP897htGSRK1SkWu2dcyFAzW0GI/UI2LFwtBELfyKaBk7doLRIIW1meqbDoMwdCmVAEWrurZQIYYvxMitsg2Qk8d3zU0mt4KT1x1ArmdaOy3llbeJfHhdz+lY4U8NQflJlrsECQzjCcYmo2C5BKCnHwz0tG2CF1tJ1o1lFybYf3gFLnBwjkKCcdxWnD4U3m7dC2PBhsXIkueadrZe3ESfBkfBnPT/nWOdSgf1k7m9Lmatao0iRO8VfpA99RvabH4dbIkFf2NIWjBzooKVAzN9QcIcetxrFlYm+RIr0tYOGJcgoiaZN4OgXj8By1sZU/EsoN9IIrXBR9HULTtaFveRg3XTEc4b9ZYHApCcH8wRLYkNre5xdiZhV6zQBGw1YoX0sR/1/4C+g3ueHVmz53oMXTvnPKwHQLjHZd6LWHaCURxTmqrnvODEmOj9ZcIiKasLsjLVx1ZIUxZ18SktVwbIUl5JkBEnp6bGNwKSG53+Dqf4RMm3/zGuBqID4VE2MG7hoIvkBNBH2lL1V2tthoVT1d3UlEW4eHZCOjHjmq81iDRn6ptn4Rkla2Krib1n8n523uBe69pzn/dbVggYnssbft+hgrNV3fu347H3WKb/d45smvHbVX9DLBjNYplV5Ve77U6kmWby4FytBxkjP1izbh1Yb4izbmgKaLJXGqEmN5bRNp+Wni4ZaAey7OSFTv3Bbdz8VnmJBalGNCqJzuLyYtprAEfRwM6Nenb/AoGf+q/Kkte/1ZcL6tLandRTaYTdCajizilzeHj3IrFCKAK7rjevXzWjJa2cq0JY tt5btnh6 KPXNuvZewyRiWYIjcdGF6rRe+xehlp4tMhGEQyzdXQfezBQ5eHQCr/YiIkptcuYOQaN9eAxZ93JlR3aMMXKwxZPz7RTh9Usq/oXJXWL5UpZCwHlEKfri8ANrVhoxQShcZ879sLKCzqneVVJSE3JYvMWnsBcyD+tD7xY2jIHFGhvxd8d81guf950O0o+Yh5moD5LA3MM0kEnAqthcSr/SHFuGybFMTLAOg1px7w6aa1/JQWRi0JjDukexdVwbjCZu9QSuE2GrcR2+5W3U8WKLIg+FO9DaYzZ5iEQ7u2P9IB2PI9kC8MAIcRRs8V+VhDlvlcSnsgWWylHx12ynIwHQ42GXFFQIFaUlrmaEPxrT/EIO5q1ANfYZuYmtxkzYFZV7DgEtmI6H4YZIYY2at3UQ60DTEp1fS2n9BUL4wqlkgrEqCC9YgVET149GSBFWJDqZ/k6rK2khxwmQ6qos= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: khugepaged scans anons PMD ranges for potential collapse to a hugepage. To add mTHP support we use this scan to instead record chunks of utilized sections of the PMD. khugepaged_scan_bitmap uses a stack struct to recursively scan a bitmap that represents chunks of utilized regions. We can then determine what mTHP size fits best and in the following patch, we set this bitmap while scanning the anon PMD. A minimum collapse order of 2 is used as this is the lowest order supported by anon memory. max_ptes_none is used as a scale to determine how "full" an order must be before being considered for collapse. When attempting to collapse an order that has its order set to "always" lets always collapse to that order in a greedy manner without considering the number of bits set. Signed-off-by: Nico Pache --- include/linux/khugepaged.h | 4 ++ mm/khugepaged.c | 94 ++++++++++++++++++++++++++++++++++---- 2 files changed, 89 insertions(+), 9 deletions(-) diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h index b8d69cfbb58b..b6e5ba1fae58 100644 --- a/include/linux/khugepaged.h +++ b/include/linux/khugepaged.h @@ -1,6 +1,10 @@ /* SPDX-License-Identifier: GPL-2.0 */ #ifndef _LINUX_KHUGEPAGED_H #define _LINUX_KHUGEPAGED_H +#define KHUGEPAGED_MIN_MTHP_ORDER 2 +#define KHUGEPAGED_MIN_MTHP_NR (1<mthp_bitmap_stack[++top] = (struct scan_bit_state) + { HPAGE_PMD_ORDER - KHUGEPAGED_MIN_MTHP_ORDER, 0 }; + + while (top >= 0) { + state = cc->mthp_bitmap_stack[top--]; + order = state.order + KHUGEPAGED_MIN_MTHP_ORDER; + offset = state.offset; + num_chunks = 1 << (state.order); + // Skip mTHP orders that are not enabled + if (!test_bit(order, &enabled_orders)) + goto next; + + // copy the relavant section to a new bitmap + bitmap_shift_right(cc->mthp_bitmap_temp, cc->mthp_bitmap, offset, + MTHP_BITMAP_SIZE); + + bits_set = bitmap_weight(cc->mthp_bitmap_temp, num_chunks); + threshold_bits = (HPAGE_PMD_NR - khugepaged_max_ptes_none - 1) + >> (HPAGE_PMD_ORDER - state.order); + + //Check if the region is "almost full" based on the threshold + if (bits_set > threshold_bits || is_pmd_only + || test_bit(order, &huge_anon_orders_always)) { + ret = collapse_huge_page(mm, address, referenced, unmapped, cc, + mmap_locked, order, offset * KHUGEPAGED_MIN_MTHP_NR); + if (ret == SCAN_SUCCEED) { + collapsed += (1 << order); + continue; + } + } + +next: + if (state.order > 0) { + next_order = state.order - 1; + mid_offset = offset + (num_chunks / 2); + cc->mthp_bitmap_stack[++top] = (struct scan_bit_state) + { next_order, mid_offset }; + cc->mthp_bitmap_stack[++top] = (struct scan_bit_state) + { next_order, offset }; + } + } + return collapsed; +} + static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long address, bool *mmap_locked, @@ -1447,9 +1525,7 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, pte_unmap_unlock(pte, ptl); if (result == SCAN_SUCCEED) { result = collapse_huge_page(mm, address, referenced, - unmapped, cc); - /* collapse_huge_page will return with the mmap_lock released */ - *mmap_locked = false; + unmapped, cc, mmap_locked, HPAGE_PMD_ORDER, 0); } out: trace_mm_khugepaged_scan_pmd(mm, folio, writable, referenced, -- 2.49.0