From: Dev Jain <dev.jain@arm.com>
To: Nico Pache <npache@redhat.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Cc: ryan.roberts@arm.com, anshuman.khandual@arm.com,
catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz,
mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com,
will@kernel.org, baohua@kernel.org, jack@suse.cz,
srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com,
aneesh.kumar@kernel.org, yang@os.amperecomputing.com,
peterx@redhat.com, ioworker0@gmail.com,
wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com,
surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com,
zhengqi.arch@bytedance.com, jhubbard@nvidia.com,
21cnbao@gmail.com, willy@infradead.org,
kirill.shutemov@linux.intel.com, david@redhat.com,
aarcange@redhat.com, raquini@redhat.com, sunnanyong@huawei.com,
usamaarif642@gmail.com, audra@redhat.com,
akpm@linux-foundation.org
Subject: Re: [RFC 08/11] khugepaged: introduce khugepaged_scan_bitmap for mTHP support
Date: Fri, 10 Jan 2025 20:24:25 +0530 [thread overview]
Message-ID: <27ae4d80-38cd-4d6b-a49c-dad3f0ffbde3@arm.com> (raw)
In-Reply-To: <20250108233128.14484-9-npache@redhat.com>
On 09/01/25 5:01 am, Nico Pache wrote:
> khugepaged scans PMD ranges for potential collapse to a hugepage. To add
> mTHP support we use this scan to instead record chunks of fully utilized
> sections of the PMD.
>
> create a bitmap to represent a PMD in order MTHP_MIN_ORDER chunks.
> by default we will set this to order 3. The reasoning is that for 4K 512
> PMD size this results in a 64 bit bitmap which has some optimizations.
> For other arches like ARM64 64K, we can set a larger order if needed.
>
> khugepaged_scan_bitmap uses a stack struct to recursively scan a bitmap
> that represents chunks of fully utilized regions. We can then determine
> what mTHP size fits best and in the following patch, we set this bitmap
> while scanning the PMD.
>
> max_ptes_none is used as a scale to determine how "full" an order must
> be before being considered for collapse.
>
> Signed-off-by: Nico Pache <npache@redhat.com>
> ---
> include/linux/khugepaged.h | 4 +-
> mm/khugepaged.c | 129 +++++++++++++++++++++++++++++++++++--
> 2 files changed, 126 insertions(+), 7 deletions(-)
>
[--snip--]
>
> +// Recursive function to consume the bitmap
> +static int khugepaged_scan_bitmap(struct mm_struct *mm, unsigned long address,
> + int referenced, int unmapped, struct collapse_control *cc,
> + bool *mmap_locked, unsigned long enabled_orders)
> +{
> + u8 order, offset;
> + int num_chunks;
> + int bits_set, max_percent, threshold_bits;
> + int next_order, mid_offset;
> + int top = -1;
> + int collapsed = 0;
> + int ret;
> + struct scan_bit_state state;
> +
> + cc->mthp_bitmap_stack[++top] = (struct scan_bit_state)
> + { HPAGE_PMD_ORDER - MIN_MTHP_ORDER, 0 };
> +
> + while (top >= 0) {
> + state = cc->mthp_bitmap_stack[top--];
> + order = state.order;
> + offset = state.offset;
> + num_chunks = 1 << order;
> + // Skip mTHP orders that are not enabled
> + if (!(enabled_orders >> (order + MIN_MTHP_ORDER)) & 1)
> + goto next;
> +
> + // copy the relavant section to a new bitmap
> + bitmap_shift_right(cc->mthp_bitmap_temp, cc->mthp_bitmap, offset,
> + MTHP_BITMAP_SIZE);
> +
> + bits_set = bitmap_weight(cc->mthp_bitmap_temp, num_chunks);
> +
> + // Check if the region is "almost full" based on the threshold
> + max_percent = ((HPAGE_PMD_NR - khugepaged_max_ptes_none - 1) * 100)
> + / (HPAGE_PMD_NR - 1);
> + threshold_bits = (max_percent * num_chunks) / 100;
> +
> + if (bits_set >= threshold_bits) {
> + ret = collapse_huge_page(mm, address, referenced, unmapped, cc,
> + mmap_locked, order + MIN_MTHP_ORDER, offset * MIN_MTHP_NR);
> + if (ret == SCAN_SUCCEED)
> + collapsed += (1 << (order + MIN_MTHP_ORDER));
> + continue;
> + }
We are going to the lower order when it is not in the allowed mask of
orders, or when we are below the threshold. What to do when these
conditions do not happen, and the reason for collapse failure is
collapse_huge_page()? For example, if you start with a PMD order scan,
and collapse_huge_page() fails, then you hit "continue", and then exit
the loop because there is nothing else in the stack, so we exit without
trying mTHPs.
> +
> +next:
> + if (order > 0) {
> + next_order = order - 1;
> + mid_offset = offset + (num_chunks / 2);
> + cc->mthp_bitmap_stack[++top] = (struct scan_bit_state)
> + { next_order, mid_offset };
> + cc->mthp_bitmap_stack[++top] = (struct scan_bit_state)
> + { next_order, offset };
> + }
> + }
> + return collapsed;
> +}
> +
next prev parent reply other threads:[~2025-01-10 14:54 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-08 23:31 [RFC 00/11] khugepaged: mTHP support Nico Pache
2025-01-08 23:31 ` [RFC 01/11] introduce khugepaged_collapse_single_pmd to collapse a single pmd Nico Pache
2025-01-10 6:25 ` Dev Jain
2025-01-08 23:31 ` [RFC 02/11] khugepaged: refactor madvise_collapse and khugepaged_scan_mm_slot Nico Pache
2025-01-08 23:31 ` [RFC 03/11] khugepaged: Don't allocate khugepaged mm_slot early Nico Pache
2025-01-10 6:11 ` Dev Jain
2025-01-10 19:37 ` Nico Pache
2025-01-08 23:31 ` [RFC 04/11] khugepaged: rename hpage_collapse_* to khugepaged_* Nico Pache
2025-01-08 23:31 ` [RFC 05/11] khugepaged: generalize hugepage_vma_revalidate for mTHP support Nico Pache
2025-01-08 23:31 ` [RFC 06/11] khugepaged: generalize alloc_charge_folio " Nico Pache
2025-01-10 6:23 ` Dev Jain
2025-01-10 19:41 ` Nico Pache
2025-01-08 23:31 ` [RFC 07/11] khugepaged: generalize __collapse_huge_page_* " Nico Pache
2025-01-10 6:38 ` Dev Jain
2025-01-08 23:31 ` [RFC 08/11] khugepaged: introduce khugepaged_scan_bitmap " Nico Pache
2025-01-10 9:05 ` Dev Jain
2025-01-10 21:48 ` Nico Pache
2025-01-12 11:23 ` Dev Jain
2025-01-13 22:25 ` Nico Pache
2025-01-10 14:54 ` Dev Jain [this message]
2025-01-10 21:48 ` Nico Pache
2025-01-12 15:13 ` Dev Jain
2025-01-12 16:41 ` Dev Jain
2025-01-08 23:31 ` [RFC 09/11] khugepaged: add " Nico Pache
2025-01-10 9:20 ` Dev Jain
2025-01-10 13:36 ` Dev Jain
2025-01-08 23:31 ` [RFC 10/11] khugepaged: remove max_ptes_none restriction on the pmd scan Nico Pache
2025-01-08 23:31 ` [RFC 11/11] khugepaged: skip collapsing mTHP to smaller orders Nico Pache
2025-01-09 6:22 ` [RFC 00/11] khugepaged: mTHP support Dev Jain
2025-01-10 2:27 ` Nico Pache
2025-01-10 4:56 ` Dev Jain
2025-01-10 22:01 ` Nico Pache
2025-01-12 14:11 ` Dev Jain
2025-01-13 23:00 ` Nico Pache
2025-01-09 6:27 ` Dev Jain
2025-01-10 1:28 ` Nico Pache
2025-01-16 9:47 ` Ryan Roberts
2025-01-16 20:53 ` Nico Pache
2025-01-20 5:17 ` Dev Jain
2025-01-23 20:24 ` Nico Pache
2025-01-24 7:13 ` Dev Jain
2025-01-24 7:38 ` Dev Jain
2025-01-20 12:49 ` Ryan Roberts
2025-01-23 20:42 ` Nico Pache
2025-01-20 12:54 ` David Hildenbrand
2025-01-20 13:37 ` Ryan Roberts
2025-01-20 13:56 ` David Hildenbrand
2025-01-20 16:27 ` Ryan Roberts
2025-01-20 18:39 ` David Hildenbrand
2025-01-21 9:48 ` Ryan Roberts
2025-01-21 10:19 ` David Hildenbrand
2025-01-27 9:31 ` Dev Jain
2025-01-22 5:18 ` Dev Jain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=27ae4d80-38cd-4d6b-a49c-dad3f0ffbde3@arm.com \
--to=dev.jain@arm.com \
--cc=21cnbao@gmail.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@kernel.org \
--cc=anshuman.khandual@arm.com \
--cc=apopple@nvidia.com \
--cc=audra@redhat.com \
--cc=baohua@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=cl@gentwo.org \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=haowenchao22@gmail.com \
--cc=hughd@google.com \
--cc=ioworker0@gmail.com \
--cc=jack@suse.cz \
--cc=jglisse@google.com \
--cc=jhubbard@nvidia.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=npache@redhat.com \
--cc=peterx@redhat.com \
--cc=raquini@redhat.com \
--cc=ryan.roberts@arm.com \
--cc=srivatsa@csail.mit.edu \
--cc=sunnanyong@huawei.com \
--cc=surenb@google.com \
--cc=usamaarif642@gmail.com \
--cc=vbabka@suse.cz \
--cc=vishal.moola@gmail.com \
--cc=wangkefeng.wang@huawei.com \
--cc=will@kernel.org \
--cc=willy@infradead.org \
--cc=yang@os.amperecomputing.com \
--cc=zhengqi.arch@bytedance.com \
--cc=ziy@nvidia.com \
--cc=zokeefe@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox