From: Lance Yang <lance.yang@linux.dev>
To: Yafang Shao <laoar.shao@gmail.com>
Cc: akpm@linux-foundation.org, david@redhat.com, ziy@nvidia.com,
baolin.wang@linux.alibaba.com, lorenzo.stoakes@oracle.com,
Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com,
dev.jain@arm.com, hannes@cmpxchg.org, usamaarif642@gmail.com,
gutierrez.asier@huawei-partners.com, willy@infradead.org,
ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org,
ameryhung@gmail.com, rientjes@google.com, corbet@lwn.net,
21cnbao@gmail.com, shakeel.butt@linux.dev, bpf@vger.kernel.org,
linux-mm@kvack.org, linux-doc@vger.kernel.org
Subject: Re: [PATCH v7 mm-new 02/10] mm: thp: add support for BPF based THP order selection
Date: Thu, 11 Sep 2025 11:04:58 +0800 [thread overview]
Message-ID: <f1b2fe3d-e8d4-47a7-b369-ee826fec0e7c@linux.dev> (raw)
In-Reply-To: <CALOAHbAfzDNdx5LTUhH+eMgVfdG35gAM5subeByP97x53=CWLw@mail.gmail.com>
On 2025/9/11 10:48, Yafang Shao wrote:
> On Wed, Sep 10, 2025 at 9:57 PM Lance Yang <lance.yang@linux.dev> wrote:
[...]
>>>>> +/**
>>>>> + * @thp_order_fn_t: Get the suggested THP orders from a BPF program for allocation
>>>>> + * @vma: vm_area_struct associated with the THP allocation
>>>>> + * @vma_type: The VMA type, such as BPF_THP_VM_HUGEPAGE if VM_HUGEPAGE is set
>>>>> + * BPF_THP_VM_NOHUGEPAGE if VM_NOHUGEPAGE is set, or BPF_THP_VM_NONE if
>>>>> + * neither is set.
>>>>> + * @tva_type: TVA type for current @vma
>>>>> + * @orders: Bitmask of requested THP orders for this allocation
>>>>> + * - PMD-mapped allocation if PMD_ORDER is set
>>>>> + * - mTHP allocation otherwise
>>>>> + *
>>>>> + * Return: The suggested THP order from the BPF program for allocation. It will
>>>>> + * not exceed the highest requested order in @orders. Return -1 to
>>>>> + * indicate that the original requested @orders should remain unchanged.
>>>>
>>>> A minor documentation nit: the comment says "Return -1 to indicate that the
>>>> original requested @orders should remain unchanged". It might be slightly
>>>> clearer to say "Return a negative value to fall back to the original
>>>> behavior". This would cover all error codes as well ;)
>
> will change it.
Please feel free to change it ;)
>
>>>>
[...]
>>>>
>>>> Also, for future extensions, it might be a good idea to add a reserved
>>>> flags argument to the thp_order_fn_t signature.
>>>>
>>>> For example thp_order_fn_t(..., unsigned long flags).
>>>>
>>>> This would give us aforward-compatible way to add new semantics later
>>>> without breaking the ABI and needing a v2. We could just require it to be
>>>> 0 for now.
>
> That makes sense. However, as Lorenzo mentioned previously, we should
> keep the interface as minimal as possible.
Got it.
[...]
>> Forgot to add:
>>
>> Noticed that if the hook returns 0, bpf_hook_thp_get_orders() falls
>> back to 'orders', preventing us from dynamically disabling mTHP
>> allocations.
>
> Could you please clarify what you mean by that?
>
> + thp_order = bpf_hook_thp_get_order(vma, vma_type, tva_type, orders);
> + if (thp_order < 0)
> + goto out;
>
> In my implementation, it only falls back to @orders if the return
> value is negative. If the return value is 0, it uses BIT(0):
My bad, I completely misread the code last night ...
I see now that returning 0 forces a base page (order-0)
>
> + if (thp_order <= highest_order(orders))
> + thp_orders = BIT(thp_order);
Yes, this is exactly the behavior we need. It will allow us to dynamically
disable mTHP for low-priority containers when we need to, which is perfect
for our use case!
>
>>
>> Honoring a return of 0 is critical for our use case, which is to
>> dynamically disable mTHP for low-priority containers when memory gets
>> low in mixed workloads.
>>
>> And then re-enable it for them when memory is back above the low
>> watermark.
>
> Thank you for detailing your use case; that context is very helpful.
Cheers,
Lance
next prev parent reply other threads:[~2025-09-11 3:05 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-10 2:44 [PATCH v7 mm-new 0/9] mm, bpf: BPF based THP order selection Yafang Shao
2025-09-10 2:44 ` [PATCH v7 mm-new 01/10] mm: thp: remove disabled task from khugepaged_mm_slot Yafang Shao
2025-09-10 5:11 ` Lance Yang
2025-09-10 6:17 ` Yafang Shao
2025-09-10 7:21 ` Lance Yang
2025-09-10 17:27 ` kernel test robot
2025-09-11 2:12 ` Lance Yang
2025-09-11 2:28 ` Zi Yan
2025-09-11 2:35 ` Yafang Shao
2025-09-11 2:38 ` Lance Yang
2025-09-11 13:47 ` Lorenzo Stoakes
2025-09-14 2:48 ` Yafang Shao
2025-09-11 13:43 ` Lorenzo Stoakes
2025-09-14 2:47 ` Yafang Shao
2025-09-10 2:44 ` [PATCH v7 mm-new 02/10] mm: thp: add support for BPF based THP order selection Yafang Shao
2025-09-10 12:42 ` Lance Yang
2025-09-10 12:54 ` Lance Yang
2025-09-10 13:56 ` Lance Yang
2025-09-11 2:48 ` Yafang Shao
2025-09-11 3:04 ` Lance Yang [this message]
2025-09-11 14:45 ` Lorenzo Stoakes
2025-09-11 14:02 ` Lorenzo Stoakes
2025-09-11 14:42 ` Lance Yang
2025-09-11 14:58 ` Lorenzo Stoakes
2025-09-12 7:58 ` Yafang Shao
2025-09-12 12:04 ` Lorenzo Stoakes
2025-09-11 14:33 ` Lorenzo Stoakes
2025-09-12 8:28 ` Yafang Shao
2025-09-12 11:53 ` Lorenzo Stoakes
2025-09-14 2:22 ` Yafang Shao
2025-09-11 14:51 ` Lorenzo Stoakes
2025-09-12 8:03 ` Yafang Shao
2025-09-12 12:00 ` Lorenzo Stoakes
2025-09-25 10:05 ` Lance Yang
2025-09-25 11:38 ` Yafang Shao
2025-09-10 2:44 ` [PATCH v7 mm-new 03/10] mm: thp: decouple THP allocation between swap and page fault paths Yafang Shao
2025-09-11 14:55 ` Lorenzo Stoakes
2025-09-12 7:20 ` Yafang Shao
2025-09-12 12:04 ` Lorenzo Stoakes
2025-09-10 2:44 ` [PATCH v7 mm-new 04/10] mm: thp: enable THP allocation exclusively through khugepaged Yafang Shao
2025-09-11 15:53 ` Lance Yang
2025-09-12 6:21 ` Yafang Shao
2025-09-11 15:58 ` Lorenzo Stoakes
2025-09-12 6:17 ` Yafang Shao
2025-09-12 13:48 ` Lorenzo Stoakes
2025-09-14 2:19 ` Yafang Shao
2025-09-10 2:44 ` [PATCH v7 mm-new 05/10] bpf: mark mm->owner as __safe_rcu_or_null Yafang Shao
2025-09-11 16:04 ` Lorenzo Stoakes
2025-09-10 2:44 ` [PATCH v7 mm-new 06/10] bpf: mark vma->vm_mm as __safe_trusted_or_null Yafang Shao
2025-09-11 17:08 ` Lorenzo Stoakes
2025-09-11 17:30 ` Liam R. Howlett
2025-09-11 17:44 ` Lorenzo Stoakes
2025-09-12 3:56 ` Yafang Shao
2025-09-12 3:50 ` Yafang Shao
2025-09-10 2:44 ` [PATCH v7 mm-new 07/10] selftests/bpf: add a simple BPF based THP policy Yafang Shao
2025-09-10 20:44 ` Alexei Starovoitov
2025-09-11 2:31 ` Yafang Shao
2025-09-10 2:44 ` [PATCH v7 mm-new 08/10] selftests/bpf: add test case to update " Yafang Shao
2025-09-10 2:44 ` [PATCH v7 mm-new 09/10] selftests/bpf: add test cases for invalid thp_adjust usage Yafang Shao
2025-09-10 2:44 ` [PATCH v7 mm-new 10/10] Documentation: add BPF-based THP policy management Yafang Shao
2025-09-10 11:11 ` [PATCH v7 mm-new 0/9] mm, bpf: BPF based THP order selection Lance Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f1b2fe3d-e8d4-47a7-b369-ee826fec0e7c@linux.dev \
--to=lance.yang@linux.dev \
--cc=21cnbao@gmail.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=ameryhung@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=bpf@vger.kernel.org \
--cc=corbet@lwn.net \
--cc=daniel@iogearbox.net \
--cc=david@redhat.com \
--cc=dev.jain@arm.com \
--cc=gutierrez.asier@huawei-partners.com \
--cc=hannes@cmpxchg.org \
--cc=laoar.shao@gmail.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=npache@redhat.com \
--cc=rientjes@google.com \
--cc=ryan.roberts@arm.com \
--cc=shakeel.butt@linux.dev \
--cc=usamaarif642@gmail.com \
--cc=willy@infradead.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.