From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Usama Arif <usamaarif642@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
david@redhat.com, linux-mm@kvack.org,
linux-fsdevel@vger.kernel.org, corbet@lwn.net, rppt@kernel.org,
surenb@google.com, mhocko@suse.com, hannes@cmpxchg.org,
baohua@kernel.org, shakeel.butt@linux.dev, riel@surriel.com,
ziy@nvidia.com, laoar.shao@gmail.com, dev.jain@arm.com,
baolin.wang@linux.alibaba.com, npache@redhat.com,
Liam.Howlett@oracle.com, ryan.roberts@arm.com, vbabka@suse.cz,
jannh@google.com, Arnd Bergmann <arnd@arndb.de>,
sj@kernel.org, linux-kernel@vger.kernel.org,
linux-doc@vger.kernel.org, kernel-team@meta.com
Subject: Re: [PATCH v4 6/7] selftests: prctl: introduce tests for disabling THPs completely
Date: Wed, 13 Aug 2025 15:54:38 +0100 [thread overview]
Message-ID: <4ec21eaf-b728-43c6-9f11-841a367b9794@lucifer.local> (raw)
In-Reply-To: <20250813135642.1986480-7-usamaarif642@gmail.com>
On Wed, Aug 13, 2025 at 02:55:41PM +0100, Usama Arif wrote:
> The test will set the global system THP setting to never, madvise
> or always depending on the fixture variant and the 2M setting to
> inherit before it starts (and reset to original at teardown).
> The fixture setup will also test if PR_SET_THP_DISABLE prctl call can
> be made to disable all THPs and skip if it fails.
>
> This tests if the process can:
> - successfully get the policy to disable THPs completely.
> - never get a hugepage when the THPs are completely disabled
> with the prctl, including with MADV_HUGE and MADV_COLLAPSE.
> - successfully reset the policy of the process.
> - after reset, only get hugepages with:
> - MADV_COLLAPSE when policy is set to never.
> - MADV_HUGE and MADV_COLLAPSE when policy is set to madvise.
> - always when policy is set to "always".
> - repeat the above tests in a forked process to make sure
> the policy is carried across forks.
>
> Signed-off-by: Usama Arif <usamaarif642@gmail.com>
> Acked-by: David Hildenbrand <david@redhat.com>
Some nits below but this looks sensible, so:
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> ---
> tools/testing/selftests/mm/.gitignore | 1 +
> tools/testing/selftests/mm/Makefile | 1 +
> .../testing/selftests/mm/prctl_thp_disable.c | 168 ++++++++++++++++++
> tools/testing/selftests/mm/thp_settings.c | 9 +-
> tools/testing/selftests/mm/thp_settings.h | 1 +
> 5 files changed, 179 insertions(+), 1 deletion(-)
> create mode 100644 tools/testing/selftests/mm/prctl_thp_disable.c
>
> diff --git a/tools/testing/selftests/mm/.gitignore b/tools/testing/selftests/mm/.gitignore
> index e7b23a8a05fe2..eb023ea857b31 100644
> --- a/tools/testing/selftests/mm/.gitignore
> +++ b/tools/testing/selftests/mm/.gitignore
> @@ -58,3 +58,4 @@ pkey_sighandler_tests_32
> pkey_sighandler_tests_64
> guard-regions
> merge
> +prctl_thp_disable
> diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
> index d75f1effcb791..bd5d17beafa64 100644
> --- a/tools/testing/selftests/mm/Makefile
> +++ b/tools/testing/selftests/mm/Makefile
> @@ -87,6 +87,7 @@ TEST_GEN_FILES += on-fault-limit
> TEST_GEN_FILES += pagemap_ioctl
> TEST_GEN_FILES += pfnmap
> TEST_GEN_FILES += process_madv
> +TEST_GEN_FILES += prctl_thp_disable
> TEST_GEN_FILES += thuge-gen
> TEST_GEN_FILES += transhuge-stress
> TEST_GEN_FILES += uffd-stress
> diff --git a/tools/testing/selftests/mm/prctl_thp_disable.c b/tools/testing/selftests/mm/prctl_thp_disable.c
> new file mode 100644
> index 0000000000000..8845e9f414560
> --- /dev/null
> +++ b/tools/testing/selftests/mm/prctl_thp_disable.c
> @@ -0,0 +1,168 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Basic tests for PR_GET/SET_THP_DISABLE prctl calls
> + *
> + * Author(s): Usama Arif <usamaarif642@gmail.com>
> + */
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <unistd.h>
> +#include <sys/mman.h>
> +#include <sys/prctl.h>
> +#include <sys/wait.h>
> +
> +#include "../kselftest_harness.h"
> +#include "thp_settings.h"
> +#include "vm_util.h"
> +
> +enum thp_collapse_type {
> + THP_COLLAPSE_NONE,
> + THP_COLLAPSE_MADV_HUGEPAGE, /* MADV_HUGEPAGE before access */
> + THP_COLLAPSE_MADV_COLLAPSE, /* MADV_COLLAPSE after access */
> +};
> +
> +/*
> + * Function to mmap a buffer, fault it in, madvise it appropriately (before
> + * page fault for MADV_HUGE, and after for MADV_COLLAPSE), and check if the
> + * mmap region is huge.
> + * Returns:
> + * 0 if test doesn't give hugepage
> + * 1 if test gives a hugepage
> + * -errno if mmap fails
> + */
> +static int test_mmap_thp(enum thp_collapse_type madvise_buf, size_t pmdsize)
> +{
> + char *mem, *mmap_mem;
> + size_t mmap_size;
> + int ret;
> +
> + /* For alignment purposes, we need twice the THP size. */
> + mmap_size = 2 * pmdsize;
> + mmap_mem = (char *)mmap(NULL, mmap_size, PROT_READ | PROT_WRITE,
> + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> + if (mmap_mem == MAP_FAILED)
> + return -errno;
> +
> + /* We need a THP-aligned memory area. */
> + mem = (char *)(((uintptr_t)mmap_mem + pmdsize) & ~(pmdsize - 1));
> +
> + if (madvise_buf == THP_COLLAPSE_MADV_HUGEPAGE)
> + madvise(mem, pmdsize, MADV_HUGEPAGE);
> +
> + /* Ensure memory is allocated */
> + memset(mem, 1, pmdsize);
> +
> + if (madvise_buf == THP_COLLAPSE_MADV_COLLAPSE)
> + madvise(mem, pmdsize, MADV_COLLAPSE);
> +
> + /* HACK: make sure we have a separate VMA that we can check reliably. */
> + mprotect(mem, pmdsize, PROT_READ);
I mean you won't be _absolutely_ sure of this, as you might merge with an
adjacent read-only VMA.
The best way is always to map a PROT_NONE mapping first, then perform a
MAP_FIXED mapping into it.
Given 2 * PMD should guarantee at least 1 alligned PMD you can use, you could
do:
char *reserve, *mem, *mmap_mem;
...
(set mmap_size)
/* Reserve space so we don't get any unexpected merges around us. */
reserve = mmap(NULL, 2 * pagesize + mmap_size, PROT_NONE, MAP_PRIVATE | MAP_ANON, -1, 0);
if (reserve == MAP_FAILED)
return -errno;
mmap_mem = mmap(&reserved[pagesize], mmap_size, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANON | MAP_FIXED, -1, 0);
...
You could then do your 'hack' (which is not really a hack, just fine I think).
> +
> + ret = check_huge_anon(mem, 1, pmdsize);
> + munmap(mmap_mem, mmap_size);
> + return ret;
> +}
> +
> +static void prctl_thp_disable_completely_test(struct __test_metadata *const _metadata,
> + size_t pmdsize,
> + enum thp_enabled thp_policy)
> +{
> + ASSERT_EQ(prctl(PR_GET_THP_DISABLE, NULL, NULL, NULL, NULL), 1);
> +
> + /* tests after prctl overrides global policy */
> + ASSERT_EQ(test_mmap_thp(THP_COLLAPSE_NONE, pmdsize), 0);
> +
> + ASSERT_EQ(test_mmap_thp(THP_COLLAPSE_MADV_HUGEPAGE, pmdsize), 0);
> +
> + ASSERT_EQ(test_mmap_thp(THP_COLLAPSE_MADV_COLLAPSE, pmdsize), 0);
> +
> + /* Reset to global policy */
> + ASSERT_EQ(prctl(PR_SET_THP_DISABLE, 0, NULL, NULL, NULL), 0);
> +
> + /* tests after prctl is cleared, and only global policy is effective */
> + ASSERT_EQ(test_mmap_thp(THP_COLLAPSE_NONE, pmdsize),
> + thp_policy == THP_ALWAYS ? 1 : 0);
> +
> + ASSERT_EQ(test_mmap_thp(THP_COLLAPSE_MADV_HUGEPAGE, pmdsize),
> + thp_policy == THP_NEVER ? 0 : 1);
> +
> + ASSERT_EQ(test_mmap_thp(THP_COLLAPSE_MADV_COLLAPSE, pmdsize), 1);
> +}
> +
> +FIXTURE(prctl_thp_disable_completely)
> +{
> + struct thp_settings settings;
> + size_t pmdsize;
> +};
> +
> +FIXTURE_VARIANT(prctl_thp_disable_completely)
> +{
> + enum thp_enabled thp_policy;
> +};
> +
> +FIXTURE_VARIANT_ADD(prctl_thp_disable_completely, never)
> +{
> + .thp_policy = THP_NEVER,
> +};
> +
> +FIXTURE_VARIANT_ADD(prctl_thp_disable_completely, madvise)
> +{
> + .thp_policy = THP_MADVISE,
> +};
> +
> +FIXTURE_VARIANT_ADD(prctl_thp_disable_completely, always)
> +{
> + .thp_policy = THP_ALWAYS,
> +};
> +
Nice!
> +FIXTURE_SETUP(prctl_thp_disable_completely)
> +{
> + if (!thp_available())
> + SKIP(return, "Transparent Hugepages not available\n");
> +
> + self->pmdsize = read_pmd_pagesize();
> + if (!self->pmdsize)
> + SKIP(return, "Unable to read PMD size\n");
> +
> + if (prctl(PR_SET_THP_DISABLE, 1, NULL, NULL, NULL))
> + SKIP(return, "Unable to disable THPs completely for the process\n");
Hm, shouldn't this be a test failure?
> +
> + thp_save_settings();
> + thp_read_settings(&self->settings);
> + self->settings.thp_enabled = variant->thp_policy;
Ugh this variable name is horrid, not your fault. I see you've renamed it at
least in the variant field.
That's not one for this series though, one for a follow up.
> + self->settings.hugepages[sz2ord(self->pmdsize, getpagesize())].enabled = THP_INHERIT;
> + thp_write_settings(&self->settings);
> +}
> +
> +FIXTURE_TEARDOWN(prctl_thp_disable_completely)
> +{
> + thp_restore_settings();
> +}
> +
> +TEST_F(prctl_thp_disable_completely, nofork)
> +{
> + prctl_thp_disable_completely_test(_metadata, self->pmdsize, variant->thp_policy);
> +}
> +
> +TEST_F(prctl_thp_disable_completely, fork)
> +{
> + int ret = 0;
> + pid_t pid;
> +
> + /* Make sure prctl changes are carried across fork */
> + pid = fork();
> + ASSERT_GE(pid, 0);
> +
> + if (!pid)
> + prctl_thp_disable_completely_test(_metadata, self->pmdsize, variant->thp_policy);
> +
> + wait(&ret);
> + if (WIFEXITED(ret))
> + ret = WEXITSTATUS(ret);
> + else
> + ret = -EINVAL;
> + ASSERT_EQ(ret, 0);
> +}
> +
> +TEST_HARNESS_MAIN
> diff --git a/tools/testing/selftests/mm/thp_settings.c b/tools/testing/selftests/mm/thp_settings.c
> index bad60ac52874a..574bd0f8ae480 100644
> --- a/tools/testing/selftests/mm/thp_settings.c
> +++ b/tools/testing/selftests/mm/thp_settings.c
> @@ -382,10 +382,17 @@ unsigned long thp_shmem_supported_orders(void)
> return __thp_supported_orders(true);
> }
>
> -bool thp_is_enabled(void)
> +bool thp_available(void)
> {
> if (access(THP_SYSFS, F_OK) != 0)
> return false;
> + return true;
> +}
> +
> +bool thp_is_enabled(void)
> +{
> + if (!thp_available())
> + return false;
>
> int mode = thp_read_string("enabled", thp_enabled_strings);
>
> diff --git a/tools/testing/selftests/mm/thp_settings.h b/tools/testing/selftests/mm/thp_settings.h
> index 6c07f70beee97..76eeb712e5f10 100644
> --- a/tools/testing/selftests/mm/thp_settings.h
> +++ b/tools/testing/selftests/mm/thp_settings.h
> @@ -84,6 +84,7 @@ void thp_set_read_ahead_path(char *path);
> unsigned long thp_supported_orders(void);
> unsigned long thp_shmem_supported_orders(void);
>
> +bool thp_available(void);
> bool thp_is_enabled(void);
>
> #endif /* __THP_SETTINGS_H__ */
> --
> 2.47.3
>
next prev parent reply other threads:[~2025-08-13 14:55 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-13 13:55 [PATCH v4 0/7] prctl: extend PR_SET_THP_DISABLE to only provide THPs when advised Usama Arif
2025-08-13 13:55 ` [PATCH v4 1/7] prctl: extend PR_SET_THP_DISABLE to optionally exclude VM_HUGEPAGE Usama Arif
2025-08-13 13:55 ` [PATCH v4 2/7] mm/huge_memory: convert "tva_flags" to "enum tva_type" Usama Arif
2025-08-14 3:07 ` Yafang Shao
2025-08-14 10:43 ` Usama Arif
2025-08-15 1:11 ` Andrew Morton
2025-08-15 9:29 ` Usama Arif
2025-08-14 14:59 ` Zi Yan
2025-08-13 13:55 ` [PATCH v4 3/7] mm/huge_memory: respect MADV_COLLAPSE with PR_THP_DISABLE_EXCEPT_ADVISED Usama Arif
2025-08-14 15:14 ` Zi Yan
2025-08-13 13:55 ` [PATCH v4 4/7] docs: transhuge: document process level THP controls Usama Arif
2025-08-13 14:30 ` Lorenzo Stoakes
2025-08-14 15:47 ` Zi Yan
2025-08-13 13:55 ` [PATCH v4 5/7] selftest/mm: Extract sz2ord function into vm_util.h Usama Arif
2025-08-13 14:31 ` Lorenzo Stoakes
2025-08-14 15:52 ` Zi Yan
2025-08-13 13:55 ` [PATCH v4 6/7] selftests: prctl: introduce tests for disabling THPs completely Usama Arif
2025-08-13 14:54 ` Lorenzo Stoakes [this message]
2025-08-13 13:55 ` [PATCH v4 7/7] selftests: prctl: introduce tests for disabling THPs except for madvise Usama Arif
2025-08-13 15:13 ` Lorenzo Stoakes
2025-08-13 16:24 ` David Hildenbrand
2025-08-13 18:52 ` Lorenzo Stoakes
2025-08-14 9:32 ` David Hildenbrand
2025-08-14 10:49 ` Lorenzo Stoakes
2025-08-14 11:45 ` Mark Brown
2025-08-14 12:00 ` David Hildenbrand
2025-08-14 12:09 ` Mark Brown
2025-08-14 12:59 ` David Hildenbrand
2025-08-14 13:08 ` Mark Brown
2025-08-14 15:02 ` Lorenzo Stoakes
2025-08-14 15:41 ` Usama Arif
2025-08-14 10:36 ` Usama Arif
2025-08-14 10:53 ` Lorenzo Stoakes
2025-08-14 11:51 ` Usama Arif
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4ec21eaf-b728-43c6-9f11-841a367b9794@lucifer.local \
--to=lorenzo.stoakes@oracle.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=corbet@lwn.net \
--cc=david@redhat.com \
--cc=dev.jain@arm.com \
--cc=hannes@cmpxchg.org \
--cc=jannh@google.com \
--cc=kernel-team@meta.com \
--cc=laoar.shao@gmail.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=npache@redhat.com \
--cc=riel@surriel.com \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=shakeel.butt@linux.dev \
--cc=sj@kernel.org \
--cc=surenb@google.com \
--cc=usamaarif642@gmail.com \
--cc=vbabka@suse.cz \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).