* [PATCH v3 01/13] selftests/mm: restore default nr_hugepages value during cleanup in charge_reserved_hugetlb.sh
2026-03-27 7:15 [PATCH v3 00/13] selftests/mm: fix failures and robustness improvements Sayali Patil
@ 2026-03-27 7:15 ` Sayali Patil
2026-04-01 14:52 ` Sayali Patil
2026-03-27 7:15 ` [PATCH v3 02/13] selftests/mm: fix hugetlb pathname construction " Sayali Patil
` (12 subsequent siblings)
13 siblings, 1 reply; 45+ messages in thread
From: Sayali Patil @ 2026-03-27 7:15 UTC (permalink / raw)
To: Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: David Hildenbrand, Zi Yan, Michal Hocko, Oscar Salvador,
Lorenzo Stoakes, Dev Jain, Liam.Howlett, linuxppc-dev,
Sayali Patil, Venkat Rao Bagalkote
During cleanup, the value of /proc/sys/vm/nr_hugepages is currently being
set to 0. At the end of the test, if all tests pass, the original
nr_hugepages value is restored. However, if any test fails, it remains
set to 0.
With this patch, we ensure that the original nr_hugepages value is
restored during cleanup, regardless of whether the test passes or fails.
Fixes: 7d695b1c3695b ("selftests/mm: save and restore nr_hugepages value")
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: David Hildenbrand (Arm) <david@kernel.org>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
---
tools/testing/selftests/mm/charge_reserved_hugetlb.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/mm/charge_reserved_hugetlb.sh b/tools/testing/selftests/mm/charge_reserved_hugetlb.sh
index 447769657634..c9fe68b6fcf9 100755
--- a/tools/testing/selftests/mm/charge_reserved_hugetlb.sh
+++ b/tools/testing/selftests/mm/charge_reserved_hugetlb.sh
@@ -65,7 +65,7 @@ function cleanup() {
if [[ -e $cgroup_path/hugetlb_cgroup_test2 ]]; then
rmdir $cgroup_path/hugetlb_cgroup_test2
fi
- echo 0 >/proc/sys/vm/nr_hugepages
+ echo "$nr_hugepgs" > /proc/sys/vm/nr_hugepages
echo CLEANUP DONE
}
--
2.52.0
^ permalink raw reply related [flat|nested] 45+ messages in thread* Re: [PATCH v3 01/13] selftests/mm: restore default nr_hugepages value during cleanup in charge_reserved_hugetlb.sh
2026-03-27 7:15 ` [PATCH v3 01/13] selftests/mm: restore default nr_hugepages value during cleanup in charge_reserved_hugetlb.sh Sayali Patil
@ 2026-04-01 14:52 ` Sayali Patil
2026-04-01 16:05 ` Sayali Patil
0 siblings, 1 reply; 45+ messages in thread
From: Sayali Patil @ 2026-04-01 14:52 UTC (permalink / raw)
To: Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: David Hildenbrand, Zi Yan, Michal Hocko, Oscar Salvador,
Lorenzo Stoakes, Dev Jain, Liam.Howlett, linuxppc-dev,
Venkat Rao Bagalkote
[-- Attachment #1: Type: text/plain, Size: 2907 bytes --]
On 27/03/26 12:45, Sayali Patil wrote:
> During cleanup, the value of /proc/sys/vm/nr_hugepages is currently being
> set to 0. At the end of the test, if all tests pass, the original
> nr_hugepages value is restored. However, if any test fails, it remains
> set to 0.
> With this patch, we ensure that the original nr_hugepages value is
> restored during cleanup, regardless of whether the test passes or fails.
>
> Fixes: 7d695b1c3695b ("selftests/mm: save and restore nr_hugepages value")
> Reviewed-by: Zi Yan<ziy@nvidia.com>
> Reviewed-by: David Hildenbrand (Arm)<david@kernel.org>
> Tested-by: Venkat Rao Bagalkote<venkat88@linux.ibm.com>
> Signed-off-by: Sayali Patil<sayalip@linux.ibm.com>
> ---
> tools/testing/selftests/mm/charge_reserved_hugetlb.sh | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/mm/charge_reserved_hugetlb.sh b/tools/testing/selftests/mm/charge_reserved_hugetlb.sh
> index 447769657634..c9fe68b6fcf9 100755
> --- a/tools/testing/selftests/mm/charge_reserved_hugetlb.sh
> +++ b/tools/testing/selftests/mm/charge_reserved_hugetlb.sh
> @@ -65,7 +65,7 @@ function cleanup() {
> if [[ -e $cgroup_path/hugetlb_cgroup_test2 ]]; then
> rmdir $cgroup_path/hugetlb_cgroup_test2
> fi
> - echo 0 >/proc/sys/vm/nr_hugepages
> + echo "$nr_hugepgs" > /proc/sys/vm/nr_hugepages
> echo CLEANUP DONE
> }
>
AI review question:
Does this cause excessive memory allocation churn during test execution?
Since cleanup() is invoked before and after every test iteration, restoring
the original nr_hugepages here (which could be thousands of pages) and then
immediately setting it back to 10 in the next test case forces the kernel
to repeatedly allocate and free many hugepages for every single test case.
Also, does this reliably restore nr_hugepages on test failures?
The test script runs with "set -e" active. If a test fails while the
background write_to_hugetlbfs process is still running, commands earlier
in cleanup() like "rmdir /mnt/huge" can fail with EBUSY because the
directory is still a mounted filesystem.
Due to "set -e", this failure causes the script to immediately exit,
completely bypassing this restore command at the end of cleanup().
Would it be better to restore the original value once at the very end
of the script using an EXIT trap instead?
Yes, it is better to use an EXIT trap here to avoid unnecessary allocation
churn and to ensure the original value is reliably restored on all exit
paths.
A similar change can also be applied to |hugetlb_reparenting_test.sh|. The
test modifies |nr_hugepages| during execution and restores it from
|cleanup()|, while also reconfiguring it again in |setup()|, which is
invoked multiple times across the test flow. This can lead to repeated
allocation and freeing of hugepages.
I will prepare a patch for both tests and include it in v4.
Thanks,
Sayali
[-- Attachment #2: Type: text/html, Size: 18291 bytes --]
^ permalink raw reply [flat|nested] 45+ messages in thread* Re: [PATCH v3 01/13] selftests/mm: restore default nr_hugepages value during cleanup in charge_reserved_hugetlb.sh
2026-04-01 14:52 ` Sayali Patil
@ 2026-04-01 16:05 ` Sayali Patil
0 siblings, 0 replies; 45+ messages in thread
From: Sayali Patil @ 2026-04-01 16:05 UTC (permalink / raw)
To: Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: David Hildenbrand, Zi Yan, Michal Hocko, Oscar Salvador,
Lorenzo Stoakes, Dev Jain, Liam.Howlett, linuxppc-dev,
Venkat Rao Bagalkote
[-- Attachment #1: Type: text/plain, Size: 3322 bytes --]
On 01/04/26 20:22, Sayali Patil wrote:
> On 27/03/26 12: 45, Sayali Patil wrote: During cleanup, the value of
> /proc/sys/vm/nr_hugepages is currently being set to 0. At the end of
> the test, if all tests pass, the original nr_hugepages value is
> restored. However, if any test fails, it
> ZjQcmQR
> ZjQcmQRYFpfptBannerEnd
> On 27/03/26 12:45, Sayali Patil wrote:
>> During cleanup, the value of /proc/sys/vm/nr_hugepages is currently being
>> set to 0. At the end of the test, if all tests pass, the original
>> nr_hugepages value is restored. However, if any test fails, it remains
>> set to 0.
>> With this patch, we ensure that the original nr_hugepages value is
>> restored during cleanup, regardless of whether the test passes or fails.
>>
>> Fixes: 7d695b1c3695b ("selftests/mm: save and restore nr_hugepages value")
>> Reviewed-by: Zi Yan<ziy@nvidia.com>
>> Reviewed-by: David Hildenbrand (Arm)<david@kernel.org>
>> Tested-by: Venkat Rao Bagalkote<venkat88@linux.ibm.com>
>> Signed-off-by: Sayali Patil<sayalip@linux.ibm.com>
>> ---
>> tools/testing/selftests/mm/charge_reserved_hugetlb.sh | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/tools/testing/selftests/mm/charge_reserved_hugetlb.sh b/tools/testing/selftests/mm/charge_reserved_hugetlb.sh
>> index 447769657634..c9fe68b6fcf9 100755
>> --- a/tools/testing/selftests/mm/charge_reserved_hugetlb.sh
>> +++ b/tools/testing/selftests/mm/charge_reserved_hugetlb.sh
>> @@ -65,7 +65,7 @@ function cleanup() {
>> if [[ -e $cgroup_path/hugetlb_cgroup_test2 ]]; then
>> rmdir $cgroup_path/hugetlb_cgroup_test2
>> fi
>> - echo 0 >/proc/sys/vm/nr_hugepages
>> + echo "$nr_hugepgs" > /proc/sys/vm/nr_hugepages
>> echo CLEANUP DONE
>> }
>>
Previous reply was not formatted properly so sending it again.
AI review question
> Does this cause excessive memory allocation churn during test execution?
> Since cleanup() is invoked before and after every test iteration, restoring
> the original nr_hugepages here (which could be thousands of pages) and then
> immediately setting it back to 10 in the next test case forces the kernel
> to repeatedly allocate and free many hugepages for every single test case.
> Also, does this reliably restore nr_hugepages on test failures?
> The test script runs with "set -e" active. If a test fails while the
> background write_to_hugetlbfs process is still running, commands earlier
> in cleanup() like "rmdir /mnt/huge" can fail with EBUSY because the
> directory is still a mounted filesystem.
> Due to "set -e", this failure causes the script to immediately exit,
> completely bypassing this restore command at the end of cleanup().
> Would it be better to restore the original value once at the very end.
>
Yes, it is better to use an EXIT trap here to avoid unnecessary allocation
churn and to ensure the original value is reliably restored on all exit
paths.
A similar change can also be applied to |hugetlb_reparenting_test.sh|. The
test modifies |nr_hugepages| during execution and restores it from
|cleanup()|, while also reconfiguring it again in |setup()|, which is
invoked multiple times across the test flow. This can lead to repeated
allocation and freeing of hugepages.
I will prepare a patch for both tests and include it in v4.
Thanks,
Sayali
[-- Attachment #2: Type: text/html, Size: 6719 bytes --]
^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v3 02/13] selftests/mm: fix hugetlb pathname construction in charge_reserved_hugetlb.sh
2026-03-27 7:15 [PATCH v3 00/13] selftests/mm: fix failures and robustness improvements Sayali Patil
2026-03-27 7:15 ` [PATCH v3 01/13] selftests/mm: restore default nr_hugepages value during cleanup in charge_reserved_hugetlb.sh Sayali Patil
@ 2026-03-27 7:15 ` Sayali Patil
2026-04-01 14:06 ` David Hildenbrand (Arm)
2026-03-27 7:15 ` [PATCH v3 03/13] selftests/mm: fix hugetlb pathname construction in hugetlb_reparenting_test.sh Sayali Patil
` (11 subsequent siblings)
13 siblings, 1 reply; 45+ messages in thread
From: Sayali Patil @ 2026-03-27 7:15 UTC (permalink / raw)
To: Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: David Hildenbrand, Zi Yan, Michal Hocko, Oscar Salvador,
Lorenzo Stoakes, Dev Jain, Liam.Howlett, linuxppc-dev,
Sayali Patil, Venkat Rao Bagalkote
The charge_reserved_hugetlb.sh script assumes hugetlb cgroup memory
interface file names use the "<size>MB" format
(e.g. hugetlb.1024MB.current).
This assumption breaks on systems with larger huge pages such as 1GB,
where the kernel exposes normalized units:
hugetlb.1GB.current
hugetlb.1GB.max
hugetlb.1GB.rsvd.max
...
As a result, the script attempts to access files like
hugetlb.1024MB.current, which do not exist when the kernel reports the
size in GB.
Normalize the huge page size and construct the pathname using the
appropriate unit (MB or GB), matching the hugetlb controller naming.
Fixes: 209376ed2a84 ("selftests/vm: make charge_reserved_hugetlb.sh work with existing cgroup setting")
Fixes: 29750f71a9b4 ("hugetlb_cgroup: add hugetlb_cgroup reservation tests")
Reviewed-by: Zi Yan <ziy@nvidia.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
---
.../selftests/mm/charge_reserved_hugetlb.sh | 42 +++++++++++++------
1 file changed, 29 insertions(+), 13 deletions(-)
diff --git a/tools/testing/selftests/mm/charge_reserved_hugetlb.sh b/tools/testing/selftests/mm/charge_reserved_hugetlb.sh
index c9fe68b6fcf9..6bec53e16e05 100755
--- a/tools/testing/selftests/mm/charge_reserved_hugetlb.sh
+++ b/tools/testing/selftests/mm/charge_reserved_hugetlb.sh
@@ -89,6 +89,15 @@ function get_machine_hugepage_size() {
}
MB=$(get_machine_hugepage_size)
+if (( MB >= 1024 )); then
+ # For 1GB hugepages
+ UNIT="GB"
+ MB_DISPLAY=$((MB / 1024))
+else
+ # For 2MB hugepages
+ UNIT="MB"
+ MB_DISPLAY=$MB
+fi
function setup_cgroup() {
local name="$1"
@@ -98,11 +107,12 @@ function setup_cgroup() {
mkdir $cgroup_path/$name
echo writing cgroup limit: "$cgroup_limit"
- echo "$cgroup_limit" >$cgroup_path/$name/hugetlb.${MB}MB.$fault_limit_file
+ echo "$cgroup_limit" > \
+ $cgroup_path/$name/hugetlb.${MB_DISPLAY}${UNIT}.$fault_limit_file
echo writing reservation limit: "$reservation_limit"
echo "$reservation_limit" > \
- $cgroup_path/$name/hugetlb.${MB}MB.$reservation_limit_file
+ $cgroup_path/$name/hugetlb.${MB_DISPLAY}${UNIT}.$reservation_limit_file
if [ -e "$cgroup_path/$name/cpuset.cpus" ]; then
echo 0 >$cgroup_path/$name/cpuset.cpus
@@ -137,7 +147,7 @@ function wait_for_file_value() {
function wait_for_hugetlb_memory_to_get_depleted() {
local cgroup="$1"
- local path="$cgroup_path/$cgroup/hugetlb.${MB}MB.$reservation_usage_file"
+ local path="$cgroup_path/$cgroup/hugetlb.${MB_DISPLAY}${UNIT}.$reservation_usage_file"
wait_for_file_value "$path" "0"
}
@@ -145,7 +155,7 @@ function wait_for_hugetlb_memory_to_get_depleted() {
function wait_for_hugetlb_memory_to_get_reserved() {
local cgroup="$1"
local size="$2"
- local path="$cgroup_path/$cgroup/hugetlb.${MB}MB.$reservation_usage_file"
+ local path="$cgroup_path/$cgroup/hugetlb.${MB_DISPLAY}${UNIT}.$reservation_usage_file"
wait_for_file_value "$path" "$size"
}
@@ -153,7 +163,7 @@ function wait_for_hugetlb_memory_to_get_reserved() {
function wait_for_hugetlb_memory_to_get_written() {
local cgroup="$1"
local size="$2"
- local path="$cgroup_path/$cgroup/hugetlb.${MB}MB.$fault_usage_file"
+ local path="$cgroup_path/$cgroup/hugetlb.${MB_DISPLAY}${UNIT}.$fault_usage_file"
wait_for_file_value "$path" "$size"
}
@@ -175,8 +185,8 @@ function write_hugetlbfs_and_get_usage() {
hugetlb_difference=0
reserved_difference=0
- local hugetlb_usage=$cgroup_path/$cgroup/hugetlb.${MB}MB.$fault_usage_file
- local reserved_usage=$cgroup_path/$cgroup/hugetlb.${MB}MB.$reservation_usage_file
+ local hugetlb_usage=$cgroup_path/$cgroup/hugetlb.${MB_DISPLAY}${UNIT}.$fault_usage_file
+ local reserved_usage=$cgroup_path/$cgroup/hugetlb.${MB_DISPLAY}${UNIT}.$reservation_usage_file
local hugetlb_before=$(cat $hugetlb_usage)
local reserved_before=$(cat $reserved_usage)
@@ -307,8 +317,10 @@ function run_test() {
cleanup_hugetlb_memory "hugetlb_cgroup_test"
- local final_hugetlb=$(cat $cgroup_path/hugetlb_cgroup_test/hugetlb.${MB}MB.$fault_usage_file)
- local final_reservation=$(cat $cgroup_path/hugetlb_cgroup_test/hugetlb.${MB}MB.$reservation_usage_file)
+ local final_hugetlb=$(cat \
+ $cgroup_path/hugetlb_cgroup_test/hugetlb.${MB_DISPLAY}${UNIT}.$fault_usage_file)
+ local final_reservation=$(cat \
+ $cgroup_path/hugetlb_cgroup_test/hugetlb.${MB_DISPLAY}${UNIT}.$reservation_usage_file)
echo $hugetlb_difference
echo $reserved_difference
@@ -364,10 +376,14 @@ function run_multiple_cgroup_test() {
reservation_failed1=$reservation_failed
oom_killed1=$oom_killed
- local cgroup1_hugetlb_usage=$cgroup_path/hugetlb_cgroup_test1/hugetlb.${MB}MB.$fault_usage_file
- local cgroup1_reservation_usage=$cgroup_path/hugetlb_cgroup_test1/hugetlb.${MB}MB.$reservation_usage_file
- local cgroup2_hugetlb_usage=$cgroup_path/hugetlb_cgroup_test2/hugetlb.${MB}MB.$fault_usage_file
- local cgroup2_reservation_usage=$cgroup_path/hugetlb_cgroup_test2/hugetlb.${MB}MB.$reservation_usage_file
+ local cgroup1_hugetlb_usage=\
+ $cgroup_path/hugetlb_cgroup_test1/hugetlb.${MB_DISPLAY}${UNIT}.$fault_usage_file
+ local cgroup1_reservation_usage=\
+ $cgroup_path/hugetlb_cgroup_test1/hugetlb.${MB_DISPLAY}${UNIT}.$reservation_usage_file
+ local cgroup2_hugetlb_usage=\
+ $cgroup_path/hugetlb_cgroup_test2/hugetlb.${MB_DISPLAY}${UNIT}.$fault_usage_file
+ local cgroup2_reservation_usage=\
+ $cgroup_path/hugetlb_cgroup_test2/hugetlb.${MB_DISPLAY}${UNIT}.$reservation_usage_file
local usage_before_second_write=$(cat $cgroup1_hugetlb_usage)
local reservation_usage_before_second_write=$(cat $cgroup1_reservation_usage)
--
2.52.0
^ permalink raw reply related [flat|nested] 45+ messages in thread* Re: [PATCH v3 02/13] selftests/mm: fix hugetlb pathname construction in charge_reserved_hugetlb.sh
2026-03-27 7:15 ` [PATCH v3 02/13] selftests/mm: fix hugetlb pathname construction " Sayali Patil
@ 2026-04-01 14:06 ` David Hildenbrand (Arm)
0 siblings, 0 replies; 45+ messages in thread
From: David Hildenbrand (Arm) @ 2026-04-01 14:06 UTC (permalink / raw)
To: Sayali Patil, Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: Zi Yan, Michal Hocko, Oscar Salvador, Lorenzo Stoakes, Dev Jain,
Liam.Howlett, linuxppc-dev, Venkat Rao Bagalkote
On 3/27/26 08:15, Sayali Patil wrote:
> The charge_reserved_hugetlb.sh script assumes hugetlb cgroup memory
> interface file names use the "<size>MB" format
> (e.g. hugetlb.1024MB.current).
> This assumption breaks on systems with larger huge pages such as 1GB,
> where the kernel exposes normalized units:
> hugetlb.1GB.current
> hugetlb.1GB.max
> hugetlb.1GB.rsvd.max
> ...
>
> As a result, the script attempts to access files like
> hugetlb.1024MB.current, which do not exist when the kernel reports the
> size in GB.
>
> Normalize the huge page size and construct the pathname using the
> appropriate unit (MB or GB), matching the hugetlb controller naming.
>
> Fixes: 209376ed2a84 ("selftests/vm: make charge_reserved_hugetlb.sh work with existing cgroup setting")
> Fixes: 29750f71a9b4 ("hugetlb_cgroup: add hugetlb_cgroup reservation tests")
> Reviewed-by: Zi Yan <ziy@nvidia.com>
> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
> Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
> ---
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
--
Cheers,
David
^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v3 03/13] selftests/mm: fix hugetlb pathname construction in hugetlb_reparenting_test.sh
2026-03-27 7:15 [PATCH v3 00/13] selftests/mm: fix failures and robustness improvements Sayali Patil
2026-03-27 7:15 ` [PATCH v3 01/13] selftests/mm: restore default nr_hugepages value during cleanup in charge_reserved_hugetlb.sh Sayali Patil
2026-03-27 7:15 ` [PATCH v3 02/13] selftests/mm: fix hugetlb pathname construction " Sayali Patil
@ 2026-03-27 7:15 ` Sayali Patil
2026-04-01 14:06 ` David Hildenbrand (Arm)
2026-03-27 7:15 ` [PATCH v3 04/13] selftest/mm: fix cgroup task placement and drop memory.current checks " Sayali Patil
` (10 subsequent siblings)
13 siblings, 1 reply; 45+ messages in thread
From: Sayali Patil @ 2026-03-27 7:15 UTC (permalink / raw)
To: Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: David Hildenbrand, Zi Yan, Michal Hocko, Oscar Salvador,
Lorenzo Stoakes, Dev Jain, Liam.Howlett, linuxppc-dev,
Sayali Patil, Venkat Rao Bagalkote
The hugetlb_reparenting_test.sh script constructs hugetlb cgroup
memory interface file names based on the configured huge page size. The
script formats the size only in MB units, which causes mismatches on
systems using larger huge pages where the kernel exposes normalized
units (e.g. "1GB" instead of "1024MB").
As a result, the test fails to locate the corresponding cgroup files
when 1GB huge pages are configured.
Update the script to detect the huge page size and select the
appropriate unit (MB or GB) so that the constructed paths match the
kernel's hugetlb controller naming.
Also print an explicit "Fail" message when a test failure occurs to
improve result visibility.
Fixes: e487a5d513cb ("selftest/mm: make hugetlb_reparenting_test tolerant to async reparenting")
Reviewed-by: Zi Yan <ziy@nvidia.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
---
.../selftests/mm/hugetlb_reparenting_test.sh | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/mm/hugetlb_reparenting_test.sh b/tools/testing/selftests/mm/hugetlb_reparenting_test.sh
index 0dd31892ff67..073a71fa36b4 100755
--- a/tools/testing/selftests/mm/hugetlb_reparenting_test.sh
+++ b/tools/testing/selftests/mm/hugetlb_reparenting_test.sh
@@ -46,6 +46,13 @@ function get_machine_hugepage_size() {
}
MB=$(get_machine_hugepage_size)
+if (( MB >= 1024 )); then
+ UNIT="GB"
+ MB_DISPLAY=$((MB / 1024))
+else
+ UNIT="MB"
+ MB_DISPLAY=$MB
+fi
function cleanup() {
echo cleanup
@@ -87,6 +94,7 @@ function assert_with_retry() {
if [[ $elapsed -ge $timeout ]]; then
echo "actual = $((${actual%% *} / 1024 / 1024)) MB"
echo "expected = $((${expected%% *} / 1024 / 1024)) MB"
+ echo FAIL
cleanup
exit 1
fi
@@ -107,11 +115,13 @@ function assert_state() {
fi
assert_with_retry "$CGROUP_ROOT/a/memory.$usage_file" "$expected_a"
- assert_with_retry "$CGROUP_ROOT/a/hugetlb.${MB}MB.$usage_file" "$expected_a_hugetlb"
+ assert_with_retry \
+ "$CGROUP_ROOT/a/hugetlb.${MB_DISPLAY}${UNIT}.$usage_file" "$expected_a_hugetlb"
if [[ -n "$expected_b" && -n "$expected_b_hugetlb" ]]; then
assert_with_retry "$CGROUP_ROOT/a/b/memory.$usage_file" "$expected_b"
- assert_with_retry "$CGROUP_ROOT/a/b/hugetlb.${MB}MB.$usage_file" "$expected_b_hugetlb"
+ assert_with_retry \
+ "$CGROUP_ROOT/a/b/hugetlb.${MB_DISPLAY}${UNIT}.$usage_file" "$expected_b_hugetlb"
fi
}
--
2.52.0
^ permalink raw reply related [flat|nested] 45+ messages in thread* Re: [PATCH v3 03/13] selftests/mm: fix hugetlb pathname construction in hugetlb_reparenting_test.sh
2026-03-27 7:15 ` [PATCH v3 03/13] selftests/mm: fix hugetlb pathname construction in hugetlb_reparenting_test.sh Sayali Patil
@ 2026-04-01 14:06 ` David Hildenbrand (Arm)
0 siblings, 0 replies; 45+ messages in thread
From: David Hildenbrand (Arm) @ 2026-04-01 14:06 UTC (permalink / raw)
To: Sayali Patil, Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: Zi Yan, Michal Hocko, Oscar Salvador, Lorenzo Stoakes, Dev Jain,
Liam.Howlett, linuxppc-dev, Venkat Rao Bagalkote
On 3/27/26 08:15, Sayali Patil wrote:
> The hugetlb_reparenting_test.sh script constructs hugetlb cgroup
> memory interface file names based on the configured huge page size. The
> script formats the size only in MB units, which causes mismatches on
> systems using larger huge pages where the kernel exposes normalized
> units (e.g. "1GB" instead of "1024MB").
>
> As a result, the test fails to locate the corresponding cgroup files
> when 1GB huge pages are configured.
>
> Update the script to detect the huge page size and select the
> appropriate unit (MB or GB) so that the constructed paths match the
> kernel's hugetlb controller naming.
>
> Also print an explicit "Fail" message when a test failure occurs to
> improve result visibility.
>
> Fixes: e487a5d513cb ("selftest/mm: make hugetlb_reparenting_test tolerant to async reparenting")
> Reviewed-by: Zi Yan <ziy@nvidia.com>
> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
> Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
> ---
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
--
Cheers,
David
^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v3 04/13] selftest/mm: fix cgroup task placement and drop memory.current checks in hugetlb_reparenting_test.sh
2026-03-27 7:15 [PATCH v3 00/13] selftests/mm: fix failures and robustness improvements Sayali Patil
` (2 preceding siblings ...)
2026-03-27 7:15 ` [PATCH v3 03/13] selftests/mm: fix hugetlb pathname construction in hugetlb_reparenting_test.sh Sayali Patil
@ 2026-03-27 7:15 ` Sayali Patil
2026-04-01 14:08 ` David Hildenbrand (Arm)
2026-03-27 7:15 ` [PATCH v3 05/13] selftests/mm: size tmpfs according to PMD page size in split_huge_page_test Sayali Patil
` (9 subsequent siblings)
13 siblings, 1 reply; 45+ messages in thread
From: Sayali Patil @ 2026-03-27 7:15 UTC (permalink / raw)
To: Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: David Hildenbrand, Zi Yan, Michal Hocko, Oscar Salvador,
Lorenzo Stoakes, Dev Jain, Liam.Howlett, linuxppc-dev,
Sayali Patil
Launch write_to_hugetlbfs as a separate process and move only its PID
into the target cgroup before waiting for completion. This avoids moving
the test shell itself, prevents unintended charging to the shell, and
ensures hugetlb and memcg accounting is attributed only to the intended
workload.
Add a short delay before the hugetlb allocation to avoid a race where
memory may be charged before the task migration takes effect, which
can lead to incorrect accounting and intermittent test failures.
The test currently validates both hugetlb usage and memory.current.
However, memory.current includes internal memcg allocations and
per-CPU batched accounting (MEMCG_CHARGE_BATCH), which are not
synchronized and can vary across systems, leading to
non-deterministic results.
Since hugetlb memory is accounted via hugetlb.<size>.current,
memory.current is not a reliable indicator here. Drop memory.current
checks and rely only on hugetlb controller statistics for stable
and accurate validation.
Fixes: 29750f71a9b4 ("hugetlb_cgroup: add hugetlb_cgroup reservation tests")
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
---
.../selftests/mm/hugetlb_reparenting_test.sh | 42 ++++++++-----------
.../testing/selftests/mm/write_to_hugetlbfs.c | 5 ++-
2 files changed, 22 insertions(+), 25 deletions(-)
diff --git a/tools/testing/selftests/mm/hugetlb_reparenting_test.sh b/tools/testing/selftests/mm/hugetlb_reparenting_test.sh
index 073a71fa36b4..1e87ac67d43e 100755
--- a/tools/testing/selftests/mm/hugetlb_reparenting_test.sh
+++ b/tools/testing/selftests/mm/hugetlb_reparenting_test.sh
@@ -104,22 +104,17 @@ function assert_with_retry() {
}
function assert_state() {
- local expected_a="$1"
- local expected_a_hugetlb="$2"
- local expected_b=""
+ local expected_a_hugetlb="$1"
local expected_b_hugetlb=""
- if [ ! -z ${3:-} ] && [ ! -z ${4:-} ]; then
- expected_b="$3"
- expected_b_hugetlb="$4"
+ if [ ! -z ${2:-} ]; then
+ expected_b_hugetlb="$2"
fi
- assert_with_retry "$CGROUP_ROOT/a/memory.$usage_file" "$expected_a"
assert_with_retry \
"$CGROUP_ROOT/a/hugetlb.${MB_DISPLAY}${UNIT}.$usage_file" "$expected_a_hugetlb"
- if [[ -n "$expected_b" && -n "$expected_b_hugetlb" ]]; then
- assert_with_retry "$CGROUP_ROOT/a/b/memory.$usage_file" "$expected_b"
+ if [[ -n "$expected_b_hugetlb" ]]; then
assert_with_retry \
"$CGROUP_ROOT/a/b/hugetlb.${MB_DISPLAY}${UNIT}.$usage_file" "$expected_b_hugetlb"
fi
@@ -153,18 +148,17 @@ write_hugetlbfs() {
local size="$3"
if [[ $cgroup2 ]]; then
- echo $$ >$CGROUP_ROOT/$cgroup/cgroup.procs
+ cg_file="$CGROUP_ROOT/$cgroup/cgroup.procs"
else
echo 0 >$CGROUP_ROOT/$cgroup/cpuset.mems
echo 0 >$CGROUP_ROOT/$cgroup/cpuset.cpus
- echo $$ >"$CGROUP_ROOT/$cgroup/tasks"
- fi
- ./write_to_hugetlbfs -p "$path" -s "$size" -m 0 -o
- if [[ $cgroup2 ]]; then
- echo $$ >$CGROUP_ROOT/cgroup.procs
- else
- echo $$ >"$CGROUP_ROOT/tasks"
+ cg_file="$CGROUP_ROOT/$cgroup/tasks"
fi
+
+ # Spawn write_to_hugetlbfs in a separate task to ensure correct cgroup accounting
+ ./write_to_hugetlbfs -p "$path" -s "$size" -m 0 -o -d & pid=$!
+ echo "$pid" > "$cg_file"
+ wait "$pid"
echo
}
@@ -202,21 +196,21 @@ if [[ ! $cgroup2 ]]; then
write_hugetlbfs a "$MNT"/test $size
echo Assert memory charged correctly for parent use.
- assert_state 0 $size 0 0
+ assert_state $size 0
write_hugetlbfs a/b "$MNT"/test2 $size
echo Assert memory charged correctly for child use.
- assert_state 0 $(($size * 2)) 0 $size
+ assert_state $(($size * 2)) $size
rmdir "$CGROUP_ROOT"/a/b
echo Assert memory reparent correctly.
- assert_state 0 $(($size * 2))
+ assert_state $(($size * 2))
rm -rf "$MNT"/*
umount "$MNT"
echo Assert memory uncharged correctly.
- assert_state 0 0
+ assert_state 0
cleanup
fi
@@ -230,16 +224,16 @@ echo write
write_hugetlbfs a/b "$MNT"/test2 $size
echo Assert memory charged correctly for child only use.
-assert_state 0 $(($size)) 0 $size
+assert_state $(($size)) $size
rmdir "$CGROUP_ROOT"/a/b
echo Assert memory reparent correctly.
-assert_state 0 $size
+assert_state $size
rm -rf "$MNT"/*
umount "$MNT"
echo Assert memory uncharged correctly.
-assert_state 0 0
+assert_state 0
cleanup
diff --git a/tools/testing/selftests/mm/write_to_hugetlbfs.c b/tools/testing/selftests/mm/write_to_hugetlbfs.c
index ecb5f7619960..6b01b0485bd0 100644
--- a/tools/testing/selftests/mm/write_to_hugetlbfs.c
+++ b/tools/testing/selftests/mm/write_to_hugetlbfs.c
@@ -83,7 +83,7 @@ int main(int argc, char **argv)
setvbuf(stdout, NULL, _IONBF, 0);
self = argv[0];
- while ((c = getopt(argc, argv, "s:p:m:owlrn")) != -1) {
+ while ((c = getopt(argc, argv, "s:p:m:owlrnd")) != -1) {
switch (c) {
case 's':
if (sscanf(optarg, "%zu", &size) != 1) {
@@ -118,6 +118,9 @@ int main(int argc, char **argv)
case 'n':
reserve = 0;
break;
+ case 'd':
+ sleep(1);
+ break;
default:
errno = EINVAL;
perror("Invalid arg");
--
2.52.0
^ permalink raw reply related [flat|nested] 45+ messages in thread* Re: [PATCH v3 04/13] selftest/mm: fix cgroup task placement and drop memory.current checks in hugetlb_reparenting_test.sh
2026-03-27 7:15 ` [PATCH v3 04/13] selftest/mm: fix cgroup task placement and drop memory.current checks " Sayali Patil
@ 2026-04-01 14:08 ` David Hildenbrand (Arm)
2026-04-03 19:59 ` Sayali Patil
0 siblings, 1 reply; 45+ messages in thread
From: David Hildenbrand (Arm) @ 2026-04-01 14:08 UTC (permalink / raw)
To: Sayali Patil, Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: Zi Yan, Michal Hocko, Oscar Salvador, Lorenzo Stoakes, Dev Jain,
Liam.Howlett, linuxppc-dev
On 3/27/26 08:15, Sayali Patil wrote:
> Launch write_to_hugetlbfs as a separate process and move only its PID
> into the target cgroup before waiting for completion. This avoids moving
> the test shell itself, prevents unintended charging to the shell, and
> ensures hugetlb and memcg accounting is attributed only to the intended
> workload.
>
> Add a short delay before the hugetlb allocation to avoid a race where
> memory may be charged before the task migration takes effect, which
> can lead to incorrect accounting and intermittent test failures.
Isn't there still a chance for a race, for example, when running in a VM?
--
Cheers,
David
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH v3 04/13] selftest/mm: fix cgroup task placement and drop memory.current checks in hugetlb_reparenting_test.sh
2026-04-01 14:08 ` David Hildenbrand (Arm)
@ 2026-04-03 19:59 ` Sayali Patil
0 siblings, 0 replies; 45+ messages in thread
From: Sayali Patil @ 2026-04-03 19:59 UTC (permalink / raw)
To: David Hildenbrand (Arm), Andrew Morton, Shuah Khan, linux-mm,
linux-kernel, linux-kselftest, Ritesh Harjani
Cc: Zi Yan, Michal Hocko, Oscar Salvador, Lorenzo Stoakes, Dev Jain,
Liam.Howlett, linuxppc-dev
On 01/04/26 19:38, David Hildenbrand (Arm) wrote:
> On 3/27/26 08:15, Sayali Patil wrote:
>> Launch write_to_hugetlbfs as a separate process and move only its PID
>> into the target cgroup before waiting for completion. This avoids moving
>> the test shell itself, prevents unintended charging to the shell, and
>> ensures hugetlb and memcg accounting is attributed only to the intended
>> workload.
>>
>> Add a short delay before the hugetlb allocation to avoid a race where
>> memory may be charged before the task migration takes effect, which
>> can lead to incorrect accounting and intermittent test failures.
>
> Isn't there still a chance for a race, for example, when running in a VM?
>
Yes, there is still a small race window in the current approach.
I am looking into making this more reliable with a deterministic
synchronization mechanism to avoid such timing dependencies.
I will send a v4 with this improvement.
Thanks,
Sayali
^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v3 05/13] selftests/mm: size tmpfs according to PMD page size in split_huge_page_test
2026-03-27 7:15 [PATCH v3 00/13] selftests/mm: fix failures and robustness improvements Sayali Patil
` (3 preceding siblings ...)
2026-03-27 7:15 ` [PATCH v3 04/13] selftest/mm: fix cgroup task placement and drop memory.current checks " Sayali Patil
@ 2026-03-27 7:15 ` Sayali Patil
2026-04-01 16:20 ` Sayali Patil
2026-03-27 7:16 ` [PATCH v3 06/13] selftest/mm: adjust hugepage-mremap test size for large huge pages Sayali Patil
` (8 subsequent siblings)
13 siblings, 1 reply; 45+ messages in thread
From: Sayali Patil @ 2026-03-27 7:15 UTC (permalink / raw)
To: Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: David Hildenbrand, Zi Yan, Michal Hocko, Oscar Salvador,
Lorenzo Stoakes, Dev Jain, Liam.Howlett, linuxppc-dev,
Sayali Patil, Venkat Rao Bagalkote
The split_file_backed_thp() test mounts a tmpfs with a fixed size of
"4m". This works on systems with smaller PMD page sizes,
but fails on configurations where the PMD huge page size is
larger (e.g. 16MB).
On such systems, the fixed 4MB tmpfs is insufficient to allocate even
a single PMD-sized THP, causing the test to fail.
Fix this by sizing the tmpfs dynamically based on the runtime
pmd_pagesize, allocating space for two PMD-sized pages.
Before patch:
running ./split_huge_page_test /tmp/xfs_dir_YTrI5E
--------------------------------------------------
TAP version 13
1..55
ok 1 Split zero filled huge pages successful
ok 2 Split huge pages to order 0 successful
ok 3 Split huge pages to order 2 successful
ok 4 Split huge pages to order 3 successful
ok 5 Split huge pages to order 4 successful
ok 6 Split huge pages to order 5 successful
ok 7 Split huge pages to order 6 successful
ok 8 Split huge pages to order 7 successful
ok 9 Split PTE-mapped huge pages successful
Please enable pr_debug in split_huge_pages_in_file() for more info.
Failed to write data to testing file: Success (0)
Bail out! Error occurred
Planned tests != run tests (55 != 9)
Totals: pass:9 fail:0 xfail:0 xpass:0 skip:0 error:0
[FAIL]
After patch:
--------------------------------------------------
running ./split_huge_page_test /tmp/xfs_dir_bMvj6o
--------------------------------------------------
TAP version 13
1..55
ok 1 Split zero filled huge pages successful
ok 2 Split huge pages to order 0 successful
ok 3 Split huge pages to order 2 successful
ok 4 Split huge pages to order 3 successful
ok 5 Split huge pages to order 4 successful
ok 6 Split huge pages to order 5 successful
ok 7 Split huge pages to order 6 successful
ok 8 Split huge pages to order 7 successful
ok 9 Split PTE-mapped huge pages successful
Please enable pr_debug in split_huge_pages_in_file() for more info.
Please check dmesg for more information
ok 10 File-backed THP split to order 0 test done
Please enable pr_debug in split_huge_pages_in_file() for more info.
Please check dmesg for more information
ok 11 File-backed THP split to order 1 test done
Please enable pr_debug in split_huge_pages_in_file() for more info.
Please check dmesg for more information
ok 12 File-backed THP split to order 2 test done
...
ok 55 Split PMD-mapped pagecache folio to order 7 at
in-folio offset 128 passed
Totals: pass:55 fail:0 xfail:0 xpass:0 skip:0 error:0
[PASS]
ok 1 split_huge_page_test /tmp/xfs_dir_bMvj6o
Fixes: fbe37501b252 ("mm: huge_memory: debugfs for file-backed THP split")
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: David Hildenbrand (Arm) <david@kernel.org>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
---
tools/testing/selftests/mm/split_huge_page_test.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/mm/split_huge_page_test.c b/tools/testing/selftests/mm/split_huge_page_test.c
index e0167111bdd1..57e8a1c9647a 100644
--- a/tools/testing/selftests/mm/split_huge_page_test.c
+++ b/tools/testing/selftests/mm/split_huge_page_test.c
@@ -484,6 +484,8 @@ static void split_file_backed_thp(int order)
char tmpfs_template[] = "/tmp/thp_split_XXXXXX";
const char *tmpfs_loc = mkdtemp(tmpfs_template);
char testfile[INPUT_MAX];
+ unsigned long size = 2 * pmd_pagesize;
+ char opts[64];
ssize_t num_written, num_read;
char *file_buf1, *file_buf2;
uint64_t pgoff_start = 0, pgoff_end = 1024;
@@ -503,7 +505,8 @@ static void split_file_backed_thp(int order)
file_buf1[i] = (char)i;
memset(file_buf2, 0, pmd_pagesize);
- status = mount("tmpfs", tmpfs_loc, "tmpfs", 0, "huge=always,size=4m");
+ snprintf(opts, sizeof(opts), "huge=always,size=%lu", size);
+ status = mount("tmpfs", tmpfs_loc, "tmpfs", 0, opts);
if (status)
ksft_exit_fail_msg("Unable to create a tmpfs for testing\n");
--
2.52.0
^ permalink raw reply related [flat|nested] 45+ messages in thread* Re: [PATCH v3 05/13] selftests/mm: size tmpfs according to PMD page size in split_huge_page_test
2026-03-27 7:15 ` [PATCH v3 05/13] selftests/mm: size tmpfs according to PMD page size in split_huge_page_test Sayali Patil
@ 2026-04-01 16:20 ` Sayali Patil
0 siblings, 0 replies; 45+ messages in thread
From: Sayali Patil @ 2026-04-01 16:20 UTC (permalink / raw)
To: Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: David Hildenbrand, Zi Yan, Michal Hocko, Oscar Salvador,
Lorenzo Stoakes, Dev Jain, Liam.Howlett, linuxppc-dev,
Venkat Rao Bagalkote
[-- Attachment #1: Type: text/plain, Size: 5796 bytes --]
On 27/03/26 12:45, Sayali Patil wrote:
> The split_file_backed_thp() test mounts a tmpfs with a fixed size of
> "4m". This works on systems with smaller PMD page sizes,
> but fails on configurations where the PMD huge page size is
> larger (e.g. 16MB).
>
> On such systems, the fixed 4MB tmpfs is insufficient to allocate even
> a single PMD-sized THP, causing the test to fail.
>
> Fix this by sizing the tmpfs dynamically based on the runtime
> pmd_pagesize, allocating space for two PMD-sized pages.
>
> Before patch:
> running ./split_huge_page_test /tmp/xfs_dir_YTrI5E
> --------------------------------------------------
> TAP version 13
> 1..55
> ok 1 Split zero filled huge pages successful
> ok 2 Split huge pages to order 0 successful
> ok 3 Split huge pages to order 2 successful
> ok 4 Split huge pages to order 3 successful
> ok 5 Split huge pages to order 4 successful
> ok 6 Split huge pages to order 5 successful
> ok 7 Split huge pages to order 6 successful
> ok 8 Split huge pages to order 7 successful
> ok 9 Split PTE-mapped huge pages successful
> Please enable pr_debug in split_huge_pages_in_file() for more info.
> Failed to write data to testing file: Success (0)
> Bail out! Error occurred
> Planned tests != run tests (55 != 9)
> Totals: pass:9 fail:0 xfail:0 xpass:0 skip:0 error:0
> [FAIL]
>
> After patch:
> --------------------------------------------------
> running ./split_huge_page_test /tmp/xfs_dir_bMvj6o
> --------------------------------------------------
> TAP version 13
> 1..55
> ok 1 Split zero filled huge pages successful
> ok 2 Split huge pages to order 0 successful
> ok 3 Split huge pages to order 2 successful
> ok 4 Split huge pages to order 3 successful
> ok 5 Split huge pages to order 4 successful
> ok 6 Split huge pages to order 5 successful
> ok 7 Split huge pages to order 6 successful
> ok 8 Split huge pages to order 7 successful
> ok 9 Split PTE-mapped huge pages successful
> Please enable pr_debug in split_huge_pages_in_file() for more info.
> Please check dmesg for more information
> ok 10 File-backed THP split to order 0 test done
> Please enable pr_debug in split_huge_pages_in_file() for more info.
> Please check dmesg for more information
> ok 11 File-backed THP split to order 1 test done
> Please enable pr_debug in split_huge_pages_in_file() for more info.
> Please check dmesg for more information
> ok 12 File-backed THP split to order 2 test done
> ...
> ok 55 Split PMD-mapped pagecache folio to order 7 at
> in-folio offset 128 passed
> Totals: pass:55 fail:0 xfail:0 xpass:0 skip:0 error:0
> [PASS]
> ok 1 split_huge_page_test /tmp/xfs_dir_bMvj6o
>
> Fixes: fbe37501b252 ("mm: huge_memory: debugfs for file-backed THP split")
> Reviewed-by: Zi Yan<ziy@nvidia.com>
> Reviewed-by: David Hildenbrand (Arm)<david@kernel.org>
> Tested-by: Venkat Rao Bagalkote<venkat88@linux.ibm.com>
> Signed-off-by: Sayali Patil<sayalip@linux.ibm.com>
> ---
> tools/testing/selftests/mm/split_huge_page_test.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/mm/split_huge_page_test.c b/tools/testing/selftests/mm/split_huge_page_test.c
> index e0167111bdd1..57e8a1c9647a 100644
> --- a/tools/testing/selftests/mm/split_huge_page_test.c
> +++ b/tools/testing/selftests/mm/split_huge_page_test.c
> @@ -484,6 +484,8 @@ static void split_file_backed_thp(int order)
> char tmpfs_template[] = "/tmp/thp_split_XXXXXX";
> const char *tmpfs_loc = mkdtemp(tmpfs_template);
> char testfile[INPUT_MAX];
> + unsigned long size = 2 * pmd_pagesize;
> + char opts[64];
> ssize_t num_written, num_read;
> char *file_buf1, *file_buf2;
> uint64_t pgoff_start = 0, pgoff_end = 1024;
> @@ -503,7 +505,8 @@ static void split_file_backed_thp(int order)
> file_buf1[i] = (char)i;
> memset(file_buf2, 0, pmd_pagesize);
>
> - status = mount("tmpfs", tmpfs_loc, "tmpfs", 0, "huge=always,size=4m");
> + snprintf(opts, sizeof(opts), "huge=always,size=%lu", size);
> + status = mount("tmpfs", tmpfs_loc, "tmpfs", 0, opts);
>
> if (status)
> ksft_exit_fail_msg("Unable to create a tmpfs for testing\n");
AI review question:
> Are file_buf1 and file_buf2 being leaked here?
> While reviewing the patch's changes to support larger PMD sizes, I noticed
> that earlier in this function file_buf1 and file_buf2 are dynamically
> allocated using pmd_pagesize:
> file_buf1 = (char *)malloc(pmd_pagesize);
> file_buf2 = (char *)malloc(pmd_pagesize);
> Looking at the end of the function, the success path returns without freeing
> either buffer:
> ksft_print_msg("Please check dmesg for more information\n");
> ksft_test_result_pass("File-backed THP split to order %d test done\n", order);
> return;
> Since this function is called in a loop for each split order, could this
> cause a significant memory leak? On architectures with large PMD sizes
> (like arm64 with 64KB base pages, where the PMD size is 512MB), leaking
> these buffers repeatedly could exhaust system memory and trigger the OOM
> killer during the test run.
>
Yes, both file_buf1 and file_buf2 are leaked on the success path and
some failure paths.
Since this function is invoked in a loop for each split order, the leak
accumulates over time.
On systems with large PMD sizes, this can potentially trigger OOM during
the test run.
This was likely not noticeable earlier with smaller PMD sizes, but
becomes significant
with larger configurations.
This appears to be a pre-existing issue and not introduced by my patch.
I will prepare a separate fix to free both buffers on all exit paths to
prevent this memory leak.
Thanks,
Sayali
[-- Attachment #2: Type: text/html, Size: 7093 bytes --]
^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v3 06/13] selftest/mm: adjust hugepage-mremap test size for large huge pages
2026-03-27 7:15 [PATCH v3 00/13] selftests/mm: fix failures and robustness improvements Sayali Patil
` (4 preceding siblings ...)
2026-03-27 7:15 ` [PATCH v3 05/13] selftests/mm: size tmpfs according to PMD page size in split_huge_page_test Sayali Patil
@ 2026-03-27 7:16 ` Sayali Patil
2026-04-01 14:10 ` David Hildenbrand (Arm)
2026-03-27 7:16 ` [PATCH v3 07/13] selftest/mm: register existing mapping with userfaultfd in hugepage-mremap Sayali Patil
` (7 subsequent siblings)
13 siblings, 1 reply; 45+ messages in thread
From: Sayali Patil @ 2026-03-27 7:16 UTC (permalink / raw)
To: Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: David Hildenbrand, Zi Yan, Michal Hocko, Oscar Salvador,
Lorenzo Stoakes, Dev Jain, Liam.Howlett, linuxppc-dev,
Sayali Patil, Venkat Rao Bagalkote
The hugepage-mremap selftest uses a default size of 10MB, which is
sufficient for small huge page sizes. However, when the huge page size
is large (e.g. 1GB), 10MB is smaller than a single huge page.
As a result, the test does not trigger PMD sharing and the
corresponding unshare path in mremap(), causing the
test to fail (mremap succeeds where a failure is expected).
Update run_vmtest.sh to use twice the huge page size when the huge page
size exceeds 10MB, while retaining the 10MB default for smaller huge
pages. This ensures the test exercises the intended PMD sharing and
unsharing paths for larger huge page sizes.
Before patch:
running ./hugepage-mremap
------------------------------
TAP version 13
1..1
Map haddr: Returned address is 0x7eaa40000000
Map daddr: Returned address is 0x7daa40000000
Map vaddr: Returned address is 0x7faa40000000
Address returned by mmap() = 0x7fffaa600000
Mremap: Returned address is 0x7faa40000000
First hex is 0
First hex is 3020100
Bail out! mremap: Expected failure, but call succeeded
Planned tests != run tests (1 != 0)
Totals: pass:0 fail:0 xfail:0 xpass:0 skip:0 error:0
[FAIL]
not ok 1 hugepage-mremap # exit=1
Before patch:
running ./hugepage-mremap
------------------------------
TAP version 13
1..1
Map haddr: Returned address is 0x7eaa40000000
Map daddr: Returned address is 0x7daa40000000
Map vaddr: Returned address is 0x7faa40000000
Address returned by mmap() = 0x7fffaa600000
Mremap: Returned address is 0x7faa40000000
First hex is 0
First hex is 3020100
Bail out! mremap: Expected failure, but call succeeded
Planned tests != run tests (1 != 0)
Totals: pass:0 fail:0 xfail:0 xpass:0 skip:0 error:0
[FAIL]
not ok 1 hugepage-mremap # exit=1
After patch:
running ./hugepage-mremap 2048
------------------------------
TAP version 13
1..1
Map haddr: Returned address is 0x7eaa40000000
Map daddr: Returned address is 0x7daa40000000
Map vaddr: Returned address is 0x7faa40000000
Address returned by mmap() = 0x7fff13000000
Mremap: Returned address is 0x7faa40000000
First hex is 0
First hex is 3020100
ok 1 Read same data
Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0
[PASS]
ok 1 hugepage-mremap 2048
Fixes: f77a286de48c ("mm, hugepages: make memory size variable in hugepage-mremap selftest")
Acked-by: Zi Yan <ziy@nvidia.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
---
tools/testing/selftests/mm/run_vmtests.sh | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/selftests/mm/run_vmtests.sh
index afdcfd0d7cef..eecec0b6eb13 100755
--- a/tools/testing/selftests/mm/run_vmtests.sh
+++ b/tools/testing/selftests/mm/run_vmtests.sh
@@ -293,7 +293,18 @@ echo "$shmmax" > /proc/sys/kernel/shmmax
echo "$shmall" > /proc/sys/kernel/shmall
CATEGORY="hugetlb" run_test ./map_hugetlb
-CATEGORY="hugetlb" run_test ./hugepage-mremap
+
+# If the huge page size is larger than 10MB, increase the test memory size
+# to twice the huge page size (in MB) to ensure the test exercises PMD sharing
+# and the unshare path in hugepage-mremap. Otherwise, run the test with
+# the default 10MB memory size.
+if [ "$hpgsize_KB" -gt 10240 ]; then
+ len_mb=$(( (2 * hpgsize_KB) / 1024 ))
+ CATEGORY="hugetlb" run_test ./hugepage-mremap "${len_mb}"
+else
+ CATEGORY="hugetlb" run_test ./hugepage-mremap
+fi
+
CATEGORY="hugetlb" run_test ./hugepage-vmemmap
CATEGORY="hugetlb" run_test ./hugetlb-madvise
CATEGORY="hugetlb" run_test ./hugetlb_dio
--
2.52.0
^ permalink raw reply related [flat|nested] 45+ messages in thread* Re: [PATCH v3 06/13] selftest/mm: adjust hugepage-mremap test size for large huge pages
2026-03-27 7:16 ` [PATCH v3 06/13] selftest/mm: adjust hugepage-mremap test size for large huge pages Sayali Patil
@ 2026-04-01 14:10 ` David Hildenbrand (Arm)
2026-04-01 20:45 ` Sayali Patil
0 siblings, 1 reply; 45+ messages in thread
From: David Hildenbrand (Arm) @ 2026-04-01 14:10 UTC (permalink / raw)
To: Sayali Patil, Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: Zi Yan, Michal Hocko, Oscar Salvador, Lorenzo Stoakes, Dev Jain,
Liam.Howlett, linuxppc-dev, Venkat Rao Bagalkote
On 3/27/26 08:16, Sayali Patil wrote:
> The hugepage-mremap selftest uses a default size of 10MB, which is
> sufficient for small huge page sizes. However, when the huge page size
> is large (e.g. 1GB), 10MB is smaller than a single huge page.
> As a result, the test does not trigger PMD sharing and the
> corresponding unshare path in mremap(), causing the
> test to fail (mremap succeeds where a failure is expected).
>
> Update run_vmtest.sh to use twice the huge page size when the huge page
> size exceeds 10MB, while retaining the 10MB default for smaller huge
> pages. This ensures the test exercises the intended PMD sharing and
> unsharing paths for larger huge page sizes.
>
> Before patch:
> running ./hugepage-mremap
> ------------------------------
> TAP version 13
> 1..1
> Map haddr: Returned address is 0x7eaa40000000
> Map daddr: Returned address is 0x7daa40000000
> Map vaddr: Returned address is 0x7faa40000000
> Address returned by mmap() = 0x7fffaa600000
> Mremap: Returned address is 0x7faa40000000
> First hex is 0
> First hex is 3020100
> Bail out! mremap: Expected failure, but call succeeded
> Planned tests != run tests (1 != 0)
> Totals: pass:0 fail:0 xfail:0 xpass:0 skip:0 error:0
> [FAIL]
> not ok 1 hugepage-mremap # exit=1
>
> Before patch:
> running ./hugepage-mremap
> ------------------------------
> TAP version 13
> 1..1
> Map haddr: Returned address is 0x7eaa40000000
> Map daddr: Returned address is 0x7daa40000000
> Map vaddr: Returned address is 0x7faa40000000
> Address returned by mmap() = 0x7fffaa600000
> Mremap: Returned address is 0x7faa40000000
> First hex is 0
> First hex is 3020100
> Bail out! mremap: Expected failure, but call succeeded
> Planned tests != run tests (1 != 0)
> Totals: pass:0 fail:0 xfail:0 xpass:0 skip:0 error:0
> [FAIL]
> not ok 1 hugepage-mremap # exit=1
>
Why are there two "Before patch" in here?
> After patch:
> running ./hugepage-mremap 2048
> ------------------------------
> TAP version 13
> 1..1
> Map haddr: Returned address is 0x7eaa40000000
> Map daddr: Returned address is 0x7daa40000000
> Map vaddr: Returned address is 0x7faa40000000
> Address returned by mmap() = 0x7fff13000000
> Mremap: Returned address is 0x7faa40000000
> First hex is 0
> First hex is 3020100
> ok 1 Read same data
> Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0
> [PASS]
> ok 1 hugepage-mremap 2048
>
> Fixes: f77a286de48c ("mm, hugepages: make memory size variable in hugepage-mremap selftest")
> Acked-by: Zi Yan <ziy@nvidia.com>
> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
> Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
> ---
> tools/testing/selftests/mm/run_vmtests.sh | 13 ++++++++++++-
> 1 file changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/selftests/mm/run_vmtests.sh
> index afdcfd0d7cef..eecec0b6eb13 100755
> --- a/tools/testing/selftests/mm/run_vmtests.sh
> +++ b/tools/testing/selftests/mm/run_vmtests.sh
> @@ -293,7 +293,18 @@ echo "$shmmax" > /proc/sys/kernel/shmmax
> echo "$shmall" > /proc/sys/kernel/shmall
>
> CATEGORY="hugetlb" run_test ./map_hugetlb
> -CATEGORY="hugetlb" run_test ./hugepage-mremap
> +
> +# If the huge page size is larger than 10MB, increase the test memory size
> +# to twice the huge page size (in MB) to ensure the test exercises PMD sharing
> +# and the unshare path in hugepage-mremap. Otherwise, run the test with
> +# the default 10MB memory size.
PMD sharing requires, on x86, a 1 GiB area with 2 MiB hugetlb folios.
How does doubling sort that out?
Also, why the magic value 10mb?
--
Cheers,
David
^ permalink raw reply [flat|nested] 45+ messages in thread* Re: [PATCH v3 06/13] selftest/mm: adjust hugepage-mremap test size for large huge pages
2026-04-01 14:10 ` David Hildenbrand (Arm)
@ 2026-04-01 20:45 ` Sayali Patil
0 siblings, 0 replies; 45+ messages in thread
From: Sayali Patil @ 2026-04-01 20:45 UTC (permalink / raw)
To: David Hildenbrand (Arm), Andrew Morton, Shuah Khan, linux-mm,
linux-kernel, linux-kselftest, Ritesh Harjani
Cc: Zi Yan, Michal Hocko, Oscar Salvador, Lorenzo Stoakes, Dev Jain,
Liam.Howlett, linuxppc-dev, Venkat Rao Bagalkote
On 01/04/26 19:40, David Hildenbrand (Arm) wrote:
> On 3/27/26 08:16, Sayali Patil wrote:
>> The hugepage-mremap selftest uses a default size of 10MB, which is
>> sufficient for small huge page sizes. However, when the huge page size
>> is large (e.g. 1GB), 10MB is smaller than a single huge page.
>> As a result, the test does not trigger PMD sharing and the
>> corresponding unshare path in mremap(), causing the
>> test to fail (mremap succeeds where a failure is expected).
>>
>> Update run_vmtest.sh to use twice the huge page size when the huge page
>> size exceeds 10MB, while retaining the 10MB default for smaller huge
>> pages. This ensures the test exercises the intended PMD sharing and
>> unsharing paths for larger huge page sizes.
>>
>> Before patch:
>> running ./hugepage-mremap
>> ------------------------------
>> TAP version 13
>> 1..1
>> Map haddr: Returned address is 0x7eaa40000000
>> Map daddr: Returned address is 0x7daa40000000
>> Map vaddr: Returned address is 0x7faa40000000
>> Address returned by mmap() = 0x7fffaa600000
>> Mremap: Returned address is 0x7faa40000000
>> First hex is 0
>> First hex is 3020100
>> Bail out! mremap: Expected failure, but call succeeded
>> Planned tests != run tests (1 != 0)
>> Totals: pass:0 fail:0 xfail:0 xpass:0 skip:0 error:0
>> [FAIL]
>> not ok 1 hugepage-mremap # exit=1
>>
>> Before patch:
>> running ./hugepage-mremap
>> ------------------------------
>> TAP version 13
>> 1..1
>> Map haddr: Returned address is 0x7eaa40000000
>> Map daddr: Returned address is 0x7daa40000000
>> Map vaddr: Returned address is 0x7faa40000000
>> Address returned by mmap() = 0x7fffaa600000
>> Mremap: Returned address is 0x7faa40000000
>> First hex is 0
>> First hex is 3020100
>> Bail out! mremap: Expected failure, but call succeeded
>> Planned tests != run tests (1 != 0)
>> Totals: pass:0 fail:0 xfail:0 xpass:0 skip:0 error:0
>> [FAIL]
>> not ok 1 hugepage-mremap # exit=1
>>
>
> Why are there two "Before patch" in here?
Thanks for pointing that out, Let me fix it in the next version.
>
>> After patch:
>> running ./hugepage-mremap 2048
>> ------------------------------
>> TAP version 13
>> 1..1
>> Map haddr: Returned address is 0x7eaa40000000
>> Map daddr: Returned address is 0x7daa40000000
>> Map vaddr: Returned address is 0x7faa40000000
>> Address returned by mmap() = 0x7fff13000000
>> Mremap: Returned address is 0x7faa40000000
>> First hex is 0
>> First hex is 3020100
>> ok 1 Read same data
>> Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0
>> [PASS]
>> ok 1 hugepage-mremap 2048
>>
>> Fixes: f77a286de48c ("mm, hugepages: make memory size variable in hugepage-mremap selftest")
>> Acked-by: Zi Yan <ziy@nvidia.com>
>> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
>> Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
>> ---
>> tools/testing/selftests/mm/run_vmtests.sh | 13 ++++++++++++-
>> 1 file changed, 12 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/selftests/mm/run_vmtests.sh
>> index afdcfd0d7cef..eecec0b6eb13 100755
>> --- a/tools/testing/selftests/mm/run_vmtests.sh
>> +++ b/tools/testing/selftests/mm/run_vmtests.sh
>> @@ -293,7 +293,18 @@ echo "$shmmax" > /proc/sys/kernel/shmmax
>> echo "$shmall" > /proc/sys/kernel/shmall
>>
>> CATEGORY="hugetlb" run_test ./map_hugetlb
>> -CATEGORY="hugetlb" run_test ./hugepage-mremap
>> +
>> +# If the huge page size is larger than 10MB, increase the test memory size
>> +# to twice the huge page size (in MB) to ensure the test exercises PMD sharing
>> +# and the unshare path in hugepage-mremap. Otherwise, run the test with
>> +# the default 10MB memory size.
>
> PMD sharing requires, on x86, a 1 GiB area with 2 MiB hugetlb folios.
>
> How does doubling sort that out?
>
> Also, why the magic value 10mb?
>
>
Hi David,
Yes, 1GB huge pages are mapped at the PUD level and are not involved in
PMD sharing, as huge_pte_alloc() skips sharing for sizes other than
PMD_SIZE.
The issue here is due to an unaligned memory size on a 1GB mapping.
This leads munmap() to fail at an unaligned address, causing the
subsequent expected-to-fail mremap() to unexpectedly succeed.
The default memory size for this test is 10MB.
Aligning the size to a multiple of 1GB avoids this failure, but it is
not related to PMD sharing. I will update the description in v4 to
reflect this more accurately.
I will also update the test code directly to align the memory size to
the huge page size, rather than modifying run_vmtests.sh.
Thanks,
Sayali
^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v3 07/13] selftest/mm: register existing mapping with userfaultfd in hugepage-mremap
2026-03-27 7:15 [PATCH v3 00/13] selftests/mm: fix failures and robustness improvements Sayali Patil
` (5 preceding siblings ...)
2026-03-27 7:16 ` [PATCH v3 06/13] selftest/mm: adjust hugepage-mremap test size for large huge pages Sayali Patil
@ 2026-03-27 7:16 ` Sayali Patil
2026-04-01 14:18 ` David Hildenbrand (Arm)
2026-03-27 7:16 ` [PATCH v3 08/13] selftests/mm: ensure destination is hugetlb-backed " Sayali Patil
` (6 subsequent siblings)
13 siblings, 1 reply; 45+ messages in thread
From: Sayali Patil @ 2026-03-27 7:16 UTC (permalink / raw)
To: Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: David Hildenbrand, Zi Yan, Michal Hocko, Oscar Salvador,
Lorenzo Stoakes, Dev Jain, Liam.Howlett, linuxppc-dev,
Sayali Patil, Venkat Rao Bagalkote
Previously, register_region_with_uffd() created a new anonymous
mapping and overwrote the address supplied by the caller before
registering the range with userfaultfd.
As a result, userfaultfd was applied to an unrelated anonymous mapping
instead of the hugetlb region used by the test.
Remove the extra mmap() and register the caller-provided address range
directly using UFFDIO_REGISTER_MODE_MISSING, so that faults are
generated for the hugetlb mapping used by the test.
This ensures userfaultfd operates on the actual hugetlb test region and
validates the expected fault handling.
Before patch:
running ./hugepage-mremap
-------------------------
TAP version 13
1..1
Map haddr: Returned address is 0x7eaa40000000
Map daddr: Returned address is 0x7daa40000000
Map vaddr: Returned address is 0x7faa40000000
Address returned by mmap() = 0x7fff9d000000
Mremap: Returned address is 0x7faa40000000
First hex is 0
First hex is 3020100
ok 1 Read same data
Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0
[PASS]
ok 1 hugepage-mremap
After patch:
running ./hugepage-mremap
-------------------------
TAP version 13
1..1
Map haddr: Returned address is 0x7eaa40000000
Map daddr: Returned address is 0x7daa40000000
Map vaddr: Returned address is 0x7faa40000000
Registered memory at address 0x7eaa40000000 with userfaultfd
Mremap: Returned address is 0x7faa40000000
First hex is 0
First hex is 3020100
ok 1 Read same data
Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0
[PASS]
ok 1 hugepage-mremap
Fixes: 12b613206474 ("mm, hugepages: add hugetlb vma mremap() test")
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
---
tools/testing/selftests/mm/hugepage-mremap.c | 21 +++++---------------
1 file changed, 5 insertions(+), 16 deletions(-)
diff --git a/tools/testing/selftests/mm/hugepage-mremap.c b/tools/testing/selftests/mm/hugepage-mremap.c
index b8f7d92e5a35..e611249080d6 100644
--- a/tools/testing/selftests/mm/hugepage-mremap.c
+++ b/tools/testing/selftests/mm/hugepage-mremap.c
@@ -85,25 +85,14 @@ static void register_region_with_uffd(char *addr, size_t len)
if (ioctl(uffd, UFFDIO_API, &uffdio_api) == -1)
ksft_exit_fail_msg("ioctl-UFFDIO_API: %s\n", strerror(errno));
- /* Create a private anonymous mapping. The memory will be
- * demand-zero paged--that is, not yet allocated. When we
- * actually touch the memory, it will be allocated via
- * the userfaultfd.
- */
-
- addr = mmap(NULL, len, PROT_READ | PROT_WRITE,
- MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
- if (addr == MAP_FAILED)
- ksft_exit_fail_msg("mmap: %s\n", strerror(errno));
-
- ksft_print_msg("Address returned by mmap() = %p\n", addr);
-
- /* Register the memory range of the mapping we just created for
- * handling by the userfaultfd object. In mode, we request to track
- * missing pages (i.e., pages that have not yet been faulted in).
+ /* Register the passed memory range for handling by the userfaultfd object.
+ * In mode, we request to track missing pages
+ * (i.e., pages that have not yet been faulted in).
*/
if (uffd_register(uffd, addr, len, true, false, false))
ksft_exit_fail_msg("ioctl-UFFDIO_REGISTER: %s\n", strerror(errno));
+
+ ksft_print_msg("Registered memory at address %p with userfaultfd\n", addr);
}
int main(int argc, char *argv[])
--
2.52.0
^ permalink raw reply related [flat|nested] 45+ messages in thread* Re: [PATCH v3 07/13] selftest/mm: register existing mapping with userfaultfd in hugepage-mremap
2026-03-27 7:16 ` [PATCH v3 07/13] selftest/mm: register existing mapping with userfaultfd in hugepage-mremap Sayali Patil
@ 2026-04-01 14:18 ` David Hildenbrand (Arm)
2026-04-01 14:43 ` Sayali Patil
0 siblings, 1 reply; 45+ messages in thread
From: David Hildenbrand (Arm) @ 2026-04-01 14:18 UTC (permalink / raw)
To: Sayali Patil, Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: Zi Yan, Michal Hocko, Oscar Salvador, Lorenzo Stoakes, Dev Jain,
Liam.Howlett, linuxppc-dev, Venkat Rao Bagalkote
On 3/27/26 08:16, Sayali Patil wrote:
> Previously, register_region_with_uffd() created a new anonymous
> mapping and overwrote the address supplied by the caller before
> registering the range with userfaultfd.
>
> As a result, userfaultfd was applied to an unrelated anonymous mapping
> instead of the hugetlb region used by the test.
>
> Remove the extra mmap() and register the caller-provided address range
> directly using UFFDIO_REGISTER_MODE_MISSING, so that faults are
> generated for the hugetlb mapping used by the test.
>
> This ensures userfaultfd operates on the actual hugetlb test region and
> validates the expected fault handling.
>
> Before patch:
> running ./hugepage-mremap
> -------------------------
> TAP version 13
> 1..1
> Map haddr: Returned address is 0x7eaa40000000
> Map daddr: Returned address is 0x7daa40000000
> Map vaddr: Returned address is 0x7faa40000000
> Address returned by mmap() = 0x7fff9d000000
> Mremap: Returned address is 0x7faa40000000
> First hex is 0
> First hex is 3020100
> ok 1 Read same data
> Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0
> [PASS]
> ok 1 hugepage-mremap
>
> After patch:
> running ./hugepage-mremap
> -------------------------
> TAP version 13
> 1..1
> Map haddr: Returned address is 0x7eaa40000000
> Map daddr: Returned address is 0x7daa40000000
> Map vaddr: Returned address is 0x7faa40000000
> Registered memory at address 0x7eaa40000000 with userfaultfd
> Mremap: Returned address is 0x7faa40000000
> First hex is 0
> First hex is 3020100
> ok 1 Read same data
> Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0
> [PASS]
> ok 1 hugepage-mremap
Okay, so we tested mremap() of something that is not even hugetlb.
>
> Fixes: 12b613206474 ("mm, hugepages: add hugetlb vma mremap() test")
> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
> Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
> ---
> tools/testing/selftests/mm/hugepage-mremap.c | 21 +++++---------------
> 1 file changed, 5 insertions(+), 16 deletions(-)
>
> diff --git a/tools/testing/selftests/mm/hugepage-mremap.c b/tools/testing/selftests/mm/hugepage-mremap.c
> index b8f7d92e5a35..e611249080d6 100644
> --- a/tools/testing/selftests/mm/hugepage-mremap.c
> +++ b/tools/testing/selftests/mm/hugepage-mremap.c
> @@ -85,25 +85,14 @@ static void register_region_with_uffd(char *addr, size_t len)
> if (ioctl(uffd, UFFDIO_API, &uffdio_api) == -1)
> ksft_exit_fail_msg("ioctl-UFFDIO_API: %s\n", strerror(errno));
>
> - /* Create a private anonymous mapping. The memory will be
> - * demand-zero paged--that is, not yet allocated. When we
> - * actually touch the memory, it will be allocated via
> - * the userfaultfd.
> - */
> -
> - addr = mmap(NULL, len, PROT_READ | PROT_WRITE,
> - MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> - if (addr == MAP_FAILED)
> - ksft_exit_fail_msg("mmap: %s\n", strerror(errno));
> -
> - ksft_print_msg("Address returned by mmap() = %p\n", addr);
> -
> - /* Register the memory range of the mapping we just created for
> - * handling by the userfaultfd object. In mode, we request to track
> - * missing pages (i.e., pages that have not yet been faulted in).
> + /* Register the passed memory range for handling by the userfaultfd object.
/*
* ...
While at it.
> + * In mode, we request to track missing pages
> + * (i.e., pages that have not yet been faulted in).
> */
> if (uffd_register(uffd, addr, len, true, false, false))
> ksft_exit_fail_msg("ioctl-UFFDIO_REGISTER: %s\n", strerror(errno));
> +
> + ksft_print_msg("Registered memory at address %p with userfaultfd\n", addr);
> }
>
> int main(int argc, char *argv[])
Yes, that code is extremely weird. I wonder if this was some
copy-and-paste from other uffd test code.
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
--
Cheers,
David
^ permalink raw reply [flat|nested] 45+ messages in thread* Re: [PATCH v3 07/13] selftest/mm: register existing mapping with userfaultfd in hugepage-mremap
2026-04-01 14:18 ` David Hildenbrand (Arm)
@ 2026-04-01 14:43 ` Sayali Patil
2026-04-02 7:31 ` David Hildenbrand (Arm)
0 siblings, 1 reply; 45+ messages in thread
From: Sayali Patil @ 2026-04-01 14:43 UTC (permalink / raw)
To: David Hildenbrand (Arm), Andrew Morton, Shuah Khan, linux-mm,
linux-kernel, linux-kselftest, Ritesh Harjani
Cc: Zi Yan, Michal Hocko, Oscar Salvador, Lorenzo Stoakes, Dev Jain,
Liam.Howlett, linuxppc-dev, Venkat Rao Bagalkote
[-- Attachment #1: Type: text/plain, Size: 4700 bytes --]
On 01/04/26 19:48, David Hildenbrand (Arm) wrote:
> On 3/27/26 08:16, Sayali Patil wrote:
>> Previously, register_region_with_uffd() created a new anonymous
>> mapping and overwrote the address supplied by the caller before
>> registering the range with userfaultfd.
>>
>> As a result, userfaultfd was applied to an unrelated anonymous mapping
>> instead of the hugetlb region used by the test.
>>
>> Remove the extra mmap() and register the caller-provided address range
>> directly using UFFDIO_REGISTER_MODE_MISSING, so that faults are
>> generated for the hugetlb mapping used by the test.
>>
>> This ensures userfaultfd operates on the actual hugetlb test region and
>> validates the expected fault handling.
>>
>> Before patch:
>> running ./hugepage-mremap
>> -------------------------
>> TAP version 13
>> 1..1
>> Map haddr: Returned address is 0x7eaa40000000
>> Map daddr: Returned address is 0x7daa40000000
>> Map vaddr: Returned address is 0x7faa40000000
>> Address returned by mmap() = 0x7fff9d000000
>> Mremap: Returned address is 0x7faa40000000
>> First hex is 0
>> First hex is 3020100
>> ok 1 Read same data
>> Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0
>> [PASS]
>> ok 1 hugepage-mremap
>>
>> After patch:
>> running ./hugepage-mremap
>> -------------------------
>> TAP version 13
>> 1..1
>> Map haddr: Returned address is 0x7eaa40000000
>> Map daddr: Returned address is 0x7daa40000000
>> Map vaddr: Returned address is 0x7faa40000000
>> Registered memory at address 0x7eaa40000000 with userfaultfd
>> Mremap: Returned address is 0x7faa40000000
>> First hex is 0
>> First hex is 3020100
>> ok 1 Read same data
>> Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0
>> [PASS]
>> ok 1 hugepage-mremap
> Okay, so we tested mremap() of something that is not even hugetlb.
>
>> Fixes: 12b613206474 ("mm, hugepages: add hugetlb vma mremap() test")
>> Tested-by: Venkat Rao Bagalkote<venkat88@linux.ibm.com>
>> Signed-off-by: Sayali Patil<sayalip@linux.ibm.com>
>> ---
>> tools/testing/selftests/mm/hugepage-mremap.c | 21 +++++---------------
>> 1 file changed, 5 insertions(+), 16 deletions(-)
>>
>> diff --git a/tools/testing/selftests/mm/hugepage-mremap.c b/tools/testing/selftests/mm/hugepage-mremap.c
>> index b8f7d92e5a35..e611249080d6 100644
>> --- a/tools/testing/selftests/mm/hugepage-mremap.c
>> +++ b/tools/testing/selftests/mm/hugepage-mremap.c
>> @@ -85,25 +85,14 @@ static void register_region_with_uffd(char *addr, size_t len)
>> if (ioctl(uffd, UFFDIO_API, &uffdio_api) == -1)
>> ksft_exit_fail_msg("ioctl-UFFDIO_API: %s\n", strerror(errno));
>>
>> - /* Create a private anonymous mapping. The memory will be
>> - * demand-zero paged--that is, not yet allocated. When we
>> - * actually touch the memory, it will be allocated via
>> - * the userfaultfd.
>> - */
>> -
>> - addr = mmap(NULL, len, PROT_READ | PROT_WRITE,
>> - MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
>> - if (addr == MAP_FAILED)
>> - ksft_exit_fail_msg("mmap: %s\n", strerror(errno));
>> -
>> - ksft_print_msg("Address returned by mmap() = %p\n", addr);
>> -
>> - /* Register the memory range of the mapping we just created for
>> - * handling by the userfaultfd object. In mode, we request to track
>> - * missing pages (i.e., pages that have not yet been faulted in).
>> + /* Register the passed memory range for handling by the userfaultfd object.
>
> /*
> * ...
>
> While at it.
>
>> + * In mode, we request to track missing pages
>> + * (i.e., pages that have not yet been faulted in).
>> */
>> if (uffd_register(uffd, addr, len, true, false, false))
>> ksft_exit_fail_msg("ioctl-UFFDIO_REGISTER: %s\n", strerror(errno));
>> +
>> + ksft_print_msg("Registered memory at address %p with userfaultfd\n", addr);
>> }
>>
>> int main(int argc, char *argv[])
> Yes, that code is extremely weird. I wonder if this was some
> copy-and-paste from other uffd test code.
>
> Acked-by: David Hildenbrand (Arm)<david@kernel.org>
>
>
Hi David,
Yes, the test operates on hugetlb mappings created with
|MAP_HUGETLB | MAP_POPULATE|and sets up userfaultfd. Consequently,
registering it with |UFFDIO_REGISTER_MODE_MISSING| does not result in
any userfaults.
Originally, the helper function created a separate anonymous mapping and
registered it with userfaultfd instead of the address supplied by the
caller. However, the test operates on hugetlb mappings, and the registered
anonymous mapping is never used in the |mremap()| path being exercised.
Would it be better to remove userfaultfd registration entirely from this
test, since that path is not actually being tested?
Thanks,
Sayali
[-- Attachment #2: Type: text/html, Size: 6586 bytes --]
^ permalink raw reply [flat|nested] 45+ messages in thread* Re: [PATCH v3 07/13] selftest/mm: register existing mapping with userfaultfd in hugepage-mremap
2026-04-01 14:43 ` Sayali Patil
@ 2026-04-02 7:31 ` David Hildenbrand (Arm)
2026-04-03 17:41 ` Sayali Patil
0 siblings, 1 reply; 45+ messages in thread
From: David Hildenbrand (Arm) @ 2026-04-02 7:31 UTC (permalink / raw)
To: Sayali Patil, Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: Zi Yan, Michal Hocko, Oscar Salvador, Lorenzo Stoakes, Dev Jain,
Liam.Howlett, linuxppc-dev, Venkat Rao Bagalkote
On 4/1/26 16:43, Sayali Patil wrote:
>
> On 01/04/26 19:48, David Hildenbrand (Arm) wrote:
>> On 3/27/26 08:16, Sayali Patil wrote:
>>> Previously, register_region_with_uffd() created a new anonymous
>>> mapping and overwrote the address supplied by the caller before
>>> registering the range with userfaultfd.
>>>
>>> As a result, userfaultfd was applied to an unrelated anonymous mapping
>>> instead of the hugetlb region used by the test.
>>>
>>> Remove the extra mmap() and register the caller-provided address range
>>> directly using UFFDIO_REGISTER_MODE_MISSING, so that faults are
>>> generated for the hugetlb mapping used by the test.
>>>
>>> This ensures userfaultfd operates on the actual hugetlb test region and
>>> validates the expected fault handling.
>>>
>>> Before patch:
>>> running ./hugepage-mremap
>>> -------------------------
>>> TAP version 13
>>> 1..1
>>> Map haddr: Returned address is 0x7eaa40000000
>>> Map daddr: Returned address is 0x7daa40000000
>>> Map vaddr: Returned address is 0x7faa40000000
>>> Address returned by mmap() = 0x7fff9d000000
>>> Mremap: Returned address is 0x7faa40000000
>>> First hex is 0
>>> First hex is 3020100
>>> ok 1 Read same data
>>> Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0
>>> [PASS]
>>> ok 1 hugepage-mremap
>>>
>>> After patch:
>>> running ./hugepage-mremap
>>> -------------------------
>>> TAP version 13
>>> 1..1
>>> Map haddr: Returned address is 0x7eaa40000000
>>> Map daddr: Returned address is 0x7daa40000000
>>> Map vaddr: Returned address is 0x7faa40000000
>>> Registered memory at address 0x7eaa40000000 with userfaultfd
>>> Mremap: Returned address is 0x7faa40000000
>>> First hex is 0
>>> First hex is 3020100
>>> ok 1 Read same data
>>> Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0
>>> [PASS]
>>> ok 1 hugepage-mremap
>> Okay, so we tested mremap() of something that is not even hugetlb.
>>
>>> Fixes: 12b613206474 ("mm, hugepages: add hugetlb vma mremap() test")
>>> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
>>> Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
>>> ---
>>> tools/testing/selftests/mm/hugepage-mremap.c | 21 +++++---------------
>>> 1 file changed, 5 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/tools/testing/selftests/mm/hugepage-mremap.c b/tools/testing/selftests/mm/hugepage-mremap.c
>>> index b8f7d92e5a35..e611249080d6 100644
>>> --- a/tools/testing/selftests/mm/hugepage-mremap.c
>>> +++ b/tools/testing/selftests/mm/hugepage-mremap.c
>>> @@ -85,25 +85,14 @@ static void register_region_with_uffd(char *addr, size_t len)
>>> if (ioctl(uffd, UFFDIO_API, &uffdio_api) == -1)
>>> ksft_exit_fail_msg("ioctl-UFFDIO_API: %s\n", strerror(errno));
>>>
>>> - /* Create a private anonymous mapping. The memory will be
>>> - * demand-zero paged--that is, not yet allocated. When we
>>> - * actually touch the memory, it will be allocated via
>>> - * the userfaultfd.
>>> - */
>>> -
>>> - addr = mmap(NULL, len, PROT_READ | PROT_WRITE,
>>> - MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
>>> - if (addr == MAP_FAILED)
>>> - ksft_exit_fail_msg("mmap: %s\n", strerror(errno));
>>> -
>>> - ksft_print_msg("Address returned by mmap() = %p\n", addr);
>>> -
>>> - /* Register the memory range of the mapping we just created for
>>> - * handling by the userfaultfd object. In mode, we request to track
>>> - * missing pages (i.e., pages that have not yet been faulted in).
>>> + /* Register the passed memory range for handling by the userfaultfd object.
>> /*
>> * ...
>>
>> While at it.
>>
>>> + * In mode, we request to track missing pages
>>> + * (i.e., pages that have not yet been faulted in).
>>> */
>>> if (uffd_register(uffd, addr, len, true, false, false))
>>> ksft_exit_fail_msg("ioctl-UFFDIO_REGISTER: %s\n", strerror(errno));
>>> +
>>> + ksft_print_msg("Registered memory at address %p with userfaultfd\n", addr);
>>> }
>>>
>>> int main(int argc, char *argv[])
>> Yes, that code is extremely weird. I wonder if this was some
>> copy-and-paste from other uffd test code.
>>
>> Acked-by: David Hildenbrand (Arm) <david@kernel.org>
>>
>>
> Hi David,
>
> Yes, the test operates on hugetlb mappings created with
> |MAP_HUGETLB | MAP_POPULATE|and sets up userfaultfd. Consequently,
> registering it with |UFFDIO_REGISTER_MODE_MISSING| does not result in
> any userfaults.
>
> Originally, the helper function created a separate anonymous mapping and
> registered it with userfaultfd instead of the address supplied by the
> caller. However, the test operates on hugetlb mappings, and the registered
> anonymous mapping is never used in the |mremap()| path being exercised.
>
> Would it be better to remove userfaultfd registration entirely from this
> test, since that path is not actually being tested?
If it's tested with your change now (which I think that's what
happenes), this is fine.
It was just very weird before, because it tested something fairly unrelated.
--
Cheers,
David
^ permalink raw reply [flat|nested] 45+ messages in thread* Re: [PATCH v3 07/13] selftest/mm: register existing mapping with userfaultfd in hugepage-mremap
2026-04-02 7:31 ` David Hildenbrand (Arm)
@ 2026-04-03 17:41 ` Sayali Patil
0 siblings, 0 replies; 45+ messages in thread
From: Sayali Patil @ 2026-04-03 17:41 UTC (permalink / raw)
To: David Hildenbrand (Arm), Andrew Morton, Shuah Khan, linux-mm,
linux-kernel, linux-kselftest, Ritesh Harjani
Cc: Zi Yan, Michal Hocko, Oscar Salvador, Lorenzo Stoakes, Dev Jain,
Liam.Howlett, linuxppc-dev, Venkat Rao Bagalkote
On 02/04/26 13:01, David Hildenbrand (Arm) wrote:
> On 4/1/26 16:43, Sayali Patil wrote:
>>
>> On 01/04/26 19:48, David Hildenbrand (Arm) wrote:
>>> On 3/27/26 08:16, Sayali Patil wrote:
>>>> Previously, register_region_with_uffd() created a new anonymous
>>>> mapping and overwrote the address supplied by the caller before
>>>> registering the range with userfaultfd.
>>>>
>>>> As a result, userfaultfd was applied to an unrelated anonymous mapping
>>>> instead of the hugetlb region used by the test.
>>>>
>>>> Remove the extra mmap() and register the caller-provided address range
>>>> directly using UFFDIO_REGISTER_MODE_MISSING, so that faults are
>>>> generated for the hugetlb mapping used by the test.
>>>>
>>>> This ensures userfaultfd operates on the actual hugetlb test region and
>>>> validates the expected fault handling.
>>>>
>>>> Before patch:
>>>> running ./hugepage-mremap
>>>> -------------------------
>>>> TAP version 13
>>>> 1..1
>>>> Map haddr: Returned address is 0x7eaa40000000
>>>> Map daddr: Returned address is 0x7daa40000000
>>>> Map vaddr: Returned address is 0x7faa40000000
>>>> Address returned by mmap() = 0x7fff9d000000
>>>> Mremap: Returned address is 0x7faa40000000
>>>> First hex is 0
>>>> First hex is 3020100
>>>> ok 1 Read same data
>>>> Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0
>>>> [PASS]
>>>> ok 1 hugepage-mremap
>>>>
>>>> After patch:
>>>> running ./hugepage-mremap
>>>> -------------------------
>>>> TAP version 13
>>>> 1..1
>>>> Map haddr: Returned address is 0x7eaa40000000
>>>> Map daddr: Returned address is 0x7daa40000000
>>>> Map vaddr: Returned address is 0x7faa40000000
>>>> Registered memory at address 0x7eaa40000000 with userfaultfd
>>>> Mremap: Returned address is 0x7faa40000000
>>>> First hex is 0
>>>> First hex is 3020100
>>>> ok 1 Read same data
>>>> Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0
>>>> [PASS]
>>>> ok 1 hugepage-mremap
>>> Okay, so we tested mremap() of something that is not even hugetlb.
>>>
>>>> Fixes: 12b613206474 ("mm, hugepages: add hugetlb vma mremap() test")
>>>> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
>>>> Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
>>>> ---
>>>> tools/testing/selftests/mm/hugepage-mremap.c | 21 +++++---------------
>>>> 1 file changed, 5 insertions(+), 16 deletions(-)
>>>>
>>>> diff --git a/tools/testing/selftests/mm/hugepage-mremap.c b/tools/testing/selftests/mm/hugepage-mremap.c
>>>> index b8f7d92e5a35..e611249080d6 100644
>>>> --- a/tools/testing/selftests/mm/hugepage-mremap.c
>>>> +++ b/tools/testing/selftests/mm/hugepage-mremap.c
>>>> @@ -85,25 +85,14 @@ static void register_region_with_uffd(char *addr, size_t len)
>>>> if (ioctl(uffd, UFFDIO_API, &uffdio_api) == -1)
>>>> ksft_exit_fail_msg("ioctl-UFFDIO_API: %s\n", strerror(errno));
>>>>
>>>> - /* Create a private anonymous mapping. The memory will be
>>>> - * demand-zero paged--that is, not yet allocated. When we
>>>> - * actually touch the memory, it will be allocated via
>>>> - * the userfaultfd.
>>>> - */
>>>> -
>>>> - addr = mmap(NULL, len, PROT_READ | PROT_WRITE,
>>>> - MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
>>>> - if (addr == MAP_FAILED)
>>>> - ksft_exit_fail_msg("mmap: %s\n", strerror(errno));
>>>> -
>>>> - ksft_print_msg("Address returned by mmap() = %p\n", addr);
>>>> -
>>>> - /* Register the memory range of the mapping we just created for
>>>> - * handling by the userfaultfd object. In mode, we request to track
>>>> - * missing pages (i.e., pages that have not yet been faulted in).
>>>> + /* Register the passed memory range for handling by the userfaultfd object.
>>> /*
>>> * ...
>>>
>>> While at it.
>>>
>>>> + * In mode, we request to track missing pages
>>>> + * (i.e., pages that have not yet been faulted in).
>>>> */
>>>> if (uffd_register(uffd, addr, len, true, false, false))
>>>> ksft_exit_fail_msg("ioctl-UFFDIO_REGISTER: %s\n", strerror(errno));
>>>> +
>>>> + ksft_print_msg("Registered memory at address %p with userfaultfd\n", addr);
>>>> }
>>>>
>>>> int main(int argc, char *argv[])
>>> Yes, that code is extremely weird. I wonder if this was some
>>> copy-and-paste from other uffd test code.
>>>
>>> Acked-by: David Hildenbrand (Arm) <david@kernel.org>
>>>
>>>
>> Hi David,
>>
>> Yes, the test operates on hugetlb mappings created with
>> |MAP_HUGETLB | MAP_POPULATE|and sets up userfaultfd. Consequently,
>> registering it with |UFFDIO_REGISTER_MODE_MISSING| does not result in
>> any userfaults.
>>
>> Originally, the helper function created a separate anonymous mapping and
>> registered it with userfaultfd instead of the address supplied by the
>> caller. However, the test operates on hugetlb mappings, and the registered
>> anonymous mapping is never used in the |mremap()| path being exercised.
>>
>> Would it be better to remove userfaultfd registration entirely from this
>> test, since that path is not actually being tested?
>
> If it's tested with your change now (which I think that's what
> happenes), this is fine.
>
> It was just very weird before, because it tested something fairly unrelated.
>
Thanks for the review. Yes, tested with this change and it behaves as
expected now.
^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v3 08/13] selftests/mm: ensure destination is hugetlb-backed in hugepage-mremap
2026-03-27 7:15 [PATCH v3 00/13] selftests/mm: fix failures and robustness improvements Sayali Patil
` (6 preceding siblings ...)
2026-03-27 7:16 ` [PATCH v3 07/13] selftest/mm: register existing mapping with userfaultfd in hugepage-mremap Sayali Patil
@ 2026-03-27 7:16 ` Sayali Patil
2026-04-01 14:21 ` David Hildenbrand (Arm)
2026-03-27 7:16 ` [PATCH v3 09/13] selftests/mm: skip uffd-wp-mremap if UFFD write-protect is unsupported Sayali Patil
` (5 subsequent siblings)
13 siblings, 1 reply; 45+ messages in thread
From: Sayali Patil @ 2026-03-27 7:16 UTC (permalink / raw)
To: Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: David Hildenbrand, Zi Yan, Michal Hocko, Oscar Salvador,
Lorenzo Stoakes, Dev Jain, Liam.Howlett, linuxppc-dev,
Sayali Patil, Venkat Rao Bagalkote
The hugepage-mremap selftest reserves the destination address using a
anonymous base-page mapping before calling mremap() with MREMAP_FIXED,
while the source region is hugetlb-backed.
When remapping a hugetlb mapping into a base-page VMA may fail with:
mremap: Device or resource busy
This is observed on powerpc hash MMU systems where slice constraints
and page size incompatibilities prevent the remap.
Ensure the destination region is created using MAP_HUGETLB so that both
source and destination VMAs are hugetlb-backed and compatible. Also add
MAP_POPULATE to the destination mapping to prefault hugepages,
matching the behaviour used for other hugetlb mapping in the test and
ensuring deterministic behaviour.
Update the FLAGS macro to include MAP_HUGETLB | MAP_SHARED |
MAP_POPULATE so that both mappings are hugetlb-backed and compatible.
Also use the macro for the mmap() calls to avoid repeating
the flag combination.
This ensures the test reliably exercises hugetlb mremap instead of
failing due to VMA type mismatch.
Fixes: 12b613206474 ("mm, hugepages: add hugetlb vma mremap() test")
Acked-by: Zi Yan <ziy@nvidia.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
---
tools/testing/selftests/mm/hugepage-mremap.c | 11 ++++-------
1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/tools/testing/selftests/mm/hugepage-mremap.c b/tools/testing/selftests/mm/hugepage-mremap.c
index e611249080d6..48c24a4ba9a7 100644
--- a/tools/testing/selftests/mm/hugepage-mremap.c
+++ b/tools/testing/selftests/mm/hugepage-mremap.c
@@ -31,7 +31,7 @@
#define MB_TO_BYTES(x) (x * 1024 * 1024)
#define PROTECTION (PROT_READ | PROT_WRITE | PROT_EXEC)
-#define FLAGS (MAP_SHARED | MAP_ANONYMOUS)
+#define FLAGS (MAP_HUGETLB | MAP_SHARED | MAP_POPULATE)
static void check_bytes(char *addr)
{
@@ -121,23 +121,20 @@ int main(int argc, char *argv[])
/* mmap to a PUD aligned address to hopefully trigger pmd sharing. */
unsigned long suggested_addr = 0x7eaa40000000;
- void *haddr = mmap((void *)suggested_addr, length, PROTECTION,
- MAP_HUGETLB | MAP_SHARED | MAP_POPULATE, fd, 0);
+ void *haddr = mmap((void *)suggested_addr, length, PROTECTION, FLAGS, fd, 0);
ksft_print_msg("Map haddr: Returned address is %p\n", haddr);
if (haddr == MAP_FAILED)
ksft_exit_fail_msg("mmap1: %s\n", strerror(errno));
/* mmap again to a dummy address to hopefully trigger pmd sharing. */
suggested_addr = 0x7daa40000000;
- void *daddr = mmap((void *)suggested_addr, length, PROTECTION,
- MAP_HUGETLB | MAP_SHARED | MAP_POPULATE, fd, 0);
+ void *daddr = mmap((void *)suggested_addr, length, PROTECTION, FLAGS, fd, 0);
ksft_print_msg("Map daddr: Returned address is %p\n", daddr);
if (daddr == MAP_FAILED)
ksft_exit_fail_msg("mmap3: %s\n", strerror(errno));
suggested_addr = 0x7faa40000000;
- void *vaddr =
- mmap((void *)suggested_addr, length, PROTECTION, FLAGS, -1, 0);
+ void *vaddr = mmap((void *)suggested_addr, length, PROTECTION, FLAGS, fd, 0);
ksft_print_msg("Map vaddr: Returned address is %p\n", vaddr);
if (vaddr == MAP_FAILED)
ksft_exit_fail_msg("mmap2: %s\n", strerror(errno));
--
2.52.0
^ permalink raw reply related [flat|nested] 45+ messages in thread* Re: [PATCH v3 08/13] selftests/mm: ensure destination is hugetlb-backed in hugepage-mremap
2026-03-27 7:16 ` [PATCH v3 08/13] selftests/mm: ensure destination is hugetlb-backed " Sayali Patil
@ 2026-04-01 14:21 ` David Hildenbrand (Arm)
2026-04-01 14:40 ` Lorenzo Stoakes (Oracle)
0 siblings, 1 reply; 45+ messages in thread
From: David Hildenbrand (Arm) @ 2026-04-01 14:21 UTC (permalink / raw)
To: Sayali Patil, Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: Zi Yan, Michal Hocko, Oscar Salvador, Lorenzo Stoakes, Dev Jain,
Liam.Howlett, linuxppc-dev, Venkat Rao Bagalkote
On 3/27/26 08:16, Sayali Patil wrote:
> The hugepage-mremap selftest reserves the destination address using a
> anonymous base-page mapping before calling mremap() with MREMAP_FIXED,
> while the source region is hugetlb-backed.
>
> When remapping a hugetlb mapping into a base-page VMA may fail with:
>
> mremap: Device or resource busy
>
> This is observed on powerpc hash MMU systems where slice constraints
> and page size incompatibilities prevent the remap.
>
That is weird. An mremap(MREMAP_FIXED) is really just an munmap() + move.
Are we sure this is not some actual problem in the hugetlb implementation?
> Ensure the destination region is created using MAP_HUGETLB so that both
> source and destination VMAs are hugetlb-backed and compatible. Also add
> MAP_POPULATE to the destination mapping to prefault hugepages,
> matching the behaviour used for other hugetlb mapping in the test and
> ensuring deterministic behaviour.
But then the test suddenly requires more hugetlb pages, no? I don't see
a good reason for the MAP_POPULATE, really. It will be discarded either way.
>
> Update the FLAGS macro to include MAP_HUGETLB | MAP_SHARED |
> MAP_POPULATE so that both mappings are hugetlb-backed and compatible.
> Also use the macro for the mmap() calls to avoid repeating
> the flag combination.
>
> This ensures the test reliably exercises hugetlb mremap instead of
> failing due to VMA type mismatch.
>
> Fixes: 12b613206474 ("mm, hugepages: add hugetlb vma mremap() test")
> Acked-by: Zi Yan <ziy@nvidia.com>
> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
> Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
> ---
> tools/testing/selftests/mm/hugepage-mremap.c | 11 ++++-------
> 1 file changed, 4 insertions(+), 7 deletions(-)
>
> diff --git a/tools/testing/selftests/mm/hugepage-mremap.c b/tools/testing/selftests/mm/hugepage-mremap.c
> index e611249080d6..48c24a4ba9a7 100644
> --- a/tools/testing/selftests/mm/hugepage-mremap.c
> +++ b/tools/testing/selftests/mm/hugepage-mremap.c
> @@ -31,7 +31,7 @@
> #define MB_TO_BYTES(x) (x * 1024 * 1024)
>
> #define PROTECTION (PROT_READ | PROT_WRITE | PROT_EXEC)
> -#define FLAGS (MAP_SHARED | MAP_ANONYMOUS)
> +#define FLAGS (MAP_HUGETLB | MAP_SHARED | MAP_POPULATE)
>
> static void check_bytes(char *addr)
> {
> @@ -121,23 +121,20 @@ int main(int argc, char *argv[])
>
> /* mmap to a PUD aligned address to hopefully trigger pmd sharing. */
> unsigned long suggested_addr = 0x7eaa40000000;
> - void *haddr = mmap((void *)suggested_addr, length, PROTECTION,
> - MAP_HUGETLB | MAP_SHARED | MAP_POPULATE, fd, 0);
> + void *haddr = mmap((void *)suggested_addr, length, PROTECTION, FLAGS, fd, 0);
> ksft_print_msg("Map haddr: Returned address is %p\n", haddr);
> if (haddr == MAP_FAILED)
> ksft_exit_fail_msg("mmap1: %s\n", strerror(errno));
>
> /* mmap again to a dummy address to hopefully trigger pmd sharing. */
> suggested_addr = 0x7daa40000000;
> - void *daddr = mmap((void *)suggested_addr, length, PROTECTION,
> - MAP_HUGETLB | MAP_SHARED | MAP_POPULATE, fd, 0);
> + void *daddr = mmap((void *)suggested_addr, length, PROTECTION, FLAGS, fd, 0);
> ksft_print_msg("Map daddr: Returned address is %p\n", daddr);
> if (daddr == MAP_FAILED)
> ksft_exit_fail_msg("mmap3: %s\n", strerror(errno));
>
> suggested_addr = 0x7faa40000000;
> - void *vaddr =
> - mmap((void *)suggested_addr, length, PROTECTION, FLAGS, -1, 0);
> + void *vaddr = mmap((void *)suggested_addr, length, PROTECTION, FLAGS, fd, 0);
> ksft_print_msg("Map vaddr: Returned address is %p\n", vaddr);
> if (vaddr == MAP_FAILED)
> ksft_exit_fail_msg("mmap2: %s\n", strerror(errno));
--
Cheers,
David
^ permalink raw reply [flat|nested] 45+ messages in thread* Re: [PATCH v3 08/13] selftests/mm: ensure destination is hugetlb-backed in hugepage-mremap
2026-04-01 14:21 ` David Hildenbrand (Arm)
@ 2026-04-01 14:40 ` Lorenzo Stoakes (Oracle)
2026-04-01 20:39 ` Sayali Patil
0 siblings, 1 reply; 45+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-04-01 14:40 UTC (permalink / raw)
To: David Hildenbrand (Arm)
Cc: Sayali Patil, Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani, Zi Yan, Michal Hocko,
Oscar Salvador, Lorenzo Stoakes, Dev Jain, Liam.Howlett,
linuxppc-dev, Venkat Rao Bagalkote
On Wed, Apr 01, 2026 at 04:21:55PM +0200, David Hildenbrand (Arm) wrote:
> On 3/27/26 08:16, Sayali Patil wrote:
> > The hugepage-mremap selftest reserves the destination address using a
> > anonymous base-page mapping before calling mremap() with MREMAP_FIXED,
> > while the source region is hugetlb-backed.
> >
> > When remapping a hugetlb mapping into a base-page VMA may fail with:
> >
> > mremap: Device or resource busy
> >
> > This is observed on powerpc hash MMU systems where slice constraints
> > and page size incompatibilities prevent the remap.
OK so digging in:
mremap -> ... -> vrm_set_new_addr() -> get_unmapped_area() -> ... (in ppc arch
code) -> slice_get_unmapped_area():
unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len,
unsigned long flags, unsigned int psize,
int topdown)
{
...
/* bunch of checks */
/* If we have MAP_FIXED and failed the above steps, then error out */
if (fixed)
return -EBUSY;
...
}
Is presumably where we hit the issue.
> >
>
> That is weird. An mremap(MREMAP_FIXED) is really just an munmap() + move.
Yeah the weird bit I guess is that we _still_ invoke get_unmapped_area() but
with MAP_FIXED set to indicate that we want the specific address, so it's
subject to the above checks.
>
> Are we sure this is not some actual problem in the hugetlb implementation?
It seems the 'slices' check sees if the _target address_ has an equivalent page
size, presumably hugetlb-mandated, and fails if they're not equivalent, so this
change is just accounting for that.
>
> > Ensure the destination region is created using MAP_HUGETLB so that both
> > source and destination VMAs are hugetlb-backed and compatible. Also add
> > MAP_POPULATE to the destination mapping to prefault hugepages,
> > matching the behaviour used for other hugetlb mapping in the test and
> > ensuring deterministic behaviour.
>
> But then the test suddenly requires more hugetlb pages, no? I don't see
> a good reason for the MAP_POPULATE, really. It will be discarded either way.
Yeah I'm not sure about the MAP_POPULATE being all that important here.
>
> >
> > Update the FLAGS macro to include MAP_HUGETLB | MAP_SHARED |
> > MAP_POPULATE so that both mappings are hugetlb-backed and compatible.
> > Also use the macro for the mmap() calls to avoid repeating
> > the flag combination.
> >
> > This ensures the test reliably exercises hugetlb mremap instead of
> > failing due to VMA type mismatch.
> >
> > Fixes: 12b613206474 ("mm, hugepages: add hugetlb vma mremap() test")
> > Acked-by: Zi Yan <ziy@nvidia.com>
> > Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
> > Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
> > ---
> > tools/testing/selftests/mm/hugepage-mremap.c | 11 ++++-------
> > 1 file changed, 4 insertions(+), 7 deletions(-)
> >
> > diff --git a/tools/testing/selftests/mm/hugepage-mremap.c b/tools/testing/selftests/mm/hugepage-mremap.c
> > index e611249080d6..48c24a4ba9a7 100644
> > --- a/tools/testing/selftests/mm/hugepage-mremap.c
> > +++ b/tools/testing/selftests/mm/hugepage-mremap.c
> > @@ -31,7 +31,7 @@
> > #define MB_TO_BYTES(x) (x * 1024 * 1024)
> >
> > #define PROTECTION (PROT_READ | PROT_WRITE | PROT_EXEC)
> > -#define FLAGS (MAP_SHARED | MAP_ANONYMOUS)
> > +#define FLAGS (MAP_HUGETLB | MAP_SHARED | MAP_POPULATE)
> >
> > static void check_bytes(char *addr)
> > {
> > @@ -121,23 +121,20 @@ int main(int argc, char *argv[])
> >
> > /* mmap to a PUD aligned address to hopefully trigger pmd sharing. */
> > unsigned long suggested_addr = 0x7eaa40000000;
> > - void *haddr = mmap((void *)suggested_addr, length, PROTECTION,
> > - MAP_HUGETLB | MAP_SHARED | MAP_POPULATE, fd, 0);
> > + void *haddr = mmap((void *)suggested_addr, length, PROTECTION, FLAGS, fd, 0);
> > ksft_print_msg("Map haddr: Returned address is %p\n", haddr);
> > if (haddr == MAP_FAILED)
> > ksft_exit_fail_msg("mmap1: %s\n", strerror(errno));
> >
> > /* mmap again to a dummy address to hopefully trigger pmd sharing. */
> > suggested_addr = 0x7daa40000000;
> > - void *daddr = mmap((void *)suggested_addr, length, PROTECTION,
> > - MAP_HUGETLB | MAP_SHARED | MAP_POPULATE, fd, 0);
> > + void *daddr = mmap((void *)suggested_addr, length, PROTECTION, FLAGS, fd, 0);
> > ksft_print_msg("Map daddr: Returned address is %p\n", daddr);
> > if (daddr == MAP_FAILED)
> > ksft_exit_fail_msg("mmap3: %s\n", strerror(errno));
> >
> > suggested_addr = 0x7faa40000000;
> > - void *vaddr =
> > - mmap((void *)suggested_addr, length, PROTECTION, FLAGS, -1, 0);
> > + void *vaddr = mmap((void *)suggested_addr, length, PROTECTION, FLAGS, fd, 0);
> > ksft_print_msg("Map vaddr: Returned address is %p\n", vaddr);
> > if (vaddr == MAP_FAILED)
> > ksft_exit_fail_msg("mmap2: %s\n", strerror(errno));
>
>
> --
> Cheers,
>
> David
Cheers, Lorenzo
^ permalink raw reply [flat|nested] 45+ messages in thread* Re: [PATCH v3 08/13] selftests/mm: ensure destination is hugetlb-backed in hugepage-mremap
2026-04-01 14:40 ` Lorenzo Stoakes (Oracle)
@ 2026-04-01 20:39 ` Sayali Patil
2026-04-02 7:33 ` David Hildenbrand (Arm)
0 siblings, 1 reply; 45+ messages in thread
From: Sayali Patil @ 2026-04-01 20:39 UTC (permalink / raw)
To: Lorenzo Stoakes (Oracle), David Hildenbrand (Arm)
Cc: Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani, Zi Yan, Michal Hocko,
Oscar Salvador, Lorenzo Stoakes, Dev Jain, Liam.Howlett,
linuxppc-dev, Venkat Rao Bagalkote
On 01/04/26 20:10, Lorenzo Stoakes (Oracle) wrote:
> On Wed, Apr 01, 2026 at 04:21:55PM +0200, David Hildenbrand (Arm) wrote:
>> On 3/27/26 08:16, Sayali Patil wrote:
>>> The hugepage-mremap selftest reserves the destination address using a
>>> anonymous base-page mapping before calling mremap() with MREMAP_FIXED,
>>> while the source region is hugetlb-backed.
>>>
>>> When remapping a hugetlb mapping into a base-page VMA may fail with:
>>>
>>> mremap: Device or resource busy
>>>
>>> This is observed on powerpc hash MMU systems where slice constraints
>>> and page size incompatibilities prevent the remap.
>
> OK so digging in:
>
> mremap -> ... -> vrm_set_new_addr() -> get_unmapped_area() -> ... (in ppc arch
> code) -> slice_get_unmapped_area():
>
> unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len,
> unsigned long flags, unsigned int psize,
> int topdown)
> {
> ...
> /* bunch of checks */
>
> /* If we have MAP_FIXED and failed the above steps, then error out */
> if (fixed)
> return -EBUSY;
>
> ...
> }
>
> Is presumably where we hit the issue.
>
>>>
>>
>> That is weird. An mremap(MREMAP_FIXED) is really just an munmap() + move.
>
> Yeah the weird bit I guess is that we _still_ invoke get_unmapped_area() but
> with MAP_FIXED set to indicate that we want the specific address, so it's
> subject to the above checks.
>
>>
>> Are we sure this is not some actual problem in the hugetlb implementation?
>
> It seems the 'slices' check sees if the _target address_ has an equivalent page
> size, presumably hugetlb-mandated, and fails if they're not equivalent, so this
> change is just accounting for that.
>
Yes, this change accounts for that by ensuring the destination is
created with MAP_HUGETLB so it has the same page size as the source.
>
>>
>>> Ensure the destination region is created using MAP_HUGETLB so that both
>>> source and destination VMAs are hugetlb-backed and compatible. Also add
>>> MAP_POPULATE to the destination mapping to prefault hugepages,
>>> matching the behaviour used for other hugetlb mapping in the test and
>>> ensuring deterministic behaviour.
>>
>> But then the test suddenly requires more hugetlb pages, no? I don't see
>> a good reason for the MAP_POPULATE, really. It will be discarded either way.
>
> Yeah I'm not sure about the MAP_POPULATE being all that important here.
>
As far as I understand, without MAP_POPULATE, memory accesses would
trigger userfaults, and since the test is single-threaded and has no
background handler for the uffd, it would deadlock. MAP_POPULATE ensures
the test runs correctly by prefaulting all pages, but please let me know
if I’m mistaken.
>>
>>>
>>> Update the FLAGS macro to include MAP_HUGETLB | MAP_SHARED |
>>> MAP_POPULATE so that both mappings are hugetlb-backed and compatible.
>>> Also use the macro for the mmap() calls to avoid repeating
>>> the flag combination.
>>>
>>> This ensures the test reliably exercises hugetlb mremap instead of
>>> failing due to VMA type mismatch.
>>>
>>> Fixes: 12b613206474 ("mm, hugepages: add hugetlb vma mremap() test")
>>> Acked-by: Zi Yan <ziy@nvidia.com>
>>> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
>>> Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
>>> ---
>>> tools/testing/selftests/mm/hugepage-mremap.c | 11 ++++-------
>>> 1 file changed, 4 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/tools/testing/selftests/mm/hugepage-mremap.c b/tools/testing/selftests/mm/hugepage-mremap.c
>>> index e611249080d6..48c24a4ba9a7 100644
>>> --- a/tools/testing/selftests/mm/hugepage-mremap.c
>>> +++ b/tools/testing/selftests/mm/hugepage-mremap.c
>>> @@ -31,7 +31,7 @@
>>> #define MB_TO_BYTES(x) (x * 1024 * 1024)
>>>
>>> #define PROTECTION (PROT_READ | PROT_WRITE | PROT_EXEC)
>>> -#define FLAGS (MAP_SHARED | MAP_ANONYMOUS)
>>> +#define FLAGS (MAP_HUGETLB | MAP_SHARED | MAP_POPULATE)
>>>
>>> static void check_bytes(char *addr)
>>> {
>>> @@ -121,23 +121,20 @@ int main(int argc, char *argv[])
>>>
>>> /* mmap to a PUD aligned address to hopefully trigger pmd sharing. */
>>> unsigned long suggested_addr = 0x7eaa40000000;
>>> - void *haddr = mmap((void *)suggested_addr, length, PROTECTION,
>>> - MAP_HUGETLB | MAP_SHARED | MAP_POPULATE, fd, 0);
>>> + void *haddr = mmap((void *)suggested_addr, length, PROTECTION, FLAGS, fd, 0);
>>> ksft_print_msg("Map haddr: Returned address is %p\n", haddr);
>>> if (haddr == MAP_FAILED)
>>> ksft_exit_fail_msg("mmap1: %s\n", strerror(errno));
>>>
>>> /* mmap again to a dummy address to hopefully trigger pmd sharing. */
>>> suggested_addr = 0x7daa40000000;
>>> - void *daddr = mmap((void *)suggested_addr, length, PROTECTION,
>>> - MAP_HUGETLB | MAP_SHARED | MAP_POPULATE, fd, 0);
>>> + void *daddr = mmap((void *)suggested_addr, length, PROTECTION, FLAGS, fd, 0);
>>> ksft_print_msg("Map daddr: Returned address is %p\n", daddr);
>>> if (daddr == MAP_FAILED)
>>> ksft_exit_fail_msg("mmap3: %s\n", strerror(errno));
>>>
>>> suggested_addr = 0x7faa40000000;
>>> - void *vaddr =
>>> - mmap((void *)suggested_addr, length, PROTECTION, FLAGS, -1, 0);
>>> + void *vaddr = mmap((void *)suggested_addr, length, PROTECTION, FLAGS, fd, 0);
>>> ksft_print_msg("Map vaddr: Returned address is %p\n", vaddr);
>>> if (vaddr == MAP_FAILED)
>>> ksft_exit_fail_msg("mmap2: %s\n", strerror(errno));
>>
>>
>> --
>> Cheers,
>>
>> David
>
> Cheers, Lorenzo
^ permalink raw reply [flat|nested] 45+ messages in thread* Re: [PATCH v3 08/13] selftests/mm: ensure destination is hugetlb-backed in hugepage-mremap
2026-04-01 20:39 ` Sayali Patil
@ 2026-04-02 7:33 ` David Hildenbrand (Arm)
2026-04-02 9:05 ` Lorenzo Stoakes (Oracle)
0 siblings, 1 reply; 45+ messages in thread
From: David Hildenbrand (Arm) @ 2026-04-02 7:33 UTC (permalink / raw)
To: Sayali Patil, Lorenzo Stoakes (Oracle)
Cc: Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani, Zi Yan, Michal Hocko,
Oscar Salvador, Lorenzo Stoakes, Dev Jain, Liam.Howlett,
linuxppc-dev, Venkat Rao Bagalkote
On 4/1/26 22:39, Sayali Patil wrote:
>
>
> On 01/04/26 20:10, Lorenzo Stoakes (Oracle) wrote:
>> On Wed, Apr 01, 2026 at 04:21:55PM +0200, David Hildenbrand (Arm) wrote:
>>
>> OK so digging in:
>>
>> mremap -> ... -> vrm_set_new_addr() -> get_unmapped_area() -> ... (in
>> ppc arch
>> code) -> slice_get_unmapped_area():
>>
>> unsigned long slice_get_unmapped_area(unsigned long addr, unsigned
>> long len,
>> unsigned long flags, unsigned int psize,
>> int topdown)
>> {
>> ...
>> /* bunch of checks */
>>
>> /* If we have MAP_FIXED and failed the above steps, then error out */
>> if (fixed)
>> return -EBUSY;
>>
>> ...
>> }
>>
>> Is presumably where we hit the issue.
>>
>>>
>>> That is weird. An mremap(MREMAP_FIXED) is really just an munmap() +
>>> move.
>>
>> Yeah the weird bit I guess is that we _still_ invoke
>> get_unmapped_area() but
>> with MAP_FIXED set to indicate that we want the specific address, so it's
>> subject to the above checks.
>>
>>>
>>> Are we sure this is not some actual problem in the hugetlb
>>> implementation?
>>
>> It seems the 'slices' check sees if the _target address_ has an
>> equivalent page
>> size, presumably hugetlb-mandated, and fails if they're not
>> equivalent, so this
>> change is just accounting for that.
>>
> Yes, this change accounts for that by ensuring the destination is
> created with MAP_HUGETLB so it has the same page size as the source.
Okay, weird, so it's the right thing to do to cover all odd arch behavior.
>>
>>>
>>>
>>> But then the test suddenly requires more hugetlb pages, no? I don't see
>>> a good reason for the MAP_POPULATE, really. It will be discarded
>>> either way.
>>
>> Yeah I'm not sure about the MAP_POPULATE being all that important here.
>>
> As far as I understand, without MAP_POPULATE, memory accesses would
> trigger userfaults, and since the test is single-threaded and has no
> background handler for the uffd, it would deadlock. MAP_POPULATE ensures
> the test runs correctly by prefaulting all pages, but please let me know
> if I’m mistaken.
So you are saying the test would deadlock if you are not adding
MAP_POPULATE? If so, please double check if that is actually the case.
And if it's actually the case, please carefully document that in the
patch description, and probably as a comment above the MAP_POPULATE usage.
--
Cheers,
David
^ permalink raw reply [flat|nested] 45+ messages in thread* Re: [PATCH v3 08/13] selftests/mm: ensure destination is hugetlb-backed in hugepage-mremap
2026-04-02 7:33 ` David Hildenbrand (Arm)
@ 2026-04-02 9:05 ` Lorenzo Stoakes (Oracle)
2026-04-03 17:41 ` Sayali Patil
0 siblings, 1 reply; 45+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-04-02 9:05 UTC (permalink / raw)
To: David Hildenbrand (Arm)
Cc: Sayali Patil, Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani, Zi Yan, Michal Hocko,
Oscar Salvador, Lorenzo Stoakes, Dev Jain, Liam.Howlett,
linuxppc-dev, Venkat Rao Bagalkote
On Thu, Apr 02, 2026 at 09:33:29AM +0200, David Hildenbrand (Arm) wrote:
> On 4/1/26 22:39, Sayali Patil wrote:
> >
> >
> > On 01/04/26 20:10, Lorenzo Stoakes (Oracle) wrote:
> >> On Wed, Apr 01, 2026 at 04:21:55PM +0200, David Hildenbrand (Arm) wrote:
> >>
> >> OK so digging in:
> >>
> >> mremap -> ... -> vrm_set_new_addr() -> get_unmapped_area() -> ... (in
> >> ppc arch
> >> code) -> slice_get_unmapped_area():
> >>
> >> unsigned long slice_get_unmapped_area(unsigned long addr, unsigned
> >> long len,
> >> unsigned long flags, unsigned int psize,
> >> int topdown)
> >> {
> >> ...
> >> /* bunch of checks */
> >>
> >> /* If we have MAP_FIXED and failed the above steps, then error out */
> >> if (fixed)
> >> return -EBUSY;
> >>
> >> ...
> >> }
> >>
> >> Is presumably where we hit the issue.
> >>
> >>>
> >>> That is weird. An mremap(MREMAP_FIXED) is really just an munmap() +
> >>> move.
> >>
> >> Yeah the weird bit I guess is that we _still_ invoke
> >> get_unmapped_area() but
> >> with MAP_FIXED set to indicate that we want the specific address, so it's
> >> subject to the above checks.
> >>
> >>>
> >>> Are we sure this is not some actual problem in the hugetlb
> >>> implementation?
> >>
> >> It seems the 'slices' check sees if the _target address_ has an
> >> equivalent page
> >> size, presumably hugetlb-mandated, and fails if they're not
> >> equivalent, so this
> >> change is just accounting for that.
> >>
> > Yes, this change accounts for that by ensuring the destination is
> > created with MAP_HUGETLB so it has the same page size as the source.
>
> Okay, weird, so it's the right thing to do to cover all odd arch behavior.
>
> >>
> >>>
> >>>
> >>> But then the test suddenly requires more hugetlb pages, no? I don't see
> >>> a good reason for the MAP_POPULATE, really. It will be discarded
> >>> either way.
> >>
> >> Yeah I'm not sure about the MAP_POPULATE being all that important here.
> >>
> > As far as I understand, without MAP_POPULATE, memory accesses would
> > trigger userfaults, and since the test is single-threaded and has no
> > background handler for the uffd, it would deadlock. MAP_POPULATE ensures
> > the test runs correctly by prefaulting all pages, but please let me know
> > if I’m mistaken.
>
> So you are saying the test would deadlock if you are not adding
> MAP_POPULATE? If so, please double check if that is actually the case.
>
> And if it's actually the case, please carefully document that in the
> patch description, and probably as a comment above the MAP_POPULATE usage.
Do keep in mind MAP_POPULATE is not _guaranteed_ to work :)
For guaranteed populate you need madvise(..., MADV_POPULATE_[READ/WRITE]) or to
directly fault in.
>
> --
> Cheers,
>
> David
Cheers, Lorenzo
^ permalink raw reply [flat|nested] 45+ messages in thread* Re: [PATCH v3 08/13] selftests/mm: ensure destination is hugetlb-backed in hugepage-mremap
2026-04-02 9:05 ` Lorenzo Stoakes (Oracle)
@ 2026-04-03 17:41 ` Sayali Patil
0 siblings, 0 replies; 45+ messages in thread
From: Sayali Patil @ 2026-04-03 17:41 UTC (permalink / raw)
To: Lorenzo Stoakes (Oracle), David Hildenbrand (Arm)
Cc: Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani, Zi Yan, Michal Hocko,
Oscar Salvador, Lorenzo Stoakes, Dev Jain, Liam.Howlett,
linuxppc-dev, Venkat Rao Bagalkote
On 02/04/26 14:35, Lorenzo Stoakes (Oracle) wrote:
> On Thu, Apr 02, 2026 at 09:33:29AM +0200, David Hildenbrand (Arm) wrote:
>> On 4/1/26 22:39, Sayali Patil wrote:
>>>
>>>
>>> On 01/04/26 20:10, Lorenzo Stoakes (Oracle) wrote:
>>>> On Wed, Apr 01, 2026 at 04:21:55PM +0200, David Hildenbrand (Arm) wrote:
>>>>
>>>> OK so digging in:
>>>>
>>>> mremap -> ... -> vrm_set_new_addr() -> get_unmapped_area() -> ... (in
>>>> ppc arch
>>>> code) -> slice_get_unmapped_area():
>>>>
>>>> unsigned long slice_get_unmapped_area(unsigned long addr, unsigned
>>>> long len,
>>>> unsigned long flags, unsigned int psize,
>>>> int topdown)
>>>> {
>>>> ...
>>>> /* bunch of checks */
>>>>
>>>> /* If we have MAP_FIXED and failed the above steps, then error out */
>>>> if (fixed)
>>>> return -EBUSY;
>>>>
>>>> ...
>>>> }
>>>>
>>>> Is presumably where we hit the issue.
>>>>
>>>>>
>>>>> That is weird. An mremap(MREMAP_FIXED) is really just an munmap() +
>>>>> move.
>>>>
>>>> Yeah the weird bit I guess is that we _still_ invoke
>>>> get_unmapped_area() but
>>>> with MAP_FIXED set to indicate that we want the specific address, so it's
>>>> subject to the above checks.
>>>>
>>>>>
>>>>> Are we sure this is not some actual problem in the hugetlb
>>>>> implementation?
>>>>
>>>> It seems the 'slices' check sees if the _target address_ has an
>>>> equivalent page
>>>> size, presumably hugetlb-mandated, and fails if they're not
>>>> equivalent, so this
>>>> change is just accounting for that.
>>>>
>>> Yes, this change accounts for that by ensuring the destination is
>>> created with MAP_HUGETLB so it has the same page size as the source.
>>
>> Okay, weird, so it's the right thing to do to cover all odd arch behavior.
>>
>>>>
>>>>>
>>>>>
>>>>> But then the test suddenly requires more hugetlb pages, no? I don't see
>>>>> a good reason for the MAP_POPULATE, really. It will be discarded
>>>>> either way.
>>>>
>>>> Yeah I'm not sure about the MAP_POPULATE being all that important here.
>>>>
>>> As far as I understand, without MAP_POPULATE, memory accesses would
>>> trigger userfaults, and since the test is single-threaded and has no
>>> background handler for the uffd, it would deadlock. MAP_POPULATE ensures
>>> the test runs correctly by prefaulting all pages, but please let me know
>>> if I’m mistaken.
>>
>> So you are saying the test would deadlock if you are not adding
>> MAP_POPULATE? If so, please double check if that is actually the case.
>>
>> And if it's actually the case, please carefully document that in the
>> patch description, and probably as a comment above the MAP_POPULATE usage.
>
> Do keep in mind MAP_POPULATE is not _guaranteed_ to work :)
>
> For guaranteed populate you need madvise(..., MADV_POPULATE_[READ/WRITE]) or to
> directly fault in.
>
>>
>> --
>> Cheers,
>>
>> David
>
> Cheers, Lorenzo
>
Thanks David and Lorenzo for the input.
I tested without MAP_POPULATE and the test works fine without it.
I will remove it in the next version.
Thanks,
Sayali
^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v3 09/13] selftests/mm: skip uffd-wp-mremap if UFFD write-protect is unsupported
2026-03-27 7:15 [PATCH v3 00/13] selftests/mm: fix failures and robustness improvements Sayali Patil
` (7 preceding siblings ...)
2026-03-27 7:16 ` [PATCH v3 08/13] selftests/mm: ensure destination is hugetlb-backed " Sayali Patil
@ 2026-03-27 7:16 ` Sayali Patil
2026-04-02 6:59 ` Sayali Patil
2026-03-27 7:16 ` [PATCH v3 10/13] selftests/mm: skip uffd-stress test when nr_pages_per_cpu is zero Sayali Patil
` (4 subsequent siblings)
13 siblings, 1 reply; 45+ messages in thread
From: Sayali Patil @ 2026-03-27 7:16 UTC (permalink / raw)
To: Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: David Hildenbrand, Zi Yan, Michal Hocko, Oscar Salvador,
Lorenzo Stoakes, Dev Jain, Liam.Howlett, linuxppc-dev,
Sayali Patil
The uffd-wp-mremap test requires the UFFD_FEATURE_PAGEFAULT_FLAG_WP
capability. On systems where userfaultfd write-protect is
not supported, uffd_register() fails and the test reports failures.
Check for the required feature at startup and skip the test when the
UFFD_FEATURE_PAGEFAULT_FLAG_WP capability is not present,
preventing false failures on unsupported configurations.
Before patch:
running ./uffd-wp-mremap
------------------------
[INFO] detected THP size: 256 KiB
[INFO] detected THP size: 512 KiB
[INFO] detected THP size: 1024 KiB
[INFO] detected THP size: 2048 KiB
[INFO] detected hugetlb page size: 2048 KiB
[INFO] detected hugetlb page size: 1048576 KiB
1..24
[RUN] test_one_folio(size=65536, private=false, swapout=false,
hugetlb=false)
not ok 1 uffd_register() failed
[RUN] test_one_folio(size=65536, private=true, swapout=false,
hugetlb=false)
not ok 2 uffd_register() failed
[RUN] test_one_folio(size=65536, private=false, swapout=true,
hugetlb=false)
not ok 3 uffd_register() failed
[RUN] test_one_folio(size=65536, private=true, swapout=true,
hugetlb=false)
not ok 4 uffd_register() failed
[RUN] test_one_folio(size=262144, private=false, swapout=false,
hugetlb=false)
not ok 5 uffd_register() failed
[RUN] test_one_folio(size=524288, private=false, swapout=false,
hugetlb=false)
not ok 6 uffd_register() failed
.
.
.
Bail out! 24 out of 24 tests failed
Totals: pass:0 fail:24 xfail:0 xpass:0 skip:0 error:0
[FAIL]
not ok 1 uffd-wp-mremap # exit=1
After patch:
running ./uffd-wp-mremap
------------------------
1..0 # SKIP uffd-wp feature not supported
[SKIP]
ok 1 uffd-wp-mremap # SKIP
Acked-by: Zi Yan <ziy@nvidia.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
---
tools/testing/selftests/mm/uffd-wp-mremap.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/tools/testing/selftests/mm/uffd-wp-mremap.c b/tools/testing/selftests/mm/uffd-wp-mremap.c
index 17186d4a4147..6edbd09f0ca6 100644
--- a/tools/testing/selftests/mm/uffd-wp-mremap.c
+++ b/tools/testing/selftests/mm/uffd-wp-mremap.c
@@ -19,6 +19,17 @@ static size_t thpsizes[20];
static int nr_hugetlbsizes;
static size_t hugetlbsizes[10];
+static void check_uffd_wp_feature_supported(void)
+{
+ uint64_t features;
+
+ if (uffd_get_features(&features) && errno == ENOENT)
+ ksft_exit_skip("failed to get available features (%d)\n", errno);
+
+ if (!(features & UFFD_FEATURE_PAGEFAULT_FLAG_WP))
+ ksft_exit_skip("uffd-wp feature not supported\n");
+}
+
static int detect_thp_sizes(size_t sizes[], int max)
{
int count = 0;
@@ -336,6 +347,8 @@ int main(int argc, char **argv)
struct thp_settings settings;
int i, j, plan = 0;
+ check_uffd_wp_feature_supported();
+
pagesize = getpagesize();
nr_thpsizes = detect_thp_sizes(thpsizes, ARRAY_SIZE(thpsizes));
nr_hugetlbsizes = detect_hugetlb_page_sizes(hugetlbsizes,
--
2.52.0
^ permalink raw reply related [flat|nested] 45+ messages in thread* Re: [PATCH v3 09/13] selftests/mm: skip uffd-wp-mremap if UFFD write-protect is unsupported
2026-03-27 7:16 ` [PATCH v3 09/13] selftests/mm: skip uffd-wp-mremap if UFFD write-protect is unsupported Sayali Patil
@ 2026-04-02 6:59 ` Sayali Patil
0 siblings, 0 replies; 45+ messages in thread
From: Sayali Patil @ 2026-04-02 6:59 UTC (permalink / raw)
To: Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: David Hildenbrand, Zi Yan, Michal Hocko, Oscar Salvador,
Lorenzo Stoakes, Dev Jain, Liam.Howlett, linuxppc-dev
On 27/03/26 12:46, Sayali Patil wrote:
> The uffd-wp-mremap test requires the UFFD_FEATURE_PAGEFAULT_FLAG_WP
> capability. On systems where userfaultfd write-protect is
> not supported, uffd_register() fails and the test reports failures.
>
> Check for the required feature at startup and skip the test when the
> UFFD_FEATURE_PAGEFAULT_FLAG_WP capability is not present,
> preventing false failures on unsupported configurations.
>
> Before patch:
> running ./uffd-wp-mremap
> ------------------------
> [INFO] detected THP size: 256 KiB
> [INFO] detected THP size: 512 KiB
> [INFO] detected THP size: 1024 KiB
> [INFO] detected THP size: 2048 KiB
> [INFO] detected hugetlb page size: 2048 KiB
> [INFO] detected hugetlb page size: 1048576 KiB
> 1..24
> [RUN] test_one_folio(size=65536, private=false, swapout=false,
> hugetlb=false)
> not ok 1 uffd_register() failed
> [RUN] test_one_folio(size=65536, private=true, swapout=false,
> hugetlb=false)
> not ok 2 uffd_register() failed
> [RUN] test_one_folio(size=65536, private=false, swapout=true,
> hugetlb=false)
> not ok 3 uffd_register() failed
> [RUN] test_one_folio(size=65536, private=true, swapout=true,
> hugetlb=false)
> not ok 4 uffd_register() failed
> [RUN] test_one_folio(size=262144, private=false, swapout=false,
> hugetlb=false)
> not ok 5 uffd_register() failed
> [RUN] test_one_folio(size=524288, private=false, swapout=false,
> hugetlb=false)
> not ok 6 uffd_register() failed
> .
> .
> .
> Bail out! 24 out of 24 tests failed
> Totals: pass:0 fail:24 xfail:0 xpass:0 skip:0 error:0
> [FAIL]
> not ok 1 uffd-wp-mremap # exit=1
>
> After patch:
> running ./uffd-wp-mremap
> ------------------------
> 1..0 # SKIP uffd-wp feature not supported
> [SKIP]
> ok 1 uffd-wp-mremap # SKIP
>
> Acked-by: Zi Yan <ziy@nvidia.com>
> Acked-by: David Hildenbrand (Arm) <david@kernel.org>
> Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
> ---
> tools/testing/selftests/mm/uffd-wp-mremap.c | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/tools/testing/selftests/mm/uffd-wp-mremap.c b/tools/testing/selftests/mm/uffd-wp-mremap.c
> index 17186d4a4147..6edbd09f0ca6 100644
> --- a/tools/testing/selftests/mm/uffd-wp-mremap.c
> +++ b/tools/testing/selftests/mm/uffd-wp-mremap.c
> @@ -19,6 +19,17 @@ static size_t thpsizes[20];
> static int nr_hugetlbsizes;
> static size_t hugetlbsizes[10];
>
> +static void check_uffd_wp_feature_supported(void)
> +{
> + uint64_t features;
> +
> + if (uffd_get_features(&features) && errno == ENOENT)
> + ksft_exit_skip("failed to get available features (%d)\n", errno);
> +
> + if (!(features & UFFD_FEATURE_PAGEFAULT_FLAG_WP))
> + ksft_exit_skip("uffd-wp feature not supported\n");
> +}
> +
> static int detect_thp_sizes(size_t sizes[], int max)
> {
> int count = 0;
> @@ -336,6 +347,8 @@ int main(int argc, char **argv)
> struct thp_settings settings;
> int i, j, plan = 0;
>
> + check_uffd_wp_feature_supported();
> +
> pagesize = getpagesize();
> nr_thpsizes = detect_thp_sizes(thpsizes, ARRAY_SIZE(thpsizes));
> nr_hugetlbsizes = detect_hugetlb_page_sizes(hugetlbsizes,
AI review question:
> commit 22914640356d1eaf4e0a8a8b13ad71911d12db50
> Author: Sayali Patil <sayalip@linux.ibm.com>
> selftests/mm: skip uffd-wp-mremap if UFFD write-protect is unsupported
> This commit adds a check at startup to verify if the userfaultfd
> write-protect capability is present. It skips the test instead of reporting> false failures on unsupported configurations.
> > diff --git a/tools/testing/selftests/mm/uffd-wp-mremap.c b/tools/testing/selftests/mm/uffd-wp-mremap.c> > index 17186d4a4147..6edbd09f0ca6 100644
> > --- a/tools/testing/selftests/mm/uffd-wp-mremap.c
> > +++ b/tools/testing/selftests/mm/uffd-wp-mremap.c
> > @@ -19,6 +19,17 @@ static size_t thpsizes[20];
> > static int nr_hugetlbsizes;
> > static size_t hugetlbsizes[10];
>
> > +static void check_uffd_wp_feature_supported(void)
> > +{
> > + uint64_t features;
> > +
> > + if (uffd_get_features(&features) && errno == ENOENT)
> > + ksft_exit_skip("failed to get available features (%d)\n", errno);
> If uffd_get_features() fails but errno is something other than ENOENT (for> example, EPERM when unprivileged userfaultfd is disabled), this check will
> evaluate to false and execution will continue.> > +
> > + if (!(features & UFFD_FEATURE_PAGEFAULT_FLAG_WP))
> Could this evaluate uninitialized stack memory?
> If uffd_get_features() fails and is bypassed by the previous check, the
> features variable will remain uninitialized. This might cause the test to
> falsely pass or fail depending on whatever data happens to be on the stack.> Should we initialize features to zero or exit on all uffd_get_features()
> failures instead?
> > + ksft_exit_skip("uffd-wp feature not supported\n");
> > +}
Yes,If uffd_get_features() fails with an error other than ENOENT, the
features variable would remain uninitialized, which could lead to
undefined behavior in the subsequent check.
I will fix this in the next patch version.
Thanks,
Sayali
^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v3 10/13] selftests/mm: skip uffd-stress test when nr_pages_per_cpu is zero
2026-03-27 7:15 [PATCH v3 00/13] selftests/mm: fix failures and robustness improvements Sayali Patil
` (8 preceding siblings ...)
2026-03-27 7:16 ` [PATCH v3 09/13] selftests/mm: skip uffd-wp-mremap if UFFD write-protect is unsupported Sayali Patil
@ 2026-03-27 7:16 ` Sayali Patil
2026-04-01 14:23 ` David Hildenbrand (Arm)
2026-03-27 7:16 ` [PATCH v3 11/13] selftests/mm: fix double increment in linked list cleanup in compaction_test Sayali Patil
` (3 subsequent siblings)
13 siblings, 1 reply; 45+ messages in thread
From: Sayali Patil @ 2026-03-27 7:16 UTC (permalink / raw)
To: Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: David Hildenbrand, Zi Yan, Michal Hocko, Oscar Salvador,
Lorenzo Stoakes, Dev Jain, Liam.Howlett, linuxppc-dev,
Sayali Patil, Venkat Rao Bagalkote
uffd-stress currently fails when the computed nr_pages_per_cpu
evaluates to zero:
nr_pages_per_cpu = bytes / page_size / nr_parallel
This can occur on systems with large hugepage sizes (e.g. 1GB) and a
high number of CPUs, where the total allocated memory is sufficient
overall but not enough to provide at least one page per cpu.
In such cases, the failure is due to insufficient test resources
rather than incorrect kernel behaviour. Update the test
to treat this condition as a test skip instead of reporting an error.
Fixes: db0f1c138f18 ("selftests/mm: print some details when uffd-stress gets bad params")
Acked-by: Zi Yan <ziy@nvidia.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
---
tools/testing/selftests/mm/uffd-stress.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/mm/uffd-stress.c b/tools/testing/selftests/mm/uffd-stress.c
index 700fbaa18d44..b8f22ea859a6 100644
--- a/tools/testing/selftests/mm/uffd-stress.c
+++ b/tools/testing/selftests/mm/uffd-stress.c
@@ -491,9 +491,9 @@ int main(int argc, char **argv)
gopts->nr_pages_per_cpu = bytes / gopts->page_size / gopts->nr_parallel;
if (!gopts->nr_pages_per_cpu) {
- _err("pages_per_cpu = 0, cannot test (%lu / %lu / %lu)",
- bytes, gopts->page_size, gopts->nr_parallel);
- usage();
+ ksft_print_msg("pages_per_cpu = 0, cannot test (%lu / %lu / %lu)\n",
+ bytes, gopts->page_size, gopts->nr_parallel);
+ return KSFT_SKIP;
}
bounces = atoi(argv[3]);
--
2.52.0
^ permalink raw reply related [flat|nested] 45+ messages in thread* Re: [PATCH v3 10/13] selftests/mm: skip uffd-stress test when nr_pages_per_cpu is zero
2026-03-27 7:16 ` [PATCH v3 10/13] selftests/mm: skip uffd-stress test when nr_pages_per_cpu is zero Sayali Patil
@ 2026-04-01 14:23 ` David Hildenbrand (Arm)
0 siblings, 0 replies; 45+ messages in thread
From: David Hildenbrand (Arm) @ 2026-04-01 14:23 UTC (permalink / raw)
To: Sayali Patil, Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: Zi Yan, Michal Hocko, Oscar Salvador, Lorenzo Stoakes, Dev Jain,
Liam.Howlett, linuxppc-dev, Venkat Rao Bagalkote
On 3/27/26 08:16, Sayali Patil wrote:
> uffd-stress currently fails when the computed nr_pages_per_cpu
> evaluates to zero:
>
> nr_pages_per_cpu = bytes / page_size / nr_parallel
>
> This can occur on systems with large hugepage sizes (e.g. 1GB) and a
> high number of CPUs, where the total allocated memory is sufficient
> overall but not enough to provide at least one page per cpu.
>
> In such cases, the failure is due to insufficient test resources
> rather than incorrect kernel behaviour. Update the test
> to treat this condition as a test skip instead of reporting an error.
>
> Fixes: db0f1c138f18 ("selftests/mm: print some details when uffd-stress gets bad params")
> Acked-by: Zi Yan <ziy@nvidia.com>
> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
> Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
--
Cheers,
David
^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v3 11/13] selftests/mm: fix double increment in linked list cleanup in compaction_test
2026-03-27 7:15 [PATCH v3 00/13] selftests/mm: fix failures and robustness improvements Sayali Patil
` (9 preceding siblings ...)
2026-03-27 7:16 ` [PATCH v3 10/13] selftests/mm: skip uffd-stress test when nr_pages_per_cpu is zero Sayali Patil
@ 2026-03-27 7:16 ` Sayali Patil
2026-04-01 14:32 ` Sayali Patil
2026-03-27 7:16 ` [PATCH v3 12/13] selftests/mm: move hwpoison setup into run_test() and silence modprobe output for memory-failure category Sayali Patil
` (2 subsequent siblings)
13 siblings, 1 reply; 45+ messages in thread
From: Sayali Patil @ 2026-03-27 7:16 UTC (permalink / raw)
To: Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: David Hildenbrand, Zi Yan, Michal Hocko, Oscar Salvador,
Lorenzo Stoakes, Dev Jain, Liam.Howlett, linuxppc-dev,
Sayali Patil, Venkat Rao Bagalkote
The cleanup loop of allocated memory currently uses:
for (entry = list; entry != NULL; entry = entry->next) {
munmap(entry->map, MAP_SIZE);
if (!entry->next)
break;
entry = entry->next;
}
The inner entry = entry->next causes the loop to skip every
other node, resulting in only half of the mapped regions being
unmapped.
Remove the redundant increment to ensure every entry is visited
and unmapped during cleanup.
Fixes: bd67d5c15cc1 ("Test compaction of mlocked memory")
Reviewed-by: Zi Yan <ziy@nvidia.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
---
tools/testing/selftests/mm/compaction_test.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/tools/testing/selftests/mm/compaction_test.c b/tools/testing/selftests/mm/compaction_test.c
index 30209c40b697..f73930706bd0 100644
--- a/tools/testing/selftests/mm/compaction_test.c
+++ b/tools/testing/selftests/mm/compaction_test.c
@@ -263,9 +263,6 @@ int main(int argc, char **argv)
for (entry = list; entry != NULL; entry = entry->next) {
munmap(entry->map, MAP_SIZE);
- if (!entry->next)
- break;
- entry = entry->next;
}
if (check_compaction(mem_free, hugepage_size,
--
2.52.0
^ permalink raw reply related [flat|nested] 45+ messages in thread* Re: [PATCH v3 11/13] selftests/mm: fix double increment in linked list cleanup in compaction_test
2026-03-27 7:16 ` [PATCH v3 11/13] selftests/mm: fix double increment in linked list cleanup in compaction_test Sayali Patil
@ 2026-04-01 14:32 ` Sayali Patil
2026-04-01 14:39 ` David Hildenbrand (Arm)
0 siblings, 1 reply; 45+ messages in thread
From: Sayali Patil @ 2026-04-01 14:32 UTC (permalink / raw)
To: Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: David Hildenbrand, Zi Yan, Michal Hocko, Oscar Salvador,
Lorenzo Stoakes, Dev Jain, Liam.Howlett, linuxppc-dev,
Venkat Rao Bagalkote
On 27/03/26 12:46, Sayali Patil wrote:
> The cleanup loop of allocated memory currently uses:
>
> for (entry = list; entry != NULL; entry = entry->next) {
> munmap(entry->map, MAP_SIZE);
> if (!entry->next)
> break;
> entry = entry->next;
> }
>
> The inner entry = entry->next causes the loop to skip every
> other node, resulting in only half of the mapped regions being
> unmapped.
>
> Remove the redundant increment to ensure every entry is visited
> and unmapped during cleanup.
>
> Fixes: bd67d5c15cc1 ("Test compaction of mlocked memory")
> Reviewed-by: Zi Yan <ziy@nvidia.com>
> Acked-by: David Hildenbrand (Arm) <david@kernel.org>
> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
> Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
> ---
> tools/testing/selftests/mm/compaction_test.c | 3 ---
> 1 file changed, 3 deletions(-)
>
> diff --git a/tools/testing/selftests/mm/compaction_test.c b/tools/testing/selftests/mm/compaction_test.c
> index 30209c40b697..f73930706bd0 100644
> --- a/tools/testing/selftests/mm/compaction_test.c
> +++ b/tools/testing/selftests/mm/compaction_test.c
> @@ -263,9 +263,6 @@ int main(int argc, char **argv)
>
> for (entry = list; entry != NULL; entry = entry->next) {
> munmap(entry->map, MAP_SIZE);
> - if (!entry->next)
> - break;
> - entry = entry->next;
> }
>
> if (check_compaction(mem_free, hugepage_size,
Sorry, this change is not valid.
The goal of this test is to verify the kernel’s ability to compact
unevictable (MAP_LOCKED) pages. The loop is intentionally written to
unmap every other chunk, thereby creating fragmentation with locked pages
before check_compaction() is invoked.
With the proposed change (removing the double increment), the loop ends up
unmapping all allocated locked pages instead of leaving a fragmented
pattern. This results in memory being effectively unfragmented.
I will send v4 without this patch.
Thanks,
Sayali
^ permalink raw reply [flat|nested] 45+ messages in thread* Re: [PATCH v3 11/13] selftests/mm: fix double increment in linked list cleanup in compaction_test
2026-04-01 14:32 ` Sayali Patil
@ 2026-04-01 14:39 ` David Hildenbrand (Arm)
2026-04-01 17:33 ` Sayali Patil
0 siblings, 1 reply; 45+ messages in thread
From: David Hildenbrand (Arm) @ 2026-04-01 14:39 UTC (permalink / raw)
To: Sayali Patil, Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: Zi Yan, Michal Hocko, Oscar Salvador, Lorenzo Stoakes, Dev Jain,
Liam.Howlett, linuxppc-dev, Venkat Rao Bagalkote
On 4/1/26 16:32, Sayali Patil wrote:
>
> On 27/03/26 12:46, Sayali Patil wrote:
>> The cleanup loop of allocated memory currently uses:
>>
>> for (entry = list; entry != NULL; entry = entry->next) {
>> munmap(entry->map, MAP_SIZE);
>> if (!entry->next)
>> break;
>> entry = entry->next;
>> }
>>
>> The inner entry = entry->next causes the loop to skip every
>> other node, resulting in only half of the mapped regions being
>> unmapped.
>>
>> Remove the redundant increment to ensure every entry is visited
>> and unmapped during cleanup.
>>
>> Fixes: bd67d5c15cc1 ("Test compaction of mlocked memory")
>> Reviewed-by: Zi Yan <ziy@nvidia.com>
>> Acked-by: David Hildenbrand (Arm) <david@kernel.org>
>> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
>> Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
>> ---
>> tools/testing/selftests/mm/compaction_test.c | 3 ---
>> 1 file changed, 3 deletions(-)
>>
>> diff --git a/tools/testing/selftests/mm/compaction_test.c b/tools/
>> testing/selftests/mm/compaction_test.c
>> index 30209c40b697..f73930706bd0 100644
>> --- a/tools/testing/selftests/mm/compaction_test.c
>> +++ b/tools/testing/selftests/mm/compaction_test.c
>> @@ -263,9 +263,6 @@ int main(int argc, char **argv)
>> for (entry = list; entry != NULL; entry = entry->next) {
>> munmap(entry->map, MAP_SIZE);
>> - if (!entry->next)
>> - break;
>> - entry = entry->next;
>> }
>> if (check_compaction(mem_free, hugepage_size,
>
> Sorry, this change is not valid.
>
> The goal of this test is to verify the kernel’s ability to compact
> unevictable (MAP_LOCKED) pages. The loop is intentionally written to
> unmap every other chunk, thereby creating fragmentation with locked pages
> before check_compaction() is invoked.
>
> With the proposed change (removing the double increment), the loop ends up
> unmapping all allocated locked pages instead of leaving a fragmented
> pattern. This results in memory being effectively unfragmented.
Ahhh, we should really make that clearer in a comment. I missed it myself :(
--
Cheers,
David
^ permalink raw reply [flat|nested] 45+ messages in thread* Re: [PATCH v3 11/13] selftests/mm: fix double increment in linked list cleanup in compaction_test
2026-04-01 14:39 ` David Hildenbrand (Arm)
@ 2026-04-01 17:33 ` Sayali Patil
0 siblings, 0 replies; 45+ messages in thread
From: Sayali Patil @ 2026-04-01 17:33 UTC (permalink / raw)
To: David Hildenbrand (Arm), Andrew Morton, Shuah Khan, linux-mm,
linux-kernel, linux-kselftest, Ritesh Harjani
Cc: Zi Yan, Michal Hocko, Oscar Salvador, Lorenzo Stoakes, Dev Jain,
Liam.Howlett, linuxppc-dev, Venkat Rao Bagalkote
On 01/04/26 20:09, David Hildenbrand (Arm) wrote:
> On 4/1/26 16:32, Sayali Patil wrote:
>>
>> On 27/03/26 12:46, Sayali Patil wrote:
>>> The cleanup loop of allocated memory currently uses:
>>>
>>> for (entry = list; entry != NULL; entry = entry->next) {
>>> munmap(entry->map, MAP_SIZE);
>>> if (!entry->next)
>>> break;
>>> entry = entry->next;
>>> }
>>>
>>> The inner entry = entry->next causes the loop to skip every
>>> other node, resulting in only half of the mapped regions being
>>> unmapped.
>>>
>>> Remove the redundant increment to ensure every entry is visited
>>> and unmapped during cleanup.
>>>
>>> Fixes: bd67d5c15cc1 ("Test compaction of mlocked memory")
>>> Reviewed-by: Zi Yan <ziy@nvidia.com>
>>> Acked-by: David Hildenbrand (Arm) <david@kernel.org>
>>> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
>>> Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
>>> ---
>>> tools/testing/selftests/mm/compaction_test.c | 3 ---
>>> 1 file changed, 3 deletions(-)
>>>
>>> diff --git a/tools/testing/selftests/mm/compaction_test.c b/tools/
>>> testing/selftests/mm/compaction_test.c
>>> index 30209c40b697..f73930706bd0 100644
>>> --- a/tools/testing/selftests/mm/compaction_test.c
>>> +++ b/tools/testing/selftests/mm/compaction_test.c
>>> @@ -263,9 +263,6 @@ int main(int argc, char **argv)
>>> for (entry = list; entry != NULL; entry = entry->next) {
>>> munmap(entry->map, MAP_SIZE);
>>> - if (!entry->next)
>>> - break;
>>> - entry = entry->next;
>>> }
>>> if (check_compaction(mem_free, hugepage_size,
>>
>> Sorry, this change is not valid.
>>
>> The goal of this test is to verify the kernel’s ability to compact
>> unevictable (MAP_LOCKED) pages. The loop is intentionally written to
>> unmap every other chunk, thereby creating fragmentation with locked pages
>> before check_compaction() is invoked.
>>
>> With the proposed change (removing the double increment), the loop ends up
>> unmapping all allocated locked pages instead of leaving a fragmented
>> pattern. This results in memory being effectively unfragmented.
>
> Ahhh, we should really make that clearer in a comment. I missed it myself :(
>
yes, let me add a comment to clarify this and send it in v4.
^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v3 12/13] selftests/mm: move hwpoison setup into run_test() and silence modprobe output for memory-failure category
2026-03-27 7:15 [PATCH v3 00/13] selftests/mm: fix failures and robustness improvements Sayali Patil
` (10 preceding siblings ...)
2026-03-27 7:16 ` [PATCH v3 11/13] selftests/mm: fix double increment in linked list cleanup in compaction_test Sayali Patil
@ 2026-03-27 7:16 ` Sayali Patil
2026-04-02 7:15 ` Sayali Patil
2026-03-27 7:16 ` [PATCH v3 13/13] selftests/cgroup: extend test_hugetlb_memcg.c to support all huge page sizes Sayali Patil
2026-03-27 18:11 ` [PATCH v3 00/13] selftests/mm: fix failures and robustness improvements Andrew Morton
13 siblings, 1 reply; 45+ messages in thread
From: Sayali Patil @ 2026-03-27 7:16 UTC (permalink / raw)
To: Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: David Hildenbrand, Zi Yan, Michal Hocko, Oscar Salvador,
Lorenzo Stoakes, Dev Jain, Liam.Howlett, linuxppc-dev,
Sayali Patil, Miaohe Lin, Venkat Rao Bagalkote
run_vmtests.sh contains special handling to ensure the hwpoison_inject
module is available for the memory-failure tests. This logic was
implemented outside of run_test(), making the setup category-specific
but managed globally.
Move the hwpoison_inject handling into run_test() and restrict it
to the memory-failure category so that:
1. the module is checked and loaded only when memory-failure tests run,
2. the test is skipped if the module or the debugfs interface
(/sys/kernel/debug/hwpoison/) is not available.
3. the module is unloaded after the test if it was loaded by the script.
This localizes category-specific setup and makes the test flow
consistent with other per-category preparations.
While updating this logic, fix the module availability check.
The script previously used:
modprobe -R hwpoison_inject
The -R option prints the resolved module name to stdout, causing every
run to print:
hwpoison_inject
in the test output, even when no action is required, introducing
unnecessary noise.
Replace this with:
modprobe -n hwpoison_inject
which verifies that the module is loadable without producing output,
keeping the selftest logs clean and consistent.
Fixes: ff4ef2fbd101 ("selftests/mm: add memory failure anonymous page test")
Acked-by: Zi Yan <ziy@nvidia.com>
Acked-by: Miaohe Lin <linmiaohe@huawei.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
---
tools/testing/selftests/mm/run_vmtests.sh | 46 ++++++++++++++---------
1 file changed, 28 insertions(+), 18 deletions(-)
diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/selftests/mm/run_vmtests.sh
index eecec0b6eb13..606558cc3b09 100755
--- a/tools/testing/selftests/mm/run_vmtests.sh
+++ b/tools/testing/selftests/mm/run_vmtests.sh
@@ -250,6 +250,27 @@ run_test() {
fi
fi
+ # Ensure hwpoison_inject is available for memory-failure tests
+ if [ "${CATEGORY}" = "memory-failure" ]; then
+ # Try to load hwpoison_inject if not present.
+ HWPOISON_DIR=/sys/kernel/debug/hwpoison/
+ if [ ! -d "$HWPOISON_DIR" ]; then
+ if ! modprobe -n hwpoison_inject > /dev/null 2>&1; then
+ echo "Module hwpoison_inject not found, skipping..." \
+ | tap_prefix
+ skip=1
+ else
+ modprobe hwpoison_inject > /dev/null 2>&1
+ LOADED_MOD=1
+ fi
+ fi
+
+ if [ ! -d "$HWPOISON_DIR" ]; then
+ echo "hwpoison debugfs interface not present" | tap_prefix
+ skip=1
+ fi
+ fi
+
local test=$(pretty_name "$*")
local title="running $*"
local sep=$(echo -n "$title" | tr "[:graph:][:space:]" -)
@@ -261,6 +282,12 @@ run_test() {
else
local ret=$ksft_skip
fi
+
+ # Unload hwpoison_inject if we loaded it
+ if [ -n "${LOADED_MOD}" ]; then
+ modprobe -r hwpoison_inject > /dev/null 2>&1
+ fi
+
count_total=$(( count_total + 1 ))
if [ $ret -eq 0 ]; then
count_pass=$(( count_pass + 1 ))
@@ -540,24 +567,7 @@ CATEGORY="page_frag" run_test ./test_page_frag.sh nonaligned
CATEGORY="rmap" run_test ./rmap
-# Try to load hwpoison_inject if not present.
-HWPOISON_DIR=/sys/kernel/debug/hwpoison/
-if [ ! -d "$HWPOISON_DIR" ]; then
- if ! modprobe -q -R hwpoison_inject; then
- echo "Module hwpoison_inject not found, skipping..."
- else
- modprobe hwpoison_inject > /dev/null 2>&1
- LOADED_MOD=1
- fi
-fi
-
-if [ -d "$HWPOISON_DIR" ]; then
- CATEGORY="memory-failure" run_test ./memory-failure
-fi
-
-if [ -n "${LOADED_MOD}" ]; then
- modprobe -r hwpoison_inject > /dev/null 2>&1
-fi
+CATEGORY="memory-failure" run_test ./memory-failure
if [ "${HAVE_HUGEPAGES}" = 1 ]; then
echo "$orig_nr_hugepgs" > /proc/sys/vm/nr_hugepages
--
2.52.0
^ permalink raw reply related [flat|nested] 45+ messages in thread* Re: [PATCH v3 12/13] selftests/mm: move hwpoison setup into run_test() and silence modprobe output for memory-failure category
2026-03-27 7:16 ` [PATCH v3 12/13] selftests/mm: move hwpoison setup into run_test() and silence modprobe output for memory-failure category Sayali Patil
@ 2026-04-02 7:15 ` Sayali Patil
0 siblings, 0 replies; 45+ messages in thread
From: Sayali Patil @ 2026-04-02 7:15 UTC (permalink / raw)
To: Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: David Hildenbrand, Zi Yan, Michal Hocko, Oscar Salvador,
Lorenzo Stoakes, Dev Jain, Liam.Howlett, linuxppc-dev, Miaohe Lin,
Venkat Rao Bagalkote
On 27/03/26 12:46, Sayali Patil wrote:
> run_vmtests.sh contains special handling to ensure the hwpoison_inject
> module is available for the memory-failure tests. This logic was
> implemented outside of run_test(), making the setup category-specific
> but managed globally.
>
> Move the hwpoison_inject handling into run_test() and restrict it
> to the memory-failure category so that:
> 1. the module is checked and loaded only when memory-failure tests run,
> 2. the test is skipped if the module or the debugfs interface
> (/sys/kernel/debug/hwpoison/) is not available.
> 3. the module is unloaded after the test if it was loaded by the script.
>
> This localizes category-specific setup and makes the test flow
> consistent with other per-category preparations.
>
> While updating this logic, fix the module availability check.
> The script previously used:
>
> modprobe -R hwpoison_inject
>
> The -R option prints the resolved module name to stdout, causing every
> run to print:
>
> hwpoison_inject
>
> in the test output, even when no action is required, introducing
> unnecessary noise.
>
> Replace this with:
>
> modprobe -n hwpoison_inject
>
> which verifies that the module is loadable without producing output,
> keeping the selftest logs clean and consistent.
>
> Fixes: ff4ef2fbd101 ("selftests/mm: add memory failure anonymous page test")
> Acked-by: Zi Yan <ziy@nvidia.com>
> Acked-by: Miaohe Lin <linmiaohe@huawei.com>
> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
> Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
> ---
> tools/testing/selftests/mm/run_vmtests.sh | 46 ++++++++++++++---------
> 1 file changed, 28 insertions(+), 18 deletions(-)
>
> diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/selftests/mm/run_vmtests.sh
> index eecec0b6eb13..606558cc3b09 100755
> --- a/tools/testing/selftests/mm/run_vmtests.sh
> +++ b/tools/testing/selftests/mm/run_vmtests.sh
> @@ -250,6 +250,27 @@ run_test() {
> fi
> fi
>
> + # Ensure hwpoison_inject is available for memory-failure tests
> + if [ "${CATEGORY}" = "memory-failure" ]; then
> + # Try to load hwpoison_inject if not present.
> + HWPOISON_DIR=/sys/kernel/debug/hwpoison/
> + if [ ! -d "$HWPOISON_DIR" ]; then
> + if ! modprobe -n hwpoison_inject > /dev/null 2>&1; then
> + echo "Module hwpoison_inject not found, skipping..." \
> + | tap_prefix
> + skip=1
> + else
> + modprobe hwpoison_inject > /dev/null 2>&1
> + LOADED_MOD=1
> + fi
> + fi
> +
> + if [ ! -d "$HWPOISON_DIR" ]; then
> + echo "hwpoison debugfs interface not present" | tap_prefix
> + skip=1
> + fi
> + fi
> +
> local test=$(pretty_name "$*")
> local title="running $*"
> local sep=$(echo -n "$title" | tr "[:graph:][:space:]" -)
> @@ -261,6 +282,12 @@ run_test() {
> else
> local ret=$ksft_skip
> fi
> +
> + # Unload hwpoison_inject if we loaded it
> + if [ -n "${LOADED_MOD}" ]; then
> + modprobe -r hwpoison_inject > /dev/null 2>&1
> + fi
> +
> count_total=$(( count_total + 1 ))
> if [ $ret -eq 0 ]; then
> count_pass=$(( count_pass + 1 ))
> @@ -540,24 +567,7 @@ CATEGORY="page_frag" run_test ./test_page_frag.sh nonaligned
>
> CATEGORY="rmap" run_test ./rmap
>
> -# Try to load hwpoison_inject if not present.
> -HWPOISON_DIR=/sys/kernel/debug/hwpoison/
> -if [ ! -d "$HWPOISON_DIR" ]; then
> - if ! modprobe -q -R hwpoison_inject; then
> - echo "Module hwpoison_inject not found, skipping..."
> - else
> - modprobe hwpoison_inject > /dev/null 2>&1
> - LOADED_MOD=1
> - fi
> -fi
> -
> -if [ -d "$HWPOISON_DIR" ]; then
> - CATEGORY="memory-failure" run_test ./memory-failure
> -fi
> -
> -if [ -n "${LOADED_MOD}" ]; then
> - modprobe -r hwpoison_inject > /dev/null 2>&1
> -fi
> +CATEGORY="memory-failure" run_test ./memory-failure
>
> if [ "${HAVE_HUGEPAGES}" = 1 ]; then
> echo "$orig_nr_hugepgs" > /proc/sys/vm/nr_hugepages
AI review question:
> > diff --git a/tools/testing/selftests/mm/run_vmtests.sh
> > b/tools/testing/selftests/mm/run_vmtests.sh
> > --- a/tools/testing/selftests/mm/run_vmtests.sh
> > +++ b/tools/testing/selftests/mm/run_vmtests.sh
> > @@ -250,6 +250,27 @@ run_test() {
> > fi
> > fi
> >
> > + # Ensure hwpoison_inject is available for memory-failure tests
> > + if [ "${CATEGORY}" = "memory-failure" ]; then
> > + # Try to load hwpoison_inject if not present.
> > + HWPOISON_DIR=/sys/kernel/debug/hwpoison/
> > + if [ ! -d "$HWPOISON_DIR" ]; then
> > + if ! modprobe -n hwpoison_inject > /dev/null 2>&1; then
> > + echo "Module hwpoison_inject not found, skipping..." \
> > + | tap_prefix
> > + skip=1
> Does moving this skip logic inside run_test() mask previous test failures?
> The run_vmtests.sh script sets exitcode=$ksft_skip when a test skips.
> Previously, if the module was missing, the script bypassed calling run_test()
> for memory-failure entirely, which preserved any previous
> failure exit code.
> With this patch, run_test() executes, skips internally, and sets the
> exitcode to 4. Since memory-failure is the final test, could this
> change the test
> suite's final exit code from 1 to 4 on systems without the module?
This was pre-existing behavior, however, the concern is valid. Failures
should take precedence over skips. Let me address this in v4.
> > + else
> > + modprobe hwpoison_inject > /dev/null 2>&1
> > + LOADED_MOD=1
> Should LOADED_MOD be declared as a local variable here?
> Since it lacks a local declaration, it persists in the script's global
> environment. If any tests are appended after memory-failure in the future,
> or if LOADED_MOD is already set in the user's environment, this could cause
> modprobe -r hwpoison_inject to execute redundantly after those tests.
Yes, good point. Let me address this in v4 by making it a local variable.
> > + fi
> > + fi
> > +
> > + if [ ! -d "$HWPOISON_DIR" ]; then
> > + echo "hwpoison debugfs interface not present" | tap_prefix
> > + skip=1
> > + fi
> Will this print a redundant skip message when the module is not found?
> If modprobe -n fails earlier, the directory will still be absent, causing
> this second block to evaluate to true. This prints hwpoison debugfs
> interface
> not present right after Module hwpoison_inject not found, skipping...,
> which might contradict the goal of keeping the selftest logs clean.
Agreed, this can lead to redundant output. I’ll update it in v4 to avoid
the duplicate skip message.
^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v3 13/13] selftests/cgroup: extend test_hugetlb_memcg.c to support all huge page sizes
2026-03-27 7:15 [PATCH v3 00/13] selftests/mm: fix failures and robustness improvements Sayali Patil
` (11 preceding siblings ...)
2026-03-27 7:16 ` [PATCH v3 12/13] selftests/mm: move hwpoison setup into run_test() and silence modprobe output for memory-failure category Sayali Patil
@ 2026-03-27 7:16 ` Sayali Patil
2026-04-03 17:16 ` Sayali Patil
2026-03-27 18:11 ` [PATCH v3 00/13] selftests/mm: fix failures and robustness improvements Andrew Morton
13 siblings, 1 reply; 45+ messages in thread
From: Sayali Patil @ 2026-03-27 7:16 UTC (permalink / raw)
To: Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: David Hildenbrand, Zi Yan, Michal Hocko, Oscar Salvador,
Lorenzo Stoakes, Dev Jain, Liam.Howlett, linuxppc-dev,
Sayali Patil, Venkat Rao Bagalkote
The hugetlb memcg selftest was previously skipped when the configured
huge page size was not 2MB, preventing the test from running on systems
using other default huge page sizes.
Detect the system's configured huge page size at runtime and use it for
the allocation instead of assuming a fixed 2MB size. This allows the
test to run on configurations using non-2MB huge pages and avoids
unnecessary skips.
Fixes: c0dddb7aa5f8 ("selftests: add a selftest to verify hugetlb usage in memcg")
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
---
.../selftests/cgroup/test_hugetlb_memcg.c | 66 ++++++++++++++-----
1 file changed, 48 insertions(+), 18 deletions(-)
diff --git a/tools/testing/selftests/cgroup/test_hugetlb_memcg.c b/tools/testing/selftests/cgroup/test_hugetlb_memcg.c
index f451aa449be6..a449dbec16a8 100644
--- a/tools/testing/selftests/cgroup/test_hugetlb_memcg.c
+++ b/tools/testing/selftests/cgroup/test_hugetlb_memcg.c
@@ -12,10 +12,15 @@
#define ADDR ((void *)(0x0UL))
#define FLAGS (MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB)
-/* mapping 8 MBs == 4 hugepages */
-#define LENGTH (8UL*1024*1024)
#define PROTECTION (PROT_READ | PROT_WRITE)
+/*
+ * This value matches the kernel's MEMCG_CHARGE_BATCH definition:
+ * see include/linux/memcontrol.h. If the kernel value changes, this
+ * test constant must be updated accordingly to stay consistent.
+ */
+#define MEMCG_CHARGE_BATCH 64U
+
/* borrowed from mm/hmm-tests.c */
static long get_hugepage_size(void)
{
@@ -84,11 +89,11 @@ static unsigned int check_first(char *addr)
return *(unsigned int *)addr;
}
-static void write_data(char *addr)
+static void write_data(char *addr, size_t length)
{
unsigned long i;
- for (i = 0; i < LENGTH; i++)
+ for (i = 0; i < length; i++)
*(addr + i) = (char)i;
}
@@ -96,26 +101,31 @@ static int hugetlb_test_program(const char *cgroup, void *arg)
{
char *test_group = (char *)arg;
void *addr;
+ long hpage_size = get_hugepage_size() * 1024;
long old_current, expected_current, current;
int ret = EXIT_FAILURE;
+ size_t length = 4 * hpage_size;
+ int pagesize, nr_pages;
+
+ pagesize = getpagesize();
old_current = cg_read_long(test_group, "memory.current");
set_nr_hugepages(20);
current = cg_read_long(test_group, "memory.current");
- if (current - old_current >= MB(2)) {
+ if (current - old_current >= hpage_size) {
ksft_print_msg(
"setting nr_hugepages should not increase hugepage usage.\n");
ksft_print_msg("before: %ld, after: %ld\n", old_current, current);
return EXIT_FAILURE;
}
- addr = mmap(ADDR, LENGTH, PROTECTION, FLAGS, 0, 0);
+ addr = mmap(ADDR, length, PROTECTION, FLAGS, 0, 0);
if (addr == MAP_FAILED) {
ksft_print_msg("fail to mmap.\n");
return EXIT_FAILURE;
}
current = cg_read_long(test_group, "memory.current");
- if (current - old_current >= MB(2)) {
+ if (current - old_current >= hpage_size) {
ksft_print_msg("mmap should not increase hugepage usage.\n");
ksft_print_msg("before: %ld, after: %ld\n", old_current, current);
goto out_failed_munmap;
@@ -124,10 +134,24 @@ static int hugetlb_test_program(const char *cgroup, void *arg)
/* read the first page */
check_first(addr);
- expected_current = old_current + MB(2);
+ nr_pages = hpage_size / pagesize;
+ expected_current = old_current + hpage_size;
current = cg_read_long(test_group, "memory.current");
- if (!values_close(expected_current, current, 5)) {
- ksft_print_msg("memory usage should increase by around 2MB.\n");
+ if (nr_pages < MEMCG_CHARGE_BATCH && current == old_current) {
+ /*
+ * Memory cgroup charging uses per-CPU stocks and batched updates to the
+ * memcg usage counters. For hugetlb allocations, the number of pages
+ * that memcg charges is expressed in base pages (nr_pages), not
+ * in hugepage units. When the charge for an allocation is smaller than
+ * the internal batching threshold (nr_pages < MEMCG_CHARGE_BATCH),
+ * it may be fully satisfied from the CPU’s local stock. In such
+ * cases memory.current does not necessarily
+ * increase.
+ * Therefore, Treat a zero delta as valid behaviour here.
+ */
+ ksft_print_msg("no visible memcg charge, allocation consumed from local stock.\n");
+ } else if (!values_close(expected_current, current, 5)) {
+ ksft_print_msg("memory usage should increase by ~1 huge page.\n");
ksft_print_msg(
"expected memory: %ld, actual memory: %ld\n",
expected_current, current);
@@ -135,11 +159,11 @@ static int hugetlb_test_program(const char *cgroup, void *arg)
}
/* write to the whole range */
- write_data(addr);
+ write_data(addr, length);
current = cg_read_long(test_group, "memory.current");
- expected_current = old_current + MB(8);
+ expected_current = old_current + length;
if (!values_close(expected_current, current, 5)) {
- ksft_print_msg("memory usage should increase by around 8MB.\n");
+ ksft_print_msg("memory usage should increase by around 4 huge pages.\n");
ksft_print_msg(
"expected memory: %ld, actual memory: %ld\n",
expected_current, current);
@@ -147,7 +171,7 @@ static int hugetlb_test_program(const char *cgroup, void *arg)
}
/* unmap the whole range */
- munmap(addr, LENGTH);
+ munmap(addr, length);
current = cg_read_long(test_group, "memory.current");
expected_current = old_current;
if (!values_close(expected_current, current, 5)) {
@@ -162,13 +186,15 @@ static int hugetlb_test_program(const char *cgroup, void *arg)
return ret;
out_failed_munmap:
- munmap(addr, LENGTH);
+ munmap(addr, length);
return ret;
}
static int test_hugetlb_memcg(char *root)
{
int ret = KSFT_FAIL;
+ int num_pages = 20;
+ long hpage_size = get_hugepage_size();
char *test_group;
test_group = cg_name(root, "hugetlb_memcg_test");
@@ -177,7 +203,7 @@ static int test_hugetlb_memcg(char *root)
goto out;
}
- if (cg_write(test_group, "memory.max", "100M")) {
+ if (cg_write_numeric(test_group, "memory.max", num_pages * hpage_size * 1024)) {
ksft_print_msg("fail to set cgroup memory limit.\n");
goto out;
}
@@ -200,6 +226,7 @@ int main(int argc, char **argv)
{
char root[PATH_MAX];
int ret = EXIT_SUCCESS, has_memory_hugetlb_acc;
+ long val;
has_memory_hugetlb_acc = proc_mount_contains("memory_hugetlb_accounting");
if (has_memory_hugetlb_acc < 0)
@@ -208,12 +235,15 @@ int main(int argc, char **argv)
ksft_exit_skip("memory hugetlb accounting is disabled\n");
/* Unit is kB! */
- if (get_hugepage_size() != 2048) {
- ksft_print_msg("test_hugetlb_memcg requires 2MB hugepages\n");
+ val = get_hugepage_size();
+ if (val < 0) {
+ ksft_print_msg("Failed to read hugepage size\n");
ksft_test_result_skip("test_hugetlb_memcg\n");
return ret;
}
+ ksft_print_msg("Hugepage size: %ld kB\n", val);
+
if (cg_find_unified_root(root, sizeof(root), NULL))
ksft_exit_skip("cgroup v2 isn't mounted\n");
--
2.52.0
^ permalink raw reply related [flat|nested] 45+ messages in thread* Re: [PATCH v3 13/13] selftests/cgroup: extend test_hugetlb_memcg.c to support all huge page sizes
2026-03-27 7:16 ` [PATCH v3 13/13] selftests/cgroup: extend test_hugetlb_memcg.c to support all huge page sizes Sayali Patil
@ 2026-04-03 17:16 ` Sayali Patil
0 siblings, 0 replies; 45+ messages in thread
From: Sayali Patil @ 2026-04-03 17:16 UTC (permalink / raw)
To: Andrew Morton, Shuah Khan, linux-mm, linux-kernel,
linux-kselftest, Ritesh Harjani
Cc: David Hildenbrand, Zi Yan, Michal Hocko, Oscar Salvador,
Lorenzo Stoakes, Dev Jain, Liam.Howlett, linuxppc-dev,
Venkat Rao Bagalkote
On 27/03/26 12:46, Sayali Patil wrote:
> The hugetlb memcg selftest was previously skipped when the configured
> huge page size was not 2MB, preventing the test from running on systems
> using other default huge page sizes.
>
> Detect the system's configured huge page size at runtime and use it for
> the allocation instead of assuming a fixed 2MB size. This allows the
> test to run on configurations using non-2MB huge pages and avoids
> unnecessary skips.
>
> Fixes: c0dddb7aa5f8 ("selftests: add a selftest to verify hugetlb usage in memcg")
> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
> Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
> ---
> .../selftests/cgroup/test_hugetlb_memcg.c | 66 ++++++++++++++-----
> 1 file changed, 48 insertions(+), 18 deletions(-)
>
> diff --git a/tools/testing/selftests/cgroup/test_hugetlb_memcg.c b/tools/testing/selftests/cgroup/test_hugetlb_memcg.c
> index f451aa449be6..a449dbec16a8 100644
> --- a/tools/testing/selftests/cgroup/test_hugetlb_memcg.c
> +++ b/tools/testing/selftests/cgroup/test_hugetlb_memcg.c
> @@ -12,10 +12,15 @@
>
> #define ADDR ((void *)(0x0UL))
> #define FLAGS (MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB)
> -/* mapping 8 MBs == 4 hugepages */
> -#define LENGTH (8UL*1024*1024)
> #define PROTECTION (PROT_READ | PROT_WRITE)
>
> +/*
> + * This value matches the kernel's MEMCG_CHARGE_BATCH definition:
> + * see include/linux/memcontrol.h. If the kernel value changes, this
> + * test constant must be updated accordingly to stay consistent.
> + */
> +#define MEMCG_CHARGE_BATCH 64U
> +
> /* borrowed from mm/hmm-tests.c */
> static long get_hugepage_size(void)
> {
> @@ -84,11 +89,11 @@ static unsigned int check_first(char *addr)
> return *(unsigned int *)addr;
> }
>
> -static void write_data(char *addr)
> +static void write_data(char *addr, size_t length)
> {
> unsigned long i;
>
> - for (i = 0; i < LENGTH; i++)
> + for (i = 0; i < length; i++)
> *(addr + i) = (char)i;
> }
>
> @@ -96,26 +101,31 @@ static int hugetlb_test_program(const char *cgroup, void *arg)
> {
> char *test_group = (char *)arg;
> void *addr;
> + long hpage_size = get_hugepage_size() * 1024;
> long old_current, expected_current, current;
> int ret = EXIT_FAILURE;
> + size_t length = 4 * hpage_size;
> + int pagesize, nr_pages;
> +
> + pagesize = getpagesize();
>
> old_current = cg_read_long(test_group, "memory.current");
> set_nr_hugepages(20);
> current = cg_read_long(test_group, "memory.current");
> - if (current - old_current >= MB(2)) {
> + if (current - old_current >= hpage_size) {
> ksft_print_msg(
> "setting nr_hugepages should not increase hugepage usage.\n");
> ksft_print_msg("before: %ld, after: %ld\n", old_current, current);
> return EXIT_FAILURE;
> }
>
> - addr = mmap(ADDR, LENGTH, PROTECTION, FLAGS, 0, 0);
> + addr = mmap(ADDR, length, PROTECTION, FLAGS, 0, 0);
> if (addr == MAP_FAILED) {
> ksft_print_msg("fail to mmap.\n");
> return EXIT_FAILURE;
> }
> current = cg_read_long(test_group, "memory.current");
> - if (current - old_current >= MB(2)) {
> + if (current - old_current >= hpage_size) {
> ksft_print_msg("mmap should not increase hugepage usage.\n");
> ksft_print_msg("before: %ld, after: %ld\n", old_current, current);
> goto out_failed_munmap;
> @@ -124,10 +134,24 @@ static int hugetlb_test_program(const char *cgroup, void *arg)
>
> /* read the first page */
> check_first(addr);
> - expected_current = old_current + MB(2);
> + nr_pages = hpage_size / pagesize;
> + expected_current = old_current + hpage_size;
> current = cg_read_long(test_group, "memory.current");
> - if (!values_close(expected_current, current, 5)) {
> - ksft_print_msg("memory usage should increase by around 2MB.\n");
> + if (nr_pages < MEMCG_CHARGE_BATCH && current == old_current) {
> + /*
> + * Memory cgroup charging uses per-CPU stocks and batched updates to the
> + * memcg usage counters. For hugetlb allocations, the number of pages
> + * that memcg charges is expressed in base pages (nr_pages), not
> + * in hugepage units. When the charge for an allocation is smaller than
> + * the internal batching threshold (nr_pages < MEMCG_CHARGE_BATCH),
> + * it may be fully satisfied from the CPU’s local stock. In such
> + * cases memory.current does not necessarily
> + * increase.
> + * Therefore, Treat a zero delta as valid behaviour here.
> + */
> + ksft_print_msg("no visible memcg charge, allocation consumed from local stock.\n");
> + } else if (!values_close(expected_current, current, 5)) {
> + ksft_print_msg("memory usage should increase by ~1 huge page.\n");
> ksft_print_msg(
> "expected memory: %ld, actual memory: %ld\n",
> expected_current, current);
> @@ -135,11 +159,11 @@ static int hugetlb_test_program(const char *cgroup, void *arg)
> }
>
> /* write to the whole range */
> - write_data(addr);
> + write_data(addr, length);
> current = cg_read_long(test_group, "memory.current");
> - expected_current = old_current + MB(8);
> + expected_current = old_current + length;
> if (!values_close(expected_current, current, 5)) {
> - ksft_print_msg("memory usage should increase by around 8MB.\n");
> + ksft_print_msg("memory usage should increase by around 4 huge pages.\n");
> ksft_print_msg(
> "expected memory: %ld, actual memory: %ld\n",
> expected_current, current);
> @@ -147,7 +171,7 @@ static int hugetlb_test_program(const char *cgroup, void *arg)
> }
>
> /* unmap the whole range */
> - munmap(addr, LENGTH);
> + munmap(addr, length);
> current = cg_read_long(test_group, "memory.current");
> expected_current = old_current;
> if (!values_close(expected_current, current, 5)) {
> @@ -162,13 +186,15 @@ static int hugetlb_test_program(const char *cgroup, void *arg)
> return ret;
>
> out_failed_munmap:
> - munmap(addr, LENGTH);
> + munmap(addr, length);
> return ret;
> }
>
> static int test_hugetlb_memcg(char *root)
> {
> int ret = KSFT_FAIL;
> + int num_pages = 20;
> + long hpage_size = get_hugepage_size();
> char *test_group;
>
> test_group = cg_name(root, "hugetlb_memcg_test");
> @@ -177,7 +203,7 @@ static int test_hugetlb_memcg(char *root)
> goto out;
> }
>
> - if (cg_write(test_group, "memory.max", "100M")) {
> + if (cg_write_numeric(test_group, "memory.max", num_pages * hpage_size * 1024)) {
> ksft_print_msg("fail to set cgroup memory limit.\n");
> goto out;
> }
> @@ -200,6 +226,7 @@ int main(int argc, char **argv)
> {
> char root[PATH_MAX];
> int ret = EXIT_SUCCESS, has_memory_hugetlb_acc;
> + long val;
>
> has_memory_hugetlb_acc = proc_mount_contains("memory_hugetlb_accounting");
> if (has_memory_hugetlb_acc < 0)
> @@ -208,12 +235,15 @@ int main(int argc, char **argv)
> ksft_exit_skip("memory hugetlb accounting is disabled\n");
>
> /* Unit is kB! */
> - if (get_hugepage_size() != 2048) {
> - ksft_print_msg("test_hugetlb_memcg requires 2MB hugepages\n");
> + val = get_hugepage_size();
> + if (val < 0) {
> + ksft_print_msg("Failed to read hugepage size\n");
> ksft_test_result_skip("test_hugetlb_memcg\n");
> return ret;
> }
>
> + ksft_print_msg("Hugepage size: %ld kB\n", val);
> +
> if (cg_find_unified_root(root, sizeof(root), NULL))
> ksft_exit_skip("cgroup v2 isn't mounted\n");
>
AI review question:
> > - addr = mmap(ADDR, LENGTH, PROTECTION, FLAGS, 0, 0);
> > + addr = mmap(ADDR, length, PROTECTION, FLAGS, 0, 0);
> > if (addr == MAP_FAILED) {
> > ksft_print_msg("fail to mmap.\n");
> > return EXIT_FAILURE;
> Should the test gracefully handle memory constraints by skipping
> instead of returning EXIT_FAILURE here?
> With the removal of the 2MB skip condition, the test will now run on systems
> with large default huge pages (e.g., 1GB on x86_64). Reserving 20
> huge pages
> (20GB) and mapping 4 huge pages (4GB) might easily fail due to memory
> exhaustion in typical automated testing environments, causing false
> test regressions.
I’ll update it in v4 to skip on memory constraints instead of returning
EXIT_FAILURE.
> > @@ -124,10 +134,24 @@ static int hugetlb_test_program(const char
> *cgroup, void *arg)
> >
> > /* read the first page */
> > check_first(addr);
> > - expected_current = old_current + MB(2);
> > + nr_pages = hpage_size / pagesize;
> > + expected_current = old_current + hpage_size;
> > current = cg_read_long(test_group, "memory.current");
> > - if (!values_close(expected_current, current, 5)) {
> > - ksft_print_msg("memory usage should increase by around 2MB.\n");
> > + if (nr_pages < MEMCG_CHARGE_BATCH && current == old_current) {
> > + /*
> > + * Memory cgroup charging uses per-CPU stocks and batched
> updates to the
> > + * memcg usage counters. For hugetlb allocations, the number of
> pages
> > + * that memcg charges is expressed in base pages (nr_pages), not
> > + * in hugepage units. When the charge for an allocation is
> smaller than
> > + * the internal batching threshold (nr_pages <
> > MEMCG_CHARGE_BATCH),
> > + * it may be fully satisfied from the CPU’s local stock. In such
> > + * cases memory.current does not necessarily
> > + * increase.
> > + * Therefore, Treat a zero delta as valid behaviour here.
> > + */
> > + ksft_print_msg("no visible memcg charge, allocation consumed
> from local stock.\n");
> > + } else if (!values_close(expected_current, current, 5)) {
> Does this exception fully account for memcg stock batching when the
> per-CPU
> stock is empty?
> If the stock lacks sufficient pages, charging the huge page triggers
> a stock
> refill that charges exactly MEMCG_CHARGE_BATCH (64) base pages from
> the global
> memcg counter.
> On an architecture with 64KB base pages and 2MB contiguous huge pages,
> nr_pages is 32. Refilling the stock would charge 64 base pages (4MB),
> which
> increases memory.current by 4MB instead of the expected 2MB. Since
> current != old_current, the test falls through to values_close(),
> which expects
> a 2MB increase and may fail the test.
Valid point.
The current exception does not fully account for this
scenario when the per-CPU stock is empty and a refill charges
MEMCG_CHARGE_BATCH. This can indeed lead to a larger-than-expected jump
in memory.current and cause the test to fail.
I’ll update the logic in v4 to handle this case more robustly.
> > @@ -177,7 +203,7 @@ static int test_hugetlb_memcg(char *root)
> > goto out;
> > }
> >
> > - if (cg_write(test_group, "memory.max", "100M")) {
> > + if (cg_write_numeric(test_group, "memory.max", num_pages *
> > hpage_size * 1024)) {
> Can this calculation overflow on 32-bit systems?
> Since long is 32 bits on 32-bit systems, num_pages * hpage_size * 1024 can
> exceed the 32-bit signed integer maximum if the architecture supports large> huge pages (e.g., 256MB on MIPS).This would evaluate to 5,368,709,120,
> resulting in a negative or truncated value, which sets memory.max to an
> invalid or overly restrictive limit.
Yes, this can overflow on 32-bit systems. I’ll fix it in v4.
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH v3 00/13] selftests/mm: fix failures and robustness improvements
2026-03-27 7:15 [PATCH v3 00/13] selftests/mm: fix failures and robustness improvements Sayali Patil
` (12 preceding siblings ...)
2026-03-27 7:16 ` [PATCH v3 13/13] selftests/cgroup: extend test_hugetlb_memcg.c to support all huge page sizes Sayali Patil
@ 2026-03-27 18:11 ` Andrew Morton
2026-03-30 5:57 ` Sayali Patil
13 siblings, 1 reply; 45+ messages in thread
From: Andrew Morton @ 2026-03-27 18:11 UTC (permalink / raw)
To: Sayali Patil
Cc: Shuah Khan, linux-mm, linux-kernel, linux-kselftest,
Ritesh Harjani, David Hildenbrand, Zi Yan, Michal Hocko,
Oscar Salvador, Lorenzo Stoakes, Dev Jain, Liam.Howlett,
linuxppc-dev
On Fri, 27 Mar 2026 12:45:54 +0530 Sayali Patil <sayalip@linux.ibm.com> wrote:
> Powerpc systems with a 64K base page size exposed several issues while
> running mm selftests. Some tests assume specific hugetlb configurations,
> use incorrect interfaces, or fail instead of skipping when the required
> kernel features are not available.
>
> This series fixes these issues and improves test robustness.
>
> Please review the patches and provide any feedback or suggestions for
> improvement.
AI review asks many questions:
https://sashiko.dev/#/patchset/cover.1774591179.git.sayalip@linux.ibm.com
I never knew about that bash line continuation thing.
hp2:/home/akpm> cat t.sh
foo=\
bar
echo $foo
hp2:/home/akpm> bash t.sh
t.sh: line 3: bar: command not found
Huh. But it presumably passed your testing so confused.
I don't want to risk breaking selftests so I'll set v3 aside until
you're confident we should proceed.
Thanks.
^ permalink raw reply [flat|nested] 45+ messages in thread* Re: [PATCH v3 00/13] selftests/mm: fix failures and robustness improvements
2026-03-27 18:11 ` [PATCH v3 00/13] selftests/mm: fix failures and robustness improvements Andrew Morton
@ 2026-03-30 5:57 ` Sayali Patil
2026-03-30 22:11 ` Andrew Morton
0 siblings, 1 reply; 45+ messages in thread
From: Sayali Patil @ 2026-03-30 5:57 UTC (permalink / raw)
To: Andrew Morton
Cc: Shuah Khan, linux-mm, linux-kernel, linux-kselftest,
Ritesh Harjani, David Hildenbrand, Zi Yan, Michal Hocko,
Oscar Salvador, Lorenzo Stoakes, Dev Jain, Liam.Howlett,
linuxppc-dev
[-- Attachment #1: Type: text/plain, Size: 1128 bytes --]
On 27/03/26 23:41, Andrew Morton wrote:
> AI review asks many questions:
> https://sashiko.dev/#/patchset/cover.1774591179.git.sayalip@linux.ibm.com
Thanks, Let me check them.
>
> I never knew about that bash line continuation thing.
>
> hp2:/home/akpm> cat t.sh
>
> foo=\
> bar
>
> echo $foo
> hp2:/home/akpm> bash t.sh
> t.sh: line 3: bar: command not found
>
> Huh. But it presumably passed your testing so confused.
>
>
> I don't want to risk breaking selftests so I'll set v3 aside until
> you're confident we should proceed.
>
> Thanks.
This line continuation pattern has been used in selftests for quite some
time. For example, a similar usage exists in
|charge_reserved_hugetlb.sh|, introduced here:
https://lore.kernel.org/all/20200211213128.73302-8-almasrymina@google.com/T/#u
<https://lore.kernel.org/all/20200211213128.73302-8-almasrymina@google.com/T/#u>
echo "$reservation_limit" > \
$cgroup_path/$name/hugetlb.${MB}MB.$reservation_limit_file
In this case, it was primarily used to keep line length within 100
characters. I’ve tested the script and it behaved as expected.
Thanks,
Sayali
[-- Attachment #2: Type: text/html, Size: 2732 bytes --]
^ permalink raw reply [flat|nested] 45+ messages in thread* Re: [PATCH v3 00/13] selftests/mm: fix failures and robustness improvements
2026-03-30 5:57 ` Sayali Patil
@ 2026-03-30 22:11 ` Andrew Morton
2026-04-01 14:05 ` David Hildenbrand (Arm)
2026-04-01 15:03 ` Sayali Patil
0 siblings, 2 replies; 45+ messages in thread
From: Andrew Morton @ 2026-03-30 22:11 UTC (permalink / raw)
To: Sayali Patil
Cc: Shuah Khan, linux-mm, linux-kernel, linux-kselftest,
Ritesh Harjani, David Hildenbrand, Zi Yan, Michal Hocko,
Oscar Salvador, Lorenzo Stoakes, Dev Jain, Liam.Howlett,
linuxppc-dev
On Mon, 30 Mar 2026 11:27:04 +0530 Sayali Patil <sayalip@linux.ibm.com> wrote:
> > I don't want to risk breaking selftests so I'll set v3 aside until
> > you're confident we should proceed.
> >
> > Thanks.
>
> This line continuation pattern has been used in selftests for quite some
> time. For example, a similar usage exists in
> |charge_reserved_hugetlb.sh|, introduced here:
> https://lore.kernel.org/all/20200211213128.73302-8-almasrymina@google.com/T/#u
> <https://lore.kernel.org/all/20200211213128.73302-8-almasrymina@google.com/T/#u>
>
> echo "$reservation_limit" > \
> $cgroup_path/$name/hugetlb.${MB}MB.$reservation_limit_file
>
> In this case, it was primarily used to keep line length within 100
> characters. I’ve tested the script and it behaved as expected.
Great, thanks for checking.
Series is nicely reviewed and an earlier version spent time in mm.git.
And the bar tends to be lower for selftests. So I *could* break my rule
(https://lkml.kernel.org/r/20260323202941.08ddf2b0411501cae801ab4c@linux-foundation.org)
but would prefer not. What do others think?
Did Venkat's report
(https://lkml.kernel.org/r/cf815c21-138e-44c8-986d-d8496503ee32@linux.ibm.com)
get addressed? I'm not seeing that in the v2->v3 changelogging.
^ permalink raw reply [flat|nested] 45+ messages in thread* Re: [PATCH v3 00/13] selftests/mm: fix failures and robustness improvements
2026-03-30 22:11 ` Andrew Morton
@ 2026-04-01 14:05 ` David Hildenbrand (Arm)
2026-04-01 15:03 ` Sayali Patil
1 sibling, 0 replies; 45+ messages in thread
From: David Hildenbrand (Arm) @ 2026-04-01 14:05 UTC (permalink / raw)
To: Andrew Morton, Sayali Patil
Cc: Shuah Khan, linux-mm, linux-kernel, linux-kselftest,
Ritesh Harjani, Zi Yan, Michal Hocko, Oscar Salvador,
Lorenzo Stoakes, Dev Jain, Liam.Howlett, linuxppc-dev
On 3/31/26 00:11, Andrew Morton wrote:
> On Mon, 30 Mar 2026 11:27:04 +0530 Sayali Patil <sayalip@linux.ibm.com> wrote:
>
>>> I don't want to risk breaking selftests so I'll set v3 aside until
>>> you're confident we should proceed.
>>>
>>> Thanks.
>>
>> This line continuation pattern has been used in selftests for quite some
>> time. For example, a similar usage exists in
>> |charge_reserved_hugetlb.sh|, introduced here:
>> https://lore.kernel.org/all/20200211213128.73302-8-almasrymina@google.com/T/#u
>> <https://lore.kernel.org/all/20200211213128.73302-8-almasrymina@google.com/T/#u>
>>
>> echo "$reservation_limit" > \
>> $cgroup_path/$name/hugetlb.${MB}MB.$reservation_limit_file
>>
>> In this case, it was primarily used to keep line length within 100
>> characters. I’ve tested the script and it behaved as expected.
>
> Great, thanks for checking.
>
> Series is nicely reviewed and an earlier version spent time in mm.git.
> And the bar tends to be lower for selftests. So I *could* break my rule
> (https://lkml.kernel.org/r/20260323202941.08ddf2b0411501cae801ab4c@linux-foundation.org)
> but would prefer not. What do others think?
I guess it doesn't really hurt to take this now.
--
Cheers,
David
^ permalink raw reply [flat|nested] 45+ messages in thread* Re: [PATCH v3 00/13] selftests/mm: fix failures and robustness improvements
2026-03-30 22:11 ` Andrew Morton
2026-04-01 14:05 ` David Hildenbrand (Arm)
@ 2026-04-01 15:03 ` Sayali Patil
1 sibling, 0 replies; 45+ messages in thread
From: Sayali Patil @ 2026-04-01 15:03 UTC (permalink / raw)
To: Andrew Morton
Cc: Shuah Khan, linux-mm, linux-kernel, linux-kselftest,
Ritesh Harjani, David Hildenbrand, Zi Yan, Michal Hocko,
Oscar Salvador, Lorenzo Stoakes, Dev Jain, Liam.Howlett,
linuxppc-dev
[-- Attachment #1: Type: text/plain, Size: 1677 bytes --]
On 31/03/26 03:41, Andrew Morton wrote:
> On Mon, 30 Mar 2026 11:27:04 +0530 Sayali Patil<sayalip@linux.ibm.com> wrote:
>
>>> I don't want to risk breaking selftests so I'll set v3 aside until
>>> you're confident we should proceed.
>>>
>>> Thanks.
>> This line continuation pattern has been used in selftests for quite some
>> time. For example, a similar usage exists in
>> |charge_reserved_hugetlb.sh|, introduced here:
>> https://lore.kernel.org/all/20200211213128.73302-8-almasrymina@google.com/T/#u
>> <https://lore.kernel.org/all/20200211213128.73302-8-almasrymina@google.com/T/#u>
>>
>> echo "$reservation_limit" > \
>> $cgroup_path/$name/hugetlb.${MB}MB.$reservation_limit_file
>>
>> In this case, it was primarily used to keep line length within 100
>> characters. I’ve tested the script and it behaved as expected.
> Great, thanks for checking.
>
> Series is nicely reviewed and an earlier version spent time in mm.git.
> And the bar tends to be lower for selftests. So I *could* break my rule
> (https://lkml.kernel.org/r/20260323202941.08ddf2b0411501cae801ab4c@linux-foundation.org)
> but would prefer not. What do others think?
>
> Did Venkat's report
> (https://lkml.kernel.org/r/cf815c21-138e-44c8-986d-d8496503ee32@linux.ibm.com)
> get addressed? I'm not seeing that in the v2->v3 changelogging.
>
Hi Andrew,
I am making changes as per AI review comments and will include them in v4.
The comments were helpful and should improve the overall quality of the
series.
Also venkat's report has been addressed in v3 in "selftest/mm: fix
cgroup task placement and drop memory.current checksin
hugetlb_reparenting_test.sh" patch.
Thanks,
Sayali
[-- Attachment #2: Type: text/html, Size: 3039 bytes --]
^ permalink raw reply [flat|nested] 45+ messages in thread