linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] selftests/mm: add test for (BATCH_PROCESS)MADV_DONTNEED
@ 2025-06-21 13:30 wang lian
  2025-06-23 12:35 ` Lorenzo Stoakes
  0 siblings, 1 reply; 6+ messages in thread
From: wang lian @ 2025-06-21 13:30 UTC (permalink / raw)
  To: Andrew Morton, Shuah Khan
  Cc: Christian Brauner, linux-kernel, linux-mm, linux-kselftest,
	SeongJae Park, zijing.zhang, ryncsn, p1ucky0923, gkwang,
	Lian Wang

From: Lian Wang <lianux.mm@gmail.com>

Let's add a simple test for MADV_DONTNEED and PROCESS_MADV_DONTNEED,
and inspired by SeongJae Park's test at GitHub[1] add batch test
for PROCESS_MADV_DONTNEED,but for now it influence by workload and
need add some race conditions test.We can add it later.

Signed-off-by: Lian Wang <lianux.mm@gmail.com>
References
==========

[1] https://github.com/sjp38/eval_proc_madvise

---
 tools/testing/selftests/mm/.gitignore      |   1 +
 tools/testing/selftests/mm/Makefile        |   1 +
 tools/testing/selftests/mm/madv_dontneed.c | 220 +++++++++++++++++++++
 tools/testing/selftests/mm/run_vmtests.sh  |   5 +
 4 files changed, 227 insertions(+)
 create mode 100644 tools/testing/selftests/mm/madv_dontneed.c

diff --git a/tools/testing/selftests/mm/.gitignore b/tools/testing/selftests/mm/.gitignore
index 824266982aa3..911f39d634be 100644
--- a/tools/testing/selftests/mm/.gitignore
+++ b/tools/testing/selftests/mm/.gitignore
@@ -25,6 +25,7 @@ pfnmap
 protection_keys
 protection_keys_32
 protection_keys_64
+madv_dontneed
 madv_populate
 uffd-stress
 uffd-unit-tests
diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
index ae6f994d3add..2352252f3914 100644
--- a/tools/testing/selftests/mm/Makefile
+++ b/tools/testing/selftests/mm/Makefile
@@ -67,6 +67,7 @@ TEST_GEN_FILES += hugepage-mremap
 TEST_GEN_FILES += hugepage-shm
 TEST_GEN_FILES += hugepage-vmemmap
 TEST_GEN_FILES += khugepaged
+TEST_GEN_FILES += madv_dontneed
 TEST_GEN_FILES += madv_populate
 TEST_GEN_FILES += map_fixed_noreplace
 TEST_GEN_FILES += map_hugetlb
diff --git a/tools/testing/selftests/mm/madv_dontneed.c b/tools/testing/selftests/mm/madv_dontneed.c
new file mode 100644
index 000000000000..b88444da7f9e
--- /dev/null
+++ b/tools/testing/selftests/mm/madv_dontneed.c
@@ -0,0 +1,220 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * MADV_DONTNEED and PROCESS_MADV_DONTNEED tests
+ *
+ * Copyright (C) 2025, Linx Software Corp.
+ *
+ * Author(s): Lian Wang <lianux.mm@gmail.com>
+ */
+#define _GNU_SOURCE
+#include <stdlib.h>
+#include <string.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <linux/mman.h>
+#include <sys/mman.h>
+#include <sys/syscall.h>
+#include "vm_util.h"
+#include <time.h>
+
+#include "../kselftest.h"
+
+/*
+ * For now, we're using 2 MiB of private anonymous memory for all tests.
+ */
+#define SIZE (256 * 1024 * 1024)
+
+static size_t pagesize;
+
+static void sense_support(void)
+{
+	char *addr;
+	int ret;
+
+	addr = mmap(0, pagesize, PROT_READ | PROT_WRITE,
+		    MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);
+	if (!addr)
+		ksft_exit_fail_msg("mmap failed\n");
+
+	ret = madvise(addr, pagesize, MADV_DONTNEED);
+	if (ret)
+		ksft_exit_skip("MADV_DONTNEED is not available\n");
+
+	munmap(addr, pagesize);
+}
+
+/*
+ * Read pagemap to check page is present in mermory
+ */
+static bool is_page_present(void *addr)
+{
+	uintptr_t vaddr = (uintptr_t)addr;
+	uintptr_t offset = (vaddr / pagesize) * sizeof(uint64_t);
+	ssize_t bytes_read;
+	uint64_t entry;
+	bool ret;
+	int fd;
+
+	fd = open("/proc/self/pagemap", O_RDONLY);
+	if (fd < 0) {
+		ksft_exit_fail_msg("opening pagemap failed\n");
+		ret = false;
+	}
+
+	if ((lseek(fd, offset, SEEK_SET)) == -1) {
+		close(fd);
+		ret = false;
+	}
+
+	bytes_read = read(fd, &entry, sizeof(entry));
+	close(fd);
+
+	if (bytes_read != sizeof(entry)) {
+		perror("read failed");
+		return false;
+	}
+
+	if (entry & (1ULL << 63))
+		ret = true;
+
+	return ret;
+}
+
+/*
+ * test madvsise_dontneed
+ */
+static void test_madv_dontneed(void)
+{
+	unsigned long rss_anon_before, rss_anon_after;
+	bool present, rss;
+	char *addr;
+	int ret;
+
+	ksft_print_msg("[RUN] %s\n", __func__);
+
+	addr = mmap(0, SIZE, PROT_READ | PROT_WRITE,
+		    MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);
+	if (!addr)
+		ksft_exit_fail_msg("mmap failed\n");
+
+	memset(addr, 0x7A, SIZE);
+
+	rss_anon_before = rss_anon();
+	if (!rss_anon_before)
+		ksft_exit_fail_msg("No RssAnon is allocated before dontneed\n");
+	ret = madvise(addr, SIZE, MADV_DONTNEED);
+	ksft_test_result(!ret, "MADV_DONTNEED\n");
+
+	rss_anon_after = rss_anon();
+	if (rss_anon_after < rss_anon_before)
+		rss = true;
+	ksft_test_result(rss, "MADV_DONTNEED rss is correct\n");
+
+	for (size_t i = 0; i < SIZE; i += pagesize) {
+		present = is_page_present(addr + i);
+		if (present) {
+			ksft_print_msg("Page not zero at offset %zu\n",
+				       (size_t)i);
+		}
+	}
+
+	ksft_test_result(!present, "MADV_DONTNEED page is present\n");
+	munmap(addr, SIZE);
+}
+
+/*
+ * Measure performance of batched process_madvise vs madvise
+ */
+static int measure_process_madvise_batching(int hint, int total_size,
+					    int single_unit, int batch_size)
+{
+	struct iovec *vec = malloc(sizeof(*vec) * batch_size);
+	struct timespec start, end;
+	unsigned long elapsed_ns = 0;
+	unsigned long nr_measures = 0;
+	pid_t pid = getpid();
+	char *buf;
+	int pidfd;
+
+	pidfd = syscall(SYS_pidfd_open, pid, 0);
+	if (pidfd == -1) {
+		perror("pidfd_open fail");
+		return -1;
+	}
+
+	buf = mmap(NULL, total_size, PROT_READ | PROT_WRITE,
+		   MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+	if (buf == MAP_FAILED) {
+		perror("mmap fail");
+		goto out;
+	}
+
+	if (!vec) {
+		perror("malloc vec failed");
+		goto unmap_out;
+	}
+
+	while (elapsed_ns < 5UL * 1000 * 1000 * 1000) {
+		memset(buf, 0x7A, total_size);
+
+		clock_gettime(CLOCK_MONOTONIC, &start);
+
+		for (int off = 0; off < total_size;
+		     off += single_unit * batch_size) {
+			for (int i = 0; i < batch_size; i++) {
+				vec[i].iov_base = buf + off + i * single_unit;
+				vec[i].iov_len = single_unit;
+			}
+			syscall(SYS_process_madvise, pidfd, vec, batch_size,
+				hint, 0);
+		}
+
+		clock_gettime(CLOCK_MONOTONIC, &end);
+		elapsed_ns += (end.tv_sec - start.tv_sec) * 1e9 +
+			      (end.tv_nsec - start.tv_nsec);
+		nr_measures++;
+	}
+
+	ksft_print_msg("[RESULT] batch=%d time=%.3f us/op\n", batch_size,
+		       (double)(elapsed_ns / nr_measures) /
+			       (total_size / single_unit));
+
+	free(vec);
+unmap_out:
+	munmap(buf, total_size);
+out:
+	close(pidfd);
+	return 0;
+}
+
+static void test_perf_batch_process(void)
+{
+	ksft_print_msg("[RUN] %s\n", __func__);
+	measure_process_madvise_batching(MADV_DONTNEED, SIZE, pagesize, 1);
+	measure_process_madvise_batching(MADV_DONTNEED, SIZE, pagesize, 2);
+	measure_process_madvise_batching(MADV_DONTNEED, SIZE, pagesize, 4);
+	ksft_test_result(1, "All test were done\n");
+}
+
+int main(int argc, char **argv)
+{
+	int err;
+
+	pagesize = getpagesize();
+
+	ksft_print_header();
+	ksft_set_plan(4);
+
+	sense_support();
+	test_madv_dontneed();
+	test_perf_batch_process();
+
+	err = ksft_get_fail_cnt();
+	if (err)
+		ksft_exit_fail_msg("%d out of %d tests failed\n", err,
+				   ksft_test_num());
+	ksft_exit_pass();
+}
diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/selftests/mm/run_vmtests.sh
index dddd1dd8af14..f96d43153fc0 100755
--- a/tools/testing/selftests/mm/run_vmtests.sh
+++ b/tools/testing/selftests/mm/run_vmtests.sh
@@ -47,6 +47,8 @@ separated by spaces:
 	hmm smoke tests
 - madv_guard
 	test madvise(2) MADV_GUARD_INSTALL and MADV_GUARD_REMOVE options
+- madv_dontneed
+	test memadvise(2) MADV_DONTNEED and PROCESS_MADV_DONTNEED options
 - madv_populate
 	test memadvise(2) MADV_POPULATE_{READ,WRITE} options
 - memfd_secret
@@ -422,6 +424,9 @@ CATEGORY="hmm" run_test bash ./test_hmm.sh smoke
 # MADV_GUARD_INSTALL and MADV_GUARD_REMOVE tests
 CATEGORY="madv_guard" run_test ./guard-regions
 
+# MADV_DONTNEED and PROCESS_DONTNEED tests
+CATEGORY="madv_dontneed" run_test ./madv_dontneed
+
 # MADV_POPULATE_READ and MADV_POPULATE_WRITE tests
 CATEGORY="madv_populate" run_test ./madv_populate
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] selftests/mm: add test for (BATCH_PROCESS)MADV_DONTNEED
  2025-06-21 13:30 [PATCH] selftests/mm: add test for (BATCH_PROCESS)MADV_DONTNEED wang lian
@ 2025-06-23 12:35 ` Lorenzo Stoakes
  2025-06-23 13:49   ` David Hildenbrand
  0 siblings, 1 reply; 6+ messages in thread
From: Lorenzo Stoakes @ 2025-06-23 12:35 UTC (permalink / raw)
  To: wang lian
  Cc: Andrew Morton, Shuah Khan, Christian Brauner, linux-kernel,
	linux-mm, linux-kselftest, SeongJae Park, zijing.zhang, ryncsn,
	p1ucky0923, gkwang, Liam R. Howlett, David Hildenbrand,
	Vlastimil Babka, Jann Horn

+cc Liam, David, Vlastimil, Jann

(it might not be obvious from get_maintainers.pl but please cc
maintainers/reviewers of the thing you are adding a test for, thanks!)

Overall I'm not in favour of us taking this patch.

There are a number of issues with it (explained inline below), but those aside,
it seems to be:

- Checking whether a simple anon buffer of arbitrary size is zapped by
  MADV_DONTNEED.

- Printing out a dubious microbenchmark that seems to be mostly asserting that
  fewer sycalls are faster when using process_madvise() locally.

And I'm struggling to see the value of that.

The test is also slow and will slow down a test run for little benefit.

On Sat, Jun 21, 2025 at 09:30:03PM +0800, wang lian wrote:
> From: Lian Wang <lianux.mm@gmail.com>
>
> Let's add a simple test for MADV_DONTNEED and PROCESS_MADV_DONTNEED,

I'm not sure what PROCESS_MADV_DONTNEED is?

you mean process_madvise(..., MADV_DONTNEED, ...)?

> and inspired by SeongJae Park's test at GitHub[1] add batch test
> for PROCESS_MADV_DONTNEED,but for now it influence by workload and
> need add some race conditions test.We can add it later.
>
> Signed-off-by: Lian Wang <lianux.mm@gmail.com>
> References
> ==========
>
> [1] https://github.com/sjp38/eval_proc_madvise
>
> ---
>  tools/testing/selftests/mm/.gitignore      |   1 +
>  tools/testing/selftests/mm/Makefile        |   1 +
>  tools/testing/selftests/mm/madv_dontneed.c | 220 +++++++++++++++++++++
>  tools/testing/selftests/mm/run_vmtests.sh  |   5 +
>  4 files changed, 227 insertions(+)
>  create mode 100644 tools/testing/selftests/mm/madv_dontneed.c
>
> diff --git a/tools/testing/selftests/mm/.gitignore b/tools/testing/selftests/mm/.gitignore
> index 824266982aa3..911f39d634be 100644
> --- a/tools/testing/selftests/mm/.gitignore
> +++ b/tools/testing/selftests/mm/.gitignore
> @@ -25,6 +25,7 @@ pfnmap
>  protection_keys
>  protection_keys_32
>  protection_keys_64
> +madv_dontneed
>  madv_populate
>  uffd-stress
>  uffd-unit-tests
> diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
> index ae6f994d3add..2352252f3914 100644
> --- a/tools/testing/selftests/mm/Makefile
> +++ b/tools/testing/selftests/mm/Makefile
> @@ -67,6 +67,7 @@ TEST_GEN_FILES += hugepage-mremap
>  TEST_GEN_FILES += hugepage-shm
>  TEST_GEN_FILES += hugepage-vmemmap
>  TEST_GEN_FILES += khugepaged
> +TEST_GEN_FILES += madv_dontneed
>  TEST_GEN_FILES += madv_populate
>  TEST_GEN_FILES += map_fixed_noreplace
>  TEST_GEN_FILES += map_hugetlb
> diff --git a/tools/testing/selftests/mm/madv_dontneed.c b/tools/testing/selftests/mm/madv_dontneed.c
> new file mode 100644
> index 000000000000..b88444da7f9e
> --- /dev/null
> +++ b/tools/testing/selftests/mm/madv_dontneed.c
> @@ -0,0 +1,220 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * MADV_DONTNEED and PROCESS_MADV_DONTNEED tests
> + *
> + * Copyright (C) 2025, Linx Software Corp.
> + *
> + * Author(s): Lian Wang <lianux.mm@gmail.com>
> + */
> +#define _GNU_SOURCE
> +#include <stdlib.h>
> +#include <string.h>
> +#include <stdbool.h>
> +#include <stdint.h>
> +#include <unistd.h>
> +#include <errno.h>
> +#include <fcntl.h>
> +#include <linux/mman.h>
> +#include <sys/mman.h>
> +#include <sys/syscall.h>
> +#include "vm_util.h"
> +#include <time.h>
> +
> +#include "../kselftest.h"
> +
> +/*
> + * For now, we're using 2 MiB of private anonymous memory for all tests.
> + */
> +#define SIZE (256 * 1024 * 1024)

This is 256 MB?

Also you don't need to do:

/*
 * foo bar baz bar foo
 */

Just do:

/* foo bar baz bar foo */

> +
> +static size_t pagesize;
> +
> +static void sense_support(void)
> +{
> +	char *addr;
> +	int ret;
> +
> +	addr = mmap(0, pagesize, PROT_READ | PROT_WRITE,
> +		    MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);

This is incorrect, you need to specify -1 for fd for anon mappings, not zero.

Also the first parameter in mmap() is a pointer, use NULL.

> +	if (!addr)
> +		ksft_exit_fail_msg("mmap failed\n");

This is incorrect, you need to check for MAP_FAILED.

> +
> +	ret = madvise(addr, pagesize, MADV_DONTNEED);
> +	if (ret)
> +		ksft_exit_skip("MADV_DONTNEED is not available\n");

This is incorrect, you're testing for a current kernel, MADV_DONTNEED is very
definitely always available.

I'm guessing you've copy/pasted from madv_populate.c, which really shouldn't be
'sensing' if this option is available either - the tests are for the most recent
kernel, in which you know the behaviour is supported.

Anyway you should just drop this.

> +
> +	munmap(addr, pagesize);
> +}
> +
> +/*
> + * Read pagemap to check page is present in mermory
> + */
> +static bool is_page_present(void *addr)

This seems a flakey thing to test? While unlikely, a page might be swapped
out. So you will want to add a check for this also.

> +{
> +	uintptr_t vaddr = (uintptr_t)addr;
> +	uintptr_t offset = (vaddr / pagesize) * sizeof(uint64_t);
> +	ssize_t bytes_read;
> +	uint64_t entry;
> +	bool ret;
> +	int fd;
> +
> +	fd = open("/proc/self/pagemap", O_RDONLY);
> +	if (fd < 0) {
> +		ksft_exit_fail_msg("opening pagemap failed\n");
> +		ret = false;

Why are you doing stuff after ksft_exit_fail_msg()? As the name suggests, it
literally exits the whole process.

> +	}
> +
> +	if ((lseek(fd, offset, SEEK_SET)) == -1) {
> +		close(fd);
> +		ret = false;
> +	}
> +
> +	bytes_read = read(fd, &entry, sizeof(entry));
> +	close(fd);
> +
> +	if (bytes_read != sizeof(entry)) {
> +		perror("read failed");
> +		return false;
> +	}
> +
> +	if (entry & (1ULL << 63))

This is a 'magical number'. Please #define something like:

#define PAGEMAP_PRESENT_BIT (63)
#define PAGEMAP_PRESENT_MASK (1ULL << 63)

> +		ret = true;
> +
> +	return ret;
> +}
> +
> +/*
> + * test madvsise_dontneed

Useless comment.

> + */
> +static void test_madv_dontneed(void)
> +{
> +	unsigned long rss_anon_before, rss_anon_after;
> +	bool present, rss;
> +	char *addr;
> +	int ret;
> +
> +	ksft_print_msg("[RUN] %s\n", __func__);

Not sure why we need so much noise.

> +
> +	addr = mmap(0, SIZE, PROT_READ | PROT_WRITE,
> +		    MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);

Again this is incorrect - you need to set -1 for the fd, NULL for the address.

> +	if (!addr)
> +		ksft_exit_fail_msg("mmap failed\n");

This is wildly incorrect and means you will segfault the test, check for
MAP_FAILED.

> +
> +	memset(addr, 0x7A, SIZE);

I've said it elsewhere but I'm not sure you should write 256 megabytes of data
when you could write 65,536 to fault in memory.

Or even use MADV_POPULATE_READ/WRITE...?

> +
> +	rss_anon_before = rss_anon();
> +	if (!rss_anon_before)
> +		ksft_exit_fail_msg("No RssAnon is allocated before dontneed\n");

This seems copy/pasted from split_huge_page_test.c.

I mean I'm not sure it's really useful to assert that RSS goes down. Again, if
RSS doesn't go down we have bigger problems...

> +	ret = madvise(addr, SIZE, MADV_DONTNEED);
> +	ksft_test_result(!ret, "MADV_DONTNEED\n");
> +
> +	rss_anon_after = rss_anon();
> +	if (rss_anon_after < rss_anon_before)
> +		rss = true;
> +	ksft_test_result(rss, "MADV_DONTNEED rss is correct\n");

This is incorrect, rss is uninitialised here so you're reading uninitialised
memory unless your condition is met.

I'm not sure why you don't just pass in the condition... or use
kselftest_harness rather than do everything manually here.

> +
> +	for (size_t i = 0; i < SIZE; i += pagesize) {
> +		present = is_page_present(addr + i);

You're doing this in a super slow way, you're checking 256 MB's worth of memory
via /proc/pagemap, opening this every single time

> +		if (present) {
> +			ksft_print_msg("Page not zero at offset %zu\n",
> +				       (size_t)i);

I'm not sure what 'page not zero' means.

And also if this failed, you're going to spam like crazy. 256 * 1024 / 4 =
65,536 lines of output if somehow, in an imagined scenario where zapping no
longer works, this fails.

> +		}
> +	}
> +
> +	ksft_test_result(!present, "MADV_DONTNEED page is present\n");

Are you not duplicating test result here? So this function named test_xxx()
isn't actually testing 1 thing it's testing multiple things output as separate
test results...

kselftest_harness again would help you a lot.

> +	munmap(addr, SIZE);
> +}
> +
> +/*
> + * Measure performance of batched process_madvise vs madvise
> + */
> +static int measure_process_madvise_batching(int hint, int total_size,

I have no idea what you mean by 'hint'?

You're passing in MADV_DONTNEED, surely this should be 'behavior'?

> +					    int single_unit, int batch_size)

On my system this took a long time to run, and I have a powerful system. I just
don't think this is any way appropriate or suited to mm self tests, sorry.

At the very least it should be hugely sped up.

> +{
> +	struct iovec *vec = malloc(sizeof(*vec) * batch_size);
> +	struct timespec start, end;
> +	unsigned long elapsed_ns = 0;
> +	unsigned long nr_measures = 0;
> +	pid_t pid = getpid();
> +	char *buf;
> +	int pidfd;
> +
> +	pidfd = syscall(SYS_pidfd_open, pid, 0);
> +	if (pidfd == -1) {
> +		perror("pidfd_open fail");
> +		return -1;

Why -1? And this is just ignored by caller?

In any case you don't need to do any of this, just pass PIDFD_SELF, I went to
great pains to add this and have process_madvise() work locally so you don't
have to do this kind of thing :)

> +	}
> +
> +	buf = mmap(NULL, total_size, PROT_READ | PROT_WRITE,
> +		   MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);

Hm strangely here you invoke mmap() correctly (+ test for failure correctly) but
not elsewhere?

I don't know why you get the pidfd first before doing this.

> +	if (buf == MAP_FAILED) {
> +		perror("mmap fail");

Are we good to just randomly perror() in test code? Not 100% sure that will
work?

> +		goto out;
> +	}
> +
> +	if (!vec) {
> +		perror("malloc vec failed");
> +		goto unmap_out;
> +	}

Why are you checking whether a malloc() failed a million years after
malloc()'ing?

> +
> +	while (elapsed_ns < 5UL * 1000 * 1000 * 1000) {

This is just horrible. Completely arbitrary and a magical number.

And making these tests run for 3 * 5 seconds to not assert anything but to just
print out some stats is not ok.

> +		memset(buf, 0x7A, total_size);

I don't know why you feel the need to memset() the whole thing each time? If the
point is faulting in maybe better to just fault in each page?

> +
> +		clock_gettime(CLOCK_MONOTONIC, &start);
> +
> +		for (int off = 0; off < total_size;
> +		     off += single_unit * batch_size) {
> +			for (int i = 0; i < batch_size; i++) {
> +				vec[i].iov_base = buf + off + i * single_unit;
> +				vec[i].iov_len = single_unit;
> +			}
> +			syscall(SYS_process_madvise, pidfd, vec, batch_size,
> +				hint, 0);

Yeah 'hint' here is awful as a name, it's actively misleading. You're not
'hinting' at a process_madvise() behaviour, you're specifying it.

Hint usually implies something like 'map this around <hint> address'.

> +		}
> +
> +		clock_gettime(CLOCK_MONOTONIC, &end);
> +		elapsed_ns += (end.tv_sec - start.tv_sec) * 1e9 +
> +			      (end.tv_nsec - start.tv_nsec);

> +		nr_measures++;
> +	}
> +
> +	ksft_print_msg("[RESULT] batch=%d time=%.3f us/op\n", batch_size,
> +		       (double)(elapsed_ns / nr_measures) /
> +			       (total_size / single_unit));
> +
> +	free(vec);

Surely this should be under out?

> +unmap_out:
> +	munmap(buf, total_size);
> +out:
> +	close(pidfd);
> +	return 0;
> +}

Anyway I seriously question the value of this. The syscall overhead savings are
pretty obvious, but I'm not sure what we're asserting here? Are we trying to
avoid perf regressions in future? Who will read this?

Just an 'advert for use of process_madvise()' I think is not really a great
justification.

> +
> +static void test_perf_batch_process(void)
> +{
> +	ksft_print_msg("[RUN] %s\n", __func__);
> +	measure_process_madvise_batching(MADV_DONTNEED, SIZE, pagesize, 1);
> +	measure_process_madvise_batching(MADV_DONTNEED, SIZE, pagesize, 2);
> +	measure_process_madvise_batching(MADV_DONTNEED, SIZE, pagesize, 4);

I wonder whether this really measures anything re: TLB batching, because you're
doing 4x fewer syscalls in the last instance over probably millions of
invocations.

Asserting 'fewer syscalls are cheaper' is you know, pretty obvious :)

> +	ksft_test_result(1, "All test were done\n");

This is really weird output, it reads like you're saying all tests have passed
or something, but these aren't tests, and this isn't a test result.

> +}
> +
> +int main(int argc, char **argv)
> +{
> +	int err;
> +
> +	pagesize = getpagesize();
> +
> +	ksft_print_header();
> +	ksft_set_plan(4);

I absolutely hate this manual style of writing tests. We kselftest_harness -
there's literally no need to write tests in this unmaintainable way any more.

> +
> +	sense_support();
> +	test_madv_dontneed();
> +	test_perf_batch_process();
> +
> +	err = ksft_get_fail_cnt();
> +	if (err)
> +		ksft_exit_fail_msg("%d out of %d tests failed\n", err,
> +				   ksft_test_num());
> +	ksft_exit_pass();

More useless manual stuff. Use kselftest_harness.

> +}
> diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/selftests/mm/run_vmtests.sh
> index dddd1dd8af14..f96d43153fc0 100755
> --- a/tools/testing/selftests/mm/run_vmtests.sh
> +++ b/tools/testing/selftests/mm/run_vmtests.sh
> @@ -47,6 +47,8 @@ separated by spaces:
>  	hmm smoke tests
>  - madv_guard
>  	test madvise(2) MADV_GUARD_INSTALL and MADV_GUARD_REMOVE options
> +- madv_dontneed
> +	test memadvise(2) MADV_DONTNEED and PROCESS_MADV_DONTNEED options
>  - madv_populate
>  	test memadvise(2) MADV_POPULATE_{READ,WRITE} options
>  - memfd_secret
> @@ -422,6 +424,9 @@ CATEGORY="hmm" run_test bash ./test_hmm.sh smoke
>  # MADV_GUARD_INSTALL and MADV_GUARD_REMOVE tests
>  CATEGORY="madv_guard" run_test ./guard-regions
>
> +# MADV_DONTNEED and PROCESS_DONTNEED tests
> +CATEGORY="madv_dontneed" run_test ./madv_dontneed
> +
>  # MADV_POPULATE_READ and MADV_POPULATE_WRITE tests
>  CATEGORY="madv_populate" run_test ./madv_populate
>
> --
> 2.43.0
>
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] selftests/mm: add test for (BATCH_PROCESS)MADV_DONTNEED
  2025-06-23 12:35 ` Lorenzo Stoakes
@ 2025-06-23 13:49   ` David Hildenbrand
  2025-06-23 13:56     ` Lorenzo Stoakes
  0 siblings, 1 reply; 6+ messages in thread
From: David Hildenbrand @ 2025-06-23 13:49 UTC (permalink / raw)
  To: Lorenzo Stoakes, wang lian
  Cc: Andrew Morton, Shuah Khan, Christian Brauner, linux-kernel,
	linux-mm, linux-kselftest, SeongJae Park, zijing.zhang, ryncsn,
	p1ucky0923, gkwang, Liam R. Howlett, Vlastimil Babka, Jann Horn

On 23.06.25 14:35, Lorenzo Stoakes wrote:
> +cc Liam, David, Vlastimil, Jann
> 
> (it might not be obvious from get_maintainers.pl but please cc
> maintainers/reviewers of the thing you are adding a test for, thanks!)
> 
> Overall I'm not in favour of us taking this patch.
> 
> There are a number of issues with it (explained inline below), but those aside,
> it seems to be:
> 
> - Checking whether a simple anon buffer of arbitrary size is zapped by
>    MADV_DONTNEED.
> 
> - Printing out a dubious microbenchmark that seems to be mostly asserting that
>    fewer sycalls are faster when using process_madvise() locally.
> 
> And I'm struggling to see the value of that.

We have other tests that should already severely break if MADV_DONTNEED 
doesn't work ... but sure, we could think about more elaborate 
functional tests when they provide a clear benefit. (zapping all kinds 
of memory types, anon/ksm/huge zeropage/pagecache/hugetlb/ ... and using 
/proc/self/pagemap to see if the page table mappings are already gone)

I don't think we have a lot of process_madvise selftests, right?

hugtlb handling that was added recently is already tested to some degree 
in hugetlb-madvise.c.

In general, I'm not a fan of selftests that measure syscall performance ...


-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] selftests/mm: add test for (BATCH_PROCESS)MADV_DONTNEED
  2025-06-23 13:49   ` David Hildenbrand
@ 2025-06-23 13:56     ` Lorenzo Stoakes
  0 siblings, 0 replies; 6+ messages in thread
From: Lorenzo Stoakes @ 2025-06-23 13:56 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: wang lian, Andrew Morton, Shuah Khan, Christian Brauner,
	linux-kernel, linux-mm, linux-kselftest, SeongJae Park,
	zijing.zhang, ryncsn, p1ucky0923, gkwang, Liam R. Howlett,
	Vlastimil Babka, Jann Horn

On Mon, Jun 23, 2025 at 03:49:05PM +0200, David Hildenbrand wrote:
> On 23.06.25 14:35, Lorenzo Stoakes wrote:
> > +cc Liam, David, Vlastimil, Jann
> >
> > (it might not be obvious from get_maintainers.pl but please cc
> > maintainers/reviewers of the thing you are adding a test for, thanks!)
> >
> > Overall I'm not in favour of us taking this patch.
> >
> > There are a number of issues with it (explained inline below), but those aside,
> > it seems to be:
> >
> > - Checking whether a simple anon buffer of arbitrary size is zapped by
> >    MADV_DONTNEED.
> >
> > - Printing out a dubious microbenchmark that seems to be mostly asserting that
> >    fewer sycalls are faster when using process_madvise() locally.
> >
> > And I'm struggling to see the value of that.
>
> We have other tests that should already severely break if MADV_DONTNEED
> doesn't work ... but sure, we could think about more elaborate functional
> tests when they provide a clear benefit. (zapping all kinds of memory types,
> anon/ksm/huge zeropage/pagecache/hugetlb/ ... and using /proc/self/pagemap
> to see if the page table mappings are already gone)

Yes right, exactly.

>
> I don't think we have a lot of process_madvise selftests, right?

No and that's an issue. It'd be good to have some of those, but not as a
benchmark.

I think the only stuff we have right now is in the guard-region tests which I
added to assert it worked with MADV_GUARD_INSTALL/REMOVE.

It'd be good to have stuff that tested remote process stuff (tricky to set up,
maybe test has to fork itself etc.) as well as the stuff I added allowing
self-madvise().

But again we'd probably really want to find a way to exercise this stuff
properly.

>
> hugtlb handling that was added recently is already tested to some degree in
> hugetlb-madvise.c.
>
> In general, I'm not a fan of selftests that measure syscall performance ...

:)

>
>
> --
> Cheers,
>
> David / dhildenb
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] selftests/mm: add test for (BATCH_PROCESS)MADV_DONTNEED
@ 2025-06-23 14:36 wang lian
  2025-06-23 14:39 ` Lorenzo Stoakes
  0 siblings, 1 reply; 6+ messages in thread
From: wang lian @ 2025-06-23 14:36 UTC (permalink / raw)
  To: lorenzo.stoakes
  Cc: Liam.Howlett, akpm, brauner, david, gkwang, jannh, lianux.mm,
	linux-kernel, linux-kselftest, linux-mm, p1ucky0923, ryncsn,
	shuah, sj, vbabka, zijing.zhang

I deeply appreciate for your criticism. This is my first patch and I will improve it based on what we have discussed so far.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] selftests/mm: add test for (BATCH_PROCESS)MADV_DONTNEED
  2025-06-23 14:36 wang lian
@ 2025-06-23 14:39 ` Lorenzo Stoakes
  0 siblings, 0 replies; 6+ messages in thread
From: Lorenzo Stoakes @ 2025-06-23 14:39 UTC (permalink / raw)
  To: wang lian
  Cc: Liam.Howlett, akpm, brauner, david, gkwang, jannh, linux-kernel,
	linux-kselftest, linux-mm, p1ucky0923, ryncsn, shuah, sj, vbabka,
	zijing.zhang

On Mon, Jun 23, 2025 at 10:36:48PM +0800, wang lian wrote:
> I deeply appreciate for your criticism. This is my first patch and I will improve it based on what we have discussed so far.

Sorry if it seemed harsh, I appreciate the first patch can be difficult (I
still remember mine!) but hopefully it's clear the focus is on getting
things right technically and this is all :)

Overall I think something more like a generalised test of process_madvise()
behaviour would be most valuable, as David suggested?

Thanks, Lorenzo

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-06-23 14:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-21 13:30 [PATCH] selftests/mm: add test for (BATCH_PROCESS)MADV_DONTNEED wang lian
2025-06-23 12:35 ` Lorenzo Stoakes
2025-06-23 13:49   ` David Hildenbrand
2025-06-23 13:56     ` Lorenzo Stoakes
  -- strict thread matches above, loose matches on Subject: below --
2025-06-23 14:36 wang lian
2025-06-23 14:39 ` Lorenzo Stoakes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).