All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sayali Patil <sayalip@linux.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>,
	Shuah Khan <shuah@kernel.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	linux-kselftest@vger.kernel.org,
	Ritesh Harjani <ritesh.list@gmail.com>
Cc: David Hildenbrand <david@kernel.org>, Zi Yan <ziy@nvidia.com>,
	Michal Hocko <mhocko@kernel.org>,
	Oscar Salvador <osalvador@suse.de>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Dev Jain <dev.jain@arm.com>,
	Liam.Howlett@oracle.com, linuxppc-dev@lists.ozlabs.org,
	Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Subject: Re: [PATCH v3 13/13] selftests/cgroup: extend test_hugetlb_memcg.c to support all huge page sizes
Date: Fri, 3 Apr 2026 22:46:29 +0530	[thread overview]
Message-ID: <dd044148-b40f-47f5-a254-ef6f1b86cb0f@linux.ibm.com> (raw)
In-Reply-To: <41fc5be38d4205b5a4aee8499631cf60a9026163.1774591179.git.sayalip@linux.ibm.com>



On 27/03/26 12:46, Sayali Patil wrote:
> The hugetlb memcg selftest was previously skipped when the configured
> huge page size was not 2MB, preventing the test from running on systems
> using other default huge page sizes.
> 
> Detect the system's configured huge page size at runtime and use it for
> the allocation instead of assuming a fixed 2MB size. This allows the
> test to run on configurations using non-2MB huge pages and avoids
> unnecessary skips.
> 
> Fixes: c0dddb7aa5f8 ("selftests: add a selftest to verify hugetlb usage in memcg")
> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
> Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
> ---
>   .../selftests/cgroup/test_hugetlb_memcg.c     | 66 ++++++++++++++-----
>   1 file changed, 48 insertions(+), 18 deletions(-)
> 
> diff --git a/tools/testing/selftests/cgroup/test_hugetlb_memcg.c b/tools/testing/selftests/cgroup/test_hugetlb_memcg.c
> index f451aa449be6..a449dbec16a8 100644
> --- a/tools/testing/selftests/cgroup/test_hugetlb_memcg.c
> +++ b/tools/testing/selftests/cgroup/test_hugetlb_memcg.c
> @@ -12,10 +12,15 @@
>   
>   #define ADDR ((void *)(0x0UL))
>   #define FLAGS (MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB)
> -/* mapping 8 MBs == 4 hugepages */
> -#define LENGTH (8UL*1024*1024)
>   #define PROTECTION (PROT_READ | PROT_WRITE)
>   
> +/*
> + * This value matches the kernel's MEMCG_CHARGE_BATCH definition:
> + * see include/linux/memcontrol.h. If the kernel value changes, this
> + * test constant must be updated accordingly to stay consistent.
> + */
> +#define MEMCG_CHARGE_BATCH 64U
> +
>   /* borrowed from mm/hmm-tests.c */
>   static long get_hugepage_size(void)
>   {
> @@ -84,11 +89,11 @@ static unsigned int check_first(char *addr)
>   	return *(unsigned int *)addr;
>   }
>   
> -static void write_data(char *addr)
> +static void write_data(char *addr, size_t length)
>   {
>   	unsigned long i;
>   
> -	for (i = 0; i < LENGTH; i++)
> +	for (i = 0; i < length; i++)
>   		*(addr + i) = (char)i;
>   }
>   
> @@ -96,26 +101,31 @@ static int hugetlb_test_program(const char *cgroup, void *arg)
>   {
>   	char *test_group = (char *)arg;
>   	void *addr;
> +	long hpage_size = get_hugepage_size() * 1024;
>   	long old_current, expected_current, current;
>   	int ret = EXIT_FAILURE;
> +	size_t length = 4 * hpage_size;
> +	int pagesize, nr_pages;
> +
> +	pagesize = getpagesize();
>   
>   	old_current = cg_read_long(test_group, "memory.current");
>   	set_nr_hugepages(20);
>   	current = cg_read_long(test_group, "memory.current");
> -	if (current - old_current >= MB(2)) {
> +	if (current - old_current >= hpage_size) {
>   		ksft_print_msg(
>   			"setting nr_hugepages should not increase hugepage usage.\n");
>   		ksft_print_msg("before: %ld, after: %ld\n", old_current, current);
>   		return EXIT_FAILURE;
>   	}
>   
> -	addr = mmap(ADDR, LENGTH, PROTECTION, FLAGS, 0, 0);
> +	addr = mmap(ADDR, length, PROTECTION, FLAGS, 0, 0);
>   	if (addr == MAP_FAILED) {
>   		ksft_print_msg("fail to mmap.\n");
>   		return EXIT_FAILURE;
>   	}
>   	current = cg_read_long(test_group, "memory.current");
> -	if (current - old_current >= MB(2)) {
> +	if (current - old_current >= hpage_size) {
>   		ksft_print_msg("mmap should not increase hugepage usage.\n");
>   		ksft_print_msg("before: %ld, after: %ld\n", old_current, current);
>   		goto out_failed_munmap;
> @@ -124,10 +134,24 @@ static int hugetlb_test_program(const char *cgroup, void *arg)
>   
>   	/* read the first page */
>   	check_first(addr);
> -	expected_current = old_current + MB(2);
> +	nr_pages = hpage_size / pagesize;
> +	expected_current = old_current + hpage_size;
>   	current = cg_read_long(test_group, "memory.current");
> -	if (!values_close(expected_current, current, 5)) {
> -		ksft_print_msg("memory usage should increase by around 2MB.\n");
> +	if (nr_pages < MEMCG_CHARGE_BATCH && current == old_current) {
> +		/*
> +		 * Memory cgroup charging uses per-CPU stocks and batched updates to the
> +		 *  memcg usage counters. For hugetlb allocations, the number of pages
> +		 *  that memcg charges is expressed in base pages (nr_pages), not
> +		 *  in hugepage units. When the charge for an allocation is smaller than
> +		 *  the internal batching threshold  (nr_pages <  MEMCG_CHARGE_BATCH),
> +		 *  it may be fully satisfied from the CPU’s local stock. In such
> +		 *  cases memory.current does not necessarily
> +		 *  increase.
> +		 *  Therefore, Treat a zero delta as valid behaviour here.
> +		 */
> +		ksft_print_msg("no visible memcg charge, allocation consumed from local stock.\n");
> +	} else if (!values_close(expected_current, current, 5)) {
> +		ksft_print_msg("memory usage should increase by ~1 huge page.\n");
>   		ksft_print_msg(
>   			"expected memory: %ld, actual memory: %ld\n",
>   			expected_current, current);
> @@ -135,11 +159,11 @@ static int hugetlb_test_program(const char *cgroup, void *arg)
>   	}
>   
>   	/* write to the whole range */
> -	write_data(addr);
> +	write_data(addr, length);
>   	current = cg_read_long(test_group, "memory.current");
> -	expected_current = old_current + MB(8);
> +	expected_current = old_current + length;
>   	if (!values_close(expected_current, current, 5)) {
> -		ksft_print_msg("memory usage should increase by around 8MB.\n");
> +		ksft_print_msg("memory usage should increase by around 4 huge pages.\n");
>   		ksft_print_msg(
>   			"expected memory: %ld, actual memory: %ld\n",
>   			expected_current, current);
> @@ -147,7 +171,7 @@ static int hugetlb_test_program(const char *cgroup, void *arg)
>   	}
>   
>   	/* unmap the whole range */
> -	munmap(addr, LENGTH);
> +	munmap(addr, length);
>   	current = cg_read_long(test_group, "memory.current");
>   	expected_current = old_current;
>   	if (!values_close(expected_current, current, 5)) {
> @@ -162,13 +186,15 @@ static int hugetlb_test_program(const char *cgroup, void *arg)
>   	return ret;
>   
>   out_failed_munmap:
> -	munmap(addr, LENGTH);
> +	munmap(addr, length);
>   	return ret;
>   }
>   
>   static int test_hugetlb_memcg(char *root)
>   {
>   	int ret = KSFT_FAIL;
> +	int num_pages = 20;
> +	long hpage_size = get_hugepage_size();
>   	char *test_group;
>   
>   	test_group = cg_name(root, "hugetlb_memcg_test");
> @@ -177,7 +203,7 @@ static int test_hugetlb_memcg(char *root)
>   		goto out;
>   	}
>   
> -	if (cg_write(test_group, "memory.max", "100M")) {
> +	if (cg_write_numeric(test_group, "memory.max", num_pages * hpage_size * 1024)) {
>   		ksft_print_msg("fail to set cgroup memory limit.\n");
>   		goto out;
>   	}
> @@ -200,6 +226,7 @@ int main(int argc, char **argv)
>   {
>   	char root[PATH_MAX];
>   	int ret = EXIT_SUCCESS, has_memory_hugetlb_acc;
> +	long val;
>   
>   	has_memory_hugetlb_acc = proc_mount_contains("memory_hugetlb_accounting");
>   	if (has_memory_hugetlb_acc < 0)
> @@ -208,12 +235,15 @@ int main(int argc, char **argv)
>   		ksft_exit_skip("memory hugetlb accounting is disabled\n");
>   
>   	/* Unit is kB! */
> -	if (get_hugepage_size() != 2048) {
> -		ksft_print_msg("test_hugetlb_memcg requires 2MB hugepages\n");
> +	val = get_hugepage_size();
> +	if (val < 0) {
> +		ksft_print_msg("Failed to read hugepage size\n");
>   		ksft_test_result_skip("test_hugetlb_memcg\n");
>   		return ret;
>   	}
>   
> +	ksft_print_msg("Hugepage size: %ld kB\n", val);
> +
>   	if (cg_find_unified_root(root, sizeof(root), NULL))
>   		ksft_exit_skip("cgroup v2 isn't mounted\n");
>   

AI review question:


> > -	addr = mmap(ADDR, LENGTH, PROTECTION, FLAGS, 0, 0);
> > +	addr = mmap(ADDR, length, PROTECTION, FLAGS, 0, 0);
> >  	if (addr == MAP_FAILED) {
> >  		ksft_print_msg("fail to mmap.\n");
> >  		return EXIT_FAILURE;
> Should the test gracefully handle memory constraints by skipping
> instead of returning EXIT_FAILURE here?
> With the removal of the 2MB skip condition, the test will now run on systems
> with large default huge pages (e.g., 1GB on x86_64). Reserving 20 
> huge pages
> (20GB) and mapping 4 huge pages (4GB) might easily fail due to memory
> exhaustion in typical automated testing environments, causing false 
> test regressions.

I’ll update it in v4 to skip on memory constraints instead of returning 
EXIT_FAILURE.

> > @@ -124,10 +134,24 @@ static int hugetlb_test_program(const char 
> *cgroup, void *arg)
> >
> >  	/* read the first page */
> >  	check_first(addr);
> > -	expected_current = old_current + MB(2);
> > +	nr_pages = hpage_size / pagesize;
> > +	expected_current = old_current + hpage_size;
> >  	current = cg_read_long(test_group, "memory.current");
> > -	if (!values_close(expected_current, current, 5)) {
> > -		ksft_print_msg("memory usage should increase by around 2MB.\n");
> > +	if (nr_pages < MEMCG_CHARGE_BATCH && current == old_current) {
> > +		/*
> > +		 * Memory cgroup charging uses per-CPU stocks and batched 
> updates to the
> > +		 *  memcg usage counters. For hugetlb allocations, the number of 
> pages
> > +		 *  that memcg charges is expressed in base pages (nr_pages), not
> > +		 *  in hugepage units. When the charge for an allocation is 
> smaller than
> > +		 *  the internal batching threshold  (nr_pages < 
> > MEMCG_CHARGE_BATCH),
> > +		 *  it may be fully satisfied from the CPU’s local stock. In such
> > +		 *  cases memory.current does not necessarily
> > +		 *  increase.
> > +		 *  Therefore, Treat a zero delta as valid behaviour here.
> > +		 */
> > +		ksft_print_msg("no visible memcg charge, allocation consumed 
> from local stock.\n");
> > +	} else if (!values_close(expected_current, current, 5)) {
> Does this exception fully account for memcg stock batching when the 
> per-CPU
> stock is empty?
> If the stock lacks sufficient pages, charging the huge page triggers 
> a stock
> refill that charges exactly MEMCG_CHARGE_BATCH (64) base pages from 
> the global
> memcg counter.
> On an architecture with 64KB base pages and 2MB contiguous huge pages,
> nr_pages is 32. Refilling the stock would charge 64 base pages (4MB), 
> which
> increases memory.current by 4MB instead of the expected 2MB. Since
> current != old_current, the test falls through to values_close(), 
> which expects
> a 2MB increase and may fail the test.

Valid point.
The current exception does not fully account for this
scenario when the per-CPU stock is empty and a refill charges
MEMCG_CHARGE_BATCH. This can indeed lead to a larger-than-expected jump
in memory.current and cause the test to fail.
I’ll update the logic in v4 to handle this case more robustly.

> > @@ -177,7 +203,7 @@ static int test_hugetlb_memcg(char *root)
> >  		goto out;
> >  	}
> >
> > -	if (cg_write(test_group, "memory.max", "100M")) {
> > +	if (cg_write_numeric(test_group, "memory.max", num_pages * 
> > hpage_size * 1024)) {
> Can this calculation overflow on 32-bit systems?
> Since long is 32 bits on 32-bit systems, num_pages * hpage_size * 1024 can
> exceed the 32-bit signed integer maximum if the architecture supports large> huge pages (e.g., 256MB on MIPS).This would evaluate to 5,368,709,120,
> resulting in a negative or truncated value, which sets memory.max to an
> invalid or overly restrictive limit.

Yes,  this can overflow on 32-bit systems. I’ll fix it in v4.

  reply	other threads:[~2026-04-03 17:17 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-27  7:15 [PATCH v3 00/13] selftests/mm: fix failures and robustness improvements Sayali Patil
2026-03-27  7:15 ` [PATCH v3 01/13] selftests/mm: restore default nr_hugepages value during cleanup in charge_reserved_hugetlb.sh Sayali Patil
2026-04-01 14:52   ` Sayali Patil
2026-04-01 16:05     ` Sayali Patil
2026-03-27  7:15 ` [PATCH v3 02/13] selftests/mm: fix hugetlb pathname construction " Sayali Patil
2026-04-01 14:06   ` David Hildenbrand (Arm)
2026-03-27  7:15 ` [PATCH v3 03/13] selftests/mm: fix hugetlb pathname construction in hugetlb_reparenting_test.sh Sayali Patil
2026-04-01 14:06   ` David Hildenbrand (Arm)
2026-03-27  7:15 ` [PATCH v3 04/13] selftest/mm: fix cgroup task placement and drop memory.current checks " Sayali Patil
2026-04-01 14:08   ` David Hildenbrand (Arm)
2026-04-03 19:59     ` Sayali Patil
2026-03-27  7:15 ` [PATCH v3 05/13] selftests/mm: size tmpfs according to PMD page size in split_huge_page_test Sayali Patil
2026-04-01 16:20   ` Sayali Patil
2026-03-27  7:16 ` [PATCH v3 06/13] selftest/mm: adjust hugepage-mremap test size for large huge pages Sayali Patil
2026-04-01 14:10   ` David Hildenbrand (Arm)
2026-04-01 20:45     ` Sayali Patil
2026-03-27  7:16 ` [PATCH v3 07/13] selftest/mm: register existing mapping with userfaultfd in hugepage-mremap Sayali Patil
2026-04-01 14:18   ` David Hildenbrand (Arm)
2026-04-01 14:43     ` Sayali Patil
2026-04-02  7:31       ` David Hildenbrand (Arm)
2026-04-03 17:41         ` Sayali Patil
2026-03-27  7:16 ` [PATCH v3 08/13] selftests/mm: ensure destination is hugetlb-backed " Sayali Patil
2026-04-01 14:21   ` David Hildenbrand (Arm)
2026-04-01 14:40     ` Lorenzo Stoakes (Oracle)
2026-04-01 20:39       ` Sayali Patil
2026-04-02  7:33         ` David Hildenbrand (Arm)
2026-04-02  9:05           ` Lorenzo Stoakes (Oracle)
2026-04-03 17:41             ` Sayali Patil
2026-04-07 10:22               ` Lorenzo Stoakes (Oracle)
2026-03-27  7:16 ` [PATCH v3 09/13] selftests/mm: skip uffd-wp-mremap if UFFD write-protect is unsupported Sayali Patil
2026-04-02  6:59   ` Sayali Patil
2026-03-27  7:16 ` [PATCH v3 10/13] selftests/mm: skip uffd-stress test when nr_pages_per_cpu is zero Sayali Patil
2026-04-01 14:23   ` David Hildenbrand (Arm)
2026-03-27  7:16 ` [PATCH v3 11/13] selftests/mm: fix double increment in linked list cleanup in compaction_test Sayali Patil
2026-04-01 14:32   ` Sayali Patil
2026-04-01 14:39     ` David Hildenbrand (Arm)
2026-04-01 17:33       ` Sayali Patil
2026-03-27  7:16 ` [PATCH v3 12/13] selftests/mm: move hwpoison setup into run_test() and silence modprobe output for memory-failure category Sayali Patil
2026-04-02  7:15   ` Sayali Patil
2026-03-27  7:16 ` [PATCH v3 13/13] selftests/cgroup: extend test_hugetlb_memcg.c to support all huge page sizes Sayali Patil
2026-04-03 17:16   ` Sayali Patil [this message]
2026-03-27 18:11 ` [PATCH v3 00/13] selftests/mm: fix failures and robustness improvements Andrew Morton
2026-03-30  5:57   ` Sayali Patil
2026-03-30 22:11     ` Andrew Morton
2026-04-01 14:05       ` David Hildenbrand (Arm)
2026-04-01 15:03       ` Sayali Patil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dd044148-b40f-47f5-a254-ef6f1b86cb0f@linux.ibm.com \
    --to=sayalip@linux.ibm.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@kernel.org \
    --cc=osalvador@suse.de \
    --cc=ritesh.list@gmail.com \
    --cc=shuah@kernel.org \
    --cc=venkat88@linux.ibm.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.