Re: [PATCH v2 4/7] selftests: memcg: Increase error tolerance in accordance with page size

public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed

From: Waiman Long <longman@redhat.com>
To: Li Wang <liwang@redhat.com>
Cc: "Johannes Weiner" <hannes@cmpxchg.org>,
	"Michal Hocko" <mhocko@kernel.org>,
	"Roman Gushchin" <roman.gushchin@linux.dev>,
	"Shakeel Butt" <shakeel.butt@linux.dev>,
	"Muchun Song" <muchun.song@linux.dev>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Tejun Heo" <tj@kernel.org>, "Michal Koutný" <mkoutny@suse.com>,
	"Shuah Khan" <shuah@kernel.org>,
	"Mike Rapoport" <rppt@kernel.org>,
	linux-kernel@vger.kernel.org, cgroups@vger.kernel.org,
	linux-mm@kvack.org, linux-kselftest@vger.kernel.org,
	"Sean Christopherson" <seanjc@google.com>,
	"James Houghton" <jthoughton@google.com>,
	"Sebastian Chlad" <sebastianchlad@gmail.com>,
	"Guopeng Zhang" <zhangguopeng@kylinos.cn>,
	"Li Wang" <liwan@redhat.com>
Subject: Re: [PATCH v2 4/7] selftests: memcg: Increase error tolerance in accordance with page size
Date: Wed, 25 Mar 2026 12:42:12 -0400	[thread overview]
Message-ID: <82661a8e-256c-4cf4-96b2-98d52cf62cde@redhat.com> (raw)
In-Reply-To: <acDzaouBPCIpB7Ij@redhat.com>

On 3/23/26 4:01 AM, Li Wang wrote:
> On Fri, Mar 20, 2026 at 04:42:38PM -0400, Waiman Long wrote:
>> It was found that some of the tests in test_memcontrol can fail more
>> readily if system page size is larger than 4k. It is because the
>> actual memory.current value deviates more from the expected value with
>> larger page size. This is likely due to the fact there may be up to
>> MEMCG_CHARGE_BATCH pages of charge hidden in each one of the percpu
>> memcg_stock.
>>
>> To avoid this failure, the error tolerance is now increased in accordance
>> to the current system page size value. The page size scale factor is
>> set to 2 for 64k page and 1 for 16k page.
>>
>> Changes are made in alloc_pagecache_max_30M(), test_memcg_protection()
>> and alloc_anon_50M_check_swap() to increase the error tolerance for
>> memory.current for larger page size. The current set of values are
>> chosen to ensure that the relevant test_memcontrol tests no longer
>> have any test failure in a 100 repeated run of test_memcontrol with a
>> 4k/16k/64k page size kernels on an arm64 system.
>>
>> Signed-off-by: Waiman Long <longman@redhat.com>
>> ---
>>   .../cgroup/lib/include/cgroup_util.h          |  3 ++-
>>   .../selftests/cgroup/test_memcontrol.c        | 23 ++++++++++++++-----
>>   2 files changed, 19 insertions(+), 7 deletions(-)
>>
>> diff --git a/tools/testing/selftests/cgroup/lib/include/cgroup_util.h b/tools/testing/selftests/cgroup/lib/include/cgroup_util.h
>> index 77f386dab5e8..2293e770e9b4 100644
>> --- a/tools/testing/selftests/cgroup/lib/include/cgroup_util.h
>> +++ b/tools/testing/selftests/cgroup/lib/include/cgroup_util.h
>> @@ -6,7 +6,8 @@
>>   #define PAGE_SIZE 4096
>>   #endif
>>   
>> -#define MB(x) (x << 20)
>> +#define KB(x) ((x) << 10)
>> +#define MB(x) ((x) << 20)
>>   
>>   #define USEC_PER_SEC	1000000L
>>   #define NSEC_PER_SEC	1000000000L
>> diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
>> index babbfad10aaf..c078fc458def 100644
>> --- a/tools/testing/selftests/cgroup/test_memcontrol.c
>> +++ b/tools/testing/selftests/cgroup/test_memcontrol.c
>> @@ -26,6 +26,7 @@
>>   static bool has_localevents;
>>   static bool has_recursiveprot;
>>   static int page_size;
>> +static int pscale_factor;	/* Page size scale factor */
>>   
>>   int get_temp_fd(void)
>>   {
>> @@ -571,16 +572,17 @@ static int test_memcg_protection(const char *root, bool min)
>>   	if (cg_run(parent[2], alloc_anon, (void *)MB(148)))
>>   		goto cleanup;
>>   
>> -	if (!values_close(cg_read_long(parent[1], "memory.current"), MB(50), 3))
>> +	if (!values_close(cg_read_long(parent[1], "memory.current"), MB(50),
>> +				       3 + (min ? 0 : 4) * pscale_factor))
>>   		goto cleanup;
>>   
>>   	for (i = 0; i < ARRAY_SIZE(children); i++)
>>   		c[i] = cg_read_long(children[i], "memory.current");
>>   
>> -	if (!values_close(c[0], MB(29), 15))
>> +	if (!values_close(c[0], MB(29), 15 + 3 * pscale_factor))
>>   		goto cleanup;
>>   
>> -	if (!values_close(c[1], MB(21), 20))
>> +	if (!values_close(c[1], MB(21), 20 + pscale_factor))
>>   		goto cleanup;
>>   
>>   	if (c[3] != 0)
>> @@ -596,7 +598,8 @@ static int test_memcg_protection(const char *root, bool min)
>>   	}
>>   
>>   	current = min ? MB(50) : MB(30);
>> -	if (!values_close(cg_read_long(parent[1], "memory.current"), current, 3))
>> +	if (!values_close(cg_read_long(parent[1], "memory.current"), current,
>> +				       9 + (min ? 0 : 6) * pscale_factor))
>>   		goto cleanup;
>>   
>>   	if (!reclaim_until(children[0], MB(10)))
>> @@ -684,7 +687,7 @@ static int alloc_pagecache_max_30M(const char *cgroup, void *arg)
>>   		goto cleanup;
>>   
>>   	current = cg_read_long(cgroup, "memory.current");
>> -	if (!values_close(current, MB(30), 5))
>> +	if (!values_close(current, MB(30), 5 + (pscale_factor ? 2 : 0)))
>>   		goto cleanup;
>>   
>>   	ret = 0;
>> @@ -1004,7 +1007,7 @@ static int alloc_anon_50M_check_swap(const char *cgroup, void *arg)
>>   		*ptr = 0;
>>   
>>   	mem_current = cg_read_long(cgroup, "memory.current");
>> -	if (!mem_current || !values_close(mem_current, mem_max, 3))
>> +	if (!mem_current || !values_close(mem_current, mem_max, 6 + pscale_factor))
>>   		goto cleanup;
>>   
>>   	swap_current = cg_read_long(cgroup, "memory.swap.current");
>> @@ -1684,6 +1687,14 @@ int main(int argc, char **argv)
>>   	if (page_size <= 0)
>>   		page_size = PAGE_SIZE;
>>   
>> +	/*
>> +	 * It is found that the actual memory.current value can deviate more
>> +	 * from the expected value with larger page size. So error tolerance
>> +	 * will have to be increased a bit more for larger page size.
>> +	 */
>> +	if (page_size > KB(4))
>> +		pscale_factor = (page_size >= KB(64)) ? 2 : 1;
> This is a good improment but I still think the pscale_factor adjustments
> are a bit fragile, each call site needs its own hand-tuned formula, and only
> three page sizes (4K/16K/64K) are handled. If a new page size shows up,
> every call site needs revisiting.
>
> How about centralizing the page size adjustment inside values_close()
> itself? Something like:
>
>      static inline int values_close(long a, long b, int err)
>      {
>            ssize_t page_adjusted_err = ffs(page_size >> 13) + err;
>      
>            return 100 * labs(a - b) <= (a + b) * page_adjusted_err;
>      }
>
> This adds one extra percent of tolerance per doubling above 4K, scales
> continuously for any power-of-two page size, and also fixes an integer
> truncation issue in the original: (a + b) / 100 * err loses precision
> when (a + b) < 100.
>
> With this, the callers wouldn't need any changes at all.
>
> This method is inspired from LTP:
>    https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/controllers/memcg/memcontrol_common.h#L27

Good point. I will implement something like in the next version.

Cheers,
Longman

next prev parent reply	other threads:[~2026-03-25 16:42 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-20 20:42 [PATCH v2 0/7] selftests: memcg: Fix test_memcontrol test failures with large page sizes Waiman Long
2026-03-20 20:42 ` [PATCH v2 1/7] memcg: Scale up vmstats flush threshold with int_sqrt(nr_cpus+2) Waiman Long
2026-03-23 12:46   ` Li Wang
2026-03-24  0:15     ` Yosry Ahmed
2026-03-25 16:47       ` Waiman Long
2026-03-25 17:23         ` Yosry Ahmed
2026-03-20 20:42 ` [PATCH v2 2/7] memcg: Scale down MEMCG_CHARGE_BATCH with increase in PAGE_SIZE Waiman Long
2026-03-23 12:47   ` Li Wang
2026-03-24  0:17     ` Yosry Ahmed
2026-03-20 20:42 ` [PATCH v2 3/7] selftests: memcg: Iterate pages based on the actual page size Waiman Long
2026-03-23  2:53   ` Li Wang
2026-03-23  2:56     ` Li Wang
2026-03-25  3:33     ` Waiman Long
2026-03-20 20:42 ` [PATCH v2 4/7] selftests: memcg: Increase error tolerance in accordance with " Waiman Long
2026-03-23  8:01   ` Li Wang
2026-03-25 16:42     ` Waiman Long [this message]
2026-03-20 20:42 ` [PATCH v2 5/7] selftests: memcg: Reduce the expected swap.peak with larger " Waiman Long
2026-03-23  8:24   ` Li Wang
2026-03-25  3:47     ` Waiman Long
2026-03-20 20:42 ` [PATCH v2 6/7] selftests: memcg: Don't call reclaim_until() if already in target Waiman Long
2026-03-23  8:53   ` Li Wang
2026-03-20 20:42 ` [PATCH v2 7/7] selftests: memcg: Treat failure for zeroing sock in test_memcg_sock as XFAIL Waiman Long
2026-03-23  9:44   ` Li Wang
2026-03-21  1:16 ` [PATCH v2 0/7] selftests: memcg: Fix test_memcontrol test failures with large page sizes Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=82661a8e-256c-4cf4-96b2-98d52cf62cde@redhat.com \
    --to=longman@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=jthoughton@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liwan@redhat.com \
    --cc=liwang@redhat.com \
    --cc=mhocko@kernel.org \
    --cc=mkoutny@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=roman.gushchin@linux.dev \
    --cc=rppt@kernel.org \
    --cc=seanjc@google.com \
    --cc=sebastianchlad@gmail.com \
    --cc=shakeel.butt@linux.dev \
    --cc=shuah@kernel.org \
    --cc=tj@kernel.org \
    --cc=zhangguopeng@kylinos.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox