Re: [PATCH v2 4/7] selftests: memcg: Increase error tolerance in accordance with page size

public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed

From: Li Wang <liwang@redhat.com>
To: Waiman Long <longman@redhat.com>
Cc: "Johannes Weiner" <hannes@cmpxchg.org>,
	"Michal Hocko" <mhocko@kernel.org>,
	"Roman Gushchin" <roman.gushchin@linux.dev>,
	"Shakeel Butt" <shakeel.butt@linux.dev>,
	"Muchun Song" <muchun.song@linux.dev>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Tejun Heo" <tj@kernel.org>, "Michal Koutný" <mkoutny@suse.com>,
	"Shuah Khan" <shuah@kernel.org>,
	"Mike Rapoport" <rppt@kernel.org>,
	linux-kernel@vger.kernel.org, cgroups@vger.kernel.org,
	linux-mm@kvack.org, linux-kselftest@vger.kernel.org,
	"Sean Christopherson" <seanjc@google.com>,
	"James Houghton" <jthoughton@google.com>,
	"Sebastian Chlad" <sebastianchlad@gmail.com>,
	"Guopeng Zhang" <zhangguopeng@kylinos.cn>,
	"Li Wang" <liwan@redhat.com>
Subject: Re: [PATCH v2 4/7] selftests: memcg: Increase error tolerance in accordance with page size
Date: Mon, 23 Mar 2026 16:01:46 +0800	[thread overview]
Message-ID: <acDzaouBPCIpB7Ij@redhat.com> (raw)
In-Reply-To: <20260320204241.1613861-5-longman@redhat.com>

On Fri, Mar 20, 2026 at 04:42:38PM -0400, Waiman Long wrote:
> It was found that some of the tests in test_memcontrol can fail more
> readily if system page size is larger than 4k. It is because the
> actual memory.current value deviates more from the expected value with
> larger page size. This is likely due to the fact there may be up to
> MEMCG_CHARGE_BATCH pages of charge hidden in each one of the percpu
> memcg_stock.
> 
> To avoid this failure, the error tolerance is now increased in accordance
> to the current system page size value. The page size scale factor is
> set to 2 for 64k page and 1 for 16k page.
> 
> Changes are made in alloc_pagecache_max_30M(), test_memcg_protection()
> and alloc_anon_50M_check_swap() to increase the error tolerance for
> memory.current for larger page size. The current set of values are
> chosen to ensure that the relevant test_memcontrol tests no longer
> have any test failure in a 100 repeated run of test_memcontrol with a
> 4k/16k/64k page size kernels on an arm64 system.
> 
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
>  .../cgroup/lib/include/cgroup_util.h          |  3 ++-
>  .../selftests/cgroup/test_memcontrol.c        | 23 ++++++++++++++-----
>  2 files changed, 19 insertions(+), 7 deletions(-)
> 
> diff --git a/tools/testing/selftests/cgroup/lib/include/cgroup_util.h b/tools/testing/selftests/cgroup/lib/include/cgroup_util.h
> index 77f386dab5e8..2293e770e9b4 100644
> --- a/tools/testing/selftests/cgroup/lib/include/cgroup_util.h
> +++ b/tools/testing/selftests/cgroup/lib/include/cgroup_util.h
> @@ -6,7 +6,8 @@
>  #define PAGE_SIZE 4096
>  #endif
>  
> -#define MB(x) (x << 20)
> +#define KB(x) ((x) << 10)
> +#define MB(x) ((x) << 20)
>  
>  #define USEC_PER_SEC	1000000L
>  #define NSEC_PER_SEC	1000000000L
> diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
> index babbfad10aaf..c078fc458def 100644
> --- a/tools/testing/selftests/cgroup/test_memcontrol.c
> +++ b/tools/testing/selftests/cgroup/test_memcontrol.c
> @@ -26,6 +26,7 @@
>  static bool has_localevents;
>  static bool has_recursiveprot;
>  static int page_size;
> +static int pscale_factor;	/* Page size scale factor */
>  
>  int get_temp_fd(void)
>  {
> @@ -571,16 +572,17 @@ static int test_memcg_protection(const char *root, bool min)
>  	if (cg_run(parent[2], alloc_anon, (void *)MB(148)))
>  		goto cleanup;
>  
> -	if (!values_close(cg_read_long(parent[1], "memory.current"), MB(50), 3))
> +	if (!values_close(cg_read_long(parent[1], "memory.current"), MB(50),
> +				       3 + (min ? 0 : 4) * pscale_factor))
>  		goto cleanup;
>  
>  	for (i = 0; i < ARRAY_SIZE(children); i++)
>  		c[i] = cg_read_long(children[i], "memory.current");
>  
> -	if (!values_close(c[0], MB(29), 15))
> +	if (!values_close(c[0], MB(29), 15 + 3 * pscale_factor))
>  		goto cleanup;
>  
> -	if (!values_close(c[1], MB(21), 20))
> +	if (!values_close(c[1], MB(21), 20 + pscale_factor))
>  		goto cleanup;
>  
>  	if (c[3] != 0)
> @@ -596,7 +598,8 @@ static int test_memcg_protection(const char *root, bool min)
>  	}
>  
>  	current = min ? MB(50) : MB(30);
> -	if (!values_close(cg_read_long(parent[1], "memory.current"), current, 3))
> +	if (!values_close(cg_read_long(parent[1], "memory.current"), current,
> +				       9 + (min ? 0 : 6) * pscale_factor))
>  		goto cleanup;
>  
>  	if (!reclaim_until(children[0], MB(10)))
> @@ -684,7 +687,7 @@ static int alloc_pagecache_max_30M(const char *cgroup, void *arg)
>  		goto cleanup;
>  
>  	current = cg_read_long(cgroup, "memory.current");
> -	if (!values_close(current, MB(30), 5))
> +	if (!values_close(current, MB(30), 5 + (pscale_factor ? 2 : 0)))
>  		goto cleanup;
>  
>  	ret = 0;
> @@ -1004,7 +1007,7 @@ static int alloc_anon_50M_check_swap(const char *cgroup, void *arg)
>  		*ptr = 0;
>  
>  	mem_current = cg_read_long(cgroup, "memory.current");
> -	if (!mem_current || !values_close(mem_current, mem_max, 3))
> +	if (!mem_current || !values_close(mem_current, mem_max, 6 + pscale_factor))
>  		goto cleanup;
>  
>  	swap_current = cg_read_long(cgroup, "memory.swap.current");
> @@ -1684,6 +1687,14 @@ int main(int argc, char **argv)
>  	if (page_size <= 0)
>  		page_size = PAGE_SIZE;
>  
> +	/*
> +	 * It is found that the actual memory.current value can deviate more
> +	 * from the expected value with larger page size. So error tolerance
> +	 * will have to be increased a bit more for larger page size.
> +	 */
> +	if (page_size > KB(4))
> +		pscale_factor = (page_size >= KB(64)) ? 2 : 1;

This is a good improment but I still think the pscale_factor adjustments
are a bit fragile, each call site needs its own hand-tuned formula, and only
three page sizes (4K/16K/64K) are handled. If a new page size shows up,
every call site needs revisiting.

How about centralizing the page size adjustment inside values_close()
itself? Something like:

    static inline int values_close(long a, long b, int err)
    {
          ssize_t page_adjusted_err = ffs(page_size >> 13) + err;
    
          return 100 * labs(a - b) <= (a + b) * page_adjusted_err;
    }

This adds one extra percent of tolerance per doubling above 4K, scales
continuously for any power-of-two page size, and also fixes an integer
truncation issue in the original: (a + b) / 100 * err loses precision
when (a + b) < 100.

With this, the callers wouldn't need any changes at all.

This method is inspired from LTP:
  https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/controllers/memcg/memcontrol_common.h#L27

-- 
Regards,
Li Wang

next prev parent reply	other threads:[~2026-03-23  8:01 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-20 20:42 [PATCH v2 0/7] selftests: memcg: Fix test_memcontrol test failures with large page sizes Waiman Long
2026-03-20 20:42 ` [PATCH v2 1/7] memcg: Scale up vmstats flush threshold with int_sqrt(nr_cpus+2) Waiman Long
2026-03-23 12:46   ` Li Wang
2026-03-24  0:15     ` Yosry Ahmed
2026-03-25 16:47       ` Waiman Long
2026-03-25 17:23         ` Yosry Ahmed
2026-03-20 20:42 ` [PATCH v2 2/7] memcg: Scale down MEMCG_CHARGE_BATCH with increase in PAGE_SIZE Waiman Long
2026-03-23 12:47   ` Li Wang
2026-03-24  0:17     ` Yosry Ahmed
2026-03-20 20:42 ` [PATCH v2 3/7] selftests: memcg: Iterate pages based on the actual page size Waiman Long
2026-03-23  2:53   ` Li Wang
2026-03-23  2:56     ` Li Wang
2026-03-25  3:33     ` Waiman Long
2026-03-20 20:42 ` [PATCH v2 4/7] selftests: memcg: Increase error tolerance in accordance with " Waiman Long
2026-03-23  8:01   ` Li Wang [this message]
2026-03-25 16:42     ` Waiman Long
2026-03-20 20:42 ` [PATCH v2 5/7] selftests: memcg: Reduce the expected swap.peak with larger " Waiman Long
2026-03-23  8:24   ` Li Wang
2026-03-25  3:47     ` Waiman Long
2026-03-20 20:42 ` [PATCH v2 6/7] selftests: memcg: Don't call reclaim_until() if already in target Waiman Long
2026-03-23  8:53   ` Li Wang
2026-03-20 20:42 ` [PATCH v2 7/7] selftests: memcg: Treat failure for zeroing sock in test_memcg_sock as XFAIL Waiman Long
2026-03-23  9:44   ` Li Wang
2026-03-21  1:16 ` [PATCH v2 0/7] selftests: memcg: Fix test_memcontrol test failures with large page sizes Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=acDzaouBPCIpB7Ij@redhat.com \
    --to=liwang@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=jthoughton@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liwan@redhat.com \
    --cc=longman@redhat.com \
    --cc=mhocko@kernel.org \
    --cc=mkoutny@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=roman.gushchin@linux.dev \
    --cc=rppt@kernel.org \
    --cc=seanjc@google.com \
    --cc=sebastianchlad@gmail.com \
    --cc=shakeel.butt@linux.dev \
    --cc=shuah@kernel.org \
    --cc=tj@kernel.org \
    --cc=zhangguopeng@kylinos.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox