Re: [ISSUE] cgroup: test_percpu_basic fails on PREEMPT_RT due to lazy percpu stat flushing

public inbox for cgroups@vger.kernel.org
 help / color / mirror / Atom feed

From: Li Wang <liwang@redhat.com>
To: Waiman Long <longman@redhat.com>, Lucas Liu <hongzliu@redhat.com>
Cc: cgroups@vger.kernel.org, linux-kselftest@vger.kernel.org,
	Li Wang <liwan@redhat.com>
Subject: Re: [ISSUE] cgroup: test_percpu_basic fails on PREEMPT_RT due to lazy percpu stat flushing
Date: Thu, 12 Mar 2026 18:30:21 +0800	[thread overview]
Message-ID: <abKVvQc7NPAnoWq8@redhat.com> (raw)
In-Reply-To: <abKS4Qt72UP8rYS_@redhat.com>

On Thu, Mar 12, 2026 at 06:18:09PM +0800, Li Wang wrote:
> Waiman Long wrote:
> 
> > On 3/11/26 4:49 AM, Lucas Liu wrote:
> > > Hi recently I met this issue
> > >   ./test_kmem
> > > ok 1 test_kmem_basic
> > > ok 2 test_kmem_memcg_deletion
> > > ok 3 test_kmem_proc_kpagecgroup
> > > ok 4 test_kmem_kernel_stacks
> > > ok 5 test_kmem_dead_cgroups
> > > memory.current 24514560
> > > percpu 15280000
> > > not ok 6 test_percpu_basic
> > > 
> > > In this test the memory.current 24514560, percpu 15280000, Diff ~9.2MB.
> > > 
> > > #define MAX_VMSTAT_ERROR (4096 * 64 * get_nprocs())
> > > 
> > > in this part (8cpus) MAX_VMSTAT_ERROR is 4M memory. On the RT kernel,
> > > the labs(current - percpu) is 9.2M, that is the root cause for this
> > > failure. I am not sure what value is suitable for this case(2M per cpu
> > > maybe?)
> > 
> > Li Wang had posted patches to address some of the problems in this test.
> > 
> > https://lore.kernel.org/lkml/20260306071843.149147-2-liwang@redhat.com/
> > 
> > It could be the case that lazy percpu stat flushing can also be a factor
> > here. In this case, we may need to reread the stat counters again several
> > time with some delay to solve this problem.
> 
> When memory.stat is read, the kernel calls mem_cgroup_flush_stats(), which
> invokes cgroup_rstat_flush() to drain per-cpu counters before returning
> results. So in the normal read path, stats are flushed, they aren't
> arbitrarily stale at the point this test reads them.
> 
> The "lazy" aspect, my understand, is that background flushing maybe skipped
> sometime, as there is an situation: __mem_cgroup_flush_stats() skips the
> flush if the total pending update is below a threshold, i.e.
> 
>   575  static bool memcg_vmstats_needs_flush(struct memcg_vmstats *vmstats)
>   576  {
>   577          return atomic64_read(&vmstats->stats_updates) >
>   578                  MEMCG_CHARGE_BATCH * num_online_cpus();
>   579  }
> 
> So the "lazy" could happen on a machine with too many CPUs, that threshold
> can be non-trivial and could contribute a few MB of discrepancy.
> 
> But my failure observed on a 3CPUs box, it shouldn't go with "lazy" skip.
> 
>  # ./test_kmem
>  TAP version 13
>  1..6
>  ok 1 test_kmem_basic
>  ok 2 test_kmem_memcg_deletion
>  ok 3 test_kmem_proc_kpagecgroup
>  ok 4 test_kmem_kernel_stacks
>  ok 5 test_kmem_dead_cgroups
>  memory.current 11530240
>  percpu 8440000
>  not ok 6 test_percpu_basic
>  # Totals: pass:5 fail:1 xfail:0 xpass:0 skip:0 error:0
>  
>  # uname -r
>  6.12.0-211.el10.aarch64
>  
>  # getconf PAGE_SIZE
>  4096
>  
>  # lscpu
>  Architecture:                aarch64
>    CPU op-mode(s):            32-bit, 64-bit
>    Byte Order:                Little Endian
>  CPU(s):                      3
>    On-line CPU(s) list:       0-2
>  ...
> 
> Even on Lucas's test system, (8cpus), I assume the pagesize is 4k, the
> threashold is 2M is still less than the failed result:
>   64 × 8 = 512 pages = 512 × 4096 = 2 MB
> 
> Bose on the above two testing, the lazy produce deviation is not
> like the root cause.

BTW, if the lazy flush does become a problem on large-CPU machines
in real test, we can add a retry loop (like Waiman suggested) in a
seperate patch. But I'd prefer to keep this one focused on the
missing slab accounting first.

-- 
Regards,
Li Wang

     prev parent reply	other threads:[~2026-03-12 10:30 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-11  8:49 [ISSUE] cgroup: test_percpu_basic fails on PREEMPT_RT due to lazy percpu stat flushing Lucas Liu
2026-03-11 14:17 ` Waiman Long
2026-03-12  6:27   ` Lucas Liu
2026-03-12 10:18   ` Li Wang
2026-03-12 10:30     ` Li Wang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=abKVvQc7NPAnoWq8@redhat.com \
    --to=liwang@redhat.com \
    --cc=cgroups@vger.kernel.org \
    --cc=hongzliu@redhat.com \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=liwan@redhat.com \
    --cc=longman@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox