* [ISSUE] cgroup: test_percpu_basic fails on PREEMPT_RT due to lazy percpu stat flushing @ 2026-03-11 8:49 Lucas Liu 2026-03-11 14:17 ` Waiman Long 0 siblings, 1 reply; 5+ messages in thread From: Lucas Liu @ 2026-03-11 8:49 UTC (permalink / raw) To: cgroups, linux-kselftest Hi recently I met this issue ./test_kmem ok 1 test_kmem_basic ok 2 test_kmem_memcg_deletion ok 3 test_kmem_proc_kpagecgroup ok 4 test_kmem_kernel_stacks ok 5 test_kmem_dead_cgroups memory.current 24514560 percpu 15280000 not ok 6 test_percpu_basic In this test the memory.current 24514560, percpu 15280000, Diff ~9.2MB. #define MAX_VMSTAT_ERROR (4096 * 64 * get_nprocs()) in this part (8cpus) MAX_VMSTAT_ERROR is 4M memory. On the RT kernel, the labs(current - percpu) is 9.2M, that is the root cause for this failure. I am not sure what value is suitable for this case(2M per cpu maybe?) ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [ISSUE] cgroup: test_percpu_basic fails on PREEMPT_RT due to lazy percpu stat flushing 2026-03-11 8:49 [ISSUE] cgroup: test_percpu_basic fails on PREEMPT_RT due to lazy percpu stat flushing Lucas Liu @ 2026-03-11 14:17 ` Waiman Long 2026-03-12 6:27 ` Lucas Liu 2026-03-12 10:18 ` Li Wang 0 siblings, 2 replies; 5+ messages in thread From: Waiman Long @ 2026-03-11 14:17 UTC (permalink / raw) To: Lucas Liu, cgroups, linux-kselftest; +Cc: Li Wang On 3/11/26 4:49 AM, Lucas Liu wrote: > Hi recently I met this issue > ./test_kmem > ok 1 test_kmem_basic > ok 2 test_kmem_memcg_deletion > ok 3 test_kmem_proc_kpagecgroup > ok 4 test_kmem_kernel_stacks > ok 5 test_kmem_dead_cgroups > memory.current 24514560 > percpu 15280000 > not ok 6 test_percpu_basic > > In this test the memory.current 24514560, percpu 15280000, Diff ~9.2MB. > > #define MAX_VMSTAT_ERROR (4096 * 64 * get_nprocs()) > > in this part (8cpus) MAX_VMSTAT_ERROR is 4M memory. On the RT kernel, > the labs(current - percpu) is 9.2M, that is the root cause for this > failure. I am not sure what value is suitable for this case(2M per cpu > maybe?) Li Wang had posted patches to address some of the problems in this test. https://lore.kernel.org/lkml/20260306071843.149147-2-liwang@redhat.com/ It could be the case that lazy percpu stat flushing can also be a factor here. In this case, we may need to reread the stat counters again several time with some delay to solve this problem. Cheers, Longman ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [ISSUE] cgroup: test_percpu_basic fails on PREEMPT_RT due to lazy percpu stat flushing 2026-03-11 14:17 ` Waiman Long @ 2026-03-12 6:27 ` Lucas Liu 2026-03-12 10:18 ` Li Wang 1 sibling, 0 replies; 5+ messages in thread From: Lucas Liu @ 2026-03-12 6:27 UTC (permalink / raw) To: Waiman Long; +Cc: cgroups, linux-kselftest, Li Wang Hi Waiman: Thanks for responding, I have tried Li Wang's patch, The problem has been fixed. # ./test_kmem ok 1 test_kmem_basic ok 2 test_kmem_memcg_deletion ok 3 test_kmem_proc_kpagecgroup ok 4 test_kmem_kernel_stacks ok 5 test_kmem_dead_cgroups ok 6 test_percpu_basic [root@localhost cgroup]# bash run.sh run 100 times... -------------------------------------- proccess: 100/100 status: [ OK ] failure: 0 -------------------------------------- done overall: 100 ok: 100 fail: 0 For the lazy percpu stat flushing, I assume this is expected behavior for RT kernels? So Li Wang's patch can be our final solution? Please correct me if I am wrong. Thanks On Wed, Mar 11, 2026 at 10:17 PM Waiman Long <longman@redhat.com> wrote: > > On 3/11/26 4:49 AM, Lucas Liu wrote: > > Hi recently I met this issue > > ./test_kmem > > ok 1 test_kmem_basic > > ok 2 test_kmem_memcg_deletion > > ok 3 test_kmem_proc_kpagecgroup > > ok 4 test_kmem_kernel_stacks > > ok 5 test_kmem_dead_cgroups > > memory.current 24514560 > > percpu 15280000 > > not ok 6 test_percpu_basic > > > > In this test the memory.current 24514560, percpu 15280000, Diff ~9.2MB. > > > > #define MAX_VMSTAT_ERROR (4096 * 64 * get_nprocs()) > > > > in this part (8cpus) MAX_VMSTAT_ERROR is 4M memory. On the RT kernel, > > the labs(current - percpu) is 9.2M, that is the root cause for this > > failure. I am not sure what value is suitable for this case(2M per cpu > > maybe?) > > Li Wang had posted patches to address some of the problems in this test. > > https://lore.kernel.org/lkml/20260306071843.149147-2-liwang@redhat.com/ > > It could be the case that lazy percpu stat flushing can also be a factor > here. In this case, we may need to reread the stat counters again > several time with some delay to solve this problem. > > Cheers, > Longman > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [ISSUE] cgroup: test_percpu_basic fails on PREEMPT_RT due to lazy percpu stat flushing 2026-03-11 14:17 ` Waiman Long 2026-03-12 6:27 ` Lucas Liu @ 2026-03-12 10:18 ` Li Wang 2026-03-12 10:30 ` Li Wang 1 sibling, 1 reply; 5+ messages in thread From: Li Wang @ 2026-03-12 10:18 UTC (permalink / raw) To: Waiman Long, Lucas Liu; +Cc: cgroups, linux-kselftest, Li Wang Waiman Long wrote: > On 3/11/26 4:49 AM, Lucas Liu wrote: > > Hi recently I met this issue > > ./test_kmem > > ok 1 test_kmem_basic > > ok 2 test_kmem_memcg_deletion > > ok 3 test_kmem_proc_kpagecgroup > > ok 4 test_kmem_kernel_stacks > > ok 5 test_kmem_dead_cgroups > > memory.current 24514560 > > percpu 15280000 > > not ok 6 test_percpu_basic > > > > In this test the memory.current 24514560, percpu 15280000, Diff ~9.2MB. > > > > #define MAX_VMSTAT_ERROR (4096 * 64 * get_nprocs()) > > > > in this part (8cpus) MAX_VMSTAT_ERROR is 4M memory. On the RT kernel, > > the labs(current - percpu) is 9.2M, that is the root cause for this > > failure. I am not sure what value is suitable for this case(2M per cpu > > maybe?) > > Li Wang had posted patches to address some of the problems in this test. > > https://lore.kernel.org/lkml/20260306071843.149147-2-liwang@redhat.com/ > > It could be the case that lazy percpu stat flushing can also be a factor > here. In this case, we may need to reread the stat counters again several > time with some delay to solve this problem. When memory.stat is read, the kernel calls mem_cgroup_flush_stats(), which invokes cgroup_rstat_flush() to drain per-cpu counters before returning results. So in the normal read path, stats are flushed, they aren't arbitrarily stale at the point this test reads them. The "lazy" aspect, my understand, is that background flushing maybe skipped sometime, as there is an situation: __mem_cgroup_flush_stats() skips the flush if the total pending update is below a threshold, i.e. 575 static bool memcg_vmstats_needs_flush(struct memcg_vmstats *vmstats) 576 { 577 return atomic64_read(&vmstats->stats_updates) > 578 MEMCG_CHARGE_BATCH * num_online_cpus(); 579 } So the "lazy" could happen on a machine with too many CPUs, that threshold can be non-trivial and could contribute a few MB of discrepancy. But my failure observed on a 3CPUs box, it shouldn't go with "lazy" skip. # ./test_kmem TAP version 13 1..6 ok 1 test_kmem_basic ok 2 test_kmem_memcg_deletion ok 3 test_kmem_proc_kpagecgroup ok 4 test_kmem_kernel_stacks ok 5 test_kmem_dead_cgroups memory.current 11530240 percpu 8440000 not ok 6 test_percpu_basic # Totals: pass:5 fail:1 xfail:0 xpass:0 skip:0 error:0 # uname -r 6.12.0-211.el10.aarch64 # getconf PAGE_SIZE 4096 # lscpu Architecture: aarch64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 3 On-line CPU(s) list: 0-2 ... Even on Lucas's test system, (8cpus), I assume the pagesize is 4k, the threashold is 2M is still less than the failed result: 64 × 8 = 512 pages = 512 × 4096 = 2 MB Bose on the above two testing, the lazy produce deviation is not like the root cause. -- Regards, Li Wang ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [ISSUE] cgroup: test_percpu_basic fails on PREEMPT_RT due to lazy percpu stat flushing 2026-03-12 10:18 ` Li Wang @ 2026-03-12 10:30 ` Li Wang 0 siblings, 0 replies; 5+ messages in thread From: Li Wang @ 2026-03-12 10:30 UTC (permalink / raw) To: Waiman Long, Lucas Liu; +Cc: cgroups, linux-kselftest, Li Wang On Thu, Mar 12, 2026 at 06:18:09PM +0800, Li Wang wrote: > Waiman Long wrote: > > > On 3/11/26 4:49 AM, Lucas Liu wrote: > > > Hi recently I met this issue > > > ./test_kmem > > > ok 1 test_kmem_basic > > > ok 2 test_kmem_memcg_deletion > > > ok 3 test_kmem_proc_kpagecgroup > > > ok 4 test_kmem_kernel_stacks > > > ok 5 test_kmem_dead_cgroups > > > memory.current 24514560 > > > percpu 15280000 > > > not ok 6 test_percpu_basic > > > > > > In this test the memory.current 24514560, percpu 15280000, Diff ~9.2MB. > > > > > > #define MAX_VMSTAT_ERROR (4096 * 64 * get_nprocs()) > > > > > > in this part (8cpus) MAX_VMSTAT_ERROR is 4M memory. On the RT kernel, > > > the labs(current - percpu) is 9.2M, that is the root cause for this > > > failure. I am not sure what value is suitable for this case(2M per cpu > > > maybe?) > > > > Li Wang had posted patches to address some of the problems in this test. > > > > https://lore.kernel.org/lkml/20260306071843.149147-2-liwang@redhat.com/ > > > > It could be the case that lazy percpu stat flushing can also be a factor > > here. In this case, we may need to reread the stat counters again several > > time with some delay to solve this problem. > > When memory.stat is read, the kernel calls mem_cgroup_flush_stats(), which > invokes cgroup_rstat_flush() to drain per-cpu counters before returning > results. So in the normal read path, stats are flushed, they aren't > arbitrarily stale at the point this test reads them. > > The "lazy" aspect, my understand, is that background flushing maybe skipped > sometime, as there is an situation: __mem_cgroup_flush_stats() skips the > flush if the total pending update is below a threshold, i.e. > > 575 static bool memcg_vmstats_needs_flush(struct memcg_vmstats *vmstats) > 576 { > 577 return atomic64_read(&vmstats->stats_updates) > > 578 MEMCG_CHARGE_BATCH * num_online_cpus(); > 579 } > > So the "lazy" could happen on a machine with too many CPUs, that threshold > can be non-trivial and could contribute a few MB of discrepancy. > > But my failure observed on a 3CPUs box, it shouldn't go with "lazy" skip. > > # ./test_kmem > TAP version 13 > 1..6 > ok 1 test_kmem_basic > ok 2 test_kmem_memcg_deletion > ok 3 test_kmem_proc_kpagecgroup > ok 4 test_kmem_kernel_stacks > ok 5 test_kmem_dead_cgroups > memory.current 11530240 > percpu 8440000 > not ok 6 test_percpu_basic > # Totals: pass:5 fail:1 xfail:0 xpass:0 skip:0 error:0 > > # uname -r > 6.12.0-211.el10.aarch64 > > # getconf PAGE_SIZE > 4096 > > # lscpu > Architecture: aarch64 > CPU op-mode(s): 32-bit, 64-bit > Byte Order: Little Endian > CPU(s): 3 > On-line CPU(s) list: 0-2 > ... > > Even on Lucas's test system, (8cpus), I assume the pagesize is 4k, the > threashold is 2M is still less than the failed result: > 64 × 8 = 512 pages = 512 × 4096 = 2 MB > > Bose on the above two testing, the lazy produce deviation is not > like the root cause. BTW, if the lazy flush does become a problem on large-CPU machines in real test, we can add a retry loop (like Waiman suggested) in a seperate patch. But I'd prefer to keep this one focused on the missing slab accounting first. -- Regards, Li Wang ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-03-12 10:30 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-03-11 8:49 [ISSUE] cgroup: test_percpu_basic fails on PREEMPT_RT due to lazy percpu stat flushing Lucas Liu 2026-03-11 14:17 ` Waiman Long 2026-03-12 6:27 ` Lucas Liu 2026-03-12 10:18 ` Li Wang 2026-03-12 10:30 ` Li Wang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox