Re: [LTP] Issue faced in memcg_stat_rss while running mainline kernels between 6.7 and 6.8

cgroups.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: [LTP] Issue faced in memcg_stat_rss while running mainline kernels between 6.7 and 6.8
       [not found]   ` <20250115125241.GD648257@pevik>
@ 2025-01-15 22:59     ` Petr Vorel
  2025-01-16  7:53       ` Michal Hocko
  0 siblings, 1 reply; 7+ messages in thread
From: Petr Vorel @ 2025-01-15 22:59 UTC (permalink / raw)
  To: Harshvardhan Jha; +Cc: Li Wang, Cyril Hrubis, ltp, cgroups

Hi Harshvardhan,

[ Cc cgroups@vger.kernel.org: FYI problem in recent kernel using cgroup v1 ]

> Kind regards,
> Petr

> > Hi there,
> > I saw your name appear the most in the commit log of memcg_stat_rss.sh so I was wondering if you had any information as to why this is happening. I feel that we have enough reason to believe that this is due to outdated testcases. It’ll be highly appreciated if you could verify this fact.

> > Thanks & Regards,
> > Harshvardhan

> > From: ltp <ltp-bounces+harshvardhan.j.jha=oracle.com@lists.linux.it> on behalf of Harshvardhan Jha via ltp <ltp@lists.linux.it>
> > Date: Thursday, 28 November 2024 at 3:20 PM
> > To: ltp@lists.linux.it <ltp@lists.linux.it>
> > Subject: [LTP] Issue faced in memcg_stat_rss while running mainline kernels between 6.7 and 6.8
> > Hi there,

> > I've been getting test failures on the memcg_stat_rss testcase for
> > mainline 6.12 kernels with 3 tests failing and one being broken.

> > Running tests.......
> > <<<test_start>>>
> > tag=memcg_stat_rss stime=1732003500
> > cmdline="memcg_stat_rss.sh"
> > contacts=""
> > analysis=exit
> > <<<test_output>>>
> > incrementing stop
> > memcg_stat_rss 1 TINFO: Running: memcg_stat_rss.sh
> > memcg_stat_rss 1 TINFO: Tested kernel: Linux harjha-ol9kdevltp
> > 6.12.0-master.20241021.el9.v1.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Oct 21
> > 06:24:22 PDT 2024 x86_64 x86_64 x86_64 GNU/Linux
> > memcg_stat_rss 1 TINFO: Using
> > /tempdir/ltp-Y4AEUmKVIE/LTP_memcg_stat_rss.kEhD0QvvMw as tmpdir (xfs
> > filesystem)
> > memcg_stat_rss 1 TINFO: timeout per run is 0h 5m 0s
> > memcg_stat_rss 1 TINFO: set /sys/fs/cgroup/memory/memory.use_hierarchy
> > to 0 failed
> > memcg_stat_rss 1 TINFO: Setting shmmax
> > memcg_stat_rss 1 TINFO: Running memcg_process --mmap-anon -s 266240
> > memcg_stat_rss 1 TINFO: Warming up pid: 9367
> > memcg_stat_rss 1 TINFO: Process is still here after warm up: 9367
> > memcg_stat_rss 1 TFAIL: rss is 0, 266240 expected
> > memcg_stat_rss 2 TINFO: Running memcg_process --mmap-file -s 4096
> > memcg_stat_rss 2 TINFO: Warming up pid: 9383
> > memcg_stat_rss 2 TINFO: Process is still here after warm up: 9383
> > memcg_stat_rss 2 TPASS: rss is 0 as expected
> > memcg_stat_rss 3 TINFO: Running memcg_process --shm -k 3 -s 4096
> > memcg_stat_rss 3 TINFO: Warming up pid: 9446
> > memcg_stat_rss 3 TINFO: Process is still here after warm up: 9446
> > memcg_stat_rss 3 TPASS: rss is 0 as expected
> > memcg_stat_rss 4 TINFO: Running memcg_process --mmap-anon --mmap-file
> > --shm -s 266240
> > memcg_stat_rss 4 TINFO: Warming up pid: 9462
> > memcg_stat_rss 4 TINFO: Process is still here after warm up: 9462
> > memcg_stat_rss 4 TPASS: rss is 266240 as expected
> > memcg_stat_rss 5 TINFO: Running memcg_process --mmap-lock1 -s 266240
> > memcg_stat_rss 5 TINFO: Warming up pid: 9479
> > memcg_stat_rss 5 TINFO: Process is still here after warm up: 9479
> > memcg_stat_rss 5 TFAIL: rss is 0, 266240 expected
> > memcg_stat_rss 6 TINFO: Running memcg_process --mmap-anon -s 266240
> > memcg_stat_rss 6 TINFO: Warming up pid: 9495
> > memcg_stat_rss 6 TINFO: Process is still here after warm up: 9495
> > memcg_stat_rss 6 TFAIL: rss is 0, 266240 expected
> > memcg_stat_rss 6 TBROK: timed out on memory.usage_in_bytes 4096 266240
> > 266240
> > /opt/ltp-20240930/testcases/bin/tst_test.sh: line 158:  9495
> > Killed                  memcg_process "$@"  (wd:
> > /sys/fs/cgroup/memory/ltp/test-9308/ltp_9308)

> > Summary:
> > passed   3
> > failed   3
> > broken   1
> > skipped  0
> > warnings 0
> > <<<execution_status>>>
> > initiation_status="ok"
> > duration=17 termination_type=exited termination_id=3 corefile=no
> > cutime=13 cstime=58
> > <<<test_end>>>
> > INFO: ltp-pan reported some tests FAIL
> > LTP Version: 20240930

> > I'm not sure whether this error is due to the kernel or the testcase
> > being outdated. I know that since cgroup v2 is the default upstream and
> > cgroup v1 is now a legacy option, this specific testcase is not

Yes, exactly. I have system with cgroup v1, but it's based on 4.12.14.
Even old Debian VM with old 5.10 uses cgroup v2. Therefore I have no change to
debug the problem.

> > particularly higher in the priority list, but just to be sure, I wanted
> > to verify this from your side. Please let me know whether this error is
> > coming due to the testcase being outdated or this in fact is a valid
> > kernel error.

> > I ran a bisect on memcg_stat_rss test upon mainline kernels and saw the
> > bisect range narrow down between 6.7 and 6.8 which further isolated to:
> > https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=7d7ef0a4686abe43cd76a141b340a348f45ecdf2__;!!ACWV5N9M2RV99hQ!Ky0mM2XEGFSiCbcBvjP5FV5IV3kGpDuDEhuFVAGVdD1mXLQPidRcZLqH8k0AFxScjZgYnjCgaCISEgDVlcn4BSoj$<https://urldefense.com/v3/__https:/git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=7d7ef0a4686abe43cd76a141b340a348f45ecdf2__;!!ACWV5N9M2RV99hQ!Ky0mM2XEGFSiCbcBvjP5FV5IV3kGpDuDEhuFVAGVdD1mXLQPidRcZLqH8k0AFxScjZgYnjCgaCISEgDVlcn4BSoj$>

This was a reason to Cc cgroups@vger.kernel.org.

> > This commit was part of a 5 patch series and I wasn't able to revert it
> > on 6.12 without getting a series of conflicts.
> > So, what I did was checkout the SHA before this patch series
> > 4a3bfbd1699e2306731809d50d480634012ed4de and after the patch series
> > 7d7ef0a4686abe43cd76a141b340a348f45ecdf2 and ran this test.

> > The machine had 32GB Ram and 4CPUs.

> > The steps to reproduce this are:

> > #!/bin/bash

> > # After setting default kernel to the desired one
> > if ! grep -q "unified_cgroup_hierarchy=0" /proc/cmdline; then
> >         sudo grubby --update-kernel DEFAULT
> > --args="systemd.unified_cgroup_hierarchy=0"
> >         sudo grubby --update-kernel DEFAULT
> > --args="systemd.legacy_systemd_cgroup_controller"
> >         sudo grubby --update-kernel DEFAULT --args selinux=0
> >         sudo sed -i "/^SELINUX=/s/=.*/=disabled/" /etc/selinux/config
> >         sudo reboot
> > fi

> > cd /opt/ltp
> > rm -rf /tmpdir
> > mkdir /tempdir
> > ./runltp -d /tempdir  -s memcg_stat_rss

Or just:

# PATH="/opt/ltp/testcases/bin:$PATH" memcg_stat_rss.sh

Kind regards,
Petr

> > The results obtained were:

> > Pre bisect culprit (4a3bfbd1699e2306731809d50d480634012ed4de):

> > <<<test_start>>>
> > tag=memcg_stat_rss stime=1731754078
> > cmdline="memcg_stat_rss.sh"
> > contacts=""
> > analysis=exit
> > <<<test_output>>>
> > incrementing stop
> > memcg_stat_rss 1 TINFO: Running: memcg_stat_rss.sh
> > memcg_stat_rss 1 TINFO: Tested kernel: Linux harjha-ol9kdevltp
> > 6.7.0-masterpre.2024111.el9.rc1.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Nov 15
> > 11:56:10 PST 2024 x86_64 x86_64 x86_64 GNU/Linux
> > memcg_stat_rss 1 TINFO: Using
> > /tempdir/ltp-SzE9ADK6MM/LTP_memcg_stat_rss.6op28sMXO2 as tmpdir (xfs
> > filesystem)
> > memcg_stat_rss 1 TINFO: timeout per run is 0h 5m 0s
> > memcg_stat_rss 1 TINFO: set /sys/fs/cgroup/memory/memory.use_hierarchy
> > to 0 failed
> > memcg_stat_rss 1 TINFO: Setting shmmax
> > memcg_stat_rss 1 TINFO: Running memcg_process --mmap-anon -s 266240
> > memcg_stat_rss 1 TINFO: Warming up pid: 34237
> > memcg_stat_rss 1 TINFO: Process is still here after warm up: 34237
> > memcg_stat_rss 1 TPASS: rss is 266240 as expected
> > memcg_stat_rss 1 TBROK: timed out on memory.usage_in_bytes 4096 266240
> > 266240
> > /opt/ltp-20240930/testcases/bin/tst_test.sh: line 158: 34237
> > Killed                  memcg_process "$@"  (wd:
> > /sys/fs/cgroup/memory/ltp/test-34180/ltp_34180)

> > Summary:
> > passed   1
> > failed   0
> > broken   1
> > skipped  0
> > warnings 0
> > <<<execution_status>>>


> > Post bisect culprit(7d7ef0a4686abe43cd76a141b340a348f45ecdf2):

> > <<<test_start>>>
> > tag=memcg_stat_rss stime=1731755339
> > cmdline="memcg_stat_rss.sh"
> > contacts=""
> > analysis=exit
> > <<<test_output>>>
> > incrementing stop
> > memcg_stat_rss 1 TINFO: Running: memcg_stat_rss.sh
> > memcg_stat_rss 1 TINFO: Tested kernel: Linux harjha-ol9kdevltp
> > 6.7.0-masterpost.2024111.el9.rc1.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Nov
> > 15 11:55:41 PST 2024 x86_64 x86_64 x86_64 GNU/Linux
> > memcg_stat_rss 1 TINFO: Using
> > /tempdir/ltp-G6cge4CkrR/LTP_memcg_stat_rss.1zrm6X02CO as tmpdir (xfs
> > filesystem)
> > memcg_stat_rss 1 TINFO: timeout per run is 0h 5m 0s
> > memcg_stat_rss 1 TINFO: set /sys/fs/cgroup/memory/memory.use_hierarchy
> > to 0 failed
> > memcg_stat_rss 1 TINFO: Setting shmmax
> > memcg_stat_rss 1 TINFO: Running memcg_process --mmap-anon -s 266240
> > memcg_stat_rss 1 TINFO: Warming up pid: 9083
> > memcg_stat_rss 1 TINFO: Process is still here after warm up: 9083
> > memcg_stat_rss 1 TFAIL: rss is 0, 266240 expected
> > memcg_stat_rss 1 TBROK: timed out on memory.usage_in_bytes 4096 266240
> > 266240
> > /opt/ltp-20240930/testcases/bin/tst_test.sh: line 158:  9083
> > Killed                  memcg_process "$@"  (wd:
> > /sys/fs/cgroup/memory/ltp/test-9024/ltp_9024)

> > Summary:
> > passed   0
> > failed   1
> > broken   1
> > skipped  0
> > warnings 0
> > <<<execution_status>>>

> > Thanks & Regards,
> > Harshvardhan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [LTP] Issue faced in memcg_stat_rss while running mainline kernels between 6.7 and 6.8
  2025-01-15 22:59     ` [LTP] Issue faced in memcg_stat_rss while running mainline kernels between 6.7 and 6.8 Petr Vorel
@ 2025-01-16  7:53       ` Michal Hocko
  2025-01-16  8:07         ` Harshvardhan Jha
  0 siblings, 1 reply; 7+ messages in thread
From: Michal Hocko @ 2025-01-16  7:53 UTC (permalink / raw)
  To: Petr Vorel; +Cc: Harshvardhan Jha, Li Wang, Cyril Hrubis, ltp, cgroups

Hi,

On Wed 15-01-25 23:59:20, Petr Vorel wrote:
> Hi Harshvardhan,
> 
> [ Cc cgroups@vger.kernel.org: FYI problem in recent kernel using cgroup v1 ]

It is hard to decypher the output and nail down actual failure. Could
somebody do a TL;DR summary of the failure, since when it happens, is it
really v1 specific?

Thanks!

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [LTP] Issue faced in memcg_stat_rss while running mainline kernels between 6.7 and 6.8
  2025-01-16  7:53       ` Michal Hocko
@ 2025-01-16  8:07         ` Harshvardhan Jha
  2025-01-16  9:06           ` Michal Hocko
  0 siblings, 1 reply; 7+ messages in thread
From: Harshvardhan Jha @ 2025-01-16  8:07 UTC (permalink / raw)
  To: Michal Hocko, Petr Vorel; +Cc: Li Wang, Cyril Hrubis, ltp, cgroups

Hello Michal
On 16/01/25 1:23 PM, Michal Hocko wrote:
> Hi,
>
> On Wed 15-01-25 23:59:20, Petr Vorel wrote:
>> Hi Harshvardhan,
>>
>> [ Cc cgroups@vger.kernel.org: FYI problem in recent kernel using cgroup v1 ]
> It is hard to decypher the output and nail down actual failure. Could
> somebody do a TL;DR summary of the failure, since when it happens, is it
> really v1 specific?

The test ltp_memcg_stat_rss is indeed cgroup v1 specific.

The test started failing soon after this commit 7d7ef0a4686ab mm: memcg:
restore subtree stats flushing

This commit was part of a 5 patch series:
508bed884767a mm: memcg: change flush_next_time to flush_last_time
e0bf1dc859fdd mm: memcg: move vmstats structs definition above flushing code
8d59d2214c236 mm: memcg: make stats flushing threshold per-memcg
b006847222623 mm: workingset: move the stats flush into
workingset_test_recent()
7d7ef0a4686ab mm: memcg: restore subtree stats flushing

The test log returns this:

<<<test_start>>>
tag=memcg_stat_rss stime=1731755339
cmdline="memcg_stat_rss.sh"
contacts=""
analysis=exit
<<<test_output>>>
incrementing stop
memcg_stat_rss 1 TINFO: Running: memcg_stat_rss.sh
memcg_stat_rss 1 TINFO: Tested kernel: Linux harjha-ol9kdevltp
6.7.0-masterpost.2024111.el9.rc1.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Nov
15 11:55:41 PST 2024 x86_64 x86_64 x86_64 GNU/Linux
memcg_stat_rss 1 TINFO: Using
/tempdir/ltp-G6cge4CkrR/LTP_memcg_stat_rss.1zrm6X02CO as tmpdir (xfs
filesystem)
memcg_stat_rss 1 TINFO: timeout per run is 0h 5m 0s
memcg_stat_rss 1 TINFO: set /sys/fs/cgroup/memory/memory.use_hierarchy
to 0 failed
memcg_stat_rss 1 TINFO: Setting shmmax
memcg_stat_rss 1 TINFO: Running memcg_process --mmap-anon -s 266240
memcg_stat_rss 1 TINFO: Warming up pid: 9083
memcg_stat_rss 1 TINFO: Process is still here after warm up: 9083
memcg_stat_rss 1 TFAIL: rss is 0, 266240 expected
memcg_stat_rss 1 TBROK: timed out on memory.usage_in_bytes 4096 266240
266240
/opt/ltp-20240930/testcases/bin/tst_test.sh: line 158:  9083
Killed                  memcg_process "$@"  (wd:
/sys/fs/cgroup/memory/ltp/test-9024/ltp_9024)

Summary:
passed   0
failed   1
broken   1
skipped  0
warnings 0
<<<execution_status>>>

It is important to note that the entire test suite didn't even execute
as the second test itself was broken.
The latest 6.12 also shows errors in this test suite upon explicitly
enabling cgroups v1.

Thanks & Regards,
Harshvardhan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [LTP] Issue faced in memcg_stat_rss while running mainline kernels between 6.7 and 6.8
  2025-01-16  8:07         ` Harshvardhan Jha
@ 2025-01-16  9:06           ` Michal Hocko
  2025-01-16 10:04             ` Harshvardhan Jha
  2025-01-16 10:12             ` Petr Vorel
  0 siblings, 2 replies; 7+ messages in thread
From: Michal Hocko @ 2025-01-16  9:06 UTC (permalink / raw)
  To: Harshvardhan Jha; +Cc: Petr Vorel, Li Wang, Cyril Hrubis, ltp, cgroups

On Thu 16-01-25 13:37:14, Harshvardhan Jha wrote:
> Hello Michal
> On 16/01/25 1:23 PM, Michal Hocko wrote:
> > Hi,
> >
> > On Wed 15-01-25 23:59:20, Petr Vorel wrote:
> >> Hi Harshvardhan,
> >>
> >> [ Cc cgroups@vger.kernel.org: FYI problem in recent kernel using cgroup v1 ]
> > It is hard to decypher the output and nail down actual failure. Could
> > somebody do a TL;DR summary of the failure, since when it happens, is it
> > really v1 specific?
> 
> The test ltp_memcg_stat_rss is indeed cgroup v1 specific.

What does this test case aims to test?

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [LTP] Issue faced in memcg_stat_rss while running mainline kernels between 6.7 and 6.8
  2025-01-16  9:06           ` Michal Hocko
@ 2025-01-16 10:04             ` Harshvardhan Jha
  2025-01-16 10:35               ` Michal Hocko
  2025-01-16 10:12             ` Petr Vorel
  1 sibling, 1 reply; 7+ messages in thread
From: Harshvardhan Jha @ 2025-01-16 10:04 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Petr Vorel, Li Wang, Cyril Hrubis, ltp, cgroups

Hi Michal,

On 16/01/25 2:36 PM, Michal Hocko wrote:
> On Thu 16-01-25 13:37:14, Harshvardhan Jha wrote:
>> Hello Michal
>> On 16/01/25 1:23 PM, Michal Hocko wrote:
>>> Hi,
>>>
>>> On Wed 15-01-25 23:59:20, Petr Vorel wrote:
>>>> Hi Harshvardhan,
>>>>
>>>> [ Cc cgroups@vger.kernel.org: FYI problem in recent kernel using cgroup v1 ]
>>> It is hard to decypher the output and nail down actual failure. Could
>>> somebody do a TL;DR summary of the failure, since when it happens, is it
>>> really v1 specific?
>> The test ltp_memcg_stat_rss is indeed cgroup v1 specific.
> What does this test case aims to test?
>
This test specifically tests the memory cgroup(memcg) subsystem,
focusing on the RSS accounting functionality.

The test verifies how the kernel tracks and reports memory usage within
cgroups, specifically:

- The accuracy of RSS accounting in memory cgroups
- How the kernel updates and maintains the RSS statistics for processes
within memory cgroups
- The proper reporting of memory usage through the cgroup interface

The test typically:

 1. Creates a memory cgroup
 2. Allocates various types of memory within it
 3. Verifies that the reported RSS statistics match the expected values
 4. Test edge cases like shared pages and memory pressure situations

I hope I explained it right @Petr?

Thanks & Regards,
Harshvardhan


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [LTP] Issue faced in memcg_stat_rss while running mainline kernels between 6.7 and 6.8
  2025-01-16  9:06           ` Michal Hocko
  2025-01-16 10:04             ` Harshvardhan Jha
@ 2025-01-16 10:12             ` Petr Vorel
  1 sibling, 0 replies; 7+ messages in thread
From: Petr Vorel @ 2025-01-16 10:12 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Harshvardhan Jha, Li Wang, Cyril Hrubis, ltp, cgroups

Hi Michal, all,

> On Thu 16-01-25 13:37:14, Harshvardhan Jha wrote:
> > Hello Michal
> > On 16/01/25 1:23 PM, Michal Hocko wrote:
> > > Hi,

> > > On Wed 15-01-25 23:59:20, Petr Vorel wrote:
> > >> Hi Harshvardhan,

> > >> [ Cc cgroups@vger.kernel.org: FYI problem in recent kernel using cgroup v1 ]
> > > It is hard to decypher the output and nail down actual failure. Could
> > > somebody do a TL;DR summary of the failure, since when it happens, is it
> > > really v1 specific?

> > The test ltp_memcg_stat_rss is indeed cgroup v1 specific.

> What does this test case aims to test?

I'm not an expert on cgroup tests, maybe Li or Cyril will comment better.

memcg_stat_rss.sh [1] claims "Test the management and counting of memory",
test_mem_stat() [2] checks memory.stat doing some memory allocation.
Each test runs memcg_process.c [3], which does various mmap(),
followed by checks.

These tests are quite old, not sure how relevant they are. We have newer tests
written completely in C, which are more reliable.

Kind regards,
Petr

[1] https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/controllers/memcg/functional/memcg_stat_rss.sh#L17C3-L17C45
[2] https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/controllers/memcg/functional/memcg_lib.sh#L249
[3] https://github.com/linux-test-project/ltp/tree/master/testcases/kernel/controllers/memcg/functional/memcg_process.c


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [LTP] Issue faced in memcg_stat_rss while running mainline kernels between 6.7 and 6.8
  2025-01-16 10:04             ` Harshvardhan Jha
@ 2025-01-16 10:35               ` Michal Hocko
  0 siblings, 0 replies; 7+ messages in thread
From: Michal Hocko @ 2025-01-16 10:35 UTC (permalink / raw)
  To: Harshvardhan Jha; +Cc: Petr Vorel, Li Wang, Cyril Hrubis, ltp, cgroups

On Thu 16-01-25 15:34:38, Harshvardhan Jha wrote:
> Hi Michal,
> 
> On 16/01/25 2:36 PM, Michal Hocko wrote:
> > On Thu 16-01-25 13:37:14, Harshvardhan Jha wrote:
> >> Hello Michal
> >> On 16/01/25 1:23 PM, Michal Hocko wrote:
> >>> Hi,
> >>>
> >>> On Wed 15-01-25 23:59:20, Petr Vorel wrote:
> >>>> Hi Harshvardhan,
> >>>>
> >>>> [ Cc cgroups@vger.kernel.org: FYI problem in recent kernel using cgroup v1 ]
> >>> It is hard to decypher the output and nail down actual failure. Could
> >>> somebody do a TL;DR summary of the failure, since when it happens, is it
> >>> really v1 specific?
> >> The test ltp_memcg_stat_rss is indeed cgroup v1 specific.
> > What does this test case aims to test?
> >
> This test specifically tests the memory cgroup(memcg) subsystem,
> focusing on the RSS accounting functionality.
> 
> The test verifies how the kernel tracks and reports memory usage within
> cgroups, specifically:
> 
> - The accuracy of RSS accounting in memory cgroups
> - How the kernel updates and maintains the RSS statistics for processes
> within memory cgroups
> - The proper reporting of memory usage through the cgroup interface
> 
> The test typically:
> 
>  1. Creates a memory cgroup
>  2. Allocates various types of memory within it
>  3. Verifies that the reported RSS statistics match the expected values
>  4. Test edge cases like shared pages and memory pressure situations
> 
> I hope I explained it right @Petr?

Thanks. Yes this does clarify the test case. Unfortunatelly this could
be quite tricky to get right, especially on short lived processes. Due
to stats accounting optimizations all the changes to counters might not be
visible right a way. So there is some tuning required and to make it
worse that tuning might just not work with future optimizations.

All that being said, it is a question whether the specific testcases
brings a sufficient value to justify likely false negatives and constant
tuning to existing kenrnel implementation.

If this local imprecision is a problem for real workloads we might need
to provide means to sync up stats (similar to what we have for
/proc/vmstat) and test cases could rely on that rather than trying to
estimate in flight cached stats.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-01-16 10:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <e66fcf77-cf9d-4d14-9e42-1fc4564483bc@oracle.com>
     [not found] ` <PH7PR10MB650583A6483E7A87B43630BDAC302@PH7PR10MB6505.namprd10.prod.outlook.com>
     [not found]   ` <20250115125241.GD648257@pevik>
2025-01-15 22:59     ` [LTP] Issue faced in memcg_stat_rss while running mainline kernels between 6.7 and 6.8 Petr Vorel
2025-01-16  7:53       ` Michal Hocko
2025-01-16  8:07         ` Harshvardhan Jha
2025-01-16  9:06           ` Michal Hocko
2025-01-16 10:04             ` Harshvardhan Jha
2025-01-16 10:35               ` Michal Hocko
2025-01-16 10:12             ` Petr Vorel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).