linux-kselftest.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chen Ridong <chenridong@huaweicloud.com>
To: Tiffany Yang <ynaffit@google.com>
Cc: linux-kernel@vger.kernel.org, "John Stultz" <jstultz@google.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Stephen Boyd" <sboyd@kernel.org>,
	"Anna-Maria Behnsen" <anna-maria@linutronix.de>,
	"Frederic Weisbecker" <frederic@kernel.org>,
	"Tejun Heo" <tj@kernel.org>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Michal Koutný" <mkoutny@suse.com>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	"Pavel Machek" <pavel@kernel.org>,
	"Roman Gushchin" <roman.gushchin@linux.dev>,
	"Chen Ridong" <chenridong@huawei.com>,
	kernel-team@android.com, "Jonathan Corbet" <corbet@lwn.net>,
	"Shuah Khan" <shuah@kernel.org>,
	cgroups@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kselftest@vger.kernel.org
Subject: Re: [PATCH v4 1/2] cgroup: cgroup.stat.local time accounting
Date: Sat, 23 Aug 2025 09:45:26 +0800	[thread overview]
Message-ID: <1b6498f3-ca07-41d5-9637-f20a58184e60@huaweicloud.com> (raw)
In-Reply-To: <dbx8ms7r885f.fsf@ynaffit-andsys.c.googlers.com>



On 2025/8/23 3:32, Tiffany Yang wrote:
> Hi Chen,
> 
> Thanks again for taking a look!
> 
> Chen Ridong <chenridong@huaweicloud.com> writes:
> 
>> On 2025/8/22 14:14, Chen Ridong wrote:
> 
> 
>>> On 2025/8/22 9:37, Tiffany Yang wrote:
>>>> There isn't yet a clear way to identify a set of "lost" time that
>>>> everyone (or at least a wider group of users) cares about. However,
>>>> users can perform some delay accounting by iterating over components of
>>>> interest. This patch allows cgroup v2 freezing time to be one of those
>>>> components.
> 
>>>> Track the cumulative time that each v2 cgroup spends freezing and expose
>>>> it to userland via a new local stat file in cgroupfs. Thank you to
>>>> Michal, who provided the ASCII art in the updated documentation.
> 
>>>> To access this value:
>>>>    $ mkdir /sys/fs/cgroup/test
>>>>    $ cat /sys/fs/cgroup/test/cgroup.stat.local
>>>>    freeze_time_total 0
> 
>>>> Ensure consistent freeze time reads with freeze_seq, a per-cgroup
>>>> sequence counter. Writes are serialized using the css_set_lock.
> 
> ...
> 
>>>>       spin_lock_irq(&css_set_lock);
>>>> -    if (freeze)
>>>> +    write_seqcount_begin(&cgrp->freezer.freeze_seq);
>>>> +    if (freeze) {
>>>>           set_bit(CGRP_FREEZE, &cgrp->flags);
>>>> -    else
>>>> +        cgrp->freezer.freeze_start_nsec = ts_nsec;
>>>> +    } else {
>>>>           clear_bit(CGRP_FREEZE, &cgrp->flags);
>>>> +        cgrp->freezer.frozen_nsec += (ts_nsec -
>>>> +            cgrp->freezer.freeze_start_nsec);
>>>> +    }
>>>> +    write_seqcount_end(&cgrp->freezer.freeze_seq);
>>>>       spin_unlock_irq(&css_set_lock);
> 
> 
>>> Hello Tiffany,
> 
>>> I wanted to check if there are any specific considerations regarding how we should input the ts_nsec
>>> value.
> 
>>> Would it be possible to define this directly within the cgroup_do_freeze function rather than
>>> passing it as a parameter? This approach might simplify the implementation and potentially improve
>>> timing accuracy when it have lots of descendants.
> 
> 
>> I revisited v3, and this was Michal's point.
>>     p
>>       /  |  \
>>      1  ...  n
>> When we freeze the parent group p, is it expected that all descendant cgroups (1 to n) should share
>> the same frozen timestamp?
> 
> 
> Yes, this is the expectation from the current change. I understand your
> concern about the accuracy of this measurement (especially when there
> are many descendants), but I agree with Michal's point that the time to
> traverse the descendant cgroups is basically noise relative to the
> quantity we're trying to measure here.
> 
>> If the cgroup tree structure is stable, the exact frozen time may not be really matter. However, if
>> the tree is not stable, obtaining the same frozen time is acceptable?
> 
> I'm a little unclear as to what you mean about when the cgroup tree is
> unstable. In the case where a new descendant of p is being created, I
> believe the cgroup_mutex prevents that from happening at the same time
> as we are freezing p's other descendants. If it won the race, was
> created unfrozen under p, and then became frozen during cgroup_freeze,
> it would have the same timestamp as the other descendants. If it lost
> the race and was created as a frozen cgroup under p, it would get its
> own timestamp in cgroup_create, so its freezing duration would be
> slightly less than that of the others in the hierarchy. Both values
> would be acceptable for our purposes, but if there was a different case
> you had in mind, please let me know!
> 
> Thanks,

What I mean by "stable" is that while cgroup 1 through n might be deleted or have more descendants
created. For example:

         n  n-1  n-2  ... 1
frozen   a  a+1  a+2     a+n
unfozen  b  b+1  b+2  ... b+n
nsec     b-a ...

In this case, all frozen_nsec values are b - a, which I believe is correct.
However, consider a scenario where some cgroups are deleted:

         n  n-1  n-2  ... 1
frozen   a  a+1  a+2     a+n
// 2 ... n-1 are deleted.
unfozen  b               b+1

Here, the frozen_nsec for cgroup n would be b - a, but for cgroup 1 it would be (b + 1) - (a + n).
This could introduce some discrepancy / timing inaccuracies.

-- 
Best regards,
Ridong


  reply	other threads:[~2025-08-23  1:45 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-22  1:37 [PATCH v4 0/2] cgroup: Track time in cgroup v2 freezer Tiffany Yang
2025-08-22  1:37 ` [PATCH v4 1/2] cgroup: cgroup.stat.local time accounting Tiffany Yang
2025-08-22  6:14   ` Chen Ridong
2025-08-22  6:58     ` Chen Ridong
2025-08-22 19:32       ` Tiffany Yang
2025-08-23  1:45         ` Chen Ridong [this message]
2025-08-25 21:00           ` Tiffany Yang
2025-08-22  1:37 ` [PATCH v4 2/2] cgroup: selftests: Add tests for freezer time Tiffany Yang
2025-08-22  7:19   ` Chen Ridong
2025-08-22 18:50     ` Tiffany Yang
2025-08-23  1:47       ` Chen Ridong
2025-08-22 17:51 ` [PATCH v4 0/2] cgroup: Track time in cgroup v2 freezer Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1b6498f3-ca07-41d5-9637-f20a58184e60@huaweicloud.com \
    --to=chenridong@huaweicloud.com \
    --cc=anna-maria@linutronix.de \
    --cc=cgroups@vger.kernel.org \
    --cc=chenridong@huawei.com \
    --cc=corbet@lwn.net \
    --cc=frederic@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=jstultz@google.com \
    --cc=kernel-team@android.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mkoutny@suse.com \
    --cc=pavel@kernel.org \
    --cc=rafael@kernel.org \
    --cc=roman.gushchin@linux.dev \
    --cc=sboyd@kernel.org \
    --cc=shuah@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=ynaffit@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).