Re: [PATCH 1/4 v2] cgroup: move cgroup_rstat from cgroup to cgroup_subsys_state

public inbox for cgroups@vger.kernel.org
 help / color / mirror / Atom feed

From: JP Kobryn <inwardvessel@gmail.com>
To: Yosry Ahmed <yosry.ahmed@linux.dev>
Cc: tj@kernel.org, shakeel.butt@linux.dev, mhocko@kernel.org,
	hannes@cmpxchg.org, akpm@linux-foundation.org,
	linux-mm@kvack.org, cgroups@vger.kernel.org,
	kernel-team@meta.com
Subject: Re: [PATCH 1/4 v2] cgroup: move cgroup_rstat from cgroup to cgroup_subsys_state
Date: Fri, 28 Feb 2025 17:30:08 -0800	[thread overview]
Message-ID: <0a551dcc-6a95-46a4-9a60-7e62200e63ef@gmail.com> (raw)
In-Reply-To: <Z8Jh7-lN_qltU7WD@google.com>

On 2/28/25 5:25 PM, Yosry Ahmed wrote:
> On Fri, Feb 28, 2025 at 05:06:23PM -0800, JP Kobryn wrote:
> [..]
>>>
>>>>    		cgroup_idr_replace(&ss->css_idr, NULL, css->id);
>>>>    		if (ss->css_released)
>>> [..]
>>>> @@ -6188,6 +6186,9 @@ int __init cgroup_init(void)
>>>>    			css->id = cgroup_idr_alloc(&ss->css_idr, css, 1, 2,
>>>>    						   GFP_KERNEL);
>>>>    			BUG_ON(css->id < 0);
>>>> +
>>>> +			if (css->ss && css->ss->css_rstat_flush)
>>>> +				BUG_ON(cgroup_rstat_init(css));
>>>
>>> Why do we need this call here? We already call cgroup_rstat_init() in
>>> cgroup_init_subsys(). IIUC for subsystems with ss->early_init, we will
>>> have already called cgroup_init_subsys() in cgroup_init_early().
>>>
>>> Did I miss something?
>>
>> Hmmm it's a good question. cgroup_rstat_init() is deferred in the same
>> way that cgroup_idr_alloc() is. So for ss with early_init == true,
>> cgroup_rstat_init() is not called during cgroup_early_init().
> 
> Oh I didn't realize that the call here is only when early_init == false.
> I think we need a comment to clarify that cgroup_idr_alloc() and
> cgroup_rstat_init() are not called in cgroup_init_subsys() when
> early_init == true, and hence need to be called in cgroup_init().
> 
> Or maybe organize the code in a way to make this more obvious (put them
> in a helper with a descriptive name? idk).

I see what you're getting at. Let me think of something for v3.

> 
>>
>> Is it safe to call alloc_percpu() during early boot? If so, the
>> deferral can be removed and cgroup_rstat_init() can be called in one
>> place.
> 
> I don't think so. cgroup_init_early() is called before
> setup_per_cpu_areas().

Cool. Thanks for pointing that out.

> 
>>
>>>
>>>>    		} else {
>>>>    			cgroup_init_subsys(ss, false);
>>>>    		}
>>> [..]
>>>> @@ -300,27 +306,25 @@ static inline void __cgroup_rstat_unlock(struct cgroup *cgrp, int cpu_in_loop)
>>>>    }
>>>>    /* see cgroup_rstat_flush() */
>>>> -static void cgroup_rstat_flush_locked(struct cgroup *cgrp)
>>>> +static void cgroup_rstat_flush_locked(struct cgroup_subsys_state *css)
>>>>    	__releases(&cgroup_rstat_lock) __acquires(&cgroup_rstat_lock)
>>>>    {
>>>> +	struct cgroup *cgrp = css->cgroup;
>>>>    	int cpu;
>>>>    	lockdep_assert_held(&cgroup_rstat_lock);
>>>>    	for_each_possible_cpu(cpu) {
>>>> -		struct cgroup *pos = cgroup_rstat_updated_list(cgrp, cpu);
>>>> +		struct cgroup_subsys_state *pos;
>>>> +		pos = cgroup_rstat_updated_list(css, cpu);
>>>>    		for (; pos; pos = pos->rstat_flush_next) {
>>>> -			struct cgroup_subsys_state *css;
>>>> +			if (!pos->ss)
>>>> +				cgroup_base_stat_flush(pos->cgroup, cpu);
>>>> +			else
>>>> +				pos->ss->css_rstat_flush(pos, cpu);
>>>> -			cgroup_base_stat_flush(pos, cpu);
>>>> -			bpf_rstat_flush(pos, cgroup_parent(pos), cpu);
>>>> -
>>>> -			rcu_read_lock();
>>>> -			list_for_each_entry_rcu(css, &pos->rstat_css_list,
>>>> -						rstat_css_node)
>>>> -				css->ss->css_rstat_flush(css, cpu);
>>>> -			rcu_read_unlock();
>>>> +			bpf_rstat_flush(pos->cgroup, cgroup_parent(pos->cgroup), cpu);
>>>
>>> We should call bpf_rstat_flush() only if (!pos->ss) as well, right?
>>> Otherwise we will call BPF rstat flush whenever any subsystem is
>>> flushed.
>>>
>>> I guess it's because BPF can now pass any subsystem to
>>> cgroup_rstat_flush(), and we don't keep track. I think it would be
>>> better if we do not allow BPF programs to select a css and always make
>>> them flush the self css.
>>>
>>> We can perhaps introduce a bpf_cgroup_rstat_flush() wrapper that takes
>>> in a cgroup and passes cgroup->self internally to cgroup_rstat_flush().
>>
>> I'm fine with this if others are in agreement. A similar concept was
>> done in v1.
> 
> Let's wait for Shakeel to chime in here since he suggested removing this
> hook, but I am not sure if he intended to actually do it or not. Better
> not to waste effort if this will be gone soon anyway.
> 
>>
>>>
>>> But if the plan is to remove the bpf_rstat_flush() call here soon then
>>> it's probably not worth the hassle.
>>>
>>> Shakeel (and others), WDYT?
>>

next prev parent reply	other threads:[~2025-03-01  1:30 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-27 21:55 [PATCH 0/4 v2] cgroup: separate rstat trees inwardvessel
2025-02-27 21:55 ` [PATCH 1/4 v2] cgroup: move cgroup_rstat from cgroup to cgroup_subsys_state inwardvessel
2025-02-27 22:43   ` Shakeel Butt
2025-02-28 19:04   ` Yosry Ahmed
2025-03-01  1:06     ` JP Kobryn
2025-03-01  1:25       ` Yosry Ahmed
2025-03-01  1:30         ` JP Kobryn [this message]
2025-03-03 18:18         ` Shakeel Butt
2025-03-03 18:21           ` Yosry Ahmed
2025-03-03 15:20   ` Michal Koutný
2025-03-03 19:31     ` JP Kobryn
2025-02-27 21:55 ` [PATCH 2/4 v2] cgroup: rstat lock indirection inwardvessel
2025-03-03 15:21   ` Michal Koutný
2025-02-27 21:55 ` [PATCH 3/4 v2] cgroup: separate rstat locks for subsystems inwardvessel
2025-02-27 22:52   ` Shakeel Butt
2025-02-28 16:07     ` JP Kobryn
2025-02-28 17:37   ` JP Kobryn
2025-02-28 19:20   ` Yosry Ahmed
2025-03-06 21:47     ` JP Kobryn
2025-03-01 23:00   ` kernel test robot
2025-03-03 15:22   ` Michal Koutný
2025-03-03 18:29     ` Yosry Ahmed
2025-03-03 18:40       ` Shakeel Butt
2025-03-03 19:23         ` JP Kobryn
2025-03-03 19:39           ` Shakeel Butt
2025-03-03 19:50         ` Yosry Ahmed
2025-03-03 20:09           ` Shakeel Butt
2025-03-03 18:49       ` Michal Koutný
2025-03-10 17:59         ` JP Kobryn
2025-03-11 13:49           ` Michal Koutný
2025-03-06 21:36       ` JP Kobryn
2025-03-03 23:49   ` kernel test robot
2025-02-27 21:55 ` [PATCH 4/4 v2] cgroup: separate rstat list pointers from base stats inwardvessel
2025-02-27 23:01   ` Shakeel Butt
2025-02-28 20:33   ` Yosry Ahmed
2025-02-28 18:22 ` [PATCH 0/4 v2] cgroup: separate rstat trees Yosry Ahmed
2025-03-03 15:19 ` Michal Koutný
2025-03-06  1:07   ` JP Kobryn
2025-03-11 13:49     ` Michal Koutný

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0a551dcc-6a95-46a4-9a60-7e62200e63ef@gmail.com \
    --to=inwardvessel@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@meta.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=shakeel.butt@linux.dev \
    --cc=tj@kernel.org \
    --cc=yosry.ahmed@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox