linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Waiman Long <longman@redhat.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: Tejun Heo <tj@kernel.org>, Jens Axboe <axboe@kernel.dk>,
	cgroups@vger.kernel.org, linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v6 3/3] blk-cgroup: Optimize blkcg_rstat_flush()
Date: Sun, 5 Jun 2022 21:59:50 -0400	[thread overview]
Message-ID: <c9ab0e91-76db-430f-272c-558c269d62ce@redhat.com> (raw)
In-Reply-To: <Yp1atoLkZPvA1Zd3@T590>

On 6/5/22 21:39, Ming Lei wrote:
> On Sun, Jun 05, 2022 at 07:15:27PM -0400, Waiman Long wrote:
>> On 6/3/22 23:58, Ming Lei wrote:
>>
>>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>>> index abec50f31fe6..8c4f204dbf5b 100644
>>> --- a/mm/memcontrol.c
>>> +++ b/mm/memcontrol.c
>>> @@ -622,7 +622,7 @@ static inline void memcg_rstat_updated(struct mem_cgroup *memcg, int val)
>>>    {
>>>    	unsigned int x;
>>> -	cgroup_rstat_updated(memcg->css.cgroup, smp_processor_id());
>>> +	cgroup_rstat_updated(memcg->css.cgroup, smp_processor_id(), NULL);
>>>    	x = __this_cpu_add_return(stats_updates, abs(val));
>>>    	if (x > MEMCG_CHARGE_BATCH) {
>> I think the rstat set of functions are doing that already. So flush will
>> only call CPUs that have called cgroup_rstat_updated() before. However, one
> Yeah, I guess the detail is in cgroup_rstat_cpu_pop_updated(), but the
> percpu lock(raw_spin_lock_irqsave) is still required, and cgroup_rstat_cpu_pop_updated()
> is still called even through there isn't any update on this CPU.
Yes, I think we may need to add a bitmask of what controllers have 
updates in cgroup_rstat_cpu structure.
>
>> deficiency that I am aware of is that there is no bitmap of which controller
>> have update. The problem that I saw in cgroup v2 is that in a cgroup with
>> both memory controller and block controller enabled, a
>> cgroup_rstat_updated() call from memory cgroup later causes the rstat
>> function to call into block cgroup flush method even though there is no
>> update in the block controller. This is an area that needs improvement.
>>
>> Your code does allow the block controller to be aware of that and avoid
>> further action, but I think it has to be done in the rstat code to be
>> applicable to all controllers instead of just specific to block controller.
> I guess it can be done by adding one percpu variable to 'struct cgroup'.
>
>> There is another problem that this approach. Suppose the system have 20
>> block devices and one of them has an IO operation. Now the flush method
>> still needs to iterate all the 20 blkg's to do an update. The block
>> controller is kind of special that the number of per-cgroup IO stats depends
>> on the number of block devices present. Other controllers just have one set
>> of stats per cgroup.
> Yeah, and this one is really blkio specific issue, and your patch does
> cover this one. Maybe you can add one callback to
> cgroup_rstat_updated(), so the "blkg_iostat_set" instance is added into
> percpu list under percpu lock of cgroup_rstat_cpu_lock, then the lockless
> list isn't needed.

The rstat API is generic. It may not be a good idea to put controller 
specific information into it. Yes, cgroup_rstat_cpu_lock is taken at the 
read side (flush). It may not taken on the write side (update). So it 
may not be easy to rely on this lock for synchronization between the 
read and write side.

Cheers,
Longman


  reply	other threads:[~2022-06-06  2:00 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-02 19:20 [PATCH v6 0/3] blk-cgroup: Optimize blkcg_rstat_flush() Waiman Long
2022-06-02 19:20 ` [PATCH v6 1/3] blk-cgroup: Correctly free percpu iostat_cpu in blkg on error exit Waiman Long
2022-06-04  2:08   ` Ming Lei
2022-06-04  2:47     ` Waiman Long
2022-06-02 19:20 ` [PATCH v6 2/3] blk-cgroup: Return -ENOMEM directly in blkcg_css_alloc() error path Waiman Long
2022-06-02 20:39   ` Tejun Heo
2022-06-04  2:16   ` Ming Lei
2022-06-02 19:20 ` [PATCH v6 3/3] blk-cgroup: Optimize blkcg_rstat_flush() Waiman Long
2022-06-04  3:58   ` Ming Lei
2022-06-05 23:15     ` Waiman Long
2022-06-06  1:39       ` Ming Lei
2022-06-06  1:59         ` Waiman Long [this message]
2022-06-06  2:23           ` Ming Lei
2022-06-06  2:58             ` Waiman Long
2022-06-06  3:15               ` Ming Lei
2022-06-06  3:16   ` Ming Lei
2022-06-08 16:57   ` Michal Koutný
2022-06-08 18:16     ` Waiman Long
2022-06-08 21:12       ` Michal Koutný
2022-06-08 22:14         ` Michal Koutný
2022-09-30 18:34     ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c9ab0e91-76db-430f-272c-558c269d62ce@redhat.com \
    --to=longman@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=cgroups@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).