From: Yosry Ahmed <yosryahmed@google.com>
To: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Roman Gushchin <roman.gushchin@linux.dev>,
Shakeel Butt <shakeelb@google.com>,
Muchun Song <muchun.song@linux.dev>,
Ivan Babrou <ivan@cloudflare.com>, Tejun Heo <tj@kernel.org>,
linux-mm@kvack.org, cgroups@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/3] mm: memcg: use non-unified stats flushing for userspace reads
Date: Mon, 28 Aug 2023 09:15:04 -0700 [thread overview]
Message-ID: <CAJD7tkakMcaR_6NygEXCt6GF8TOuzYAUQe1im+vu2F3G4jtz=w@mail.gmail.com> (raw)
In-Reply-To: <ZOzBgfzlGdrPD4gk@dhcp22.suse.cz>
On Mon, Aug 28, 2023 at 8:47 AM Michal Hocko <mhocko@suse.com> wrote:
>
> Done my homework and studied the rstat code more (sorry should have done
> that earlier).
>
> On Fri 25-08-23 08:14:54, Yosry Ahmed wrote:
> [...]
> > I guess what I am trying to say is, breaking down that lock is a major
> > surgery that might require re-designing or re-implementing some parts
> > of rstat. I would be extremely happy to be proven wrong. If we can
> > break down that lock then there is no need for unified flushing even
> > for in-kernel contexts, and we can all live happily ever after with
> > cheap(ish) and accurate stats flushing.
>
> Yes, this seems like a big change and also over complicating the whole
> thing. I am not sure this is worth it.
>
> > I really hope we can move forward with the problems at hand (sometimes
> > reads are expensive, sometimes reads are stale), and not block fixing
> > them until we can come up with an alternative to that global lock
> > (unless, of course, there is a simpler way of doing that).
>
> Well, I really have to say that I do not like the notion that reading
> stats is unpredictable. This just makes it really hard to use. If
> the precision is to be sarificed then this should be preferable over
> potentially high global lock contention. We already have that model in
> place of /proc/vmstat (configurable timeout for flusher and a way to
> flush explicitly). I appreciate you would like to have a better
> precision but as you have explored the locking is really hard to get rid
> of here.
Reading the stats *is* unpredictable today. In terms of
accuracy/staleness and cost. Avoiding the flush entirely on the read
path will surely make the cost very stable and cheap, but will make
accuracy even less predictable.
>
> So from my POV I would prefer to avoid flushing from the stats reading
> path and implement force flushing by writing to stat file. If the 2s
> flushing interval is considered to coarse I would be OK to allow setting
> it from userspace. This way this would be more in line with /proc/vmstat
> which seems to be working quite well.
>
> If this is not accaptable or deemed a wrong approach long term then it
> would be good to reonsider the current cgroup_rstat_lock at least.
> Either by turning it into mutex or by dropping the yielding code which
> can severly affect the worst case latency AFAIU.
Honestly I think it's better if we do it the other way around. We make
flushing on the stats reading path non-unified and deterministic. That
model also exists and is used for cpu.stat. If we find a problem with
the locking being held from userspace, we can then remove flushing
from the read path and add interface(s) to configure the periodic
flusher and do a force flush.
I would like to avoid introducing additional interfaces and
configuration knobs unless it's necessary. Also, if we remove the
flush entirely the cost will become really cheap. We will have a hard
time reversing that in the future if we want to change the
implementation.
IOW, moving forward with this change seems much more reversible than
adopting the /proc/vmstat model.
If using a mutex will make things better, we can do that now. It
doesn't introduce performance issues in my testing. My only concern is
someone sleeping or getting preempted while holding the mutex, so I
would prefer disabling preemption while we flush if that doesn't cause
problems.
Thanks!
next prev parent reply other threads:[~2023-08-28 16:15 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-21 20:54 [PATCH 0/3] memcg: non-unified flushing for userspace stats Yosry Ahmed
2023-08-21 20:54 ` [PATCH 1/3] mm: memcg: properly name and document unified stats flushing Yosry Ahmed
2023-08-21 20:54 ` [PATCH 2/3] mm: memcg: add a helper for non-unified " Yosry Ahmed
2023-08-22 13:01 ` Michal Koutný
2023-08-22 16:00 ` Yosry Ahmed
2023-08-22 16:35 ` Michal Koutný
2023-08-22 16:48 ` Yosry Ahmed
2023-08-21 20:54 ` [PATCH 3/3] mm: memcg: use non-unified stats flushing for userspace reads Yosry Ahmed
2023-08-22 9:06 ` Michal Hocko
2023-08-22 15:30 ` Yosry Ahmed
2023-08-23 7:33 ` Michal Hocko
2023-08-23 14:55 ` Yosry Ahmed
2023-08-24 7:13 ` Michal Hocko
2023-08-24 18:15 ` Yosry Ahmed
2023-08-24 18:50 ` Yosry Ahmed
2023-08-25 7:05 ` Michal Hocko
2023-08-25 15:14 ` Yosry Ahmed
2023-08-25 18:17 ` Michal Hocko
2023-08-25 18:21 ` Yosry Ahmed
2023-08-25 18:43 ` Michal Hocko
2023-08-25 18:44 ` Michal Hocko
2023-08-28 15:47 ` Michal Hocko
2023-08-28 16:15 ` Yosry Ahmed [this message]
2023-08-28 17:00 ` Shakeel Butt
2023-08-28 17:07 ` Yosry Ahmed
2023-08-28 17:27 ` Waiman Long
2023-08-28 17:28 ` Yosry Ahmed
2023-08-28 17:35 ` Waiman Long
2023-08-28 17:43 ` Waiman Long
2023-08-28 18:35 ` Yosry Ahmed
2023-08-29 7:27 ` Michal Hocko
2023-08-29 15:05 ` Waiman Long
2023-08-29 15:17 ` Michal Hocko
2023-08-29 16:04 ` Yosry Ahmed
2023-08-29 18:44 ` Tejun Heo
2023-08-29 19:13 ` Yosry Ahmed
2023-08-29 19:36 ` Tejun Heo
2023-08-29 19:54 ` Yosry Ahmed
2023-08-29 20:12 ` Tejun Heo
2023-08-29 20:20 ` Yosry Ahmed
2023-08-31 9:05 ` Michal Hocko
2023-08-22 13:00 ` [PATCH 0/3] memcg: non-unified flushing for userspace stats Michal Koutný
2023-08-22 15:43 ` Yosry Ahmed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAJD7tkakMcaR_6NygEXCt6GF8TOuzYAUQe1im+vu2F3G4jtz=w@mail.gmail.com' \
--to=yosryahmed@google.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=ivan@cloudflare.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=muchun.song@linux.dev \
--cc=roman.gushchin@linux.dev \
--cc=shakeelb@google.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).