From: "Leonardo Brás" <leobras@redhat.com>
To: Roman Gushchin <roman.gushchin@linux.dev>,
Michal Hocko <mhocko@suse.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Shakeel Butt <shakeelb@google.com>,
Muchun Song <muchun.song@linux.dev>,
Andrew Morton <akpm@linux-foundation.org>,
cgroups@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 0/5] Introduce memcg_stock_pcp remote draining
Date: Fri, 27 Jan 2023 04:14:19 -0300 [thread overview]
Message-ID: <55ac6e3cbb97c7d13c49c3125c1455d8a2c785c3.camel@redhat.com> (raw)
In-Reply-To: <Y9MI42NSLooyVZNu@P9FQF9L96D.corp.robot.car>
On Thu, 2023-01-26 at 15:12 -0800, Roman Gushchin wrote:
> On Thu, Jan 26, 2023 at 08:41:34AM +0100, Michal Hocko wrote:
> > On Wed 25-01-23 15:14:48, Roman Gushchin wrote:
> > > On Wed, Jan 25, 2023 at 03:22:00PM -0300, Marcelo Tosatti wrote:
> > > > On Wed, Jan 25, 2023 at 08:06:46AM -0300, Leonardo Brás wrote:
> > > > > On Wed, 2023-01-25 at 09:33 +0100, Michal Hocko wrote:
> > > > > > On Wed 25-01-23 04:34:57, Leonardo Bras wrote:
> > > > > > > Disclaimer:
> > > > > > > a - The cover letter got bigger than expected, so I had to split it in
> > > > > > > sections to better organize myself. I am not very confortable with it.
> > > > > > > b - Performance numbers below did not include patch 5/5 (Remove flags
> > > > > > > from memcg_stock_pcp), which could further improve performance for
> > > > > > > drain_all_stock(), but I could only notice the optimization at the
> > > > > > > last minute.
> > > > > > >
> > > > > > >
> > > > > > > 0 - Motivation:
> > > > > > > On current codebase, when drain_all_stock() is ran, it will schedule a
> > > > > > > drain_local_stock() for each cpu that has a percpu stock associated with a
> > > > > > > descendant of a given root_memcg.
> > >
> > > Do you know what caused those drain_all_stock() calls? I wonder if we should look
> > > into why we have many of them and whether we really need them?
> > >
> > > It's either some user's actions (e.g. reducing memory.max), either some memcg
> > > is entering pre-oom conditions. In the latter case a lot of drain calls can be
> > > scheduled without a good reason (assuming the cgroup contain multiple tasks running
> > > on multiple cpus).
> >
> > I believe I've never got a specific answer to that. We
> > have discussed that in the previous version submission
> > (20221102020243.522358-1-leobras@redhat.com and specifically
> > Y2TQLavnLVd4qHMT@dhcp22.suse.cz). Leonardo has mentioned a mix of RT and
> > isolcpus. I was wondering about using memcgs in RT workloads because
> > that just sounds weird but let's say this is the case indeed. Then an RT
> > task or whatever task that is running on an isolated cpu can have pcp
> > charges.
> >
> > > Essentially each cpu will try to grab the remains of the memory quota
> > > and move it locally. I wonder in such circumstances if we need to disable the pcp-caching
> > > on per-cgroup basis.
> >
> > I think it would be more than sufficient to disable pcp charging on an
> > isolated cpu.
>
> It might have significant performance consequences.
>
> I'd rather opt out of stock draining for isolated cpus: it might slightly reduce
> the accuracy of memory limits and slightly increase the memory footprint (all
> those dying memcgs...), but the impact will be limited. Actually it is limited
> by the number of cpus.
I was discussing this same idea with Marcelo yesterday morning.
The questions had in the topic were:
a - About how many pages the pcp cache will hold before draining them itself?
b - Would it cache any kind of bigger page, or huge page in this same aspect?
Please let me know if I got anything wrong, but IIUC from a previous debug (a)'s
answer is 4 pages. Meaning even on bigger-page archs such as powerpc, with 64k
pages, the max pcp cache 'wasted' on each processor would be 256k (very small on
today's standard).
Please let me know if you have any info on (b), or any correcting on (a).
The thing is: having this drain_local_stock() waiver only for isolated cpus
would not bring the same benefits for non-isolated cpus in high memory pressure
as I understand this patchset is bringing.
OTOH not running drain_local_stock() at all for every cpu may introduce
performance gains (no remote CPU access) but can be a problem if I got the
'wasted pages' could on (a) wrong. I mean, drain_local_stock() was introduced
for a reason.
>
> > This is not a per memcg property.
>
> Sure, my point was that in pre-oom condition several cpus might try to consolidate
> the remains of the memory quota, actually working against each other. Separate issue,
> which might be a reason why there are many flush attempts in the case we discuss.
>
> Thanks!
>
Thank you for reviewing!
Leo
next prev parent reply other threads:[~2023-01-27 7:14 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-25 7:34 [PATCH v2 0/5] Introduce memcg_stock_pcp remote draining Leonardo Bras
2023-01-25 7:34 ` [PATCH v2 1/5] mm/memcontrol: Align percpu memcg_stock to cache Leonardo Bras
2023-01-25 7:34 ` [PATCH v2 2/5] mm/memcontrol: Change stock_lock type from local_lock_t to spinlock_t Leonardo Bras
2023-01-25 7:35 ` [PATCH v2 3/5] mm/memcontrol: Reorder memcg_stock_pcp members to avoid holes Leonardo Bras
2023-01-25 7:35 ` [PATCH v2 4/5] mm/memcontrol: Perform all stock drain in current CPU Leonardo Bras
2023-01-25 7:35 ` [PATCH v2 5/5] mm/memcontrol: Remove flags from memcg_stock_pcp Leonardo Bras
2023-01-25 8:33 ` [PATCH v2 0/5] Introduce memcg_stock_pcp remote draining Michal Hocko
2023-01-25 11:06 ` Leonardo Brás
2023-01-25 11:39 ` Michal Hocko
2023-01-25 18:22 ` Marcelo Tosatti
2023-01-25 23:14 ` Roman Gushchin
2023-01-26 7:41 ` Michal Hocko
2023-01-26 18:03 ` Marcelo Tosatti
2023-01-26 19:20 ` Michal Hocko
2023-01-27 0:32 ` Marcelo Tosatti
2023-01-27 6:58 ` Michal Hocko
2023-02-01 18:31 ` Roman Gushchin
2023-01-26 23:12 ` Roman Gushchin
2023-01-27 7:11 ` Michal Hocko
2023-01-27 7:22 ` Leonardo Brás
2023-01-27 8:12 ` Leonardo Brás
2023-01-27 9:23 ` Michal Hocko
2023-01-27 13:03 ` Frederic Weisbecker
2023-01-27 13:58 ` Michal Hocko
2023-01-27 18:18 ` Roman Gushchin
2023-02-03 15:21 ` Michal Hocko
2023-02-03 19:25 ` Roman Gushchin
2023-02-13 13:36 ` Michal Hocko
2023-01-27 7:14 ` Leonardo Brás [this message]
2023-01-27 7:20 ` Michal Hocko
2023-01-27 7:35 ` Leonardo Brás
2023-01-27 9:29 ` Michal Hocko
2023-01-27 19:29 ` Leonardo Brás
2023-01-27 23:50 ` Roman Gushchin
2023-01-26 18:19 ` Marcelo Tosatti
2023-01-27 5:40 ` Leonardo Brás
2023-01-26 2:01 ` Hillf Danton
2023-01-26 7:45 ` Michal Hocko
2023-01-26 18:14 ` Marcelo Tosatti
2023-01-26 19:13 ` Michal Hocko
2023-01-27 6:55 ` Leonardo Brás
2023-01-31 11:35 ` Marcelo Tosatti
2023-02-01 4:36 ` Leonardo Brás
2023-02-01 12:52 ` Michal Hocko
2023-02-01 12:41 ` Michal Hocko
2023-02-04 4:55 ` Leonardo Brás
2023-02-05 19:49 ` Roman Gushchin
2023-02-07 3:18 ` Leonardo Brás
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55ac6e3cbb97c7d13c49c3125c1455d8a2c785c3.camel@redhat.com \
--to=leobras@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=mtosatti@redhat.com \
--cc=muchun.song@linux.dev \
--cc=roman.gushchin@linux.dev \
--cc=shakeelb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).