From: Roman Gushchin <guro@fb.com>
To: Shakeel Butt <shakeelb@google.com>
Cc: Michal Hocko <mhocko@kernel.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Yang Shi <yang.shi@linux.alibaba.com>,
David Rientjes <rientjes@google.com>,
Greg Thelen <gthelen@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linux MM <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>,
Cgroups <cgroups@vger.kernel.org>
Subject: Re: [RFC PROPOSAL] memcg: per-memcg user space reclaim interface
Date: Mon, 6 Jul 2020 14:38:33 -0700 [thread overview]
Message-ID: <20200706213404.GA152560@carbon.lan> (raw)
In-Reply-To: <CALvZod5Z4=1CijJp0QRnx+pdH=Me6sYPXASCxVATnshU0RW-Qw@mail.gmail.com>
On Fri, Jul 03, 2020 at 09:27:19AM -0700, Shakeel Butt wrote:
> On Fri, Jul 3, 2020 at 8:50 AM Roman Gushchin <guro@fb.com> wrote:
> >
> > On Fri, Jul 03, 2020 at 07:23:14AM -0700, Shakeel Butt wrote:
> > > On Thu, Jul 2, 2020 at 11:35 PM Michal Hocko <mhocko@kernel.org> wrote:
> > > >
> > > > On Thu 02-07-20 08:22:22, Shakeel Butt wrote:
> > > > [...]
> > > > > Interface options:
> > > > > ------------------
> > > > >
> > > > > 1) memcg interface e.g. 'echo 10M > memory.reclaim'
> > > > >
> > > > > + simple
> > > > > + can be extended to target specific type of memory (anon, file, kmem).
> > > > > - most probably restricted to cgroup v2.
> > > > >
> > > > > 2) fadvise(PAGEOUT) on cgroup_dir_fd
> > > > >
> > > > > + more general and applicable to other FSes (actually we are using
> > > > > something similar for tmpfs).
> > > > > + can be extended in future to just age the LRUs instead of reclaim or
> > > > > some new use cases.
> > > >
> > > > Could you explain why memory.high as an interface to trigger pro-active
> > > > memory reclaim is not sufficient. Also memory.low limit to protect
> > > > latency sensitve workloads?
> >
> > I initially liked the proposal, but after some thoughts I've realized
> > that I don't know a good use case where memory.high is less useful.
> > Shakeel, what's the typical use case you thinking of?
> > Who and how will use the new interface?
> >
> > >
> > > Yes, we can use memory.high to trigger [proactive] reclaim in a memcg
> > > but note that it can also introduce stalls in the application running
> > > in that memcg. Let's suppose the memory.current of a memcg is 100MiB
> > > and we want to reclaim 20MiB from it, we can set the memory.high to
> > > 80MiB but any allocation attempt from the application running in that
> > > memcg can get stalled/throttled. I want the functionality of the
> > > reclaim without potential stalls.
> >
> > But reclaiming some pagecache/swapping out anon pages can always
> > generate some stalls caused by pagefaults, no?
> >
>
> Thanks for looking into the proposal. Let me answer both of your
> questions together. I have added the two use-cases but let me explain
> the proactive reclaim a bit more as we actually use that in our
> production.
>
> We have defined tolerable refault rates for the applications based on
> their type (latency sensitive or not). Proactive reclaim is triggered
> in the application based on their current refault rates and usage. If
> the current refault rate exceeds the tolerable refault rate then
> stop/slowdown the proactive reclaim.
>
> For the second question, yes, each individual refault can induce the
> stall as well but we have more control on that stall as compared to
> stalls due to reclaim. For us almost all the reclaimable memory is
> anon and we use compression based swap, so, the cost of each refault
> is fixed and a couple of microseconds.
>
> I think the next question is what about the refaults from disk or
> source with highly variable cost. Usually the latency sensitive
> applications remove such uncertainty by mlocking the pages backed by
> such backends (e.g. mlocking the executable) or at least that is the
> case for us.
Got it.
It feels like you're suggesting something similar to memory.high with
something similar to a different gfp flags. In other words, the
difference is only which pages can be reclaimed and which not. I don't
have a definitive answer here, but I wonder if we can somehow
generalize the existing interface? E.g. if the problem is with artificially
induced delays, we can have a config option/sysctl/sysfs knob/something else
which would disable it. Otherwise we risk ending up with many different kinds
of soft memory limits.
Thanks!
next prev parent reply other threads:[~2020-07-06 21:38 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-02 15:22 [RFC PROPOSAL] memcg: per-memcg user space reclaim interface Shakeel Butt
2020-07-02 15:22 ` Shakeel Butt
[not found] ` <20200702152222.2630760-1-shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2020-07-03 6:35 ` Michal Hocko
2020-07-03 6:35 ` Michal Hocko
2020-07-03 14:23 ` Shakeel Butt
[not found] ` <CALvZod5gthVX5m6o50OiYsXa=0_NpXK-tVvjTF42Oj4udr4Nuw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2020-07-03 15:50 ` Roman Gushchin
2020-07-03 15:50 ` Roman Gushchin
[not found] ` <20200703155021.GB114903-cx5fftMpWqeCjSd+JxjunQ2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org>
2020-07-03 16:27 ` Shakeel Butt
2020-07-03 16:27 ` Shakeel Butt
2020-07-06 21:38 ` Roman Gushchin [this message]
2020-07-07 15:51 ` Shakeel Butt
2020-07-07 12:14 ` Michal Hocko
2020-07-07 12:14 ` Michal Hocko
[not found] ` <20200707121422.GP5913-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2020-07-07 17:02 ` Shakeel Butt
2020-07-07 17:02 ` Shakeel Butt
[not found] ` <CALvZod5ty=piw6czyVyMhxQMBWGghC3ujxbrkVPr0fzwqogwrw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2020-08-11 17:36 ` Michal Koutný
2020-08-11 17:36 ` Michal Koutný
2020-08-12 20:47 ` Shakeel Butt
2020-08-12 20:47 ` Shakeel Butt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200706213404.GA152560@carbon.lan \
--to=guro@fb.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=gthelen@google.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=rientjes@google.com \
--cc=shakeelb@google.com \
--cc=yang.shi@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.