From: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
To: Yu Zhao <yuzhao-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Cc: Yosry Ahmed <yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Andrew Morton
<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
Roman Gushchin
<roman.gushchin-fxUVXftIFDnyG1zEObXtfA@public.gmane.org>,
Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Muchun Song <songmuchun-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>,
Greg Thelen <gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
David Rientjes <rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Cgroups <cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
Linux-MM <linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org>
Subject: Re: [PATCH v2] mm/vmscan: check references from all memcgs for swapbacked memory
Date: Thu, 6 Oct 2022 11:38:23 -0400 [thread overview]
Message-ID: <Yz72b1IjZkzk8CTl@cmpxchg.org> (raw)
In-Reply-To: <CAOUHufa+f-RB1Lddu3fQPof=eqduyxM3mcCBuk3OR-Tu=+VN+w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
On Wed, Oct 05, 2022 at 11:10:37PM -0600, Yu Zhao wrote:
> On Wed, Oct 5, 2022 at 10:19 PM Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org> wrote:
> >
> > On Wed, Oct 05, 2022 at 03:13:38PM -0600, Yu Zhao wrote:
> > > On Wed, Oct 5, 2022 at 3:02 PM Yosry Ahmed <yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
> > > >
> > > > On Wed, Oct 5, 2022 at 1:48 PM Yu Zhao <yuzhao-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
> > > > >
> > > > > On Wed, Oct 5, 2022 at 11:37 AM Yosry Ahmed <yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
> > > > > >
> > > > > > During page/folio reclaim, we check if a folio is referenced using
> > > > > > folio_referenced() to avoid reclaiming folios that have been recently
> > > > > > accessed (hot memory). The rationale is that this memory is likely to be
> > > > > > accessed soon, and hence reclaiming it will cause a refault.
> > > > > >
> > > > > > For memcg reclaim, we currently only check accesses to the folio from
> > > > > > processes in the subtree of the target memcg. This behavior was
> > > > > > originally introduced by commit bed7161a519a ("Memory controller: make
> > > > > > page_referenced() cgroup aware") a long time ago. Back then, refaulted
> > > > > > pages would get charged to the memcg of the process that was faulting them
> > > > > > in. It made sense to only consider accesses coming from processes in the
> > > > > > subtree of target_mem_cgroup. If a page was charged to memcg A but only
> > > > > > being accessed by a sibling memcg B, we would reclaim it if memcg A is
> > > > > > is the reclaim target. memcg B can then fault it back in and get charged
> > > > > > for it appropriately.
> > > > > >
> > > > > > Today, this behavior still makes sense for file pages. However, unlike
> > > > > > file pages, when swapbacked pages are refaulted they are charged to the
> > > > > > memcg that was originally charged for them during swapping out. Which
> > > > > > means that if a swapbacked page is charged to memcg A but only used by
> > > > > > memcg B, and we reclaim it from memcg A, it would simply be faulted back
> > > > > > in and charged again to memcg A once memcg B accesses it. In that sense,
> > > > > > accesses from all memcgs matter equally when considering if a swapbacked
> > > > > > page/folio is a viable reclaim target.
> > > > > >
> > > > > > Modify folio_referenced() to always consider accesses from all memcgs if
> > > > > > the folio is swapbacked.
> > > > >
> > > > > It seems to me this change can potentially increase the number of
> > > > > zombie memcgs. Any risk assessment done on this?
> > > >
> > > > Do you mind elaborating the case(s) where this could happen? Is this
> > > > the cgroup v1 case in mem_cgroup_swapout() where we are reclaiming
> > > > from a zombie memcg and swapping out would let us move the charge to
> > > > the parent?
> > >
> > > The scenario is quite straightforward: for a page charged to memcg A
> > > and also actively used by memcg B, if we don't ignore the access from
> > > memcg B, we won't be able to reclaim it after memcg A is deleted.
> >
> > This patch changes the behavior of limit-induced reclaim. There is no
> > limit reclaim on A after it's been deleted. And parental/global
> > reclaim has always recognized outside references.
>
> We use memory.reclaim to scrape memcgs right before rmdir so that they
> are unlikely to stick around. Otherwise our job scheduler would see
> less available memory and become less eager to increase load. This in
> turn reduces the chance of global reclaim, and deleted memcgs would
> stick around even longer.
Thanks for the context.
It's not great that we have to design reclaim policy around this
implementation detail of past-EOF-pins. But such is life until we get
rid of them.
next prev parent reply other threads:[~2022-10-06 15:38 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-05 17:37 [PATCH v2] mm/vmscan: check references from all memcgs for swapbacked memory Yosry Ahmed
[not found] ` <20221005173713.1308832-1-yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2022-10-05 19:49 ` Johannes Weiner
2022-10-05 20:47 ` Yu Zhao
[not found] ` <CAOUHufaDhmHwY_qd2z26k6vK=eCHudJL1Pp4xALP25iZfbSJWA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2022-10-05 21:01 ` Yosry Ahmed
[not found] ` <CAJD7tkaS4T5dD3CpST2wsie5uP1ruHiaWL5AJv0j8V9=yiOuug-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2022-10-05 21:13 ` Yu Zhao
[not found] ` <CAOUHufYKvbZTJ_ofD4+DyzY+DuHrRKYChnJVwqD7OKwe6sw-hw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2022-10-05 22:22 ` Yu Zhao
[not found] ` <CAOUHufaMFySiybW7drbPg_+w1xvk_Xh0bkCbPWw3aGaSnEFdTQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2022-10-05 22:45 ` Yosry Ahmed
2022-10-06 4:19 ` Johannes Weiner
[not found] ` <Yz5XVZfq8abvMYJ8-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
2022-10-06 5:10 ` Yu Zhao
[not found] ` <CAOUHufa+f-RB1Lddu3fQPof=eqduyxM3mcCBuk3OR-Tu=+VN+w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2022-10-06 15:38 ` Johannes Weiner [this message]
2022-10-06 7:30 ` Yosry Ahmed
[not found] ` <CAJD7tkao9DU2e_2co_HgOm38PxvLqdRS=kHcOdRfqcqN6MRdaw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2022-10-06 15:32 ` Johannes Weiner
[not found] ` <Yz71HQpeS6ccOIe2-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
2022-10-06 18:29 ` Yosry Ahmed
[not found] ` <CAJD7tka+wzjw8dHHGnz5jWULqhvbSF5WQ4gJCui7ztMUeVwfTg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2022-10-06 21:56 ` Yu Zhao
[not found] ` <CAOUHufZo5WMpHvZMevGfB_T4wxWn86Z76NcPK9GymoHK8-o0Kg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2022-10-06 23:07 ` Yosry Ahmed
[not found] ` <CAJD7tkZAKhtbd7Gk4hoN-y9p9PHAxQqv5p3ePZs=Au84=Y0ViQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2022-10-06 23:55 ` Yu Zhao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Yz72b1IjZkzk8CTl@cmpxchg.org \
--to=hannes-druugvl0lcnafugrpc6u6w@public.gmane.org \
--cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
--cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
--cc=mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
--cc=rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=roman.gushchin-fxUVXftIFDnyG1zEObXtfA@public.gmane.org \
--cc=shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=songmuchun-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org \
--cc=yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=yuzhao-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox