From: Roman Gushchin <guro@fb.com>
To: Shakeel Butt <shakeelb@google.com>
Cc: Roman Gushchin <guroan@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linux MM <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>,
Kernel Team <Kernel-team@fb.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@kernel.org>, Rik van Riel <riel@surriel.com>,
"david@fromorbit.com" <david@fromorbit.com>,
Christoph Lameter <cl@linux.com>,
Pekka Enberg <penberg@kernel.org>,
Vladimir Davydov <vdavydov.dev@gmail.com>,
Cgroups <cgroups@vger.kernel.org>
Subject: Re: [PATCH 4/5] mm: rework non-root kmem_cache lifecycle management
Date: Thu, 18 Apr 2019 03:07:37 +0000 [thread overview]
Message-ID: <20190418030729.GA5038@castle> (raw)
In-Reply-To: <CALvZod6UiTeN40RgpE-4zE5zagSifqh3o_AXaw8o-ubVUWf=4w@mail.gmail.com>
On Wed, Apr 17, 2019 at 06:55:12PM -0700, Shakeel Butt wrote:
> On Wed, Apr 17, 2019 at 5:39 PM Roman Gushchin <guro@fb.com> wrote:
> >
> > On Wed, Apr 17, 2019 at 04:41:01PM -0700, Shakeel Butt wrote:
> > > On Wed, Apr 17, 2019 at 2:55 PM Roman Gushchin <guroan@gmail.com> wrote:
> > > >
> > > > This commit makes several important changes in the lifecycle
> > > > of a non-root kmem_cache, which also affect the lifecycle
> > > > of a memory cgroup.
> > > >
> > > > Currently each charged slab page has a page->mem_cgroup pointer
> > > > to the memory cgroup and holds a reference to it.
> > > > Kmem_caches are held by the cgroup. On offlining empty kmem_caches
> > > > are freed, all other are freed on cgroup release.
> > >
> > > No, they are not freed (i.e. destroyed) on offlining, only
> > > deactivated. All memcg kmem_caches are freed/destroyed on memcg's
> > > css_free.
> >
> > You're right, my bad. I was thinking about the corresponding sysfs entry
> > when was writing it. We try to free it from the deactivation path too.
> >
> > >
> > > >
> > > > So the current scheme can be illustrated as:
> > > > page->mem_cgroup->kmem_cache.
> > > >
> > > > To implement the slab memory reparenting we need to invert the scheme
> > > > into: page->kmem_cache->mem_cgroup.
> > > >
> > > > Let's make every page to hold a reference to the kmem_cache (we
> > > > already have a stable pointer), and make kmem_caches to hold a single
> > > > reference to the memory cgroup.
> > >
> > > What about memcg_kmem_get_cache()? That function assumes that by
> > > taking reference on memcg, it's kmem_caches will stay. I think you
> > > need to get reference on the kmem_cache in memcg_kmem_get_cache()
> > > within the rcu lock where you get the memcg through css_tryget_online.
> >
> > Yeah, a very good question.
> >
> > I believe it's safe because css_tryget_online() guarantees that
> > the cgroup is online and won't go offline before css_free() in
> > slab_post_alloc_hook(). I do initialize kmem_cache's refcount to 1
> > and drop it on offlining, so it protects the online kmem_cache.
> >
>
> Let's suppose a thread doing a remote charging calls
> memcg_kmem_get_cache() and gets an empty kmem_cache of the remote
> memcg having refcnt equal to 1. That thread got a reference on the
> remote memcg but no reference on the kmem_cache. Let's suppose that
> thread got stuck in the reclaim and scheduled away. In the meantime
> that remote memcg got offlined and decremented the refcnt of all of
> its kmem_caches. The empty kmem_cache which the thread stuck in
> reclaim have pointer to can get deleted and may be using an already
> destroyed kmem_cache after coming back from reclaim.
>
> I think the above situation is possible unless the thread gets the
> reference on the kmem_cache in memcg_kmem_get_cache().
Yes, you're right and I'm writing a nonsense: css_tryget_online()
can't prevent the cgroup from being offlined.
So, the problem with getting a reference in memcg_kmem_get_cache()
is that it's an atomic operation on the hot path, something I'd like
to avoid.
I can make the refcounter percpu, but it'll add some complexity and size
to the kmem_cache object. Still an option, of course.
I wonder if we can use rcu_read_lock() instead, and bump the refcounter
only if we're going into reclaim.
What do you think?
Thanks!
next prev parent reply other threads:[~2019-04-18 3:07 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-17 21:54 [PATCH 0/5] mm: reparent slab memory on cgroup removal Roman Gushchin
2019-04-17 21:54 ` [PATCH 1/5] mm: postpone kmem_cache memcg pointer initialization to memcg_link_cache() Roman Gushchin
2019-04-17 21:54 ` [PATCH 2/5] mm: generalize postponed non-root kmem_cache deactivation Roman Gushchin
2019-04-17 21:54 ` [PATCH 3/5] mm: introduce __memcg_kmem_uncharge_memcg() Roman Gushchin
2019-04-17 21:54 ` [PATCH 4/5] mm: rework non-root kmem_cache lifecycle management Roman Gushchin
2019-04-17 23:41 ` Shakeel Butt
2019-04-18 0:38 ` Roman Gushchin
2019-04-18 1:55 ` Shakeel Butt
2019-04-18 3:07 ` Roman Gushchin [this message]
2019-04-18 14:05 ` Shakeel Butt
2019-04-18 18:14 ` Roman Gushchin
2019-04-18 13:34 ` Christopher Lameter
2019-04-18 18:04 ` Roman Gushchin
2019-04-18 13:38 ` Christopher Lameter
2019-04-18 18:05 ` Roman Gushchin
2019-04-17 21:54 ` [PATCH 5/5] mm: reparent slab memory on cgroup removal Roman Gushchin
2019-04-18 8:15 ` [PATCH 0/5] " Vladimir Davydov
2019-04-18 18:27 ` Roman Gushchin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190418030729.GA5038@castle \
--to=guro@fb.com \
--cc=Kernel-team@fb.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=cl@linux.com \
--cc=david@fromorbit.com \
--cc=guroan@gmail.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=penberg@kernel.org \
--cc=riel@surriel.com \
--cc=shakeelb@google.com \
--cc=vdavydov.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.