From: Michal Hocko <mhocko@suse.cz>
To: Vladimir Davydov <vdavydov@parallels.com>
Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, glommer@gmail.com,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
devel@openvz.org
Subject: Re: [PATCH RESEND -mm 02/12] memcg: fix race in memcg cache destruction path
Date: Mon, 17 Mar 2014 17:42:03 +0100 [thread overview]
Message-ID: <20140317164203.GC30623@dhcp22.suse.cz> (raw)
In-Reply-To: <94fc308b9074e45a2aac7a06cf357a33c5d97c9f.1394708827.git.vdavydov@parallels.com>
On Thu 13-03-14 19:06:40, Vladimir Davydov wrote:
> We schedule memcg cache shrink+destruction work (memcg_params::destroy)
> from two places: when we turn memcg offline
> (mem_cgroup_destroy_all_caches) and when the last page of the cache is
> freed (memcg_params::nr_pages reachs zero, see memcg_release_pages,
> mem_cgroup_destroy_cache).
This is just ugly! Why do we mem_cgroup_destroy_all_caches from the
offline code at all? Just calling kmem_cache_shrink and then wait for
the last pages to go away should be sufficient to fix this, no?
Whether the current code is good (no it's not) is another question. But
this should be fixed also in the stable trees (is the bug there since
the very beginning?) so the fix should be as simple as possible IMO.
So if there is a simpler solution I would prefer it. But I am drowning
in the kmem trickiness spread out all over the place so I might be
missing something very easily.
> Since the latter can happen while the work
> scheduled from mem_cgroup_destroy_all_caches is in progress or still
> pending, we need to be cautious to avoid races there - we should
> accurately bail out in one of those functions if we see that the other
> is in progress. Currently we only check if memcg_params::nr_pages is 0
> in the destruction work handler and do not destroy the cache if so. But
> that's not enough. An example of race we can get is shown below:
>
> CPU0 CPU1
> ---- ----
> kmem_cache_destroy_work_func: memcg_release_pages:
> atomic_sub_and_test(1<<order, &s->
> memcg_params->nr_pages)
> /* reached 0 => schedule destroy */
>
> atomic_read(&cachep->memcg_params->nr_pages)
> /* 0 => going to destroy the cache */
> kmem_cache_destroy(cachep);
>
> mem_cgroup_destroy_cache(s):
> /* the cache was destroyed on CPU0
> - use after free */
>
> An obvious way to fix this would be substituting the nr_pages counter
> with a reference counter and make memcg take a reference. The cache
> destruction would be then scheduled from that thread which decremented
> the refcount to 0. Generally, this is what this patch does, but there is
> one subtle thing here - the work handler serves not only for cache
> destruction, it also shrinks the cache if it's still in use (we can't
> call shrink directly from mem_cgroup_destroy_all_caches due to locking
> dependencies). We handle this by noting that we should only issue shrink
> if called from mem_cgroup_destroy_all_caches, because the cache is
> already empty when we release its last page. And if we drop the
> reference taken by memcg in the work handler, we can detect who exactly
> scheduled the worker - mem_cgroup_destroy_all_caches or
> memcg_release_pages.
>
> Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Glauber Costa <glommer@gmail.com>
[...]
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-03-17 16:42 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-03-13 15:06 [PATCH RESEND -mm 00/12] kmemcg reparenting Vladimir Davydov
2014-03-13 15:06 ` [PATCH RESEND -mm 01/12] memcg: flush cache creation works before memcg cache destruction Vladimir Davydov
2014-03-17 16:07 ` Michal Hocko
2014-03-18 8:14 ` Vladimir Davydov
2014-03-18 8:55 ` Michal Hocko
2014-03-18 9:28 ` Vladimir Davydov
2014-03-13 15:06 ` [PATCH RESEND -mm 02/12] memcg: fix race in memcg cache destruction path Vladimir Davydov
2014-03-17 16:42 ` Michal Hocko [this message]
2014-03-18 8:19 ` Vladimir Davydov
2014-03-18 10:01 ` Michal Hocko
2014-03-18 12:14 ` Vladimir Davydov
2014-03-13 15:06 ` [PATCH RESEND -mm 03/12] memcg: fix root vs memcg cache destruction race Vladimir Davydov
2014-03-13 15:06 ` [PATCH RESEND -mm 04/12] memcg: move slab caches list/mutex init to memcg creation Vladimir Davydov
2014-03-13 15:06 ` [PATCH RESEND -mm 05/12] memcg: add pointer from memcg_cache_params to cache Vladimir Davydov
2014-03-13 15:06 ` [PATCH RESEND -mm 06/12] memcg: keep all children of each root cache on a list Vladimir Davydov
2014-03-13 15:06 ` [PATCH RESEND -mm 07/12] memcg: rework slab charging Vladimir Davydov
2014-03-13 15:06 ` [PATCH RESEND -mm 08/12] memcg: do not charge kmalloc_large allocations Vladimir Davydov
2014-03-13 15:06 ` [PATCH RESEND -mm 09/12] fork: do not charge thread_info to kmemcg Vladimir Davydov
2014-03-13 15:06 ` [PATCH RESEND -mm 10/12] memcg: kill GFP_KMEMCG and stuff Vladimir Davydov
2014-03-13 15:06 ` [PATCH RESEND -mm 11/12] memcg: reparent slab on css offline Vladimir Davydov
2014-03-13 15:06 ` [PATCH RESEND -mm 12/12] slub: make sure all memcg caches have unique names on sysfs Vladimir Davydov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140317164203.GC30623@dhcp22.suse.cz \
--to=mhocko@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=devel@openvz.org \
--cc=glommer@gmail.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=vdavydov@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).