From: Vladimir Davydov <vdavydov@parallels.com>
To: Christoph Lameter <cl@linux.com>
Cc: hannes@cmpxchg.org, mhocko@suse.cz, akpm@linux-foundation.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH RFC 1/3] slub: keep full slabs on list for per memcg caches
Date: Fri, 16 May 2014 17:06:30 +0400 [thread overview]
Message-ID: <20140516130629.GE32113@esperanza> (raw)
In-Reply-To: <alpine.DEB.2.10.1405151011210.24665@gentwo.org>
On Thu, May 15, 2014 at 10:15:10AM -0500, Christoph Lameter wrote:
> On Thu, 15 May 2014, Vladimir Davydov wrote:
>
> > > That will significantly impact the fastpaths for alloc and free.
> > >
> > > Also a pretty significant change the logic of the fastpaths since they
> > > were not designed to handle the full lists. In debug mode all operations
> > > were only performed by the slow paths and only the slow paths so far
> > > supported tracking full slabs.
> >
> > That's the minimal price we have to pay for slab re-parenting, because
> > w/o it we won't be able to look up for all slabs of a particular per
> > memcg cache. The question is, can it be tolerated or I'd better try some
> > other way?
>
> AFACIT these modifications all together will have a significant impact on
> performance.
>
> You could avoid the refcounting on free relying on the atomic nature of
> cmpxchg operations. If you zap the per cpu slab then the fast path will be
> forced to fall back to the slowpaths where you could do what you need to
> do.
Hmm, looking at __slab_free once again, I tend to agree that we could
rely on cmpxchg to do re-parenting: we could freeze all slabs of the
cache being re-parented forcing every on-going kfree to do only a
cmpxchg w/o touching any lists and taking any locks, and then unfreeze
all the frozen slabs to the target cache. No need in the ugly "slow
mode" I introduced in this patch set would be necessary then.
But w/o ref-counting how can we make sure that all kfrees to the cache
we are going to re-parent have been completed so that it can be safely
destroyed? An example:
CPU0: CPU1:
----- -----
kfree(obj):
page = virt_to_head_page(obj)
s = page->slab_cache
slab_free(s, page, obj):
<<< gets preempted here
reparent_slab_cache:
for each slab page
[...]
page->slab_cache = target_cache;
kmem_cache_destroy(old_cache)
<<< continues execution
c = s->cpu_slab /* s points to the previous owner cache,
so we use-after-free here */
If kfree were not preemptable, we could make reparent_slab_cache wait
for all cpus to schedule() before destroying the cache to avoid this,
but since it is, we need ref-counting...
Thanks.
> There is no tracking of full slabs without adding much more logic to the
> fastpath. You could force any operation that affects tne full list into
> the slow path. But that also would have an impact.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-05-16 13:06 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-13 13:48 [PATCH RFC 0/3] kmemcg slab reparenting Vladimir Davydov
2014-05-13 13:48 ` [PATCH RFC 1/3] slub: keep full slabs on list for per memcg caches Vladimir Davydov
2014-05-14 16:16 ` Christoph Lameter
2014-05-15 6:34 ` Vladimir Davydov
2014-05-15 15:15 ` Christoph Lameter
2014-05-16 13:06 ` Vladimir Davydov [this message]
2014-05-16 15:05 ` Christoph Lameter
2014-05-13 13:48 ` [PATCH RFC 2/3] percpu-refcount: allow to get dead reference Vladimir Davydov
2014-05-13 13:48 ` [PATCH RFC 3/3] slub: reparent memcg caches' slabs on memcg offline Vladimir Davydov
2014-05-14 16:20 ` Christoph Lameter
2014-05-15 7:16 ` Vladimir Davydov
2014-05-15 15:16 ` Christoph Lameter
2014-05-16 13:22 ` Vladimir Davydov
2014-05-16 15:03 ` Christoph Lameter
2014-05-19 15:24 ` Vladimir Davydov
2014-05-19 16:03 ` Christoph Lameter
2014-05-19 18:27 ` Vladimir Davydov
2014-05-21 13:58 ` Vladimir Davydov
2014-05-21 14:45 ` Christoph Lameter
2014-05-21 15:14 ` Vladimir Davydov
2014-05-22 0:15 ` Christoph Lameter
2014-05-22 14:07 ` Vladimir Davydov
2014-05-21 14:41 ` Christoph Lameter
2014-05-21 15:04 ` Vladimir Davydov
2014-05-22 0:13 ` Christoph Lameter
2014-05-22 13:47 ` Vladimir Davydov
2014-05-22 19:25 ` Christoph Lameter
2014-05-23 15:26 ` Vladimir Davydov
2014-05-23 17:45 ` Christoph Lameter
2014-05-23 19:57 ` Vladimir Davydov
2014-05-27 14:38 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140516130629.GE32113@esperanza \
--to=vdavydov@parallels.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).