From: Vladimir Davydov <vdavydov@parallels.com>
To: Christoph Lameter <cl@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Pekka Enberg <penberg@kernel.org>,
David Rientjes <rientjes@google.com>,
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@suse.cz>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH -mm 1/3] slub: don't fail kmem_cache_shrink if slab placement optimization fails
Date: Mon, 26 Jan 2015 20:01:47 +0300 [thread overview]
Message-ID: <20150126170147.GB28978@esperanza> (raw)
In-Reply-To: <alpine.DEB.2.11.1501260944550.15849@gentwo.org>
Hi Christoph,
On Mon, Jan 26, 2015 at 09:48:00AM -0600, Christoph Lameter wrote:
> On Mon, 26 Jan 2015, Vladimir Davydov wrote:
>
> > SLUB's kmem_cache_shrink not only removes empty slabs from the cache,
> > but also sorts slabs by the number of objects in-use to cope with
> > fragmentation. To achieve that, it tries to allocate a temporary array.
> > If it fails, it will abort the whole procedure.
>
> I do not think its worth optimizing this. If we cannot allocate even a
> small object then the system is in an extremely bad state anyways.
Hmm, I've just checked my /proc/slabinfo and seen that I have 512
objects per slab at max, so that the temporary array will be 2 pages at
max. So you're right - this kmalloc will never fail on my system, simply
because we never fail GFP_KERNEL allocations of order < 3. However,
theoretically we can have as much as MAX_OBJS_PER_PAGE=32767 objects per
slab, which would result in a huge allocation.
Anyways, I think that silently relying on the fact that the allocator
never fails small allocations is kind of unreliable. What if this
behavior will change one day? So I'd prefer to either make
kmem_cache_shrink fall back to using a variable on stack in case of the
kmalloc failure, like this patch does, or place an explicit BUG_ON after
it. The latter looks dangerous to me, because, as I mentioned above, I'm
not sure that we always have less than 2048 objects per slab.
>
> > @@ -3400,7 +3407,9 @@ int __kmem_cache_shrink(struct kmem_cache *s)
> > * list_lock. page->inuse here is the upper limit.
> > */
> > list_for_each_entry_safe(page, t, &n->partial, lru) {
> > - list_move(&page->lru, slabs_by_inuse + page->inuse);
> > + if (page->inuse < objects)
> > + list_move(&page->lru,
> > + slabs_by_inuse + page->inuse);
> > if (!page->inuse)
> > n->nr_partial--;
> > }
>
> The condition is always true. A page that has page->inuse == objects
> would not be on the partial list.
>
This is in case we failed to allocate the slabs_by_inuse array. We only
have a list for empty slabs then (on stack).
Thanks,
Vladimir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-01-26 17:02 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-01-26 12:55 [PATCH -mm 0/3] slub: make dead caches discard free slabs immediately Vladimir Davydov
2015-01-26 12:55 ` [PATCH -mm 1/3] slub: don't fail kmem_cache_shrink if slab placement optimization fails Vladimir Davydov
2015-01-26 15:48 ` Christoph Lameter
2015-01-26 17:01 ` Vladimir Davydov [this message]
2015-01-26 18:24 ` Christoph Lameter
2015-01-26 19:36 ` Vladimir Davydov
2015-01-26 19:53 ` Christoph Lameter
2015-01-27 12:58 ` Vladimir Davydov
2015-01-27 17:02 ` Christoph Lameter
2015-01-28 15:00 ` Vladimir Davydov
2015-01-26 12:55 ` [PATCH -mm 2/3] slab: zap kmem_cache_shrink return value Vladimir Davydov
2015-01-26 15:49 ` Christoph Lameter
2015-01-26 17:04 ` Vladimir Davydov
2015-01-26 18:26 ` Christoph Lameter
2015-01-26 19:48 ` Vladimir Davydov
2015-01-26 19:55 ` Christoph Lameter
2015-01-26 20:16 ` Vladimir Davydov
2015-01-26 20:28 ` Christoph Lameter
2015-01-26 20:43 ` Vladimir Davydov
2015-01-26 12:55 ` [PATCH -mm 3/3] slub: make dead caches discard free slabs immediately Vladimir Davydov
2015-01-27 8:00 ` Joonsoo Kim
2015-01-27 8:23 ` Vladimir Davydov
2015-01-27 9:21 ` Joonsoo Kim
2015-01-27 9:28 ` Vladimir Davydov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150126170147.GB28978@esperanza \
--to=vdavydov@parallels.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=hannes@cmpxchg.org \
--cc=iamjoonsoo.kim@lge.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
--cc=penberg@kernel.org \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).