From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, Christoph Lameter <cl@linux.com>,
Alexander Duyck <alexander.duyck@gmail.com>,
Pekka Enberg <penberg@kernel.org>,
netdev@vger.kernel.org, Joonsoo Kim <iamjoonsoo.kim@lge.com>,
David Rientjes <rientjes@google.com>,
brouer@redhat.com, Hannes Frederic Sowa <hannes@redhat.com>
Subject: Re: [MM PATCH V4.1 5/6] slub: support for bulk free with SLUB freelists
Date: Fri, 2 Oct 2015 15:40:39 +0200 [thread overview]
Message-ID: <20151002154039.69f82bdc@redhat.com> (raw)
In-Reply-To: <20151002114118.75aae2f9@redhat.com>
On Fri, 2 Oct 2015 11:41:18 +0200
Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> On Thu, 1 Oct 2015 15:10:15 -0700
> Andrew Morton <akpm@linux-foundation.org> wrote:
>
> > On Wed, 30 Sep 2015 13:44:19 +0200 Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> >
> > > Make it possible to free a freelist with several objects by adjusting
> > > API of slab_free() and __slab_free() to have head, tail and an objects
> > > counter (cnt).
> > >
> > > Tail being NULL indicate single object free of head object. This
> > > allow compiler inline constant propagation in slab_free() and
> > > slab_free_freelist_hook() to avoid adding any overhead in case of
> > > single object free.
> > >
> > > This allows a freelist with several objects (all within the same
> > > slab-page) to be free'ed using a single locked cmpxchg_double in
> > > __slab_free() and with an unlocked cmpxchg_double in slab_free().
> > >
> > > Object debugging on the free path is also extended to handle these
> > > freelists. When CONFIG_SLUB_DEBUG is enabled it will also detect if
> > > objects don't belong to the same slab-page.
> > >
> > > These changes are needed for the next patch to bulk free the detached
> > > freelists it introduces and constructs.
> > >
> > > Micro benchmarking showed no performance reduction due to this change,
> > > when debugging is turned off (compiled with CONFIG_SLUB_DEBUG).
> > >
> >
> > checkpatch says
> >
> > WARNING: Avoid crashing the kernel - try using WARN_ON & recovery code rather than BUG() or BUG_ON()
> > #205: FILE: mm/slub.c:2888:
> > + BUG_ON(!size);
> >
> >
> > Linus will get mad at you if he finds out, and we wouldn't want that.
> >
> > --- a/mm/slub.c~slub-optimize-bulk-slowpath-free-by-detached-freelist-fix
> > +++ a/mm/slub.c
> > @@ -2885,7 +2885,8 @@ static int build_detached_freelist(struc
> > /* Note that interrupts must be enabled when calling this function. */
> > void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
> > {
> > - BUG_ON(!size);
> > + if (WARN_ON(!size))
> > + return;
> >
> > do {
> > struct detached_freelist df;
> > _
>
> My problem with this change is that WARN_ON generates (slightly) larger
> code size, which is critical for instruction-cache usage...
>
> [net-next-mm]$ ./scripts/bloat-o-meter vmlinux-with_BUG_ON vmlinux-with_WARN_ON
> add/remove: 0/0 grow/shrink: 1/0 up/down: 17/0 (17)
> function old new delta
> kmem_cache_free_bulk 438 455 +17
>
> My IP-forwarding benchmark is actually a very challenging use-case,
> because the code path "size" a packet have to travel is larger than the
> instruction-cache of the CPU.
>
> Thus, I need introducing new code like this patch and at the same time
> have to reduce the number of instruction-cache misses/usage. In this
> case we solve the problem by kmem_cache_free_bulk() not getting called
> too often. Thus, +17 bytes will hopefully not matter too much... but on
> the other hand we sort-of know that calling kmem_cache_free_bulk() will
> cause icache misses.
I just tested this change on top of my net-use-case patchset... and for
some strange reason the code with this WARN_ON is faster and have much
less icache-misses (1,278,276 vs 2,719,158 L1-icache-load-misses).
Thus, I think we should keep your fix.
I cannot explain why using WARN_ON() is better and cause less icache
misses. And I hate when I don't understand every detail.
My theory is, after reading the assembler code, that the UD2
instruction (from BUG_ON) cause some kind of icache decoder stall
(Intel experts???). Now that should not be a problem, as UD2 is
obviously placed as an unlikely branch and left at the end of the asm
function call. But the call to __slab_free() is also placed at the end
of the asm function (gets inlined from slab_free() as unlikely). And
it is actually fairly likely that bulking is calling __slab_free (slub
slowpath call).
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-10-02 13:40 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-28 12:26 [PATCH 0/7] Further optimizing SLAB/SLUB bulking Jesper Dangaard Brouer
2015-09-28 12:26 ` [PATCH 1/7] slub: create new ___slab_alloc function that can be called with irqs disabled Jesper Dangaard Brouer
2015-09-28 12:26 ` Jesper Dangaard Brouer
2015-09-28 12:26 ` [PATCH 2/7] slub: Avoid irqoff/on in bulk allocation Jesper Dangaard Brouer
2015-09-28 12:26 ` [PATCH 3/7] slub: mark the dangling ifdef #else of CONFIG_SLUB_DEBUG Jesper Dangaard Brouer
2015-09-28 12:26 ` Jesper Dangaard Brouer
2015-09-28 13:49 ` Christoph Lameter
2015-09-28 12:26 ` [PATCH 4/7] slab: implement bulking for SLAB allocator Jesper Dangaard Brouer
2015-09-28 12:26 ` Jesper Dangaard Brouer
2015-09-28 15:11 ` Christoph Lameter
2015-09-28 12:26 ` [PATCH 5/7] slub: support for bulk free with SLUB freelists Jesper Dangaard Brouer
2015-09-28 12:26 ` Jesper Dangaard Brouer
2015-09-28 15:16 ` Christoph Lameter
2015-09-28 15:51 ` Jesper Dangaard Brouer
2015-09-28 15:51 ` Jesper Dangaard Brouer
2015-09-28 16:28 ` Christoph Lameter
2015-09-29 7:32 ` Jesper Dangaard Brouer
2015-09-29 7:32 ` Jesper Dangaard Brouer
2015-09-28 16:30 ` Christoph Lameter
2015-09-29 7:12 ` Jesper Dangaard Brouer
2015-09-29 7:12 ` Jesper Dangaard Brouer
2015-09-28 12:26 ` [PATCH 6/7] slub: optimize bulk slowpath free by detached freelist Jesper Dangaard Brouer
2015-09-28 12:26 ` Jesper Dangaard Brouer
2015-09-28 15:22 ` Christoph Lameter
2015-09-28 15:22 ` Christoph Lameter
2015-09-28 12:26 ` [PATCH 7/7] slub: do prefetching in kmem_cache_alloc_bulk() Jesper Dangaard Brouer
2015-09-28 12:26 ` Jesper Dangaard Brouer
2015-09-28 14:53 ` Alexander Duyck
2015-09-28 15:59 ` Jesper Dangaard Brouer
2015-09-28 15:59 ` Jesper Dangaard Brouer
2015-09-29 15:46 ` [MM PATCH V4 0/6] Further optimizing SLAB/SLUB bulking Jesper Dangaard Brouer
2015-09-29 15:47 ` [MM PATCH V4 1/6] slub: create new ___slab_alloc function that can be called with irqs disabled Jesper Dangaard Brouer
2015-09-29 15:47 ` [MM PATCH V4 2/6] slub: Avoid irqoff/on in bulk allocation Jesper Dangaard Brouer
2015-09-29 15:47 ` [MM PATCH V4 3/6] slub: mark the dangling ifdef #else of CONFIG_SLUB_DEBUG Jesper Dangaard Brouer
2015-09-29 15:48 ` [MM PATCH V4 4/6] slab: implement bulking for SLAB allocator Jesper Dangaard Brouer
2015-09-29 15:48 ` [MM PATCH V4 5/6] slub: support for bulk free with SLUB freelists Jesper Dangaard Brouer
2015-09-29 16:38 ` Alexander Duyck
2015-09-29 17:00 ` Jesper Dangaard Brouer
2015-09-29 17:20 ` Alexander Duyck
2015-09-29 17:20 ` Alexander Duyck
2015-09-29 18:16 ` Jesper Dangaard Brouer
2015-09-30 11:44 ` [MM PATCH V4.1 " Jesper Dangaard Brouer
2015-09-30 16:03 ` Christoph Lameter
2015-10-01 22:10 ` Andrew Morton
2015-10-01 22:10 ` Andrew Morton
2015-10-02 9:41 ` Jesper Dangaard Brouer
2015-10-02 10:10 ` Christoph Lameter
2015-10-02 10:40 ` Jesper Dangaard Brouer
2015-10-02 13:40 ` Jesper Dangaard Brouer [this message]
2015-10-02 21:50 ` Andrew Morton
2015-10-02 21:50 ` Andrew Morton
2015-10-05 19:26 ` Jesper Dangaard Brouer
2015-10-05 21:20 ` Andi Kleen
2015-10-05 21:20 ` Andi Kleen
2015-10-05 23:07 ` Jesper Dangaard Brouer
2015-10-07 12:31 ` Jesper Dangaard Brouer
2015-10-07 13:36 ` Arnaldo Carvalho de Melo
2015-10-07 15:44 ` Andi Kleen
2015-10-07 15:44 ` Andi Kleen
2015-10-07 16:06 ` Andi Kleen
2015-10-05 23:53 ` Jesper Dangaard Brouer
2015-10-05 23:53 ` Jesper Dangaard Brouer
2015-10-07 10:39 ` Jesper Dangaard Brouer
2015-10-07 10:39 ` Jesper Dangaard Brouer
2015-09-29 15:48 ` [MM PATCH V4 6/6] slub: optimize bulk slowpath free by detached freelist Jesper Dangaard Brouer
2015-10-14 5:15 ` Joonsoo Kim
2015-10-14 5:15 ` Joonsoo Kim
2015-10-21 7:57 ` Jesper Dangaard Brouer
2015-11-05 5:09 ` Joonsoo Kim
2015-11-05 5:09 ` Joonsoo Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151002154039.69f82bdc@redhat.com \
--to=brouer@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=alexander.duyck@gmail.com \
--cc=cl@linux.com \
--cc=hannes@redhat.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=linux-mm@kvack.org \
--cc=netdev@vger.kernel.org \
--cc=penberg@kernel.org \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.