From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, Christoph Lameter <cl@linux.com>,
Alexander Duyck <alexander.duyck@gmail.com>,
Pekka Enberg <penberg@kernel.org>,
netdev@vger.kernel.org, Joonsoo Kim <iamjoonsoo.kim@lge.com>,
David Rientjes <rientjes@google.com>,
brouer@redhat.com, Hannes Frederic Sowa <hannes@redhat.com>
Subject: Re: [MM PATCH V4.1 5/6] slub: support for bulk free with SLUB freelists
Date: Fri, 2 Oct 2015 15:40:39 +0200 [thread overview]
Message-ID: <20151002154039.69f82bdc@redhat.com> (raw)
In-Reply-To: <20151002114118.75aae2f9@redhat.com>
On Fri, 2 Oct 2015 11:41:18 +0200
Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> On Thu, 1 Oct 2015 15:10:15 -0700
> Andrew Morton <akpm@linux-foundation.org> wrote:
>
> > On Wed, 30 Sep 2015 13:44:19 +0200 Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> >
> > > Make it possible to free a freelist with several objects by adjusting
> > > API of slab_free() and __slab_free() to have head, tail and an objects
> > > counter (cnt).
> > >
> > > Tail being NULL indicate single object free of head object. This
> > > allow compiler inline constant propagation in slab_free() and
> > > slab_free_freelist_hook() to avoid adding any overhead in case of
> > > single object free.
> > >
> > > This allows a freelist with several objects (all within the same
> > > slab-page) to be free'ed using a single locked cmpxchg_double in
> > > __slab_free() and with an unlocked cmpxchg_double in slab_free().
> > >
> > > Object debugging on the free path is also extended to handle these
> > > freelists. When CONFIG_SLUB_DEBUG is enabled it will also detect if
> > > objects don't belong to the same slab-page.
> > >
> > > These changes are needed for the next patch to bulk free the detached
> > > freelists it introduces and constructs.
> > >
> > > Micro benchmarking showed no performance reduction due to this change,
> > > when debugging is turned off (compiled with CONFIG_SLUB_DEBUG).
> > >
> >
> > checkpatch says
> >
> > WARNING: Avoid crashing the kernel - try using WARN_ON & recovery code rather than BUG() or BUG_ON()
> > #205: FILE: mm/slub.c:2888:
> > + BUG_ON(!size);
> >
> >
> > Linus will get mad at you if he finds out, and we wouldn't want that.
> >
> > --- a/mm/slub.c~slub-optimize-bulk-slowpath-free-by-detached-freelist-fix
> > +++ a/mm/slub.c
> > @@ -2885,7 +2885,8 @@ static int build_detached_freelist(struc
> > /* Note that interrupts must be enabled when calling this function. */
> > void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
> > {
> > - BUG_ON(!size);
> > + if (WARN_ON(!size))
> > + return;
> >
> > do {
> > struct detached_freelist df;
> > _
>
> My problem with this change is that WARN_ON generates (slightly) larger
> code size, which is critical for instruction-cache usage...
>
> [net-next-mm]$ ./scripts/bloat-o-meter vmlinux-with_BUG_ON vmlinux-with_WARN_ON
> add/remove: 0/0 grow/shrink: 1/0 up/down: 17/0 (17)
> function old new delta
> kmem_cache_free_bulk 438 455 +17
>
> My IP-forwarding benchmark is actually a very challenging use-case,
> because the code path "size" a packet have to travel is larger than the
> instruction-cache of the CPU.
>
> Thus, I need introducing new code like this patch and at the same time
> have to reduce the number of instruction-cache misses/usage. In this
> case we solve the problem by kmem_cache_free_bulk() not getting called
> too often. Thus, +17 bytes will hopefully not matter too much... but on
> the other hand we sort-of know that calling kmem_cache_free_bulk() will
> cause icache misses.
I just tested this change on top of my net-use-case patchset... and for
some strange reason the code with this WARN_ON is faster and have much
less icache-misses (1,278,276 vs 2,719,158 L1-icache-load-misses).
Thus, I think we should keep your fix.
I cannot explain why using WARN_ON() is better and cause less icache
misses. And I hate when I don't understand every detail.
My theory is, after reading the assembler code, that the UD2
instruction (from BUG_ON) cause some kind of icache decoder stall
(Intel experts???). Now that should not be a problem, as UD2 is
obviously placed as an unlikely branch and left at the end of the asm
function call. But the call to __slab_free() is also placed at the end
of the asm function (gets inlined from slab_free() as unlikely). And
it is actually fairly likely that bulking is calling __slab_free (slub
slowpath call).
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-10-02 13:40 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-28 12:26 [PATCH 0/7] Further optimizing SLAB/SLUB bulking Jesper Dangaard Brouer
2015-09-28 12:26 ` [PATCH 1/7] slub: create new ___slab_alloc function that can be called with irqs disabled Jesper Dangaard Brouer
2015-09-28 12:26 ` [PATCH 2/7] slub: Avoid irqoff/on in bulk allocation Jesper Dangaard Brouer
2015-09-28 12:26 ` [PATCH 3/7] slub: mark the dangling ifdef #else of CONFIG_SLUB_DEBUG Jesper Dangaard Brouer
2015-09-28 13:49 ` Christoph Lameter
2015-09-28 12:26 ` [PATCH 4/7] slab: implement bulking for SLAB allocator Jesper Dangaard Brouer
2015-09-28 15:11 ` Christoph Lameter
2015-09-28 12:26 ` [PATCH 5/7] slub: support for bulk free with SLUB freelists Jesper Dangaard Brouer
2015-09-28 15:16 ` Christoph Lameter
2015-09-28 15:51 ` Jesper Dangaard Brouer
2015-09-28 16:28 ` Christoph Lameter
2015-09-29 7:32 ` Jesper Dangaard Brouer
2015-09-28 16:30 ` Christoph Lameter
2015-09-29 7:12 ` Jesper Dangaard Brouer
2015-09-28 12:26 ` [PATCH 6/7] slub: optimize bulk slowpath free by detached freelist Jesper Dangaard Brouer
2015-09-28 15:22 ` Christoph Lameter
2015-09-28 12:26 ` [PATCH 7/7] slub: do prefetching in kmem_cache_alloc_bulk() Jesper Dangaard Brouer
2015-09-28 14:53 ` Alexander Duyck
2015-09-28 15:59 ` Jesper Dangaard Brouer
2015-09-29 15:46 ` [MM PATCH V4 0/6] Further optimizing SLAB/SLUB bulking Jesper Dangaard Brouer
2015-09-29 15:47 ` [MM PATCH V4 1/6] slub: create new ___slab_alloc function that can be called with irqs disabled Jesper Dangaard Brouer
2015-09-29 15:47 ` [MM PATCH V4 2/6] slub: Avoid irqoff/on in bulk allocation Jesper Dangaard Brouer
2015-09-29 15:47 ` [MM PATCH V4 3/6] slub: mark the dangling ifdef #else of CONFIG_SLUB_DEBUG Jesper Dangaard Brouer
2015-09-29 15:48 ` [MM PATCH V4 4/6] slab: implement bulking for SLAB allocator Jesper Dangaard Brouer
2015-09-29 15:48 ` [MM PATCH V4 5/6] slub: support for bulk free with SLUB freelists Jesper Dangaard Brouer
2015-09-29 16:38 ` Alexander Duyck
2015-09-29 17:00 ` Jesper Dangaard Brouer
2015-09-29 17:20 ` Alexander Duyck
2015-09-29 18:16 ` Jesper Dangaard Brouer
2015-09-30 11:44 ` [MM PATCH V4.1 " Jesper Dangaard Brouer
2015-09-30 16:03 ` Christoph Lameter
2015-10-01 22:10 ` Andrew Morton
2015-10-02 9:41 ` Jesper Dangaard Brouer
2015-10-02 10:10 ` Christoph Lameter
2015-10-02 10:40 ` Jesper Dangaard Brouer
2015-10-02 13:40 ` Jesper Dangaard Brouer [this message]
2015-10-02 21:50 ` Andrew Morton
2015-10-05 19:26 ` Jesper Dangaard Brouer
2015-10-05 21:20 ` Andi Kleen
2015-10-05 23:07 ` Jesper Dangaard Brouer
2015-10-07 12:31 ` Jesper Dangaard Brouer
2015-10-07 13:36 ` Arnaldo Carvalho de Melo
2015-10-07 15:44 ` Andi Kleen
2015-10-07 16:06 ` Andi Kleen
2015-10-05 23:53 ` Jesper Dangaard Brouer
2015-10-07 10:39 ` Jesper Dangaard Brouer
2015-09-29 15:48 ` [MM PATCH V4 6/6] slub: optimize bulk slowpath free by detached freelist Jesper Dangaard Brouer
2015-10-14 5:15 ` Joonsoo Kim
2015-10-21 7:57 ` Jesper Dangaard Brouer
2015-11-05 5:09 ` Joonsoo Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151002154039.69f82bdc@redhat.com \
--to=brouer@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=alexander.duyck@gmail.com \
--cc=cl@linux.com \
--cc=hannes@redhat.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=linux-mm@kvack.org \
--cc=netdev@vger.kernel.org \
--cc=penberg@kernel.org \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).