From: Jesper Dangaard Brouer <brouer@redhat.com>
To: linux-mm@kvack.org, Christoph Lameter <cl@linux.com>,
Andrew Morton <akpm@linux-foundation.org>
Cc: netdev@vger.kernel.org,
Alexander Duyck <alexander.duyck@gmail.com>,
Jesper Dangaard Brouer <brouer@redhat.com>
Subject: [PATCH 0/7] slub: bulk alloc and free for slub allocator
Date: Mon, 15 Jun 2015 17:51:45 +0200 [thread overview]
Message-ID: <20150615155053.18824.617.stgit@devil> (raw)
With this patchset SLUB allocator now both have bulk alloc and free
implemented.
(This patchset is based on DaveM's net-next tree on-top of commit
c3eee1fb1d308. Tested patchset applied on-top of volatile linux-next
commit aa036f86e1bf ("slub bulk alloc: extract objects from the per
cpu slab"))
This mostly optimizes the "fastpath" where objects are available on
the per CPU fastpath page. This mostly amortize the less-heavy
none-locked cmpxchg_double used on fastpath.
The "fallback bulking" (e.g __kmem_cache_free_bulk) provides a good
basis for comparison, but to avoid counting the overhead of the
function call in benchmarking[1] I've used an inlined versions of
these.
Tested on (very fast) CPU i7-4790K @ 4.00GHz, thus look at cycles
count (as nanosec measurements are very low given the clock rate).
Baseline normal fastpath (alloc+free cost): 43 cycles(tsc) 10.814 ns
Bulk - Fallback bulking - fastpath-bulking
1 - 47 cycles(tsc) 11.921 ns - 45 cycles(tsc) 11.461 ns improved 4.3%
2 - 46 cycles(tsc) 11.649 ns - 28 cycles(tsc) 7.023 ns improved 39.1%
3 - 46 cycles(tsc) 11.550 ns - 22 cycles(tsc) 5.671 ns improved 52.2%
4 - 45 cycles(tsc) 11.398 ns - 19 cycles(tsc) 4.967 ns improved 57.8%
8 - 45 cycles(tsc) 11.303 ns - 17 cycles(tsc) 4.298 ns improved 62.2%
16 - 44 cycles(tsc) 11.221 ns - 17 cycles(tsc) 4.423 ns improved 61.4%
30 - 75 cycles(tsc) 18.894 ns - 57 cycles(tsc) 14.497 ns improved 24.0%
32 - 73 cycles(tsc) 18.491 ns - 56 cycles(tsc) 14.227 ns improved 23.3%
34 - 75 cycles(tsc) 18.962 ns - 58 cycles(tsc) 14.638 ns improved 22.7%
48 - 80 cycles(tsc) 20.049 ns - 64 cycles(tsc) 16.247 ns improved 20.0%
64 - 87 cycles(tsc) 21.929 ns - 74 cycles(tsc) 18.598 ns improved 14.9%
128 - 98 cycles(tsc) 24.511 ns - 89 cycles(tsc) 22.295 ns improved 9.2%
158 - 101 cycles(tsc) 25.389 ns - 93 cycles(tsc) 23.390 ns improved 7.9%
250 - 104 cycles(tsc) 26.170 ns - 100 cycles(tsc) 25.112 ns improved 3.8%
Benchmarking shows impressive improvements in the "fastpath" with a
small number of objects in the working set. Once the working set
increases, resulting in activating the "slowpath" (that contains the
heavier locked cmpxchg_double) the improvement decreases.
I'm currently working on also optimizing the "slowpath" (as network
stack use-case hits this), but this patchset should provide a good
foundation for further improvements.
Rest of my patch queue in this area needs some more work, but
preliminary results are good. I'm attending Netfilter Workshop[2]
next week, and I'll hopefully return working on further improvements
in this area.
[1] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/mm/slab_bulk_test01.c
[2] http://workshop.netfilter.org/2015/
---
Christoph Lameter (2):
slab: infrastructure for bulk object allocation and freeing
slub bulk alloc: extract objects from the per cpu slab
Jesper Dangaard Brouer (5):
slub: reduce indention level in kmem_cache_alloc_bulk()
slub: fix error path bug in kmem_cache_alloc_bulk
slub: kmem_cache_alloc_bulk() move clearing outside IRQ disabled section
slub: improve bulk alloc strategy
slub: initial bulk free implementation
include/linux/slab.h | 10 +++++
mm/slab.c | 13 +++++++
mm/slab.h | 9 +++++
mm/slab_common.c | 23 ++++++++++++
mm/slob.c | 13 +++++++
mm/slub.c | 93 ++++++++++++++++++++++++++++++++++++++++++++++++++
6 files changed, 161 insertions(+)
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Sr. Network Kernel Developer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next reply other threads:[~2015-06-15 15:51 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-15 15:51 Jesper Dangaard Brouer [this message]
2015-06-15 15:51 ` [PATCH 1/7] slab: infrastructure for bulk object allocation and freeing Jesper Dangaard Brouer
2015-06-15 16:45 ` Alexander Duyck
2015-06-15 16:50 ` Christoph Lameter
2015-06-16 21:44 ` Andrew Morton
2015-06-15 15:52 ` [PATCH 2/7] slub bulk alloc: extract objects from the per cpu slab Jesper Dangaard Brouer
2015-06-16 7:21 ` Joonsoo Kim
2015-06-16 15:05 ` Christoph Lameter
2015-06-16 21:48 ` Andrew Morton
2015-06-17 6:24 ` Jesper Dangaard Brouer
2015-06-15 15:52 ` [PATCH 3/7] slub: reduce indention level in kmem_cache_alloc_bulk() Jesper Dangaard Brouer
2015-06-15 15:52 ` [PATCH 4/7] slub: fix error path bug in kmem_cache_alloc_bulk Jesper Dangaard Brouer
2015-06-16 21:51 ` Andrew Morton
2015-06-17 6:25 ` Jesper Dangaard Brouer
2015-06-15 15:52 ` [PATCH 5/7] slub: kmem_cache_alloc_bulk() move clearing outside IRQ disabled section Jesper Dangaard Brouer
2015-06-15 15:52 ` [PATCH 6/7] slub: improve bulk alloc strategy Jesper Dangaard Brouer
2015-06-15 16:36 ` Christoph Lameter
2015-06-16 21:53 ` Andrew Morton
2015-06-17 6:29 ` Jesper Dangaard Brouer
2015-06-15 15:52 ` [PATCH 7/7] slub: initial bulk free implementation Jesper Dangaard Brouer
2015-06-15 16:34 ` Christoph Lameter
2015-06-16 8:04 ` Jesper Dangaard Brouer
2015-06-15 17:04 ` Alexander Duyck
2015-06-16 7:23 ` Joonsoo Kim
2015-06-16 9:20 ` Jesper Dangaard Brouer
2015-06-16 12:00 ` Joonsoo Kim
2015-06-16 13:58 ` Jesper Dangaard Brouer
2015-06-16 15:06 ` Christoph Lameter
2015-06-16 7:28 ` Joonsoo Kim
2015-06-16 8:21 ` Jesper Dangaard Brouer
2015-06-16 8:57 ` Jesper Dangaard Brouer
2015-06-16 12:05 ` Joonsoo Kim
2015-06-16 15:10 ` Christoph Lameter
2015-06-16 15:52 ` Jesper Dangaard Brouer
2015-06-16 16:04 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150615155053.18824.617.stgit@devil \
--to=brouer@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=alexander.duyck@gmail.com \
--cc=cl@linux.com \
--cc=linux-mm@kvack.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).