* [PATCH RT 5/5] allow preemption in slab_alloc_node and slab_free
@ 2014-02-10 15:40 Nicholas Mc Guire
2014-02-14 14:07 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 2+ messages in thread
From: Nicholas Mc Guire @ 2014-02-10 15:40 UTC (permalink / raw)
To: linux-rt-users
Cc: LKML, Sebastian Andrzej Siewior, Steven Rostedt, Peter Zijlstra,
Carsten Emde, Thomas Gleixner, Andreas Platschek
drop preempt_disable/enable in slab_alloc_node and slab_free
__slab_alloc is only called from slub.c:slab_alloc_node
it runs with local irqs disabled so it can't be pushed off this CPU
asynchronously, the preempt_disable/enable is thus not needed.
Aside from that the later this_cpu_cmpxchg_double would catch such a
migration event anyay.
slab_free:
slowpath: (if the allocation was on a different CPU) detected by
(page == c->page) c pointing to the per cpu slab, this does not need a
consistent ref to tid so the slow path is safe without the
preempt_disable/enable
fastpath: if allocation was on the same cpu but we got migrated between
fetching the cpu_slab and the actual push onto the free list then
this_cpu_cmpxchg_double would catch this case and loop in redo. So the
fast path is also safe without the preempt_disable/enable
Testing:
while : ; do ./hackbench 120 thread 10000 ; done
Time: 296.631
Time: 298.723
Time: 301.468
Time: 303.880
Time: 301.988
Time: 300.038
Time: 299.634
Time: 301.488
which seems to be a good way to stress-test slub
Impact on performance:
The change could negatively impact performance if the removal of the
preempt_disable/enable would result in a significant increase of the
slow path being taken or looping via goto redo - this was checked by:
static instrumentation:
an instrumentation was added to check how often the redo loop is taken
the results showed that the redo loop is very rarely taken (< 1 in 10000)
and is below the value with the preempt_disable/enable present. Further
the slowpath to fastpath ration improves slightly (not sure if this is
statistically significant though)
running slab_test.c:
the slub-benchmark from Christoph Lameter and Mathieu Desnoyers was used
the only change being that asm/system.h was droped from the list of
includes. The results indicate that the removal of preempt_disable/enable
reduces the cycles needed slightly (though quite a few testsystems would
need to be checked before this can be confirmed).
Tested-by: Andreas Platschek <platschek@ict.tuwien.ac.at>
Tested-by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at>
---
mm/slub.c | 4 ----
1 files changed, 0 insertions(+), 4 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index 546bd9a..c422988 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2424,7 +2424,6 @@ redo:
* on a different processor between the determination of the pointer
* and the retrieval of the tid.
*/
- preempt_disable();
c = __this_cpu_ptr(s->cpu_slab);
/*
@@ -2434,7 +2433,6 @@ redo:
* linked list in between.
*/
tid = c->tid;
- preempt_enable();
object = c->freelist;
page = c->page;
@@ -2683,11 +2681,9 @@ redo:
* data is retrieved via this pointer. If we are on the same cpu
* during the cmpxchg then the free will succedd.
*/
- preempt_disable();
c = __this_cpu_ptr(s->cpu_slab);
tid = c->tid;
- preempt_enable();
if (likely(page == c->page)) {
set_freepointer(s, object, c->freelist);
--
1.7.2.5
^ permalink raw reply related [flat|nested] 2+ messages in thread* Re: [PATCH RT 5/5] allow preemption in slab_alloc_node and slab_free
2014-02-10 15:40 [PATCH RT 5/5] allow preemption in slab_alloc_node and slab_free Nicholas Mc Guire
@ 2014-02-14 14:07 ` Sebastian Andrzej Siewior
0 siblings, 0 replies; 2+ messages in thread
From: Sebastian Andrzej Siewior @ 2014-02-14 14:07 UTC (permalink / raw)
To: Nicholas Mc Guire
Cc: linux-rt-users, LKML, Steven Rostedt, Peter Zijlstra,
Carsten Emde, Thomas Gleixner, Andreas Platschek
* Nicholas Mc Guire | 2014-02-10 16:40:16 [+0100]:
>__slab_alloc is only called from slub.c:slab_alloc_node
>it runs with local irqs disabled so it can't be pushed off this CPU
>asynchronously, the preempt_disable/enable is thus not needed.
>Aside from that the later this_cpu_cmpxchg_double would catch such a
>migration event anyay.
Not sure what to do with this one. You do write a longer chapter why it
is okay to drop the preemption disable section and that
this_cpu_cmpxchg_double() would catch it. And I didn't figure out so
far why need to keep preemption disabled while looking at c->tid but not
at c->page.
However, it seems that Christoph Lameter found it important to add a
note in the comment that this preemption disable here is important.
Looking at commit 7cccd80 ("slub: tid must be retrieved from the percpu
area of the current processor") it seems that Steven Rostedt run into
trouble and now we have that preemption_disable() here.
So if you really get better performance and you haven't seen anything
bad happen then you might want to check with Lameter & Rostedt about
your patch and getting it merged upstream.
The commit I mentioned is upstream since v3.11-rc1 and I can see it in
v3.8-RT tree so it looks serious.
I fail to see it in v3.2-RT, Steven, isn't this something we want there,
too?
Sebastian
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2014-02-14 14:07 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-02-10 15:40 [PATCH RT 5/5] allow preemption in slab_alloc_node and slab_free Nicholas Mc Guire
2014-02-14 14:07 ` Sebastian Andrzej Siewior
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).