Re: [cpuops cmpxchg double V1 4/4] Lockless (and preemptless) fastpaths for slub

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Tejun Heo <tj@kernel.org>
To: Christoph Lameter <cl@linux.com>
Cc: akpm@linux-foundation.org, Pekka Enberg <penberg@cs.helsinki.fi>,
	linux-kernel@vger.kernel.org,
	Eric Dumazet <eric.dumazet@gmail.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Subject: Re: [cpuops cmpxchg double V1 4/4] Lockless (and preemptless) fastpaths for slub
Date: Wed, 15 Dec 2010 17:51:39 +0100	[thread overview]
Message-ID: <4D08F21B.3000706@kernel.org> (raw)
In-Reply-To: <20101214174901.827380447@linux.com>

On 12/14/2010 06:48 PM, Christoph Lameter wrote:
> Use the this_cpu_cmpxchg_double functionality to implement a lockless
> allocation algorithm on arches that support fast this_cpu_ops.
> 
> Each of the per cpu pointers is paired with a transaction id that ensures
> that updates of the per cpu information can only occur in sequence on
> a certain cpu.
> 
> A transaction id is a "long" integer that is comprised of an event number
> and the cpu number. The event number is incremented for every change to the
> per cpu state. This means that the cmpxchg instruction can verify for an
> update that nothing interfered and that we are updating the percpu structure
> for the processor where we picked up the information and that we are also
> currently on that processor when we update the information.
> 
> This results in a significant decrease of the overhead in the fastpaths. It
> also makes it easy to adopt the fast path for realtime kernels since this
> is lockless and does not require the use of the current per cpu area
> over the critical section. It is only important that the per cpu area is
> current at the beginning of the critical section and at the end.
> 
> So there is no need even to disable preemption.
> 
> Test results show that the fastpath cycle count is reduced by up to ~ 40%
> (alloc/free test goes from ~140 cycles down to ~80). The slowpath for kfree
> adds a few cycles.
> 
> Sadly this does nothing for the slowpath which is where the main issues with
> performance in slub are but the best case performance rises significantly.
> (For that see the more complex slub patches that require cmpxchg_double)

The first two look good to me but I frankly don't have much idea about
the latter two.  Pekka, can you please ack those?  Alternatively, you
can later pull the percpu tree in and apply the allocator bits in your
tree, which I actually prefer.

Thanks.

-- 
tejun

next prev parent reply	other threads:[~2010-12-15 16:52 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-14 17:48 [cpuops cmpxchg double V1 0/4] this_cpu_cmpxchg_double support Christoph Lameter
2010-12-14 17:48 ` [cpuops cmpxchg double V1 1/4] Generic support for this_cpu_cmpxchg_double Christoph Lameter
2010-12-18 14:47   ` Tejun Heo
2010-12-18 14:51     ` Tejun Heo
2010-12-21 22:36     ` Christoph Lameter
2010-12-21 23:24       ` H. Peter Anvin
2010-12-22  9:14         ` Tejun Heo
2010-12-24  0:16           ` Christoph Lameter
2010-12-24  0:22             ` H. Peter Anvin
2010-12-25  4:53               ` Christoph Lameter
2010-12-25  6:11                 ` H. Peter Anvin
2010-12-25 16:52                 ` Tejun Heo
2010-12-25 23:55                   ` Christoph Lameter
2010-12-27 10:52                     ` Tejun Heo
2011-01-03 22:43                       ` Christoph Lameter
2010-12-14 17:48 ` [cpuops cmpxchg double V1 2/4] x86: this_cpu_cmpxchg_double() support Christoph Lameter
2010-12-15  0:46   ` H. Peter Anvin
2010-12-15  0:56     ` H. Peter Anvin
2010-12-15 16:12       ` Christoph Lameter
2010-12-15 16:20         ` Christoph Lameter
2010-12-15 17:36           ` H. Peter Anvin
2010-12-15 17:53             ` Christoph Lameter
2010-12-15 16:32         ` H. Peter Anvin
2010-12-15 16:41           ` Christoph Lameter
2010-12-14 17:48 ` [cpuops cmpxchg double V1 3/4] slub: Get rid of slab_free_hook_irq() Christoph Lameter
2010-12-14 17:48 ` [cpuops cmpxchg double V1 4/4] Lockless (and preemptless) fastpaths for slub Christoph Lameter
2010-12-15 16:51   ` Tejun Heo [this message]
2010-12-15 16:55     ` Pekka Enberg
2010-12-15 16:57       ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D08F21B.3000706@kernel.org \
    --to=tj@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=eric.dumazet@gmail.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=penberg@cs.helsinki.fi \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.