linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch 0/7] [RFC] SLUB: Improve allocpercpu to reduce per cpu access overhead
@ 2007-11-01  0:02 Christoph Lameter
  2007-11-01  0:02 ` [patch 1/7] allocpercpu: Make it a true per cpu allocator by allocating from a per cpu array Christoph Lameter
                   ` (8 more replies)
  0 siblings, 9 replies; 62+ messages in thread
From: Christoph Lameter @ 2007-11-01  0:02 UTC (permalink / raw)
  To: akpm; +Cc: linux-arch, linux-kernel, Mathieu Desnoyers, Pekka Enberg

This patch increases the speed of the SLUB fastpath by
improving the per cpu allocator and makes it usable for SLUB.

Currently allocpercpu manages arrays of pointer to per cpu objects.
This means that is has to allocate the arrays and then populate them
as needed with objects. Although these objects are called per cpu
objects they cannot be handled in the same way as per cpu objects
by adding the per cpu offset of the respective cpu.

The patch here changes that. We create a small memory pool in the
percpu area and allocate from there if alloc per cpu is called.
As a result we do not need the per cpu pointer arrays for each
object. This reduces memory usage and also the cache foot print
of allocpercpu users. Also the per cpu objects for a single processor
are tightly packed next to each other decreasing cache footprint
even further and making it possible to access multiple objects
in the same cacheline.

SLUB has the same mechanism implemented. After fixing up the
alloccpu stuff we throw the SLUB method out and use the new
allocpercpu handling. Then we optimize allocpercpu addressing
by adding a new function

	this_cpu_ptr()

that allows the determination of the per cpu pointer for the
current processor in an more efficient way on many platforms.

This increases the speed of SLUB (and likely other kernel subsystems
that benefit from the allocpercpu enhancements):


       SLAB    SLUB    SLUB+   SLUB-o	SLUB-a
   8    96      86      45      44      38	3 *
  16    84      92      49      48      43	2 *
  32    84      106     61      59      53	+++
  64    102     129     82      88      75	++
 128    147     226     188     181     176	-
 256    200     248     207     285     204	=
 512    300     301     260     209     250	+
1024    416     440     398     264     391	++
2048    720     542     530     390     511	+++
4096    1254    342     342     336     376	3 *

alloc/free test
      SLAB    SLUB    SLUB+   SLUB-o	SLUB-a
      137-146 151     68-72   68-74	56-58	3 *

Note: The per cpu optimization are only half way there because of the screwed
up way that x86_64 handles its cpu area that causes addditional cycles to be
spend by retrieving a pointer from memory and adding it to the address.
The i386 code is much less cycle intensive being able to get to per cpu
data using a segment prefix and if we can get that to work on x86_64
then we may be able to get the cycle count for the fastpath down to 20-30
cycles.

-- 

^ permalink raw reply	[flat|nested] 62+ messages in thread

end of thread, other threads:[~2007-11-12 22:46 UTC | newest]

Thread overview: 62+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-01  0:02 [patch 0/7] [RFC] SLUB: Improve allocpercpu to reduce per cpu access overhead Christoph Lameter
2007-11-01  0:02 ` [patch 1/7] allocpercpu: Make it a true per cpu allocator by allocating from a per cpu array Christoph Lameter
2007-11-01  7:24   ` Eric Dumazet
2007-11-01 12:59     ` Christoph Lameter
2007-11-01  0:02 ` [patch 2/7] allocpercpu: Remove functions that are rarely used Christoph Lameter
2007-11-01  0:02 ` [patch 3/7] Allocpercpu: Do __percpu_disguise() only if CONFIG_DEBUG_VM is set Christoph Lameter
2007-11-01  7:25   ` Eric Dumazet
2007-11-01  0:02 ` [patch 4/7] Percpu: Add support for this_cpu_offset() to be able to create this_cpu_ptr() Christoph Lameter
2007-11-01  0:02 ` [patch 5/7] SLUB: Use allocpercpu to allocate per cpu data instead of running our own per cpu allocator Christoph Lameter
2007-11-01  0:02 ` [patch 6/7] SLUB: No need to cache kmem_cache data in kmem_cache_cpu anymore Christoph Lameter
2007-11-01  0:02 ` [patch 7/7] SLUB: Optimize per cpu access on the local cpu using this_cpu_ptr() Christoph Lameter
2007-11-01  0:24 ` [patch 0/7] [RFC] SLUB: Improve allocpercpu to reduce per cpu access overhead David Miller
2007-11-01  0:26   ` Christoph Lameter
2007-11-01  0:27     ` David Miller
2007-11-01  0:31       ` Christoph Lameter
2007-11-01  0:51         ` David Miller
2007-11-01  0:53           ` Christoph Lameter
2007-11-01  1:00             ` David Miller
2007-11-01  1:01               ` Christoph Lameter
2007-11-01  1:09                 ` David Miller
2007-11-01  1:12                   ` Christoph Lameter
2007-11-01  1:13                     ` David Miller
2007-11-01  1:21                       ` Christoph Lameter
2007-11-01  5:27                         ` David Miller
2007-11-01  4:16                       ` Christoph Lameter
2007-11-01  5:38                         ` David Miller
2007-11-01  7:01                         ` David Miller
2007-11-01  9:14                           ` David Miller
2007-11-01 13:03                             ` Christoph Lameter
2007-11-01 21:29                               ` David Miller
2007-11-01 22:15                                 ` Christoph Lameter
2007-11-01 22:38                                   ` David Miller
2007-11-01 22:48                                     ` Christoph Lameter
2007-11-01 22:58                                       ` David Miller
2007-11-02  1:06                                         ` Christoph Lameter
2007-11-02  2:51                                           ` David Miller
2007-11-02 10:28                                         ` Peter Zijlstra
2007-11-02 14:35                                           ` Christoph Lameter
2007-11-02 15:20                                             ` Peter Zijlstra
2007-11-02 15:29                                               ` Christoph Lameter
2007-11-12 10:52                                         ` Herbert Xu
2007-11-12 19:14                                           ` Christoph Lameter
2007-11-12 19:48                                             ` Eric Dumazet
2007-11-12 19:56                                               ` Christoph Lameter
2007-11-12 20:18                                                 ` Eric Dumazet
2007-11-12 22:46                                                   ` David Miller
2007-11-12 19:57                                               ` Luck, Tony
2007-11-12 20:14                                                 ` Eric Dumazet
2007-11-12 22:46                                                   ` David Miller
2007-11-12 21:28                                           ` David Miller
2007-11-01 23:00                                       ` Eric Dumazet
2007-11-02  0:58                                         ` Christoph Lameter
2007-11-02  1:40                                         ` Christoph Lameter
2007-11-01  7:17 ` Eric Dumazet
2007-11-01  7:57   ` David Miller
2007-11-01 13:01     ` Christoph Lameter
2007-11-01 21:25       ` David Miller
2007-11-01 12:57   ` Christoph Lameter
2007-11-01 21:28     ` David Miller
2007-11-01 22:11       ` Christoph Lameter
2007-11-01 22:14         ` David Miller
2007-11-01 22:16           ` Christoph Lameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).