From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S964808AbWAWQ5q (ORCPT ); Mon, 23 Jan 2006 11:57:46 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S964814AbWAWQ5q (ORCPT ); Mon, 23 Jan 2006 11:57:46 -0500 Received: from courier.cs.helsinki.fi ([128.214.9.1]:39613 "EHLO mail.cs.helsinki.fi") by vger.kernel.org with ESMTP id S964808AbWAWQ5Y (ORCPT ); Mon, 23 Jan 2006 11:57:24 -0500 Subject: [RFC/PATCH 3/3] slab: update documentation From: Pekka Enberg To: manfred@colorfullife.com, christoph@lameter.com Cc: linux-kernel@vger.kernel.org Date: Mon, 23 Jan 2006 18:57:22 +0200 Message-Id: <1138035442.10527.23.camel@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 7bit X-Mailer: Evolution 2.4.2.1 Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org From: Pekka Enberg This patch updates documentation in mm/slab.c and replaces the redundant changelog with copyright statements. Signed-off-by: Pekka Enberg --- mm/slab.c | 153 +++++++++++++++++++++++++++++++------------------------------- 1 file changed, 78 insertions(+), 75 deletions(-) Index: 2.6/mm/slab.c =================================================================== --- 2.6.orig/mm/slab.c +++ 2.6/mm/slab.c @@ -1,57 +1,52 @@ /* - * linux/mm/slab.c - * Written by Mark Hemment, 1996/97. - * (markhe@nextd.demon.co.uk) - * - * kmem_cache_destroy() + some cleanup - 1999 Andrea Arcangeli - * - * Major cleanup, different bufctl logic, per-cpu arrays - * (c) 2000 Manfred Spraul - * - * Cleanup, make the head arrays unconditional, preparation for NUMA - * (c) 2002 Manfred Spraul + * mm/slab.c - An object-caching memory allocator + * Copyright (c) 1996-1997 Mark Hemment + * Copyright (c) 1999 Andrea Arcangeli + * Copyright (c) 2000, 2002 Manfred Spraul + * Copyright (c) 2005 Shai Fultheim + * Copyright (c) 2005 Shobhit Dayal + * Copyright (c) 2005 Alok N Kataria + * Copyright (c) 2005 Christoph Lameter * * An implementation of the Slab Allocator as described in outline in; * UNIX Internals: The New Frontiers by Uresh Vahalia * Pub: Prentice Hall ISBN 0-13-101908-2 + * * or with a little more detail in; * The Slab Allocator: An Object-Caching Kernel Memory Allocator * Jeff Bonwick (Sun Microsystems). * Presented at: USENIX Summer 1994 Technical Conference * - * The memory is organized in caches, one cache for each object type. - * (e.g. inode_cache, dentry_cache, buffer_head, vm_area_struct) - * Each cache consists out of many slabs (they are small (usually one - * page long) and always contiguous), and each slab contains multiple - * initialized objects. - * - * This means, that your constructor is used only for newly allocated - * slabs and you must pass objects with the same intializations to - * kmem_cache_free. - * - * Each cache can only support one memory type (GFP_DMA, GFP_HIGHMEM, - * normal). If you need a special memory type, then must create a new - * cache for that memory type. - * - * In order to reduce fragmentation, the slabs are sorted in 3 groups: - * full slabs with 0 free objects - * partial slabs - * empty slabs with no allocated objects - * - * If partial slabs exist, then new allocations come from these slabs, - * otherwise from empty slabs or new slabs are allocated. - * - * kmem_cache_destroy() CAN CRASH if you try to allocate from the cache - * during kmem_cache_destroy(). The caller must prevent concurrent allocs. - * - * Each cache has a short per-cpu head array, most allocs - * and frees go into that array, and if that array overflows, then 1/2 - * of the entries in the array are given back into the global cache. - * The head array is strictly LIFO and should improve the cache hit rates. - * On SMP, it additionally reduces the spinlock operations. - * - * The c_cpuarray may not be read with enabled local interrupts - - * it's changed with a smp_call_function(). + * Magazines and Vmem: Extending the Slab Allocator to Many CPUs and + * Arbitrary Resources + * Jeff Bonwick, Sun Microsystems + * Jonathan Adams, California Institute of Technology + * Presented at: USENIX 2001 + * + * The slab allocator organizes memory in object caches where each + * type has its own cache. Examples of types that have their own cache + * are struct inode, struct dentry, struct buffer_head, and so + * on. There is also a general purpose allocator, kmalloc, which is + * backed by number of fixed-size object caches ranging from 32 bytes + * up to 128 KB. + * + * Physical memory is allocated with the page allocator. A slab can + * contain one or more contiguous pages that are carved into + * equal-sized blocks called buffers. Each object cache has an + * optional constructor and destructor which are applied to a buffer + * before returning an object from the allocator. The destructor is + * applied to an object before the underlying buffer is retured to the + * page allocator. + * + * The object cache is organized into per-CPU and per-NUMA-node + * caches. The per-CPU cache improves SMP scalability as most + * allocation and free operations are satisfied from it. Object cache + * slabs are stored in three per-NUMA-node lists: the partial list, + * the full list, and the free list depending on how many objects are + * available for allocation in a slab. Each node also has a shared + * array cache that is used to refill the per-CPU caches and an alien + * cache that is used to free objects that were allocated on a + * different node in a batch. * * SMP synchronization: * constructors and destructors are called without any locking. @@ -61,29 +56,48 @@ * and local interrupts are disabled so slab code is preempt-safe. * The non-constant members are protected with a per-cache irq spinlock. * + * The slab layout is as follows: + * + * 0 ... PAGE_SIZE << gfp_order + * +-------+----------+----------+----------+----------+----------+----------+ + * | | | | : : | | + * | Mgmt | Buffer 1 | Buffer 2 | .... : .... : .... | Buffer n | + * | | | | : : | | + * +-------+----------+----------+----------+----------+----------+----------+ + * + * Legend: + * + * Mgmt Slab management and buffer control (optional). Use the + * slab_mgmt_size() function to retrieve size of this block. + * + * Buffer 1..n Equally-sized buffers. Buffer size is stored in the + * buffer_size member of struct kmem_cache. + * + * + * The buffer layout is as follows: + * + * 0 ... buffer_size + * +---------+---+---------------------------------------------------+---+---+ + * | : | : : | + * | Padding : R | Object : R : C | + * | : | : : | + * +---------+---+---------------------------------------------------+---+---+ + * + * Legend: + * + * Padding Empty space to ensure alignment restrictions of this object + * cache. + * Object The actual object. Use the obj_offset() function to retrieve + * the start offset of an object and obj_size() function for the + * object size. + * R Red-zone word. These two redzone words are BYTES_PER_WORD in + * size and they are used to detect buffer overruns. + * C Address of last user of this buffer. This field is + * BYTES_PER_WORD in size. + * * Many thanks to Mark Hemment, who wrote another per-cpu slab patch * in 2000 - many ideas in the current implementation are derived from * his patch. - * - * Further notes from the original documentation: - * - * 11 April '97. Started multi-threading - markhe - * The global cache-chain is protected by the semaphore 'cache_chain_sem'. - * The sem is only needed when accessing/extending the cache-chain, which - * can never happen inside an interrupt (kmem_cache_create(), - * kmem_cache_shrink() and kmem_cache_reap()). - * - * At present, each engine can be growing a cache. This should be blocked. - * - * 15 March 2005. NUMA slab allocator. - * Shai Fultheim . - * Shobhit Dayal - * Alok N Kataria - * Christoph Lameter - * - * Modified the slab allocator to be node aware on NUMA systems. - * Each node has its own list of partial, free and full slabs. - * All object allocations for a node occur from node specific slab lists. */ #include @@ -506,17 +520,6 @@ struct kmem_cache { #define POISON_FREE 0x6b /* for use-after-free poisoning */ #define POISON_END 0xa5 /* end-byte of poisoning */ -/* memory layout of objects: - * 0 : objp - * 0 .. cachep->obj_offset - BYTES_PER_WORD - 1: padding. This ensures that - * the end of an object is aligned with the end of the real - * allocation. Catches writes behind the end of the allocation. - * cachep->obj_offset - BYTES_PER_WORD .. cachep->obj_offset - 1: - * redzone word. - * cachep->obj_offset: The real object. - * cachep->buffer_size - 2* BYTES_PER_WORD: redzone word [BYTES_PER_WORD long] - * cachep->buffer_size - 1* BYTES_PER_WORD: last caller address [BYTES_PER_WORD long] - */ static int obj_offset(struct kmem_cache *cachep) { return cachep->obj_offset;