[RFC] SL[AUO]B common code 0/9] Sl[auo]b: Common functionality V1

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [RFC] SL[AUO]B common code 0/9] Sl[auo]b: Common functionality V1
@ 2012-05-14 20:15 Christoph Lameter
  2012-05-14 20:15 ` [RFC] SL[AUO]B common code 1/9] [slob] define page struct fields used in mm_types.h Christoph Lameter
                   ` (9 more replies)
  0 siblings, 10 replies; 38+ messages in thread
From: Christoph Lameter @ 2012-05-14 20:15 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: linux-mm, David Rientjes, Matt Mackall

This is a series of patches that extracts common functionality from
slab allocators into a common code base. The intend is to standardize
as much as possible of the allocator behavior while keeping the
distinctive features of each allocator which are mostly due to their
storage format and serialization approaches.

This patchset makes a beginning by extracting common functionality in
kmem_cache_create() and kmem_cache_destroy(). However, there are
numerous other areas where such work could be beneficial:

1. Extract the sysfs support from SLUB and make it common. That way
   all allocators have a common sysfs API and are handleable in the same
   way regardless of the allocator chose.

2. Extract the error reporting and checking from SLUB and make
   it available for all allocators. This means that all allocators
   will gain the resiliency and error handling capabilties.

3. Extract the memory hotplug and cpu hotplug handling. It seems that
   SLAB may be more sophisticated here. Having common code here will
   make it easier to maintain the special code.

4. Extract the aliasing capability of SLUB. This will enable fast
   slab creation without creating too many additional slab caches.
   The arrays of caches of varying sizes in numerous subsystems
   do not cause the creation of numerous slab caches. Storage
   density is increased and the cache footprint is reduced.

Ultimately it is to be hoped that the special code for each allocator
shrinks to a mininum. This will also make it easier to make modification
to allocators.

In the far future one could envision that the current allocators will
just become storage algorithms that can be chosen based on the need of
the subsystem. F.e.

Cpu cache dependend performance		= Bonwick allocator (SLAB)
Minimal cycle count and cache footprint	= SLUB
Maximum storage density			= SLOB

But that could be controversial and inefficient if indirect calls are needed.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC] SL[AUO]B common code 1/9] [slob] define page struct fields used in mm_types.h
  2012-05-14 20:15 [RFC] SL[AUO]B common code 0/9] Sl[auo]b: Common functionality V1 Christoph Lameter
@ 2012-05-14 20:15 ` Christoph Lameter
  2012-05-16  7:31   ` Glauber Costa
  2012-05-14 20:15 ` [RFC] SL[AUO]B common code 2/9] [slab]: Use page struct fields instead of casting Christoph Lameter
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 38+ messages in thread
From: Christoph Lameter @ 2012-05-14 20:15 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: linux-mm, David Rientjes, Matt Mackall

[-- Attachment #1: slob_use_page_struct --]
[-- Type: text/plain, Size: 9015 bytes --]

Define the fields used by slob in mm_types.h and use struct page instead
of struct slob_page in slob. This cleans up a lot of typecasts in slob.c and
makes readers aware of slob's use of page struct fields.

[Also cleans up some bitrot in slob.c. The page struct field layout
in slob.c is an old layout and does not match the one in mm_types.h]

Signed-off-by: Christoph Lameter <cl@linux.com>

---
 include/linux/mm_types.h |    5 +-
 mm/slob.c                |  105 ++++++++++++++++++-----------------------------
 2 files changed, 45 insertions(+), 65 deletions(-)

Index: linux-2.6/mm/slob.c
===================================================================
--- linux-2.6.orig/mm/slob.c	2012-04-11 07:07:16.672129272 -0500
+++ linux-2.6/mm/slob.c	2012-04-11 07:12:34.344122598 -0500
@@ -92,33 +92,12 @@ struct slob_block {
 typedef struct slob_block slob_t;
 
 /*
- * We use struct page fields to manage some slob allocation aspects,
- * however to avoid the horrible mess in include/linux/mm_types.h, we'll
- * just define our own struct page type variant here.
- */
-struct slob_page {
-	union {
-		struct {
-			unsigned long flags;	/* mandatory */
-			atomic_t _count;	/* mandatory */
-			slobidx_t units;	/* free units left in page */
-			unsigned long pad[2];
-			slob_t *free;		/* first free slob_t in page */
-			struct list_head list;	/* linked list of free pages */
-		};
-		struct page page;
-	};
-};
-static inline void struct_slob_page_wrong_size(void)
-{ BUILD_BUG_ON(sizeof(struct slob_page) != sizeof(struct page)); }
-
-/*
  * free_slob_page: call before a slob_page is returned to the page allocator.
  */
-static inline void free_slob_page(struct slob_page *sp)
+static inline void free_slob_page(struct page *sp)
 {
-	reset_page_mapcount(&sp->page);
-	sp->page.mapping = NULL;
+	reset_page_mapcount(sp);
+	sp->mapping = NULL;
 }
 
 /*
@@ -133,44 +112,44 @@ static LIST_HEAD(free_slob_large);
 /*
  * is_slob_page: True for all slob pages (false for bigblock pages)
  */
-static inline int is_slob_page(struct slob_page *sp)
+static inline int is_slob_page(struct page *sp)
 {
-	return PageSlab((struct page *)sp);
+	return PageSlab(sp);
 }
 
-static inline void set_slob_page(struct slob_page *sp)
+static inline void set_slob_page(struct page *sp)
 {
-	__SetPageSlab((struct page *)sp);
+	__SetPageSlab(sp);
 }
 
-static inline void clear_slob_page(struct slob_page *sp)
+static inline void clear_slob_page(struct page *sp)
 {
-	__ClearPageSlab((struct page *)sp);
+	__ClearPageSlab(sp);
 }
 
-static inline struct slob_page *slob_page(const void *addr)
+static inline struct page *slob_page(const void *addr)
 {
-	return (struct slob_page *)virt_to_page(addr);
+	return virt_to_page(addr);
 }
 
 /*
  * slob_page_free: true for pages on free_slob_pages list.
  */
-static inline int slob_page_free(struct slob_page *sp)
+static inline int slob_page_free(struct page *sp)
 {
-	return PageSlobFree((struct page *)sp);
+	return PageSlobFree(sp);
 }
 
-static void set_slob_page_free(struct slob_page *sp, struct list_head *list)
+static void set_slob_page_free(struct page *sp, struct list_head *list)
 {
-	list_add(&sp->list, list);
-	__SetPageSlobFree((struct page *)sp);
+	list_add(&sp->lru, list);
+	__SetPageSlobFree(sp);
 }
 
-static inline void clear_slob_page_free(struct slob_page *sp)
+static inline void clear_slob_page_free(struct page *sp)
 {
-	list_del(&sp->list);
-	__ClearPageSlobFree((struct page *)sp);
+	list_del(&sp->lru);
+	__ClearPageSlobFree(sp);
 }
 
 #define SLOB_UNIT sizeof(slob_t)
@@ -267,12 +246,12 @@ static void slob_free_pages(void *b, int
 /*
  * Allocate a slob block within a given slob_page sp.
  */
-static void *slob_page_alloc(struct slob_page *sp, size_t size, int align)
+static void *slob_page_alloc(struct page *sp, size_t size, int align)
 {
 	slob_t *prev, *cur, *aligned = NULL;
 	int delta = 0, units = SLOB_UNITS(size);
 
-	for (prev = NULL, cur = sp->free; ; prev = cur, cur = slob_next(cur)) {
+	for (prev = NULL, cur = sp->freelist; ; prev = cur, cur = slob_next(cur)) {
 		slobidx_t avail = slob_units(cur);
 
 		if (align) {
@@ -296,12 +275,12 @@ static void *slob_page_alloc(struct slob
 				if (prev)
 					set_slob(prev, slob_units(prev), next);
 				else
-					sp->free = next;
+					sp->freelist = next;
 			} else { /* fragment */
 				if (prev)
 					set_slob(prev, slob_units(prev), cur + units);
 				else
-					sp->free = cur + units;
+					sp->freelist = cur + units;
 				set_slob(cur + units, avail - units, next);
 			}
 
@@ -320,7 +299,7 @@ static void *slob_page_alloc(struct slob
  */
 static void *slob_alloc(size_t size, gfp_t gfp, int align, int node)
 {
-	struct slob_page *sp;
+	struct page *sp;
 	struct list_head *prev;
 	struct list_head *slob_list;
 	slob_t *b = NULL;
@@ -335,13 +314,13 @@ static void *slob_alloc(size_t size, gfp
 
 	spin_lock_irqsave(&slob_lock, flags);
 	/* Iterate through each partially free page, try to find room */
-	list_for_each_entry(sp, slob_list, list) {
+	list_for_each_entry(sp, slob_list, lru) {
 #ifdef CONFIG_NUMA
 		/*
 		 * If there's a node specification, search for a partial
 		 * page with a matching node id in the freelist.
 		 */
-		if (node != -1 && page_to_nid(&sp->page) != node)
+		if (node != -1 && page_to_nid(sp) != node)
 			continue;
 #endif
 		/* Enough room on this page? */
@@ -349,7 +328,7 @@ static void *slob_alloc(size_t size, gfp
 			continue;
 
 		/* Attempt to alloc */
-		prev = sp->list.prev;
+		prev = sp->lru.prev;
 		b = slob_page_alloc(sp, size, align);
 		if (!b)
 			continue;
@@ -374,8 +353,8 @@ static void *slob_alloc(size_t size, gfp
 
 		spin_lock_irqsave(&slob_lock, flags);
 		sp->units = SLOB_UNITS(PAGE_SIZE);
-		sp->free = b;
-		INIT_LIST_HEAD(&sp->list);
+		sp->freelist = b;
+		INIT_LIST_HEAD(&sp->lru);
 		set_slob(b, SLOB_UNITS(PAGE_SIZE), b + SLOB_UNITS(PAGE_SIZE));
 		set_slob_page_free(sp, slob_list);
 		b = slob_page_alloc(sp, size, align);
@@ -392,7 +371,7 @@ static void *slob_alloc(size_t size, gfp
  */
 static void slob_free(void *block, int size)
 {
-	struct slob_page *sp;
+	struct page *sp;
 	slob_t *prev, *next, *b = (slob_t *)block;
 	slobidx_t units;
 	unsigned long flags;
@@ -421,7 +400,7 @@ static void slob_free(void *block, int s
 	if (!slob_page_free(sp)) {
 		/* This slob page is about to become partially free. Easy! */
 		sp->units = units;
-		sp->free = b;
+		sp->freelist = b;
 		set_slob(b, units,
 			(void *)((unsigned long)(b +
 					SLOB_UNITS(PAGE_SIZE)) & PAGE_MASK));
@@ -441,15 +420,15 @@ static void slob_free(void *block, int s
 	 */
 	sp->units += units;
 
-	if (b < sp->free) {
-		if (b + units == sp->free) {
-			units += slob_units(sp->free);
-			sp->free = slob_next(sp->free);
+	if (b < (slob_t *)sp->freelist) {
+		if (b + units == sp->freelist) {
+			units += slob_units(sp->freelist);
+			sp->freelist = slob_next(sp->freelist);
 		}
-		set_slob(b, units, sp->free);
-		sp->free = b;
+		set_slob(b, units, sp->freelist);
+		sp->freelist = b;
 	} else {
-		prev = sp->free;
+		prev = sp->freelist;
 		next = slob_next(prev);
 		while (b > next) {
 			prev = next;
@@ -522,7 +501,7 @@ EXPORT_SYMBOL(__kmalloc_node);
 
 void kfree(const void *block)
 {
-	struct slob_page *sp;
+	struct page *sp;
 
 	trace_kfree(_RET_IP_, block);
 
@@ -536,14 +515,14 @@ void kfree(const void *block)
 		unsigned int *m = (unsigned int *)(block - align);
 		slob_free(m, *m + align);
 	} else
-		put_page(&sp->page);
+		put_page(sp);
 }
 EXPORT_SYMBOL(kfree);
 
 /* can't use ksize for kmem_cache_alloc memory, only kmalloc */
 size_t ksize(const void *block)
 {
-	struct slob_page *sp;
+	struct page *sp;
 
 	BUG_ON(!block);
 	if (unlikely(block == ZERO_SIZE_PTR))
@@ -555,7 +534,7 @@ size_t ksize(const void *block)
 		unsigned int *m = (unsigned int *)(block - align);
 		return SLOB_UNITS(*m) * SLOB_UNIT;
 	} else
-		return sp->page.private;
+		return sp->private;
 }
 EXPORT_SYMBOL(ksize);
 
Index: linux-2.6/include/linux/mm_types.h
===================================================================
--- linux-2.6.orig/include/linux/mm_types.h	2012-04-11 07:07:16.660129272 -0500
+++ linux-2.6/include/linux/mm_types.h	2012-04-11 07:11:01.944124607 -0500
@@ -52,7 +52,7 @@ struct page {
 	struct {
 		union {
 			pgoff_t index;		/* Our offset within mapping. */
-			void *freelist;		/* slub first free object */
+			void *freelist;		/* slub/slob first free object */
 		};
 
 		union {
@@ -80,11 +80,12 @@ struct page {
 					 */
 					atomic_t _mapcount;
 
-					struct {
+					struct { /* SLUB */
 						unsigned inuse:16;
 						unsigned objects:15;
 						unsigned frozen:1;
 					};
+					int units;	/* SLOB */
 				};
 				atomic_t _count;		/* Usage count, see below. */
 			};

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC] SL[AUO]B common code 2/9] [slab]: Use page struct fields instead of casting
  2012-05-14 20:15 [RFC] SL[AUO]B common code 0/9] Sl[auo]b: Common functionality V1 Christoph Lameter
  2012-05-14 20:15 ` [RFC] SL[AUO]B common code 1/9] [slob] define page struct fields used in mm_types.h Christoph Lameter
@ 2012-05-14 20:15 ` Christoph Lameter
  2012-05-16  7:52   ` Glauber Costa
  2012-05-14 20:15 ` [RFC] SL[AUO]B common code 3/9] Extract common fields from struct kmem_cache Christoph Lameter
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 38+ messages in thread
From: Christoph Lameter @ 2012-05-14 20:15 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: linux-mm, David Rientjes, Matt Mackall

[-- Attachment #1: slab_page_struct_fields --]
[-- Type: text/plain, Size: 2141 bytes --]

Add fields to the page struct so that it is properly documented that
slab overlays the lru fields.

This cleans up some casts in slab.

Signed-off-by: Christoph Lameter <cl@linux.com>

---
 include/linux/mm_types.h |    4 ++++
 mm/slab.c                |    8 ++++----
 2 files changed, 8 insertions(+), 4 deletions(-)

Index: linux-2.6/mm/slab.c
===================================================================
--- linux-2.6.orig/mm/slab.c	2012-04-19 09:05:04.397660703 -0500
+++ linux-2.6/mm/slab.c	2012-04-19 09:15:23.561647873 -0500
@@ -496,25 +496,25 @@ static bool slab_max_order_set __initdat
  */
 static inline void page_set_cache(struct page *page, struct kmem_cache *cache)
 {
-	page->lru.next = (struct list_head *)cache;
+	page->slab_cache = cache;
 }
 
 static inline struct kmem_cache *page_get_cache(struct page *page)
 {
 	page = compound_head(page);
 	BUG_ON(!PageSlab(page));
-	return (struct kmem_cache *)page->lru.next;
+	return page->slab_cache;
 }
 
 static inline void page_set_slab(struct page *page, struct slab *slab)
 {
-	page->lru.prev = (struct list_head *)slab;
+	page->slab_page = slab;
 }
 
 static inline struct slab *page_get_slab(struct page *page)
 {
 	BUG_ON(!PageSlab(page));
-	return (struct slab *)page->lru.prev;
+	return page->slab_page;
 }
 
 static inline struct kmem_cache *virt_to_cache(const void *obj)
Index: linux-2.6/include/linux/mm_types.h
===================================================================
--- linux-2.6.orig/include/linux/mm_types.h	2012-04-19 09:05:10.221660582 -0500
+++ linux-2.6/include/linux/mm_types.h	2012-04-19 09:18:10.657644409 -0500
@@ -107,6 +107,10 @@ struct page {
 			short int pobjects;
 #endif
 		};
+		struct {		/* SLAB */
+			struct kmem_cache *slab_cache;
+			struct slab *slab_page;
+		};
 	};
 
 	/* Remainder is not double word aligned */

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC] SL[AUO]B common code 3/9] Extract common fields from struct kmem_cache
  2012-05-14 20:15 [RFC] SL[AUO]B common code 0/9] Sl[auo]b: Common functionality V1 Christoph Lameter
  2012-05-14 20:15 ` [RFC] SL[AUO]B common code 1/9] [slob] define page struct fields used in mm_types.h Christoph Lameter
  2012-05-14 20:15 ` [RFC] SL[AUO]B common code 2/9] [slab]: Use page struct fields instead of casting Christoph Lameter
@ 2012-05-14 20:15 ` Christoph Lameter
  2012-05-16  7:59   ` Glauber Costa
       [not found]   ` <alpine.LFD.2.02.1205160943180.2249@tux.localdomain>
  2012-05-14 20:15 ` [RFC] SL[AUO]B common code 4/9] slabs: Extract common code for kmem_cache_create Christoph Lameter
                   ` (6 subsequent siblings)
  9 siblings, 2 replies; 38+ messages in thread
From: Christoph Lameter @ 2012-05-14 20:15 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: linux-mm, David Rientjes, Matt Mackall

[-- Attachment #1: common_fields --]
[-- Type: text/plain, Size: 9180 bytes --]

Define "COMMON" to include definitions for fields used in all
slab allocators. After that it will be possible to share code that
only operates on those fields of kmem_cache.

The patch basically takes the slob definition of kmem cache and
uses the field namees for the other allocators.

The slob definition of kmem_cache is moved from slob.c to slob_def.h
so that the location of the kmem_cache definition is the same for
all allocators.

Signed-off-by: Christoph Lameter <cl@linux.com>

---
 include/linux/slab.h     |   11 +++++++++++
 include/linux/slab_def.h |    8 ++------
 include/linux/slub_def.h |   11 ++++-------
 mm/slab.c                |   30 +++++++++++++++---------------
 mm/slob.c                |    7 -------
 5 files changed, 32 insertions(+), 35 deletions(-)

Index: linux-2.6/include/linux/slab.h
===================================================================
--- linux-2.6.orig/include/linux/slab.h	2012-05-11 08:34:30.272522792 -0500
+++ linux-2.6/include/linux/slab.h	2012-05-11 09:36:35.292445608 -0500
@@ -93,6 +93,17 @@
 				(unsigned long)ZERO_SIZE_PTR)
 
 /*
+ * Common fields provided in kmem_cache by all slab allocators
+ */
+#define SLAB_COMMON \
+	unsigned int size, align;					\
+	unsigned long flags;						\
+	const char *name;						\
+	int refcount;							\
+	void (*ctor)(void *);						\
+	struct list_head list;
+
+/*
  * struct kmem_cache related prototypes
  */
 void __init kmem_cache_init(void);
Index: linux-2.6/include/linux/slab_def.h
===================================================================
--- linux-2.6.orig/include/linux/slab_def.h	2012-05-11 08:34:30.272522793 -0500
+++ linux-2.6/include/linux/slab_def.h	2012-05-11 08:34:32.708522751 -0500
@@ -31,7 +31,6 @@ struct kmem_cache {
 	u32 reciprocal_buffer_size;
 /* 2) touched by every alloc & free from the backend */
 
-	unsigned int flags;		/* constant flags */
 	unsigned int num;		/* # of objs per slab */
 
 /* 3) cache_grow/shrink */
@@ -47,12 +46,9 @@ struct kmem_cache {
 	unsigned int slab_size;
 	unsigned int dflags;		/* dynamic flags */
 
-	/* constructor func */
-	void (*ctor)(void *obj);
-
 /* 4) cache creation/removal */
-	const char *name;
-	struct list_head next;
+
+	SLAB_COMMON
 
 /* 5) statistics */
 #ifdef CONFIG_DEBUG_SLAB
Index: linux-2.6/include/linux/slub_def.h
===================================================================
--- linux-2.6.orig/include/linux/slub_def.h	2012-05-11 08:34:30.272522793 -0500
+++ linux-2.6/include/linux/slub_def.h	2012-05-11 08:34:32.708522751 -0500
@@ -80,9 +80,7 @@ struct kmem_cache_order_objects {
 struct kmem_cache {
 	struct kmem_cache_cpu __percpu *cpu_slab;
 	/* Used for retriving partial slabs etc */
-	unsigned long flags;
 	unsigned long min_partial;
-	int size;		/* The size of an object including meta data */
 	int objsize;		/* The size of an object without meta data */
 	int offset;		/* Free pointer offset. */
 	int cpu_partial;	/* Number of per cpu partial objects to keep around */
@@ -92,13 +90,12 @@ struct kmem_cache {
 	struct kmem_cache_order_objects max;
 	struct kmem_cache_order_objects min;
 	gfp_t allocflags;	/* gfp flags to use on each alloc */
-	int refcount;		/* Refcount for slab cache destroy */
-	void (*ctor)(void *);
+
+	SLAB_COMMON
+
 	int inuse;		/* Offset to metadata */
-	int align;		/* Alignment */
 	int reserved;		/* Reserved bytes at the end of slabs */
-	const char *name;	/* Name (only for display!) */
-	struct list_head list;	/* List of slab caches */
+
 #ifdef CONFIG_SYSFS
 	struct kobject kobj;	/* For sysfs */
 #endif
Index: linux-2.6/mm/slob.c
===================================================================
--- linux-2.6.orig/mm/slob.c	2012-05-11 08:34:31.792522763 -0500
+++ linux-2.6/mm/slob.c	2012-05-11 09:42:52.032437799 -0500
@@ -538,13 +538,6 @@ size_t ksize(const void *block)
 }
 EXPORT_SYMBOL(ksize);
 
-struct kmem_cache {
-	unsigned int size, align;
-	unsigned long flags;
-	const char *name;
-	void (*ctor)(void *);
-};
-
 struct kmem_cache *kmem_cache_create(const char *name, size_t size,
 	size_t align, unsigned long flags, void (*ctor)(void *))
 {
Index: linux-2.6/mm/slab.c
===================================================================
--- linux-2.6.orig/mm/slab.c	2012-05-11 08:34:32.292522756 -0500
+++ linux-2.6/mm/slab.c	2012-05-11 09:36:35.308445605 -0500
@@ -1153,7 +1153,7 @@ static int init_cache_nodelists_node(int
 	struct kmem_list3 *l3;
 	const int memsize = sizeof(struct kmem_list3);
 
-	list_for_each_entry(cachep, &cache_chain, next) {
+	list_for_each_entry(cachep, &cache_chain, list) {
 		/*
 		 * Set up the size64 kmemlist for cpu before we can
 		 * begin anything. Make sure some other cpu on this
@@ -1191,7 +1191,7 @@ static void __cpuinit cpuup_canceled(lon
 	int node = cpu_to_mem(cpu);
 	const struct cpumask *mask = cpumask_of_node(node);
 
-	list_for_each_entry(cachep, &cache_chain, next) {
+	list_for_each_entry(cachep, &cache_chain, list) {
 		struct array_cache *nc;
 		struct array_cache *shared;
 		struct array_cache **alien;
@@ -1241,7 +1241,7 @@ free_array_cache:
 	 * the respective cache's slabs,  now we can go ahead and
 	 * shrink each nodelist to its limit.
 	 */
-	list_for_each_entry(cachep, &cache_chain, next) {
+	list_for_each_entry(cachep, &cache_chain, list) {
 		l3 = cachep->nodelists[node];
 		if (!l3)
 			continue;
@@ -1270,7 +1270,7 @@ static int __cpuinit cpuup_prepare(long
 	 * Now we can go ahead with allocating the shared arrays and
 	 * array caches
 	 */
-	list_for_each_entry(cachep, &cache_chain, next) {
+	list_for_each_entry(cachep, &cache_chain, list) {
 		struct array_cache *nc;
 		struct array_cache *shared = NULL;
 		struct array_cache **alien = NULL;
@@ -1402,7 +1402,7 @@ static int __meminit drain_cache_nodelis
 	struct kmem_cache *cachep;
 	int ret = 0;
 
-	list_for_each_entry(cachep, &cache_chain, next) {
+	list_for_each_entry(cachep, &cache_chain, list) {
 		struct kmem_list3 *l3;
 
 		l3 = cachep->nodelists[node];
@@ -1545,7 +1545,7 @@ void __init kmem_cache_init(void)
 
 	/* 1) create the cache_cache */
 	INIT_LIST_HEAD(&cache_chain);
-	list_add(&cache_cache.next, &cache_chain);
+	list_add(&cache_cache.list, &cache_chain);
 	cache_cache.colour_off = cache_line_size();
 	cache_cache.array[smp_processor_id()] = &initarray_cache.cache;
 	cache_cache.nodelists[node] = &initkmem_list3[CACHE_CACHE + node];
@@ -1690,7 +1690,7 @@ void __init kmem_cache_init_late(void)
 
 	/* 6) resize the head arrays to their final sizes */
 	mutex_lock(&cache_chain_mutex);
-	list_for_each_entry(cachep, &cache_chain, next)
+	list_for_each_entry(cachep, &cache_chain, list)
 		if (enable_cpucache(cachep, GFP_NOWAIT))
 			BUG();
 	mutex_unlock(&cache_chain_mutex);
@@ -2300,7 +2300,7 @@ kmem_cache_create (const char *name, siz
 		mutex_lock(&cache_chain_mutex);
 	}
 
-	list_for_each_entry(pc, &cache_chain, next) {
+	list_for_each_entry(pc, &cache_chain, list) {
 		char tmp;
 		int res;
 
@@ -2545,7 +2545,7 @@ kmem_cache_create (const char *name, siz
 	}
 
 	/* cache setup completed, link it into the list */
-	list_add(&cachep->next, &cache_chain);
+	list_add(&cachep->list, &cache_chain);
 oops:
 	if (!cachep && (flags & SLAB_PANIC))
 		panic("kmem_cache_create(): failed to create slab `%s'\n",
@@ -2740,10 +2740,10 @@ void kmem_cache_destroy(struct kmem_cach
 	/*
 	 * the chain is never empty, cache_cache is never destroyed
 	 */
-	list_del(&cachep->next);
+	list_del(&cachep->list);
 	if (__cache_shrink(cachep)) {
 		slab_error(cachep, "Can't free all objects");
-		list_add(&cachep->next, &cache_chain);
+		list_add(&cachep->list, &cache_chain);
 		mutex_unlock(&cache_chain_mutex);
 		put_online_cpus();
 		return;
@@ -4030,7 +4030,7 @@ static int alloc_kmemlist(struct kmem_ca
 	return 0;
 
 fail:
-	if (!cachep->next.next) {
+	if (!cachep->list.next) {
 		/* Cache is not active yet. Roll back what we did */
 		node--;
 		while (node >= 0) {
@@ -4215,7 +4215,7 @@ static void cache_reap(struct work_struc
 		/* Give up. Setup the next iteration. */
 		goto out;
 
-	list_for_each_entry(searchp, &cache_chain, next) {
+	list_for_each_entry(searchp, &cache_chain, list) {
 		check_irq_on();
 
 		/*
@@ -4308,7 +4308,7 @@ static void s_stop(struct seq_file *m, v
 
 static int s_show(struct seq_file *m, void *p)
 {
-	struct kmem_cache *cachep = list_entry(p, struct kmem_cache, next);
+	struct kmem_cache *cachep = list_entry(p, struct kmem_cache, list);
 	struct slab *slabp;
 	unsigned long active_objs;
 	unsigned long num_objs;
@@ -4456,7 +4456,7 @@ static ssize_t slabinfo_write(struct fil
 	/* Find the cache in the chain of caches. */
 	mutex_lock(&cache_chain_mutex);
 	res = -EINVAL;
-	list_for_each_entry(cachep, &cache_chain, next) {
+	list_for_each_entry(cachep, &cache_chain, list) {
 		if (!strcmp(cachep->name, kbuf)) {
 			if (limit < 1 || batchcount < 1 ||
 					batchcount > limit || shared < 0) {

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC] SL[AUO]B common code 4/9] slabs: Extract common code for kmem_cache_create
  2012-05-14 20:15 [RFC] SL[AUO]B common code 0/9] Sl[auo]b: Common functionality V1 Christoph Lameter
                   ` (2 preceding siblings ...)
  2012-05-14 20:15 ` [RFC] SL[AUO]B common code 3/9] Extract common fields from struct kmem_cache Christoph Lameter
@ 2012-05-14 20:15 ` Christoph Lameter
  2012-05-14 20:15 ` [RFC] SL[AUO]B common code 5/9] slabs: Common definition for boot state of the slab allocators Christoph Lameter
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 38+ messages in thread
From: Christoph Lameter @ 2012-05-14 20:15 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: linux-mm, David Rientjes, Matt Mackall

[-- Attachment #1: common_kmem_cache_checks --]
[-- Type: text/plain, Size: 9749 bytes --]

Kmem_cache_create does a variety of sanity checks but those
vary depending on the allocator. Use the strictest tests and put them into
a slab_common file. Make the tests conditional on CONFIG_DEBUG_VM.

This patch has the effect of adding sanity checks for SLUB and SLOB
under CONFIG_DEBUG_VM and removes the checks in SLAB for !CONFIG_DEBUG_VM.

Signed-off-by: Christoph Lameter <cl@linux.com>

---
 include/linux/slab.h |    4 +++
 mm/Makefile          |    2 -
 mm/slab.c            |   24 ++++++------------
 mm/slab_common.c     |   67 +++++++++++++++++++++++++++++++++++++++++++++++++++
 mm/slob.c            |    8 ++----
 mm/slub.c            |   11 --------
 6 files changed, 85 insertions(+), 31 deletions(-)

Index: linux-2.6/mm/slab.c
===================================================================
--- linux-2.6.orig/mm/slab.c	2012-05-11 09:36:35.308445605 -0500
+++ linux-2.6/mm/slab.c	2012-05-11 09:43:33.160436947 -0500
@@ -1585,7 +1585,7 @@ void __init kmem_cache_init(void)
 	 * bug.
 	 */
 
-	sizes[INDEX_AC].cs_cachep = kmem_cache_create(names[INDEX_AC].name,
+	sizes[INDEX_AC].cs_cachep = __kmem_cache_create(names[INDEX_AC].name,
 					sizes[INDEX_AC].cs_size,
 					ARCH_KMALLOC_MINALIGN,
 					ARCH_KMALLOC_FLAGS|SLAB_PANIC,
@@ -1593,7 +1593,7 @@ void __init kmem_cache_init(void)
 
 	if (INDEX_AC != INDEX_L3) {
 		sizes[INDEX_L3].cs_cachep =
-			kmem_cache_create(names[INDEX_L3].name,
+			__kmem_cache_create(names[INDEX_L3].name,
 				sizes[INDEX_L3].cs_size,
 				ARCH_KMALLOC_MINALIGN,
 				ARCH_KMALLOC_FLAGS|SLAB_PANIC,
@@ -1611,14 +1611,14 @@ void __init kmem_cache_init(void)
 		 * allow tighter packing of the smaller caches.
 		 */
 		if (!sizes->cs_cachep) {
-			sizes->cs_cachep = kmem_cache_create(names->name,
+			sizes->cs_cachep = __kmem_cache_create(names->name,
 					sizes->cs_size,
 					ARCH_KMALLOC_MINALIGN,
 					ARCH_KMALLOC_FLAGS|SLAB_PANIC,
 					NULL);
 		}
 #ifdef CONFIG_ZONE_DMA
-		sizes->cs_dmacachep = kmem_cache_create(
+		sizes->cs_dmacachep = __kmem_cache_create(
 					names->name_dma,
 					sizes->cs_size,
 					ARCH_KMALLOC_MINALIGN,
@@ -2247,7 +2247,7 @@ static int __init_refok setup_cpu_cache(
 }
 
 /**
- * kmem_cache_create - Create a cache.
+ * __kmem_cache_create - Create a cache.
  * @name: A string which is used in /proc/slabinfo to identify this cache.
  * @size: The size of objects to be created in this cache.
  * @align: The required alignment for the objects.
@@ -2274,7 +2274,7 @@ static int __init_refok setup_cpu_cache(
  * as davem.
  */
 struct kmem_cache *
-kmem_cache_create (const char *name, size_t size, size_t align,
+__kmem_cache_create (const char *name, size_t size, size_t align,
 	unsigned long flags, void (*ctor)(void *))
 {
 	size_t left_over, slab_size, ralign;
@@ -2415,7 +2415,7 @@ kmem_cache_create (const char *name, siz
 	/* Get cache's description obj. */
 	cachep = kmem_cache_zalloc(&cache_cache, gfp);
 	if (!cachep)
-		goto oops;
+		return NULL;
 
 	cachep->nodelists = (struct kmem_list3 **)&cachep->array[nr_cpu_ids];
 #if DEBUG
@@ -2471,8 +2471,7 @@ kmem_cache_create (const char *name, siz
 		printk(KERN_ERR
 		       "kmem_cache_create: couldn't create cache %s.\n", name);
 		kmem_cache_free(&cache_cache, cachep);
-		cachep = NULL;
-		goto oops;
+		return NULL;
 	}
 	slab_size = ALIGN(cachep->num * sizeof(kmem_bufctl_t)
 			  + sizeof(struct slab), align);
@@ -2530,8 +2529,7 @@ kmem_cache_create (const char *name, siz
 
 	if (setup_cpu_cache(cachep, gfp)) {
 		__kmem_cache_destroy(cachep);
-		cachep = NULL;
-		goto oops;
+		return NULL;
 	}
 
 	if (flags & SLAB_DEBUG_OBJECTS) {
@@ -2547,16 +2545,12 @@ kmem_cache_create (const char *name, siz
 	/* cache setup completed, link it into the list */
 	list_add(&cachep->list, &cache_chain);
 oops:
-	if (!cachep && (flags & SLAB_PANIC))
-		panic("kmem_cache_create(): failed to create slab `%s'\n",
-		      name);
 	if (slab_is_available()) {
 		mutex_unlock(&cache_chain_mutex);
 		put_online_cpus();
 	}
 	return cachep;
 }
-EXPORT_SYMBOL(kmem_cache_create);
 
 #if DEBUG
 static void check_irq_off(void)
Index: linux-2.6/mm/slab_common.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/mm/slab_common.c	2012-05-11 09:43:33.160436947 -0500
@@ -0,0 +1,67 @@
+/*
+ * Slab allocator functions that are independent of the allocator strategy
+ *
+ * (C) 2012 Christoph Lameter <cl@linux.com>
+ */
+#include <linux/slab.h>
+
+#include <linux/mm.h>
+#include <linux/poison.h>
+#include <linux/interrupt.h>
+#include <linux/memory.h>
+#include <linux/compiler.h>
+#include <linux/module.h>
+
+#include <asm/cacheflush.h>
+#include <asm/tlbflush.h>
+#include <asm/page.h>
+
+/*
+ * kmem_cache_create - Create a cache.
+ * @name: A string which is used in /proc/slabinfo to identify this cache.
+ * @size: The size of objects to be created in this cache.
+ * @align: The required alignment for the objects.
+ * @flags: SLAB flags
+ * @ctor: A constructor for the objects.
+ *
+ * Returns a ptr to the cache on success, NULL on failure.
+ * Cannot be called within a interrupt, but can be interrupted.
+ * The @ctor is run when new pages are allocated by the cache.
+ *
+ * The flags are
+ *
+ * %SLAB_POISON - Poison the slab with a known test pattern (a5a5a5a5)
+ * to catch references to uninitialised memory.
+ *
+ * %SLAB_RED_ZONE - Insert `Red' zones around the allocated memory to check
+ * for buffer overruns.
+ *
+ * %SLAB_HWCACHE_ALIGN - Align the objects in this cache to a hardware
+ * cacheline.  This can be beneficial if you're counting cycles as closely
+ * as davem.
+ */
+
+struct kmem_cache *kmem_cache_create(const char *name, size_t size, size_t align,
+		unsigned long flags, void (*ctor)(void *))
+{
+	struct kmem_cache *s = NULL;
+
+#ifdef CONFIG_DEBUG_VM
+	if (!name || in_interrupt() || size < sizeof(void *) ||
+		size > KMALLOC_MAX_SIZE) {
+		printk(KERN_ERR "kmem_cache_create(%s) integrity check"
+			" failed\n", name);
+		goto out;
+	}
+#endif
+
+	s = __kmem_cache_create(name, size, align, flags, ctor);
+
+out:
+	if (!s && (flags & SLAB_PANIC))
+		panic("kmem_cache_create: Failed to create slab '%s'\n", name);
+
+	return s;
+}
+EXPORT_SYMBOL(kmem_cache_create);
+
Index: linux-2.6/mm/slub.c
===================================================================
--- linux-2.6.orig/mm/slub.c	2012-05-11 09:36:35.320445605 -0500
+++ linux-2.6/mm/slub.c	2012-05-11 09:43:33.164436947 -0500
@@ -3921,15 +3921,12 @@ static struct kmem_cache *find_mergeable
 	return NULL;
 }
 
-struct kmem_cache *kmem_cache_create(const char *name, size_t size,
+struct kmem_cache *__kmem_cache_create(const char *name, size_t size,
 		size_t align, unsigned long flags, void (*ctor)(void *))
 {
 	struct kmem_cache *s;
 	char *n;
 
-	if (WARN_ON(!name))
-		return NULL;
-
 	down_write(&slub_lock);
 	s = find_mergeable(size, align, flags, name, ctor);
 	if (s) {
@@ -3973,14 +3970,8 @@ struct kmem_cache *kmem_cache_create(con
 	}
 err:
 	up_write(&slub_lock);
-
-	if (flags & SLAB_PANIC)
-		panic("Cannot create slabcache %s\n", name);
-	else
-		s = NULL;
 	return s;
 }
-EXPORT_SYMBOL(kmem_cache_create);
 
 #ifdef CONFIG_SMP
 /*
Index: linux-2.6/mm/slob.c
===================================================================
--- linux-2.6.orig/mm/slob.c	2012-05-11 09:42:52.032437799 -0500
+++ linux-2.6/mm/slob.c	2012-05-11 09:43:33.164436947 -0500
@@ -538,7 +538,7 @@ size_t ksize(const void *block)
 }
 EXPORT_SYMBOL(ksize);
 
-struct kmem_cache *kmem_cache_create(const char *name, size_t size,
+struct kmem_cache *__kmem_cache_create(const char *name, size_t size,
 	size_t align, unsigned long flags, void (*ctor)(void *))
 {
 	struct kmem_cache *c;
@@ -561,13 +561,11 @@ struct kmem_cache *kmem_cache_create(con
 			c->align = ARCH_SLAB_MINALIGN;
 		if (c->align < align)
 			c->align = align;
-	} else if (flags & SLAB_PANIC)
-		panic("Cannot create slab cache %s\n", name);
 
-	kmemleak_alloc(c, sizeof(struct kmem_cache), 1, GFP_KERNEL);
+		kmemleak_alloc(c, sizeof(struct kmem_cache), 1, GFP_KERNEL);
+	}
 	return c;
 }
-EXPORT_SYMBOL(kmem_cache_create);
 
 void kmem_cache_destroy(struct kmem_cache *c)
 {
Index: linux-2.6/mm/Makefile
===================================================================
--- linux-2.6.orig/mm/Makefile	2012-05-11 09:36:35.328445605 -0500
+++ linux-2.6/mm/Makefile	2012-05-11 09:43:33.164436947 -0500
@@ -13,7 +13,7 @@ obj-y			:= filemap.o mempool.o oom_kill.
 			   readahead.o swap.o truncate.o vmscan.o shmem.o \
 			   prio_tree.o util.o mmzone.o vmstat.o backing-dev.o \
 			   page_isolation.o mm_init.o mmu_context.o percpu.o \
-			   $(mmu-y)
+			   slab_common.o $(mmu-y)
 obj-y += init-mm.o
 
 ifdef CONFIG_NO_BOOTMEM
Index: linux-2.6/include/linux/slab.h
===================================================================
--- linux-2.6.orig/include/linux/slab.h	2012-05-11 09:36:35.292445608 -0500
+++ linux-2.6/include/linux/slab.h	2012-05-11 09:43:33.164436947 -0500
@@ -117,6 +117,10 @@ int kmem_cache_shrink(struct kmem_cache
 void kmem_cache_free(struct kmem_cache *, void *);
 unsigned int kmem_cache_size(struct kmem_cache *);
 
+/* Slab internal function */
+struct kmem_cache *__kmem_cache_create(const char *, size_t, size_t,
+			unsigned long,
+			void (*)(void *));
 /*
  * Please use this macro to create slab caches. Simply specify the
  * name of the structure and maybe some flags that are listed above.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC] SL[AUO]B common code 5/9] slabs: Common definition for boot state of the slab allocators
  2012-05-14 20:15 [RFC] SL[AUO]B common code 0/9] Sl[auo]b: Common functionality V1 Christoph Lameter
                   ` (3 preceding siblings ...)
  2012-05-14 20:15 ` [RFC] SL[AUO]B common code 4/9] slabs: Extract common code for kmem_cache_create Christoph Lameter
@ 2012-05-14 20:15 ` Christoph Lameter
  2012-05-16  8:19   ` Glauber Costa
  2012-05-14 20:15 ` [RFC] SL[AUO]B common code 6/9] slabs: Use a common mutex definition Christoph Lameter
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 38+ messages in thread
From: Christoph Lameter @ 2012-05-14 20:15 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: linux-mm, David Rientjes, Matt Mackall

[-- Attachment #1: slab_internal --]
[-- Type: text/plain, Size: 9064 bytes --]

All allocators have some sort of support for the bootstrap status.

Setup a common definition for the boot states and make all slab
allocators use that definition.

Signed-off-by: Christoph Lameter <cl@linux.com>

---
 include/linux/slab.h |    4 ----
 mm/slab.c            |   42 +++++++++++-------------------------------
 mm/slab.h            |   30 ++++++++++++++++++++++++++++++
 mm/slab_common.c     |    9 +++++++++
 mm/slob.c            |   14 +++++---------
 mm/slub.c            |   21 +++++----------------
 6 files changed, 60 insertions(+), 60 deletions(-)

Index: linux-2.6/mm/slab.c
===================================================================
--- linux-2.6.orig/mm/slab.c	2012-05-11 09:43:33.160436947 -0500
+++ linux-2.6/mm/slab.c	2012-05-11 09:43:53.448436526 -0500
@@ -87,6 +87,7 @@
  */
 
 #include	<linux/slab.h>
+#include	"slab.h"
 #include	<linux/mm.h>
 #include	<linux/poison.h>
 #include	<linux/swap.h>
@@ -590,27 +591,6 @@ static struct kmem_cache cache_cache = {
 
 #define BAD_ALIEN_MAGIC 0x01020304ul
 
-/*
- * chicken and egg problem: delay the per-cpu array allocation
- * until the general caches are up.
- */
-static enum {
-	NONE,
-	PARTIAL_AC,
-	PARTIAL_L3,
-	EARLY,
-	LATE,
-	FULL
-} g_cpucache_up;
-
-/*
- * used by boot code to determine if it can use slab based allocator
- */
-int slab_is_available(void)
-{
-	return g_cpucache_up >= EARLY;
-}
-
 #ifdef CONFIG_LOCKDEP
 
 /*
@@ -676,7 +656,7 @@ static void init_node_lock_keys(int q)
 {
 	struct cache_sizes *s = malloc_sizes;
 
-	if (g_cpucache_up < LATE)
+	if (slab_state < UP)
 		return;
 
 	for (s = malloc_sizes; s->cs_size != ULONG_MAX; s++) {
@@ -1676,14 +1656,14 @@ void __init kmem_cache_init(void)
 		}
 	}
 
-	g_cpucache_up = EARLY;
+	slab_state = UP;
 }
 
 void __init kmem_cache_init_late(void)
 {
 	struct kmem_cache *cachep;
 
-	g_cpucache_up = LATE;
+	slab_state = UP;
 
 	/* Annotate slab for lockdep -- annotate the malloc caches */
 	init_lock_keys();
@@ -1696,7 +1676,7 @@ void __init kmem_cache_init_late(void)
 	mutex_unlock(&cache_chain_mutex);
 
 	/* Done! */
-	g_cpucache_up = FULL;
+	slab_state = FULL;
 
 	/*
 	 * Register a cpu startup notifier callback that initializes
@@ -2194,10 +2174,10 @@ static size_t calculate_slab_order(struc
 
 static int __init_refok setup_cpu_cache(struct kmem_cache *cachep, gfp_t gfp)
 {
-	if (g_cpucache_up == FULL)
+	if (slab_state == FULL)
 		return enable_cpucache(cachep, gfp);
 
-	if (g_cpucache_up == NONE) {
+	if (slab_state == DOWN) {
 		/*
 		 * Note: the first kmem_cache_create must create the cache
 		 * that's used by kmalloc(24), otherwise the creation of
@@ -2212,16 +2192,16 @@ static int __init_refok setup_cpu_cache(
 		 */
 		set_up_list3s(cachep, SIZE_AC);
 		if (INDEX_AC == INDEX_L3)
-			g_cpucache_up = PARTIAL_L3;
+			slab_state = PARTIAL_L3;
 		else
-			g_cpucache_up = PARTIAL_AC;
+			slab_state = PARTIAL_ARRAYCACHE;
 	} else {
 		cachep->array[smp_processor_id()] =
 			kmalloc(sizeof(struct arraycache_init), gfp);
 
-		if (g_cpucache_up == PARTIAL_AC) {
+		if (slab_state == PARTIAL_ARRAYCACHE) {
 			set_up_list3s(cachep, SIZE_L3);
-			g_cpucache_up = PARTIAL_L3;
+			slab_state = PARTIAL_L3;
 		} else {
 			int node;
 			for_each_online_node(node) {
Index: linux-2.6/mm/slab.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/mm/slab.h	2012-05-11 09:43:53.448436526 -0500
@@ -0,0 +1,30 @@
+#ifndef MM_SLAB_H
+#define MM_SLAB_H
+/*
+ * Internal slab definitions
+ */
+
+/*
+ * State of the slab allocator.
+ *
+ * This is used to describe the states of the allocator during bootup.
+ * Allocators use this to gradually bootstrap themselves. Most allocators
+ * have the problem that the structures used for managing slab caches are
+ * allocated from slab caches themselves.
+ */
+enum slab_state {
+	DOWN,			/* No slab functionality yet */
+	PARTIAL,		/* SLUB: kmem_cache_node available */
+	PARTIAL_ARRAYCACHE,	/* SLAB: kmalloc size for arraycache available */
+	PARTIAL_L3,		/* SLAB: kmalloc size for l3 struct available */
+	UP,			/* Slab caches usable but not all extras yet */
+	FULL			/* Everything is working */
+};
+
+extern enum slab_state slab_state;
+
+struct kmem_cache *__kmem_cache_create (const char *name, size_t size,
+	size_t align, unsigned long flags, void (*ctor)(void *));
+
+#endif
+
Index: linux-2.6/mm/slob.c
===================================================================
--- linux-2.6.orig/mm/slob.c	2012-05-11 09:43:33.164436947 -0500
+++ linux-2.6/mm/slob.c	2012-05-11 09:43:53.448436526 -0500
@@ -59,6 +59,8 @@
 
 #include <linux/kernel.h>
 #include <linux/slab.h>
+#include "slab.h"
+
 #include <linux/mm.h>
 #include <linux/swap.h> /* struct reclaim_state */
 #include <linux/cache.h>
@@ -563,6 +565,7 @@ struct kmem_cache *__kmem_cache_create(c
 			c->align = align;
 
 		kmemleak_alloc(c, sizeof(struct kmem_cache), 1, GFP_KERNEL);
+		c->refcount = 1;
 	}
 	return c;
 }
@@ -648,19 +651,12 @@ int kmem_cache_shrink(struct kmem_cache
 }
 EXPORT_SYMBOL(kmem_cache_shrink);
 
-static unsigned int slob_ready __read_mostly;
-
-int slab_is_available(void)
-{
-	return slob_ready;
-}
-
 void __init kmem_cache_init(void)
 {
-	slob_ready = 1;
+	slab_state = UP;
 }
 
 void __init kmem_cache_init_late(void)
 {
-	/* Nothing to do */
+	slab_state = FULL;
 }
Index: linux-2.6/mm/slub.c
===================================================================
--- linux-2.6.orig/mm/slub.c	2012-05-11 09:43:33.164436947 -0500
+++ linux-2.6/mm/slub.c	2012-05-11 09:43:53.448436526 -0500
@@ -16,6 +16,7 @@
 #include <linux/interrupt.h>
 #include <linux/bitops.h>
 #include <linux/slab.h>
+#include "slab.h"
 #include <linux/proc_fs.h>
 #include <linux/seq_file.h>
 #include <linux/kmemcheck.h>
@@ -182,13 +183,6 @@ static int kmem_size = sizeof(struct kme
 static struct notifier_block slab_notifier;
 #endif
 
-static enum {
-	DOWN,		/* No slab functionality available */
-	PARTIAL,	/* Kmem_cache_node works */
-	UP,		/* Everything works but does not show up in sysfs */
-	SYSFS		/* Sysfs up */
-} slab_state = DOWN;
-
 /* A list of all slab caches on the system */
 static DECLARE_RWSEM(slub_lock);
 static LIST_HEAD(slab_caches);
@@ -237,11 +231,6 @@ static inline void stat(const struct kme
  * 			Core slab cache functions
  *******************************************************************/
 
-int slab_is_available(void)
-{
-	return slab_state >= UP;
-}
-
 static inline struct kmem_cache_node *get_node(struct kmem_cache *s, int node)
 {
 	return s->node[node];
@@ -5274,7 +5263,7 @@ static int sysfs_slab_add(struct kmem_ca
 	const char *name;
 	int unmergeable;
 
-	if (slab_state < SYSFS)
+	if (slab_state < FULL)
 		/* Defer until later */
 		return 0;
 
@@ -5319,7 +5308,7 @@ static int sysfs_slab_add(struct kmem_ca
 
 static void sysfs_slab_remove(struct kmem_cache *s)
 {
-	if (slab_state < SYSFS)
+	if (slab_state < FULL)
 		/*
 		 * Sysfs has not been setup yet so no need to remove the
 		 * cache from sysfs.
@@ -5347,7 +5336,7 @@ static int sysfs_slab_alias(struct kmem_
 {
 	struct saved_alias *al;
 
-	if (slab_state == SYSFS) {
+	if (slab_state == FULL) {
 		/*
 		 * If we have a leftover link then remove it.
 		 */
@@ -5380,7 +5369,7 @@ static int __init slab_sysfs_init(void)
 		return -ENOSYS;
 	}
 
-	slab_state = SYSFS;
+	slab_state = FULL;
 
 	list_for_each_entry(s, &slab_caches, list) {
 		err = sysfs_slab_add(s);
Index: linux-2.6/mm/slab_common.c
===================================================================
--- linux-2.6.orig/mm/slab_common.c	2012-05-11 09:43:33.160436947 -0500
+++ linux-2.6/mm/slab_common.c	2012-05-11 09:43:53.448436526 -0500
@@ -16,6 +16,10 @@
 #include <asm/tlbflush.h>
 #include <asm/page.h>
 
+#include "slab.h"
+
+enum slab_state slab_state;
+
 /*
  * kmem_cache_create - Create a cache.
  * @name: A string which is used in /proc/slabinfo to identify this cache.
@@ -65,3 +69,8 @@ out:
 }
 EXPORT_SYMBOL(kmem_cache_create);
 
+int slab_is_available(void)
+{
+	return slab_state >= UP;
+}
+
Index: linux-2.6/include/linux/slab.h
===================================================================
--- linux-2.6.orig/include/linux/slab.h	2012-05-11 09:43:33.164436947 -0500
+++ linux-2.6/include/linux/slab.h	2012-05-11 09:43:53.448436526 -0500
@@ -117,10 +117,6 @@ int kmem_cache_shrink(struct kmem_cache
 void kmem_cache_free(struct kmem_cache *, void *);
 unsigned int kmem_cache_size(struct kmem_cache *);
 
-/* Slab internal function */
-struct kmem_cache *__kmem_cache_create(const char *, size_t, size_t,
-			unsigned long,
-			void (*)(void *));
 /*
  * Please use this macro to create slab caches. Simply specify the
  * name of the structure and maybe some flags that are listed above.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC] SL[AUO]B common code 6/9] slabs: Use a common mutex definition
  2012-05-14 20:15 [RFC] SL[AUO]B common code 0/9] Sl[auo]b: Common functionality V1 Christoph Lameter
                   ` (4 preceding siblings ...)
  2012-05-14 20:15 ` [RFC] SL[AUO]B common code 5/9] slabs: Common definition for boot state of the slab allocators Christoph Lameter
@ 2012-05-14 20:15 ` Christoph Lameter
  2012-05-16  8:34   ` Glauber Costa
  2012-05-14 20:15 ` [RFC] SL[AUO]B common code 7/9] slabs: Move kmem_cache_create mutex handling to common code Christoph Lameter
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 38+ messages in thread
From: Christoph Lameter @ 2012-05-14 20:15 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: linux-mm, David Rientjes, Matt Mackall

[-- Attachment #1: common_mutex --]
[-- Type: text/plain, Size: 18679 bytes --]

Use the mutex definition from SLAB and make it the common way to take a sleeping lock.

This has the effect of using a mutex instead of a rw semaphore for SLUB.

SLOB gains the use of a mutex for kmem_cache_create serialization.
Not needed now but SLOB may acquire some more features later (like slabinfo
/ sysfs support) through the expansion of the common code that will
need this.

Signed-off-by: Christoph Lameter <cl@linux.com>

---
 mm/slab.c        |  104 +++++++++++++++++++++++++------------------------------
 mm/slab.h        |    4 ++
 mm/slab_common.c |    2 +
 mm/slub.c        |   54 +++++++++++++---------------
 4 files changed, 80 insertions(+), 84 deletions(-)

Index: linux-2.6/mm/slab.c
===================================================================
--- linux-2.6.orig/mm/slab.c	2012-05-11 09:43:53.448436526 -0500
+++ linux-2.6/mm/slab.c	2012-05-11 09:43:58.984436408 -0500
@@ -68,7 +68,7 @@
  * Further notes from the original documentation:
  *
  * 11 April '97.  Started multi-threading - markhe
- *	The global cache-chain is protected by the mutex 'cache_chain_mutex'.
+ *	The global cache-chain is protected by the mutex 'slab_mutex'.
  *	The sem is only needed when accessing/extending the cache-chain, which
  *	can never happen inside an interrupt (kmem_cache_create(),
  *	kmem_cache_shrink() and kmem_cache_reap()).
@@ -696,12 +696,6 @@ static void slab_set_debugobj_lock_class
 }
 #endif
 
-/*
- * Guard access to the cache-chain.
- */
-static DEFINE_MUTEX(cache_chain_mutex);
-static struct list_head cache_chain;
-
 static DEFINE_PER_CPU(struct delayed_work, slab_reap_work);
 
 static inline struct array_cache *cpu_cache_get(struct kmem_cache *cachep)
@@ -1125,7 +1119,7 @@ static inline int cache_free_alien(struc
  * When hotplugging memory or a cpu, existing nodelists are not replaced if
  * already in use.
  *
- * Must hold cache_chain_mutex.
+ * Must hold slab_mutex.
  */
 static int init_cache_nodelists_node(int node)
 {
@@ -1133,7 +1127,7 @@ static int init_cache_nodelists_node(int
 	struct kmem_list3 *l3;
 	const int memsize = sizeof(struct kmem_list3);
 
-	list_for_each_entry(cachep, &cache_chain, list) {
+	list_for_each_entry(cachep, &slab_caches, list) {
 		/*
 		 * Set up the size64 kmemlist for cpu before we can
 		 * begin anything. Make sure some other cpu on this
@@ -1149,7 +1143,7 @@ static int init_cache_nodelists_node(int
 
 			/*
 			 * The l3s don't come and go as CPUs come and
-			 * go.  cache_chain_mutex is sufficient
+			 * go.  slab_mutex is sufficient
 			 * protection here.
 			 */
 			cachep->nodelists[node] = l3;
@@ -1171,7 +1165,7 @@ static void __cpuinit cpuup_canceled(lon
 	int node = cpu_to_mem(cpu);
 	const struct cpumask *mask = cpumask_of_node(node);
 
-	list_for_each_entry(cachep, &cache_chain, list) {
+	list_for_each_entry(cachep, &slab_caches, list) {
 		struct array_cache *nc;
 		struct array_cache *shared;
 		struct array_cache **alien;
@@ -1221,7 +1215,7 @@ free_array_cache:
 	 * the respective cache's slabs,  now we can go ahead and
 	 * shrink each nodelist to its limit.
 	 */
-	list_for_each_entry(cachep, &cache_chain, list) {
+	list_for_each_entry(cachep, &slab_caches, list) {
 		l3 = cachep->nodelists[node];
 		if (!l3)
 			continue;
@@ -1250,7 +1244,7 @@ static int __cpuinit cpuup_prepare(long
 	 * Now we can go ahead with allocating the shared arrays and
 	 * array caches
 	 */
-	list_for_each_entry(cachep, &cache_chain, list) {
+	list_for_each_entry(cachep, &slab_caches, list) {
 		struct array_cache *nc;
 		struct array_cache *shared = NULL;
 		struct array_cache **alien = NULL;
@@ -1318,9 +1312,9 @@ static int __cpuinit cpuup_callback(stru
 	switch (action) {
 	case CPU_UP_PREPARE:
 	case CPU_UP_PREPARE_FROZEN:
-		mutex_lock(&cache_chain_mutex);
+		mutex_lock(&slab_mutex);
 		err = cpuup_prepare(cpu);
-		mutex_unlock(&cache_chain_mutex);
+		mutex_unlock(&slab_mutex);
 		break;
 	case CPU_ONLINE:
 	case CPU_ONLINE_FROZEN:
@@ -1330,7 +1324,7 @@ static int __cpuinit cpuup_callback(stru
   	case CPU_DOWN_PREPARE:
   	case CPU_DOWN_PREPARE_FROZEN:
 		/*
-		 * Shutdown cache reaper. Note that the cache_chain_mutex is
+		 * Shutdown cache reaper. Note that the slab_mutex is
 		 * held so that if cache_reap() is invoked it cannot do
 		 * anything expensive but will only modify reap_work
 		 * and reschedule the timer.
@@ -1357,9 +1351,9 @@ static int __cpuinit cpuup_callback(stru
 #endif
 	case CPU_UP_CANCELED:
 	case CPU_UP_CANCELED_FROZEN:
-		mutex_lock(&cache_chain_mutex);
+		mutex_lock(&slab_mutex);
 		cpuup_canceled(cpu);
-		mutex_unlock(&cache_chain_mutex);
+		mutex_unlock(&slab_mutex);
 		break;
 	}
 	return notifier_from_errno(err);
@@ -1375,14 +1369,14 @@ static struct notifier_block __cpuinitda
  * Returns -EBUSY if all objects cannot be drained so that the node is not
  * removed.
  *
- * Must hold cache_chain_mutex.
+ * Must hold slab_mutex.
  */
 static int __meminit drain_cache_nodelists_node(int node)
 {
 	struct kmem_cache *cachep;
 	int ret = 0;
 
-	list_for_each_entry(cachep, &cache_chain, list) {
+	list_for_each_entry(cachep, &slab_caches, list) {
 		struct kmem_list3 *l3;
 
 		l3 = cachep->nodelists[node];
@@ -1413,14 +1407,14 @@ static int __meminit slab_memory_callbac
 
 	switch (action) {
 	case MEM_GOING_ONLINE:
-		mutex_lock(&cache_chain_mutex);
+		mutex_lock(&slab_mutex);
 		ret = init_cache_nodelists_node(nid);
-		mutex_unlock(&cache_chain_mutex);
+		mutex_unlock(&slab_mutex);
 		break;
 	case MEM_GOING_OFFLINE:
-		mutex_lock(&cache_chain_mutex);
+		mutex_lock(&slab_mutex);
 		ret = drain_cache_nodelists_node(nid);
-		mutex_unlock(&cache_chain_mutex);
+		mutex_unlock(&slab_mutex);
 		break;
 	case MEM_ONLINE:
 	case MEM_OFFLINE:
@@ -1524,8 +1518,8 @@ void __init kmem_cache_init(void)
 	node = numa_mem_id();
 
 	/* 1) create the cache_cache */
-	INIT_LIST_HEAD(&cache_chain);
-	list_add(&cache_cache.list, &cache_chain);
+	INIT_LIST_HEAD(&slab_caches);
+	list_add(&cache_cache.list, &slab_caches);
 	cache_cache.colour_off = cache_line_size();
 	cache_cache.array[smp_processor_id()] = &initarray_cache.cache;
 	cache_cache.nodelists[node] = &initkmem_list3[CACHE_CACHE + node];
@@ -1669,11 +1663,11 @@ void __init kmem_cache_init_late(void)
 	init_lock_keys();
 
 	/* 6) resize the head arrays to their final sizes */
-	mutex_lock(&cache_chain_mutex);
-	list_for_each_entry(cachep, &cache_chain, list)
+	mutex_lock(&slab_mutex);
+	list_for_each_entry(cachep, &slab_caches, list)
 		if (enable_cpucache(cachep, GFP_NOWAIT))
 			BUG();
-	mutex_unlock(&cache_chain_mutex);
+	mutex_unlock(&slab_mutex);
 
 	/* Done! */
 	slab_state = FULL;
@@ -2523,10 +2517,10 @@ __kmem_cache_create (const char *name, s
 	}
 
 	/* cache setup completed, link it into the list */
-	list_add(&cachep->list, &cache_chain);
+	list_add(&cachep->list, &slab_caches);
 oops:
 	if (slab_is_available()) {
-		mutex_unlock(&cache_chain_mutex);
+		mutex_unlock(&slab_mutex);
 		put_online_cpus();
 	}
 	return cachep;
@@ -2645,7 +2639,7 @@ out:
 	return nr_freed;
 }
 
-/* Called with cache_chain_mutex held to protect against cpu hotplug */
+/* Called with slab_mutex held to protect against cpu hotplug */
 static int __cache_shrink(struct kmem_cache *cachep)
 {
 	int ret = 0, i = 0;
@@ -2680,9 +2674,9 @@ int kmem_cache_shrink(struct kmem_cache
 	BUG_ON(!cachep || in_interrupt());
 
 	get_online_cpus();
-	mutex_lock(&cache_chain_mutex);
+	mutex_lock(&slab_mutex);
 	ret = __cache_shrink(cachep);
-	mutex_unlock(&cache_chain_mutex);
+	mutex_unlock(&slab_mutex);
 	put_online_cpus();
 	return ret;
 }
@@ -2710,15 +2704,15 @@ void kmem_cache_destroy(struct kmem_cach
 
 	/* Find the cache in the chain of caches. */
 	get_online_cpus();
-	mutex_lock(&cache_chain_mutex);
+	mutex_lock(&slab_mutex);
 	/*
 	 * the chain is never empty, cache_cache is never destroyed
 	 */
 	list_del(&cachep->list);
 	if (__cache_shrink(cachep)) {
 		slab_error(cachep, "Can't free all objects");
-		list_add(&cachep->list, &cache_chain);
-		mutex_unlock(&cache_chain_mutex);
+		list_add(&cachep->list, &slab_caches);
+		mutex_unlock(&slab_mutex);
 		put_online_cpus();
 		return;
 	}
@@ -2727,7 +2721,7 @@ void kmem_cache_destroy(struct kmem_cach
 		rcu_barrier();
 
 	__kmem_cache_destroy(cachep);
-	mutex_unlock(&cache_chain_mutex);
+	mutex_unlock(&slab_mutex);
 	put_online_cpus();
 }
 EXPORT_SYMBOL(kmem_cache_destroy);
@@ -4039,7 +4033,7 @@ static void do_ccupdate_local(void *info
 	new->new[smp_processor_id()] = old;
 }
 
-/* Always called with the cache_chain_mutex held */
+/* Always called with the slab_mutex held */
 static int do_tune_cpucache(struct kmem_cache *cachep, int limit,
 				int batchcount, int shared, gfp_t gfp)
 {
@@ -4083,7 +4077,7 @@ static int do_tune_cpucache(struct kmem_
 	return alloc_kmemlist(cachep, gfp);
 }
 
-/* Called with cache_chain_mutex held always */
+/* Called with slab_mutex held always */
 static int enable_cpucache(struct kmem_cache *cachep, gfp_t gfp)
 {
 	int err;
@@ -4185,11 +4179,11 @@ static void cache_reap(struct work_struc
 	int node = numa_mem_id();
 	struct delayed_work *work = to_delayed_work(w);
 
-	if (!mutex_trylock(&cache_chain_mutex))
+	if (!mutex_trylock(&slab_mutex))
 		/* Give up. Setup the next iteration. */
 		goto out;
 
-	list_for_each_entry(searchp, &cache_chain, list) {
+	list_for_each_entry(searchp, &slab_caches, list) {
 		check_irq_on();
 
 		/*
@@ -4227,7 +4221,7 @@ next:
 		cond_resched();
 	}
 	check_irq_on();
-	mutex_unlock(&cache_chain_mutex);
+	mutex_unlock(&slab_mutex);
 	next_reap_node();
 out:
 	/* Set up the next iteration */
@@ -4263,21 +4257,21 @@ static void *s_start(struct seq_file *m,
 {
 	loff_t n = *pos;
 
-	mutex_lock(&cache_chain_mutex);
+	mutex_lock(&slab_mutex);
 	if (!n)
 		print_slabinfo_header(m);
 
-	return seq_list_start(&cache_chain, *pos);
+	return seq_list_start(&slab_caches, *pos);
 }
 
 static void *s_next(struct seq_file *m, void *p, loff_t *pos)
 {
-	return seq_list_next(p, &cache_chain, pos);
+	return seq_list_next(p, &slab_caches, pos);
 }
 
 static void s_stop(struct seq_file *m, void *p)
 {
-	mutex_unlock(&cache_chain_mutex);
+	mutex_unlock(&slab_mutex);
 }
 
 static int s_show(struct seq_file *m, void *p)
@@ -4428,9 +4422,9 @@ static ssize_t slabinfo_write(struct fil
 		return -EINVAL;
 
 	/* Find the cache in the chain of caches. */
-	mutex_lock(&cache_chain_mutex);
+	mutex_lock(&slab_mutex);
 	res = -EINVAL;
-	list_for_each_entry(cachep, &cache_chain, list) {
+	list_for_each_entry(cachep, &slab_caches, list) {
 		if (!strcmp(cachep->name, kbuf)) {
 			if (limit < 1 || batchcount < 1 ||
 					batchcount > limit || shared < 0) {
@@ -4443,7 +4437,7 @@ static ssize_t slabinfo_write(struct fil
 			break;
 		}
 	}
-	mutex_unlock(&cache_chain_mutex);
+	mutex_unlock(&slab_mutex);
 	if (res >= 0)
 		res = count;
 	return res;
@@ -4466,8 +4460,8 @@ static const struct file_operations proc
 
 static void *leaks_start(struct seq_file *m, loff_t *pos)
 {
-	mutex_lock(&cache_chain_mutex);
-	return seq_list_start(&cache_chain, *pos);
+	mutex_lock(&slab_mutex);
+	return seq_list_start(&slab_caches, *pos);
 }
 
 static inline int add_caller(unsigned long *n, unsigned long v)
@@ -4566,17 +4560,17 @@ static int leaks_show(struct seq_file *m
 	name = cachep->name;
 	if (n[0] == n[1]) {
 		/* Increase the buffer size */
-		mutex_unlock(&cache_chain_mutex);
+		mutex_unlock(&slab_mutex);
 		m->private = kzalloc(n[0] * 4 * sizeof(unsigned long), GFP_KERNEL);
 		if (!m->private) {
 			/* Too bad, we are really out */
 			m->private = n;
-			mutex_lock(&cache_chain_mutex);
+			mutex_lock(&slab_mutex);
 			return -ENOMEM;
 		}
 		*(unsigned long *)m->private = n[0] * 2;
 		kfree(n);
-		mutex_lock(&cache_chain_mutex);
+		mutex_lock(&slab_mutex);
 		/* Now make sure this entry will be retried */
 		m->count = m->size;
 		return 0;
Index: linux-2.6/mm/slab.h
===================================================================
--- linux-2.6.orig/mm/slab.h	2012-05-11 09:43:53.448436526 -0500
+++ linux-2.6/mm/slab.h	2012-05-11 09:43:58.988436412 -0500
@@ -23,6 +23,10 @@ enum slab_state {
 
 extern enum slab_state slab_state;
 
+/* The slab cache mutex protects the management structures during changes */
+extern struct mutex slab_mutex;
+extern struct list_head slab_caches;
+
 struct kmem_cache *__kmem_cache_create (const char *name, size_t size,
 	size_t align, unsigned long flags, void (*ctor)(void *));
 
Index: linux-2.6/mm/slub.c
===================================================================
--- linux-2.6.orig/mm/slub.c	2012-05-11 09:43:53.448436526 -0500
+++ linux-2.6/mm/slub.c	2012-05-11 09:43:58.988436412 -0500
@@ -36,13 +36,13 @@
 
 /*
  * Lock order:
- *   1. slub_lock (Global Semaphore)
+ *   1. slab_mutex (Global Mutex)
  *   2. node->list_lock
  *   3. slab_lock(page) (Only on some arches and for debugging)
  *
- *   slub_lock
+ *   slab_mutex
  *
- *   The role of the slub_lock is to protect the list of all the slabs
+ *   The role of the slab_mutex is to protect the list of all the slabs
  *   and to synchronize major metadata changes to slab cache structures.
  *
  *   The slab_lock is only used for debugging and on arches that do not
@@ -183,10 +183,6 @@ static int kmem_size = sizeof(struct kme
 static struct notifier_block slab_notifier;
 #endif
 
-/* A list of all slab caches on the system */
-static DECLARE_RWSEM(slub_lock);
-static LIST_HEAD(slab_caches);
-
 /*
  * Tracking user of a slab.
  */
@@ -3178,11 +3174,11 @@ static inline int kmem_cache_close(struc
  */
 void kmem_cache_destroy(struct kmem_cache *s)
 {
-	down_write(&slub_lock);
+	mutex_lock(&slab_mutex);
 	s->refcount--;
 	if (!s->refcount) {
 		list_del(&s->list);
-		up_write(&slub_lock);
+		mutex_unlock(&slab_mutex);
 		if (kmem_cache_close(s)) {
 			printk(KERN_ERR "SLUB %s: %s called for cache that "
 				"still has objects.\n", s->name, __func__);
@@ -3192,7 +3188,7 @@ void kmem_cache_destroy(struct kmem_cach
 			rcu_barrier();
 		sysfs_slab_remove(s);
 	} else
-		up_write(&slub_lock);
+		mutex_unlock(&slab_mutex);
 }
 EXPORT_SYMBOL(kmem_cache_destroy);
 
@@ -3254,7 +3250,7 @@ static struct kmem_cache *__init create_
 
 	/*
 	 * This function is called with IRQs disabled during early-boot on
-	 * single CPU so there's no need to take slub_lock here.
+	 * single CPU so there's no need to take slab_mutex here.
 	 */
 	if (!kmem_cache_open(s, name, size, ARCH_KMALLOC_MINALIGN,
 								flags, NULL))
@@ -3539,10 +3535,10 @@ static int slab_mem_going_offline_callba
 {
 	struct kmem_cache *s;
 
-	down_read(&slub_lock);
+	mutex_lock(&slab_mutex);
 	list_for_each_entry(s, &slab_caches, list)
 		kmem_cache_shrink(s);
-	up_read(&slub_lock);
+	mutex_unlock(&slab_mutex);
 
 	return 0;
 }
@@ -3563,7 +3559,7 @@ static void slab_mem_offline_callback(vo
 	if (offline_node < 0)
 		return;
 
-	down_read(&slub_lock);
+	mutex_lock(&slab_mutex);
 	list_for_each_entry(s, &slab_caches, list) {
 		n = get_node(s, offline_node);
 		if (n) {
@@ -3579,7 +3575,7 @@ static void slab_mem_offline_callback(vo
 			kmem_cache_free(kmem_cache_node, n);
 		}
 	}
-	up_read(&slub_lock);
+	mutex_unlock(&slab_mutex);
 }
 
 static int slab_mem_going_online_callback(void *arg)
@@ -3602,7 +3598,7 @@ static int slab_mem_going_online_callbac
 	 * allocate a kmem_cache_node structure in order to bring the node
 	 * online.
 	 */
-	down_read(&slub_lock);
+	mutex_lock(&slab_mutex);
 	list_for_each_entry(s, &slab_caches, list) {
 		/*
 		 * XXX: kmem_cache_alloc_node will fallback to other nodes
@@ -3618,7 +3614,7 @@ static int slab_mem_going_online_callbac
 		s->node[nid] = n;
 	}
 out:
-	up_read(&slub_lock);
+	mutex_unlock(&slab_mutex);
 	return ret;
 }
 
@@ -3916,7 +3912,7 @@ struct kmem_cache *__kmem_cache_create(c
 	struct kmem_cache *s;
 	char *n;
 
-	down_write(&slub_lock);
+	mutex_lock(&slab_mutex);
 	s = find_mergeable(size, align, flags, name, ctor);
 	if (s) {
 		s->refcount++;
@@ -3931,7 +3927,7 @@ struct kmem_cache *__kmem_cache_create(c
 			s->refcount--;
 			goto err;
 		}
-		up_write(&slub_lock);
+		mutex_unlock(&slab_mutex);
 		return s;
 	}
 
@@ -3944,9 +3940,9 @@ struct kmem_cache *__kmem_cache_create(c
 		if (kmem_cache_open(s, n,
 				size, align, flags, ctor)) {
 			list_add(&s->list, &slab_caches);
-			up_write(&slub_lock);
+			mutex_unlock(&slab_mutex);
 			if (sysfs_slab_add(s)) {
-				down_write(&slub_lock);
+				mutex_lock(&slab_mutex);
 				list_del(&s->list);
 				kfree(n);
 				kfree(s);
@@ -3958,7 +3954,7 @@ struct kmem_cache *__kmem_cache_create(c
 		kfree(s);
 	}
 err:
-	up_write(&slub_lock);
+	mutex_unlock(&slab_mutex);
 	return s;
 }
 
@@ -3979,13 +3975,13 @@ static int __cpuinit slab_cpuup_callback
 	case CPU_UP_CANCELED_FROZEN:
 	case CPU_DEAD:
 	case CPU_DEAD_FROZEN:
-		down_read(&slub_lock);
+		mutex_lock(&slab_mutex);
 		list_for_each_entry(s, &slab_caches, list) {
 			local_irq_save(flags);
 			__flush_cpu_slab(s, cpu);
 			local_irq_restore(flags);
 		}
-		up_read(&slub_lock);
+		mutex_unlock(&slab_mutex);
 		break;
 	default:
 		break;
@@ -5360,11 +5356,11 @@ static int __init slab_sysfs_init(void)
 	struct kmem_cache *s;
 	int err;
 
-	down_write(&slub_lock);
+	mutex_lock(&slab_mutex);
 
 	slab_kset = kset_create_and_add("slab", &slab_uevent_ops, kernel_kobj);
 	if (!slab_kset) {
-		up_write(&slub_lock);
+		mutex_unlock(&slab_mutex);
 		printk(KERN_ERR "Cannot register slab subsystem.\n");
 		return -ENOSYS;
 	}
@@ -5389,7 +5385,7 @@ static int __init slab_sysfs_init(void)
 		kfree(al);
 	}
 
-	up_write(&slub_lock);
+	mutex_unlock(&slab_mutex);
 	resiliency_test();
 	return 0;
 }
@@ -5415,7 +5411,7 @@ static void *s_start(struct seq_file *m,
 {
 	loff_t n = *pos;
 
-	down_read(&slub_lock);
+	mutex_lock(&slab_mutex);
 	if (!n)
 		print_slabinfo_header(m);
 
@@ -5429,7 +5425,7 @@ static void *s_next(struct seq_file *m,
 
 static void s_stop(struct seq_file *m, void *p)
 {
-	up_read(&slub_lock);
+	mutex_unlock(&slab_mutex);
 }
 
 static int s_show(struct seq_file *m, void *p)
Index: linux-2.6/mm/slab_common.c
===================================================================
--- linux-2.6.orig/mm/slab_common.c	2012-05-11 09:43:53.448436526 -0500
+++ linux-2.6/mm/slab_common.c	2012-05-11 09:43:58.988436412 -0500
@@ -19,6 +19,8 @@
 #include "slab.h"
 
 enum slab_state slab_state;
+LIST_HEAD(slab_caches);
+DEFINE_MUTEX(slab_mutex);
 
 /*
  * kmem_cache_create - Create a cache.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC] SL[AUO]B common code 7/9] slabs: Move kmem_cache_create mutex handling to common code
  2012-05-14 20:15 [RFC] SL[AUO]B common code 0/9] Sl[auo]b: Common functionality V1 Christoph Lameter
                   ` (5 preceding siblings ...)
  2012-05-14 20:15 ` [RFC] SL[AUO]B common code 6/9] slabs: Use a common mutex definition Christoph Lameter
@ 2012-05-14 20:15 ` Christoph Lameter
  2012-05-16 10:15   ` Glauber Costa
  2012-05-14 20:15 ` [RFC] SL[AUO]B common code 8/9] slabs: list addition move to slab_common Christoph Lameter
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 38+ messages in thread
From: Christoph Lameter @ 2012-05-14 20:15 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: linux-mm, David Rientjes, Matt Mackall

[-- Attachment #1: move_mutex_to_common --]
[-- Type: text/plain, Size: 5607 bytes --]

Move the mutex handling into the common kmem_cache_create()
function.

Then we can also move more checks out of SLAB's kmem_cache_create()
into the common code.

Signed-off-by: Christoph Lameter <cl@linux.com>

---
 mm/slab.c        |   52 +---------------------------------------------------
 mm/slab_common.c |   41 ++++++++++++++++++++++++++++++++++++++++-
 mm/slub.c        |   30 ++++++++++++++----------------
 3 files changed, 55 insertions(+), 68 deletions(-)

Index: linux-2.6/mm/slab.c
===================================================================
--- linux-2.6.orig/mm/slab.c	2012-05-14 08:51:51.819130415 -0500
+++ linux-2.6/mm/slab.c	2012-05-14 08:52:41.707129382 -0500
@@ -2252,55 +2252,10 @@ __kmem_cache_create (const char *name, s
 	unsigned long flags, void (*ctor)(void *))
 {
 	size_t left_over, slab_size, ralign;
-	struct kmem_cache *cachep = NULL, *pc;
+	struct kmem_cache *cachep = NULL;
 	gfp_t gfp;
 
-	/*
-	 * Sanity checks... these are all serious usage bugs.
-	 */
-	if (!name || in_interrupt() || (size < BYTES_PER_WORD) ||
-	    size > KMALLOC_MAX_SIZE) {
-		printk(KERN_ERR "%s: Early error in slab %s\n", __func__,
-				name);
-		BUG();
-	}
-
-	/*
-	 * We use cache_chain_mutex to ensure a consistent view of
-	 * cpu_online_mask as well.  Please see cpuup_callback
-	 */
-	if (slab_is_available()) {
-		get_online_cpus();
-		mutex_lock(&cache_chain_mutex);
-	}
-
-	list_for_each_entry(pc, &cache_chain, list) {
-		char tmp;
-		int res;
-
-		/*
-		 * This happens when the module gets unloaded and doesn't
-		 * destroy its slab cache and no-one else reuses the vmalloc
-		 * area of the module.  Print a warning.
-		 */
-		res = probe_kernel_address(pc->name, tmp);
-		if (res) {
-			printk(KERN_ERR
-			       "SLAB: cache with size %d has lost its name\n",
-			       pc->buffer_size);
-			continue;
-		}
-
-		if (!strcmp(pc->name, name)) {
-			printk(KERN_ERR
-			       "kmem_cache_create: duplicate cache %s\n", name);
-			dump_stack();
-			goto oops;
-		}
-	}
-
 #if DEBUG
-	WARN_ON(strchr(name, ' '));	/* It confuses parsers */
 #if FORCED_DEBUG
 	/*
 	 * Enable redzoning and last user accounting, except for caches with
@@ -2518,11 +2473,6 @@ __kmem_cache_create (const char *name, s
 
 	/* cache setup completed, link it into the list */
 	list_add(&cachep->list, &slab_caches);
-oops:
-	if (slab_is_available()) {
-		mutex_unlock(&slab_mutex);
-		put_online_cpus();
-	}
 	return cachep;
 }
 
Index: linux-2.6/mm/slab_common.c
===================================================================
--- linux-2.6.orig/mm/slab_common.c	2012-05-14 08:51:51.839130415 -0500
+++ linux-2.6/mm/slab_common.c	2012-05-14 08:52:11.407130009 -0500
@@ -11,7 +11,8 @@
 #include <linux/memory.h>
 #include <linux/compiler.h>
 #include <linux/module.h>
-
+#include <linux/cpu.h>
+#include <linux/uaccess.h>
 #include <asm/cacheflush.h>
 #include <asm/tlbflush.h>
 #include <asm/page.h>
@@ -61,8 +62,46 @@ struct kmem_cache *kmem_cache_create(con
 	}
 #endif
 
+	get_online_cpus();
+	mutex_lock(&slab_mutex);
+
+#ifdef CONFIG_DEBUG_VM
+	list_for_each_entry(s, &slab_caches, list) {
+		char tmp;
+		int res;
+
+		/*
+		 * This happens when the module gets unloaded and doesn't
+		 * destroy its slab cache and no-one else reuses the vmalloc
+		 * area of the module.  Print a warning.
+		 */
+		res = probe_kernel_address(s->name, tmp);
+		if (res) {
+			printk(KERN_ERR
+			       "Slab cache with size %d has lost its name\n",
+			       s->size);
+			continue;
+		}
+
+		if (!strcmp(s->name, name)) {
+			printk(KERN_ERR "kmem_cache_create(%s): Cache name"
+				" already exists.\n",
+				name);
+			dump_stack();
+			s = NULL;
+			goto oops;
+		}
+	}
+
+	WARN_ON(strchr(name, ' '));	/* It confuses parsers */
+#endif
+
 	s = __kmem_cache_create(name, size, align, flags, ctor);
 
+oops:
+	mutex_unlock(&slab_mutex);
+	put_online_cpus();
+
 out:
 	if (!s && (flags & SLAB_PANIC))
 		panic("kmem_cache_create: Failed to create slab '%s'\n", name);
Index: linux-2.6/mm/slub.c
===================================================================
--- linux-2.6.orig/mm/slub.c	2012-05-14 08:51:51.827130415 -0500
+++ linux-2.6/mm/slub.c	2012-05-14 08:52:11.411130009 -0500
@@ -3912,7 +3912,6 @@ struct kmem_cache *__kmem_cache_create(c
 	struct kmem_cache *s;
 	char *n;
 
-	mutex_lock(&slab_mutex);
 	s = find_mergeable(size, align, flags, name, ctor);
 	if (s) {
 		s->refcount++;
@@ -3925,37 +3924,36 @@ struct kmem_cache *__kmem_cache_create(c
 
 		if (sysfs_slab_alias(s, name)) {
 			s->refcount--;
-			goto err;
+			return NULL;
 		}
-		mutex_unlock(&slab_mutex);
 		return s;
 	}
 
 	n = kstrdup(name, GFP_KERNEL);
 	if (!n)
-		goto err;
+		return NULL;
 
 	s = kmalloc(kmem_size, GFP_KERNEL);
 	if (s) {
 		if (kmem_cache_open(s, n,
 				size, align, flags, ctor)) {
+			int r;
+
 			list_add(&s->list, &slab_caches);
 			mutex_unlock(&slab_mutex);
-			if (sysfs_slab_add(s)) {
-				mutex_lock(&slab_mutex);
-				list_del(&s->list);
-				kfree(n);
-				kfree(s);
-				goto err;
-			}
-			return s;
+			r = sysfs_slab_add(s);
+			mutex_lock(&slab_mutex);
+
+			if (!r)
+				return s;
+
+			list_del(&s->list);
+			kmem_cache_close(s);
 		}
-		kfree(n);
 		kfree(s);
 	}
-err:
-	mutex_unlock(&slab_mutex);
-	return s;
+	kfree(n);
+	return NULL;
 }
 
 #ifdef CONFIG_SMP

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC] SL[AUO]B common code 8/9] slabs: list addition move to slab_common
  2012-05-14 20:15 [RFC] SL[AUO]B common code 0/9] Sl[auo]b: Common functionality V1 Christoph Lameter
                   ` (6 preceding siblings ...)
  2012-05-14 20:15 ` [RFC] SL[AUO]B common code 7/9] slabs: Move kmem_cache_create mutex handling to common code Christoph Lameter
@ 2012-05-14 20:15 ` Christoph Lameter
  2012-05-16 10:09   ` Glauber Costa
  2012-05-14 20:15 ` [RFC] SL[AUO]B common code 9/9] slabs: Extract a common function for kmem_cache_destroy Christoph Lameter
  2012-05-16  8:08 ` [RFC] SL[AUO]B common code 0/9] Sl[auo]b: Common functionality V1 Glauber Costa
  9 siblings, 1 reply; 38+ messages in thread
From: Christoph Lameter @ 2012-05-14 20:15 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: linux-mm, David Rientjes, Matt Mackall

[-- Attachment #1: move_list_add --]
[-- Type: text/plain, Size: 3449 bytes --]

Move the code to append the new kmem_cache to the list of slab caches to
the kmem_cache_create code in the shared code.

This is possible now since the acquisition of the mutex was moved into
kmem_cache_create().

Signed-off-by: Christoph Lameter <cl@linux.com>


---
 mm/slab.c        |    7 +++++--
 mm/slab_common.c |    3 +++
 mm/slub.c        |    2 --
 3 files changed, 8 insertions(+), 4 deletions(-)

Index: linux-2.6/mm/slab_common.c
===================================================================
--- linux-2.6.orig/mm/slab_common.c	2012-05-14 08:39:27.859145830 -0500
+++ linux-2.6/mm/slab_common.c	2012-05-14 08:39:29.827145790 -0500
@@ -98,6 +98,9 @@ struct kmem_cache *kmem_cache_create(con
 
 	s = __kmem_cache_create(name, size, align, flags, ctor);
 
+	if (s && s->refcount == 1)
+		list_add(&s->list, &slab_caches);
+
 oops:
 	mutex_unlock(&slab_mutex);
 	put_online_cpus();
Index: linux-2.6/mm/slab.c
===================================================================
--- linux-2.6.orig/mm/slab.c	2012-05-14 08:39:28.563145816 -0500
+++ linux-2.6/mm/slab.c	2012-05-14 08:39:29.831145790 -0500
@@ -1565,6 +1565,7 @@ void __init kmem_cache_init(void)
 					ARCH_KMALLOC_FLAGS|SLAB_PANIC,
 					NULL);
 
+	list_add(&sizes[INDEX_AC].cs_cachep->list, &slab_caches);
 	if (INDEX_AC != INDEX_L3) {
 		sizes[INDEX_L3].cs_cachep =
 			__kmem_cache_create(names[INDEX_L3].name,
@@ -1572,6 +1573,7 @@ void __init kmem_cache_init(void)
 				ARCH_KMALLOC_MINALIGN,
 				ARCH_KMALLOC_FLAGS|SLAB_PANIC,
 				NULL);
+		list_add(&sizes[INDEX_L3].cs_cachep->list, &slab_caches);
 	}
 
 	slab_early_init = 0;
@@ -1590,6 +1592,7 @@ void __init kmem_cache_init(void)
 					ARCH_KMALLOC_MINALIGN,
 					ARCH_KMALLOC_FLAGS|SLAB_PANIC,
 					NULL);
+			list_add(&sizes->cs_cachep->list, &slab_caches);
 		}
 #ifdef CONFIG_ZONE_DMA
 		sizes->cs_dmacachep = __kmem_cache_create(
@@ -1599,6 +1602,7 @@ void __init kmem_cache_init(void)
 					ARCH_KMALLOC_FLAGS|SLAB_CACHE_DMA|
 						SLAB_PANIC,
 					NULL);
+		list_add(&sizes->cs_dmacachep->list, &slab_caches);
 #endif
 		sizes++;
 		names++;
@@ -2455,6 +2459,7 @@ __kmem_cache_create (const char *name, s
 	}
 	cachep->ctor = ctor;
 	cachep->name = name;
+	cachep->refcount = 1;
 
 	if (setup_cpu_cache(cachep, gfp)) {
 		__kmem_cache_destroy(cachep);
@@ -2471,8 +2476,6 @@ __kmem_cache_create (const char *name, s
 		slab_set_debugobj_lock_classes(cachep);
 	}
 
-	/* cache setup completed, link it into the list */
-	list_add(&cachep->list, &slab_caches);
 	return cachep;
 }
 
Index: linux-2.6/mm/slub.c
===================================================================
--- linux-2.6.orig/mm/slub.c	2012-05-14 08:39:27.859145830 -0500
+++ linux-2.6/mm/slub.c	2012-05-14 08:39:29.831145790 -0500
@@ -3939,7 +3939,6 @@ struct kmem_cache *__kmem_cache_create(c
 				size, align, flags, ctor)) {
 			int r;
 
-			list_add(&s->list, &slab_caches);
 			mutex_unlock(&slab_mutex);
 			r = sysfs_slab_add(s);
 			mutex_lock(&slab_mutex);
@@ -3947,7 +3946,6 @@ struct kmem_cache *__kmem_cache_create(c
 			if (!r)
 				return s;
 
-			list_del(&s->list);
 			kmem_cache_close(s);
 		}
 		kfree(s);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC] SL[AUO]B common code 9/9] slabs: Extract a common function for kmem_cache_destroy
  2012-05-14 20:15 [RFC] SL[AUO]B common code 0/9] Sl[auo]b: Common functionality V1 Christoph Lameter
                   ` (7 preceding siblings ...)
  2012-05-14 20:15 ` [RFC] SL[AUO]B common code 8/9] slabs: list addition move to slab_common Christoph Lameter
@ 2012-05-14 20:15 ` Christoph Lameter
  2012-05-16  8:08 ` [RFC] SL[AUO]B common code 0/9] Sl[auo]b: Common functionality V1 Glauber Costa
  9 siblings, 0 replies; 38+ messages in thread
From: Christoph Lameter @ 2012-05-14 20:15 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: linux-mm, David Rientjes, Matt Mackall

[-- Attachment #1: kmem_cache_destroy --]
[-- Type: text/plain, Size: 6831 bytes --]

kmem_cache_destroy does basically the same in all allocators.

Extract common code which is easy since we already have common mutex handling.

Signed-off-by: Christoph Lameter <cl@linux.com>


---
 mm/slab.c        |   55 +++----------------------------------------------------
 mm/slab.h        |    4 +++-
 mm/slab_common.c |   22 ++++++++++++++++++++++
 mm/slob.c        |   11 +++++++----
 mm/slub.c        |   29 ++++++++---------------------
 5 files changed, 43 insertions(+), 78 deletions(-)

Index: linux-2.6/mm/slab_common.c
===================================================================
--- linux-2.6.orig/mm/slab_common.c	2012-05-14 08:39:29.827145790 -0500
+++ linux-2.6/mm/slab_common.c	2012-05-14 08:39:42.867145519 -0500
@@ -113,6 +113,28 @@ out:
 }
 EXPORT_SYMBOL(kmem_cache_create);
 
+void kmem_cache_destroy(struct kmem_cache *s)
+{
+	get_online_cpus();
+	mutex_lock(&slab_mutex);
+	list_del(&s->list);
+
+	if (!__kmem_cache_shutdown(s)) {
+		if (s->flags & SLAB_DESTROY_BY_RCU)
+			rcu_barrier();
+
+		__kmem_cache_destroy(s);
+	} else {
+		list_add(&s->list, &slab_caches);
+		printk(KERN_ERR "kmem_cache_destroy %s: Slab cache still has objects\n",
+			s->name);
+		dump_stack();
+	}
+	mutex_unlock(&slab_mutex);
+	put_online_cpus();
+}
+EXPORT_SYMBOL(kmem_cache_destroy);
+
 int slab_is_available(void)
 {
 	return slab_state >= UP;
Index: linux-2.6/mm/slab.c
===================================================================
--- linux-2.6.orig/mm/slab.c	2012-05-14 08:39:29.831145790 -0500
+++ linux-2.6/mm/slab.c	2012-05-14 08:39:42.871145519 -0500
@@ -804,16 +804,6 @@ static void cache_estimate(unsigned long
 	*left_over = slab_size - nr_objs*buffer_size - mgmt_size;
 }
 
-#define slab_error(cachep, msg) __slab_error(__func__, cachep, msg)
-
-static void __slab_error(const char *function, struct kmem_cache *cachep,
-			char *msg)
-{
-	printk(KERN_ERR "slab error in %s(): cache `%s': %s\n",
-	       function, cachep->name, msg);
-	dump_stack();
-}
-
 /*
  * By default on NUMA we use alien caches to stage the freeing of
  * objects allocated from other nodes. This causes massive memory
@@ -2079,7 +2069,7 @@ static void slab_destroy(struct kmem_cac
 	}
 }
 
-static void __kmem_cache_destroy(struct kmem_cache *cachep)
+void __kmem_cache_destroy(struct kmem_cache *cachep)
 {
 	int i;
 	struct kmem_list3 *l3;
@@ -2635,49 +2625,10 @@ int kmem_cache_shrink(struct kmem_cache
 }
 EXPORT_SYMBOL(kmem_cache_shrink);
 
-/**
- * kmem_cache_destroy - delete a cache
- * @cachep: the cache to destroy
- *
- * Remove a &struct kmem_cache object from the slab cache.
- *
- * It is expected this function will be called by a module when it is
- * unloaded.  This will remove the cache completely, and avoid a duplicate
- * cache being allocated each time a module is loaded and unloaded, if the
- * module doesn't have persistent in-kernel storage across loads and unloads.
- *
- * The cache must be empty before calling this function.
- *
- * The caller must guarantee that no one will allocate memory from the cache
- * during the kmem_cache_destroy().
- */
-void kmem_cache_destroy(struct kmem_cache *cachep)
+int __kmem_cache_shutdown(struct kmem_cache *cachep)
 {
-	BUG_ON(!cachep || in_interrupt());
-
-	/* Find the cache in the chain of caches. */
-	get_online_cpus();
-	mutex_lock(&slab_mutex);
-	/*
-	 * the chain is never empty, cache_cache is never destroyed
-	 */
-	list_del(&cachep->list);
-	if (__cache_shrink(cachep)) {
-		slab_error(cachep, "Can't free all objects");
-		list_add(&cachep->list, &slab_caches);
-		mutex_unlock(&slab_mutex);
-		put_online_cpus();
-		return;
-	}
-
-	if (unlikely(cachep->flags & SLAB_DESTROY_BY_RCU))
-		rcu_barrier();
-
-	__kmem_cache_destroy(cachep);
-	mutex_unlock(&slab_mutex);
-	put_online_cpus();
+	return __cache_shrink(cachep);
 }
-EXPORT_SYMBOL(kmem_cache_destroy);
 
 /*
  * Get the memory for a slab management obj.
Index: linux-2.6/mm/slab.h
===================================================================
--- linux-2.6.orig/mm/slab.h	2012-05-14 08:39:27.195145844 -0500
+++ linux-2.6/mm/slab.h	2012-05-14 08:39:42.871145519 -0500
@@ -30,5 +30,7 @@ extern struct list_head slab_caches;
 struct kmem_cache *__kmem_cache_create (const char *name, size_t size,
 	size_t align, unsigned long flags, void (*ctor)(void *));
 
-#endif
+int __kmem_cache_shutdown(struct kmem_cache *);
+void __kmem_cache_destroy(struct kmem_cache *);
 
+#endif
Index: linux-2.6/mm/slob.c
===================================================================
--- linux-2.6.orig/mm/slob.c	2012-05-14 08:39:26.455145860 -0500
+++ linux-2.6/mm/slob.c	2012-05-14 08:39:42.871145519 -0500
@@ -570,14 +570,11 @@ struct kmem_cache *__kmem_cache_create(c
 	return c;
 }
 
-void kmem_cache_destroy(struct kmem_cache *c)
+void __kmem_cache_destroy(struct kmem_cache *c)
 {
 	kmemleak_free(c);
-	if (c->flags & SLAB_DESTROY_BY_RCU)
-		rcu_barrier();
 	slob_free(c, sizeof(struct kmem_cache));
 }
-EXPORT_SYMBOL(kmem_cache_destroy);
 
 void *kmem_cache_alloc_node(struct kmem_cache *c, gfp_t flags, int node)
 {
@@ -645,6 +642,12 @@ unsigned int kmem_cache_size(struct kmem
 }
 EXPORT_SYMBOL(kmem_cache_size);
 
+int __kmem_cache_shutdown(struct kmem_cache *c)
+{
+	/* No way to check for remaining objects */
+	return 0;
+}
+
 int kmem_cache_shrink(struct kmem_cache *d)
 {
 	return 0;
Index: linux-2.6/mm/slub.c
===================================================================
--- linux-2.6.orig/mm/slub.c	2012-05-14 08:39:29.831145790 -0500
+++ linux-2.6/mm/slub.c	2012-05-14 08:39:42.871145519 -0500
@@ -3168,29 +3168,16 @@ static inline int kmem_cache_close(struc
 	return 0;
 }
 
-/*
- * Close a cache and release the kmem_cache structure
- * (must be used for caches created using kmem_cache_create)
- */
-void kmem_cache_destroy(struct kmem_cache *s)
+int __kmem_cache_shutdown(struct kmem_cache *s)
 {
-	mutex_lock(&slab_mutex);
-	s->refcount--;
-	if (!s->refcount) {
-		list_del(&s->list);
-		mutex_unlock(&slab_mutex);
-		if (kmem_cache_close(s)) {
-			printk(KERN_ERR "SLUB %s: %s called for cache that "
-				"still has objects.\n", s->name, __func__);
-			dump_stack();
-		}
-		if (s->flags & SLAB_DESTROY_BY_RCU)
-			rcu_barrier();
-		sysfs_slab_remove(s);
-	} else
-		mutex_unlock(&slab_mutex);
+	return kmem_cache_close(s);
+}
+
+void __kmem_cache_destroy(struct kmem_cache *s)
+{
+	kfree(s);
+	sysfs_slab_remove(s);
 }
-EXPORT_SYMBOL(kmem_cache_destroy);
 
 /********************************************************************
  *		Kmalloc subsystem

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] SL[AUO]B common code 1/9] [slob] define page struct fields used in mm_types.h
  2012-05-14 20:15 ` [RFC] SL[AUO]B common code 1/9] [slob] define page struct fields used in mm_types.h Christoph Lameter
@ 2012-05-16  7:31   ` Glauber Costa
  2012-05-16 14:26     ` Christoph Lameter
  0 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2012-05-16  7:31 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka Enberg, linux-mm, David Rientjes, Matt Mackall

On 05/15/2012 12:15 AM, Christoph Lameter wrote:
> - * We use struct page fields to manage some slob allocation aspects,
> - * however to avoid the horrible mess in include/linux/mm_types.h, we'll
> - * just define our own struct page type variant here.
> - */
> -struct slob_page {
> -	union {
> -		struct {
> -			unsigned long flags;	/* mandatory */
> -			atomic_t _count;	/* mandatory */
> -			slobidx_t units;	/* free units left in page */
> -			unsigned long pad[2];
> -			slob_t *free;		/* first free slob_t in page */
> -			struct list_head list;	/* linked list of free pages */
> -		};
> -		struct page page;
> -	};
> -};

I am generally in favor of this, but since this list inside the 
structure doesn't seem to have any particular order, I think it should 
not be called LRU.

It is of course ok to reuse the field, but what about we make it a union 
between "list" and "lru" ?

It may seem stupid because they all have the same storage size, but the 
word "lru" does trigger a lot of assumptions on people reading the code.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] SL[AUO]B common code 2/9] [slab]: Use page struct fields instead of casting
  2012-05-14 20:15 ` [RFC] SL[AUO]B common code 2/9] [slab]: Use page struct fields instead of casting Christoph Lameter
@ 2012-05-16  7:52   ` Glauber Costa
  0 siblings, 0 replies; 38+ messages in thread
From: Glauber Costa @ 2012-05-16  7:52 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka Enberg, linux-mm, David Rientjes, Matt Mackall

On 05/15/2012 12:15 AM, Christoph Lameter wrote:
> Add fields to the page struct so that it is properly documented that
> slab overlays the lru fields.
>
> This cleans up some casts in slab.
>
> Signed-off-by: Christoph Lameter<cl@linux.com>
Reviewed-by: Glauber Costa <glommer@parallels.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] SL[AUO]B common code 3/9] Extract common fields from struct kmem_cache
  2012-05-14 20:15 ` [RFC] SL[AUO]B common code 3/9] Extract common fields from struct kmem_cache Christoph Lameter
@ 2012-05-16  7:59   ` Glauber Costa
  2012-05-16 14:28     ` Christoph Lameter
  2012-05-16 15:41     ` Christoph Lameter
       [not found]   ` <alpine.LFD.2.02.1205160943180.2249@tux.localdomain>
  1 sibling, 2 replies; 38+ messages in thread
From: Glauber Costa @ 2012-05-16  7:59 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka Enberg, linux-mm, David Rientjes, Matt Mackall

On 05/15/2012 12:15 AM, Christoph Lameter wrote:
> Define "COMMON" to include definitions for fields used in all
> slab allocators. After that it will be possible to share code that
> only operates on those fields of kmem_cache.
>
> The patch basically takes the slob definition of kmem cache and
> uses the field namees for the other allocators.
>
> The slob definition of kmem_cache is moved from slob.c to slob_def.h
> so that the location of the kmem_cache definition is the same for
> all allocators.
>
> Signed-off-by: Christoph Lameter<cl@linux.com>
>
> ---
>   include/linux/slab.h     |   11 +++++++++++
>   include/linux/slab_def.h |    8 ++------
>   include/linux/slub_def.h |   11 ++++-------
>   mm/slab.c                |   30 +++++++++++++++---------------
>   mm/slob.c                |    7 -------
>   5 files changed, 32 insertions(+), 35 deletions(-)
>
> Index: linux-2.6/include/linux/slab.h
> ===================================================================
> --- linux-2.6.orig/include/linux/slab.h	2012-05-11 08:34:30.272522792 -0500
> +++ linux-2.6/include/linux/slab.h	2012-05-11 09:36:35.292445608 -0500
> @@ -93,6 +93,17 @@
>   				(unsigned long)ZERO_SIZE_PTR)
>
>   /*
> + * Common fields provided in kmem_cache by all slab allocators
> + */
> +#define SLAB_COMMON \
> +	unsigned int size, align;					\
> +	unsigned long flags;						\
> +	const char *name;						\
> +	int refcount;							\
> +	void (*ctor)(void *);						\
> +	struct list_head list;
> +
> +/*
>    * struct kmem_cache related prototypes

Isn't it better to define struct kmem_cache here, and then put the 
non-common fields under proper ifdefs ?

I myself prefer that style, but style aside, we should aim for what you
call SLAB_COMMON to encompass as many fields as we can, so the others 
will be kept to a minimum... It makes more sense, by looking from that 
angle.

> Index: linux-2.6/mm/slob.c
> ===================================================================
> --- linux-2.6.orig/mm/slob.c	2012-05-11 08:34:31.792522763 -0500
> +++ linux-2.6/mm/slob.c	2012-05-11 09:42:52.032437799 -0500
> @@ -538,13 +538,6 @@ size_t ksize(const void *block)
>   }
>   EXPORT_SYMBOL(ksize);
>
> -struct kmem_cache {
> -	unsigned int size, align;
> -	unsigned long flags;
> -	const char *name;
> -	void (*ctor)(void *);
> -};
> -

Who defines struct kmem_cache for the slob now ?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] SL[AUO]B common code 0/9] Sl[auo]b: Common functionality V1
  2012-05-14 20:15 [RFC] SL[AUO]B common code 0/9] Sl[auo]b: Common functionality V1 Christoph Lameter
                   ` (8 preceding siblings ...)
  2012-05-14 20:15 ` [RFC] SL[AUO]B common code 9/9] slabs: Extract a common function for kmem_cache_destroy Christoph Lameter
@ 2012-05-16  8:08 ` Glauber Costa
  2012-05-16 14:28   ` Christoph Lameter
  9 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2012-05-16  8:08 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka Enberg, linux-mm, David Rientjes, Matt Mackall

On 05/15/2012 12:15 AM, Christoph Lameter wrote:
> In the far future one could envision that the current allocators will
> just become storage algorithms that can be chosen based on the need of
> the subsystem. F.e.
>
> Cpu cache dependend performance		= Bonwick allocator (SLAB)
> Minimal cycle count and cache footprint	= SLUB
> Maximum storage density			= SLOB

While we are at it, do we plan to lift the vowel-only usage restriction 
and start allowing consonants for the 3rd letter? We're running short of 
vowels, so we can only accommodate 2 more allocators if this restriction 
stays.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] SL[AUO]B common code 5/9] slabs: Common definition for boot state of the slab allocators
  2012-05-14 20:15 ` [RFC] SL[AUO]B common code 5/9] slabs: Common definition for boot state of the slab allocators Christoph Lameter
@ 2012-05-16  8:19   ` Glauber Costa
  2012-05-16 14:31     ` Christoph Lameter
  2012-05-16 15:44     ` Christoph Lameter
  0 siblings, 2 replies; 38+ messages in thread
From: Glauber Costa @ 2012-05-16  8:19 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka Enberg, linux-mm, David Rientjes, Matt Mackall

On 05/15/2012 12:15 AM, Christoph Lameter wrote:
> ll allocators have some sort of support for the bootstrap status.
>
> Setup a common definition for the boot states and make all slab
> allocators use that definition.
>
> Signed-off-by: Christoph Lameter<cl@linux.com>

This is a good change.

> Index: linux-2.6/mm/slab.c
> ===================================================================
> --- linux-2.6.orig/mm/slab.c	2012-05-11 09:43:33.160436947 -0500
> +++ linux-2.6/mm/slab.c	2012-05-11 09:43:53.448436526 -0500
> @@ -87,6 +87,7 @@
>    */
>
>   #include	<linux/slab.h>
> +#include	"slab.h"

Why do we need a separate file for that?
I know some people do prefer it... I am not being one of them, just feel 
forced to ask =)

>   static int __init_refok setup_cpu_cache(struct kmem_cache *cachep, gfp_t gfp)
>   {
> -	if (g_cpucache_up == FULL)
> +	if (slab_state == FULL)
>   		return enable_cpucache(cachep, gfp);
>
> -	if (g_cpucache_up == NONE) {
> +	if (slab_state == DOWN) {

Can we avoid doing == tests here?

There are a couple of places where that test seems to be okay (I 
remember 1 in the slub), but at least for the "FULL" test here, we 
should be testing >= FULL.

Also, I don't like the name FULL too much, since I do intend to add a 
new one soon (MEMCG, as you can see in my series)

Since we are using slab-specific states like PARTIAL_L3 here, maybe we 
can use slub's like SYSFS here with no problem.

If we stick to >= and <= whenever needed, that should reflect a lot 
better what the algorithm is really doing


> Index: linux-2.6/include/linux/slab.h
> ===================================================================
> --- linux-2.6.orig/include/linux/slab.h	2012-05-11 09:43:33.164436947 -0500
> +++ linux-2.6/include/linux/slab.h	2012-05-11 09:43:53.448436526 -0500
> @@ -117,10 +117,6 @@ int kmem_cache_shrink(struct kmem_cache
>   void kmem_cache_free(struct kmem_cache *, void *);
>   unsigned int kmem_cache_size(struct kmem_cache *);
>
> -/* Slab internal function */
> -struct kmem_cache *__kmem_cache_create(const char *, size_t, size_t,
> -			unsigned long,
> -			void (*)(void *));
>   /*
>    * Please use this macro to create slab caches. Simply specify the
>    * name of the structure and maybe some flags that are listed above.
>

Should be in an earlier patch...

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] SL[AUO]B common code 6/9] slabs: Use a common mutex definition
  2012-05-14 20:15 ` [RFC] SL[AUO]B common code 6/9] slabs: Use a common mutex definition Christoph Lameter
@ 2012-05-16  8:34   ` Glauber Costa
  2012-05-16 14:32     ` Christoph Lameter
  0 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2012-05-16  8:34 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka Enberg, linux-mm, David Rientjes, Matt Mackall

On 05/15/2012 12:15 AM, Christoph Lameter wrote:
> Use the mutex definition from SLAB and make it the common way to take a sleeping lock.
>
> This has the effect of using a mutex instead of a rw semaphore for SLUB.

This is very good, IMHO.

> SLOB gains the use of a mutex for kmem_cache_create serialization.
> Not needed now but SLOB may acquire some more features later (like slabinfo
> / sysfs support) through the expansion of the common code that will
> need this.

Now, won't this hurt performance of the slob allocator, that seems to 
gain its edge from its simplicity ?

But I'll let whoever cares comment on that. From where I stand:

> Signed-off-by: Christoph Lameter<cl@linux.com>
>
Reviewed-by: Glauber Costa <glommer@parallels.com>


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] SL[AUO]B common code 8/9] slabs: list addition move to slab_common
  2012-05-14 20:15 ` [RFC] SL[AUO]B common code 8/9] slabs: list addition move to slab_common Christoph Lameter
@ 2012-05-16 10:09   ` Glauber Costa
  2012-05-16 14:33     ` Christoph Lameter
  0 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2012-05-16 10:09 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka Enberg, linux-mm, David Rientjes, Matt Mackall

On 05/15/2012 12:15 AM, Christoph Lameter wrote:
> Move the code to append the new kmem_cache to the list of slab caches to
> the kmem_cache_create code in the shared code.
>
> This is possible now since the acquisition of the mutex was moved into
> kmem_cache_create().
>
> Signed-off-by: Christoph Lameter<cl@linux.com>
>
>
> ---
>   mm/slab.c        |    7 +++++--
>   mm/slab_common.c |    3 +++
>   mm/slub.c        |    2 --
>   3 files changed, 8 insertions(+), 4 deletions(-)
>
> Index: linux-2.6/mm/slab_common.c
> ===================================================================
> --- linux-2.6.orig/mm/slab_common.c	2012-05-14 08:39:27.859145830 -0500
> +++ linux-2.6/mm/slab_common.c	2012-05-14 08:39:29.827145790 -0500
> @@ -98,6 +98,9 @@ struct kmem_cache *kmem_cache_create(con
>
>   	s = __kmem_cache_create(name, size, align, flags, ctor);
>
> +	if (s&&  s->refcount == 1)
> +		list_add(&s->list,&slab_caches);
> +
>   oops:

I personally think that the refcount == 1 test is too fragile.
It happens to be true, and is likely to be true in the future, but there 
is no particular reason that is *has* to be true forever.

Also, the only reasons it exists, seems to be to go around the fact that 
the slab already adds the kmalloc caches to a list in a slightly 
different way. And there has to be cleaner ways to achieve that.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] SL[AUO]B common code 7/9] slabs: Move kmem_cache_create mutex handling to common code
  2012-05-14 20:15 ` [RFC] SL[AUO]B common code 7/9] slabs: Move kmem_cache_create mutex handling to common code Christoph Lameter
@ 2012-05-16 10:15   ` Glauber Costa
  0 siblings, 0 replies; 38+ messages in thread
From: Glauber Costa @ 2012-05-16 10:15 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka Enberg, linux-mm, David Rientjes, Matt Mackall

On 05/15/2012 12:15 AM, Christoph Lameter wrote:
> Move the mutex handling into the common kmem_cache_create()
> function.
>
> Then we can also move more checks out of SLAB's kmem_cache_create()
> into the common code.
>
> Signed-off-by: Christoph Lameter<cl@linux.com>
>
> ---
>   mm/slab.c        |   52 +---------------------------------------------------
>   mm/slab_common.c |   41 ++++++++++++++++++++++++++++++++++++++++-
>   mm/slub.c        |   30 ++++++++++++++----------------
>   3 files changed, 55 insertions(+), 68 deletions(-)

I see you are moving a lot of the other tests I mentioned in your other 
e-mail here

Reviewed-by: Glauber Costa <glommer@parallels.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] SL[AUO]B common code 1/9] [slob] define page struct fields used in mm_types.h
  2012-05-16  7:31   ` Glauber Costa
@ 2012-05-16 14:26     ` Christoph Lameter
  2012-05-16 15:38       ` Christoph Lameter
  0 siblings, 1 reply; 38+ messages in thread
From: Christoph Lameter @ 2012-05-16 14:26 UTC (permalink / raw)
  To: Glauber Costa; +Cc: Pekka Enberg, linux-mm, David Rientjes, Matt Mackall

On Wed, 16 May 2012, Glauber Costa wrote:

> It is of course ok to reuse the field, but what about we make it a union
> between "list" and "lru" ?

That is what this patch does. You are commenting on code that was
removed.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] SL[AUO]B common code 3/9] Extract common fields from struct kmem_cache
  2012-05-16  7:59   ` Glauber Costa
@ 2012-05-16 14:28     ` Christoph Lameter
  2012-05-16 15:41     ` Christoph Lameter
  1 sibling, 0 replies; 38+ messages in thread
From: Christoph Lameter @ 2012-05-16 14:28 UTC (permalink / raw)
  To: Glauber Costa; +Cc: Pekka Enberg, linux-mm, David Rientjes, Matt Mackall

On Wed, 16 May 2012, Glauber Costa wrote:

> >   /*
> > + * Common fields provided in kmem_cache by all slab allocators
> > + */
> > +#define SLAB_COMMON \
> > +	unsigned int size, align;					\
> > +	unsigned long flags;						\
> > +	const char *name;						\
> > +	int refcount;							\
> > +	void (*ctor)(void *);						\
> > +	struct list_head list;
> > +
> > +/*
> >    * struct kmem_cache related prototypes
>
> Isn't it better to define struct kmem_cache here, and then put the non-common
> fields under proper ifdefs ?

Then we would move the definition to include/linux/slab.h. The allocator
kmem_cache struct definitions rely on other allocator specific
declarations at this point. Wont work.

> > Index: linux-2.6/mm/slob.c
> > ===================================================================
> > --- linux-2.6.orig/mm/slob.c	2012-05-11 08:34:31.792522763 -0500
> > +++ linux-2.6/mm/slob.c	2012-05-11 09:42:52.032437799 -0500
> > @@ -538,13 +538,6 @@ size_t ksize(const void *block)
> >   }
> >   EXPORT_SYMBOL(ksize);
> >
> > -struct kmem_cache {
> > -	unsigned int size, align;
> > -	unsigned long flags;
> > -	const char *name;
> > -	void (*ctor)(void *);
> > -};
> > -
>
> Who defines struct kmem_cache for the slob now ?

Its defined in include/linux/slob_def.h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] SL[AUO]B common code 0/9] Sl[auo]b: Common functionality V1
  2012-05-16  8:08 ` [RFC] SL[AUO]B common code 0/9] Sl[auo]b: Common functionality V1 Glauber Costa
@ 2012-05-16 14:28   ` Christoph Lameter
  0 siblings, 0 replies; 38+ messages in thread
From: Christoph Lameter @ 2012-05-16 14:28 UTC (permalink / raw)
  To: Glauber Costa; +Cc: Pekka Enberg, linux-mm, David Rientjes, Matt Mackall

On Wed, 16 May 2012, Glauber Costa wrote:

> On 05/15/2012 12:15 AM, Christoph Lameter wrote:
> > In the far future one could envision that the current allocators will
> > just become storage algorithms that can be chosen based on the need of
> > the subsystem. F.e.
> >
> > Cpu cache dependend performance		= Bonwick allocator (SLAB)
> > Minimal cycle count and cache footprint	= SLUB
> > Maximum storage density			= SLOB
>
> While we are at it, do we plan to lift the vowel-only usage restriction and
> start allowing consonants for the 3rd letter? We're running short of vowels,
> so we can only accommodate 2 more allocators if this restriction stays.

Well if we go to storage algorithms then we can have free naming.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] SL[AUO]B common code 5/9] slabs: Common definition for boot state of the slab allocators
  2012-05-16  8:19   ` Glauber Costa
@ 2012-05-16 14:31     ` Christoph Lameter
  2012-05-17  9:38       ` Glauber Costa
  2012-05-16 15:44     ` Christoph Lameter
  1 sibling, 1 reply; 38+ messages in thread
From: Christoph Lameter @ 2012-05-16 14:31 UTC (permalink / raw)
  To: Glauber Costa; +Cc: Pekka Enberg, linux-mm, David Rientjes, Matt Mackall

On Wed, 16 May 2012, Glauber Costa wrote:

> > Index: linux-2.6/mm/slab.c
> > ===================================================================
> > --- linux-2.6.orig/mm/slab.c	2012-05-11 09:43:33.160436947 -0500
> > +++ linux-2.6/mm/slab.c	2012-05-11 09:43:53.448436526 -0500
> > @@ -87,6 +87,7 @@
> >    */
> >
> >   #include	<linux/slab.h>
> > +#include	"slab.h"
>
> Why do we need a separate file for that?
> I know some people do prefer it... I am not being one of them, just feel
> forced to ask =)

These are local definitons only relevant for slab allocators using the
common slab code.

> >   static int __init_refok setup_cpu_cache(struct kmem_cache *cachep, gfp_t
> > gfp)
> >   {
> > -	if (g_cpucache_up == FULL)
> > +	if (slab_state == FULL)
> >   		return enable_cpucache(cachep, gfp);
> >
> > -	if (g_cpucache_up == NONE) {
> > +	if (slab_state == DOWN) {
>
> Can we avoid doing == tests here?

We could.

> There are a couple of places where that test seems to be okay (I remember 1 in
> the slub), but at least for the "FULL" test here, we should be testing >=
> FULL.
>
> Also, I don't like the name FULL too much, since I do intend to add a new one
> soon (MEMCG, as you can see in my series)

Ok. Why would memcg need an additional state?

> Since we are using slab-specific states like PARTIAL_L3 here, maybe we can use
> slub's like SYSFS here with no problem.

Sure. I thought there would only be special states before UP.

> If we stick to >= and <= whenever needed, that should reflect a lot better
> what the algorithm is really doing

How so?

> > Index: linux-2.6/include/linux/slab.h
> > ===================================================================
> > --- linux-2.6.orig/include/linux/slab.h	2012-05-11 09:43:33.164436947
> > -0500
> > +++ linux-2.6/include/linux/slab.h	2012-05-11 09:43:53.448436526 -0500
> > @@ -117,10 +117,6 @@ int kmem_cache_shrink(struct kmem_cache
> >   void kmem_cache_free(struct kmem_cache *, void *);
> >   unsigned int kmem_cache_size(struct kmem_cache *);
> >
> > -/* Slab internal function */
> > -struct kmem_cache *__kmem_cache_create(const char *, size_t, size_t,
> > -			unsigned long,
> > -			void (*)(void *));
> >   /*
> >    * Please use this macro to create slab caches. Simply specify the
> >    * name of the structure and maybe some flags that are listed above.
> >
>
> Should be in an earlier patch...

Yup.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] SL[AUO]B common code 6/9] slabs: Use a common mutex definition
  2012-05-16  8:34   ` Glauber Costa
@ 2012-05-16 14:32     ` Christoph Lameter
  0 siblings, 0 replies; 38+ messages in thread
From: Christoph Lameter @ 2012-05-16 14:32 UTC (permalink / raw)
  To: Glauber Costa; +Cc: Pekka Enberg, linux-mm, David Rientjes, Matt Mackall

On Wed, 16 May 2012, Glauber Costa wrote:

> Now, won't this hurt performance of the slob allocator, that seems to gain its
> edge from its simplicity ?

Well the cache creation and removal is usually not that performance
critical. Code exists anyway for various other things so we add some
features to SLOB in the process.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] SL[AUO]B common code 8/9] slabs: list addition move to slab_common
  2012-05-16 10:09   ` Glauber Costa
@ 2012-05-16 14:33     ` Christoph Lameter
  2012-05-17  9:39       ` Glauber Costa
  0 siblings, 1 reply; 38+ messages in thread
From: Christoph Lameter @ 2012-05-16 14:33 UTC (permalink / raw)
  To: Glauber Costa; +Cc: Pekka Enberg, linux-mm, David Rientjes, Matt Mackall

On Wed, 16 May 2012, Glauber Costa wrote:

> > Index: linux-2.6/mm/slab_common.c
> > ===================================================================
> > --- linux-2.6.orig/mm/slab_common.c	2012-05-14 08:39:27.859145830 -0500
> > +++ linux-2.6/mm/slab_common.c	2012-05-14 08:39:29.827145790 -0500
> > @@ -98,6 +98,9 @@ struct kmem_cache *kmem_cache_create(con
> >
> >   	s = __kmem_cache_create(name, size, align, flags, ctor);
> >
> > +	if (s&&  s->refcount == 1)
> > +		list_add(&s->list,&slab_caches);
> > +
> >   oops:
>
> I personally think that the refcount == 1 test is too fragile.
> It happens to be true, and is likely to be true in the future, but there is no
> particular reason that is *has* to be true forever.

Its not fragile since a refcount will always be one for a slab that was
just created. There is no possible other reference to it since the
subsystem using it has never received a pointer to the kmem_cache struct
yet.

> Also, the only reasons it exists, seems to be to go around the fact that the
> slab already adds the kmalloc caches to a list in a slightly different way.
> And there has to be cleaner ways to achieve that.

The reason it exists is to distinguish the case of an alias creation from
a true kmem_cache instatiation. The alias does not need to be added to the
list of slabs.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] SL[AUO]B common code 1/9] [slob] define page struct fields used in mm_types.h
  2012-05-16 14:26     ` Christoph Lameter
@ 2012-05-16 15:38       ` Christoph Lameter
  2012-05-17  9:41         ` Glauber Costa
  0 siblings, 1 reply; 38+ messages in thread
From: Christoph Lameter @ 2012-05-16 15:38 UTC (permalink / raw)
  To: Glauber Costa; +Cc: Pekka Enberg, linux-mm, David Rientjes, Matt Mackall

On Wed, 16 May 2012, Christoph Lameter wrote:

> On Wed, 16 May 2012, Glauber Costa wrote:
>
> > It is of course ok to reuse the field, but what about we make it a union
> > between "list" and "lru" ?
>
> That is what this patch does. You are commenting on code that was
> removed.

Argh. No it doesnt..... It will be easy to add though. But then you have
two list_head definitions in page struct that just differ in name.

Index: linux-2.6/mm/slob.c
===================================================================
--- linux-2.6.orig/mm/slob.c	2012-05-16 04:36:42.531867096 -0500
+++ linux-2.6/mm/slob.c	2012-05-16 04:36:13.611867636 -0500
@@ -142,13 +142,13 @@ static inline int slob_page_free(struct

 static void set_slob_page_free(struct page *sp, struct list_head *list)
 {
-	list_add(&sp->lru, list);
+	list_add(&sp->list, list);
 	__SetPageSlobFree(sp);
 }

 static inline void clear_slob_page_free(struct page *sp)
 {
-	list_del(&sp->lru);
+	list_del(&sp->list);
 	__ClearPageSlobFree(sp);
 }

@@ -314,7 +314,7 @@ static void *slob_alloc(size_t size, gfp

 	spin_lock_irqsave(&slob_lock, flags);
 	/* Iterate through each partially free page, try to find room */
-	list_for_each_entry(sp, slob_list, lru) {
+	list_for_each_entry(sp, slob_list, list) {
 #ifdef CONFIG_NUMA
 		/*
 		 * If there's a node specification, search for a partial
@@ -328,7 +328,7 @@ static void *slob_alloc(size_t size, gfp
 			continue;

 		/* Attempt to alloc */
-		prev = sp->lru.prev;
+		prev = sp->list.prev;
 		b = slob_page_alloc(sp, size, align);
 		if (!b)
 			continue;
@@ -354,7 +354,7 @@ static void *slob_alloc(size_t size, gfp
 		spin_lock_irqsave(&slob_lock, flags);
 		sp->units = SLOB_UNITS(PAGE_SIZE);
 		sp->freelist = b;
-		INIT_LIST_HEAD(&sp->lru);
+		INIT_LIST_HEAD(&sp->list);
 		set_slob(b, SLOB_UNITS(PAGE_SIZE), b + SLOB_UNITS(PAGE_SIZE));
 		set_slob_page_free(sp, slob_list);
 		b = slob_page_alloc(sp, size, align);
Index: linux-2.6/include/linux/mm_types.h
===================================================================
--- linux-2.6.orig/include/linux/mm_types.h	2012-05-16 04:36:42.535867105 -0500
+++ linux-2.6/include/linux/mm_types.h	2012-05-16 04:35:37.963868439 -0500
@@ -97,6 +97,7 @@ struct page {
 		struct list_head lru;	/* Pageout list, eg. active_list
 					 * protected by zone->lru_lock !
 					 */
+		struct list_head list;	/* slobs list of pages */
 		struct {		/* slub per cpu partial pages */
 			struct page *next;	/* Next partial slab */
 #ifdef CONFIG_64BIT

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] SL[AUO]B common code 3/9] Extract common fields from struct kmem_cache
  2012-05-16  7:59   ` Glauber Costa
  2012-05-16 14:28     ` Christoph Lameter
@ 2012-05-16 15:41     ` Christoph Lameter
  1 sibling, 0 replies; 38+ messages in thread
From: Christoph Lameter @ 2012-05-16 15:41 UTC (permalink / raw)
  To: Glauber Costa; +Cc: Pekka Enberg, linux-mm, David Rientjes, Matt Mackall

On Wed, 16 May 2012, Glauber Costa wrote:

> Who defines struct kmem_cache for the slob now ?

The hunk was dropped that added this to include/linux/slob_def.h. Next
post will include that.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] SL[AUO]B common code 5/9] slabs: Common definition for boot state of the slab allocators
  2012-05-16  8:19   ` Glauber Costa
  2012-05-16 14:31     ` Christoph Lameter
@ 2012-05-16 15:44     ` Christoph Lameter
  1 sibling, 0 replies; 38+ messages in thread
From: Christoph Lameter @ 2012-05-16 15:44 UTC (permalink / raw)
  To: Glauber Costa; +Cc: Pekka Enberg, linux-mm, David Rientjes, Matt Mackall

On Wed, 16 May 2012, Glauber Costa wrote:

> > @@ -117,10 +117,6 @@ int kmem_cache_shrink(struct kmem_cache
> >   void kmem_cache_free(struct kmem_cache *, void *);
> >   unsigned int kmem_cache_size(struct kmem_cache *);
> >
> > -/* Slab internal function */
> > -struct kmem_cache *__kmem_cache_create(const char *, size_t, size_t,
> > -			unsigned long,
> > -			void (*)(void *));
> >   /*
> >    * Please use this macro to create slab caches. Simply specify the
> >    * name of the structure and maybe some flags that are listed above.
> >
>
> Should be in an earlier patch...

This patch moves the definition to mm/slab.h since it is only used for
allocators.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] SL[AUO]B common code 5/9] slabs: Common definition for boot state of the slab allocators
  2012-05-16 14:31     ` Christoph Lameter
@ 2012-05-17  9:38       ` Glauber Costa
  2012-05-17 14:07         ` Christoph Lameter
  0 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2012-05-17  9:38 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka Enberg, linux-mm, David Rientjes, Matt Mackall

On 05/16/2012 06:31 PM, Christoph Lameter wrote:
>> There are a couple of places where that test seems to be okay (I remember 1 in
>> >  the slub), but at least for the "FULL" test here, we should be testing>=
>> >  FULL.
>> >
>> >  Also, I don't like the name FULL too much, since I do intend to add a new one
>> >  soon (MEMCG, as you can see in my series)
> Ok. Why would memcg need an additional state?

Please refer to my patchset for the full story.
I add state both to the slab and to the slub for that.

But in summary, it is not unlike the "SYSFS" state: we depend on 
something else outside of the slab domain to be ready before we can proceed.

Specifically, we need to register each cache with an index. And for 
that, we use idr/ida. When it is ready, we run code to register indexes 
for all caches that are already available. After that, we just grab an 
index right away - much like sysfs state for aliases.

>> >  Since we are using slab-specific states like PARTIAL_L3 here, maybe we can use
>> >  slub's like SYSFS here with no problem.
> Sure. I thought there would only be special states before UP.
>
>> >  If we stick to>= and<= whenever needed, that should reflect a lot better
>> >  what the algorithm is really doing
> How so?

In the sense that we very rarely want to do some action *at a specific 
moment*. Most of the time we want to separate the world into before and 
after a state. We test == instead of <= and >=, and it happens to work 
because of the specific order of things, which are subject to change in 
a rework or another...

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] SL[AUO]B common code 8/9] slabs: list addition move to slab_common
  2012-05-16 14:33     ` Christoph Lameter
@ 2012-05-17  9:39       ` Glauber Costa
  2012-05-17 14:09         ` Christoph Lameter
  0 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2012-05-17  9:39 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka Enberg, linux-mm, David Rientjes, Matt Mackall

On 05/16/2012 06:33 PM, Christoph Lameter wrote:
>> >  Also, the only reasons it exists, seems to be to go around the fact that the
>> >  slab already adds the kmalloc caches to a list in a slightly different way.
>> >  And there has to be cleaner ways to achieve that.
> The reason it exists is to distinguish the case of an alias creation from
> a true kmem_cache instatiation. The alias does not need to be added to the
> list of slabs.
>
Worth a comment, then, maybe ? It tricked me a bit

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] SL[AUO]B common code 1/9] [slob] define page struct fields used in mm_types.h
  2012-05-16 15:38       ` Christoph Lameter
@ 2012-05-17  9:41         ` Glauber Costa
  2012-05-17 14:09           ` Christoph Lameter
  0 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2012-05-17  9:41 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka Enberg, linux-mm, David Rientjes, Matt Mackall

On 05/16/2012 07:38 PM, Christoph Lameter wrote:
> On Wed, 16 May 2012, Christoph Lameter wrote:
>
>> >  On Wed, 16 May 2012, Glauber Costa wrote:
>> >
>>> >  >  It is of course ok to reuse the field, but what about we make it a union
>>> >  >  between "list" and "lru" ?
>> >
>> >  That is what this patch does. You are commenting on code that was
>> >  removed.
> Argh. No it doesnt..... It will be easy to add though. But then you have
> two list_head definitions in page struct that just differ in name.
As I said previously, it sounds stupid if you look from the typing 
system point of view.

But when I read something like: list_add(&sp->lru, list), something very 
special assumptions about list ordering comes to mind. It's something 
that should be done for the sake of the readers.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] SL[AUO]B common code 5/9] slabs: Common definition for boot state of the slab allocators
  2012-05-17  9:38       ` Glauber Costa
@ 2012-05-17 14:07         ` Christoph Lameter
  2012-05-17 14:08           ` Glauber Costa
  0 siblings, 1 reply; 38+ messages in thread
From: Christoph Lameter @ 2012-05-17 14:07 UTC (permalink / raw)
  To: Glauber Costa; +Cc: Pekka Enberg, linux-mm, David Rientjes, Matt Mackall

On Thu, 17 May 2012, Glauber Costa wrote:

> On 05/16/2012 06:31 PM, Christoph Lameter wrote:
> > > There are a couple of places where that test seems to be okay (I remember
> > > 1 in
> > > >  the slub), but at least for the "FULL" test here, we should be
> > > testing>=
> > > >  FULL.
> > > >
> > > >  Also, I don't like the name FULL too much, since I do intend to add a
> > > new one
> > > >  soon (MEMCG, as you can see in my series)
> > Ok. Why would memcg need an additional state?
>
> Please refer to my patchset for the full story.
> I add state both to the slab and to the slub for that.
>
> But in summary, it is not unlike the "SYSFS" state: we depend on something
> else outside of the slab domain to be ready before we can proceed.
>
> Specifically, we need to register each cache with an index. And for that, we
> use idr/ida. When it is ready, we run code to register indexes for all caches
> that are already available. After that, we just grab an index right away -
> much like sysfs state for aliases.

Why can this processing not be done when sysfs has just been initialized?

> > > >  Since we are using slab-specific states like PARTIAL_L3 here, maybe we
> > > can use
> > > >  slub's like SYSFS here with no problem.
> > Sure. I thought there would only be special states before UP.
> >
> > > >  If we stick to>= and<= whenever needed, that should reflect a lot
> > > better
> > > >  what the algorithm is really doing
> > How so?
>
> In the sense that we very rarely want to do some action *at a specific
> moment*. Most of the time we want to separate the world into before and after
> a state. We test == instead of <= and >=, and it happens to work because of
> the specific order of things, which are subject to change in a rework or
> another...

The reason to use == is because we want things to happen only at a
particular stage of things. The == SYSFS means we will only do an action
if the slab system is fully functional. Such things will have to be
reevaluated if the number of states change.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] SL[AUO]B common code 5/9] slabs: Common definition for boot state of the slab allocators
  2012-05-17 14:07         ` Christoph Lameter
@ 2012-05-17 14:08           ` Glauber Costa
  2012-05-17 14:16             ` Christoph Lameter
  0 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2012-05-17 14:08 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka Enberg, linux-mm, David Rientjes, Matt Mackall

On 05/17/2012 06:07 PM, Christoph Lameter wrote:
> On Thu, 17 May 2012, Glauber Costa wrote:
>
>> On 05/16/2012 06:31 PM, Christoph Lameter wrote:
>>>> There are a couple of places where that test seems to be okay (I remember
>>>> 1 in
>>>>>   the slub), but at least for the "FULL" test here, we should be
>>>> testing>=
>>>>>   FULL.
>>>>>
>>>>>   Also, I don't like the name FULL too much, since I do intend to add a
>>>> new one
>>>>>   soon (MEMCG, as you can see in my series)
>>> Ok. Why would memcg need an additional state?
>>
>> Please refer to my patchset for the full story.
>> I add state both to the slab and to the slub for that.
>>
>> But in summary, it is not unlike the "SYSFS" state: we depend on something
>> else outside of the slab domain to be ready before we can proceed.
>>
>> Specifically, we need to register each cache with an index. And for that, we
>> use idr/ida. When it is ready, we run code to register indexes for all caches
>> that are already available. After that, we just grab an index right away -
>> much like sysfs state for aliases.
>
> Why can this processing not be done when sysfs has just been initialized?

If we can be 100 % sure that idr/ida is always initialized before sysfs, 
than yes, we can.

>>>>>   Since we are using slab-specific states like PARTIAL_L3 here, maybe we
>>>> can use
>>>>>   slub's like SYSFS here with no problem.
>>> Sure. I thought there would only be special states before UP.
>>>
>>>>>   If we stick to>= and<= whenever needed, that should reflect a lot
>>>> better
>>>>>   what the algorithm is really doing
>>> How so?
>>
>> In the sense that we very rarely want to do some action *at a specific
>> moment*. Most of the time we want to separate the world into before and after
>> a state. We test == instead of<= and>=, and it happens to work because of
>> the specific order of things, which are subject to change in a rework or
>> another...
>
> The reason to use == is because we want things to happen only at a
> particular stage of things. The == SYSFS means we will only do an action
> if the slab system is fully functional. Such things will have to be
> reevaluated if the number of states change.

Yes, but you are actually arguing in my favor. "fully functional" means 
 >= SYSFS, not == SYSFS.

If for whatever reordering people may decide doing another state is 
added, or this function is called later, that will fail


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] SL[AUO]B common code 8/9] slabs: list addition move to slab_common
  2012-05-17  9:39       ` Glauber Costa
@ 2012-05-17 14:09         ` Christoph Lameter
  0 siblings, 0 replies; 38+ messages in thread
From: Christoph Lameter @ 2012-05-17 14:09 UTC (permalink / raw)
  To: Glauber Costa; +Cc: Pekka Enberg, linux-mm, David Rientjes, Matt Mackall

On Thu, 17 May 2012, Glauber Costa wrote:

> On 05/16/2012 06:33 PM, Christoph Lameter wrote:
> > > >  Also, the only reasons it exists, seems to be to go around the fact
> > > that the
> > > >  slab already adds the kmalloc caches to a list in a slightly different
> > > way.
> > > >  And there has to be cleaner ways to achieve that.
> > The reason it exists is to distinguish the case of an alias creation from
> > a true kmem_cache instatiation. The alias does not need to be added to the
> > list of slabs.
> >
> Worth a comment, then, maybe ? It tricked me a bit

Ok.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] SL[AUO]B common code 1/9] [slob] define page struct fields used in mm_types.h
  2012-05-17  9:41         ` Glauber Costa
@ 2012-05-17 14:09           ` Christoph Lameter
  0 siblings, 0 replies; 38+ messages in thread
From: Christoph Lameter @ 2012-05-17 14:09 UTC (permalink / raw)
  To: Glauber Costa; +Cc: Pekka Enberg, linux-mm, David Rientjes, Matt Mackall

On Thu, 17 May 2012, Glauber Costa wrote:

> On 05/16/2012 07:38 PM, Christoph Lameter wrote:
> > On Wed, 16 May 2012, Christoph Lameter wrote:
> >
> > > >  On Wed, 16 May 2012, Glauber Costa wrote:
> > > >
> > > > >  >  It is of course ok to reuse the field, but what about we make it a
> > > > union
> > > > >  >  between "list" and "lru" ?
> > > >
> > > >  That is what this patch does. You are commenting on code that was
> > > >  removed.
> > Argh. No it doesnt..... It will be easy to add though. But then you have
> > two list_head definitions in page struct that just differ in name.
> As I said previously, it sounds stupid if you look from the typing system
> point of view.
>
> But when I read something like: list_add(&sp->lru, list), something very
> special assumptions about list ordering comes to mind. It's something that
> should be done for the sake of the readers.

Allright will merge the changes that I posted into the patch.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] SL[AUO]B common code 5/9] slabs: Common definition for boot state of the slab allocators
  2012-05-17 14:08           ` Glauber Costa
@ 2012-05-17 14:16             ` Christoph Lameter
  2012-05-17 14:19               ` Glauber Costa
  0 siblings, 1 reply; 38+ messages in thread
From: Christoph Lameter @ 2012-05-17 14:16 UTC (permalink / raw)
  To: Glauber Costa; +Cc: Pekka Enberg, linux-mm, David Rientjes, Matt Mackall

On Thu, 17 May 2012, Glauber Costa wrote:

> > Why can this processing not be done when sysfs has just been initialized?
>
> If we can be 100 % sure that idr/ida is always initialized before sysfs, than
> yes, we can.

idr_init_cache() is run even before kmem_cache_init_late(). Have a look at
init/main.c. No need to muck with the bootup sequence.

> > The reason to use == is because we want things to happen only at a
> > particular stage of things. The == SYSFS means we will only do an action
> > if the slab system is fully functional. Such things will have to be
> > reevaluated if the number of states change.
>
> Yes, but you are actually arguing in my favor. "fully functional" means >=
> SYSFS, not == SYSFS.

That is only true if you add another state.

> If for whatever reordering people may decide doing another state is added, or
> this function is called later, that will fail

Then the assumptions that SYSFS is the final state is no longer true and
therefore the code needs to be inspected if this change affects anything.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] SL[AUO]B common code 5/9] slabs: Common definition for boot state of the slab allocators
  2012-05-17 14:16             ` Christoph Lameter
@ 2012-05-17 14:19               ` Glauber Costa
  2012-05-17 14:27                 ` Christoph Lameter
  0 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2012-05-17 14:19 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka Enberg, linux-mm, David Rientjes, Matt Mackall

On 05/17/2012 06:16 PM, Christoph Lameter wrote:
> That is only true if you add another state.

precisely.

>> >  If for whatever reordering people may decide doing another state is added, or
>> >  this function is called later, that will fail
> Then the assumptions that SYSFS is the final state is no longer true and
> therefore the code needs to be inspected if this change affects anything.
>
yes, by humans, that are known to make mistakes. Using >= is a tiny 
attitude that protects about failures in this realm.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] SL[AUO]B common code 5/9] slabs: Common definition for boot state of the slab allocators
  2012-05-17 14:19               ` Glauber Costa
@ 2012-05-17 14:27                 ` Christoph Lameter
  0 siblings, 0 replies; 38+ messages in thread
From: Christoph Lameter @ 2012-05-17 14:27 UTC (permalink / raw)
  To: Glauber Costa; +Cc: Pekka Enberg, linux-mm, David Rientjes, Matt Mackall

On Thu, 17 May 2012, Glauber Costa wrote:

> > > >  If for whatever reordering people may decide doing another state is
> > > added, or
> > > >  this function is called later, that will fail
> > Then the assumptions that SYSFS is the final state is no longer true and
> > therefore the code needs to be inspected if this change affects anything.
> >
> yes, by humans, that are known to make mistakes. Using >= is a tiny attitude
> that protects about failures in this realm.

No it risks breakage because the code will run now under a condition when
the system has not been brought up fully.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] SL[AUO]B common code 3/9] Extract common fields from struct kmem_cache
       [not found]       ` <alpine.LFD.2.02.1205181221570.3899@tux.localdomain>
@ 2012-05-18 13:57         ` Christoph Lameter
  0 siblings, 0 replies; 38+ messages in thread
From: Christoph Lameter @ 2012-05-18 13:57 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: Matt Mackall, linux-mm

On Fri, 18 May 2012, Pekka Enberg wrote:

> Why not make a "struct kmem_cache_common" structure and use that? You can
> embed it in "struct kmem_cache" just fine.

Tried that but I ended up with having to qualify all common variables.

I.e.

struct kmem_cache {
	struct kmem_cache_common {
		int a,b
	} common
} kmem_cache;


requires

	kmemcache->common.a

instead of

	kmemcache->a

That in turn requires significant changes to all allocators.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2012-05-18 13:57 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-14 20:15 [RFC] SL[AUO]B common code 0/9] Sl[auo]b: Common functionality V1 Christoph Lameter
2012-05-14 20:15 ` [RFC] SL[AUO]B common code 1/9] [slob] define page struct fields used in mm_types.h Christoph Lameter
2012-05-16  7:31   ` Glauber Costa
2012-05-16 14:26     ` Christoph Lameter
2012-05-16 15:38       ` Christoph Lameter
2012-05-17  9:41         ` Glauber Costa
2012-05-17 14:09           ` Christoph Lameter
2012-05-14 20:15 ` [RFC] SL[AUO]B common code 2/9] [slab]: Use page struct fields instead of casting Christoph Lameter
2012-05-16  7:52   ` Glauber Costa
2012-05-14 20:15 ` [RFC] SL[AUO]B common code 3/9] Extract common fields from struct kmem_cache Christoph Lameter
2012-05-16  7:59   ` Glauber Costa
2012-05-16 14:28     ` Christoph Lameter
2012-05-16 15:41     ` Christoph Lameter
     [not found]   ` <alpine.LFD.2.02.1205160943180.2249@tux.localdomain>
     [not found]     ` <alpine.DEB.2.00.1205160922520.25512@router.home>
     [not found]       ` <alpine.LFD.2.02.1205181221570.3899@tux.localdomain>
2012-05-18 13:57         ` Christoph Lameter
2012-05-14 20:15 ` [RFC] SL[AUO]B common code 4/9] slabs: Extract common code for kmem_cache_create Christoph Lameter
2012-05-14 20:15 ` [RFC] SL[AUO]B common code 5/9] slabs: Common definition for boot state of the slab allocators Christoph Lameter
2012-05-16  8:19   ` Glauber Costa
2012-05-16 14:31     ` Christoph Lameter
2012-05-17  9:38       ` Glauber Costa
2012-05-17 14:07         ` Christoph Lameter
2012-05-17 14:08           ` Glauber Costa
2012-05-17 14:16             ` Christoph Lameter
2012-05-17 14:19               ` Glauber Costa
2012-05-17 14:27                 ` Christoph Lameter
2012-05-16 15:44     ` Christoph Lameter
2012-05-14 20:15 ` [RFC] SL[AUO]B common code 6/9] slabs: Use a common mutex definition Christoph Lameter
2012-05-16  8:34   ` Glauber Costa
2012-05-16 14:32     ` Christoph Lameter
2012-05-14 20:15 ` [RFC] SL[AUO]B common code 7/9] slabs: Move kmem_cache_create mutex handling to common code Christoph Lameter
2012-05-16 10:15   ` Glauber Costa
2012-05-14 20:15 ` [RFC] SL[AUO]B common code 8/9] slabs: list addition move to slab_common Christoph Lameter
2012-05-16 10:09   ` Glauber Costa
2012-05-16 14:33     ` Christoph Lameter
2012-05-17  9:39       ` Glauber Costa
2012-05-17 14:09         ` Christoph Lameter
2012-05-14 20:15 ` [RFC] SL[AUO]B common code 9/9] slabs: Extract a common function for kmem_cache_destroy Christoph Lameter
2012-05-16  8:08 ` [RFC] SL[AUO]B common code 0/9] Sl[auo]b: Common functionality V1 Glauber Costa
2012-05-16 14:28   ` Christoph Lameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).