* [REPOST PATCH 0/4] slab: implement byte sized indexes for the freelist of a slab @ 2013-09-06 5:57 Joonsoo Kim 2013-09-06 5:57 ` [REPOST PATCH 1/4] slab: factor out calculate nr objects in cache_estimate Joonsoo Kim ` (3 more replies) 0 siblings, 4 replies; 17+ messages in thread From: Joonsoo Kim @ 2013-09-06 5:57 UTC (permalink / raw) To: Pekka Enberg Cc: Christoph Lameter, Andrew Morton, Joonsoo Kim, David Rientjes, linux-mm, linux-kernel, Joonsoo Kim * THIS IS JUST REPOSTED ACCORDING TO MAINTAINER'S REQUEST * * Changes from original post Correct the position of the results. Attach more results about cache-misses and elapsed time on a hackbench test. ----------------------------------------------------- This patchset implements byte sized indexes for the freelist of a slab. Currently, the freelist of a slab consist of unsigned int sized indexes. Most of slabs have less number of objects than 256, so much space is wasted. To reduce this overhead, this patchset implements byte sized indexes for the freelist of a slab. With it, we can save 3 bytes for each objects. This introduce one likely branch to functions used for setting/getting objects to/from the freelist, but we may get more benefits from this change. Below is some numbers of 'cat /proc/slabinfo' related to my previous posting and this patchset. * Before * # name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables [snip...] kmalloc-512 527 600 512 8 1 : tunables 54 27 0 : slabdata 75 75 0 kmalloc-256 210 210 256 15 1 : tunables 120 60 0 : slabdata 14 14 0 kmalloc-192 1040 1040 192 20 1 : tunables 120 60 0 : slabdata 52 52 0 kmalloc-96 750 750 128 30 1 : tunables 120 60 0 : slabdata 25 25 0 kmalloc-64 2773 2773 64 59 1 : tunables 120 60 0 : slabdata 47 47 0 kmalloc-128 660 690 128 30 1 : tunables 120 60 0 : slabdata 23 23 0 kmalloc-32 11200 11200 32 112 1 : tunables 120 60 0 : slabdata 100 100 0 kmem_cache 197 200 192 20 1 : tunables 120 60 0 : slabdata 10 10 0 * After my previous posting(overload struct slab over struct page) * # name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables [snip...] kmalloc-512 525 640 512 8 1 : tunables 54 27 0 : slabdata 80 80 0 kmalloc-256 210 210 256 15 1 : tunables 120 60 0 : slabdata 14 14 0 kmalloc-192 1016 1040 192 20 1 : tunables 120 60 0 : slabdata 52 52 0 kmalloc-96 560 620 128 31 1 : tunables 120 60 0 : slabdata 20 20 0 kmalloc-64 2148 2280 64 60 1 : tunables 120 60 0 : slabdata 38 38 0 kmalloc-128 647 682 128 31 1 : tunables 120 60 0 : slabdata 22 22 0 kmalloc-32 11360 11413 32 113 1 : tunables 120 60 0 : slabdata 101 101 0 kmem_cache 197 200 192 20 1 : tunables 120 60 0 : slabdata 10 10 0 kmem_caches consisting of objects less than or equal to 128 byte have one more objects in a slab. You can see it at objperslab. We can improve further with this patchset. * My previous posting + this patchset * # name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables [snip...] kmalloc-512 521 648 512 8 1 : tunables 54 27 0 : slabdata 81 81 0 kmalloc-256 208 208 256 16 1 : tunables 120 60 0 : slabdata 13 13 0 kmalloc-192 1029 1029 192 21 1 : tunables 120 60 0 : slabdata 49 49 0 kmalloc-96 529 589 128 31 1 : tunables 120 60 0 : slabdata 19 19 0 kmalloc-64 2142 2142 64 63 1 : tunables 120 60 0 : slabdata 34 34 0 kmalloc-128 660 682 128 31 1 : tunables 120 60 0 : slabdata 22 22 0 kmalloc-32 11716 11780 32 124 1 : tunables 120 60 0 : slabdata 95 95 0 kmem_cache 197 210 192 21 1 : tunables 120 60 0 : slabdata 10 10 0 kmem_caches consisting of objects less than or equal to 256 byte have one or more objects than before. In the case of kmalloc-32, we have 11 more objects, so 352 bytes (11 * 32) are saved and this is roughly 9% saving of memory. Of couse, this percentage decreases as the number of objects in a slab decreases. Here are the performance results on my 4 cpus machine. * Before * Performance counter stats for 'perf bench sched messaging -g 50 -l 1000' (10 runs): 238,309,671 cache-misses ( +- 0.40% ) 12.010172090 seconds time elapsed ( +- 0.21% ) * After my previous posting * Performance counter stats for 'perf bench sched messaging -g 50 -l 1000' (10 runs): 229,945,138 cache-misses ( +- 0.23% ) 11.627897174 seconds time elapsed ( +- 0.14% ) * My previous posting + this patchset * Performance counter stats for 'perf bench sched messaging -g 50 -l 1000' (10 runs): 218,640,472 cache-misses ( +- 0.42% ) 11.504999837 seconds time elapsed ( +- 0.21% ) cache-misses are reduced by each patchset, roughly 5% respectively. And elapsed times are also improved by 3.1% and 4.2% to baseline, respectively. I think that all patchsets deserve to be merged, since it reduces memory usage and also improves performance. :) Please let me know expert's opinions :) Thanks. This patchset comes from a Christoph's idea. https://lkml.org/lkml/2013/8/23/315 Patches are on top of my previous posting. https://lkml.org/lkml/2013/8/22/137 Joonsoo Kim (4): slab: factor out calculate nr objects in cache_estimate slab: introduce helper functions to get/set free object slab: introduce byte sized index for the freelist of a slab slab: make more slab management structure off the slab mm/slab.c | 138 +++++++++++++++++++++++++++++++++++++++++++++---------------- 1 file changed, 103 insertions(+), 35 deletions(-) -- 1.7.9.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* [REPOST PATCH 1/4] slab: factor out calculate nr objects in cache_estimate 2013-09-06 5:57 [REPOST PATCH 0/4] slab: implement byte sized indexes for the freelist of a slab Joonsoo Kim @ 2013-09-06 5:57 ` Joonsoo Kim 2013-09-06 15:48 ` Christoph Lameter 2013-09-06 5:57 ` [REPOST PATCH 2/4] slab: introduce helper functions to get/set free object Joonsoo Kim ` (2 subsequent siblings) 3 siblings, 1 reply; 17+ messages in thread From: Joonsoo Kim @ 2013-09-06 5:57 UTC (permalink / raw) To: Pekka Enberg Cc: Christoph Lameter, Andrew Morton, Joonsoo Kim, David Rientjes, linux-mm, linux-kernel, Joonsoo Kim This logic is not simple to understand so that making separate function helping readability. Additionally, we can use this change in the following patch which implement for freelist to have another sized index in according to nr objects. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> diff --git a/mm/slab.c b/mm/slab.c index f3868fe..9d4bad5 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -565,9 +565,31 @@ static inline struct array_cache *cpu_cache_get(struct kmem_cache *cachep) return cachep->array[smp_processor_id()]; } -static size_t slab_mgmt_size(size_t nr_objs, size_t align) +static int calculate_nr_objs(size_t slab_size, size_t buffer_size, + size_t idx_size, size_t align) { - return ALIGN(nr_objs * sizeof(unsigned int), align); + int nr_objs; + size_t freelist_size; + + /* + * Ignore padding for the initial guess. The padding + * is at most @align-1 bytes, and @buffer_size is at + * least @align. In the worst case, this result will + * be one greater than the number of objects that fit + * into the memory allocation when taking the padding + * into account. + */ + nr_objs = slab_size / (buffer_size + idx_size); + + /* + * This calculated number will be either the right + * amount, or one greater than what we want. + */ + freelist_size = slab_size - nr_objs * buffer_size; + if (freelist_size < ALIGN(nr_objs * idx_size, align)) + nr_objs--; + + return nr_objs; } /* @@ -600,28 +622,12 @@ static void cache_estimate(unsigned long gfporder, size_t buffer_size, nr_objs = slab_size / buffer_size; } else { - /* - * Ignore padding for the initial guess. The padding - * is at most @align-1 bytes, and @buffer_size is at - * least @align. In the worst case, this result will - * be one greater than the number of objects that fit - * into the memory allocation when taking the padding - * into account. - */ - nr_objs = (slab_size) / (buffer_size + sizeof(unsigned int)); - - /* - * This calculated number will be either the right - * amount, or one greater than what we want. - */ - if (slab_mgmt_size(nr_objs, align) + nr_objs*buffer_size - > slab_size) - nr_objs--; - - mgmt_size = slab_mgmt_size(nr_objs, align); + nr_objs = calculate_nr_objs(slab_size, buffer_size, + sizeof(unsigned int), align); + mgmt_size = ALIGN(nr_objs * sizeof(unsigned int), align); } *num = nr_objs; - *left_over = slab_size - nr_objs*buffer_size - mgmt_size; + *left_over = slab_size - (nr_objs * buffer_size) - mgmt_size; } #if DEBUG -- 1.7.9.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [REPOST PATCH 1/4] slab: factor out calculate nr objects in cache_estimate 2013-09-06 5:57 ` [REPOST PATCH 1/4] slab: factor out calculate nr objects in cache_estimate Joonsoo Kim @ 2013-09-06 15:48 ` Christoph Lameter 2013-09-09 4:32 ` Joonsoo Kim 0 siblings, 1 reply; 17+ messages in thread From: Christoph Lameter @ 2013-09-06 15:48 UTC (permalink / raw) To: Joonsoo Kim Cc: Pekka Enberg, Andrew Morton, Joonsoo Kim, David Rientjes, linux-mm, linux-kernel On Fri, 6 Sep 2013, Joonsoo Kim wrote: > } > *num = nr_objs; > - *left_over = slab_size - nr_objs*buffer_size - mgmt_size; > + *left_over = slab_size - (nr_objs * buffer_size) - mgmt_size; > } What is the point of this change? Drop it. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [REPOST PATCH 1/4] slab: factor out calculate nr objects in cache_estimate 2013-09-06 15:48 ` Christoph Lameter @ 2013-09-09 4:32 ` Joonsoo Kim 0 siblings, 0 replies; 17+ messages in thread From: Joonsoo Kim @ 2013-09-09 4:32 UTC (permalink / raw) To: Christoph Lameter Cc: Pekka Enberg, Andrew Morton, David Rientjes, linux-mm, linux-kernel On Fri, Sep 06, 2013 at 03:48:04PM +0000, Christoph Lameter wrote: > On Fri, 6 Sep 2013, Joonsoo Kim wrote: > > > } > > *num = nr_objs; > > - *left_over = slab_size - nr_objs*buffer_size - mgmt_size; > > + *left_over = slab_size - (nr_objs * buffer_size) - mgmt_size; > > } > > What is the point of this change? Drop it. Okay. I will drop it. > > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* [REPOST PATCH 2/4] slab: introduce helper functions to get/set free object 2013-09-06 5:57 [REPOST PATCH 0/4] slab: implement byte sized indexes for the freelist of a slab Joonsoo Kim 2013-09-06 5:57 ` [REPOST PATCH 1/4] slab: factor out calculate nr objects in cache_estimate Joonsoo Kim @ 2013-09-06 5:57 ` Joonsoo Kim 2013-09-06 15:49 ` Christoph Lameter 2013-09-06 5:57 ` [REPOST PATCH 3/4] slab: introduce byte sized index for the freelist of a slab Joonsoo Kim 2013-09-06 5:57 ` [REPOST PATCH 4/4] slab: make more slab management structure off the slab Joonsoo Kim 3 siblings, 1 reply; 17+ messages in thread From: Joonsoo Kim @ 2013-09-06 5:57 UTC (permalink / raw) To: Pekka Enberg Cc: Christoph Lameter, Andrew Morton, Joonsoo Kim, David Rientjes, linux-mm, linux-kernel, Joonsoo Kim In the following patches, to get/set free objects from the freelist is changed so that simple casting doesn't work for it. Therefore, introduce helper functions. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> diff --git a/mm/slab.c b/mm/slab.c index 9d4bad5..a0e49bb 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -2545,9 +2545,15 @@ static struct freelist *alloc_slabmgmt(struct kmem_cache *cachep, return freelist; } -static inline unsigned int *slab_freelist(struct page *page) +static inline unsigned int get_free_obj(struct page *page, unsigned int idx) { - return (unsigned int *)(page->freelist); + return ((unsigned int *)page->freelist)[idx]; +} + +static inline void set_free_obj(struct page *page, + unsigned int idx, unsigned int val) +{ + ((unsigned int *)(page->freelist))[idx] = val; } static void cache_init_objs(struct kmem_cache *cachep, @@ -2592,7 +2598,7 @@ static void cache_init_objs(struct kmem_cache *cachep, if (cachep->ctor) cachep->ctor(objp); #endif - slab_freelist(page)[i] = i; + set_free_obj(page, i, i); } } @@ -2611,7 +2617,7 @@ static void *slab_get_obj(struct kmem_cache *cachep, struct page *page, { void *objp; - objp = index_to_obj(cachep, page, slab_freelist(page)[page->active]); + objp = index_to_obj(cachep, page, get_free_obj(page, page->active)); page->active++; #if DEBUG WARN_ON(page_to_nid(virt_to_page(objp)) != nodeid); @@ -2632,7 +2638,7 @@ static void slab_put_obj(struct kmem_cache *cachep, struct page *page, /* Verify double free bug */ for (i = page->active; i < cachep->num; i++) { - if (slab_freelist(page)[i] == objnr) { + if (get_free_obj(page, i) == objnr) { printk(KERN_ERR "slab: double free detected in cache " "'%s', objp %p\n", cachep->name, objp); BUG(); @@ -2640,7 +2646,7 @@ static void slab_put_obj(struct kmem_cache *cachep, struct page *page, } #endif page->active--; - slab_freelist(page)[page->active] = objnr; + set_free_obj(page, page->active, objnr); } /* @@ -4214,7 +4220,7 @@ static void handle_slab(unsigned long *n, struct kmem_cache *c, for (j = page->active; j < c->num; j++) { /* Skip freed item */ - if (slab_freelist(page)[j] == i) { + if (get_free_obj(page, j) == i) { active = false; break; } -- 1.7.9.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [REPOST PATCH 2/4] slab: introduce helper functions to get/set free object 2013-09-06 5:57 ` [REPOST PATCH 2/4] slab: introduce helper functions to get/set free object Joonsoo Kim @ 2013-09-06 15:49 ` Christoph Lameter 0 siblings, 0 replies; 17+ messages in thread From: Christoph Lameter @ 2013-09-06 15:49 UTC (permalink / raw) To: Joonsoo Kim Cc: Pekka Enberg, Andrew Morton, Joonsoo Kim, David Rientjes, linux-mm, linux-kernel On Fri, 6 Sep 2013, Joonsoo Kim wrote: > In the following patches, to get/set free objects from the freelist > is changed so that simple casting doesn't work for it. Therefore, > introduce helper functions. Acked-by: Christoph Lameter <cl@linux.com> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* [REPOST PATCH 3/4] slab: introduce byte sized index for the freelist of a slab 2013-09-06 5:57 [REPOST PATCH 0/4] slab: implement byte sized indexes for the freelist of a slab Joonsoo Kim 2013-09-06 5:57 ` [REPOST PATCH 1/4] slab: factor out calculate nr objects in cache_estimate Joonsoo Kim 2013-09-06 5:57 ` [REPOST PATCH 2/4] slab: introduce helper functions to get/set free object Joonsoo Kim @ 2013-09-06 5:57 ` Joonsoo Kim 2013-09-06 15:58 ` Christoph Lameter 2013-09-06 5:57 ` [REPOST PATCH 4/4] slab: make more slab management structure off the slab Joonsoo Kim 3 siblings, 1 reply; 17+ messages in thread From: Joonsoo Kim @ 2013-09-06 5:57 UTC (permalink / raw) To: Pekka Enberg Cc: Christoph Lameter, Andrew Morton, Joonsoo Kim, David Rientjes, linux-mm, linux-kernel, Joonsoo Kim Currently, the freelist of a slab consist of unsigned int sized indexes. Most of slabs have less number of objects than 256, since restriction for page order is at most 1 in default configuration. For example, consider a slab consisting of 32 byte sized objects on two continous pages. In this case, 256 objects is possible and these number fit to byte sized indexes. 256 objects is maximum possible value in default configuration, since 32 byte is minimum object size in the SLAB. (8192 / 32 = 256). Therefore, if we use byte sized index, we can save 3 bytes for each object. This introduce one likely branch to functions used for setting/getting objects to/from the freelist, but we may get more benefits from this change. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> diff --git a/mm/slab.c b/mm/slab.c index a0e49bb..bd366e5 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -565,8 +565,16 @@ static inline struct array_cache *cpu_cache_get(struct kmem_cache *cachep) return cachep->array[smp_processor_id()]; } -static int calculate_nr_objs(size_t slab_size, size_t buffer_size, - size_t idx_size, size_t align) +static inline bool can_byte_index(int nr_objs) +{ + if (likely(nr_objs <= (sizeof(unsigned char) << 8))) + return true; + + return false; +} + +static int __calculate_nr_objs(size_t slab_size, size_t buffer_size, + unsigned int idx_size, size_t align) { int nr_objs; size_t freelist_size; @@ -592,6 +600,29 @@ static int calculate_nr_objs(size_t slab_size, size_t buffer_size, return nr_objs; } +static int calculate_nr_objs(size_t slab_size, size_t buffer_size, + size_t align) +{ + int nr_objs; + int byte_nr_objs; + + nr_objs = __calculate_nr_objs(slab_size, buffer_size, + sizeof(unsigned int), align); + if (!can_byte_index(nr_objs)) + return nr_objs; + + byte_nr_objs = __calculate_nr_objs(slab_size, buffer_size, + sizeof(unsigned char), align); + /* + * nr_objs can be larger when using byte index, + * so that it cannot be indexed by byte index. + */ + if (can_byte_index(byte_nr_objs)) + return byte_nr_objs; + else + return nr_objs; +} + /* * Calculate the number of objects and left-over bytes for a given buffer size. */ @@ -618,13 +649,18 @@ static void cache_estimate(unsigned long gfporder, size_t buffer_size, * correct alignment when allocated. */ if (flags & CFLGS_OFF_SLAB) { - mgmt_size = 0; nr_objs = slab_size / buffer_size; + mgmt_size = 0; } else { - nr_objs = calculate_nr_objs(slab_size, buffer_size, - sizeof(unsigned int), align); - mgmt_size = ALIGN(nr_objs * sizeof(unsigned int), align); + nr_objs = calculate_nr_objs(slab_size, buffer_size, align); + if (can_byte_index(nr_objs)) { + mgmt_size = + ALIGN(nr_objs * sizeof(unsigned char), align); + } else { + mgmt_size = + ALIGN(nr_objs * sizeof(unsigned int), align); + } } *num = nr_objs; *left_over = slab_size - (nr_objs * buffer_size) - mgmt_size; @@ -2012,7 +2048,10 @@ static size_t calculate_slab_order(struct kmem_cache *cachep, * looping condition in cache_grow(). */ offslab_limit = size; - offslab_limit /= sizeof(unsigned int); + if (can_byte_index(num)) + offslab_limit /= sizeof(unsigned char); + else + offslab_limit /= sizeof(unsigned int); if (num > offslab_limit) break; @@ -2253,8 +2292,13 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags) if (!cachep->num) return -E2BIG; - freelist_size = - ALIGN(cachep->num * sizeof(unsigned int), cachep->align); + if (can_byte_index(cachep->num)) { + freelist_size = ALIGN(cachep->num * sizeof(unsigned char), + cachep->align); + } else { + freelist_size = ALIGN(cachep->num * sizeof(unsigned int), + cachep->align); + } /* * If the slab has been placed off-slab, and we have enough space then @@ -2267,7 +2311,10 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags) if (flags & CFLGS_OFF_SLAB) { /* really off slab. No need for manual alignment */ - freelist_size = cachep->num * sizeof(unsigned int); + if (can_byte_index(cachep->num)) + freelist_size = cachep->num * sizeof(unsigned char); + else + freelist_size = cachep->num * sizeof(unsigned int); #ifdef CONFIG_PAGE_POISONING /* If we're going to use the generic kernel_map_pages() @@ -2545,15 +2592,22 @@ static struct freelist *alloc_slabmgmt(struct kmem_cache *cachep, return freelist; } -static inline unsigned int get_free_obj(struct page *page, unsigned int idx) +static inline unsigned int get_free_obj(struct kmem_cache *cachep, + struct page *page, unsigned int idx) { - return ((unsigned int *)page->freelist)[idx]; + if (likely(can_byte_index(cachep->num))) + return ((unsigned char *)page->freelist)[idx]; + else + return ((unsigned int *)page->freelist)[idx]; } -static inline void set_free_obj(struct page *page, +static inline void set_free_obj(struct kmem_cache *cachep, struct page *page, unsigned int idx, unsigned int val) { - ((unsigned int *)(page->freelist))[idx] = val; + if (likely(can_byte_index(cachep->num))) + ((unsigned char *)(page->freelist))[idx] = (unsigned char)val; + else + ((unsigned int *)(page->freelist))[idx] = val; } static void cache_init_objs(struct kmem_cache *cachep, @@ -2598,7 +2652,7 @@ static void cache_init_objs(struct kmem_cache *cachep, if (cachep->ctor) cachep->ctor(objp); #endif - set_free_obj(page, i, i); + set_free_obj(cachep, page, i, i); } } @@ -2615,9 +2669,11 @@ static void kmem_flagcheck(struct kmem_cache *cachep, gfp_t flags) static void *slab_get_obj(struct kmem_cache *cachep, struct page *page, int nodeid) { + unsigned int index; void *objp; - objp = index_to_obj(cachep, page, get_free_obj(page, page->active)); + index = get_free_obj(cachep, page, page->active); + objp = index_to_obj(cachep, page, index); page->active++; #if DEBUG WARN_ON(page_to_nid(virt_to_page(objp)) != nodeid); @@ -2638,7 +2694,7 @@ static void slab_put_obj(struct kmem_cache *cachep, struct page *page, /* Verify double free bug */ for (i = page->active; i < cachep->num; i++) { - if (get_free_obj(page, i) == objnr) { + if (get_free_obj(cachep, page, i) == objnr) { printk(KERN_ERR "slab: double free detected in cache " "'%s', objp %p\n", cachep->name, objp); BUG(); @@ -2646,7 +2702,7 @@ static void slab_put_obj(struct kmem_cache *cachep, struct page *page, } #endif page->active--; - set_free_obj(page, page->active, objnr); + set_free_obj(cachep, page, page->active, objnr); } /* @@ -4220,7 +4276,7 @@ static void handle_slab(unsigned long *n, struct kmem_cache *c, for (j = page->active; j < c->num; j++) { /* Skip freed item */ - if (get_free_obj(page, j) == i) { + if (get_free_obj(c, page, j) == i) { active = false; break; } -- 1.7.9.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [REPOST PATCH 3/4] slab: introduce byte sized index for the freelist of a slab 2013-09-06 5:57 ` [REPOST PATCH 3/4] slab: introduce byte sized index for the freelist of a slab Joonsoo Kim @ 2013-09-06 15:58 ` Christoph Lameter 2013-09-09 4:32 ` Joonsoo Kim 0 siblings, 1 reply; 17+ messages in thread From: Christoph Lameter @ 2013-09-06 15:58 UTC (permalink / raw) To: Joonsoo Kim Cc: Pekka Enberg, Andrew Morton, Joonsoo Kim, David Rientjes, linux-mm, linux-kernel On Fri, 6 Sep 2013, Joonsoo Kim wrote: > Currently, the freelist of a slab consist of unsigned int sized indexes. > Most of slabs have less number of objects than 256, since restriction > for page order is at most 1 in default configuration. For example, > consider a slab consisting of 32 byte sized objects on two continous > pages. In this case, 256 objects is possible and these number fit to byte > sized indexes. 256 objects is maximum possible value in default > configuration, since 32 byte is minimum object size in the SLAB. > (8192 / 32 = 256). Therefore, if we use byte sized index, we can save > 3 bytes for each object. Ok then why is the patch making slab do either byte sized or int sized indexes? Seems that you could do a clean cutover? As you said: The mininum object size is 32 bytes for slab. 32 * 256 = 8k. So we are fine unless the page size is > 8k. This is true for IA64 and powerpc only I believe. The page size can be determined at compile time and depending on that page size you could then choose a different size for the indexes. Or the alternative is to increase the minimum slab object size. A 16k page size would require a 64 byte minimum allocation. But thats no good I guess. byte sized or short int sized index support would be enough. > This introduce one likely branch to functions used for setting/getting > objects to/from the freelist, but we may get more benefits from > this change. Lets not do that. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [REPOST PATCH 3/4] slab: introduce byte sized index for the freelist of a slab 2013-09-06 15:58 ` Christoph Lameter @ 2013-09-09 4:32 ` Joonsoo Kim 2013-09-09 14:44 ` Christoph Lameter 0 siblings, 1 reply; 17+ messages in thread From: Joonsoo Kim @ 2013-09-09 4:32 UTC (permalink / raw) To: Christoph Lameter Cc: Pekka Enberg, Andrew Morton, David Rientjes, linux-mm, linux-kernel On Fri, Sep 06, 2013 at 03:58:18PM +0000, Christoph Lameter wrote: > On Fri, 6 Sep 2013, Joonsoo Kim wrote: > > > Currently, the freelist of a slab consist of unsigned int sized indexes. > > Most of slabs have less number of objects than 256, since restriction > > for page order is at most 1 in default configuration. For example, > > consider a slab consisting of 32 byte sized objects on two continous > > pages. In this case, 256 objects is possible and these number fit to byte > > sized indexes. 256 objects is maximum possible value in default > > configuration, since 32 byte is minimum object size in the SLAB. > > (8192 / 32 = 256). Therefore, if we use byte sized index, we can save > > 3 bytes for each object. > > Ok then why is the patch making slab do either byte sized or int sized > indexes? Seems that you could do a clean cutover? > > > As you said: The mininum object size is 32 bytes for slab. 32 * 256 = > 8k. So we are fine unless the page size is > 8k. This is true for IA64 and > powerpc only I believe. The page size can be determined at compile time > and depending on that page size you could then choose a different size for > the indexes. Or the alternative is to increase the minimum slab object size. > A 16k page size would require a 64 byte minimum allocation. But thats no > good I guess. byte sized or short int sized index support would be enough. Sorry for misleading commit message. 32 byte is not minimum object size, minimum *kmalloc* object size in default configuration. There are some slabs that their object size is less than 32 byte. If we have a 8 byte sized kmem_cache, it has 512 objects in 4K page. Moreover, we can configure slab_max_order in boot time so that we can't know how many object are in a certain slab in compile time. Therefore we can't decide the size of the index in compile time. I think that byte and short int sized index support would be enough, but it should be determined at runtime. > > > This introduce one likely branch to functions used for setting/getting > > objects to/from the freelist, but we may get more benefits from > > this change. > > Lets not do that. IMHO, this is as best as we can. Do you have any better idea? Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [REPOST PATCH 3/4] slab: introduce byte sized index for the freelist of a slab 2013-09-09 4:32 ` Joonsoo Kim @ 2013-09-09 14:44 ` Christoph Lameter 2013-09-10 5:43 ` Joonsoo Kim 0 siblings, 1 reply; 17+ messages in thread From: Christoph Lameter @ 2013-09-09 14:44 UTC (permalink / raw) To: Joonsoo Kim Cc: Pekka Enberg, Andrew Morton, David Rientjes, linux-mm, linux-kernel On Mon, 9 Sep 2013, Joonsoo Kim wrote: > 32 byte is not minimum object size, minimum *kmalloc* object size > in default configuration. There are some slabs that their object size is > less than 32 byte. If we have a 8 byte sized kmem_cache, it has 512 objects > in 4K page. As far as I can recall only SLUB supports 8 byte objects. SLABs mininum has always been 32 bytes. > Moreover, we can configure slab_max_order in boot time so that we can't know > how many object are in a certain slab in compile time. Therefore we can't > decide the size of the index in compile time. You can ignore the slab_max_order if necessary. > I think that byte and short int sized index support would be enough, but > it should be determined at runtime. On x86 f.e. it would add useless branching. The branches are never taken. You only need these if you do bad things to the system like requiring large contiguous allocs. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [REPOST PATCH 3/4] slab: introduce byte sized index for the freelist of a slab 2013-09-09 14:44 ` Christoph Lameter @ 2013-09-10 5:43 ` Joonsoo Kim 2013-09-10 21:25 ` Christoph Lameter 0 siblings, 1 reply; 17+ messages in thread From: Joonsoo Kim @ 2013-09-10 5:43 UTC (permalink / raw) To: Christoph Lameter Cc: Pekka Enberg, Andrew Morton, David Rientjes, linux-mm, linux-kernel On Mon, Sep 09, 2013 at 02:44:03PM +0000, Christoph Lameter wrote: > On Mon, 9 Sep 2013, Joonsoo Kim wrote: > > > 32 byte is not minimum object size, minimum *kmalloc* object size > > in default configuration. There are some slabs that their object size is > > less than 32 byte. If we have a 8 byte sized kmem_cache, it has 512 objects > > in 4K page. > > As far as I can recall only SLUB supports 8 byte objects. SLABs mininum > has always been 32 bytes. No. There are many slabs that their object size are less than 32 byte. And I can also create a 8 byte sized slab in my kernel with SLAB. js1304@js1304-P5Q-DELUXE:~/Projects/remote_git/linux$ sudo cat /proc/slabinfo | awk '{if($4 < 32) print $0}' slabinfo - version: 2.1 ecryptfs_file_cache 0 0 16 240 1 : tunables 120 60 8 : slabdata 0 0 0 jbd2_revoke_table_s 2 240 16 240 1 : tunables 120 60 8 : slabdata 1 1 0 journal_handle 0 0 24 163 1 : tunables 120 60 8 : slabdata 0 0 0 revoke_table 0 0 16 240 1 : tunables 120 60 8 : slabdata 0 0 0 scsi_data_buffer 0 0 24 163 1 : tunables 120 60 8 : slabdata 0 0 0 fsnotify_event_holder 0 0 24 163 1 : tunables 120 60 8 : slabdata 0 0 0 numa_policy 3 163 24 163 1 : tunables 120 60 8 : slabdata 1 1 0 > > > Moreover, we can configure slab_max_order in boot time so that we can't know > > how many object are in a certain slab in compile time. Therefore we can't > > decide the size of the index in compile time. > > You can ignore the slab_max_order if necessary. > > > I think that byte and short int sized index support would be enough, but > > it should be determined at runtime. > > On x86 f.e. it would add useless branching. The branches are never taken. > You only need these if you do bad things to the system like requiring > large contiguous allocs. As I said before, since there is a possibility that some runtime loaded modules use a 8 byte sized slab, we can't determine index size in compile time. Otherwise we should always use short int sized index and I think that it is worse than adding a branch. Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [REPOST PATCH 3/4] slab: introduce byte sized index for the freelist of a slab 2013-09-10 5:43 ` Joonsoo Kim @ 2013-09-10 21:25 ` Christoph Lameter 2013-09-11 1:04 ` Joonsoo Kim 0 siblings, 1 reply; 17+ messages in thread From: Christoph Lameter @ 2013-09-10 21:25 UTC (permalink / raw) To: Joonsoo Kim Cc: Pekka Enberg, Andrew Morton, David Rientjes, linux-mm, linux-kernel On Tue, 10 Sep 2013, Joonsoo Kim wrote: > On Mon, Sep 09, 2013 at 02:44:03PM +0000, Christoph Lameter wrote: > > On Mon, 9 Sep 2013, Joonsoo Kim wrote: > > > > > 32 byte is not minimum object size, minimum *kmalloc* object size > > > in default configuration. There are some slabs that their object size is > > > less than 32 byte. If we have a 8 byte sized kmem_cache, it has 512 objects > > > in 4K page. > > > > As far as I can recall only SLUB supports 8 byte objects. SLABs mininum > > has always been 32 bytes. > > No. > There are many slabs that their object size are less than 32 byte. > And I can also create a 8 byte sized slab in my kernel with SLAB. Well the minimum size for the kmalloc array is 32 bytes. These are custom slabs. KMALLOC_SHIFT_LOW is set to 5 in include/linux/slab.h. Ok so there are some slabs like that. Hmmm.. We have sizes 16 and 24 in your list. 16*256 is still 4096. So this would still work fine if we would forbid a size of 8 or increase that by default to 16. > > On x86 f.e. it would add useless branching. The branches are never taken. > > You only need these if you do bad things to the system like requiring > > large contiguous allocs. > > As I said before, since there is a possibility that some runtime loaded modules > use a 8 byte sized slab, we can't determine index size in compile time. Otherwise > we should always use short int sized index and I think that it is worse than > adding a branch. We can enforce a mininum slab size and an order limit so that it fits. And then there would be no additional branching. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [REPOST PATCH 3/4] slab: introduce byte sized index for the freelist of a slab 2013-09-10 21:25 ` Christoph Lameter @ 2013-09-11 1:04 ` Joonsoo Kim 2013-09-11 14:22 ` Christoph Lameter 0 siblings, 1 reply; 17+ messages in thread From: Joonsoo Kim @ 2013-09-11 1:04 UTC (permalink / raw) To: Christoph Lameter Cc: Pekka Enberg, Andrew Morton, David Rientjes, linux-mm, linux-kernel On Tue, Sep 10, 2013 at 09:25:05PM +0000, Christoph Lameter wrote: > On Tue, 10 Sep 2013, Joonsoo Kim wrote: > > > On Mon, Sep 09, 2013 at 02:44:03PM +0000, Christoph Lameter wrote: > > > On Mon, 9 Sep 2013, Joonsoo Kim wrote: > > > > > > > 32 byte is not minimum object size, minimum *kmalloc* object size > > > > in default configuration. There are some slabs that their object size is > > > > less than 32 byte. If we have a 8 byte sized kmem_cache, it has 512 objects > > > > in 4K page. > > > > > > As far as I can recall only SLUB supports 8 byte objects. SLABs mininum > > > has always been 32 bytes. > > > > No. > > There are many slabs that their object size are less than 32 byte. > > And I can also create a 8 byte sized slab in my kernel with SLAB. > > Well the minimum size for the kmalloc array is 32 bytes. These are custom > slabs. KMALLOC_SHIFT_LOW is set to 5 in include/linux/slab.h. > > Ok so there are some slabs like that. Hmmm.. We have sizes 16 and 24 in > your list. 16*256 is still 4096. So this would still work fine if we would > forbid a size of 8 or increase that by default to 16. > > > > On x86 f.e. it would add useless branching. The branches are never taken. > > > You only need these if you do bad things to the system like requiring > > > large contiguous allocs. > > > > As I said before, since there is a possibility that some runtime loaded modules > > use a 8 byte sized slab, we can't determine index size in compile time. Otherwise > > we should always use short int sized index and I think that it is worse than > > adding a branch. > > We can enforce a mininum slab size and an order limit so that it fits. And > then there would be no additional branching. > Okay. I will respin this patchset with your suggestion. Anyway, could you review my previous patchset, that is, 'overload struct slab over struct page to reduce memory usage'? I'm not sure whether your answer is ack or not. Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [REPOST PATCH 3/4] slab: introduce byte sized index for the freelist of a slab 2013-09-11 1:04 ` Joonsoo Kim @ 2013-09-11 14:22 ` Christoph Lameter 2013-09-12 6:52 ` Joonsoo Kim 0 siblings, 1 reply; 17+ messages in thread From: Christoph Lameter @ 2013-09-11 14:22 UTC (permalink / raw) To: Joonsoo Kim Cc: Pekka Enberg, Andrew Morton, David Rientjes, linux-mm, linux-kernel On Wed, 11 Sep 2013, Joonsoo Kim wrote: > Anyway, could you review my previous patchset, that is, 'overload struct slab > over struct page to reduce memory usage'? I'm not sure whether your answer is > ack or not. I scanned over it before but I was not able to see if it was correct on first glance. I will look at it again. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [REPOST PATCH 3/4] slab: introduce byte sized index for the freelist of a slab 2013-09-11 14:22 ` Christoph Lameter @ 2013-09-12 6:52 ` Joonsoo Kim 0 siblings, 0 replies; 17+ messages in thread From: Joonsoo Kim @ 2013-09-12 6:52 UTC (permalink / raw) To: Christoph Lameter Cc: Pekka Enberg, Andrew Morton, David Rientjes, linux-mm, linux-kernel On Wed, Sep 11, 2013 at 02:22:25PM +0000, Christoph Lameter wrote: > On Wed, 11 Sep 2013, Joonsoo Kim wrote: > > > Anyway, could you review my previous patchset, that is, 'overload struct slab > > over struct page to reduce memory usage'? I'm not sure whether your answer is > > ack or not. > > I scanned over it before but I was not able to see if it was correct on > first glance. I will look at it again. Sounds good! Thanks :) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* [REPOST PATCH 4/4] slab: make more slab management structure off the slab 2013-09-06 5:57 [REPOST PATCH 0/4] slab: implement byte sized indexes for the freelist of a slab Joonsoo Kim ` (2 preceding siblings ...) 2013-09-06 5:57 ` [REPOST PATCH 3/4] slab: introduce byte sized index for the freelist of a slab Joonsoo Kim @ 2013-09-06 5:57 ` Joonsoo Kim 2013-09-06 15:59 ` Christoph Lameter 3 siblings, 1 reply; 17+ messages in thread From: Joonsoo Kim @ 2013-09-06 5:57 UTC (permalink / raw) To: Pekka Enberg Cc: Christoph Lameter, Andrew Morton, Joonsoo Kim, David Rientjes, linux-mm, linux-kernel, Joonsoo Kim Now, the size of the freelist for the slab management diminish, so that the on-slab management structure can waste large space if the object of the slab is large. Consider a 128 byte sized slab. If on-slab is used, 31 objects can be in the slab. The size of the freelist for this case would be 31 bytes so that 97 bytes, that is, more than 75% of object size, are wasted. In a 64 byte sized slab case, no space is wasted if we use on-slab. So set off-slab determining constraint to 128 bytes. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> diff --git a/mm/slab.c b/mm/slab.c index bd366e5..d01a2f0 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -2277,7 +2277,7 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags) * it too early on. Always use on-slab management when * SLAB_NOLEAKTRACE to avoid recursive calls into kmemleak) */ - if ((size >= (PAGE_SIZE >> 3)) && !slab_early_init && + if ((size >= (PAGE_SIZE >> 5)) && !slab_early_init && !(flags & SLAB_NOLEAKTRACE)) /* * Size is large, assume best to place the slab management obj -- 1.7.9.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [REPOST PATCH 4/4] slab: make more slab management structure off the slab 2013-09-06 5:57 ` [REPOST PATCH 4/4] slab: make more slab management structure off the slab Joonsoo Kim @ 2013-09-06 15:59 ` Christoph Lameter 0 siblings, 0 replies; 17+ messages in thread From: Christoph Lameter @ 2013-09-06 15:59 UTC (permalink / raw) To: Joonsoo Kim Cc: Pekka Enberg, Andrew Morton, Joonsoo Kim, David Rientjes, linux-mm, linux-kernel On Fri, 6 Sep 2013, Joonsoo Kim wrote: > In a 64 byte sized slab case, no space is wasted if we use on-slab. > So set off-slab determining constraint to 128 bytes. Acked-by: Christoph Lameter <cl@linux.com> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2013-09-12 6:52 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-09-06 5:57 [REPOST PATCH 0/4] slab: implement byte sized indexes for the freelist of a slab Joonsoo Kim 2013-09-06 5:57 ` [REPOST PATCH 1/4] slab: factor out calculate nr objects in cache_estimate Joonsoo Kim 2013-09-06 15:48 ` Christoph Lameter 2013-09-09 4:32 ` Joonsoo Kim 2013-09-06 5:57 ` [REPOST PATCH 2/4] slab: introduce helper functions to get/set free object Joonsoo Kim 2013-09-06 15:49 ` Christoph Lameter 2013-09-06 5:57 ` [REPOST PATCH 3/4] slab: introduce byte sized index for the freelist of a slab Joonsoo Kim 2013-09-06 15:58 ` Christoph Lameter 2013-09-09 4:32 ` Joonsoo Kim 2013-09-09 14:44 ` Christoph Lameter 2013-09-10 5:43 ` Joonsoo Kim 2013-09-10 21:25 ` Christoph Lameter 2013-09-11 1:04 ` Joonsoo Kim 2013-09-11 14:22 ` Christoph Lameter 2013-09-12 6:52 ` Joonsoo Kim 2013-09-06 5:57 ` [REPOST PATCH 4/4] slab: make more slab management structure off the slab Joonsoo Kim 2013-09-06 15:59 ` Christoph Lameter
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).