From: js1304@gmail.com
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux.com>,
Pekka Enberg <penberg@kernel.org>,
David Rientjes <rientjes@google.com>,
Jesper Dangaard Brouer <brouer@redhat.com>,
Vlastimil Babka <vbabka@suse.cz>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: [PATCH v2 09/17] mm/slab: put the freelist at the end of slab page
Date: Fri, 26 Feb 2016 15:01:16 +0900 [thread overview]
Message-ID: <1456466484-3442-10-git-send-email-iamjoonsoo.kim@lge.com> (raw)
In-Reply-To: <1456466484-3442-1-git-send-email-iamjoonsoo.kim@lge.com>
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Currently, the freelist is at the front of slab page. This requires extra
space to meet object alignment requirement. If we put the freelist at the
end of slab page, object could start at page boundary and will be at
correct alignment. This is possible because freelist has no alignment
constraint itself.
This gives us two benefits. It removes extra memory space for the
freelist alignment and remove complex calculation at cache initialization
step. I can't think notable drawback here.
I mentioned that this would reduce extra memory space, but, this benefit
is rather theoretical because it can be applied to very few cases.
Following is the example cache type that can get benefit from this change.
size align num before after
32 8 124 4100 4092
64 8 63 4103 4095
88 8 46 4102 4094
272 8 15 4103 4095
408 8 10 4098 4090
32 16 124 4108 4092
64 16 63 4111 4095
32 32 124 4124 4092
64 32 63 4127 4095
96 32 42 4106 4074
before means whole size for objects and aligned freelist before applying
patch and after shows the result of this patch.
Since before is more than 4096, number of object should decrease and
memory waste happens.
Anyway, this patch removes complex calculation so looks beneficial to me.
v2: fix kerneldoc by Andrew
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/slab.c | 90 ++++++++++++++++-----------------------------------------------
1 file changed, 22 insertions(+), 68 deletions(-)
diff --git a/mm/slab.c b/mm/slab.c
index 02be9d9..b3d91b0 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -456,55 +456,12 @@ static inline struct array_cache *cpu_cache_get(struct kmem_cache *cachep)
return this_cpu_ptr(cachep->cpu_cache);
}
-static size_t calculate_freelist_size(int nr_objs, size_t align)
-{
- size_t freelist_size;
-
- freelist_size = nr_objs * sizeof(freelist_idx_t);
- if (align)
- freelist_size = ALIGN(freelist_size, align);
-
- return freelist_size;
-}
-
-static int calculate_nr_objs(size_t slab_size, size_t buffer_size,
- size_t idx_size, size_t align)
-{
- int nr_objs;
- size_t remained_size;
- size_t freelist_size;
-
- /*
- * Ignore padding for the initial guess. The padding
- * is at most @align-1 bytes, and @buffer_size is at
- * least @align. In the worst case, this result will
- * be one greater than the number of objects that fit
- * into the memory allocation when taking the padding
- * into account.
- */
- nr_objs = slab_size / (buffer_size + idx_size);
-
- /*
- * This calculated number will be either the right
- * amount, or one greater than what we want.
- */
- remained_size = slab_size - nr_objs * buffer_size;
- freelist_size = calculate_freelist_size(nr_objs, align);
- if (remained_size < freelist_size)
- nr_objs--;
-
- return nr_objs;
-}
-
/*
* Calculate the number of objects and left-over bytes for a given buffer size.
*/
static void cache_estimate(unsigned long gfporder, size_t buffer_size,
- size_t align, int flags, size_t *left_over,
- unsigned int *num)
+ unsigned long flags, size_t *left_over, unsigned int *num)
{
- int nr_objs;
- size_t mgmt_size;
size_t slab_size = PAGE_SIZE << gfporder;
/*
@@ -512,9 +469,12 @@ static void cache_estimate(unsigned long gfporder, size_t buffer_size,
* on it. For the latter case, the memory allocated for a
* slab is used for:
*
- * - One freelist_idx_t for each object
- * - Padding to respect alignment of @align
* - @buffer_size bytes for each object
+ * - One freelist_idx_t for each object
+ *
+ * We don't need to consider alignment of freelist because
+ * freelist will be at the end of slab page. The objects will be
+ * at the correct alignment.
*
* If the slab management structure is off the slab, then the
* alignment will already be calculated into the size. Because
@@ -522,16 +482,13 @@ static void cache_estimate(unsigned long gfporder, size_t buffer_size,
* correct alignment when allocated.
*/
if (flags & CFLGS_OFF_SLAB) {
- mgmt_size = 0;
- nr_objs = slab_size / buffer_size;
-
+ *num = slab_size / buffer_size;
+ *left_over = slab_size % buffer_size;
} else {
- nr_objs = calculate_nr_objs(slab_size, buffer_size,
- sizeof(freelist_idx_t), align);
- mgmt_size = calculate_freelist_size(nr_objs, align);
+ *num = slab_size / (buffer_size + sizeof(freelist_idx_t));
+ *left_over = slab_size %
+ (buffer_size + sizeof(freelist_idx_t));
}
- *num = nr_objs;
- *left_over = slab_size - nr_objs*buffer_size - mgmt_size;
}
#if DEBUG
@@ -1911,7 +1868,6 @@ static void slabs_destroy(struct kmem_cache *cachep, struct list_head *list)
* calculate_slab_order - calculate size (page order) of slabs
* @cachep: pointer to the cache that is being created
* @size: size of objects to be created in this cache.
- * @align: required alignment for the objects.
* @flags: slab allocation flags
*
* Also calculates the number of objects per slab.
@@ -1921,7 +1877,7 @@ static void slabs_destroy(struct kmem_cache *cachep, struct list_head *list)
* towards high-order requests, this should be changed.
*/
static size_t calculate_slab_order(struct kmem_cache *cachep,
- size_t size, size_t align, unsigned long flags)
+ size_t size, unsigned long flags)
{
unsigned long offslab_limit;
size_t left_over = 0;
@@ -1931,7 +1887,7 @@ static size_t calculate_slab_order(struct kmem_cache *cachep,
unsigned int num;
size_t remainder;
- cache_estimate(gfporder, size, align, flags, &remainder, &num);
+ cache_estimate(gfporder, size, flags, &remainder, &num);
if (!num)
continue;
@@ -2207,12 +2163,12 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags)
if (FREELIST_BYTE_INDEX && size < SLAB_OBJ_MIN_SIZE)
size = ALIGN(SLAB_OBJ_MIN_SIZE, cachep->align);
- left_over = calculate_slab_order(cachep, size, cachep->align, flags);
+ left_over = calculate_slab_order(cachep, size, flags);
if (!cachep->num)
return -E2BIG;
- freelist_size = calculate_freelist_size(cachep->num, cachep->align);
+ freelist_size = cachep->num * sizeof(freelist_idx_t);
/*
* If the slab has been placed off-slab, and we have enough space then
@@ -2223,11 +2179,6 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags)
left_over -= freelist_size;
}
- if (flags & CFLGS_OFF_SLAB) {
- /* really off slab. No need for manual alignment */
- freelist_size = calculate_freelist_size(cachep->num, 0);
- }
-
cachep->colour_off = cache_line_size();
/* Offset must be a multiple of the alignment. */
if (cachep->colour_off < cachep->align)
@@ -2443,6 +2394,9 @@ static void *alloc_slabmgmt(struct kmem_cache *cachep,
void *freelist;
void *addr = page_address(page);
+ page->s_mem = addr + colour_off;
+ page->active = 0;
+
if (OFF_SLAB(cachep)) {
/* Slab management obj is off-slab. */
freelist = kmem_cache_alloc_node(cachep->freelist_cache,
@@ -2450,11 +2404,11 @@ static void *alloc_slabmgmt(struct kmem_cache *cachep,
if (!freelist)
return NULL;
} else {
- freelist = addr + colour_off;
- colour_off += cachep->freelist_size;
+ /* We will use last bytes at the slab for freelist */
+ freelist = addr + (PAGE_SIZE << cachep->gfporder) -
+ cachep->freelist_size;
}
- page->active = 0;
- page->s_mem = addr + colour_off;
+
return freelist;
}
--
1.9.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-02-26 6:02 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-26 6:01 [PATCH v2 00/17] mm/slab: introduce new freed objects management way, OBJFREELIST_SLAB js1304
2016-02-26 6:01 ` [PATCH v2 01/17] mm/slab: fix stale code comment js1304
2016-02-26 6:01 ` [PATCH v2 02/17] mm/slab: remove useless structure define js1304
2016-02-26 6:01 ` [PATCH v2 03/17] mm/slab: remove the checks for slab implementation bug js1304
2016-02-26 16:05 ` Christoph Lameter
2016-02-26 6:01 ` [PATCH v2 04/17] mm/slab: activate debug_pagealloc in SLAB when it is actually enabled js1304
2016-02-26 16:06 ` Christoph Lameter
2016-02-26 6:01 ` [PATCH v2 05/17] mm/slab: use more appropriate condition check for debug_pagealloc js1304
2016-02-26 16:08 ` Christoph Lameter
2016-02-26 6:01 ` [PATCH v2 06/17] mm/slab: clean up DEBUG_PAGEALLOC processing code js1304
2016-02-26 6:01 ` [PATCH v2 07/17] mm/slab: alternative implementation for DEBUG_SLAB_LEAK js1304
2016-02-26 6:01 ` [PATCH v2 08/17] mm/slab: remove object status buffer " js1304
2016-02-26 6:01 ` js1304 [this message]
2016-02-26 6:01 ` [PATCH v2 10/17] mm/slab: align cache size first before determination of OFF_SLAB candidate js1304
2016-02-26 6:01 ` [PATCH v2 11/17] mm/slab: clean up cache type determination js1304
2016-02-26 6:01 ` [PATCH v2 12/17] mm/slab: do not change cache size if debug pagealloc isn't possible js1304
2016-02-26 16:13 ` Christoph Lameter
2016-02-26 17:02 ` Joonsoo Kim
2016-02-26 6:01 ` [PATCH v2 13/17] mm/slab: make criteria for off slab determination robust and simple js1304
2016-02-26 6:01 ` [PATCH v2 14/17] mm/slab: factor out slab list fixup code js1304
2016-02-26 6:01 ` [PATCH v2 15/17] mm/slab: factor out debugging initialization in cache_init_objs() js1304
2016-02-26 6:01 ` [PATCH v2 16/17] mm/slab: introduce new slab management type, OBJFREELIST_SLAB js1304
2016-02-26 16:21 ` Christoph Lameter
2016-02-26 17:06 ` Joonsoo Kim
2016-02-26 6:01 ` [PATCH v2 17/17] mm/slab: avoid returning values by reference js1304
2016-02-26 16:22 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1456466484-3442-10-git-send-email-iamjoonsoo.kim@lge.com \
--to=js1304@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=brouer@redhat.com \
--cc=cl@linux.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=penberg@kernel.org \
--cc=rientjes@google.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).