All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
To: Pekka Enberg <penberg@kernel.org>
Cc: Christoph Lameter <cl@linux.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Rientjes <rientjes@google.com>,
	Wanpeng Li <liwanp@linux.vnet.ibm.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 4/5] slab: introduce byte sized index for the freelist of a slab
Date: Tue, 3 Dec 2013 11:25:39 +0900	[thread overview]
Message-ID: <20131203022539.GF31168@lge.com> (raw)
In-Reply-To: <1385974183-31423-5-git-send-email-iamjoonsoo.kim@lge.com>

On Mon, Dec 02, 2013 at 05:49:42PM +0900, Joonsoo Kim wrote:
> Currently, the freelist of a slab consist of unsigned int sized indexes.
> Since most of slabs have less number of objects than 256, large sized
> indexes is needless. For example, consider the minimum kmalloc slab. It's
> object size is 32 byte and it would consist of one page, so 256 indexes
> through byte sized index are enough to contain all possible indexes.
> 
> There can be some slabs whose object size is 8 byte. We cannot handle
> this case with byte sized index, so we need to restrict minimum
> object size. Since these slabs are not major, wasted memory from these
> slabs would be negligible.
> 
> Some architectures' page size isn't 4096 bytes and rather larger than
> 4096 bytes (One example is 64KB page size on PPC or IA64) so that
> byte sized index doesn't fit to them. In this case, we will use
> two bytes sized index.
> 
> Below is some number for this patch.
> 
> * Before *
> kmalloc-512          525    640    512    8    1 : tunables   54   27    0 : slabdata     80     80      0
> kmalloc-256          210    210    256   15    1 : tunables  120   60    0 : slabdata     14     14      0
> kmalloc-192         1016   1040    192   20    1 : tunables  120   60    0 : slabdata     52     52      0
> kmalloc-96           560    620    128   31    1 : tunables  120   60    0 : slabdata     20     20      0
> kmalloc-64          2148   2280     64   60    1 : tunables  120   60    0 : slabdata     38     38      0
> kmalloc-128          647    682    128   31    1 : tunables  120   60    0 : slabdata     22     22      0
> kmalloc-32         11360  11413     32  113    1 : tunables  120   60    0 : slabdata    101    101      0
> kmem_cache           197    200    192   20    1 : tunables  120   60    0 : slabdata     10     10      0
> 
> * After *
> kmalloc-512          521    648    512    8    1 : tunables   54   27    0 : slabdata     81     81      0
> kmalloc-256          208    208    256   16    1 : tunables  120   60    0 : slabdata     13     13      0
> kmalloc-192         1029   1029    192   21    1 : tunables  120   60    0 : slabdata     49     49      0
> kmalloc-96           529    589    128   31    1 : tunables  120   60    0 : slabdata     19     19      0
> kmalloc-64          2142   2142     64   63    1 : tunables  120   60    0 : slabdata     34     34      0
> kmalloc-128          660    682    128   31    1 : tunables  120   60    0 : slabdata     22     22      0
> kmalloc-32         11716  11780     32  124    1 : tunables  120   60    0 : slabdata     95     95      0
> kmem_cache           197    210    192   21    1 : tunables  120   60    0 : slabdata     10     10      0
> 
> kmem_caches consisting of objects less than or equal to 256 byte have
> one or more objects than before. In the case of kmalloc-32, we have 11 more
> objects, so 352 bytes (11 * 32) are saved and this is roughly 9% saving of
> memory. Of couse, this percentage decreases as the number of objects
> in a slab decreases.
> 
> Here are the performance results on my 4 cpus machine.
> 
> * Before *
> 
>  Performance counter stats for 'perf bench sched messaging -g 50 -l 1000' (10 runs):
> 
>        229,945,138 cache-misses                                                  ( +-  0.23% )
> 
>       11.627897174 seconds time elapsed                                          ( +-  0.14% )
> 
> * After *
> 
>  Performance counter stats for 'perf bench sched messaging -g 50 -l 1000' (10 runs):
> 
>        218,640,472 cache-misses                                                  ( +-  0.42% )
> 
>       11.504999837 seconds time elapsed                                          ( +-  0.21% )
> 
> cache-misses are reduced by this patchset, roughly 5%.
> And elapsed times are improved by 1%.
> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 

Hello, Christoph.

Can I get your ACK for this patch?

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
To: Pekka Enberg <penberg@kernel.org>
Cc: Christoph Lameter <cl@linux.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Rientjes <rientjes@google.com>,
	Wanpeng Li <liwanp@linux.vnet.ibm.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 4/5] slab: introduce byte sized index for the freelist of a slab
Date: Tue, 3 Dec 2013 11:25:39 +0900	[thread overview]
Message-ID: <20131203022539.GF31168@lge.com> (raw)
In-Reply-To: <1385974183-31423-5-git-send-email-iamjoonsoo.kim@lge.com>

On Mon, Dec 02, 2013 at 05:49:42PM +0900, Joonsoo Kim wrote:
> Currently, the freelist of a slab consist of unsigned int sized indexes.
> Since most of slabs have less number of objects than 256, large sized
> indexes is needless. For example, consider the minimum kmalloc slab. It's
> object size is 32 byte and it would consist of one page, so 256 indexes
> through byte sized index are enough to contain all possible indexes.
> 
> There can be some slabs whose object size is 8 byte. We cannot handle
> this case with byte sized index, so we need to restrict minimum
> object size. Since these slabs are not major, wasted memory from these
> slabs would be negligible.
> 
> Some architectures' page size isn't 4096 bytes and rather larger than
> 4096 bytes (One example is 64KB page size on PPC or IA64) so that
> byte sized index doesn't fit to them. In this case, we will use
> two bytes sized index.
> 
> Below is some number for this patch.
> 
> * Before *
> kmalloc-512          525    640    512    8    1 : tunables   54   27    0 : slabdata     80     80      0
> kmalloc-256          210    210    256   15    1 : tunables  120   60    0 : slabdata     14     14      0
> kmalloc-192         1016   1040    192   20    1 : tunables  120   60    0 : slabdata     52     52      0
> kmalloc-96           560    620    128   31    1 : tunables  120   60    0 : slabdata     20     20      0
> kmalloc-64          2148   2280     64   60    1 : tunables  120   60    0 : slabdata     38     38      0
> kmalloc-128          647    682    128   31    1 : tunables  120   60    0 : slabdata     22     22      0
> kmalloc-32         11360  11413     32  113    1 : tunables  120   60    0 : slabdata    101    101      0
> kmem_cache           197    200    192   20    1 : tunables  120   60    0 : slabdata     10     10      0
> 
> * After *
> kmalloc-512          521    648    512    8    1 : tunables   54   27    0 : slabdata     81     81      0
> kmalloc-256          208    208    256   16    1 : tunables  120   60    0 : slabdata     13     13      0
> kmalloc-192         1029   1029    192   21    1 : tunables  120   60    0 : slabdata     49     49      0
> kmalloc-96           529    589    128   31    1 : tunables  120   60    0 : slabdata     19     19      0
> kmalloc-64          2142   2142     64   63    1 : tunables  120   60    0 : slabdata     34     34      0
> kmalloc-128          660    682    128   31    1 : tunables  120   60    0 : slabdata     22     22      0
> kmalloc-32         11716  11780     32  124    1 : tunables  120   60    0 : slabdata     95     95      0
> kmem_cache           197    210    192   21    1 : tunables  120   60    0 : slabdata     10     10      0
> 
> kmem_caches consisting of objects less than or equal to 256 byte have
> one or more objects than before. In the case of kmalloc-32, we have 11 more
> objects, so 352 bytes (11 * 32) are saved and this is roughly 9% saving of
> memory. Of couse, this percentage decreases as the number of objects
> in a slab decreases.
> 
> Here are the performance results on my 4 cpus machine.
> 
> * Before *
> 
>  Performance counter stats for 'perf bench sched messaging -g 50 -l 1000' (10 runs):
> 
>        229,945,138 cache-misses                                                  ( +-  0.23% )
> 
>       11.627897174 seconds time elapsed                                          ( +-  0.14% )
> 
> * After *
> 
>  Performance counter stats for 'perf bench sched messaging -g 50 -l 1000' (10 runs):
> 
>        218,640,472 cache-misses                                                  ( +-  0.42% )
> 
>       11.504999837 seconds time elapsed                                          ( +-  0.21% )
> 
> cache-misses are reduced by this patchset, roughly 5%.
> And elapsed times are improved by 1%.
> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 

Hello, Christoph.

Can I get your ACK for this patch?

Thanks.

  reply	other threads:[~2013-12-03  2:23 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-02  8:49 [PATCH v3 0/5] slab: implement byte sized indexes for the freelist of a slab Joonsoo Kim
2013-12-02  8:49 ` Joonsoo Kim
2013-12-02  8:49 ` [PATCH v3 1/5] slab: factor out calculate nr objects in cache_estimate Joonsoo Kim
2013-12-02  8:49   ` Joonsoo Kim
2014-01-15  4:54   ` David Rientjes
2014-01-15  4:54     ` David Rientjes
2013-12-02  8:49 ` [PATCH v3 2/5] slab: introduce helper functions to get/set free object Joonsoo Kim
2013-12-02  8:49   ` Joonsoo Kim
2014-01-15  4:57   ` David Rientjes
2014-01-15  4:57     ` David Rientjes
2013-12-02  8:49 ` [PATCH v3 3/5] slab: restrict the number of objects in a slab Joonsoo Kim
2013-12-02  8:49   ` Joonsoo Kim
2013-12-02 19:45   ` Christoph Lameter
2013-12-02 19:45     ` Christoph Lameter
2014-01-15  5:05   ` David Rientjes
2014-01-15  5:05     ` David Rientjes
2013-12-02  8:49 ` [PATCH v3 4/5] slab: introduce byte sized index for the freelist of " Joonsoo Kim
2013-12-02  8:49   ` Joonsoo Kim
2013-12-03  2:25   ` Joonsoo Kim [this message]
2013-12-03  2:25     ` Joonsoo Kim
2013-12-03 15:24     ` Christoph Lameter
2013-12-03 15:24       ` Christoph Lameter
2014-01-15  5:08   ` David Rientjes
2014-01-15  5:08     ` David Rientjes
2013-12-02  8:49 ` [PATCH v3 5/5] slab: make more slab management structure off the slab Joonsoo Kim
2013-12-02  8:49   ` Joonsoo Kim
2013-12-02 14:58   ` Christoph Lameter
2013-12-02 14:58     ` Christoph Lameter
2013-12-03  2:13     ` Joonsoo Kim
2013-12-03  2:13       ` Joonsoo Kim
2013-12-13  7:03       ` Joonsoo Kim
2013-12-13  7:03         ` Joonsoo Kim
2014-02-08 10:17         ` Pekka Enberg
2014-02-08 10:17           ` Pekka Enberg
2014-01-15  5:09   ` David Rientjes
2014-01-15  5:09     ` David Rientjes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131203022539.GF31168@lge.com \
    --to=iamjoonsoo.kim@lge.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liwanp@linux.vnet.ibm.com \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.