public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [ANNOUNCE] GIT 1.0.0
@ 2005-12-21  8:00 Junio C Hamano
  2005-12-21  9:11 ` [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? Eric Dumazet
  0 siblings, 1 reply; 36+ messages in thread
From: Junio C Hamano @ 2005-12-21  8:00 UTC (permalink / raw)
  To: git, linux-kernel

GIT 1.0.0 is found at the usual places:

	Tarball	http://www.kernel.org/pub/software/scm/git/
	RPM	http://www.kernel.org/pub/software/scm/git/RPMS/
	Debian	http://www.kernel.org/pub/software/scm/git/debian/
	GIT	git://git.kernel.org/pub/scm/git/git.git/

The name "1.0.0" ought to mean a significant milestone, but
actually it is not.  Pre 1.0 version has been in production use
by the kernel folks for quite some time, and the changes since
1.0rc are pretty small and primarily consist of documenation
updates, clone/fetch enhancements and miscellaneous bugfixes.

Thank you all who gave patches, comments and time.

Happy hacking, and a little early ho-ho-ho.



^ permalink raw reply	[flat|nested] 36+ messages in thread

* [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2005-12-21  8:00 [ANNOUNCE] GIT 1.0.0 Junio C Hamano
@ 2005-12-21  9:11 ` Eric Dumazet
  2005-12-21  9:22   ` David S. Miller
                     ` (3 more replies)
  0 siblings, 4 replies; 36+ messages in thread
From: Eric Dumazet @ 2005-12-21  9:11 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andi Kleen

I wonder if the 32 and 192 bytes caches are worth to be declared in 
include/linux/kmalloc_sizes.h, at least on x86_64

(x86_64 : PAGE_SIZE = 4096, L1_CACHE_BYTES = 64)

On my machines, I can say that the 32 and 192 sizes could be avoided in favor 
in spending less cpu cycles in __find_general_cachep()

Could some of you post the result of the following command on your machines :

# grep "size-" /proc/slabinfo |grep -v DMA|cut -c1-40

size-131072            0      0 131072
size-65536             0      0  65536
size-32768             2      2  32768
size-16384             0      0  16384
size-8192             13     13   8192
size-4096            161    161   4096
size-2048          40564  42976   2048
size-1024            681    800   1024
size-512           19792  37168    512
size-256              81    105    256
size-192            1218   1280    192
size-64            31278  86907     64
size-128            5457  10380    128
size-32              594    784     32

Thank you

PS : I have no idea why the last lines (size-192, 64, 128, 32) are not ordered...

Eric

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2005-12-21  9:11 ` [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? Eric Dumazet
@ 2005-12-21  9:22   ` David S. Miller
  2005-12-21 10:03     ` Jan-Benedict Glaw
  2005-12-21  9:46   ` Alok kataria
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 36+ messages in thread
From: David S. Miller @ 2005-12-21  9:22 UTC (permalink / raw)
  To: dada1; +Cc: linux-kernel, ak

From: Eric Dumazet <dada1@cosmosbay.com>
Date: Wed, 21 Dec 2005 10:11:51 +0100

> Could some of you post the result of the following command on your machines :

sparc64, PAGE_SIZE=8192, L1_CACHE_BYTES=32

size-131072            0      0 131072  
size-65536            13     13  65536  
size-32768             2      2  32768  
size-16384             2      2  16384  
size-8192             67     67   8192  
size-4096             75     76   4096  
size-2048            303    308   2048  
size-1024            176    176   1024  
size-512             251    255    512  
size-256             217    217    256  
size-192            1230   1230    192  
size-128             106    122    128  
size-96             1098   1134     96  
size-64            29387  30226     64  

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2005-12-21  9:11 ` [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? Eric Dumazet
  2005-12-21  9:22   ` David S. Miller
@ 2005-12-21  9:46   ` Alok kataria
  2005-12-21 12:44   ` Ed Tomlinson
  2005-12-28  8:32   ` Denis Vlasenko
  3 siblings, 0 replies; 36+ messages in thread
From: Alok kataria @ 2005-12-21  9:46 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: linux-kernel, Andi Kleen

On 12/21/05, Eric Dumazet <dada1@cosmosbay.com> wrote:
> I wonder if the 32 and 192 bytes caches are worth to be declared in
> include/linux/kmalloc_sizes.h, at least on x86_64
>
> (x86_64 : PAGE_SIZE = 4096, L1_CACHE_BYTES = 64)
>
> On my machines, I can say that the 32 and 192 sizes could be avoided in favor
> in spending less cpu cycles in __find_general_cachep()
>
> Could some of you post the result of the following command on your machines :
>
> # grep "size-" /proc/slabinfo |grep -v DMA|cut -c1-40
>
> size-131072            0      0 131072
> size-65536             0      0  65536
> size-32768             2      2  32768
> size-16384             0      0  16384
> size-8192             13     13   8192
> size-4096            161    161   4096
> size-2048          40564  42976   2048
> size-1024            681    800   1024
> size-512           19792  37168    512
> size-256              81    105    256
> size-192            1218   1280    192
> size-64            31278  86907     64
> size-128            5457  10380    128
> size-32              594    784     32
>
> Thank you
>
> PS : I have no idea why the last lines (size-192, 64, 128, 32) are not ordered...

The size-32 and size-128 caches are created before any other cache, as
the array_caches (arraycache_init) and kmem_list3's structure come
from these cache.
Thus these caches are added to the cache_chain before other caches.
And s_show just walks this chain and prints info for the caches.

Before l3 was converted into a pointer (per node slabs) we could
intialize the caches in order as we knew that the arraycache_init will
always fit in the first cache.

Thanks & Regards,
Alok
>
> Eric
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2005-12-21  9:22   ` David S. Miller
@ 2005-12-21 10:03     ` Jan-Benedict Glaw
  0 siblings, 0 replies; 36+ messages in thread
From: Jan-Benedict Glaw @ 2005-12-21 10:03 UTC (permalink / raw)
  To: David S. Miller; +Cc: dada1, linux-kernel, ak

[-- Attachment #1: Type: text/plain, Size: 1370 bytes --]

On Wed, 2005-12-21 01:22:12 -0800, David S. Miller <davem@davemloft.net> wrote:
> From: Eric Dumazet <dada1@cosmosbay.com>
> Date: Wed, 21 Dec 2005 10:11:51 +0100
> 
> > Could some of you post the result of the following command on your machines :

VAX KA650 (simulated), 4k pages (hw-size is 512 Bytes, though),
L1_CACHE_BYTES=32

# grep "size-" /proc/slabinfo |grep -v DMA|cut -c1-40
size-131072            0      0 131072  
size-65536             0      0  65536  
size-32768             0      0  32768  
size-16384             0      0  16384  
size-8192              0      0   8192  
size-4096             21     21   4096  
size-2048             39     42   2060  
size-1024             18     21   1036  
size-512              70     70    524  
size-256               5     14    268  
size-192             722    722    204  
size-128             145    168    140  
size-96              382    396    108  
size-32             1040   1092     44  
size-64              338    350     76  

MfG, JBG

-- 
Jan-Benedict Glaw       jbglaw@lug-owl.de    . +49-172-7608481             _ O _
"Eine Freie Meinung in  einem Freien Kopf    | Gegen Zensur | Gegen Krieg  _ _ O
 für einen Freien Staat voll Freier Bürger"  | im Internet! |   im Irak!   O O O
ret = do_actions((curr | FREE_SPEECH) & ~(NEW_COPYRIGHT_LAW | DRM | TCPA));

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2005-12-21  9:11 ` [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? Eric Dumazet
  2005-12-21  9:22   ` David S. Miller
  2005-12-21  9:46   ` Alok kataria
@ 2005-12-21 12:44   ` Ed Tomlinson
  2005-12-21 13:20     ` Folkert van Heusden
  2005-12-28  8:32   ` Denis Vlasenko
  3 siblings, 1 reply; 36+ messages in thread
From: Ed Tomlinson @ 2005-12-21 12:44 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: linux-kernel, Andi Kleen

On Wednesday 21 December 2005 04:11, Eric Dumazet wrote:
> (x86_64 : PAGE_SIZE = 4096, L1_CACHE_BYTES = 64)
> 
> On my machines, I can say that the 32 and 192 sizes could be avoided in favor 
> in spending less cpu cycles in __find_general_cachep()
> 
> Could some of you post the result of the following command on your machines :
> 
> # grep "size-" /proc/slabinfo |grep -v DMA|cut -c1-40
size-131072            0      0 131072
size-65536             3      3  65536
size-32768             0      0  32768
size-16384             3      3  16384
size-8192             28     28   8192
size-4096            184    184   4096
size-2048            272    272   2048
size-1024            300    300   1024
size-512             275    376    512
size-256             717    720    256
size-192            1120   1220    192
size-64             7720   8568     64
size-128           45019  65830    128
size-32             1627   3333     32

amd64 up 

Ed Tomlinson

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2005-12-21 12:44   ` Ed Tomlinson
@ 2005-12-21 13:20     ` Folkert van Heusden
  2005-12-21 13:38       ` Eric Dumazet
  0 siblings, 1 reply; 36+ messages in thread
From: Folkert van Heusden @ 2005-12-21 13:20 UTC (permalink / raw)
  To: Ed Tomlinson; +Cc: Eric Dumazet, linux-kernel, Andi Kleen

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

> > (x86_64 : PAGE_SIZE = 4096, L1_CACHE_BYTES = 64)
> > On my machines, I can say that the 32 and 192 sizes could be avoided in favor 
> > in spending less cpu cycles in __find_general_cachep()
> > Could some of you post the result of the following command on your machines :
> > # grep "size-" /proc/slabinfo |grep -v DMA|cut -c1-40
> size-131072            0      0 131072
> size-65536             3      3  65536
> size-32768             0      0  32768
> size-16384             3      3  16384
> size-8192             28     28   8192
> size-4096            184    184   4096
> size-2048            272    272   2048
> size-1024            300    300   1024
> size-512             275    376    512
> size-256             717    720    256
> size-192            1120   1220    192
> size-64             7720   8568     64
> size-128           45019  65830    128
> size-32             1627   3333     32

size-131072            0      0 131072
size-65536             0      0  65536
size-32768            20     20  32768
size-16384             8      9  16384
size-8192             37     38   8192
size-4096            269    269   4096
size-2048            793    910   2048
size-1024            564    608   1024
size-512             702    856    512
size-256            1485   4005    256
size-128            1209   1350    128
size-64             2858   3363     64
size-32             1538   2714     64
Intel(R) Xeon(TM) MP CPU 3.00GHz
address sizes   : 40 bits physical, 48 bits virtual


Folkert van Heusden

- -- 
Try MultiTail! Multiple windows with logfiles, filtered with regular
expressions, colored output, etc. etc. www.vanheusden.com/multitail/
- ----------------------------------------------------------------------
Get your PGP/GPG key signed at www.biglumber.com!
- ----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iIMEARECAEMFAkOpVq48Gmh0dHA6Ly93d3cudmFuaGV1c2Rlbi5jb20vZGF0YS1z
aWduaW5nLXdpdGgtcGdwLXBvbGljeS5odG1sAAoJEDAZDowfKNiuUUEAnR9DJq5M
x+Bj1R+djzCli3bFrJXKAJ9OmCx9FKDaGl6PocRwCZSKURerPA==
=vQhF
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2005-12-21 13:20     ` Folkert van Heusden
@ 2005-12-21 13:38       ` Eric Dumazet
  2005-12-21 14:09         ` Folkert van Heusden
  0 siblings, 1 reply; 36+ messages in thread
From: Eric Dumazet @ 2005-12-21 13:38 UTC (permalink / raw)
  To: Folkert van Heusden; +Cc: Ed Tomlinson, linux-kernel, Andi Kleen

Folkert van Heusden a écrit :

> 
> 
> size-131072            0      0 131072
> size-65536             0      0  65536
> size-32768            20     20  32768
> size-16384             8      9  16384
> size-8192             37     38   8192
> size-4096            269    269   4096
> size-2048            793    910   2048
> size-1024            564    608   1024
> size-512             702    856    512
> size-256            1485   4005    256
> size-128            1209   1350    128
> size-64             2858   3363     64
> size-32             1538   2714     64
> Intel(R) Xeon(TM) MP CPU 3.00GHz
> address sizes   : 40 bits physical, 48 bits virtual
> 
> 
> Folkert van Heusden

Hi Folkert

Your results are interesting : size-32 seems to use objects of size 64 !

 > size-32             1538   2714     64 <<HERE>>

So I guess that size-32 cache could be avoided at least for EMT (I take you 
run a 64 bits kernel ?)

Eric

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2005-12-21 13:38       ` Eric Dumazet
@ 2005-12-21 14:09         ` Folkert van Heusden
  2005-12-21 16:40           ` Dave Jones
  0 siblings, 1 reply; 36+ messages in thread
From: Folkert van Heusden @ 2005-12-21 14:09 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Ed Tomlinson, linux-kernel, Andi Kleen

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

> >size-131072            0      0 131072
> >size-65536             0      0  65536
> >size-32768            20     20  32768
> >size-16384             8      9  16384
> >size-8192             37     38   8192
> >size-4096            269    269   4096
> >size-2048            793    910   2048
> >size-1024            564    608   1024
> >size-512             702    856    512
> >size-256            1485   4005    256
> >size-128            1209   1350    128
> >size-64             2858   3363     64
> >size-32             1538   2714     64
> >Intel(R) Xeon(TM) MP CPU 3.00GHz
> >address sizes   : 40 bits physical, 48 bits virtual
> 
> Your results are interesting : size-32 seems to use objects of size 64 !
> > size-32             1538   2714     64 <<HERE>>
> So I guess that size-32 cache could be avoided at least for EMT (I take you 
> run a 64 bits kernel ?)

I think I do yes:
Linux xxxxx 2.4.21-37.EL #1 SMP Wed Sep 7 13:32:18 EDT 2005 x86_64 x86_64 x86_64 GNU/Linux
It is a redhat 4 x64 system.
Also from /proc/cpuinfo:
address sizes   : 40 bits physical, 48 bits virtual


Folkert van Heusden

- -- 
Try MultiTail! Multiple windows with logfiles, filtered with regular
expressions, colored output, etc. etc. www.vanheusden.com/multitail/
- ----------------------------------------------------------------------
Get your PGP/GPG key signed at www.biglumber.com!
- ----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iIMEARECAEMFAkOpYf08Gmh0dHA6Ly93d3cudmFuaGV1c2Rlbi5jb20vZGF0YS1z
aWduaW5nLXdpdGgtcGdwLXBvbGljeS5odG1sAAoJEDAZDowfKNiugqYAoJWSoI9M
O1sYrhWfFCoyTWweGN29AKCfPy46A1XHYC598IN4TXRSV2u6QA==
=xMjS
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2005-12-21 14:09         ` Folkert van Heusden
@ 2005-12-21 16:40           ` Dave Jones
  2005-12-21 19:36             ` Folkert van Heusden
  0 siblings, 1 reply; 36+ messages in thread
From: Dave Jones @ 2005-12-21 16:40 UTC (permalink / raw)
  To: Folkert van Heusden; +Cc: Eric Dumazet, Ed Tomlinson, linux-kernel, Andi Kleen

On Wed, Dec 21, 2005 at 03:09:02PM +0100, Folkert van Heusden wrote:

 > > Your results are interesting : size-32 seems to use objects of size 64 !
 > > > size-32             1538   2714     64 <<HERE>>
 > > So I guess that size-32 cache could be avoided at least for EMT (I take you 
 > > run a 64 bits kernel ?)
 > 
 > I think I do yes:
 > Linux xxxxx 2.4.21-37.EL #1 SMP Wed Sep 7 13:32:18 EDT 2005 x86_64 x86_64 x86_64 GNU/Linux
 > It is a redhat 4 x64 system.

Looks more like RHEL3 judging from the kernel version.

		Dave


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2005-12-21 16:40           ` Dave Jones
@ 2005-12-21 19:36             ` Folkert van Heusden
  0 siblings, 0 replies; 36+ messages in thread
From: Folkert van Heusden @ 2005-12-21 19:36 UTC (permalink / raw)
  To: Dave Jones, Eric Dumazet, Ed Tomlinson, linux-kernel, Andi Kleen

>  > > Your results are interesting : size-32 seems to use objects of size 64 !
>  > > > size-32             1538   2714     64 <<HERE>>
>  > > So I guess that size-32 cache could be avoided at least for EMT (I take you 
>  > > run a 64 bits kernel ?)
>  > I think I do yes:
>  > Linux xxxxx 2.4.21-37.EL #1 SMP Wed Sep 7 13:32:18 EDT 2005 x86_64 x86_64 x86_64 GNU/Linux
>  > It is a redhat 4 x64 system.
> Looks more like RHEL3 judging from the kernel version.

Ehr yes, you're totally right.


Folkert van Heusden

-- 
Try MultiTail! Multiple windows with logfiles, filtered with regular
expressions, colored output, etc. etc. www.vanheusden.com/multitail/
----------------------------------------------------------------------
Get your PGP/GPG key signed at www.biglumber.com!
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2005-12-21  9:11 ` [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? Eric Dumazet
                     ` (2 preceding siblings ...)
  2005-12-21 12:44   ` Ed Tomlinson
@ 2005-12-28  8:32   ` Denis Vlasenko
  2005-12-28  8:54     ` Denis Vlasenko
  3 siblings, 1 reply; 36+ messages in thread
From: Denis Vlasenko @ 2005-12-28  8:32 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: linux-kernel, Andi Kleen

On Wednesday 21 December 2005 11:11, Eric Dumazet wrote:
> I wonder if the 32 and 192 bytes caches are worth to be declared in 
> include/linux/kmalloc_sizes.h, at least on x86_64
> 
> (x86_64 : PAGE_SIZE = 4096, L1_CACHE_BYTES = 64)
> 
> On my machines, I can say that the 32 and 192 sizes could be avoided in favor 
> in spending less cpu cycles in __find_general_cachep()
> 
> Could some of you post the result of the following command on your machines :
> 
> # grep "size-" /proc/slabinfo |grep -v DMA|cut -c1-40
> 
> size-131072            0      0 131072
> size-65536             0      0  65536
> size-32768             2      2  32768
> size-16384             0      0  16384
> size-8192             13     13   8192
> size-4096            161    161   4096
> size-2048          40564  42976   2048
> size-1024            681    800   1024
> size-512           19792  37168    512
> size-256              81    105    256
> size-192            1218   1280    192
> size-64            31278  86907     64
> size-128            5457  10380    128
> size-32              594    784     32

# grep "size-" /proc/slabinfo |grep -v DMA|cut -c1-40
size-131072            0      0 131072
size-65536             0      0  65536
size-32768             1      1  32768
size-16384             0      0  16384
size-8192            253    253   8192
size-4096             89     89   4096
size-2048            248    248   2048
size-1024            312    312   1024
size-512             545    648    512
size-256             213    270    256
size-128            5642   5642    128
size-64             1025   1586     64
size-32             2262   7854     32

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2005-12-28  8:32   ` Denis Vlasenko
@ 2005-12-28  8:54     ` Denis Vlasenko
  2005-12-28 17:57       ` Andreas Kleen
  0 siblings, 1 reply; 36+ messages in thread
From: Denis Vlasenko @ 2005-12-28  8:54 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: linux-kernel, Andi Kleen

On Wednesday 28 December 2005 10:32, Denis Vlasenko wrote:
> On Wednesday 21 December 2005 11:11, Eric Dumazet wrote:
> > I wonder if the 32 and 192 bytes caches are worth to be declared in 
> > include/linux/kmalloc_sizes.h, at least on x86_64
> > 
> > (x86_64 : PAGE_SIZE = 4096, L1_CACHE_BYTES = 64)
> > 
> > On my machines, I can say that the 32 and 192 sizes could be avoided in favor 
> > in spending less cpu cycles in __find_general_cachep()
> > 
> > Could some of you post the result of the following command on your machines :
> > 
> > # grep "size-" /proc/slabinfo |grep -v DMA|cut -c1-40
> > 
> > size-131072            0      0 131072
> > size-65536             0      0  65536
> > size-32768             2      2  32768
> > size-16384             0      0  16384
> > size-8192             13     13   8192
> > size-4096            161    161   4096
> > size-2048          40564  42976   2048
> > size-1024            681    800   1024
> > size-512           19792  37168    512
> > size-256              81    105    256
> > size-192            1218   1280    192
> > size-64            31278  86907     64
> > size-128            5457  10380    128
> > size-32              594    784     32
> 
> # grep "size-" /proc/slabinfo |grep -v DMA|cut -c1-40
> size-131072            0      0 131072
> size-65536             0      0  65536
> size-32768             1      1  32768
> size-16384             0      0  16384
> size-8192            253    253   8192
> size-4096             89     89   4096
> size-2048            248    248   2048
> size-1024            312    312   1024
> size-512             545    648    512
> size-256             213    270    256
> size-128            5642   5642    128
> size-64             1025   1586     64
> size-32             2262   7854     32

Wow... I overlooked that you are requesting data from x86_64 boxes.
Mine is not, it's i386...
--
vda

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2005-12-28  8:54     ` Denis Vlasenko
@ 2005-12-28 17:57       ` Andreas Kleen
  2005-12-28 21:01         ` Matt Mackall
                           ` (2 more replies)
  0 siblings, 3 replies; 36+ messages in thread
From: Andreas Kleen @ 2005-12-28 17:57 UTC (permalink / raw)
  To: Denis Vlasenko; +Cc: Eric Dumazet, linux-kernel

Am Mi 28.12.2005 09:54 schrieb Denis Vlasenko <vda@ilport.com.ua>:

> > # grep "size-" /proc/slabinfo |grep -v DMA|cut -c1-40
> > size-131072 0 0 131072
> > size-65536 0 0 65536
> > size-32768 1 1 32768
> > size-16384 0 0 16384
> > size-8192 253 253 8192
> > size-4096 89 89 4096
> > size-2048 248 248 2048
> > size-1024 312 312 1024
> > size-512 545 648 512
> > size-256 213 270 256
> > size-128 5642 5642 128
> > size-64 1025 1586 64
> > size-32 2262 7854 32
>
> Wow... I overlooked that you are requesting data from x86_64 boxes.
> Mine is not, it's i386...

This whole discussion is pointless anyways because most kmallocs are
constant
sized and with a constant sized kmalloc the slab is selected at compile
time.

What would be more interesting would be to redo the complete kmalloc
slab list.

I remember the original slab paper from Bonwick actually mentioned that
power of
two slabs are the worst choice for a malloc - but for some reason Linux
chose them
anyways. That would require a lot of measurements in different workloads
on the
actual kmalloc sizes and then select a good list, but could ultimately
safe
a lot of memory (ok not that much anymore because the memory intensive
allocations should all have their own caches, but at least some)

Most likely the best list is different for 32bit and 64bit too.

Note that just looking at slabinfo is not enough for this - you need the
original
sizes as passed to kmalloc, not the rounded values reported there.
Should be probably not too hard to hack a simple monitoring script up
for that
in systemtap to generate the data.

-Andi



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2005-12-28 17:57       ` Andreas Kleen
@ 2005-12-28 21:01         ` Matt Mackall
  2005-12-29  1:26           ` Dave Jones
                             ` (2 more replies)
  2005-12-29 19:48         ` Steven Rostedt
  2006-01-02  8:37         ` Pekka Enberg
  2 siblings, 3 replies; 36+ messages in thread
From: Matt Mackall @ 2005-12-28 21:01 UTC (permalink / raw)
  To: Andreas Kleen; +Cc: Denis Vlasenko, Eric Dumazet, linux-kernel

On Wed, Dec 28, 2005 at 06:57:15PM +0100, Andreas Kleen wrote:
> Am Mi 28.12.2005 09:54 schrieb Denis Vlasenko <vda@ilport.com.ua>:
> 
> > > # grep "size-" /proc/slabinfo |grep -v DMA|cut -c1-40
> > > size-131072 0 0 131072
> > > size-65536 0 0 65536
> > > size-32768 1 1 32768
> > > size-16384 0 0 16384
> > > size-8192 253 253 8192
> > > size-4096 89 89 4096
> > > size-2048 248 248 2048
> > > size-1024 312 312 1024
> > > size-512 545 648 512
> > > size-256 213 270 256
> > > size-128 5642 5642 128
> > > size-64 1025 1586 64
> > > size-32 2262 7854 32
> >
> > Wow... I overlooked that you are requesting data from x86_64 boxes.
> > Mine is not, it's i386...
> 
> This whole discussion is pointless anyways because most kmallocs are
> constant
> sized and with a constant sized kmalloc the slab is selected at compile
> time.
> 
> What would be more interesting would be to redo the complete kmalloc
> slab list.
> 
> I remember the original slab paper from Bonwick actually mentioned that
> power of
> two slabs are the worst choice for a malloc - but for some reason Linux
> chose them
> anyways. That would require a lot of measurements in different workloads
> on the
> actual kmalloc sizes and then select a good list, but could ultimately
> safe
> a lot of memory (ok not that much anymore because the memory intensive
> allocations should all have their own caches, but at least some)
> 
> Most likely the best list is different for 32bit and 64bit too.
> 
> Note that just looking at slabinfo is not enough for this - you need the
> original
> sizes as passed to kmalloc, not the rounded values reported there.
> Should be probably not too hard to hack a simple monitoring script up
> for that
> in systemtap to generate the data.

Something like this:

http://lwn.net/Articles/124374/

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2005-12-28 21:01         ` Matt Mackall
@ 2005-12-29  1:26           ` Dave Jones
  2005-12-30  4:06             ` Steven Rostedt
  2005-12-29  1:29           ` Dave Jones
  2005-12-30 21:13           ` Marcelo Tosatti
  2 siblings, 1 reply; 36+ messages in thread
From: Dave Jones @ 2005-12-29  1:26 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Andreas Kleen, Denis Vlasenko, Eric Dumazet, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 300 bytes --]

On Wed, Dec 28, 2005 at 03:01:25PM -0600, Matt Mackall wrote:

 > Something like this:
 > 
 > http://lwn.net/Articles/124374/

Nice toy. Variant attached that works on 2.6.15rc7
- ->cs_size compile error fixed
- inlines kstrdup and kzalloc.
  Otherwise these functions dominate the profile.

		Dave


[-- Attachment #2: linux-2.6-debug-account-kmalloc.patch --]
[-- Type: text/plain, Size: 12834 bytes --]


/proc/kmalloc allocation tracing


 tiny-mpm/fs/proc/proc_misc.c  |   21 ++++
 tiny-mpm/include/linux/slab.h |   19 ++++
 tiny-mpm/init/Kconfig         |    7 +
 tiny-mpm/mm/Makefile          |    2 
 tiny-mpm/mm/kmallocacct.c     |  182 ++++++++++++++++++++++++++++++++++++++++++
 tiny-mpm/mm/slab.c            |    7 +
 6 files changed, 237 insertions(+), 1 deletion(-)

Index: tiny/init/Kconfig
===================================================================
--- tiny.orig/init/Kconfig	2005-10-10 17:41:44.000000000 -0700
+++ tiny/init/Kconfig	2005-10-10 17:41:46.000000000 -0700
@@ -315,6 +315,13 @@ config BUG
           option for embedded systems with no facilities for reporting errors.
           Just say Y.
 
+config KMALLOC_ACCOUNTING
+	default n
+	bool "Enabled accounting of kmalloc/kfree allocations"
+	help
+	  This option records kmalloc and kfree activity and reports it via
+	  /proc/kmalloc.
+
 config BASE_FULL
 	default y
 	bool "Enable full-sized data structures for core" if EMBEDDED
Index: tiny/mm/slab.c
===================================================================
--- tiny.orig/mm/slab.c	2005-10-10 17:32:51.000000000 -0700
+++ tiny/mm/slab.c	2005-10-10 17:41:46.000000000 -0700
@@ -2911,6 +2911,8 @@ EXPORT_SYMBOL(kmalloc_node);
 void *__kmalloc(size_t size, unsigned int __nocast flags)
 {
 	kmem_cache_t *cachep;
+	struct cache_sizes *csizep = malloc_sizes;
+	void *a;
 
 	/* If you want to save a few bytes .text space: replace
 	 * __ with kmem_.
@@ -2920,7 +2921,9 @@ void *__kmalloc(size_t size, unsigned in
 	cachep = __find_general_cachep(size, flags);
 	if (unlikely(cachep == NULL))
 		return NULL;
-	return __cache_alloc(cachep, flags);
+	a = __cache_alloc(cachep, flags);
+	kmalloc_account(a, csizep->cs_size, size);
+	return a;
 }
 EXPORT_SYMBOL(__kmalloc);
 
@@ -3020,6 +3023,8 @@ void kfree(const void *objp)
 	kmem_cache_t *c;
 	unsigned long flags;
 
+	kfree_account(objp, ksize(objp));
+
 	if (unlikely(!objp))
 		return;
 	local_irq_save(flags);
Index: tiny/mm/kmallocacct.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ tiny/mm/kmallocacct.c	2005-10-10 17:41:46.000000000 -0700
@@ -0,0 +1,182 @@
+#include	<linux/config.h>
+#include	<linux/seq_file.h>
+#include	<linux/kallsyms.h>
+
+struct kma_caller {
+	const void *caller;
+	int total, net, slack, allocs, frees;
+};
+
+struct kma_list {
+	int callerhash;
+	const void *address;
+};
+
+#define MAX_CALLER_TABLE 512
+#define MAX_ALLOC_TRACK 4096
+
+#define kma_hash(address, size) (((u32)address / (u32)size) % size)
+
+static struct kma_list kma_alloc[MAX_ALLOC_TRACK];
+static struct kma_caller kma_caller[MAX_CALLER_TABLE];
+
+static int kma_callers;
+static int kma_lost_callers, kma_lost_allocs, kma_unknown_frees;
+static int kma_total, kma_net, kma_slack, kma_allocs, kma_frees;
+static spinlock_t kma_lock = SPIN_LOCK_UNLOCKED;
+
+void __kmalloc_account(const void *caller, const void *addr, int size, int req)
+{
+	int i, hasha, hashc;
+	unsigned long flags;
+
+	spin_lock_irqsave(&kma_lock, flags);
+	if(req >= 0) /* kmalloc */
+	{
+		/* find callers slot */
+		hashc = kma_hash(caller, MAX_CALLER_TABLE);
+		for (i = 0; i < MAX_CALLER_TABLE; i++) {
+			if (!kma_caller[hashc].caller ||
+			    kma_caller[hashc].caller == caller)
+				break;
+			hashc = (hashc + 1) % MAX_CALLER_TABLE;
+		}
+
+		if (!kma_caller[hashc].caller)
+			kma_callers++;
+
+		if (i < MAX_CALLER_TABLE) {
+			/* update callers stats */
+			kma_caller[hashc].caller = caller;
+			kma_caller[hashc].total += size;
+			kma_caller[hashc].net += size;
+			kma_caller[hashc].slack += size - req;
+			kma_caller[hashc].allocs++;
+
+			/* add malloc to list */
+			hasha = kma_hash(addr, MAX_ALLOC_TRACK);
+			for (i = 0; i < MAX_ALLOC_TRACK; i++) {
+				if (!kma_alloc[hasha].callerhash)
+					break;
+				hasha = (hasha + 1) % MAX_ALLOC_TRACK;
+			}
+
+			if(i < MAX_ALLOC_TRACK) {
+				kma_alloc[hasha].callerhash = hashc;
+				kma_alloc[hasha].address = addr;
+			}
+			else
+				kma_lost_allocs++;
+		}
+		else {
+			kma_lost_callers++;
+			kma_lost_allocs++;
+		}
+
+		kma_total += size;
+		kma_net += size;
+		kma_slack += size - req;
+		kma_allocs++;
+	}
+	else { /* kfree */
+		hasha = kma_hash(addr, MAX_ALLOC_TRACK);
+		for (i = 0; i < MAX_ALLOC_TRACK ; i++) {
+			if (kma_alloc[hasha].address == addr)
+				break;
+			hasha = (hasha + 1) % MAX_ALLOC_TRACK;
+		}
+
+		if (i < MAX_ALLOC_TRACK) {
+			hashc = kma_alloc[hasha].callerhash;
+			kma_alloc[hasha].callerhash = 0;
+			kma_caller[hashc].net -= size;
+			kma_caller[hashc].frees++;
+		}
+		else
+			kma_unknown_frees++;
+
+		kma_net -= size;
+		kma_frees++;
+	}
+	spin_unlock_irqrestore(&kma_lock, flags);
+}
+
+static void *as_start(struct seq_file *m, loff_t *pos)
+{
+	int i;
+	loff_t n = *pos;
+
+	if (!n) {
+		seq_printf(m, "total bytes allocated: %8d\n", kma_total);
+		seq_printf(m, "slack bytes allocated: %8d\n", kma_slack);
+		seq_printf(m, "net bytes allocated:   %8d\n", kma_net);
+		seq_printf(m, "number of allocs:      %8d\n", kma_allocs);
+		seq_printf(m, "number of frees:       %8d\n", kma_frees);
+		seq_printf(m, "number of callers:     %8d\n", kma_callers);
+		seq_printf(m, "lost callers:          %8d\n",
+			   kma_lost_callers);
+		seq_printf(m, "lost allocs:           %8d\n",
+			   kma_lost_allocs);
+		seq_printf(m, "unknown frees:         %8d\n",
+			   kma_unknown_frees);
+		seq_puts(m, "\n   total    slack      net alloc/free  caller\n");
+	}
+
+	for (i = 0; i < MAX_CALLER_TABLE; i++) {
+		if(kma_caller[i].caller)
+			n--;
+		if(n < 0)
+			return (void *)(i+1);
+	}
+
+	return 0;
+}
+
+static void *as_next(struct seq_file *m, void *p, loff_t *pos)
+{
+	int n = (int)p-1, i;
+	++*pos;
+
+	for (i = n + 1; i < MAX_CALLER_TABLE; i++)
+		if(kma_caller[i].caller)
+			return (void *)(i+1);
+
+	return 0;
+}
+
+static void as_stop(struct seq_file *m, void *p)
+{
+}
+
+static int as_show(struct seq_file *m, void *p)
+{
+	int n = (int)p-1;
+	struct kma_caller *c;
+#ifdef CONFIG_KALLSYMS
+	char *modname;
+	const char *name;
+	unsigned long offset = 0, size;
+	char namebuf[128];
+
+	c = &kma_caller[n];
+	name = kallsyms_lookup((int)c->caller, &size, &offset, &modname,
+			       namebuf);
+	seq_printf(m, "%8d %8d %8d %5d/%-5d %s+0x%lx\n",
+		   c->total, c->slack, c->net, c->allocs, c->frees,
+		   name, offset);
+#else
+	c = &kma_caller[n];
+	seq_printf(m, "%8d %8d %8d %5d/%-5d %p\n",
+		   c->total, c->slack, c->net, c->allocs, c->frees, c->caller);
+#endif
+
+	return 0;
+}
+
+struct seq_operations kmalloc_account_op = {
+	.start	= as_start,
+	.next	= as_next,
+	.stop	= as_stop,
+	.show	= as_show,
+};
+
Index: tiny/mm/Makefile
===================================================================
--- tiny.orig/mm/Makefile	2005-10-10 17:30:45.000000000 -0700
+++ tiny/mm/Makefile	2005-10-10 17:41:46.000000000 -0700
@@ -12,6 +12,7 @@ obj-y			:= bootmem.o filemap.o mempool.o
 			   readahead.o slab.o swap.o truncate.o vmscan.o \
 			   prio_tree.o $(mmu-y)
 
+obj-$(CONFIG_KMALLOC_ACCOUNTING) += kmallocacct.o
 obj-$(CONFIG_SWAP)	+= page_io.o swap_state.o swapfile.o thrash.o
 obj-$(CONFIG_HUGETLBFS)	+= hugetlb.o
 obj-$(CONFIG_NUMA) 	+= mempolicy.o
Index: tiny/include/linux/slab.h
===================================================================
--- tiny.orig/include/linux/slab.h	2005-10-10 17:32:41.000000000 -0700
+++ tiny/include/linux/slab.h	2005-10-10 17:41:46.000000000 -0700
@@ -53,6 +53,23 @@ typedef struct kmem_cache_s kmem_cache_t
 #define SLAB_CTOR_ATOMIC	0x002UL		/* tell constructor it can't sleep */
 #define	SLAB_CTOR_VERIFY	0x004UL		/* tell constructor it's a verify call */
 
+#ifdef CONFIG_KMALLOC_ACCOUNTING
+void __kmalloc_account(const void *, const void *, int, int);
+
+static void inline kmalloc_account(const void *addr, int size, int req)
+{
+	__kmalloc_account(__builtin_return_address(0), addr, size, req);
+}
+
+static void inline kfree_account(const void *addr, int size)
+{
+	__kmalloc_account(__builtin_return_address(0), addr, size, -1);
+}
+#else
+#define kmalloc_account(a, b, c)
+#define kfree_account(a, b)
+#endif
+
 /* prototypes */
 extern void __init kmem_cache_init(void);
 
@@ -78,6 +95,7 @@ extern void *__kmalloc(size_t, unsigned 
 
 static inline void *kmalloc(size_t size, unsigned int __nocast flags)
 {
+#ifndef CONFIG_KMALLOC_ACCOUNTING
 	if (__builtin_constant_p(size)) {
 		int i = 0;
 #define CACHE(x) \
@@ -96,6 +114,7 @@ found:
 			malloc_sizes[i].cs_dmacachep :
 			malloc_sizes[i].cs_cachep, flags);
 	}
+#endif
 	return __kmalloc(size, flags);
 }
 
Index: tiny/fs/proc/proc_misc.c
===================================================================
--- tiny.orig/fs/proc/proc_misc.c	2005-10-10 17:30:45.000000000 -0700
+++ tiny/fs/proc/proc_misc.c	2005-10-10 17:41:46.000000000 -0700
@@ -337,6 +337,24 @@ static struct file_operations proc_slabi
 	.release	= seq_release,
 };
 
+#ifdef CONFIG_KMALLOC_ACCOUNTING
+
+extern struct seq_operations kmalloc_account_op;
+
+static int kmalloc_account_open(struct inode *inode, struct file *file)
+{
+	return seq_open(file, &kmalloc_account_op);
+}
+
+static struct file_operations proc_kmalloc_account_operations = {
+	.open		= kmalloc_account_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= seq_release,
+};
+
+#endif
+
 static int show_stat(struct seq_file *p, void *v)
 {
 	int i;
@@ -601,6 +619,9 @@ void __init proc_misc_init(void)
 	create_seq_entry("stat", 0, &proc_stat_operations);
 	create_seq_entry("interrupts", 0, &proc_interrupts_operations);
 	create_seq_entry("slabinfo",S_IWUSR|S_IRUGO,&proc_slabinfo_operations);
+#ifdef CONFIG_KMALLOC_ACCOUNTING
+	create_seq_entry("kmalloc",S_IRUGO,&proc_kmalloc_account_operations);
+#endif
 	create_seq_entry("buddyinfo",S_IRUGO, &fragmentation_file_operations);
 	create_seq_entry("vmstat",S_IRUGO, &proc_vmstat_file_operations);
 	create_seq_entry("zoneinfo",S_IRUGO, &proc_zoneinfo_file_operations);

--- linux-2.6.14/mm/slab.c~	2005-12-28 16:37:04.000000000 -0500
+++ linux-2.6.14/mm/slab.c	2005-12-28 16:37:14.000000000 -0500
@@ -3045,20 +3045,6 @@ void kmem_cache_free(kmem_cache_t *cache
 EXPORT_SYMBOL(kmem_cache_free);
 
 /**
- * kzalloc - allocate memory. The memory is set to zero.
- * @size: how many bytes of memory are required.
- * @flags: the type of memory to allocate.
- */
-void *kzalloc(size_t size, gfp_t flags)
-{
-	void *ret = kmalloc(size, flags);
-	if (ret)
-		memset(ret, 0, size);
-	return ret;
-}
-EXPORT_SYMBOL(kzalloc);
-
-/**
  * kfree - free previously allocated memory
  * @objp: pointer returned by kmalloc.
  *
--- linux-2.6.14/include/linux/slab.h~	2005-12-28 16:37:19.000000000 -0500
+++ linux-2.6.14/include/linux/slab.h	2005-12-28 16:38:51.000000000 -0500
@@ -118,7 +118,13 @@ found:
 	return __kmalloc(size, flags);
 }
 
-extern void *kzalloc(size_t, gfp_t);
+static inline void *kzalloc(size_t size, gfp_t flags)
+{
+	void *ret = kmalloc(size, flags);
+	if (ret)
+		memset(ret, 0, size);
+	return ret;
+}
 
 /**
  * kcalloc - allocate memory for an array. The memory is set to zero.

--- linux-2.6.14/include/linux/slab.h~	2005-12-28 19:04:06.000000000 -0500
+++ linux-2.6.14/include/linux/slab.h	2005-12-28 19:04:47.000000000 -0500
@@ -126,6 +126,27 @@ static inline void *kzalloc(size_t size,
 	return ret;
 }
 
+/*
+ * kstrdup - allocate space for and copy an existing string
+ *
+ * @s: the string to duplicate
+ * @gfp: the GFP mask used in the kmalloc() call when allocating memory
+ */
+static inline char *kstrdup(const char *s, gfp_t gfp)
+{
+	size_t len;
+	char *buf;
+
+	if (!s)
+		return NULL;
+
+	len = strlen(s) + 1;
+	buf = kmalloc(len, gfp);
+	if (buf)
+		memcpy(buf, s, len);
+	return buf;
+}
+
 /**
  * kcalloc - allocate memory for an array. The memory is set to zero.
  * @n: number of elements.
--- linux-2.6.14/mm/slab.c~	2005-12-28 19:04:54.000000000 -0500
+++ linux-2.6.14/mm/slab.c	2005-12-28 19:04:59.000000000 -0500
@@ -3669,25 +3669,3 @@ unsigned int ksize(const void *objp)
 	return obj_reallen(page_get_cache(virt_to_page(objp)));
 }
 
-
-/*
- * kstrdup - allocate space for and copy an existing string
- *
- * @s: the string to duplicate
- * @gfp: the GFP mask used in the kmalloc() call when allocating memory
- */
-char *kstrdup(const char *s, gfp_t gfp)
-{
-	size_t len;
-	char *buf;
-
-	if (!s)
-		return NULL;
-
-	len = strlen(s) + 1;
-	buf = kmalloc(len, gfp);
-	if (buf)
-		memcpy(buf, s, len);
-	return buf;
-}
-EXPORT_SYMBOL(kstrdup);

--- linux-2.6.14/include/linux/string.h~	2005-12-28 19:12:06.000000000 -0500
+++ linux-2.6.14/include/linux/string.h	2005-12-28 19:12:19.000000000 -0500
@@ -88,8 +88,6 @@ extern int memcmp(const void *,const voi
 extern void * memchr(const void *,int,__kernel_size_t);
 #endif
 
-extern char *kstrdup(const char *s, gfp_t gfp);
-
 #ifdef __cplusplus
 }
 #endif

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2005-12-28 21:01         ` Matt Mackall
  2005-12-29  1:26           ` Dave Jones
@ 2005-12-29  1:29           ` Dave Jones
  2005-12-29  1:50             ` Keith Owens
  2005-12-30 21:13           ` Marcelo Tosatti
  2 siblings, 1 reply; 36+ messages in thread
From: Dave Jones @ 2005-12-29  1:29 UTC (permalink / raw)
  To: Matt Mackall; +Cc: linux-kernel


 > Something like this:
 > 
 > http://lwn.net/Articles/124374/

One thing that really sticks out like a sore thumb is soft_cursor()
That thing gets called a *lot*, and every time it does a kmalloc/free
pair that 99.9% of the time is going to be the same size alloc as
it was the last time.  This patch makes that alloc persistent
(and does a realloc if the size changes).
The only time it should change is if the font/resolution changes I think.

Boot tested with vesafb & fbconsole, which had the desired effect.
With this patch, it almost falls off the profile.

Signed-off-by: Dave Jones <davej@redhat.com>

--- linux-2.6.14/drivers/video/console/softcursor.c~	2005-12-28 18:40:08.000000000 -0500
+++ linux-2.6.14/drivers/video/console/softcursor.c	2005-12-28 18:45:50.000000000 -0500
@@ -23,7 +23,9 @@ int soft_cursor(struct fb_info *info, st
 	unsigned int buf_align = info->pixmap.buf_align - 1;
 	unsigned int i, size, dsize, s_pitch, d_pitch;
 	struct fb_image *image;
-	u8 *dst, *src;
+	u8 *dst;
+	static u8 *src=NULL;
+	static int allocsize=0;
 
 	if (info->state != FBINFO_STATE_RUNNING)
 		return 0;
@@ -31,9 +33,15 @@ int soft_cursor(struct fb_info *info, st
 	s_pitch = (cursor->image.width + 7) >> 3;
 	dsize = s_pitch * cursor->image.height;
 
-	src = kmalloc(dsize + sizeof(struct fb_image), GFP_ATOMIC);
-	if (!src)
-		return -ENOMEM;
+	if (dsize + sizeof(struct fb_image) != allocsize) {
+		if (src != NULL)
+			kfree(src);
+		allocsize = dsize + sizeof(struct fb_image);
+
+		src = kmalloc(allocsize, GFP_ATOMIC);
+		if (!src)
+			return -ENOMEM;
+	}
 
 	image = (struct fb_image *) (src + dsize);
 	*image = cursor->image;
@@ -61,7 +69,6 @@ int soft_cursor(struct fb_info *info, st
 	fb_pad_aligned_buffer(dst, d_pitch, src, s_pitch, image->height);
 	image->data = dst;
 	info->fbops->fb_imageblit(info, image);
-	kfree(src);
 	return 0;
 }
 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2005-12-29  1:29           ` Dave Jones
@ 2005-12-29  1:50             ` Keith Owens
  2005-12-29  2:39               ` Dave Jones
  2006-01-04  5:26               ` Dave Jones
  0 siblings, 2 replies; 36+ messages in thread
From: Keith Owens @ 2005-12-29  1:50 UTC (permalink / raw)
  To: Dave Jones; +Cc: Matt Mackall, linux-kernel

Dave Jones (on Wed, 28 Dec 2005 20:29:15 -0500) wrote:
>
> > Something like this:
> > 
> > http://lwn.net/Articles/124374/
>
>One thing that really sticks out like a sore thumb is soft_cursor()
>That thing gets called a *lot*, and every time it does a kmalloc/free
>pair that 99.9% of the time is going to be the same size alloc as
>it was the last time.  This patch makes that alloc persistent
>(and does a realloc if the size changes).
>The only time it should change is if the font/resolution changes I think.

Can soft_cursor() be called from multiple processes at the same time,
in particular with dual head systems?  If so then a static variable is
not going to work.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2005-12-29  1:50             ` Keith Owens
@ 2005-12-29  2:39               ` Dave Jones
  2006-01-02 15:03                 ` Helge Hafting
  2006-01-04  5:26               ` Dave Jones
  1 sibling, 1 reply; 36+ messages in thread
From: Dave Jones @ 2005-12-29  2:39 UTC (permalink / raw)
  To: Keith Owens; +Cc: Matt Mackall, linux-kernel

On Thu, Dec 29, 2005 at 12:50:10PM +1100, Keith Owens wrote:
 > Dave Jones (on Wed, 28 Dec 2005 20:29:15 -0500) wrote:
 > >
 > > > Something like this:
 > > > 
 > > > http://lwn.net/Articles/124374/
 > >
 > >One thing that really sticks out like a sore thumb is soft_cursor()
 > >That thing gets called a *lot*, and every time it does a kmalloc/free
 > >pair that 99.9% of the time is going to be the same size alloc as
 > >it was the last time.  This patch makes that alloc persistent
 > >(and does a realloc if the size changes).
 > >The only time it should change is if the font/resolution changes I think.
 > 
 > Can soft_cursor() be called from multiple processes at the same time,
 > in particular with dual head systems?  If so then a static variable is
 > not going to work.

My dual-head system here displays a cloned image on the second
screen, which seems to dtrt.  I'm not sure how to make it show
something different on the other head to test further.

		Dave


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2005-12-28 17:57       ` Andreas Kleen
  2005-12-28 21:01         ` Matt Mackall
@ 2005-12-29 19:48         ` Steven Rostedt
  2005-12-29 21:16           ` Andi Kleen
  2006-01-02  8:37         ` Pekka Enberg
  2 siblings, 1 reply; 36+ messages in thread
From: Steven Rostedt @ 2005-12-29 19:48 UTC (permalink / raw)
  To: Andreas Kleen
  Cc: linux-kernel, Eric Dumazet, Denis Vlasenko, Matt Mackall,
	Dave Jones

On Wed, 2005-12-28 at 18:57 +0100, Andreas Kleen wrote:
[...]
> 
> This whole discussion is pointless anyways because most kmallocs are
> constant
> sized and with a constant sized kmalloc the slab is selected at compile
> time.
> 
> What would be more interesting would be to redo the complete kmalloc
> slab list.
> 
> I remember the original slab paper from Bonwick actually mentioned that
> power of
> two slabs are the worst choice for a malloc - but for some reason Linux
> chose them
> anyways. That would require a lot of measurements in different workloads
> on the
> actual kmalloc sizes and then select a good list, but could ultimately
> safe
> a lot of memory (ok not that much anymore because the memory intensive
> allocations should all have their own caches, but at least some)
> 
> Most likely the best list is different for 32bit and 64bit too.
> 
> Note that just looking at slabinfo is not enough for this - you need the
> original
> sizes as passed to kmalloc, not the rounded values reported there.
> Should be probably not too hard to hack a simple monitoring script up
> for that
> in systemtap to generate the data.
> 

OK then, after reading this I figured there must be a way to dynamically
allocate slab sizes based on the kmalloc constants.  So I spent last
night and some of this morning coming up with the below patch.

Right now it only works with i386, but I'm sure it can be hacked to work
with all archs.  At compile time it creates a table of sizes for all
kmallocs (outside of slab.c and arch/i386/mm/init.c) that uses a
constant declaration.

This table is then initialized in arch/i386/mm/init.c to use a cache
that is either already created (like the mem_sizes array) or it creates
a new cache of that size (L1 cached aligned), and then updates the
pointers to use that cache.

Here's what was created on my test box:

cat /proc/slabinfo
[...]
dynamic_dma-1536       0      0   1536    5    2 : tunables   24   12    0 : slabdata      0      0      0
dynamic-1536           1      5   1536    5    2 : tunables   24   12    0 : slabdata      1      1      0
dynamic_dma-1280       0      0   1280    3    1 : tunables   24   12    0 : slabdata      0      0      0
dynamic-1280           6      6   1280    3    1 : tunables   24   12    0 : slabdata      2      2      0
dynamic_dma-2176       0      0   2176    3    2 : tunables   24   12    0 : slabdata      0      0      0
dynamic-2176           0      0   2176    3    2 : tunables   24   12    0 : slabdata      0      0      0
dynamic_dma-1152       0      0   1152    7    2 : tunables   24   12    0 : slabdata      0      0      0
dynamic-1152           0      0   1152    7    2 : tunables   24   12    0 : slabdata      0      0      0
dynamic_dma-1408       0      0   1408    5    2 : tunables   24   12    0 : slabdata      0      0      0
dynamic-1408           0      0   1408    5    2 : tunables   24   12    0 : slabdata      0      0      0
dynamic_dma-640        0      0    640    6    1 : tunables   54   27    0 : slabdata      0      0      0
dynamic-640            0      0    640    6    1 : tunables   54   27    0 : slabdata      0      0      0
dynamic_dma-768        0      0    768    5    1 : tunables   54   27    0 : slabdata      0      0      0
dynamic-768            0      0    768    5    1 : tunables   54   27    0 : slabdata      0      0      0
dynamic_dma-3200       0      0   3200    2    2 : tunables   24   12    0 : slabdata      0      0      0
dynamic-3200           8      8   3200    2    2 : tunables   24   12    0 : slabdata      4      4      0
dynamic_dma-896        0      0    896    4    1 : tunables   54   27    0 : slabdata      0      0      0
dynamic-896            9     12    896    4    1 : tunables   54   27    0 : slabdata      3      3      0
dynamic_dma-384        0      0    384   10    1 : tunables   54   27    0 : slabdata      0      0      0
dynamic-384           40     40    384   10    1 : tunables   54   27    0 : slabdata      4      4      0
size-131072(DMA)       0      0 131072    1   32 : tunables    8    4    0 : slabdata      0      0      0
size-131072            0      0 131072    1   32 : tunables    8    4    0 : slabdata      0      0      0
size-65536(DMA)        0      0  65536    1   16 : tunables    8    4    0 : slabdata      0      0      0
size-65536             1      1  65536    1   16 : tunables    8    4    0 : slabdata      1      1      0
size-32768(DMA)        0      0  32768    1    8 : tunables    8    4    0 : slabdata      0      0      0
size-32768             0      0  32768    1    8 : tunables    8    4    0 : slabdata      0      0      0
size-16384(DMA)        0      0  16384    1    4 : tunables    8    4    0 : slabdata      0      0      0
size-16384             0      0  16384    1    4 : tunables    8    4    0 : slabdata      0      0      0
size-8192(DMA)         0      0   8192    1    2 : tunables    8    4    0 : slabdata      0      0      0
size-8192             40     40   8192    1    2 : tunables    8    4    0 : slabdata     40     40      0
size-4096(DMA)         0      0   4096    1    1 : tunables   24   12    0 : slabdata      0      0      0
size-4096             34     34   4096    1    1 : tunables   24   12    0 : slabdata     34     34      0
size-2048(DMA)         0      0   2048    2    1 : tunables   24   12    0 : slabdata      0      0      0
size-2048            266    266   2048    2    1 : tunables   24   12    0 : slabdata    133    133      0
size-1024(DMA)         0      0   1024    4    1 : tunables   54   27    0 : slabdata      0      0      0
size-1024             24     24   1024    4    1 : tunables   54   27    0 : slabdata      6      6      0
size-512(DMA)          0      0    512    8    1 : tunables   54   27    0 : slabdata      0      0      0
size-512              90    112    512    8    1 : tunables   54   27    0 : slabdata     14     14      0
size-256(DMA)          0      0    256   15    1 : tunables  120   60    0 : slabdata      0      0      0
size-256             735    735    256   15    1 : tunables  120   60    0 : slabdata     49     49      0
size-128(DMA)          0      0    128   30    1 : tunables  120   60    0 : slabdata      0      0      0
size-128            2750   2760    128   30    1 : tunables  120   60    0 : slabdata     92     92      0
size-64(DMA)           0      0     64   59    1 : tunables  120   60    0 : slabdata      0      0      0
size-32(DMA)           0      0     32  113    1 : tunables  120   60    0 : slabdata      0      0      0
size-64              418    472     64   59    1 : tunables  120   60    0 : slabdata      8      8      0
size-32             1175   1243     32  113    1 : tunables  120   60    0 : slabdata     11     11      0
[...]

Not sure if this is worth looking further into, but it might actually be
a way to use less memory.  For example, the above 384 size with 40
objects cost only 4 4k pages, where as these same objects would be 40
512 objects (in size-512) costing 5 4k pages.  Plus the 384 probably has
ON_SLAB management where as the 512 does not.

Comments?

-- Steve

Index: linux-2.6.15-rc7/arch/i386/Kconfig
===================================================================
--- linux-2.6.15-rc7.orig/arch/i386/Kconfig	2005-12-29 09:09:29.000000000 -0500
+++ linux-2.6.15-rc7/arch/i386/Kconfig	2005-12-29 09:09:53.000000000 -0500
@@ -173,6 +173,14 @@
 	depends on HPET_TIMER && RTC=y
 	default y
 
+config DYNAMIC_SLABS
+	bool "Dynamically create slabs for constant kmalloc"
+	default y
+	help
+	  This enables the creation of SLABS using information created at
+	  compile time.  Then on boot up, the slabs are created to fit
+	  more with what was asked for.
+
 config SMP
 	bool "Symmetric multi-processing support"
 	---help---
Index: linux-2.6.15-rc7/arch/i386/kernel/vmlinux.lds.S
===================================================================
--- linux-2.6.15-rc7.orig/arch/i386/kernel/vmlinux.lds.S	2005-12-29 09:09:29.000000000 -0500
+++ linux-2.6.15-rc7/arch/i386/kernel/vmlinux.lds.S	2005-12-29 09:09:53.000000000 -0500
@@ -68,6 +68,13 @@
 	*(.data.init_task)
   }
 
+#ifdef CONFIG_DYNAMIC_SLABS
+  . = ALIGN(16);		/* dynamic slab table */
+  __start____slab_addresses = .;
+  __slab_addresses : AT(ADDR(__slab_addresses) - LOAD_OFFSET) { *(__slab_addresses) }
+  __stop____slab_addresses = .;
+#endif
+
   /* will be freed after init */
   . = ALIGN(4096);		/* Init code and data */
   __init_begin = .;
@@ -107,6 +114,14 @@
   .altinstr_replacement : AT(ADDR(.altinstr_replacement) - LOAD_OFFSET) {
 	*(.altinstr_replacement)
   }
+#ifdef CONFIG_DYNAMIC_SLABS
+  . = ALIGN(16);		/* dynamic slab table */
+  __start____slab_preprocess = .;
+  __slab_preprocess : AT(ADDR(__slab_preprocess) - LOAD_OFFSET) { *(__slab_preprocess) }
+  __slab_process_ret : AT(ADDR(__slab_process_ret) - LOAD_OFFSET) { *(__slab_process_ret) }
+  __stop____slab_preprocess = .;
+#endif
+
   /* .exit.text is discard at runtime, not link time, to deal with references
      from .altinstructions and .eh_frame */
   .exit.text : AT(ADDR(.exit.text) - LOAD_OFFSET) { *(.exit.text) }
@@ -119,7 +134,7 @@
   __per_cpu_start = .;
   .data.percpu  : AT(ADDR(.data.percpu) - LOAD_OFFSET) { *(.data.percpu) }
   __per_cpu_end = .;
-  . = ALIGN(4096);
+ . = ALIGN(4096);
   __init_end = .;
   /* freed after init ends here */
 	
Index: linux-2.6.15-rc7/arch/i386/mm/init.c
===================================================================
--- linux-2.6.15-rc7.orig/arch/i386/mm/init.c	2005-12-29 09:09:29.000000000 -0500
+++ linux-2.6.15-rc7/arch/i386/mm/init.c	2005-12-29 14:31:08.000000000 -0500
@@ -6,6 +6,7 @@
  *  Support of BIGMEM added by Gerhard Wichert, Siemens AG, July 1999
  */
 
+#define DYNAMIC_SLABS_BOOTSTRAP
 #include <linux/config.h>
 #include <linux/module.h>
 #include <linux/signal.h>
@@ -748,3 +749,187 @@
 	}
 }
 #endif
+
+#ifdef CONFIG_DYNAMIC_SLABS
+extern void __start____slab_preprocess(void);
+extern unsigned long __start____slab_addresses;
+extern unsigned long __stop____slab_addresses;
+
+static __initdata LIST_HEAD(slablist);
+
+struct slab_links {
+	struct cache_sizes *c;
+	struct list_head list;
+};
+
+static struct cache_sizes *find_slab_size(int size)
+{
+	struct list_head *curr;
+	struct slab_links *s;
+
+	list_for_each(curr, &slablist) {
+		s = list_entry(curr, struct slab_links, list);
+		if (s->c->cs_size == size)
+			return s->c;
+	}
+	return NULL;
+}
+
+static void free_slablist(void)
+{
+	struct list_head *curr, *next;
+	struct slab_links *s;
+
+	list_for_each_safe(curr, next, &slablist) {
+		s = list_entry(curr, struct slab_links, list);
+		list_del(&s->list);
+		kfree(s);
+	}
+}
+
+#ifndef ARCH_KMALLOC_MINALIGN
+#define ARCH_KMALLOC_MINALIGN L1_CACHE_BYTES
+#endif
+#ifndef ARCH_KMALLOC_FLAGS
+#define ARCH_KMALLOC_FLAGS SLAB_HWCACHE_ALIGN
+#endif
+#define	BYTES_PER_WORD		sizeof(void *)
+
+#ifdef DEBUG_ADDR
+static __init void print_slab_addresses(int hex)
+{
+	unsigned long *slab_addresses = &__start____slab_addresses;
+	unsigned long *end = &__stop____slab_addresses;
+
+
+	for (; slab_addresses < end; slab_addresses++) {
+		if (hex)
+			printk("slab %p = %lx\n",slab_addresses, *slab_addresses);
+		else
+			printk("slab %p = %ld\n",slab_addresses, *slab_addresses);
+	}
+}
+#else
+# define print_slab_addresses(x) do {} while(0)
+#endif
+
+int __init dynamic_slab_init(void)
+{
+	unsigned long *slab_addresses = &__start____slab_addresses;
+	unsigned long *end = &__stop____slab_addresses;
+	struct cache_sizes *c;
+	struct slab_links *s;
+	unsigned long sizes[] = {
+#define CACHE(C) C,
+#include <linux/kmalloc_sizes.h>
+#undef CACHE
+	};
+	int i;
+
+
+	asm (".section __slab_process_ret,\"ax\"\n"
+	     "ret\n"
+	     ".previous\n");
+
+	__start____slab_preprocess();
+
+	printk("Before update!\n");
+	print_slab_addresses(0);
+
+	/*
+	 * DYNAMIC_SLABS_BOOTSTRAP is defined, so we don't need
+	 * to worry about kmalloc hardcoded.
+	 */
+
+	/*
+	 * This is really bad, but I don't want to go monkey up the
+	 * slab.c to get to the cache_chain.  So right now I just
+	 * allocate a pointer list to search for slabs that are
+	 * of the right size, and then free it at the end.
+	 *
+	 * Hey, you find a better way, then fix this ;)
+	 */
+	for (i=0; i < sizeof(sizes)/sizeof(sizes[0]); i++) {
+		s = kmalloc(sizeof(*s), GFP_ATOMIC);
+		if (!s)
+			panic("Can't create link list for slabs\n");
+		s->c = &malloc_sizes[i];
+		list_add_tail(&s->list, &slablist);
+	}
+
+	for (; slab_addresses < end; slab_addresses++) {
+		char *name;
+		char *name_dma;
+		unsigned long size = *slab_addresses;
+		struct cache_sizes **ptr = (struct cache_sizes**)slab_addresses;
+
+		if (!size)
+			continue;
+
+		size = (size + (L1_CACHE_BYTES-1)) & ~(L1_CACHE_BYTES-1);
+		if (size < BYTES_PER_WORD)
+			size = BYTES_PER_WORD;
+		if (size < ARCH_KMALLOC_MINALIGN)
+			size = ARCH_KMALLOC_MINALIGN;
+
+		c = find_slab_size(size);
+		if (c) {
+			*ptr = c;
+			continue;
+		}
+
+		/*
+		 * Create a cache for this specific size.
+		 */
+		name = kmalloc(25, GFP_ATOMIC);
+		if (!name)
+			panic("Can't allocate name for dynamic slab\n");
+
+		snprintf(name, 25, "dynamic-%ld", size);
+		name_dma = kmalloc(25, GFP_ATOMIC);
+		if (!name_dma)
+			panic("Can't allocate name for dynamic slab\n");
+
+		snprintf(name_dma, 25, "dynamic_dma-%ld", size);
+
+		c = kmalloc(sizeof(*c), GFP_ATOMIC);
+
+		if (!c)
+			panic("Can't allocate cache_size descriptor\n");
+
+		c->cs_size = size;
+
+		/*
+		 * For performance, all the general caches are L1 aligned.
+		 * This should be particularly beneficial on SMP boxes, as it
+		 * eliminates "false sharing".
+		 * Note for systems short on memory removing the alignment will
+		 * allow tighter packing of the smaller caches.
+		 */
+		c->cs_cachep = kmem_cache_create(name,
+				c->cs_size, ARCH_KMALLOC_MINALIGN,
+				(ARCH_KMALLOC_FLAGS | SLAB_PANIC), NULL, NULL);
+
+		c->cs_dmacachep = kmem_cache_create(name_dma,
+			c->cs_size, ARCH_KMALLOC_MINALIGN,
+			(ARCH_KMALLOC_FLAGS | SLAB_CACHE_DMA | SLAB_PANIC),
+			NULL, NULL);
+
+		s = kmalloc(sizeof(*s), GFP_ATOMIC);
+		if (!s)
+			panic("Can't create link list for slabs\n");
+		s->c = c;
+		list_add_tail(&s->list, &slablist);
+
+		*ptr = c;
+
+	}
+
+	free_slablist();
+
+	printk("\nAfter update!\n");
+	print_slab_addresses(1);
+
+	return 0;
+}
+#endif
Index: linux-2.6.15-rc7/include/asm-i386/dynamic_slab.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.15-rc7/include/asm-i386/dynamic_slab.h	2005-12-29 09:09:53.000000000 -0500
@@ -0,0 +1,20 @@
+
+/*
+ * Included in slab.h
+ *
+ * @c    - cache pointer to return base on size
+ * @size - size of cache.
+ */
+__asm__ __volatile__ (
+	"jmp 2f\n"
+	".section __slab_preprocess,\"ax\"\n"
+	"movl %1,1f\n"
+	".previous\n"
+	".section __slab_addresses,\"aw\"\n"
+	".align 4\n"
+	"1:\n"
+	".long 0\n"
+	".previous\n"
+	"2:\n"
+	"movl 1b, %0\n"
+	: "=r"(c) : "i"(size));
Index: linux-2.6.15-rc7/include/linux/slab.h
===================================================================
--- linux-2.6.15-rc7.orig/include/linux/slab.h	2005-12-29 09:09:29.000000000 -0500
+++ linux-2.6.15-rc7/include/linux/slab.h	2005-12-29 09:23:44.000000000 -0500
@@ -80,6 +80,15 @@
 {
 	if (__builtin_constant_p(size)) {
 		int i = 0;
+#if defined(CONFIG_DYNAMIC_SLABS) && !defined(MODULE) && !defined(DYNAMIC_SLABS_BOOTSTRAP)
+		{
+			struct cache_sizes *c;
+# include <asm/dynamic_slab.h>
+		return kmem_cache_alloc((flags & GFP_DMA) ?
+			c->cs_dmacachep :
+			c->cs_cachep, flags);
+		}
+#endif
 #define CACHE(x) \
 		if (size <= x) \
 			goto found; \
Index: linux-2.6.15-rc7/mm/slab.c
===================================================================
--- linux-2.6.15-rc7.orig/mm/slab.c	2005-12-29 09:09:29.000000000 -0500
+++ linux-2.6.15-rc7/mm/slab.c	2005-12-29 14:04:44.000000000 -0500
@@ -86,6 +86,7 @@
  *	All object allocations for a node occur from node specific slab lists.
  */
 
+#define DYNAMIC_SLABS_BOOTSTRAP
 #include	<linux/config.h>
 #include	<linux/slab.h>
 #include	<linux/mm.h>
@@ -1165,6 +1166,19 @@
 	/* Done! */
 	g_cpucache_up = FULL;
 
+#ifdef CONFIG_DYNAMIC_SLABS
+	{
+		extern int dynamic_slab_init(void);
+		/*
+		 * Create the caches that will handle
+		 * kmallocs of constant sizes.
+		 */
+		dynamic_slab_init();
+	}
+#endif
+	/*
+	 */
+
 	/* Register a cpu startup notifier callback
 	 * that initializes ac_data for all new cpus
 	 */






^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2005-12-29 19:48         ` Steven Rostedt
@ 2005-12-29 21:16           ` Andi Kleen
  0 siblings, 0 replies; 36+ messages in thread
From: Andi Kleen @ 2005-12-29 21:16 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Andreas Kleen, linux-kernel, Eric Dumazet, Denis Vlasenko,
	Matt Mackall, Dave Jones

> OK then, after reading this I figured there must be a way to dynamically
> allocate slab sizes based on the kmalloc constants.  So I spent last
> night and some of this morning coming up with the below patch.

The canonical slab theory is that constant allocations are for fixed
objects. And if they are frequent they should be in theory kmem
cache because in theory their object live times should be similar
and clustering them together should give the best fragmentation
advoidance.

So in theory longer term the dynamic kmallocs are more important because
they cannot be handled like this - and these are not caught by
your patch.

So I'm not sure you're optimizing the right thing here.

Perhaps a good evolution your patch would be to add some analysis of
the callers and generate a nice compile time report that people can use as a 
guideline to convert kmalloc over the kmem_cache_alloc. But to do this really
well would require dynamic data from runtime.

Given that I think a runtime patch is better. Ideally one that's easy
to use with someone collecting data from users and then submitting a patch
for a better new set of default slabs.  Would need to be separate
for 32bit and 64bit too.

I guess one could run a fancy dynamic optimization algorithm to find
the best set of slabs from the data.

-Andi


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2005-12-29  1:26           ` Dave Jones
@ 2005-12-30  4:06             ` Steven Rostedt
  2006-01-02  8:46               ` Pekka Enberg
  0 siblings, 1 reply; 36+ messages in thread
From: Steven Rostedt @ 2005-12-30  4:06 UTC (permalink / raw)
  To: Dave Jones
  Cc: linux-kernel, Eric Dumazet, Denis Vlasenko, Andreas Kleen,
	Matt Mackall

[-- Attachment #1: Type: text/plain, Size: 550 bytes --]

On Wed, 2005-12-28 at 20:26 -0500, Dave Jones wrote:
> On Wed, Dec 28, 2005 at 03:01:25PM -0600, Matt Mackall wrote:
> 
>  > Something like this:
>  > 
>  > http://lwn.net/Articles/124374/
> 
> Nice toy. Variant attached that works on 2.6.15rc7
> - ->cs_size compile error fixed
> - inlines kstrdup and kzalloc.
>   Otherwise these functions dominate the profile.

Attached is a variant that was refreshed against 2.6.15-rc7 and fixes
the logical bug that your compile error fix made ;)

It should be cachep->objsize not csizep->cs_size.

-- Steve



[-- Attachment #2: linux-2.6-debug-account-kmalloc.patch --]
[-- Type: text/x-patch, Size: 11851 bytes --]


/proc/kmalloc allocation tracing


 tiny-mpm/fs/proc/proc_misc.c  |   21 ++++
 tiny-mpm/include/linux/slab.h |   19 ++++
 tiny-mpm/init/Kconfig         |    7 +
 tiny-mpm/mm/Makefile          |    2 
 tiny-mpm/mm/kmallocacct.c     |  182 ++++++++++++++++++++++++++++++++++++++++++
 tiny-mpm/mm/slab.c            |    7 +
 6 files changed, 237 insertions(+), 1 deletion(-)

Index: linux-2.6.15-rc7/init/Kconfig
===================================================================
--- linux-2.6.15-rc7.orig/init/Kconfig	2005-12-29 22:54:48.000000000 -0500
+++ linux-2.6.15-rc7/init/Kconfig	2005-12-29 22:55:29.000000000 -0500
@@ -328,6 +328,13 @@
           option for embedded systems with no facilities for reporting errors.
           Just say Y.
 
+config KMALLOC_ACCOUNTING
+	default n
+	bool "Enabled accounting of kmalloc/kfree allocations"
+	help
+	  This option records kmalloc and kfree activity and reports it via
+	  /proc/kmalloc.
+
 config BASE_FULL
 	default y
 	bool "Enable full-sized data structures for core" if EMBEDDED
Index: linux-2.6.15-rc7/mm/slab.c
===================================================================
--- linux-2.6.15-rc7.orig/mm/slab.c	2005-12-29 22:54:48.000000000 -0500
+++ linux-2.6.15-rc7/mm/slab.c	2005-12-29 22:56:13.000000000 -0500
@@ -2924,6 +2924,7 @@
 void *__kmalloc(size_t size, gfp_t flags)
 {
 	kmem_cache_t *cachep;
+	void *a;
 
 	/* If you want to save a few bytes .text space: replace
 	 * __ with kmem_.
@@ -2933,7 +2934,9 @@
 	cachep = __find_general_cachep(size, flags);
 	if (unlikely(cachep == NULL))
 		return NULL;
-	return __cache_alloc(cachep, flags);
+	a = __cache_alloc(cachep, flags);
+	kmalloc_account(a, cachep->objsize, size);
+	return a;
 }
 EXPORT_SYMBOL(__kmalloc);
 
@@ -3006,20 +3009,6 @@
 EXPORT_SYMBOL(kmem_cache_free);
 
 /**
- * kzalloc - allocate memory. The memory is set to zero.
- * @size: how many bytes of memory are required.
- * @flags: the type of memory to allocate.
- */
-void *kzalloc(size_t size, gfp_t flags)
-{
-	void *ret = kmalloc(size, flags);
-	if (ret)
-		memset(ret, 0, size);
-	return ret;
-}
-EXPORT_SYMBOL(kzalloc);
-
-/**
  * kfree - free previously allocated memory
  * @objp: pointer returned by kmalloc.
  *
@@ -3033,6 +3022,8 @@
 	kmem_cache_t *c;
 	unsigned long flags;
 
+	kfree_account(objp, ksize(objp));
+
 	if (unlikely(!objp))
 		return;
 	local_irq_save(flags);
@@ -3610,25 +3601,3 @@
 	return obj_reallen(page_get_cache(virt_to_page(objp)));
 }
 
-
-/*
- * kstrdup - allocate space for and copy an existing string
- *
- * @s: the string to duplicate
- * @gfp: the GFP mask used in the kmalloc() call when allocating memory
- */
-char *kstrdup(const char *s, gfp_t gfp)
-{
-	size_t len;
-	char *buf;
-
-	if (!s)
-		return NULL;
-
-	len = strlen(s) + 1;
-	buf = kmalloc(len, gfp);
-	if (buf)
-		memcpy(buf, s, len);
-	return buf;
-}
-EXPORT_SYMBOL(kstrdup);
Index: linux-2.6.15-rc7/mm/kmallocacct.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.15-rc7/mm/kmallocacct.c	2005-12-29 22:55:29.000000000 -0500
@@ -0,0 +1,182 @@
+#include	<linux/config.h>
+#include	<linux/seq_file.h>
+#include	<linux/kallsyms.h>
+
+struct kma_caller {
+	const void *caller;
+	int total, net, slack, allocs, frees;
+};
+
+struct kma_list {
+	int callerhash;
+	const void *address;
+};
+
+#define MAX_CALLER_TABLE 512
+#define MAX_ALLOC_TRACK 4096
+
+#define kma_hash(address, size) (((u32)address / (u32)size) % size)
+
+static struct kma_list kma_alloc[MAX_ALLOC_TRACK];
+static struct kma_caller kma_caller[MAX_CALLER_TABLE];
+
+static int kma_callers;
+static int kma_lost_callers, kma_lost_allocs, kma_unknown_frees;
+static int kma_total, kma_net, kma_slack, kma_allocs, kma_frees;
+static spinlock_t kma_lock = SPIN_LOCK_UNLOCKED;
+
+void __kmalloc_account(const void *caller, const void *addr, int size, int req)
+{
+	int i, hasha, hashc;
+	unsigned long flags;
+
+	spin_lock_irqsave(&kma_lock, flags);
+	if(req >= 0) /* kmalloc */
+	{
+		/* find callers slot */
+		hashc = kma_hash(caller, MAX_CALLER_TABLE);
+		for (i = 0; i < MAX_CALLER_TABLE; i++) {
+			if (!kma_caller[hashc].caller ||
+			    kma_caller[hashc].caller == caller)
+				break;
+			hashc = (hashc + 1) % MAX_CALLER_TABLE;
+		}
+
+		if (!kma_caller[hashc].caller)
+			kma_callers++;
+
+		if (i < MAX_CALLER_TABLE) {
+			/* update callers stats */
+			kma_caller[hashc].caller = caller;
+			kma_caller[hashc].total += size;
+			kma_caller[hashc].net += size;
+			kma_caller[hashc].slack += size - req;
+			kma_caller[hashc].allocs++;
+
+			/* add malloc to list */
+			hasha = kma_hash(addr, MAX_ALLOC_TRACK);
+			for (i = 0; i < MAX_ALLOC_TRACK; i++) {
+				if (!kma_alloc[hasha].callerhash)
+					break;
+				hasha = (hasha + 1) % MAX_ALLOC_TRACK;
+			}
+
+			if(i < MAX_ALLOC_TRACK) {
+				kma_alloc[hasha].callerhash = hashc;
+				kma_alloc[hasha].address = addr;
+			}
+			else
+				kma_lost_allocs++;
+		}
+		else {
+			kma_lost_callers++;
+			kma_lost_allocs++;
+		}
+
+		kma_total += size;
+		kma_net += size;
+		kma_slack += size - req;
+		kma_allocs++;
+	}
+	else { /* kfree */
+		hasha = kma_hash(addr, MAX_ALLOC_TRACK);
+		for (i = 0; i < MAX_ALLOC_TRACK ; i++) {
+			if (kma_alloc[hasha].address == addr)
+				break;
+			hasha = (hasha + 1) % MAX_ALLOC_TRACK;
+		}
+
+		if (i < MAX_ALLOC_TRACK) {
+			hashc = kma_alloc[hasha].callerhash;
+			kma_alloc[hasha].callerhash = 0;
+			kma_caller[hashc].net -= size;
+			kma_caller[hashc].frees++;
+		}
+		else
+			kma_unknown_frees++;
+
+		kma_net -= size;
+		kma_frees++;
+	}
+	spin_unlock_irqrestore(&kma_lock, flags);
+}
+
+static void *as_start(struct seq_file *m, loff_t *pos)
+{
+	int i;
+	loff_t n = *pos;
+
+	if (!n) {
+		seq_printf(m, "total bytes allocated: %8d\n", kma_total);
+		seq_printf(m, "slack bytes allocated: %8d\n", kma_slack);
+		seq_printf(m, "net bytes allocated:   %8d\n", kma_net);
+		seq_printf(m, "number of allocs:      %8d\n", kma_allocs);
+		seq_printf(m, "number of frees:       %8d\n", kma_frees);
+		seq_printf(m, "number of callers:     %8d\n", kma_callers);
+		seq_printf(m, "lost callers:          %8d\n",
+			   kma_lost_callers);
+		seq_printf(m, "lost allocs:           %8d\n",
+			   kma_lost_allocs);
+		seq_printf(m, "unknown frees:         %8d\n",
+			   kma_unknown_frees);
+		seq_puts(m, "\n   total    slack      net alloc/free  caller\n");
+	}
+
+	for (i = 0; i < MAX_CALLER_TABLE; i++) {
+		if(kma_caller[i].caller)
+			n--;
+		if(n < 0)
+			return (void *)(i+1);
+	}
+
+	return 0;
+}
+
+static void *as_next(struct seq_file *m, void *p, loff_t *pos)
+{
+	int n = (int)p-1, i;
+	++*pos;
+
+	for (i = n + 1; i < MAX_CALLER_TABLE; i++)
+		if(kma_caller[i].caller)
+			return (void *)(i+1);
+
+	return 0;
+}
+
+static void as_stop(struct seq_file *m, void *p)
+{
+}
+
+static int as_show(struct seq_file *m, void *p)
+{
+	int n = (int)p-1;
+	struct kma_caller *c;
+#ifdef CONFIG_KALLSYMS
+	char *modname;
+	const char *name;
+	unsigned long offset = 0, size;
+	char namebuf[128];
+
+	c = &kma_caller[n];
+	name = kallsyms_lookup((int)c->caller, &size, &offset, &modname,
+			       namebuf);
+	seq_printf(m, "%8d %8d %8d %5d/%-5d %s+0x%lx\n",
+		   c->total, c->slack, c->net, c->allocs, c->frees,
+		   name, offset);
+#else
+	c = &kma_caller[n];
+	seq_printf(m, "%8d %8d %8d %5d/%-5d %p\n",
+		   c->total, c->slack, c->net, c->allocs, c->frees, c->caller);
+#endif
+
+	return 0;
+}
+
+struct seq_operations kmalloc_account_op = {
+	.start	= as_start,
+	.next	= as_next,
+	.stop	= as_stop,
+	.show	= as_show,
+};
+
Index: linux-2.6.15-rc7/mm/Makefile
===================================================================
--- linux-2.6.15-rc7.orig/mm/Makefile	2005-12-29 22:54:48.000000000 -0500
+++ linux-2.6.15-rc7/mm/Makefile	2005-12-29 22:55:29.000000000 -0500
@@ -12,6 +12,7 @@
 			   readahead.o slab.o swap.o truncate.o vmscan.o \
 			   prio_tree.o $(mmu-y)
 
+obj-$(CONFIG_KMALLOC_ACCOUNTING) += kmallocacct.o
 obj-$(CONFIG_SWAP)	+= page_io.o swap_state.o swapfile.o thrash.o
 obj-$(CONFIG_HUGETLBFS)	+= hugetlb.o
 obj-$(CONFIG_NUMA) 	+= mempolicy.o
Index: linux-2.6.15-rc7/include/linux/slab.h
===================================================================
--- linux-2.6.15-rc7.orig/include/linux/slab.h	2005-12-29 22:54:48.000000000 -0500
+++ linux-2.6.15-rc7/include/linux/slab.h	2005-12-29 22:55:29.000000000 -0500
@@ -53,6 +53,23 @@
 #define SLAB_CTOR_ATOMIC	0x002UL		/* tell constructor it can't sleep */
 #define	SLAB_CTOR_VERIFY	0x004UL		/* tell constructor it's a verify call */
 
+#ifdef CONFIG_KMALLOC_ACCOUNTING
+void __kmalloc_account(const void *, const void *, int, int);
+
+static void inline kmalloc_account(const void *addr, int size, int req)
+{
+	__kmalloc_account(__builtin_return_address(0), addr, size, req);
+}
+
+static void inline kfree_account(const void *addr, int size)
+{
+	__kmalloc_account(__builtin_return_address(0), addr, size, -1);
+}
+#else
+#define kmalloc_account(a, b, c)
+#define kfree_account(a, b)
+#endif
+
 /* prototypes */
 extern void __init kmem_cache_init(void);
 
@@ -78,6 +95,7 @@
 
 static inline void *kmalloc(size_t size, gfp_t flags)
 {
+#ifndef CONFIG_KMALLOC_ACCOUNTING
 	if (__builtin_constant_p(size)) {
 		int i = 0;
 #define CACHE(x) \
@@ -96,10 +114,38 @@
 			malloc_sizes[i].cs_dmacachep :
 			malloc_sizes[i].cs_cachep, flags);
 	}
+#endif
 	return __kmalloc(size, flags);
 }
 
-extern void *kzalloc(size_t, gfp_t);
+static inline void *kzalloc(size_t size, gfp_t flags)
+{
+	void *ret = kmalloc(size, flags);
+	if (ret)
+		memset(ret, 0, size);
+	return ret;
+}
+
+/*
+ * kstrdup - allocate space for and copy an existing string
+ *
+ * @s: the string to duplicate
+ * @gfp: the GFP mask used in the kmalloc() call when allocating memory
+ */
+static inline char *kstrdup(const char *s, gfp_t gfp)
+{
+	size_t len;
+	char *buf;
+
+	if (!s)
+		return NULL;
+
+	len = strlen(s) + 1;
+	buf = kmalloc(len, gfp);
+	if (buf)
+		memcpy(buf, s, len);
+	return buf;
+}
 
 /**
  * kcalloc - allocate memory for an array. The memory is set to zero.
Index: linux-2.6.15-rc7/fs/proc/proc_misc.c
===================================================================
--- linux-2.6.15-rc7.orig/fs/proc/proc_misc.c	2005-12-29 22:54:48.000000000 -0500
+++ linux-2.6.15-rc7/fs/proc/proc_misc.c	2005-12-29 22:55:29.000000000 -0500
@@ -337,6 +337,24 @@
 	.release	= seq_release,
 };
 
+#ifdef CONFIG_KMALLOC_ACCOUNTING
+
+extern struct seq_operations kmalloc_account_op;
+
+static int kmalloc_account_open(struct inode *inode, struct file *file)
+{
+	return seq_open(file, &kmalloc_account_op);
+}
+
+static struct file_operations proc_kmalloc_account_operations = {
+	.open		= kmalloc_account_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= seq_release,
+};
+
+#endif
+
 static int show_stat(struct seq_file *p, void *v)
 {
 	int i;
@@ -601,6 +619,9 @@
 	create_seq_entry("stat", 0, &proc_stat_operations);
 	create_seq_entry("interrupts", 0, &proc_interrupts_operations);
 	create_seq_entry("slabinfo",S_IWUSR|S_IRUGO,&proc_slabinfo_operations);
+#ifdef CONFIG_KMALLOC_ACCOUNTING
+	create_seq_entry("kmalloc",S_IRUGO,&proc_kmalloc_account_operations);
+#endif
 	create_seq_entry("buddyinfo",S_IRUGO, &fragmentation_file_operations);
 	create_seq_entry("vmstat",S_IRUGO, &proc_vmstat_file_operations);
 	create_seq_entry("zoneinfo",S_IRUGO, &proc_zoneinfo_file_operations);
Index: linux-2.6.15-rc7/include/linux/string.h
===================================================================
--- linux-2.6.15-rc7.orig/include/linux/string.h	2005-12-29 22:54:48.000000000 -0500
+++ linux-2.6.15-rc7/include/linux/string.h	2005-12-29 22:55:29.000000000 -0500
@@ -88,8 +88,6 @@
 extern void * memchr(const void *,int,__kernel_size_t);
 #endif
 
-extern char *kstrdup(const char *s, gfp_t gfp);
-
 #ifdef __cplusplus
 }
 #endif

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2005-12-28 21:01         ` Matt Mackall
  2005-12-29  1:26           ` Dave Jones
  2005-12-29  1:29           ` Dave Jones
@ 2005-12-30 21:13           ` Marcelo Tosatti
  2005-12-31 20:13             ` Andi Kleen
  2 siblings, 1 reply; 36+ messages in thread
From: Marcelo Tosatti @ 2005-12-30 21:13 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Andreas Kleen, Denis Vlasenko, Eric Dumazet, linux-kernel


<snip>

> > Note that just looking at slabinfo is not enough for this - you need the
> > original
> > sizes as passed to kmalloc, not the rounded values reported there.
> > Should be probably not too hard to hack a simple monitoring script up
> > for that
> > in systemtap to generate the data.
> 
> Something like this:
> 
> http://lwn.net/Articles/124374/

Written with a systemtap script: 
http://sourceware.org/ml/systemtap/2005-q3/msg00550.html



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2005-12-30 21:13           ` Marcelo Tosatti
@ 2005-12-31 20:13             ` Andi Kleen
  0 siblings, 0 replies; 36+ messages in thread
From: Andi Kleen @ 2005-12-31 20:13 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Matt Mackall, Denis Vlasenko, Eric Dumazet, linux-kernel

On Friday 30 December 2005 22:13, Marcelo Tosatti wrote:
> 
> <snip>
> 
> > > Note that just looking at slabinfo is not enough for this - you need the
> > > original
> > > sizes as passed to kmalloc, not the rounded values reported there.
> > > Should be probably not too hard to hack a simple monitoring script up
> > > for that
> > > in systemtap to generate the data.
> > 
> > Something like this:
> > 
> > http://lwn.net/Articles/124374/
> 
> Written with a systemtap script: 
> http://sourceware.org/ml/systemtap/2005-q3/msg00550.html

I had actually written a similar script on my own before,
but I found it was near completely unusable on a 4core Opteron
system even under moderate load because systemtap bombed out 
when it needed more than one spin to take the lock of the 
shared hash table.

(it basically did if (!spin_trylock()) ... stop script; ...) 

The problem was that the backtraces took so long that another
CPU very often run into the locked lock.

Still with a stripped down script without backtraces had some
interesting results. In particular my init was reading some 
file in /proc 10 times a second, allocating 4K (wtf did it do that?) and
some other somewhat surprising results.

-Andi

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2005-12-28 17:57       ` Andreas Kleen
  2005-12-28 21:01         ` Matt Mackall
  2005-12-29 19:48         ` Steven Rostedt
@ 2006-01-02  8:37         ` Pekka Enberg
  2006-01-02 12:45           ` Andi Kleen
  2 siblings, 1 reply; 36+ messages in thread
From: Pekka Enberg @ 2006-01-02  8:37 UTC (permalink / raw)
  To: Andreas Kleen; +Cc: Denis Vlasenko, Eric Dumazet, linux-kernel

On 12/28/05, Andreas Kleen <ak@suse.de> wrote:
> I remember the original slab paper from Bonwick actually mentioned that
> power of two slabs are the worst choice for a malloc - but for some reason Linux
> chose them anyways.

Power of two sizes are bad because memory accesses tend to concentrate
on the same cache lines but slab coloring should take care of that. So
I don't think there's a problem with using power of twos for kmalloc()
caches.

                                   Pekka

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2005-12-30  4:06             ` Steven Rostedt
@ 2006-01-02  8:46               ` Pekka Enberg
  2006-01-02  8:51                 ` Pekka Enberg
  2006-01-02 12:31                 ` Steven Rostedt
  0 siblings, 2 replies; 36+ messages in thread
From: Pekka Enberg @ 2006-01-02  8:46 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Dave Jones, linux-kernel, Eric Dumazet, Denis Vlasenko,
	Andreas Kleen, Matt Mackall

Hi,

On 12/30/05, Steven Rostedt <rostedt@goodmis.org> wrote:
> Attached is a variant that was refreshed against 2.6.15-rc7 and fixes
> the logical bug that your compile error fix made ;)
>
> It should be cachep->objsize not csizep->cs_size.

Isn't there any other way to do this patch other than making kzalloc()
and kstrdup() inline? I would like to see something like this in the
mainline but making them inline is not acceptable because they
increase kernel text a lot.

                       Pekka

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2006-01-02  8:46               ` Pekka Enberg
@ 2006-01-02  8:51                 ` Pekka Enberg
  2006-01-02 12:33                   ` Steven Rostedt
  2006-01-02 12:31                 ` Steven Rostedt
  1 sibling, 1 reply; 36+ messages in thread
From: Pekka Enberg @ 2006-01-02  8:51 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Dave Jones, linux-kernel, Eric Dumazet, Denis Vlasenko,
	Andreas Kleen, Matt Mackall

On 12/30/05, Steven Rostedt <rostedt@goodmis.org> wrote:
> > Attached is a variant that was refreshed against 2.6.15-rc7 and fixes
> > the logical bug that your compile error fix made ;)
> >
> > It should be cachep->objsize not csizep->cs_size.

On 1/2/06, Pekka Enberg <penberg@cs.helsinki.fi> wrote:
> Isn't there any other way to do this patch other than making kzalloc()
> and kstrdup() inline? I would like to see something like this in the
> mainline but making them inline is not acceptable because they
> increase kernel text a lot.

Also, wouldn't it be better to track kmem_cache_alloc and
kmem_cache_alloc_node instead?

                                      Pekka

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2006-01-02  8:46               ` Pekka Enberg
  2006-01-02  8:51                 ` Pekka Enberg
@ 2006-01-02 12:31                 ` Steven Rostedt
  1 sibling, 0 replies; 36+ messages in thread
From: Steven Rostedt @ 2006-01-02 12:31 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Dave Jones, linux-kernel, Eric Dumazet, Denis Vlasenko,
	Andreas Kleen, Matt Mackall

On Mon, 2 Jan 2006, Pekka Enberg wrote:

> Hi,
>
> On 12/30/05, Steven Rostedt <rostedt@goodmis.org> wrote:
> > Attached is a variant that was refreshed against 2.6.15-rc7 and fixes
> > the logical bug that your compile error fix made ;)
> >
> > It should be cachep->objsize not csizep->cs_size.
>
> Isn't there any other way to do this patch other than making kzalloc()
> and kstrdup() inline? I would like to see something like this in the
> mainline but making them inline is not acceptable because they
> increase kernel text a lot.

Actually, yes. I was adding to this patch something to be more specific,
and to either pass the EIP through the parameter or a __FILE__, __LINE__.

Using the following:

#ifdef CONFIG_KMALLOC_ACCOUNTING
# define __EIP__ , __builtin_return_address(0)
# define __DECLARE_EIP__ , void *eip
#else
# define __EIP__
# define __DECLARE_EIP__
#endif

#define kstrdup(s,g) __kstrdup(s, g __EIP__)
extern char *__kstrdup(const char *s, gfp_t g __DECLARE_EIP__);


Or a file line can be used:

# define __EIP__ , __FILE__, __LINE__
# define __DECLARE_EIP__ , char *file, int line



-- Steve



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2006-01-02  8:51                 ` Pekka Enberg
@ 2006-01-02 12:33                   ` Steven Rostedt
  0 siblings, 0 replies; 36+ messages in thread
From: Steven Rostedt @ 2006-01-02 12:33 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Dave Jones, linux-kernel, Eric Dumazet, Denis Vlasenko,
	Andreas Kleen, Matt Mackall


On Mon, 2 Jan 2006, Pekka Enberg wrote:

>
> Also, wouldn't it be better to track kmem_cache_alloc and
> kmem_cache_alloc_node instead?
>

I believe they are very interested in when kmalloc and kfree are used,
since those are the ones for the generic slabs.  And even then, they are
only profiling the ones that use a dynamic allocation.  (the kmalloc and
kfree of sizeof(x) is not profiled).  This was brought up earlier in the
thread.

-- Steve


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2006-01-02  8:37         ` Pekka Enberg
@ 2006-01-02 12:45           ` Andi Kleen
  2006-01-02 13:04             ` Pekka J Enberg
  0 siblings, 1 reply; 36+ messages in thread
From: Andi Kleen @ 2006-01-02 12:45 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: Denis Vlasenko, Eric Dumazet, linux-kernel

On Monday 02 January 2006 09:37, Pekka Enberg wrote:
> On 12/28/05, Andreas Kleen <ak@suse.de> wrote:
> > I remember the original slab paper from Bonwick actually mentioned that
> > power of two slabs are the worst choice for a malloc - but for some reason Linux
> > chose them anyways.
> 
> Power of two sizes are bad because memory accesses tend to concentrate
> on the same cache lines but slab coloring should take care of that. So
> I don't think there's a problem with using power of twos for kmalloc()
> caches.

There is - who tells you it's the best possible distribution of memory? 

-Andi



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2006-01-02 12:45           ` Andi Kleen
@ 2006-01-02 13:04             ` Pekka J Enberg
  2006-01-02 13:56               ` Andi Kleen
  0 siblings, 1 reply; 36+ messages in thread
From: Pekka J Enberg @ 2006-01-02 13:04 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Denis Vlasenko, Eric Dumazet, linux-kernel

On 12/28/05, Andreas Kleen <ak@suse.de> wrote:
> > > I remember the original slab paper from Bonwick actually mentioned that
> > > power of two slabs are the worst choice for a malloc - but for some reason Linux
> > > chose them anyways.

On Monday 02 January 2006 09:37, Pekka Enberg wrote:
> > Power of two sizes are bad because memory accesses tend to concentrate
> > on the same cache lines but slab coloring should take care of that. So
> > I don't think there's a problem with using power of twos for kmalloc()
> > caches.
 
On Mon, 2 Jan 2006, Andi Kleen wrote:
> There is - who tells you it's the best possible distribution of memory?

Maybe it's not. But that's besides the point. The specific problem Bonwick 
mentioned is related to cache line distribution and should be taken care 
of by slab coloring. Internal fragmentation is painful but the worst 
offenders can be fixed with kmem_cache_alloc(). So I really don't see the 
problem. On the other hand, I am not opposed to dynamic generic slabs if 
you can show a clear performance benefit from it. I just doubt you will.

			Pekka

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2006-01-02 13:04             ` Pekka J Enberg
@ 2006-01-02 13:56               ` Andi Kleen
  2006-01-02 15:09                 ` Pekka J Enberg
  2006-01-02 15:46                 ` Jörn Engel
  0 siblings, 2 replies; 36+ messages in thread
From: Andi Kleen @ 2006-01-02 13:56 UTC (permalink / raw)
  To: Pekka J Enberg; +Cc: Denis Vlasenko, Eric Dumazet, linux-kernel

On Monday 02 January 2006 14:04, Pekka J Enberg wrote:

> Maybe it's not. But that's besides the point. 

It was my point. I don't know what your point was.

> The specific problem Bonwick  
> mentioned is related to cache line distribution and should be taken care 
> of by slab coloring. Internal fragmentation is painful but the worst 
> offenders can be fixed with kmem_cache_alloc(). So I really don't see the 
> problem. On the other hand, I am not opposed to dynamic generic slabs if 
> you can show a clear performance benefit from it. I just doubt you will.

I wasn't proposing fully dynamic slabs, just a better default set
of slabs based on real measurements instead of handwaving (like
the power of two slabs seemed to have been generated). With separate
sets for 32bit and 64bit. 

Also the goal wouldn't be better performance, but just less waste of memory.

I suspect such a move could save much more memory on small systems 
than any of these "make fundamental debugging tools a CONFIG" patches ever.

-Andi

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2005-12-29  2:39               ` Dave Jones
@ 2006-01-02 15:03                 ` Helge Hafting
  0 siblings, 0 replies; 36+ messages in thread
From: Helge Hafting @ 2006-01-02 15:03 UTC (permalink / raw)
  To: Dave Jones, Keith Owens, Matt Mackall, linux-kernel

On Wed, Dec 28, 2005 at 09:39:06PM -0500, Dave Jones wrote:
> On Thu, Dec 29, 2005 at 12:50:10PM +1100, Keith Owens wrote:
>  > Dave Jones (on Wed, 28 Dec 2005 20:29:15 -0500) wrote:
>  > >
>  > > > Something like this:
>  > > > 
>  > > > http://lwn.net/Articles/124374/
>  > >
>  > >One thing that really sticks out like a sore thumb is soft_cursor()
>  > >That thing gets called a *lot*, and every time it does a kmalloc/free
>  > >pair that 99.9% of the time is going to be the same size alloc as
>  > >it was the last time.  This patch makes that alloc persistent
>  > >(and does a realloc if the size changes).
>  > >The only time it should change is if the font/resolution changes I think.
>  > 
>  > Can soft_cursor() be called from multiple processes at the same time,
>  > in particular with dual head systems?  If so then a static variable is
>  > not going to work.
> 
> My dual-head system here displays a cloned image on the second
> screen, which seems to dtrt.  I'm not sure how to make it show
> something different on the other head to test further.

Few dualhead drivers actually support two different framebuffers,
but the matrox G550 (and G400) drivers do.  Compile one
of those, make sure to configure dualhead support.
After booting up, use "matroxset" to set the 
framebuffer to vga-connector mapping so that the two 
outputs actually show the different framebuffers.

Another way is to use several graphichs cards (AGP getting
the first framebuffer and each PCI card getting others as
the drivers load.)

Helge Hafting

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2006-01-02 13:56               ` Andi Kleen
@ 2006-01-02 15:09                 ` Pekka J Enberg
  2006-01-02 15:46                 ` Jörn Engel
  1 sibling, 0 replies; 36+ messages in thread
From: Pekka J Enberg @ 2006-01-02 15:09 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Denis Vlasenko, Eric Dumazet, linux-kernel

On Mon, 2 Jan 2006, Andi Kleen wrote:
> I wasn't proposing fully dynamic slabs, just a better default set
> of slabs based on real measurements instead of handwaving (like
> the power of two slabs seemed to have been generated). With separate
> sets for 32bit and 64bit. 
> 
> Also the goal wouldn't be better performance, but just less waste of memory.
> 
> I suspect such a move could save much more memory on small systems 
> than any of these "make fundamental debugging tools a CONFIG" patches ever.

I misunderstood what you were proposing. Sorry. It makes sense to measure 
it.

			Pekka

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2006-01-02 13:56               ` Andi Kleen
  2006-01-02 15:09                 ` Pekka J Enberg
@ 2006-01-02 15:46                 ` Jörn Engel
  1 sibling, 0 replies; 36+ messages in thread
From: Jörn Engel @ 2006-01-02 15:46 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Pekka J Enberg, Denis Vlasenko, Eric Dumazet, linux-kernel

On Mon, 2 January 2006 14:56:22 +0100, Andi Kleen wrote:
> 
> I wasn't proposing fully dynamic slabs, just a better default set
> of slabs based on real measurements instead of handwaving (like
> the power of two slabs seemed to have been generated). With separate
> sets for 32bit and 64bit. 
> 
> Also the goal wouldn't be better performance, but just less waste of memory.

My fear would be that this leads to something like the gperf: a
perfect distribution of slab caches - until any tiny detail changes.
But maybe there is a different distribution that is "pretty good" for
all configurations and better than powers of two.

> I suspect such a move could save much more memory on small systems 
> than any of these "make fundamental debugging tools a CONFIG" patches ever.

Unlikely.  SLOB should be better than SLAB for those purposes, no
matter how you arrange the slab caches.

Jörn

-- 
Fancy algorithms are slow when n is small, and n is usually small.
Fancy algorithms have big constants. Until you know that n is
frequently going to be big, don't get fancy.
-- Rob Pike

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ?
  2005-12-29  1:50             ` Keith Owens
  2005-12-29  2:39               ` Dave Jones
@ 2006-01-04  5:26               ` Dave Jones
  1 sibling, 0 replies; 36+ messages in thread
From: Dave Jones @ 2006-01-04  5:26 UTC (permalink / raw)
  To: Keith Owens; +Cc: Matt Mackall, linux-kernel

On Thu, Dec 29, 2005 at 12:50:10PM +1100, Keith Owens wrote:
 > Dave Jones (on Wed, 28 Dec 2005 20:29:15 -0500) wrote:
 > >
 > > > Something like this:
 > > > 
 > > > http://lwn.net/Articles/124374/
 > >
 > >One thing that really sticks out like a sore thumb is soft_cursor()
 > >That thing gets called a *lot*, and every time it does a kmalloc/free
 > >pair that 99.9% of the time is going to be the same size alloc as
 > >it was the last time.  This patch makes that alloc persistent
 > >(and does a realloc if the size changes).
 > >The only time it should change is if the font/resolution changes I think.
 > 
 > Can soft_cursor() be called from multiple processes at the same time,
 > in particular with dual head systems?  If so then a static variable is
 > not going to work.

I looked at this a little closer. If my understanding of the console/fb layers
is correct, soft_cursor() is serialised by the console_sem in
drivers/video/console/fbcon.c::fb_flashcursor()

		Dave

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2006-01-04  5:26 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-12-21  8:00 [ANNOUNCE] GIT 1.0.0 Junio C Hamano
2005-12-21  9:11 ` [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? Eric Dumazet
2005-12-21  9:22   ` David S. Miller
2005-12-21 10:03     ` Jan-Benedict Glaw
2005-12-21  9:46   ` Alok kataria
2005-12-21 12:44   ` Ed Tomlinson
2005-12-21 13:20     ` Folkert van Heusden
2005-12-21 13:38       ` Eric Dumazet
2005-12-21 14:09         ` Folkert van Heusden
2005-12-21 16:40           ` Dave Jones
2005-12-21 19:36             ` Folkert van Heusden
2005-12-28  8:32   ` Denis Vlasenko
2005-12-28  8:54     ` Denis Vlasenko
2005-12-28 17:57       ` Andreas Kleen
2005-12-28 21:01         ` Matt Mackall
2005-12-29  1:26           ` Dave Jones
2005-12-30  4:06             ` Steven Rostedt
2006-01-02  8:46               ` Pekka Enberg
2006-01-02  8:51                 ` Pekka Enberg
2006-01-02 12:33                   ` Steven Rostedt
2006-01-02 12:31                 ` Steven Rostedt
2005-12-29  1:29           ` Dave Jones
2005-12-29  1:50             ` Keith Owens
2005-12-29  2:39               ` Dave Jones
2006-01-02 15:03                 ` Helge Hafting
2006-01-04  5:26               ` Dave Jones
2005-12-30 21:13           ` Marcelo Tosatti
2005-12-31 20:13             ` Andi Kleen
2005-12-29 19:48         ` Steven Rostedt
2005-12-29 21:16           ` Andi Kleen
2006-01-02  8:37         ` Pekka Enberg
2006-01-02 12:45           ` Andi Kleen
2006-01-02 13:04             ` Pekka J Enberg
2006-01-02 13:56               ` Andi Kleen
2006-01-02 15:09                 ` Pekka J Enberg
2006-01-02 15:46                 ` Jörn Engel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox