* [ANNOUNCE] GIT 1.0.0 @ 2005-12-21 8:00 Junio C Hamano 2005-12-21 9:11 ` [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? Eric Dumazet 0 siblings, 1 reply; 36+ messages in thread From: Junio C Hamano @ 2005-12-21 8:00 UTC (permalink / raw) To: git, linux-kernel GIT 1.0.0 is found at the usual places: Tarball http://www.kernel.org/pub/software/scm/git/ RPM http://www.kernel.org/pub/software/scm/git/RPMS/ Debian http://www.kernel.org/pub/software/scm/git/debian/ GIT git://git.kernel.org/pub/scm/git/git.git/ The name "1.0.0" ought to mean a significant milestone, but actually it is not. Pre 1.0 version has been in production use by the kernel folks for quite some time, and the changes since 1.0rc are pretty small and primarily consist of documenation updates, clone/fetch enhancements and miscellaneous bugfixes. Thank you all who gave patches, comments and time. Happy hacking, and a little early ho-ho-ho. ^ permalink raw reply [flat|nested] 36+ messages in thread
* [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2005-12-21 8:00 [ANNOUNCE] GIT 1.0.0 Junio C Hamano @ 2005-12-21 9:11 ` Eric Dumazet 2005-12-21 9:22 ` David S. Miller ` (3 more replies) 0 siblings, 4 replies; 36+ messages in thread From: Eric Dumazet @ 2005-12-21 9:11 UTC (permalink / raw) To: linux-kernel; +Cc: Andi Kleen I wonder if the 32 and 192 bytes caches are worth to be declared in include/linux/kmalloc_sizes.h, at least on x86_64 (x86_64 : PAGE_SIZE = 4096, L1_CACHE_BYTES = 64) On my machines, I can say that the 32 and 192 sizes could be avoided in favor in spending less cpu cycles in __find_general_cachep() Could some of you post the result of the following command on your machines : # grep "size-" /proc/slabinfo |grep -v DMA|cut -c1-40 size-131072 0 0 131072 size-65536 0 0 65536 size-32768 2 2 32768 size-16384 0 0 16384 size-8192 13 13 8192 size-4096 161 161 4096 size-2048 40564 42976 2048 size-1024 681 800 1024 size-512 19792 37168 512 size-256 81 105 256 size-192 1218 1280 192 size-64 31278 86907 64 size-128 5457 10380 128 size-32 594 784 32 Thank you PS : I have no idea why the last lines (size-192, 64, 128, 32) are not ordered... Eric ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2005-12-21 9:11 ` [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? Eric Dumazet @ 2005-12-21 9:22 ` David S. Miller 2005-12-21 10:03 ` Jan-Benedict Glaw 2005-12-21 9:46 ` Alok kataria ` (2 subsequent siblings) 3 siblings, 1 reply; 36+ messages in thread From: David S. Miller @ 2005-12-21 9:22 UTC (permalink / raw) To: dada1; +Cc: linux-kernel, ak From: Eric Dumazet <dada1@cosmosbay.com> Date: Wed, 21 Dec 2005 10:11:51 +0100 > Could some of you post the result of the following command on your machines : sparc64, PAGE_SIZE=8192, L1_CACHE_BYTES=32 size-131072 0 0 131072 size-65536 13 13 65536 size-32768 2 2 32768 size-16384 2 2 16384 size-8192 67 67 8192 size-4096 75 76 4096 size-2048 303 308 2048 size-1024 176 176 1024 size-512 251 255 512 size-256 217 217 256 size-192 1230 1230 192 size-128 106 122 128 size-96 1098 1134 96 size-64 29387 30226 64 ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2005-12-21 9:22 ` David S. Miller @ 2005-12-21 10:03 ` Jan-Benedict Glaw 0 siblings, 0 replies; 36+ messages in thread From: Jan-Benedict Glaw @ 2005-12-21 10:03 UTC (permalink / raw) To: David S. Miller; +Cc: dada1, linux-kernel, ak [-- Attachment #1: Type: text/plain, Size: 1370 bytes --] On Wed, 2005-12-21 01:22:12 -0800, David S. Miller <davem@davemloft.net> wrote: > From: Eric Dumazet <dada1@cosmosbay.com> > Date: Wed, 21 Dec 2005 10:11:51 +0100 > > > Could some of you post the result of the following command on your machines : VAX KA650 (simulated), 4k pages (hw-size is 512 Bytes, though), L1_CACHE_BYTES=32 # grep "size-" /proc/slabinfo |grep -v DMA|cut -c1-40 size-131072 0 0 131072 size-65536 0 0 65536 size-32768 0 0 32768 size-16384 0 0 16384 size-8192 0 0 8192 size-4096 21 21 4096 size-2048 39 42 2060 size-1024 18 21 1036 size-512 70 70 524 size-256 5 14 268 size-192 722 722 204 size-128 145 168 140 size-96 382 396 108 size-32 1040 1092 44 size-64 338 350 76 MfG, JBG -- Jan-Benedict Glaw jbglaw@lug-owl.de . +49-172-7608481 _ O _ "Eine Freie Meinung in einem Freien Kopf | Gegen Zensur | Gegen Krieg _ _ O für einen Freien Staat voll Freier Bürger" | im Internet! | im Irak! O O O ret = do_actions((curr | FREE_SPEECH) & ~(NEW_COPYRIGHT_LAW | DRM | TCPA)); [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2005-12-21 9:11 ` [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? Eric Dumazet 2005-12-21 9:22 ` David S. Miller @ 2005-12-21 9:46 ` Alok kataria 2005-12-21 12:44 ` Ed Tomlinson 2005-12-28 8:32 ` Denis Vlasenko 3 siblings, 0 replies; 36+ messages in thread From: Alok kataria @ 2005-12-21 9:46 UTC (permalink / raw) To: Eric Dumazet; +Cc: linux-kernel, Andi Kleen On 12/21/05, Eric Dumazet <dada1@cosmosbay.com> wrote: > I wonder if the 32 and 192 bytes caches are worth to be declared in > include/linux/kmalloc_sizes.h, at least on x86_64 > > (x86_64 : PAGE_SIZE = 4096, L1_CACHE_BYTES = 64) > > On my machines, I can say that the 32 and 192 sizes could be avoided in favor > in spending less cpu cycles in __find_general_cachep() > > Could some of you post the result of the following command on your machines : > > # grep "size-" /proc/slabinfo |grep -v DMA|cut -c1-40 > > size-131072 0 0 131072 > size-65536 0 0 65536 > size-32768 2 2 32768 > size-16384 0 0 16384 > size-8192 13 13 8192 > size-4096 161 161 4096 > size-2048 40564 42976 2048 > size-1024 681 800 1024 > size-512 19792 37168 512 > size-256 81 105 256 > size-192 1218 1280 192 > size-64 31278 86907 64 > size-128 5457 10380 128 > size-32 594 784 32 > > Thank you > > PS : I have no idea why the last lines (size-192, 64, 128, 32) are not ordered... The size-32 and size-128 caches are created before any other cache, as the array_caches (arraycache_init) and kmem_list3's structure come from these cache. Thus these caches are added to the cache_chain before other caches. And s_show just walks this chain and prints info for the caches. Before l3 was converted into a pointer (per node slabs) we could intialize the caches in order as we knew that the arraycache_init will always fit in the first cache. Thanks & Regards, Alok > > Eric > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2005-12-21 9:11 ` [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? Eric Dumazet 2005-12-21 9:22 ` David S. Miller 2005-12-21 9:46 ` Alok kataria @ 2005-12-21 12:44 ` Ed Tomlinson 2005-12-21 13:20 ` Folkert van Heusden 2005-12-28 8:32 ` Denis Vlasenko 3 siblings, 1 reply; 36+ messages in thread From: Ed Tomlinson @ 2005-12-21 12:44 UTC (permalink / raw) To: Eric Dumazet; +Cc: linux-kernel, Andi Kleen On Wednesday 21 December 2005 04:11, Eric Dumazet wrote: > (x86_64 : PAGE_SIZE = 4096, L1_CACHE_BYTES = 64) > > On my machines, I can say that the 32 and 192 sizes could be avoided in favor > in spending less cpu cycles in __find_general_cachep() > > Could some of you post the result of the following command on your machines : > > # grep "size-" /proc/slabinfo |grep -v DMA|cut -c1-40 size-131072 0 0 131072 size-65536 3 3 65536 size-32768 0 0 32768 size-16384 3 3 16384 size-8192 28 28 8192 size-4096 184 184 4096 size-2048 272 272 2048 size-1024 300 300 1024 size-512 275 376 512 size-256 717 720 256 size-192 1120 1220 192 size-64 7720 8568 64 size-128 45019 65830 128 size-32 1627 3333 32 amd64 up Ed Tomlinson ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2005-12-21 12:44 ` Ed Tomlinson @ 2005-12-21 13:20 ` Folkert van Heusden 2005-12-21 13:38 ` Eric Dumazet 0 siblings, 1 reply; 36+ messages in thread From: Folkert van Heusden @ 2005-12-21 13:20 UTC (permalink / raw) To: Ed Tomlinson; +Cc: Eric Dumazet, linux-kernel, Andi Kleen -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 > > (x86_64 : PAGE_SIZE = 4096, L1_CACHE_BYTES = 64) > > On my machines, I can say that the 32 and 192 sizes could be avoided in favor > > in spending less cpu cycles in __find_general_cachep() > > Could some of you post the result of the following command on your machines : > > # grep "size-" /proc/slabinfo |grep -v DMA|cut -c1-40 > size-131072 0 0 131072 > size-65536 3 3 65536 > size-32768 0 0 32768 > size-16384 3 3 16384 > size-8192 28 28 8192 > size-4096 184 184 4096 > size-2048 272 272 2048 > size-1024 300 300 1024 > size-512 275 376 512 > size-256 717 720 256 > size-192 1120 1220 192 > size-64 7720 8568 64 > size-128 45019 65830 128 > size-32 1627 3333 32 size-131072 0 0 131072 size-65536 0 0 65536 size-32768 20 20 32768 size-16384 8 9 16384 size-8192 37 38 8192 size-4096 269 269 4096 size-2048 793 910 2048 size-1024 564 608 1024 size-512 702 856 512 size-256 1485 4005 256 size-128 1209 1350 128 size-64 2858 3363 64 size-32 1538 2714 64 Intel(R) Xeon(TM) MP CPU 3.00GHz address sizes : 40 bits physical, 48 bits virtual Folkert van Heusden - -- Try MultiTail! Multiple windows with logfiles, filtered with regular expressions, colored output, etc. etc. www.vanheusden.com/multitail/ - ---------------------------------------------------------------------- Get your PGP/GPG key signed at www.biglumber.com! - ---------------------------------------------------------------------- Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux) iIMEARECAEMFAkOpVq48Gmh0dHA6Ly93d3cudmFuaGV1c2Rlbi5jb20vZGF0YS1z aWduaW5nLXdpdGgtcGdwLXBvbGljeS5odG1sAAoJEDAZDowfKNiuUUEAnR9DJq5M x+Bj1R+djzCli3bFrJXKAJ9OmCx9FKDaGl6PocRwCZSKURerPA== =vQhF -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2005-12-21 13:20 ` Folkert van Heusden @ 2005-12-21 13:38 ` Eric Dumazet 2005-12-21 14:09 ` Folkert van Heusden 0 siblings, 1 reply; 36+ messages in thread From: Eric Dumazet @ 2005-12-21 13:38 UTC (permalink / raw) To: Folkert van Heusden; +Cc: Ed Tomlinson, linux-kernel, Andi Kleen Folkert van Heusden a écrit : > > > size-131072 0 0 131072 > size-65536 0 0 65536 > size-32768 20 20 32768 > size-16384 8 9 16384 > size-8192 37 38 8192 > size-4096 269 269 4096 > size-2048 793 910 2048 > size-1024 564 608 1024 > size-512 702 856 512 > size-256 1485 4005 256 > size-128 1209 1350 128 > size-64 2858 3363 64 > size-32 1538 2714 64 > Intel(R) Xeon(TM) MP CPU 3.00GHz > address sizes : 40 bits physical, 48 bits virtual > > > Folkert van Heusden Hi Folkert Your results are interesting : size-32 seems to use objects of size 64 ! > size-32 1538 2714 64 <<HERE>> So I guess that size-32 cache could be avoided at least for EMT (I take you run a 64 bits kernel ?) Eric ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2005-12-21 13:38 ` Eric Dumazet @ 2005-12-21 14:09 ` Folkert van Heusden 2005-12-21 16:40 ` Dave Jones 0 siblings, 1 reply; 36+ messages in thread From: Folkert van Heusden @ 2005-12-21 14:09 UTC (permalink / raw) To: Eric Dumazet; +Cc: Ed Tomlinson, linux-kernel, Andi Kleen -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 > >size-131072 0 0 131072 > >size-65536 0 0 65536 > >size-32768 20 20 32768 > >size-16384 8 9 16384 > >size-8192 37 38 8192 > >size-4096 269 269 4096 > >size-2048 793 910 2048 > >size-1024 564 608 1024 > >size-512 702 856 512 > >size-256 1485 4005 256 > >size-128 1209 1350 128 > >size-64 2858 3363 64 > >size-32 1538 2714 64 > >Intel(R) Xeon(TM) MP CPU 3.00GHz > >address sizes : 40 bits physical, 48 bits virtual > > Your results are interesting : size-32 seems to use objects of size 64 ! > > size-32 1538 2714 64 <<HERE>> > So I guess that size-32 cache could be avoided at least for EMT (I take you > run a 64 bits kernel ?) I think I do yes: Linux xxxxx 2.4.21-37.EL #1 SMP Wed Sep 7 13:32:18 EDT 2005 x86_64 x86_64 x86_64 GNU/Linux It is a redhat 4 x64 system. Also from /proc/cpuinfo: address sizes : 40 bits physical, 48 bits virtual Folkert van Heusden - -- Try MultiTail! Multiple windows with logfiles, filtered with regular expressions, colored output, etc. etc. www.vanheusden.com/multitail/ - ---------------------------------------------------------------------- Get your PGP/GPG key signed at www.biglumber.com! - ---------------------------------------------------------------------- Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux) iIMEARECAEMFAkOpYf08Gmh0dHA6Ly93d3cudmFuaGV1c2Rlbi5jb20vZGF0YS1z aWduaW5nLXdpdGgtcGdwLXBvbGljeS5odG1sAAoJEDAZDowfKNiugqYAoJWSoI9M O1sYrhWfFCoyTWweGN29AKCfPy46A1XHYC598IN4TXRSV2u6QA== =xMjS -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2005-12-21 14:09 ` Folkert van Heusden @ 2005-12-21 16:40 ` Dave Jones 2005-12-21 19:36 ` Folkert van Heusden 0 siblings, 1 reply; 36+ messages in thread From: Dave Jones @ 2005-12-21 16:40 UTC (permalink / raw) To: Folkert van Heusden; +Cc: Eric Dumazet, Ed Tomlinson, linux-kernel, Andi Kleen On Wed, Dec 21, 2005 at 03:09:02PM +0100, Folkert van Heusden wrote: > > Your results are interesting : size-32 seems to use objects of size 64 ! > > > size-32 1538 2714 64 <<HERE>> > > So I guess that size-32 cache could be avoided at least for EMT (I take you > > run a 64 bits kernel ?) > > I think I do yes: > Linux xxxxx 2.4.21-37.EL #1 SMP Wed Sep 7 13:32:18 EDT 2005 x86_64 x86_64 x86_64 GNU/Linux > It is a redhat 4 x64 system. Looks more like RHEL3 judging from the kernel version. Dave ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2005-12-21 16:40 ` Dave Jones @ 2005-12-21 19:36 ` Folkert van Heusden 0 siblings, 0 replies; 36+ messages in thread From: Folkert van Heusden @ 2005-12-21 19:36 UTC (permalink / raw) To: Dave Jones, Eric Dumazet, Ed Tomlinson, linux-kernel, Andi Kleen > > > Your results are interesting : size-32 seems to use objects of size 64 ! > > > > size-32 1538 2714 64 <<HERE>> > > > So I guess that size-32 cache could be avoided at least for EMT (I take you > > > run a 64 bits kernel ?) > > I think I do yes: > > Linux xxxxx 2.4.21-37.EL #1 SMP Wed Sep 7 13:32:18 EDT 2005 x86_64 x86_64 x86_64 GNU/Linux > > It is a redhat 4 x64 system. > Looks more like RHEL3 judging from the kernel version. Ehr yes, you're totally right. Folkert van Heusden -- Try MultiTail! Multiple windows with logfiles, filtered with regular expressions, colored output, etc. etc. www.vanheusden.com/multitail/ ---------------------------------------------------------------------- Get your PGP/GPG key signed at www.biglumber.com! ---------------------------------------------------------------------- Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2005-12-21 9:11 ` [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? Eric Dumazet ` (2 preceding siblings ...) 2005-12-21 12:44 ` Ed Tomlinson @ 2005-12-28 8:32 ` Denis Vlasenko 2005-12-28 8:54 ` Denis Vlasenko 3 siblings, 1 reply; 36+ messages in thread From: Denis Vlasenko @ 2005-12-28 8:32 UTC (permalink / raw) To: Eric Dumazet; +Cc: linux-kernel, Andi Kleen On Wednesday 21 December 2005 11:11, Eric Dumazet wrote: > I wonder if the 32 and 192 bytes caches are worth to be declared in > include/linux/kmalloc_sizes.h, at least on x86_64 > > (x86_64 : PAGE_SIZE = 4096, L1_CACHE_BYTES = 64) > > On my machines, I can say that the 32 and 192 sizes could be avoided in favor > in spending less cpu cycles in __find_general_cachep() > > Could some of you post the result of the following command on your machines : > > # grep "size-" /proc/slabinfo |grep -v DMA|cut -c1-40 > > size-131072 0 0 131072 > size-65536 0 0 65536 > size-32768 2 2 32768 > size-16384 0 0 16384 > size-8192 13 13 8192 > size-4096 161 161 4096 > size-2048 40564 42976 2048 > size-1024 681 800 1024 > size-512 19792 37168 512 > size-256 81 105 256 > size-192 1218 1280 192 > size-64 31278 86907 64 > size-128 5457 10380 128 > size-32 594 784 32 # grep "size-" /proc/slabinfo |grep -v DMA|cut -c1-40 size-131072 0 0 131072 size-65536 0 0 65536 size-32768 1 1 32768 size-16384 0 0 16384 size-8192 253 253 8192 size-4096 89 89 4096 size-2048 248 248 2048 size-1024 312 312 1024 size-512 545 648 512 size-256 213 270 256 size-128 5642 5642 128 size-64 1025 1586 64 size-32 2262 7854 32 ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2005-12-28 8:32 ` Denis Vlasenko @ 2005-12-28 8:54 ` Denis Vlasenko 2005-12-28 17:57 ` Andreas Kleen 0 siblings, 1 reply; 36+ messages in thread From: Denis Vlasenko @ 2005-12-28 8:54 UTC (permalink / raw) To: Eric Dumazet; +Cc: linux-kernel, Andi Kleen On Wednesday 28 December 2005 10:32, Denis Vlasenko wrote: > On Wednesday 21 December 2005 11:11, Eric Dumazet wrote: > > I wonder if the 32 and 192 bytes caches are worth to be declared in > > include/linux/kmalloc_sizes.h, at least on x86_64 > > > > (x86_64 : PAGE_SIZE = 4096, L1_CACHE_BYTES = 64) > > > > On my machines, I can say that the 32 and 192 sizes could be avoided in favor > > in spending less cpu cycles in __find_general_cachep() > > > > Could some of you post the result of the following command on your machines : > > > > # grep "size-" /proc/slabinfo |grep -v DMA|cut -c1-40 > > > > size-131072 0 0 131072 > > size-65536 0 0 65536 > > size-32768 2 2 32768 > > size-16384 0 0 16384 > > size-8192 13 13 8192 > > size-4096 161 161 4096 > > size-2048 40564 42976 2048 > > size-1024 681 800 1024 > > size-512 19792 37168 512 > > size-256 81 105 256 > > size-192 1218 1280 192 > > size-64 31278 86907 64 > > size-128 5457 10380 128 > > size-32 594 784 32 > > # grep "size-" /proc/slabinfo |grep -v DMA|cut -c1-40 > size-131072 0 0 131072 > size-65536 0 0 65536 > size-32768 1 1 32768 > size-16384 0 0 16384 > size-8192 253 253 8192 > size-4096 89 89 4096 > size-2048 248 248 2048 > size-1024 312 312 1024 > size-512 545 648 512 > size-256 213 270 256 > size-128 5642 5642 128 > size-64 1025 1586 64 > size-32 2262 7854 32 Wow... I overlooked that you are requesting data from x86_64 boxes. Mine is not, it's i386... -- vda ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2005-12-28 8:54 ` Denis Vlasenko @ 2005-12-28 17:57 ` Andreas Kleen 2005-12-28 21:01 ` Matt Mackall ` (2 more replies) 0 siblings, 3 replies; 36+ messages in thread From: Andreas Kleen @ 2005-12-28 17:57 UTC (permalink / raw) To: Denis Vlasenko; +Cc: Eric Dumazet, linux-kernel Am Mi 28.12.2005 09:54 schrieb Denis Vlasenko <vda@ilport.com.ua>: > > # grep "size-" /proc/slabinfo |grep -v DMA|cut -c1-40 > > size-131072 0 0 131072 > > size-65536 0 0 65536 > > size-32768 1 1 32768 > > size-16384 0 0 16384 > > size-8192 253 253 8192 > > size-4096 89 89 4096 > > size-2048 248 248 2048 > > size-1024 312 312 1024 > > size-512 545 648 512 > > size-256 213 270 256 > > size-128 5642 5642 128 > > size-64 1025 1586 64 > > size-32 2262 7854 32 > > Wow... I overlooked that you are requesting data from x86_64 boxes. > Mine is not, it's i386... This whole discussion is pointless anyways because most kmallocs are constant sized and with a constant sized kmalloc the slab is selected at compile time. What would be more interesting would be to redo the complete kmalloc slab list. I remember the original slab paper from Bonwick actually mentioned that power of two slabs are the worst choice for a malloc - but for some reason Linux chose them anyways. That would require a lot of measurements in different workloads on the actual kmalloc sizes and then select a good list, but could ultimately safe a lot of memory (ok not that much anymore because the memory intensive allocations should all have their own caches, but at least some) Most likely the best list is different for 32bit and 64bit too. Note that just looking at slabinfo is not enough for this - you need the original sizes as passed to kmalloc, not the rounded values reported there. Should be probably not too hard to hack a simple monitoring script up for that in systemtap to generate the data. -Andi ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2005-12-28 17:57 ` Andreas Kleen @ 2005-12-28 21:01 ` Matt Mackall 2005-12-29 1:26 ` Dave Jones ` (2 more replies) 2005-12-29 19:48 ` Steven Rostedt 2006-01-02 8:37 ` Pekka Enberg 2 siblings, 3 replies; 36+ messages in thread From: Matt Mackall @ 2005-12-28 21:01 UTC (permalink / raw) To: Andreas Kleen; +Cc: Denis Vlasenko, Eric Dumazet, linux-kernel On Wed, Dec 28, 2005 at 06:57:15PM +0100, Andreas Kleen wrote: > Am Mi 28.12.2005 09:54 schrieb Denis Vlasenko <vda@ilport.com.ua>: > > > > # grep "size-" /proc/slabinfo |grep -v DMA|cut -c1-40 > > > size-131072 0 0 131072 > > > size-65536 0 0 65536 > > > size-32768 1 1 32768 > > > size-16384 0 0 16384 > > > size-8192 253 253 8192 > > > size-4096 89 89 4096 > > > size-2048 248 248 2048 > > > size-1024 312 312 1024 > > > size-512 545 648 512 > > > size-256 213 270 256 > > > size-128 5642 5642 128 > > > size-64 1025 1586 64 > > > size-32 2262 7854 32 > > > > Wow... I overlooked that you are requesting data from x86_64 boxes. > > Mine is not, it's i386... > > This whole discussion is pointless anyways because most kmallocs are > constant > sized and with a constant sized kmalloc the slab is selected at compile > time. > > What would be more interesting would be to redo the complete kmalloc > slab list. > > I remember the original slab paper from Bonwick actually mentioned that > power of > two slabs are the worst choice for a malloc - but for some reason Linux > chose them > anyways. That would require a lot of measurements in different workloads > on the > actual kmalloc sizes and then select a good list, but could ultimately > safe > a lot of memory (ok not that much anymore because the memory intensive > allocations should all have their own caches, but at least some) > > Most likely the best list is different for 32bit and 64bit too. > > Note that just looking at slabinfo is not enough for this - you need the > original > sizes as passed to kmalloc, not the rounded values reported there. > Should be probably not too hard to hack a simple monitoring script up > for that > in systemtap to generate the data. Something like this: http://lwn.net/Articles/124374/ -- Mathematics is the supreme nostalgia of our time. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2005-12-28 21:01 ` Matt Mackall @ 2005-12-29 1:26 ` Dave Jones 2005-12-30 4:06 ` Steven Rostedt 2005-12-29 1:29 ` Dave Jones 2005-12-30 21:13 ` Marcelo Tosatti 2 siblings, 1 reply; 36+ messages in thread From: Dave Jones @ 2005-12-29 1:26 UTC (permalink / raw) To: Matt Mackall; +Cc: Andreas Kleen, Denis Vlasenko, Eric Dumazet, linux-kernel [-- Attachment #1: Type: text/plain, Size: 300 bytes --] On Wed, Dec 28, 2005 at 03:01:25PM -0600, Matt Mackall wrote: > Something like this: > > http://lwn.net/Articles/124374/ Nice toy. Variant attached that works on 2.6.15rc7 - ->cs_size compile error fixed - inlines kstrdup and kzalloc. Otherwise these functions dominate the profile. Dave [-- Attachment #2: linux-2.6-debug-account-kmalloc.patch --] [-- Type: text/plain, Size: 12834 bytes --] /proc/kmalloc allocation tracing tiny-mpm/fs/proc/proc_misc.c | 21 ++++ tiny-mpm/include/linux/slab.h | 19 ++++ tiny-mpm/init/Kconfig | 7 + tiny-mpm/mm/Makefile | 2 tiny-mpm/mm/kmallocacct.c | 182 ++++++++++++++++++++++++++++++++++++++++++ tiny-mpm/mm/slab.c | 7 + 6 files changed, 237 insertions(+), 1 deletion(-) Index: tiny/init/Kconfig =================================================================== --- tiny.orig/init/Kconfig 2005-10-10 17:41:44.000000000 -0700 +++ tiny/init/Kconfig 2005-10-10 17:41:46.000000000 -0700 @@ -315,6 +315,13 @@ config BUG option for embedded systems with no facilities for reporting errors. Just say Y. +config KMALLOC_ACCOUNTING + default n + bool "Enabled accounting of kmalloc/kfree allocations" + help + This option records kmalloc and kfree activity and reports it via + /proc/kmalloc. + config BASE_FULL default y bool "Enable full-sized data structures for core" if EMBEDDED Index: tiny/mm/slab.c =================================================================== --- tiny.orig/mm/slab.c 2005-10-10 17:32:51.000000000 -0700 +++ tiny/mm/slab.c 2005-10-10 17:41:46.000000000 -0700 @@ -2911,6 +2911,8 @@ EXPORT_SYMBOL(kmalloc_node); void *__kmalloc(size_t size, unsigned int __nocast flags) { kmem_cache_t *cachep; + struct cache_sizes *csizep = malloc_sizes; + void *a; /* If you want to save a few bytes .text space: replace * __ with kmem_. @@ -2920,7 +2921,9 @@ void *__kmalloc(size_t size, unsigned in cachep = __find_general_cachep(size, flags); if (unlikely(cachep == NULL)) return NULL; - return __cache_alloc(cachep, flags); + a = __cache_alloc(cachep, flags); + kmalloc_account(a, csizep->cs_size, size); + return a; } EXPORT_SYMBOL(__kmalloc); @@ -3020,6 +3023,8 @@ void kfree(const void *objp) kmem_cache_t *c; unsigned long flags; + kfree_account(objp, ksize(objp)); + if (unlikely(!objp)) return; local_irq_save(flags); Index: tiny/mm/kmallocacct.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ tiny/mm/kmallocacct.c 2005-10-10 17:41:46.000000000 -0700 @@ -0,0 +1,182 @@ +#include <linux/config.h> +#include <linux/seq_file.h> +#include <linux/kallsyms.h> + +struct kma_caller { + const void *caller; + int total, net, slack, allocs, frees; +}; + +struct kma_list { + int callerhash; + const void *address; +}; + +#define MAX_CALLER_TABLE 512 +#define MAX_ALLOC_TRACK 4096 + +#define kma_hash(address, size) (((u32)address / (u32)size) % size) + +static struct kma_list kma_alloc[MAX_ALLOC_TRACK]; +static struct kma_caller kma_caller[MAX_CALLER_TABLE]; + +static int kma_callers; +static int kma_lost_callers, kma_lost_allocs, kma_unknown_frees; +static int kma_total, kma_net, kma_slack, kma_allocs, kma_frees; +static spinlock_t kma_lock = SPIN_LOCK_UNLOCKED; + +void __kmalloc_account(const void *caller, const void *addr, int size, int req) +{ + int i, hasha, hashc; + unsigned long flags; + + spin_lock_irqsave(&kma_lock, flags); + if(req >= 0) /* kmalloc */ + { + /* find callers slot */ + hashc = kma_hash(caller, MAX_CALLER_TABLE); + for (i = 0; i < MAX_CALLER_TABLE; i++) { + if (!kma_caller[hashc].caller || + kma_caller[hashc].caller == caller) + break; + hashc = (hashc + 1) % MAX_CALLER_TABLE; + } + + if (!kma_caller[hashc].caller) + kma_callers++; + + if (i < MAX_CALLER_TABLE) { + /* update callers stats */ + kma_caller[hashc].caller = caller; + kma_caller[hashc].total += size; + kma_caller[hashc].net += size; + kma_caller[hashc].slack += size - req; + kma_caller[hashc].allocs++; + + /* add malloc to list */ + hasha = kma_hash(addr, MAX_ALLOC_TRACK); + for (i = 0; i < MAX_ALLOC_TRACK; i++) { + if (!kma_alloc[hasha].callerhash) + break; + hasha = (hasha + 1) % MAX_ALLOC_TRACK; + } + + if(i < MAX_ALLOC_TRACK) { + kma_alloc[hasha].callerhash = hashc; + kma_alloc[hasha].address = addr; + } + else + kma_lost_allocs++; + } + else { + kma_lost_callers++; + kma_lost_allocs++; + } + + kma_total += size; + kma_net += size; + kma_slack += size - req; + kma_allocs++; + } + else { /* kfree */ + hasha = kma_hash(addr, MAX_ALLOC_TRACK); + for (i = 0; i < MAX_ALLOC_TRACK ; i++) { + if (kma_alloc[hasha].address == addr) + break; + hasha = (hasha + 1) % MAX_ALLOC_TRACK; + } + + if (i < MAX_ALLOC_TRACK) { + hashc = kma_alloc[hasha].callerhash; + kma_alloc[hasha].callerhash = 0; + kma_caller[hashc].net -= size; + kma_caller[hashc].frees++; + } + else + kma_unknown_frees++; + + kma_net -= size; + kma_frees++; + } + spin_unlock_irqrestore(&kma_lock, flags); +} + +static void *as_start(struct seq_file *m, loff_t *pos) +{ + int i; + loff_t n = *pos; + + if (!n) { + seq_printf(m, "total bytes allocated: %8d\n", kma_total); + seq_printf(m, "slack bytes allocated: %8d\n", kma_slack); + seq_printf(m, "net bytes allocated: %8d\n", kma_net); + seq_printf(m, "number of allocs: %8d\n", kma_allocs); + seq_printf(m, "number of frees: %8d\n", kma_frees); + seq_printf(m, "number of callers: %8d\n", kma_callers); + seq_printf(m, "lost callers: %8d\n", + kma_lost_callers); + seq_printf(m, "lost allocs: %8d\n", + kma_lost_allocs); + seq_printf(m, "unknown frees: %8d\n", + kma_unknown_frees); + seq_puts(m, "\n total slack net alloc/free caller\n"); + } + + for (i = 0; i < MAX_CALLER_TABLE; i++) { + if(kma_caller[i].caller) + n--; + if(n < 0) + return (void *)(i+1); + } + + return 0; +} + +static void *as_next(struct seq_file *m, void *p, loff_t *pos) +{ + int n = (int)p-1, i; + ++*pos; + + for (i = n + 1; i < MAX_CALLER_TABLE; i++) + if(kma_caller[i].caller) + return (void *)(i+1); + + return 0; +} + +static void as_stop(struct seq_file *m, void *p) +{ +} + +static int as_show(struct seq_file *m, void *p) +{ + int n = (int)p-1; + struct kma_caller *c; +#ifdef CONFIG_KALLSYMS + char *modname; + const char *name; + unsigned long offset = 0, size; + char namebuf[128]; + + c = &kma_caller[n]; + name = kallsyms_lookup((int)c->caller, &size, &offset, &modname, + namebuf); + seq_printf(m, "%8d %8d %8d %5d/%-5d %s+0x%lx\n", + c->total, c->slack, c->net, c->allocs, c->frees, + name, offset); +#else + c = &kma_caller[n]; + seq_printf(m, "%8d %8d %8d %5d/%-5d %p\n", + c->total, c->slack, c->net, c->allocs, c->frees, c->caller); +#endif + + return 0; +} + +struct seq_operations kmalloc_account_op = { + .start = as_start, + .next = as_next, + .stop = as_stop, + .show = as_show, +}; + Index: tiny/mm/Makefile =================================================================== --- tiny.orig/mm/Makefile 2005-10-10 17:30:45.000000000 -0700 +++ tiny/mm/Makefile 2005-10-10 17:41:46.000000000 -0700 @@ -12,6 +12,7 @@ obj-y := bootmem.o filemap.o mempool.o readahead.o slab.o swap.o truncate.o vmscan.o \ prio_tree.o $(mmu-y) +obj-$(CONFIG_KMALLOC_ACCOUNTING) += kmallocacct.o obj-$(CONFIG_SWAP) += page_io.o swap_state.o swapfile.o thrash.o obj-$(CONFIG_HUGETLBFS) += hugetlb.o obj-$(CONFIG_NUMA) += mempolicy.o Index: tiny/include/linux/slab.h =================================================================== --- tiny.orig/include/linux/slab.h 2005-10-10 17:32:41.000000000 -0700 +++ tiny/include/linux/slab.h 2005-10-10 17:41:46.000000000 -0700 @@ -53,6 +53,23 @@ typedef struct kmem_cache_s kmem_cache_t #define SLAB_CTOR_ATOMIC 0x002UL /* tell constructor it can't sleep */ #define SLAB_CTOR_VERIFY 0x004UL /* tell constructor it's a verify call */ +#ifdef CONFIG_KMALLOC_ACCOUNTING +void __kmalloc_account(const void *, const void *, int, int); + +static void inline kmalloc_account(const void *addr, int size, int req) +{ + __kmalloc_account(__builtin_return_address(0), addr, size, req); +} + +static void inline kfree_account(const void *addr, int size) +{ + __kmalloc_account(__builtin_return_address(0), addr, size, -1); +} +#else +#define kmalloc_account(a, b, c) +#define kfree_account(a, b) +#endif + /* prototypes */ extern void __init kmem_cache_init(void); @@ -78,6 +95,7 @@ extern void *__kmalloc(size_t, unsigned static inline void *kmalloc(size_t size, unsigned int __nocast flags) { +#ifndef CONFIG_KMALLOC_ACCOUNTING if (__builtin_constant_p(size)) { int i = 0; #define CACHE(x) \ @@ -96,6 +114,7 @@ found: malloc_sizes[i].cs_dmacachep : malloc_sizes[i].cs_cachep, flags); } +#endif return __kmalloc(size, flags); } Index: tiny/fs/proc/proc_misc.c =================================================================== --- tiny.orig/fs/proc/proc_misc.c 2005-10-10 17:30:45.000000000 -0700 +++ tiny/fs/proc/proc_misc.c 2005-10-10 17:41:46.000000000 -0700 @@ -337,6 +337,24 @@ static struct file_operations proc_slabi .release = seq_release, }; +#ifdef CONFIG_KMALLOC_ACCOUNTING + +extern struct seq_operations kmalloc_account_op; + +static int kmalloc_account_open(struct inode *inode, struct file *file) +{ + return seq_open(file, &kmalloc_account_op); +} + +static struct file_operations proc_kmalloc_account_operations = { + .open = kmalloc_account_open, + .read = seq_read, + .llseek = seq_lseek, + .release = seq_release, +}; + +#endif + static int show_stat(struct seq_file *p, void *v) { int i; @@ -601,6 +619,9 @@ void __init proc_misc_init(void) create_seq_entry("stat", 0, &proc_stat_operations); create_seq_entry("interrupts", 0, &proc_interrupts_operations); create_seq_entry("slabinfo",S_IWUSR|S_IRUGO,&proc_slabinfo_operations); +#ifdef CONFIG_KMALLOC_ACCOUNTING + create_seq_entry("kmalloc",S_IRUGO,&proc_kmalloc_account_operations); +#endif create_seq_entry("buddyinfo",S_IRUGO, &fragmentation_file_operations); create_seq_entry("vmstat",S_IRUGO, &proc_vmstat_file_operations); create_seq_entry("zoneinfo",S_IRUGO, &proc_zoneinfo_file_operations); --- linux-2.6.14/mm/slab.c~ 2005-12-28 16:37:04.000000000 -0500 +++ linux-2.6.14/mm/slab.c 2005-12-28 16:37:14.000000000 -0500 @@ -3045,20 +3045,6 @@ void kmem_cache_free(kmem_cache_t *cache EXPORT_SYMBOL(kmem_cache_free); /** - * kzalloc - allocate memory. The memory is set to zero. - * @size: how many bytes of memory are required. - * @flags: the type of memory to allocate. - */ -void *kzalloc(size_t size, gfp_t flags) -{ - void *ret = kmalloc(size, flags); - if (ret) - memset(ret, 0, size); - return ret; -} -EXPORT_SYMBOL(kzalloc); - -/** * kfree - free previously allocated memory * @objp: pointer returned by kmalloc. * --- linux-2.6.14/include/linux/slab.h~ 2005-12-28 16:37:19.000000000 -0500 +++ linux-2.6.14/include/linux/slab.h 2005-12-28 16:38:51.000000000 -0500 @@ -118,7 +118,13 @@ found: return __kmalloc(size, flags); } -extern void *kzalloc(size_t, gfp_t); +static inline void *kzalloc(size_t size, gfp_t flags) +{ + void *ret = kmalloc(size, flags); + if (ret) + memset(ret, 0, size); + return ret; +} /** * kcalloc - allocate memory for an array. The memory is set to zero. --- linux-2.6.14/include/linux/slab.h~ 2005-12-28 19:04:06.000000000 -0500 +++ linux-2.6.14/include/linux/slab.h 2005-12-28 19:04:47.000000000 -0500 @@ -126,6 +126,27 @@ static inline void *kzalloc(size_t size, return ret; } +/* + * kstrdup - allocate space for and copy an existing string + * + * @s: the string to duplicate + * @gfp: the GFP mask used in the kmalloc() call when allocating memory + */ +static inline char *kstrdup(const char *s, gfp_t gfp) +{ + size_t len; + char *buf; + + if (!s) + return NULL; + + len = strlen(s) + 1; + buf = kmalloc(len, gfp); + if (buf) + memcpy(buf, s, len); + return buf; +} + /** * kcalloc - allocate memory for an array. The memory is set to zero. * @n: number of elements. --- linux-2.6.14/mm/slab.c~ 2005-12-28 19:04:54.000000000 -0500 +++ linux-2.6.14/mm/slab.c 2005-12-28 19:04:59.000000000 -0500 @@ -3669,25 +3669,3 @@ unsigned int ksize(const void *objp) return obj_reallen(page_get_cache(virt_to_page(objp))); } - -/* - * kstrdup - allocate space for and copy an existing string - * - * @s: the string to duplicate - * @gfp: the GFP mask used in the kmalloc() call when allocating memory - */ -char *kstrdup(const char *s, gfp_t gfp) -{ - size_t len; - char *buf; - - if (!s) - return NULL; - - len = strlen(s) + 1; - buf = kmalloc(len, gfp); - if (buf) - memcpy(buf, s, len); - return buf; -} -EXPORT_SYMBOL(kstrdup); --- linux-2.6.14/include/linux/string.h~ 2005-12-28 19:12:06.000000000 -0500 +++ linux-2.6.14/include/linux/string.h 2005-12-28 19:12:19.000000000 -0500 @@ -88,8 +88,6 @@ extern int memcmp(const void *,const voi extern void * memchr(const void *,int,__kernel_size_t); #endif -extern char *kstrdup(const char *s, gfp_t gfp); - #ifdef __cplusplus } #endif ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2005-12-29 1:26 ` Dave Jones @ 2005-12-30 4:06 ` Steven Rostedt 2006-01-02 8:46 ` Pekka Enberg 0 siblings, 1 reply; 36+ messages in thread From: Steven Rostedt @ 2005-12-30 4:06 UTC (permalink / raw) To: Dave Jones Cc: linux-kernel, Eric Dumazet, Denis Vlasenko, Andreas Kleen, Matt Mackall [-- Attachment #1: Type: text/plain, Size: 550 bytes --] On Wed, 2005-12-28 at 20:26 -0500, Dave Jones wrote: > On Wed, Dec 28, 2005 at 03:01:25PM -0600, Matt Mackall wrote: > > > Something like this: > > > > http://lwn.net/Articles/124374/ > > Nice toy. Variant attached that works on 2.6.15rc7 > - ->cs_size compile error fixed > - inlines kstrdup and kzalloc. > Otherwise these functions dominate the profile. Attached is a variant that was refreshed against 2.6.15-rc7 and fixes the logical bug that your compile error fix made ;) It should be cachep->objsize not csizep->cs_size. -- Steve [-- Attachment #2: linux-2.6-debug-account-kmalloc.patch --] [-- Type: text/x-patch, Size: 11851 bytes --] /proc/kmalloc allocation tracing tiny-mpm/fs/proc/proc_misc.c | 21 ++++ tiny-mpm/include/linux/slab.h | 19 ++++ tiny-mpm/init/Kconfig | 7 + tiny-mpm/mm/Makefile | 2 tiny-mpm/mm/kmallocacct.c | 182 ++++++++++++++++++++++++++++++++++++++++++ tiny-mpm/mm/slab.c | 7 + 6 files changed, 237 insertions(+), 1 deletion(-) Index: linux-2.6.15-rc7/init/Kconfig =================================================================== --- linux-2.6.15-rc7.orig/init/Kconfig 2005-12-29 22:54:48.000000000 -0500 +++ linux-2.6.15-rc7/init/Kconfig 2005-12-29 22:55:29.000000000 -0500 @@ -328,6 +328,13 @@ option for embedded systems with no facilities for reporting errors. Just say Y. +config KMALLOC_ACCOUNTING + default n + bool "Enabled accounting of kmalloc/kfree allocations" + help + This option records kmalloc and kfree activity and reports it via + /proc/kmalloc. + config BASE_FULL default y bool "Enable full-sized data structures for core" if EMBEDDED Index: linux-2.6.15-rc7/mm/slab.c =================================================================== --- linux-2.6.15-rc7.orig/mm/slab.c 2005-12-29 22:54:48.000000000 -0500 +++ linux-2.6.15-rc7/mm/slab.c 2005-12-29 22:56:13.000000000 -0500 @@ -2924,6 +2924,7 @@ void *__kmalloc(size_t size, gfp_t flags) { kmem_cache_t *cachep; + void *a; /* If you want to save a few bytes .text space: replace * __ with kmem_. @@ -2933,7 +2934,9 @@ cachep = __find_general_cachep(size, flags); if (unlikely(cachep == NULL)) return NULL; - return __cache_alloc(cachep, flags); + a = __cache_alloc(cachep, flags); + kmalloc_account(a, cachep->objsize, size); + return a; } EXPORT_SYMBOL(__kmalloc); @@ -3006,20 +3009,6 @@ EXPORT_SYMBOL(kmem_cache_free); /** - * kzalloc - allocate memory. The memory is set to zero. - * @size: how many bytes of memory are required. - * @flags: the type of memory to allocate. - */ -void *kzalloc(size_t size, gfp_t flags) -{ - void *ret = kmalloc(size, flags); - if (ret) - memset(ret, 0, size); - return ret; -} -EXPORT_SYMBOL(kzalloc); - -/** * kfree - free previously allocated memory * @objp: pointer returned by kmalloc. * @@ -3033,6 +3022,8 @@ kmem_cache_t *c; unsigned long flags; + kfree_account(objp, ksize(objp)); + if (unlikely(!objp)) return; local_irq_save(flags); @@ -3610,25 +3601,3 @@ return obj_reallen(page_get_cache(virt_to_page(objp))); } - -/* - * kstrdup - allocate space for and copy an existing string - * - * @s: the string to duplicate - * @gfp: the GFP mask used in the kmalloc() call when allocating memory - */ -char *kstrdup(const char *s, gfp_t gfp) -{ - size_t len; - char *buf; - - if (!s) - return NULL; - - len = strlen(s) + 1; - buf = kmalloc(len, gfp); - if (buf) - memcpy(buf, s, len); - return buf; -} -EXPORT_SYMBOL(kstrdup); Index: linux-2.6.15-rc7/mm/kmallocacct.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-2.6.15-rc7/mm/kmallocacct.c 2005-12-29 22:55:29.000000000 -0500 @@ -0,0 +1,182 @@ +#include <linux/config.h> +#include <linux/seq_file.h> +#include <linux/kallsyms.h> + +struct kma_caller { + const void *caller; + int total, net, slack, allocs, frees; +}; + +struct kma_list { + int callerhash; + const void *address; +}; + +#define MAX_CALLER_TABLE 512 +#define MAX_ALLOC_TRACK 4096 + +#define kma_hash(address, size) (((u32)address / (u32)size) % size) + +static struct kma_list kma_alloc[MAX_ALLOC_TRACK]; +static struct kma_caller kma_caller[MAX_CALLER_TABLE]; + +static int kma_callers; +static int kma_lost_callers, kma_lost_allocs, kma_unknown_frees; +static int kma_total, kma_net, kma_slack, kma_allocs, kma_frees; +static spinlock_t kma_lock = SPIN_LOCK_UNLOCKED; + +void __kmalloc_account(const void *caller, const void *addr, int size, int req) +{ + int i, hasha, hashc; + unsigned long flags; + + spin_lock_irqsave(&kma_lock, flags); + if(req >= 0) /* kmalloc */ + { + /* find callers slot */ + hashc = kma_hash(caller, MAX_CALLER_TABLE); + for (i = 0; i < MAX_CALLER_TABLE; i++) { + if (!kma_caller[hashc].caller || + kma_caller[hashc].caller == caller) + break; + hashc = (hashc + 1) % MAX_CALLER_TABLE; + } + + if (!kma_caller[hashc].caller) + kma_callers++; + + if (i < MAX_CALLER_TABLE) { + /* update callers stats */ + kma_caller[hashc].caller = caller; + kma_caller[hashc].total += size; + kma_caller[hashc].net += size; + kma_caller[hashc].slack += size - req; + kma_caller[hashc].allocs++; + + /* add malloc to list */ + hasha = kma_hash(addr, MAX_ALLOC_TRACK); + for (i = 0; i < MAX_ALLOC_TRACK; i++) { + if (!kma_alloc[hasha].callerhash) + break; + hasha = (hasha + 1) % MAX_ALLOC_TRACK; + } + + if(i < MAX_ALLOC_TRACK) { + kma_alloc[hasha].callerhash = hashc; + kma_alloc[hasha].address = addr; + } + else + kma_lost_allocs++; + } + else { + kma_lost_callers++; + kma_lost_allocs++; + } + + kma_total += size; + kma_net += size; + kma_slack += size - req; + kma_allocs++; + } + else { /* kfree */ + hasha = kma_hash(addr, MAX_ALLOC_TRACK); + for (i = 0; i < MAX_ALLOC_TRACK ; i++) { + if (kma_alloc[hasha].address == addr) + break; + hasha = (hasha + 1) % MAX_ALLOC_TRACK; + } + + if (i < MAX_ALLOC_TRACK) { + hashc = kma_alloc[hasha].callerhash; + kma_alloc[hasha].callerhash = 0; + kma_caller[hashc].net -= size; + kma_caller[hashc].frees++; + } + else + kma_unknown_frees++; + + kma_net -= size; + kma_frees++; + } + spin_unlock_irqrestore(&kma_lock, flags); +} + +static void *as_start(struct seq_file *m, loff_t *pos) +{ + int i; + loff_t n = *pos; + + if (!n) { + seq_printf(m, "total bytes allocated: %8d\n", kma_total); + seq_printf(m, "slack bytes allocated: %8d\n", kma_slack); + seq_printf(m, "net bytes allocated: %8d\n", kma_net); + seq_printf(m, "number of allocs: %8d\n", kma_allocs); + seq_printf(m, "number of frees: %8d\n", kma_frees); + seq_printf(m, "number of callers: %8d\n", kma_callers); + seq_printf(m, "lost callers: %8d\n", + kma_lost_callers); + seq_printf(m, "lost allocs: %8d\n", + kma_lost_allocs); + seq_printf(m, "unknown frees: %8d\n", + kma_unknown_frees); + seq_puts(m, "\n total slack net alloc/free caller\n"); + } + + for (i = 0; i < MAX_CALLER_TABLE; i++) { + if(kma_caller[i].caller) + n--; + if(n < 0) + return (void *)(i+1); + } + + return 0; +} + +static void *as_next(struct seq_file *m, void *p, loff_t *pos) +{ + int n = (int)p-1, i; + ++*pos; + + for (i = n + 1; i < MAX_CALLER_TABLE; i++) + if(kma_caller[i].caller) + return (void *)(i+1); + + return 0; +} + +static void as_stop(struct seq_file *m, void *p) +{ +} + +static int as_show(struct seq_file *m, void *p) +{ + int n = (int)p-1; + struct kma_caller *c; +#ifdef CONFIG_KALLSYMS + char *modname; + const char *name; + unsigned long offset = 0, size; + char namebuf[128]; + + c = &kma_caller[n]; + name = kallsyms_lookup((int)c->caller, &size, &offset, &modname, + namebuf); + seq_printf(m, "%8d %8d %8d %5d/%-5d %s+0x%lx\n", + c->total, c->slack, c->net, c->allocs, c->frees, + name, offset); +#else + c = &kma_caller[n]; + seq_printf(m, "%8d %8d %8d %5d/%-5d %p\n", + c->total, c->slack, c->net, c->allocs, c->frees, c->caller); +#endif + + return 0; +} + +struct seq_operations kmalloc_account_op = { + .start = as_start, + .next = as_next, + .stop = as_stop, + .show = as_show, +}; + Index: linux-2.6.15-rc7/mm/Makefile =================================================================== --- linux-2.6.15-rc7.orig/mm/Makefile 2005-12-29 22:54:48.000000000 -0500 +++ linux-2.6.15-rc7/mm/Makefile 2005-12-29 22:55:29.000000000 -0500 @@ -12,6 +12,7 @@ readahead.o slab.o swap.o truncate.o vmscan.o \ prio_tree.o $(mmu-y) +obj-$(CONFIG_KMALLOC_ACCOUNTING) += kmallocacct.o obj-$(CONFIG_SWAP) += page_io.o swap_state.o swapfile.o thrash.o obj-$(CONFIG_HUGETLBFS) += hugetlb.o obj-$(CONFIG_NUMA) += mempolicy.o Index: linux-2.6.15-rc7/include/linux/slab.h =================================================================== --- linux-2.6.15-rc7.orig/include/linux/slab.h 2005-12-29 22:54:48.000000000 -0500 +++ linux-2.6.15-rc7/include/linux/slab.h 2005-12-29 22:55:29.000000000 -0500 @@ -53,6 +53,23 @@ #define SLAB_CTOR_ATOMIC 0x002UL /* tell constructor it can't sleep */ #define SLAB_CTOR_VERIFY 0x004UL /* tell constructor it's a verify call */ +#ifdef CONFIG_KMALLOC_ACCOUNTING +void __kmalloc_account(const void *, const void *, int, int); + +static void inline kmalloc_account(const void *addr, int size, int req) +{ + __kmalloc_account(__builtin_return_address(0), addr, size, req); +} + +static void inline kfree_account(const void *addr, int size) +{ + __kmalloc_account(__builtin_return_address(0), addr, size, -1); +} +#else +#define kmalloc_account(a, b, c) +#define kfree_account(a, b) +#endif + /* prototypes */ extern void __init kmem_cache_init(void); @@ -78,6 +95,7 @@ static inline void *kmalloc(size_t size, gfp_t flags) { +#ifndef CONFIG_KMALLOC_ACCOUNTING if (__builtin_constant_p(size)) { int i = 0; #define CACHE(x) \ @@ -96,10 +114,38 @@ malloc_sizes[i].cs_dmacachep : malloc_sizes[i].cs_cachep, flags); } +#endif return __kmalloc(size, flags); } -extern void *kzalloc(size_t, gfp_t); +static inline void *kzalloc(size_t size, gfp_t flags) +{ + void *ret = kmalloc(size, flags); + if (ret) + memset(ret, 0, size); + return ret; +} + +/* + * kstrdup - allocate space for and copy an existing string + * + * @s: the string to duplicate + * @gfp: the GFP mask used in the kmalloc() call when allocating memory + */ +static inline char *kstrdup(const char *s, gfp_t gfp) +{ + size_t len; + char *buf; + + if (!s) + return NULL; + + len = strlen(s) + 1; + buf = kmalloc(len, gfp); + if (buf) + memcpy(buf, s, len); + return buf; +} /** * kcalloc - allocate memory for an array. The memory is set to zero. Index: linux-2.6.15-rc7/fs/proc/proc_misc.c =================================================================== --- linux-2.6.15-rc7.orig/fs/proc/proc_misc.c 2005-12-29 22:54:48.000000000 -0500 +++ linux-2.6.15-rc7/fs/proc/proc_misc.c 2005-12-29 22:55:29.000000000 -0500 @@ -337,6 +337,24 @@ .release = seq_release, }; +#ifdef CONFIG_KMALLOC_ACCOUNTING + +extern struct seq_operations kmalloc_account_op; + +static int kmalloc_account_open(struct inode *inode, struct file *file) +{ + return seq_open(file, &kmalloc_account_op); +} + +static struct file_operations proc_kmalloc_account_operations = { + .open = kmalloc_account_open, + .read = seq_read, + .llseek = seq_lseek, + .release = seq_release, +}; + +#endif + static int show_stat(struct seq_file *p, void *v) { int i; @@ -601,6 +619,9 @@ create_seq_entry("stat", 0, &proc_stat_operations); create_seq_entry("interrupts", 0, &proc_interrupts_operations); create_seq_entry("slabinfo",S_IWUSR|S_IRUGO,&proc_slabinfo_operations); +#ifdef CONFIG_KMALLOC_ACCOUNTING + create_seq_entry("kmalloc",S_IRUGO,&proc_kmalloc_account_operations); +#endif create_seq_entry("buddyinfo",S_IRUGO, &fragmentation_file_operations); create_seq_entry("vmstat",S_IRUGO, &proc_vmstat_file_operations); create_seq_entry("zoneinfo",S_IRUGO, &proc_zoneinfo_file_operations); Index: linux-2.6.15-rc7/include/linux/string.h =================================================================== --- linux-2.6.15-rc7.orig/include/linux/string.h 2005-12-29 22:54:48.000000000 -0500 +++ linux-2.6.15-rc7/include/linux/string.h 2005-12-29 22:55:29.000000000 -0500 @@ -88,8 +88,6 @@ extern void * memchr(const void *,int,__kernel_size_t); #endif -extern char *kstrdup(const char *s, gfp_t gfp); - #ifdef __cplusplus } #endif ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2005-12-30 4:06 ` Steven Rostedt @ 2006-01-02 8:46 ` Pekka Enberg 2006-01-02 8:51 ` Pekka Enberg 2006-01-02 12:31 ` Steven Rostedt 0 siblings, 2 replies; 36+ messages in thread From: Pekka Enberg @ 2006-01-02 8:46 UTC (permalink / raw) To: Steven Rostedt Cc: Dave Jones, linux-kernel, Eric Dumazet, Denis Vlasenko, Andreas Kleen, Matt Mackall Hi, On 12/30/05, Steven Rostedt <rostedt@goodmis.org> wrote: > Attached is a variant that was refreshed against 2.6.15-rc7 and fixes > the logical bug that your compile error fix made ;) > > It should be cachep->objsize not csizep->cs_size. Isn't there any other way to do this patch other than making kzalloc() and kstrdup() inline? I would like to see something like this in the mainline but making them inline is not acceptable because they increase kernel text a lot. Pekka ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2006-01-02 8:46 ` Pekka Enberg @ 2006-01-02 8:51 ` Pekka Enberg 2006-01-02 12:33 ` Steven Rostedt 2006-01-02 12:31 ` Steven Rostedt 1 sibling, 1 reply; 36+ messages in thread From: Pekka Enberg @ 2006-01-02 8:51 UTC (permalink / raw) To: Steven Rostedt Cc: Dave Jones, linux-kernel, Eric Dumazet, Denis Vlasenko, Andreas Kleen, Matt Mackall On 12/30/05, Steven Rostedt <rostedt@goodmis.org> wrote: > > Attached is a variant that was refreshed against 2.6.15-rc7 and fixes > > the logical bug that your compile error fix made ;) > > > > It should be cachep->objsize not csizep->cs_size. On 1/2/06, Pekka Enberg <penberg@cs.helsinki.fi> wrote: > Isn't there any other way to do this patch other than making kzalloc() > and kstrdup() inline? I would like to see something like this in the > mainline but making them inline is not acceptable because they > increase kernel text a lot. Also, wouldn't it be better to track kmem_cache_alloc and kmem_cache_alloc_node instead? Pekka ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2006-01-02 8:51 ` Pekka Enberg @ 2006-01-02 12:33 ` Steven Rostedt 0 siblings, 0 replies; 36+ messages in thread From: Steven Rostedt @ 2006-01-02 12:33 UTC (permalink / raw) To: Pekka Enberg Cc: Dave Jones, linux-kernel, Eric Dumazet, Denis Vlasenko, Andreas Kleen, Matt Mackall On Mon, 2 Jan 2006, Pekka Enberg wrote: > > Also, wouldn't it be better to track kmem_cache_alloc and > kmem_cache_alloc_node instead? > I believe they are very interested in when kmalloc and kfree are used, since those are the ones for the generic slabs. And even then, they are only profiling the ones that use a dynamic allocation. (the kmalloc and kfree of sizeof(x) is not profiled). This was brought up earlier in the thread. -- Steve ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2006-01-02 8:46 ` Pekka Enberg 2006-01-02 8:51 ` Pekka Enberg @ 2006-01-02 12:31 ` Steven Rostedt 1 sibling, 0 replies; 36+ messages in thread From: Steven Rostedt @ 2006-01-02 12:31 UTC (permalink / raw) To: Pekka Enberg Cc: Dave Jones, linux-kernel, Eric Dumazet, Denis Vlasenko, Andreas Kleen, Matt Mackall On Mon, 2 Jan 2006, Pekka Enberg wrote: > Hi, > > On 12/30/05, Steven Rostedt <rostedt@goodmis.org> wrote: > > Attached is a variant that was refreshed against 2.6.15-rc7 and fixes > > the logical bug that your compile error fix made ;) > > > > It should be cachep->objsize not csizep->cs_size. > > Isn't there any other way to do this patch other than making kzalloc() > and kstrdup() inline? I would like to see something like this in the > mainline but making them inline is not acceptable because they > increase kernel text a lot. Actually, yes. I was adding to this patch something to be more specific, and to either pass the EIP through the parameter or a __FILE__, __LINE__. Using the following: #ifdef CONFIG_KMALLOC_ACCOUNTING # define __EIP__ , __builtin_return_address(0) # define __DECLARE_EIP__ , void *eip #else # define __EIP__ # define __DECLARE_EIP__ #endif #define kstrdup(s,g) __kstrdup(s, g __EIP__) extern char *__kstrdup(const char *s, gfp_t g __DECLARE_EIP__); Or a file line can be used: # define __EIP__ , __FILE__, __LINE__ # define __DECLARE_EIP__ , char *file, int line -- Steve ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2005-12-28 21:01 ` Matt Mackall 2005-12-29 1:26 ` Dave Jones @ 2005-12-29 1:29 ` Dave Jones 2005-12-29 1:50 ` Keith Owens 2005-12-30 21:13 ` Marcelo Tosatti 2 siblings, 1 reply; 36+ messages in thread From: Dave Jones @ 2005-12-29 1:29 UTC (permalink / raw) To: Matt Mackall; +Cc: linux-kernel > Something like this: > > http://lwn.net/Articles/124374/ One thing that really sticks out like a sore thumb is soft_cursor() That thing gets called a *lot*, and every time it does a kmalloc/free pair that 99.9% of the time is going to be the same size alloc as it was the last time. This patch makes that alloc persistent (and does a realloc if the size changes). The only time it should change is if the font/resolution changes I think. Boot tested with vesafb & fbconsole, which had the desired effect. With this patch, it almost falls off the profile. Signed-off-by: Dave Jones <davej@redhat.com> --- linux-2.6.14/drivers/video/console/softcursor.c~ 2005-12-28 18:40:08.000000000 -0500 +++ linux-2.6.14/drivers/video/console/softcursor.c 2005-12-28 18:45:50.000000000 -0500 @@ -23,7 +23,9 @@ int soft_cursor(struct fb_info *info, st unsigned int buf_align = info->pixmap.buf_align - 1; unsigned int i, size, dsize, s_pitch, d_pitch; struct fb_image *image; - u8 *dst, *src; + u8 *dst; + static u8 *src=NULL; + static int allocsize=0; if (info->state != FBINFO_STATE_RUNNING) return 0; @@ -31,9 +33,15 @@ int soft_cursor(struct fb_info *info, st s_pitch = (cursor->image.width + 7) >> 3; dsize = s_pitch * cursor->image.height; - src = kmalloc(dsize + sizeof(struct fb_image), GFP_ATOMIC); - if (!src) - return -ENOMEM; + if (dsize + sizeof(struct fb_image) != allocsize) { + if (src != NULL) + kfree(src); + allocsize = dsize + sizeof(struct fb_image); + + src = kmalloc(allocsize, GFP_ATOMIC); + if (!src) + return -ENOMEM; + } image = (struct fb_image *) (src + dsize); *image = cursor->image; @@ -61,7 +69,6 @@ int soft_cursor(struct fb_info *info, st fb_pad_aligned_buffer(dst, d_pitch, src, s_pitch, image->height); image->data = dst; info->fbops->fb_imageblit(info, image); - kfree(src); return 0; } ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2005-12-29 1:29 ` Dave Jones @ 2005-12-29 1:50 ` Keith Owens 2005-12-29 2:39 ` Dave Jones 2006-01-04 5:26 ` Dave Jones 0 siblings, 2 replies; 36+ messages in thread From: Keith Owens @ 2005-12-29 1:50 UTC (permalink / raw) To: Dave Jones; +Cc: Matt Mackall, linux-kernel Dave Jones (on Wed, 28 Dec 2005 20:29:15 -0500) wrote: > > > Something like this: > > > > http://lwn.net/Articles/124374/ > >One thing that really sticks out like a sore thumb is soft_cursor() >That thing gets called a *lot*, and every time it does a kmalloc/free >pair that 99.9% of the time is going to be the same size alloc as >it was the last time. This patch makes that alloc persistent >(and does a realloc if the size changes). >The only time it should change is if the font/resolution changes I think. Can soft_cursor() be called from multiple processes at the same time, in particular with dual head systems? If so then a static variable is not going to work. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2005-12-29 1:50 ` Keith Owens @ 2005-12-29 2:39 ` Dave Jones 2006-01-02 15:03 ` Helge Hafting 2006-01-04 5:26 ` Dave Jones 1 sibling, 1 reply; 36+ messages in thread From: Dave Jones @ 2005-12-29 2:39 UTC (permalink / raw) To: Keith Owens; +Cc: Matt Mackall, linux-kernel On Thu, Dec 29, 2005 at 12:50:10PM +1100, Keith Owens wrote: > Dave Jones (on Wed, 28 Dec 2005 20:29:15 -0500) wrote: > > > > > Something like this: > > > > > > http://lwn.net/Articles/124374/ > > > >One thing that really sticks out like a sore thumb is soft_cursor() > >That thing gets called a *lot*, and every time it does a kmalloc/free > >pair that 99.9% of the time is going to be the same size alloc as > >it was the last time. This patch makes that alloc persistent > >(and does a realloc if the size changes). > >The only time it should change is if the font/resolution changes I think. > > Can soft_cursor() be called from multiple processes at the same time, > in particular with dual head systems? If so then a static variable is > not going to work. My dual-head system here displays a cloned image on the second screen, which seems to dtrt. I'm not sure how to make it show something different on the other head to test further. Dave ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2005-12-29 2:39 ` Dave Jones @ 2006-01-02 15:03 ` Helge Hafting 0 siblings, 0 replies; 36+ messages in thread From: Helge Hafting @ 2006-01-02 15:03 UTC (permalink / raw) To: Dave Jones, Keith Owens, Matt Mackall, linux-kernel On Wed, Dec 28, 2005 at 09:39:06PM -0500, Dave Jones wrote: > On Thu, Dec 29, 2005 at 12:50:10PM +1100, Keith Owens wrote: > > Dave Jones (on Wed, 28 Dec 2005 20:29:15 -0500) wrote: > > > > > > > Something like this: > > > > > > > > http://lwn.net/Articles/124374/ > > > > > >One thing that really sticks out like a sore thumb is soft_cursor() > > >That thing gets called a *lot*, and every time it does a kmalloc/free > > >pair that 99.9% of the time is going to be the same size alloc as > > >it was the last time. This patch makes that alloc persistent > > >(and does a realloc if the size changes). > > >The only time it should change is if the font/resolution changes I think. > > > > Can soft_cursor() be called from multiple processes at the same time, > > in particular with dual head systems? If so then a static variable is > > not going to work. > > My dual-head system here displays a cloned image on the second > screen, which seems to dtrt. I'm not sure how to make it show > something different on the other head to test further. Few dualhead drivers actually support two different framebuffers, but the matrox G550 (and G400) drivers do. Compile one of those, make sure to configure dualhead support. After booting up, use "matroxset" to set the framebuffer to vga-connector mapping so that the two outputs actually show the different framebuffers. Another way is to use several graphichs cards (AGP getting the first framebuffer and each PCI card getting others as the drivers load.) Helge Hafting ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2005-12-29 1:50 ` Keith Owens 2005-12-29 2:39 ` Dave Jones @ 2006-01-04 5:26 ` Dave Jones 1 sibling, 0 replies; 36+ messages in thread From: Dave Jones @ 2006-01-04 5:26 UTC (permalink / raw) To: Keith Owens; +Cc: Matt Mackall, linux-kernel On Thu, Dec 29, 2005 at 12:50:10PM +1100, Keith Owens wrote: > Dave Jones (on Wed, 28 Dec 2005 20:29:15 -0500) wrote: > > > > > Something like this: > > > > > > http://lwn.net/Articles/124374/ > > > >One thing that really sticks out like a sore thumb is soft_cursor() > >That thing gets called a *lot*, and every time it does a kmalloc/free > >pair that 99.9% of the time is going to be the same size alloc as > >it was the last time. This patch makes that alloc persistent > >(and does a realloc if the size changes). > >The only time it should change is if the font/resolution changes I think. > > Can soft_cursor() be called from multiple processes at the same time, > in particular with dual head systems? If so then a static variable is > not going to work. I looked at this a little closer. If my understanding of the console/fb layers is correct, soft_cursor() is serialised by the console_sem in drivers/video/console/fbcon.c::fb_flashcursor() Dave ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2005-12-28 21:01 ` Matt Mackall 2005-12-29 1:26 ` Dave Jones 2005-12-29 1:29 ` Dave Jones @ 2005-12-30 21:13 ` Marcelo Tosatti 2005-12-31 20:13 ` Andi Kleen 2 siblings, 1 reply; 36+ messages in thread From: Marcelo Tosatti @ 2005-12-30 21:13 UTC (permalink / raw) To: Matt Mackall; +Cc: Andreas Kleen, Denis Vlasenko, Eric Dumazet, linux-kernel <snip> > > Note that just looking at slabinfo is not enough for this - you need the > > original > > sizes as passed to kmalloc, not the rounded values reported there. > > Should be probably not too hard to hack a simple monitoring script up > > for that > > in systemtap to generate the data. > > Something like this: > > http://lwn.net/Articles/124374/ Written with a systemtap script: http://sourceware.org/ml/systemtap/2005-q3/msg00550.html ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2005-12-30 21:13 ` Marcelo Tosatti @ 2005-12-31 20:13 ` Andi Kleen 0 siblings, 0 replies; 36+ messages in thread From: Andi Kleen @ 2005-12-31 20:13 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: Matt Mackall, Denis Vlasenko, Eric Dumazet, linux-kernel On Friday 30 December 2005 22:13, Marcelo Tosatti wrote: > > <snip> > > > > Note that just looking at slabinfo is not enough for this - you need the > > > original > > > sizes as passed to kmalloc, not the rounded values reported there. > > > Should be probably not too hard to hack a simple monitoring script up > > > for that > > > in systemtap to generate the data. > > > > Something like this: > > > > http://lwn.net/Articles/124374/ > > Written with a systemtap script: > http://sourceware.org/ml/systemtap/2005-q3/msg00550.html I had actually written a similar script on my own before, but I found it was near completely unusable on a 4core Opteron system even under moderate load because systemtap bombed out when it needed more than one spin to take the lock of the shared hash table. (it basically did if (!spin_trylock()) ... stop script; ...) The problem was that the backtraces took so long that another CPU very often run into the locked lock. Still with a stripped down script without backtraces had some interesting results. In particular my init was reading some file in /proc 10 times a second, allocating 4K (wtf did it do that?) and some other somewhat surprising results. -Andi ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2005-12-28 17:57 ` Andreas Kleen 2005-12-28 21:01 ` Matt Mackall @ 2005-12-29 19:48 ` Steven Rostedt 2005-12-29 21:16 ` Andi Kleen 2006-01-02 8:37 ` Pekka Enberg 2 siblings, 1 reply; 36+ messages in thread From: Steven Rostedt @ 2005-12-29 19:48 UTC (permalink / raw) To: Andreas Kleen Cc: linux-kernel, Eric Dumazet, Denis Vlasenko, Matt Mackall, Dave Jones On Wed, 2005-12-28 at 18:57 +0100, Andreas Kleen wrote: [...] > > This whole discussion is pointless anyways because most kmallocs are > constant > sized and with a constant sized kmalloc the slab is selected at compile > time. > > What would be more interesting would be to redo the complete kmalloc > slab list. > > I remember the original slab paper from Bonwick actually mentioned that > power of > two slabs are the worst choice for a malloc - but for some reason Linux > chose them > anyways. That would require a lot of measurements in different workloads > on the > actual kmalloc sizes and then select a good list, but could ultimately > safe > a lot of memory (ok not that much anymore because the memory intensive > allocations should all have their own caches, but at least some) > > Most likely the best list is different for 32bit and 64bit too. > > Note that just looking at slabinfo is not enough for this - you need the > original > sizes as passed to kmalloc, not the rounded values reported there. > Should be probably not too hard to hack a simple monitoring script up > for that > in systemtap to generate the data. > OK then, after reading this I figured there must be a way to dynamically allocate slab sizes based on the kmalloc constants. So I spent last night and some of this morning coming up with the below patch. Right now it only works with i386, but I'm sure it can be hacked to work with all archs. At compile time it creates a table of sizes for all kmallocs (outside of slab.c and arch/i386/mm/init.c) that uses a constant declaration. This table is then initialized in arch/i386/mm/init.c to use a cache that is either already created (like the mem_sizes array) or it creates a new cache of that size (L1 cached aligned), and then updates the pointers to use that cache. Here's what was created on my test box: cat /proc/slabinfo [...] dynamic_dma-1536 0 0 1536 5 2 : tunables 24 12 0 : slabdata 0 0 0 dynamic-1536 1 5 1536 5 2 : tunables 24 12 0 : slabdata 1 1 0 dynamic_dma-1280 0 0 1280 3 1 : tunables 24 12 0 : slabdata 0 0 0 dynamic-1280 6 6 1280 3 1 : tunables 24 12 0 : slabdata 2 2 0 dynamic_dma-2176 0 0 2176 3 2 : tunables 24 12 0 : slabdata 0 0 0 dynamic-2176 0 0 2176 3 2 : tunables 24 12 0 : slabdata 0 0 0 dynamic_dma-1152 0 0 1152 7 2 : tunables 24 12 0 : slabdata 0 0 0 dynamic-1152 0 0 1152 7 2 : tunables 24 12 0 : slabdata 0 0 0 dynamic_dma-1408 0 0 1408 5 2 : tunables 24 12 0 : slabdata 0 0 0 dynamic-1408 0 0 1408 5 2 : tunables 24 12 0 : slabdata 0 0 0 dynamic_dma-640 0 0 640 6 1 : tunables 54 27 0 : slabdata 0 0 0 dynamic-640 0 0 640 6 1 : tunables 54 27 0 : slabdata 0 0 0 dynamic_dma-768 0 0 768 5 1 : tunables 54 27 0 : slabdata 0 0 0 dynamic-768 0 0 768 5 1 : tunables 54 27 0 : slabdata 0 0 0 dynamic_dma-3200 0 0 3200 2 2 : tunables 24 12 0 : slabdata 0 0 0 dynamic-3200 8 8 3200 2 2 : tunables 24 12 0 : slabdata 4 4 0 dynamic_dma-896 0 0 896 4 1 : tunables 54 27 0 : slabdata 0 0 0 dynamic-896 9 12 896 4 1 : tunables 54 27 0 : slabdata 3 3 0 dynamic_dma-384 0 0 384 10 1 : tunables 54 27 0 : slabdata 0 0 0 dynamic-384 40 40 384 10 1 : tunables 54 27 0 : slabdata 4 4 0 size-131072(DMA) 0 0 131072 1 32 : tunables 8 4 0 : slabdata 0 0 0 size-131072 0 0 131072 1 32 : tunables 8 4 0 : slabdata 0 0 0 size-65536(DMA) 0 0 65536 1 16 : tunables 8 4 0 : slabdata 0 0 0 size-65536 1 1 65536 1 16 : tunables 8 4 0 : slabdata 1 1 0 size-32768(DMA) 0 0 32768 1 8 : tunables 8 4 0 : slabdata 0 0 0 size-32768 0 0 32768 1 8 : tunables 8 4 0 : slabdata 0 0 0 size-16384(DMA) 0 0 16384 1 4 : tunables 8 4 0 : slabdata 0 0 0 size-16384 0 0 16384 1 4 : tunables 8 4 0 : slabdata 0 0 0 size-8192(DMA) 0 0 8192 1 2 : tunables 8 4 0 : slabdata 0 0 0 size-8192 40 40 8192 1 2 : tunables 8 4 0 : slabdata 40 40 0 size-4096(DMA) 0 0 4096 1 1 : tunables 24 12 0 : slabdata 0 0 0 size-4096 34 34 4096 1 1 : tunables 24 12 0 : slabdata 34 34 0 size-2048(DMA) 0 0 2048 2 1 : tunables 24 12 0 : slabdata 0 0 0 size-2048 266 266 2048 2 1 : tunables 24 12 0 : slabdata 133 133 0 size-1024(DMA) 0 0 1024 4 1 : tunables 54 27 0 : slabdata 0 0 0 size-1024 24 24 1024 4 1 : tunables 54 27 0 : slabdata 6 6 0 size-512(DMA) 0 0 512 8 1 : tunables 54 27 0 : slabdata 0 0 0 size-512 90 112 512 8 1 : tunables 54 27 0 : slabdata 14 14 0 size-256(DMA) 0 0 256 15 1 : tunables 120 60 0 : slabdata 0 0 0 size-256 735 735 256 15 1 : tunables 120 60 0 : slabdata 49 49 0 size-128(DMA) 0 0 128 30 1 : tunables 120 60 0 : slabdata 0 0 0 size-128 2750 2760 128 30 1 : tunables 120 60 0 : slabdata 92 92 0 size-64(DMA) 0 0 64 59 1 : tunables 120 60 0 : slabdata 0 0 0 size-32(DMA) 0 0 32 113 1 : tunables 120 60 0 : slabdata 0 0 0 size-64 418 472 64 59 1 : tunables 120 60 0 : slabdata 8 8 0 size-32 1175 1243 32 113 1 : tunables 120 60 0 : slabdata 11 11 0 [...] Not sure if this is worth looking further into, but it might actually be a way to use less memory. For example, the above 384 size with 40 objects cost only 4 4k pages, where as these same objects would be 40 512 objects (in size-512) costing 5 4k pages. Plus the 384 probably has ON_SLAB management where as the 512 does not. Comments? -- Steve Index: linux-2.6.15-rc7/arch/i386/Kconfig =================================================================== --- linux-2.6.15-rc7.orig/arch/i386/Kconfig 2005-12-29 09:09:29.000000000 -0500 +++ linux-2.6.15-rc7/arch/i386/Kconfig 2005-12-29 09:09:53.000000000 -0500 @@ -173,6 +173,14 @@ depends on HPET_TIMER && RTC=y default y +config DYNAMIC_SLABS + bool "Dynamically create slabs for constant kmalloc" + default y + help + This enables the creation of SLABS using information created at + compile time. Then on boot up, the slabs are created to fit + more with what was asked for. + config SMP bool "Symmetric multi-processing support" ---help--- Index: linux-2.6.15-rc7/arch/i386/kernel/vmlinux.lds.S =================================================================== --- linux-2.6.15-rc7.orig/arch/i386/kernel/vmlinux.lds.S 2005-12-29 09:09:29.000000000 -0500 +++ linux-2.6.15-rc7/arch/i386/kernel/vmlinux.lds.S 2005-12-29 09:09:53.000000000 -0500 @@ -68,6 +68,13 @@ *(.data.init_task) } +#ifdef CONFIG_DYNAMIC_SLABS + . = ALIGN(16); /* dynamic slab table */ + __start____slab_addresses = .; + __slab_addresses : AT(ADDR(__slab_addresses) - LOAD_OFFSET) { *(__slab_addresses) } + __stop____slab_addresses = .; +#endif + /* will be freed after init */ . = ALIGN(4096); /* Init code and data */ __init_begin = .; @@ -107,6 +114,14 @@ .altinstr_replacement : AT(ADDR(.altinstr_replacement) - LOAD_OFFSET) { *(.altinstr_replacement) } +#ifdef CONFIG_DYNAMIC_SLABS + . = ALIGN(16); /* dynamic slab table */ + __start____slab_preprocess = .; + __slab_preprocess : AT(ADDR(__slab_preprocess) - LOAD_OFFSET) { *(__slab_preprocess) } + __slab_process_ret : AT(ADDR(__slab_process_ret) - LOAD_OFFSET) { *(__slab_process_ret) } + __stop____slab_preprocess = .; +#endif + /* .exit.text is discard at runtime, not link time, to deal with references from .altinstructions and .eh_frame */ .exit.text : AT(ADDR(.exit.text) - LOAD_OFFSET) { *(.exit.text) } @@ -119,7 +134,7 @@ __per_cpu_start = .; .data.percpu : AT(ADDR(.data.percpu) - LOAD_OFFSET) { *(.data.percpu) } __per_cpu_end = .; - . = ALIGN(4096); + . = ALIGN(4096); __init_end = .; /* freed after init ends here */ Index: linux-2.6.15-rc7/arch/i386/mm/init.c =================================================================== --- linux-2.6.15-rc7.orig/arch/i386/mm/init.c 2005-12-29 09:09:29.000000000 -0500 +++ linux-2.6.15-rc7/arch/i386/mm/init.c 2005-12-29 14:31:08.000000000 -0500 @@ -6,6 +6,7 @@ * Support of BIGMEM added by Gerhard Wichert, Siemens AG, July 1999 */ +#define DYNAMIC_SLABS_BOOTSTRAP #include <linux/config.h> #include <linux/module.h> #include <linux/signal.h> @@ -748,3 +749,187 @@ } } #endif + +#ifdef CONFIG_DYNAMIC_SLABS +extern void __start____slab_preprocess(void); +extern unsigned long __start____slab_addresses; +extern unsigned long __stop____slab_addresses; + +static __initdata LIST_HEAD(slablist); + +struct slab_links { + struct cache_sizes *c; + struct list_head list; +}; + +static struct cache_sizes *find_slab_size(int size) +{ + struct list_head *curr; + struct slab_links *s; + + list_for_each(curr, &slablist) { + s = list_entry(curr, struct slab_links, list); + if (s->c->cs_size == size) + return s->c; + } + return NULL; +} + +static void free_slablist(void) +{ + struct list_head *curr, *next; + struct slab_links *s; + + list_for_each_safe(curr, next, &slablist) { + s = list_entry(curr, struct slab_links, list); + list_del(&s->list); + kfree(s); + } +} + +#ifndef ARCH_KMALLOC_MINALIGN +#define ARCH_KMALLOC_MINALIGN L1_CACHE_BYTES +#endif +#ifndef ARCH_KMALLOC_FLAGS +#define ARCH_KMALLOC_FLAGS SLAB_HWCACHE_ALIGN +#endif +#define BYTES_PER_WORD sizeof(void *) + +#ifdef DEBUG_ADDR +static __init void print_slab_addresses(int hex) +{ + unsigned long *slab_addresses = &__start____slab_addresses; + unsigned long *end = &__stop____slab_addresses; + + + for (; slab_addresses < end; slab_addresses++) { + if (hex) + printk("slab %p = %lx\n",slab_addresses, *slab_addresses); + else + printk("slab %p = %ld\n",slab_addresses, *slab_addresses); + } +} +#else +# define print_slab_addresses(x) do {} while(0) +#endif + +int __init dynamic_slab_init(void) +{ + unsigned long *slab_addresses = &__start____slab_addresses; + unsigned long *end = &__stop____slab_addresses; + struct cache_sizes *c; + struct slab_links *s; + unsigned long sizes[] = { +#define CACHE(C) C, +#include <linux/kmalloc_sizes.h> +#undef CACHE + }; + int i; + + + asm (".section __slab_process_ret,\"ax\"\n" + "ret\n" + ".previous\n"); + + __start____slab_preprocess(); + + printk("Before update!\n"); + print_slab_addresses(0); + + /* + * DYNAMIC_SLABS_BOOTSTRAP is defined, so we don't need + * to worry about kmalloc hardcoded. + */ + + /* + * This is really bad, but I don't want to go monkey up the + * slab.c to get to the cache_chain. So right now I just + * allocate a pointer list to search for slabs that are + * of the right size, and then free it at the end. + * + * Hey, you find a better way, then fix this ;) + */ + for (i=0; i < sizeof(sizes)/sizeof(sizes[0]); i++) { + s = kmalloc(sizeof(*s), GFP_ATOMIC); + if (!s) + panic("Can't create link list for slabs\n"); + s->c = &malloc_sizes[i]; + list_add_tail(&s->list, &slablist); + } + + for (; slab_addresses < end; slab_addresses++) { + char *name; + char *name_dma; + unsigned long size = *slab_addresses; + struct cache_sizes **ptr = (struct cache_sizes**)slab_addresses; + + if (!size) + continue; + + size = (size + (L1_CACHE_BYTES-1)) & ~(L1_CACHE_BYTES-1); + if (size < BYTES_PER_WORD) + size = BYTES_PER_WORD; + if (size < ARCH_KMALLOC_MINALIGN) + size = ARCH_KMALLOC_MINALIGN; + + c = find_slab_size(size); + if (c) { + *ptr = c; + continue; + } + + /* + * Create a cache for this specific size. + */ + name = kmalloc(25, GFP_ATOMIC); + if (!name) + panic("Can't allocate name for dynamic slab\n"); + + snprintf(name, 25, "dynamic-%ld", size); + name_dma = kmalloc(25, GFP_ATOMIC); + if (!name_dma) + panic("Can't allocate name for dynamic slab\n"); + + snprintf(name_dma, 25, "dynamic_dma-%ld", size); + + c = kmalloc(sizeof(*c), GFP_ATOMIC); + + if (!c) + panic("Can't allocate cache_size descriptor\n"); + + c->cs_size = size; + + /* + * For performance, all the general caches are L1 aligned. + * This should be particularly beneficial on SMP boxes, as it + * eliminates "false sharing". + * Note for systems short on memory removing the alignment will + * allow tighter packing of the smaller caches. + */ + c->cs_cachep = kmem_cache_create(name, + c->cs_size, ARCH_KMALLOC_MINALIGN, + (ARCH_KMALLOC_FLAGS | SLAB_PANIC), NULL, NULL); + + c->cs_dmacachep = kmem_cache_create(name_dma, + c->cs_size, ARCH_KMALLOC_MINALIGN, + (ARCH_KMALLOC_FLAGS | SLAB_CACHE_DMA | SLAB_PANIC), + NULL, NULL); + + s = kmalloc(sizeof(*s), GFP_ATOMIC); + if (!s) + panic("Can't create link list for slabs\n"); + s->c = c; + list_add_tail(&s->list, &slablist); + + *ptr = c; + + } + + free_slablist(); + + printk("\nAfter update!\n"); + print_slab_addresses(1); + + return 0; +} +#endif Index: linux-2.6.15-rc7/include/asm-i386/dynamic_slab.h =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-2.6.15-rc7/include/asm-i386/dynamic_slab.h 2005-12-29 09:09:53.000000000 -0500 @@ -0,0 +1,20 @@ + +/* + * Included in slab.h + * + * @c - cache pointer to return base on size + * @size - size of cache. + */ +__asm__ __volatile__ ( + "jmp 2f\n" + ".section __slab_preprocess,\"ax\"\n" + "movl %1,1f\n" + ".previous\n" + ".section __slab_addresses,\"aw\"\n" + ".align 4\n" + "1:\n" + ".long 0\n" + ".previous\n" + "2:\n" + "movl 1b, %0\n" + : "=r"(c) : "i"(size)); Index: linux-2.6.15-rc7/include/linux/slab.h =================================================================== --- linux-2.6.15-rc7.orig/include/linux/slab.h 2005-12-29 09:09:29.000000000 -0500 +++ linux-2.6.15-rc7/include/linux/slab.h 2005-12-29 09:23:44.000000000 -0500 @@ -80,6 +80,15 @@ { if (__builtin_constant_p(size)) { int i = 0; +#if defined(CONFIG_DYNAMIC_SLABS) && !defined(MODULE) && !defined(DYNAMIC_SLABS_BOOTSTRAP) + { + struct cache_sizes *c; +# include <asm/dynamic_slab.h> + return kmem_cache_alloc((flags & GFP_DMA) ? + c->cs_dmacachep : + c->cs_cachep, flags); + } +#endif #define CACHE(x) \ if (size <= x) \ goto found; \ Index: linux-2.6.15-rc7/mm/slab.c =================================================================== --- linux-2.6.15-rc7.orig/mm/slab.c 2005-12-29 09:09:29.000000000 -0500 +++ linux-2.6.15-rc7/mm/slab.c 2005-12-29 14:04:44.000000000 -0500 @@ -86,6 +86,7 @@ * All object allocations for a node occur from node specific slab lists. */ +#define DYNAMIC_SLABS_BOOTSTRAP #include <linux/config.h> #include <linux/slab.h> #include <linux/mm.h> @@ -1165,6 +1166,19 @@ /* Done! */ g_cpucache_up = FULL; +#ifdef CONFIG_DYNAMIC_SLABS + { + extern int dynamic_slab_init(void); + /* + * Create the caches that will handle + * kmallocs of constant sizes. + */ + dynamic_slab_init(); + } +#endif + /* + */ + /* Register a cpu startup notifier callback * that initializes ac_data for all new cpus */ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2005-12-29 19:48 ` Steven Rostedt @ 2005-12-29 21:16 ` Andi Kleen 0 siblings, 0 replies; 36+ messages in thread From: Andi Kleen @ 2005-12-29 21:16 UTC (permalink / raw) To: Steven Rostedt Cc: Andreas Kleen, linux-kernel, Eric Dumazet, Denis Vlasenko, Matt Mackall, Dave Jones > OK then, after reading this I figured there must be a way to dynamically > allocate slab sizes based on the kmalloc constants. So I spent last > night and some of this morning coming up with the below patch. The canonical slab theory is that constant allocations are for fixed objects. And if they are frequent they should be in theory kmem cache because in theory their object live times should be similar and clustering them together should give the best fragmentation advoidance. So in theory longer term the dynamic kmallocs are more important because they cannot be handled like this - and these are not caught by your patch. So I'm not sure you're optimizing the right thing here. Perhaps a good evolution your patch would be to add some analysis of the callers and generate a nice compile time report that people can use as a guideline to convert kmalloc over the kmem_cache_alloc. But to do this really well would require dynamic data from runtime. Given that I think a runtime patch is better. Ideally one that's easy to use with someone collecting data from users and then submitting a patch for a better new set of default slabs. Would need to be separate for 32bit and 64bit too. I guess one could run a fancy dynamic optimization algorithm to find the best set of slabs from the data. -Andi ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2005-12-28 17:57 ` Andreas Kleen 2005-12-28 21:01 ` Matt Mackall 2005-12-29 19:48 ` Steven Rostedt @ 2006-01-02 8:37 ` Pekka Enberg 2006-01-02 12:45 ` Andi Kleen 2 siblings, 1 reply; 36+ messages in thread From: Pekka Enberg @ 2006-01-02 8:37 UTC (permalink / raw) To: Andreas Kleen; +Cc: Denis Vlasenko, Eric Dumazet, linux-kernel On 12/28/05, Andreas Kleen <ak@suse.de> wrote: > I remember the original slab paper from Bonwick actually mentioned that > power of two slabs are the worst choice for a malloc - but for some reason Linux > chose them anyways. Power of two sizes are bad because memory accesses tend to concentrate on the same cache lines but slab coloring should take care of that. So I don't think there's a problem with using power of twos for kmalloc() caches. Pekka ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2006-01-02 8:37 ` Pekka Enberg @ 2006-01-02 12:45 ` Andi Kleen 2006-01-02 13:04 ` Pekka J Enberg 0 siblings, 1 reply; 36+ messages in thread From: Andi Kleen @ 2006-01-02 12:45 UTC (permalink / raw) To: Pekka Enberg; +Cc: Denis Vlasenko, Eric Dumazet, linux-kernel On Monday 02 January 2006 09:37, Pekka Enberg wrote: > On 12/28/05, Andreas Kleen <ak@suse.de> wrote: > > I remember the original slab paper from Bonwick actually mentioned that > > power of two slabs are the worst choice for a malloc - but for some reason Linux > > chose them anyways. > > Power of two sizes are bad because memory accesses tend to concentrate > on the same cache lines but slab coloring should take care of that. So > I don't think there's a problem with using power of twos for kmalloc() > caches. There is - who tells you it's the best possible distribution of memory? -Andi ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2006-01-02 12:45 ` Andi Kleen @ 2006-01-02 13:04 ` Pekka J Enberg 2006-01-02 13:56 ` Andi Kleen 0 siblings, 1 reply; 36+ messages in thread From: Pekka J Enberg @ 2006-01-02 13:04 UTC (permalink / raw) To: Andi Kleen; +Cc: Denis Vlasenko, Eric Dumazet, linux-kernel On 12/28/05, Andreas Kleen <ak@suse.de> wrote: > > > I remember the original slab paper from Bonwick actually mentioned that > > > power of two slabs are the worst choice for a malloc - but for some reason Linux > > > chose them anyways. On Monday 02 January 2006 09:37, Pekka Enberg wrote: > > Power of two sizes are bad because memory accesses tend to concentrate > > on the same cache lines but slab coloring should take care of that. So > > I don't think there's a problem with using power of twos for kmalloc() > > caches. On Mon, 2 Jan 2006, Andi Kleen wrote: > There is - who tells you it's the best possible distribution of memory? Maybe it's not. But that's besides the point. The specific problem Bonwick mentioned is related to cache line distribution and should be taken care of by slab coloring. Internal fragmentation is painful but the worst offenders can be fixed with kmem_cache_alloc(). So I really don't see the problem. On the other hand, I am not opposed to dynamic generic slabs if you can show a clear performance benefit from it. I just doubt you will. Pekka ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2006-01-02 13:04 ` Pekka J Enberg @ 2006-01-02 13:56 ` Andi Kleen 2006-01-02 15:09 ` Pekka J Enberg 2006-01-02 15:46 ` Jörn Engel 0 siblings, 2 replies; 36+ messages in thread From: Andi Kleen @ 2006-01-02 13:56 UTC (permalink / raw) To: Pekka J Enberg; +Cc: Denis Vlasenko, Eric Dumazet, linux-kernel On Monday 02 January 2006 14:04, Pekka J Enberg wrote: > Maybe it's not. But that's besides the point. It was my point. I don't know what your point was. > The specific problem Bonwick > mentioned is related to cache line distribution and should be taken care > of by slab coloring. Internal fragmentation is painful but the worst > offenders can be fixed with kmem_cache_alloc(). So I really don't see the > problem. On the other hand, I am not opposed to dynamic generic slabs if > you can show a clear performance benefit from it. I just doubt you will. I wasn't proposing fully dynamic slabs, just a better default set of slabs based on real measurements instead of handwaving (like the power of two slabs seemed to have been generated). With separate sets for 32bit and 64bit. Also the goal wouldn't be better performance, but just less waste of memory. I suspect such a move could save much more memory on small systems than any of these "make fundamental debugging tools a CONFIG" patches ever. -Andi ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2006-01-02 13:56 ` Andi Kleen @ 2006-01-02 15:09 ` Pekka J Enberg 2006-01-02 15:46 ` Jörn Engel 1 sibling, 0 replies; 36+ messages in thread From: Pekka J Enberg @ 2006-01-02 15:09 UTC (permalink / raw) To: Andi Kleen; +Cc: Denis Vlasenko, Eric Dumazet, linux-kernel On Mon, 2 Jan 2006, Andi Kleen wrote: > I wasn't proposing fully dynamic slabs, just a better default set > of slabs based on real measurements instead of handwaving (like > the power of two slabs seemed to have been generated). With separate > sets for 32bit and 64bit. > > Also the goal wouldn't be better performance, but just less waste of memory. > > I suspect such a move could save much more memory on small systems > than any of these "make fundamental debugging tools a CONFIG" patches ever. I misunderstood what you were proposing. Sorry. It makes sense to measure it. Pekka ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? 2006-01-02 13:56 ` Andi Kleen 2006-01-02 15:09 ` Pekka J Enberg @ 2006-01-02 15:46 ` Jörn Engel 1 sibling, 0 replies; 36+ messages in thread From: Jörn Engel @ 2006-01-02 15:46 UTC (permalink / raw) To: Andi Kleen; +Cc: Pekka J Enberg, Denis Vlasenko, Eric Dumazet, linux-kernel On Mon, 2 January 2006 14:56:22 +0100, Andi Kleen wrote: > > I wasn't proposing fully dynamic slabs, just a better default set > of slabs based on real measurements instead of handwaving (like > the power of two slabs seemed to have been generated). With separate > sets for 32bit and 64bit. > > Also the goal wouldn't be better performance, but just less waste of memory. My fear would be that this leads to something like the gperf: a perfect distribution of slab caches - until any tiny detail changes. But maybe there is a different distribution that is "pretty good" for all configurations and better than powers of two. > I suspect such a move could save much more memory on small systems > than any of these "make fundamental debugging tools a CONFIG" patches ever. Unlikely. SLOB should be better than SLAB for those purposes, no matter how you arrange the slab caches. Jörn -- Fancy algorithms are slow when n is small, and n is usually small. Fancy algorithms have big constants. Until you know that n is frequently going to be big, don't get fancy. -- Rob Pike ^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2006-01-04 5:26 UTC | newest] Thread overview: 36+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-12-21 8:00 [ANNOUNCE] GIT 1.0.0 Junio C Hamano 2005-12-21 9:11 ` [POLL] SLAB : Are the 32 and 192 bytes caches really usefull on x86_64 machines ? Eric Dumazet 2005-12-21 9:22 ` David S. Miller 2005-12-21 10:03 ` Jan-Benedict Glaw 2005-12-21 9:46 ` Alok kataria 2005-12-21 12:44 ` Ed Tomlinson 2005-12-21 13:20 ` Folkert van Heusden 2005-12-21 13:38 ` Eric Dumazet 2005-12-21 14:09 ` Folkert van Heusden 2005-12-21 16:40 ` Dave Jones 2005-12-21 19:36 ` Folkert van Heusden 2005-12-28 8:32 ` Denis Vlasenko 2005-12-28 8:54 ` Denis Vlasenko 2005-12-28 17:57 ` Andreas Kleen 2005-12-28 21:01 ` Matt Mackall 2005-12-29 1:26 ` Dave Jones 2005-12-30 4:06 ` Steven Rostedt 2006-01-02 8:46 ` Pekka Enberg 2006-01-02 8:51 ` Pekka Enberg 2006-01-02 12:33 ` Steven Rostedt 2006-01-02 12:31 ` Steven Rostedt 2005-12-29 1:29 ` Dave Jones 2005-12-29 1:50 ` Keith Owens 2005-12-29 2:39 ` Dave Jones 2006-01-02 15:03 ` Helge Hafting 2006-01-04 5:26 ` Dave Jones 2005-12-30 21:13 ` Marcelo Tosatti 2005-12-31 20:13 ` Andi Kleen 2005-12-29 19:48 ` Steven Rostedt 2005-12-29 21:16 ` Andi Kleen 2006-01-02 8:37 ` Pekka Enberg 2006-01-02 12:45 ` Andi Kleen 2006-01-02 13:04 ` Pekka J Enberg 2006-01-02 13:56 ` Andi Kleen 2006-01-02 15:09 ` Pekka J Enberg 2006-01-02 15:46 ` Jörn Engel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox