* [PATCH][RFC 2] cleaning up struct sock [not found] ` <20011210.231826.55509210.davem@redhat.com> @ 2001-12-18 5:35 ` Arnaldo Carvalho de Melo [not found] ` <20011217.225134.91313099.davem@redhat.com> 0 siblings, 1 reply; 9+ messages in thread From: Arnaldo Carvalho de Melo @ 2001-12-18 5:35 UTC (permalink / raw) To: David S. Miller Cc: SteveW, jschlst, ncorbic, eis, dag, torvalds, marcelo, netdev, Linux Kernel Mailing List Em Mon, Dec 10, 2001 at 11:18:26PM -0800, David S. Miller escreveu: > From: Arnaldo Carvalho de Melo <acme@conectiva.com.br> > Date: Mon, 10 Dec 2001 23:08:10 -0200 > > David, if you don't like it, I'll happily switch to the big fat > union idea, but I think that this is more clean and will avoid us > having to patch sock.h every time a new net stack is added to the > kernel. > > I'm a little concerned about having to allocate two objects instead of > one. ok, this new patch doesn't allocates two objects, just one like it is today, more below > These things aren't like inodes. Inodes are cached and lookup > read-multiple objects, whereas sockets are throw-away and recycled > objects. Inode allocation performance therefore isn't that critical, > but socket allocation performance is. > > Then we go back to the old problem of protocols that can be used to > embed IP and thus having to keep track of parallel state for multiple > protocols. I think your changes did not compromise that for what > we currently support though. > > I still need to think about this some more.... > > You know, actually, the protocols are the ones which call sk_alloc(). > So we could just extend sk_alloc() to take a kmem_cache_t argument. Well, in this patch I added two new fields to net_proto_family, sk_cachep and sk_size, so that the current protocols will pass this as zero without modification and the net_families[proto]->sk_cachep will get the current sk_cachep slabcache, but the ones that actually initializes the sk_cachep and sk_size members of net_proto_family will use its private slab cache that has objsize == sizeof(struct sock) + sizeof(proto_opt), with this I removed the sk->tp_pinfo altogether and use the macro TCP_PINFO like you suggested, also net_pinfo is gone and sk->protinfo is just a void pointer now. > TCP could thus make a kmem_cache_t which is sizeof(struct sock) > + sizeof(struct tcp_opt) and then set the TP_INFO pointer to "(sk + > 1)". > > Oh yes, another overhead is all the extra dereferencing. To fight > that we could make a macro that knows the above layout: > > #define TCP_PINFO(SK) ((struct tcp_opt *)((SK) + 1)) yes > So I guess we could do things your way without any of the potential > performance problems. > > It is going to be a while before I can apply something like this, I > would like to help Jens+Linus get the new block stuff in shape first. > This would obviously be a 2.5.x change too. Ok, this is just for comments, I'll do the modifications that people agree on here. The changes were rather minimal, i.e., tcp already was using this style: int tcp_foo_function(struct sock* sk) { struct tcp_opt *tp = &sk->tp_pinfo.af_tcp; and in the patch it just changes it to: struct tcp_opt *tp = TCP_PINFO(sk); in most places. I could make patches to make the IP_SK case be of similar style of the current tcp_opt usage (i.e., like the code excerpt above). This message was sent using the patch 8) All the protocols, khttpd, netfilter, etc, are already converted to this new style and the only thing that still has to be done is to remove things like daddr, saddr, rcv_saddr, dport, sport and other ipv4 specific members of struct sock. Ah, another thing is to try and make rtnetlink use sock_register prior to using sk_alloc, so that I can remove the check for in sk_alloc and net_proto_sk_size for net_families[family] being NULL. Please let me know if this is something acceptable for 2.5. The patch is still for 2.4.16, but it should apply cleanly to 2.5.1, I think. Of course I'll make sure it works with 2.5.1 if it is considered OK. I'll stop working on it for now till further comments are made by the net stack maintainers and concentrate on a new task: to do the same thing for struct inode, where, it seems, we won't even need the per fs slabcaches, just using a private void pointer 8) Here is an example of how the slabcaches are: [acme@rama2 acme]$ grep sock /proc/slabinfo unix_sock 5 10 396 1 1 1 : 17 735 2 1 0 inet_sock 17 20 800 4 4 1 : 25 207 8 4 0 sock 0 0 332 0 0 1 : 0 0 0 0 0 [acme@rama2 acme]$ And without the patch, using 2.4.16-pre1 [acme@brinquedo linux]$ grep sock /proc/slabinfo sock 76 84 1056 11 12 2 : 87 4730 19 7 0 With this we get memory savings and performance gains in addition to the much needed (IMHO) cleanup of include/net/sock.h 8) On big busy servers these savings can reach over one megabyte of kernel memory, please correct me if I'm wrong :) IPv6 sockets use about 980 bytes, so for a kernel with IPv6 compiled even as a module one can get savings even for the ipv4 sockets case. Patch available at: http://www.kernel.org/pub/linux/kernel/people/acme/v2.4/2.4.16/ sock.cleanup-5.patch.bz2 A not so complete changelog is at: http://www.kernel.org/pub/linux/kernel/people/acme/v2.4/2.4.16/ patch-2.4.16.log it can be of help in understanding this patch. Waiting for comments and testers results, TIA, - Arnaldo ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <20011217.225134.91313099.davem@redhat.com>]
* Re: [PATCH][RFC 2] cleaning up struct sock [not found] ` <20011217.225134.91313099.davem@redhat.com> @ 2001-12-18 12:01 ` Arnaldo Carvalho de Melo 2001-12-18 20:52 ` Arnaldo Carvalho de Melo 1 sibling, 0 replies; 9+ messages in thread From: Arnaldo Carvalho de Melo @ 2001-12-18 12:01 UTC (permalink / raw) To: David S. Miller Cc: SteveW, jschlst, ncorbic, eis, dag, torvalds, marcelo, netdev, linux-kernel Em Mon, Dec 17, 2001 at 10:51:34PM -0800, David S. Miller escreveu: > From: Arnaldo Carvalho de Melo <acme@conectiva.com.br> > Date: Tue, 18 Dec 2001 03:35:52 -0200 > > the only thing that still has to be done is to remove things > like daddr, saddr, rcv_saddr, dport, sport and other ipv4 specific members > of struct sock > > Actually, I'd like to keep the first couple cache lines of struct > sock the way it is :-( For hash lookups the identity + the hash next > pointer fit perfectly in one cache line on nearly all platforms. fair > Which brings me to... > > Please let me know if this is something acceptable for 2.5. > > What kind of before/after effect do you see in lat_tcp/lat_connect > (from lmbench) runs? Will see today, I concentrated on the cleanup part trying not to harm performance by following the suggestions for the first patch (i.e., just one allocation, etc). I'll test it later today, at the lab, UP and SMP (4 and 8 way) and submit the results here. Apart from possible performance problems, does the patch looks OK? - Arnaldo ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH][RFC 2] cleaning up struct sock [not found] ` <20011217.225134.91313099.davem@redhat.com> 2001-12-18 12:01 ` Arnaldo Carvalho de Melo @ 2001-12-18 20:52 ` Arnaldo Carvalho de Melo 2001-12-18 21:08 ` David S. Miller 1 sibling, 1 reply; 9+ messages in thread From: Arnaldo Carvalho de Melo @ 2001-12-18 20:52 UTC (permalink / raw) To: David S. Miller Cc: SteveW, jschlst, ncorbic, eis, dag, torvalds, marcelo, netdev, linux-kernel Em Mon, Dec 17, 2001 at 10:51:34PM -0800, David S. Miller escreveu: > Which brings me to... > > Please let me know if this is something acceptable for 2.5. > > What kind of before/after effect do you see in lat_tcp/lat_connect > (from lmbench) runs? Improvements on the lat_connect case? :) 2.4.16 TCP latency using 127.0.0.1: 119.3369 microseconds TCP latency using 127.0.0.1: 118.9847 microseconds TCP latency using 127.0.0.1: 118.5139 microseconds TCP latency using 127.0.0.1: 119.1301 microseconds TCP latency using 127.0.0.1: 118.6322 microseconds TCP/IP connection cost to 127.0.0.1: 429.6667 microseconds TCP/IP connection cost to 127.0.0.1: 430.7692 microseconds TCP/IP connection cost to 127.0.0.1: 431.4615 microseconds TCP/IP connection cost to 127.0.0.1: 430.3846 microseconds TCP/IP connection cost to 127.0.0.1: 435.4615 microseconds 2.4.16-acme5 TCP latency using 127.0.0.1: 119.2639 microseconds TCP latency using 127.0.0.1: 118.6068 microseconds TCP latency using 127.0.0.1: 119.0443 microseconds TCP latency using 127.0.0.1: 119.5683 microseconds TCP latency using 127.0.0.1: 118.9556 microseconds TCP/IP connection cost to 127.0.0.1: 408.3571 microseconds TCP/IP connection cost to 127.0.0.1: 409.6429 microseconds TCP/IP connection cost to 127.0.0.1: 410.6429 microseconds TCP/IP connection cost to 127.0.0.1: 409.2143 microseconds TCP/IP connection cost to 127.0.0.1: 414.8333 microseconds More results are coming, this time for local connections on a 8-way box - Arnaldo ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH][RFC 2] cleaning up struct sock 2001-12-18 20:52 ` Arnaldo Carvalho de Melo @ 2001-12-18 21:08 ` David S. Miller [not found] ` <20011218232222.A1963@conectiva.com.br> 0 siblings, 1 reply; 9+ messages in thread From: David S. Miller @ 2001-12-18 21:08 UTC (permalink / raw) To: acme Cc: SteveW, jschlst, ncorbic, eis, dag, torvalds, marcelo, netdev, linux-kernel From: Arnaldo Carvalho de Melo <acme@conectiva.com.br> Date: Tue, 18 Dec 2001 18:52:00 -0200 Em Mon, Dec 17, 2001 at 10:51:34PM -0800, David S. Miller escreveu: > Which brings me to... > > Please let me know if this is something acceptable for 2.5. > > What kind of before/after effect do you see in lat_tcp/lat_connect > (from lmbench) runs? Improvements on the lat_connect case? :) Great. I have no fundamental problems with your changes. Now, when/if we move the hash-chain/identity members into the IPv4 struct (there are some issues with this wrt. ipv6 btw) I will be interested in seeing the same tests done :-) ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <20011218232222.A1963@conectiva.com.br>]
* [PATCH][RFC 3] cleaning up struct sock [not found] ` <20011218232222.A1963@conectiva.com.br> @ 2001-12-20 3:23 ` Arnaldo Carvalho de Melo 2001-12-20 8:21 ` David S. Miller 0 siblings, 1 reply; 9+ messages in thread From: Arnaldo Carvalho de Melo @ 2001-12-20 3:23 UTC (permalink / raw) To: David S. Miller Cc: SteveW, jschlst, ncorbic, eis, dag, torvalds, marcelo, netdev, linux-kernel Ok, patch for 2.5.1, without the bogus cvs $id strings hunks, being used in this machine now. Available at: http://www.kernel.org/pub/linux/kernel/people/acme/v2.5/2.5.1/ sock.cleanup-2.5.1.patch.bz2 Ah, the lat_unix_connect results on a pentium 300 mmx notebook: 2.5.1 + this patch UNIX connection cost : 96.1749 microseconds UNIX connection cost : 96.3361 microseconds UNIX connection cost : 97.2310 microseconds UNIX connection cost : 101.9180 microseconds UNIX connection cost : 97.2461 microseconds 2.4.16 pristine UNIX connection cost : 112.7034 microseconds UNIX connection cost : 114.5494 microseconds UNIX connection cost : 114.0923 microseconds UNIX connection cost : 111.0959 microseconds UNIX connection cost : 120.8419 microseconds And about 100 KB of kernel memory saved for AF_UNIX sockets on a basic KDE session (i.e., the AF_UNIX struct sock now is about 400 bytes when it is about 1200 bytes on a pristine kernel). - Arnaldo ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH][RFC 3] cleaning up struct sock 2001-12-20 3:23 ` [PATCH][RFC 3] " Arnaldo Carvalho de Melo @ 2001-12-20 8:21 ` David S. Miller 2001-12-20 12:37 ` Arnaldo Carvalho de Melo 2001-12-21 13:54 ` Arnaldo Carvalho de Melo 0 siblings, 2 replies; 9+ messages in thread From: David S. Miller @ 2001-12-20 8:21 UTC (permalink / raw) To: acme Cc: SteveW, jschlst, ncorbic, eis, dag, torvalds, marcelo, netdev, linux-kernel From: Arnaldo Carvalho de Melo <acme@conectiva.com.br> Date: Thu, 20 Dec 2001 01:23:39 -0200 Available at: http://www.kernel.org/pub/linux/kernel/people/acme/v2.5/2.5.1/ sock.cleanup-2.5.1.patch.bz2 Looking pretty good. I have one improvement. I'd rather you pass the "kmem_cache_t" directly into sk_alloc, use NULL for "I don't have any extra private area". And then, for the IP case lay it out like this: struct sock struct ip_opt struct {tcp,raw4,...}_opt And use different kmem_cache_t's for each protocol instead of the same one for tcp, raw4, etc. RAW/UDP sockets waste a lot of space with your current layout. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH][RFC 3] cleaning up struct sock 2001-12-20 8:21 ` David S. Miller @ 2001-12-20 12:37 ` Arnaldo Carvalho de Melo 2001-12-21 13:54 ` Arnaldo Carvalho de Melo 1 sibling, 0 replies; 9+ messages in thread From: Arnaldo Carvalho de Melo @ 2001-12-20 12:37 UTC (permalink / raw) To: David S. Miller Cc: SteveW, jschlst, ncorbic, eis, dag, torvalds, marcelo, netdev, linux-kernel Em Thu, Dec 20, 2001 at 12:21:26AM -0800, David S. Miller escreveu: > From: Arnaldo Carvalho de Melo <acme@conectiva.com.br> > Date: Thu, 20 Dec 2001 01:23:39 -0200 > > Available at: > > http://www.kernel.org/pub/linux/kernel/people/acme/v2.5/2.5.1/ > sock.cleanup-2.5.1.patch.bz2 > > Looking pretty good. I have one improvement. > > I'd rather you pass the "kmem_cache_t" directly into sk_alloc, use > NULL for "I don't have any extra private area". humm I did that with sock_register to avoid changing all the sk_alloc users, but in the end all protocols were changed so... ok, I'll do that, at least it'll simplify the "rtnetlink socket allocated early in the boot process before sock_register(rtnetlink) was called". > And then, for the IP case lay it out like this: > > struct sock > struct ip_opt > struct {tcp,raw4,...}_opt > > And use different kmem_cache_t's for each protocol instead of > the same one for tcp, raw4, etc. > > RAW/UDP sockets waste a lot of space with your current layout. *grin* Ok, ok, lets save more bytes 8) I'll look into this. - Arnaldo ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH][RFC 3] cleaning up struct sock 2001-12-20 8:21 ` David S. Miller 2001-12-20 12:37 ` Arnaldo Carvalho de Melo @ 2001-12-21 13:54 ` Arnaldo Carvalho de Melo 2001-12-22 3:28 ` [PATCH][RFC 4] " Arnaldo Carvalho de Melo 1 sibling, 1 reply; 9+ messages in thread From: Arnaldo Carvalho de Melo @ 2001-12-21 13:54 UTC (permalink / raw) To: David S. Miller Cc: SteveW, jschlst, ncorbic, eis, dag, torvalds, marcelo, netdev, linux-kernel Em Thu, Dec 20, 2001 at 12:21:26AM -0800, David S. Miller escreveu: > From: Arnaldo Carvalho de Melo <acme@conectiva.com.br> > Date: Thu, 20 Dec 2001 01:23:39 -0200 > > Available at: > > http://www.kernel.org/pub/linux/kernel/people/acme/v2.5/2.5.1/ > sock.cleanup-2.5.1.patch.bz2 > > Looking pretty good. I have one improvement. > > I'd rather you pass the "kmem_cache_t" directly into sk_alloc, use > NULL for "I don't have any extra private area". > > And then, for the IP case lay it out like this: > > struct sock > struct ip_opt > struct {tcp,raw4,...}_opt > > And use different kmem_cache_t's for each protocol instead of > the same one for tcp, raw4, etc. > > RAW/UDP sockets waste a lot of space with your current layout. Indeed it wastes, but the current setup in the stock kernel wastes even more ;) Well, did what you suggested, adding a slab parameter to sk_alloc and I also overloaded zero_it but its current behaviour is maintained, i.e., 0 == don't zeroes the newly allocated sock, 1 == zeroes it, the overloading is: 1 == sizeof(struct sock), > 1 == objsize of the per protocol slabcache. For now I did only to UDPv4 sockets, will do the others this afternoon, this is the result so far: [rama2 kernel-acme]$ grep sock /proc/slabinfo unix_sock 7 20 400 1 2 1 : 17 572 2 0 0 udp_sock 6 10 372 1 1 1 : 7 31 1 0 0 tcp_sock 13 15 800 3 3 1 : 13 46 3 0 0 sock 0 0 336 0 0 1 : 0 0 0 0 0 Now UDP sockets use only 372 bytes while in the stock kernel it uses 1280 bytes when all the protocols are selected (as modules or statically linked, but more than 1 KB when just TCP/IP v4 is selected). More to come. Patch will be available later today. - Arnaldo ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH][RFC 4] cleaning up struct sock 2001-12-21 13:54 ` Arnaldo Carvalho de Melo @ 2001-12-22 3:28 ` Arnaldo Carvalho de Melo 0 siblings, 0 replies; 9+ messages in thread From: Arnaldo Carvalho de Melo @ 2001-12-22 3:28 UTC (permalink / raw) To: David S. Miller, SteveW, jschlst, ncorbic, eis, dag, torvalds, marcelo, netdev, linux-kernel Em Fri, Dec 21, 2001 at 11:54:38AM -0200, Arnaldo Carvalho de Melo escreveu: > Em Thu, Dec 20, 2001 at 12:21:26AM -0800, David S. Miller escreveu: > > I'd rather you pass the "kmem_cache_t" directly into sk_alloc, use > > NULL for "I don't have any extra private area". > > > > And then, for the IP case lay it out like this: > > > > struct sock > > struct ip_opt > > struct {tcp,raw4,...}_opt > > > > And use different kmem_cache_t's for each protocol instead of > > the same one for tcp, raw4, etc. > > > > RAW/UDP sockets waste a lot of space with your current layout. Done, patch available at: http://www.kernel.org/pub/linux/kernel/people/acme/v2.5/2.5.2-pre1/ sock.cleanup-2.5.2-pre1.bz2 Current state of /proc/slabinfo: [acme@rama2 acme]$ grep sock /proc/slabinfo unix_sock 7 20 400 1 2 1 : 17 572 2 0 0 raw4_sock 0 10 376 0 1 1 : 1 3 1 0 0 udp_sock 6 10 372 1 1 1 : 7 31 1 0 0 tcp_sock 13 15 800 3 3 1 : 14 47 3 0 0 sock 0 0 336 0 0 1 : 0 0 0 0 0 TODO: do the same for IPv6, that now has only one slabcache for all its protocols. - Arnaldo ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2001-12-22 3:29 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20011210230810.C896@conectiva.com.br>
[not found] ` <20011210.231826.55509210.davem@redhat.com>
2001-12-18 5:35 ` [PATCH][RFC 2] cleaning up struct sock Arnaldo Carvalho de Melo
[not found] ` <20011217.225134.91313099.davem@redhat.com>
2001-12-18 12:01 ` Arnaldo Carvalho de Melo
2001-12-18 20:52 ` Arnaldo Carvalho de Melo
2001-12-18 21:08 ` David S. Miller
[not found] ` <20011218232222.A1963@conectiva.com.br>
2001-12-20 3:23 ` [PATCH][RFC 3] " Arnaldo Carvalho de Melo
2001-12-20 8:21 ` David S. Miller
2001-12-20 12:37 ` Arnaldo Carvalho de Melo
2001-12-21 13:54 ` Arnaldo Carvalho de Melo
2001-12-22 3:28 ` [PATCH][RFC 4] " Arnaldo Carvalho de Melo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox