netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] tcp: use order-3 pages in tcp_sendmsg()
@ 2012-09-17  7:49 Eric Dumazet
  2012-09-17 16:12 ` David Miller
  2012-11-15  7:52 ` Yan, Zheng 
  0 siblings, 2 replies; 37+ messages in thread
From: Eric Dumazet @ 2012-09-17  7:49 UTC (permalink / raw)
  To: netdev

We currently use per socket page reserve for tcp_sendmsg() operations.

Its done to raise the probability of coalescing small write() in to
single segments in the skbs.

But it wastes a lot of memory for applications handling a lot of mostly
idle sockets, since each socket holds one page in sk->sk_sndmsg_page

I did a small experiment to use order-3 pages and it gave me a 10% boost
of performance, because each TSO skb can use only two frags of 32KB,
instead of 16 frags of 4KB, so we spend less time in ndo_start_xmit() to
setup the tx descriptor and TX completion path to unmap the frags and
free them.

We also spend less time in tcp_sendmsg(), because we call page allocator
8x less often.

Now back to the per socket page, what about trying to factorize it ?

Since we can sleep (or/and do a cpu migration) in tcp_sendmsg(), we cant
really use a percpu page reserve as we do in __netdev_alloc_frag()

We could instead use a per thread reserve, at the cost of adding a test
in task exit handler.

Recap :

1) Use a per thread page reserve instead of a per socket one
2) Use order-3 pages (or order-0 pages if page size is >= 32768)

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2012-11-21  8:05 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-17  7:49 [RFC] tcp: use order-3 pages in tcp_sendmsg() Eric Dumazet
2012-09-17 16:12 ` David Miller
2012-09-17 17:02   ` Eric Dumazet
2012-09-17 17:04     ` Eric Dumazet
2012-09-17 17:07       ` David Miller
2012-09-19 15:14         ` Eric Dumazet
2012-09-19 17:28           ` Rick Jones
2012-09-19 17:55             ` Eric Dumazet
2012-09-19 17:56           ` David Miller
2012-09-19 19:04             ` Alexander Duyck
2012-09-19 20:18           ` Ben Hutchings
2012-09-19 22:20           ` Vijay Subramanian
2012-09-20  5:37             ` Eric Dumazet
2012-09-20 17:10               ` Rick Jones
2012-09-20 17:43                 ` Eric Dumazet
2012-09-20 18:37                   ` Yuchung Cheng
2012-09-20 19:40                 ` David Miller
2012-09-20 20:06                   ` Rick Jones
2012-09-20 20:25                     ` Eric Dumazet
2012-09-21 15:48                       ` Eric Dumazet
2012-09-21 16:27                         ` David Miller
2012-09-21 16:51                           ` Eric Dumazet
2012-09-21 17:04                             ` David Miller
2012-09-21 17:11                               ` Eric Dumazet
2012-09-23 12:47                           ` Jan Engelhardt
2012-09-23 16:16                             ` David Miller
2012-09-23 17:40                               ` Jan Engelhardt
2012-09-23 18:13                                 ` Eric Dumazet
2012-09-23 18:27                                 ` David Miller
2012-09-20 21:39               ` Vijay Subramanian
2012-09-20 22:01               ` Rick Jones
2012-11-15  7:52 ` Yan, Zheng 
2012-11-15 13:06   ` Eric Dumazet
2012-11-16  2:36     ` Yan, Zheng 
2012-11-15 13:47   ` Eric Dumazet
2012-11-21  8:05     ` Yan, Zheng
2012-11-15 18:33   ` Rick Jones

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).