* design for TSO performance fix
@ 2005-01-28 0:31 David S. Miller
2005-01-28 0:51 ` Rick Jones
` (3 more replies)
0 siblings, 4 replies; 13+ messages in thread
From: David S. Miller @ 2005-01-28 0:31 UTC (permalink / raw)
To: netdev
Ok, here is the best idea I've been able to come up with
so far.
The basic idea is that we stop trying to build TSO frames
in the actual transmit queue. Instead, TSO packets are
built impromptu when we actually output packets on the
transmit queue.
Advantages:
1) No knowledge of TSO frames need exist anywhere besides
tcp_write_xmit(), tcp_transmit_skb(), and
tcp_xmit_retransmit_queue()
2) As a result of #1, all the pcount crap goes away.
The need for two MSS state variables (mss_cache,
and mss_cache_std) and assosciated complexity is
eliminated as well.
3) Keeping TSO enabled after packet loss "just works".
4) CWND sampled at the correct moment when deciding
the TSO packet arity.
The one disadvantage is that it might be a tiny bit more
expensive to build TSO frames. But I am sure we can find
ways to optimize that quite well.
The main element of the TSO output logic is a function
that is schemed as follows:
static inline int tcp_skb_data_all_paged(struct sk_buff *skb)
{
return (skb->len == skb->data_len);
}
/* If possible, append paged data of SRC_SKB onto the
* tail of DST_SKB.
*/
static int skb_append_pages(struct sk_buff *dst_skb, struct sk_buff *src_skb)
{
int i;
if (!tcp_skb_data_all_paged(src_skb))
return -EINVAL;
for (i = 0; i < skb_shinfo(src_skb)->nr_frags; i++) {
skb_frag_t *src_frag = &skb_shinfo(src_skb)->frags[i];
skb_frag_t *dst_frag;
int dst_frag_idx;
dst_frag_idx = skb_shinfo(dst_skb)->nr_frags;
if (skb_can_coalesce(dst_skb, dst_frag_idx,
src_frag->page, src_frag->page_offset)) {
dst_frag = &skb_shinfo(dst_skb)->frags[dst_frag_idx-1];
dst_frag->size += src_frag->size;
} else {
if (dst_frag_idx >= MAX_SKB_FRAGS)
return -EMSGSIZE;
dst_frag = &skb_shinfo(dst_skb)->frags[dst_frag_idx];
skb_shinfo(dst_skb)->nr_frags = dst_frag_idx + 1;
dst_frag->page = src_frag->page;
get_page(src_frag->page);
dst_frag->page_offset = src_frag->page_offset;
dst_frag->size = src_frag->size;
}
skb->data_len += src_frag->size;
}
return 0;
}
static struct sk_buff *tcp_tso_build(struct sk_buff *head, int mss, int num)
{
struct sk_buff *skb;
struct sock *sk;
int err;
sk = head->sk;
skb = alloc_skb(sk->sk_prot->max_header, GFP_ATOMIC);
err = -ENOMEM;
if (!skb)
goto fail;
err = 0;
skb_shinfo(skb)->tso_size = mss;
skb_shinfo(skb)->tso_segs = num;
while (num--) {
err = skb_append_pages(skb, head, &dst_frag_idx);
if (err)
goto fail;
head = head->next;
}
return skb;
fail:
if (skb) {
int i;
for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
put_page(frag->page);
}
kfree_skb(skb);
}
return NULL;
}
If tcp_tso_build() fails, the caller just falls back to the
normal path of sending the frames non-TSO one-by-one.
The logic is simple because if TSO is being done we know
that all of the SKB data is paged (since SG+CSUM is a
requirement for TSO). The one case where that
invariant might fail is due to a routing change (previous
device cannot do SG+CSUM, new device has full TSO capability)
and that is handled via the tcp_skb_data_all_paged() checks.
My thinking is that whatever added expensive this new scheme
has, is offset by the simplifications the rest of the TCP
stack will have since it will no longer need to know anything
about multiple MSS values and packet counts.
Comments?
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: design for TSO performance fix 2005-01-28 0:31 design for TSO performance fix David S. Miller @ 2005-01-28 0:51 ` Rick Jones 2005-01-28 0:58 ` David S. Miller 2005-01-28 1:31 ` Herbert Xu ` (2 subsequent siblings) 3 siblings, 1 reply; 13+ messages in thread From: Rick Jones @ 2005-01-28 0:51 UTC (permalink / raw) To: netdev David S. Miller wrote: > Ok, here is the best idea I've been able to come up with > so far. > > The basic idea is that we stop trying to build TSO frames > in the actual transmit queue. Instead, TSO packets are > built impromptu when we actually output packets on the > transmit queue. > > Advantages: > > 1) No knowledge of TSO frames need exist anywhere besides > tcp_write_xmit(), tcp_transmit_skb(), and > tcp_xmit_retransmit_queue() > > 2) As a result of #1, all the pcount crap goes away. > The need for two MSS state variables (mss_cache, > and mss_cache_std) and assosciated complexity is > eliminated as well. > > 3) Keeping TSO enabled after packet loss "just works". Doubleplusgood. > > 4) CWND sampled at the correct moment when deciding > the TSO packet arity. > > The one disadvantage is that it might be a tiny bit more > expensive to build TSO frames. But I am sure we can find > ways to optimize that quite well. > > The main element of the TSO output logic is a function > that is schemed as follows: > > ... > > If tcp_tso_build() fails, the caller just falls back to the > normal path of sending the frames non-TSO one-by-one. > > The logic is simple because if TSO is being done we know > that all of the SKB data is paged (since SG+CSUM is a > requirement for TSO). The one case where that > invariant might fail is due to a routing change (previous > device cannot do SG+CSUM, new device has full TSO capability) > and that is handled via the tcp_skb_data_all_paged() checks. > > My thinking is that whatever added expensive this new scheme > has, is offset by the simplifications the rest of the TCP > stack will have since it will no longer need to know anything > about multiple MSS values and packet counts. > > Comments? Does anything (need to) change wrt getting the size of the TSO's to increase as cwnd increases? rick jones ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: design for TSO performance fix 2005-01-28 0:51 ` Rick Jones @ 2005-01-28 0:58 ` David S. Miller 0 siblings, 0 replies; 13+ messages in thread From: David S. Miller @ 2005-01-28 0:58 UTC (permalink / raw) To: Rick Jones; +Cc: netdev On Thu, 27 Jan 2005 16:51:31 -0800 Rick Jones <rick.jones2@hp.com> wrote: > Does anything (need to) change wrt getting the size of the TSO's to increase as > cwnd increases? Nope, we use the same algorithm we use currently to determine the "TSO mss", except that we compute and apply it at the correct moment. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: design for TSO performance fix 2005-01-28 0:31 design for TSO performance fix David S. Miller 2005-01-28 0:51 ` Rick Jones @ 2005-01-28 1:31 ` Herbert Xu 2005-01-28 5:19 ` David S. Miller 2005-01-28 1:57 ` Thomas Graf 2005-01-28 6:25 ` Andi Kleen 3 siblings, 1 reply; 13+ messages in thread From: Herbert Xu @ 2005-01-28 1:31 UTC (permalink / raw) To: David S. Miller; +Cc: netdev David S. Miller <davem@davemloft.net> wrote: > > Ok, here is the best idea I've been able to come up with > so far. It sounds great! > 2) As a result of #1, all the pcount crap goes away. > The need for two MSS state variables (mss_cache, > and mss_cache_std) and assosciated complexity is > eliminated as well. Does this mean that we'll start counting bytes instead of packets? If not then please let me know on how you plan to do the packet counting. > static struct sk_buff *tcp_tso_build(struct sk_buff *head, int mss, int num) > { > struct sk_buff *skb; > struct sock *sk; > int err; > > sk = head->sk; > skb = alloc_skb(sk->sk_prot->max_header, GFP_ATOMIC); The other good thing about this is that if we do this for all packets including non-TSO ones, then the TCP stack doesn't have to own the TCP/IP headers at all. Then we can stop worrying about the TSO/COW mangling. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: design for TSO performance fix 2005-01-28 1:31 ` Herbert Xu @ 2005-01-28 5:19 ` David S. Miller 2005-01-28 5:44 ` Herbert Xu 0 siblings, 1 reply; 13+ messages in thread From: David S. Miller @ 2005-01-28 5:19 UTC (permalink / raw) To: Herbert Xu; +Cc: netdev On Fri, 28 Jan 2005 12:31:53 +1100 Herbert Xu <herbert@gondor.apana.org.au> wrote: > > 2) As a result of #1, all the pcount crap goes away. > > The need for two MSS state variables (mss_cache, > > and mss_cache_std) and assosciated complexity is > > eliminated as well. > > Does this mean that we'll start counting bytes instead > of packets? > > If not then please let me know on how you plan to do the > packet counting. Things will be same as what we have now, except multi-packet SKBs will no longer exist in the retransmit queue. > The other good thing about this is that if we do this for all > packets including non-TSO ones, then the TCP stack doesn't have > to own the TCP/IP headers at all. Then we can stop worrying > about the TSO/COW mangling. Hmmm, have to think about that some more. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: design for TSO performance fix 2005-01-28 5:19 ` David S. Miller @ 2005-01-28 5:44 ` Herbert Xu 2005-01-28 19:28 ` David S. Miller 0 siblings, 1 reply; 13+ messages in thread From: Herbert Xu @ 2005-01-28 5:44 UTC (permalink / raw) To: David S. Miller; +Cc: netdev On Thu, Jan 27, 2005 at 09:19:40PM -0800, David S. Miller wrote: > > > Does this mean that we'll start counting bytes instead > > of packets? > > > > If not then please let me know on how you plan to do the > > packet counting. > > Things will be same as what we have now, except multi-packet > SKBs will no longer exist in the retransmit queue. Colour me confused then. How are you going to remember the packet boundaries which we need to do if we're going to keep counting packets instead of bytes? -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: design for TSO performance fix 2005-01-28 5:44 ` Herbert Xu @ 2005-01-28 19:28 ` David S. Miller 2005-01-29 10:12 ` Herbert Xu 0 siblings, 1 reply; 13+ messages in thread From: David S. Miller @ 2005-01-28 19:28 UTC (permalink / raw) To: Herbert Xu; +Cc: netdev On Fri, 28 Jan 2005 16:44:41 +1100 Herbert Xu <herbert@gondor.apana.org.au> wrote: > Colour me confused then. How are you going to remember the > packet boundaries which we need to do if we're going to keep > counting packets instead of bytes? It's just like how the code was before I added all of that tcp_pcount_t code. The retransmit queue only ever contains normal MSS sized frames. When we decide to send something off the queue, we try to build them up into TSO frames. Congestion control etc. decisions are still made by packet counting. When we get ACKs and SACKs back, we can just trim and mark the retransmit queue in the simplest way since we don't have TSO packets in there anymore. TSO packets only exist in the tcp_transmit_skb() path, nothing else in the stack sees them. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: design for TSO performance fix 2005-01-28 19:28 ` David S. Miller @ 2005-01-29 10:12 ` Herbert Xu 0 siblings, 0 replies; 13+ messages in thread From: Herbert Xu @ 2005-01-29 10:12 UTC (permalink / raw) To: David S. Miller; +Cc: netdev On Fri, Jan 28, 2005 at 11:28:38AM -0800, David S. Miller wrote: > > TSO packets only exist in the tcp_transmit_skb() path, > nothing else in the stack sees them. Cool, that should be really good then. -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: design for TSO performance fix 2005-01-28 0:31 design for TSO performance fix David S. Miller 2005-01-28 0:51 ` Rick Jones 2005-01-28 1:31 ` Herbert Xu @ 2005-01-28 1:57 ` Thomas Graf 2005-02-01 23:04 ` David S. Miller 2005-01-28 6:25 ` Andi Kleen 3 siblings, 1 reply; 13+ messages in thread From: Thomas Graf @ 2005-01-28 1:57 UTC (permalink / raw) To: David S. Miller; +Cc: netdev * David S. Miller <20050127163146.33b01e95.davem@davemloft.net> 2005-01-27 16:31 > The basic idea is that we stop trying to build TSO frames > in the actual transmit queue. Instead, TSO packets are > built impromptu when we actually output packets on the > transmit queue. Sound great. > static inline int tcp_skb_data_all_paged(struct sk_buff *skb) > { > return (skb->len == skb->data_len); > } You could also define this as (skb_headlen(skb) == 0) > The logic is simple because if TSO is being done we know > that all of the SKB data is paged (since SG+CSUM is a > requirement for TSO). The one case where that > invariant might fail is due to a routing change (previous > device cannot do SG+CSUM, new device has full TSO capability) > and that is handled via the tcp_skb_data_all_paged() checks. I assume the case when reroute changes oif to a device no longer capable of SG+CSUM stays the same and the skb remains paged until dev_queue_xmit? > My thinking is that whatever added expensive this new scheme > has, is offset by the simplifications the rest of the TCP > stack will have since it will no longer need to know anything > about multiple MSS values and packet counts. I think the overhead is really worth the complexity that can be removed with these changes. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: design for TSO performance fix 2005-01-28 1:57 ` Thomas Graf @ 2005-02-01 23:04 ` David S. Miller 0 siblings, 0 replies; 13+ messages in thread From: David S. Miller @ 2005-02-01 23:04 UTC (permalink / raw) To: Thomas Graf; +Cc: netdev [-- Attachment #1: Type: text/plain, Size: 1631 bytes --] On Fri, 28 Jan 2005 02:57:51 +0100 Thomas Graf <tgraf@suug.ch> wrote: > > static inline int tcp_skb_data_all_paged(struct sk_buff *skb) > > { > > return (skb->len == skb->data_len); > > } > > You could also define this as (skb_headlen(skb) == 0) Good point, I'll do it that way. > I assume the case when reroute changes oif to a device no > longer capable of SG+CSUM stays the same and the skb remains > paged until dev_queue_xmit? That's correct. The only difference is that the TSO building path of send queue transmit will not be executed. I'm slowly piecing together an implementation. The most non- trivial aspect is the frame pushing logic. While building the queue from userspace, we wish to defer until either 1) the user will not supply more data or 2) there is enough in the send queue for an optimally sized TSO frame to be built. For the curious, there is attached my current state of implementation. It's very raw, but it starts to give the basic ideas. The first attachment are the design notes I've been jotting down casually while thinking about this, and the second is the rough beginnings of a patch. The patch implements the tp->tso_goal calculations, and the TSO segmentizer, but nothing else. The missing pieces are: 1) the push-pending-frames logic, it requires the most thought 1.5) the code in tcp_write_xmit() that tries to call the TSO segmenter with groups of SKBs to send 2) killing of tp->mss_cache_std, use tp->mss_cache for everything 3) kill all the code disabling TSO during packet drops 4) kill all the pcount stuff I'll continue trying to make more progress with this thing. [-- Attachment #2: tcp_tso.txt --] [-- Type: text/plain, Size: 1432 bytes --] Maintain some "TSO potential" state during segmentation at sendmsg()/sendpage() time. Use this at push-pending-frames time to defer tcp_write_xmit() calls and control it's behavior. Add tcp_flush_queue() which doesn't try to optimize TSO, it is invoked when getting packets out is more important than producing larger TSO chunks. These two cases are: 1) At end of sendmsg()/sendpage() call without MSG_MORE, indicating that we have no way to know for sure if the user will queue up more TCP data to send. 2) When sleeping within sendmsg()/sendpage() waiting for memory. Pushing out packets and receiving the ACKs may very well be the event that will free up send queue space for us. (Must consider interactions with Nagle and Minshall rules) Consider tcp_opt state which keeps a "TSO goal", it must be in sync with tcp_opt MSS state. Initially define "TSO goal" using tcp_tso_win_divisor and the current congestion window. Formally this is: max(1U, CWND / TCP_TSO_WIN_DIVISOR) We could either maintain this lazily, costing us a divide each time it is recalculated. Or, we can update it incrementally each time snd_cwnd is updated. To save some state testing during output decisions, define "TSO goal" as one for non-TSO flows. Possible send test logic: if (no new data possibly coming from user) send_now(); if (sending due to ACK queue advancement) send_now(); send_tso_goal_sized_chunks(); [-- Attachment #3: diff --] [-- Type: application/octet-stream, Size: 3165 bytes --] ===== include/linux/tcp.h 1.34 vs edited ===== --- 1.34/include/linux/tcp.h 2005-01-17 14:09:33 -08:00 +++ edited/include/linux/tcp.h 2005-01-31 16:03:32 -08:00 @@ -262,6 +262,7 @@ __u32 pmtu_cookie; /* Last pmtu seen by socket */ __u32 mss_cache; /* Cached effective mss, not including SACKS */ __u16 mss_cache_std; /* Like mss_cache, but without TSO */ + __u16 tso_goal; /* TSO packet count goal, 1 w/non-TSO paths */ __u16 mss_clamp; /* Maximal mss, negotiated at connection setup */ __u16 ext_header_len; /* Network protocol overhead (IP/IPv6 options) */ __u16 ext2_header_len;/* Options depending on route */ ===== net/ipv4/tcp_output.c 1.77 vs edited ===== --- 1.77/net/ipv4/tcp_output.c 2005-01-18 12:23:36 -08:00 +++ edited/net/ipv4/tcp_output.c 2005-02-01 14:32:46 -08:00 @@ -707,15 +707,103 @@ if (factor > limit) factor = limit; - tp->mss_cache = mss_now * factor; + /* If this ever triggers, change tp->tso_goal to + * a larger type and update this bug check. + */ + BUG_ON(factor > 65535); - mss_now = tp->mss_cache; - } + tp->tso_goal = factor; + } else + tp->tso_goal = 1; if (tp->eff_sacks) mss_now -= (TCPOLEN_SACK_BASE_ALIGNED + (tp->eff_sacks * TCPOLEN_SACK_PERBLOCK)); return mss_now; +} + +static inline int tcp_skb_data_all_paged(struct sk_buff *skb) +{ + return skb_headlen(skb) == 0; +} + +/* If possible, append paged data of SRC_SKB onto the + * tail of DST_SKB. + */ +static int skb_append_pages(struct sk_buff *dst_skb, struct sk_buff *src_skb) +{ + int i; + + if (!tcp_skb_data_all_paged(src_skb)) + return -EINVAL; + + for (i = 0; i < skb_shinfo(src_skb)->nr_frags; i++) { + skb_frag_t *src_frag = &skb_shinfo(src_skb)->frags[i]; + skb_frag_t *dst_frag; + int dst_frag_idx; + + dst_frag_idx = skb_shinfo(dst_skb)->nr_frags; + + if (skb_can_coalesce(dst_skb, dst_frag_idx, + src_frag->page, src_frag->page_offset)) { + dst_frag = &skb_shinfo(dst_skb)->frags[dst_frag_idx-1]; + dst_frag->size += src_frag->size; + } else { + if (dst_frag_idx >= MAX_SKB_FRAGS) + return -EMSGSIZE; + + dst_frag = &skb_shinfo(dst_skb)->frags[dst_frag_idx]; + skb_shinfo(dst_skb)->nr_frags = dst_frag_idx + 1; + + dst_frag->page = src_frag->page; + get_page(src_frag->page); + + dst_frag->page_offset = src_frag->page_offset; + dst_frag->size = src_frag->size; + } + dst_skb->data_len += src_frag->size; + } + + return 0; +} + +static struct sk_buff *tcp_tso_build(struct sk_buff *head, int mss, int num) +{ + struct sk_buff *skb; + struct sock *sk; + int err; + + sk = head->sk; + skb = alloc_skb(sk->sk_prot->max_header, GFP_ATOMIC); + err = -ENOMEM; + if (!skb) + goto fail; + + err = 0; + skb_shinfo(skb)->tso_size = mss; + skb_shinfo(skb)->tso_segs = num; + while (num--) { + err = skb_append_pages(skb, head); + if (err) + goto fail; + + head = head->next; + } + return skb; + +fail: + if (skb) { + int i; + + for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) { + skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; + + put_page(frag->page); + } + + kfree_skb(skb); + } + return NULL; } /* This routine writes packets to the network. It advances the ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: design for TSO performance fix 2005-01-28 0:31 design for TSO performance fix David S. Miller ` (2 preceding siblings ...) 2005-01-28 1:57 ` Thomas Graf @ 2005-01-28 6:25 ` Andi Kleen 2005-01-28 6:44 ` Nivedita Singhvi 2005-01-28 19:30 ` David S. Miller 3 siblings, 2 replies; 13+ messages in thread From: Andi Kleen @ 2005-01-28 6:25 UTC (permalink / raw) To: David S. Miller; +Cc: netdev "David S. Miller" <davem@davemloft.net> writes: > Ok, here is the best idea I've been able to come up with > so far. > > The basic idea is that we stop trying to build TSO frames > in the actual transmit queue. Instead, TSO packets are > built impromptu when we actually output packets on the > transmit queue. I don't quite get how it should work. Currently tcp_sendmsg will always push the first packet when the send_head is empty way down to hard_queue_xmit, and then queue up some others and then finally push them out. You would always miss the first one with that right? (assuming MTU sized packets) I looked at this some time ago to pass lists of packets to qdisc and hard_queue_xmit, because that would allow less locking overhead and allow some drivers to send stuff more efficiently to the hardware registers (It was one of the items in my "how to speed up the stack" list ;-) I never ended up implementing it because TSO gave most of the advantages anyways. > Advantages: > > 1) No knowledge of TSO frames need exist anywhere besides > tcp_write_xmit(), tcp_transmit_skb(), and > tcp_xmit_retransmit_queue() > > 2) As a result of #1, all the pcount crap goes away. > The need for two MSS state variables (mss_cache, > and mss_cache_std) and assosciated complexity is > eliminated as well. > > 3) Keeping TSO enabled after packet loss "just works". > > 4) CWND sampled at the correct moment when deciding > the TSO packet arity. > > The one disadvantage is that it might be a tiny bit more > expensive to build TSO frames. But I am sure we can find > ways to optimize that quite well. Without lists of packets through qdiscs etc. it will likely need a lot more spin locking than it used to be (and spinlocks tend to be quite expensive). Luckily the high level queuing you need for this could be used to implement the list of packets too (and then finally pass them to hard_queue_xmit to allow drivers more optimizations) -Andi ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: design for TSO performance fix 2005-01-28 6:25 ` Andi Kleen @ 2005-01-28 6:44 ` Nivedita Singhvi 2005-01-28 19:30 ` David S. Miller 1 sibling, 0 replies; 13+ messages in thread From: Nivedita Singhvi @ 2005-01-28 6:44 UTC (permalink / raw) To: Andi Kleen; +Cc: David S. Miller, netdev Andi Kleen wrote: > I looked at this some time ago to pass lists of packets > to qdisc and hard_queue_xmit, because that would allow less locking > overhead and allow some drivers to send stuff more efficiently > to the hardware registers > (It was one of the items in my "how to speed up the stack" list ;-) > > I never ended up implementing it because TSO gave most of the advantages > anyways. I admit that it's been several months since I last looked at this - and was just handwaving, had no code. But I had thought the converse then - that it might be better to abandon TSO and just have the stack pass down the list of skbs in one pass. Had been mentioned by Andi as well as Anton. We'd get much of the gain, avoid a lot of the complexity, and the code would be simpler. And I'm not positive about this but it seemed it would handle memory fragmentation better, too. Bogus? thanks, Nivedita ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: design for TSO performance fix 2005-01-28 6:25 ` Andi Kleen 2005-01-28 6:44 ` Nivedita Singhvi @ 2005-01-28 19:30 ` David S. Miller 1 sibling, 0 replies; 13+ messages in thread From: David S. Miller @ 2005-01-28 19:30 UTC (permalink / raw) To: Andi Kleen; +Cc: netdev On Fri, 28 Jan 2005 07:25:54 +0100 Andi Kleen <ak@muc.de> wrote: > Currently tcp_sendmsg will always push the first packet when the send_head > is empty way down to hard_queue_xmit, and then queue up some others > and then finally push them out. You would always miss the first > one with that right? (assuming MTU sized packets) We could make push_pending_frames defer if we're doing TSO and might potentially be building suck frames. It's just a detail, the main idea is what counts which is to keep all the TSO packets out of the view of most of the stack which is where all the complexity came from. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2005-02-01 23:04 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-01-28 0:31 design for TSO performance fix David S. Miller 2005-01-28 0:51 ` Rick Jones 2005-01-28 0:58 ` David S. Miller 2005-01-28 1:31 ` Herbert Xu 2005-01-28 5:19 ` David S. Miller 2005-01-28 5:44 ` Herbert Xu 2005-01-28 19:28 ` David S. Miller 2005-01-29 10:12 ` Herbert Xu 2005-01-28 1:57 ` Thomas Graf 2005-02-01 23:04 ` David S. Miller 2005-01-28 6:25 ` Andi Kleen 2005-01-28 6:44 ` Nivedita Singhvi 2005-01-28 19:30 ` David S. Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).