From: David Miller <davem@davemloft.net>
To: baruch@ev-en.org
Cc: netdev@vger.kernel.org
Subject: Re: [PATCH 1/3] Advance fast path pointer for first block only
Date: Wed, 31 Jan 2007 12:48:30 -0800 (PST) [thread overview]
Message-ID: <20070131.124830.93210236.davem@davemloft.net> (raw)
In-Reply-To: <20070129071339.24050.21052.sendpatchset@galon.ev-en.org>
From: Baruch Even <baruch@ev-en.org>
Date: Mon, 29 Jan 2007 09:13:39 +0200
> Only advance the SACK fast-path pointer for the first block, the fast-path
> assumes that only the first block advances next time so we should not move the
> skb for the next sack blocks.
>
> Signed-Off-By: Baruch Even <baruch@ev-en.org>
>
> ---
>
> I'm not sure about the fack_count part, this patch changes the value that is
> maintained in this case.
I'm not sure about this patch :-)
The fastpath is being used for two things here, I believe.
First, it's being used to monitor the expanding of the initial
SACK block, as you note. This is clear by the fact that we
only use the fastpath cache on future calls when this code path
triggers:
if (flag)
num_sacks = 1;
because otherwise we clear the fastpath cache to NULL.
But it's also being used in this loop you are editing to remember
where we stopped in the previous iteration of the loop. With your
change it will always walk the whole retransmit queue from the end of
the first SACK block, whereas before it would iterate starting at the
last SKB visited at the end of the previous sack block processed by
the loop.
This sounds trivial, but on deep pipes it could bring back some of
the performance problems the hinting was meant to cure.
The fack_count issue should be OK, since we're using the value
properly calculated at the end of the first SACK block as we
iterate to find the SKBs within the SACK block. Another way
of saying this is that if the algorithm is correct when the
fastpath cache is missed, it should be correct if we only cache
for the first SACK block.
I'll use this opportunity to say I'm rather unhappy with all of
these fastpath hinting schemes. They chew up a ton of space in
the tcp_sock because we have several pointers for each of the
different queue states we can cache and those eat 8 bytes a
piece on 64-bit.
We can probably fix the bug and preserve the inter-iteration
end-SKB caching by having local variable SKB/fack_count caches
in this function, and only updating the tcp_sock cache values
for the first SACK block.
Maybe something like this?
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index c26076f..b470a85 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -936,12 +936,14 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb, u32 prior_snd_
struct tcp_sock *tp = tcp_sk(sk);
unsigned char *ptr = ack_skb->h.raw + TCP_SKB_CB(ack_skb)->sacked;
struct tcp_sack_block_wire *sp = (struct tcp_sack_block_wire *)(ptr+2);
+ struct sk_buff *cached_skb;
int num_sacks = (ptr[1] - TCPOLEN_SACK_BASE)>>3;
int reord = tp->packets_out;
int prior_fackets;
u32 lost_retrans = 0;
int flag = 0;
int dup_sack = 0;
+ int cached_fack_count;
int i;
if (!tp->sacked_out)
@@ -1025,20 +1027,22 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb, u32 prior_snd_
/* clear flag as used for different purpose in following code */
flag = 0;
+ /* Use SACK fastpath hint if valid */
+ cached_skb = tp->fastpath_skb_hint;
+ cached_fack_count = tp->fastpath_cnt_hint;
+ if (!cached_skb) {
+ cached_skb = sk->sk_write_queue.next;
+ cached_fack_count = 0;
+ }
+
for (i=0; i<num_sacks; i++, sp++) {
struct sk_buff *skb;
__u32 start_seq = ntohl(sp->start_seq);
__u32 end_seq = ntohl(sp->end_seq);
int fack_count;
- /* Use SACK fastpath hint if valid */
- if (tp->fastpath_skb_hint) {
- skb = tp->fastpath_skb_hint;
- fack_count = tp->fastpath_cnt_hint;
- } else {
- skb = sk->sk_write_queue.next;
- fack_count = 0;
- }
+ skb = cached_skb;
+ fack_count = cached_fack_count;
/* Event "B" in the comment above. */
if (after(end_seq, tp->high_seq))
@@ -1048,8 +1052,12 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb, u32 prior_snd_
int in_sack, pcount;
u8 sacked;
- tp->fastpath_skb_hint = skb;
- tp->fastpath_cnt_hint = fack_count;
+ cached_skb = skb;
+ cached_fack_count = fack_count;
+ if (i == 0) {
+ tp->fastpath_skb_hint = skb;
+ tp->fastpath_cnt_hint = fack_count;
+ }
/* The retransmission queue is always in order, so
* we can short-circuit the walk early.
next parent reply other threads:[~2007-01-31 20:48 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20070129071339.24050.21052.sendpatchset@galon.ev-en.org>
2007-01-31 20:48 ` David Miller [this message]
2007-02-01 7:38 ` [PATCH 1/3] Advance fast path pointer for first block only Baruch Even
[not found] ` <20070129071344.24050.14971.sendpatchset@galon.ev-en.org>
2007-01-31 20:49 ` [PATCH 2/3] Seperate DSACK from SACK fast path David Miller
[not found] ` <20070129071349.24050.56022.sendpatchset@galon.ev-en.org>
2007-01-31 20:52 ` [PATCH 3/3] Check num sacks in " David Miller
2007-02-01 7:22 ` Baruch Even
2007-02-01 22:38 ` David Miller
[not found] <20070202144116.26863.3722.sendpatchset@galon.ev-en.org>
2007-02-05 7:36 ` [PATCH 1/3] Advance fast path pointer for first block only David Miller
2007-01-27 16:47 [PATCH 0/3] Fix issues with SACK processing Baruch Even
2007-01-27 16:48 ` [PATCH 1/3] Advance fast path pointer for first block only Baruch Even
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070131.124830.93210236.davem@davemloft.net \
--to=davem@davemloft.net \
--cc=baruch@ev-en.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).