From: David Miller <davem@davemloft.net>
To: ilpo.jarvinen@helsinki.fi
Cc: netdev@vger.kernel.org
Subject: Re: [RFC PATCH 09/10] [TCP]: Rewrite SACK block processing & sack_recv_cache use
Date: Thu, 15 Nov 2007 20:00:32 -0800 (PST) [thread overview]
Message-ID: <20071115.200032.115040183.davem@davemloft.net> (raw)
In-Reply-To: <11951338541351-git-send-email-ilpo.jarvinen@helsinki.fi>
From: "Ilpo_Järvinen" <ilpo.jarvinen@helsinki.fi>
Date: Thu, 15 Nov 2007 15:37:33 +0200
> Key points of this patch are:
>
> - In case new SACK information is advance only type, no skb
> processing below previously discovered highest point is done
> - Optimize cases below highest point too since there's no need
> to always go up to highest point (which is very likely still
> present in that SACK), this is not entirely true though
> because I'm dropping the fastpath_skb_hint which could
> previously optimize those cases even better. Whether that's
> significant, I'm not too sure.
>
> Corrently it will provide skipping by walking. Combined with
> RB-tree, all skipping would become fast too regardless of window
> size (can be done incrementally later).
>
> Previously a number of cases in TCP SACK processing fails to
> take advantage of costly stored information in sack_recv_cache,
> most importantly, expected events such as cumulative ACK and new
> hole ACKs. Processing on such ACKs result in rather long walks
> building up latencies (which easily gets nasty when window is
> huge). Those latencies are often completely unnecessary
> compared with the amount of _new_ information received, usually
> for cumulative ACK there's no new information at all, yet TCP
> walks whole queue unnecessary potentially taking a number of
> costly cache misses on the way, etc.!
>
> Since the inclusion of highest_sack, there's a lot information
> that is very likely redundant (SACK fastpath hint stuff,
> fackets_out, highest_sack), though there's no ultimate guarantee
> that they'll remain the same whole the time (in all unearthly
> scenarios). Take advantage of this knowledge here and drop
> fastpath hint and use direct access to highest SACKed skb as
> a replacement.
>
> Effectively "special cased" fastpath is dropped. This change
> adds some complexity to introduce better coveraged "fastpath",
> though the added complexity should make TCP behave more cache
> friendly.
>
> The current ACK's SACK blocks are compared against each cached
> block individially and only ranges that are new are then scanned
> by the high constant walk. For other parts of write queue, even
> when in previously known part of the SACK blocks, a faster skip
> function is used (if necessary at all). In addition, whenever
> possible, TCP fast-forwards to highest_sack skb that was made
> available by an earlier patch. In typical case, no other things
> but this fast-forward and mandatory markings after that occur
> making the access pattern quite similar to the former fastpath
> "special case".
>
> DSACKs are special case that must always be walked.
>
> The local to recv_sack_cache copying could be more intelligent
> w.r.t DSACKs which are likely to be there only once but that
> is left to a separate patch.
>
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
I like this patch a lot, and if there are any errors in it
they are in the details not the design, and we have lots of
time to sort that out before the 2.6.25 merge window so
I have applied this.
I have been thinking off an on about the core SACK cost problem.
One thing that keeps occuring to me is that to conquer this fully we
have to change the domain of the problem in our favor.
Basically, even if we have RB-trie and all of these nice fastpaths, we
still can walk the complete queue for a receiver which maliciously
advertises lots of SACK information in one ACK (say covering every
retransmit queue SKB except one) and then nearly no SACK information
in the next one. Repeating this cycle we can get into excessive usage
of cpu cycles.
Only in situations of severe memory constraints will a receiver
release out-of-order packets and thus remove sequences from previously
advertised SACK blocks (which by RFCs we must handle). And in such
cases it is OK to resort to timeout based recovery.
This simplifies everything and puts a very strict upper bound of
the amount of processing the most aggressive "bad guy" receiver
can impose upon us. One full SKB queue scan per window/RTT.
With all of those other weird cases out of the way, there is only
one event to process: SACKS cover more data.
Ilpo what do you think about these ideas?
next prev parent reply other threads:[~2007-11-16 4:00 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-11-15 13:37 [RFC PATCH net-2.6.25 0/10] [TCP]: Cleanups, tweaks & sacktag recode Ilpo Järvinen
2007-11-15 13:37 ` [PATCH 01/10] [TCP]: Move !in_sack test earlier in sacktag & reorganize if()s Ilpo Järvinen
2007-11-15 13:37 ` [PATCH 02/10] [TCP]: Extend reordering detection to cover CA_Loss partially Ilpo Järvinen
2007-11-15 13:37 ` [PATCH 3/10] [TCP]: non-FACK SACK follows conservative SACK loss recovery Ilpo Järvinen
2007-11-15 13:37 ` [PATCH 04/10] [TCP]: Convert highest_sack to sk_buff to allow direct access Ilpo Järvinen
2007-11-15 13:37 ` [PATCH 05/10] [TCP]: Make lost retrans detection more self-contained Ilpo Järvinen
2007-11-15 13:37 ` [PATCH 06/10] [TCP]: Prior_fackets can be replaced by highest_sack seq Ilpo Järvinen
2007-11-15 13:37 ` [PATCH 07/10] [TCP]: Create tcp_sacktag_one() Ilpo Järvinen
2007-11-15 13:37 ` [PATCH 08/10] [TCP]: Earlier SACK block verification & simplify access to them Ilpo Järvinen
2007-11-15 13:37 ` [RFC PATCH 09/10] [TCP]: Rewrite SACK block processing & sack_recv_cache use Ilpo Järvinen
2007-11-15 13:37 ` [PATCH 10/10] [TCP]: Track sacktag (DEVEL PATCH) Ilpo Järvinen
2007-11-16 4:00 ` David Miller [this message]
2007-11-16 13:44 ` [RFC PATCH 09/10] [TCP]: Rewrite SACK block processing & sack_recv_cache use Ilpo Järvinen
2007-11-20 6:08 ` David Miller
2007-11-20 9:22 ` Ilpo Järvinen
2007-11-16 3:50 ` [PATCH 08/10] [TCP]: Earlier SACK block verification & simplify access to them David Miller
2007-11-16 3:45 ` [PATCH 07/10] [TCP]: Create tcp_sacktag_one() David Miller
2007-11-16 3:44 ` [PATCH 06/10] [TCP]: Prior_fackets can be replaced by highest_sack seq David Miller
2007-11-16 3:43 ` [PATCH 05/10] [TCP]: Make lost retrans detection more self-contained David Miller
2007-11-16 3:42 ` [PATCH 04/10] [TCP]: Convert highest_sack to sk_buff to allow direct access David Miller
2007-11-16 3:41 ` [PATCH 3/10] [TCP]: non-FACK SACK follows conservative SACK loss recovery David Miller
2007-11-16 3:35 ` [PATCH 02/10] [TCP]: Extend reordering detection to cover CA_Loss partially David Miller
2007-11-16 3:33 ` [PATCH 01/10] [TCP]: Move !in_sack test earlier in sacktag & reorganize if()s David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20071115.200032.115040183.davem@davemloft.net \
--to=davem@davemloft.net \
--cc=ilpo.jarvinen@helsinki.fi \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).