netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Miller <davem@davemloft.net>
To: ilpo.jarvinen@helsinki.fi
Cc: lachlan.andrew@gmail.com, netdev@vger.kernel.org, quetchen@caltech.edu
Subject: Re: [RFC] TCP illinois max rtt aging
Date: Fri, 07 Dec 2007 04:41:50 -0800 (PST)	[thread overview]
Message-ID: <20071207.044150.02925935.davem@davemloft.net> (raw)
In-Reply-To: <Pine.LNX.4.64.0712071254070.18529@kivilampi-30.cs.helsinki.fi>

From: "Ilpo_Järvinen" <ilpo.jarvinen@helsinki.fi>
Date: Fri, 7 Dec 2007 13:05:46 +0200 (EET)

> I guess if you get a large cumulative ACK, the amount of processing is 
> still overwhelming (added DaveM if he has some idea how to combat it).
> 
> Even a simple scenario (this isn't anything fancy at all, will occur all 
> the time): Just one loss => rest skbs grow one by one into a single 
> very large SACK block (and we do that efficiently for sure) => then the 
> fast retransmit gets delivered and a cumulative ACK for whole orig_window 
> arrives => clean_rtx_queue has to do a lot of processing. In this case we 
> could optimize RB-tree cleanup away (by just blanking it all) but still 
> getting rid of all those skbs is going to take a larger moment than I'd 
> like to see.
> 
> That tree blanking could be extended to cover anything which ACK more than 
> half of the tree by just replacing the root (and dealing with potential 
> recolorization of the root).

Yes, it's the classic problem.  But it ought to be at least
partially masked when TSO is in use, because we'll only process
a handful of SKBs.  The more effectively TSO batches, the
less work clean_rtx_queue() will do.

When not doing TSO the behavior is super-stupid, we bump reference
counts on the same page multiple times while running over the SKBs
since consequetive SKBs cover data in different spans of the same
page.

The core issue is that we have a poorly behaving data container,
and therefore that's obviously what we need to change.

Conceptually what we probably need to do is seperate the data
maintainence from the SKB objects themselves.  There is a blob
that maintains the paged data state for everything in the
retransmit queue.  SKBs are built and get the page pointers
but don't actually grab references to the pages, the blob
does that and it keeps track of how many SKB references to each
page there are, non-atomically.

The hardest part is dealing with the page lifetime issues.
Unfortunately, when we trim the rtx queue, references to the clones
can still exist in the driver output path.  It's a difficult problem
to overcome in fact, so in the end my suggestion above might not
even be workable.

> No idea about what it could do, haven't yet looked web100, I was planning 
> at some point of time...

Web100 just provides statistics and other kinds of connection data
to userspace, all the actual algorithm etc. modifications have been
merged upstream and yanked out of the web100 patch.  I was looking
at it the other night and it's frankly totally uninteresting these
days :-)

  reply	other threads:[~2007-12-07 12:41 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <aa7d2c6d0711261023m3d2dd850o76a8f44aef022f39@mail.gmail.com>
     [not found] ` <001001c83063$9adbc9d0$d5897e82@csp.uiuc.edu>
2007-11-28 23:47   ` [PATCH] tcp-illinois: incorrect beta usage Stephen Hemminger
2007-11-29  0:25     ` Lachlan Andrew
2007-11-29  0:43       ` Stephen Hemminger
2007-11-29  5:26         ` Shao Liu
2007-12-03 22:52           ` [RFC] TCP illinois max rtt aging Stephen Hemminger
2007-12-03 23:06             ` Lachlan Andrew
2007-12-03 23:59               ` Shao Liu
2007-12-04  0:32                 ` Stephen Hemminger
2007-12-04  1:23                   ` Lachlan Andrew
2007-12-04  8:37                     ` Ilpo Järvinen
2007-12-07  3:27                       ` Lachlan Andrew
2007-12-07 11:05                         ` Ilpo Järvinen
2007-12-07 12:41                           ` David Miller [this message]
2007-12-07 13:05                             ` Ilpo Järvinen
2007-12-07 18:27                               ` Ilpo Järvinen
2007-12-08  1:32                               ` David Miller
2007-12-11 11:59                                 ` [RFC PATCH net-2.6.25 uncompilable] [TCP]: Avoid breaking GSOed skbs when SACKed one-by-one (Was: Re: [RFC] TCP illinois max rtt aging) Ilpo Järvinen
2007-12-11 12:32                                   ` [RFC PATCH net-2.6.25 uncompilable] [TCP]: Avoid breaking GSOed skbs when SACKed one-by-one David Miller
2007-12-12  0:14                                     ` Lachlan Andrew
2007-12-12 15:11                                       ` David Miller
2007-12-12 23:35                                         ` Lachlan Andrew
2007-12-12 23:38                                           ` David Miller
2007-12-13  0:00                                           ` Stephen Hemminger
2007-12-15  9:51                                     ` SACK scoreboard (Was: Re: [RFC PATCH net-2.6.25 uncompilable] [TCP]: Avoid breaking GSOed skbs when SACKed one-by-one) Ilpo Järvinen
2008-01-08  7:36                                       ` SACK scoreboard David Miller
2008-01-08 12:12                                         ` Ilpo Järvinen
2008-01-09  7:58                                           ` David Miller
2008-01-08 16:51                                         ` John Heffner
2008-01-08 22:44                                           ` David Miller
2008-01-09  1:34                                             ` Lachlan Andrew
2008-01-09  6:35                                               ` David Miller
2008-01-09  2:25                                             ` Andi Kleen
2008-01-09  4:27                                               ` John Heffner
2008-01-09  6:41                                                 ` David Miller
2008-01-09 14:56                                                   ` John Heffner
2008-01-09 18:14                                                     ` SANGTAE HA
2008-01-09 18:23                                                       ` John Heffner
2008-01-09 12:55                                                 ` Ilpo Järvinen
2008-01-09  6:39                                               ` David Miller
2008-01-09  7:03                                                 ` Andi Kleen
2008-01-09  7:16                                                   ` David Miller
2008-01-09  9:47                                                   ` Evgeniy Polyakov
2008-01-09 14:02                                                     ` Andi Kleen
2007-11-29 14:12     ` [PATCH] tcp-illinois: incorrect beta usage Herbert Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20071207.044150.02925935.davem@davemloft.net \
    --to=davem@davemloft.net \
    --cc=ilpo.jarvinen@helsinki.fi \
    --cc=lachlan.andrew@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=quetchen@caltech.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).