TCP acking too fast

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* TCP acking too fast
@ 2001-10-14  0:23 Mika Liljeberg
  2001-10-14  6:40 ` David S. Miller
  2001-10-15 20:59 ` Bill Davidsen
  0 siblings, 2 replies; 36+ messages in thread
From: Mika Liljeberg @ 2001-10-14  0:23 UTC (permalink / raw)
  To: linux-kernel

Hi all,

It seems that recent (and maybe not so recent) linux kernels have a TCP
problem that causes them to acknowledge almost every segment. While,
strictly speaking, this is not against the spec, any sane TCP only acks
every second segment in steady state. The statistics appended below
illustrate the problem.

Why do I care? Because I'm connected through a cable network with a
severe bandwidth asymmetry; the upstream is rate limited to 256 kbps,
while the downstream can theoretically yield 10 Mbps (assuming a quiet
period). Right now, the excessive ack rate seems to be a limiting factor
on peak performance.

I've already disabled quickacks, replaced the receive MSS estimate with
advertised MSS in the ack sending policy (two places), and removed one
dubious "immediate ack" condition from send_delay_ack(). The annoying
thing is that none of this seem to make any real difference. I must be
missing something huge that's right in front of my nose, but I'm
starting to run out of steam.

Any thoughts on this?

Regards,

	MikaL

Some stats from a unidirectional 6MB transfer:

c->d:                                  d->c:
total packets:          3643           total packets:          4498
ack pkts sent:          3642           ack pkts sent:          4498
pure acks sent:         3640           pure acks sent:            2
unique bytes sent:       108           unique bytes sent:   6161570
actual data pkts:          1           actual data pkts:       4494
actual data bytes:       108           actual data bytes:   6161570
rexmt data pkts:           0           rexmt data pkts:           0
rexmt data bytes:          0           rexmt data bytes:          0
outoforder pkts:           0           outoforder pkts:          10
pushed data pkts:          1           pushed data pkts:       3043
SYN/FIN pkts sent:       1/1           SYN/FIN pkts sent:       1/1
req 1323 ws/ts:          Y/Y           req 1323 ws/ts:          Y/Y
adv wind scale:            0           adv wind scale:            0
req sack:                  Y           req sack:                  Y
sacks sent:               28           sacks sent:                0
mss requested:          1460 bytes     mss requested:          1460
bytes
max segm size:           108 bytes     max segm size:          1448
bytes
min segm size:           108 bytes     min segm size:            92
bytes
avg segm size:           107 bytes     avg segm size:          1371
bytes
max win adv:           63712 bytes     max win adv:           32120
bytes
min win adv:            5840 bytes     min win adv:           32120
bytes
zero win adv:              0 times     zero win adv:              0
times
avg win adv:           63515 bytes     avg win adv:           32120
bytes
initial window:          108 bytes     initial window:          489
bytes
initial window:            1 pkts      initial window:            1 pkts
ttl stream length:       108 bytes     ttl stream length:   6161570
bytes
missed data:               0 bytes     missed data:               0
bytes
truncated data:           46 bytes     truncated data:      5882942
bytes
truncated packets:         1 pkts      truncated packets:      4494 pkts
data xmit time:        0.000 secs      data xmit time:       10.663 secs
idletime max:           98.1 ms        idletime max:           98.0 ms
throughput:               10 Bps       throughput:           572079 Bps

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14  0:23 TCP acking too fast Mika Liljeberg
@ 2001-10-14  6:40 ` David S. Miller
  2001-10-14  7:05   ` Mika Liljeberg
  2001-10-15 20:59 ` Bill Davidsen
  1 sibling, 1 reply; 36+ messages in thread
From: David S. Miller @ 2001-10-14  6:40 UTC (permalink / raw)
  To: Mika.Liljeberg; +Cc: linux-kernel

You need to post for us a tcpdump trace of a connection you feel
exhibits bad behavior.

Otherwise we can do nothing but guess, effectively your statistics
aren't helpful at all if we have no idea what is happening on the
wire.

Franks a lot,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14  6:40 ` David S. Miller
@ 2001-10-14  7:05   ` Mika Liljeberg
  2001-10-14  7:47     ` David S. Miller
  2001-10-14  7:50     ` David S. Miller
  0 siblings, 2 replies; 36+ messages in thread
From: Mika Liljeberg @ 2001-10-14  7:05 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1178 bytes --]

"David S. Miller" wrote:
> 
> You need to post for us a tcpdump trace of a connection you feel
> exhibits bad behavior.
> 
> Otherwise we can do nothing but guess, effectively your statistics
> aren't helpful at all if we have no idea what is happening on the
> wire.

Fair enough, chalk it down to lack of sleep addling my brain. I also
forgot to mention my kernel version, which is 2.4.10-ac10.

I've attached a fragment of tcpdump output from the middle of steady
state transfer. Looking at the dump, it seems that most arriving
segments have the PSH bit set. This leads me to believe that the
transfer is mostly application limited at the sender side.

For some reason, this causes the receiver to ack every segment
immediately (which is not suggested by the spec as far as I know). I'm
guessing that this is some kind of optimization for HTTP (i.e., avoid
Nagle on the last runt segment by acking pushed segments immediately).
However, this seems to produce less than desirable behaviour on sender
limited bulk transfers.

However, despite appearances, I can't seem to find the bit of code that
tests the PSH flag for immediate ack. Still sleepy, I guess.

Regards,

	MikaL

[-- Attachment #2: tcpdump.txt.gz --]
[-- Type: application/x-gzip, Size: 7912 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14  7:05   ` Mika Liljeberg
@ 2001-10-14  7:47     ` David S. Miller
  2001-10-14  7:51       ` Mika Liljeberg
  2001-10-14  7:50     ` David S. Miller
  1 sibling, 1 reply; 36+ messages in thread
From: David S. Miller @ 2001-10-14  7:47 UTC (permalink / raw)
  To: Mika.Liljeberg; +Cc: linux-kernel

   From: Mika Liljeberg <Mika.Liljeberg@welho.com>
   Date: Sun, 14 Oct 2001 10:05:33 +0300

   I've attached a fragment of tcpdump output from the middle of steady
   state transfer. Looking at the dump, it seems that most arriving
   segments have the PSH bit set. This leads me to believe that the
   transfer is mostly application limited at the sender side.

This means the application is doing many small writes.  To be honest,
to only sure way to cure any performance problems from that is to
fix the application in question.  What is this application?

Franks a lot,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14  7:47     ` David S. Miller
@ 2001-10-14  7:51       ` Mika Liljeberg
  2001-10-14  8:12         ` David S. Miller
  0 siblings, 1 reply; 36+ messages in thread
From: Mika Liljeberg @ 2001-10-14  7:51 UTC (permalink / raw)
  To: David S. Miller, linux-kernel

"David S. Miller" wrote:
> 
>    From: Mika Liljeberg <Mika.Liljeberg@welho.com>
>    Date: Sun, 14 Oct 2001 10:05:33 +0300
> 
>    I've attached a fragment of tcpdump output from the middle of steady
>    state transfer. Looking at the dump, it seems that most arriving
>    segments have the PSH bit set. This leads me to believe that the
>    transfer is mostly application limited at the sender side.
> 
> This means the application is doing many small writes.

Nope, it simply means that the remote machine has a 100 Mbit Ethernet
card that keeps emptying the transmit queue faster than it can be
filled.

>  To be honest,
> to only sure way to cure any performance problems from that is to
> fix the application in question.  What is this application?

I don't control the remote machine, but it's linux (don't know which
version). I tried with both HTTP (Apache 1.3.9) and FTP. I doubt it's
the application. :-)

Regards,

	MikaL

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14  7:51       ` Mika Liljeberg
@ 2001-10-14  8:12         ` David S. Miller
  2001-10-14  8:39           ` Mika Liljeberg
  0 siblings, 1 reply; 36+ messages in thread
From: David S. Miller @ 2001-10-14  8:12 UTC (permalink / raw)
  To: Mika.Liljeberg; +Cc: linux-kernel

   From: Mika Liljeberg <Mika.Liljeberg@welho.com>
   Date: Sun, 14 Oct 2001 10:51:56 +0300

   I don't control the remote machine, but it's linux (don't know which
   version). I tried with both HTTP (Apache 1.3.9) and FTP. I doubt it's
   the application. :-)

Well, the version of the kernel is pretty important.
Setting PSH all the time does sound like a possibly familiar bug.

Franks a lot,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14  8:12         ` David S. Miller
@ 2001-10-14  8:39           ` Mika Liljeberg
  2001-10-14  9:03             ` David S. Miller
  0 siblings, 1 reply; 36+ messages in thread
From: Mika Liljeberg @ 2001-10-14  8:39 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel

"David S. Miller" wrote:
>    I don't control the remote machine, but it's linux (don't know which
>    version). I tried with both HTTP (Apache 1.3.9) and FTP. I doubt it's
>    the application. :-)
> 
> Well, the version of the kernel is pretty important.

Unfortunately I have no way to ascertain that, but I do know it's
running Debian. I would venture a guess that it's a series 2.2 kernel. I
tried a nmap fingerprint, but it couldn't identify the kernel.

> Setting PSH all the time does sound like a possibly familiar bug.

You have no problem with the reiceiver immediately acking PSH segments?
Shouldn't we be robust against this kind of behaviour? [Otherwise a
sender can force us into a permanent quickack mode simply by setting PSH
on every segment.]

Regards,

	MikaL

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14  8:39           ` Mika Liljeberg
@ 2001-10-14  9:03             ` David S. Miller
  2001-10-14  9:15               ` Mika Liljeberg
  2001-10-14  9:25               ` Andi Kleen
  0 siblings, 2 replies; 36+ messages in thread
From: David S. Miller @ 2001-10-14  9:03 UTC (permalink / raw)
  To: Mika.Liljeberg; +Cc: linux-kernel

   From: Mika Liljeberg <Mika.Liljeberg@welho.com>
   Date: Sun, 14 Oct 2001 11:39:22 +0300

   [Otherwise a sender can force us into a permanent quickack mode
   simply by setting PSH on every segment.]

"A sending TCP can send us garbage so bad that it hinders
performance."

So, your point is? :-)  A sensible sending application, and a sensible
TCP should not being setting PSH every single segment.  And we're not
coding up hacks to make the Linux receiver handle this case better.
You'll have much better luck convincing us to implement ECN black hole
workarounds :-)

Franks a lot,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14  9:03             ` David S. Miller
@ 2001-10-14  9:15               ` Mika Liljeberg
  2001-10-14  9:16                 ` David S. Miller
  2001-10-14  9:25               ` Andi Kleen
  1 sibling, 1 reply; 36+ messages in thread
From: Mika Liljeberg @ 2001-10-14  9:15 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel

"David S. Miller" wrote:
> 
>    From: Mika Liljeberg <Mika.Liljeberg@welho.com>
>    Date: Sun, 14 Oct 2001 11:39:22 +0300
> 
>    [Otherwise a sender can force us into a permanent quickack mode
>    simply by setting PSH on every segment.]
> 
> "A sending TCP can send us garbage so bad that it hinders
> performance."
> 
> So, your point is? :-)  A sensible sending application, and a sensible
> TCP should not being setting PSH every single segment.

Like apache and linux? :-)

>  And we're not
> coding up hacks to make the Linux receiver handle this case better.

By the same logic we could throw away Nagle and SWS avoidance! Whatever
happened to "be conservative in what you send" (i.e. acks, in this
case)?

Frankly, I see no reason for acking PSH segments immediately. What's the
rationale for doing so? Looks like a hack to me...

I don't mean to be a pest, but it would be nice to get some technical
grounds for this behavour, since you're obviously convinced that there
are some. Please?

> You'll have much better luck convincing us to implement ECN black hole
> workarounds :-)

Oh, no. I'm not going to be dragged into that discussion! :) [Do we have
such workarounds for PMTUD detection, I wonder...]

Cheers,

	MikaL

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14  9:15               ` Mika Liljeberg
@ 2001-10-14  9:16                 ` David S. Miller
  0 siblings, 0 replies; 36+ messages in thread
From: David S. Miller @ 2001-10-14  9:16 UTC (permalink / raw)
  To: Mika.Liljeberg; +Cc: linux-kernel

   From: Mika Liljeberg <Mika.Liljeberg@welho.com>
   Date: Sun, 14 Oct 2001 12:15:24 +0300

   Like apache and linux? :-)

"BROKEN LINUX" I suspect it's just a buggy 2.2.x that machine has.

Franks a lot,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14  9:03             ` David S. Miller
  2001-10-14  9:15               ` Mika Liljeberg
@ 2001-10-14  9:25               ` Andi Kleen
  2001-10-14  9:39                 ` David S. Miller
  1 sibling, 1 reply; 36+ messages in thread
From: Andi Kleen @ 2001-10-14  9:25 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel

In article <20011014.020326.18308527.davem@redhat.com>,
"David S. Miller" <davem@redhat.com> writes:

> So, your point is? :-)  A sensible sending application, and a sensible
> TCP should not being setting PSH every single segment.  And we're not
> coding up hacks to make the Linux receiver handle this case better.
> You'll have much better luck convincing us to implement ECN black hole
> workarounds :-)

Ignoring PSH completely on RX would probably not be a worse heuristic 
than forcing an ACK on it. At least other stacks seem to do fine too 
without the force-ack-on-psh. I think you added it a long time ago, but 
I do not remember why you did it; but at least here is an counter example
now that may be a good case for a reconsider.

-Andi

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14  9:25               ` Andi Kleen
@ 2001-10-14  9:39                 ` David S. Miller
  2001-10-14 11:30                   ` Andi Kleen
  0 siblings, 1 reply; 36+ messages in thread
From: David S. Miller @ 2001-10-14  9:39 UTC (permalink / raw)
  To: ak; +Cc: linux-kernel

   From: Andi Kleen <ak@muc.de>
   Date: 14 Oct 2001 11:25:09 +0200

   but at least here is an counter example
   now that may be a good case for a reconsider.

A buggy 2.2.x kernel is not a good case counter example.

Franks a lot,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14  9:39                 ` David S. Miller
@ 2001-10-14 11:30                   ` Andi Kleen
  2001-10-14 11:49                     ` Mika Liljeberg
                                       ` (2 more replies)
  0 siblings, 3 replies; 36+ messages in thread
From: Andi Kleen @ 2001-10-14 11:30 UTC (permalink / raw)
  To: David S. Miller; +Cc: ak, linux-kernel, kuznet

On Sun, Oct 14, 2001 at 11:39:48AM +0200, David S. Miller wrote:
>    From: Andi Kleen <ak@muc.de>
>    Date: 14 Oct 2001 11:25:09 +0200
>    
>    but at least here is an counter example
>    now that may be a good case for a reconsider.
> 
> A buggy 2.2.x kernel is not a good case counter example.

I just checked and the 2.4 kernel doesn't have the PSH quickack check
anymore, so it cannot be the cause. The original poster didn't which 
kernel version he used, but he said "recent"; so I'll assume 2.4
The only special case for PSH in RX left I can is in rcv_mss estimation,
where is assumes that a packet with PSH set is not full sized.  On further
look the 2.4 tcp_measure_rcv_mss will never update rcv_mss for packets
which do have PSH set and in this case cause random ack behaviour depending
on the initial rcv_mss guess.
Not very nice; definitely violates the "be conservative what you accept"
rule. I'm not sure how to fix it, adding a fallback to every-two-packet-add
would pollute the fast path a bit.

-Andi

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14 11:30                   ` Andi Kleen
@ 2001-10-14 11:49                     ` Mika Liljeberg
  2001-10-14 14:05                       ` Andi Kleen
  2001-10-14 13:14                     ` [PATCH] " Mika Liljeberg
  2001-10-14 16:36                     ` kuznet
  2 siblings, 1 reply; 36+ messages in thread
From: Mika Liljeberg @ 2001-10-14 11:49 UTC (permalink / raw)
  To: Andi Kleen; +Cc: David S. Miller, linux-kernel, kuznet

Andi Kleen wrote:
> The only special case for PSH in RX left I can is in rcv_mss estimation,
> where is assumes that a packet with PSH set is not full sized.

A packet without PSH should be full size. Assuming the sender implemets
SWS avoidance correctly, this should be a safe enough assumption.

> On further
> look the 2.4 tcp_measure_rcv_mss will never update rcv_mss for packets
> which do have PSH set and in this case cause random ack behaviour depending
> on the initial rcv_mss guess.
> Not very nice; definitely violates the "be conservative what you accept"
> rule. I'm not sure how to fix it, adding a fallback to every-two-packet-add
> would pollute the fast path a bit.

You're right. As far as I can see, it's not necessary to set the
TCP_ACK_PUSHED flag at all (except maybe for SYN-ACK). I'm just writing
a patch to clean this up.

> -Andi

Regards,

	MikaL

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14 11:49                     ` Mika Liljeberg
@ 2001-10-14 14:05                       ` Andi Kleen
  2001-10-14 14:26                         ` Mika Liljeberg
  0 siblings, 1 reply; 36+ messages in thread
From: Andi Kleen @ 2001-10-14 14:05 UTC (permalink / raw)
  To: Mika Liljeberg; +Cc: Andi Kleen, David S. Miller, linux-kernel, kuznet

On Sun, Oct 14, 2001 at 01:49:25PM +0200, Mika Liljeberg wrote:
> Andi Kleen wrote:
> > The only special case for PSH in RX left I can is in rcv_mss estimation,
> > where is assumes that a packet with PSH set is not full sized.
> 
> A packet without PSH should be full size. Assuming the sender implemets
> SWS avoidance correctly, this should be a safe enough assumption.

It's not guaranteed by any spec; just common behaviour from BSD derived
stacks. SWS avoidance does not say anything about PSH flags.


> 
> > On further
> > look the 2.4 tcp_measure_rcv_mss will never update rcv_mss for packets
> > which do have PSH set and in this case cause random ack behaviour depending
> > on the initial rcv_mss guess.
> > Not very nice; definitely violates the "be conservative what you accept"
> > rule. I'm not sure how to fix it, adding a fallback to every-two-packet-add
> > would pollute the fast path a bit.
> 
> You're right. As far as I can see, it's not necessary to set the
> TCP_ACK_PUSHED flag at all (except maybe for SYN-ACK). I'm just writing
> a patch to clean this up.

Setting it for packets >= rcv_mss looks useful to me to catch mistakes.
Better too many acks than to few.


-Andi

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14 14:05                       ` Andi Kleen
@ 2001-10-14 14:26                         ` Mika Liljeberg
  2001-10-14 16:12                           ` Andi Kleen
  0 siblings, 1 reply; 36+ messages in thread
From: Mika Liljeberg @ 2001-10-14 14:26 UTC (permalink / raw)
  To: Andi Kleen; +Cc: David S. Miller, linux-kernel, kuznet

Andi Kleen wrote:

> It's not guaranteed by any spec; just common behaviour from BSD derived
> stacks. SWS avoidance does not say anything about PSH flags.

True enough. This is a slightly dubious heuristic at best. Besides, if
the sender sets TCP_NODELAY and sends packets that are between
TCP_MIN_MSS and true receive MSS, the estimate is probably totally
hosed.

My solution to this would be to recalculate rcv_mss once per window.
I.e., start new_rcv_mss from 0, keep increasing it for one window width,
and then copy it to rcv_mss. No funny heuristics, and it would adjust to
a shrunken MSS within one transmission window.

> > > On further
> > > look the 2.4 tcp_measure_rcv_mss will never update rcv_mss for packets
> > > which do have PSH set and in this case cause random ack behaviour depending
> > > on the initial rcv_mss guess.
> > > Not very nice; definitely violates the "be conservative what you accept"
> > > rule. I'm not sure how to fix it, adding a fallback to every-two-packet-add
> > > would pollute the fast path a bit.
> >
> > You're right. As far as I can see, it's not necessary to set the
> > TCP_ACK_PUSHED flag at all (except maybe for SYN-ACK). I'm just writing
> > a patch to clean this up.
> 
> Setting it for packets >= rcv_mss looks useful to me to catch mistakes.
> Better too many acks than to few.

Maybe so, but in that case I would only set it for packets > rcv_mss.
Otherwise, my ack-every-segment-with-PSH problem would come back.

Actually, I think it would be better to simply to always ack every other
segment (except in quickack and fast recovery modes) and only use the
receive window estimation for window updates. This would guarantee
self-clocking in all cases.

> -Andi

Regards,

	MikaL

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14 14:26                         ` Mika Liljeberg
@ 2001-10-14 16:12                           ` Andi Kleen
  2001-10-14 16:55                             ` Mika Liljeberg
  0 siblings, 1 reply; 36+ messages in thread
From: Andi Kleen @ 2001-10-14 16:12 UTC (permalink / raw)
  To: Mika Liljeberg; +Cc: Andi Kleen, David S. Miller, linux-kernel, kuznet

On Sun, Oct 14, 2001 at 04:26:53PM +0200, Mika Liljeberg wrote:
> My solution to this would be to recalculate rcv_mss once per window.
> I.e., start new_rcv_mss from 0, keep increasing it for one window width,
> and then copy it to rcv_mss. No funny heuristics, and it would adjust to
> a shrunken MSS within one transmission window.

Sounds complicated. How would you implement it?

> 
> > > > On further
> > > > look the 2.4 tcp_measure_rcv_mss will never update rcv_mss for packets
> > > > which do have PSH set and in this case cause random ack behaviour depending
> > > > on the initial rcv_mss guess.
> > > > Not very nice; definitely violates the "be conservative what you accept"
> > > > rule. I'm not sure how to fix it, adding a fallback to every-two-packet-add
> > > > would pollute the fast path a bit.
> > >
> > > You're right. As far as I can see, it's not necessary to set the
> > > TCP_ACK_PUSHED flag at all (except maybe for SYN-ACK). I'm just writing
> > > a patch to clean this up.
> > 
> > Setting it for packets >= rcv_mss looks useful to me to catch mistakes.
> > Better too many acks than to few.
> 
> Maybe so, but in that case I would only set it for packets > rcv_mss.
> Otherwise, my ack-every-segment-with-PSH problem would come back.

Yes > rcv_mss. Sorry for the typo.
> 
> Actually, I think it would be better to simply to always ack every other
> segment (except in quickack and fast recovery modes) and only use the
> receive window estimation for window updates. This would guarantee
> self-clocking in all cases.

The original "ack after 2*mss" had been carefully tuned to work with well 
slow PPP links in all case; after some bad experiences. It came 
together with the variable length delayed ack.

The rcv_mss stuff was added later to fix some performance problems
on very big MTU links like HIPPI (where you have a MSS of 64k, but 
often stacks send smaller packets like 48k; the ack after 2*mss check
only triggered every third packet, causing bad peroformance) 

Now if nobody used slow PPP links anymore it would be probably ok
to go back to the simpler "ack every other packet" rule; but I'm afraid
that's not the case yet.

-Andi

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14 16:12                           ` Andi Kleen
@ 2001-10-14 16:55                             ` Mika Liljeberg
  2001-10-14 17:07                               ` kuznet
  0 siblings, 1 reply; 36+ messages in thread
From: Mika Liljeberg @ 2001-10-14 16:55 UTC (permalink / raw)
  To: Andi Kleen; +Cc: David S. Miller, linux-kernel, kuznet

Andi Kleen wrote:
> 
> On Sun, Oct 14, 2001 at 04:26:53PM +0200, Mika Liljeberg wrote:
> > My solution to this would be to recalculate rcv_mss once per window.
> > I.e., start new_rcv_mss from 0, keep increasing it for one window width,
> > and then copy it to rcv_mss. No funny heuristics, and it would adjust to
> > a shrunken MSS within one transmission window.
> 
> Sounds complicated. How would you implement it?

Not very hard at all. It could be done easily with a couple of extra
state variables. The following is a rough pseudo code (ignores
initialization of state variables):

if (seg.len > rcv.new_mss)
	rcv.new_mss = seg.len;
if (rcv.nxt >= rcv.mss_seq || rcv.new_mss > rcv.mss) {
	rcv.mss = max(rcv.new_mss, TCP_MIN_MSS);
	rcv.new_mss = 0;
	rcv.mss_seq = rcv.nxt + measurement_window;
}

The basic property is that you can balance the time required to detect a
decreased receive MSS against the reliability of the estimate by tuning
the measurement window. Increased receive MSS would be detected
immediately. Of course, I'm not claiming that there might not be a
better algorithim somewhere that doesn't require the two state
variables.

> > Actually, I think it would be better to simply to always ack every other
> > segment (except in quickack and fast recovery modes) and only use the
> > receive window estimation for window updates. This would guarantee
> > self-clocking in all cases.
> 
> The original "ack after 2*mss" had been carefully tuned to work with well
> slow PPP links in all case; after some bad experiences. It came
> together with the variable length delayed ack.
> 
> The rcv_mss stuff was added later to fix some performance problems
> on very big MTU links like HIPPI (where you have a MSS of 64k, but
> often stacks send smaller packets like 48k; the ack after 2*mss check
> only triggered every third packet, causing bad peroformance)
> 
> Now if nobody used slow PPP links anymore it would be probably ok
> to go back to the simpler "ack every other packet" rule; but I'm afraid
> that's not the case yet.

Why would PPP links perform badly with ack-every-other? That isn't the
case in my experience, at least.

> -Andi

Regards,

	MikaL

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14 16:55                             ` Mika Liljeberg
@ 2001-10-14 17:07                               ` kuznet
  2001-10-14 17:26                                 ` Mika Liljeberg
  0 siblings, 1 reply; 36+ messages in thread
From: kuznet @ 2001-10-14 17:07 UTC (permalink / raw)
  To: Mika Liljeberg; +Cc: ak, davem, linux-kernel

Hello!

> Not very hard at all. It could be done easily with a couple of extra
> state variables.

Does current heuristics not work? :-)

> state variables. The following is a rough pseudo code (ignores
> initialization of state variables):

You missed one crucial moment: stream may consist of remnants
for long time or even forever. It is normal case. And rcv_mss is used
not only and mostly not for ACKing, it is used in really important places
(SWS avoidance et al), where specs propose to use your advertised MSS,
which does not work at all when you talk over high MTU interfaces.

The approach (invented by Andi?) provided necessary robustness,
checking for two segments in row and suppressing MSS drops below 536.
Check for PSHless segments allows to detect really low mtu reliably.

Alexey

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14 17:07                               ` kuznet
@ 2001-10-14 17:26                                 ` Mika Liljeberg
  2001-10-14 17:35                                   ` kuznet
  0 siblings, 1 reply; 36+ messages in thread
From: Mika Liljeberg @ 2001-10-14 17:26 UTC (permalink / raw)
  To: kuznet; +Cc: ak, davem, linux-kernel

kuznet@ms2.inr.ac.ru wrote:
> 
> Hello!
> 
> > Not very hard at all. It could be done easily with a couple of extra
> > state variables.
> 
> Does current heuristics not work? :-)

Well, you should read the preceding messages to understand how we got
here.

Andi had some reservations and I tend to agree. The current heuristic
assumes specific TCP behaviour, which is left as an implementation issue
in specifications. Conclusion: it works if you're lucky.

But it's true I can't show you any data to the contrary, either. This is
not the issue that started this thread.

> > state variables. The following is a rough pseudo code (ignores
> > initialization of state variables):
> 
> You missed one crucial moment: stream may consist of remnants
> for long time or even forever. It is normal case. And rcv_mss is used
> not only and mostly not for ACKing, it is used in really important places
> (SWS avoidance et al), where specs propose to use your advertised MSS,
> which does not work at all when you talk over high MTU interfaces.

I don't think I missed that point.

> The approach (invented by Andi?) provided necessary robustness,
> checking for two segments in row and suppressing MSS drops below 536.
> Check for PSHless segments allows to detect really low mtu reliably.

When you say "reliably", you should recognize the underlying assumptions
as well.

> Alexey

Regards,

	MikaL

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14 17:26                                 ` Mika Liljeberg
@ 2001-10-14 17:35                                   ` kuznet
  2001-10-14 17:56                                     ` Mika Liljeberg
  0 siblings, 1 reply; 36+ messages in thread
From: kuznet @ 2001-10-14 17:35 UTC (permalink / raw)
  To: Mika Liljeberg; +Cc: ak, davem, linux-kernel

Hello!

> Well, you should read the preceding messages to understand how we got
> here.

I am reading now and until now I did not find why problem of calculating
rcv_mss raised at all. :-)

You nicely understood the reason of the problem and
it is surely not related to rcv_mss in any way.  :-)

> When you say "reliably", you should recognize the underlying assumptions
> as well.

The assumptions are so conservative, that it is not worth to tell about them.

Heuristics does not predict fall of rcv_mss below 536 when sender
sets PSH on each frame. And it is pretty evident that such prediction
is impossible theoretically in this sad case. All that we can do is
to cry and to hold rcv_mss at 536 and to ack each 4th segment on
with mtu of 256.

Alexey

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14 17:35                                   ` kuznet
@ 2001-10-14 17:56                                     ` Mika Liljeberg
  2001-10-14 18:20                                       ` kuznet
  0 siblings, 1 reply; 36+ messages in thread
From: Mika Liljeberg @ 2001-10-14 17:56 UTC (permalink / raw)
  To: kuznet; +Cc: ak, davem, linux-kernel

kuznet@ms2.inr.ac.ru wrote:
> 
> Hello!
> 
> > Well, you should read the preceding messages to understand how we got
> > here.
> 
> I am reading now and until now I did not find why problem of calculating
> rcv_mss raised at all. :-)

I think Andi brought it up. I was actually saying that it probably works
most of the time.

> You nicely understood the reason of the problem and
> it is surely not related to rcv_mss in any way.  :-)
> 
> > When you say "reliably", you should recognize the underlying assumptions
> > as well.
> 
> The assumptions are so conservative, that it is not worth to tell about them.

The assumption is that the peer is implemented the way you expect and
that the application doesn't toy with TCP_NODELAY.

> Heuristics does not predict fall of rcv_mss below 536 when sender
> sets PSH on each frame. And it is pretty evident that such prediction
> is impossible theoretically in this sad case. All that we can do is
> to cry and to hold rcv_mss at 536 and to ack each 4th segment on
> with mtu of 256.

Not really. You could do one of two things: either ack every second
segment and leave rcv estimation only for window calculations, or use an
algorithm like the one I outlined. Either approach would work, I think,
and not produce strech acks.

Regards,

	MikaL

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14 17:56                                     ` Mika Liljeberg
@ 2001-10-14 18:20                                       ` kuznet
  2001-10-14 18:48                                         ` Mika Liljeberg
  0 siblings, 1 reply; 36+ messages in thread
From: kuznet @ 2001-10-14 18:20 UTC (permalink / raw)
  To: Mika Liljeberg; +Cc: ak, davem, linux-kernel

Hello!

> The assumption is that the peer is implemented the way you expect and
> that the application doesn't toy with TCP_NODELAY.

Sorry??

It is the most important _exactly_ for TCP_NODELAY, which
generates lots of remnants.

> Not really. You could do one of two things: either ack every second
> segment 

I do not worry about this _at_ _all_. See?
"each other", "each two mss" --- all this is red herring.

I do understand your problem, which is not related to rcv_mss.
When bandwidth in different directions differ more than 20 times,
stretch ACKs are even preferred. Look into tcplw work, using stretch ACKs
is even considered as something normal.

I really commiserate and think that removing "final cut" clause
will help you. But sending ACK on buffer drain at least for short
packets is real demand, which cannot be relaxed.
"final cut" is also better not to remove actually, but the case
when it is required is probabilistically marginal.

Alexey

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14 18:20                                       ` kuznet
@ 2001-10-14 18:48                                         ` Mika Liljeberg
  2001-10-14 19:12                                           ` kuznet
  0 siblings, 1 reply; 36+ messages in thread
From: Mika Liljeberg @ 2001-10-14 18:48 UTC (permalink / raw)
  To: kuznet; +Cc: ak, davem, linux-kernel

kuznet@ms2.inr.ac.ru wrote:
> 
> Hello!
> 
> > The assumption is that the peer is implemented the way you expect and
> > that the application doesn't toy with TCP_NODELAY.
> 
> Sorry??
> 
> It is the most important _exactly_ for TCP_NODELAY, which
> generates lots of remnants.

I simply meant that with the application in control of packet size, you
simply can't make a reliable estimate of maximum receive MSS unless our
assumption that only maximum sized segments don't have PSH.

> > Not really. You could do one of two things: either ack every second
> > segment
> 
> I do not worry about this _at_ _all_. See?
> "each other", "each two mss" --- all this is red herring.

Whatever.

> I do understand your problem, which is not related to rcv_mss.

I know.

> When bandwidth in different directions differ more than 20 times,
> stretch ACKs are even preferred. Look into tcplw work, using stretch ACKs
> is even considered as something normal.

I know. It's a difficult tradeoff between saving bandwidth on the return
path, trying to maintain self clocking, and avoiding bursts caused by
ack compression.

> I really commiserate and think that removing "final cut" clause
> will help you.

Yes.

> But sending ACK on buffer drain at least for short
> packets is real demand, which cannot be relaxed.

Why? This one has me stumped.

> "final cut" is also better not to remove actually, but the case
> when it is required is probabilistically marginal.
> 
> Alexey

Regards,

	MikaL

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14 18:48                                         ` Mika Liljeberg
@ 2001-10-14 19:12                                           ` kuznet
  2001-10-14 19:32                                             ` Mika Liljeberg
  0 siblings, 1 reply; 36+ messages in thread
From: kuznet @ 2001-10-14 19:12 UTC (permalink / raw)
  To: Mika Liljeberg; +Cc: ak, davem, linux-kernel

Hello!

> > But sending ACK on buffer drain at least for short
> > packets is real demand, which cannot be relaxed.
> 
> Why? This one has me stumped.

To remove sick delays with nagling transfers (1) and to remove
deadlocks due to starvation on rcvbuf (2) at receiver and on sndbuf
at sender (3).

Actually, (2) is solved nowadays with compressing queue. (3) can be solved
acking each other segment. But (1) remains. The solution used in 2.2,
when delack timeout was reduced to short value on short packets with PSH
set worked with probability of 50% on very slow links i.e. in the case
when wrong delay is not important at all and not covering cases
where absence of long gaps is really important.

Actually, any alternative idea how to solve this could be very useful.

Alexey

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14 19:12                                           ` kuznet
@ 2001-10-14 19:32                                             ` Mika Liljeberg
  2001-10-14 19:40                                               ` kuznet
  0 siblings, 1 reply; 36+ messages in thread
From: Mika Liljeberg @ 2001-10-14 19:32 UTC (permalink / raw)
  To: kuznet; +Cc: ak, davem, linux-kernel

kuznet@ms2.inr.ac.ru wrote:
> 
> Hello!
> 
> > > But sending ACK on buffer drain at least for short
> > > packets is real demand, which cannot be relaxed.
> >
> > Why? This one has me stumped.
> 
> To remove sick delays with nagling transfers (1) and to remove
> deadlocks due to starvation on rcvbuf (2) at receiver and on sndbuf
> at sender (3).
> 
> Actually, (2) is solved nowadays with compressing queue. (3) can be solved
> acking each other segment. But (1) remains.
> 
> Actually, any alternative idea how to solve this could be very useful.

And why (1) is a problem is precisely what I don't understand. Nagle is
*supposed* to prevent you from sending multiple remnants. If you don't
like it, you disable it in the sender! However:

The only awkward Nagle-related delay I know of appears with e.g. HTTP,
when the last undersized segment cannot be sent before everything else
is acked. This can be solved using an idea from Greg Minshall, which I
thought was quite cool.

The normal Nagle rule goes:

 - You cannot send a remnant if there are any unacknowledged segments
outstanding

Minshall's version goes:

 - You cannot send a remnant if there is already one unacknowledged
remnant outstanding

This fixes the trailing remnant problem with HTTP and similar
request-reply protocols, while adherring to the spirit of Nagle. There
was even an I-D at some point but for some reason it has not been
updated.

> Alexey

Regards,

	MikaL

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14 19:32                                             ` Mika Liljeberg
@ 2001-10-14 19:40                                               ` kuznet
  2001-10-14 20:06                                                 ` Mika Liljeberg
  0 siblings, 1 reply; 36+ messages in thread
From: kuznet @ 2001-10-14 19:40 UTC (permalink / raw)
  To: Mika Liljeberg; +Cc: ak, davem, linux-kernel

Hello!

> And why (1) is a problem is precisely what I don't understand. Nagle is
> *supposed* to prevent you from sending multiple remnants.

It is not supposed to delay between sends for delack timeout.
Nagle did not know about brain damages which his great idea
will cause when used together with delaying acks. :-)


> is acked. This can be solved using an idea from Greg Minshall, which I
> thought was quite cool.

It is approach used in 2.4. :-)

It does help when sender is also linux-2.4. :-)

Alexey

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14 19:40                                               ` kuznet
@ 2001-10-14 20:06                                                 ` Mika Liljeberg
  2001-10-15 18:40                                                   ` kuznet
  0 siblings, 1 reply; 36+ messages in thread
From: Mika Liljeberg @ 2001-10-14 20:06 UTC (permalink / raw)
  To: kuznet; +Cc: ak, davem, linux-kernel

kuznet@ms2.inr.ac.ru wrote:
> > And why (1) is a problem is precisely what I don't understand. Nagle is
> > *supposed* to prevent you from sending multiple remnants.
> 
> It is not supposed to delay between sends for delack timeout.
> Nagle did not know about brain damages which his great idea
> will cause when used together with delaying acks. :-)

Well, I think this "problem" is way overstated. With a low latency path
the delay ack estimator should already take care of this. With a high
latency path you're out of luck in any case.

Besides, as I said, you can always disable Nagle in an interactive
application. I suppose it would be nice to have a socket option to
disable delayack as well, just for completeness.

> > is acked. This can be solved using an idea from Greg Minshall, which I
> > thought was quite cool.
> 
> It is approach used in 2.4. :-)

Cool. :)

> It does help when sender is also linux-2.4. :-)
> 
> Alexey

Regards,

	MikaL

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14 20:06                                                 ` Mika Liljeberg
@ 2001-10-15 18:40                                                   ` kuznet
  2001-10-15 19:15                                                     ` Mika Liljeberg
  0 siblings, 1 reply; 36+ messages in thread
From: kuznet @ 2001-10-15 18:40 UTC (permalink / raw)
  To: Mika Liljeberg; +Cc: ak, davem, linux-kernel

Hello!

> Well, I think this "problem" is way overstated.

Understated. :-)

Actually, people who designed all this engine always kept in the mind
only two cases: ftp and telnet. Who did care that some funny
protocols sort of smtp work thousand times slower than they could?
Nobody. Until the time when mail agents started to push really
lots of mails.

> Besides, as I said, you can always disable Nagle

And you will finish with Nagle enabled only on ftp-data. I do not know
another standard protosols which are not broken by delack+nagle. :-)

This is sad but this is already truth: apache, samba etc, even ssh(!),
each of them disable nagle by default, even despite of they are able
to cure this problem with less of damage.

Well, I answered to the question: "tcp is slow!" --- "Guy, you forgot
to enable TCP_NODELAY. TCP is not supposed to work well in your case
without this" so much of times, that started to suspect that nagling
must be disabled by default. It would cause less of troubles. :-)

Alexey

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-15 18:40                                                   ` kuznet
@ 2001-10-15 19:15                                                     ` Mika Liljeberg
  2001-10-15 19:38                                                       ` Mika Liljeberg
  0 siblings, 1 reply; 36+ messages in thread
From: Mika Liljeberg @ 2001-10-15 19:15 UTC (permalink / raw)
  To: kuznet; +Cc: ak, davem, linux-kernel

kuznet@ms2.inr.ac.ru wrote:
> > Well, I think this "problem" is way overstated.
> 
> Understated. :-)
> 
> Actually, people who designed all this engine always kept in the mind
> only two cases: ftp and telnet. Who did care that some funny
> protocols sort of smtp work thousand times slower than they could?

Well, if you ask me, it's smtp that is a prime example of braindead
protocol design. It's a wonder we're still using it. If you put that
many request-reply interactions into a protocol that could easily be
done in one you're simply begging for a bloody nose. Nagle or not, smtp
sucks. :)

Anyway, Minshall's version of Nagle is ok with smtp as long as the smtp
implementation isn't stupid enough to emit two remants in one go (yeah,
right).

Anyway, it would be interesting to try a (even more) relaxed version of
Nagle that would allow a maximum of two remnants in flight. This would
basically cover all TCP request/reply cases (leading AND trailing
remnant). Coupled with large initial window to get rid of  small-cwnd
interactions, it might be almost be all right.

Assuming the above, we woulnd't need your ack-every-pushed-remnant
policy, except for the following pathological bidirection case:

A and B send two remnants to each other at the same time. Then both
block waiting for ack, until finally one of them sends a delay ack. You
could break this deadlock by using the following rule:

- if we're blocked on Nagle (two remnants out) and the received segment
has PSH, send ACK immediately

In other cases you wouldn't need to ack pushed segments. What do you
think? :-) 

> Alexey

Regards,

	MikaL

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-15 19:15                                                     ` Mika Liljeberg
@ 2001-10-15 19:38                                                       ` Mika Liljeberg
  0 siblings, 0 replies; 36+ messages in thread
From: Mika Liljeberg @ 2001-10-15 19:38 UTC (permalink / raw)
  To: kuznet, ak, davem, linux-kernel

Mika Liljeberg wrote:
> Anyway, it would be interesting to try a (even more) relaxed version of
> Nagle that would allow a maximum of two remnants in flight. This would
> basically cover all TCP request/reply cases (leading AND trailing
> remnant). Coupled with large initial window to get rid of  small-cwnd
> interactions, it might be almost be all right.

Oops, bad idea. You can quench the objections, I already figured out it
won't work. :-(

I guess we're stuck with the current status quo: braindead application
protocols will perform badly no matter what we do. All we can really do
is prevent them harming the network.

Regards,

	MikaL

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH] TCP acking too fast
  2001-10-14 11:30                   ` Andi Kleen
  2001-10-14 11:49                     ` Mika Liljeberg
@ 2001-10-14 13:14                     ` Mika Liljeberg
  2001-10-14 16:36                     ` kuznet
  2 siblings, 0 replies; 36+ messages in thread
From: Mika Liljeberg @ 2001-10-14 13:14 UTC (permalink / raw)
  To: Andi Kleen; +Cc: David S. Miller, linux-kernel, kuznet

[-- Attachment #1: Type: text/plain, Size: 1281 bytes --]

Ok, here's the patch against 2.4.10-ac10. This seems to produce
acceptable behaviour in the cases I tested, at least. Someone with one
of those "ridiculously small MTU" links might give it a go to check that
the rcv_mss estimation still works as expected. It should, though, as I
didn't really make any changes to it.

Andi Kleen wrote:
> The only special case for PSH in RX left I can is in rcv_mss estimation,
> where is assumes that a packet with PSH set is not full sized.  On further
> look the 2.4 tcp_measure_rcv_mss will never update rcv_mss for packets
> which do have PSH set and in this case cause random ack behaviour depending
> on the initial rcv_mss guess.

A too low rcv_mss estimate isn't a problem, as the estimate is
immediately increased when the first larger segment arrives. A too high
estimate can be difficult to adjust down, though, if the sender suddenly
starts sending smalls segments with PSH set.

> Not very nice; definitely violates the "be conservative what you accept"
> rule. I'm not sure how to fix it, adding a fallback to every-two-packet-add
> would pollute the fast path a bit.

Hopefully a bit more conservative now. I didn't implement the fall back
to ack-every-two-packets, though, as I had the exact opposite problem.
:)

Regards,

	MikaL

[-- Attachment #2: over_ack.patch --]
[-- Type: text/plain, Size: 1826 bytes --]

--- tcp_input.c.org	Sat Oct 13 23:24:38 2001
+++ tcp_input.c	Sun Oct 14 15:47:10 2001
@@ -126,24 +126,25 @@
 	 * sends good full-sized frames.
 	 */
 	len = skb->len;
+
 	if (len >= tp->ack.rcv_mss) {
 		tp->ack.rcv_mss = len;
-		/* Dubious? Rather, it is final cut. 8) */
-		if (tcp_flag_word(skb->h.th)&TCP_REMNANT)
-			tp->ack.pending |= TCP_ACK_PUSHED;
 	} else {
-		/* Otherwise, we make more careful check taking into account,
-		 * that SACKs block is variable.
+		/* If PSH is not set, packet should be full sized, assuming
+		 * that the peer implements Nagle correctly.
+		 * This observation (if it is correct 8)) allows
+		 * to handle super-low mtu links fairly.
 		 *
-		 * "len" is invariant segment length, including TCP header.
+		 * However, If sender sets TCP_NODELAY, this could effectively
+		 * turn receiver side SWS algorithms off. TCP_MIN_MSS guards
+		 * against a ridiculously small rcv_mss estimate.
+		 *
+		 * We also have to be careful checking the header size, since
+		 * the SACK option is variable length. "len" is the invariant
+		 * segment length, including TCP header.
 		 */
 		len += skb->data - skb->h.raw;
 		if (len >= TCP_MIN_RCVMSS + sizeof(struct tcphdr) ||
-		    /* If PSH is not set, packet should be
-		     * full sized, provided peer TCP is not badly broken.
-		     * This observation (if it is correct 8)) allows
-		     * to handle super-low mtu links fairly.
-		     */
 		    (len >= TCP_MIN_MSS + sizeof(struct tcphdr) &&
 		     !(tcp_flag_word(skb->h.th)&TCP_REMNANT))) {
 			/* Subtract also invariant (if peer is RFC compliant),
@@ -152,12 +153,9 @@
 			 */
 			len -= tp->tcp_header_len;
 			tp->ack.last_seg_size = len;
-			if (len == lss) {
+			if (len == lss)
 				tp->ack.rcv_mss = len;
-				return;
-			}
 		}
-		tp->ack.pending |= TCP_ACK_PUSHED;
 	}
 }
 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14 11:30                   ` Andi Kleen
  2001-10-14 11:49                     ` Mika Liljeberg
  2001-10-14 13:14                     ` [PATCH] " Mika Liljeberg
@ 2001-10-14 16:36                     ` kuznet
  2 siblings, 0 replies; 36+ messages in thread
From: kuznet @ 2001-10-14 16:36 UTC (permalink / raw)
  To: Andi Kleen; +Cc: davem, ak, linux-kernel

Hello!

> I just checked and the 2.4 kernel doesn't have the PSH quickack check
> anymore,

Right, it is removed because all the PSHed packets are acked as soon
as rcvbuf is completely drained and window is full open.

See? It is the reason of "too frequent" ACKs and I daresay they
are not too frequent and it is impossible to do something with this.
These ACKs are an _absolute_ demand and delay by some small time
helps nothing destroying performance instead.

Well, it is the place, commented with "Dubious? ... final cut."
It is enough to delete it to avoid "too frequent" ACKs and to return
to too rare ACKs instead.

Alexey

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14  7:05   ` Mika Liljeberg
  2001-10-14  7:47     ` David S. Miller
@ 2001-10-14  7:50     ` David S. Miller
  2001-10-14  7:53       ` Mika Liljeberg
  1 sibling, 1 reply; 36+ messages in thread
From: David S. Miller @ 2001-10-14  7:50 UTC (permalink / raw)
  To: Mika.Liljeberg; +Cc: linux-kernel

   From: Mika Liljeberg <Mika.Liljeberg@welho.com>
   Date: Sun, 14 Oct 2001 10:05:33 +0300

   Looking at the dump, it seems that most arriving
   segments have the PSH bit set.

I know you said what is running on the receiver, but do
you have any clue what is running on the sender?  It looks
_really_ broken.

The transfer looks like a bulk one but every segment (as you have
stated) has PSH set, which is completely stupid.

At least, I can guarentee you that the sender is not Linux.  Or,
if it is Linux, it is running a really broken implementation of
a web server. :-)

Franks a lot,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14  7:50     ` David S. Miller
@ 2001-10-14  7:53       ` Mika Liljeberg
  0 siblings, 0 replies; 36+ messages in thread
From: Mika Liljeberg @ 2001-10-14  7:53 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel

"David S. Miller" wrote:
> 
>    From: Mika Liljeberg <Mika.Liljeberg@welho.com>
>    Date: Sun, 14 Oct 2001 10:05:33 +0300
> 
>    Looking at the dump, it seems that most arriving
>    segments have the PSH bit set.
> 
> I know you said what is running on the receiver, but do
> you have any clue what is running on the sender?  It looks
> _really_ broken.
> 
> The transfer looks like a bulk one but every segment (as you have
> stated) has PSH set, which is completely stupid.
> 
> At least, I can guarentee you that the sender is not Linux.  Or,
> if it is Linux, it is running a really broken implementation of
> a web server. :-)

I've got a feeling you're going to rue saying that (see my other email).
;-)

Regards,

	MikaL

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: TCP acking too fast
  2001-10-14  0:23 TCP acking too fast Mika Liljeberg
  2001-10-14  6:40 ` David S. Miller
@ 2001-10-15 20:59 ` Bill Davidsen
  1 sibling, 0 replies; 36+ messages in thread
From: Bill Davidsen @ 2001-10-15 20:59 UTC (permalink / raw)
  To: linux-kernel

In article <3BC8DAF0.3D16A546@welho.com> Mika.Liljeberg@welho.com wrote:

>I've already disabled quickacks, replaced the receive MSS estimate with
>advertised MSS in the ack sending policy (two places), and removed one
>dubious "immediate ack" condition from send_delay_ack(). The annoying
>thing is that none of this seem to make any real difference. I must be
>missing something huge that's right in front of my nose, but I'm
>starting to run out of steam.
>
>Any thoughts on this?

The discussion has been most complete, I guess at this point is you
can't fix the sender to stop this anti-social behaviour, you might try
using iptables to "mangle" the PSH off from this host or rate limit the
ACKs, or some other hack. None of which is a "solution," just some
interesting things to try.

As noted, the core problem is that TCP doesn't like really asymmetric
bandwidth.

-- 
bill davidsen <davidsen@tmr.com>
 "If I were a diplomat, in the best case I'd go hungry.  In the worst
  case, people would die."
		-- Robert Lipe

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2001-10-15 20:59 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-10-14  0:23 TCP acking too fast Mika Liljeberg
2001-10-14  6:40 ` David S. Miller
2001-10-14  7:05   ` Mika Liljeberg
2001-10-14  7:47     ` David S. Miller
2001-10-14  7:51       ` Mika Liljeberg
2001-10-14  8:12         ` David S. Miller
2001-10-14  8:39           ` Mika Liljeberg
2001-10-14  9:03             ` David S. Miller
2001-10-14  9:15               ` Mika Liljeberg
2001-10-14  9:16                 ` David S. Miller
2001-10-14  9:25               ` Andi Kleen
2001-10-14  9:39                 ` David S. Miller
2001-10-14 11:30                   ` Andi Kleen
2001-10-14 11:49                     ` Mika Liljeberg
2001-10-14 14:05                       ` Andi Kleen
2001-10-14 14:26                         ` Mika Liljeberg
2001-10-14 16:12                           ` Andi Kleen
2001-10-14 16:55                             ` Mika Liljeberg
2001-10-14 17:07                               ` kuznet
2001-10-14 17:26                                 ` Mika Liljeberg
2001-10-14 17:35                                   ` kuznet
2001-10-14 17:56                                     ` Mika Liljeberg
2001-10-14 18:20                                       ` kuznet
2001-10-14 18:48                                         ` Mika Liljeberg
2001-10-14 19:12                                           ` kuznet
2001-10-14 19:32                                             ` Mika Liljeberg
2001-10-14 19:40                                               ` kuznet
2001-10-14 20:06                                                 ` Mika Liljeberg
2001-10-15 18:40                                                   ` kuznet
2001-10-15 19:15                                                     ` Mika Liljeberg
2001-10-15 19:38                                                       ` Mika Liljeberg
2001-10-14 13:14                     ` [PATCH] " Mika Liljeberg
2001-10-14 16:36                     ` kuznet
2001-10-14  7:50     ` David S. Miller
2001-10-14  7:53       ` Mika Liljeberg
2001-10-15 20:59 ` Bill Davidsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox