Re: TCP prequeue performance

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: TCP prequeue performance
       [not found] <BED5FA3B.2A0%cndougla@purdue.edu>
@ 2005-06-15 20:51 ` David S. Miller
  2005-06-15 23:34 ` Andi Kleen
  1 sibling, 0 replies; 4+ messages in thread
From: David S. Miller @ 2005-06-15 20:51 UTC (permalink / raw)
  To: cndougla; +Cc: linux-kernel, netdev

From: Chase Douglas <cndougla@purdue.edu>
Date: Wed, 15 Jun 2005 15:31:07 -0500

> Note the decreases in the system and real times. These numbers are fairly
> stable through 10 consecutive benchmarks of each. If I change message sizes
> and number of connections, the difference can narrow or widen, but usually
> the non-prequeue beats the prequeue with respect to system and real time.

Please take this discussion to the networking development list,
netdev@vger.kernel.org.  It is an interesting issue, but let's discuss
it in the right place. :-)

Prequeue has many advantages, in that processes are properly charged
for TCP processing overhead, and copying to userspace happens directly
in the TCP input path.

This paces TCP senders, in that ACKs do not come back faster than the
kernel can get the process on the cpu to drain the recvmsg() queue.
ACKs sent immediately (without prequeue) give the sender the illusion
that the system can handle a higher data rate than is actually
feasible.

Unfortunately, if there are bugs or bad heuristics in the process
scheduler, this can impact TCP performance quite a bit.

Also, applications using small messages and which are sensitive to
latency can also be harmed by prequeue, that's why we have the
"tcp_low_latency" sysctl.  It actually has a slight bug, in that one
of the checks (where you placed the "if (0") was missing, which is
fixed by the patch below:

[TCP]: Fix sysctl_tcp_low_latency

When enabled, this should disable UCOPY prequeue'ing altogether,
but it does not due to a missing test.

Signed-off-by: David S. Miller <davem@davemloft.net>

--- 1/net/ipv4/tcp.c.~1~	2005-06-09 12:29:41.000000000 -0700
+++ 2/net/ipv4/tcp.c	2005-06-09 16:39:46.000000000 -0700
@@ -1345,7 +1345,7 @@

 		cleanup_rbuf(sk, copied);

-		if (tp->ucopy.task == user_recv) {
+		if (!sysctl_tcp_low_latency && tp->ucopy.task == user_recv) {
 			/* Install new reader */
 			if (!user_recv && !(flags & (MSG_TRUNC | MSG_PEEK))) {
 				user_recv = current;

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: TCP prequeue performance
       [not found] <BED5FA3B.2A0%cndougla@purdue.edu>
  2005-06-15 20:51 ` TCP prequeue performance David S. Miller
@ 2005-06-15 23:34 ` Andi Kleen
  2005-06-15 23:41   ` David S. Miller
  1 sibling, 1 reply; 4+ messages in thread
From: Andi Kleen @ 2005-06-15 23:34 UTC (permalink / raw)
  To: Chase Douglas; +Cc: linux-kernel, netdev

Chase Douglas <cndougla@purdue.edu> writes:
>
> I then disabled the prequeue mechanism by changing net/ipv4/tcp.c:1347 of
> 2.6.11:
>
> if (tp->ucopy.task == user_recv) {
>     to
> if (0 && tp->ucopy.task == user_recv) {

You actually didn't disable it completely - it would still be filled. 
To really disable it set net.ipv4.tcp_lowlatency, that disables
even the early queueing and will process all incoming TCP in softirq context
only.

>
> The same benchmark then yielded:
>
> time ./client 10000 10000 100000 1 500000000 recv
>
> real    1m21.928s
> user    0m1.579s
> sys     0m8.330ss
>
> Note the decreases in the system and real times. These numbers are fairly
> stable through 10 consecutive benchmarks of each. If I change message sizes
> and number of connections, the difference can narrow or widen, but usually
> the non-prequeue beats the prequeue with respect to system and real time.
>
> It might be that I've just found an instance where the prequeue is slower
> than the "slow" path. I'm not quite sure why this would be. Does anyone have
> any thoughts on this?

prequeue adds latency. Its original purpose was to be able to 
do combined checksum copy to user space, but that is not very useful
anymore with modern NICs which all do hardware checksumming. 
The only purpose it has left is to batch the TCP processing
better and in particular to account it to the scheduler.

When the receiver does not process the data in time 
the delayed ack timer takes over and processes the data.

Now the way you disabled it is interesting. The data would
be still queued, but in user process would be never emptied.

This means data is always processed later in the delack
timer in your hacked kernel. 

This will lead to batching of the processing (because
after upto 200ms there will be many more packet in the queues), 
and seems to save CPU time in your case.

Basically you added something similar to the the anticipatory scheduler
which adds artifical delays into disk scheduling to your TCP receiver
to get better batching. It seems to work for you. 

I think it is unlikely adding artificial processing delays like this
will help in many cases though, normally early delivery of received
data to user space should be better.

-Andi

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: TCP prequeue performance
  2005-06-15 23:34 ` Andi Kleen
@ 2005-06-15 23:41   ` David S. Miller
  2005-06-16  0:23     ` Andi Kleen
  0 siblings, 1 reply; 4+ messages in thread
From: David S. Miller @ 2005-06-15 23:41 UTC (permalink / raw)
  To: ak; +Cc: cndougla, linux-kernel, netdev

From: Andi Kleen <ak@muc.de>
Subject: Re: TCP prequeue performance
Date: Thu, 16 Jun 2005 01:34:48 +0200

> Chase Douglas <cndougla@purdue.edu> writes:
> >
> > I then disabled the prequeue mechanism by changing net/ipv4/tcp.c:1347 of
> > 2.6.11:
> >
> > if (tp->ucopy.task == user_recv) {
> >     to
> > if (0 && tp->ucopy.task == user_recv) {
> 
> You actually didn't disable it completely - it would still be filled. 

Not true, if this check does not pass, tp->ucopy.task is
never set, therefore prequeue processing is never performed.

This test must pass the first time, when both tp->ucopy.task
and user_recv are both NULL, in order for prequeue processing
to occur at all.

So his change did totally disable prequeue.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: TCP prequeue performance
  2005-06-15 23:41   ` David S. Miller
@ 2005-06-16  0:23     ` Andi Kleen
  0 siblings, 0 replies; 4+ messages in thread
From: Andi Kleen @ 2005-06-16  0:23 UTC (permalink / raw)
  To: David S. Miller; +Cc: cndougla, linux-kernel, netdev

"David S. Miller" <davem@davemloft.net> writes:
>
> Not true, if this check does not pass, tp->ucopy.task is
> never set, therefore prequeue processing is never performed.

Oh well, here goes my nice theory :)
>
> This test must pass the first time, when both tp->ucopy.task
> and user_recv are both NULL, in order for prequeue processing
> to occur at all.
>
> So his change did totally disable prequeue.

Then probably his test was latency bound somehow, but normally
that should not affect system time, just wall time.

I would perhaps compare context switch numbers and netstat -s
output between the different runs and see if anything pops out.

-Andi

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2005-06-16  0:23 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <BED5FA3B.2A0%cndougla@purdue.edu>
2005-06-15 20:51 ` TCP prequeue performance David S. Miller
2005-06-15 23:34 ` Andi Kleen
2005-06-15 23:41   ` David S. Miller
2005-06-16  0:23     ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).