qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Ed Swierk" <eswierk@arastra.com>
To: qemu-devel@nongnu.org
Subject: [Qemu-devel] [PATCH] Improve -net user (slirp) performance by 4x
Date: Sun, 30 Apr 2006 20:00:44 -0700	[thread overview]
Message-ID: <c1bf1cf0604302000ree12430v7144109271cb6a2f@mail.gmail.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 1049 bytes --]

Three bugs in the slirp code have an enormous adverse effect on
networking performance.

1. The maximum TCP segment size for data flowing from the VM to the
host is unnecessarily limited to 512 bytes. 1460 bytes is the
appropriate value for Ethernet.

2. TCP acknowledgements are being delayed unnecessarily, in violation
of the TCP Congestion Control RFC (2581). There is no reason to delay
TCP acknowledgements (and certainly no reason to give special
treatment to packets consisting of a single ESC character, as the code
does now!).

3. qemu sleeps soundly while packets back up in slirp's buffers. slirp
socket fds should be added to the main qemu select() loop to avoid
unnecessary delays.

As Ken Duda mentioned in an earlier thread, measurements with a simple
Python script indicate that the attached patch accelerates TCP
throughput from about 2 megabytes/sec to 9 megabytes/sec, in both
directions.

I'm sure many folks would benefit from this improvement; please let me
know if there is anything I can do to help nudge it into CVS.

--Ed

[-- Attachment #2: qemu-slirp-performance.patch --]
[-- Type: text/x-patch, Size: 3193 bytes --]

diff -BurN qemu-snapshot-2006-03-27_23.orig/slirp/tcp.h qemu-snapshot-2006-03-27_23/slirp/tcp.h
--- qemu-snapshot-2006-03-27_23.orig/slirp/tcp.h	2004-04-21 17:10:47.000000000 -0700
+++ qemu-snapshot-2006-03-27_23/slirp/tcp.h	2006-04-11 15:22:05.000000000 -0700
@@ -100,8 +100,10 @@
  * With an IP MSS of 576, this is 536,
  * but 512 is probably more convenient.
  * This should be defined as MIN(512, IP_MSS - sizeof (struct tcpiphdr)).
+ *
+ * We make this 1460 because we only care about Ethernet in the qemu context.
  */
-#define	TCP_MSS	512
+#define	TCP_MSS	1460
 
 #define	TCP_MAXWIN	65535	/* largest value for (unscaled) window */
 
diff -BurN qemu-snapshot-2006-03-27_23.orig/slirp/tcp_input.c qemu-snapshot-2006-03-27_23/slirp/tcp_input.c
--- qemu-snapshot-2006-03-27_23.orig/slirp/tcp_input.c	2004-10-07 16:27:35.000000000 -0700
+++ qemu-snapshot-2006-03-27_23/slirp/tcp_input.c	2006-04-11 15:22:05.000000000 -0700
@@ -580,28 +580,11 @@
 			 *	congestion avoidance sender won't send more until
 			 *	he gets an ACK.
 			 * 
-			 * Here are 3 interpretations of what should happen.
-			 * The best (for me) is to delay-ack everything except
-			 * if it's a one-byte packet containing an ESC
-			 * (this means it's an arrow key (or similar) sent using
-			 * Nagel, hence there will be no echo)
-			 * The first of these is the original, the second is the
-			 * middle ground between the other 2
+			 * It is better to not delay acks at all to maximize
+			 * TCP throughput.  See RFC 2581.
 			 */ 
-/*			if (((unsigned)ti->ti_len < tp->t_maxseg)) {
- */			     
-/*			if (((unsigned)ti->ti_len < tp->t_maxseg && 
- *			     (so->so_iptos & IPTOS_LOWDELAY) == 0) ||
- *			    ((so->so_iptos & IPTOS_LOWDELAY) && 
- *			     ((struct tcpiphdr_2 *)ti)->first_char == (char)27)) {
- */
-			if ((unsigned)ti->ti_len == 1 &&
-			    ((struct tcpiphdr_2 *)ti)->first_char == (char)27) {
-				tp->t_flags |= TF_ACKNOW;
-				tcp_output(tp);
-			} else {
-				tp->t_flags |= TF_DELACK;
-			}
+			tp->t_flags |= TF_ACKNOW;
+			tcp_output(tp);
 			return;
 		}
 	} /* header prediction */
diff -BurN qemu-snapshot-2006-03-27_23.orig/vl.c qemu-snapshot-2006-03-27_23/vl.c
--- qemu-snapshot-2006-03-27_23.orig/vl.c	2006-04-11 15:21:27.000000000 -0700
+++ qemu-snapshot-2006-03-27_23/vl.c	2006-04-11 15:22:05.000000000 -0700
@@ -4026,7 +4026,7 @@
 void main_loop_wait(int timeout)
 {
     IOHandlerRecord *ioh, *ioh_next;
-    fd_set rfds, wfds;
+    fd_set rfds, wfds, xfds;
     int ret, nfds;
     struct timeval tv;
 
@@ -4041,6 +4041,7 @@
     nfds = -1;
     FD_ZERO(&rfds);
     FD_ZERO(&wfds);
+    FD_ZERO(&xfds);
     for(ioh = first_io_handler; ioh != NULL; ioh = ioh->next) {
         if (ioh->fd_read &&
             (!ioh->fd_read_poll ||
@@ -4062,7 +4063,12 @@
 #else
     tv.tv_usec = timeout * 1000;
 #endif
-    ret = select(nfds + 1, &rfds, &wfds, NULL, &tv);
+#if defined(CONFIG_SLIRP)
+    if (slirp_inited) {
+        slirp_select_fill(&nfds, &rfds, &wfds, &xfds);
+    }
+#endif
+    ret = select(nfds + 1, &rfds, &wfds, &xfds, &tv);
     if (ret > 0) {
         /* XXX: better handling of removal */
         for(ioh = first_io_handler; ioh != NULL; ioh = ioh_next) {


             reply	other threads:[~2006-05-01  3:00 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-05-01  3:00 Ed Swierk [this message]
2006-05-01 12:19 ` [Qemu-devel] [PATCH] Improve -net user (slirp) performance by 4x Fabrice Bellard
2006-05-01 13:34   ` Fabrice Bellard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c1bf1cf0604302000ree12430v7144109271cb6a2f@mail.gmail.com \
    --to=eswierk@arastra.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).