qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] Network Performance between Win Host and Linux
@ 2006-04-11 17:20 Kenneth Duda
  2006-04-11 17:28 ` Paul Brook
  2006-04-11 22:36 ` [Qemu-devel] " Kenneth Duda
  0 siblings, 2 replies; 15+ messages in thread
From: Kenneth Duda @ 2006-04-11 17:20 UTC (permalink / raw)
  To: qemu-devel

I am also having severe performance problems using NFS-over-TCP on
qemu-0.8 with a Linux host and guest.  I will be looking at this
today.  My current theory is that the whole machine is going idle
before qemu decides to poll kernel ring buffers holding packets the
guest is transmitting, but if anyone has actual information, please
let me know.

Thanks,
    -Ken

> Hello,
>
> I tried the cvs version from about a week ago with the latest kqemu
> driver, but the network problem still exists. I am using:
>
> qemu -net nic -net tap,ifname=my-tap
>
> under Win2k with a Gentoo guest. The network throughput is about 20 MB (
> per Minute ! ).  When I use qemu 0.7.2 with the tap patch:
>
> qemu -tap "my-tap"
>
> pthe performance is much better ( about factor 10: 3 MB per second ).
> Whats going wrong there ?
>
> Thanks
>
> Helmut
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] Network Performance between Win Host and Linux
  2006-04-11 17:20 [Qemu-devel] Network Performance between Win Host and Linux Kenneth Duda
@ 2006-04-11 17:28 ` Paul Brook
  2006-04-11 17:49   ` Kenneth Duda
  2006-04-11 22:36 ` [Qemu-devel] " Kenneth Duda
  1 sibling, 1 reply; 15+ messages in thread
From: Paul Brook @ 2006-04-11 17:28 UTC (permalink / raw)
  To: qemu-devel

On Tuesday 11 April 2006 18:20, Kenneth Duda wrote:
> I am also having severe performance problems using NFS-over-TCP on
> qemu-0.8 with a Linux host and guest.  I will be looking at this
> today.  My current theory is that the whole machine is going idle
> before qemu decides to poll kernel ring buffers holding packets the
> guest is transmitting, but if anyone has actual information, please
> let me know.

You could be suffering from high interrupt latency. If the guest CPU is not 
idle then qemu only checks for interrupts (eg. the network RX interrupt) 
every 1ms or 1/host_HZ seconds, whichever is greater.
If the guest CPU is idle it should respond immediately.
I wouldn't be surprised if this problem is worse when using kqemu.

Paul

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] Network Performance between Win Host and Linux
  2006-04-11 17:28 ` Paul Brook
@ 2006-04-11 17:49   ` Kenneth Duda
  2006-04-11 18:19     ` Helmut Auer
                       ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Kenneth Duda @ 2006-04-11 17:49 UTC (permalink / raw)
  To: Paul Brook; +Cc: qemu-devel

Paul, thanks for the note.

In my case, the guest CPU is idle.  The host CPU utilization is only 5
or 10 percent when running "find / -print > /dev/null" on the guest. 
So I don't think guest interrupt latency is the issue for me in this
case.

My first guess is that qemu is asleep when the NFS response arrives on
the slirp socket, and stays asleep for several milliseconds before
deciding to check if anything has shown up via slirp.  The problem is
that vl.c's main_loop_wait() has separate calls to select() for slirp
versus non-slirp fd's.  I think this is the problem because strace
reveals qemu blocking for several milliseconds at a time in select(),
waking up with a SIGALRM, and then polling slirp and finding stuff to
do there.  These select calls don't appear hard to integrate, and the
author seems to feel this would be a good idea anyway; from vl.c:

#if defined(CONFIG_SLIRP)
    /* XXX: merge with the previous select() */
    if (slirp_inited) {

I will take a swing at this first.  Please let me know if there's
anything I should be aware of.

Thanks,
    -Ken



On 4/11/06, Paul Brook <paul@codesourcery.com> wrote:
> On Tuesday 11 April 2006 18:20, Kenneth Duda wrote:
> > I am also having severe performance problems using NFS-over-TCP on
> > qemu-0.8 with a Linux host and guest.  I will be looking at this
> > today.  My current theory is that the whole machine is going idle
> > before qemu decides to poll kernel ring buffers holding packets the
> > guest is transmitting, but if anyone has actual information, please
> > let me know.
>
> You could be suffering from high interrupt latency. If the guest CPU is not
> idle then qemu only checks for interrupts (eg. the network RX interrupt)
> every 1ms or 1/host_HZ seconds, whichever is greater.
> If the guest CPU is idle it should respond immediately.
> I wouldn't be surprised if this problem is worse when using kqemu.
>
> Paul
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] Network Performance between Win Host and Linux
  2006-04-11 17:49   ` Kenneth Duda
@ 2006-04-11 18:19     ` Helmut Auer
  2006-04-12  2:10       ` Kazu
  2006-04-11 20:40     ` Leonardo E. Reiter
  2006-04-11 21:00     ` Leonardo E. Reiter
  2 siblings, 1 reply; 15+ messages in thread
From: Helmut Auer @ 2006-04-11 18:19 UTC (permalink / raw)
  To: qemu-devel

Hello
> In my case, the guest CPU is idle.  The host CPU utilization is only 5
> or 10 percent when running "find / -print > /dev/null" on the guest. 
> So I don't think guest interrupt latency is the issue for me in this
> case.
>   
In my environment the performance of about 300 KB is the good case, that 
means the cpu is idle. When the cpu is busy it degrades to20 KB in the 
worst case.
As I said before, with the tap-patched qemu 0.7.2 it is about 10 times 
faster.

-- 
Helmut Auer, helmut@helmutauer.de 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] Network Performance between Win Host and Linux
  2006-04-11 17:49   ` Kenneth Duda
  2006-04-11 18:19     ` Helmut Auer
@ 2006-04-11 20:40     ` Leonardo E. Reiter
  2006-04-11 21:46       ` Kenneth Duda
  2006-04-11 21:00     ` Leonardo E. Reiter
  2 siblings, 1 reply; 15+ messages in thread
From: Leonardo E. Reiter @ 2006-04-11 20:40 UTC (permalink / raw)
  To: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2411 bytes --]

Hi Ken,

I'm attaching a pretty old patch I made (from the 0.7.1 days), which did 
a quick and dirty merge of the select's.  It's not something that is 
clean and it will need adapting to 0.8.0... but, I figure you could draw 
some quick hints on how to merge the 2.  Basically it fills the select 
bitmaps when it walks through the fd's the first time, then calls select 
instead of poll.  It also has slirp fill its own bits (fd's) in before 
calling select.  So this is condensed to 1 select call.

Do what you want with the code - like I said, it's messy and old.  But 
maybe you can at least use it to quickly test your hypothesis.  I'd be 
interested in learning about any benchmarks you come up with if you 
merge the select+poll.  Also, it may not be valid at all on Windows 
hosts since there is a question about select() being interrupted 
properly on those hosts - it should work on Linux/BSD.

Regards,

Leo Reiter

P.S. this patch should be applied with -p1, not -p0 like my newer 
patches are applied.  Sorry for that - like I said, it's quite old.

Kenneth Duda wrote:
> Paul, thanks for the note.
> 
> In my case, the guest CPU is idle.  The host CPU utilization is only 5
> or 10 percent when running "find / -print > /dev/null" on the guest. 
> So I don't think guest interrupt latency is the issue for me in this
> case.
> 
> My first guess is that qemu is asleep when the NFS response arrives on
> the slirp socket, and stays asleep for several milliseconds before
> deciding to check if anything has shown up via slirp.  The problem is
> that vl.c's main_loop_wait() has separate calls to select() for slirp
> versus non-slirp fd's.  I think this is the problem because strace
> reveals qemu blocking for several milliseconds at a time in select(),
> waking up with a SIGALRM, and then polling slirp and finding stuff to
> do there.  These select calls don't appear hard to integrate, and the
> author seems to feel this would be a good idea anyway; from vl.c:
> 
> #if defined(CONFIG_SLIRP)
>     /* XXX: merge with the previous select() */
>     if (slirp_inited) {
> 
> I will take a swing at this first.  Please let me know if there's
> anything I should be aware of.
> 
> Thanks,
>     -Ken

-- 
Leonardo E. Reiter
Vice President of Product Development, CTO

Win4Lin, Inc.
Virtual Computing from Desktop to Data Center
Main: +1 512 339 7979
Fax: +1 512 532 6501
http://www.win4lin.com

[-- Attachment #2: qemu-vl-select.patch --]
[-- Type: text/x-patch, Size: 4168 bytes --]

--- qemu/vl.c	2005-05-11 17:10:02.000000000 -0400
+++ qemu-select/vl.c	2005-05-11 17:13:24.000000000 -0400
@@ -2598,51 +2598,85 @@
 void main_loop_wait(int timeout)
 {
 #ifndef _WIN32
-    struct pollfd ufds[MAX_IO_HANDLERS + 1], *pf;
     IOHandlerRecord *ioh, *ioh_next;
     uint8_t buf[4096];
     int n, max_size;
 #endif
     int ret;
+#if defined(CONFIG_SLIRP) || !defined(_WIN32)
+    fd_set rfds, wfds, xfds;
+    int nfds;
+    struct timeval tv;
+#endif
+#if defined(CONFIG_SLIRP)
+    int slirp_nfds;
+#endif
 
 #ifdef _WIN32
         if (timeout > 0)
             Sleep(timeout);
+
+#if defined(CONFIG_SLIRP)
+        /* XXX: merge with poll() */
+        if (slirp_inited) {
+
+            nfds = -1;
+            FD_ZERO(&rfds);
+            FD_ZERO(&wfds);
+            FD_ZERO(&xfds);
+            slirp_select_fill(&nfds, &rfds, &wfds, &xfds);
+            tv.tv_sec = 0;
+            tv.tv_usec = 0;
+            ret = select(nfds + 1, &rfds, &wfds, &xfds, &tv);
+            if (ret >= 0) {
+                slirp_select_poll(&rfds, &wfds, &xfds);
+            }
+        }
+#endif
 #else
         /* poll any events */
         /* XXX: separate device handlers from system ones */
-        pf = ufds;
+	FD_ZERO(&rfds);
+	FD_ZERO(&wfds);
+	FD_ZERO(&xfds);
+	nfds = -1;
         for(ioh = first_io_handler; ioh != NULL; ioh = ioh->next) {
             if (!ioh->fd_can_read) {
+		FD_SET(ioh->fd, &rfds);
                 max_size = 0;
-                pf->fd = ioh->fd;
-                pf->events = POLLIN;
-                ioh->ufd = pf;
-                pf++;
+		if (ioh->fd > nfds)
+		    nfds = ioh->fd;
             } else {
                 max_size = ioh->fd_can_read(ioh->opaque);
                 if (max_size > 0) {
                     if (max_size > sizeof(buf))
                         max_size = sizeof(buf);
-                    pf->fd = ioh->fd;
-                    pf->events = POLLIN;
-                    ioh->ufd = pf;
-                    pf++;
-                } else {
-                    ioh->ufd = NULL;
+		    FD_SET(ioh->fd, &rfds);
+		    if (ioh->fd > nfds)
+			nfds = ioh->fd;
                 }
             }
             ioh->max_size = max_size;
         }
+
+#if defined(CONFIG_SLIRP)
+        if (slirp_inited) {
+	    slirp_nfds = -1;
+	    slirp_select_fill(&slirp_nfds, &rfds, &wfds, &xfds);
+	    if (slirp_nfds > nfds)
+		nfds = slirp_nfds;
+	}
+#endif	/* CONFIG_SLIRP */
+
+	tv.tv_sec = 0;
+	tv.tv_usec = timeout * 1000;
+	ret = select(nfds + 1, &rfds, &wfds, &xfds, &tv);
         
-        ret = poll(ufds, pf - ufds, timeout);
         if (ret > 0) {
             /* XXX: better handling of removal */
             for(ioh = first_io_handler; ioh != NULL; ioh = ioh_next) {
                 ioh_next = ioh->next;
-                pf = ioh->ufd;
-                if (pf) {
-                    if (pf->revents & POLLIN) {
+		if (FD_ISSET(ioh->fd, &rfds)) {
                         if (ioh->max_size == 0) {
                             /* just a read event */
                             ioh->fd_read(ioh->opaque, NULL, 0);
@@ -2654,31 +2688,16 @@
                                 ioh->fd_read(ioh->opaque, NULL, -errno);
                             }
                         }
-                    }
-                }
+		}
             }
-        }
-#endif /* !defined(_WIN32) */
-#if defined(CONFIG_SLIRP)
-        /* XXX: merge with poll() */
-        if (slirp_inited) {
-            fd_set rfds, wfds, xfds;
-            int nfds;
-            struct timeval tv;
 
-            nfds = -1;
-            FD_ZERO(&rfds);
-            FD_ZERO(&wfds);
-            FD_ZERO(&xfds);
-            slirp_select_fill(&nfds, &rfds, &wfds, &xfds);
-            tv.tv_sec = 0;
-            tv.tv_usec = 0;
-            ret = select(nfds + 1, &rfds, &wfds, &xfds, &tv);
-            if (ret >= 0) {
+#if defined(CONFIG_SLIRP)
+	    if (slirp_inited)
                 slirp_select_poll(&rfds, &wfds, &xfds);
-            }
         }
-#endif
+#endif	/* defined(CONFIG_SLIRP) */
+
+#endif /* !defined(_WIN32) */
 
         if (vm_running) {
             qemu_run_timers(&active_timers[QEMU_TIMER_VIRTUAL], 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] Network Performance between Win Host and Linux
  2006-04-11 17:49   ` Kenneth Duda
  2006-04-11 18:19     ` Helmut Auer
  2006-04-11 20:40     ` Leonardo E. Reiter
@ 2006-04-11 21:00     ` Leonardo E. Reiter
  2 siblings, 0 replies; 15+ messages in thread
From: Leonardo E. Reiter @ 2006-04-11 21:00 UTC (permalink / raw)
  To: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 835 bytes --]

Hi Ken,

please disregard my last mail on this... here's a current patch against 
today's CVS.  I didn't realize that vl.c already converted from poll() 
to select(), so the patch logic is much easier and cleaner.

Check it out... I tested it minimally and it seems to work - only tested 
it on Linux host,

Leo

P.S. you can apply this one with -p0 arg to patch.

Kenneth Duda wrote:
> Paul, thanks for the note.
> 
> In my case, the guest CPU is idle.  The host CPU utilization is only 5
> or 10 percent when running "find / -print > /dev/null" on the guest. 
> So I don't think guest interrupt latency is the issue for me in this
> case.

-- 
Leonardo E. Reiter
Vice President of Product Development, CTO

Win4Lin, Inc.
Virtual Computing from Desktop to Data Center
Main: +1 512 339 7979
Fax: +1 512 532 6501
http://www.win4lin.com

[-- Attachment #2: qemu-select-merge.patch --]
[-- Type: text/x-patch, Size: 2057 bytes --]

Index: vl.c
===================================================================
RCS file: /cvsroot/qemu/qemu/vl.c,v
retrieving revision 1.168
diff -a -u -r1.168 vl.c
--- vl.c	9 Apr 2006 01:32:52 -0000	1.168
+++ vl.c	11 Apr 2006 20:56:56 -0000
@@ -3952,8 +3952,11 @@
 void main_loop_wait(int timeout)
 {
     IOHandlerRecord *ioh, *ioh_next;
-    fd_set rfds, wfds;
+    fd_set rfds, wfds, xfds;
     int ret, nfds;
+#if defined(CONFIG_SLIRP)
+    int slirp_nfds;
+#endif
     struct timeval tv;
 
 #ifdef _WIN32
@@ -3967,6 +3970,7 @@
     nfds = -1;
     FD_ZERO(&rfds);
     FD_ZERO(&wfds);
+    FD_ZERO(&xfds);
     for(ioh = first_io_handler; ioh != NULL; ioh = ioh->next) {
         if (ioh->fd_read &&
             (!ioh->fd_read_poll ||
@@ -3988,7 +3992,14 @@
 #else
     tv.tv_usec = timeout * 1000;
 #endif
-    ret = select(nfds + 1, &rfds, &wfds, NULL, &tv);
+#if defined(CONFIG_SLIRP)
+    if (slirp_inited) {
+        slirp_select_fill(&slirp_nfds, &rfds, &wfds, &xfds);
+        if (slirp_nfds > nfds)
+            nfds = slirp_nfds;
+    }
+#endif
+    ret = select(nfds + 1, &rfds, &wfds, &xfds, &tv);
     if (ret > 0) {
         /* XXX: better handling of removal */
         for(ioh = first_io_handler; ioh != NULL; ioh = ioh_next) {
@@ -4000,30 +4011,14 @@
                 ioh->fd_write(ioh->opaque);
             }
         }
-    }
-#ifdef _WIN32
-    tap_win32_poll();
-#endif
-
 #if defined(CONFIG_SLIRP)
-    /* XXX: merge with the previous select() */
-    if (slirp_inited) {
-        fd_set rfds, wfds, xfds;
-        int nfds;
-        struct timeval tv;
-        
-        nfds = -1;
-        FD_ZERO(&rfds);
-        FD_ZERO(&wfds);
-        FD_ZERO(&xfds);
-        slirp_select_fill(&nfds, &rfds, &wfds, &xfds);
-        tv.tv_sec = 0;
-        tv.tv_usec = 0;
-        ret = select(nfds + 1, &rfds, &wfds, &xfds, &tv);
-        if (ret >= 0) {
+        if (slirp_inited)
             slirp_select_poll(&rfds, &wfds, &xfds);
-        }
+#endif
     }
+
+#ifdef _WIN32
+    tap_win32_poll();
 #endif
 
     if (vm_running) {

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] Network Performance between Win Host and Linux
  2006-04-11 20:40     ` Leonardo E. Reiter
@ 2006-04-11 21:46       ` Kenneth Duda
  2006-04-11 21:58         ` Leonardo E. Reiter
  0 siblings, 1 reply; 15+ messages in thread
From: Kenneth Duda @ 2006-04-11 21:46 UTC (permalink / raw)
  To: qemu-devel

Thanks, Leo.  It appears your patch or something similar has made it
into 0.8.0.  I have already merged the select loops, but it didn't
help as much as I hoped, maybe 10%.  A much bigger improvement was
made by fixing the badly hacked slirp DELACK behavior.  Believe it or
not, slirp delays all TCP acks *unless* the segment data starts with
an escape character, I kid you not.  I threw that out, and have made
slirp's tcp_input rfc2581 compliant (to my shallow reading of the rfc)
and that boosted throughput from vm->host by 3.5x, to 56 megabits
(from 16 megabits).  The performance from host->vm was helped less,
and that was because of another hack in slirp that was causing it to
get the wrong MSS --- it was sending 512 byte segments.  Now, I'm
looking at excessive numbers of retransmissions (believe it or not)
--- I suspect the ne2000 ring buffer is overflowing but I'm not yet
sure.  I will post a patch including all of these things when I'm
done.  I'm expecting a significant aggregate improvement.

     -Ken

On 4/11/06, Leonardo E. Reiter <lreiter@win4lin.com> wrote:
> Hi Ken,
>
> I'm attaching a pretty old patch I made (from the 0.7.1 days), which did
> a quick and dirty merge of the select's.  It's not something that is
> clean and it will need adapting to 0.8.0... but, I figure you could draw
> some quick hints on how to merge the 2.  Basically it fills the select
> bitmaps when it walks through the fd's the first time, then calls select
> instead of poll.  It also has slirp fill its own bits (fd's) in before
> calling select.  So this is condensed to 1 select call.
>
> Do what you want with the code - like I said, it's messy and old.  But
> maybe you can at least use it to quickly test your hypothesis.  I'd be
> interested in learning about any benchmarks you come up with if you
> merge the select+poll.  Also, it may not be valid at all on Windows
> hosts since there is a question about select() being interrupted
> properly on those hosts - it should work on Linux/BSD.
>
> Regards,
>
> Leo Reiter
>
> P.S. this patch should be applied with -p1, not -p0 like my newer
> patches are applied.  Sorry for that - like I said, it's quite old.
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] Network Performance between Win Host and Linux
  2006-04-11 21:46       ` Kenneth Duda
@ 2006-04-11 21:58         ` Leonardo E. Reiter
  2006-04-11 22:42           ` Kenneth Duda
  0 siblings, 1 reply; 15+ messages in thread
From: Leonardo E. Reiter @ 2006-04-11 21:58 UTC (permalink / raw)
  To: qemu-devel

Yes... I sent a follow-up note after I looked at the latest vl.c with a 
newer patch applied. much simpler.

As for the delay acks, I've seen this and removed the delay for testing 
before.  I read in the comment (not sure if it was Fabrice or the slirp 
author) about how the delay was 1 of 3 methods that had been chosen as 
sort of a "compromise."  I recall testing newer versions of the code and 
not having as much of an issue with the delayed ack as before, so I 
figured Paul's performance fixes had addressed that somewhat (they 
definitely helped tremendously for receiving data).  In any case, it's 
good that you are taking a scientific approach to addressing this.  I 
personally think that slirp is a great idea for networking, for most 
uses, because it's totally in userspace, etc., etc.  But let's keep in 
mind that the original code was designed to meet the performance 
criteria of a serial line ;)  The work you are doing should help in 
bringing that more up to date.  I'd be glad to help with any testing 
if/when you have patches.

Thanks,

Leo Reiter

Kenneth Duda wrote:
> Thanks, Leo.  It appears your patch or something similar has made it
> into 0.8.0.  I have already merged the select loops, but it didn't
> help as much as I hoped, maybe 10%.  A much bigger improvement was
> made by fixing the badly hacked slirp DELACK behavior.  Believe it or
> not, slirp delays all TCP acks *unless* the segment data starts with
> an escape character, I kid you not.  I threw that out, and have made
> slirp's tcp_input rfc2581 compliant (to my shallow reading of the rfc)
> and that boosted throughput from vm->host by 3.5x, to 56 megabits
> (from 16 megabits).  The performance from host->vm was helped less,
> and that was because of another hack in slirp that was causing it to
> get the wrong MSS --- it was sending 512 byte segments.  Now, I'm
> looking at excessive numbers of retransmissions (believe it or not)
> --- I suspect the ne2000 ring buffer is overflowing but I'm not yet
> sure.  I will post a patch including all of these things when I'm
> done.  I'm expecting a significant aggregate improvement.
> 
>      -Ken
> 
> On 4/11/06, Leonardo E. Reiter <lreiter@win4lin.com> wrote:
> 
>>Hi Ken,
>>
>>I'm attaching a pretty old patch I made (from the 0.7.1 days), which did
>>a quick and dirty merge of the select's.  It's not something that is
>>clean and it will need adapting to 0.8.0... but, I figure you could draw
>>some quick hints on how to merge the 2.  Basically it fills the select
>>bitmaps when it walks through the fd's the first time, then calls select
>>instead of poll.  It also has slirp fill its own bits (fd's) in before
>>calling select.  So this is condensed to 1 select call.
>>
>>Do what you want with the code - like I said, it's messy and old.  But
>>maybe you can at least use it to quickly test your hypothesis.  I'd be
>>interested in learning about any benchmarks you come up with if you
>>merge the select+poll.  Also, it may not be valid at all on Windows
>>hosts since there is a question about select() being interrupted
>>properly on those hosts - it should work on Linux/BSD.
>>
>>Regards,
>>
>>Leo Reiter
>>
>>P.S. this patch should be applied with -p1, not -p0 like my newer
>>patches are applied.  Sorry for that - like I said, it's quite old.
>>
> 
> 
> 
> _______________________________________________
> Qemu-devel mailing list
> Qemu-devel@nongnu.org
> http://lists.nongnu.org/mailman/listinfo/qemu-devel

-- 
Leonardo E. Reiter
Vice President of Product Development, CTO

Win4Lin, Inc.
Virtual Computing from Desktop to Data Center
Main: +1 512 339 7979
Fax: +1 512 532 6501
http://www.win4lin.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Qemu-devel] Re: Network Performance between Win Host and Linux
  2006-04-11 17:20 [Qemu-devel] Network Performance between Win Host and Linux Kenneth Duda
  2006-04-11 17:28 ` Paul Brook
@ 2006-04-11 22:36 ` Kenneth Duda
  2006-04-12 14:04   ` Leonardo E. Reiter
  2006-04-12 14:31   ` Leonardo E. Reiter
  1 sibling, 2 replies; 15+ messages in thread
From: Kenneth Duda @ 2006-04-11 22:36 UTC (permalink / raw)
  To: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1476 bytes --]

The "qemu-slirp-performance" patch contains three improvements to qemu
slirp networking performance.  Booting my virtual machine (which
NFS-mounts its root filesystem from the host) has been accelerated by
8x, from over 5 minutes to 40 seconds.  TCP throughput has been
accelerated from about 2 megabytes/sec to 9 megabytes/sec, in both
directions (measured using a simple python script).  The system is
subjectively more responsive (for activities such as logging in or
running simple python scripts).

The specific problems fixed are:

   - the mss for the slirp-to-vm direction was 512 bytes (now 1460);
   - qemu would block in select() for up to four milliseconds at a
time, even when data was waiting on slirp sockets;
   - slirp was deliberately delaying acks until timer expiration
(TF_DELACK), preventing the vm from opening its send window, in
violation of rfc2581.

These fixes are together in one patch (qemu-slirp-performance.patch).

I have also attached some related patches that fix fairly serious
slirp bugs for IP datagrams larger than 4k.  Before these patches,
large packets can corrupt the heap or get reassembled in some
entertaining but incorrect orders (because ip_off was being sorted as
though it was signed!)  These patches are attached in the order I
apply them.

I hope they are helpful.  If there's anything I can do to make them
more likely to be accepted into the mainline, please let me know.

Thanks,
    -Ken

[-- Attachment #2: qemu-slirp-mbuf-bug.patch --]
[-- Type: text/plain, Size: 888 bytes --]

diff -BurN qemu-snapshot-2006-03-27_23.orig/slirp/mbuf.c qemu-snapshot-2006-03-27_23/slirp/mbuf.c
--- qemu-snapshot-2006-03-27_23.orig/slirp/mbuf.c	2004-04-22 00:10:47.000000000 +0000
+++ qemu-snapshot-2006-03-27_23/slirp/mbuf.c	2006-04-05 13:03:03.000000000 +0000
@@ -146,18 +146,19 @@
         struct mbuf *m;
         int size;
 {
+	int datasize;
+
 	/* some compiles throw up on gotos.  This one we can fake. */
         if(m->m_size>size) return;
 
         if (m->m_flags & M_EXT) {
-	  /* datasize = m->m_data - m->m_ext; */
+	  datasize = m->m_data - m->m_ext;
 	  m->m_ext = (char *)realloc(m->m_ext,size);
 /*		if (m->m_ext == NULL)
  *			return (struct mbuf *)NULL;
  */		
-	  /* m->m_data = m->m_ext + datasize; */
+	  m->m_data = m->m_ext + datasize;
         } else {
-	  int datasize;
 	  char *dat;
 	  datasize = m->m_data - m->m_dat;
 	  dat = (char *)malloc(size);





[-- Attachment #3: qemu-slirp-reassembly-bug.patch --]
[-- Type: text/plain, Size: 467 bytes --]

diff -BurN qemu-snapshot-2006-03-27_23.orig/slirp/ip_input.c qemu-snapshot-2006-03-27_23/slirp/ip_input.c
--- qemu-snapshot-2006-03-27_23.orig/slirp/ip_input.c	2004-04-22 00:10:47.000000000 +0000
+++ qemu-snapshot-2006-03-27_23/slirp/ip_input.c	2006-04-06 06:02:52.000000000 +0000
@@ -344,8 +344,8 @@
 	while (q != (struct ipasfrag *)fp) {
 	  struct mbuf *t;
 	  t = dtom(q);
-	  m_cat(m, t);
 	  q = (struct ipasfrag *) q->ipf_next;
+	  m_cat(m, t);
 	}
 
 	/*





[-- Attachment #4: qemu-slirp-32k-packets.patch --]
[-- Type: text/plain, Size: 4434 bytes --]

diff -burN qemu-snapshot-2006-03-27_23.orig/slirp/ip.h qemu-snapshot-2006-03-27_23/slirp/ip.h
--- qemu-snapshot-2006-03-27_23.orig/slirp/ip.h	2004-04-21 17:10:47.000000000 -0700
+++ qemu-snapshot-2006-03-27_23/slirp/ip.h	2006-04-06 00:28:49.000000000 -0700
@@ -79,6 +79,11 @@
  * We declare ip_len and ip_off to be short, rather than u_short
  * pragmatically since otherwise unsigned comparisons can result
  * against negative integers quite easily, and fail in subtle ways.
+ *
+ * The only problem with the above theory is that these quantities
+ * are in fact unsigned, and sorting fragments by a signed version
+ * of ip_off doesn't work very well, nor does length checks on
+ * ip packets with a signed version of their length!
  */
 struct ip {
 #ifdef WORDS_BIGENDIAN
@@ -101,6 +106,9 @@
 	struct	in_addr ip_src,ip_dst;	/* source and dest address */
 };
 
+#define IP_OFF(ip) (*(u_int16_t *)&((ip)->ip_off))
+#define IP_LEN(ip) (*(u_int16_t *)&((ip)->ip_len))
+
 #define	IP_MAXPACKET	65535		/* maximum packet size */
 
 /*
diff -burN qemu-snapshot-2006-03-27_23.orig/slirp/ip_input.c qemu-snapshot-2006-03-27_23/slirp/ip_input.c
--- qemu-snapshot-2006-03-27_23.orig/slirp/ip_input.c	2004-04-21 17:10:47.000000000 -0700
+++ qemu-snapshot-2006-03-27_23/slirp/ip_input.c	2006-04-06 00:32:19.000000000 -0700
@@ -111,7 +111,7 @@
 	 * Convert fields to host representation.
 	 */
 	NTOHS(ip->ip_len);
-	if (ip->ip_len < hlen) {
+	if (IP_LEN(ip) < hlen) {
 		ipstat.ips_badlen++;
 		goto bad;
 	}
@@ -124,13 +124,13 @@
 	 * Trim mbufs if longer than we expect.
 	 * Drop packet if shorter than we expect.
 	 */
-	if (m->m_len < ip->ip_len) {
+	if (m->m_len < IP_LEN(ip)) {
 		ipstat.ips_tooshort++;
 		goto bad;
 	}
 	/* Should drop packet if mbuf too long? hmmm... */
-	if (m->m_len > ip->ip_len)
-	   m_adj(m, ip->ip_len - m->m_len);
+	if (m->m_len > IP_LEN(ip))
+	   m_adj(m, IP_LEN(ip) - m->m_len);
 
 	/* check ip_ttl for a correct ICMP reply */
 	if(ip->ip_ttl==0 || ip->ip_ttl==1) {
@@ -191,7 +191,7 @@
 		 * or if this is not the first fragment,
 		 * attempt reassembly; if it succeeds, proceed.
 		 */
-		if (((struct ipasfrag *)ip)->ipf_mff & 1 || ip->ip_off) {
+		if (((struct ipasfrag *)ip)->ipf_mff & 1 || IP_OFF(ip)) {
 			ipstat.ips_fragments++;
 			ip = ip_reass((struct ipasfrag *)ip, fp);
 			if (ip == 0)
@@ -281,7 +281,7 @@
 	 */
 	for (q = (struct ipasfrag *)fp->ipq_next; q != (struct ipasfrag *)fp;
 	    q = (struct ipasfrag *)q->ipf_next)
-		if (q->ip_off > ip->ip_off)
+		if (IP_OFF(q) > IP_OFF(ip))
 			break;
 
 	/*
@@ -290,10 +290,10 @@
 	 * segment.  If it provides all of our data, drop us.
 	 */
 	if (q->ipf_prev != (ipasfragp_32)fp) {
-		i = ((struct ipasfrag *)(q->ipf_prev))->ip_off +
-		  ((struct ipasfrag *)(q->ipf_prev))->ip_len - ip->ip_off;
+		i = IP_OFF((struct ipasfrag *)(q->ipf_prev)) +
+		  IP_LEN((struct ipasfrag *)(q->ipf_prev)) - IP_OFF(ip);
 		if (i > 0) {
-			if (i >= ip->ip_len)
+			if (i >= IP_LEN(ip))
 				goto dropfrag;
 			m_adj(dtom(ip), i);
 			ip->ip_off += i;
@@ -305,9 +305,9 @@
 	 * While we overlap succeeding segments trim them or,
 	 * if they are completely covered, dequeue them.
 	 */
-	while (q != (struct ipasfrag *)fp && ip->ip_off + ip->ip_len > q->ip_off) {
-		i = (ip->ip_off + ip->ip_len) - q->ip_off;
-		if (i < q->ip_len) {
+	while (q != (struct ipasfrag *)fp && IP_OFF(ip) + IP_LEN(ip) > IP_OFF(q)) {
+		i = (IP_OFF(ip) + IP_LEN(ip)) - IP_OFF(q);
+		if (i < IP_LEN(q)) {
 			q->ip_len -= i;
 			q->ip_off += i;
 			m_adj(dtom(q), i);
@@ -327,9 +327,9 @@
 	next = 0;
 	for (q = (struct ipasfrag *) fp->ipq_next; q != (struct ipasfrag *)fp;
 	     q = (struct ipasfrag *) q->ipf_next) {
-		if (q->ip_off != next)
+		if (IP_OFF(q) != next)
 			return (0);
-		next += q->ip_len;
+		next += IP_LEN(q);
 	}
 	if (((struct ipasfrag *)(q->ipf_prev))->ipf_mff & 1)
 		return (0);
diff -burN qemu-snapshot-2006-03-27_23.orig/slirp/udp.c qemu-snapshot-2006-03-27_23/slirp/udp.c
--- qemu-snapshot-2006-03-27_23.orig/slirp/udp.c	2006-04-06 00:24:30.000000000 -0700
+++ qemu-snapshot-2006-03-27_23/slirp/udp.c	2006-04-06 00:32:55.000000000 -0700
@@ -111,12 +111,12 @@
 	 */
 	len = ntohs((u_int16_t)uh->uh_ulen);
 
-	if (ip->ip_len != len) {
-		if (len > ip->ip_len) {
+	if (IP_LEN(ip) != len) {
+		if (len > IP_LEN(ip)) {
 			udpstat.udps_badlen++;
 			goto bad;
 		}
-		m_adj(m, len - ip->ip_len);
+		m_adj(m, len - IP_LEN(ip));
 		ip->ip_len = len;
 	}
 	





[-- Attachment #5: qemu-slirp-performance.patch --]
[-- Type: text/plain, Size: 3196 bytes --]

diff -BurN qemu-snapshot-2006-03-27_23.orig/slirp/tcp.h qemu-snapshot-2006-03-27_23/slirp/tcp.h
--- qemu-snapshot-2006-03-27_23.orig/slirp/tcp.h	2004-04-21 17:10:47.000000000 -0700
+++ qemu-snapshot-2006-03-27_23/slirp/tcp.h	2006-04-11 15:22:05.000000000 -0700
@@ -100,8 +100,10 @@
  * With an IP MSS of 576, this is 536,
  * but 512 is probably more convenient.
  * This should be defined as MIN(512, IP_MSS - sizeof (struct tcpiphdr)).
+ *
+ * We make this 1460 because we only care about Ethernet in the qemu context.
  */
-#define	TCP_MSS	512
+#define	TCP_MSS	1460
 
 #define	TCP_MAXWIN	65535	/* largest value for (unscaled) window */
 
diff -BurN qemu-snapshot-2006-03-27_23.orig/slirp/tcp_input.c qemu-snapshot-2006-03-27_23/slirp/tcp_input.c
--- qemu-snapshot-2006-03-27_23.orig/slirp/tcp_input.c	2004-10-07 16:27:35.000000000 -0700
+++ qemu-snapshot-2006-03-27_23/slirp/tcp_input.c	2006-04-11 15:22:05.000000000 -0700
@@ -580,28 +580,11 @@
 			 *	congestion avoidance sender won't send more until
 			 *	he gets an ACK.
 			 * 
-			 * Here are 3 interpretations of what should happen.
-			 * The best (for me) is to delay-ack everything except
-			 * if it's a one-byte packet containing an ESC
-			 * (this means it's an arrow key (or similar) sent using
-			 * Nagel, hence there will be no echo)
-			 * The first of these is the original, the second is the
-			 * middle ground between the other 2
+			 * It is better to not delay acks at all to maximize
+			 * TCP throughput.  See RFC 2581.
 			 */ 
-/*			if (((unsigned)ti->ti_len < tp->t_maxseg)) {
- */			     
-/*			if (((unsigned)ti->ti_len < tp->t_maxseg && 
- *			     (so->so_iptos & IPTOS_LOWDELAY) == 0) ||
- *			    ((so->so_iptos & IPTOS_LOWDELAY) && 
- *			     ((struct tcpiphdr_2 *)ti)->first_char == (char)27)) {
- */
-			if ((unsigned)ti->ti_len == 1 &&
-			    ((struct tcpiphdr_2 *)ti)->first_char == (char)27) {
-				tp->t_flags |= TF_ACKNOW;
-				tcp_output(tp);
-			} else {
-				tp->t_flags |= TF_DELACK;
-			}
+			tp->t_flags |= TF_ACKNOW;
+			tcp_output(tp);
 			return;
 		}
 	} /* header prediction */
diff -BurN qemu-snapshot-2006-03-27_23.orig/vl.c qemu-snapshot-2006-03-27_23/vl.c
--- qemu-snapshot-2006-03-27_23.orig/vl.c	2006-04-11 15:21:27.000000000 -0700
+++ qemu-snapshot-2006-03-27_23/vl.c	2006-04-11 15:22:05.000000000 -0700
@@ -4026,7 +4026,7 @@
 void main_loop_wait(int timeout)
 {
     IOHandlerRecord *ioh, *ioh_next;
-    fd_set rfds, wfds;
+    fd_set rfds, wfds, xfds;
     int ret, nfds;
     struct timeval tv;
 
@@ -4041,6 +4041,7 @@
     nfds = -1;
     FD_ZERO(&rfds);
     FD_ZERO(&wfds);
+    FD_ZERO(&xfds);
     for(ioh = first_io_handler; ioh != NULL; ioh = ioh->next) {
         if (ioh->fd_read &&
             (!ioh->fd_read_poll ||
@@ -4062,7 +4063,12 @@
 #else
     tv.tv_usec = timeout * 1000;
 #endif
-    ret = select(nfds + 1, &rfds, &wfds, NULL, &tv);
+#if defined(CONFIG_SLIRP)
+    if (slirp_inited) {
+        slirp_select_fill(&nfds, &rfds, &wfds, &xfds);
+    }
+#endif
+    ret = select(nfds + 1, &rfds, &wfds, &xfds, &tv);
     if (ret > 0) {
         /* XXX: better handling of removal */
         for(ioh = first_io_handler; ioh != NULL; ioh = ioh_next) {





^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] Network Performance between Win Host and Linux
  2006-04-11 21:58         ` Leonardo E. Reiter
@ 2006-04-11 22:42           ` Kenneth Duda
  0 siblings, 0 replies; 15+ messages in thread
From: Kenneth Duda @ 2006-04-11 22:42 UTC (permalink / raw)
  To: qemu-devel

I was confused by the comments around the delaying of acks.  Delaying
these acks didn't make intuitive sense to me and is inconsistent with
RFC 2581, which states:

   ... a TCP receiver MUST NOT excessively delay
   acknowledgments.  Specifically, an ACK SHOULD be generated for at
   least every second full-sized segment, and MUST be generated within
   500 ms of the arrival of the first unacknowledged packet.

I have implemented things so that acks are never delayed, which is
simplest and seems fine in the environment where I imagine
slirp-within-qemu is being used (simulated ethernets).  I'm interested
in other viewpoints.

    -Ken


On 4/11/06, Leonardo E. Reiter <lreiter@win4lin.com> wrote:
> Yes... I sent a follow-up note after I looked at the latest vl.c with a
> newer patch applied. much simpler.
>
> As for the delay acks, I've seen this and removed the delay for testing
> before.  I read in the comment (not sure if it was Fabrice or the slirp
> author) about how the delay was 1 of 3 methods that had been chosen as
> sort of a "compromise."  I recall testing newer versions of the code and
> not having as much of an issue with the delayed ack as before, so I
> figured Paul's performance fixes had addressed that somewhat (they
> definitely helped tremendously for receiving data).  In any case, it's
> good that you are taking a scientific approach to addressing this.  I
> personally think that slirp is a great idea for networking, for most
> uses, because it's totally in userspace, etc., etc.  But let's keep in
> mind that the original code was designed to meet the performance
> criteria of a serial line ;)  The work you are doing should help in
> bringing that more up to date.  I'd be glad to help with any testing
> if/when you have patches.
>
> Thanks,
>
> Leo Reiter
>
> Kenneth Duda wrote:
> > Thanks, Leo.  It appears your patch or something similar has made it
> > into 0.8.0.  I have already merged the select loops, but it didn't
> > help as much as I hoped, maybe 10%.  A much bigger improvement was
> > made by fixing the badly hacked slirp DELACK behavior.  Believe it or
> > not, slirp delays all TCP acks *unless* the segment data starts with
> > an escape character, I kid you not.  I threw that out, and have made
> > slirp's tcp_input rfc2581 compliant (to my shallow reading of the rfc)
> > and that boosted throughput from vm->host by 3.5x, to 56 megabits
> > (from 16 megabits).  The performance from host->vm was helped less,
> > and that was because of another hack in slirp that was causing it to
> > get the wrong MSS --- it was sending 512 byte segments.  Now, I'm
> > looking at excessive numbers of retransmissions (believe it or not)
> > --- I suspect the ne2000 ring buffer is overflowing but I'm not yet
> > sure.  I will post a patch including all of these things when I'm
> > done.  I'm expecting a significant aggregate improvement.
> >
> >      -Ken
> >
> > On 4/11/06, Leonardo E. Reiter <lreiter@win4lin.com> wrote:
> >
> >>Hi Ken,
> >>
> >>I'm attaching a pretty old patch I made (from the 0.7.1 days), which did
> >>a quick and dirty merge of the select's.  It's not something that is
> >>clean and it will need adapting to 0.8.0... but, I figure you could draw
> >>some quick hints on how to merge the 2.  Basically it fills the select
> >>bitmaps when it walks through the fd's the first time, then calls select
> >>instead of poll.  It also has slirp fill its own bits (fd's) in before
> >>calling select.  So this is condensed to 1 select call.
> >>
> >>Do what you want with the code - like I said, it's messy and old.  But
> >>maybe you can at least use it to quickly test your hypothesis.  I'd be
> >>interested in learning about any benchmarks you come up with if you
> >>merge the select+poll.  Also, it may not be valid at all on Windows
> >>hosts since there is a question about select() being interrupted
> >>properly on those hosts - it should work on Linux/BSD.
> >>
> >>Regards,
> >>
> >>Leo Reiter
> >>
> >>P.S. this patch should be applied with -p1, not -p0 like my newer
> >>patches are applied.  Sorry for that - like I said, it's quite old.
> >>
> >
> >
> >
> > _______________________________________________
> > Qemu-devel mailing list
> > Qemu-devel@nongnu.org
> > http://lists.nongnu.org/mailman/listinfo/qemu-devel
>
> --
> Leonardo E. Reiter
> Vice President of Product Development, CTO
>
> Win4Lin, Inc.
> Virtual Computing from Desktop to Data Center
> Main: +1 512 339 7979
> Fax: +1 512 532 6501
> http://www.win4lin.com
>
>
> _______________________________________________
> Qemu-devel mailing list
> Qemu-devel@nongnu.org
> http://lists.nongnu.org/mailman/listinfo/qemu-devel
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] Network Performance between Win Host and Linux
  2006-04-11 18:19     ` Helmut Auer
@ 2006-04-12  2:10       ` Kazu
  0 siblings, 0 replies; 15+ messages in thread
From: Kazu @ 2006-04-12  2:10 UTC (permalink / raw)
  To: qemu-devel


Hi,

I have already made a patch. Try this.

http://lists.gnu.org/archive/html/qemu-devel/2006-03/msg00041.html

Regards,
Kazu

Sent: Wednesday, April 12, 2006 3:19 AM Helmut Auer wrote:

> Hello
>> In my case, the guest CPU is idle.  The host CPU utilization is only 5
>> or 10 percent when running "find / -print > /dev/null" on the guest. 
>> So I don't think guest interrupt latency is the issue for me in this
>> case.
>>   
> In my environment the performance of about 300 KB is the good case, that 
> means the cpu is idle. When the cpu is busy it degrades to20 KB in the 
> worst case.
> As I said before, with the tap-patched qemu 0.7.2 it is about 10 times 
> faster.
> 
> -- 
> Helmut Auer, helmut@helmutauer.de 
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] Re: Network Performance between Win Host and Linux
  2006-04-11 22:36 ` [Qemu-devel] " Kenneth Duda
@ 2006-04-12 14:04   ` Leonardo E. Reiter
  2006-04-12 18:19     ` Kenneth Duda
  2006-04-12 14:31   ` Leonardo E. Reiter
  1 sibling, 1 reply; 15+ messages in thread
From: Leonardo E. Reiter @ 2006-04-12 14:04 UTC (permalink / raw)
  To: qemu-devel

Hi Ken,

(all) the patches seem to work very well and be very stable with Windows 
2000 guests here.  I measured some SMB over TCP/IP transfers, and got 
about a 1.5x downstream improvement and a 2x upstream improvement.  You 
will likely get more boost from less convoluted protocols like FTP or 
something, but I didn't get around to testing that.  Plus it's not clear 
how much Windows itself is impeding the bandwidth.  I am using 
-kernel-kqemu.

2 additional things I noticed:

1. before your patches, the upstream transfers (guest->host) consumed 
almost no CPU at all, but of course were much slower.  Now, about half 
the CPU gets used under heavy upstream load.  The downstream, with 
Windows guests at least, consumes 100% CPU the same as before.  I 
suspect you addressed this specifically with your select hack to avoid 
the delay if there is pending slirp activity

2. overall latency "feels" improved as well, at least for basic stuff 
like web browsing, etc.  This is purely subjective.

Nice work!  I'll be testing with a Linux VM soon and try to pin down 
some better benchmarks, free of Windows clutter.

- Leo Reiter

Kenneth Duda wrote:
> The "qemu-slirp-performance" patch contains three improvements to qemu
> slirp networking performance.  Booting my virtual machine (which
> NFS-mounts its root filesystem from the host) has been accelerated by
> 8x, from over 5 minutes to 40 seconds.  TCP throughput has been
> accelerated from about 2 megabytes/sec to 9 megabytes/sec, in both
> directions (measured using a simple python script).  The system is
> subjectively more responsive (for activities such as logging in or
> running simple python scripts).
> 
> The specific problems fixed are:
> 
>    - the mss for the slirp-to-vm direction was 512 bytes (now 1460);
>    - qemu would block in select() for up to four milliseconds at a
> time, even when data was waiting on slirp sockets;
>    - slirp was deliberately delaying acks until timer expiration
> (TF_DELACK), preventing the vm from opening its send window, in
> violation of rfc2581.
> 
> These fixes are together in one patch (qemu-slirp-performance.patch).
><snip> 

-- 
Leonardo E. Reiter
Vice President of Product Development, CTO

Win4Lin, Inc.
Virtual Computing from Desktop to Data Center
Main: +1 512 339 7979
Fax: +1 512 532 6501
http://www.win4lin.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] Re: Network Performance between Win Host and Linux
  2006-04-11 22:36 ` [Qemu-devel] " Kenneth Duda
  2006-04-12 14:04   ` Leonardo E. Reiter
@ 2006-04-12 14:31   ` Leonardo E. Reiter
  1 sibling, 0 replies; 15+ messages in thread
From: Leonardo E. Reiter @ 2006-04-12 14:31 UTC (permalink / raw)
  To: qemu-devel

On an additional note, Windows host users may want to try moving the 
arbitrary Sleep() in main_loop_wait() to the end of the function, and 
making that conditional if there are no I/O events pending.  Otherwise, 
there is a fixed penalty and this does not take advantage of Ken's new 
patch to avoid the delay if there are pending slirp requests for 
example.  When I have some time I will see if there is a better way to 
multiplex poll in Windows, so that you can use something like select() 
but still get interrupted.  There might for example be something 
relevant in the newer winsock libraries (i.e. V2), but of course it has 
to be general enough to work on any type of fd.

I apologize but I have not yet been able to successfully build QEMU on 
Windows, even after mucking with the mingw stuff.  I probably need to 
spend more time on it at some point.  But if anyone is using Windows and 
can compile QEMU from source, you can try moving the Sleep to see if 
that helps, especially after applying Ken's new patches.  Actually 
Kazu's patch for TAP performance addresses this for TAP for example, so 
it should be easy to adapt to slirp... the code is in very close proximity.

- Leo Reiter

Kenneth Duda wrote:
> The "qemu-slirp-performance" patch contains three improvements to qemu
> slirp networking performance.  Booting my virtual machine (which
> NFS-mounts its root filesystem from the host) has been accelerated by
> 8x, from over 5 minutes to 40 seconds.  TCP throughput has been
> accelerated from about 2 megabytes/sec to 9 megabytes/sec, in both
> directions (measured using a simple python script).  The system is
> subjectively more responsive (for activities such as logging in or
> running simple python scripts).

-- 
Leonardo E. Reiter
Vice President of Product Development, CTO

Win4Lin, Inc.
Virtual Computing from Desktop to Data Center
Main: +1 512 339 7979
Fax: +1 512 532 6501
http://www.win4lin.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] Re: Network Performance between Win Host and Linux
  2006-04-12 14:04   ` Leonardo E. Reiter
@ 2006-04-12 18:19     ` Kenneth Duda
  2006-04-12 18:26       ` Leonardo E. Reiter
  0 siblings, 1 reply; 15+ messages in thread
From: Kenneth Duda @ 2006-04-12 18:19 UTC (permalink / raw)
  To: qemu-devel

Leo, thank you for exercising this stuff.

> 1. before your patches, the upstream transfers (guest->host) consumed
> almost no CPU at all, but of course were much slower.  Now, about half
> the CPU gets used under heavy upstream load.

I am surprised that only half the CPU gets consumed --- that suggests
there's another factor of two improvement waiting to be made.  If you
see anything like this with Linux-on-Linux, please let me know and
I'll try to track it down.

Separately, I'm curious about the path for getting these changes into
the qemu mainline.  If that's something you're in tune with and are in
the mood to summarize for me, I'd appreciate that.  We love qemu but
there are some rough edges and I think we have something like 16
patches we're maintaining internally, many of which might be helpful
for others.

     -Ken

On 4/12/06, Leonardo E. Reiter <lreiter@win4lin.com> wrote:
> Hi Ken,
>
> (all) the patches seem to work very well and be very stable with Windows
> 2000 guests here.  I measured some SMB over TCP/IP transfers, and got
> about a 1.5x downstream improvement and a 2x upstream improvement.  You
> will likely get more boost from less convoluted protocols like FTP or
> something, but I didn't get around to testing that.  Plus it's not clear
> how much Windows itself is impeding the bandwidth.  I am using
> -kernel-kqemu.
>
> 2 additional things I noticed:
>
> 1. before your patches, the upstream transfers (guest->host) consumed
> almost no CPU at all, but of course were much slower.  Now, about half
> the CPU gets used under heavy upstream load.  The downstream, with
> Windows guests at least, consumes 100% CPU the same as before.  I
> suspect you addressed this specifically with your select hack to avoid
> the delay if there is pending slirp activity
>
> 2. overall latency "feels" improved as well, at least for basic stuff
> like web browsing, etc.  This is purely subjective.
>
> Nice work!  I'll be testing with a Linux VM soon and try to pin down
> some better benchmarks, free of Windows clutter.
>
> - Leo Reiter

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] Re: Network Performance between Win Host and Linux
  2006-04-12 18:19     ` Kenneth Duda
@ 2006-04-12 18:26       ` Leonardo E. Reiter
  0 siblings, 0 replies; 15+ messages in thread
From: Leonardo E. Reiter @ 2006-04-12 18:26 UTC (permalink / raw)
  To: qemu-devel

Ken,

I'll check that on Linux-on-Linux... it's likely just some Windows 
overhead.  Windows is my guest OS priority, which is why I tested on 
Windows.

As for getting patches into the mainline, this is a job for the 
maintainers.  Fabrice is the main person, but Paul Brook also merges a 
lot of patches in.  I'm not sure what their process is, or to what 
extent they communicate with each other.  I'm sure Paul and/or Fabrice 
would be kind enough to explain.  I agree that there are lots of pending 
patches... in the case of yours specifically though, since it's so 
sweeping, I would guess that it probably needs more field testing before 
it becomes mainline.

Regards,

Leo Reiter

Kenneth Duda wrote:
> Leo, thank you for exercising this stuff.
> 
> 
>>1. before your patches, the upstream transfers (guest->host) consumed
>>almost no CPU at all, but of course were much slower.  Now, about half
>>the CPU gets used under heavy upstream load.
> 
> 
> I am surprised that only half the CPU gets consumed --- that suggests
> there's another factor of two improvement waiting to be made.  If you
> see anything like this with Linux-on-Linux, please let me know and
> I'll try to track it down.
> 
> Separately, I'm curious about the path for getting these changes into
> the qemu mainline.  If that's something you're in tune with and are in
> the mood to summarize for me, I'd appreciate that.  We love qemu but
> there are some rough edges and I think we have something like 16
> patches we're maintaining internally, many of which might be helpful
> for others.
> 
>      -Ken
> 
> On 4/12/06, Leonardo E. Reiter <lreiter@win4lin.com> wrote:
> 
>>Hi Ken,
>>
>>(all) the patches seem to work very well and be very stable with Windows
>>2000 guests here.  I measured some SMB over TCP/IP transfers, and got
>>about a 1.5x downstream improvement and a 2x upstream improvement.  You
>>will likely get more boost from less convoluted protocols like FTP or
>>something, but I didn't get around to testing that.  Plus it's not clear
>>how much Windows itself is impeding the bandwidth.  I am using
>>-kernel-kqemu.
>>
>>2 additional things I noticed:
>>
>>1. before your patches, the upstream transfers (guest->host) consumed
>>almost no CPU at all, but of course were much slower.  Now, about half
>>the CPU gets used under heavy upstream load.  The downstream, with
>>Windows guests at least, consumes 100% CPU the same as before.  I
>>suspect you addressed this specifically with your select hack to avoid
>>the delay if there is pending slirp activity
>>
>>2. overall latency "feels" improved as well, at least for basic stuff
>>like web browsing, etc.  This is purely subjective.
>>
>>Nice work!  I'll be testing with a Linux VM soon and try to pin down
>>some better benchmarks, free of Windows clutter.
>>
>>- Leo Reiter
> 
> 
> 
> _______________________________________________
> Qemu-devel mailing list
> Qemu-devel@nongnu.org
> http://lists.nongnu.org/mailman/listinfo/qemu-devel

-- 
Leonardo E. Reiter
Vice President of Product Development, CTO

Win4Lin, Inc.
Virtual Computing from Desktop to Data Center
Main: +1 512 339 7979
Fax: +1 512 532 6501
http://www.win4lin.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2006-04-12 18:26 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-11 17:20 [Qemu-devel] Network Performance between Win Host and Linux Kenneth Duda
2006-04-11 17:28 ` Paul Brook
2006-04-11 17:49   ` Kenneth Duda
2006-04-11 18:19     ` Helmut Auer
2006-04-12  2:10       ` Kazu
2006-04-11 20:40     ` Leonardo E. Reiter
2006-04-11 21:46       ` Kenneth Duda
2006-04-11 21:58         ` Leonardo E. Reiter
2006-04-11 22:42           ` Kenneth Duda
2006-04-11 21:00     ` Leonardo E. Reiter
2006-04-11 22:36 ` [Qemu-devel] " Kenneth Duda
2006-04-12 14:04   ` Leonardo E. Reiter
2006-04-12 18:19     ` Kenneth Duda
2006-04-12 18:26       ` Leonardo E. Reiter
2006-04-12 14:31   ` Leonardo E. Reiter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).