Re: RFT: virtio_net: limit xmit polling

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Roopa Prabhu <roprabhu@cisco.com>
To: "Michael S. Tsirkin" <mst@redhat.com>,
	Tom Lendacky <tahm@linux.vnet.ibm.com>
Cc: Krishna Kumar2 <krkumar2@in.ibm.com>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Carsten Otte <cotte@de.ibm.com>, <habanero@linux.vnet.ibm.com>,
	Heiko Carstens <heiko.carstens@de.ibm.com>, <kvm@vger.kernel.org>,
	<lguest@lists.ozlabs.org>, <linux-kernel@vger.kernel.org>,
	<linux-s390@vger.kernel.org>, <linux390@de.ibm.com>,
	<netdev@vger.kernel.org>, Rusty Russell <rusty@rustcorp.com.au>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>, <steved@us.ibm.com>,
	<virtualization@lists.linux-foundation.org>,
	Shirley Ma <xma@us.ibm.com>
Subject: Re: RFT: virtio_net: limit xmit polling
Date: Thu, 14 Jul 2011 12:38:05 -0700	[thread overview]
Message-ID: <CA4493AD.2E273%roprabhu@cisco.com> (raw)
In-Reply-To: <20110629084206.GA14627@redhat.com>




On 6/29/11 1:42 AM, "Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Tue, Jun 28, 2011 at 11:08:07AM -0500, Tom Lendacky wrote:
>> On Sunday, June 19, 2011 05:27:00 AM Michael S. Tsirkin wrote:
>>> OK, different people seem to test different trees.  In the hope to get
>>> everyone on the same page, I created several variants of this patch so
>>> they can be compared. Whoever's interested, please check out the
>>> following, and tell me how these compare:
>>> 
>>> kernel:
>>> 
>>> git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git
>>> 
>>> virtio-net-limit-xmit-polling/base - this is net-next baseline to test
>>> against virtio-net-limit-xmit-polling/v0 - fixes checks on out of capacity
>>> virtio-net-limit-xmit-polling/v1 - previous revision of the patch
>>>             this does xmit,free,xmit,2*free,free
>>> virtio-net-limit-xmit-polling/v2 - new revision of the patch
>>>             this does free,xmit,2*free,free
>>> 
>> 
>> Here's a summary of the results.  I've also attached an ODS format
>> spreadsheet
>> (30 KB in size) that might be easier to analyze and also has some pinned VM
>> results data.  I broke the tests down into a local guest-to-guest scenario
>> and a remote host-to-guest scenario.
>> 
>> Within the local guest-to-guest scenario I ran:
>>   - TCP_RR tests using two different messsage sizes and four different
>>     instance counts among 1 pair of VMs and 2 pairs of VMs.
>>   - TCP_STREAM tests using four different message sizes and two different
>>     instance counts among 1 pair of VMs and 2 pairs of VMs.
>> 
>> Within the remote host-to-guest scenario I ran:
>>   - TCP_RR tests using two different messsage sizes and four different
>>     instance counts to 1 VM and 4 VMs.
>>   - TCP_STREAM and TCP_MAERTS tests using four different message sizes and
>>     two different instance counts to 1 VM and 4 VMs.
>> over a 10GbE link.
> 
> roprabhu, Tom,
> 
> Thanks very much for the testing. So on the first glance
> one seems to see a significant performance gain in V0 here,
> and a slightly less significant in V2, with V1
> being worse than base. But I'm afraid that's not the
> whole story, and we'll need to work some more to
> know what really goes on, please see below.
> 
> 
> Some comments on the results: I found out that V0 because of mistake
> on my part was actually almost identical to base.
> I pushed out virtio-net-limit-xmit-polling/v1a instead that
> actually does what I intended to check. However,
> the fact we get such a huge distribution in the results by Tom
> most likely means that the noise factor is very large.
> 
> 
> From my experience one way to get stable results is to
> divide the throughput by the host CPU utilization
> (measured by something like mpstat).
> Sometimes throughput doesn't increase (e.g. guest-host)
> by CPU utilization does decrease. So it's interesting.
> 
> 
> Another issue is that we are trying to improve the latency
> of a busy queue here. However STREAM/MAERTS tests ignore the latency
> (more or less) while TCP_RR by default runs a single packet per queue.
> Without arguing about whether these are practically interesting
> workloads, these results are thus unlikely to be significantly affected
> by the optimization in question.
> 
> What we are interested in, thus, is either TCP_RR with a -b flag
> (configure with  --enable-burst) or multiple concurrent
> TCP_RRs.
> 
> 
> 
Michael, below are some numbers I got from one round of runs.
Thanks,
Roopa

256byte req/response.
Vcpus and irqs were pinned to 4 cores and the cpu utilization is
Avg across 4 cores.

base:
Numof concurrent TCP_RRs    Num of transactions/sec  host cpu-util(%)
1                            7982.93                        15.72
25                           67873                         28.84
50                           112534                        52.25
100                          192057                       86.54


v1
Numof concurrent TCP_RRs    Num of transactions/sec    host cpu-util(%)
1                           7970.94                       10.8
25                          65496.8                       28
50                          109858                        53.22
100                         190155                        87.5


v1a
Numof concurrent TCP_RRs    Num of transactions/sec   host cpu-util (%)
1                           7979.81                       9.5
25                          66786.1                       28
50                          109552                        51
100                         190876                        88


v2
Numof concurrent TCP_RRs    Num of transactions/sec   host cpu-util (%)
1                            7969.87                     16.5
25                           67780.1                     28.44
50                           114966                      54.29
100                          177982                      79.9

WARNING: multiple messages have this Message-ID (diff)

From: Roopa Prabhu <roprabhu-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
To: "Michael S. Tsirkin"
	<mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Tom Lendacky
	<tahm-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Cc: Krishna Kumar2 <krkumar2-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>,
	habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org,
	lguest-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org,
	Shirley Ma <xma-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>,
	kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Carsten Otte <cotte-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>,
	linux-s390-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Heiko Carstens
	<heiko.carstens-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	steved-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org,
	Christian Borntraeger
	<borntraeger-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>,
	netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Martin Schwidefsky
	<schwidefsky-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>,
	linux390-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org
Subject: Re: RFT: virtio_net: limit xmit polling
Date: Thu, 14 Jul 2011 12:38:05 -0700	[thread overview]
Message-ID: <CA4493AD.2E273%roprabhu@cisco.com> (raw)
In-Reply-To: <20110629084206.GA14627-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>




On 6/29/11 1:42 AM, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:

> On Tue, Jun 28, 2011 at 11:08:07AM -0500, Tom Lendacky wrote:
>> On Sunday, June 19, 2011 05:27:00 AM Michael S. Tsirkin wrote:
>>> OK, different people seem to test different trees.  In the hope to get
>>> everyone on the same page, I created several variants of this patch so
>>> they can be compared. Whoever's interested, please check out the
>>> following, and tell me how these compare:
>>> 
>>> kernel:
>>> 
>>> git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git
>>> 
>>> virtio-net-limit-xmit-polling/base - this is net-next baseline to test
>>> against virtio-net-limit-xmit-polling/v0 - fixes checks on out of capacity
>>> virtio-net-limit-xmit-polling/v1 - previous revision of the patch
>>>             this does xmit,free,xmit,2*free,free
>>> virtio-net-limit-xmit-polling/v2 - new revision of the patch
>>>             this does free,xmit,2*free,free
>>> 
>> 
>> Here's a summary of the results.  I've also attached an ODS format
>> spreadsheet
>> (30 KB in size) that might be easier to analyze and also has some pinned VM
>> results data.  I broke the tests down into a local guest-to-guest scenario
>> and a remote host-to-guest scenario.
>> 
>> Within the local guest-to-guest scenario I ran:
>>   - TCP_RR tests using two different messsage sizes and four different
>>     instance counts among 1 pair of VMs and 2 pairs of VMs.
>>   - TCP_STREAM tests using four different message sizes and two different
>>     instance counts among 1 pair of VMs and 2 pairs of VMs.
>> 
>> Within the remote host-to-guest scenario I ran:
>>   - TCP_RR tests using two different messsage sizes and four different
>>     instance counts to 1 VM and 4 VMs.
>>   - TCP_STREAM and TCP_MAERTS tests using four different message sizes and
>>     two different instance counts to 1 VM and 4 VMs.
>> over a 10GbE link.
> 
> roprabhu, Tom,
> 
> Thanks very much for the testing. So on the first glance
> one seems to see a significant performance gain in V0 here,
> and a slightly less significant in V2, with V1
> being worse than base. But I'm afraid that's not the
> whole story, and we'll need to work some more to
> know what really goes on, please see below.
> 
> 
> Some comments on the results: I found out that V0 because of mistake
> on my part was actually almost identical to base.
> I pushed out virtio-net-limit-xmit-polling/v1a instead that
> actually does what I intended to check. However,
> the fact we get such a huge distribution in the results by Tom
> most likely means that the noise factor is very large.
> 
> 
> From my experience one way to get stable results is to
> divide the throughput by the host CPU utilization
> (measured by something like mpstat).
> Sometimes throughput doesn't increase (e.g. guest-host)
> by CPU utilization does decrease. So it's interesting.
> 
> 
> Another issue is that we are trying to improve the latency
> of a busy queue here. However STREAM/MAERTS tests ignore the latency
> (more or less) while TCP_RR by default runs a single packet per queue.
> Without arguing about whether these are practically interesting
> workloads, these results are thus unlikely to be significantly affected
> by the optimization in question.
> 
> What we are interested in, thus, is either TCP_RR with a -b flag
> (configure with  --enable-burst) or multiple concurrent
> TCP_RRs.
> 
> 
> 
Michael, below are some numbers I got from one round of runs.
Thanks,
Roopa

256byte req/response.
Vcpus and irqs were pinned to 4 cores and the cpu utilization is
Avg across 4 cores.

base:
Numof concurrent TCP_RRs    Num of transactions/sec  host cpu-util(%)
1                            7982.93                        15.72
25                           67873                         28.84
50                           112534                        52.25
100                          192057                       86.54


v1
Numof concurrent TCP_RRs    Num of transactions/sec    host cpu-util(%)
1                           7970.94                       10.8
25                          65496.8                       28
50                          109858                        53.22
100                         190155                        87.5


v1a
Numof concurrent TCP_RRs    Num of transactions/sec   host cpu-util (%)
1                           7979.81                       9.5
25                          66786.1                       28
50                          109552                        51
100                         190876                        88


v2
Numof concurrent TCP_RRs    Num of transactions/sec   host cpu-util (%)
1                            7969.87                     16.5
25                           67780.1                     28.44
50                           114966                      54.29
100                          177982                      79.9

next prev parent reply	other threads:[~2011-07-14 19:38 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-19 10:27 RFT: virtio_net: limit xmit polling Michael S. Tsirkin
2011-06-19 10:27 ` Michael S. Tsirkin
2011-06-21 15:23 ` Tom Lendacky
2011-06-21 15:23 ` Tom Lendacky
2011-06-21 15:23   ` Tom Lendacky
2011-06-24 12:50   ` Roopa Prabhu
2011-06-25 19:44     ` Roopa Prabhu
2011-06-28 16:08 ` Tom Lendacky
2011-06-28 16:08   ` Tom Lendacky
2011-06-29  8:42   ` Michael S. Tsirkin
2011-06-29  8:42     ` Michael S. Tsirkin
2011-07-07 13:24     ` Roopa Prabhu
2011-07-14 19:38     ` Roopa Prabhu
2011-07-14 19:38     ` Roopa Prabhu [this message]
2011-07-14 19:38       ` Roopa Prabhu
2011-07-17  9:42       ` Michael S. Tsirkin
2011-07-17  9:42       ` Michael S. Tsirkin
2011-07-17  9:42         ` Michael S. Tsirkin
2011-06-29  8:42   ` Michael S. Tsirkin
  -- strict thread matches above, loose matches on Subject: below --
2011-06-19 10:27 Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA4493AD.2E273%roprabhu@cisco.com \
    --to=roprabhu@cisco.com \
    --cc=borntraeger@de.ibm.com \
    --cc=cotte@de.ibm.com \
    --cc=habanero@linux.vnet.ibm.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=krkumar2@in.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=lguest@lists.ozlabs.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux390@de.ibm.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=rusty@rustcorp.com.au \
    --cc=schwidefsky@de.ibm.com \
    --cc=steved@us.ibm.com \
    --cc=tahm@linux.vnet.ibm.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=xma@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.