From: "Michael S. Tsirkin" <mst@redhat.com>
To: Krishna Kumar <krkumar2@in.ibm.com>
Cc: rusty@rustcorp.com.au, davem@davemloft.net,
netdev@vger.kernel.org, kvm@vger.kernel.org,
anthony@codemonkey.ws
Subject: Re: [RFC PATCH 0/4] Implement multiqueue virtio-net
Date: Wed, 8 Sep 2010 11:10:11 +0300 [thread overview]
Message-ID: <20100908081011.GC23051@redhat.com> (raw)
In-Reply-To: <20100908072859.23769.97363.sendpatchset@krkumar2.in.ibm.com>
On Wed, Sep 08, 2010 at 12:58:59PM +0530, Krishna Kumar wrote:
> Following patches implement Transmit mq in virtio-net. Also
> included is the user qemu changes.
>
> 1. This feature was first implemented with a single vhost.
> Testing showed 3-8% performance gain for upto 8 netperf
> sessions (and sometimes 16), but BW dropped with more
> sessions. However, implementing per-txq vhost improved
> BW significantly all the way to 128 sessions.
> 2. For this mq TX patch, 1 daemon is created for RX and 'n'
> daemons for the 'n' TXQ's, for a total of (n+1) daemons.
> The (subsequent) RX mq patch changes that to a total of
> 'n' daemons, where RX and TX vq's share 1 daemon.
> 3. Service Demand increases for TCP, but significantly
> improves for UDP.
> 4. Interoperability: Many combinations, but not all, of
> qemu, host, guest tested together.
>
>
> Enabling mq on virtio:
> -----------------------
>
> When following options are passed to qemu:
> - smp > 1
> - vhost=on
> - mq=on (new option, default:off)
> then #txqueues = #cpus. The #txqueues can be changed by using
> an optional 'numtxqs' option. e.g. for a smp=4 guest:
> vhost=on,mq=on -> #txqueues = 4
> vhost=on,mq=on,numtxqs=8 -> #txqueues = 8
> vhost=on,mq=on,numtxqs=2 -> #txqueues = 2
>
>
> Performance (guest -> local host):
> -----------------------------------
>
> System configuration:
> Host: 8 Intel Xeon, 8 GB memory
> Guest: 4 cpus, 2 GB memory
> All testing without any tuning, and TCP netperf with 64K I/O
> _______________________________________________________________________________
> TCP (#numtxqs=2)
> N# BW1 BW2 (%) SD1 SD2 (%) RSD1 RSD2 (%)
> _______________________________________________________________________________
> 4 26387 40716 (54.30) 20 28 (40.00) 86i 85 (-1.16)
> 8 24356 41843 (71.79) 88 129 (46.59) 372 362 (-2.68)
> 16 23587 40546 (71.89) 375 564 (50.40) 1558 1519 (-2.50)
> 32 22927 39490 (72.24) 1617 2171 (34.26) 6694 5722 (-14.52)
> 48 23067 39238 (70.10) 3931 5170 (31.51) 15823 13552 (-14.35)
> 64 22927 38750 (69.01) 7142 9914 (38.81) 28972 26173 (-9.66)
> 96 22568 38520 (70.68) 16258 27844 (71.26) 65944 73031 (10.74)
That's a significant hit in TCP SD. Is it caused by the imbalance between
number of queues for TX and RX? Since you mention RX is complete,
maybe measure with a balanced TX/RX?
> _______________________________________________________________________________
> UDP (#numtxqs=8)
> N# BW1 BW2 (%) SD1 SD2 (%)
> __________________________________________________________
> 4 29836 56761 (90.24) 67 63 (-5.97)
> 8 27666 63767 (130.48) 326 265 (-18.71)
> 16 25452 60665 (138.35) 1396 1269 (-9.09)
> 32 26172 63491 (142.59) 5617 4202 (-25.19)
> 48 26146 64629 (147.18) 12813 9316 (-27.29)
> 64 25575 65448 (155.90) 23063 16346 (-29.12)
> 128 26454 63772 (141.06) 91054 85051 (-6.59)
> __________________________________________________________
> N#: Number of netperf sessions, 90 sec runs
> BW1,SD1,RSD1: Bandwidth (sum across 2 runs in mbps), SD and Remote
> SD for original code
> BW2,SD2,RSD2: Bandwidth (sum across 2 runs in mbps), SD and Remote
> SD for new code. e.g. BW2=40716 means average BW2 was
> 20358 mbps.
>
What happens with a single netperf?
host -> guest performance with TCP and small packet speed
are also worth measuring.
> Next steps:
> -----------
>
> 1. mq RX patch is also complete - plan to submit once TX is OK.
> 2. Cache-align data structures: I didn't see any BW/SD improvement
> after making the sq's (and similarly for vhost) cache-aligned
> statically:
> struct virtnet_info {
> ...
> struct send_queue sq[16] ____cacheline_aligned_in_smp;
> ...
> };
>
At some level, host/guest communication is easy in that we don't really
care which queue is used. I would like to give some thought (and
testing) to how is this going to work with a real NIC card and packet
steering at the backend.
Any idea?
> Guest interrupts for a 4 TXQ device after a 5 min test:
> # egrep "virtio0|CPU" /proc/interrupts
> CPU0 CPU1 CPU2 CPU3
> 40: 0 0 0 0 PCI-MSI-edge virtio0-config
> 41: 126955 126912 126505 126940 PCI-MSI-edge virtio0-input
> 42: 108583 107787 107853 107716 PCI-MSI-edge virtio0-output.0
> 43: 300278 297653 299378 300554 PCI-MSI-edge virtio0-output.1
> 44: 372607 374884 371092 372011 PCI-MSI-edge virtio0-output.2
> 45: 162042 162261 163623 162923 PCI-MSI-edge virtio0-output.3
Does this mean each interrupt is constantly bouncing between CPUs?
> Review/feedback appreciated.
>
> Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
> ---
next prev parent reply other threads:[~2010-09-08 8:16 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-09-08 7:28 [RFC PATCH 0/4] Implement multiqueue virtio-net Krishna Kumar
2010-09-08 7:29 ` [RFC PATCH 1/4] Add a new API to virtio-pci Krishna Kumar
2010-09-09 3:49 ` Rusty Russell
2010-09-09 5:23 ` Krishna Kumar2
2010-09-09 12:14 ` Rusty Russell
2010-09-09 13:49 ` Krishna Kumar2
2010-09-10 3:33 ` Rusty Russell
2010-09-12 11:46 ` Michael S. Tsirkin
2010-09-13 4:20 ` Krishna Kumar2
2010-09-13 9:04 ` Michael S. Tsirkin
2010-09-13 15:59 ` Anthony Liguori
2010-09-13 16:30 ` Michael S. Tsirkin
2010-09-13 17:00 ` Avi Kivity
2010-09-15 5:35 ` Michael S. Tsirkin
2010-09-13 17:40 ` Anthony Liguori
2010-09-15 5:40 ` Michael S. Tsirkin
2010-09-08 7:29 ` [RFC PATCH 2/4] Changes for virtio-net Krishna Kumar
2010-09-08 7:29 ` [RFC PATCH 3/4] Changes for vhost Krishna Kumar
2010-09-08 7:29 ` [RFC PATCH 4/4] qemu changes Krishna Kumar
2010-09-08 7:47 ` [RFC PATCH 0/4] Implement multiqueue virtio-net Avi Kivity
2010-09-08 9:22 ` Krishna Kumar2
2010-09-08 9:28 ` Avi Kivity
2010-09-08 10:17 ` Krishna Kumar2
2010-09-08 14:12 ` Arnd Bergmann
2010-09-08 16:47 ` Krishna Kumar2
2010-09-09 10:40 ` Arnd Bergmann
2010-09-09 13:19 ` Krishna Kumar2
2010-09-08 8:10 ` Michael S. Tsirkin [this message]
2010-09-08 9:23 ` Krishna Kumar2
2010-09-08 10:48 ` Michael S. Tsirkin
2010-09-08 12:19 ` Krishna Kumar2
2010-09-08 16:47 ` Krishna Kumar2
[not found] ` <OF70542242.6CAA236A-ON65257798.0044A4E0-65257798.005C0E7C@LocalDomain>
2010-09-09 9:45 ` Krishna Kumar2
2010-09-09 23:00 ` Sridhar Samudrala
2010-09-10 5:19 ` Krishna Kumar2
2010-09-12 11:40 ` Michael S. Tsirkin
2010-09-13 4:12 ` Krishna Kumar2
2010-09-13 11:50 ` Michael S. Tsirkin
2010-09-13 16:23 ` Krishna Kumar2
2010-09-15 5:33 ` Michael S. Tsirkin
[not found] ` <OF8043B2B7.7048D739-ON65257799.0021A2EE-65257799.00356B3E@LocalDomain>
2010-09-09 13:18 ` Krishna Kumar2
2010-09-08 8:13 ` Michael S. Tsirkin
2010-09-08 9:28 ` Krishna Kumar2
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100908081011.GC23051@redhat.com \
--to=mst@redhat.com \
--cc=anthony@codemonkey.ws \
--cc=davem@davemloft.net \
--cc=krkumar2@in.ibm.com \
--cc=kvm@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=rusty@rustcorp.com.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).