From: "Michael S. Tsirkin" <mst@redhat.com>
To: Krishna Kumar <krkumar2@in.ibm.com>
Cc: rusty@rustcorp.com.au, davem@davemloft.net,
netdev@vger.kernel.org, kvm@vger.kernel.org,
anthony@codemonkey.ws
Subject: Re: [RFC PATCH 0/4] Implement multiqueue virtio-net
Date: Wed, 8 Sep 2010 11:10:11 +0300 [thread overview]
Message-ID: <20100908081011.GC23051@redhat.com> (raw)
In-Reply-To: <20100908072859.23769.97363.sendpatchset@krkumar2.in.ibm.com>
On Wed, Sep 08, 2010 at 12:58:59PM +0530, Krishna Kumar wrote:
> Following patches implement Transmit mq in virtio-net. Also
> included is the user qemu changes.
>
> 1. This feature was first implemented with a single vhost.
> Testing showed 3-8% performance gain for upto 8 netperf
> sessions (and sometimes 16), but BW dropped with more
> sessions. However, implementing per-txq vhost improved
> BW significantly all the way to 128 sessions.
> 2. For this mq TX patch, 1 daemon is created for RX and 'n'
> daemons for the 'n' TXQ's, for a total of (n+1) daemons.
> The (subsequent) RX mq patch changes that to a total of
> 'n' daemons, where RX and TX vq's share 1 daemon.
> 3. Service Demand increases for TCP, but significantly
> improves for UDP.
> 4. Interoperability: Many combinations, but not all, of
> qemu, host, guest tested together.
>
>
> Enabling mq on virtio:
> -----------------------
>
> When following options are passed to qemu:
> - smp > 1
> - vhost=on
> - mq=on (new option, default:off)
> then #txqueues = #cpus. The #txqueues can be changed by using
> an optional 'numtxqs' option. e.g. for a smp=4 guest:
> vhost=on,mq=on -> #txqueues = 4
> vhost=on,mq=on,numtxqs=8 -> #txqueues = 8
> vhost=on,mq=on,numtxqs=2 -> #txqueues = 2
>
>
> Performance (guest -> local host):
> -----------------------------------
>
> System configuration:
> Host: 8 Intel Xeon, 8 GB memory
> Guest: 4 cpus, 2 GB memory
> All testing without any tuning, and TCP netperf with 64K I/O
> _______________________________________________________________________________
> TCP (#numtxqs=2)
> N# BW1 BW2 (%) SD1 SD2 (%) RSD1 RSD2 (%)
> _______________________________________________________________________________
> 4 26387 40716 (54.30) 20 28 (40.00) 86i 85 (-1.16)
> 8 24356 41843 (71.79) 88 129 (46.59) 372 362 (-2.68)
> 16 23587 40546 (71.89) 375 564 (50.40) 1558 1519 (-2.50)
> 32 22927 39490 (72.24) 1617 2171 (34.26) 6694 5722 (-14.52)
> 48 23067 39238 (70.10) 3931 5170 (31.51) 15823 13552 (-14.35)
> 64 22927 38750 (69.01) 7142 9914 (38.81) 28972 26173 (-9.66)
> 96 22568 38520 (70.68) 16258 27844 (71.26) 65944 73031 (10.74)
That's a significant hit in TCP SD. Is it caused by the imbalance between
number of queues for TX and RX? Since you mention RX is complete,
maybe measure with a balanced TX/RX?
> _______________________________________________________________________________
> UDP (#numtxqs=8)
> N# BW1 BW2 (%) SD1 SD2 (%)
> __________________________________________________________
> 4 29836 56761 (90.24) 67 63 (-5.97)
> 8 27666 63767 (130.48) 326 265 (-18.71)
> 16 25452 60665 (138.35) 1396 1269 (-9.09)
> 32 26172 63491 (142.59) 5617 4202 (-25.19)
> 48 26146 64629 (147.18) 12813 9316 (-27.29)
> 64 25575 65448 (155.90) 23063 16346 (-29.12)
> 128 26454 63772 (141.06) 91054 85051 (-6.59)
> __________________________________________________________
> N#: Number of netperf sessions, 90 sec runs
> BW1,SD1,RSD1: Bandwidth (sum across 2 runs in mbps), SD and Remote
> SD for original code
> BW2,SD2,RSD2: Bandwidth (sum across 2 runs in mbps), SD and Remote
> SD for new code. e.g. BW2=40716 means average BW2 was
> 20358 mbps.
>
What happens with a single netperf?
host -> guest performance with TCP and small packet speed
are also worth measuring.
> Next steps:
> -----------
>
> 1. mq RX patch is also complete - plan to submit once TX is OK.
> 2. Cache-align data structures: I didn't see any BW/SD improvement
> after making the sq's (and similarly for vhost) cache-aligned
> statically:
> struct virtnet_info {
> ...
> struct send_queue sq[16] ____cacheline_aligned_in_smp;
> ...
> };
>
At some level, host/guest communication is easy in that we don't really
care which queue is used. I would like to give some thought (and
testing) to how is this going to work with a real NIC card and packet
steering at the backend.
Any idea?
> Guest interrupts for a 4 TXQ device after a 5 min test:
> # egrep "virtio0|CPU" /proc/interrupts
> CPU0 CPU1 CPU2 CPU3
> 40: 0 0 0 0 PCI-MSI-edge virtio0-config
> 41: 126955 126912 126505 126940 PCI-MSI-edge virtio0-input
> 42: 108583 107787 107853 107716 PCI-MSI-edge virtio0-output.0
> 43: 300278 297653 299378 300554 PCI-MSI-edge virtio0-output.1
> 44: 372607 374884 371092 372011 PCI-MSI-edge virtio0-output.2
> 45: 162042 162261 163623 162923 PCI-MSI-edge virtio0-output.3
Does this mean each interrupt is constantly bouncing between CPUs?
> Review/feedback appreciated.
>
> Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
> ---
next prev parent reply other threads:[~2010-09-08 8:10 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-09-08 7:28 [RFC PATCH 0/4] Implement multiqueue virtio-net Krishna Kumar
2010-09-08 7:29 ` [RFC PATCH 1/4] Add a new API to virtio-pci Krishna Kumar
2010-09-09 3:49 ` Rusty Russell
2010-09-09 5:23 ` Krishna Kumar2
2010-09-09 12:14 ` Rusty Russell
2010-09-09 13:49 ` Krishna Kumar2
2010-09-10 3:33 ` Rusty Russell
2010-09-12 11:46 ` Michael S. Tsirkin
2010-09-13 4:20 ` Krishna Kumar2
2010-09-13 9:04 ` Michael S. Tsirkin
2010-09-13 15:59 ` Anthony Liguori
2010-09-13 16:30 ` Michael S. Tsirkin
2010-09-13 17:00 ` Avi Kivity
2010-09-15 5:35 ` Michael S. Tsirkin
2010-09-13 17:40 ` Anthony Liguori
2010-09-15 5:40 ` Michael S. Tsirkin
2010-09-08 7:29 ` [RFC PATCH 2/4] Changes for virtio-net Krishna Kumar
2010-09-08 7:29 ` [RFC PATCH 3/4] Changes for vhost Krishna Kumar
2010-09-08 7:29 ` [RFC PATCH 4/4] qemu changes Krishna Kumar
2010-09-08 7:47 ` [RFC PATCH 0/4] Implement multiqueue virtio-net Avi Kivity
2010-09-08 9:22 ` Krishna Kumar2
2010-09-08 9:28 ` Avi Kivity
2010-09-08 10:17 ` Krishna Kumar2
2010-09-08 14:12 ` Arnd Bergmann
2010-09-08 16:47 ` Krishna Kumar2
2010-09-09 10:40 ` Arnd Bergmann
2010-09-09 13:19 ` Krishna Kumar2
2010-09-08 8:10 ` Michael S. Tsirkin [this message]
2010-09-08 9:23 ` Krishna Kumar2
2010-09-08 10:48 ` Michael S. Tsirkin
2010-09-08 12:19 ` Krishna Kumar2
2010-09-08 16:47 ` Krishna Kumar2
[not found] ` <OF70542242.6CAA236A-ON65257798.0044A4E0-65257798.005C0E7C@LocalDomain>
2010-09-09 9:45 ` Krishna Kumar2
2010-09-09 23:00 ` Sridhar Samudrala
2010-09-10 5:19 ` Krishna Kumar2
2010-09-12 11:40 ` Michael S. Tsirkin
2010-09-13 4:12 ` Krishna Kumar2
2010-09-13 11:50 ` Michael S. Tsirkin
2010-09-13 16:23 ` Krishna Kumar2
2010-09-15 5:33 ` Michael S. Tsirkin
[not found] ` <OF8043B2B7.7048D739-ON65257799.0021A2EE-65257799.00356B3E@LocalDomain>
2010-09-09 13:18 ` Krishna Kumar2
2010-09-08 8:13 ` Michael S. Tsirkin
2010-09-08 9:28 ` Krishna Kumar2
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100908081011.GC23051@redhat.com \
--to=mst@redhat.com \
--cc=anthony@codemonkey.ws \
--cc=davem@davemloft.net \
--cc=krkumar2@in.ibm.com \
--cc=kvm@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=rusty@rustcorp.com.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.