From: "Michael S. Tsirkin" <mst@redhat.com>
To: Anthony Liguori <aliguori@us.ibm.com>
Cc: netdev@vger.kernel.org, Tom Lendacky <toml@us.ibm.com>,
Cristian Viana <vianac@br.ibm.com>
Subject: Re: [PATCH 2/2] vhost-net: add a spin_threshold parameter
Date: Sun, 19 Feb 2012 16:51:01 +0200 [thread overview]
Message-ID: <20120219145100.GB16620@redhat.com> (raw)
In-Reply-To: <1329519726-25763-3-git-send-email-aliguori@us.ibm.com>
On Fri, Feb 17, 2012 at 05:02:06PM -0600, Anthony Liguori wrote:
> With workloads that are dominated by very high rates of small packets, we see
> considerable overhead in virtio notifications.
>
> The best strategy we've been able to come up with to deal with this is adaptive
> polling.
> This patch simply adds the infrastructure needed to experiment with
> polling strategies. It is not meant for inclusion.
>
> Here are the results with various polling values. The spinning is not currently
> a net win due to the high mutex contention caused by the broadcast wakeup. With
> a patch attempting to signal wakeup, we see up to 170+ transactions per second
> with TCP_RR 60 instance.
>
> N Baseline Spin 0 Spin 1000 Spin 5000
>
> TCP_RR
>
> 1 9,639.66 10,164.06 9,825.43 9,827.45 101.95%
> 10 62,819.55 54,059.78 63,114.30 60,767.23 96.73%
> 30 84,715.60 131,241.86 120,922.38 89,776.39 105.97%
> 60 124,614.71 148,720.66 158,678.08 141,400.05 113.47%
>
> UDP_RR
>
> 1 9,652.50 10,343.72 9,493.95 9,569.54 99.14%
> 10 53,830.26 58,235.90 50,145.29 48,820.53 90.69%
> 30 89,471.01 97,634.53 95,108.34 91,263.65 102.00%
> 60 103,640.59 164,035.01 157,002.22 128,646.73 124.13%
>
> TCP_STREAM
> 1 2,622.63 2,610.71 2,688.49 2,678.61 102.13%
> 4 4,928.02 4,812.05 4,971.00 5,104.57 103.58%
>
> 1 5,639.89 5,751.28 5,819.81 5,593.62 99.18%
> 4 5,874.72 6,575.55 6,324.87 6,502.33 110.68%
>
> 1 6,257.42 7,655.22 7,610.52 7,424.74 118.65%
> 4 5,370.78 6,044.83 5,784.23 6,209.93 115.62%
>
> 1 6,346.63 7,267.44 7,567.39 7,677.93 120.98%
> 4 5,198.02 5,657.12 5,528.94 5,792.42 111.44%
>
> TCP_MAERTS
>
> 1 2,091.38 1,765.62 2,142.56 2,312.94 110.59%
> 4 5,319.52 5,619.49 5,544.50 5,645.81 106.13%
>
> 1 7,030.66 7,593.61 7,575.67 7,622.07 108.41%
> 4 9,040.53 7,275.84 7,322.07 6,681.34 73.90%
>
> 1 9,160.93 9,318.15 9,065.82 8,586.82 93.73%
> 4 9,372.49 8,875.63 8,959.03 9,056.07 96.62%
>
> 1 9,183.28 9,134.02 8,945.12 8,657.72 94.28%
> 4 9,377.17 8,877.52 8,959.54 9,071.53 96.74%
An obvious question would be how are BW divided by CPU
numbers affected.
> Cc: Tom Lendacky <toml@us.ibm.com>
> Cc: Cristian Viana <vianac@br.ibm.com>
> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
> ---
> drivers/vhost/net.c | 14 ++++++++++++++
> 1 files changed, 14 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index 47175cd..e9e5866 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -37,6 +37,10 @@ static int workers = 2;
> module_param(workers, int, 0444);
> MODULE_PARM_DESC(workers, "Set the number of worker threads");
>
> +static ulong spin_threshold = 0;
> +module_param(spin_threshold, ulong, 0444);
> +MODULE_PARM_DESC(spin_threshold, "The polling threshold for the tx queue");
> +
> /* Max number of bytes transferred before requeueing the job.
> * Using this limit prevents one virtqueue from starving others. */
> #define VHOST_NET_WEIGHT 0x80000
> @@ -65,6 +69,7 @@ struct vhost_net {
> * We only do this when socket buffer fills up.
> * Protected by tx vq lock. */
> enum vhost_net_poll_state tx_poll_state;
> + size_t spin_threshold;
> };
>
> static bool vhost_sock_zcopy(struct socket *sock)
> @@ -149,6 +154,7 @@ static void handle_tx(struct vhost_net *net)
> size_t hdr_size;
> struct socket *sock;
> struct vhost_ubuf_ref *uninitialized_var(ubufs);
> + size_t spin_count;
> bool zcopy;
>
> /* TODO: check that we are running from vhost_worker? */
> @@ -172,6 +178,7 @@ static void handle_tx(struct vhost_net *net)
> hdr_size = vq->vhost_hlen;
> zcopy = vhost_sock_zcopy(sock);
>
> + spin_count = 0;
> for (;;) {
> /* Release DMAs done buffers first */
> if (zcopy)
> @@ -205,9 +212,15 @@ static void handle_tx(struct vhost_net *net)
> set_bit(SOCK_ASYNC_NOSPACE, &sock->flags);
> break;
> }
> + if (spin_count < net->spin_threshold) {
> + spin_count++;
> + continue;
> + }
> if (unlikely(vhost_enable_notify(&net->dev, vq))) {
> vhost_disable_notify(&net->dev, vq);
> continue;
> + } else {
> + spin_count = 0;
> }
> break;
> }
> @@ -506,6 +519,7 @@ static int vhost_net_open(struct inode *inode, struct file *f)
> return -ENOMEM;
>
> dev = &n->dev;
> + n->spin_threshold = spin_threshold;
> n->vqs[VHOST_NET_VQ_TX].handle_kick = handle_tx_kick;
> n->vqs[VHOST_NET_VQ_RX].handle_kick = handle_rx_kick;
> r = vhost_dev_init(dev, n->vqs, workers, VHOST_NET_VQ_MAX);
> --
> 1.7.4.1
next prev parent reply other threads:[~2012-02-19 14:51 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-17 23:02 [PATCH 0/2][RFC] vhost: improve transmit rate with virtqueue polling Anthony Liguori
2012-02-17 23:02 ` [PATCH 1/2] vhost: allow multiple workers threads Anthony Liguori
2012-02-19 14:41 ` Michael S. Tsirkin
2012-02-20 15:50 ` Tom Lendacky
2012-02-20 19:27 ` Michael S. Tsirkin
2012-02-20 19:46 ` Anthony Liguori
2012-02-20 21:00 ` Michael S. Tsirkin
2012-02-21 1:04 ` Shirley Ma
2012-02-21 3:21 ` Michael S. Tsirkin
2012-02-21 4:03 ` Shirley Ma
2012-03-05 13:21 ` Anthony Liguori
2012-03-05 20:43 ` Shirley Ma
2012-02-21 4:32 ` Jason Wang
2012-02-21 4:51 ` Jason Wang
2012-02-17 23:02 ` [PATCH 2/2] vhost-net: add a spin_threshold parameter Anthony Liguori
2012-02-19 14:51 ` Michael S. Tsirkin [this message]
2012-02-21 1:35 ` Shirley Ma
2012-02-21 5:34 ` Jason Wang
2012-02-21 6:28 ` Shirley Ma
2012-02-21 6:38 ` Jason Wang
2012-02-21 11:09 ` Shirley Ma
2012-02-21 16:08 ` Sridhar Samudrala
2012-03-12 8:12 ` Dor Laor
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120219145100.GB16620@redhat.com \
--to=mst@redhat.com \
--cc=aliguori@us.ibm.com \
--cc=netdev@vger.kernel.org \
--cc=toml@us.ibm.com \
--cc=vianac@br.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).