netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Anthony Liguori <aliguori@us.ibm.com>
To: netdev@vger.kernel.org
Cc: Michael Tsirkin <mst@redhat.com>,
	Anthony Liguori <aliguori@us.ibm.com>,
	Tom Lendacky <toml@us.ibm.com>,
	Cristian Viana <vianac@br.ibm.com>
Subject: [PATCH 2/2] vhost-net: add a spin_threshold parameter
Date: Fri, 17 Feb 2012 17:02:06 -0600	[thread overview]
Message-ID: <1329519726-25763-3-git-send-email-aliguori@us.ibm.com> (raw)
In-Reply-To: <1329519726-25763-1-git-send-email-aliguori@us.ibm.com>

With workloads that are dominated by very high rates of small packets, we see
considerable overhead in virtio notifications.

The best strategy we've been able to come up with to deal with this is adaptive
polling.  This patch simply adds the infrastructure needed to experiment with
polling strategies.  It is not meant for inclusion.

Here are the results with various polling values.  The spinning is not currently
a net win due to the high mutex contention caused by the broadcast wakeup.  With
a patch attempting to signal wakeup, we see up to 170+ transactions per second
with TCP_RR 60 instance.

N  Baseline	Spin 0		Spin 1000	Spin 5000

TCP_RR

1  9,639.66	10,164.06	9,825.43	9,827.45	101.95%
10 62,819.55	54,059.78	63,114.30	60,767.23	96.73%
30 84,715.60	131,241.86	120,922.38	89,776.39	105.97%
60 124,614.71	148,720.66	158,678.08	141,400.05	113.47%

UDP_RR

1  9,652.50	10,343.72	9,493.95	9,569.54	99.14%
10 53,830.26	58,235.90	50,145.29	48,820.53	90.69%
30 89,471.01	97,634.53	95,108.34	91,263.65	102.00%
60 103,640.59	164,035.01	157,002.22	128,646.73	124.13%

TCP_STREAM
1  2,622.63	2,610.71	2,688.49	2,678.61	102.13%
4  4,928.02	4,812.05	4,971.00	5,104.57	103.58%

1  5,639.89	5,751.28	5,819.81	5,593.62	99.18%
4  5,874.72	6,575.55	6,324.87	6,502.33	110.68%

1  6,257.42	7,655.22	7,610.52	7,424.74	118.65%
4  5,370.78	6,044.83	5,784.23	6,209.93	115.62%

1  6,346.63	7,267.44	7,567.39	7,677.93	120.98%
4  5,198.02	5,657.12	5,528.94	5,792.42	111.44%

TCP_MAERTS

1  2,091.38	1,765.62	2,142.56	2,312.94	110.59%
4  5,319.52	5,619.49	5,544.50	5,645.81	106.13%

1  7,030.66	7,593.61	7,575.67	7,622.07	108.41%
4  9,040.53	7,275.84	7,322.07	6,681.34	73.90%

1  9,160.93	9,318.15	9,065.82	8,586.82	93.73%
4  9,372.49	8,875.63	8,959.03	9,056.07	96.62%

1  9,183.28	9,134.02	8,945.12	8,657.72	94.28%
4  9,377.17	8,877.52	8,959.54	9,071.53	96.74%

Cc: Tom Lendacky <toml@us.ibm.com>
Cc: Cristian Viana <vianac@br.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
---
 drivers/vhost/net.c |   14 ++++++++++++++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 47175cd..e9e5866 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -37,6 +37,10 @@ static int workers = 2;
 module_param(workers, int, 0444);
 MODULE_PARM_DESC(workers, "Set the number of worker threads");
 
+static ulong spin_threshold = 0;
+module_param(spin_threshold, ulong, 0444);
+MODULE_PARM_DESC(spin_threshold, "The polling threshold for the tx queue");
+
 /* Max number of bytes transferred before requeueing the job.
  * Using this limit prevents one virtqueue from starving others. */
 #define VHOST_NET_WEIGHT 0x80000
@@ -65,6 +69,7 @@ struct vhost_net {
 	 * We only do this when socket buffer fills up.
 	 * Protected by tx vq lock. */
 	enum vhost_net_poll_state tx_poll_state;
+	size_t spin_threshold;
 };
 
 static bool vhost_sock_zcopy(struct socket *sock)
@@ -149,6 +154,7 @@ static void handle_tx(struct vhost_net *net)
 	size_t hdr_size;
 	struct socket *sock;
 	struct vhost_ubuf_ref *uninitialized_var(ubufs);
+	size_t spin_count;
 	bool zcopy;
 
 	/* TODO: check that we are running from vhost_worker? */
@@ -172,6 +178,7 @@ static void handle_tx(struct vhost_net *net)
 	hdr_size = vq->vhost_hlen;
 	zcopy = vhost_sock_zcopy(sock);
 
+	spin_count = 0;
 	for (;;) {
 		/* Release DMAs done buffers first */
 		if (zcopy)
@@ -205,9 +212,15 @@ static void handle_tx(struct vhost_net *net)
 				set_bit(SOCK_ASYNC_NOSPACE, &sock->flags);
 				break;
 			}
+			if (spin_count < net->spin_threshold) {
+				spin_count++;
+				continue;
+			}
 			if (unlikely(vhost_enable_notify(&net->dev, vq))) {
 				vhost_disable_notify(&net->dev, vq);
 				continue;
+			} else {
+				spin_count = 0;
 			}
 			break;
 		}
@@ -506,6 +519,7 @@ static int vhost_net_open(struct inode *inode, struct file *f)
 		return -ENOMEM;
 
 	dev = &n->dev;
+	n->spin_threshold = spin_threshold;
 	n->vqs[VHOST_NET_VQ_TX].handle_kick = handle_tx_kick;
 	n->vqs[VHOST_NET_VQ_RX].handle_kick = handle_rx_kick;
 	r = vhost_dev_init(dev, n->vqs, workers, VHOST_NET_VQ_MAX);
-- 
1.7.4.1

  parent reply	other threads:[~2012-02-17 23:03 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-17 23:02 [PATCH 0/2][RFC] vhost: improve transmit rate with virtqueue polling Anthony Liguori
2012-02-17 23:02 ` [PATCH 1/2] vhost: allow multiple workers threads Anthony Liguori
2012-02-19 14:41   ` Michael S. Tsirkin
2012-02-20 15:50     ` Tom Lendacky
2012-02-20 19:27       ` Michael S. Tsirkin
2012-02-20 19:46         ` Anthony Liguori
2012-02-20 21:00           ` Michael S. Tsirkin
2012-02-21  1:04             ` Shirley Ma
2012-02-21  3:21               ` Michael S. Tsirkin
2012-02-21  4:03                 ` Shirley Ma
2012-03-05 13:21                   ` Anthony Liguori
2012-03-05 20:43                     ` Shirley Ma
2012-02-21  4:32           ` Jason Wang
2012-02-21  4:51     ` Jason Wang
2012-02-17 23:02 ` Anthony Liguori [this message]
2012-02-19 14:51   ` [PATCH 2/2] vhost-net: add a spin_threshold parameter Michael S. Tsirkin
2012-02-21  1:35     ` Shirley Ma
2012-02-21  5:34       ` Jason Wang
2012-02-21  6:28         ` Shirley Ma
2012-02-21  6:38           ` Jason Wang
2012-02-21 11:09             ` Shirley Ma
2012-02-21 16:08             ` Sridhar Samudrala
2012-03-12  8:12   ` Dor Laor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1329519726-25763-3-git-send-email-aliguori@us.ibm.com \
    --to=aliguori@us.ibm.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=toml@us.ibm.com \
    --cc=vianac@br.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).