* [RFC PATCH 0/1] vhost: Reduce TX used buffer signal for performance
@ 2010-10-27 21:00 Shirley Ma
2010-10-27 21:05 ` Shirley Ma
0 siblings, 1 reply; 4+ messages in thread
From: Shirley Ma @ 2010-10-27 21:00 UTC (permalink / raw)
To: mst@redhat.com, David Miller; +Cc: netdev, kvm, linux-kernel
This patch will change vhost TX used buffer guest signaling from one by
one to 3/4 ring size. I have tried different size, like 4, 16, 1/4 size,
1/2 size, and found that the large size is best for message size between
256 - 4K with netperf TCP_STREAM test, so 3/4 of the ring size is picked
up for signaling.
Tested both UDP and TCP performance with guest 2vcpu. The 60 secs
netperf run shows that guest to host performance for TCP.
TCP_STREAM
Message size Guest CPU(%) BW (Mb/s)
before:after before:after
256 57.84:58.42 1678.47:1908.75
512 68.68:60.21 1844.18:3387.33
1024 68.01:58.70 1945.14:3384.72
2048 65.36:54.25 2342.45:3799.31
4096 63.25:54.62 3307.11:4451.78
8192 59.57:57.89 6038.64:6694.04
UDP_STREAM
1024 49.64:26.69 1161.0:1687.6
2048 49.88:29.25 2326.8:2850.9
4096 49.59:29.15 3871.1:4880.3
8192 46.09:32.66 6822.9:7825.1
16K 42.90:34.96 11347.1:11767.4
For large message size, 60 secs run remains almost the same. I guess the
signal might not play a big role in large message transmission.
Shirley
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC PATCH 0/1] vhost: Reduce TX used buffer signal for performance
2010-10-27 21:00 [RFC PATCH 0/1] vhost: Reduce TX used buffer signal for performance Shirley Ma
@ 2010-10-27 21:05 ` Shirley Ma
2010-10-28 8:57 ` Stefan Hajnoczi
0 siblings, 1 reply; 4+ messages in thread
From: Shirley Ma @ 2010-10-27 21:05 UTC (permalink / raw)
To: mst@redhat.com; +Cc: David Miller, netdev, kvm, linux-kernel
This patch changes vhost TX used buffer signal to guest from one by
one to up to 3/4 of vring size. This change improves vhost TX message
size from 256 to 8K performance for both bandwidth and CPU utilization
without inducing any regression.
Signed-off-by: Shirley Ma <xma@us.ibm.com>
---
drivers/vhost/net.c | 19 ++++++++++++++++++-
drivers/vhost/vhost.c | 31 +++++++++++++++++++++++++++++++
drivers/vhost/vhost.h | 3 +++
3 files changed, 52 insertions(+), 1 deletions(-)
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 4b4da5b..bd1ba71 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -198,7 +198,24 @@ static void handle_tx(struct vhost_net *net)
if (err != len)
pr_debug("Truncated TX packet: "
" len %d != %zd\n", err, len);
- vhost_add_used_and_signal(&net->dev, vq, head, 0);
+ /*
+ * if no pending buffer size allocate, signal used buffer
+ * one by one, otherwise, signal used buffer when reaching
+ * 3/4 ring size to reduce CPU utilization.
+ */
+ if (unlikely(vq->pend))
+ vhost_add_used_and_signal(&net->dev, vq, head, 0);
+ else {
+ vq->pend[vq->num_pend].id = head;
+ vq->pend[vq->num_pend].len = 0;
+ ++vq->num_pend;
+ if (vq->num_pend == (vq->num - (vq->num >> 2))) {
+ vhost_add_used_and_signal_n(&net->dev, vq,
+ vq->pend,
+ vq->num_pend);
+ vq->num_pend = 0;
+ }
+ }
total_len += len;
if (unlikely(total_len >= VHOST_NET_WEIGHT)) {
vhost_poll_queue(&vq->poll);
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 94701ff..47696d2 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -170,6 +170,16 @@ static void vhost_vq_reset(struct vhost_dev *dev,
vq->call_ctx = NULL;
vq->call = NULL;
vq->log_ctx = NULL;
+ /* signal pending used buffers */
+ if (vq->pend) {
+ if (vq->num_pend != 0) {
+ vhost_add_used_and_signal_n(dev, vq, vq->pend,
+ vq->num_pend);
+ vq->num_pend = 0;
+ }
+ kfree(vq->pend);
+ }
+ vq->pend = NULL;
}
static int vhost_worker(void *data)
@@ -273,7 +283,13 @@ long vhost_dev_init(struct vhost_dev *dev,
dev->vqs[i].heads = NULL;
dev->vqs[i].dev = dev;
mutex_init(&dev->vqs[i].mutex);
+ dev->vqs[i].num_pend = 0;
+ dev->vqs[i].pend = NULL;
vhost_vq_reset(dev, dev->vqs + i);
+ /* signal 3/4 of ring size used buffers */
+ dev->vqs[i].pend = kmalloc((dev->vqs[i].num -
+ (dev->vqs[i].num >> 2)) *
+ sizeof *vq->peed, GFP_KERNEL);
if (dev->vqs[i].handle_kick)
vhost_poll_init(&dev->vqs[i].poll,
dev->vqs[i].handle_kick, POLLIN, dev);
@@ -599,6 +615,21 @@ static long vhost_set_vring(struct vhost_dev *d, int ioctl, void __user *argp)
r = -EINVAL;
break;
}
+ if (vq->num != s.num) {
+ /* signal used buffers first */
+ if (vq->pend) {
+ if (vq->num_pend != 0) {
+ vhost_add_used_and_signal_n(vq->dev, vq,
+ vq->pend,
+ vq->num_pend);
+ vq->num_pend = 0;
+ }
+ kfree(vq->pend);
+ }
+ /* realloc pending used buffers size */
+ vq->pend = kmalloc((s.num - (s.num >> 2)) *
+ sizeof *vq->pend, GFP_KERNEL);
+ }
vq->num = s.num;
break;
case VHOST_SET_VRING_BASE:
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 073d06a..78949c0 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -108,6 +108,9 @@ struct vhost_virtqueue {
/* Log write descriptors */
void __user *log_base;
struct vhost_log *log;
+ /* delay multiple used buffers to signal once */
+ int num_pend;
+ struct vring_used_elem *pend;
};
struct vhost_dev {
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [RFC PATCH 0/1] vhost: Reduce TX used buffer signal for performance
2010-10-27 21:05 ` Shirley Ma
@ 2010-10-28 8:57 ` Stefan Hajnoczi
2010-10-28 8:59 ` Stefan Hajnoczi
0 siblings, 1 reply; 4+ messages in thread
From: Stefan Hajnoczi @ 2010-10-28 8:57 UTC (permalink / raw)
To: Shirley Ma; +Cc: mst@redhat.com, David Miller, netdev, kvm, linux-kernel
On Wed, Oct 27, 2010 at 10:05 PM, Shirley Ma <mashirle@us.ibm.com> wrote:
> This patch changes vhost TX used buffer signal to guest from one by
> one to up to 3/4 of vring size. This change improves vhost TX message
> size from 256 to 8K performance for both bandwidth and CPU utilization
> without inducing any regression.
Any concerns about introducing latency or does the guest not care when
TX completions come in?
> Signed-off-by: Shirley Ma <xma@us.ibm.com>
> ---
>
> drivers/vhost/net.c | 19 ++++++++++++++++++-
> drivers/vhost/vhost.c | 31 +++++++++++++++++++++++++++++++
> drivers/vhost/vhost.h | 3 +++
> 3 files changed, 52 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index 4b4da5b..bd1ba71 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -198,7 +198,24 @@ static void handle_tx(struct vhost_net *net)
> if (err != len)
> pr_debug("Truncated TX packet: "
> " len %d != %zd\n", err, len);
> - vhost_add_used_and_signal(&net->dev, vq, head, 0);
> + /*
> + * if no pending buffer size allocate, signal used buffer
> + * one by one, otherwise, signal used buffer when reaching
> + * 3/4 ring size to reduce CPU utilization.
> + */
> + if (unlikely(vq->pend))
> + vhost_add_used_and_signal(&net->dev, vq, head, 0);
> + else {
> + vq->pend[vq->num_pend].id = head;
I don't understand the logic here: if !vq->pend then we assign to
vq->pend[vq->num_pend].
> + vq->pend[vq->num_pend].len = 0;
> + ++vq->num_pend;
> + if (vq->num_pend == (vq->num - (vq->num >> 2))) {
> + vhost_add_used_and_signal_n(&net->dev, vq,
> + vq->pend,
> + vq->num_pend);
> + vq->num_pend = 0;
> + }
> + }
> total_len += len;
> if (unlikely(total_len >= VHOST_NET_WEIGHT)) {
> vhost_poll_queue(&vq->poll);
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index 94701ff..47696d2 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -170,6 +170,16 @@ static void vhost_vq_reset(struct vhost_dev *dev,
> vq->call_ctx = NULL;
> vq->call = NULL;
> vq->log_ctx = NULL;
> + /* signal pending used buffers */
> + if (vq->pend) {
> + if (vq->num_pend != 0) {
> + vhost_add_used_and_signal_n(dev, vq, vq->pend,
> + vq->num_pend);
> + vq->num_pend = 0;
> + }
> + kfree(vq->pend);
> + }
> + vq->pend = NULL;
> }
>
> static int vhost_worker(void *data)
> @@ -273,7 +283,13 @@ long vhost_dev_init(struct vhost_dev *dev,
> dev->vqs[i].heads = NULL;
> dev->vqs[i].dev = dev;
> mutex_init(&dev->vqs[i].mutex);
> + dev->vqs[i].num_pend = 0;
> + dev->vqs[i].pend = NULL;
> vhost_vq_reset(dev, dev->vqs + i);
> + /* signal 3/4 of ring size used buffers */
> + dev->vqs[i].pend = kmalloc((dev->vqs[i].num -
> + (dev->vqs[i].num >> 2)) *
> + sizeof *vq->peed, GFP_KERNEL);
Has this patch been compile tested? vq->peed?
Stefan
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC PATCH 0/1] vhost: Reduce TX used buffer signal for performance
2010-10-28 8:57 ` Stefan Hajnoczi
@ 2010-10-28 8:59 ` Stefan Hajnoczi
0 siblings, 0 replies; 4+ messages in thread
From: Stefan Hajnoczi @ 2010-10-28 8:59 UTC (permalink / raw)
To: Shirley Ma; +Cc: mst@redhat.com, David Miller, netdev, kvm, linux-kernel
On Thu, Oct 28, 2010 at 9:57 AM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
Just read the patch 1/1 discussion and it looks like you're already on
it. Sorry for the noise.
Stefan
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-10-28 8:59 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-27 21:00 [RFC PATCH 0/1] vhost: Reduce TX used buffer signal for performance Shirley Ma
2010-10-27 21:05 ` Shirley Ma
2010-10-28 8:57 ` Stefan Hajnoczi
2010-10-28 8:59 ` Stefan Hajnoczi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).