From: "Michael S. Tsirkin" <mst@redhat.com>
To: Sridhar Samudrala <sri@us.ibm.com>
Cc: Steve Dobbelstein <steved@us.ibm.com>,
David Miller <davem@davemloft.net>,
kvm@vger.kernel.org, mashirle@linux.vnet.ibm.com,
netdev@vger.kernel.org
Subject: Re: Network performance with small packets
Date: Tue, 1 Feb 2011 07:56:27 +0200 [thread overview]
Message-ID: <20110201055627.GG9124@redhat.com> (raw)
In-Reply-To: <1296523838.30191.39.camel@sridhar.beaverton.ibm.com>
On Mon, Jan 31, 2011 at 05:30:38PM -0800, Sridhar Samudrala wrote:
> On Mon, 2011-01-31 at 18:24 -0600, Steve Dobbelstein wrote:
> > "Michael S. Tsirkin" <mst@redhat.com> wrote on 01/28/2011 06:16:16 AM:
> >
> > > OK, so thinking about it more, maybe the issue is this:
> > > tx becomes full. We process one request and interrupt the guest,
> > > then it adds one request and the queue is full again.
> > >
> > > Maybe the following will help it stabilize?
> > > By itself it does nothing, but if you set
> > > all the parameters to a huge value we will
> > > only interrupt when we see an empty ring.
> > > Which might be too much: pls try other values
> > > in the middle: e.g. make bufs half the ring,
> > > or bytes some small value, or packets some
> > > small value etc.
> > >
> > > Warning: completely untested.
> > >
> > > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> > > index aac05bc..6769cdc 100644
> > > --- a/drivers/vhost/net.c
> > > +++ b/drivers/vhost/net.c
> > > @@ -32,6 +32,13 @@
> > > * Using this limit prevents one virtqueue from starving others. */
> > > #define VHOST_NET_WEIGHT 0x80000
> > >
> > > +int tx_bytes_coalesce = 0;
> > > +module_param(tx_bytes_coalesce, int, 0644);
> > > +int tx_bufs_coalesce = 0;
> > > +module_param(tx_bufs_coalesce, int, 0644);
> > > +int tx_packets_coalesce = 0;
> > > +module_param(tx_packets_coalesce, int, 0644);
> > > +
> > > enum {
> > > VHOST_NET_VQ_RX = 0,
> > > VHOST_NET_VQ_TX = 1,
> > > @@ -127,6 +134,9 @@ static void handle_tx(struct vhost_net *net)
> > > int err, wmem;
> > > size_t hdr_size;
> > > struct socket *sock;
> > > + int bytes_coalesced = 0;
> > > + int bufs_coalesced = 0;
> > > + int packets_coalesced = 0;
> > >
> > > /* TODO: check that we are running from vhost_worker? */
> > > sock = rcu_dereference_check(vq->private_data, 1);
> > > @@ -196,14 +206,26 @@ static void handle_tx(struct vhost_net *net)
> > > if (err != len)
> > > pr_debug("Truncated TX packet: "
> > > " len %d != %zd\n", err, len);
> > > - vhost_add_used_and_signal(&net->dev, vq, head, 0);
> > > total_len += len;
> > > + packets_coalesced += 1;
> > > + bytes_coalesced += len;
> > > + bufs_coalesced += in;
> >
> > Should this instead be:
> > bufs_coalesced += out;
> >
> > Perusing the code I see that earlier there is a check to see if "in" is not
> > zero, and, if so, error out of the loop. After the check, "in" is not
> > touched until it is added to bufs_coalesced, effectively not changing
> > bufs_coalesced, meaning bufs_coalesced will never trigger the conditions
> > below.
>
> Yes. It definitely should be 'out'. 'in' should be 0 in the tx path.
>
> I tried a simpler version of this patch without any tunables by
> delaying the signaling until we come out of the for loop.
> It definitely reduced the number of vmexits significantly for small message
> guest to host stream test and the throughput went up a little.
>
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index 9b3ca10..5f9fae9 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -197,7 +197,7 @@ static void handle_tx(struct vhost_net *net)
> if (err != len)
> pr_debug("Truncated TX packet: "
> " len %d != %zd\n", err, len);
> - vhost_add_used_and_signal(&net->dev, vq, head, 0);
> + vhost_add_used(vq, head, 0);
> total_len += len;
> if (unlikely(total_len >= VHOST_NET_WEIGHT)) {
> vhost_poll_queue(&vq->poll);
> @@ -205,6 +205,8 @@ static void handle_tx(struct vhost_net *net)
> }
> }
>
> + if (total_len > 0)
> + vhost_signal(&net->dev, vq);
> mutex_unlock(&vq->mutex);
> }
>
>
> >
> > Or am I missing something?
> >
> > > + if (unlikely(packets_coalesced > tx_packets_coalesce ||
> > > + bytes_coalesced > tx_bytes_coalesce ||
> > > + bufs_coalesced > tx_bufs_coalesce))
> > > + vhost_add_used_and_signal(&net->dev, vq, head, 0);
> > > + else
> > > + vhost_add_used(vq, head, 0);
> > > if (unlikely(total_len >= VHOST_NET_WEIGHT)) {
> > > vhost_poll_queue(&vq->poll);
> > > break;
> > > }
> > > }
> > >
> > > + if (likely(packets_coalesced > tx_packets_coalesce ||
> > > + bytes_coalesced > tx_bytes_coalesce ||
> > > + bufs_coalesced > tx_bufs_coalesce))
> > > + vhost_signal(&net->dev, vq);
> > > mutex_unlock(&vq->mutex);
> > > }
>
> It is possible that we can miss signaling the guest even after
> processing a few pkts, if we don't hit any of these conditions.
Yes. It really should be
if (likely(packets_coalesced && bytes_coalesced && bufs_coalesced))
vhost_signal(&net->dev, vq);
> > >
> >
> > Steve D.
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe netdev" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2011-02-01 5:56 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <OFD293DCD2.7F0260F0-ON86257823.0061DC39-86257823.00743BB3@us.ibm.com>
[not found] ` <20110126151700.GA14113@redhat.com>
[not found] ` <1296153874.1640.27.camel@localhost.localdomain>
[not found] ` <20110127190031.GC5228@redhat.com>
[not found] ` <1296155340.1640.34.camel@localhost.localdomain>
[not found] ` <20110127193131.GD5228@redhat.com>
[not found] ` <1296157547.1640.45.camel@localhost.localdomain>
2011-01-27 20:05 ` Network performance with small packets Michael S. Tsirkin
2011-01-27 20:15 ` Shirley Ma
2011-01-28 18:29 ` Steve Dobbelstein
2011-01-28 22:51 ` Steve Dobbelstein
2011-02-01 15:52 ` [PATCHv2 dontapply] vhost-net tx tuning Michael S. Tsirkin
2011-02-01 23:07 ` Sridhar Samudrala
2011-02-01 23:27 ` Shirley Ma
2011-02-02 4:36 ` Michael S. Tsirkin
2011-01-27 21:02 ` Network performance with small packets David Miller
2011-01-27 21:30 ` Shirley Ma
2011-01-28 12:16 ` Michael S. Tsirkin
2011-02-01 0:24 ` Steve Dobbelstein
2011-02-01 1:30 ` Sridhar Samudrala
2011-02-01 5:56 ` Michael S. Tsirkin [this message]
2011-02-01 21:09 ` Shirley Ma
2011-02-01 21:24 ` Michael S. Tsirkin
2011-02-01 21:32 ` Shirley Ma
2011-02-01 21:42 ` Michael S. Tsirkin
2011-02-01 21:53 ` Shirley Ma
2011-02-01 21:56 ` Michael S. Tsirkin
2011-02-01 22:59 ` Shirley Ma
2011-02-02 4:40 ` Michael S. Tsirkin
2011-02-02 6:05 ` Shirley Ma
2011-02-02 6:19 ` Shirley Ma
2011-02-02 6:29 ` Michael S. Tsirkin
2011-02-02 7:14 ` Shirley Ma
2011-02-02 7:33 ` Shirley Ma
2011-02-02 10:49 ` Michael S. Tsirkin
2011-02-02 15:42 ` Shirley Ma
2011-02-02 15:48 ` Michael S. Tsirkin
2011-02-02 17:12 ` Shirley Ma
2011-02-02 18:20 ` Michael S. Tsirkin
2011-02-02 18:26 ` Shirley Ma
2011-02-02 10:48 ` Michael S. Tsirkin
2011-02-02 6:34 ` Krishna Kumar2
2011-02-02 7:03 ` Shirley Ma
2011-02-02 7:37 ` Krishna Kumar2
2011-02-02 10:48 ` Michael S. Tsirkin
2011-02-02 15:39 ` Shirley Ma
2011-02-02 15:47 ` Michael S. Tsirkin
2011-02-02 17:10 ` Shirley Ma
2011-02-02 17:32 ` Michael S. Tsirkin
2011-02-02 18:11 ` Shirley Ma
2011-02-02 18:27 ` Michael S. Tsirkin
2011-02-02 19:29 ` Shirley Ma
2011-02-02 20:17 ` Michael S. Tsirkin
2011-02-02 21:03 ` Shirley Ma
2011-02-02 21:20 ` Michael S. Tsirkin
2011-02-02 21:41 ` Shirley Ma
2011-02-03 5:59 ` Michael S. Tsirkin
2011-02-03 6:09 ` Shirley Ma
2011-02-03 6:16 ` Michael S. Tsirkin
2011-02-03 5:05 ` Shirley Ma
2011-02-03 6:13 ` Michael S. Tsirkin
2011-02-03 15:58 ` Shirley Ma
2011-02-03 16:20 ` Michael S. Tsirkin
2011-02-03 17:18 ` Shirley Ma
2011-02-01 5:54 ` Michael S. Tsirkin
2011-02-01 17:23 ` Michael S. Tsirkin
[not found] ` <1296590943.26937.797.camel@localhost.localdomain>
[not found] ` <20110201201715.GA30050@redhat.com>
2011-02-01 20:25 ` Shirley Ma
2011-02-01 21:21 ` Michael S. Tsirkin
2011-02-01 21:28 ` Shirley Ma
2011-02-01 21:41 ` Michael S. Tsirkin
2011-02-02 4:39 ` Krishna Kumar2
2011-02-02 4:42 ` Michael S. Tsirkin
2011-02-09 0:37 ` Rusty Russell
2011-02-09 0:53 ` Michael S. Tsirkin
2011-02-09 1:39 ` Rusty Russell
2011-02-09 1:55 ` Michael S. Tsirkin
2011-02-09 7:43 ` Stefan Hajnoczi
2011-03-08 21:57 ` Shirley Ma
2011-03-09 2:21 ` Andrew Theurer
2011-03-09 15:42 ` Shirley Ma
2011-03-10 1:49 ` Rusty Russell
2011-04-12 20:01 ` Michael S. Tsirkin
2011-04-14 11:28 ` Rusty Russell
2011-04-14 12:40 ` Michael S. Tsirkin
2011-04-14 16:03 ` Michael S. Tsirkin
2011-04-19 0:33 ` Rusty Russell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110201055627.GG9124@redhat.com \
--to=mst@redhat.com \
--cc=davem@davemloft.net \
--cc=kvm@vger.kernel.org \
--cc=mashirle@linux.vnet.ibm.com \
--cc=netdev@vger.kernel.org \
--cc=sri@us.ibm.com \
--cc=steved@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).