* Re: [PATCHv2 RFC 3/4] virtio_net: limit xmit polling
[not found] ` <a80199422de16ae355e56ee1b2abc9b2bf91a7f6.1307029009.git.mst@redhat.com>
@ 2011-06-02 18:09 ` Sridhar Samudrala
0 siblings, 0 replies; 3+ messages in thread
From: Sridhar Samudrala @ 2011-06-02 18:09 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: linux-kernel, Rusty Russell, Carsten Otte, Christian Borntraeger,
linux390, Martin Schwidefsky, Heiko Carstens, Shirley Ma, lguest,
virtualization, netdev, linux-s390, kvm, Krishna Kumar,
Tom Lendacky, steved, habanero
On Thu, 2011-06-02 at 18:43 +0300, Michael S. Tsirkin wrote:
> Current code might introduce a lot of latency variation
> if there are many pending bufs at the time we
> attempt to transmit a new one. This is bad for
> real-time applications and can't be good for TCP either.
>
> Free up just enough to both clean up all buffers
> eventually and to be able to xmit the next packet.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> drivers/net/virtio_net.c | 106 +++++++++++++++++++++++++++++-----------------
> 1 files changed, 67 insertions(+), 39 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index a0ee78d..b25db1c 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -509,17 +509,33 @@ again:
> return received;
> }
>
> -static void free_old_xmit_skbs(struct virtnet_info *vi)
> +static bool free_old_xmit_skb(struct virtnet_info *vi)
> {
> struct sk_buff *skb;
> unsigned int len;
>
> - while ((skb = virtqueue_get_buf(vi->svq, &len)) != NULL) {
> - pr_debug("Sent skb %p\n", skb);
> - vi->dev->stats.tx_bytes += skb->len;
> - vi->dev->stats.tx_packets++;
> - dev_kfree_skb_any(skb);
> - }
> + skb = virtqueue_get_buf(vi->svq, &len);
> + if (unlikely(!skb))
> + return false;
> + pr_debug("Sent skb %p\n", skb);
> + vi->dev->stats.tx_bytes += skb->len;
> + vi->dev->stats.tx_packets++;
> + dev_kfree_skb_any(skb);
> + return true;
> +}
> +
> +/* Check capacity and try to free enough pending old buffers to enable queueing
> + * new ones. Return true if we can guarantee that a following
> + * virtqueue_add_buf will succeed. */
> +static bool free_xmit_capacity(struct virtnet_info *vi)
> +{
> + struct sk_buff *skb;
> + unsigned int len;
> +
> + while (virtqueue_min_capacity(vi->svq) < MAX_SKB_FRAGS + 2)
> + if (unlikely(!free_old_xmit_skb))
> + return false;
If we are using INDIRECT descriptors, 1 descriptor entry is good enough
to guarantee that an skb can be queued unless we run out of memory.
Is it worth checking if 'indirect' is set on the svq and then only free
1 descriptor? Otherwise, we will be dropping the packet if there are
less than 18 free descriptors although we ony need 1.
> + return true;
> }
>
> static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
> @@ -572,30 +588,34 @@ static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
> static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> {
> struct virtnet_info *vi = netdev_priv(dev);
> - int capacity;
> -
> - /* Free up any pending old buffers before queueing new ones. */
> - free_old_xmit_skbs(vi);
> -
> - /* Try to transmit */
> - capacity = xmit_skb(vi, skb);
> -
> - /* This can happen with OOM and indirect buffers. */
> - if (unlikely(capacity < 0)) {
> - if (net_ratelimit()) {
> - if (likely(capacity == -ENOMEM)) {
> - dev_warn(&dev->dev,
> - "TX queue failure: out of memory\n");
> - } else {
> - dev->stats.tx_fifo_errors++;
> + int ret, i;
> +
> + /* We normally do have space in the ring, so try to queue the skb as
> + * fast as possible. */
> + ret = xmit_skb(vi, skb);
> + if (unlikely(ret < 0)) {
> + /* This triggers on the first xmit after ring full condition.
> + * We need to free up some skbs first. */
> + if (likely(free_xmit_capacity(vi))) {
> + ret = xmit_skb(vi, skb);
> + /* This should never fail. Check, just in case. */
> + if (unlikely(ret < 0)) {
> dev_warn(&dev->dev,
> "Unexpected TX queue failure: %d\n",
> - capacity);
> + ret);
> + dev->stats.tx_fifo_errors++;
> + dev->stats.tx_dropped++;
> + kfree_skb(skb);
> + return NETDEV_TX_OK;
> }
> + } else {
> + /* Ring full: it might happen if we get a callback while
> + * the queue is still mostly full. This should be
> + * extremely rare. */
> + dev->stats.tx_dropped++;
> + kfree_skb(skb);
> + goto stop;
> }
> - dev->stats.tx_dropped++;
> - kfree_skb(skb);
> - return NETDEV_TX_OK;
> }
> virtqueue_kick(vi->svq);
>
> @@ -603,18 +623,26 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> skb_orphan(skb);
> nf_reset(skb);
>
> - /* Apparently nice girls don't return TX_BUSY; stop the queue
> - * before it gets out of hand. Naturally, this wastes entries. */
> - if (capacity < 2+MAX_SKB_FRAGS) {
> - netif_stop_queue(dev);
> - if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
> - /* More just got used, free them then recheck. */
> - free_old_xmit_skbs(vi);
> - capacity = virtqueue_min_capacity(vi->svq);
> - if (capacity >= 2+MAX_SKB_FRAGS) {
> - netif_start_queue(dev);
> - virtqueue_disable_cb(vi->svq);
> - }
> + /* We transmit one skb, so try to free at least two pending skbs.
> + * This is so that we don't hog the skb memory unnecessarily. *
> + * Doing this after kick means there's a chance we'll free
> + * the skb we have just sent, which is hot in cache. */
> + for (i = 0; i < 2; i++)
> + free_old_xmit_skb(v);
> +
> + if (likely(free_xmit_capacity(vi)))
> + return NETDEV_TX_OK;
> +
> +stop:
> + /* Apparently nice girls don't return TX_BUSY; check capacity and stop
> + * the queue before it gets out of hand.
> + * Naturally, this wastes entries. */
> + netif_stop_queue(dev);
> + if (unlikely(!virtqueue_enable_cb_delayed(vi->svq))) {
> + /* More just got used, free them and recheck. */
> + if (free_xmit_capacity(vi)) {
> + netif_start_queue(dev);
> + virtqueue_disable_cb(vi->svq);
> }
> }
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCHv2 RFC 4/4] Revert "virtio: make add_buf return capacity remaining:
[not found] ` <20110607155457.GA17436@redhat.com>
@ 2011-06-08 0:19 ` Rusty Russell
0 siblings, 0 replies; 3+ messages in thread
From: Rusty Russell @ 2011-06-08 0:19 UTC (permalink / raw)
To: Michael S. Tsirkin, linux-kernel
Cc: Carsten Otte, Christian Borntraeger, linux390, Martin Schwidefsky,
Heiko Carstens, Shirley Ma, lguest, virtualization, netdev,
linux-s390, kvm, Krishna Kumar, Tom Lendacky, steved, habanero
On Tue, 7 Jun 2011 18:54:57 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Thu, Jun 02, 2011 at 06:43:25PM +0300, Michael S. Tsirkin wrote:
> > This reverts commit 3c1b27d5043086a485f8526353ae9fe37bfa1065.
> > The only user was virtio_net, and it switched to
> > min_capacity instead.
> >
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>
> It turns out another place in virtio_net: receive
> buf processing - relies on the old behaviour:
>
> try_fill_recv:
> do {
> if (vi->mergeable_rx_bufs)
> err = add_recvbuf_mergeable(vi, gfp);
> else if (vi->big_packets)
> err = add_recvbuf_big(vi, gfp);
> else
> err = add_recvbuf_small(vi, gfp);
>
> oom = err == -ENOMEM;
> if (err < 0)
> break;
> ++vi->num;
> } while (err > 0);
>
> The point is to avoid allocating a buf if
> the ring is out of space and we are sure
> add_buf will fail.
>
> It works well for mergeable buffers and for big
> packets if we are not OOM. small packets and
> oom will do extra get_page/put_page calls
> (but maybe we don't care).
>
> So this is RX, I intend to drop it from this patchset and focus on the
> TX side for starters.
We could do some hack where we get the capacity, and estimate how many
packets we need to fill it, then try to do that many.
I say hack, because knowing whether we're doing indirect buffers is a
layering violation. But that's life when you're trying to do
microoptimizations.
Cheers,
Rusty.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling
[not found] ` <20110607160830.GB17581@redhat.com>
@ 2011-06-13 13:32 ` Krishna Kumar2
0 siblings, 0 replies; 3+ messages in thread
From: Krishna Kumar2 @ 2011-06-13 13:32 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Christian Borntraeger, Carsten Otte, habanero, Heiko Carstens,
kvm, lguest, linux-kernel, linux-s390, linux390, netdev,
Rusty Russell, Martin Schwidefsky, steved, Tom Lendacky,
virtualization, Shirley Ma
"Michael S. Tsirkin" <mst@redhat.com> wrote on 06/07/2011 09:38:30 PM:
> > This is on top of the patches applied by Rusty.
> >
> > Warning: untested. Posting now to give people chance to
> > comment on the API.
>
> OK, this seems to have survived some testing so far,
> after I dropped patch 4 and fixed build for patch 3
> (build fixup patch sent in reply to the original).
>
> I'll be mostly offline until Sunday, would appreciate
> testing reports.
Hi Michael,
I ran the latest patches with 1K I/O (guest->local host) and
the results are (60 sec run for each test case):
______________________________
#sessions BW% SD%
______________________________
1 -25.6 47.0
2 -29.3 22.9
4 .8 1.6
8 1.6 0
16 -1.6 4.1
32 -5.3 2.1
48 11.3 -7.8
64 -2.8 .7
96 -6.2 .6
128 -10.6 12.7
______________________________
BW: -4.8 SD: 5.4
I tested it again to see if the regression is fleeting (since
the numbers vary quite a bit for 1K I/O even between guest->
local host), but:
______________________________
#sessions BW% SD%
______________________________
1 14.0 -17.3
2 19.9 -11.1
4 7.9 -15.3
8 9.6 -13.1
16 1.2 -7.3
32 -.6 -13.5
48 -28.7 10.0
64 -5.7 -.7
96 -9.4 -8.1
128 -9.4 .7
______________________________
BW: -3.7 SD: -2.0
With 16K, there was an improvement in SD, but
higher sessions seem to slightly degrade BW/SD:
______________________________
#sessions BW% SD%
______________________________
1 30.9 -25.0
2 16.5 -19.4
4 -1.3 7.9
8 1.4 6.2
16 3.9 -5.4
32 0 4.3
48 -.5 .1
64 32.1 -1.5
96 -2.1 23.2
128 -7.4 3.8
______________________________
BW: 5.0 SD: 7.5
Thanks,
- KK
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2011-06-13 13:32 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <cover.1307029008.git.mst@redhat.com>
[not found] ` <a80199422de16ae355e56ee1b2abc9b2bf91a7f6.1307029009.git.mst@redhat.com>
2011-06-02 18:09 ` [PATCHv2 RFC 3/4] virtio_net: limit xmit polling Sridhar Samudrala
[not found] ` <7572d6fb81181e349af4a8b203ea0977f6e91ae1.1307029009.git.mst@redhat.com>
[not found] ` <20110607155457.GA17436@redhat.com>
2011-06-08 0:19 ` [PATCHv2 RFC 4/4] Revert "virtio: make add_buf return capacity remaining: Rusty Russell
[not found] ` <20110607160830.GB17581@redhat.com>
2011-06-13 13:32 ` [PATCHv2 RFC 0/4] virtio and vhost-net capacity handling Krishna Kumar2
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox