[PATCH] virtio_net: Fix queue full check

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] virtio_net: Fix queue full check
@ 2010-10-28  5:10 Krishna Kumar
  2010-10-29  9:47 ` Rusty Russell
  0 siblings, 1 reply; 13+ messages in thread
From: Krishna Kumar @ 2010-10-28  5:10 UTC (permalink / raw)
  To: rusty, davem; +Cc: netdev, Krishna Kumar

I get many queue full errors being wrongly reported when running
parallel netperfs:

Oct 17 10:22:40 localhost kernel: net eth0: Unexpected TX queue failure: -28
Oct 17 10:28:22 localhost kernel: net eth0: Unexpected TX queue failure: -28
Oct 17 10:35:58 localhost kernel: net eth0: Unexpected TX queue failure: -28
Oct 17 10:41:06 localhost kernel: net eth0: Unexpected TX queue failure: -28

I initially changed the check from -ENOMEM to -ENOSPC, but
virtqueue_add_buf can return only -ENOSPC when it doesn't have
space for new request.  Patch removes redundant checks but
displays the failure errno.

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
---
 drivers/net/virtio_net.c |   15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

diff -ruNp org/drivers/net/virtio_net.c new/drivers/net/virtio_net.c
--- org/drivers/net/virtio_net.c	2010-10-11 10:20:02.000000000 +0530
+++ new/drivers/net/virtio_net.c	2010-10-21 17:37:45.000000000 +0530
@@ -570,17 +570,10 @@ static netdev_tx_t start_xmit(struct sk_
 
 	/* This can happen with OOM and indirect buffers. */
 	if (unlikely(capacity < 0)) {
-		if (net_ratelimit()) {
-			if (likely(capacity == -ENOMEM)) {
-				dev_warn(&dev->dev,
-					 "TX queue failure: out of memory\n");
-			} else {
-				dev->stats.tx_fifo_errors++;
-				dev_warn(&dev->dev,
-					 "Unexpected TX queue failure: %d\n",
-					 capacity);
-			}
-		}
+		if (net_ratelimit())
+			dev_warn(&dev->dev,
+				 "TX queue failure (%d): out of memory\n",
+				 capacity); 
 		dev->stats.tx_dropped++;
 		kfree_skb(skb);
 		return NETDEV_TX_OK;

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio_net: Fix queue full check
  2010-10-28  5:10 [PATCH] virtio_net: Fix queue full check Krishna Kumar
@ 2010-10-29  9:47 ` Rusty Russell
  2010-10-29 10:55   ` Krishna Kumar2
  0 siblings, 1 reply; 13+ messages in thread
From: Rusty Russell @ 2010-10-29  9:47 UTC (permalink / raw)
  To: Krishna Kumar; +Cc: davem, netdev

On Thu, 28 Oct 2010 03:40:36 pm Krishna Kumar wrote:
> I get many queue full errors being wrongly reported when running
> parallel netperfs:
> 
> Oct 17 10:22:40 localhost kernel: net eth0: Unexpected TX queue failure: -28
> Oct 17 10:28:22 localhost kernel: net eth0: Unexpected TX queue failure: -28
> Oct 17 10:35:58 localhost kernel: net eth0: Unexpected TX queue failure: -28
> Oct 17 10:41:06 localhost kernel: net eth0: Unexpected TX queue failure: -28
> 
> I initially changed the check from -ENOMEM to -ENOSPC, but
> virtqueue_add_buf can return only -ENOSPC when it doesn't have
> space for new request.  Patch removes redundant checks but
> displays the failure errno.
> 
> Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
> ---
>  drivers/net/virtio_net.c |   15 ++++-----------
>  1 file changed, 4 insertions(+), 11 deletions(-)
> 
> diff -ruNp org/drivers/net/virtio_net.c new/drivers/net/virtio_net.c
> --- org/drivers/net/virtio_net.c	2010-10-11 10:20:02.000000000 +0530
> +++ new/drivers/net/virtio_net.c	2010-10-21 17:37:45.000000000 +0530
> @@ -570,17 +570,10 @@ static netdev_tx_t start_xmit(struct sk_
>  
>  	/* This can happen with OOM and indirect buffers. */
>  	if (unlikely(capacity < 0)) {
> -		if (net_ratelimit()) {
> -			if (likely(capacity == -ENOMEM)) {
> -				dev_warn(&dev->dev,
> -					 "TX queue failure: out of memory\n");
> -			} else {
> -				dev->stats.tx_fifo_errors++;
> -				dev_warn(&dev->dev,
> -					 "Unexpected TX queue failure: %d\n",
> -					 capacity);
> -			}
> -		}
> +		if (net_ratelimit())
> +			dev_warn(&dev->dev,
> +				 "TX queue failure (%d): out of memory\n",
> +				 capacity); 

Hold on... you were getting -ENOSPC, which shouldn't happen.  What makes you
think it's out of memory?

Confused,
Rusty.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio_net: Fix queue full check
  2010-10-29  9:47 ` Rusty Russell
@ 2010-10-29 10:55   ` Krishna Kumar2
  2010-10-29 11:28     ` Rusty Russell
  0 siblings, 1 reply; 13+ messages in thread
From: Krishna Kumar2 @ 2010-10-29 10:55 UTC (permalink / raw)
  To: Rusty Russell; +Cc: davem, netdev

Rusty Russell <rusty@rustcorp.com.au> wrote on 10/29/2010 03:17:24 PM:

> > Oct 17 10:22:40 localhost kernel: net eth0: Unexpected TX queue
failure: -28
> > Oct 17 10:28:22 localhost kernel: net eth0: Unexpected TX queue
failure: -28
> > Oct 17 10:35:58 localhost kernel: net eth0: Unexpected TX queue
failure: -28
> > Oct 17 10:41:06 localhost kernel: net eth0: Unexpected TX queue
failure: -28
> >
> > I initially changed the check from -ENOMEM to -ENOSPC, but
> > virtqueue_add_buf can return only -ENOSPC when it doesn't have
> > space for new request.  Patch removes redundant checks but
> > displays the failure errno.
> >
> > Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
> > ---
> >  drivers/net/virtio_net.c |   15 ++++-----------
> >  1 file changed, 4 insertions(+), 11 deletions(-)
> >
> > diff -ruNp org/drivers/net/virtio_net.c new/drivers/net/virtio_net.c
> > --- org/drivers/net/virtio_net.c   2010-10-11 10:20:02.000000000 +0530
> > +++ new/drivers/net/virtio_net.c   2010-10-21 17:37:45.000000000 +0530
> > @@ -570,17 +570,10 @@ static netdev_tx_t start_xmit(struct sk_
> >
> >     /* This can happen with OOM and indirect buffers. */
> >     if (unlikely(capacity < 0)) {
> > -      if (net_ratelimit()) {
> > -         if (likely(capacity == -ENOMEM)) {
> > -            dev_warn(&dev->dev,
> > -                "TX queue failure: out of memory\n");
> > -         } else {
> > -            dev->stats.tx_fifo_errors++;
> > -            dev_warn(&dev->dev,
> > -                "Unexpected TX queue failure: %d\n",
> > -                capacity);
> > -         }
> > -      }
> > +      if (net_ratelimit())
> > +         dev_warn(&dev->dev,
> > +             "TX queue failure (%d): out of memory\n",
> > +             capacity);
>
> Hold on... you were getting -ENOSPC, which shouldn't happen.  What makes
you
> think it's out of memory?

virtqueue_add_buf_gfp returns only -ENOSPC on failure, whether
direct or indirect descriptors are used, so isn't -ENOSPC
"expected"? (vring_add_indirect returns -ENOMEM on memory
failure, but that is masked out and we go direct which is
the failure point).

Thanks,

- KK


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio_net: Fix queue full check
  2010-10-29 10:55   ` Krishna Kumar2
@ 2010-10-29 11:28     ` Rusty Russell
  2010-11-02 16:17       ` Michael S. Tsirkin
  0 siblings, 1 reply; 13+ messages in thread
From: Rusty Russell @ 2010-10-29 11:28 UTC (permalink / raw)
  To: Krishna Kumar2; +Cc: davem, netdev, Michael S. Tsirkin

On Fri, 29 Oct 2010 09:25:09 pm Krishna Kumar2 wrote:
> Rusty Russell <rusty@rustcorp.com.au> wrote on 10/29/2010 03:17:24 PM:
> 
> > > Oct 17 10:22:40 localhost kernel: net eth0: Unexpected TX queue
> failure: -28
> > > Oct 17 10:28:22 localhost kernel: net eth0: Unexpected TX queue
> failure: -28
> > > Oct 17 10:35:58 localhost kernel: net eth0: Unexpected TX queue
> failure: -28
> > > Oct 17 10:41:06 localhost kernel: net eth0: Unexpected TX queue
> failure: -28
> > >
> > > I initially changed the check from -ENOMEM to -ENOSPC, but
> > > virtqueue_add_buf can return only -ENOSPC when it doesn't have
> > > space for new request.  Patch removes redundant checks but
> > > displays the failure errno.
> > >
> > > Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
> > > ---
> > >  drivers/net/virtio_net.c |   15 ++++-----------
> > >  1 file changed, 4 insertions(+), 11 deletions(-)
> > >
> > > diff -ruNp org/drivers/net/virtio_net.c new/drivers/net/virtio_net.c
> > > --- org/drivers/net/virtio_net.c   2010-10-11 10:20:02.000000000 +0530
> > > +++ new/drivers/net/virtio_net.c   2010-10-21 17:37:45.000000000 +0530
> > > @@ -570,17 +570,10 @@ static netdev_tx_t start_xmit(struct sk_
> > >
> > >     /* This can happen with OOM and indirect buffers. */
> > >     if (unlikely(capacity < 0)) {
> > > -      if (net_ratelimit()) {
> > > -         if (likely(capacity == -ENOMEM)) {
> > > -            dev_warn(&dev->dev,
> > > -                "TX queue failure: out of memory\n");
> > > -         } else {
> > > -            dev->stats.tx_fifo_errors++;
> > > -            dev_warn(&dev->dev,
> > > -                "Unexpected TX queue failure: %d\n",
> > > -                capacity);
> > > -         }
> > > -      }
> > > +      if (net_ratelimit())
> > > +         dev_warn(&dev->dev,
> > > +             "TX queue failure (%d): out of memory\n",
> > > +             capacity);
> >
> > Hold on... you were getting -ENOSPC, which shouldn't happen.  What makes
> you
> > think it's out of memory?
> 
> virtqueue_add_buf_gfp returns only -ENOSPC on failure, whether
> direct or indirect descriptors are used, so isn't -ENOSPC
> "expected"? (vring_add_indirect returns -ENOMEM on memory
> failure, but that is masked out and we go direct which is
> the failure point).

Ah, OK, gotchya.

I'm not even sure the fallback to linear makes sense; if we're failing
kmallocs we should probably just return -ENOMEM.  Would mean we can
tell the difference between "out of space" (which should never happen
since we stop the queue when we have < 2+MAX_SKB_FRAGS slots left)
and this case.

Michael, what do you think?

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio_net: Fix queue full check
  2010-10-29 11:28     ` Rusty Russell
@ 2010-11-02 16:17       ` Michael S. Tsirkin
  2010-11-04 12:24         ` Michael S. Tsirkin
  0 siblings, 1 reply; 13+ messages in thread
From: Michael S. Tsirkin @ 2010-11-02 16:17 UTC (permalink / raw)
  To: Rusty Russell; +Cc: Krishna Kumar2, davem, netdev, yvugenfi

On Fri, Oct 29, 2010 at 09:58:40PM +1030, Rusty Russell wrote:
> On Fri, 29 Oct 2010 09:25:09 pm Krishna Kumar2 wrote:
> > Rusty Russell <rusty@rustcorp.com.au> wrote on 10/29/2010 03:17:24 PM:
> > 
> > > > Oct 17 10:22:40 localhost kernel: net eth0: Unexpected TX queue
> > failure: -28
> > > > Oct 17 10:28:22 localhost kernel: net eth0: Unexpected TX queue
> > failure: -28
> > > > Oct 17 10:35:58 localhost kernel: net eth0: Unexpected TX queue
> > failure: -28
> > > > Oct 17 10:41:06 localhost kernel: net eth0: Unexpected TX queue
> > failure: -28
> > > >
> > > > I initially changed the check from -ENOMEM to -ENOSPC, but
> > > > virtqueue_add_buf can return only -ENOSPC when it doesn't have
> > > > space for new request.  Patch removes redundant checks but
> > > > displays the failure errno.
> > > >
> > > > Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
> > > > ---
> > > >  drivers/net/virtio_net.c |   15 ++++-----------
> > > >  1 file changed, 4 insertions(+), 11 deletions(-)
> > > >
> > > > diff -ruNp org/drivers/net/virtio_net.c new/drivers/net/virtio_net.c
> > > > --- org/drivers/net/virtio_net.c   2010-10-11 10:20:02.000000000 +0530
> > > > +++ new/drivers/net/virtio_net.c   2010-10-21 17:37:45.000000000 +0530
> > > > @@ -570,17 +570,10 @@ static netdev_tx_t start_xmit(struct sk_
> > > >
> > > >     /* This can happen with OOM and indirect buffers. */
> > > >     if (unlikely(capacity < 0)) {
> > > > -      if (net_ratelimit()) {
> > > > -         if (likely(capacity == -ENOMEM)) {
> > > > -            dev_warn(&dev->dev,
> > > > -                "TX queue failure: out of memory\n");
> > > > -         } else {
> > > > -            dev->stats.tx_fifo_errors++;
> > > > -            dev_warn(&dev->dev,
> > > > -                "Unexpected TX queue failure: %d\n",
> > > > -                capacity);
> > > > -         }
> > > > -      }
> > > > +      if (net_ratelimit())
> > > > +         dev_warn(&dev->dev,
> > > > +             "TX queue failure (%d): out of memory\n",
> > > > +             capacity);
> > >
> > > Hold on... you were getting -ENOSPC, which shouldn't happen.  What makes
> > you
> > > think it's out of memory?
> > 
> > virtqueue_add_buf_gfp returns only -ENOSPC on failure, whether
> > direct or indirect descriptors are used, so isn't -ENOSPC
> > "expected"? (vring_add_indirect returns -ENOMEM on memory
> > failure, but that is masked out and we go direct which is
> > the failure point).
> 
> Ah, OK, gotchya.
> I'm not even sure the fallback to linear makes sense; if we're failing
> kmallocs we should probably just return -ENOMEM.  Would mean we can
> tell the difference between "out of space" (which should never happen
> since we stop the queue when we have < 2+MAX_SKB_FRAGS slots left)
> and this case.
> 
> Michael, what do you think?
> 
> Thanks,
> Rusty.

Let's make sure I understand the issue: we use indirect buffers
so we assume there's still a lot of place in the ring, then
allocation for the indirect fails and so we return -ENOSPC?

So first, I agree it's a bug.  But I am not sure killing the fallback
is such a good idea: recovering from add buf failure is hard
generally, we should try to accomodate if we can. Let's just fix
the return code for now?

And generally, we should be smarter: as long as the ring is almost
empty, and s/g list is short, it is a waste to use indirect buffers.
BTW we have had a FIXME there for a long while, I think Yan suggested
increasing that threshold to 3. Yan?

Further, maybe preallocating some memory for the indirect buffers might
be a good idea.

In short, lots of good ideas, let's start with the minimal patch that is
a good 2.6.37 candidate too. How about the following (untested)?

virtio: fix add_buf return code for OOM

add_buff returned ENOSPC on out of memory: this is a bug
as at leats virtio-net expects ENOMEM and handles it
specially. Fix that.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

---

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 1475ed6..0a89098 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -165,7 +165,7 @@ int virtqueue_add_buf_gfp(struct virtqueue *_vq,
 {
 	struct vring_virtqueue *vq = to_vvq(_vq);
 	unsigned int i, avail, uninitialized_var(prev);
-	int head;
+	int head = -ENOSPC;
 
 	START_USE(vq);
 
@@ -191,7 +191,7 @@ int virtqueue_add_buf_gfp(struct virtqueue *_vq,
 		if (out)
 			vq->notify(&vq->vq);
 		END_USE(vq);
-		return -ENOSPC;
+		return head;
 	}
 
 	/* We're about to use some buffers from the free list. */
-- 
MST

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio_net: Fix queue full check
  2010-11-02 16:17       ` Michael S. Tsirkin
@ 2010-11-04 12:24         ` Michael S. Tsirkin
  2010-11-04 16:17           ` Krishna Kumar2
  2010-11-07 23:08           ` Rusty Russell
  0 siblings, 2 replies; 13+ messages in thread
From: Michael S. Tsirkin @ 2010-11-04 12:24 UTC (permalink / raw)
  To: Rusty Russell; +Cc: Krishna Kumar2, davem, netdev, yvugenfi

On Tue, Nov 02, 2010 at 06:17:30PM +0200, Michael S. Tsirkin wrote:
> On Fri, Oct 29, 2010 at 09:58:40PM +1030, Rusty Russell wrote:
> > On Fri, 29 Oct 2010 09:25:09 pm Krishna Kumar2 wrote:
> > > Rusty Russell <rusty@rustcorp.com.au> wrote on 10/29/2010 03:17:24 PM:
> > > 
> > > > > Oct 17 10:22:40 localhost kernel: net eth0: Unexpected TX queue
> > > failure: -28
> > > > > Oct 17 10:28:22 localhost kernel: net eth0: Unexpected TX queue
> > > failure: -28
> > > > > Oct 17 10:35:58 localhost kernel: net eth0: Unexpected TX queue
> > > failure: -28
> > > > > Oct 17 10:41:06 localhost kernel: net eth0: Unexpected TX queue
> > > failure: -28
> > > > >
> > > > > I initially changed the check from -ENOMEM to -ENOSPC, but
> > > > > virtqueue_add_buf can return only -ENOSPC when it doesn't have
> > > > > space for new request.  Patch removes redundant checks but
> > > > > displays the failure errno.
> > > > >
> > > > > Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
> > > > > ---
> > > > >  drivers/net/virtio_net.c |   15 ++++-----------
> > > > >  1 file changed, 4 insertions(+), 11 deletions(-)
> > > > >
> > > > > diff -ruNp org/drivers/net/virtio_net.c new/drivers/net/virtio_net.c
> > > > > --- org/drivers/net/virtio_net.c   2010-10-11 10:20:02.000000000 +0530
> > > > > +++ new/drivers/net/virtio_net.c   2010-10-21 17:37:45.000000000 +0530
> > > > > @@ -570,17 +570,10 @@ static netdev_tx_t start_xmit(struct sk_
> > > > >
> > > > >     /* This can happen with OOM and indirect buffers. */
> > > > >     if (unlikely(capacity < 0)) {
> > > > > -      if (net_ratelimit()) {
> > > > > -         if (likely(capacity == -ENOMEM)) {
> > > > > -            dev_warn(&dev->dev,
> > > > > -                "TX queue failure: out of memory\n");
> > > > > -         } else {
> > > > > -            dev->stats.tx_fifo_errors++;
> > > > > -            dev_warn(&dev->dev,
> > > > > -                "Unexpected TX queue failure: %d\n",
> > > > > -                capacity);
> > > > > -         }
> > > > > -      }
> > > > > +      if (net_ratelimit())
> > > > > +         dev_warn(&dev->dev,
> > > > > +             "TX queue failure (%d): out of memory\n",
> > > > > +             capacity);
> > > >
> > > > Hold on... you were getting -ENOSPC, which shouldn't happen.  What makes
> > > you
> > > > think it's out of memory?
> > > 
> > > virtqueue_add_buf_gfp returns only -ENOSPC on failure, whether
> > > direct or indirect descriptors are used, so isn't -ENOSPC
> > > "expected"? (vring_add_indirect returns -ENOMEM on memory
> > > failure, but that is masked out and we go direct which is
> > > the failure point).
> > 
> > Ah, OK, gotchya.
> > I'm not even sure the fallback to linear makes sense; if we're failing
> > kmallocs we should probably just return -ENOMEM.  Would mean we can
> > tell the difference between "out of space" (which should never happen
> > since we stop the queue when we have < 2+MAX_SKB_FRAGS slots left)
> > and this case.
> > 
> > Michael, what do you think?
> > 
> > Thanks,
> > Rusty.
> 
> Let's make sure I understand the issue: we use indirect buffers
> so we assume there's still a lot of place in the ring, then
> allocation for the indirect fails and so we return -ENOSPC?
> 
> So first, I agree it's a bug.  But I am not sure killing the fallback
> is such a good idea: recovering from add buf failure is hard
> generally, we should try to accomodate if we can. Let's just fix
> the return code for now?
> 
> And generally, we should be smarter: as long as the ring is almost
> empty, and s/g list is short, it is a waste to use indirect buffers.
> BTW we have had a FIXME there for a long while, I think Yan suggested
> increasing that threshold to 3. Yan?
> 
> Further, maybe preallocating some memory for the indirect buffers might
> be a good idea.
> 
> In short, lots of good ideas, let's start with the minimal patch that is
> a good 2.6.37 candidate too. How about the following (untested)?
> 
> virtio: fix add_buf return code for OOM
> 
> add_buff returned ENOSPC on out of memory: this is a bug
> as at leats virtio-net expects ENOMEM and handles it
> specially. Fix that.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

I thought about this some more.  I think the original
code is actually correct in returning ENOSPC: indirect
buffers are nice, but it's a mistake
to rely on them as a memory allocation might fail.

And if you look at virtio-net, it is dropping packets
under memory pressure which is not really a happy outcome:
the packet will get freed, reallocated and we get another one,
adding pressure on the allocator instead of releasing it
until we free up some buffers.

So I now think we should calculate the capacity
assuming non-indirect entries, and if we manage to
use indirect, all the better.

So below is what I propose now - as a replacement for
my original patch.  Krishna Kumar, Rusty, what do you think?

Separately I'm also considering moving the
	if (vq->num_free < out + in)
check earlier in the function to keep all users honest,
but need to check what the implications are for e.g. block.
Thoughts on this?

---->

virtio: return correct capacity to users

We can't rely on indirect buffers for capacity
calculations because they need a memory allocation
which might fail.

So return the number of buffers we can guarantee users.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 1475ed6..cc2f73e 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -230,9 +230,6 @@ add_head:
 	pr_debug("Added buffer head %i to %p\n", head, vq);
 	END_USE(vq);
 
-	/* If we're indirect, we can fit many (assuming not OOM). */
-	if (vq->indirect)
-		return vq->num_free ? vq->vring.num : 0;
 	return vq->num_free;
 }
 EXPORT_SYMBOL_GPL(virtqueue_add_buf_gfp);

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio_net: Fix queue full check
  2010-11-04 12:24         ` Michael S. Tsirkin
@ 2010-11-04 16:17           ` Krishna Kumar2
  2010-11-04 16:45             ` Michael S. Tsirkin
  2010-11-07 23:08           ` Rusty Russell
  1 sibling, 1 reply; 13+ messages in thread
From: Krishna Kumar2 @ 2010-11-04 16:17 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: davem, netdev, Rusty Russell, yvugenfi

"Michael S. Tsirkin" <mst@redhat.com> wrote on 11/04/2010 05:54:24 PM:

> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>
> I thought about this some more.  I think the original
> code is actually correct in returning ENOSPC: indirect
> buffers are nice, but it's a mistake
> to rely on them as a memory allocation might fail.
>
> And if you look at virtio-net, it is dropping packets
> under memory pressure which is not really a happy outcome:
> the packet will get freed, reallocated and we get another one,
> adding pressure on the allocator instead of releasing it
> until we free up some buffers.
>
> So I now think we should calculate the capacity
> assuming non-indirect entries, and if we manage to
> use indirect, all the better.
>
> So below is what I propose now - as a replacement for
> my original patch.  Krishna Kumar, Rusty, what do you think?
>
> Separately I'm also considering moving the
>    if (vq->num_free < out + in)
> check earlier in the function to keep all users honest,
> but need to check what the implications are for e.g. block.
> Thoughts on this?

This looks like the right thing to do.  Besides this, I
think virtio-net still needs to remove check for ENOMEM?
I will test this patch tomorrow.

Another question about add_recvbuf_small and
add_recvbuf_big - both call virtqueue_add_buf_gfp with
in+out > 1, and that can fail with -ENOSPC.  So try_fill_recv
gets -ENOSPC.  When that happens, oom is not set to true,
I thought it should have got set.  Is this a bug?

Thanks,

- KK


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio_net: Fix queue full check
  2010-11-04 16:17           ` Krishna Kumar2
@ 2010-11-04 16:45             ` Michael S. Tsirkin
  0 siblings, 0 replies; 13+ messages in thread
From: Michael S. Tsirkin @ 2010-11-04 16:45 UTC (permalink / raw)
  To: Krishna Kumar2; +Cc: davem, netdev, Rusty Russell, yvugenfi

On Thu, Nov 04, 2010 at 09:47:04PM +0530, Krishna Kumar2 wrote:
> "Michael S. Tsirkin" <mst@redhat.com> wrote on 11/04/2010 05:54:24 PM:
> 
> > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >
> > I thought about this some more.  I think the original
> > code is actually correct in returning ENOSPC: indirect
> > buffers are nice, but it's a mistake
> > to rely on them as a memory allocation might fail.
> >
> > And if you look at virtio-net, it is dropping packets
> > under memory pressure which is not really a happy outcome:
> > the packet will get freed, reallocated and we get another one,
> > adding pressure on the allocator instead of releasing it
> > until we free up some buffers.
> >
> > So I now think we should calculate the capacity
> > assuming non-indirect entries, and if we manage to
> > use indirect, all the better.
> >
> > So below is what I propose now - as a replacement for
> > my original patch.  Krishna Kumar, Rusty, what do you think?
> >
> > Separately I'm also considering moving the
> >    if (vq->num_free < out + in)
> > check earlier in the function to keep all users honest,
> > but need to check what the implications are for e.g. block.
> > Thoughts on this?
> 
> This looks like the right thing to do.  Besides this, I
> think virtio-net still needs to remove check for ENOMEM?

Yes, the only valid reason for failure would be a unexpected error.
No need to special-case ENOMEM anymore.

> I will test this patch tomorrow.
> 
> Another question about add_recvbuf_small and
> add_recvbuf_big - both call virtqueue_add_buf_gfp with
> in+out > 1, and that can fail with -ENOSPC.  So try_fill_recv
> gets -ENOSPC.  When that happens, oom is not set to true,
> I thought it should have got set.  Is this a bug?
> 
> Thanks,
> 
> - KK

I don't see a bug: on ENOSPC we don't need to (and can't) add any more
buffers, we know we will make progress since there must be some buffers
in the ring already, ENOMEM makes us try again later with more buffers
(and possibly more aggressive GFP flag).  What's wrong?

-- 
MST

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio_net: Fix queue full check
  2010-11-04 12:24         ` Michael S. Tsirkin
  2010-11-04 16:17           ` Krishna Kumar2
@ 2010-11-07 23:08           ` Rusty Russell
  2010-11-09  4:26             ` Krishna Kumar2
  1 sibling, 1 reply; 13+ messages in thread
From: Rusty Russell @ 2010-11-07 23:08 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Krishna Kumar2, davem, netdev, yvugenfi

On Thu, 4 Nov 2010 10:54:24 pm Michael S. Tsirkin wrote:
> I thought about this some more.  I think the original
> code is actually correct in returning ENOSPC: indirect
> buffers are nice, but it's a mistake
> to rely on them as a memory allocation might fail.
> 
> And if you look at virtio-net, it is dropping packets
> under memory pressure which is not really a happy outcome:
> the packet will get freed, reallocated and we get another one,
> adding pressure on the allocator instead of releasing it
> until we free up some buffers.
> 
> So I now think we should calculate the capacity
> assuming non-indirect entries, and if we manage to
> use indirect, all the better.

I've long said it's a weakness in the network stack that it insists
drivers stop the tx queue before they *might* run out of room, leading to
worst-case assumptions and underutilization of the tx ring.

However, I lost that debate, and so your patch is the way it's supposed to
work.  The other main indirect user (block) doesn't care as its queue
allows for post-attempt blocking.

I enhanced your commentry a little:

Subject: virtio: return correct capacity to users
Date: Thu, 4 Nov 2010 14:24:24 +0200
From: "Michael S. Tsirkin" <mst@redhat.com>

We can't rely on indirect buffers for capacity
calculations because they need a memory allocation
which might fail.  In particular, virtio_net can get
into this situation under stress, and it drops packets
and performs badly.

So return the number of buffers we can guarantee users.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reported-By: Krishna Kumar2 <krkumar2@in.ibm.com>

Thanks!
Rusty.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio_net: Fix queue full check
  2010-11-07 23:08           ` Rusty Russell
@ 2010-11-09  4:26             ` Krishna Kumar2
  2010-11-09 13:15               ` Michael S. Tsirkin
  0 siblings, 1 reply; 13+ messages in thread
From: Krishna Kumar2 @ 2010-11-09  4:26 UTC (permalink / raw)
  To: Rusty Russell; +Cc: davem, Michael S. Tsirkin, netdev, yvugenfi

Rusty Russell <rusty@rustcorp.com.au> wrote on 11/08/2010 04:38:47 AM:

> Re: [PATCH] virtio_net: Fix queue full check
>
> On Thu, 4 Nov 2010 10:54:24 pm Michael S. Tsirkin wrote:
> > I thought about this some more.  I think the original
> > code is actually correct in returning ENOSPC: indirect
> > buffers are nice, but it's a mistake
> > to rely on them as a memory allocation might fail.
> >
> > And if you look at virtio-net, it is dropping packets
> > under memory pressure which is not really a happy outcome:
> > the packet will get freed, reallocated and we get another one,
> > adding pressure on the allocator instead of releasing it
> > until we free up some buffers.
> >
> > So I now think we should calculate the capacity
> > assuming non-indirect entries, and if we manage to
> > use indirect, all the better.
>
> I've long said it's a weakness in the network stack that it insists
> drivers stop the tx queue before they *might* run out of room, leading to
> worst-case assumptions and underutilization of the tx ring.
>
> However, I lost that debate, and so your patch is the way it's supposed
to
> work.  The other main indirect user (block) doesn't care as its queue
> allows for post-attempt blocking.
>
> I enhanced your commentry a little:
>
> Subject: virtio: return correct capacity to users
> Date: Thu, 4 Nov 2010 14:24:24 +0200
> From: "Michael S. Tsirkin" <mst@redhat.com>
>
> We can't rely on indirect buffers for capacity
> calculations because they need a memory allocation
> which might fail.  In particular, virtio_net can get
> into this situation under stress, and it drops packets
> and performs badly.
>
> So return the number of buffers we can guarantee users.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> Reported-By: Krishna Kumar2 <krkumar2@in.ibm.com>

I have tested this patch for 3-4 hours but so far I have not got the tx
full
error. I am not sure if "Tested-By" applies to this situation, but just in
case:

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reported-By: Krishna Kumar2 <krkumar2@in.ibm.com>
Tested-By: Krishna Kumar2 <krkumar2@in.ibm.com>

I think both this patch and the original patch I submitted
are needed? That patch removes ENOMEM check and the increment
of dev->stats.tx_fifo_errors, and reports "memory failure".

Thanks,

- KK


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio_net: Fix queue full check
  2010-11-09  4:26             ` Krishna Kumar2
@ 2010-11-09 13:15               ` Michael S. Tsirkin
  2010-11-09 15:30                 ` Krishna Kumar2
  0 siblings, 1 reply; 13+ messages in thread
From: Michael S. Tsirkin @ 2010-11-09 13:15 UTC (permalink / raw)
  To: Krishna Kumar2; +Cc: Rusty Russell, davem, netdev, yvugenfi

On Tue, Nov 09, 2010 at 09:56:03AM +0530, Krishna Kumar2 wrote:
> Rusty Russell <rusty@rustcorp.com.au> wrote on 11/08/2010 04:38:47 AM:
> 
> > Re: [PATCH] virtio_net: Fix queue full check
> >
> > On Thu, 4 Nov 2010 10:54:24 pm Michael S. Tsirkin wrote:
> > > I thought about this some more.  I think the original
> > > code is actually correct in returning ENOSPC: indirect
> > > buffers are nice, but it's a mistake
> > > to rely on them as a memory allocation might fail.
> > >
> > > And if you look at virtio-net, it is dropping packets
> > > under memory pressure which is not really a happy outcome:
> > > the packet will get freed, reallocated and we get another one,
> > > adding pressure on the allocator instead of releasing it
> > > until we free up some buffers.
> > >
> > > So I now think we should calculate the capacity
> > > assuming non-indirect entries, and if we manage to
> > > use indirect, all the better.
> >
> > I've long said it's a weakness in the network stack that it insists
> > drivers stop the tx queue before they *might* run out of room, leading to
> > worst-case assumptions and underutilization of the tx ring.
> >
> > However, I lost that debate, and so your patch is the way it's supposed
> to
> > work.  The other main indirect user (block) doesn't care as its queue
> > allows for post-attempt blocking.
> >
> > I enhanced your commentry a little:
> >
> > Subject: virtio: return correct capacity to users
> > Date: Thu, 4 Nov 2010 14:24:24 +0200
> > From: "Michael S. Tsirkin" <mst@redhat.com>
> >
> > We can't rely on indirect buffers for capacity
> > calculations because they need a memory allocation
> > which might fail.  In particular, virtio_net can get
> > into this situation under stress, and it drops packets
> > and performs badly.
> >
> > So return the number of buffers we can guarantee users.
> >
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> > Reported-By: Krishna Kumar2 <krkumar2@in.ibm.com>
> 
> I have tested this patch for 3-4 hours but so far I have not got the tx
> full
> error. I am not sure if "Tested-By" applies to this situation, but just in
> case:
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> Reported-By: Krishna Kumar2 <krkumar2@in.ibm.com>
> Tested-By: Krishna Kumar2 <krkumar2@in.ibm.com>
> 
> I think both this patch and the original patch I submitted
> are needed? That patch removes ENOMEM check and the increment
> of dev->stats.tx_fifo_errors, and reports "memory failure".
> 
> Thanks,
> 
> - KK

So I think your patch on top of this one would be wrong:
we actually make sure out of memory does not affect TX path
at all, so any error would be unexpected.

Incrementing tx fifo errors is probably also helpful for debugging.

If you like, we could kill the special handling for ENOMEM.
Not sure whether it matters.

-- 
MST

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio_net: Fix queue full check
  2010-11-09 13:15               ` Michael S. Tsirkin
@ 2010-11-09 15:30                 ` Krishna Kumar2
  2010-11-09 15:30                   ` Michael S. Tsirkin
  0 siblings, 1 reply; 13+ messages in thread
From: Krishna Kumar2 @ 2010-11-09 15:30 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: davem, netdev, Rusty Russell, yvugenfi

"Michael S. Tsirkin" <mst@redhat.com> wrote on 11/09/2010 06:45:55 PM:

> Re: [PATCH] virtio_net: Fix queue full check
>
> On Tue, Nov 09, 2010 at 09:56:03AM +0530, Krishna Kumar2 wrote:
> > Rusty Russell <rusty@rustcorp.com.au> wrote on 11/08/2010 04:38:47 AM:
> >
> > > Re: [PATCH] virtio_net: Fix queue full check
> > >
> > > On Thu, 4 Nov 2010 10:54:24 pm Michael S. Tsirkin wrote:
> > > > I thought about this some more.  I think the original
> > > > code is actually correct in returning ENOSPC: indirect
> > > > buffers are nice, but it's a mistake
> > > > to rely on them as a memory allocation might fail.
> > > >
> > > > And if you look at virtio-net, it is dropping packets
> > > > under memory pressure which is not really a happy outcome:
> > > > the packet will get freed, reallocated and we get another one,
> > > > adding pressure on the allocator instead of releasing it
> > > > until we free up some buffers.
> > > >
> > > > So I now think we should calculate the capacity
> > > > assuming non-indirect entries, and if we manage to
> > > > use indirect, all the better.
> > >
> > > I've long said it's a weakness in the network stack that it insists
> > > drivers stop the tx queue before they *might* run out of room,
leading to
> > > worst-case assumptions and underutilization of the tx ring.
> > >
> > > However, I lost that debate, and so your patch is the way it's
supposed
> > to
> > > work.  The other main indirect user (block) doesn't care as its queue
> > > allows for post-attempt blocking.
> > >
> > > I enhanced your commentry a little:
> > >
> > > Subject: virtio: return correct capacity to users
> > > Date: Thu, 4 Nov 2010 14:24:24 +0200
> > > From: "Michael S. Tsirkin" <mst@redhat.com>
> > >
> > > We can't rely on indirect buffers for capacity
> > > calculations because they need a memory allocation
> > > which might fail.  In particular, virtio_net can get
> > > into this situation under stress, and it drops packets
> > > and performs badly.
> > >
> > > So return the number of buffers we can guarantee users.
> > >
> > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> > > Reported-By: Krishna Kumar2 <krkumar2@in.ibm.com>
> >
> > I have tested this patch for 3-4 hours but so far I have not got the tx
> > full
> > error. I am not sure if "Tested-By" applies to this situation, but just
in
> > case:
> >
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> > Reported-By: Krishna Kumar2 <krkumar2@in.ibm.com>
> > Tested-By: Krishna Kumar2 <krkumar2@in.ibm.com>
> >
> > I think both this patch and the original patch I submitted
> > are needed? That patch removes ENOMEM check and the increment
> > of dev->stats.tx_fifo_errors, and reports "memory failure".
> >
> > Thanks,
> >
> > - KK
>
> So I think your patch on top of this one would be wrong:
> we actually make sure out of memory does not affect TX path
> at all, so any error would be unexpected.
>
> Incrementing tx fifo errors is probably also helpful for debugging.
>
> If you like, we could kill the special handling for ENOMEM.
> Not sure whether it matters.

Since that is dead code, we could remove it (and the fifo error
should disappear too - tx_dropped should be the only counter to
be incremented?). Sorry if I misunderstood something.

Thanks,

- KK


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio_net: Fix queue full check
  2010-11-09 15:30                 ` Krishna Kumar2
@ 2010-11-09 15:30                   ` Michael S. Tsirkin
  0 siblings, 0 replies; 13+ messages in thread
From: Michael S. Tsirkin @ 2010-11-09 15:30 UTC (permalink / raw)
  To: Krishna Kumar2; +Cc: davem, netdev, Rusty Russell, yvugenfi

On Tue, Nov 09, 2010 at 09:00:58PM +0530, Krishna Kumar2 wrote:
> "Michael S. Tsirkin" <mst@redhat.com> wrote on 11/09/2010 06:45:55 PM:
> 
> > Re: [PATCH] virtio_net: Fix queue full check
> >
> > On Tue, Nov 09, 2010 at 09:56:03AM +0530, Krishna Kumar2 wrote:
> > > Rusty Russell <rusty@rustcorp.com.au> wrote on 11/08/2010 04:38:47 AM:
> > >
> > > > Re: [PATCH] virtio_net: Fix queue full check
> > > >
> > > > On Thu, 4 Nov 2010 10:54:24 pm Michael S. Tsirkin wrote:
> > > > > I thought about this some more.  I think the original
> > > > > code is actually correct in returning ENOSPC: indirect
> > > > > buffers are nice, but it's a mistake
> > > > > to rely on them as a memory allocation might fail.
> > > > >
> > > > > And if you look at virtio-net, it is dropping packets
> > > > > under memory pressure which is not really a happy outcome:
> > > > > the packet will get freed, reallocated and we get another one,
> > > > > adding pressure on the allocator instead of releasing it
> > > > > until we free up some buffers.
> > > > >
> > > > > So I now think we should calculate the capacity
> > > > > assuming non-indirect entries, and if we manage to
> > > > > use indirect, all the better.
> > > >
> > > > I've long said it's a weakness in the network stack that it insists
> > > > drivers stop the tx queue before they *might* run out of room,
> leading to
> > > > worst-case assumptions and underutilization of the tx ring.
> > > >
> > > > However, I lost that debate, and so your patch is the way it's
> supposed
> > > to
> > > > work.  The other main indirect user (block) doesn't care as its queue
> > > > allows for post-attempt blocking.
> > > >
> > > > I enhanced your commentry a little:
> > > >
> > > > Subject: virtio: return correct capacity to users
> > > > Date: Thu, 4 Nov 2010 14:24:24 +0200
> > > > From: "Michael S. Tsirkin" <mst@redhat.com>
> > > >
> > > > We can't rely on indirect buffers for capacity
> > > > calculations because they need a memory allocation
> > > > which might fail.  In particular, virtio_net can get
> > > > into this situation under stress, and it drops packets
> > > > and performs badly.
> > > >
> > > > So return the number of buffers we can guarantee users.
> > > >
> > > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> > > > Reported-By: Krishna Kumar2 <krkumar2@in.ibm.com>
> > >
> > > I have tested this patch for 3-4 hours but so far I have not got the tx
> > > full
> > > error. I am not sure if "Tested-By" applies to this situation, but just
> in
> > > case:
> > >
> > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> > > Reported-By: Krishna Kumar2 <krkumar2@in.ibm.com>
> > > Tested-By: Krishna Kumar2 <krkumar2@in.ibm.com>
> > >
> > > I think both this patch and the original patch I submitted
> > > are needed? That patch removes ENOMEM check and the increment
> > > of dev->stats.tx_fifo_errors, and reports "memory failure".
> > >
> > > Thanks,
> > >
> > > - KK
> >
> > So I think your patch on top of this one would be wrong:
> > we actually make sure out of memory does not affect TX path
> > at all, so any error would be unexpected.
> >
> > Incrementing tx fifo errors is probably also helpful for debugging.
> >
> > If you like, we could kill the special handling for ENOMEM.
> > Not sure whether it matters.
> 
> Since that is dead code, we could remove it (and the fifo error
> should disappear too - tx_dropped should be the only counter to
> be incremented?). Sorry if I misunderstood something.
> 
> Thanks,
> 
> - KK

It's just a sanity check. The capacity checking is tricky enough
that I'm happier with some way to detect overflow both from ifconfig
and dmesg.

I don't really care which counter gets incremented really,
since this is some TX bug fifo error seems appropriate
but I don't really care much.


-- 
MST

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2010-11-09 15:30 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-28  5:10 [PATCH] virtio_net: Fix queue full check Krishna Kumar
2010-10-29  9:47 ` Rusty Russell
2010-10-29 10:55   ` Krishna Kumar2
2010-10-29 11:28     ` Rusty Russell
2010-11-02 16:17       ` Michael S. Tsirkin
2010-11-04 12:24         ` Michael S. Tsirkin
2010-11-04 16:17           ` Krishna Kumar2
2010-11-04 16:45             ` Michael S. Tsirkin
2010-11-07 23:08           ` Rusty Russell
2010-11-09  4:26             ` Krishna Kumar2
2010-11-09 13:15               ` Michael S. Tsirkin
2010-11-09 15:30                 ` Krishna Kumar2
2010-11-09 15:30                   ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).