netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] Regression in linux 2.6.32 virtio_net seen with vhost-net
@ 2009-12-08 22:50 Sridhar Samudrala
  2009-12-13 12:25 ` Herbert Xu
  0 siblings, 1 reply; 58+ messages in thread
From: Sridhar Samudrala @ 2009-12-08 22:50 UTC (permalink / raw)
  To: rusty, mst; +Cc: netdev

When testing vhost-net patches with a guest running linux 2.6.32, i am
running into "Unexpected full queue" warning messages from start_xmit() in
virtio_net.c causing a lot of requeues and a drastic reduction in throughput.

With a guest running 2.6.31, i get guest to host throughput around 7000Mb/s,
but it drops to around 3200Mb/s with 2.6.32.

I don't see this problem with usermode virtio in qemu, but i get a thruput of
only 2700Mb/s with both 2.6.31 and 2.6.32.

The following patch fixes this problem by dropping the skb and not requeuing
to qdisc when it cannot be added to ring buffer. With this patch, i see
similar performance as 2.6.31 with vhost-net.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index b9e002f..307cfd6 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -528,7 +528,11 @@ again:
 			netif_start_queue(dev);
 			goto again;
 		}
-		return NETDEV_TX_BUSY;
+
+		/* drop the skb under stress. */ 
+ 		vi->dev->stats.tx_dropped++;
+		kfree_skb(skb);
+		return NETDEV_TX_OK;
 	}
 	vi->svq->vq_ops->kick(vi->svq);
 

Thanks
Sridhar






^ permalink raw reply related	[flat|nested] 58+ messages in thread
* Re: [RFC PATCH] Regression in linux 2.6.32 virtio_net seen with vhost-net
@ 2009-12-17 11:20 Krishna Kumar
  2009-12-17 19:28 ` Jarek Poplawski
  0 siblings, 1 reply; 58+ messages in thread
From: Krishna Kumar @ 2009-12-17 11:20 UTC (permalink / raw)
  To: davem; +Cc: herbert, mst, netdev, rusty, Krishna Kumar, sri

> On Thu, Dec 17, 2009, Krishna Kumar2/India/IBM@IBMIN wrote on
>
> Subject: Re: [RFC PATCH] Regression in linux 2.6.32 virtio_net seen with vhost-net
>
> > >> I think sch_direct_xmit() is not even calling dev_hard_start_xmit() as
> > >> the tx queue is stopped
> > >> and does a dev_requeue_skb() and returns NETDEV_TX_BUSY.
> > >
> > > Yes but if the queue was stopped then we shouldn't even get into
> > > sch_direct_xmit.
> > I don't see any checks for txq_stopped in the callers of
> sch_direct_xmit() :
> > __dev_xmit_skb() and qdisc_restart().  Both these routines get the txq
> > and call
> > sch_direct_xmit() which checks if tx queue is stopped or frozen.
> >
> > Am i missing something?
> 
> Yes - dequeue_skb.
> 
> The final skb, before the queue was stopped, is transmitted by
> the driver. The next time sch_direct_xmit is called, it gets a
> skb and finds the device is stopped and requeue's the skb. For
> all subsequent xmits, dequeue_skb returns NULL (and the other
> caller - __dev_xmit_skb can never be called since qdisc_qlen is
> true) and thus requeue's will not happen. This also means that
> the number of requeues you see (eg 283K in one run) is the number
> of times the queue was stopped and restarted. So it looks like
> driver either:
> 
> 1. didn't stop the queue when xmiting a packet successfully (the
>       condition being that it would not be possible to xmit the
>       next skb). But this doesn't seem to be the case.
> 2. wrongly restarted the queue. Possible - since a few places
>       use both the start & wake queue api's.

[ Resending, since I sent to wrong id last time - sorry for
some duplicates ]

On a (slightly) related note, qdisc_restart() has this code:
	/* Dequeue packet */
	skb = dequeue_skb(q);
	if (unlikely(!skb))
		return 0;

When a txq is stopped, all subsequent dev_queue_xmits will
execute this path, pass the "unlikely" code, and return. Is
it reasonable to remove "unlikely" in both dequeue_skb and
qdisc_restart, if so patch inlined below:

thanks,

- KK

From: Krishna Kumar <krkumar2@in.ibm.com>

1. Remove two unlikely checks since stopped queue's will result
	int getting a NULL skb.
2. Remove an extra space after unlikely check.

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
---
 net/sched/sch_generic.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff -ruNp org/net/sched/sch_generic.c new/net/sched/sch_generic.c
--- org/net/sched/sch_generic.c	2009-12-17 16:29:59.000000000 +0530
+++ new/net/sched/sch_generic.c	2009-12-17 16:30:55.000000000 +0530
@@ -51,7 +51,7 @@ static inline struct sk_buff *dequeue_sk
 {
 	struct sk_buff *skb = q->gso_skb;
 
-	if (unlikely(skb)) {
+	if (skb) {
 		struct net_device *dev = qdisc_dev(q);
 		struct netdev_queue *txq;
 
@@ -134,7 +134,7 @@ int sch_direct_xmit(struct sk_buff *skb,
 		ret = handle_dev_cpu_collision(skb, txq, q);
 	} else {
 		/* Driver returned NETDEV_TX_BUSY - requeue skb */
-		if (unlikely (ret != NETDEV_TX_BUSY && net_ratelimit()))
+		if (unlikely(ret != NETDEV_TX_BUSY && net_ratelimit()))
 			printk(KERN_WARNING "BUG %s code %d qlen %d\n",
 			       dev->name, ret, q->q.qlen);
 
@@ -176,7 +176,7 @@ static inline int qdisc_restart(struct Q
 
 	/* Dequeue packet */
 	skb = dequeue_skb(q);
-	if (unlikely(!skb))
+	if (!skb)
 		return 0;
 
 	root_lock = qdisc_lock(q);

^ permalink raw reply	[flat|nested] 58+ messages in thread
[parent not found: <20091217111219.9809.27432.sendpatchset@krkumar2.in.ibm.com>]

end of thread, other threads:[~2009-12-18 19:13 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-08 22:50 [RFC PATCH] Regression in linux 2.6.32 virtio_net seen with vhost-net Sridhar Samudrala
2009-12-13 12:25 ` Herbert Xu
2009-12-13 23:40   ` Michael S. Tsirkin
2009-12-15 14:42     ` Herbert Xu
2009-12-15 16:26       ` Sridhar Samudrala
2009-12-16  1:21         ` Herbert Xu
2009-12-15 23:32       ` Michael S. Tsirkin
2009-12-16  1:58         ` Herbert Xu
2009-12-16  4:37         ` Rusty Russell
2009-12-16 10:37           ` Michael S. Tsirkin
2009-12-16  2:41   ` Rusty Russell
2009-12-16  2:53     ` Herbert Xu
2009-12-16 12:45       ` Rusty Russell
2009-12-16 13:22         ` Michael S. Tsirkin
2009-12-16 13:35           ` Herbert Xu
2009-12-16 13:38             ` Michael S. Tsirkin
2009-12-16 13:48               ` Herbert Xu
2009-12-17  2:02           ` Rusty Russell
2009-12-17  9:25             ` Michael S. Tsirkin
2009-12-18  1:55               ` Rusty Russell
2009-12-16 13:30         ` Herbert Xu
2009-12-17  1:43         ` Sridhar Samudrala
2009-12-17  3:12           ` Herbert Xu
2009-12-17  5:02             ` Sridhar Samudrala
2009-12-17  3:15           ` Herbert Xu
2009-12-17  5:05             ` Sridhar Samudrala
2009-12-17  6:28               ` Herbert Xu
2009-12-17  6:45                 ` Sridhar Samudrala
2009-12-17 10:03                   ` Krishna Kumar2
2009-12-17 11:27                     ` Jarek Poplawski
2009-12-17 11:45                       ` Herbert Xu
2009-12-17 11:49                         ` Herbert Xu
2009-12-17 12:08                           ` Herbert Xu
2009-12-17 12:27                             ` Krishna Kumar2
2009-12-17 12:42                               ` Jarek Poplawski
2009-12-17 12:56                                 ` Herbert Xu
2009-12-17 13:22                                   ` Krishna Kumar2
2009-12-17 13:04                                 ` Krishna Kumar2
2009-12-17 13:44                               ` Herbert Xu
2009-12-17 14:35                                 ` Krishna Kumar2
2009-12-17 14:36                                   ` Herbert Xu
2009-12-17 21:50                                     ` Sridhar Samudrala
2009-12-17 22:28                                       ` Sridhar Samudrala
2009-12-17 22:41                                         ` Jarek Poplawski
2009-12-18 13:46                                       ` Krishna Kumar2
2009-12-18 19:13                                       ` Sridhar Samudrala
2009-12-17 11:59                         ` Krishna Kumar2
2009-12-17 12:19                         ` Jarek Poplawski
2009-12-17 11:56                       ` Krishna Kumar2
2009-12-17 13:17                         ` Jarek Poplawski
2009-12-17 14:10                           ` Krishna Kumar2
2009-12-17 14:16                             ` Herbert Xu
2009-12-16 17:42     ` Sridhar Samudrala
  -- strict thread matches above, loose matches on Subject: below --
2009-12-17 11:20 Krishna Kumar
2009-12-17 19:28 ` Jarek Poplawski
     [not found] <20091217111219.9809.27432.sendpatchset@krkumar2.in.ibm.com>
     [not found] ` <20091217123153.GA31131@gondor.apana.org.au>
2009-12-17 12:56   ` Krishna Kumar2
2009-12-17 13:40     ` Herbert Xu
2009-12-17 13:56       ` Krishna Kumar2

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).