[RFC] [ver3 PATCH 0/6] Implement multiqueue virtio-net

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC] [ver3 PATCH 0/6] Implement multiqueue virtio-net
@ 2011-11-11 13:02 Krishna Kumar
  2011-11-11 22:02 ` Sasha Levin
  2011-11-13 11:40 ` Michael S. Tsirkin
  0 siblings, 2 replies; 5+ messages in thread
From: Krishna Kumar @ 2011-11-11 13:02 UTC (permalink / raw)
  To: rusty, mst; +Cc: netdev, kvm, davem, Krishna Kumar, virtualization

This patch series resurrects the earlier multiple TX/RX queues
functionality for virtio_net, and addresses the issues pointed
out.  It also includes an API to share irq's, f.e.  amongst the
TX vqs. 

I plan to run TCP/UDP STREAM and RR tests for local->host and
local->remote, and send the results in the next couple of days.

patch #1: Introduce VIRTIO_NET_F_MULTIQUEUE
patch #2: Move 'num_queues' to virtqueue
patch #3: virtio_net driver changes
patch #4: vhost_net changes
patch #5: Implement find_vqs_irq()
patch #6: Convert virtio_net driver to use find_vqs_irq()

		Changes from rev2:
Michael:
-------
1. Added functions to handle setting RX/TX/CTRL vq's.
2. num_queue_pairs instead of numtxqs.
3. Experimental support for fewer irq's in find_vqs.

Rusty:
------
4. Cleaned up some existing "while (1)".
5. rvq/svq and rx_sg/tx_sg changed to vq and sg respectively.
6. Cleaned up some "#if 1" code.

Issue when using patch5:
-------------------------

The new API is designed to minimize code duplication.  E.g.
vp_find_vqs() is implemented as:

static int vp_find_vqs(...)
{
	return vp_find_vqs_irq(vdev, nvqs, vqs, callbacks, names, NULL);
}

In my testing, when multiple tx/rx is used with multiple netperf
sessions, all the device tx queues stops a few thousand times and
subsequently woken up by skb_xmit_done.  But after some 40K-50K
iterations of stop/wake, some of the txq's stop and no wake
interrupt comes. (modprobe -r followed by modprobe solves this, so
it is not a system hang).  At the time of the hang (#txqs=#rxqs=4):

# egrep "CPU|virtio0" /proc/interrupts | grep -v config
       CPU0     CPU1     CPU2    CPU3
41:    49057    49262    48828   49421  PCI-MSI-edge    virtio0-input.0
42:    5066     5213     5221    5109   PCI-MSI-edge    virtio0-output.0
43:    43380    43770    43007   43148  PCI-MSI-edge    virtio0-input.1
44:    41433    41727    42101   41175  PCI-MSI-edge    virtio0-input.2
45:    38465    37629    38468   38768  PCI-MSI-edge    virtio0-input.3

# tc -s qdisc show dev eth0
qdisc mq 0: root      
	Sent 393196939897 bytes 271191624 pkt (dropped 59897,
	overlimits 0 requeues 67156) backlog 25375720b 1601p
	requeues 67156  

I am not sure if patch #5 is responsible for the hang.  Also, without
patch #5/patch #6, I changed vp_find_vqs() to:
static int vp_find_vqs(...)
{
	return vp_try_to_find_vqs(vdev, nvqs, vqs, callbacks, names,
				  false, false);
}
No packets were getting TX'd with this change when #txqs>1.  This is
with the MQ-only patch that doesn't touch drivers/virtio/ directory.

Also, the MQ patch works reasonably well with 2 vectors - with
use_msix=1 and per_vq_vectors=0 in vp_find_vqs().

Patch against net-next - please review.

Signed-off-by: krkumar2@in.ibm.com
---

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] [ver3 PATCH 0/6] Implement multiqueue virtio-net
  2011-11-11 13:02 Krishna Kumar
@ 2011-11-11 22:02 ` Sasha Levin
  2011-11-13 11:40 ` Michael S. Tsirkin
  1 sibling, 0 replies; 5+ messages in thread
From: Sasha Levin @ 2011-11-11 22:02 UTC (permalink / raw)
  To: Krishna Kumar; +Cc: kvm, mst, netdev, virtualization, davem

Hi,

I'm seeing this BUG() sometimes when running it using a small patch I
did for KVM tool:

[    1.280766] BUG: unable to handle kernel NULL pointer dereference at
0000000000000010
[    1.281531] IP: [<ffffffff810b3ac7>] free_percpu+0x9a/0x104
[    1.281531] PGD 0 
[    1.281531] Oops: 0000 [#1] PREEMPT SMP 
[    1.281531] CPU 0 
[    1.281531] Pid: 1, comm: swapper Not tainted
3.1.0-sasha-19665-gef3d2b7 #39  
[    1.281531] RIP: 0010:[<ffffffff810b3ac7>]  [<ffffffff810b3ac7>]
free_percpu+0x9a/0x104
[    1.281531] RSP: 0018:ffff88001383fd50  EFLAGS: 00010046
[    1.281531] RAX: 0000000000000000 RBX: 0000000000000282 RCX:
00000000000f4400
[    1.281531] RDX: 00003ffffffff000 RSI: ffff880000000240 RDI:
0000000001c06063
[    1.281531] RBP: ffff880013fcb7c0 R08: ffffea00004e30c0 R09:
ffffffff8138ba64
[    1.281531] R10: 0000000000001880 R11: 0000000000001880 R12:
ffff881213c00000
[    1.281531] R13: ffff8800138c0e00 R14: 0000000000000010 R15:
ffff8800138c0d00
[    1.281531] FS:  0000000000000000(0000) GS:ffff880013c00000(0000)
knlGS:0000000000000000
[    1.281531] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[    1.281531] CR2: 0000000000000010 CR3: 0000000001c05000 CR4:
00000000000406f0
[    1.281531] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[    1.281531] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[    1.281531] Process swapper (pid: 1, threadinfo ffff88001383e000,
task ffff880013848000)
[    1.281531] Stack:
[    1.281531]  ffff880013846ec0 0000000000000000 0000000000000000
ffffffff8138a0e5
[    1.281531]  ffff880013846ec0 ffff880013846800 ffff880013b6c000
ffffffff8138bb63
[    1.281531]  0000000000000011 000000000000000f ffff8800fffffff0
0000000181239bcd
[    1.281531] Call Trace:
[    1.281531]  [<ffffffff8138a0e5>] ? free_rq_sq+0x2c/0xce
[    1.281531]  [<ffffffff8138bb63>] ? virtnet_probe+0x81c/0x855
[    1.281531]  [<ffffffff8129c9e7>] ? virtio_dev_probe+0xa7/0xc6
[    1.281531]  [<ffffffff8134d2c3>] ? driver_probe_device+0xb2/0x142
[    1.281531]  [<ffffffff8134d3a2>] ? __driver_attach+0x4f/0x6f
[    1.281531]  [<ffffffff8134d353>] ? driver_probe_device+0x142/0x142
[    1.281531]  [<ffffffff8134c3ab>] ? bus_for_each_dev+0x47/0x72
[    1.281531]  [<ffffffff8134c90d>] ? bus_add_driver+0xa2/0x1e6
[    1.281531]  [<ffffffff81cc1b36>] ? tun_init+0x89/0x89
[    1.281531]  [<ffffffff8134db59>] ? driver_register+0x8d/0xf8
[    1.281531]  [<ffffffff81cc1b36>] ? tun_init+0x89/0x89
[    1.281531]  [<ffffffff81c98ac1>] ? do_one_initcall+0x78/0x130
[    1.281531]  [<ffffffff81c98c0e>] ? kernel_init+0x95/0x113
[    1.281531]  [<ffffffff81658274>] ? kernel_thread_helper+0x4/0x10
[    1.281531]  [<ffffffff81c98b79>] ? do_one_initcall+0x130/0x130
[    1.281531]  [<ffffffff81658270>] ? gs_change+0x13/0x13
[    1.281531] Code: c2 85 d2 48 0f 45 2d d1 39 ce 00 eb 22 65 8b 14 25
90 cc 00 00 48 8b 05 f0 a6 bc 00 48 63 d2 4c 89 e7 48 03 3c d0 e8 83 dd
00 00 
[    1.281531]  8b 68 10 44 89 e6 48 89 ef 2b 75 18 e8 e4 f1 ff ff 8b 05
fd 
[    1.281531] RIP  [<ffffffff810b3ac7>] free_percpu+0x9a/0x104
[    1.281531]  RSP <ffff88001383fd50>
[    1.281531] CR2: 0000000000000010
[    1.281531] ---[ end trace 68cbc23dfe2fe62a ]---

I don't have time today to dig into it, sorry.

On Fri, 2011-11-11 at 18:32 +0530, Krishna Kumar wrote:
> This patch series resurrects the earlier multiple TX/RX queues
> functionality for virtio_net, and addresses the issues pointed
> out.  It also includes an API to share irq's, f.e.  amongst the
> TX vqs. 
> 
> I plan to run TCP/UDP STREAM and RR tests for local->host and
> local->remote, and send the results in the next couple of days.
> 
> 
> patch #1: Introduce VIRTIO_NET_F_MULTIQUEUE
> patch #2: Move 'num_queues' to virtqueue
> patch #3: virtio_net driver changes
> patch #4: vhost_net changes
> patch #5: Implement find_vqs_irq()
> patch #6: Convert virtio_net driver to use find_vqs_irq()
> 
> 
> 		Changes from rev2:
> Michael:
> -------
> 1. Added functions to handle setting RX/TX/CTRL vq's.
> 2. num_queue_pairs instead of numtxqs.
> 3. Experimental support for fewer irq's in find_vqs.
> 
> Rusty:
> ------
> 4. Cleaned up some existing "while (1)".
> 5. rvq/svq and rx_sg/tx_sg changed to vq and sg respectively.
> 6. Cleaned up some "#if 1" code.
> 
> 
> Issue when using patch5:
> -------------------------
> 
> The new API is designed to minimize code duplication.  E.g.
> vp_find_vqs() is implemented as:
> 
> static int vp_find_vqs(...)
> {
> 	return vp_find_vqs_irq(vdev, nvqs, vqs, callbacks, names, NULL);
> }
> 
> In my testing, when multiple tx/rx is used with multiple netperf
> sessions, all the device tx queues stops a few thousand times and
> subsequently woken up by skb_xmit_done.  But after some 40K-50K
> iterations of stop/wake, some of the txq's stop and no wake
> interrupt comes. (modprobe -r followed by modprobe solves this, so
> it is not a system hang).  At the time of the hang (#txqs=#rxqs=4):
> 
> # egrep "CPU|virtio0" /proc/interrupts | grep -v config
>        CPU0     CPU1     CPU2    CPU3
> 41:    49057    49262    48828   49421  PCI-MSI-edge    virtio0-input.0
> 42:    5066     5213     5221    5109   PCI-MSI-edge    virtio0-output.0
> 43:    43380    43770    43007   43148  PCI-MSI-edge    virtio0-input.1
> 44:    41433    41727    42101   41175  PCI-MSI-edge    virtio0-input.2
> 45:    38465    37629    38468   38768  PCI-MSI-edge    virtio0-input.3
> 
> # tc -s qdisc show dev eth0
> qdisc mq 0: root      
> 	Sent 393196939897 bytes 271191624 pkt (dropped 59897,
> 	overlimits 0 requeues 67156) backlog 25375720b 1601p
> 	requeues 67156  
> 
> I am not sure if patch #5 is responsible for the hang.  Also, without
> patch #5/patch #6, I changed vp_find_vqs() to:
> static int vp_find_vqs(...)
> {
> 	return vp_try_to_find_vqs(vdev, nvqs, vqs, callbacks, names,
> 				  false, false);
> }
> No packets were getting TX'd with this change when #txqs>1.  This is
> with the MQ-only patch that doesn't touch drivers/virtio/ directory.
> 
> Also, the MQ patch works reasonably well with 2 vectors - with
> use_msix=1 and per_vq_vectors=0 in vp_find_vqs().
> 
> Patch against net-next - please review.
> 
> Signed-off-by: krkumar2@in.ibm.com
> ---
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 

Sasha.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] [ver3 PATCH 0/6] Implement multiqueue virtio-net
@ 2011-11-12  5:45 Krishna Kumar
  2011-11-12  7:20 ` Sasha Levin
  0 siblings, 1 reply; 5+ messages in thread
From: Krishna Kumar @ 2011-11-12  5:45 UTC (permalink / raw)
  To: levinsasha928
  Cc: kvm, mst, netdev, rusty, virtualization, davem, Krishna Kumar

Sasha Levin <levinsasha928@gmail.com> wrote on 11/12/2011 03:32:04 AM:

> I'm seeing this BUG() sometimes when running it using a small patch I
> did for KVM tool:
> 
> [    1.281531] Call Trace:
> [    1.281531]  [<ffffffff8138a0e5>] ? free_rq_sq+0x2c/0xce
> [    1.281531]  [<ffffffff8138bb63>] ? virtnet_probe+0x81c/0x855
> [    1.281531]  [<ffffffff8129c9e7>] ? virtio_dev_probe+0xa7/0xc6
> [    1.281531]  [<ffffffff8134d2c3>] ? driver_probe_device+0xb2/0x142
> [    1.281531]  [<ffffffff8134d3a2>] ? __driver_attach+0x4f/0x6f
> [    1.281531]  [<ffffffff8134d353>] ? driver_probe_device+0x142/0x142
> [    1.281531]  [<ffffffff8134c3ab>] ? bus_for_each_dev+0x47/0x72
> [    1.281531]  [<ffffffff8134c90d>] ? bus_add_driver+0xa2/0x1e6
> [    1.281531]  [<ffffffff81cc1b36>] ? tun_init+0x89/0x89
> [    1.281531]  [<ffffffff8134db59>] ? driver_register+0x8d/0xf8
> [    1.281531]  [<ffffffff81cc1b36>] ? tun_init+0x89/0x89
> [    1.281531]  [<ffffffff81c98ac1>] ? do_one_initcall+0x78/0x130
> [    1.281531]  [<ffffffff81c98c0e>] ? kernel_init+0x95/0x113
> [    1.281531]  [<ffffffff81658274>] ? kernel_thread_helper+0x4/0x10
> [    1.281531]  [<ffffffff81c98b79>] ? do_one_initcall+0x130/0x130
> [    1.281531]  [<ffffffff81658270>] ? gs_change+0x13/0x13
> [    1.281531] Code: c2 85 d2 48 0f 45 2d d1 39 ce 00 eb 22 65 8b 14 25
> 90 cc 00 00 48 8b 05 f0 a6 bc 00 48 63 d2 4c 89 e7 48 03 3c d0 e8 83 dd
> 00 00 
> [    1.281531]  8b 68 10 44 89 e6 48 89 ef 2b 75 18 e8 e4 f1 ff ff 8b 05
> fd 
> [    1.281531] RIP  [<ffffffff810b3ac7>] free_percpu+0x9a/0x104
> [    1.281531]  RSP <ffff88001383fd50>
> [    1.281531] CR2: 0000000000000010
> [    1.281531] ---[ end trace 68cbc23dfe2fe62a ]---
> 
> I don't have time today to dig into it, sorry.

Thanks for the report.

free_rq_sq() was being called twice in the failure path. The second
call panic'd since it had freed the same pointers earlier.

1. free_rq_sq() was being called twice in the failure path.
   virtnet_setup_vqs() had already freed up rq/sq on error, and
   virtnet_probe() tried to do it again. Fix it in virtnet_probe
   by moving the call up.
2. Make free_rq_sq() re-entrant by setting freed pointers to NULL.
3. Remove free_stats() as it was being called only once.

Sasha, could you please try this patch on top of existing patches?

thanks!

Signed-off-by: krkumar2@in.ibm.com
---
 drivers/net/virtio_net.c |   41 +++++++++++--------------------------
 1 file changed, 13 insertions(+), 28 deletions(-)

diff -ruNp n6/drivers/net/virtio_net.c n7/drivers/net/virtio_net.c
--- n6/drivers/net/virtio_net.c	2011-11-12 11:03:48.000000000 +0530
+++ n7/drivers/net/virtio_net.c	2011-11-12 10:39:28.000000000 +0530
@@ -782,23 +782,6 @@ static void virtnet_netpoll(struct net_d
 }
 #endif
 
-static void free_stats(struct virtnet_info *vi)
-{
-	int i;
-
-	for (i = 0; i < vi->num_queue_pairs; i++) {
-		if (vi->sq && vi->sq[i]) {
-			free_percpu(vi->sq[i]->stats);
-			vi->sq[i]->stats = NULL;
-		}
-
-		if (vi->rq && vi->rq[i]) {
-			free_percpu(vi->rq[i]->stats);
-			vi->rq[i]->stats = NULL;
-		}
-	}
-}
-
 static int virtnet_open(struct net_device *dev)
 {
 	struct virtnet_info *vi = netdev_priv(dev);
@@ -1054,19 +1037,22 @@ static void free_rq_sq(struct virtnet_in
 {
 	int i;
 
-	free_stats(vi);
-
-	if (vi->rq) {
-		for (i = 0; i < vi->num_queue_pairs; i++)
+	for (i = 0; i < vi->num_queue_pairs; i++) {
+		if (vi->rq && vi->rq[i]) {
+			free_percpu(vi->rq[i]->stats);
 			kfree(vi->rq[i]);
-		kfree(vi->rq);
-	}
+			vi->rq[i] = NULL;
+		}
 
-	if (vi->sq) {
-		for (i = 0; i < vi->num_queue_pairs; i++)
+		if (vi->sq && vi->sq[i]) {
+			free_percpu(vi->sq[i]->stats);
 			kfree(vi->sq[i]);
-		kfree(vi->sq);
+			vi->sq[i] = NULL;
+		}
 	}
+
+	kfree(vi->rq);
+	kfree(vi->sq);
 }
 
 static void free_unused_bufs(struct virtnet_info *vi)
@@ -1387,10 +1373,9 @@ free_vqs:
 	for (i = 0; i < num_queue_pairs; i++)
 		cancel_delayed_work_sync(&vi->rq[i]->refill);
 	vdev->config->del_vqs(vdev);
-
-free_netdev:
 	free_rq_sq(vi);
 
+free_netdev:
 	free_netdev(dev);
 	return err;
 }


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] [ver3 PATCH 0/6] Implement multiqueue virtio-net
  2011-11-12  5:45 [RFC] [ver3 PATCH 0/6] Implement multiqueue virtio-net Krishna Kumar
@ 2011-11-12  7:20 ` Sasha Levin
  0 siblings, 0 replies; 5+ messages in thread
From: Sasha Levin @ 2011-11-12  7:20 UTC (permalink / raw)
  To: Krishna Kumar; +Cc: kvm, mst, netdev, rusty, virtualization, davem

On Sat, 2011-11-12 at 11:15 +0530, Krishna Kumar wrote:
> Sasha Levin <levinsasha928@gmail.com> wrote on 11/12/2011 03:32:04 AM:
> 
> > I'm seeing this BUG() sometimes when running it using a small patch I
> > did for KVM tool:
> > 
> > [    1.281531] Call Trace:
> > [    1.281531]  [<ffffffff8138a0e5>] ? free_rq_sq+0x2c/0xce
> > [    1.281531]  [<ffffffff8138bb63>] ? virtnet_probe+0x81c/0x855
> > [    1.281531]  [<ffffffff8129c9e7>] ? virtio_dev_probe+0xa7/0xc6
> > [    1.281531]  [<ffffffff8134d2c3>] ? driver_probe_device+0xb2/0x142
> > [    1.281531]  [<ffffffff8134d3a2>] ? __driver_attach+0x4f/0x6f
> > [    1.281531]  [<ffffffff8134d353>] ? driver_probe_device+0x142/0x142
> > [    1.281531]  [<ffffffff8134c3ab>] ? bus_for_each_dev+0x47/0x72
> > [    1.281531]  [<ffffffff8134c90d>] ? bus_add_driver+0xa2/0x1e6
> > [    1.281531]  [<ffffffff81cc1b36>] ? tun_init+0x89/0x89
> > [    1.281531]  [<ffffffff8134db59>] ? driver_register+0x8d/0xf8
> > [    1.281531]  [<ffffffff81cc1b36>] ? tun_init+0x89/0x89
> > [    1.281531]  [<ffffffff81c98ac1>] ? do_one_initcall+0x78/0x130
> > [    1.281531]  [<ffffffff81c98c0e>] ? kernel_init+0x95/0x113
> > [    1.281531]  [<ffffffff81658274>] ? kernel_thread_helper+0x4/0x10
> > [    1.281531]  [<ffffffff81c98b79>] ? do_one_initcall+0x130/0x130
> > [    1.281531]  [<ffffffff81658270>] ? gs_change+0x13/0x13
> > [    1.281531] Code: c2 85 d2 48 0f 45 2d d1 39 ce 00 eb 22 65 8b 14 25
> > 90 cc 00 00 48 8b 05 f0 a6 bc 00 48 63 d2 4c 89 e7 48 03 3c d0 e8 83 dd
> > 00 00 
> > [    1.281531]  8b 68 10 44 89 e6 48 89 ef 2b 75 18 e8 e4 f1 ff ff 8b 05
> > fd 
> > [    1.281531] RIP  [<ffffffff810b3ac7>] free_percpu+0x9a/0x104
> > [    1.281531]  RSP <ffff88001383fd50>
> > [    1.281531] CR2: 0000000000000010
> > [    1.281531] ---[ end trace 68cbc23dfe2fe62a ]---
> > 
> > I don't have time today to dig into it, sorry.
> 
> Thanks for the report.
> 
> free_rq_sq() was being called twice in the failure path. The second
> call panic'd since it had freed the same pointers earlier.
> 
> 1. free_rq_sq() was being called twice in the failure path.
>    virtnet_setup_vqs() had already freed up rq/sq on error, and
>    virtnet_probe() tried to do it again. Fix it in virtnet_probe
>    by moving the call up.
> 2. Make free_rq_sq() re-entrant by setting freed pointers to NULL.
> 3. Remove free_stats() as it was being called only once.
> 
> Sasha, could you please try this patch on top of existing patches?
> 
> thanks!
> 
> Signed-off-by: krkumar2@in.ibm.com
> ---
[snip]

I've tested it and it looks good, no BUG() or panic.

I would suggest adding some output if we take the failure path, since
while the guest does boot peacefully, it doesn't have a network
interface and there's nothing in the boot log to indicate anything has
gone wrong.

-- 

Sasha.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] [ver3 PATCH 0/6] Implement multiqueue virtio-net
  2011-11-11 13:02 Krishna Kumar
  2011-11-11 22:02 ` Sasha Levin
@ 2011-11-13 11:40 ` Michael S. Tsirkin
  1 sibling, 0 replies; 5+ messages in thread
From: Michael S. Tsirkin @ 2011-11-13 11:40 UTC (permalink / raw)
  To: Krishna Kumar; +Cc: kvm, netdev, virtualization, Ben Hutchings, davem

On Fri, Nov 11, 2011 at 06:32:23PM +0530, Krishna Kumar wrote:
> This patch series resurrects the earlier multiple TX/RX queues
> functionality for virtio_net, and addresses the issues pointed
> out.

Some general questions/issues with the approach this patchset takes:
1. Lack of host-guest synchronization for flow hash.
   On the host side, things will scale if the same vhost thread
   handles both transmit and receive for a specific flow.
   Further, things will scale
   if packets from distinct guest queues get routed to
   distict queues on the NIC and tap devices in the host.
   It seems that to achieve both, host and guest
   need to pass  the flow hash information to each other.
   Ben Hutchings suggested effectively pushing the guest's
   RFS socket map out to the host.  Any thoughts on this?
2. Reduced batching/increased number of exits.
   It's easy to see that the amount of work per VQ
   is reduced with this patch. Thus it's easy to imagine
   that under some workloads, where we previously had
   X packets per VM exit/interrupt, we'll now have X/N with N
   the number of virtqueues. Since both a VM exit and an interrupt
   are expensive operations, one wonders whether this can
   lead to performance regressions.
   It seems that to reduce the chance of such, some adaptive
   strategy would work better. But how would we ensure
   packets aren't reordered then? Any thoughts?
3. Lack of userspace resource control.
   A vhost-net device already uses quite a lot of resources.
   This patch seems to make the problem worse. At the moment,
   management can to some level control that by using a file
   descriptor per virtio device. So using a file descriptor
   per VQ has an advantage of limiting the amount of resources
   qemu can consume.  In April, Jason posted a qemu patch that supported
   a multiqueue guest by using existing vhost interfaces, by opening
   multiple devices, one per queue.
   It seems that this can be improved upon, if we allow e.g. sharing
   of memory maps between file descriptors.
   This might also make adaptive queueing strategies possible.
   Would it be possible to do this instead?

>  It also includes an API to share irq's, f.e.  amongst the
> TX vqs. 
> I plan to run TCP/UDP STREAM and RR tests for local->host and
> local->remote, and send the results in the next couple of days.

Please do. Small message throughput would be especially interesting.

> patch #1: Introduce VIRTIO_NET_F_MULTIQUEUE
> patch #2: Move 'num_queues' to virtqueue
> patch #3: virtio_net driver changes
> patch #4: vhost_net changes
> patch #5: Implement find_vqs_irq()
> patch #6: Convert virtio_net driver to use find_vqs_irq()
> 
> 
> 		Changes from rev2:
> Michael:
> -------
> 1. Added functions to handle setting RX/TX/CTRL vq's.
> 2. num_queue_pairs instead of numtxqs.
> 3. Experimental support for fewer irq's in find_vqs.
> 
> Rusty:
> ------
> 4. Cleaned up some existing "while (1)".
> 5. rvq/svq and rx_sg/tx_sg changed to vq and sg respectively.
> 6. Cleaned up some "#if 1" code.
> 
> 
> Issue when using patch5:
> -------------------------
> 
> The new API is designed to minimize code duplication.  E.g.
> vp_find_vqs() is implemented as:
> 
> static int vp_find_vqs(...)
> {
> 	return vp_find_vqs_irq(vdev, nvqs, vqs, callbacks, names, NULL);
> }
> 
> In my testing, when multiple tx/rx is used with multiple netperf
> sessions, all the device tx queues stops a few thousand times and
> subsequently woken up by skb_xmit_done.  But after some 40K-50K
> iterations of stop/wake, some of the txq's stop and no wake
> interrupt comes. (modprobe -r followed by modprobe solves this, so
> it is not a system hang).  At the time of the hang (#txqs=#rxqs=4):
> 
> # egrep "CPU|virtio0" /proc/interrupts | grep -v config
>        CPU0     CPU1     CPU2    CPU3
> 41:    49057    49262    48828   49421  PCI-MSI-edge    virtio0-input.0
> 42:    5066     5213     5221    5109   PCI-MSI-edge    virtio0-output.0
> 43:    43380    43770    43007   43148  PCI-MSI-edge    virtio0-input.1
> 44:    41433    41727    42101   41175  PCI-MSI-edge    virtio0-input.2
> 45:    38465    37629    38468   38768  PCI-MSI-edge    virtio0-input.3
> 
> # tc -s qdisc show dev eth0
> qdisc mq 0: root      
> 	Sent 393196939897 bytes 271191624 pkt (dropped 59897,
> 	overlimits 0 requeues 67156) backlog 25375720b 1601p
> 	requeues 67156  
> 
> I am not sure if patch #5 is responsible for the hang.  Also, without
> patch #5/patch #6, I changed vp_find_vqs() to:
> static int vp_find_vqs(...)
> {
> 	return vp_try_to_find_vqs(vdev, nvqs, vqs, callbacks, names,
> 				  false, false);
> }
> No packets were getting TX'd with this change when #txqs>1.  This is
> with the MQ-only patch that doesn't touch drivers/virtio/ directory.
> 
> Also, the MQ patch works reasonably well with 2 vectors - with
> use_msix=1 and per_vq_vectors=0 in vp_find_vqs().
> 
> Patch against net-next - please review.
> 
> Signed-off-by: krkumar2@in.ibm.com
> ---

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-11-13 11:40 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-12  5:45 [RFC] [ver3 PATCH 0/6] Implement multiqueue virtio-net Krishna Kumar
2011-11-12  7:20 ` Sasha Levin
  -- strict thread matches above, loose matches on Subject: below --
2011-11-11 13:02 Krishna Kumar
2011-11-11 22:02 ` Sasha Levin
2011-11-13 11:40 ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).