* virtio_net and SMP guests
@ 2007-12-14 12:12 Christian Borntraeger
[not found] ` <200712141312.05562.borntraeger-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 11+ messages in thread
From: Christian Borntraeger @ 2007-12-14 12:12 UTC (permalink / raw)
To: Rusty Russell, Dor Laor, Anthony Liguori; +Cc: kvm-devel, virtualization
Rusty, Anthony, Dor,
I need your brain power :-)
On smp guests I have seen a problem with virtio (the version in curent Avi's
git) which do not occur on single processor guests:
kernel BUG at /space/kvm/drivers/virtio/virtio_ring.c:228!
illegal operation: 0001 [#1]
Modules linked in: ipv6
CPU: 2 Not tainted
Process swapper (pid: 0, task: 000000000f83e038, ksp: 000000000f877d70)
Krnl PSW : 0704000180000000 000000000045df2a (vring_restart+0x5a/0x70)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0 EA:3
Krnl GPRS: 00000000c0a80101 0000000000000000 000000000eb35000 0000000010005800
000000000045ded0 000000000000192f 000000000eb21000 000000000eb21000
000000000000000e 000000000eb21900 000000000eb21920 000000000f867cb8
0700000000d9b058 0000000000000010 000000000045c06a 000000000f867cb8
Krnl Code: 000000000045df1e: e3b0b0700004 lg %r11,112(%r11)
000000000045df24: 07fe bcr 15,%r14
000000000045df26: a7f40001 brc 15,45df28
>000000000045df2a: a7f4ffe1 brc 15,45deec
000000000045df2e: e31020300004 lg %r1,48(%r2)
000000000045df34: a7480000 lhi %r4,0
000000000045df38: 96011001 oi 1(%r1),1
000000000045df3c: a7f4ffef brc 15,45df1a
Call Trace:
([<000000000045c016>] virtnet_poll+0x96/0x42c)
[<000000000048cda2>] net_rx_action+0xca/0x150
[<0000000000137f7a>] __do_softirq+0x9e/0x130
[<00000000001105d6>] do_softirq+0xae/0xb4
[<0000000000138182>] irq_exit+0x96/0x9c
[<000000000010d710>] do_extint+0xcc/0xf8
[<00000000001135d0>] ext_no_vtime+0x16/0x1a
[<000000000010a57e>] cpu_idle+0x216/0x238
I think there is a valid code path, triggering this bug:
CPU1 CPU2
----------------------- -----------------------
- virtnet_poll found no
more packets on queue
- netif_rx_complete allow
poll to be called
- vq_ops->restart is called
- vq Interrupts are enabled
. <new packets arrive>
<vcpu is scheduled away>
. - interrupt is delivered
. - poll is called
. - poll work is done
. - netif_rx_complete
. - vq_ops->restart is called
. - check if vq interrupts are
. enable --> BUG
The first idea was to remove this check? (See patch below). I am not sure
if the proper fix also requires to change vring.avail->flags to be only
changed by atomic bitops. Any ideas, comments?
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
CC: Anthony Liguori <aliguori@us.ibm.com>
CC: Dor Laor <dor.laor@qumranet.com>
CC: Rusty Russell <rusty@rustcorp.com.au>
---
drivers/virtio/virtio_ring.c | 2 --
1 file changed, 2 deletions(-)
Index: kvm/drivers/virtio/virtio_ring.c
===================================================================
--- kvm.orig/drivers/virtio/virtio_ring.c
+++ kvm/drivers/virtio/virtio_ring.c
@@ -225,8 +225,6 @@ static bool vring_restart(struct virtque
struct vring_virtqueue *vq = to_vvq(_vq);
START_USE(vq);
- BUG_ON(!(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT));
-
/* We optimistically turn back on interrupts, then check if there was
* more to do. */
vq->vring.avail->flags &= ~VRING_AVAIL_F_NO_INTERRUPT;
^ permalink raw reply [flat|nested] 11+ messages in thread[parent not found: <200712141312.05562.borntraeger-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>]
* Re: virtio_net and SMP guests [not found] ` <200712141312.05562.borntraeger-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org> @ 2007-12-16 11:55 ` Dor Laor 2007-12-18 6:51 ` Rusty Russell 1 sibling, 0 replies; 11+ messages in thread From: Dor Laor @ 2007-12-16 11:55 UTC (permalink / raw) To: Christian Borntraeger Cc: kvm-devel, virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA Christian Borntraeger wrote: > Rusty, Anthony, Dor, > > I need your brain power :-) > > On smp guests I have seen a problem with virtio (the version in curent Avi's > git) which do not occur on single processor guests: > > kernel BUG at /space/kvm/drivers/virtio/virtio_ring.c:228! > illegal operation: 0001 [#1] > Modules linked in: ipv6 > CPU: 2 Not tainted > Process swapper (pid: 0, task: 000000000f83e038, ksp: 000000000f877d70) > Krnl PSW : 0704000180000000 000000000045df2a (vring_restart+0x5a/0x70) > R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0 EA:3 > Krnl GPRS: 00000000c0a80101 0000000000000000 000000000eb35000 0000000010005800 > 000000000045ded0 000000000000192f 000000000eb21000 000000000eb21000 > 000000000000000e 000000000eb21900 000000000eb21920 000000000f867cb8 > 0700000000d9b058 0000000000000010 000000000045c06a 000000000f867cb8 > Krnl Code: 000000000045df1e: e3b0b0700004 lg %r11,112(%r11) > 000000000045df24: 07fe bcr 15,%r14 > 000000000045df26: a7f40001 brc 15,45df28 > >000000000045df2a: a7f4ffe1 brc 15,45deec > 000000000045df2e: e31020300004 lg %r1,48(%r2) > 000000000045df34: a7480000 lhi %r4,0 > 000000000045df38: 96011001 oi 1(%r1),1 > 000000000045df3c: a7f4ffef brc 15,45df1a > Call Trace: > ([<000000000045c016>] virtnet_poll+0x96/0x42c) > [<000000000048cda2>] net_rx_action+0xca/0x150 > [<0000000000137f7a>] __do_softirq+0x9e/0x130 > [<00000000001105d6>] do_softirq+0xae/0xb4 > [<0000000000138182>] irq_exit+0x96/0x9c > [<000000000010d710>] do_extint+0xcc/0xf8 > [<00000000001135d0>] ext_no_vtime+0x16/0x1a > [<000000000010a57e>] cpu_idle+0x216/0x238 > > > I think there is a valid code path, triggering this bug: > > CPU1 CPU2 > ----------------------- ----------------------- > - virtnet_poll found no > more packets on queue > - netif_rx_complete allow > poll to be called > - vq_ops->restart is called > - vq Interrupts are enabled > . <new packets arrive> > <vcpu is scheduled away> > . - interrupt is delivered > . - poll is called > . - poll work is done > . - netif_rx_complete > . - vq_ops->restart is called > . - check if vq interrupts are > . enable --> BUG > > I didn't understand how its possible: <vcpu is scheduled away> . - interrupt is delivered -vring_interrupt is called -> - skb_recv_done callback return false -> vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT; So when the restart callback will be called the BUG_ON(!(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT)); will not be issued. . - poll is called . - poll work is done . - netif_rx_complete . - vq_ops->restart is called . - check if vq interrupts are . enable --> BUG > The first idea was to remove this check? (See patch below). I am not sure > if the proper fix also requires to change vring.avail->flags to be only > changed by atomic bitops. Any ideas, comments? > As for now no harm can be done since it is only used in two place: 1. vring_restart inside a napi poll calback which is protected by napi poll lock 2. vring_interrupt in the interrupt handler. While only the VRING_AVAIL_F_NO_INTERRUPT bit is touched there is no possible harm, once we'll use more bits it might be an issue. So Maybe we should use atomics. > Signed-off-by: Christian Borntraeger <borntraeger-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org> > CC: Anthony Liguori <aliguori-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> > CC: Dor Laor <dor.laor-atKUWr5tajBWk0Htik3J/w@public.gmane.org> > CC: Rusty Russell <rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org> > > --- > drivers/virtio/virtio_ring.c | 2 -- > 1 file changed, 2 deletions(-) > > Index: kvm/drivers/virtio/virtio_ring.c > =================================================================== > --- kvm.orig/drivers/virtio/virtio_ring.c > +++ kvm/drivers/virtio/virtio_ring.c > @@ -225,8 +225,6 @@ static bool vring_restart(struct virtque > struct vring_virtqueue *vq = to_vvq(_vq); > > START_USE(vq); > - BUG_ON(!(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT)); > - > /* We optimistically turn back on interrupts, then check if there was > * more to do. */ > vq->vring.avail->flags &= ~VRING_AVAIL_F_NO_INTERRUPT; > > ------------------------------------------------------------------------- SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: virtio_net and SMP guests [not found] ` <200712141312.05562.borntraeger-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org> 2007-12-16 11:55 ` Dor Laor @ 2007-12-18 6:51 ` Rusty Russell [not found] ` <200712181751.24692.rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org> 1 sibling, 1 reply; 11+ messages in thread From: Rusty Russell @ 2007-12-18 6:51 UTC (permalink / raw) To: Christian Borntraeger Cc: kvm-devel, virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA On Friday 14 December 2007 23:12:05 Christian Borntraeger wrote: > Rusty, Anthony, Dor, > > I need your brain power :-) > > On smp guests I have seen a problem with virtio (the version in curent > Avi's git) which do not occur on single processor guests: > > kernel BUG at /space/kvm/drivers/virtio/virtio_ring.c:228! > illegal operation: 0001 [#1] > Modules linked in: ipv6 > CPU: 2 Not tainted > Process swapper (pid: 0, task: 000000000f83e038, ksp: 000000000f877d70) > Krnl PSW : 0704000180000000 000000000045df2a (vring_restart+0x5a/0x70) > R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0 EA:3 > Krnl GPRS: 00000000c0a80101 0000000000000000 000000000eb35000 > 0000000010005800 000000000045ded0 000000000000192f 000000000eb21000 > 000000000eb21000 000000000000000e 000000000eb21900 000000000eb21920 > 000000000f867cb8 0700000000d9b058 0000000000000010 000000000045c06a > 000000000f867cb8 Krnl Code: 000000000045df1e: e3b0b0700004 lg > %r11,112(%r11) 000000000045df24: 07fe bcr 15,%r14 > 000000000045df26: a7f40001 brc 15,45df28 > > >000000000045df2a: a7f4ffe1 brc 15,45deec > > 000000000045df2e: e31020300004 lg %r1,48(%r2) > 000000000045df34: a7480000 lhi %r4,0 > 000000000045df38: 96011001 oi 1(%r1),1 > 000000000045df3c: a7f4ffef brc 15,45df1a > Call Trace: > ([<000000000045c016>] virtnet_poll+0x96/0x42c) > [<000000000048cda2>] net_rx_action+0xca/0x150 > [<0000000000137f7a>] __do_softirq+0x9e/0x130 > [<00000000001105d6>] do_softirq+0xae/0xb4 > [<0000000000138182>] irq_exit+0x96/0x9c > [<000000000010d710>] do_extint+0xcc/0xf8 > [<00000000001135d0>] ext_no_vtime+0x16/0x1a > [<000000000010a57e>] cpu_idle+0x216/0x238 > > > I think there is a valid code path, triggering this bug: > > CPU1 CPU2 > ----------------------- ----------------------- > - virtnet_poll found no > more packets on queue > - netif_rx_complete allow > poll to be called > - vq_ops->restart is called > - vq Interrupts are enabled > . <new packets arrive> > <vcpu is scheduled away> > . - interrupt is delivered > . - poll is called > . - poll work is done > . - netif_rx_complete > . - vq_ops->restart is called > . - check if vq interrupts are > . enable --> BUG Shouldn't this be BUG()ing in START_USE()? Or did you disable that because of previous false positives? virtnet_poll() should not be reentering, AFAICT. virtio relies on the caller to ensure it never calls two virtio functions on the same queue simultaneously, and virtio_net in turn relies on the core net code to enforce this. So, how did this happen? Looks like we can have an interrupt on one CPU, which does: if (vq->vq.callback && !vq->vq.callback(&vq->vq)) vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT; The callback schedules the softirq to run, which usually runs on same CPU... but if somehow it runs somewhere else, it could hit restart before the flag is set. This means removing the BUG_ON() is bad, because we could restart, and then set the NO_INTERRUPT flag from the handler, and stall. Note that interrupt suppression is advisory, so we can be sloppy on the setting side. To me this points to doing interrupt suppression a different way. If we have a ->disable_cb() virtio function, and call it before we call netif_rx_schedule, does that fix it? How's this? Rusty. --- virtio: explicit enable_cb/disable_cb rather than callback return. It seems that virtio_net wants to disable callbacks (interrupts) before calling netif_rx_schedule(), so we can't use the return value to do so. Rename "restart" to "cb_enable" and introduce "cb_disable" hook: callback now returns void, rather than a boolean. Signed-off-by: Rusty Russell <rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org> diff -r 0eabf082c13a drivers/block/virtio_blk.c --- a/drivers/block/virtio_blk.c Tue Dec 18 13:51:13 2007 +1100 +++ b/drivers/block/virtio_blk.c Tue Dec 18 15:49:18 2007 +1100 @@ -36,7 +36,7 @@ struct virtblk_req struct virtio_blk_inhdr in_hdr; }; -static bool blk_done(struct virtqueue *vq) +static void blk_done(struct virtqueue *vq) { struct virtio_blk *vblk = vq->vdev->priv; struct virtblk_req *vbr; @@ -65,7 +65,6 @@ static bool blk_done(struct virtqueue *v /* In case queue is stopped waiting for more buffers. */ blk_start_queue(vblk->disk->queue); spin_unlock_irqrestore(&vblk->lock, flags); - return true; } static bool do_req(struct request_queue *q, struct virtio_blk *vblk, diff -r 0eabf082c13a drivers/lguest/lguest_device.c --- a/drivers/lguest/lguest_device.c Tue Dec 18 13:51:13 2007 +1100 +++ b/drivers/lguest/lguest_device.c Tue Dec 18 15:49:18 2007 +1100 @@ -193,7 +193,7 @@ static void lg_notify(struct virtqueue * * So we provide devices with a "find virtqueue and set it up" function. */ static struct virtqueue *lg_find_vq(struct virtio_device *vdev, unsigned index, - bool (*callback)(struct virtqueue *vq)) + void (*callback)(struct virtqueue *vq)) { struct lguest_device *ldev = to_lgdev(vdev); struct lguest_vq_info *lvq; diff -r 0eabf082c13a drivers/net/virtio_net.c --- a/drivers/net/virtio_net.c Tue Dec 18 13:51:13 2007 +1100 +++ b/drivers/net/virtio_net.c Tue Dec 18 15:49:18 2007 +1100 @@ -59,13 +59,12 @@ static inline void vnet_hdr_to_sg(struct sg_init_one(sg, skb_vnet_hdr(skb), sizeof(struct virtio_net_hdr)); } -static bool skb_xmit_done(struct virtqueue *rvq) +static void skb_xmit_done(struct virtqueue *rvq) { struct virtnet_info *vi = rvq->vdev->priv; /* In case we were waiting for output buffers. */ netif_wake_queue(vi->dev); - return true; } static void receive_skb(struct net_device *dev, struct sk_buff *skb, @@ -177,12 +176,12 @@ static void try_fill_recv(struct virtnet vi->rvq->vq_ops->kick(vi->rvq); } -static bool skb_recv_done(struct virtqueue *rvq) +static void skb_recv_done(struct virtqueue *rvq) { struct virtnet_info *vi = rvq->vdev->priv; + /* Suppress further interrupts. */ + rvq->vq_ops->disable_cb(rvq); netif_rx_schedule(vi->dev, &vi->napi); - /* Suppress further interrupts. */ - return false; } static int virtnet_poll(struct napi_struct *napi, int budget) @@ -208,7 +207,7 @@ again: /* Out of packets? */ if (received < budget) { netif_rx_complete(vi->dev, napi); - if (unlikely(!vi->rvq->vq_ops->restart(vi->rvq)) + if (unlikely(!vi->rvq->vq_ops->enable_cb(vi->rvq)) && netif_rx_reschedule(vi->dev, napi)) goto again; } diff -r 0eabf082c13a drivers/virtio/virtio_pci.c --- a/drivers/virtio/virtio_pci.c Tue Dec 18 13:51:13 2007 +1100 +++ b/drivers/virtio/virtio_pci.c Tue Dec 18 15:49:18 2007 +1100 @@ -190,7 +190,7 @@ static irqreturn_t vp_interrupt(int irq, /* the config->find_vq() implementation */ static struct virtqueue *vp_find_vq(struct virtio_device *vdev, unsigned index, - bool (*callback)(struct virtqueue *vq)) + void (*callback)(struct virtqueue *vq)) { struct virtio_pci_device *vp_dev = to_vp_device(vdev); struct virtio_pci_vq_info *info; diff -r 0eabf082c13a drivers/virtio/virtio_ring.c --- a/drivers/virtio/virtio_ring.c Tue Dec 18 13:51:13 2007 +1100 +++ b/drivers/virtio/virtio_ring.c Tue Dec 18 15:49:18 2007 +1100 @@ -220,7 +220,17 @@ static void *vring_get_buf(struct virtqu return ret; } -static bool vring_restart(struct virtqueue *_vq) +static void vring_disable_cb(struct virtqueue *_vq) +{ + struct vring_virtqueue *vq = to_vvq(_vq); + + START_USE(vq); + BUG_ON(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT); + vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT; + END_USE(vq); +} + +static bool vring_enable_cb(struct virtqueue *_vq) { struct vring_virtqueue *vq = to_vvq(_vq); @@ -254,8 +264,8 @@ irqreturn_t vring_interrupt(int irq, voi return IRQ_HANDLED; pr_debug("virtqueue callback for %p (%p)\n", vq, vq->vq.callback); - if (vq->vq.callback && !vq->vq.callback(&vq->vq)) - vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT; + if (vq->vq.callback) + vq->vq.callback(&vq->vq); return IRQ_HANDLED; } @@ -266,7 +276,8 @@ static struct virtqueue_ops vring_vq_ops .add_buf = vring_add_buf, .get_buf = vring_get_buf, .kick = vring_kick, - .restart = vring_restart, + .disable_cb = vring_disable_cb, + .enable_cb = vring_enable_cb, .shutdown = vring_shutdown, }; @@ -274,7 +285,7 @@ struct virtqueue *vring_new_virtqueue(un struct virtio_device *vdev, void *pages, void (*notify)(struct virtqueue *), - bool (*callback)(struct virtqueue *)) + void (*callback)(struct virtqueue *)) { struct vring_virtqueue *vq; unsigned int i; diff -r 0eabf082c13a include/linux/virtio.h --- a/include/linux/virtio.h Tue Dec 18 13:51:13 2007 +1100 +++ b/include/linux/virtio.h Tue Dec 18 15:49:18 2007 +1100 @@ -11,15 +11,13 @@ /** * virtqueue - a queue to register buffers for sending or receiving. * @callback: the function to call when buffers are consumed (can be NULL). - * If this returns false, callbacks are suppressed until vq_ops->restart - * is called. * @vdev: the virtio device this queue was created for. * @vq_ops: the operations for this virtqueue (see below). * @priv: a pointer for the virtqueue implementation to use. */ struct virtqueue { - bool (*callback)(struct virtqueue *vq); + void (*callback)(struct virtqueue *vq); struct virtio_device *vdev; struct virtqueue_ops *vq_ops; void *priv; @@ -41,7 +39,9 @@ struct virtqueue * vq: the struct virtqueue we're talking about. * len: the length written into the buffer * Returns NULL or the "data" token handed to add_buf. - * @restart: restart callbacks after callback returned false. + * @disable_cb: disable callbacks + * vq: the struct virtqueue we're talking about. + * @enable_cb: restart callbacks after disable_cb. * vq: the struct virtqueue we're talking about. * This returns "false" (and doesn't re-enable) if there are pending * buffers in the queue, to avoid a race. @@ -65,7 +65,8 @@ struct virtqueue_ops { void *(*get_buf)(struct virtqueue *vq, unsigned int *len); - bool (*restart)(struct virtqueue *vq); + void (*disable_cb)(struct virtqueue *vq); + bool (*enable_cb)(struct virtqueue *vq); void (*shutdown)(struct virtqueue *vq); }; diff -r 0eabf082c13a include/linux/virtio_config.h --- a/include/linux/virtio_config.h Tue Dec 18 13:51:13 2007 +1100 +++ b/include/linux/virtio_config.h Tue Dec 18 15:49:18 2007 +1100 @@ -61,7 +61,7 @@ struct virtio_config_ops void (*set_status)(struct virtio_device *vdev, u8 status); struct virtqueue *(*find_vq)(struct virtio_device *vdev, unsigned index, - bool (*callback)(struct virtqueue *)); + void (*callback)(struct virtqueue *)); void (*del_vq)(struct virtqueue *vq); }; diff -r 0eabf082c13a include/linux/virtio_ring.h --- a/include/linux/virtio_ring.h Tue Dec 18 13:51:13 2007 +1100 +++ b/include/linux/virtio_ring.h Tue Dec 18 15:49:18 2007 +1100 @@ -114,7 +114,7 @@ struct virtqueue *vring_new_virtqueue(un struct virtio_device *vdev, void *pages, void (*notify)(struct virtqueue *vq), - bool (*callback)(struct virtqueue *vq)); + void (*callback)(struct virtqueue *vq)); void vring_del_virtqueue(struct virtqueue *vq); irqreturn_t vring_interrupt(int irq, void *_vq); ------------------------------------------------------------------------- SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <200712181751.24692.rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org>]
* Re: virtio_net and SMP guests [not found] ` <200712181751.24692.rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org> @ 2007-12-23 23:19 ` Dor Laor [not found] ` <476EECF7.9000204-atKUWr5tajBWk0Htik3J/w@public.gmane.org> 2008-01-10 12:37 ` Christian Borntraeger 1 sibling, 1 reply; 11+ messages in thread From: Dor Laor @ 2007-12-23 23:19 UTC (permalink / raw) To: Rusty Russell Cc: kvm-devel, Christian Borntraeger, virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA Rusty Russell wrote: > On Friday 14 December 2007 23:12:05 Christian Borntraeger wrote: > >> Rusty, Anthony, Dor, >> >> I need your brain power :-) >> >> On smp guests I have seen a problem with virtio (the version in curent >> Avi's git) which do not occur on single processor guests: >> >> kernel BUG at /space/kvm/drivers/virtio/virtio_ring.c:228! >> illegal operation: 0001 [#1] >> Modules linked in: ipv6 >> CPU: 2 Not tainted >> Process swapper (pid: 0, task: 000000000f83e038, ksp: 000000000f877d70) >> Krnl PSW : 0704000180000000 000000000045df2a (vring_restart+0x5a/0x70) >> R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0 EA:3 >> Krnl GPRS: 00000000c0a80101 0000000000000000 000000000eb35000 >> 0000000010005800 000000000045ded0 000000000000192f 000000000eb21000 >> 000000000eb21000 000000000000000e 000000000eb21900 000000000eb21920 >> 000000000f867cb8 0700000000d9b058 0000000000000010 000000000045c06a >> 000000000f867cb8 Krnl Code: 000000000045df1e: e3b0b0700004 lg >> %r11,112(%r11) 000000000045df24: 07fe bcr 15,%r14 >> 000000000045df26: a7f40001 brc 15,45df28 >> >> >000000000045df2a: a7f4ffe1 brc 15,45deec >> >> 000000000045df2e: e31020300004 lg %r1,48(%r2) >> 000000000045df34: a7480000 lhi %r4,0 >> 000000000045df38: 96011001 oi 1(%r1),1 >> 000000000045df3c: a7f4ffef brc 15,45df1a >> Call Trace: >> ([<000000000045c016>] virtnet_poll+0x96/0x42c) >> [<000000000048cda2>] net_rx_action+0xca/0x150 >> [<0000000000137f7a>] __do_softirq+0x9e/0x130 >> [<00000000001105d6>] do_softirq+0xae/0xb4 >> [<0000000000138182>] irq_exit+0x96/0x9c >> [<000000000010d710>] do_extint+0xcc/0xf8 >> [<00000000001135d0>] ext_no_vtime+0x16/0x1a >> [<000000000010a57e>] cpu_idle+0x216/0x238 >> >> >> I think there is a valid code path, triggering this bug: >> >> CPU1 CPU2 >> ----------------------- ----------------------- >> - virtnet_poll found no >> more packets on queue >> - netif_rx_complete allow >> poll to be called >> - vq_ops->restart is called >> - vq Interrupts are enabled >> . <new packets arrive> >> <vcpu is scheduled away> >> . - interrupt is delivered >> . - poll is called >> . - poll work is done >> . - netif_rx_complete >> . - vq_ops->restart is called >> . - check if vq interrupts are >> . enable --> BUG >> > > Shouldn't this be BUG()ing in START_USE()? Or did you disable that because > of previous false positives? > > virtnet_poll() should not be reentering, AFAICT. virtio relies on the caller > to ensure it never calls two virtio functions on the same queue > simultaneously, and virtio_net in turn relies on the core net code to enforce > this. > > So, how did this happen? > > Looks like we can have an interrupt on one CPU, which does: > > if (vq->vq.callback && !vq->vq.callback(&vq->vq)) > vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT; > > The callback schedules the softirq to run, which usually runs on same CPU... > but if somehow it runs somewhere else, it could hit restart before the flag > is set. This means removing the BUG_ON() is bad, because we could restart, > and then set the NO_INTERRUPT flag from the handler, and stall. > > Note that interrupt suppression is advisory, so we can be sloppy on the > setting side. > > To me this points to doing interrupt suppression a different way. If we > have a ->disable_cb() virtio function, and call it before we call > netif_rx_schedule, does that fix it? > > How's this? > Rusty. > > Looks good to me. The only thing is the naming.. Maybe one can find better name than [dis|en]able_cb since it is more like disable interrupts than disable_cb and enable_cb is more like run_callbacks (and enable interrupts). Actually while looking at it some more, there's might be a performance optimization using your new patch: Up to now, if the tx/rx paths were running out of descriptors they called enable_cb. enable_cb told the host it should trigger an irq the next time it has data for the guest. From now on, enable_cb will run the callbacks inside shortening latency. btw, I tried to apply this patch on my source tree without luck, after doing a manual merge, the guest opssed on the BUG_ON. I think it's because my sources are not aligned with yours. Can you please post a mercurial/git repo? Please specify the relatively repository in case you choose mercurial. Thanks, Dor > --- > virtio: explicit enable_cb/disable_cb rather than callback return. > > It seems that virtio_net wants to disable callbacks (interrupts) before > calling netif_rx_schedule(), so we can't use the return value to do so. > > Rename "restart" to "cb_enable" and introduce "cb_disable" hook: callback > now returns void, rather than a boolean. > > Signed-off-by: Rusty Russell <rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org> > > diff -r 0eabf082c13a drivers/block/virtio_blk.c > --- a/drivers/block/virtio_blk.c Tue Dec 18 13:51:13 2007 +1100 > +++ b/drivers/block/virtio_blk.c Tue Dec 18 15:49:18 2007 +1100 > @@ -36,7 +36,7 @@ struct virtblk_req > struct virtio_blk_inhdr in_hdr; > }; > > -static bool blk_done(struct virtqueue *vq) > +static void blk_done(struct virtqueue *vq) > { > struct virtio_blk *vblk = vq->vdev->priv; > struct virtblk_req *vbr; > @@ -65,7 +65,6 @@ static bool blk_done(struct virtqueue *v > /* In case queue is stopped waiting for more buffers. */ > blk_start_queue(vblk->disk->queue); > spin_unlock_irqrestore(&vblk->lock, flags); > - return true; > } > > static bool do_req(struct request_queue *q, struct virtio_blk *vblk, > diff -r 0eabf082c13a drivers/lguest/lguest_device.c > --- a/drivers/lguest/lguest_device.c Tue Dec 18 13:51:13 2007 +1100 > +++ b/drivers/lguest/lguest_device.c Tue Dec 18 15:49:18 2007 +1100 > @@ -193,7 +193,7 @@ static void lg_notify(struct virtqueue * > * So we provide devices with a "find virtqueue and set it up" function. */ > static struct virtqueue *lg_find_vq(struct virtio_device *vdev, > unsigned index, > - bool (*callback)(struct virtqueue *vq)) > + void (*callback)(struct virtqueue *vq)) > { > struct lguest_device *ldev = to_lgdev(vdev); > struct lguest_vq_info *lvq; > diff -r 0eabf082c13a drivers/net/virtio_net.c > --- a/drivers/net/virtio_net.c Tue Dec 18 13:51:13 2007 +1100 > +++ b/drivers/net/virtio_net.c Tue Dec 18 15:49:18 2007 +1100 > @@ -59,13 +59,12 @@ static inline void vnet_hdr_to_sg(struct > sg_init_one(sg, skb_vnet_hdr(skb), sizeof(struct virtio_net_hdr)); > } > > -static bool skb_xmit_done(struct virtqueue *rvq) > +static void skb_xmit_done(struct virtqueue *rvq) > { > struct virtnet_info *vi = rvq->vdev->priv; > > /* In case we were waiting for output buffers. */ > netif_wake_queue(vi->dev); > - return true; > } > > static void receive_skb(struct net_device *dev, struct sk_buff *skb, > @@ -177,12 +176,12 @@ static void try_fill_recv(struct virtnet > vi->rvq->vq_ops->kick(vi->rvq); > } > > -static bool skb_recv_done(struct virtqueue *rvq) > +static void skb_recv_done(struct virtqueue *rvq) > { > struct virtnet_info *vi = rvq->vdev->priv; > + /* Suppress further interrupts. */ > + rvq->vq_ops->disable_cb(rvq); > netif_rx_schedule(vi->dev, &vi->napi); > - /* Suppress further interrupts. */ > - return false; > } > > static int virtnet_poll(struct napi_struct *napi, int budget) > @@ -208,7 +207,7 @@ again: > /* Out of packets? */ > if (received < budget) { > netif_rx_complete(vi->dev, napi); > - if (unlikely(!vi->rvq->vq_ops->restart(vi->rvq)) > + if (unlikely(!vi->rvq->vq_ops->enable_cb(vi->rvq)) > && netif_rx_reschedule(vi->dev, napi)) > goto again; > } > diff -r 0eabf082c13a drivers/virtio/virtio_pci.c > --- a/drivers/virtio/virtio_pci.c Tue Dec 18 13:51:13 2007 +1100 > +++ b/drivers/virtio/virtio_pci.c Tue Dec 18 15:49:18 2007 +1100 > @@ -190,7 +190,7 @@ static irqreturn_t vp_interrupt(int irq, > > /* the config->find_vq() implementation */ > static struct virtqueue *vp_find_vq(struct virtio_device *vdev, unsigned index, > - bool (*callback)(struct virtqueue *vq)) > + void (*callback)(struct virtqueue *vq)) > { > struct virtio_pci_device *vp_dev = to_vp_device(vdev); > struct virtio_pci_vq_info *info; > diff -r 0eabf082c13a drivers/virtio/virtio_ring.c > --- a/drivers/virtio/virtio_ring.c Tue Dec 18 13:51:13 2007 +1100 > +++ b/drivers/virtio/virtio_ring.c Tue Dec 18 15:49:18 2007 +1100 > @@ -220,7 +220,17 @@ static void *vring_get_buf(struct virtqu > return ret; > } > > -static bool vring_restart(struct virtqueue *_vq) > +static void vring_disable_cb(struct virtqueue *_vq) > +{ > + struct vring_virtqueue *vq = to_vvq(_vq); > + > + START_USE(vq); > + BUG_ON(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT); > + vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT; > + END_USE(vq); > +} > + > +static bool vring_enable_cb(struct virtqueue *_vq) > { > struct vring_virtqueue *vq = to_vvq(_vq); > > @@ -254,8 +264,8 @@ irqreturn_t vring_interrupt(int irq, voi > return IRQ_HANDLED; > > pr_debug("virtqueue callback for %p (%p)\n", vq, vq->vq.callback); > - if (vq->vq.callback && !vq->vq.callback(&vq->vq)) > - vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT; > + if (vq->vq.callback) > + vq->vq.callback(&vq->vq); > > return IRQ_HANDLED; > } > @@ -266,7 +276,8 @@ static struct virtqueue_ops vring_vq_ops > .add_buf = vring_add_buf, > .get_buf = vring_get_buf, > .kick = vring_kick, > - .restart = vring_restart, > + .disable_cb = vring_disable_cb, > + .enable_cb = vring_enable_cb, > .shutdown = vring_shutdown, > }; > > @@ -274,7 +285,7 @@ struct virtqueue *vring_new_virtqueue(un > struct virtio_device *vdev, > void *pages, > void (*notify)(struct virtqueue *), > - bool (*callback)(struct virtqueue *)) > + void (*callback)(struct virtqueue *)) > { > struct vring_virtqueue *vq; > unsigned int i; > diff -r 0eabf082c13a include/linux/virtio.h > --- a/include/linux/virtio.h Tue Dec 18 13:51:13 2007 +1100 > +++ b/include/linux/virtio.h Tue Dec 18 15:49:18 2007 +1100 > @@ -11,15 +11,13 @@ > /** > * virtqueue - a queue to register buffers for sending or receiving. > * @callback: the function to call when buffers are consumed (can be NULL). > - * If this returns false, callbacks are suppressed until vq_ops->restart > - * is called. > * @vdev: the virtio device this queue was created for. > * @vq_ops: the operations for this virtqueue (see below). > * @priv: a pointer for the virtqueue implementation to use. > */ > struct virtqueue > { > - bool (*callback)(struct virtqueue *vq); > + void (*callback)(struct virtqueue *vq); > struct virtio_device *vdev; > struct virtqueue_ops *vq_ops; > void *priv; > @@ -41,7 +39,9 @@ struct virtqueue > * vq: the struct virtqueue we're talking about. > * len: the length written into the buffer > * Returns NULL or the "data" token handed to add_buf. > - * @restart: restart callbacks after callback returned false. > + * @disable_cb: disable callbacks > + * vq: the struct virtqueue we're talking about. > + * @enable_cb: restart callbacks after disable_cb. > * vq: the struct virtqueue we're talking about. > * This returns "false" (and doesn't re-enable) if there are pending > * buffers in the queue, to avoid a race. > @@ -65,7 +65,8 @@ struct virtqueue_ops { > > void *(*get_buf)(struct virtqueue *vq, unsigned int *len); > > - bool (*restart)(struct virtqueue *vq); > + void (*disable_cb)(struct virtqueue *vq); > + bool (*enable_cb)(struct virtqueue *vq); > > void (*shutdown)(struct virtqueue *vq); > }; > diff -r 0eabf082c13a include/linux/virtio_config.h > --- a/include/linux/virtio_config.h Tue Dec 18 13:51:13 2007 +1100 > +++ b/include/linux/virtio_config.h Tue Dec 18 15:49:18 2007 +1100 > @@ -61,7 +61,7 @@ struct virtio_config_ops > void (*set_status)(struct virtio_device *vdev, u8 status); > struct virtqueue *(*find_vq)(struct virtio_device *vdev, > unsigned index, > - bool (*callback)(struct virtqueue *)); > + void (*callback)(struct virtqueue *)); > void (*del_vq)(struct virtqueue *vq); > }; > > diff -r 0eabf082c13a include/linux/virtio_ring.h > --- a/include/linux/virtio_ring.h Tue Dec 18 13:51:13 2007 +1100 > +++ b/include/linux/virtio_ring.h Tue Dec 18 15:49:18 2007 +1100 > @@ -114,7 +114,7 @@ struct virtqueue *vring_new_virtqueue(un > struct virtio_device *vdev, > void *pages, > void (*notify)(struct virtqueue *vq), > - bool (*callback)(struct virtqueue *vq)); > + void (*callback)(struct virtqueue *vq)); > void vring_del_virtqueue(struct virtqueue *vq); > > irqreturn_t vring_interrupt(int irq, void *_vq); > > ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <476EECF7.9000204-atKUWr5tajBWk0Htik3J/w@public.gmane.org>]
* Re: virtio_net and SMP guests [not found] ` <476EECF7.9000204-atKUWr5tajBWk0Htik3J/w@public.gmane.org> @ 2007-12-24 0:54 ` Rusty Russell [not found] ` <200712241154.20885.rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org> 0 siblings, 1 reply; 11+ messages in thread From: Rusty Russell @ 2007-12-24 0:54 UTC (permalink / raw) To: dor.laor-atKUWr5tajBWk0Htik3J/w Cc: kvm-devel, Christian Borntraeger, virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA On Monday 24 December 2007 10:19:19 Dor Laor wrote: > Rusty Russell wrote: > Looks good to me. The only thing is the naming.. Maybe one can find > better name than [dis|en]able_cb since > it is more like disable interrupts than disable_cb and enable_cb is more > like run_callbacks (and enable interrupts). Yes, the nomenclature is a little wishy-washy. Perhaps we should call them "interrupts" even though (all together) the s390 doesn't have interrupts. > Actually while looking at it some more, there's might be a performance > optimization using your new patch: > Up to now, if the tx/rx paths were running out of descriptors they > called enable_cb. > enable_cb told the host it should trigger an irq the next time it has > data for the guest. > From now on, enable_cb will run the callbacks inside shortening latency. > > btw, I tried to apply this patch on my source tree without luck, after > doing a manual merge, the > guest opssed on the BUG_ON. I think it's because my sources are not > aligned with yours. > Can you please post a mercurial/git repo? Please specify the relatively > repository in case you choose mercurial. Hmm, I currently publish a patch queue and a subset of that for Linus to pull from. I could create a git tree from it but it'd be useless to you, since it'd be re-created every time I modify the patch queue... Rusty. > Thanks, > Dor > > > --- > > virtio: explicit enable_cb/disable_cb rather than callback return. > > > > It seems that virtio_net wants to disable callbacks (interrupts) before > > calling netif_rx_schedule(), so we can't use the return value to do so. > > > > Rename "restart" to "cb_enable" and introduce "cb_disable" hook: callback > > now returns void, rather than a boolean. > > > > Signed-off-by: Rusty Russell <rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org> > > > > diff -r 0eabf082c13a drivers/block/virtio_blk.c > > --- a/drivers/block/virtio_blk.c Tue Dec 18 13:51:13 2007 +1100 > > +++ b/drivers/block/virtio_blk.c Tue Dec 18 15:49:18 2007 +1100 > > @@ -36,7 +36,7 @@ struct virtblk_req > > struct virtio_blk_inhdr in_hdr; > > }; > > > > -static bool blk_done(struct virtqueue *vq) > > +static void blk_done(struct virtqueue *vq) > > { > > struct virtio_blk *vblk = vq->vdev->priv; > > struct virtblk_req *vbr; > > @@ -65,7 +65,6 @@ static bool blk_done(struct virtqueue *v > > /* In case queue is stopped waiting for more buffers. */ > > blk_start_queue(vblk->disk->queue); > > spin_unlock_irqrestore(&vblk->lock, flags); > > - return true; > > } > > > > static bool do_req(struct request_queue *q, struct virtio_blk *vblk, > > diff -r 0eabf082c13a drivers/lguest/lguest_device.c > > --- a/drivers/lguest/lguest_device.c Tue Dec 18 13:51:13 2007 +1100 > > +++ b/drivers/lguest/lguest_device.c Tue Dec 18 15:49:18 2007 +1100 > > @@ -193,7 +193,7 @@ static void lg_notify(struct virtqueue * > > * So we provide devices with a "find virtqueue and set it up" function. > > */ static struct virtqueue *lg_find_vq(struct virtio_device *vdev, > > unsigned index, > > - bool (*callback)(struct virtqueue *vq)) > > + void (*callback)(struct virtqueue *vq)) > > { > > struct lguest_device *ldev = to_lgdev(vdev); > > struct lguest_vq_info *lvq; > > diff -r 0eabf082c13a drivers/net/virtio_net.c > > --- a/drivers/net/virtio_net.c Tue Dec 18 13:51:13 2007 +1100 > > +++ b/drivers/net/virtio_net.c Tue Dec 18 15:49:18 2007 +1100 > > @@ -59,13 +59,12 @@ static inline void vnet_hdr_to_sg(struct > > sg_init_one(sg, skb_vnet_hdr(skb), sizeof(struct virtio_net_hdr)); > > } > > > > -static bool skb_xmit_done(struct virtqueue *rvq) > > +static void skb_xmit_done(struct virtqueue *rvq) > > { > > struct virtnet_info *vi = rvq->vdev->priv; > > > > /* In case we were waiting for output buffers. */ > > netif_wake_queue(vi->dev); > > - return true; > > } > > > > static void receive_skb(struct net_device *dev, struct sk_buff *skb, > > @@ -177,12 +176,12 @@ static void try_fill_recv(struct virtnet > > vi->rvq->vq_ops->kick(vi->rvq); > > } > > > > -static bool skb_recv_done(struct virtqueue *rvq) > > +static void skb_recv_done(struct virtqueue *rvq) > > { > > struct virtnet_info *vi = rvq->vdev->priv; > > + /* Suppress further interrupts. */ > > + rvq->vq_ops->disable_cb(rvq); > > netif_rx_schedule(vi->dev, &vi->napi); > > - /* Suppress further interrupts. */ > > - return false; > > } > > > > static int virtnet_poll(struct napi_struct *napi, int budget) > > @@ -208,7 +207,7 @@ again: > > /* Out of packets? */ > > if (received < budget) { > > netif_rx_complete(vi->dev, napi); > > - if (unlikely(!vi->rvq->vq_ops->restart(vi->rvq)) > > + if (unlikely(!vi->rvq->vq_ops->enable_cb(vi->rvq)) > > && netif_rx_reschedule(vi->dev, napi)) > > goto again; > > } > > diff -r 0eabf082c13a drivers/virtio/virtio_pci.c > > --- a/drivers/virtio/virtio_pci.c Tue Dec 18 13:51:13 2007 +1100 > > +++ b/drivers/virtio/virtio_pci.c Tue Dec 18 15:49:18 2007 +1100 > > @@ -190,7 +190,7 @@ static irqreturn_t vp_interrupt(int irq, > > > > /* the config->find_vq() implementation */ > > static struct virtqueue *vp_find_vq(struct virtio_device *vdev, unsigned > > index, - bool (*callback)(struct virtqueue *vq)) > > + void (*callback)(struct virtqueue *vq)) > > { > > struct virtio_pci_device *vp_dev = to_vp_device(vdev); > > struct virtio_pci_vq_info *info; > > diff -r 0eabf082c13a drivers/virtio/virtio_ring.c > > --- a/drivers/virtio/virtio_ring.c Tue Dec 18 13:51:13 2007 +1100 > > +++ b/drivers/virtio/virtio_ring.c Tue Dec 18 15:49:18 2007 +1100 > > @@ -220,7 +220,17 @@ static void *vring_get_buf(struct virtqu > > return ret; > > } > > > > -static bool vring_restart(struct virtqueue *_vq) > > +static void vring_disable_cb(struct virtqueue *_vq) > > +{ > > + struct vring_virtqueue *vq = to_vvq(_vq); > > + > > + START_USE(vq); > > + BUG_ON(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT); > > + vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT; > > + END_USE(vq); > > +} > > + > > +static bool vring_enable_cb(struct virtqueue *_vq) > > { > > struct vring_virtqueue *vq = to_vvq(_vq); > > > > @@ -254,8 +264,8 @@ irqreturn_t vring_interrupt(int irq, voi > > return IRQ_HANDLED; > > > > pr_debug("virtqueue callback for %p (%p)\n", vq, vq->vq.callback); > > - if (vq->vq.callback && !vq->vq.callback(&vq->vq)) > > - vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT; > > + if (vq->vq.callback) > > + vq->vq.callback(&vq->vq); > > > > return IRQ_HANDLED; > > } > > @@ -266,7 +276,8 @@ static struct virtqueue_ops vring_vq_ops > > .add_buf = vring_add_buf, > > .get_buf = vring_get_buf, > > .kick = vring_kick, > > - .restart = vring_restart, > > + .disable_cb = vring_disable_cb, > > + .enable_cb = vring_enable_cb, > > .shutdown = vring_shutdown, > > }; > > > > @@ -274,7 +285,7 @@ struct virtqueue *vring_new_virtqueue(un > > struct virtio_device *vdev, > > void *pages, > > void (*notify)(struct virtqueue *), > > - bool (*callback)(struct virtqueue *)) > > + void (*callback)(struct virtqueue *)) > > { > > struct vring_virtqueue *vq; > > unsigned int i; > > diff -r 0eabf082c13a include/linux/virtio.h > > --- a/include/linux/virtio.h Tue Dec 18 13:51:13 2007 +1100 > > +++ b/include/linux/virtio.h Tue Dec 18 15:49:18 2007 +1100 > > @@ -11,15 +11,13 @@ > > /** > > * virtqueue - a queue to register buffers for sending or receiving. > > * @callback: the function to call when buffers are consumed (can be > > NULL). - * If this returns false, callbacks are suppressed until > > vq_ops->restart - * is called. > > * @vdev: the virtio device this queue was created for. > > * @vq_ops: the operations for this virtqueue (see below). > > * @priv: a pointer for the virtqueue implementation to use. > > */ > > struct virtqueue > > { > > - bool (*callback)(struct virtqueue *vq); > > + void (*callback)(struct virtqueue *vq); > > struct virtio_device *vdev; > > struct virtqueue_ops *vq_ops; > > void *priv; > > @@ -41,7 +39,9 @@ struct virtqueue > > * vq: the struct virtqueue we're talking about. > > * len: the length written into the buffer > > * Returns NULL or the "data" token handed to add_buf. > > - * @restart: restart callbacks after callback returned false. > > + * @disable_cb: disable callbacks > > + * vq: the struct virtqueue we're talking about. > > + * @enable_cb: restart callbacks after disable_cb. > > * vq: the struct virtqueue we're talking about. > > * This returns "false" (and doesn't re-enable) if there are pending > > * buffers in the queue, to avoid a race. > > @@ -65,7 +65,8 @@ struct virtqueue_ops { > > > > void *(*get_buf)(struct virtqueue *vq, unsigned int *len); > > > > - bool (*restart)(struct virtqueue *vq); > > + void (*disable_cb)(struct virtqueue *vq); > > + bool (*enable_cb)(struct virtqueue *vq); > > > > void (*shutdown)(struct virtqueue *vq); > > }; > > diff -r 0eabf082c13a include/linux/virtio_config.h > > --- a/include/linux/virtio_config.h Tue Dec 18 13:51:13 2007 +1100 > > +++ b/include/linux/virtio_config.h Tue Dec 18 15:49:18 2007 +1100 > > @@ -61,7 +61,7 @@ struct virtio_config_ops > > void (*set_status)(struct virtio_device *vdev, u8 status); > > struct virtqueue *(*find_vq)(struct virtio_device *vdev, > > unsigned index, > > - bool (*callback)(struct virtqueue *)); > > + void (*callback)(struct virtqueue *)); > > void (*del_vq)(struct virtqueue *vq); > > }; > > > > diff -r 0eabf082c13a include/linux/virtio_ring.h > > --- a/include/linux/virtio_ring.h Tue Dec 18 13:51:13 2007 +1100 > > +++ b/include/linux/virtio_ring.h Tue Dec 18 15:49:18 2007 +1100 > > @@ -114,7 +114,7 @@ struct virtqueue *vring_new_virtqueue(un > > struct virtio_device *vdev, > > void *pages, > > void (*notify)(struct virtqueue *vq), > > - bool (*callback)(struct virtqueue *vq)); > > + void (*callback)(struct virtqueue *vq)); > > void vring_del_virtqueue(struct virtqueue *vq); > > > > irqreturn_t vring_interrupt(int irq, void *_vq); ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <200712241154.20885.rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org>]
* Re: virtio_net and SMP guests [not found] ` <200712241154.20885.rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org> @ 2007-12-25 12:22 ` Dor Laor [not found] ` <4770F60D.4010904-atKUWr5tajBWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 11+ messages in thread From: Dor Laor @ 2007-12-25 12:22 UTC (permalink / raw) To: Rusty Russell Cc: kvm-devel, Christian Borntraeger, virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA Rusty Russell wrote: > On Monday 24 December 2007 10:19:19 Dor Laor wrote: > >> Rusty Russell wrote: >> Looks good to me. The only thing is the naming.. Maybe one can find >> better name than [dis|en]able_cb since >> it is more like disable interrupts than disable_cb and enable_cb is more >> like run_callbacks (and enable interrupts). >> > > Yes, the nomenclature is a little wishy-washy. Perhaps we should call > them "interrupts" even though (all together) the s390 doesn't have > interrupts. > > >> Actually while looking at it some more, there's might be a performance >> optimization using your new patch: >> Up to now, if the tx/rx paths were running out of descriptors they >> called enable_cb. >> enable_cb told the host it should trigger an irq the next time it has >> data for the guest. >> From now on, enable_cb will run the callbacks inside shortening latency. >> >> btw, I tried to apply this patch on my source tree without luck, after >> doing a manual merge, the >> guest opssed on the BUG_ON. I think it's because my sources are not >> aligned with yours. >> Can you please post a mercurial/git repo? Please specify the relatively >> repository in case you choose mercurial. >> > > Hmm, I currently publish a patch queue and a subset of that for Linus to pull > from. I could create a git tree from it but it'd be useless to you, since > it'd be re-created every time I modify the patch queue... > > Rusty. > It's ok with me, anyway I always pull from Avi & you and try push new patches back. btw: I checked your Linux tree and found it a bit old, there was no tx coalescing timer, it's probably good for 2.6.24. Do you plan to post patchset for 25 windows? Thanks, Dor ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <4770F60D.4010904-atKUWr5tajBWk0Htik3J/w@public.gmane.org>]
* Re: virtio_net and SMP guests [not found] ` <4770F60D.4010904-atKUWr5tajBWk0Htik3J/w@public.gmane.org> @ 2007-12-26 0:48 ` Rusty Russell 0 siblings, 0 replies; 11+ messages in thread From: Rusty Russell @ 2007-12-26 0:48 UTC (permalink / raw) To: dor.laor-atKUWr5tajBWk0Htik3J/w Cc: kvm-devel, Christian Borntraeger, virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA On Tuesday 25 December 2007 23:22:37 Dor Laor wrote: > btw: I checked your Linux tree and found it a bit old, there was no tx > coalescing timer, it's probably good > for 2.6.24. Do you plan to post patchset for 25 windows? > Thanks, > Dor Hmm, should be updated now. Will be pushing all these virtio patches for 2.6.25... Cheers, Rusty. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: virtio_net and SMP guests [not found] ` <200712181751.24692.rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org> 2007-12-23 23:19 ` Dor Laor @ 2008-01-10 12:37 ` Christian Borntraeger [not found] ` <200801101337.40433.borntraeger-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org> 1 sibling, 1 reply; 11+ messages in thread From: Christian Borntraeger @ 2008-01-10 12:37 UTC (permalink / raw) To: Rusty Russell Cc: kvm-devel, virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA Am Dienstag, 18. Dezember 2007 schrieb Rusty Russell: > To me this points to doing interrupt suppression a different way. If we > have a ->disable_cb() virtio function, and call it before we call > netif_rx_schedule, does that fix it? The fix looks good and I agree with it. There is one problem that I try to find for some days, but the following BUG_ON triggers: static void vring_disable_cb(struct virtqueue *_vq) { struct vring_virtqueue *vq = to_vvq(_vq); START_USE(vq); ----> BUG_ON(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT); vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT; END_USE(vq); } The funny thing is, that this bit is correct during probe but changes later before open. It seems that there is still something fishy on s390 and virtio. I looked several times at the virtio_ring code and it really seems to be ok. Any ideas? Christian -------- the oops that I cannot explain ------------- kernel BUG at /space/kvm/drivers/virtio/virtio_ring.c:232! illegal operation: 0001 [#1] Modules linked in: CPU: 0 Not tainted Process ip (pid: 1583, task: 000000000eee7038, ksp: 000000000ec4beb8) Krnl PSW : 0704300180000000 000000000045dcd4 (vring_disable_cb+0x30/0x34) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:3 PM:0 EA:3 Krnl GPRS: 000003ff00000002 0000000000000001 0000000010001800 000000000045dca4 000000000f054800 0000000000595a80 0000000000000000 0000000000000000 0000000000897768 000003ff00001002 000000000f09c920 000000000ec4bbf8 000000000f09c900 0000000000596b20 000000000045c9c6 000000000ec4bbf8 Krnl Code: 000000000045dcc8: e3b0b0700004 lg %r11,112(%r11) 000000000045dcce: 07fe bcr 15,%r14 000000000045dcd0: a7f40001 brc 15,45dcd2 >000000000045dcd4: a7f4fff6 brc 15,45dcc0 000000000045dcd8: eb7ff0500024 stmg %r7,%r15,80(%r15) 000000000045dcde: a7f13e00 tmll %r15,15872 000000000045dce2: b90400ef lgr %r14,%r15 000000000045dce6: a7840001 brc 8,45dce8 Call Trace: ([<000000000045c95e>] virtnet_open+0x32/0x114) [<000000000048cbac>] dev_open+0xb0/0xe8 [<000000000048b6a2>] dev_change_flags+0x156/0x1cc [<00000000004ead02>] devinet_ioctl+0x5ae/0x728 [<00000000004eb564>] inet_ioctl+0xa4/0xf0 [<00000000004798dc>] sock_ioctl+0x90/0x2e4 [<00000000001b8bd2>] do_ioctl+0x4a/0xd4 [<00000000001b8cda>] vfs_ioctl+0x7e/0x3c8 [<00000000001b90b6>] sys_ioctl+0x92/0xa4 [<0000000000112e7c>] sysc_noemu+0x10/0x16 [<000002000012c7e6>] 0x2000012c7e6 ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <200801101337.40433.borntraeger-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>]
* Re: virtio_net and SMP guests [not found] ` <200801101337.40433.borntraeger-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org> @ 2008-01-10 15:39 ` Christian Borntraeger 2008-01-10 15:51 ` Christian Borntraeger 0 siblings, 1 reply; 11+ messages in thread From: Christian Borntraeger @ 2008-01-10 15:39 UTC (permalink / raw) To: virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA; +Cc: kvm-devel Am Donnerstag, 10. Januar 2008 schrieb Christian Borntraeger: > Am Dienstag, 18. Dezember 2007 schrieb Rusty Russell: > > To me this points to doing interrupt suppression a different way. If we > > have a ->disable_cb() virtio function, and call it before we call > > netif_rx_schedule, does that fix it? > > The fix looks good and I agree with it. > > There is one problem that I try to find for some days, but the following > BUG_ON triggers: > > static void vring_disable_cb(struct virtqueue *_vq) > { > struct vring_virtqueue *vq = to_vvq(_vq); > > START_USE(vq); > ----> BUG_ON(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT); > vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT; > END_USE(vq); > } Ok, I found it: static int virtnet_open(struct net_device *dev) { struct virtnet_info *vi = netdev_priv(dev); try_fill_recv(vi); /* If we didn't even get one input buffer, we're useless. */ if (vi->num == 0) return -ENOMEM; ---> int for new packet static void skb_recv_done(struct virtqueue *rvq) { struct virtnet_info *vi = rvq->vdev->priv; /* Suppress further interrupts. */ rvq->vq_ops->disable_cb(rvq); netif_rx_schedule(vi->dev, &vi->napi); } - poll is not yet possible, no softirq - return from interrupt napi_enable(&vi->napi); vi->rvq->vq_ops->disable_cb(vi->rvq); ---> BUG: its already disabled Christian ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: virtio_net and SMP guests 2008-01-10 15:39 ` Christian Borntraeger @ 2008-01-10 15:51 ` Christian Borntraeger 2008-01-11 9:53 ` Rusty Russell 0 siblings, 1 reply; 11+ messages in thread From: Christian Borntraeger @ 2008-01-10 15:51 UTC (permalink / raw) To: virtualization; +Cc: kvm-devel, Anthony Liguori, Dor Laor, netdev Am Donnerstag, 10. Januar 2008 schrieb Christian Borntraeger: > Am Donnerstag, 10. Januar 2008 schrieb Christian Borntraeger: > > Am Dienstag, 18. Dezember 2007 schrieb Rusty Russell: > > > To me this points to doing interrupt suppression a different way. If we > > > have a ->disable_cb() virtio function, and call it before we call > > > netif_rx_schedule, does that fix it? > > > > The fix looks good and I agree with it. > > > > There is one problem that I try to find for some days, but the following > > BUG_ON triggers: > > > > static void vring_disable_cb(struct virtqueue *_vq) > > { > > struct vring_virtqueue *vq = to_vvq(_vq); > > > > START_USE(vq); > > ----> BUG_ON(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT); > > vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT; > > END_USE(vq); > > } > > Ok, I found it: > > static int virtnet_open(struct net_device *dev) > { > struct virtnet_info *vi = netdev_priv(dev); > try_fill_recv(vi); > /* If we didn't even get one input buffer, we're useless. */ > if (vi->num == 0) > return -ENOMEM; > ---> int for new packet > static void skb_recv_done(struct virtqueue *rvq) > { > struct virtnet_info *vi = rvq->vdev->priv; > /* Suppress further interrupts. */ > rvq->vq_ops->disable_cb(rvq); > netif_rx_schedule(vi->dev, &vi->napi); > } > - poll is not yet possible, no softirq > - return from interrupt > napi_enable(&vi->napi); > vi->rvq->vq_ops->disable_cb(vi->rvq); > ---> BUG: its already disabled Btw. this problem also happens on single processor guests. What about the following patch: --- drivers/net/virtio_net.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) Index: kvm/drivers/net/virtio_net.c =================================================================== --- kvm.orig/drivers/net/virtio_net.c +++ kvm/drivers/net/virtio_net.c @@ -179,9 +179,12 @@ static void try_fill_recv(struct virtnet static void skb_recv_done(struct virtqueue *rvq) { struct virtnet_info *vi = rvq->vdev->priv; - /* Suppress further interrupts. */ - rvq->vq_ops->disable_cb(rvq); - netif_rx_schedule(vi->dev, &vi->napi); + /* Schedule NAPI, Suppress further interrupts if successful. */ + + if (netif_rx_schedule_prep(vi->dev, &vi->napi)) { + rvq->vq_ops->disable_cb(rvq); + __netif_rx_schedule(vi->dev, &vi->napi); + } } static int virtnet_poll(struct napi_struct *napi, int budget) ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: virtio_net and SMP guests 2008-01-10 15:51 ` Christian Borntraeger @ 2008-01-11 9:53 ` Rusty Russell 0 siblings, 0 replies; 11+ messages in thread From: Rusty Russell @ 2008-01-11 9:53 UTC (permalink / raw) To: virtualization Cc: Christian Borntraeger, kvm-devel, netdev, Anthony Liguori, Dor Laor On Friday 11 January 2008 02:51:58 Christian Borntraeger wrote: > What about the following patch: Looks correct and in fact pretty orthodox. I've folded this in, thanks! Rusty. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2008-01-11 9:53 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-12-14 12:12 virtio_net and SMP guests Christian Borntraeger
[not found] ` <200712141312.05562.borntraeger-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-12-16 11:55 ` Dor Laor
2007-12-18 6:51 ` Rusty Russell
[not found] ` <200712181751.24692.rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org>
2007-12-23 23:19 ` Dor Laor
[not found] ` <476EECF7.9000204-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-12-24 0:54 ` Rusty Russell
[not found] ` <200712241154.20885.rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org>
2007-12-25 12:22 ` Dor Laor
[not found] ` <4770F60D.4010904-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-12-26 0:48 ` Rusty Russell
2008-01-10 12:37 ` Christian Borntraeger
[not found] ` <200801101337.40433.borntraeger-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2008-01-10 15:39 ` Christian Borntraeger
2008-01-10 15:51 ` Christian Borntraeger
2008-01-11 9:53 ` Rusty Russell
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox