All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dor Laor <dor.laor-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: Rusty Russell <rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org>
Cc: kvm-devel
	<kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org>,
	Christian Borntraeger
	<borntraeger-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
Subject: Re: virtio_net and SMP guests
Date: Mon, 24 Dec 2007 01:19:19 +0200	[thread overview]
Message-ID: <476EECF7.9000204@qumranet.com> (raw)
In-Reply-To: <200712181751.24692.rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org>

Rusty Russell wrote:
> On Friday 14 December 2007 23:12:05 Christian Borntraeger wrote:
>   
>> Rusty, Anthony, Dor,
>>
>> I need your brain power :-)
>>
>> On smp guests I have seen a problem with virtio (the version in curent
>> Avi's git) which do not occur on single processor guests:
>>
>> kernel BUG at /space/kvm/drivers/virtio/virtio_ring.c:228!
>> illegal operation: 0001 [#1]
>> Modules linked in: ipv6
>> CPU:    2    Not tainted
>> Process swapper (pid: 0, task: 000000000f83e038, ksp: 000000000f877d70)
>> Krnl PSW : 0704000180000000 000000000045df2a (vring_restart+0x5a/0x70)
>>            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0 EA:3
>> Krnl GPRS: 00000000c0a80101 0000000000000000 000000000eb35000
>> 0000000010005800 000000000045ded0 000000000000192f 000000000eb21000
>> 000000000eb21000 000000000000000e 000000000eb21900 000000000eb21920
>> 000000000f867cb8 0700000000d9b058 0000000000000010 000000000045c06a
>> 000000000f867cb8 Krnl Code: 000000000045df1e: e3b0b0700004       lg     
>> %r11,112(%r11) 000000000045df24: 07fe               bcr     15,%r14
>>            000000000045df26: a7f40001           brc     15,45df28
>>
>>           >000000000045df2a: a7f4ffe1           brc     15,45deec
>>
>>            000000000045df2e: e31020300004       lg      %r1,48(%r2)
>>            000000000045df34: a7480000           lhi     %r4,0
>>            000000000045df38: 96011001           oi      1(%r1),1
>>            000000000045df3c: a7f4ffef           brc     15,45df1a
>> Call Trace:
>> ([<000000000045c016>] virtnet_poll+0x96/0x42c)
>>  [<000000000048cda2>] net_rx_action+0xca/0x150
>>  [<0000000000137f7a>] __do_softirq+0x9e/0x130
>>  [<00000000001105d6>] do_softirq+0xae/0xb4
>>  [<0000000000138182>] irq_exit+0x96/0x9c
>>  [<000000000010d710>] do_extint+0xcc/0xf8
>>  [<00000000001135d0>] ext_no_vtime+0x16/0x1a
>>  [<000000000010a57e>] cpu_idle+0x216/0x238
>>
>>
>> I think there is a valid code path, triggering this bug:
>>
>> 	CPU1						CPU2
>> -----------------------				-----------------------
>> - virtnet_poll found no
>>   more packets on queue
>> - netif_rx_complete allow
>>   poll to be called
>> - vq_ops->restart is called
>> - vq Interrupts are enabled
>> 	.		     <new packets arrive>
>> <vcpu is scheduled away>
>> 	.					- interrupt is delivered
>> 	.					- poll is called
>> 	.					- poll work is done
>> 	.					- netif_rx_complete
>> 	.					- vq_ops->restart is called
>> 	.					- check if vq interrupts are
>> 	.					  enable --> BUG
>>     
>
> Shouldn't this be BUG()ing in START_USE()?  Or did you disable that because
> of previous false positives?
>
> virtnet_poll() should not be reentering, AFAICT.  virtio relies on the caller
> to ensure it never calls two virtio functions on the same queue
> simultaneously, and virtio_net in turn relies on the core net code to enforce
> this.
>
> So, how did this happen?
>
> Looks like we can have an interrupt on one CPU, which does:
>
> 	if (vq->vq.callback && !vq->vq.callback(&vq->vq))
> 		vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT;
>
> The callback schedules the softirq to run, which usually runs on same CPU...
> but if somehow it runs somewhere else, it could hit restart before the flag
> is set.  This means removing the BUG_ON() is bad, because we could restart,
> and then set the NO_INTERRUPT flag from the handler, and stall.
>
> Note that interrupt suppression is advisory, so we can be sloppy on the
> setting side.
>
> To me this points to doing interrupt suppression a different way.  If we
> have a ->disable_cb() virtio function, and call it before we call
> netif_rx_schedule, does that fix it?
>
> How's this?
> Rusty.
>
>   
Looks good to me. The only thing is the naming.. Maybe one can find 
better name than [dis|en]able_cb since
it is more like disable interrupts than disable_cb and enable_cb is more 
like run_callbacks (and enable interrupts).

Actually while looking at it some more, there's might be a performance 
optimization using your new patch:
Up to now, if the tx/rx paths were running out of descriptors they 
called enable_cb.
enable_cb told the host it should trigger an irq the next time it has 
data for the guest.
 From now on, enable_cb will run the callbacks inside shortening latency.

btw, I tried to apply this patch on my source tree without luck,  after 
doing a manual merge, the
guest opssed on the BUG_ON. I think it's because my sources are not 
aligned with yours.
Can you please post a mercurial/git repo? Please specify the relatively 
repository in case you choose mercurial.
Thanks,
Dor
> ---
> virtio: explicit enable_cb/disable_cb rather than callback return.
>
> It seems that virtio_net wants to disable callbacks (interrupts) before
> calling netif_rx_schedule(), so we can't use the return value to do so.
>
> Rename "restart" to "cb_enable" and introduce "cb_disable" hook: callback
> now returns void, rather than a boolean.
>
> Signed-off-by: Rusty Russell <rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org>
>
> diff -r 0eabf082c13a drivers/block/virtio_blk.c
> --- a/drivers/block/virtio_blk.c	Tue Dec 18 13:51:13 2007 +1100
> +++ b/drivers/block/virtio_blk.c	Tue Dec 18 15:49:18 2007 +1100
> @@ -36,7 +36,7 @@ struct virtblk_req
>  	struct virtio_blk_inhdr in_hdr;
>  };
>  
> -static bool blk_done(struct virtqueue *vq)
> +static void blk_done(struct virtqueue *vq)
>  {
>  	struct virtio_blk *vblk = vq->vdev->priv;
>  	struct virtblk_req *vbr;
> @@ -65,7 +65,6 @@ static bool blk_done(struct virtqueue *v
>  	/* In case queue is stopped waiting for more buffers. */
>  	blk_start_queue(vblk->disk->queue);
>  	spin_unlock_irqrestore(&vblk->lock, flags);
> -	return true;
>  }
>  
>  static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
> diff -r 0eabf082c13a drivers/lguest/lguest_device.c
> --- a/drivers/lguest/lguest_device.c	Tue Dec 18 13:51:13 2007 +1100
> +++ b/drivers/lguest/lguest_device.c	Tue Dec 18 15:49:18 2007 +1100
> @@ -193,7 +193,7 @@ static void lg_notify(struct virtqueue *
>   * So we provide devices with a "find virtqueue and set it up" function. */
>  static struct virtqueue *lg_find_vq(struct virtio_device *vdev,
>  				    unsigned index,
> -				    bool (*callback)(struct virtqueue *vq))
> +				    void (*callback)(struct virtqueue *vq))
>  {
>  	struct lguest_device *ldev = to_lgdev(vdev);
>  	struct lguest_vq_info *lvq;
> diff -r 0eabf082c13a drivers/net/virtio_net.c
> --- a/drivers/net/virtio_net.c	Tue Dec 18 13:51:13 2007 +1100
> +++ b/drivers/net/virtio_net.c	Tue Dec 18 15:49:18 2007 +1100
> @@ -59,13 +59,12 @@ static inline void vnet_hdr_to_sg(struct
>  	sg_init_one(sg, skb_vnet_hdr(skb), sizeof(struct virtio_net_hdr));
>  }
>  
> -static bool skb_xmit_done(struct virtqueue *rvq)
> +static void skb_xmit_done(struct virtqueue *rvq)
>  {
>  	struct virtnet_info *vi = rvq->vdev->priv;
>  
>  	/* In case we were waiting for output buffers. */
>  	netif_wake_queue(vi->dev);
> -	return true;
>  }
>  
>  static void receive_skb(struct net_device *dev, struct sk_buff *skb,
> @@ -177,12 +176,12 @@ static void try_fill_recv(struct virtnet
>  	vi->rvq->vq_ops->kick(vi->rvq);
>  }
>  
> -static bool skb_recv_done(struct virtqueue *rvq)
> +static void skb_recv_done(struct virtqueue *rvq)
>  {
>  	struct virtnet_info *vi = rvq->vdev->priv;
> +	/* Suppress further interrupts. */
> +	rvq->vq_ops->disable_cb(rvq);
>  	netif_rx_schedule(vi->dev, &vi->napi);
> -	/* Suppress further interrupts. */
> -	return false;
>  }
>  
>  static int virtnet_poll(struct napi_struct *napi, int budget)
> @@ -208,7 +207,7 @@ again:
>  	/* Out of packets? */
>  	if (received < budget) {
>  		netif_rx_complete(vi->dev, napi);
> -		if (unlikely(!vi->rvq->vq_ops->restart(vi->rvq))
> +		if (unlikely(!vi->rvq->vq_ops->enable_cb(vi->rvq))
>  		    && netif_rx_reschedule(vi->dev, napi))
>  			goto again;
>  	}
> diff -r 0eabf082c13a drivers/virtio/virtio_pci.c
> --- a/drivers/virtio/virtio_pci.c	Tue Dec 18 13:51:13 2007 +1100
> +++ b/drivers/virtio/virtio_pci.c	Tue Dec 18 15:49:18 2007 +1100
> @@ -190,7 +190,7 @@ static irqreturn_t vp_interrupt(int irq,
>  
>  /* the config->find_vq() implementation */
>  static struct virtqueue *vp_find_vq(struct virtio_device *vdev, unsigned index,
> -				    bool (*callback)(struct virtqueue *vq))
> +				    void (*callback)(struct virtqueue *vq))
>  {
>  	struct virtio_pci_device *vp_dev = to_vp_device(vdev);
>  	struct virtio_pci_vq_info *info;
> diff -r 0eabf082c13a drivers/virtio/virtio_ring.c
> --- a/drivers/virtio/virtio_ring.c	Tue Dec 18 13:51:13 2007 +1100
> +++ b/drivers/virtio/virtio_ring.c	Tue Dec 18 15:49:18 2007 +1100
> @@ -220,7 +220,17 @@ static void *vring_get_buf(struct virtqu
>  	return ret;
>  }
>  
> -static bool vring_restart(struct virtqueue *_vq)
> +static void vring_disable_cb(struct virtqueue *_vq)
> +{
> +	struct vring_virtqueue *vq = to_vvq(_vq);
> +
> +	START_USE(vq);
> +	BUG_ON(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT);
> +	vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT;
> +	END_USE(vq);
> +}
> +
> +static bool vring_enable_cb(struct virtqueue *_vq)
>  {
>  	struct vring_virtqueue *vq = to_vvq(_vq);
>  
> @@ -254,8 +264,8 @@ irqreturn_t vring_interrupt(int irq, voi
>  		return IRQ_HANDLED;
>  
>  	pr_debug("virtqueue callback for %p (%p)\n", vq, vq->vq.callback);
> -	if (vq->vq.callback && !vq->vq.callback(&vq->vq))
> -		vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT;
> +	if (vq->vq.callback)
> +		vq->vq.callback(&vq->vq);
>  
>  	return IRQ_HANDLED;
>  }
> @@ -266,7 +276,8 @@ static struct virtqueue_ops vring_vq_ops
>  	.add_buf = vring_add_buf,
>  	.get_buf = vring_get_buf,
>  	.kick = vring_kick,
> -	.restart = vring_restart,
> +	.disable_cb = vring_disable_cb,
> +	.enable_cb = vring_enable_cb,
>  	.shutdown = vring_shutdown,
>  };
>  
> @@ -274,7 +285,7 @@ struct virtqueue *vring_new_virtqueue(un
>  				      struct virtio_device *vdev,
>  				      void *pages,
>  				      void (*notify)(struct virtqueue *),
> -				      bool (*callback)(struct virtqueue *))
> +				      void (*callback)(struct virtqueue *))
>  {
>  	struct vring_virtqueue *vq;
>  	unsigned int i;
> diff -r 0eabf082c13a include/linux/virtio.h
> --- a/include/linux/virtio.h	Tue Dec 18 13:51:13 2007 +1100
> +++ b/include/linux/virtio.h	Tue Dec 18 15:49:18 2007 +1100
> @@ -11,15 +11,13 @@
>  /**
>   * virtqueue - a queue to register buffers for sending or receiving.
>   * @callback: the function to call when buffers are consumed (can be NULL).
> - *    If this returns false, callbacks are suppressed until vq_ops->restart
> - *    is called.
>   * @vdev: the virtio device this queue was created for.
>   * @vq_ops: the operations for this virtqueue (see below).
>   * @priv: a pointer for the virtqueue implementation to use.
>   */
>  struct virtqueue
>  {
> -	bool (*callback)(struct virtqueue *vq);
> +	void (*callback)(struct virtqueue *vq);
>  	struct virtio_device *vdev;
>  	struct virtqueue_ops *vq_ops;
>  	void *priv;
> @@ -41,7 +39,9 @@ struct virtqueue
>   *	vq: the struct virtqueue we're talking about.
>   *	len: the length written into the buffer
>   *	Returns NULL or the "data" token handed to add_buf.
> - * @restart: restart callbacks after callback returned false.
> + * @disable_cb: disable callbacks
> + *	vq: the struct virtqueue we're talking about.
> + * @enable_cb: restart callbacks after disable_cb.
>   *	vq: the struct virtqueue we're talking about.
>   *	This returns "false" (and doesn't re-enable) if there are pending
>   *	buffers in the queue, to avoid a race.
> @@ -65,7 +65,8 @@ struct virtqueue_ops {
>  
>  	void *(*get_buf)(struct virtqueue *vq, unsigned int *len);
>  
> -	bool (*restart)(struct virtqueue *vq);
> +	void (*disable_cb)(struct virtqueue *vq);
> +	bool (*enable_cb)(struct virtqueue *vq);
>  
>  	void (*shutdown)(struct virtqueue *vq);
>  };
> diff -r 0eabf082c13a include/linux/virtio_config.h
> --- a/include/linux/virtio_config.h	Tue Dec 18 13:51:13 2007 +1100
> +++ b/include/linux/virtio_config.h	Tue Dec 18 15:49:18 2007 +1100
> @@ -61,7 +61,7 @@ struct virtio_config_ops
>  	void (*set_status)(struct virtio_device *vdev, u8 status);
>  	struct virtqueue *(*find_vq)(struct virtio_device *vdev,
>  				     unsigned index,
> -				     bool (*callback)(struct virtqueue *));
> +				     void (*callback)(struct virtqueue *));
>  	void (*del_vq)(struct virtqueue *vq);
>  };
>  
> diff -r 0eabf082c13a include/linux/virtio_ring.h
> --- a/include/linux/virtio_ring.h	Tue Dec 18 13:51:13 2007 +1100
> +++ b/include/linux/virtio_ring.h	Tue Dec 18 15:49:18 2007 +1100
> @@ -114,7 +114,7 @@ struct virtqueue *vring_new_virtqueue(un
>  				      struct virtio_device *vdev,
>  				      void *pages,
>  				      void (*notify)(struct virtqueue *vq),
> -				      bool (*callback)(struct virtqueue *vq));
> +				      void (*callback)(struct virtqueue *vq));
>  void vring_del_virtqueue(struct virtqueue *vq);
>  
>  irqreturn_t vring_interrupt(int irq, void *_vq);
>
>   


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

  parent reply	other threads:[~2007-12-23 23:19 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-14 12:12 virtio_net and SMP guests Christian Borntraeger
2007-12-16 11:55 ` Dor Laor
     [not found] ` <200712141312.05562.borntraeger-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-12-16 11:55   ` Dor Laor
2007-12-18  6:51   ` Rusty Russell
2007-12-23 23:19     ` Dor Laor
2008-01-10 12:37     ` Christian Borntraeger
     [not found]     ` <200712181751.24692.rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org>
2007-12-23 23:19       ` Dor Laor [this message]
2007-12-24  0:54         ` Rusty Russell
     [not found]         ` <476EECF7.9000204-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-12-24  0:54           ` Rusty Russell
     [not found]             ` <200712241154.20885.rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org>
2007-12-25 12:22               ` Dor Laor
2007-12-26  0:48                 ` Rusty Russell
     [not found]                 ` <4770F60D.4010904-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-12-26  0:48                   ` Rusty Russell
2007-12-25 12:22             ` Dor Laor
2008-01-10 12:37       ` Christian Borntraeger
2008-01-10 15:39         ` Christian Borntraeger
     [not found]         ` <200801101337.40433.borntraeger-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2008-01-10 15:39           ` Christian Borntraeger
2008-01-10 15:51             ` Christian Borntraeger
2008-01-11  9:53               ` Rusty Russell
2008-01-11  9:53               ` Rusty Russell
2008-01-10 15:51             ` Christian Borntraeger
2007-12-18  6:51 ` Rusty Russell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=476EECF7.9000204@qumranet.com \
    --to=dor.laor-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=borntraeger-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org \
    --cc=dor.laor-atKUWr5tajBWk0Htik3J/w@public.gmane.org \
    --cc=kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org \
    --cc=rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org \
    --cc=virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.