public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Dor Laor <dor.laor-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: Christian Borntraeger
	<borntraeger-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
Cc: kvm-devel
	<kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org>,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
Subject: Re: virtio_net and SMP guests
Date: Sun, 16 Dec 2007 13:55:21 +0200	[thread overview]
Message-ID: <47651229.4050001@qumranet.com> (raw)
In-Reply-To: <200712141312.05562.borntraeger-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>

Christian Borntraeger wrote:
> Rusty, Anthony, Dor,
>
> I need your brain power :-)
>
> On smp guests I have seen a problem with virtio (the version in curent Avi's
> git) which do not occur on single processor guests:
>
> kernel BUG at /space/kvm/drivers/virtio/virtio_ring.c:228!
> illegal operation: 0001 [#1]
> Modules linked in: ipv6
> CPU:    2    Not tainted
> Process swapper (pid: 0, task: 000000000f83e038, ksp: 000000000f877d70)
> Krnl PSW : 0704000180000000 000000000045df2a (vring_restart+0x5a/0x70)
>            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0 EA:3
> Krnl GPRS: 00000000c0a80101 0000000000000000 000000000eb35000 0000000010005800
>            000000000045ded0 000000000000192f 000000000eb21000 000000000eb21000
>            000000000000000e 000000000eb21900 000000000eb21920 000000000f867cb8
>            0700000000d9b058 0000000000000010 000000000045c06a 000000000f867cb8
> Krnl Code: 000000000045df1e: e3b0b0700004       lg      %r11,112(%r11)
>            000000000045df24: 07fe               bcr     15,%r14
>            000000000045df26: a7f40001           brc     15,45df28
>           >000000000045df2a: a7f4ffe1           brc     15,45deec
>            000000000045df2e: e31020300004       lg      %r1,48(%r2)
>            000000000045df34: a7480000           lhi     %r4,0
>            000000000045df38: 96011001           oi      1(%r1),1
>            000000000045df3c: a7f4ffef           brc     15,45df1a
> Call Trace:
> ([<000000000045c016>] virtnet_poll+0x96/0x42c)
>  [<000000000048cda2>] net_rx_action+0xca/0x150
>  [<0000000000137f7a>] __do_softirq+0x9e/0x130
>  [<00000000001105d6>] do_softirq+0xae/0xb4
>  [<0000000000138182>] irq_exit+0x96/0x9c
>  [<000000000010d710>] do_extint+0xcc/0xf8
>  [<00000000001135d0>] ext_no_vtime+0x16/0x1a
>  [<000000000010a57e>] cpu_idle+0x216/0x238
>
>
> I think there is a valid code path, triggering this bug:
>
> 	CPU1						CPU2
> -----------------------				-----------------------
> - virtnet_poll found no
>   more packets on queue
> - netif_rx_complete allow
>   poll to be called
> - vq_ops->restart is called
> - vq Interrupts are enabled	
> 	.		     <new packets arrive>
> <vcpu is scheduled away>
> 	.					- interrupt is delivered
> 	.					- poll is called
> 	.					- poll work is done
> 	.					- netif_rx_complete
> 	.					- vq_ops->restart is called
> 	.					- check if vq interrupts are
> 	.					  enable --> BUG
>
>   
I didn't understand how its possible:

<vcpu is scheduled away>
	.					- interrupt is delivered
							-vring_interrupt is called -> 
							- skb_recv_done callback return false ->
							vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT;

So when the restart callback will be called the 
BUG_ON(!(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT));
will not be issued.

	.					- poll is called
	.					- poll work is done
	.					- netif_rx_complete
	.					- vq_ops->restart is called
	.					- check if vq interrupts are
	.					  enable --> BUG


> The first idea was to remove this check? (See patch below). I am not sure
> if the proper fix also requires to change vring.avail->flags to be only
> changed by atomic bitops. Any ideas, comments?
>   
As for now no harm can be done since it is only used in two place:
1. vring_restart inside a napi poll calback which is protected by napi 
poll lock
2. vring_interrupt in the interrupt handler. While only the 
VRING_AVAIL_F_NO_INTERRUPT bit is touched
    there is no possible harm, once we'll use more bits it might be an 
issue.
So Maybe we should use atomics.
> Signed-off-by: Christian Borntraeger <borntraeger-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
> CC: Anthony Liguori <aliguori-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
> CC: Dor Laor <dor.laor-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
> CC: Rusty Russell <rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org>
>
> ---
>  drivers/virtio/virtio_ring.c |    2 --
>  1 file changed, 2 deletions(-)
>
> Index: kvm/drivers/virtio/virtio_ring.c
> ===================================================================
> --- kvm.orig/drivers/virtio/virtio_ring.c
> +++ kvm/drivers/virtio/virtio_ring.c
> @@ -225,8 +225,6 @@ static bool vring_restart(struct virtque
>  	struct vring_virtqueue *vq = to_vvq(_vq);
>  
>  	START_USE(vq);
> -	BUG_ON(!(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT));
> -
>  	/* We optimistically turn back on interrupts, then check if there was
>  	 * more to do. */
>  	vq->vring.avail->flags &= ~VRING_AVAIL_F_NO_INTERRUPT;
>
>   


-------------------------------------------------------------------------
SF.Net email is sponsored by:
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services
for just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace

  parent reply	other threads:[~2007-12-16 11:55 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-14 12:12 virtio_net and SMP guests Christian Borntraeger
     [not found] ` <200712141312.05562.borntraeger-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-12-16 11:55   ` Dor Laor [this message]
2007-12-18  6:51   ` Rusty Russell
     [not found]     ` <200712181751.24692.rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org>
2007-12-23 23:19       ` Dor Laor
     [not found]         ` <476EECF7.9000204-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-12-24  0:54           ` Rusty Russell
     [not found]             ` <200712241154.20885.rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org>
2007-12-25 12:22               ` Dor Laor
     [not found]                 ` <4770F60D.4010904-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-12-26  0:48                   ` Rusty Russell
2008-01-10 12:37       ` Christian Borntraeger
     [not found]         ` <200801101337.40433.borntraeger-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2008-01-10 15:39           ` Christian Borntraeger
2008-01-10 15:51             ` Christian Borntraeger
2008-01-11  9:53               ` Rusty Russell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47651229.4050001@qumranet.com \
    --to=dor.laor-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=borntraeger-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org \
    --cc=dor.laor-atKUWr5tajBWk0Htik3J/w@public.gmane.org \
    --cc=kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org \
    --cc=virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox