All of lore.kernel.org
 help / color / mirror / Atom feed
* [kvm-ppc-devel] virtio network traffic issues
@ 2008-03-25 13:45 Christian Ehrhardt
  2008-03-25 13:52 ` Hollis Blanchard
  2008-03-26 16:06 ` Christian Ehrhardt
  0 siblings, 2 replies; 7+ messages in thread
From: Christian Ehrhardt @ 2008-03-25 13:45 UTC (permalink / raw)
  To: kvm-ppc

Hi,
I analyzed the network traffic to get a feeling where our bug may be located.
I summarize some things I think I see in that data

Environment:
192.168.1.2 = nfs host
192.168.1.3 = ppc host
192.168.1.4 = ppc host 2nd network interface (not active, but bridged to tap)
192.168.1.10 = ppc vkvm guest - configured via dhcp (working)


time 0-25: host nfs and telnet
time ~25:   DHCP discover from kvm guest + offer answer from dhcp server
       after that there are arp requests for the nfs host which are answered correctly
       the guest starts to get a portmap response for nfs and mounts the nfs share
       Then the guest does some nfs getattr, statfs calls followed by some larger read calls (fragmented)
       The read calls are single packets while the reply (the data) is fragmented
       In the wireshark protocol we even see that it accesses 1st dev/console and then /sbin/init (this is where the reads go to)
       This continues to ld.1.so and other libs which are read via nfs (response looks good)
packet 350 is retransmitted as packet 356 - there is no visible error - so it must have been a ?timeout?
then we get more retransmissions of that request ( there are replies to these requests all the time - does the guest not "see them"?)
packet 365 is then a reply to a getattr call before

=> it looks as if after #353 no packet was actually received by the guest
   after that only timeouts retransmits and unanswered arp requests are seen

packet 368-385 have some kind of arp flood asking for 192.168.1.10 (the kvm guest) but there is no reply to that too
First these are direct arp requests then broadcasts - this looks (as described above) as if after packet 353 no packet is recieved by the guest

The rest is my telnet connection aborting the kvm guest and some host->nfs server traffic not interesting for us:

Summary:
=> from one not yet defined point our guest seems to receive absolutely nothing
=> when the guest is hanging it sends nfs requests which are seen externally, but it does not seem to get the respones
=> the arp requests for the guest are repeated - maybe we can add some very verbose debug in the guest virtio code and activate it when we see that it is already hanging
=> ideas:
  - maybe some buffer/memory runs out and so incoming packets won't be received anymore 
  - we break something in virtio / incoming interrupts (or maybe a lock) and from that point no receive is possible


I'll continue on that and keep you informed

-- 

Grüsse / regards, 
Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
kvm-ppc-devel mailing list
kvm-ppc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-ppc-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [kvm-ppc-devel] virtio network traffic issues
  2008-03-25 13:45 [kvm-ppc-devel] virtio network traffic issues Christian Ehrhardt
@ 2008-03-25 13:52 ` Hollis Blanchard
  2008-03-26 16:07     ` Christian Ehrhardt
  2008-03-26 16:06 ` Christian Ehrhardt
  1 sibling, 1 reply; 7+ messages in thread
From: Hollis Blanchard @ 2008-03-25 13:52 UTC (permalink / raw)
  To: kvm-ppc

On Tue, 2008-03-25 at 14:45 +0100, Christian Ehrhardt wrote:
> 
> => from one not yet defined point our guest seems to receive
> absolutely nothing
> => when the guest is hanging it sends nfs requests which are seen
> externally, but it does not seem to get the respones
> => the arp requests for the guest are repeated - maybe we can add some
> very verbose debug in the guest virtio code and activate it when we
> see that it is already hanging
> => ideas:
>   - maybe some buffer/memory runs out and so incoming packets won't be
> received anymore 
>   - we break something in virtio / incoming interrupts (or maybe a
> lock) and from that point no receive is possible
> 
Could we be missing interrupts? Can you add a guest kernel timer that
calls virtio_poll() regularly?
> 
-- 
Hollis Blanchard
IBM Linux Technology Center


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
kvm-ppc-devel mailing list
kvm-ppc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-ppc-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [kvm-ppc-devel] virtio network traffic issues
  2008-03-25 13:45 [kvm-ppc-devel] virtio network traffic issues Christian Ehrhardt
  2008-03-25 13:52 ` Hollis Blanchard
@ 2008-03-26 16:06 ` Christian Ehrhardt
  1 sibling, 0 replies; 7+ messages in thread
From: Christian Ehrhardt @ 2008-03-26 16:06 UTC (permalink / raw)
  To: kvm-ppc

[-- Attachment #1: Type: text/plain, Size: 1615 bytes --]

Hollis Blanchard wrote:
> On Tue, 2008-03-25 at 14:45 +0100, Christian Ehrhardt wrote:
>> => from one not yet defined point our guest seems to receive absolutely nothing
>> => when the guest is hanging it sends nfs requests which are seen externally, but it does not seem to get the respones
>> => the arp requests for the guest are repeated - maybe we can add some very verbose debug in the guest virtio code and activate it when we see that it is already hanging
>> => ideas:
>>   - maybe some buffer/memory runs out and so incoming packets won't be received anymore 
>>   - we break something in virtio / incoming interrupts (or maybe a lock) and from that point no receive is possible
>>
> Could we be missing interrupts? Can you add a guest kernel timer that
> calls virtio_poll() regularly?


Working but not the final solution.
Preliminary workaround patch attached

Atm I think we might disable interrupts and polling at the same time.

Good message: it was slow, but I saw a login prompt ;-)
here a guest "cat /proc/cpuinfo"
cat /proc/cpuinfo
processor       : 0
cpu             : unknown (00000000)
clock           : 666.666660MHz
revision        : 0.0 (pvr 0000 0000)
bogomips        : 2490.36
timebase        : 666666660
platform        : Bamboo

That means once we found the reason for that staving virto-net device we should have a basic working linux guest.

P.S. added virtualization@lists.linux-foundation.org to get any virtio-net related suggestions from there too

-- 

Grüsse / regards, 
Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization

[-- Attachment #2: virtio-net-poll-on-timer --]
[-- Type: text/plain, Size: 3047 bytes --]

Subject: [PATCH] kvmppc virtio-net: workaround for lost interrupt/polling

From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>

This patch is (atm) just a debug workaround. The issue is that virtio-net
works fine for a while but then "something" happens and we see neither
vp_interrupts nor calls to virtnet_poll anymore.
Looking at the network traffic shows that the kvm guest still sends packets
via virtio-net and that userspace tries to deliver things to the guest, but
the guest receives nothing.
Somehow it loks loke polling and interrupts are disables (more debugging
needed).
For now anyone can continue with that workaround patch (which is
very slow, I had no time to tune the polling interval yet).
There's an ugly fixme, but I don't yet know what exactly causes this BUG()
to trigger so thats the way to get it out for now.
I'll update the patch once I have an improved version.

Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
---

[diffstat]
 virtio_net.c |   36 ++++++++++++++++++++++++++++++------
 1 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -194,17 +194,22 @@ again:
 		received++;
 	}
 
+
 	/* FIXME: If we oom and completely run out of inbufs, we need
 	 * to start a timer trying to fill more. */
 	if (vi->num < vi->max / 2)
 		try_fill_recv(vi);
 
-	/* Out of packets? */
-	if (received < budget) {
-		netif_rx_complete(vi->dev, napi);
-		if (unlikely(!vi->rvq->vq_ops->enable_cb(vi->rvq))
-		    && netif_rx_reschedule(vi->dev, napi))
-			goto again;
+	/* FIXME - fails when called by workaround timer polling (sometimes) */
+	if (budget != 42)
+	{
+		/* Out of packets? */
+		if (received < budget) {
+				netif_rx_complete(vi->dev, napi);
+			if (unlikely(!vi->rvq->vq_ops->enable_cb(vi->rvq))
+			    && netif_rx_reschedule(vi->dev, napi))
+				goto again;
+		}
 	}
 
 	return received;
@@ -294,6 +299,17 @@ again:
 	return 0;
 }
 
+static struct timer_list viopoll_timer;
+static void virtio_poll_wrap(unsigned long dev)
+{
+        struct virtnet_info *vi = netdev_priv((struct net_device *)dev);
+	/* poll more often if polling received something */
+	if (virtnet_poll(&vi->napi, 42))
+		mod_timer(&viopoll_timer, get_jiffies_64() + (HZ/150));
+	else
+		mod_timer(&viopoll_timer, get_jiffies_64() + (HZ/25));
+}
+
 static int virtnet_open(struct net_device *dev)
 {
 	struct virtnet_info *vi = netdev_priv(dev);
@@ -308,12 +324,20 @@ static int virtnet_open(struct net_devic
 		vi->rvq->vq_ops->disable_cb(vi->rvq);
 		__netif_rx_schedule(dev, &vi->napi);
 	}
+
+	// DEBUG (Missing interrupts ?)
+	setup_timer(&viopoll_timer, virtio_poll_wrap, (unsigned long)(dev));
+	mod_timer(&viopoll_timer, get_jiffies_64() + HZ);
+	printk("%s - set up virtnet_poll timer\n", __func__);
+
 	return 0;
 }
 
 static int virtnet_close(struct net_device *dev)
 {
 	struct virtnet_info *vi = netdev_priv(dev);
+
+        del_timer(&viopoll_timer);
 
 	napi_disable(&vi->napi);
 

[-- Attachment #3: Type: text/plain, Size: 278 bytes --]

-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace

[-- Attachment #4: Type: text/plain, Size: 170 bytes --]

_______________________________________________
kvm-ppc-devel mailing list
kvm-ppc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-ppc-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [kvm-ppc-devel] virtio network traffic issues
  2008-03-25 13:52 ` Hollis Blanchard
@ 2008-03-26 16:07     ` Christian Ehrhardt
  0 siblings, 0 replies; 7+ messages in thread
From: Christian Ehrhardt @ 2008-03-26 16:07 UTC (permalink / raw)
  To: Hollis Blanchard; +Cc: kvm-ppc-devel, virtualization

[-- Attachment #1: Type: text/plain, Size: 1629 bytes --]

Hollis Blanchard wrote:
> On Tue, 2008-03-25 at 14:45 +0100, Christian Ehrhardt wrote:
>> => from one not yet defined point our guest seems to receive absolutely nothing
>> => when the guest is hanging it sends nfs requests which are seen externally, but it does not seem to get the respones
>> => the arp requests for the guest are repeated - maybe we can add some very verbose debug in the guest virtio code and activate it when we see that it is already hanging
>> => ideas:
>>   - maybe some buffer/memory runs out and so incoming packets won't be received anymore 
>>   - we break something in virtio / incoming interrupts (or maybe a lock) and from that point no receive is possible
>>
> Could we be missing interrupts? Can you add a guest kernel timer that
> calls virtio_poll() regularly?


Working but not the final solution.
Preliminary workaround patch attached

Atm I think we might disable interrupts and polling at the same time.

Good message: it was slow, but I saw a login prompt ;-)
here a guest "cat /proc/cpuinfo"
cat /proc/cpuinfo
processor       : 0
cpu             : unknown (00000000)
clock           : 666.666660MHz
revision        : 0.0 (pvr 0000 0000)
bogomips        : 2490.36
timebase        : 666666660
platform        : Bamboo

That means once we found the reason for that staving virto-net device we should have a basic working linux guest.

P.S. added virtualization@lists.linux-foundation.org (this time) to get any virtio-net related suggestions from there too

-- 

Grüsse / regards, 
Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization


[-- Attachment #2: virtio-net-poll-on-timer --]
[-- Type: text/plain, Size: 3048 bytes --]

Subject: [PATCH] kvmppc virtio-net: workaround for lost interrupt/polling

From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>

This patch is (atm) just a debug workaround. The issue is that virtio-net
works fine for a while but then "something" happens and we see neither
vp_interrupts nor calls to virtnet_poll anymore.
Looking at the network traffic shows that the kvm guest still sends packets
via virtio-net and that userspace tries to deliver things to the guest, but
the guest receives nothing.
Somehow it loks loke polling and interrupts are disables (more debugging
needed).
For now anyone can continue with that workaround patch (which is
very slow, I had no time to tune the polling interval yet).
There's an ugly fixme, but I don't yet know what exactly causes this BUG()
to trigger so thats the way to get it out for now.
I'll update the patch once I have an improved version.

Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
---

[diffstat]
 virtio_net.c |   36 ++++++++++++++++++++++++++++++------
 1 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -194,17 +194,22 @@ again:
 		received++;
 	}
 
+
 	/* FIXME: If we oom and completely run out of inbufs, we need
 	 * to start a timer trying to fill more. */
 	if (vi->num < vi->max / 2)
 		try_fill_recv(vi);
 
-	/* Out of packets? */
-	if (received < budget) {
-		netif_rx_complete(vi->dev, napi);
-		if (unlikely(!vi->rvq->vq_ops->enable_cb(vi->rvq))
-		    && netif_rx_reschedule(vi->dev, napi))
-			goto again;
+	/* FIXME - fails when called by workaround timer polling (sometimes) */
+	if (budget != 42)
+	{
+		/* Out of packets? */
+		if (received < budget) {
+				netif_rx_complete(vi->dev, napi);
+			if (unlikely(!vi->rvq->vq_ops->enable_cb(vi->rvq))
+			    && netif_rx_reschedule(vi->dev, napi))
+				goto again;
+		}
 	}
 
 	return received;
@@ -294,6 +299,17 @@ again:
 	return 0;
 }
 
+static struct timer_list viopoll_timer;
+static void virtio_poll_wrap(unsigned long dev)
+{
+        struct virtnet_info *vi = netdev_priv((struct net_device *)dev);
+	/* poll more often if polling received something */
+	if (virtnet_poll(&vi->napi, 42))
+		mod_timer(&viopoll_timer, get_jiffies_64() + (HZ/150));
+	else
+		mod_timer(&viopoll_timer, get_jiffies_64() + (HZ/25));
+}
+
 static int virtnet_open(struct net_device *dev)
 {
 	struct virtnet_info *vi = netdev_priv(dev);
@@ -308,12 +324,20 @@ static int virtnet_open(struct net_devic
 		vi->rvq->vq_ops->disable_cb(vi->rvq);
 		__netif_rx_schedule(dev, &vi->napi);
 	}
+
+	// DEBUG (Missing interrupts ?)
+	setup_timer(&viopoll_timer, virtio_poll_wrap, (unsigned long)(dev));
+	mod_timer(&viopoll_timer, get_jiffies_64() + HZ);
+	printk("%s - set up virtnet_poll timer\n", __func__);
+
 	return 0;
 }
 
 static int virtnet_close(struct net_device *dev)
 {
 	struct virtnet_info *vi = netdev_priv(dev);
+
+        del_timer(&viopoll_timer);
 
 	napi_disable(&vi->napi);
 


[-- Attachment #3: Type: text/plain, Size: 278 bytes --]

-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace

[-- Attachment #4: Type: text/plain, Size: 170 bytes --]

_______________________________________________
kvm-ppc-devel mailing list
kvm-ppc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-ppc-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [kvm-ppc-devel] virtio network traffic issues
@ 2008-03-26 16:07     ` Christian Ehrhardt
  0 siblings, 0 replies; 7+ messages in thread
From: Christian Ehrhardt @ 2008-03-26 16:07 UTC (permalink / raw)
  To: Hollis Blanchard; +Cc: kvm-ppc-devel, virtualization

[-- Attachment #1: Type: text/plain, Size: 1629 bytes --]

Hollis Blanchard wrote:
> On Tue, 2008-03-25 at 14:45 +0100, Christian Ehrhardt wrote:
>> => from one not yet defined point our guest seems to receive absolutely nothing
>> => when the guest is hanging it sends nfs requests which are seen externally, but it does not seem to get the respones
>> => the arp requests for the guest are repeated - maybe we can add some very verbose debug in the guest virtio code and activate it when we see that it is already hanging
>> => ideas:
>>   - maybe some buffer/memory runs out and so incoming packets won't be received anymore 
>>   - we break something in virtio / incoming interrupts (or maybe a lock) and from that point no receive is possible
>>
> Could we be missing interrupts? Can you add a guest kernel timer that
> calls virtio_poll() regularly?


Working but not the final solution.
Preliminary workaround patch attached

Atm I think we might disable interrupts and polling at the same time.

Good message: it was slow, but I saw a login prompt ;-)
here a guest "cat /proc/cpuinfo"
cat /proc/cpuinfo
processor       : 0
cpu             : unknown (00000000)
clock           : 666.666660MHz
revision        : 0.0 (pvr 0000 0000)
bogomips        : 2490.36
timebase        : 666666660
platform        : Bamboo

That means once we found the reason for that staving virto-net device we should have a basic working linux guest.

P.S. added virtualization@lists.linux-foundation.org (this time) to get any virtio-net related suggestions from there too

-- 

Grüsse / regards, 
Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization


[-- Attachment #2: virtio-net-poll-on-timer --]
[-- Type: text/plain, Size: 3048 bytes --]

Subject: [PATCH] kvmppc virtio-net: workaround for lost interrupt/polling

From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>

This patch is (atm) just a debug workaround. The issue is that virtio-net
works fine for a while but then "something" happens and we see neither
vp_interrupts nor calls to virtnet_poll anymore.
Looking at the network traffic shows that the kvm guest still sends packets
via virtio-net and that userspace tries to deliver things to the guest, but
the guest receives nothing.
Somehow it loks loke polling and interrupts are disables (more debugging
needed).
For now anyone can continue with that workaround patch (which is
very slow, I had no time to tune the polling interval yet).
There's an ugly fixme, but I don't yet know what exactly causes this BUG()
to trigger so thats the way to get it out for now.
I'll update the patch once I have an improved version.

Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
---

[diffstat]
 virtio_net.c |   36 ++++++++++++++++++++++++++++++------
 1 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -194,17 +194,22 @@ again:
 		received++;
 	}
 
+
 	/* FIXME: If we oom and completely run out of inbufs, we need
 	 * to start a timer trying to fill more. */
 	if (vi->num < vi->max / 2)
 		try_fill_recv(vi);
 
-	/* Out of packets? */
-	if (received < budget) {
-		netif_rx_complete(vi->dev, napi);
-		if (unlikely(!vi->rvq->vq_ops->enable_cb(vi->rvq))
-		    && netif_rx_reschedule(vi->dev, napi))
-			goto again;
+	/* FIXME - fails when called by workaround timer polling (sometimes) */
+	if (budget != 42)
+	{
+		/* Out of packets? */
+		if (received < budget) {
+				netif_rx_complete(vi->dev, napi);
+			if (unlikely(!vi->rvq->vq_ops->enable_cb(vi->rvq))
+			    && netif_rx_reschedule(vi->dev, napi))
+				goto again;
+		}
 	}
 
 	return received;
@@ -294,6 +299,17 @@ again:
 	return 0;
 }
 
+static struct timer_list viopoll_timer;
+static void virtio_poll_wrap(unsigned long dev)
+{
+        struct virtnet_info *vi = netdev_priv((struct net_device *)dev);
+	/* poll more often if polling received something */
+	if (virtnet_poll(&vi->napi, 42))
+		mod_timer(&viopoll_timer, get_jiffies_64() + (HZ/150));
+	else
+		mod_timer(&viopoll_timer, get_jiffies_64() + (HZ/25));
+}
+
 static int virtnet_open(struct net_device *dev)
 {
 	struct virtnet_info *vi = netdev_priv(dev);
@@ -308,12 +324,20 @@ static int virtnet_open(struct net_devic
 		vi->rvq->vq_ops->disable_cb(vi->rvq);
 		__netif_rx_schedule(dev, &vi->napi);
 	}
+
+	// DEBUG (Missing interrupts ?)
+	setup_timer(&viopoll_timer, virtio_poll_wrap, (unsigned long)(dev));
+	mod_timer(&viopoll_timer, get_jiffies_64() + HZ);
+	printk("%s - set up virtnet_poll timer\n", __func__);
+
 	return 0;
 }
 
 static int virtnet_close(struct net_device *dev)
 {
 	struct virtnet_info *vi = netdev_priv(dev);
+
+        del_timer(&viopoll_timer);
 
 	napi_disable(&vi->napi);
 


[-- Attachment #3: Type: text/plain, Size: 184 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [kvm-ppc-devel] virtio network traffic issues
  2008-03-26 16:07     ` Christian Ehrhardt
@ 2008-03-26 17:03       ` Dor Laor
  -1 siblings, 0 replies; 7+ messages in thread
From: Dor Laor @ 2008-03-26 17:03 UTC (permalink / raw)
  To: Christian Ehrhardt; +Cc: kvm-ppc-devel, Hollis Blanchard, virtualization


On Wed, 2008-03-26 at 17:07 +0100, Christian Ehrhardt wrote:
> Hollis Blanchard wrote:
> > On Tue, 2008-03-25 at 14:45 +0100, Christian Ehrhardt wrote:
> >> => from one not yet defined point our guest seems to receive absolutely nothing
> >> => when the guest is hanging it sends nfs requests which are seen externally, but it does not seem to get the respones
> >> => the arp requests for the guest are repeated - maybe we can add some very verbose debug in the guest virtio code and activate it when we see that it is already hanging
> >> => ideas:
> >>   - maybe some buffer/memory runs out and so incoming packets won't be received anymore 
> >>   - we break something in virtio / incoming interrupts (or maybe a lock) and from that point no receive is possible
> >>
> > Could we be missing interrupts? Can you add a guest kernel timer that
> > calls virtio_poll() regularly?
> 
> 
> Working but not the final solution.
> Preliminary workaround patch attached
> 
> Atm I think we might disable interrupts and polling at the same time.

For disabled interrupts you mean that the VRING_AVAIL_F_NO_INTERRUPT
flag is set or really your s390 replacement to cli?

If it's just the flag you can ignore it in the host (virtio-net.c).


btw: I didn't see any VRING_USED_F_NO_NOTIFY on the guest side.
It is only for optimizations so just put it on your todo list.

> 
> Good message: it was slow, but I saw a login prompt ;-)
> here a guest "cat /proc/cpuinfo"
> cat /proc/cpuinfo
> processor       : 0
> cpu             : unknown (00000000)
> clock           : 666.666660MHz
> revision        : 0.0 (pvr 0000 0000)
> bogomips        : 2490.36
> timebase        : 666666660
> platform        : Bamboo
> 
> That means once we found the reason for that staving virto-net device we should have a basic working linux guest.
> 
> P.S. added virtualization@lists.linux-foundation.org (this time) to get any virtio-net related suggestions from there too
> 
> plain text document attachment (virtio-net-poll-on-timer)
> Subject: [PATCH] kvmppc virtio-net: workaround for lost interrupt/polling
> 
> From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
> 
> This patch is (atm) just a debug workaround. The issue is that virtio-net
> works fine for a while but then "something" happens and we see neither
> vp_interrupts nor calls to virtnet_poll anymore.
> Looking at the network traffic shows that the kvm guest still sends packets
> via virtio-net and that userspace tries to deliver things to the guest, but
> the guest receives nothing.
> Somehow it loks loke polling and interrupts are disables (more debugging
> needed).
> For now anyone can continue with that workaround patch (which is
> very slow, I had no time to tune the polling interval yet).
> There's an ugly fixme, but I don't yet know what exactly causes this BUG()
> to trigger so thats the way to get it out for now.
> I'll update the patch once I have an improved version.
> 
> Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
> ---
> 
> [diffstat]
>  virtio_net.c |   36 ++++++++++++++++++++++++++++++------
>  1 files changed, 30 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -194,17 +194,22 @@ again:
>  		received++;
>  	}
>  
> +
>  	/* FIXME: If we oom and completely run out of inbufs, we need
>  	 * to start a timer trying to fill more. */
>  	if (vi->num < vi->max / 2)
>  		try_fill_recv(vi);
>  
> -	/* Out of packets? */
> -	if (received < budget) {
> -		netif_rx_complete(vi->dev, napi);
> -		if (unlikely(!vi->rvq->vq_ops->enable_cb(vi->rvq))
> -		    && netif_rx_reschedule(vi->dev, napi))
> -			goto again;
> +	/* FIXME - fails when called by workaround timer polling (sometimes) */
> +	if (budget != 42)
> +	{
> +		/* Out of packets? */
> +		if (received < budget) {
> +				netif_rx_complete(vi->dev, napi);
> +			if (unlikely(!vi->rvq->vq_ops->enable_cb(vi->rvq))
> +			    && netif_rx_reschedule(vi->dev, napi))
> +				goto again;
> +		}
>  	}
>  
>  	return received;
> @@ -294,6 +299,17 @@ again:
>  	return 0;
>  }
>  
> +static struct timer_list viopoll_timer;
> +static void virtio_poll_wrap(unsigned long dev)
> +{
> +        struct virtnet_info *vi = netdev_priv((struct net_device *)dev);
> +	/* poll more often if polling received something */
> +	if (virtnet_poll(&vi->napi, 42))
> +		mod_timer(&viopoll_timer, get_jiffies_64() + (HZ/150));
> +	else
> +		mod_timer(&viopoll_timer, get_jiffies_64() + (HZ/25));
> +}
> +
>  static int virtnet_open(struct net_device *dev)
>  {
>  	struct virtnet_info *vi = netdev_priv(dev);
> @@ -308,12 +324,20 @@ static int virtnet_open(struct net_devic
>  		vi->rvq->vq_ops->disable_cb(vi->rvq);
>  		__netif_rx_schedule(dev, &vi->napi);
>  	}
> +
> +	// DEBUG (Missing interrupts ?)
> +	setup_timer(&viopoll_timer, virtio_poll_wrap, (unsigned long)(dev));
> +	mod_timer(&viopoll_timer, get_jiffies_64() + HZ);
> +	printk("%s - set up virtnet_poll timer\n", __func__);
> +
>  	return 0;
>  }
>  
>  static int virtnet_close(struct net_device *dev)
>  {
>  	struct virtnet_info *vi = netdev_priv(dev);
> +
> +        del_timer(&viopoll_timer);
>  
>  	napi_disable(&vi->napi);
>  
> 
> _______________________________________________
> Virtualization mailing list
> Virtualization@lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/virtualization


-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
_______________________________________________
kvm-ppc-devel mailing list
kvm-ppc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-ppc-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [kvm-ppc-devel] virtio network traffic issues
@ 2008-03-26 17:03       ` Dor Laor
  0 siblings, 0 replies; 7+ messages in thread
From: Dor Laor @ 2008-03-26 17:03 UTC (permalink / raw)
  To: Christian Ehrhardt; +Cc: kvm-ppc-devel, Hollis Blanchard, virtualization


On Wed, 2008-03-26 at 17:07 +0100, Christian Ehrhardt wrote:
> Hollis Blanchard wrote:
> > On Tue, 2008-03-25 at 14:45 +0100, Christian Ehrhardt wrote:
> >> => from one not yet defined point our guest seems to receive absolutely nothing
> >> => when the guest is hanging it sends nfs requests which are seen externally, but it does not seem to get the respones
> >> => the arp requests for the guest are repeated - maybe we can add some very verbose debug in the guest virtio code and activate it when we see that it is already hanging
> >> => ideas:
> >>   - maybe some buffer/memory runs out and so incoming packets won't be received anymore 
> >>   - we break something in virtio / incoming interrupts (or maybe a lock) and from that point no receive is possible
> >>
> > Could we be missing interrupts? Can you add a guest kernel timer that
> > calls virtio_poll() regularly?
> 
> 
> Working but not the final solution.
> Preliminary workaround patch attached
> 
> Atm I think we might disable interrupts and polling at the same time.

For disabled interrupts you mean that the VRING_AVAIL_F_NO_INTERRUPT
flag is set or really your s390 replacement to cli?

If it's just the flag you can ignore it in the host (virtio-net.c).


btw: I didn't see any VRING_USED_F_NO_NOTIFY on the guest side.
It is only for optimizations so just put it on your todo list.

> 
> Good message: it was slow, but I saw a login prompt ;-)
> here a guest "cat /proc/cpuinfo"
> cat /proc/cpuinfo
> processor       : 0
> cpu             : unknown (00000000)
> clock           : 666.666660MHz
> revision        : 0.0 (pvr 0000 0000)
> bogomips        : 2490.36
> timebase        : 666666660
> platform        : Bamboo
> 
> That means once we found the reason for that staving virto-net device we should have a basic working linux guest.
> 
> P.S. added virtualization@lists.linux-foundation.org (this time) to get any virtio-net related suggestions from there too
> 
> plain text document attachment (virtio-net-poll-on-timer)
> Subject: [PATCH] kvmppc virtio-net: workaround for lost interrupt/polling
> 
> From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
> 
> This patch is (atm) just a debug workaround. The issue is that virtio-net
> works fine for a while but then "something" happens and we see neither
> vp_interrupts nor calls to virtnet_poll anymore.
> Looking at the network traffic shows that the kvm guest still sends packets
> via virtio-net and that userspace tries to deliver things to the guest, but
> the guest receives nothing.
> Somehow it loks loke polling and interrupts are disables (more debugging
> needed).
> For now anyone can continue with that workaround patch (which is
> very slow, I had no time to tune the polling interval yet).
> There's an ugly fixme, but I don't yet know what exactly causes this BUG()
> to trigger so thats the way to get it out for now.
> I'll update the patch once I have an improved version.
> 
> Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
> ---
> 
> [diffstat]
>  virtio_net.c |   36 ++++++++++++++++++++++++++++++------
>  1 files changed, 30 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -194,17 +194,22 @@ again:
>  		received++;
>  	}
>  
> +
>  	/* FIXME: If we oom and completely run out of inbufs, we need
>  	 * to start a timer trying to fill more. */
>  	if (vi->num < vi->max / 2)
>  		try_fill_recv(vi);
>  
> -	/* Out of packets? */
> -	if (received < budget) {
> -		netif_rx_complete(vi->dev, napi);
> -		if (unlikely(!vi->rvq->vq_ops->enable_cb(vi->rvq))
> -		    && netif_rx_reschedule(vi->dev, napi))
> -			goto again;
> +	/* FIXME - fails when called by workaround timer polling (sometimes) */
> +	if (budget != 42)
> +	{
> +		/* Out of packets? */
> +		if (received < budget) {
> +				netif_rx_complete(vi->dev, napi);
> +			if (unlikely(!vi->rvq->vq_ops->enable_cb(vi->rvq))
> +			    && netif_rx_reschedule(vi->dev, napi))
> +				goto again;
> +		}
>  	}
>  
>  	return received;
> @@ -294,6 +299,17 @@ again:
>  	return 0;
>  }
>  
> +static struct timer_list viopoll_timer;
> +static void virtio_poll_wrap(unsigned long dev)
> +{
> +        struct virtnet_info *vi = netdev_priv((struct net_device *)dev);
> +	/* poll more often if polling received something */
> +	if (virtnet_poll(&vi->napi, 42))
> +		mod_timer(&viopoll_timer, get_jiffies_64() + (HZ/150));
> +	else
> +		mod_timer(&viopoll_timer, get_jiffies_64() + (HZ/25));
> +}
> +
>  static int virtnet_open(struct net_device *dev)
>  {
>  	struct virtnet_info *vi = netdev_priv(dev);
> @@ -308,12 +324,20 @@ static int virtnet_open(struct net_devic
>  		vi->rvq->vq_ops->disable_cb(vi->rvq);
>  		__netif_rx_schedule(dev, &vi->napi);
>  	}
> +
> +	// DEBUG (Missing interrupts ?)
> +	setup_timer(&viopoll_timer, virtio_poll_wrap, (unsigned long)(dev));
> +	mod_timer(&viopoll_timer, get_jiffies_64() + HZ);
> +	printk("%s - set up virtnet_poll timer\n", __func__);
> +
>  	return 0;
>  }
>  
>  static int virtnet_close(struct net_device *dev)
>  {
>  	struct virtnet_info *vi = netdev_priv(dev);
> +
> +        del_timer(&viopoll_timer);
>  
>  	napi_disable(&vi->napi);
>  
> 
> _______________________________________________
> Virtualization mailing list
> Virtualization@lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-03-26 17:03 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-03-25 13:45 [kvm-ppc-devel] virtio network traffic issues Christian Ehrhardt
2008-03-25 13:52 ` Hollis Blanchard
2008-03-26 16:07   ` Christian Ehrhardt
2008-03-26 16:07     ` Christian Ehrhardt
2008-03-26 17:03     ` Dor Laor
2008-03-26 17:03       ` Dor Laor
2008-03-26 16:06 ` Christian Ehrhardt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.