netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] tun: Persistent devices can get stuck in xoff state
@ 2008-07-09  6:24 Max Krasnyansky
  2008-07-09 15:15 ` Christian Borntraeger
  2008-07-09 15:59 ` Christian Borntraeger
  0 siblings, 2 replies; 7+ messages in thread
From: Max Krasnyansky @ 2008-07-09  6:24 UTC (permalink / raw)
  To: davem; +Cc: netdev, Max Krasnyansky

The scenario goes like this. App stops reading from tun/tap.
TX queue gets full and driver does netif_stop_queue().
App closes fd and TX queue gets flushed as part of the cleanup.
Next time the app opens tun/tap and starts reading from it but
the xoff state is not cleared. We're stuck.
Normally xoff state is cleared when netdev is brought up. But
in the case of persistent devices this happens only during
initial setup.

The fix is trivial. If device is already up when an app opens
it we clear xoff state and that gets things moving again.

Signed-off-by: Max Krasnyansky <maxk@qualcomm.com>
---
 drivers/net/tun.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 5b5d875..e665f72 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -576,6 +576,11 @@ static int tun_set_iff(struct file *file, struct ifreq *ifr)
 	file->private_data = tun;
 	tun->attached = 1;
 
+	/* Make sure persistent devices do not get stuck in
+	 * xoff state */
+	if (netif_running(tun->dev))
+		netif_wake_queue(tun->dev);
+
 	strcpy(ifr->ifr_name, tun->dev->name);
 	return 0;
 
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] tun: Persistent devices can get stuck in xoff state
  2008-07-09  6:24 [PATCH] tun: Persistent devices can get stuck in xoff state Max Krasnyansky
@ 2008-07-09 15:15 ` Christian Borntraeger
  2008-07-09 21:15   ` Max Krasnyansky
  2008-07-09 15:59 ` Christian Borntraeger
  1 sibling, 1 reply; 7+ messages in thread
From: Christian Borntraeger @ 2008-07-09 15:15 UTC (permalink / raw)
  To: Max Krasnyansky; +Cc: davem, netdev

Am Mittwoch, 9. Juli 2008 schrieb Max Krasnyansky:
> The scenario goes like this. App stops reading from tun/tap.
> TX queue gets full and driver does netif_stop_queue().
> App closes fd and TX queue gets flushed as part of the cleanup.
> Next time the app opens tun/tap and starts reading from it but
> the xoff state is not cleared. We're stuck.
> Normally xoff state is cleared when netdev is brought up. But
> in the case of persistent devices this happens only during
> initial setup.

Thats interesting. I believe we have seen exactly this behaviour
with KVM and lots of preallocated tap devices. 
[...]
> +++ b/drivers/net/tun.c
> @@ -576,6 +576,11 @@ static int tun_set_iff(struct file *file, struct ifreq 
*ifr)
>  	file->private_data = tun;
>  	tun->attached = 1;
> 
> +	/* Make sure persistent devices do not get stuck in
> +	 * xoff state */
> +	if (netif_running(tun->dev))
> +		netif_wake_queue(tun->dev);
> +
>  	strcpy(ifr->ifr_name, tun->dev->name);
>  	return 0;

I think that patch looks ok, but I am curious why you dont clear the xoff 
state on application close at the same time when the TX queue gets flushed?

Christian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tun: Persistent devices can get stuck in xoff state
  2008-07-09  6:24 [PATCH] tun: Persistent devices can get stuck in xoff state Max Krasnyansky
  2008-07-09 15:15 ` Christian Borntraeger
@ 2008-07-09 15:59 ` Christian Borntraeger
  2008-07-09 21:19   ` Max Krasnyansky
  1 sibling, 1 reply; 7+ messages in thread
From: Christian Borntraeger @ 2008-07-09 15:59 UTC (permalink / raw)
  To: Max Krasnyansky; +Cc: davem, netdev

Am Mittwoch, 9. Juli 2008 schrieb Max Krasnyansky:
> The scenario goes like this. App stops reading from tun/tap.
> TX queue gets full and driver does netif_stop_queue().
> App closes fd and TX queue gets flushed as part of the cleanup.
> Next time the app opens tun/tap and starts reading from it but
> the xoff state is not cleared. We're stuck.
> Normally xoff state is cleared when netdev is brought up. But
> in the case of persistent devices this happens only during
> initial setup.

I was able to reproduce the problem. The patch seems to fix it.

Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tun: Persistent devices can get stuck in xoff state
  2008-07-09 15:15 ` Christian Borntraeger
@ 2008-07-09 21:15   ` Max Krasnyansky
  0 siblings, 0 replies; 7+ messages in thread
From: Max Krasnyansky @ 2008-07-09 21:15 UTC (permalink / raw)
  To: Christian Borntraeger; +Cc: davem, netdev

Christian Borntraeger wrote:
> Am Mittwoch, 9. Juli 2008 schrieb Max Krasnyansky:
>> The scenario goes like this. App stops reading from tun/tap.
>> TX queue gets full and driver does netif_stop_queue().
>> App closes fd and TX queue gets flushed as part of the cleanup.
>> Next time the app opens tun/tap and starts reading from it but
>> the xoff state is not cleared. We're stuck.
>> Normally xoff state is cleared when netdev is brought up. But
>> in the case of persistent devices this happens only during
>> initial setup.
> 
> Thats interesting. I believe we have seen exactly this behaviour
> with KVM and lots of preallocated tap devices. 
Yeah, it's interesting given that it's (the bug that is) been there for 
at least 2-3 years now :).

> [...]
>> +++ b/drivers/net/tun.c
>> @@ -576,6 +576,11 @@ static int tun_set_iff(struct file *file, struct ifreq 
> *ifr)
>>  	file->private_data = tun;
>>  	tun->attached = 1;
>>
>> +	/* Make sure persistent devices do not get stuck in
>> +	 * xoff state */
>> +	if (netif_running(tun->dev))
>> +		netif_wake_queue(tun->dev);
>> +
>>  	strcpy(ifr->ifr_name, tun->dev->name);
>>  	return 0;
> 
> I think that patch looks ok, but I am curious why you dont clear the xoff 
> state on application close at the same time when the TX queue gets flushed?
Why bother. I mean the packets will be dropped anyway.

Max


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tun: Persistent devices can get stuck in xoff state
  2008-07-09 15:59 ` Christian Borntraeger
@ 2008-07-09 21:19   ` Max Krasnyansky
  2008-07-11  0:00     ` David Miller
  0 siblings, 1 reply; 7+ messages in thread
From: Max Krasnyansky @ 2008-07-09 21:19 UTC (permalink / raw)
  To: Christian Borntraeger; +Cc: davem, netdev

Christian Borntraeger wrote:
> Am Mittwoch, 9. Juli 2008 schrieb Max Krasnyansky:
>> The scenario goes like this. App stops reading from tun/tap.
>> TX queue gets full and driver does netif_stop_queue().
>> App closes fd and TX queue gets flushed as part of the cleanup.
>> Next time the app opens tun/tap and starts reading from it but
>> the xoff state is not cleared. We're stuck.
>> Normally xoff state is cleared when netdev is brought up. But
>> in the case of persistent devices this happens only during
>> initial setup.
> 
> I was able to reproduce the problem. The patch seems to fix it.
> 
> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>

Oh it definitely fixes that problem. It's very easy to reproduce by just 
suspending (ctrl-z) an app that is ready from the tun, waiting a bit for 
the queue to fill up and then killing the app. I have a simple test app 
that simulates this. It's just that in real life that does not happen 
very often. ie Applications either constantly read from tun or they 
close it.

Thanks a lot for testing and confirming.

Max

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tun: Persistent devices can get stuck in xoff state
  2008-07-09 21:19   ` Max Krasnyansky
@ 2008-07-11  0:00     ` David Miller
  2008-07-11  0:53       ` Max Krasnyansky
  0 siblings, 1 reply; 7+ messages in thread
From: David Miller @ 2008-07-11  0:00 UTC (permalink / raw)
  To: maxk; +Cc: borntraeger, netdev

From: Max Krasnyansky <maxk@qualcomm.com>
Date: Wed, 09 Jul 2008 14:19:28 -0700

> Christian Borntraeger wrote:
> > Am Mittwoch, 9. Juli 2008 schrieb Max Krasnyansky:
> >> The scenario goes like this. App stops reading from tun/tap.
> >> TX queue gets full and driver does netif_stop_queue().
> >> App closes fd and TX queue gets flushed as part of the cleanup.
> >> Next time the app opens tun/tap and starts reading from it but
> >> the xoff state is not cleared. We're stuck.
> >> Normally xoff state is cleared when netdev is brought up. But
> >> in the case of persistent devices this happens only during
> >> initial setup.
> > 
> > I was able to reproduce the problem. The patch seems to fix it.
> > 
> > Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
> 
> Oh it definitely fixes that problem. It's very easy to reproduce by just 
> suspending (ctrl-z) an app that is ready from the tun, waiting a bit for 
> the queue to fill up and then killing the app. I have a simple test app 
> that simulates this. It's just that in real life that does not happen 
> very often. ie Applications either constantly read from tun or they 
> close it.
> 
> Thanks a lot for testing and confirming.

I've applied this, thanks.

But I had to apply it by hand.  You made this against an older
tree, because in the current code there is a get_net() call
right before the new lines you added but in your patch context
it isn't there.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tun: Persistent devices can get stuck in xoff state
  2008-07-11  0:00     ` David Miller
@ 2008-07-11  0:53       ` Max Krasnyansky
  0 siblings, 0 replies; 7+ messages in thread
From: Max Krasnyansky @ 2008-07-11  0:53 UTC (permalink / raw)
  To: David Miller; +Cc: borntraeger, netdev

David Miller wrote:
> From: Max Krasnyansky <maxk@qualcomm.com>
> Date: Wed, 09 Jul 2008 14:19:28 -0700
> 
>> Christian Borntraeger wrote:
>>> Am Mittwoch, 9. Juli 2008 schrieb Max Krasnyansky:
>>>> The scenario goes like this. App stops reading from tun/tap.
>>>> TX queue gets full and driver does netif_stop_queue().
>>>> App closes fd and TX queue gets flushed as part of the cleanup.
>>>> Next time the app opens tun/tap and starts reading from it but
>>>> the xoff state is not cleared. We're stuck.
>>>> Normally xoff state is cleared when netdev is brought up. But
>>>> in the case of persistent devices this happens only during
>>>> initial setup.
>>> I was able to reproduce the problem. The patch seems to fix it.
>>>
>>> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
>> Oh it definitely fixes that problem. It's very easy to reproduce by just 
>> suspending (ctrl-z) an app that is ready from the tun, waiting a bit for 
>> the queue to fill up and then killing the app. I have a simple test app 
>> that simulates this. It's just that in real life that does not happen 
>> very often. ie Applications either constantly read from tun or they 
>> close it.
>>
>> Thanks a lot for testing and confirming.
> 
> I've applied this, thanks.
> 
> But I had to apply it by hand.  You made this against an older
> tree, because in the current code there is a get_net() call
> right before the new lines you added but in your patch context
> it isn't there.

Hmm, I though I used latest git tree. Oh, turns out it was 2.6.25.4. 
Sorry. Thanx for fixing it.

Max




^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-07-11  0:53 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-09  6:24 [PATCH] tun: Persistent devices can get stuck in xoff state Max Krasnyansky
2008-07-09 15:15 ` Christian Borntraeger
2008-07-09 21:15   ` Max Krasnyansky
2008-07-09 15:59 ` Christian Borntraeger
2008-07-09 21:19   ` Max Krasnyansky
2008-07-11  0:00     ` David Miller
2008-07-11  0:53       ` Max Krasnyansky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).