* [PATCH -next] tun: stop tx queue when limit is hit
@ 2014-07-20 18:51 Florian Westphal
2014-07-21 5:33 ` David Miller
0 siblings, 1 reply; 6+ messages in thread
From: Florian Westphal @ 2014-07-20 18:51 UTC (permalink / raw)
To: netdev; +Cc: Florian Westphal
Currently tun just frees the skb and returns NETDEV_TX_OK
when queue length exceeds txqlen.
This causes severe packetloss and unneeded resource
consumption on host when sending to vm connected via tun.
Instead, lets stop the transmit queue and start it once
packets are consumed from the queue. This allows the network
stack to control applications that send data via tun device.
Before:
time netperf -H 10.1.1.2 -t UDP_STREAM -- -m 1024
Socket Message Elapsed Messages
Size Size Time Okay Errors Throughput
bytes bytes secs # # 10^6bits/sec
212992 1024 10.00 5712540 0 4679.65
212992 10.00 804441 658.99
0.40s user 9.62s system 71% cpu 14.030 total
After:
time netperf -H 10.1.1.2 -t UDP_STREAM -- -m 1024
Socket Message Elapsed Messages
Size Size Time Okay Errors Throughput
bytes bytes secs # # 10^6bits/sec
212992 1024 10.00 2060473 0 1687.92
212992 10.00 904187 740.70
0.18s user 5.14s system 37% cpu 14.028 total
Signed-off-by: Florian Westphal <fw@strlen.de>
---
drivers/net/tun.c | 42 ++++++++++++++++++++++++++++++++++++------
1 file changed, 36 insertions(+), 6 deletions(-)
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index acaaf67..c020215 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -733,6 +733,24 @@ static int tun_net_close(struct net_device *dev)
return 0;
}
+static unsigned long tun_queue_len(const struct tun_file *tfile, u32 numqueues)
+{
+ return skb_queue_len(&tfile->socket.sk->sk_receive_queue) * numqueues;
+}
+
+static bool tun_queue_should_stop(const struct net_device *dev,
+ const struct tun_file *tfile, u32 numqueues)
+{
+ return tun_queue_len(tfile, numqueues) > dev->tx_queue_len;
+}
+
+static bool tun_queue_should_wake(const struct net_device *dev,
+ const struct tun_file *tfile, u32 numqueues)
+{
+ return !tun_queue_should_stop(dev, tfile, numqueues);
+}
+
+
/* Net device start xmit */
static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
{
@@ -779,12 +797,19 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
sk_filter(tfile->socket.sk, skb))
goto drop;
- /* Limit the number of packets queued by dividing txq length with the
- * number of queues.
- */
- if (skb_queue_len(&tfile->socket.sk->sk_receive_queue) * numqueues
- >= dev->tx_queue_len)
- goto drop;
+ if (tun_queue_should_stop(dev, tfile, numqueues)) {
+ if (netif_subqueue_stopped(dev, skb))
+ goto drop;
+
+ netif_stop_subqueue(dev, txq);
+
+ /* concurrent reader must not miss 'stopped queue' */
+ smp_mb__after_atomic();
+
+ /* reader might have drained queue before stop_subqueue */
+ if (tun_queue_should_wake(dev, tfile, numqueues))
+ netif_start_subqueue(dev, txq);
+ }
if (unlikely(skb_orphan_frags(skb, GFP_ATOMIC)))
goto drop;
@@ -1346,6 +1371,11 @@ static ssize_t tun_do_read(struct tun_struct *tun, struct tun_file *tfile,
&peeked, &off, &err);
if (skb) {
ret = tun_put_user(tun, tfile, skb, iv, len);
+
+ if (netif_subqueue_stopped(tun->dev, skb) &&
+ tun_queue_should_wake(tun->dev, tfile, tun->numqueues))
+ netif_wake_subqueue(tun->dev, skb->queue_mapping);
+
kfree_skb(skb);
} else
ret = err;
--
1.8.1.5
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH -next] tun: stop tx queue when limit is hit
2014-07-20 18:51 [PATCH -next] tun: stop tx queue when limit is hit Florian Westphal
@ 2014-07-21 5:33 ` David Miller
2014-07-21 6:54 ` Florian Westphal
0 siblings, 1 reply; 6+ messages in thread
From: David Miller @ 2014-07-21 5:33 UTC (permalink / raw)
To: fw; +Cc: netdev
From: Florian Westphal <fw@strlen.de>
Date: Sun, 20 Jul 2014 20:51:25 +0200
> Currently tun just frees the skb and returns NETDEV_TX_OK
> when queue length exceeds txqlen.
>
> This causes severe packetloss and unneeded resource
> consumption on host when sending to vm connected via tun.
>
> Instead, lets stop the transmit queue and start it once
> packets are consumed from the queue. This allows the network
> stack to control applications that send data via tun device.
I strongly suspect the current behavior is intentional, see
commit:
commit 5d097109257c03a71845729f8db6b5770c4bbedc
Author: Michael S. Tsirkin <mst@redhat.com>
Date: Mon Dec 3 10:07:14 2012 +0000
tun: only queue packets on device
for example.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH -next] tun: stop tx queue when limit is hit
2014-07-21 5:33 ` David Miller
@ 2014-07-21 6:54 ` Florian Westphal
2014-07-21 8:31 ` Florian Westphal
0 siblings, 1 reply; 6+ messages in thread
From: Florian Westphal @ 2014-07-21 6:54 UTC (permalink / raw)
To: David Miller; +Cc: fw, netdev
David Miller <davem@davemloft.net> wrote:
> From: Florian Westphal <fw@strlen.de>
> Date: Sun, 20 Jul 2014 20:51:25 +0200
>
> > Currently tun just frees the skb and returns NETDEV_TX_OK
> > when queue length exceeds txqlen.
> >
> > This causes severe packetloss and unneeded resource
> > consumption on host when sending to vm connected via tun.
> >
> > Instead, lets stop the transmit queue and start it once
> > packets are consumed from the queue. This allows the network
> > stack to control applications that send data via tun device.
>
> I strongly suspect the current behavior is intentional, see
> commit:
>
> commit 5d097109257c03a71845729f8db6b5770c4bbedc
> Author: Michael S. Tsirkin <mst@redhat.com>
> Date: Mon Dec 3 10:07:14 2012 +0000
>
> tun: only queue packets on device
Looks like you're right :-/
Alright, please ignore my patch.
That being said, the current behaviour isn't ideal either.
It took me quite some time to realize that packetloss
was on the sender side inside tun driver and not on the receiver
vm. Not stopping the queue was a bit ... unexpected.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH -next] tun: stop tx queue when limit is hit
2014-07-21 6:54 ` Florian Westphal
@ 2014-07-21 8:31 ` Florian Westphal
2014-07-21 8:56 ` Eric Dumazet
0 siblings, 1 reply; 6+ messages in thread
From: Florian Westphal @ 2014-07-21 8:31 UTC (permalink / raw)
To: Florian Westphal; +Cc: David Miller, netdev, mst
Florian Westphal <fw@strlen.de> wrote:
> David Miller <davem@davemloft.net> wrote:
> > From: Florian Westphal <fw@strlen.de>
> > Date: Sun, 20 Jul 2014 20:51:25 +0200
> >
> > > Currently tun just frees the skb and returns NETDEV_TX_OK
> > > when queue length exceeds txqlen.
> > >
> > > This causes severe packetloss and unneeded resource
> > > consumption on host when sending to vm connected via tun.
> > >
> > > Instead, lets stop the transmit queue and start it once
> > > packets are consumed from the queue. This allows the network
> > > stack to control applications that send data via tun device.
> >
> > I strongly suspect the current behavior is intentional, see
> > commit:
> >
> > commit 5d097109257c03a71845729f8db6b5770c4bbedc
> > Author: Michael S. Tsirkin <mst@redhat.com>
> > Date: Mon Dec 3 10:07:14 2012 +0000
> >
> > tun: only queue packets on device
>
> Looks like you're right :-/
>
> Alright, please ignore my patch.
>
> That being said, the current behaviour isn't ideal either.
>
> It took me quite some time to realize that packetloss
> was on the sender side inside tun driver and not on the receiver
> vm. Not stopping the queue was a bit ... unexpected.
Michael, do you think we could restore the 'stop queue' default
behaviour?
Looking at your changelog, the only concern seems to be the
'packets never consumed'/'receiver is stuck forever' case.
What about reverting 5d097109257c03a7184, and then adding some
type of tun watchdog that will zap the rcv queue + tx queue wakeup?
That should be quite noisy if we combine it with the WARN_ON
you usually get when (physical) NIC driver detect stalled tx unit.
What do you think?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH -next] tun: stop tx queue when limit is hit
2014-07-21 8:31 ` Florian Westphal
@ 2014-07-21 8:56 ` Eric Dumazet
2014-07-21 9:19 ` Florian Westphal
0 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2014-07-21 8:56 UTC (permalink / raw)
To: Florian Westphal; +Cc: David Miller, netdev, mst
On Mon, 2014-07-21 at 10:31 +0200, Florian Westphal wrote:
> Florian Westphal <fw@strlen.de> wrote:
> > David Miller <davem@davemloft.net> wrote:
> > > From: Florian Westphal <fw@strlen.de>
> > > Date: Sun, 20 Jul 2014 20:51:25 +0200
> > >
> > > > Currently tun just frees the skb and returns NETDEV_TX_OK
> > > > when queue length exceeds txqlen.
> > > >
> > > > This causes severe packetloss and unneeded resource
> > > > consumption on host when sending to vm connected via tun.
> > > >
> > > > Instead, lets stop the transmit queue and start it once
> > > > packets are consumed from the queue. This allows the network
> > > > stack to control applications that send data via tun device.
> > >
> > > I strongly suspect the current behavior is intentional, see
> > > commit:
> > >
> > > commit 5d097109257c03a71845729f8db6b5770c4bbedc
> > > Author: Michael S. Tsirkin <mst@redhat.com>
> > > Date: Mon Dec 3 10:07:14 2012 +0000
> > >
> > > tun: only queue packets on device
> >
> > Looks like you're right :-/
> >
> > Alright, please ignore my patch.
> >
> > That being said, the current behaviour isn't ideal either.
> >
> > It took me quite some time to realize that packetloss
> > was on the sender side inside tun driver and not on the receiver
> > vm. Not stopping the queue was a bit ... unexpected.
>
> Michael, do you think we could restore the 'stop queue' default
> behaviour?
>
> Looking at your changelog, the only concern seems to be the
> 'packets never consumed'/'receiver is stuck forever' case.
>
> What about reverting 5d097109257c03a7184, and then adding some
> type of tun watchdog that will zap the rcv queue + tx queue wakeup?
>
> That should be quite noisy if we combine it with the WARN_ON
> you usually get when (physical) NIC driver detect stalled tx unit.
>
> What do you think?
Note that this patch had the ability to choose the behavior (drop or
back pressure)
http://patchwork.ozlabs.org/patch/338951/ ?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH -next] tun: stop tx queue when limit is hit
2014-07-21 8:56 ` Eric Dumazet
@ 2014-07-21 9:19 ` Florian Westphal
0 siblings, 0 replies; 6+ messages in thread
From: Florian Westphal @ 2014-07-21 9:19 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Florian Westphal, David Miller, netdev, mst
Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Mon, 2014-07-21 at 10:31 +0200, Florian Westphal wrote:
> > Michael, do you think we could restore the 'stop queue' default
> > behaviour?
> >
> > Looking at your changelog, the only concern seems to be the
> > 'packets never consumed'/'receiver is stuck forever' case.
> >
> > What about reverting 5d097109257c03a7184, and then adding some
> > type of tun watchdog that will zap the rcv queue + tx queue wakeup?
> >
> > That should be quite noisy if we combine it with the WARN_ON
> > you usually get when (physical) NIC driver detect stalled tx unit.
> >
> > What do you think?
>
> Note that this patch had the ability to choose the behavior (drop or
> back pressure)
>
> http://patchwork.ozlabs.org/patch/338951/ ?
Eric, many thanks for this pointer. In fact, the related discussions
in v2 of the patchset (http://patchwork.ozlabs.org/patch/338803/)
also answers my questions above.
[ the concern is with applications that can get stuck just because
packet they sent is queued forever ]
I need to think about this some more, but in any case Michael outlined
a possible solution (limit queue delay) in the thread above.
Thanks,
Florian
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-07-21 9:19 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-20 18:51 [PATCH -next] tun: stop tx queue when limit is hit Florian Westphal
2014-07-21 5:33 ` David Miller
2014-07-21 6:54 ` Florian Westphal
2014-07-21 8:31 ` Florian Westphal
2014-07-21 8:56 ` Eric Dumazet
2014-07-21 9:19 ` Florian Westphal
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).