Tun congestion/BQL - David Woodhouse

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: David Woodhouse <dwmw2@infradead.org>
To: netdev@vger.kernel.org
Subject: Tun congestion/BQL
Date: Wed, 10 Apr 2019 15:01:58 +0300	[thread overview]
Message-ID: <2e310fc6ee847d20dd23692fd1db733e607602f5.camel@infradead.org> (raw)

[-- Attachment #1: Type: text/plain, Size: 3436 bytes --]

I've been working on OpenConnect VPN performance. After fixing some
local stupidities I am basically crypto-bound as I suck packets out of
the tun device and feed them out over the public network as fast as the
crypto library can encrypt them.

However, the tun device is dropping packets.

I'm testing with an ESP setup that the kernel happens to support. If I
do netperf UDP_STREAM testing with the kernel doing ESP, I get this:

Socket  Message  Elapsed      Messages                
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

212992    1400   10.00     1198093      0    1341.86
212992           10.00     1198044           1341.80

Change to doing it in userspace through the tun device, though, and it
looks more like this:

Socket  Message  Elapsed      Messages                
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

212992    1400   10.00     8194693      0    9178.04
212992           10.00     1536155           1720.49

The discrepancy between sent and received packets is all seen as packet
loss on the tun0 interface, where userspace is not reading the packets
out fast enough:

$ netstat -i
Kernel Interface table
Iface             MTU    RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0             9001 56790193      0 1718127 0      129849068      0      0      0 BMRU
lo              65536       42      0      0 0            42      0      0      0 LRU
tun0             1500        9      0      0 0       1546968      0 6647739      0 MOPRU

So... I threw together something to stop the queue when the tx_ring was
full (which I know is incomplete but was enough for my test)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index e9ca1c088d0b..a15fca23ef45 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1125,7 +1128,9 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
 	if (tfile->flags & TUN_FASYNC)
 		kill_fasync(&tfile->fasync, SIGIO, POLL_IN);
 	tfile->socket.sk->sk_data_ready(tfile->socket.sk);

+	if (!ptr_ring_empty(&tfile->tx_ring))
+		netif_stop_queue(tun->dev);
 	rcu_read_unlock();
 	return NETDEV_TX_OK;

@@ -2237,7 +2239,7 @@ static ssize_t tun_do_read(struct tun_struct *tun, struct tun_file *tfile,
 		else
 			consume_skb(skb);
 	}
-
+	netif_wake_queue(tun->dev);
 	return ret;
 }

So now netperf doesn't send lots of packets that get dropped by the tun
device. But it doesn't send anywhere near as many packets successfully,
either...

Socket  Message  Elapsed      Messages                
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

212992    1400   10.00     1250223      0    1400.25
212992           10.00     1245458           1394.91

That's actually dropped me back to the performance I was getting with
the kernel's ESP implementation. Is that something we should expect?

I don't think it's purely the overhead I've added in the driver. If I
leave the netif_wake_queue() and add '&& 0' to the condition for the
netif_stop_queue(), which should still leave the locking in the
ptr_ring_empty() to happen, the performance goes back up.

What's going on? Am I actually better off letting it drop packets
silently?

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5174 bytes --]

next             reply	other threads:[~2019-04-10 12:02 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-10 12:01 David Woodhouse [this message]
2019-04-10 13:01 ` Tun congestion/BQL David Woodhouse
2019-04-10 13:25   ` Jason Wang
2019-04-10 13:42     ` Toke Høiland-Jørgensen
2019-04-10 14:33       ` David Woodhouse
2019-04-10 15:01         ` Toke Høiland-Jørgensen
2019-04-10 15:32           ` David Woodhouse
2019-04-11  7:22             ` Jason Wang
2019-04-11  9:25               ` David Woodhouse
2019-04-12  4:26                 ` Jason Wang
2019-04-12  5:45                   ` David Woodhouse
2019-04-11  7:17         ` Jason Wang
2019-04-11  8:56           ` David Woodhouse
2019-04-11  9:04             ` Jason Wang
2019-04-11  9:16               ` David Woodhouse
2019-04-12  4:23                 ` Jason Wang
2019-04-11  7:01       ` Jason Wang

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:e9ca1c088d0 dfblob:a15fca23ef4 )
 OR (
bs:"Tun congestion/BQL" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2e310fc6ee847d20dd23692fd1db733e607602f5.camel@infradead.org \
    --to=dwmw2@infradead.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).