From: Sabrina Dubroca <sd@queasysnail.net>
To: Antonio Quartulli <antonio@openvpn.net>
Cc: netdev@vger.kernel.org, Jakub Kicinski <kuba@kernel.org>,
Sergey Ryazanov <ryazanov.s.a@gmail.com>,
Paolo Abeni <pabeni@redhat.com>,
Eric Dumazet <edumazet@google.com>, Andrew Lunn <andrew@lunn.ch>,
Esben Haabendal <esben@geanix.com>
Subject: Re: [PATCH net-next v3 13/24] ovpn: implement TCP transport
Date: Tue, 14 May 2024 10:58:15 +0200 [thread overview]
Message-ID: <ZkMnpy3_T8YO3eHD@hog> (raw)
In-Reply-To: <73433bdf-763b-4023-8cb9-ffd9487744e0@openvpn.net>
2024-05-14, 00:20:24 +0200, Antonio Quartulli wrote:
> On 13/05/2024 16:50, Sabrina Dubroca wrote:
> > 2024-05-06, 03:16:26 +0200, Antonio Quartulli wrote:
> > > diff --git a/drivers/net/ovpn/main.c b/drivers/net/ovpn/main.c
> > > index 9ae9844dd281..a04d6e55a473 100644
> > > --- a/drivers/net/ovpn/main.c
> > > +++ b/drivers/net/ovpn/main.c
> > > @@ -23,6 +23,7 @@
> > > #include "io.h"
> > > #include "packet.h"
> > > #include "peer.h"
> > > +#include "tcp.h"
> > > /* Driver info */
> > > #define DRV_DESCRIPTION "OpenVPN data channel offload (ovpn)"
> > > @@ -247,8 +248,14 @@ static struct pernet_operations ovpn_pernet_ops = {
> > > static int __init ovpn_init(void)
> > > {
> > > - int err = register_netdevice_notifier(&ovpn_netdev_notifier);
> > > + int err = ovpn_tcp_init();
> > > + if (err) {
> >
> > ovpn_tcp_init cannot fail (and if it could, you'd need to clean up
> > when register_netdevice_notifier fails). I'd make ovpn_tcp_init void
> > and kill this check.
>
> I like to have all init functions returning int by design, even though they
> may not fail.
>
> But I can undersand this is not necessarily good practice (somebody will
> always ask "when does it fail?" and there will will be no answer, which is
> confusing)
Yes, pretty much.
> > > diff --git a/drivers/net/ovpn/peer.h b/drivers/net/ovpn/peer.h
> > > index b5ff59a4b40f..ac4907705d98 100644
> > > --- a/drivers/net/ovpn/peer.h
> > > +++ b/drivers/net/ovpn/peer.h
> > > + * @tcp.raw_len: next packet length as read from the stream (TCP only)
> > > + * @tcp.skb: next packet being filled with data from the stream (TCP only)
> > > + * @tcp.offset: position of the next byte to write in the skb (TCP only)
> > > + * @tcp.data_len: next packet length converted to host order (TCP only)
> >
> > It would be nice to add information about whether they're used for TX or RX.
>
> they are all about "from the stream" and "to the skb", meaning that we are
> doing RX.
> Will make it more explicit.
Maybe group them in a struct rx?
> > > + * @tcp.sk_cb.sk_data_ready: pointer to original cb
> > > + * @tcp.sk_cb.sk_write_space: pointer to original cb
> > > + * @tcp.sk_cb.prot: pointer to original prot object
> > > * @crypto: the crypto configuration (ciphers, keys, etc..)
> > > * @dst_cache: cache for dst_entry used to send to peer
> > > * @bind: remote peer binding
> > > @@ -59,6 +69,25 @@ struct ovpn_peer {
> > > struct ptr_ring netif_rx_ring;
> > > struct napi_struct napi;
> > > struct ovpn_socket *sock;
> > > + /* state of the TCP reading. Needed to keep track of how much of a
> > > + * single packet has already been read from the stream and how much is
> > > + * missing
> > > + */
> > > + struct {
> > > + struct ptr_ring tx_ring;
> > > + struct work_struct tx_work;
> > > + struct work_struct rx_work;
> > > +
> > > + u8 raw_len[sizeof(u16)];
> >
> > Why not u16 or __be16 for this one?
>
> because in this array we are putting the bytes as we get them from the
> stream.
> We may be at the point where one out of two bytes is available on the
> stream. For this reason I use an array to store this u16 byte by byte.
>
> Once thw two bytes are ready, we convert the content in an actual int and
> store it in "data_len" (a few lines below).
Ok, I see. Hopefully you can switch to strparser and make this one go
away.
> > > diff --git a/drivers/net/ovpn/socket.c b/drivers/net/ovpn/socket.c
> > > index e099a61b03fa..004db5b13663 100644
> > > --- a/drivers/net/ovpn/socket.c
> > > +++ b/drivers/net/ovpn/socket.c
> > > @@ -16,6 +16,7 @@
> > > #include "packet.h"
> > > #include "peer.h"
> > > #include "socket.h"
> > > +#include "tcp.h"
> > > #include "udp.h"
> > > /* Finalize release of socket, called after RCU grace period */
> > > @@ -26,6 +27,8 @@ static void ovpn_socket_detach(struct socket *sock)
> > > if (sock->sk->sk_protocol == IPPROTO_UDP)
> > > ovpn_udp_socket_detach(sock);
> > > + else if (sock->sk->sk_protocol == IPPROTO_TCP)
> > > + ovpn_tcp_socket_detach(sock);
> > > sockfd_put(sock);
> > > }
> > > @@ -69,6 +72,8 @@ static int ovpn_socket_attach(struct socket *sock, struct ovpn_peer *peer)
> > > if (sock->sk->sk_protocol == IPPROTO_UDP)
> > > ret = ovpn_udp_socket_attach(sock, peer->ovpn);
> > > + else if (sock->sk->sk_protocol == IPPROTO_TCP)
> > > + ret = ovpn_tcp_socket_attach(sock, peer);
> > > return ret;
> > > }
> > > @@ -124,6 +129,21 @@ struct ovpn_socket *ovpn_socket_new(struct socket *sock, struct ovpn_peer *peer)
> > > ovpn_sock->sock = sock;
> >
> > The line above this is:
> >
> > ovpn_sock->ovpn = peer->ovpn;
> >
> > It's technically fine since you then overwrite this with peer in case
> > we're on TCP, but ovpn_sock->ovpn only exists on UDP since you moved
> > it into a union in this patch.
>
> Yeah, I did not want to make another branch, but having a UDP specific case
> will make code easier to read.
Either that, or drop the union.
> > > diff --git a/drivers/net/ovpn/tcp.c b/drivers/net/ovpn/tcp.c
> > > new file mode 100644
> > > index 000000000000..84ad7cd4fc4f
> > > --- /dev/null
> > > +++ b/drivers/net/ovpn/tcp.c
> > > @@ -0,0 +1,511 @@
> > > +static int ovpn_tcp_read_sock(read_descriptor_t *desc, struct sk_buff *in_skb,
> > > + unsigned int in_offset, size_t in_len)
> > > +{
> > > + struct sock *sk = desc->arg.data;
> > > + struct ovpn_socket *sock;
> > > + struct ovpn_skb_cb *cb;
> > > + struct ovpn_peer *peer;
> > > + size_t chunk, copied = 0;
> > > + void *data;
> > > + u16 len;
> > > + int st;
> > > +
> > > + rcu_read_lock();
> > > + sock = rcu_dereference_sk_user_data(sk);
> > > + rcu_read_unlock();
> >
> > You can't just release rcu_read_lock and keep using sock (here and in
> > the rest of this file). Either you keep rcu_read_lock, or you can take
> > a reference on the ovpn_socket.
>
> I was just staring at this today, after having worked on the
> rcu_read_lock/unlock for the peer get()s..
>
> I thinkt the assumption was: if we are in this read_sock callback, it's
> impossible that the ovpn_socket was invalidated, because it gets invalidated
> upon detach, which also prevents any further calling of this callback. But
> this sounds racy, and I guess we should somewhat hold a reference..
ovpn_tcp_read_sock starts
detach
kfree_rcu(ovpn_socket)
...
ovpn_socket actually freed
...
ovpn_tcp_read_sock continues with freed ovpn_socket
I don't think anything in the current code prevents this.
> > > +/* Set TCP encapsulation callbacks */
> > > +int ovpn_tcp_socket_attach(struct socket *sock, struct ovpn_peer *peer)
> > > +{
> > > + void *old_data;
> > > + int ret;
> > > +
> > > + INIT_WORK(&peer->tcp.tx_work, ovpn_tcp_tx_work);
> > > +
> > > + ret = ptr_ring_init(&peer->tcp.tx_ring, OVPN_QUEUE_LEN, GFP_KERNEL);
> > > + if (ret < 0) {
> > > + netdev_err(peer->ovpn->dev, "cannot allocate TCP TX ring\n");
> > > + return ret;
> > > + }
> > > +
> > > + peer->tcp.skb = NULL;
> > > + peer->tcp.offset = 0;
> > > + peer->tcp.data_len = 0;
> > > +
> > > + write_lock_bh(&sock->sk->sk_callback_lock);
> > > +
> > > + /* make sure no pre-existing encapsulation handler exists */
> > > + rcu_read_lock();
> > > + old_data = rcu_dereference_sk_user_data(sock->sk);
> > > + rcu_read_unlock();
> > > + if (old_data) {
> > > + netdev_err(peer->ovpn->dev,
> > > + "provided socket already taken by other user\n");
> > > + ret = -EBUSY;
> > > + goto err;
> >
> > The UDP code differentiates "socket already owned by this interface"
> > from "already taken by other user". That doesn't apply to TCP?
>
> This makes me wonder: how safe it is to interpret the user data as an object
> of type ovpn_socket?
>
> When we find the user data already assigned, we don't know what was really
> stored in there, right?
> Technically this socket could have gone through another module which
> assigned its own state.
>
> Therefore I think that what UDP does [ dereferencing ((struct ovpn_socket
> *)user_data)->ovpn ] is probably not safe. Would you agree?
Hmmm, yeah, I think you're right. If you checked encap_type ==
UDP_ENCAP_OVPNINUDP before (sk_prot for TCP), then you'd know it's
really your data. Basically call ovpn_from_udp_sock during attach if
you want to check something beyond EBUSY.
Once you're in your own callbacks, it should be safe. If some other
code sends packet with a non-ovpn socket to ovpn's ->encap_rcv,
something is really broken.
> > > +int __init ovpn_tcp_init(void)
> > > +{
> > > + /* We need to substitute the recvmsg and the sock_is_readable
> > > + * callbacks in the sk_prot member of the sock object for TCP
> > > + * sockets.
> > > + *
> > > + * However sock->sk_prot is a pointer to a static variable and
> > > + * therefore we can't directly modify it, otherwise every socket
> > > + * pointing to it will be affected.
> > > + *
> > > + * For this reason we create our own static copy and modify what
> > > + * we need. Then we make sk_prot point to this copy
> > > + * (in ovpn_tcp_socket_attach())
> > > + */
> > > + ovpn_tcp_prot = tcp_prot;
> >
> > Don't you need a separate variant for IPv6, like TLS does?
>
> Never did so far.
>
> My wild wild wild guess: for the time this socket is owned by ovpn, we only
> use callbacks that are IPvX agnostic, hence v4 vs v6 doesn't make any
> difference.
> When this socket is released, we reassigned the original prot.
That seems a bit suspicious to me. For example, tcpv6_prot has a
different backlog_rcv. And you don't control if the socket is detached
before being closed, or which callbacks are needed. Your userspace
client doesn't use them, but someone else's might.
> > > + ovpn_tcp_prot.recvmsg = ovpn_tcp_recvmsg;
> >
> > You don't need to replace ->sendmsg as well? The userspace client is
> > not expected to send messages?
>
> It is, but my assumption is that those packets will just go through the
> socket as usual. No need to be handled by ovpn (those packets are not
> encrypted/decrypted, like data traffic is).
> And this is how it has worked so far.
>
> Makes sense?
Two things come to mind:
- userspace is expected to prefix the messages it inserts on the
stream with the 2-byte length field? otherwise, the peer won't be
able to parse them out of the stream
- I'm not convinced this would be safe wrt kernel writing partial
messages. if ovpn_tcp_send_one doesn't send the full message, you
could interleave two messages:
+------+-------------------+------+--------+----------------+
| len1 | (bytes from msg1) | len2 | (msg2) | (rest of msg1) |
+------+-------------------+------+--------+----------------+
and the RX side would parse that as:
+------+-----------------------------------+------+---------
| len1 | (bytes from msg1) | len2 | (msg2) | ???? | ...
+------+-------------------+---------------+------+---------
and try to interpret some random bytes out of either msg1 or msg2 as
a length prefix, resulting in a broken stream.
The stream format looks identical to ESP in TCP [1] (2B length prefix
followed by the actual message), so I think the espintcp code (both tx
and rx, except for actual protocol parsing) should look very
similar. The problems that need to be solved for both protocols are
pretty much the same.
[1] https://www.rfc-editor.org/rfc/rfc8229#section-3
--
Sabrina
next prev parent reply other threads:[~2024-05-14 8:58 UTC|newest]
Thread overview: 117+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-06 1:16 [PATCH net-next v3 00/24] Introducing OpenVPN Data Channel Offload Antonio Quartulli
2024-05-06 1:16 ` [PATCH net-next v3 01/24] netlink: add NLA_POLICY_MAX_LEN macro Antonio Quartulli
2024-05-06 1:16 ` [PATCH net-next v3 02/24] net: introduce OpenVPN Data Channel Offload (ovpn) Antonio Quartulli
2024-05-06 1:16 ` [PATCH net-next v3 03/24] ovpn: add basic netlink support Antonio Quartulli
2024-05-08 0:10 ` Jakub Kicinski
2024-05-08 7:42 ` Antonio Quartulli
2024-05-08 14:42 ` Sabrina Dubroca
2024-05-08 14:51 ` Antonio Quartulli
2024-05-06 1:16 ` [PATCH net-next v3 04/24] ovpn: add basic interface creation/destruction/management routines Antonio Quartulli
2024-05-08 0:18 ` Jakub Kicinski
2024-05-08 7:53 ` Antonio Quartulli
2024-05-08 14:52 ` Sabrina Dubroca
2024-05-09 1:06 ` Jakub Kicinski
2024-05-09 8:25 ` Antonio Quartulli
2024-05-09 10:09 ` Sabrina Dubroca
2024-05-09 10:35 ` Antonio Quartulli
2024-05-09 12:16 ` Sabrina Dubroca
2024-05-09 13:25 ` Antonio Quartulli
2024-05-09 13:52 ` Sabrina Dubroca
2024-05-06 1:16 ` [PATCH net-next v3 05/24] ovpn: implement interface creation/destruction via netlink Antonio Quartulli
2024-05-08 0:21 ` Jakub Kicinski
2024-05-08 9:49 ` Antonio Quartulli
2024-05-09 1:09 ` Jakub Kicinski
2024-05-09 8:30 ` Antonio Quartulli
2024-05-06 1:16 ` [PATCH net-next v3 06/24] ovpn: keep carrier always on Antonio Quartulli
2024-05-06 1:16 ` [PATCH net-next v3 07/24] ovpn: introduce the ovpn_peer object Antonio Quartulli
2024-05-08 16:06 ` Sabrina Dubroca
2024-05-08 20:31 ` Antonio Quartulli
2024-05-09 13:04 ` Sabrina Dubroca
2024-05-09 13:24 ` Andrew Lunn
2024-05-10 18:57 ` Antonio Quartulli
2024-05-11 0:28 ` Jakub Kicinski
2024-05-09 13:44 ` Antonio Quartulli
2024-05-09 13:55 ` Andrew Lunn
2024-05-09 14:17 ` Sabrina Dubroca
2024-05-09 14:36 ` Antonio Quartulli
2024-05-09 14:53 ` Antonio Quartulli
2024-05-10 10:30 ` Sabrina Dubroca
2024-05-10 12:34 ` Antonio Quartulli
2024-05-10 14:11 ` Sabrina Dubroca
2024-05-13 10:09 ` Simon Horman
2024-05-13 10:53 ` Antonio Quartulli
2024-05-13 15:04 ` Simon Horman
2024-05-06 1:16 ` [PATCH net-next v3 08/24] ovpn: introduce the ovpn_socket object Antonio Quartulli
2024-05-08 17:10 ` Sabrina Dubroca
2024-05-08 20:38 ` Antonio Quartulli
2024-05-09 13:32 ` Sabrina Dubroca
2024-05-09 13:46 ` Antonio Quartulli
2024-05-06 1:16 ` [PATCH net-next v3 09/24] ovpn: implement basic TX path (UDP) Antonio Quartulli
2024-05-10 13:01 ` Sabrina Dubroca
2024-05-10 13:39 ` Antonio Quartulli
2024-05-12 21:35 ` Sabrina Dubroca
2024-05-13 7:37 ` Antonio Quartulli
2024-05-13 9:36 ` Sabrina Dubroca
2024-05-13 9:47 ` Antonio Quartulli
2024-05-06 1:16 ` [PATCH net-next v3 10/24] ovpn: implement basic RX " Antonio Quartulli
2024-05-10 13:45 ` Sabrina Dubroca
2024-05-10 14:41 ` Antonio Quartulli
2024-07-18 10:46 ` Sabrina Dubroca
2024-07-18 13:06 ` Antonio Quartulli
2024-07-18 13:11 ` Sabrina Dubroca
2024-07-18 13:27 ` Antonio Quartulli
2024-07-18 13:40 ` Sabrina Dubroca
2024-07-18 14:15 ` Antonio Quartulli
2024-05-06 1:16 ` [PATCH net-next v3 11/24] ovpn: implement packet processing Antonio Quartulli
2024-05-12 8:46 ` Sabrina Dubroca
2024-05-13 7:14 ` Antonio Quartulli
2024-05-13 9:24 ` Sabrina Dubroca
2024-05-13 9:31 ` Antonio Quartulli
2024-05-22 14:08 ` Antonio Quartulli
2024-05-22 14:28 ` Andrew Lunn
2024-05-06 1:16 ` [PATCH net-next v3 12/24] ovpn: store tunnel and transport statistics Antonio Quartulli
2024-05-12 8:47 ` Sabrina Dubroca
2024-05-13 7:25 ` Antonio Quartulli
2024-05-13 9:19 ` Sabrina Dubroca
2024-05-13 9:33 ` Antonio Quartulli
2024-05-06 1:16 ` [PATCH net-next v3 13/24] ovpn: implement TCP transport Antonio Quartulli
2024-05-13 13:37 ` Antonio Quartulli
2024-05-13 15:34 ` Jakub Kicinski
2024-05-13 14:50 ` Sabrina Dubroca
2024-05-13 22:20 ` Antonio Quartulli
2024-05-14 8:58 ` Sabrina Dubroca [this message]
2024-05-14 22:11 ` Antonio Quartulli
2024-05-15 10:19 ` Sabrina Dubroca
2024-05-15 12:54 ` Antonio Quartulli
2024-05-15 14:55 ` Sabrina Dubroca
2024-05-15 19:44 ` Antonio Quartulli
2024-05-15 20:35 ` Sabrina Dubroca
2024-05-15 20:39 ` Antonio Quartulli
2024-05-06 1:16 ` [PATCH net-next v3 14/24] ovpn: implement multi-peer support Antonio Quartulli
2024-05-28 14:44 ` Sabrina Dubroca
2024-05-28 19:41 ` Antonio Quartulli
2024-05-29 15:16 ` Sabrina Dubroca
2024-05-29 20:15 ` Antonio Quartulli
2024-05-29 20:45 ` Sabrina Dubroca
2024-05-06 1:16 ` [PATCH net-next v3 15/24] ovpn: implement peer lookup logic Antonio Quartulli
2024-05-28 16:42 ` Sabrina Dubroca
2024-05-28 20:09 ` Antonio Quartulli
2024-05-29 16:42 ` Sabrina Dubroca
2024-05-29 20:19 ` Antonio Quartulli
2024-05-06 1:16 ` [PATCH net-next v3 16/24] ovpn: implement keepalive mechanism Antonio Quartulli
2024-05-06 1:16 ` [PATCH net-next v3 17/24] ovpn: add support for updating local UDP endpoint Antonio Quartulli
2024-05-06 1:16 ` [PATCH net-next v3 18/24] ovpn: add support for peer floating Antonio Quartulli
2024-05-06 1:16 ` [PATCH net-next v3 19/24] ovpn: implement peer add/dump/delete via netlink Antonio Quartulli
2024-05-06 1:16 ` [PATCH net-next v3 20/24] ovpn: implement key add/del/swap " Antonio Quartulli
2024-05-06 1:16 ` [PATCH net-next v3 21/24] ovpn: kill key and notify userspace in case of IV exhaustion Antonio Quartulli
2024-05-06 1:16 ` [PATCH net-next v3 22/24] ovpn: notify userspace when a peer is deleted Antonio Quartulli
2024-05-06 1:16 ` [PATCH net-next v3 23/24] ovpn: add basic ethtool support Antonio Quartulli
2024-05-06 1:16 ` [PATCH net-next v3 24/24] testing/selftest: add test tool and scripts for ovpn module Antonio Quartulli
2024-05-07 23:55 ` Jakub Kicinski
2024-05-08 9:51 ` Antonio Quartulli
2024-05-09 0:50 ` Jakub Kicinski
2024-05-09 8:40 ` Antonio Quartulli
2024-05-07 23:48 ` [PATCH net-next v3 00/24] Introducing OpenVPN Data Channel Offload Jakub Kicinski
2024-05-08 9:56 ` Antonio Quartulli
2024-05-09 0:53 ` Jakub Kicinski
2024-05-09 8:41 ` Antonio Quartulli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZkMnpy3_T8YO3eHD@hog \
--to=sd@queasysnail.net \
--cc=andrew@lunn.ch \
--cc=antonio@openvpn.net \
--cc=edumazet@google.com \
--cc=esben@geanix.com \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=ryazanov.s.a@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).