netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/7] TCP-AO fixes
@ 2023-11-28 20:57 Dmitry Safonov
  2023-11-28 20:57 ` [PATCH v3 1/7] Documentation/tcp: Fix an obvious typo Dmitry Safonov
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Dmitry Safonov @ 2023-11-28 20:57 UTC (permalink / raw)
  To: David Ahern, Eric Dumazet, Paolo Abeni, Jakub Kicinski,
	David S. Miller
  Cc: linux-kernel, Dmitry Safonov, Dmitry Safonov, Francesco Ruggeri,
	Salam Noureddine, Simon Horman, netdev

Hi,

Changes from v2:
- rwlocks are problematic in net code (Paolo)
  Changed the SNE code to avoid spin/rw locks on RX/TX fastpath by
  double-accounting SEQ numbers for TCP-AO enabled connections.

Changes from v1:
- Use tcp_can_repair_sock() helper to limit TCP_AO_REPAIR (Eric)
- Instead of hook to listen() syscall, allow removing current/rnext keys
  on TCP_LISTEN (addressing Eric's objection)
- Add sne_lock to protect snd_sne/rcv_sne
- Don't move used_tcp_ao in struct tcp_request_sock (Eric)

I've been working on TCP-AO key-rotation selftests and as a result
exercised some corner-cases that are not usually met in production.

Here are a bunch of semi-related fixes:
- Documentation typo (reported by Markus Elfring)
- Proper alignment for TCP-AO option in TCP header that has MAC length
  of non 4 bytes (now a selftest with randomized maclen/algorithm/etc
  passes)
- 3 uAPI restricting patches that disallow more things to userspace in
  order to prevent it shooting itself in any parts of the body
- SNEs READ_ONCE()/WRITE_ONCE() that went missing by my human factor
- Avoid storing MAC length from SYN header as SYN-ACK will use
  rnext_key.maclen (drops an extra check that fails on new selftests)

Please, consider applying/pulling.

The following changes since commit df60cee26a2e3d937a319229e335cb3f9c1f16d2:

  Merge tag '6.7-rc3-smb3-server-fixes' of git://git.samba.org/ksmbd (2023-11-27 17:17:23 -0800)

are available in the Git repository at:

  git@github.com:0x7f454c46/linux.git tcp-ao-post-merge-v3

for you to fetch changes up to 822e6f2d14a1e1de98835fcc3940c04d28582656:

  net/tcp: Don't store TCP-AO maclen on reqsk (2023-11-28 17:57:32 +0000)

----------------------------------------------------------------

Thanks,
             Dmitry

Cc: David Ahern <dsahern@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Francesco Ruggeri <fruggeri05@gmail.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Salam Noureddine <noureddine@arista.com>
Cc: Simon Horman <horms@kernel.org>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Dmitry Safonov (7):
  Documentation/tcp: Fix an obvious typo
  net/tcp: Consistently align TCP-AO option in the header
  net/tcp: Limit TCP_AO_REPAIR to non-listen sockets
  net/tcp: Allow removing current/rnext TCP-AO keys on TCP_LISTEN
    sockets
  net/tcp: Don't add key with non-matching VRF on connected sockets
  net/tcp: Store SNEs + SEQs on ao_info
  net/tcp: Don't store TCP-AO maclen on reqsk

 Documentation/networking/tcp_ao.rst |  2 +-
 include/linux/tcp.h                 |  8 +---
 include/net/tcp_ao.h                | 31 ++++++++++--
 net/ipv4/tcp.c                      | 13 ++++-
 net/ipv4/tcp_ao.c                   | 74 ++++++++++++++++++-----------
 net/ipv4/tcp_fastopen.c             |  2 +
 net/ipv4/tcp_input.c                | 26 ++++++----
 net/ipv4/tcp_ipv4.c                 |  4 +-
 net/ipv4/tcp_minisocks.c            |  2 +-
 net/ipv4/tcp_output.c               | 16 +++----
 net/ipv6/tcp_ipv6.c                 |  2 +-
 11 files changed, 116 insertions(+), 64 deletions(-)


base-commit: df60cee26a2e3d937a319229e335cb3f9c1f16d2
-- 
2.43.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 1/7] Documentation/tcp: Fix an obvious typo
  2023-11-28 20:57 [PATCH v3 0/7] TCP-AO fixes Dmitry Safonov
@ 2023-11-28 20:57 ` Dmitry Safonov
  2023-11-28 20:57 ` [PATCH v3 2/7] net/tcp: Consistently align TCP-AO option in the header Dmitry Safonov
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Dmitry Safonov @ 2023-11-28 20:57 UTC (permalink / raw)
  To: David Ahern, Eric Dumazet, Paolo Abeni, Jakub Kicinski,
	David S. Miller
  Cc: linux-kernel, Dmitry Safonov, Dmitry Safonov, Francesco Ruggeri,
	Salam Noureddine, Simon Horman, netdev, Markus Elfring,
	Jonathan Corbet, linux-doc

Yep, my VIM spellchecker is not good enough for typos like this one.

Fixes: 7fe0e38bb669 ("Documentation/tcp: Add TCP-AO documentation")
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Reported-by: Markus Elfring <Markus.Elfring@web.de>
Closes: https://lore.kernel.org/all/2745ab4e-acac-40d4-83bf-37f2600d0c3d@web.de/
Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 Documentation/networking/tcp_ao.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/networking/tcp_ao.rst b/Documentation/networking/tcp_ao.rst
index cfa5bf1cc542..8a58321acce7 100644
--- a/Documentation/networking/tcp_ao.rst
+++ b/Documentation/networking/tcp_ao.rst
@@ -99,7 +99,7 @@ also [6.1]::
    when it is no longer considered permitted.
 
 Linux TCP-AO will try its best to prevent you from removing a key that's
-being used, considering it a key management failure. But sine keeping
+being used, considering it a key management failure. But since keeping
 an outdated key may become a security issue and as a peer may
 unintentionally prevent the removal of an old key by always setting
 it as RNextKeyID - a forced key removal mechanism is provided, where
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 2/7] net/tcp: Consistently align TCP-AO option in the header
  2023-11-28 20:57 [PATCH v3 0/7] TCP-AO fixes Dmitry Safonov
  2023-11-28 20:57 ` [PATCH v3 1/7] Documentation/tcp: Fix an obvious typo Dmitry Safonov
@ 2023-11-28 20:57 ` Dmitry Safonov
  2023-11-28 20:57 ` [PATCH v3 3/7] net/tcp: Limit TCP_AO_REPAIR to non-listen sockets Dmitry Safonov
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Dmitry Safonov @ 2023-11-28 20:57 UTC (permalink / raw)
  To: David Ahern, Eric Dumazet, Paolo Abeni, Jakub Kicinski,
	David S. Miller
  Cc: linux-kernel, Dmitry Safonov, Dmitry Safonov, Francesco Ruggeri,
	Salam Noureddine, Simon Horman, netdev

Currently functions that pre-calculate TCP header options length use
unaligned TCP-AO header + MAC-length for skb reservation.
And the functions that actually write TCP-AO options into skb do align
the header. Nothing good can come out of this for ((maclen % 4) != 0).

Provide tcp_ao_len_aligned() helper and use it everywhere for TCP
header options space calculations.

Fixes: 1e03d32bea8e ("net/tcp: Add TCP-AO sign to outgoing packets")
Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 include/net/tcp_ao.h     | 6 ++++++
 net/ipv4/tcp_ao.c        | 4 ++--
 net/ipv4/tcp_ipv4.c      | 4 ++--
 net/ipv4/tcp_minisocks.c | 2 +-
 net/ipv4/tcp_output.c    | 6 +++---
 net/ipv6/tcp_ipv6.c      | 2 +-
 6 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h
index b56be10838f0..647781080613 100644
--- a/include/net/tcp_ao.h
+++ b/include/net/tcp_ao.h
@@ -62,11 +62,17 @@ static inline int tcp_ao_maclen(const struct tcp_ao_key *key)
 	return key->maclen;
 }
 
+/* Use tcp_ao_len_aligned() for TCP header calculations */
 static inline int tcp_ao_len(const struct tcp_ao_key *key)
 {
 	return tcp_ao_maclen(key) + sizeof(struct tcp_ao_hdr);
 }
 
+static inline int tcp_ao_len_aligned(const struct tcp_ao_key *key)
+{
+	return round_up(tcp_ao_len(key), 4);
+}
+
 static inline unsigned int tcp_ao_digest_size(struct tcp_ao_key *key)
 {
 	return key->digest_size;
diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index 7696417d0640..c8be1d526eac 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -1100,7 +1100,7 @@ void tcp_ao_connect_init(struct sock *sk)
 			ao_info->current_key = key;
 		if (!ao_info->rnext_key)
 			ao_info->rnext_key = key;
-		tp->tcp_header_len += tcp_ao_len(key);
+		tp->tcp_header_len += tcp_ao_len_aligned(key);
 
 		ao_info->lisn = htonl(tp->write_seq);
 		ao_info->snd_sne = 0;
@@ -1346,7 +1346,7 @@ static int tcp_ao_parse_crypto(struct tcp_ao_add *cmd, struct tcp_ao_key *key)
 	syn_tcp_option_space -= TCPOLEN_MSS_ALIGNED;
 	syn_tcp_option_space -= TCPOLEN_TSTAMP_ALIGNED;
 	syn_tcp_option_space -= TCPOLEN_WSCALE_ALIGNED;
-	if (tcp_ao_len(key) > syn_tcp_option_space) {
+	if (tcp_ao_len_aligned(key) > syn_tcp_option_space) {
 		err = -EMSGSIZE;
 		goto err_kfree;
 	}
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 5f693bbd578d..0c50c5a32b84 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -690,7 +690,7 @@ static bool tcp_v4_ao_sign_reset(const struct sock *sk, struct sk_buff *skb,
 
 	reply_options[0] = htonl((TCPOPT_AO << 24) | (tcp_ao_len(key) << 16) |
 				 (aoh->rnext_keyid << 8) | keyid);
-	arg->iov[0].iov_len += round_up(tcp_ao_len(key), 4);
+	arg->iov[0].iov_len += tcp_ao_len_aligned(key);
 	reply->doff = arg->iov[0].iov_len / 4;
 
 	if (tcp_ao_hash_hdr(AF_INET, (char *)&reply_options[1],
@@ -978,7 +978,7 @@ static void tcp_v4_send_ack(const struct sock *sk,
 					  (tcp_ao_len(key->ao_key) << 16) |
 					  (key->ao_key->sndid << 8) |
 					  key->rcv_next);
-		arg.iov[0].iov_len += round_up(tcp_ao_len(key->ao_key), 4);
+		arg.iov[0].iov_len += tcp_ao_len_aligned(key->ao_key);
 		rep.th.doff = arg.iov[0].iov_len / 4;
 
 		tcp_ao_hash_hdr(AF_INET, (char *)&rep.opt[offset],
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index a9807eeb311c..9e85f2a0bddd 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -615,7 +615,7 @@ struct sock *tcp_create_openreq_child(const struct sock *sk,
 	ao_key = treq->af_specific->ao_lookup(sk, req,
 				tcp_rsk(req)->ao_keyid, -1);
 	if (ao_key)
-		newtp->tcp_header_len += tcp_ao_len(ao_key);
+		newtp->tcp_header_len += tcp_ao_len_aligned(ao_key);
  #endif
 	if (skb->len >= TCP_MSS_DEFAULT + newtp->tcp_header_len)
 		newicsk->icsk_ack.last_seg_size = skb->len - newtp->tcp_header_len;
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index eb13a55d660c..93eef1dbbc55 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -825,7 +825,7 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb,
 		timestamps = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_timestamps);
 		if (tcp_key_is_ao(key)) {
 			opts->options |= OPTION_AO;
-			remaining -= tcp_ao_len(key->ao_key);
+			remaining -= tcp_ao_len_aligned(key->ao_key);
 		}
 	}
 
@@ -915,7 +915,7 @@ static unsigned int tcp_synack_options(const struct sock *sk,
 			ireq->tstamp_ok &= !ireq->sack_ok;
 	} else if (tcp_key_is_ao(key)) {
 		opts->options |= OPTION_AO;
-		remaining -= tcp_ao_len(key->ao_key);
+		remaining -= tcp_ao_len_aligned(key->ao_key);
 		ireq->tstamp_ok &= !ireq->sack_ok;
 	}
 
@@ -982,7 +982,7 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb
 		size += TCPOLEN_MD5SIG_ALIGNED;
 	} else if (tcp_key_is_ao(key)) {
 		opts->options |= OPTION_AO;
-		size += tcp_ao_len(key->ao_key);
+		size += tcp_ao_len_aligned(key->ao_key);
 	}
 
 	if (likely(tp->rx_opt.tstamp_ok)) {
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 937a02c2e534..8c6623496dd7 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -881,7 +881,7 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32
 	if (tcp_key_is_md5(key))
 		tot_len += TCPOLEN_MD5SIG_ALIGNED;
 	if (tcp_key_is_ao(key))
-		tot_len += tcp_ao_len(key->ao_key);
+		tot_len += tcp_ao_len_aligned(key->ao_key);
 
 #ifdef CONFIG_MPTCP
 	if (rst && !tcp_key_is_md5(key)) {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 3/7] net/tcp: Limit TCP_AO_REPAIR to non-listen sockets
  2023-11-28 20:57 [PATCH v3 0/7] TCP-AO fixes Dmitry Safonov
  2023-11-28 20:57 ` [PATCH v3 1/7] Documentation/tcp: Fix an obvious typo Dmitry Safonov
  2023-11-28 20:57 ` [PATCH v3 2/7] net/tcp: Consistently align TCP-AO option in the header Dmitry Safonov
@ 2023-11-28 20:57 ` Dmitry Safonov
  2023-11-28 20:57 ` [PATCH v3 4/7] net/tcp: Allow removing current/rnext TCP-AO keys on TCP_LISTEN sockets Dmitry Safonov
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Dmitry Safonov @ 2023-11-28 20:57 UTC (permalink / raw)
  To: David Ahern, Eric Dumazet, Paolo Abeni, Jakub Kicinski,
	David S. Miller
  Cc: linux-kernel, Dmitry Safonov, Dmitry Safonov, Francesco Ruggeri,
	Salam Noureddine, Simon Horman, netdev

Listen socket is not an established TCP connection, so
setsockopt(TCP_AO_REPAIR) doesn't have any impact.

Restrict this uAPI for listen sockets.

Fixes: faadfaba5e01 ("net/tcp: Add TCP_AO_REPAIR")
Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 net/ipv4/tcp.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 53bcc17c91e4..b1fe4eb01829 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3594,6 +3594,10 @@ int do_tcp_setsockopt(struct sock *sk, int level, int optname,
 		break;
 
 	case TCP_AO_REPAIR:
+		if (!tcp_can_repair_sock(sk)) {
+			err = -EPERM;
+			break;
+		}
 		err = tcp_ao_set_repair(sk, optval, optlen);
 		break;
 #ifdef CONFIG_TCP_AO
@@ -4293,6 +4297,8 @@ int do_tcp_getsockopt(struct sock *sk, int level,
 	}
 #endif
 	case TCP_AO_REPAIR:
+		if (!tcp_can_repair_sock(sk))
+			return -EPERM;
 		return tcp_ao_get_repair(sk, optval, optlen);
 	case TCP_AO_GET_KEYS:
 	case TCP_AO_INFO: {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 4/7] net/tcp: Allow removing current/rnext TCP-AO keys on TCP_LISTEN sockets
  2023-11-28 20:57 [PATCH v3 0/7] TCP-AO fixes Dmitry Safonov
                   ` (2 preceding siblings ...)
  2023-11-28 20:57 ` [PATCH v3 3/7] net/tcp: Limit TCP_AO_REPAIR to non-listen sockets Dmitry Safonov
@ 2023-11-28 20:57 ` Dmitry Safonov
  2023-11-28 20:57 ` [PATCH v3 5/7] net/tcp: Don't add key with non-matching VRF on connected sockets Dmitry Safonov
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Dmitry Safonov @ 2023-11-28 20:57 UTC (permalink / raw)
  To: David Ahern, Eric Dumazet, Paolo Abeni, Jakub Kicinski,
	David S. Miller
  Cc: linux-kernel, Dmitry Safonov, Dmitry Safonov, Francesco Ruggeri,
	Salam Noureddine, Simon Horman, netdev

TCP_LISTEN sockets are not connected to any peer, so having
current_key/rnext_key doesn't make sense.

The userspace may falter over this issue by setting current or rnext
TCP-AO key before listen() syscall. setsockopt(TCP_AO_DEL_KEY) doesn't
allow removing a key that is in use (in accordance to RFC 5925), so
it might be inconvenient to have keys that can be destroyed only with
listener socket.

Fixes: 4954f17ddefc ("net/tcp: Introduce TCP_AO setsockopt()s")
Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 net/ipv4/tcp_ao.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index c8be1d526eac..bf41be6d4721 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -1818,8 +1818,16 @@ static int tcp_ao_del_cmd(struct sock *sk, unsigned short int family,
 		if (!new_rnext)
 			return -ENOENT;
 	}
-	if (cmd.del_async && sk->sk_state != TCP_LISTEN)
-		return -EINVAL;
+	if (sk->sk_state == TCP_LISTEN) {
+		/* Cleaning up possible "stale" current/rnext keys state,
+		 * that may have preserved from TCP_CLOSE, before sys_listen()
+		 */
+		ao_info->current_key = NULL;
+		ao_info->rnext_key = NULL;
+	} else {
+		if (cmd.del_async)
+			return -EINVAL;
+	}
 
 	if (family == AF_INET) {
 		struct sockaddr_in *sin = (struct sockaddr_in *)&cmd.addr;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 5/7] net/tcp: Don't add key with non-matching VRF on connected sockets
  2023-11-28 20:57 [PATCH v3 0/7] TCP-AO fixes Dmitry Safonov
                   ` (3 preceding siblings ...)
  2023-11-28 20:57 ` [PATCH v3 4/7] net/tcp: Allow removing current/rnext TCP-AO keys on TCP_LISTEN sockets Dmitry Safonov
@ 2023-11-28 20:57 ` Dmitry Safonov
  2023-11-29  1:34   ` David Ahern
  2023-11-28 20:57 ` [PATCH v3 6/7] net/tcp: Store SNEs + SEQs on ao_info Dmitry Safonov
  2023-11-28 20:57 ` [PATCH v3 7/7] net/tcp: Don't store TCP-AO maclen on reqsk Dmitry Safonov
  6 siblings, 1 reply; 10+ messages in thread
From: Dmitry Safonov @ 2023-11-28 20:57 UTC (permalink / raw)
  To: David Ahern, Eric Dumazet, Paolo Abeni, Jakub Kicinski,
	David S. Miller
  Cc: linux-kernel, Dmitry Safonov, Dmitry Safonov, Francesco Ruggeri,
	Salam Noureddine, Simon Horman, netdev

If the connection was established, don't allow adding TCP-AO keys that
don't match the peer. Currently, there are checks for ip-address
matching, but L3 index check is missing. Add it to restrict userspace
shooting itself somewhere.

Fixes: 248411b8cb89 ("net/tcp: Wire up l3index to TCP-AO")
Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 net/ipv4/tcp_ao.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index bf41be6d4721..2d000e275ce7 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -1608,6 +1608,9 @@ static int tcp_ao_add_cmd(struct sock *sk, unsigned short int family,
 		if (!dev || !l3index)
 			return -EINVAL;
 
+		if (!((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE)))
+			return -EINVAL;
+
 		/* It's still possible to bind after adding keys or even
 		 * re-bind to a different dev (with CAP_NET_RAW).
 		 * So, no reason to return error here, rather try to be
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 6/7] net/tcp: Store SNEs + SEQs on ao_info
  2023-11-28 20:57 [PATCH v3 0/7] TCP-AO fixes Dmitry Safonov
                   ` (4 preceding siblings ...)
  2023-11-28 20:57 ` [PATCH v3 5/7] net/tcp: Don't add key with non-matching VRF on connected sockets Dmitry Safonov
@ 2023-11-28 20:57 ` Dmitry Safonov
  2023-11-28 20:57 ` [PATCH v3 7/7] net/tcp: Don't store TCP-AO maclen on reqsk Dmitry Safonov
  6 siblings, 0 replies; 10+ messages in thread
From: Dmitry Safonov @ 2023-11-28 20:57 UTC (permalink / raw)
  To: David Ahern, Eric Dumazet, Paolo Abeni, Jakub Kicinski,
	David S. Miller
  Cc: linux-kernel, Dmitry Safonov, Dmitry Safonov, Francesco Ruggeri,
	Salam Noureddine, Simon Horman, netdev

RFC 5925 (6.2):
> TCP-AO emulates a 64-bit sequence number space by inferring when to
> increment the high-order 32-bit portion (the SNE) based on
> transitions in the low-order portion (the TCP sequence number).

snd_sne and rcv_sne are the upper 4 bytes of extended SEQ number.
Unfortunately, reading two 4-bytes pointers can't be performed
atomically (without synchronization).

In order to avoid locks on TCP fastpath, let's just double-account for
SEQ changes: snd_una/rcv_nxt will be lower 4 bytes of snd_sne/rcv_sne.

Fixes: 64382c71a557 ("net/tcp: Add TCP-AO SNE support")
Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 include/net/tcp_ao.h    | 25 +++++++++++++++++---
 net/ipv4/tcp.c          |  7 ++++--
 net/ipv4/tcp_ao.c       | 51 ++++++++++++++++++++++-------------------
 net/ipv4/tcp_fastopen.c |  2 ++
 net/ipv4/tcp_input.c    | 21 ++++++++++-------
 net/ipv4/tcp_output.c   |  1 +
 6 files changed, 71 insertions(+), 36 deletions(-)

diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h
index 647781080613..b8ef25d4b632 100644
--- a/include/net/tcp_ao.h
+++ b/include/net/tcp_ao.h
@@ -121,8 +121,8 @@ struct tcp_ao_info {
 	 * - for time-wait sockets the basis is tw_rcv_nxt/tw_snd_nxt.
 	 *   tw_snd_nxt is not expected to change, while tw_rcv_nxt may.
 	 */
-	u32			snd_sne;
-	u32			rcv_sne;
+	u64			snd_sne;
+	u64			rcv_sne;
 	refcount_t		refcnt;		/* Protects twsk destruction */
 	struct rcu_head		rcu;
 };
@@ -212,7 +212,6 @@ enum skb_drop_reason tcp_inbound_ao_hash(struct sock *sk,
 			const struct sk_buff *skb, unsigned short int family,
 			const struct request_sock *req, int l3index,
 			const struct tcp_ao_hdr *aoh);
-u32 tcp_ao_compute_sne(u32 next_sne, u32 next_seq, u32 seq);
 struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk, int l3index,
 				    const union tcp_ao_addr *addr,
 				    int family, int sndid, int rcvid);
@@ -353,6 +352,26 @@ static inline int tcp_ao_set_repair(struct sock *sk,
 }
 #endif
 
+static inline void tcp_ao_sne_set(struct tcp_sock *tp, bool send, u64 sne)
+{
+#ifdef CONFIG_TCP_AO
+	struct tcp_ao_info *ao;
+
+	if (!static_branch_unlikely(&tcp_ao_needed.key))
+		return;
+
+	ao = rcu_dereference_protected(tp->ao_info,
+				       lockdep_sock_is_held((struct sock *)tp));
+	if (!ao)
+		return;
+
+	if (send)
+		WRITE_ONCE(ao->snd_sne, sne);
+	else
+		WRITE_ONCE(ao->rcv_sne, sne);
+#endif
+}
+
 #if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO)
 int tcp_do_parse_auth_options(const struct tcphdr *th,
 			      const u8 **md5_hash, const u8 **ao_hash);
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index b1fe4eb01829..431c10917d27 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3545,16 +3545,19 @@ int do_tcp_setsockopt(struct sock *sk, int level, int optname,
 		if (sk->sk_state != TCP_CLOSE) {
 			err = -EPERM;
 		} else if (tp->repair_queue == TCP_SEND_QUEUE) {
-			if (!tcp_rtx_queue_empty(sk))
+			if (!tcp_rtx_queue_empty(sk)) {
 				err = -EPERM;
-			else
+			} else {
 				WRITE_ONCE(tp->write_seq, val);
+				tcp_ao_sne_set(tp, true, val);
+			}
 		} else if (tp->repair_queue == TCP_RECV_QUEUE) {
 			if (tp->rcv_nxt != tp->copied_seq) {
 				err = -EPERM;
 			} else {
 				WRITE_ONCE(tp->rcv_nxt, val);
 				WRITE_ONCE(tp->copied_seq, val);
+				tcp_ao_sne_set(tp, false, val);
 			}
 		} else {
 			err = -EINVAL;
diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index 2d000e275ce7..fe68983fcf26 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -472,9 +472,10 @@ static int tcp_ao_hash_pseudoheader(unsigned short int family,
 	return -EAFNOSUPPORT;
 }
 
-u32 tcp_ao_compute_sne(u32 next_sne, u32 next_seq, u32 seq)
+static u32 tcp_ao_compute_sne(u64 seq_sne, u32 seq)
 {
-	u32 sne = next_sne;
+	u32 next_seq = (u32)(seq_sne & 0xffffffff);
+	u32 sne = seq_sne >> 32;
 
 	if (before(seq, next_seq)) {
 		if (seq > next_seq)
@@ -483,7 +484,6 @@ u32 tcp_ao_compute_sne(u32 next_sne, u32 next_seq, u32 seq)
 		if (seq < next_seq)
 			sne++;
 	}
-
 	return sne;
 }
 
@@ -731,7 +731,7 @@ int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb,
 
 			sisn = htonl(tcp_rsk(req)->rcv_isn);
 			disn = htonl(tcp_rsk(req)->snt_isn);
-			*sne = tcp_ao_compute_sne(0, tcp_rsk(req)->snt_isn, seq);
+			*sne = tcp_ao_compute_sne(tcp_rsk(req)->snt_isn, seq);
 		} else {
 			sisn = th->seq;
 			disn = 0;
@@ -763,14 +763,11 @@ int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb,
 		*keyid = (*key)->rcvid;
 	} else {
 		struct tcp_ao_key *rnext_key;
-		u32 snd_basis;
 
 		if (sk->sk_state == TCP_TIME_WAIT) {
 			ao_info = rcu_dereference(tcp_twsk(sk)->ao_info);
-			snd_basis = tcp_twsk(sk)->tw_snd_nxt;
 		} else {
 			ao_info = rcu_dereference(tcp_sk(sk)->ao_info);
-			snd_basis = tcp_sk(sk)->snd_una;
 		}
 		if (!ao_info)
 			return -ENOENT;
@@ -781,8 +778,7 @@ int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb,
 		*traffic_key = snd_other_key(*key);
 		rnext_key = READ_ONCE(ao_info->rnext_key);
 		*keyid = rnext_key->rcvid;
-		*sne = tcp_ao_compute_sne(READ_ONCE(ao_info->snd_sne),
-					  snd_basis, seq);
+		*sne = tcp_ao_compute_sne(READ_ONCE(ao_info->snd_sne), seq);
 	}
 	return 0;
 }
@@ -816,8 +812,7 @@ int tcp_ao_transmit_skb(struct sock *sk, struct sk_buff *skb,
 		tp->af_specific->ao_calc_key_sk(key, traffic_key,
 						sk, ao->lisn, disn, true);
 	}
-	sne = tcp_ao_compute_sne(READ_ONCE(ao->snd_sne), READ_ONCE(tp->snd_una),
-				 ntohl(th->seq));
+	sne = tcp_ao_compute_sne(READ_ONCE(ao->snd_sne), ntohl(th->seq));
 	tp->af_specific->calc_ao_hash(hash_location, key, sk, skb, traffic_key,
 				      hash_location - (u8 *)th, sne);
 	kfree(tkey_buf);
@@ -938,8 +933,8 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb,
 
 	/* Fast-path */
 	if (likely((1 << sk->sk_state) & TCP_AO_ESTABLISHED)) {
-		enum skb_drop_reason err;
 		struct tcp_ao_key *current_key;
+		enum skb_drop_reason err;
 
 		/* Check if this socket's rnext_key matches the keyid in the
 		 * packet. If not we lookup the key based on the keyid
@@ -956,8 +951,7 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb,
 		if (unlikely(th->syn && !th->ack))
 			goto verify_hash;
 
-		sne = tcp_ao_compute_sne(info->rcv_sne, tcp_sk(sk)->rcv_nxt,
-					 ntohl(th->seq));
+		sne = tcp_ao_compute_sne(READ_ONCE(info->rcv_sne), ntohl(th->seq));
 		/* Established socket, traffic key are cached */
 		traffic_key = rcv_other_key(key);
 		err = tcp_ao_verify_hash(sk, skb, family, info, aoh, key,
@@ -992,7 +986,7 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb,
 	if ((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_NEW_SYN_RECV)) {
 		/* Make the initial syn the likely case here */
 		if (unlikely(req)) {
-			sne = tcp_ao_compute_sne(0, tcp_rsk(req)->rcv_isn,
+			sne = tcp_ao_compute_sne(tcp_rsk(req)->rcv_isn,
 						 ntohl(th->seq));
 			sisn = htonl(tcp_rsk(req)->rcv_isn);
 			disn = htonl(tcp_rsk(req)->snt_isn);
@@ -1000,8 +994,7 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb,
 			/* Possible syncookie packet */
 			sisn = htonl(ntohl(th->seq) - 1);
 			disn = htonl(ntohl(th->ack_seq) - 1);
-			sne = tcp_ao_compute_sne(0, ntohl(sisn),
-						 ntohl(th->seq));
+			sne = tcp_ao_compute_sne(ntohl(sisn), ntohl(th->seq));
 		} else if (unlikely(!th->syn)) {
 			/* no way to figure out initial sisn/disn - drop */
 			return SKB_DROP_REASON_TCP_FLAGS;
@@ -1103,7 +1096,8 @@ void tcp_ao_connect_init(struct sock *sk)
 		tp->tcp_header_len += tcp_ao_len_aligned(key);
 
 		ao_info->lisn = htonl(tp->write_seq);
-		ao_info->snd_sne = 0;
+		ao_info->snd_sne = htonl(tp->write_seq);
+		ao_info->rcv_sne = 0;
 	} else {
 		/* Can't happen: tcp_connect() verifies that there's
 		 * at least one tcp-ao key that matches the remote peer.
@@ -1139,7 +1133,7 @@ void tcp_ao_finish_connect(struct sock *sk, struct sk_buff *skb)
 		return;
 
 	WRITE_ONCE(ao->risn, tcp_hdr(skb)->seq);
-	ao->rcv_sne = 0;
+	WRITE_ONCE(ao->rcv_sne, ntohl(tcp_hdr(skb)->seq));
 
 	hlist_for_each_entry_rcu(key, &ao->head, node)
 		tcp_ao_cache_traffic_keys(sk, ao, key);
@@ -1169,6 +1163,8 @@ int tcp_ao_copy_all_matching(const struct sock *sk, struct sock *newsk,
 		return -ENOMEM;
 	new_ao->lisn = htonl(tcp_rsk(req)->snt_isn);
 	new_ao->risn = htonl(tcp_rsk(req)->rcv_isn);
+	new_ao->snd_sne = tcp_rsk(req)->snt_isn;
+	new_ao->rcv_sne = tcp_rsk(req)->rcv_isn;
 	new_ao->ao_required = ao->ao_required;
 	new_ao->accept_icmps = ao->accept_icmps;
 
@@ -1694,6 +1690,8 @@ static int tcp_ao_add_cmd(struct sock *sk, unsigned short int family,
 			goto err_free_sock;
 		}
 		sk_gso_disable(sk);
+		WRITE_ONCE(ao_info->snd_sne, tcp_sk(sk)->snd_una);
+		WRITE_ONCE(ao_info->rcv_sne, tcp_sk(sk)->rcv_nxt);
 		rcu_assign_pointer(tcp_sk(sk)->ao_info, ao_info);
 	}
 
@@ -2334,6 +2332,7 @@ int tcp_ao_set_repair(struct sock *sk, sockptr_t optval, unsigned int optlen)
 	struct tcp_ao_repair cmd;
 	struct tcp_ao_key *key;
 	struct tcp_ao_info *ao;
+	u64 sne;
 	int err;
 
 	if (optlen < sizeof(cmd))
@@ -2354,8 +2353,14 @@ int tcp_ao_set_repair(struct sock *sk, sockptr_t optval, unsigned int optlen)
 
 	WRITE_ONCE(ao->lisn, cmd.snt_isn);
 	WRITE_ONCE(ao->risn, cmd.rcv_isn);
-	WRITE_ONCE(ao->snd_sne, cmd.snd_sne);
-	WRITE_ONCE(ao->rcv_sne, cmd.rcv_sne);
+
+	sne = READ_ONCE(ao->snd_sne) & 0xffffffff;
+	sne += (u64)cmd.snd_sne << 32;
+	WRITE_ONCE(ao->snd_sne, sne);
+
+	sne = READ_ONCE(ao->rcv_sne) & 0xffffffff;
+	sne += (u64)cmd.rcv_sne << 32;
+	WRITE_ONCE(ao->rcv_sne, sne);
 
 	hlist_for_each_entry_rcu(key, &ao->head, node)
 		tcp_ao_cache_traffic_keys(sk, ao, key);
@@ -2388,8 +2393,8 @@ int tcp_ao_get_repair(struct sock *sk, sockptr_t optval, sockptr_t optlen)
 
 	opt.snt_isn	= ao->lisn;
 	opt.rcv_isn	= ao->risn;
-	opt.snd_sne	= READ_ONCE(ao->snd_sne);
-	opt.rcv_sne	= READ_ONCE(ao->rcv_sne);
+	opt.snd_sne	= READ_ONCE(ao->snd_sne) >> 32;
+	opt.rcv_sne	= READ_ONCE(ao->rcv_sne) >> 32;
 	rcu_read_unlock();
 
 	if (copy_to_sockptr(optval, &opt, min_t(int, len, sizeof(opt))))
diff --git a/net/ipv4/tcp_fastopen.c b/net/ipv4/tcp_fastopen.c
index 8ed54e7334a9..d28d0df300d3 100644
--- a/net/ipv4/tcp_fastopen.c
+++ b/net/ipv4/tcp_fastopen.c
@@ -194,6 +194,7 @@ void tcp_fastopen_add_skb(struct sock *sk, struct sk_buff *skb)
 	TCP_SKB_CB(skb)->tcp_flags &= ~TCPHDR_SYN;
 
 	tp->rcv_nxt = TCP_SKB_CB(skb)->end_seq;
+	tcp_ao_sne_set(tp, false, TCP_SKB_CB(skb)->end_seq);
 	__skb_queue_tail(&sk->sk_receive_queue, skb);
 	tp->syn_data_acked = 1;
 
@@ -282,6 +283,7 @@ static struct sock *tcp_fastopen_create_child(struct sock *sk,
 	tcp_init_transfer(child, BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB, skb);
 
 	tp->rcv_nxt = TCP_SKB_CB(skb)->seq + 1;
+	tcp_ao_sne_set(tp, false, TCP_SKB_CB(skb)->seq + 1);
 
 	tcp_fastopen_add_skb(child, skb);
 
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index bcb55d98004c..0a58447c33b1 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3572,7 +3572,7 @@ static inline bool tcp_may_update_window(const struct tcp_sock *tp,
 		(ack_seq == tp->snd_wl1 && (nwin > tp->snd_wnd || !nwin));
 }
 
-static void tcp_snd_sne_update(struct tcp_sock *tp, u32 ack)
+static void tcp_ao_snd_sne_update(struct tcp_sock *tp, u32 delta)
 {
 #ifdef CONFIG_TCP_AO
 	struct tcp_ao_info *ao;
@@ -3582,8 +3582,9 @@ static void tcp_snd_sne_update(struct tcp_sock *tp, u32 ack)
 
 	ao = rcu_dereference_protected(tp->ao_info,
 				       lockdep_sock_is_held((struct sock *)tp));
-	if (ao && ack < tp->snd_una)
-		ao->snd_sne++;
+	if (!ao)
+		return;
+	WRITE_ONCE(ao->snd_sne, ao->snd_sne + delta);
 #endif
 }
 
@@ -3594,11 +3595,11 @@ static void tcp_snd_una_update(struct tcp_sock *tp, u32 ack)
 
 	sock_owned_by_me((struct sock *)tp);
 	tp->bytes_acked += delta;
-	tcp_snd_sne_update(tp, ack);
+	tcp_ao_snd_sne_update(tp, delta);
 	tp->snd_una = ack;
 }
 
-static void tcp_rcv_sne_update(struct tcp_sock *tp, u32 seq)
+static void tcp_ao_rcv_sne_update(struct tcp_sock *tp, u32 delta)
 {
 #ifdef CONFIG_TCP_AO
 	struct tcp_ao_info *ao;
@@ -3608,8 +3609,9 @@ static void tcp_rcv_sne_update(struct tcp_sock *tp, u32 seq)
 
 	ao = rcu_dereference_protected(tp->ao_info,
 				       lockdep_sock_is_held((struct sock *)tp));
-	if (ao && seq < tp->rcv_nxt)
-		ao->rcv_sne++;
+	if (!ao)
+		return;
+	WRITE_ONCE(ao->rcv_sne, ao->rcv_sne + delta);
 #endif
 }
 
@@ -3620,7 +3622,7 @@ static void tcp_rcv_nxt_update(struct tcp_sock *tp, u32 seq)
 
 	sock_owned_by_me((struct sock *)tp);
 	tp->bytes_received += delta;
-	tcp_rcv_sne_update(tp, seq);
+	tcp_ao_rcv_sne_update(tp, delta);
 	WRITE_ONCE(tp->rcv_nxt, seq);
 }
 
@@ -6400,6 +6402,7 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb,
 		 * move to established.
 		 */
 		WRITE_ONCE(tp->rcv_nxt, TCP_SKB_CB(skb)->seq + 1);
+		tcp_ao_sne_set(tp, false, TCP_SKB_CB(skb)->seq + 1);
 		tp->rcv_wup = TCP_SKB_CB(skb)->seq + 1;
 
 		/* RFC1323: The window in SYN & SYN/ACK segments is
@@ -6510,6 +6513,7 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb,
 		}
 
 		WRITE_ONCE(tp->rcv_nxt, TCP_SKB_CB(skb)->seq + 1);
+		tcp_ao_sne_set(tp, false, TCP_SKB_CB(skb)->seq + 1);
 		WRITE_ONCE(tp->copied_seq, tp->rcv_nxt);
 		tp->rcv_wup = TCP_SKB_CB(skb)->seq + 1;
 
@@ -6722,6 +6726,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 		if (sk->sk_socket)
 			sk_wake_async(sk, SOCK_WAKE_IO, POLL_OUT);
 
+		tcp_ao_sne_set(tp, true, TCP_SKB_CB(skb)->ack_seq);
 		tp->snd_una = TCP_SKB_CB(skb)->ack_seq;
 		tp->snd_wnd = ntohs(th->window) << tp->rx_opt.snd_wscale;
 		tcp_init_wl(tp, TCP_SKB_CB(skb)->seq);
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 93eef1dbbc55..3ddd057fb6f7 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -3882,6 +3882,7 @@ static void tcp_connect_init(struct sock *sk)
 	tp->snd_wnd = 0;
 	tcp_init_wl(tp, 0);
 	tcp_write_queue_purge(sk);
+	tcp_ao_sne_set(tp, true, tp->write_seq);
 	tp->snd_una = tp->write_seq;
 	tp->snd_sml = tp->write_seq;
 	tp->snd_up = tp->write_seq;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 7/7] net/tcp: Don't store TCP-AO maclen on reqsk
  2023-11-28 20:57 [PATCH v3 0/7] TCP-AO fixes Dmitry Safonov
                   ` (5 preceding siblings ...)
  2023-11-28 20:57 ` [PATCH v3 6/7] net/tcp: Store SNEs + SEQs on ao_info Dmitry Safonov
@ 2023-11-28 20:57 ` Dmitry Safonov
  6 siblings, 0 replies; 10+ messages in thread
From: Dmitry Safonov @ 2023-11-28 20:57 UTC (permalink / raw)
  To: David Ahern, Eric Dumazet, Paolo Abeni, Jakub Kicinski,
	David S. Miller
  Cc: linux-kernel, Dmitry Safonov, Dmitry Safonov, Francesco Ruggeri,
	Salam Noureddine, Simon Horman, netdev

This extra check doesn't work for a handshake when SYN segment has
(current_key.maclen != rnext_key.maclen). It could be amended to
preserve rnext_key.maclen instead of current_key.maclen, but that
requires a lookup on listen socket.

Originally, this extra maclen check was introduced just because it was
cheap. Drop it and convert tcp_request_sock::maclen into boolean
tcp_request_sock::used_tcp_ao.

Fixes: 06b22ef29591 ("net/tcp: Wire TCP-AO to request sockets")
Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 include/linux/tcp.h   | 8 ++------
 net/ipv4/tcp_ao.c     | 4 ++--
 net/ipv4/tcp_input.c  | 5 +++--
 net/ipv4/tcp_output.c | 9 +++------
 4 files changed, 10 insertions(+), 16 deletions(-)

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index 68f3d315d2e1..b646b574b060 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -169,7 +169,7 @@ struct tcp_request_sock {
 #ifdef CONFIG_TCP_AO
 	u8				ao_keyid;
 	u8				ao_rcv_next;
-	u8				maclen;
+	bool				used_tcp_ao;
 #endif
 };
 
@@ -180,14 +180,10 @@ static inline struct tcp_request_sock *tcp_rsk(const struct request_sock *req)
 
 static inline bool tcp_rsk_used_ao(const struct request_sock *req)
 {
-	/* The real length of MAC is saved in the request socket,
-	 * signing anything with zero-length makes no sense, so here is
-	 * a little hack..
-	 */
 #ifndef CONFIG_TCP_AO
 	return false;
 #else
-	return tcp_rsk(req)->maclen != 0;
+	return tcp_rsk(req)->used_tcp_ao;
 #endif
 }
 
diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index fe68983fcf26..88c0a858534e 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -846,7 +846,7 @@ void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb,
 	const struct tcp_ao_hdr *aoh;
 	struct tcp_ao_key *key;
 
-	treq->maclen = 0;
+	treq->used_tcp_ao = false;
 
 	if (tcp_parse_auth_options(th, NULL, &aoh) || !aoh)
 		return;
@@ -858,7 +858,7 @@ void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb,
 
 	treq->ao_rcv_next = aoh->keyid;
 	treq->ao_keyid = aoh->rnext_keyid;
-	treq->maclen = tcp_ao_maclen(key);
+	treq->used_tcp_ao = true;
 }
 
 static enum skb_drop_reason
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 0a58447c33b1..9bcbde89ab5c 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -7187,11 +7187,12 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
 	if (tcp_parse_auth_options(tcp_hdr(skb), NULL, &aoh))
 		goto drop_and_release; /* Invalid TCP options */
 	if (aoh) {
-		tcp_rsk(req)->maclen = aoh->length - sizeof(struct tcp_ao_hdr);
+		tcp_rsk(req)->used_tcp_ao = true;
 		tcp_rsk(req)->ao_rcv_next = aoh->keyid;
 		tcp_rsk(req)->ao_keyid = aoh->rnext_keyid;
+
 	} else {
-		tcp_rsk(req)->maclen = 0;
+		tcp_rsk(req)->used_tcp_ao = false;
 	}
 #endif
 	tcp_rsk(req)->snt_isn = isn;
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 3ddd057fb6f7..335ab90afe65 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -3720,7 +3720,6 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
 	if (tcp_rsk_used_ao(req)) {
 #ifdef CONFIG_TCP_AO
 		struct tcp_ao_key *ao_key = NULL;
-		u8 maclen = tcp_rsk(req)->maclen;
 		u8 keyid = tcp_rsk(req)->ao_keyid;
 
 		ao_key = tcp_sk(sk)->af_specific->ao_lookup(sk, req_to_sk(req),
@@ -3730,13 +3729,11 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
 		 * for another peer-matching key, but the peer has requested
 		 * ao_keyid (RFC5925 RNextKeyID), so let's keep it simple here.
 		 */
-		if (unlikely(!ao_key || tcp_ao_maclen(ao_key) != maclen)) {
-			u8 key_maclen = ao_key ? tcp_ao_maclen(ao_key) : 0;
-
+		if (unlikely(!ao_key)) {
 			rcu_read_unlock();
 			kfree_skb(skb);
-			net_warn_ratelimited("TCP-AO: the keyid %u with maclen %u|%u from SYN packet is not present - not sending SYNACK\n",
-					     keyid, maclen, key_maclen);
+			net_warn_ratelimited("TCP-AO: the keyid %u from SYN packet is not present - not sending SYNACK\n",
+					     keyid);
 			return NULL;
 		}
 		key.ao_key = ao_key;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 5/7] net/tcp: Don't add key with non-matching VRF on connected sockets
  2023-11-28 20:57 ` [PATCH v3 5/7] net/tcp: Don't add key with non-matching VRF on connected sockets Dmitry Safonov
@ 2023-11-29  1:34   ` David Ahern
  2023-11-29 15:42     ` Dmitry Safonov
  0 siblings, 1 reply; 10+ messages in thread
From: David Ahern @ 2023-11-29  1:34 UTC (permalink / raw)
  To: Dmitry Safonov, Eric Dumazet, Paolo Abeni, Jakub Kicinski,
	David S. Miller
  Cc: linux-kernel, Dmitry Safonov, Francesco Ruggeri, Salam Noureddine,
	Simon Horman, netdev

On 11/28/23 1:57 PM, Dmitry Safonov wrote:
> If the connection was established, don't allow adding TCP-AO keys that
> don't match the peer. Currently, there are checks for ip-address
> matching, but L3 index check is missing. Add it to restrict userspace

you say L3 index check is missing - add it. yet ...

> shooting itself somewhere.
> 
> Fixes: 248411b8cb89 ("net/tcp: Wire up l3index to TCP-AO")
> Signed-off-by: Dmitry Safonov <dima@arista.com>
> ---
>  net/ipv4/tcp_ao.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
> index bf41be6d4721..2d000e275ce7 100644
> --- a/net/ipv4/tcp_ao.c
> +++ b/net/ipv4/tcp_ao.c
> @@ -1608,6 +1608,9 @@ static int tcp_ao_add_cmd(struct sock *sk, unsigned short int family,
>  		if (!dev || !l3index)
>  			return -EINVAL;
>  
> +		if (!((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE)))
> +			return -EINVAL;

... this is checking socket state.


> +
>  		/* It's still possible to bind after adding keys or even
>  		 * re-bind to a different dev (with CAP_NET_RAW).
>  		 * So, no reason to return error here, rather try to be


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 5/7] net/tcp: Don't add key with non-matching VRF on connected sockets
  2023-11-29  1:34   ` David Ahern
@ 2023-11-29 15:42     ` Dmitry Safonov
  0 siblings, 0 replies; 10+ messages in thread
From: Dmitry Safonov @ 2023-11-29 15:42 UTC (permalink / raw)
  To: David Ahern
  Cc: linux-kernel, Dmitry Safonov, Francesco Ruggeri, Salam Noureddine,
	Simon Horman, netdev, Eric Dumazet, Paolo Abeni, Jakub Kicinski,
	David S. Miller

Hi David,

On 11/29/23 01:34, David Ahern wrote:
> On 11/28/23 1:57 PM, Dmitry Safonov wrote:
>> If the connection was established, don't allow adding TCP-AO keys that
>> don't match the peer. Currently, there are checks for ip-address
>> matching, but L3 index check is missing. Add it to restrict userspace
> 
> you say L3 index check is missing - add it. yet ...
> 
>> shooting itself somewhere.
>>
>> Fixes: 248411b8cb89 ("net/tcp: Wire up l3index to TCP-AO")
>> Signed-off-by: Dmitry Safonov <dima@arista.com>
>> ---
>>  net/ipv4/tcp_ao.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
>> index bf41be6d4721..2d000e275ce7 100644
>> --- a/net/ipv4/tcp_ao.c
>> +++ b/net/ipv4/tcp_ao.c
>> @@ -1608,6 +1608,9 @@ static int tcp_ao_add_cmd(struct sock *sk, unsigned short int family,
>>  		if (!dev || !l3index)
>>  			return -EINVAL;
>>  
>> +		if (!((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE)))
>> +			return -EINVAL;
> 
> ... this is checking socket state.

Right you are, it should have been under check for
: if (bound_dev_if != cmd.ifindex)

Currently it's warning for all sockets (which can be re-bound), but for
sockets in the connected state it doesn't make sense as the key lookup
is not expecting non peer-matching key post connect()/accept().

In this patch version the check will restrict adding a key on a
connected socket with VRF regardless if it's matching the bound VRF.
Will fix!

Thanks for spotting this,
             Dmitry


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-11-29 15:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-28 20:57 [PATCH v3 0/7] TCP-AO fixes Dmitry Safonov
2023-11-28 20:57 ` [PATCH v3 1/7] Documentation/tcp: Fix an obvious typo Dmitry Safonov
2023-11-28 20:57 ` [PATCH v3 2/7] net/tcp: Consistently align TCP-AO option in the header Dmitry Safonov
2023-11-28 20:57 ` [PATCH v3 3/7] net/tcp: Limit TCP_AO_REPAIR to non-listen sockets Dmitry Safonov
2023-11-28 20:57 ` [PATCH v3 4/7] net/tcp: Allow removing current/rnext TCP-AO keys on TCP_LISTEN sockets Dmitry Safonov
2023-11-28 20:57 ` [PATCH v3 5/7] net/tcp: Don't add key with non-matching VRF on connected sockets Dmitry Safonov
2023-11-29  1:34   ` David Ahern
2023-11-29 15:42     ` Dmitry Safonov
2023-11-28 20:57 ` [PATCH v3 6/7] net/tcp: Store SNEs + SEQs on ao_info Dmitry Safonov
2023-11-28 20:57 ` [PATCH v3 7/7] net/tcp: Don't store TCP-AO maclen on reqsk Dmitry Safonov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).