From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: Jakub Kicinski <kuba@kernel.org>,
Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Cc: netdev@vger.kernel.org, pabeni@redhat.com, borisp@nvidia.com,
gal@nvidia.com, cratiu@nvidia.com, rrameshbabu@nvidia.com,
steffen.klassert@secunet.com, tariqt@nvidia.com,
mingtao@meta.com, knekritz@meta.com,
Lance Richardson <lance604@gmail.com>
Subject: Re: [RFC net-next 01/15] psp: add documentation
Date: Wed, 05 Jun 2024 16:11:31 -0400 [thread overview]
Message-ID: <6660c673921ff_35916d294ef@willemb.c.googlers.com.notmuch> (raw)
In-Reply-To: <20240604170849.110d56c1@kernel.org>
Jakub Kicinski wrote:
> On Fri, 31 May 2024 09:56:42 -0400 Willem de Bruijn wrote:
> > > > If one peer can enter the state where it drops all plaintext, while
> > > > the other decides to close the connection before completing the
> > > > upgrade, and thus sends a plaintext FIN.
> > > >
> > > > If (big if) that can happen, then the connection cannot be cleanly
> > > > closed.
> > >
> > > Hm. And we can avoid this by only enforcing encryption of data-less
> > > segments once we've seen some encrypted data?
> >
> > That would help. It may also be needed to accept a pure ACK right at
> > the upgrade seqno. Depends on the upgrade process.
> >
> > Which may be worth documenting explicitly: the system call and network
> > packet exchange from when one peer initiates (by generating its local
> > key) until the connection is fully encrypted. That also allows poking
> > at the various edge cases that may happen if packets are lost, or when
> > actions can race.
>
> Dunno if the format below is good, but you're very right.
> At least to me writing the diagram was an hour well spent :)
Great :)
> > One unexpected example of the latter that I came across was Tx SADB
> > key insertion in tail edge cases taking longer than network RTT, for
> > instance.
> >
> > The kernel API can be exercised in a variety of ways, not all of them
> > will uphold the correctness. Documenting how it should be used should
> > help.
> >
> > Even better when it reduces the option space. As it already does by
> > failing a Tx key install until Rx is configured.
>
> Something along these lines?
>
> "Sequence" diagram of the worst case scenario:
>
> 01 p Host A Host B
> 02 l t ~~~~~~~~~~~[TCP 3 WHS]~~~~~~~~~~
> 03 a e ~~~~~~[crypto negotiation]~~~~~~
> 04 i x [Rx key alloc = K-B]
> 05 n t <--- [app] K-B key send
> 06 ------[Rx key alloc = K-A]-
> 07 [app] K-A key send -->|
> 08 [TCP] K-B input <-----
> 08 P [TCP] K-B ACK ---->|
> 09 S R [app] recv(K-B) |
> 10 P x [app] [Tx key set] |
> 11 --------------------------
> 12 P T [app] send(RPC) #####>|
> 13 S x |<---- [TCP] Seq OoO! queue RPC, SACK
> 14 P [TCP] retr K-A --->|
> 15 | `-> [TCP] K-A input
> 16 | <--- [TCP] K-A ACK (or FIN)
> 17 | [app] recv(K-A)
> 18 | [app] [Tx key set]
> 19 -----------------------------------
> 20
>
> There is a causal dependency between Host B allocating the key (line 4),
> sending it (line 5) and Host A receiving it (line 8). Since Host B will
> accept PSP packets as soon as it allocated the key, Host A does not
> need to wait to start using the key (line 12). Host B will queue the
> RPC to the socket (line 13).
>
> [Problem #1]
>
> However, because Host B does not have a Tx key, the ACK / SACK packet
> (line 13) will not be encrypted. (Similarly if Host B decided to close
> the connection at this point, the resulting FIN packet would not be
> encrypted.)
Or if it plays SO_LINGER games the resulting RST.
> Host B needs to accept unencrypted non-data segments
> (pure acks, pure FIN) until it sees an encrypted packet from Host B.
>
> [Problem #2]
>
> The retansmissions of K-A are unencrypted, to avoid sending the same
> data in encrypted and unencrypted form. This poses a risk if an ACK
> gets lost but both hosts end up in the PSP Tx state. Assume that Host A
> did not send the RPC (line 12), and the retransmission (line 14)
> happens as an RTO or TLP. Host B may already reach PSP Tx state (line
> "20") and expect encrypted data. Plain text retransmissions (with
> sequence number before rcv_nxt) must be accepted until Host B sees
> encrypted data from Host A.
Is that sufficient if an initial encrypted packet could get reordered
by the network to arrive before a plaintext retransmit of a lower
seqno?
Both scenarios make sense. It is unfortunately harder to be sure that
we have captured all edge cases.
An issue related to the rcv_nxt cut-point, not sure how important: the
plaintext packet contents are protected by user crypto before upgrade.
But the TCP headers are not. PSP relies on TCP PAWS against replay
protection. It is possible for a MITM to offset all seqno from the
start of connection establishment. I don't see an immediate issue. But
at a minimum it could be possible to insert or delete before PSP is
upgraded.
>
> With that I think the state machine needs to be amended:
>
> Event | Normal TCP | Rx PSP | Tx PSP | PSP full |
> -----------------------------------------------------------------------
> Rx plain (new) | accept | accept | drop | drop |
>
> Rx plain | accept | accept | accept | drop |
> (ACK|FIN|rtx) | | | | |
>
> Rx PSP (good) | drop | accept | accept | accept |
>
> Rx PSP (bad | drop | drop | drop | drop |
> (crypt, !=SPI) | | | | |
>
> Tx | plain text | plain text | encrypted | encrypted |
> | | | (excl. rtx) | (excl. rtx) |
>
> > > > Another example where a peer stays open and stays retrying if it has
> > > > upgraded and drops all plaintext.
> >
> > May want to always allow plaintext RSTs. This is a potential DoS
> > vector.
>
> Because of key exhaustion? Or we can be tricked into spamming someone
> with retranmissions and ignoring their RST?
Simpler: this falls back onto unencrypted TCP where someone capable of
spoofing valid data is capable of terminating a connection.
If denying all plaintext after upgrade, PSP protects against this.
It is arguably low on the list of concerns, especially in a closed
world hyperscaler setting. As it is hardly the only DoS vector.
> > In all these cases, I suppose this has already been figured
> > out for TLS.
>
> Assuming the answer above is "key exhaustion" - I wouldn't be surprised
> if it wasn't :(
next prev parent reply other threads:[~2024-06-05 20:11 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-10 3:04 [RFC net-next 00/15] add basic PSP encryption for TCP connections Jakub Kicinski
2024-05-10 3:04 ` [RFC net-next 01/15] psp: add documentation Jakub Kicinski
2024-05-10 22:19 ` Saeed Mahameed
2024-05-11 0:11 ` Jakub Kicinski
2024-05-11 9:41 ` Vadim Fedorenko
2024-05-11 16:25 ` David Ahern
2024-06-26 13:57 ` Sasha Levin
2024-05-13 1:24 ` Willem de Bruijn
2024-05-29 17:35 ` Jakub Kicinski
2024-05-30 0:47 ` Willem de Bruijn
2024-05-30 19:51 ` Jakub Kicinski
2024-05-30 20:15 ` Jakub Kicinski
2024-05-30 21:03 ` Willem de Bruijn
2024-05-31 13:56 ` Willem de Bruijn
2024-06-05 0:08 ` Jakub Kicinski
2024-06-05 20:11 ` Willem de Bruijn [this message]
2024-06-05 22:24 ` Jakub Kicinski
2024-06-06 2:40 ` Willem de Bruijn
2024-06-27 15:14 ` Lance Richardson
2024-06-27 22:33 ` Jakub Kicinski
2024-06-28 19:33 ` Lance Richardson
2024-06-28 23:41 ` Jakub Kicinski
2024-05-10 3:04 ` [RFC net-next 02/15] psp: base PSP device support Jakub Kicinski
2024-05-10 3:04 ` [RFC net-next 03/15] net: modify core data structures for PSP datapath support Jakub Kicinski
2024-05-10 3:04 ` [RFC net-next 04/15] tcp: add datapath logic for PSP with inline key exchange Jakub Kicinski
2024-05-10 3:04 ` [RFC net-next 05/15] psp: add op for rotation of secret state Jakub Kicinski
2024-05-16 19:59 ` Lance Richardson
2024-05-29 17:43 ` Jakub Kicinski
2024-05-10 3:04 ` [RFC net-next 06/15] net: psp: add socket security association code Jakub Kicinski
2024-05-10 3:04 ` [RFC net-next 07/15] net: psp: update the TCP MSS to reflect PSP packet overhead Jakub Kicinski
2024-05-13 1:47 ` Willem de Bruijn
2024-05-29 17:48 ` Jakub Kicinski
2024-05-30 0:52 ` Willem de Bruijn
2024-05-10 3:04 ` [RFC net-next 08/15] psp: track generations of secret state Jakub Kicinski
2024-05-10 3:04 ` [RFC net-next 09/15] net/mlx5e: Support PSP offload functionality Jakub Kicinski
2024-05-10 3:04 ` [RFC net-next 10/15] net/mlx5e: Implement PSP operations .assoc_add and .assoc_del Jakub Kicinski
2024-05-10 3:04 ` [RFC net-next 11/15] net/mlx5e: Implement PSP Tx data path Jakub Kicinski
2024-05-10 3:04 ` [RFC net-next 12/15] net/mlx5e: Add PSP steering in local NIC RX Jakub Kicinski
2024-05-13 1:52 ` Willem de Bruijn
2024-05-10 3:04 ` [RFC net-next 13/15] net/mlx5e: Configure PSP Rx flow steering rules Jakub Kicinski
2024-05-10 3:04 ` [RFC net-next 14/15] net/mlx5e: Add Rx data path offload Jakub Kicinski
2024-05-13 1:54 ` Willem de Bruijn
2024-05-29 18:38 ` Jakub Kicinski
2024-05-30 9:04 ` Cosmin Ratiu
2024-05-10 3:04 ` [RFC net-next 15/15] net/mlx5e: Implement PSP key_rotate operation Jakub Kicinski
2024-05-29 9:16 ` [RFC net-next 00/15] add basic PSP encryption for TCP connections Boris Pismenny
2024-05-29 18:50 ` Jakub Kicinski
2024-05-29 20:01 ` Boris Pismenny
2024-05-29 20:38 ` Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6660c673921ff_35916d294ef@willemb.c.googlers.com.notmuch \
--to=willemdebruijn.kernel@gmail.com \
--cc=borisp@nvidia.com \
--cc=cratiu@nvidia.com \
--cc=gal@nvidia.com \
--cc=knekritz@meta.com \
--cc=kuba@kernel.org \
--cc=lance604@gmail.com \
--cc=mingtao@meta.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=rrameshbabu@nvidia.com \
--cc=steffen.klassert@secunet.com \
--cc=tariqt@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).