* [PATCH net v2] net/tls: prevent skb_orphan() from leaking TLS plain text with offload
@ 2019-08-07 6:06 Jakub Kicinski
2019-08-07 16:59 ` Willem de Bruijn
0 siblings, 1 reply; 6+ messages in thread
From: Jakub Kicinski @ 2019-08-07 6:06 UTC (permalink / raw)
To: davem
Cc: netdev, davejwatson, borisp, aviadye, john.fastabend, daniel,
willemb, edumazet, alexei.starovoitov, oss-drivers,
Jakub Kicinski
sk_validate_xmit_skb() and drivers depend on the sk member of
struct sk_buff to identify segments requiring encryption.
Any operation which removes or does not preserve the original TLS
socket such as skb_orphan() or skb_clone() will cause clear text
leaks.
Make the TCP socket underlying an offloaded TLS connection
mark all skbs as decrypted, if TLS TX is in offload mode.
Then in sk_validate_xmit_skb() catch skbs which have no socket
(or a socket with no validation) and decrypted flag set.
Note that CONFIG_SOCK_VALIDATE_XMIT, CONFIG_TLS_DEVICE and
sk->sk_validate_xmit_skb are slightly interchangeable right now,
they all imply TLS offload. The new checks are guarded by
CONFIG_TLS_DEVICE because that's the option guarding the
sk_buff->decrypted member.
Second, smaller issue with orphaning is that it breaks
the guarantee that packets will be delivered to device
queues in-order. All TLS offload drivers depend on that
scheduling property. This means skb_orphan_partial()'s
trick of preserving partial socket references will cause
issues in the drivers. We need a full orphan, and as a
result netem delay/throttling will cause all TLS offload
skbs to be dropped.
Reusing the sk_buff->decrypted flag also protects from
leaking clear text when incoming, decrypted skb is redirected
(e.g. by TC).
See commit 0608c69c9a80 ("bpf: sk_msg, sock{map|hash} redirect
through ULP") for justification why the internal flag is safe.
v2:
- remove superfluous decrypted mark copy (Willem);
- remove the stale doc entry (Boris);
- rely entirely on EOR marking to prevent coalescing (Boris);
- use an internal sendpages flag instead of marking the socket
(Boris).
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
Documentation/networking/tls-offload.rst | 18 ------------------
include/linux/skbuff.h | 8 ++++++++
include/linux/socket.h | 3 +++
include/net/sock.h | 10 +++++++++-
net/core/sock.c | 20 +++++++++++++++-----
net/ipv4/tcp.c | 3 +++
net/ipv4/tcp_output.c | 3 +++
net/tls/tls_device.c | 9 +++++++--
8 files changed, 48 insertions(+), 26 deletions(-)
diff --git a/Documentation/networking/tls-offload.rst b/Documentation/networking/tls-offload.rst
index b70b70dc4524..0dd3f748239f 100644
--- a/Documentation/networking/tls-offload.rst
+++ b/Documentation/networking/tls-offload.rst
@@ -506,21 +506,3 @@ Drivers should ignore the changes to TLS the device feature flags.
These flags will be acted upon accordingly by the core ``ktls`` code.
TLS device feature flags only control adding of new TLS connection
offloads, old connections will remain active after flags are cleared.
-
-Known bugs
-==========
-
-skb_orphan() leaks clear text
------------------------------
-
-Currently drivers depend on the :c:member:`sk` member of
-:c:type:`struct sk_buff <sk_buff>` to identify segments requiring
-encryption. Any operation which removes or does not preserve the socket
-association such as :c:func:`skb_orphan` or :c:func:`skb_clone`
-will cause the driver to miss the packets and lead to clear text leaks.
-
-Redirects leak clear text
--------------------------
-
-In the RX direction, if segment has already been decrypted by the device
-and it gets redirected or mirrored - clear text will be transmitted out.
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index d8af86d995d6..ba5583522d24 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1374,6 +1374,14 @@ static inline void skb_copy_hash(struct sk_buff *to, const struct sk_buff *from)
to->l4_hash = from->l4_hash;
};
+static inline void skb_copy_decrypted(struct sk_buff *to,
+ const struct sk_buff *from)
+{
+#ifdef CONFIG_TLS_DEVICE
+ to->decrypted = from->decrypted;
+#endif
+}
+
#ifdef NET_SKBUFF_DATA_USES_OFFSET
static inline unsigned char *skb_end_pointer(const struct sk_buff *skb)
{
diff --git a/include/linux/socket.h b/include/linux/socket.h
index 97523818cb14..fc0bed59fc84 100644
--- a/include/linux/socket.h
+++ b/include/linux/socket.h
@@ -292,6 +292,9 @@ struct ucred {
#define MSG_BATCH 0x40000 /* sendmmsg(): more messages coming */
#define MSG_EOF MSG_FIN
#define MSG_NO_SHARED_FRAGS 0x80000 /* sendpage() internal : page frags are not shared */
+#define MSG_SENDPAGE_DECRYPTED 0x100000 /* sendpage() internal : page may carry
+ * plain text and require encryption
+ */
#define MSG_ZEROCOPY 0x4000000 /* Use user data in kernel path */
#define MSG_FASTOPEN 0x20000000 /* Send data in TCP SYN */
diff --git a/include/net/sock.h b/include/net/sock.h
index 228db3998e46..2c53f1a1d905 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -2482,6 +2482,7 @@ static inline bool sk_fullsock(const struct sock *sk)
/* Checks if this SKB belongs to an HW offloaded socket
* and whether any SW fallbacks are required based on dev.
+ * Check decrypted mark in case skb_orphan() cleared socket.
*/
static inline struct sk_buff *sk_validate_xmit_skb(struct sk_buff *skb,
struct net_device *dev)
@@ -2489,8 +2490,15 @@ static inline struct sk_buff *sk_validate_xmit_skb(struct sk_buff *skb,
#ifdef CONFIG_SOCK_VALIDATE_XMIT
struct sock *sk = skb->sk;
- if (sk && sk_fullsock(sk) && sk->sk_validate_xmit_skb)
+ if (sk && sk_fullsock(sk) && sk->sk_validate_xmit_skb) {
skb = sk->sk_validate_xmit_skb(sk, dev, skb);
+#ifdef CONFIG_TLS_DEVICE
+ } else if (unlikely(skb->decrypted)) {
+ pr_warn_ratelimited("unencrypted skb with no associated socket - dropping\n");
+ kfree_skb(skb);
+ skb = NULL;
+#endif
+ }
#endif
return skb;
diff --git a/net/core/sock.c b/net/core/sock.c
index d57b0cc995a0..0f9619b0892f 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1992,6 +1992,20 @@ void skb_set_owner_w(struct sk_buff *skb, struct sock *sk)
}
EXPORT_SYMBOL(skb_set_owner_w);
+static bool can_skb_orphan_partial(const struct sk_buff *skb)
+{
+#ifdef CONFIG_TLS_DEVICE
+ /* Drivers depend on in-order delivery for crypto offload,
+ * partial orphan breaks out-of-order-OK logic.
+ */
+ if (skb->decrypted)
+ return false;
+#endif
+ return (IS_ENABLED(CONFIG_INET) &&
+ skb->destructor == tcp_wfree) ||
+ skb->destructor == sock_wfree;
+}
+
/* This helper is used by netem, as it can hold packets in its
* delay queue. We want to allow the owner socket to send more
* packets, as if they were already TX completed by a typical driver.
@@ -2003,11 +2017,7 @@ void skb_orphan_partial(struct sk_buff *skb)
if (skb_is_tcp_pure_ack(skb))
return;
- if (skb->destructor == sock_wfree
-#ifdef CONFIG_INET
- || skb->destructor == tcp_wfree
-#endif
- ) {
+ if (can_skb_orphan_partial(skb)) {
struct sock *sk = skb->sk;
if (refcount_inc_not_zero(&sk->sk_refcnt)) {
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 776905899ac0..77b485d60b9d 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -984,6 +984,9 @@ ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
if (!skb)
goto wait_for_memory;
+#ifdef CONFIG_TLS_DEVICE
+ skb->decrypted = !!(flags & MSG_SENDPAGE_DECRYPTED);
+#endif
skb_entail(sk, skb);
copy = size_goal;
}
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 6e4afc48d7bb..979520e46e33 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1320,6 +1320,7 @@ int tcp_fragment(struct sock *sk, enum tcp_queue tcp_queue,
buff = sk_stream_alloc_skb(sk, nsize, gfp, true);
if (!buff)
return -ENOMEM; /* We'll just try again later. */
+ skb_copy_decrypted(buff, skb);
sk->sk_wmem_queued += buff->truesize;
sk_mem_charge(sk, buff->truesize);
@@ -1874,6 +1875,7 @@ static int tso_fragment(struct sock *sk, struct sk_buff *skb, unsigned int len,
buff = sk_stream_alloc_skb(sk, 0, gfp, true);
if (unlikely(!buff))
return -ENOMEM;
+ skb_copy_decrypted(buff, skb);
sk->sk_wmem_queued += buff->truesize;
sk_mem_charge(sk, buff->truesize);
@@ -2143,6 +2145,7 @@ static int tcp_mtu_probe(struct sock *sk)
sk_mem_charge(sk, nskb->truesize);
skb = tcp_send_head(sk);
+ skb_copy_decrypted(nskb, skb);
TCP_SKB_CB(nskb)->seq = TCP_SKB_CB(skb)->seq;
TCP_SKB_CB(nskb)->end_seq = TCP_SKB_CB(skb)->seq + probe_size;
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
index 7c0b2b778703..43922d86e510 100644
--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -373,9 +373,9 @@ static int tls_push_data(struct sock *sk,
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct tls_prot_info *prot = &tls_ctx->prot_info;
struct tls_offload_context_tx *ctx = tls_offload_ctx_tx(tls_ctx);
- int tls_push_record_flags = flags | MSG_SENDPAGE_NOTLAST;
int more = flags & (MSG_SENDPAGE_NOTLAST | MSG_MORE);
struct tls_record_info *record = ctx->open_record;
+ int tls_push_record_flags;
struct page_frag *pfrag;
size_t orig_size = size;
u32 max_open_record_len;
@@ -390,6 +390,9 @@ static int tls_push_data(struct sock *sk,
if (sk->sk_err)
return -sk->sk_err;
+ flags |= MSG_SENDPAGE_DECRYPTED;
+ tls_push_record_flags = flags | MSG_SENDPAGE_NOTLAST;
+
timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
if (tls_is_partially_sent_record(tls_ctx)) {
rc = tls_push_partial_record(sk, tls_ctx, flags);
@@ -576,7 +579,9 @@ void tls_device_write_space(struct sock *sk, struct tls_context *ctx)
gfp_t sk_allocation = sk->sk_allocation;
sk->sk_allocation = GFP_ATOMIC;
- tls_push_partial_record(sk, ctx, MSG_DONTWAIT | MSG_NOSIGNAL);
+ tls_push_partial_record(sk, ctx,
+ MSG_DONTWAIT | MSG_NOSIGNAL |
+ MSG_SENDPAGE_DECRYPTED);
sk->sk_allocation = sk_allocation;
}
}
--
2.21.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH net v2] net/tls: prevent skb_orphan() from leaking TLS plain text with offload
2019-08-07 6:06 [PATCH net v2] net/tls: prevent skb_orphan() from leaking TLS plain text with offload Jakub Kicinski
@ 2019-08-07 16:59 ` Willem de Bruijn
2019-08-07 18:00 ` Jakub Kicinski
0 siblings, 1 reply; 6+ messages in thread
From: Willem de Bruijn @ 2019-08-07 16:59 UTC (permalink / raw)
To: Jakub Kicinski
Cc: David Miller, Network Development, davejwatson, borisp, aviadye,
John Fastabend, Daniel Borkmann, Eric Dumazet, Alexei Starovoitov,
oss-drivers
On Wed, Aug 7, 2019 at 2:06 AM Jakub Kicinski
<jakub.kicinski@netronome.com> wrote:
>
> sk_validate_xmit_skb() and drivers depend on the sk member of
> struct sk_buff to identify segments requiring encryption.
> Any operation which removes or does not preserve the original TLS
> socket such as skb_orphan() or skb_clone() will cause clear text
> leaks.
>
> Make the TCP socket underlying an offloaded TLS connection
> mark all skbs as decrypted, if TLS TX is in offload mode.
> Then in sk_validate_xmit_skb() catch skbs which have no socket
> (or a socket with no validation) and decrypted flag set.
>
> Note that CONFIG_SOCK_VALIDATE_XMIT, CONFIG_TLS_DEVICE and
> sk->sk_validate_xmit_skb are slightly interchangeable right now,
> they all imply TLS offload. The new checks are guarded by
> CONFIG_TLS_DEVICE because that's the option guarding the
> sk_buff->decrypted member.
>
> Second, smaller issue with orphaning is that it breaks
> the guarantee that packets will be delivered to device
> queues in-order. All TLS offload drivers depend on that
> scheduling property. This means skb_orphan_partial()'s
> trick of preserving partial socket references will cause
> issues in the drivers. We need a full orphan, and as a
> result netem delay/throttling will cause all TLS offload
> skbs to be dropped.
>
> Reusing the sk_buff->decrypted flag also protects from
> leaking clear text when incoming, decrypted skb is redirected
> (e.g. by TC).
>
> See commit 0608c69c9a80 ("bpf: sk_msg, sock{map|hash} redirect
> through ULP") for justification why the internal flag is safe.
>
> v2:
> - remove superfluous decrypted mark copy (Willem);
> - remove the stale doc entry (Boris);
> - rely entirely on EOR marking to prevent coalescing (Boris);
> - use an internal sendpages flag instead of marking the socket
> (Boris).
>
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> ---
> Documentation/networking/tls-offload.rst | 18 ------------------
> include/linux/skbuff.h | 8 ++++++++
> include/linux/socket.h | 3 +++
> include/net/sock.h | 10 +++++++++-
> net/core/sock.c | 20 +++++++++++++++-----
> net/ipv4/tcp.c | 3 +++
> net/ipv4/tcp_output.c | 3 +++
> net/tls/tls_device.c | 9 +++++++--
> 8 files changed, 48 insertions(+), 26 deletions(-)
> diff --git a/net/core/sock.c b/net/core/sock.c
> index d57b0cc995a0..0f9619b0892f 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -1992,6 +1992,20 @@ void skb_set_owner_w(struct sk_buff *skb, struct sock *sk)
> }
> EXPORT_SYMBOL(skb_set_owner_w);
>
> +static bool can_skb_orphan_partial(const struct sk_buff *skb)
> +{
> +#ifdef CONFIG_TLS_DEVICE
> + /* Drivers depend on in-order delivery for crypto offload,
> + * partial orphan breaks out-of-order-OK logic.
> + */
> + if (skb->decrypted)
> + return false;
> +#endif
> + return (IS_ENABLED(CONFIG_INET) &&
> + skb->destructor == tcp_wfree) ||
Please add parentheses around IS_ENABLED(CONFIG_INET) &&
skb->destructor == tcp_wfree
I was also surprised that this works when tcp_wfree is not defined if
!CONFIG_INET. But apparently it does (at -O2?) :)
> @@ -984,6 +984,9 @@ ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
> if (!skb)
> goto wait_for_memory;
>
> +#ifdef CONFIG_TLS_DEVICE
> + skb->decrypted = !!(flags & MSG_SENDPAGE_DECRYPTED);
> +#endif
Nothing is stopping userspace from passing this new flag. In send
(tcp_sendmsg_locked) it is ignored. But can it reach do_tcp_sendpages
through tcp_bpf_sendmsg?
> skb_entail(sk, skb);
> copy = size_goal;
> }
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 6e4afc48d7bb..979520e46e33 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -1320,6 +1320,7 @@ int tcp_fragment(struct sock *sk, enum tcp_queue tcp_queue,
> buff = sk_stream_alloc_skb(sk, nsize, gfp, true);
> if (!buff)
> return -ENOMEM; /* We'll just try again later. */
> + skb_copy_decrypted(buff, skb);
This code has to copy timestamps, tx_flags, zerocopy state and now
this in three locations. Eventually we'll want a single helper for all
of them..
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH net v2] net/tls: prevent skb_orphan() from leaking TLS plain text with offload
2019-08-07 16:59 ` Willem de Bruijn
@ 2019-08-07 18:00 ` Jakub Kicinski
2019-08-07 18:46 ` Willem de Bruijn
0 siblings, 1 reply; 6+ messages in thread
From: Jakub Kicinski @ 2019-08-07 18:00 UTC (permalink / raw)
To: Willem de Bruijn
Cc: David Miller, Network Development, davejwatson, borisp, aviadye,
John Fastabend, Daniel Borkmann, Eric Dumazet, Alexei Starovoitov,
oss-drivers
On Wed, 7 Aug 2019 12:59:00 -0400, Willem de Bruijn wrote:
> On Wed, Aug 7, 2019 at 2:06 AM Jakub Kicinski wrote:
> > diff --git a/net/core/sock.c b/net/core/sock.c
> > index d57b0cc995a0..0f9619b0892f 100644
> > --- a/net/core/sock.c
> > +++ b/net/core/sock.c
> > @@ -1992,6 +1992,20 @@ void skb_set_owner_w(struct sk_buff *skb, struct sock *sk)
> > }
> > EXPORT_SYMBOL(skb_set_owner_w);
> >
> > +static bool can_skb_orphan_partial(const struct sk_buff *skb)
> > +{
> > +#ifdef CONFIG_TLS_DEVICE
> > + /* Drivers depend on in-order delivery for crypto offload,
> > + * partial orphan breaks out-of-order-OK logic.
> > + */
> > + if (skb->decrypted)
> > + return false;
> > +#endif
> > + return (IS_ENABLED(CONFIG_INET) &&
> > + skb->destructor == tcp_wfree) ||
>
> Please add parentheses around IS_ENABLED(CONFIG_INET) &&
> skb->destructor == tcp_wfree
Mm.. there are parenthesis around them, maybe I'm being slow,
could you show me how?
> I was also surprised that this works when tcp_wfree is not defined if
> !CONFIG_INET. But apparently it does (at -O2?) :)
I was surprised to but in essence it should work the same as
if (IS_ENABLED(CONFIG_xyz))
call_some_xyz_code();
from compiler's perspective, and we do that a lot. Perhaps kbuild
bot will prove us wrong :)
> > @@ -984,6 +984,9 @@ ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
> > if (!skb)
> > goto wait_for_memory;
> >
> > +#ifdef CONFIG_TLS_DEVICE
> > + skb->decrypted = !!(flags & MSG_SENDPAGE_DECRYPTED);
> > +#endif
>
> Nothing is stopping userspace from passing this new flag. In send
> (tcp_sendmsg_locked) it is ignored. But can it reach do_tcp_sendpages
> through tcp_bpf_sendmsg?
Ah, I think you're right, thanks for checking that :( I don't entirely
follow how 0608c69c9a80 ("bpf: sk_msg, sock{map|hash} redirect through
ULP") is safe then.
One option would be to clear the flags kernel would previously ignore
in tcp_bpf_sendmsg(). But I feel like we should just go back to marking
the socket, since we don't need the per-message flexibility of a flag.
WDYT?
> > skb_entail(sk, skb);
> > copy = size_goal;
> > }
> > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> > index 6e4afc48d7bb..979520e46e33 100644
> > --- a/net/ipv4/tcp_output.c
> > +++ b/net/ipv4/tcp_output.c
> > @@ -1320,6 +1320,7 @@ int tcp_fragment(struct sock *sk, enum tcp_queue tcp_queue,
> > buff = sk_stream_alloc_skb(sk, nsize, gfp, true);
> > if (!buff)
> > return -ENOMEM; /* We'll just try again later. */
> > + skb_copy_decrypted(buff, skb);
>
> This code has to copy timestamps, tx_flags, zerocopy state and now
> this in three locations. Eventually we'll want a single helper for all
> of them..
Ack, should I take an action on that for net-next or was it a
note-to-self? :)
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH net v2] net/tls: prevent skb_orphan() from leaking TLS plain text with offload
2019-08-07 18:00 ` Jakub Kicinski
@ 2019-08-07 18:46 ` Willem de Bruijn
2019-08-07 19:54 ` Willem de Bruijn
2019-08-07 22:49 ` Jakub Kicinski
0 siblings, 2 replies; 6+ messages in thread
From: Willem de Bruijn @ 2019-08-07 18:46 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Willem de Bruijn, David Miller, Network Development, davejwatson,
borisp, aviadye, John Fastabend, Daniel Borkmann, Eric Dumazet,
Alexei Starovoitov, oss-drivers
On Wed, Aug 7, 2019 at 2:01 PM Jakub Kicinski
<jakub.kicinski@netronome.com> wrote:
>
> On Wed, 7 Aug 2019 12:59:00 -0400, Willem de Bruijn wrote:
> > On Wed, Aug 7, 2019 at 2:06 AM Jakub Kicinski wrote:
> > > diff --git a/net/core/sock.c b/net/core/sock.c
> > > index d57b0cc995a0..0f9619b0892f 100644
> > > --- a/net/core/sock.c
> > > +++ b/net/core/sock.c
> > > @@ -1992,6 +1992,20 @@ void skb_set_owner_w(struct sk_buff *skb, struct sock *sk)
> > > }
> > > EXPORT_SYMBOL(skb_set_owner_w);
> > >
> > > +static bool can_skb_orphan_partial(const struct sk_buff *skb)
> > > +{
> > > +#ifdef CONFIG_TLS_DEVICE
> > > + /* Drivers depend on in-order delivery for crypto offload,
> > > + * partial orphan breaks out-of-order-OK logic.
> > > + */
> > > + if (skb->decrypted)
> > > + return false;
> > > +#endif
> > > + return (IS_ENABLED(CONFIG_INET) &&
> > > + skb->destructor == tcp_wfree) ||
> >
> > Please add parentheses around IS_ENABLED(CONFIG_INET) &&
> > skb->destructor == tcp_wfree
>
> Mm.. there are parenthesis around them, maybe I'm being slow,
> could you show me how?
I mean
return (skb->destructor == sock_wfree ||
(IS_ENABLED(CONFIG_INET) && skb->destructor == tcp_wfree))
In other words, (a || (b && c)) instead of (a || b && c). Though the
existing code also eschews the extra parentheses.
> > I was also surprised that this works when tcp_wfree is not defined if
> > !CONFIG_INET. But apparently it does (at -O2?) :)
>
> I was surprised to but in essence it should work the same as
>
> if (IS_ENABLED(CONFIG_xyz))
> call_some_xyz_code();
>
> from compiler's perspective, and we do that a lot. Perhaps kbuild
> bot will prove us wrong :)
>
> > > @@ -984,6 +984,9 @@ ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
> > > if (!skb)
> > > goto wait_for_memory;
> > >
> > > +#ifdef CONFIG_TLS_DEVICE
> > > + skb->decrypted = !!(flags & MSG_SENDPAGE_DECRYPTED);
> > > +#endif
> >
> > Nothing is stopping userspace from passing this new flag. In send
> > (tcp_sendmsg_locked) it is ignored. But can it reach do_tcp_sendpages
> > through tcp_bpf_sendmsg?
>
> Ah, I think you're right, thanks for checking that :( I don't entirely
> follow how 0608c69c9a80 ("bpf: sk_msg, sock{map|hash} redirect through
> ULP") is safe then.
>
> One option would be to clear the flags kernel would previously ignore
> in tcp_bpf_sendmsg(). But I feel like we should just go back to marking
> the socket, since we don't need the per-message flexibility of a flag.
>
> WDYT?
I don't feel strongly either way. Passing flags from send through
tcp_bpf_sendmsg is probably unintentional, so should probably be
addressed anyway? Then this is a bit simpler.
> > > skb_entail(sk, skb);
> > > copy = size_goal;
> > > }
> > > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> > > index 6e4afc48d7bb..979520e46e33 100644
> > > --- a/net/ipv4/tcp_output.c
> > > +++ b/net/ipv4/tcp_output.c
> > > @@ -1320,6 +1320,7 @@ int tcp_fragment(struct sock *sk, enum tcp_queue tcp_queue,
> > > buff = sk_stream_alloc_skb(sk, nsize, gfp, true);
> > > if (!buff)
> > > return -ENOMEM; /* We'll just try again later. */
> > > + skb_copy_decrypted(buff, skb);
> >
> > This code has to copy timestamps, tx_flags, zerocopy state and now
> > this in three locations. Eventually we'll want a single helper for all
> > of them..
>
> Ack, should I take an action on that for net-next or was it a
> note-to-self? :)
Note-to-self :)
As a matter of fact, your patch showed me that we actually miss the
tstamp case in tcp_mtu_probe..
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH net v2] net/tls: prevent skb_orphan() from leaking TLS plain text with offload
2019-08-07 18:46 ` Willem de Bruijn
@ 2019-08-07 19:54 ` Willem de Bruijn
2019-08-07 22:49 ` Jakub Kicinski
1 sibling, 0 replies; 6+ messages in thread
From: Willem de Bruijn @ 2019-08-07 19:54 UTC (permalink / raw)
To: Jakub Kicinski
Cc: David Miller, Network Development, davejwatson, borisp, aviadye,
John Fastabend, Daniel Borkmann, Eric Dumazet, Alexei Starovoitov,
oss-drivers
On Wed, Aug 7, 2019 at 2:47 PM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> On Wed, Aug 7, 2019 at 2:01 PM Jakub Kicinski
> <jakub.kicinski@netronome.com> wrote:
> >
> > On Wed, 7 Aug 2019 12:59:00 -0400, Willem de Bruijn wrote:
> > > On Wed, Aug 7, 2019 at 2:06 AM Jakub Kicinski wrote:
> > > > diff --git a/net/core/sock.c b/net/core/sock.c
> > > > index d57b0cc995a0..0f9619b0892f 100644
> > > > --- a/net/core/sock.c
> > > > +++ b/net/core/sock.c
> > > > @@ -1992,6 +1992,20 @@ void skb_set_owner_w(struct sk_buff *skb, struct sock *sk)
> > > > }
> > > > EXPORT_SYMBOL(skb_set_owner_w);
> > > >
> > > > +static bool can_skb_orphan_partial(const struct sk_buff *skb)
> > > > +{
> > > > +#ifdef CONFIG_TLS_DEVICE
> > > > + /* Drivers depend on in-order delivery for crypto offload,
> > > > + * partial orphan breaks out-of-order-OK logic.
> > > > + */
> > > > + if (skb->decrypted)
> > > > + return false;
> > > > +#endif
> > > > + return (IS_ENABLED(CONFIG_INET) &&
> > > > + skb->destructor == tcp_wfree) ||
> > >
> > > Please add parentheses around IS_ENABLED(CONFIG_INET) &&
> > > skb->destructor == tcp_wfree
> >
> > Mm.. there are parenthesis around them, maybe I'm being slow,
> > could you show me how?
>
> I mean
>
> return (skb->destructor == sock_wfree ||
> (IS_ENABLED(CONFIG_INET) && skb->destructor == tcp_wfree))
>
> In other words, (a || (b && c)) instead of (a || b && c). Though the
> existing code also eschews the extra parentheses.
No, ignore that last bit. It uses #ifdef, of course.
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH net v2] net/tls: prevent skb_orphan() from leaking TLS plain text with offload
2019-08-07 18:46 ` Willem de Bruijn
2019-08-07 19:54 ` Willem de Bruijn
@ 2019-08-07 22:49 ` Jakub Kicinski
1 sibling, 0 replies; 6+ messages in thread
From: Jakub Kicinski @ 2019-08-07 22:49 UTC (permalink / raw)
To: Willem de Bruijn
Cc: David Miller, Network Development, davejwatson, borisp, aviadye,
John Fastabend, Daniel Borkmann, Eric Dumazet, Alexei Starovoitov,
oss-drivers
On Wed, 7 Aug 2019 14:46:23 -0400, Willem de Bruijn wrote:
> > > > @@ -984,6 +984,9 @@ ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
> > > > if (!skb)
> > > > goto wait_for_memory;
> > > >
> > > > +#ifdef CONFIG_TLS_DEVICE
> > > > + skb->decrypted = !!(flags & MSG_SENDPAGE_DECRYPTED);
> > > > +#endif
> > >
> > > Nothing is stopping userspace from passing this new flag. In send
> > > (tcp_sendmsg_locked) it is ignored. But can it reach do_tcp_sendpages
> > > through tcp_bpf_sendmsg?
> >
> > Ah, I think you're right, thanks for checking that :( I don't entirely
> > follow how 0608c69c9a80 ("bpf: sk_msg, sock{map|hash} redirect through
> > ULP") is safe then.
> >
> > One option would be to clear the flags kernel would previously ignore
> > in tcp_bpf_sendmsg(). But I feel like we should just go back to marking
> > the socket, since we don't need the per-message flexibility of a flag.
> >
> > WDYT?
>
> I don't feel strongly either way. Passing flags from send through
> tcp_bpf_sendmsg is probably unintentional, so should probably be
> addressed anyway? Then this is a bit simpler.
FWIW I had a closer look at the tcp_bpf_sendmsg() flags, and
MSG_SENDPAGE_NOPOLICY should be okay to let though there.
That flag is only meaningful to tls in case sockmap is layered
on top of tls and we'd always set it before calling tls.
v3 coming soon..
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-08-07 22:50 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-08-07 6:06 [PATCH net v2] net/tls: prevent skb_orphan() from leaking TLS plain text with offload Jakub Kicinski
2019-08-07 16:59 ` Willem de Bruijn
2019-08-07 18:00 ` Jakub Kicinski
2019-08-07 18:46 ` Willem de Bruijn
2019-08-07 19:54 ` Willem de Bruijn
2019-08-07 22:49 ` Jakub Kicinski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox