* [PATCH net] net: tls: avoid discarding data on record close
@ 2023-08-04 22:59 Jakub Kicinski
2023-08-06 7:40 ` patchwork-bot+netdevbpf
0 siblings, 1 reply; 2+ messages in thread
From: Jakub Kicinski @ 2023-08-04 22:59 UTC (permalink / raw)
To: davem
Cc: netdev, edumazet, pabeni, Jakub Kicinski, borisp, john.fastabend,
dirk.vandermerwe, Tariq Toukan
TLS records end with a 16B tag. For TLS device offload we only
need to make space for this tag in the stream, the device will
generate and replace it with the actual calculated tag.
Long time ago the code would just re-reference the head frag
which mostly worked but was suboptimal because it prevented TCP
from combining the record into a single skb frag. I'm not sure
if it was correct as the first frag may be shorter than the tag.
The commit under fixes tried to replace that with using the page
frag and if the allocation failed rolling back the data, if record
was long enough. It achieves better fragment coalescing but is
also buggy.
We don't roll back the iterator, so unless we're at the end of
send we'll skip the data we designated as tag and start the
next record as if the rollback never happened.
There's also the possibility that the record was constructed
with MSG_MORE and the data came from a different syscall and
we already told the user space that we "got it".
Allocate a single dummy page and use it as fallback.
Found by code inspection, and proven by forcing allocation
failures.
Fixes: e7b159a48ba6 ("net/tls: remove the record tail optimization")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
CC: borisp@nvidia.com
CC: john.fastabend@gmail.com
CC: dirk.vandermerwe@netronome.com
CC: Tariq Toukan <tariqt@nvidia.com>
Tariq, this is the "another bug" I mentioned yesterday.
The fix won't help the bug you were reporting, tho.
---
net/tls/tls_device.c | 64 +++++++++++++++++++++++---------------------
1 file changed, 33 insertions(+), 31 deletions(-)
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
index 2021fe557e50..529101eb20bd 100644
--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -52,6 +52,8 @@ static LIST_HEAD(tls_device_list);
static LIST_HEAD(tls_device_down_list);
static DEFINE_SPINLOCK(tls_device_lock);
+static struct page *dummy_page;
+
static void tls_device_free_ctx(struct tls_context *ctx)
{
if (ctx->tx_conf == TLS_HW) {
@@ -312,36 +314,33 @@ static int tls_push_record(struct sock *sk,
return tls_push_sg(sk, ctx, offload_ctx->sg_tx_data, 0, flags);
}
-static int tls_device_record_close(struct sock *sk,
- struct tls_context *ctx,
- struct tls_record_info *record,
- struct page_frag *pfrag,
- unsigned char record_type)
+static void tls_device_record_close(struct sock *sk,
+ struct tls_context *ctx,
+ struct tls_record_info *record,
+ struct page_frag *pfrag,
+ unsigned char record_type)
{
struct tls_prot_info *prot = &ctx->prot_info;
- int ret;
+ struct page_frag dummy_tag_frag;
/* append tag
* device will fill in the tag, we just need to append a placeholder
* use socket memory to improve coalescing (re-using a single buffer
* increases frag count)
- * if we can't allocate memory now, steal some back from data
+ * if we can't allocate memory now use the dummy page
*/
- if (likely(skb_page_frag_refill(prot->tag_size, pfrag,
- sk->sk_allocation))) {
- ret = 0;
- tls_append_frag(record, pfrag, prot->tag_size);
- } else {
- ret = prot->tag_size;
- if (record->len <= prot->overhead_size)
- return -ENOMEM;
+ if (unlikely(pfrag->size - pfrag->offset < prot->tag_size) &&
+ !skb_page_frag_refill(prot->tag_size, pfrag, sk->sk_allocation)) {
+ dummy_tag_frag.page = dummy_page;
+ dummy_tag_frag.offset = 0;
+ pfrag = &dummy_tag_frag;
}
+ tls_append_frag(record, pfrag, prot->tag_size);
/* fill prepend */
tls_fill_prepend(ctx, skb_frag_address(&record->frags[0]),
record->len - prot->overhead_size,
record_type);
- return ret;
}
static int tls_create_new_record(struct tls_offload_context_tx *offload_ctx,
@@ -541,18 +540,8 @@ static int tls_push_data(struct sock *sk,
if (done || record->len >= max_open_record_len ||
(record->num_frags >= MAX_SKB_FRAGS - 1)) {
- rc = tls_device_record_close(sk, tls_ctx, record,
- pfrag, record_type);
- if (rc) {
- if (rc > 0) {
- size += rc;
- } else {
- size = orig_size;
- destroy_record(record);
- ctx->open_record = NULL;
- break;
- }
- }
+ tls_device_record_close(sk, tls_ctx, record,
+ pfrag, record_type);
rc = tls_push_record(sk,
tls_ctx,
@@ -1450,14 +1439,26 @@ int __init tls_device_init(void)
{
int err;
- destruct_wq = alloc_workqueue("ktls_device_destruct", 0, 0);
- if (!destruct_wq)
+ dummy_page = alloc_page(GFP_KERNEL);
+ if (!dummy_page)
return -ENOMEM;
+ destruct_wq = alloc_workqueue("ktls_device_destruct", 0, 0);
+ if (!destruct_wq) {
+ err = -ENOMEM;
+ goto err_free_dummy;
+ }
+
err = register_netdevice_notifier(&tls_dev_notifier);
if (err)
- destroy_workqueue(destruct_wq);
+ goto err_destroy_wq;
+ return 0;
+
+err_destroy_wq:
+ destroy_workqueue(destruct_wq);
+err_free_dummy:
+ put_page(dummy_page);
return err;
}
@@ -1466,4 +1467,5 @@ void __exit tls_device_cleanup(void)
unregister_netdevice_notifier(&tls_dev_notifier);
destroy_workqueue(destruct_wq);
clean_acked_data_flush();
+ put_page(dummy_page);
}
--
2.41.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH net] net: tls: avoid discarding data on record close
2023-08-04 22:59 [PATCH net] net: tls: avoid discarding data on record close Jakub Kicinski
@ 2023-08-06 7:40 ` patchwork-bot+netdevbpf
0 siblings, 0 replies; 2+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-08-06 7:40 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, netdev, edumazet, pabeni, borisp, john.fastabend,
dirk.vandermerwe, tariqt
Hello:
This patch was applied to netdev/net.git (main)
by David S. Miller <davem@davemloft.net>:
On Fri, 4 Aug 2023 15:59:51 -0700 you wrote:
> TLS records end with a 16B tag. For TLS device offload we only
> need to make space for this tag in the stream, the device will
> generate and replace it with the actual calculated tag.
>
> Long time ago the code would just re-reference the head frag
> which mostly worked but was suboptimal because it prevented TCP
> from combining the record into a single skb frag. I'm not sure
> if it was correct as the first frag may be shorter than the tag.
>
> [...]
Here is the summary with links:
- [net] net: tls: avoid discarding data on record close
https://git.kernel.org/netdev/net/c/6b47808f223c
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2023-08-06 7:40 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-04 22:59 [PATCH net] net: tls: avoid discarding data on record close Jakub Kicinski
2023-08-06 7:40 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).