* TLS zerocopy sendfile offset causes data corruption
@ 2023-03-03 12:07 Adrien Moulin
2023-03-04 1:17 ` Jakub Kicinski
0 siblings, 1 reply; 3+ messages in thread
From: Adrien Moulin @ 2023-03-03 12:07 UTC (permalink / raw)
To: Boris Pismenny, John Fastabend, Jakub Kicinski
Cc: netdev, linux-kernel, Tariq Toukan
Hi,
When doing a sendfile call on a TLS_TX_ZEROCOPY_RO-enabled socket with an offset that is neither zero nor 4k-aligned, and with a "count" bigger than a single TLS record, part of the data received will be corrupted.
I am seeing this on 5.19 and 6.2.1 (x86_64) with a ConnectX-6 Dx NIC, with TLS NIC offload including sendfile otherwise working perfectly when not using TLS_TX_ZEROCOPY_RO.
I have a simple reproducer program available here https://gist.github.com/elyosh/922e6c15f8d4d7102c8ac9508b0cdc3b
Doing sendfile of a 32K file with a 8 bytes offset, first without zerocopy :
# ./ktls_test -i testfile -p 443 -c cert.pem -k key.pem -o 8
Serving file testfile, will send 32760 bytes (8 - 32768) with SHA1 sum 83fc1e3900cf900025311f2c27378a357f9f4d2c
sendfile(5, 3, 8, 32760) = 32760
% wget -S -q -O test_copy https://xxxxxx/; shasum test_copy
HTTP/1.1 200 OK
Content-Type: application/octet-stream
Content-Length: 32760
X-Source-SHA1: 83fc1e3900cf900025311f2c27378a357f9f4d2c
83fc1e3900cf900025311f2c27378a357f9f4d2c test_copy
Same with TLS_TX_ZEROCOPY_RO enabled, received data will be corrupted :
# ./ktls_test -i testfile -p 443 -c cert.pem -k key.pem -o 8 -z
Serving file testfile, will send 32760 bytes (8 - 32768) with SHA1 sum 83fc1e3900cf900025311f2c27378a357f9f4d2c
TLS_TX_ZEROCOPY_RO enabled
sendfile(5, 3, 8, 32760) = 32760
% wget -S -q -O test_zerocopy https://xxxxxx/; shasum test_zerocopy
HTTP/1.1 200 OK
Content-Type: application/octet-stream
Content-Length: 32760
X-Source-SHA1: 83fc1e3900cf900025311f2c27378a357f9f4d2c
03374f669f98d5f56837660a3817ce1d2a2819f8 test_zerocopy
% diff -U 1 -d <(xxd test_copy) <(xxd test_zerocopy)
--- /dev/fd/11 2023-03-03 10:13:26
+++ /dev/fd/12 2023-03-03 10:13:26
@@ -1087,3 +1087,3 @@
000043e0: 1010 1010 1010 1010 1010 1010 1010 1010 ................
-000043f0: 1010 1010 1010 1010 1111 1111 1111 1111 ................
+000043f0: 1010 1010 1010 1010 1010 1010 1010 1010 ................
00004400: 1111 1111 1111 1111 1111 1111 1111 1111 ................
@@ -1151,3 +1151,3 @@
000047e0: 1111 1111 1111 1111 1111 1111 1111 1111 ................
-000047f0: 1111 1111 1111 1111 1212 1212 1212 1212 ................
+000047f0: 1111 1111 1111 1111 1111 1111 1111 1111 ................
00004800: 1212 1212 1212 1212 1212 1212 1212 1212 ................
@@ -1215,3 +1215,3 @@
00004be0: 1212 1212 1212 1212 1212 1212 1212 1212 ................
-00004bf0: 1212 1212 1212 1212 1313 1313 1313 1313 ................
+00004bf0: 1212 1212 1212 1212 1212 1212 1212 1212 ................
00004c00: 1313 1313 1313 1313 1313 1313 1313 1313 ................
For context, I noticed this issue trying to serve cached files with nginx. For static files this works fine (sendfile offset is zero at first, then 16k-aligned), but cached files are stored with a ~500 bytes header that is skipped in the sendfile call, triggering this issue.
Best regards
--
Adrien Moulin
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: TLS zerocopy sendfile offset causes data corruption
2023-03-03 12:07 TLS zerocopy sendfile offset causes data corruption Adrien Moulin
@ 2023-03-04 1:17 ` Jakub Kicinski
2023-03-04 10:40 ` Adrien Moulin
0 siblings, 1 reply; 3+ messages in thread
From: Jakub Kicinski @ 2023-03-04 1:17 UTC (permalink / raw)
To: Adrien Moulin
Cc: Boris Pismenny, John Fastabend, netdev, linux-kernel,
Tariq Toukan
On Fri, 3 Mar 2023 13:07:15 +0100 (CET) Adrien Moulin wrote:
> When doing a sendfile call on a TLS_TX_ZEROCOPY_RO-enabled socket with an offset that is neither zero nor 4k-aligned, and with a "count" bigger than a single TLS record, part of the data received will be corrupted.
>
> I am seeing this on 5.19 and 6.2.1 (x86_64) with a ConnectX-6 Dx NIC, with TLS NIC offload including sendfile otherwise working perfectly when not using TLS_TX_ZEROCOPY_RO.
> I have a simple reproducer program available here https://gist.github.com/elyosh/922e6c15f8d4d7102c8ac9508b0cdc3b
Would you be able to test potential fixes? Unfortunately testing
requires access to the right HW :(
I think the offset needs to be incremented, so:
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
index 6c593788dc25..a7cc4f9faac2 100644
--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -508,6 +508,8 @@ static int tls_push_data(struct sock *sk,
zc_pfrag.offset = iter_offset.offset;
zc_pfrag.size = copy;
tls_append_frag(record, &zc_pfrag, copy);
+
+ iter_offset.offset += copy;
} else if (copy) {
copy = min_t(size_t, copy, pfrag->size - pfrag->offset);
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: TLS zerocopy sendfile offset causes data corruption
2023-03-04 1:17 ` Jakub Kicinski
@ 2023-03-04 10:40 ` Adrien Moulin
0 siblings, 0 replies; 3+ messages in thread
From: Adrien Moulin @ 2023-03-04 10:40 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Boris Pismenny, John Fastabend, netdev, linux-kernel,
Tariq Toukan
> From: "Jakub Kicinski" <kuba@kernel.org>
> Would you be able to test potential fixes? Unfortunately testing
> requires access to the right HW :(
>
> I think the offset needs to be incremented, so:
I confirm that this completely fixes the issue in my testing, thanks !
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-03-04 10:41 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-03-03 12:07 TLS zerocopy sendfile offset causes data corruption Adrien Moulin
2023-03-04 1:17 ` Jakub Kicinski
2023-03-04 10:40 ` Adrien Moulin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).