* [PATCH net] tls: avoid zc receive for file-backed pages
[not found] <2026052150-stylus-germicide-780e@gregkh>
@ 2026-05-21 16:53 ` Yeonju Bae
2026-05-25 17:54 ` Jakub Kicinski
0 siblings, 1 reply; 2+ messages in thread
From: Yeonju Bae @ 2026-05-21 16:53 UTC (permalink / raw)
To: netdev
Cc: gregkh, security, john.fastabend, kuba, sd, davem, edumazet,
pabeni, horms, linux-kernel, Yeonju Bae
kTLS RX zc decrypt writes unauthenticated AEAD output directly into
pages pinned from the recvmsg iterator via tls_setup_from_iter().
For MAP_SHARED, PROT_WRITE file-backed destinations, those pages are
live page-cache pages rather than anonymous copies: MAP_SHARED does not
trigger copy-on-write, so FOLL_WRITE returns the actual page-cache page.
crypto_aead_decrypt() writes CTR-mode decryption output into the
scatter-gather list before the authentication tag is verified. If the
tag check fails (-EBADMSG), the plaintext-like output is already
resident in the page-cache page. exit_free_pages() calls put_page()
without any content cleanup, so the modification persists through the
backing file. An independent open(O_RDONLY)/read() of the same file
returns different content and its SHA-256 changes. MAP_PRIVATE is safe
via COW; PROT_READ-only destinations fail at iov_iter_get_pages2()
before any decryption occurs.
Avoid zc receive for file-backed destination pages. In
tls_setup_from_iter(), after iov_iter_get_pages2() pins pages, check
each page with folio_mapping(page_folio(page)). If any pinned page is
file-backed (mapping != NULL), release the pinned pages and return
-EOPNOTSUPP. Handle -EOPNOTSUPP in tls_decrypt_sw() by clearing
darg->zc and retrying, which causes tls_decrypt_sg() to allocate a
kernel bounce buffer instead. Decryption output never reaches the
file-backed page; on tag failure the bounce buffer is discarded.
This follows the existing opportunistic zc retry pattern already used
for TLS 1.3 record type mismatches in tls_decrypt_sw().
Verified on linux-7.0-rc3 QEMU (x86-64), four destination types:
MAP_SHARED+RW: file_changed=0 (was 4077/4096 bytes before patch)
MAP_PRIVATE+RW: file_changed=0 (COW isolation; unchanged)
anonymous heap: no file backing (unchanged)
PROT_READ only: file_changed=0 (EFAULT before decrypt; unchanged)
Signed-off-by: Yeonju Bae <iwasbaeyz@gmail.com>
---
net/tls/tls_sw.c | 37 +++++++++++++++++++++++++++++++++++++
1 file changed, 37 insertions(+)
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index a977b0434..c312a83b4 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -36,6 +36,7 @@
*/
#include <linux/bug.h>
+#include <linux/pagemap.h>
#include <linux/sched/signal.h>
#include <linux/module.h>
#include <linux/kernel.h>
@@ -1443,6 +1444,34 @@ static int tls_setup_from_iter(struct iov_iter *from,
length -= copied;
size += copied;
+ /* Reject file-backed destination pages. Writing unauthenticated
+ * AEAD output into a page-cache page before tag verification
+ * leaves the backing file modified even when recvmsg() returns
+ * -EBADMSG. Return -EOPNOTSUPP so the caller retries via the
+ * non-ZC bounce-buffer path.
+ */
+ {
+ ssize_t remain = copied;
+ size_t off = offset;
+ int np = 0, j;
+
+ while (remain > 0) {
+ remain -= min_t(ssize_t, remain,
+ (ssize_t)(PAGE_SIZE - off));
+ off = 0;
+ np++;
+ }
+ for (j = 0; j < np; j++) {
+ if (folio_mapping(page_folio(pages[j]))) {
+ int k;
+
+ for (k = 0; k < np; k++)
+ put_page(pages[k]);
+ rc = -EOPNOTSUPP;
+ goto out;
+ }
+ }
+ }
while (copied) {
use = min_t(int, copied, PAGE_SIZE - offset);
@@ -1699,6 +1728,14 @@ tls_decrypt_sw(struct sock *sk, struct tls_context *tls_ctx,
if (err < 0) {
if (err == -EBADMSG)
TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSDECRYPTERROR);
+ if (err == -EOPNOTSUPP && darg->zc) {
+ /* tls_setup_from_iter detected file-backed destination
+ * pages; retry without ZC via the bounce-buffer path.
+ */
+ darg->zc = false;
+ TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSDECRYPTRETRY);
+ return tls_decrypt_sw(sk, tls_ctx, msg, darg);
+ }
return err;
}
/* keep going even for ->async, the code below is TLS 1.3 */
--
2.43.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH net] tls: avoid zc receive for file-backed pages
2026-05-21 16:53 ` [PATCH net] tls: avoid zc receive for file-backed pages Yeonju Bae
@ 2026-05-25 17:54 ` Jakub Kicinski
0 siblings, 0 replies; 2+ messages in thread
From: Jakub Kicinski @ 2026-05-25 17:54 UTC (permalink / raw)
To: Alexander Viro
Cc: Yeonju Bae, netdev, gregkh, security, john.fastabend, sd, davem,
edumazet, pabeni, horms, linux-kernel, linux-fsdevel
On Fri, 22 May 2026 01:53:28 +0900 Yeonju Bae wrote:
> kTLS RX zc decrypt writes unauthenticated AEAD output directly into
> pages pinned from the recvmsg iterator via tls_setup_from_iter().
> For MAP_SHARED, PROT_WRITE file-backed destinations, those pages are
> live page-cache pages rather than anonymous copies: MAP_SHARED does not
> trigger copy-on-write, so FOLL_WRITE returns the actual page-cache page.
>
> crypto_aead_decrypt() writes CTR-mode decryption output into the
> scatter-gather list before the authentication tag is verified. If the
> tag check fails (-EBADMSG), the plaintext-like output is already
> resident in the page-cache page. exit_free_pages() calls put_page()
> without any content cleanup, so the modification persists through the
> backing file. An independent open(O_RDONLY)/read() of the same file
> returns different content and its SHA-256 changes. MAP_PRIVATE is safe
> via COW; PROT_READ-only destinations fail at iov_iter_get_pages2()
> before any decryption occurs.
>
> Avoid zc receive for file-backed destination pages. In
> tls_setup_from_iter(), after iov_iter_get_pages2() pins pages, check
> each page with folio_mapping(page_folio(page)). If any pinned page is
> file-backed (mapping != NULL), release the pinned pages and return
> -EOPNOTSUPP. Handle -EOPNOTSUPP in tls_decrypt_sw() by clearing
> darg->zc and retrying, which causes tls_decrypt_sg() to allocate a
> kernel bounce buffer instead. Decryption output never reaches the
> file-backed page; on tag failure the bounce buffer is discarded.
>
> This follows the existing opportunistic zc retry pattern already used
> for TLS 1.3 record type mismatches in tls_decrypt_sw().
>
> Verified on linux-7.0-rc3 QEMU (x86-64), four destination types:
> MAP_SHARED+RW: file_changed=0 (was 4077/4096 bytes before patch)
> MAP_PRIVATE+RW: file_changed=0 (COW isolation; unchanged)
> anonymous heap: no file backing (unchanged)
> PROT_READ only: file_changed=0 (EFAULT before decrypt; unchanged)
I'm not seeing anything unusual here from high level API use size.
We feed the iov_iter constructed by recvmsg in socket code into
iov_iter_get_pages2(). Either:
- the way we construct the iov_iter is wrong; or
- iov_iter_get_pages2() should be return an error; or
- we should use a different iov_* API; or
- the current behavior you describe is expected / correct.
I don't think that TLS open-coding page checks is the right move.
Al, would you mind glancing over this?
I have no idea what's the expect page cache behavior in this scenario.
> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> index a977b0434..c312a83b4 100644
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -36,6 +36,7 @@
> */
>
> #include <linux/bug.h>
> +#include <linux/pagemap.h>
> #include <linux/sched/signal.h>
> #include <linux/module.h>
> #include <linux/kernel.h>
> @@ -1443,6 +1444,34 @@ static int tls_setup_from_iter(struct iov_iter *from,
>
> length -= copied;
> size += copied;
> + /* Reject file-backed destination pages. Writing unauthenticated
> + * AEAD output into a page-cache page before tag verification
> + * leaves the backing file modified even when recvmsg() returns
> + * -EBADMSG. Return -EOPNOTSUPP so the caller retries via the
> + * non-ZC bounce-buffer path.
> + */
> + {
> + ssize_t remain = copied;
> + size_t off = offset;
> + int np = 0, j;
> +
> + while (remain > 0) {
> + remain -= min_t(ssize_t, remain,
> + (ssize_t)(PAGE_SIZE - off));
> + off = 0;
> + np++;
> + }
> + for (j = 0; j < np; j++) {
> + if (folio_mapping(page_folio(pages[j]))) {
> + int k;
> +
> + for (k = 0; k < np; k++)
> + put_page(pages[k]);
> + rc = -EOPNOTSUPP;
> + goto out;
> + }
> + }
> + }
> while (copied) {
> use = min_t(int, copied, PAGE_SIZE - offset);
>
> @@ -1699,6 +1728,14 @@ tls_decrypt_sw(struct sock *sk, struct tls_context *tls_ctx,
> if (err < 0) {
> if (err == -EBADMSG)
> TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSDECRYPTERROR);
> + if (err == -EOPNOTSUPP && darg->zc) {
> + /* tls_setup_from_iter detected file-backed destination
> + * pages; retry without ZC via the bounce-buffer path.
> + */
> + darg->zc = false;
> + TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSDECRYPTRETRY);
> + return tls_decrypt_sw(sk, tls_ctx, msg, darg);
> + }
> return err;
> }
> /* keep going even for ->async, the code below is TLS 1.3 */
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-05-25 17:55 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <2026052150-stylus-germicide-780e@gregkh>
2026-05-21 16:53 ` [PATCH net] tls: avoid zc receive for file-backed pages Yeonju Bae
2026-05-25 17:54 ` Jakub Kicinski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox