Netdev List
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Yeonju Bae <iwasbaeyz@gmail.com>,
	netdev@vger.kernel.org, gregkh@linuxfoundation.org,
	security@kernel.org, john.fastabend@gmail.com,
	sd@queasysnail.net, davem@davemloft.net, edumazet@google.com,
	pabeni@redhat.com, horms@kernel.org,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH net] tls: avoid zc receive for file-backed pages
Date: Mon, 25 May 2026 10:54:59 -0700	[thread overview]
Message-ID: <20260525105459.5ae73c2b@kernel.org> (raw)
In-Reply-To: <20260521165328.16112-1-iwasbaeyz@gmail.com>

On Fri, 22 May 2026 01:53:28 +0900 Yeonju Bae wrote:
> kTLS RX zc decrypt writes unauthenticated AEAD output directly into
> pages pinned from the recvmsg iterator via tls_setup_from_iter().
> For MAP_SHARED, PROT_WRITE file-backed destinations, those pages are
> live page-cache pages rather than anonymous copies: MAP_SHARED does not
> trigger copy-on-write, so FOLL_WRITE returns the actual page-cache page.
> 
> crypto_aead_decrypt() writes CTR-mode decryption output into the
> scatter-gather list before the authentication tag is verified.  If the
> tag check fails (-EBADMSG), the plaintext-like output is already
> resident in the page-cache page.  exit_free_pages() calls put_page()
> without any content cleanup, so the modification persists through the
> backing file.  An independent open(O_RDONLY)/read() of the same file
> returns different content and its SHA-256 changes.  MAP_PRIVATE is safe
> via COW; PROT_READ-only destinations fail at iov_iter_get_pages2()
> before any decryption occurs.
> 
> Avoid zc receive for file-backed destination pages.  In
> tls_setup_from_iter(), after iov_iter_get_pages2() pins pages, check
> each page with folio_mapping(page_folio(page)).  If any pinned page is
> file-backed (mapping != NULL), release the pinned pages and return
> -EOPNOTSUPP.  Handle -EOPNOTSUPP in tls_decrypt_sw() by clearing
> darg->zc and retrying, which causes tls_decrypt_sg() to allocate a
> kernel bounce buffer instead.  Decryption output never reaches the
> file-backed page; on tag failure the bounce buffer is discarded.
> 
> This follows the existing opportunistic zc retry pattern already used
> for TLS 1.3 record type mismatches in tls_decrypt_sw().
> 
> Verified on linux-7.0-rc3 QEMU (x86-64), four destination types:
>   MAP_SHARED+RW:   file_changed=0  (was 4077/4096 bytes before patch)
>   MAP_PRIVATE+RW:  file_changed=0  (COW isolation; unchanged)
>   anonymous heap:  no file backing  (unchanged)
>   PROT_READ only:  file_changed=0  (EFAULT before decrypt; unchanged)

I'm not seeing anything unusual here from high level API use size. 
We feed the iov_iter constructed by recvmsg in socket code into 
iov_iter_get_pages2(). Either:
 - the way we construct the iov_iter is wrong; or
 - iov_iter_get_pages2() should be return an error; or
 - we should use a different iov_* API; or
 - the current behavior you describe is expected / correct.

I don't think that TLS open-coding page checks is the right move.

Al, would you mind glancing over this?
I have no idea what's the expect page cache behavior in this scenario.

> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> index a977b0434..c312a83b4 100644
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -36,6 +36,7 @@
>   */
>  
>  #include <linux/bug.h>
> +#include <linux/pagemap.h>
>  #include <linux/sched/signal.h>
>  #include <linux/module.h>
>  #include <linux/kernel.h>
> @@ -1443,6 +1444,34 @@ static int tls_setup_from_iter(struct iov_iter *from,
>  
>  		length -= copied;
>  		size += copied;
> +		/* Reject file-backed destination pages.  Writing unauthenticated
> +		 * AEAD output into a page-cache page before tag verification
> +		 * leaves the backing file modified even when recvmsg() returns
> +		 * -EBADMSG.  Return -EOPNOTSUPP so the caller retries via the
> +		 * non-ZC bounce-buffer path.
> +		 */
> +		{
> +			ssize_t remain = copied;
> +			size_t  off    = offset;
> +			int     np = 0, j;
> +
> +			while (remain > 0) {
> +				remain -= min_t(ssize_t, remain,
> +						(ssize_t)(PAGE_SIZE - off));
> +				off = 0;
> +				np++;
> +			}
> +			for (j = 0; j < np; j++) {
> +				if (folio_mapping(page_folio(pages[j]))) {
> +					int k;
> +
> +					for (k = 0; k < np; k++)
> +						put_page(pages[k]);
> +					rc = -EOPNOTSUPP;
> +					goto out;
> +				}
> +			}
> +		}
>  		while (copied) {
>  			use = min_t(int, copied, PAGE_SIZE - offset);
>  
> @@ -1699,6 +1728,14 @@ tls_decrypt_sw(struct sock *sk, struct tls_context *tls_ctx,
>  	if (err < 0) {
>  		if (err == -EBADMSG)
>  			TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSDECRYPTERROR);
> +		if (err == -EOPNOTSUPP && darg->zc) {
> +			/* tls_setup_from_iter detected file-backed destination
> +			 * pages; retry without ZC via the bounce-buffer path.
> +			 */
> +			darg->zc = false;
> +			TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSDECRYPTRETRY);
> +			return tls_decrypt_sw(sk, tls_ctx, msg, darg);
> +		}
>  		return err;
>  	}
>  	/* keep going even for ->async, the code below is TLS 1.3 */


      reply	other threads:[~2026-05-25 17:55 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <2026052150-stylus-germicide-780e@gregkh>
2026-05-21 16:53 ` [PATCH net] tls: avoid zc receive for file-backed pages Yeonju Bae
2026-05-25 17:54   ` Jakub Kicinski [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260525105459.5ae73c2b@kernel.org \
    --to=kuba@kernel.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=horms@kernel.org \
    --cc=iwasbaeyz@gmail.com \
    --cc=john.fastabend@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sd@queasysnail.net \
    --cc=security@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox