From: Jakub Kicinski <kuba@kernel.org>
To: Yiming Qian <yimingqian591@gmail.com>
Cc: security@kernel.org, John Fastabend <john.fastabend@gmail.com>,
Jakub Sitnicki <jakub@cloudflare.com>,
Sabrina Dubroca <sd@queasysnail.net>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Paolo Abeni <pabeni@redhat.com>, Simon Horman <horms@kernel.org>,
keenanat2000@gmail.com, netdev@vger.kernel.org,
bpf@vger.kernel.org, linux-kernel@vger.kernel.org,
stable@vger.kernel.org
Subject: Re: [PATCH net] net/tls: preserve sk_msg sg.copy when splitting records
Date: Mon, 8 Jun 2026 20:00:32 -0700 [thread overview]
Message-ID: <20260608200032.77aebf5c@kernel.org> (raw)
In-Reply-To: <20260604134019.39161-1-yimingqian591@gmail.com>
On Thu, 4 Jun 2026 13:40:11 +0000 Yiming Qian wrote:
> tls_split_open_record() copies scatterlist entries from the current
> plaintext sk_msg into a newly allocated plaintext sk_msg when an open
> record is split.
>
> The scatterlist entry and the corresponding msg->sg.copy bit are one
> ownership record. Splice-backed entries are created by sk_msg_page_add()
> with the copy bit set so sk_msg_compute_data_pointers() does not expose
> them as writable BPF msg->data.
>
> The split path used memcpy() to copy both partial and whole tail entries
> but left the new sk_msg copy bitmap clear. A subsequent SK_MSG verdict on
> the split tail could therefore receive a writable data pointer to a page
> that was only supposed to be copied, allowing BPF to overwrite externally
> owned page cache.
>
> Add a helper for copying one sg.copy bit and use it for the partial tmp
> entry and for each copied tail entry.
>
> Fixes: d3b18ad31f93 ("tls: add bpf support to sk_msg handling")
> Reported-by: Yiming Qian <yimingqian591@gmail.com>
> Reported-by: Keenan Dong <keenanat2000@gmail.com>
> Signed-off-by: Yiming Qian <yimingqian591@gmail.com>
> Signed-off-by: Keenan Dong <keenanat2000@gmail.com>
> ---
> include/linux/skmsg.h | 9 +++++++++
> net/tls/tls_sw.c | 7 +++++++
> 2 files changed, 16 insertions(+)
>
> diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
> index 19f4f253b4f90..f3988ce2219db 100644
> --- a/include/linux/skmsg.h
> +++ b/include/linux/skmsg.h
> @@ -283,6 +283,15 @@ static inline void sk_msg_sg_copy(struct sk_msg *msg, u32 i, bool copy_state)
> } while (1);
> }
>
> +static inline void sk_msg_sg_copy_one(struct sk_msg *dst, u32 dst_i,
> + const struct sk_msg *src, u32 src_i)
> +{
> + if (test_bit(src_i, src->sg.copy))
> + __set_bit(dst_i, dst->sg.copy);
> + else
> + __clear_bit(dst_i, dst->sg.copy);
__assign_bit()?
Also _assign() may be a better suffix for the helper than _one
> +}
> +
> static inline void sk_msg_sg_copy_set(struct sk_msg *msg, u32 start)
> {
> sk_msg_sg_copy(msg, start, true);
> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> index 964ebc268ee46..434753de8aadd 100644
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -623,6 +623,7 @@ static int tls_split_open_record(struct sock *sk, struct tls_rec *from,
> struct scatterlist *sge, *osge, *nsge;
> u32 orig_size = msg_opl->sg.size;
> struct scatterlist tmp = { };
> + u32 tmp_i = NR_MSG_FRAG_IDS;
> struct sk_msg *msg_npl;
> struct tls_rec *new;
> int ret;
> @@ -644,6 +645,7 @@ static int tls_split_open_record(struct sock *sk, struct tls_rec *from,
> if (sge->length > apply) {
> u32 len = sge->length - apply;
>
> + tmp_i = i;
> get_page(sg_page(sge));
> sg_set_page(&tmp, sg_page(sge), len,
> sge->offset + apply);
> @@ -675,6 +677,10 @@ static int tls_split_open_record(struct sock *sk, struct tls_rec *from,
> nsge = sk_msg_elem(msg_npl, j);
> if (tmp.length) {
> memcpy(nsge, &tmp, sizeof(*nsge));
> + if (WARN_ON_ONCE(tmp_i == NR_MSG_FRAG_IDS))
> + __clear_bit(j, msg_npl->sg.copy);
> + else
> + sk_msg_sg_copy_one(msg_npl, j, msg_opl, tmp_i);
Not immediately obvious from the diff or commit message why this
special handling is needed.
> sk_msg_iter_var_next(j);
> nsge = sk_msg_elem(msg_npl, j);
> }
> @@ -682,6 +688,7 @@ static int tls_split_open_record(struct sock *sk, struct tls_rec *from,
> osge = sk_msg_elem(msg_opl, i);
> while (osge->length) {
> memcpy(nsge, osge, sizeof(*nsge));
> + sk_msg_sg_copy_one(msg_npl, j, msg_opl, i);
> sg_unmark_end(nsge);
> sk_msg_iter_var_next(i);
> sk_msg_iter_var_next(j);
This patch looks a little short. I have a recollection of trying to fix
this and giving up after realizing how big the diff would be just to fix
all the cases within skmsg. Could you look harder for places that
shuffle skmsg entries around, not just the ones in TLS? Maybe I'm
misremembering but I thought there was a lot more of them...
prev parent reply other threads:[~2026-06-09 3:00 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-04 13:40 [PATCH net] net/tls: preserve sk_msg sg.copy when splitting records Yiming Qian
2026-06-05 13:40 ` sashiko-bot
2026-06-07 7:12 ` 钱一铭
2026-06-09 3:00 ` Jakub Kicinski [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260608200032.77aebf5c@kernel.org \
--to=kuba@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=jakub@cloudflare.com \
--cc=john.fastabend@gmail.com \
--cc=keenanat2000@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sd@queasysnail.net \
--cc=security@kernel.org \
--cc=stable@vger.kernel.org \
--cc=yimingqian591@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox