public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] bpf: add is_locked_tcp_sock guard for sock_ops rtt_min access
@ 2026-04-06 22:49 Werner Kasselman
  2026-04-07  1:25 ` Martin KaFai Lau
  0 siblings, 1 reply; 3+ messages in thread
From: Werner Kasselman @ 2026-04-06 22:49 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: Martin KaFai Lau, John Fastabend, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Lawrence Brakmo, bpf@vger.kernel.org,
	netdev@vger.kernel.org, stable@vger.kernel.org

sock_ops_convert_ctx_access() generates BPF instructions to inline
context field accesses for BPF_PROG_TYPE_SOCK_OPS programs. For
tcp_sock-specific fields like snd_cwnd, srtt_us, etc., it uses the
SOCK_OPS_GET_TCP_SOCK_FIELD() macro which checks is_locked_tcp_sock
and returns 0 when the socket is not a locked full TCP socket.

However, the rtt_min field bypasses this guard entirely: it emits a raw
two-instruction load sequence (load sk pointer, then load from
tcp_sock->rtt_min offset) without checking is_locked_tcp_sock first.

This is a problem because bpf_skops_hdr_opt_len() and
bpf_skops_write_hdr_opt() in tcp_output.c set sock_ops.sk to a
tcp_request_sock (cast from request_sock) during SYN-ACK processing,
with is_fullsock=0 and is_locked_tcp_sock=0. If a SOCK_OPS program
with BPF_SOCK_OPS_WRITE_HDR_OPT_CB_FLAG reads ctx->rtt_min in this
callback, the generated code treats the tcp_request_sock pointer as a
tcp_sock and reads at offsetof(struct tcp_sock, rtt_min) -- which is
well past the end of the tcp_request_sock allocation, causing an
out-of-bounds slab read.

The rtt_min field was introduced in the same commit as the other
tcp_sock fields but was given hand-rolled access code because it reads
a sub-field (rtt_min.s[0].v, a minmax_sample) rather than a direct
struct member, making it incompatible with the SOCK_OPS_GET_FIELD()
macro. This hand-rolled code omitted the is_fullsock guard that the
macro provides. The guard was later renamed to is_locked_tcp_sock in
commit fd93eaffb3f9 ("bpf: Prevent unsafe access to the sock fields in the BPF timestamping callback").

Add the is_locked_tcp_sock guard to the rtt_min case, replicating the
exact instruction pattern used by SOCK_OPS_GET_FIELD() including
proper handling of the dst_reg==src_reg case with temp register
save/restore. Use offsetof(struct minmax_sample, v) for the sub-field
offset to match the style in bpf_tcp_sock_convert_ctx_access().

Found via AST-based call-graph analysis using sqry.

Fixes: 44f0e43037d3 ("bpf: Add support for reading sk_state and more")
Cc: stable@vger.kernel.org
Signed-off-by: Werner Kasselman <werner@verivus.com>
---
 net/core/filter.c | 47 ++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 44 insertions(+), 3 deletions(-)

diff --git a/net/core/filter.c b/net/core/filter.c
index 78b548158fb0..58f0735b18d9 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -10830,13 +10830,54 @@ static u32 sock_ops_convert_ctx_access(enum bpf_access_type type,
 		BUILD_BUG_ON(sizeof(struct minmax) <
 			     sizeof(struct minmax_sample));
 
+		/* Unlike other tcp_sock fields that use
+		 * SOCK_OPS_GET_TCP_SOCK_FIELD(), rtt_min requires a
+		 * custom access pattern because it reads a sub-field
+		 * (rtt_min.s[0].v) rather than a direct struct member.
+		 * We must still guard the access with is_locked_tcp_sock
+		 * to prevent an OOB read when sk points to a
+		 * tcp_request_sock (e.g., during SYN-ACK processing via
+		 * bpf_skops_hdr_opt_len/bpf_skops_write_hdr_opt).
+		 */
+		off = offsetof(struct tcp_sock, rtt_min) +
+		      offsetof(struct minmax_sample, v);
+	{
+		int fullsock_reg = si->dst_reg, reg = BPF_REG_9, jmp = 2;
+
+		if (si->dst_reg == reg || si->src_reg == reg)
+			reg--;
+		if (si->dst_reg == reg || si->src_reg == reg)
+			reg--;
+		if (si->dst_reg == si->src_reg) {
+			*insn++ = BPF_STX_MEM(BPF_DW, si->src_reg, reg,
+					  offsetof(struct bpf_sock_ops_kern,
+					  temp));
+			fullsock_reg = reg;
+			jmp += 2;
+		}
+		*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(
+						struct bpf_sock_ops_kern,
+						is_locked_tcp_sock),
+				      fullsock_reg, si->src_reg,
+				      offsetof(struct bpf_sock_ops_kern,
+					       is_locked_tcp_sock));
+		*insn++ = BPF_JMP_IMM(BPF_JEQ, fullsock_reg, 0, jmp);
+		if (si->dst_reg == si->src_reg)
+			*insn++ = BPF_LDX_MEM(BPF_DW, reg, si->src_reg,
+				      offsetof(struct bpf_sock_ops_kern,
+				      temp));
 		*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(
 						struct bpf_sock_ops_kern, sk),
 				      si->dst_reg, si->src_reg,
 				      offsetof(struct bpf_sock_ops_kern, sk));
-		*insn++ = BPF_LDX_MEM(BPF_W, si->dst_reg, si->dst_reg,
-				      offsetof(struct tcp_sock, rtt_min) +
-				      sizeof_field(struct minmax_sample, t));
+		*insn++ = BPF_LDX_MEM(BPF_W, si->dst_reg, si->dst_reg, off);
+		if (si->dst_reg == si->src_reg) {
+			*insn++ = BPF_JMP_A(1);
+			*insn++ = BPF_LDX_MEM(BPF_DW, reg, si->src_reg,
+				      offsetof(struct bpf_sock_ops_kern,
+				      temp));
+		}
+	}
 		break;
 
 	case offsetof(struct bpf_sock_ops, bpf_sock_ops_cb_flags):
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-04-07  1:56 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-06 22:49 [PATCH] bpf: add is_locked_tcp_sock guard for sock_ops rtt_min access Werner Kasselman
2026-04-07  1:25 ` Martin KaFai Lau
2026-04-07  1:56   ` Werner Kasselman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox