* [PATCH bpf v2] bpf, sockmap: fix use-after-free when the stream parser resizes the skb
@ 2026-06-12 12:35 Sechang Lim
2026-06-17 22:36 ` Bobby Eshleman
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Sechang Lim @ 2026-06-12 12:35 UTC (permalink / raw)
To: John Fastabend, Jakub Sitnicki, David S . Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, Bobby Eshleman, netdev, bpf, linux-kernel
sk_psock_strp_parse() runs the BPF_PROG_TYPE_SK_SKB stream-parser program
to find the length of the next message. strparser assembles a message out
of several received skbs by chaining them onto the head's frag_list and
recording where to append the next one in strp->skb_nextp:
*strp->skb_nextp = skb;
strp->skb_nextp = &skb->next;
and then calls the parser on the head:
len = (*strp->cb.parse_msg)(strp, head);
The parser is only meant to inspect the skb, but the program may call
bpf_skb_change_tail() -- or the sibling bpf_skb_pull_data(),
bpf_skb_change_head(), bpf_skb_adjust_room(), all allowed for SK_SKB.
Once the head carries a frag_list these go
... -> skb_ensure_writable -> pskb_may_pull -> __pskb_pull_tail
and __pskb_pull_tail() frees the frag_list skbs that strparser still
tracks through skb_nextp:
while ((list = skb_shinfo(skb)->frag_list) != insp) {
skb_shinfo(skb)->frag_list = list->next;
consume_skb(list);
}
strp->skb_nextp now points into a freed sk_buff. The next segment of
the same message arrives in __strp_recv(), which links it with
*strp->skb_nextp = skb, an 8-byte write into the freed skb. The free
and the write happen in different __strp_recv() calls, so the message
has to span at least three segments before it triggers.
BUG: KASAN: slab-use-after-free in __strp_recv+0x447/0xda0
Write of size 8 at addr ffff88810db86140 by task repro/349
Call Trace:
<IRQ>
__strp_recv+0x447/0xda0
__tcp_read_sock+0x13d/0x590
tcp_bpf_strp_read_sock+0x195/0x320
strp_data_ready+0x267/0x340
sk_psock_strp_data_ready+0x1ce/0x350
tcp_data_queue+0x1364/0x2fd0
tcp_rcv_established+0xe07/0x1640
[...]
Allocated by task 349:
skb_clone+0x17b/0x210
__strp_recv+0x2c3/0xda0
__tcp_read_sock+0x13d/0x590
[...]
Freed by task 349:
kmem_cache_free+0x150/0x570
__pskb_pull_tail+0x57b/0xc20
skb_ensure_writable+0x236/0x260
__bpf_skb_change_tail+0x1d4/0x590
sk_skb_change_tail+0x2a/0x40
bpf_prog_1b285dcd6c41373e+0x27/0x30
bpf_prog_run_pin_on_cpu+0xf3/0x260
sk_psock_strp_parse+0x118/0x1e0
__strp_recv+0x4f6/0xda0
[...]
The same resize also leaves the head's length inconsistent with its
frags, so a later __pskb_pull_tail() can instead hit the
BUG_ON(skb_copy_bits(...)) in net/core/skbuff.c.
Run the parser on a private clone of the head when the message spans more
than one skb and the program can modify the packet
(prog->aux->changes_pkt_data), so a resizing helper can only touch the
clone and strparser's head and skb_nextp stay valid. Single-skb messages
have no frag_list and read-only parsers cannot resize, so both are still
parsed in place. If the clone cannot be allocated, return 0 so the caller
retries on the next read rather than failing the parser.
Fixes: 8a31db561566 ("bpf: add access to sock fields and pkt data from sk_skb programs")
Signed-off-by: Sechang Lim <rhkrqnwk98@gmail.com>
---
v2:
- clone only when prog->aux->changes_pkt_data (Bobby Eshleman)
- return 0 on clone failure instead of -ENOMEM (Bobby Eshleman)
- free the clone with consume_skb() instead of kfree_skb()
- drop the unrelated guard(rcu)() change (Bobby Eshleman)
v1:
- https://lore.kernel.org/all/20260609112316.3685738-1-rhkrqnwk98@gmail.com/
net/core/skmsg.c | 26 +++++++++++++++++++++++---
1 file changed, 23 insertions(+), 3 deletions(-)
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index e1850caf1a71..97e5bc5f38c3 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -1149,9 +1149,29 @@ static int sk_psock_strp_parse(struct strparser *strp, struct sk_buff *skb)
rcu_read_lock();
prog = READ_ONCE(psock->progs.stream_parser);
if (likely(prog)) {
- skb->sk = psock->sk;
- ret = bpf_prog_run_pin_on_cpu(prog, skb);
- skb->sk = NULL;
+ struct sk_buff *parse_skb = skb;
+
+ /*
+ * strparser chains the message skbs through skb->frag_list and
+ * keeps a pointer into that list in strp->skb_nextp. The parser
+ * program may call bpf_skb_change_tail() and friends, which go
+ * through __pskb_pull_tail() and free the frag_list skbs that
+ * strparser still tracks. Run the program on a clone when the head
+ * has a frag_list and the program can modify the packet, so it
+ * cannot drop frags strparser owns.
+ */
+ if (skb_has_frag_list(skb) && prog->aux->changes_pkt_data) {
+ parse_skb = skb_clone(skb, GFP_ATOMIC);
+ if (!parse_skb) {
+ rcu_read_unlock();
+ return 0;
+ }
+ }
+ parse_skb->sk = psock->sk;
+ ret = bpf_prog_run_pin_on_cpu(prog, parse_skb);
+ parse_skb->sk = NULL;
+ if (parse_skb != skb)
+ consume_skb(parse_skb);
}
rcu_read_unlock();
return ret;
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH bpf v2] bpf, sockmap: fix use-after-free when the stream parser resizes the skb
2026-06-12 12:35 [PATCH bpf v2] bpf, sockmap: fix use-after-free when the stream parser resizes the skb Sechang Lim
@ 2026-06-17 22:36 ` Bobby Eshleman
2026-06-18 0:25 ` Kuniyuki Iwashima
2026-06-18 2:45 ` Jiayuan Chen
2 siblings, 0 replies; 5+ messages in thread
From: Bobby Eshleman @ 2026-06-17 22:36 UTC (permalink / raw)
To: Sechang Lim
Cc: John Fastabend, Jakub Sitnicki, David S . Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, netdev, bpf,
linux-kernel
On Fri, Jun 12, 2026 at 12:35:51PM +0000, Sechang Lim wrote:
> sk_psock_strp_parse() runs the BPF_PROG_TYPE_SK_SKB stream-parser program
> to find the length of the next message. strparser assembles a message out
> of several received skbs by chaining them onto the head's frag_list and
> recording where to append the next one in strp->skb_nextp:
>
> *strp->skb_nextp = skb;
> strp->skb_nextp = &skb->next;
>
> and then calls the parser on the head:
>
> len = (*strp->cb.parse_msg)(strp, head);
>
> The parser is only meant to inspect the skb, but the program may call
> bpf_skb_change_tail() -- or the sibling bpf_skb_pull_data(),
> bpf_skb_change_head(), bpf_skb_adjust_room(), all allowed for SK_SKB.
> Once the head carries a frag_list these go
>
> ... -> skb_ensure_writable -> pskb_may_pull -> __pskb_pull_tail
>
> and __pskb_pull_tail() frees the frag_list skbs that strparser still
> tracks through skb_nextp:
>
> while ((list = skb_shinfo(skb)->frag_list) != insp) {
> skb_shinfo(skb)->frag_list = list->next;
> consume_skb(list);
> }
>
> strp->skb_nextp now points into a freed sk_buff. The next segment of
> the same message arrives in __strp_recv(), which links it with
> *strp->skb_nextp = skb, an 8-byte write into the freed skb. The free
> and the write happen in different __strp_recv() calls, so the message
> has to span at least three segments before it triggers.
>
> BUG: KASAN: slab-use-after-free in __strp_recv+0x447/0xda0
> Write of size 8 at addr ffff88810db86140 by task repro/349
>
> Call Trace:
> <IRQ>
> __strp_recv+0x447/0xda0
> __tcp_read_sock+0x13d/0x590
> tcp_bpf_strp_read_sock+0x195/0x320
> strp_data_ready+0x267/0x340
> sk_psock_strp_data_ready+0x1ce/0x350
> tcp_data_queue+0x1364/0x2fd0
> tcp_rcv_established+0xe07/0x1640
> [...]
>
> Allocated by task 349:
> skb_clone+0x17b/0x210
> __strp_recv+0x2c3/0xda0
> __tcp_read_sock+0x13d/0x590
> [...]
>
> Freed by task 349:
> kmem_cache_free+0x150/0x570
> __pskb_pull_tail+0x57b/0xc20
> skb_ensure_writable+0x236/0x260
> __bpf_skb_change_tail+0x1d4/0x590
> sk_skb_change_tail+0x2a/0x40
> bpf_prog_1b285dcd6c41373e+0x27/0x30
> bpf_prog_run_pin_on_cpu+0xf3/0x260
> sk_psock_strp_parse+0x118/0x1e0
> __strp_recv+0x4f6/0xda0
> [...]
>
> The same resize also leaves the head's length inconsistent with its
> frags, so a later __pskb_pull_tail() can instead hit the
> BUG_ON(skb_copy_bits(...)) in net/core/skbuff.c.
>
> Run the parser on a private clone of the head when the message spans more
> than one skb and the program can modify the packet
> (prog->aux->changes_pkt_data), so a resizing helper can only touch the
> clone and strparser's head and skb_nextp stay valid. Single-skb messages
> have no frag_list and read-only parsers cannot resize, so both are still
> parsed in place. If the clone cannot be allocated, return 0 so the caller
> retries on the next read rather than failing the parser.
>
> Fixes: 8a31db561566 ("bpf: add access to sock fields and pkt data from sk_skb programs")
> Signed-off-by: Sechang Lim <rhkrqnwk98@gmail.com>
> ---
> v2:
> - clone only when prog->aux->changes_pkt_data (Bobby Eshleman)
> - return 0 on clone failure instead of -ENOMEM (Bobby Eshleman)
> - free the clone with consume_skb() instead of kfree_skb()
> - drop the unrelated guard(rcu)() change (Bobby Eshleman)
>
> v1:
> - https://lore.kernel.org/all/20260609112316.3685738-1-rhkrqnwk98@gmail.com/
>
> net/core/skmsg.c | 26 +++++++++++++++++++++++---
> 1 file changed, 23 insertions(+), 3 deletions(-)
>
> diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> index e1850caf1a71..97e5bc5f38c3 100644
> --- a/net/core/skmsg.c
> +++ b/net/core/skmsg.c
> @@ -1149,9 +1149,29 @@ static int sk_psock_strp_parse(struct strparser *strp, struct sk_buff *skb)
> rcu_read_lock();
> prog = READ_ONCE(psock->progs.stream_parser);
> if (likely(prog)) {
> - skb->sk = psock->sk;
> - ret = bpf_prog_run_pin_on_cpu(prog, skb);
> - skb->sk = NULL;
> + struct sk_buff *parse_skb = skb;
> +
> + /*
> + * strparser chains the message skbs through skb->frag_list and
> + * keeps a pointer into that list in strp->skb_nextp. The parser
> + * program may call bpf_skb_change_tail() and friends, which go
> + * through __pskb_pull_tail() and free the frag_list skbs that
> + * strparser still tracks. Run the program on a clone when the head
> + * has a frag_list and the program can modify the packet, so it
> + * cannot drop frags strparser owns.
> + */
> + if (skb_has_frag_list(skb) && prog->aux->changes_pkt_data) {
> + parse_skb = skb_clone(skb, GFP_ATOMIC);
> + if (!parse_skb) {
> + rcu_read_unlock();
> + return 0;
> + }
> + }
> + parse_skb->sk = psock->sk;
> + ret = bpf_prog_run_pin_on_cpu(prog, parse_skb);
> + parse_skb->sk = NULL;
> + if (parse_skb != skb)
> + consume_skb(parse_skb);
> }
> rcu_read_unlock();
> return ret;
> --
> 2.43.0
>
Hey Sechang,
I'm still on the fence about "return 0" vs ENOMEM. I hate to flip-flop
on you here, but now I'm not sure if it is worth the complication to
return 0 since we're really only buying a single timer interval in which
we need 1) suddenly more memory to alloc the clone, and 2) another data
ready event to cause the stream parsing to pick up again. If any one
doesn't happen, the end result is the same. Not sure its a good
trade-off for the complexity of basically tricking the caller with the
zero return. Maybe let's go back to ENOMEM?
BTW, based on the comm name "repro", it sounds like you have a decent
reproducer for this. I wonder if it is possible to add something to the
selftests to catch this?
Best,
Bobby
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH bpf v2] bpf, sockmap: fix use-after-free when the stream parser resizes the skb
2026-06-12 12:35 [PATCH bpf v2] bpf, sockmap: fix use-after-free when the stream parser resizes the skb Sechang Lim
2026-06-17 22:36 ` Bobby Eshleman
@ 2026-06-18 0:25 ` Kuniyuki Iwashima
2026-06-18 2:45 ` Jiayuan Chen
2 siblings, 0 replies; 5+ messages in thread
From: Kuniyuki Iwashima @ 2026-06-18 0:25 UTC (permalink / raw)
To: rhkrqnwk98
Cc: bobbyeshleman, bpf, davem, edumazet, horms, jakub, john.fastabend,
kuba, linux-kernel, netdev, pabeni
From: Sechang Lim <rhkrqnwk98@gmail.com>
Date: Fri, 12 Jun 2026 12:35:51 +0000
> sk_psock_strp_parse() runs the BPF_PROG_TYPE_SK_SKB stream-parser program
> to find the length of the next message. strparser assembles a message out
> of several received skbs by chaining them onto the head's frag_list and
> recording where to append the next one in strp->skb_nextp:
>
> *strp->skb_nextp = skb;
> strp->skb_nextp = &skb->next;
>
> and then calls the parser on the head:
>
> len = (*strp->cb.parse_msg)(strp, head);
>
> The parser is only meant to inspect the skb, but the program may call
> bpf_skb_change_tail() -- or the sibling bpf_skb_pull_data(),
> bpf_skb_change_head(), bpf_skb_adjust_room(), all allowed for SK_SKB.
It's bpf prog's responsibility not to abuse them.
Even setting aside that, why not simply block such BPF prog ?
It cannot be done at load time, but doable at attach time.
---8<---
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 630d530782fe..4d60b77da8ef 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -4556,6 +4556,12 @@ static int bpf_prog_attach(const union bpf_attr *attr)
switch (ptype) {
case BPF_PROG_TYPE_SK_SKB:
+ if (attr->attach_type == BPF_SK_SKB_STREAM_PARSER &&
+ prog->aux->changes_pkt_data) {
+ ret = -EINVAL;
+ goto out;
+ }
+ fallthrough;
case BPF_PROG_TYPE_SK_MSG:
ret = sock_map_get_from_fd(attr, prog);
break;
---8<---
> Once the head carries a frag_list these go
>
> ... -> skb_ensure_writable -> pskb_may_pull -> __pskb_pull_tail
>
> and __pskb_pull_tail() frees the frag_list skbs that strparser still
> tracks through skb_nextp:
>
> while ((list = skb_shinfo(skb)->frag_list) != insp) {
> skb_shinfo(skb)->frag_list = list->next;
> consume_skb(list);
> }
>
> strp->skb_nextp now points into a freed sk_buff. The next segment of
> the same message arrives in __strp_recv(), which links it with
> *strp->skb_nextp = skb, an 8-byte write into the freed skb. The free
> and the write happen in different __strp_recv() calls, so the message
> has to span at least three segments before it triggers.
>
> BUG: KASAN: slab-use-after-free in __strp_recv+0x447/0xda0
> Write of size 8 at addr ffff88810db86140 by task repro/349
>
> Call Trace:
> <IRQ>
> __strp_recv+0x447/0xda0
> __tcp_read_sock+0x13d/0x590
> tcp_bpf_strp_read_sock+0x195/0x320
> strp_data_ready+0x267/0x340
> sk_psock_strp_data_ready+0x1ce/0x350
> tcp_data_queue+0x1364/0x2fd0
> tcp_rcv_established+0xe07/0x1640
> [...]
>
> Allocated by task 349:
> skb_clone+0x17b/0x210
> __strp_recv+0x2c3/0xda0
> __tcp_read_sock+0x13d/0x590
> [...]
>
> Freed by task 349:
> kmem_cache_free+0x150/0x570
> __pskb_pull_tail+0x57b/0xc20
> skb_ensure_writable+0x236/0x260
> __bpf_skb_change_tail+0x1d4/0x590
> sk_skb_change_tail+0x2a/0x40
> bpf_prog_1b285dcd6c41373e+0x27/0x30
> bpf_prog_run_pin_on_cpu+0xf3/0x260
> sk_psock_strp_parse+0x118/0x1e0
> __strp_recv+0x4f6/0xda0
> [...]
>
> The same resize also leaves the head's length inconsistent with its
> frags, so a later __pskb_pull_tail() can instead hit the
> BUG_ON(skb_copy_bits(...)) in net/core/skbuff.c.
>
> Run the parser on a private clone of the head when the message spans more
> than one skb and the program can modify the packet
> (prog->aux->changes_pkt_data), so a resizing helper can only touch the
> clone and strparser's head and skb_nextp stay valid. Single-skb messages
> have no frag_list and read-only parsers cannot resize, so both are still
> parsed in place. If the clone cannot be allocated, return 0 so the caller
> retries on the next read rather than failing the parser.
>
> Fixes: 8a31db561566 ("bpf: add access to sock fields and pkt data from sk_skb programs")
> Signed-off-by: Sechang Lim <rhkrqnwk98@gmail.com>
> ---
> v2:
> - clone only when prog->aux->changes_pkt_data (Bobby Eshleman)
> - return 0 on clone failure instead of -ENOMEM (Bobby Eshleman)
> - free the clone with consume_skb() instead of kfree_skb()
> - drop the unrelated guard(rcu)() change (Bobby Eshleman)
>
> v1:
> - https://lore.kernel.org/all/20260609112316.3685738-1-rhkrqnwk98@gmail.com/
>
> net/core/skmsg.c | 26 +++++++++++++++++++++++---
> 1 file changed, 23 insertions(+), 3 deletions(-)
>
> diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> index e1850caf1a71..97e5bc5f38c3 100644
> --- a/net/core/skmsg.c
> +++ b/net/core/skmsg.c
> @@ -1149,9 +1149,29 @@ static int sk_psock_strp_parse(struct strparser *strp, struct sk_buff *skb)
> rcu_read_lock();
> prog = READ_ONCE(psock->progs.stream_parser);
> if (likely(prog)) {
> - skb->sk = psock->sk;
> - ret = bpf_prog_run_pin_on_cpu(prog, skb);
> - skb->sk = NULL;
> + struct sk_buff *parse_skb = skb;
> +
> + /*
> + * strparser chains the message skbs through skb->frag_list and
> + * keeps a pointer into that list in strp->skb_nextp. The parser
> + * program may call bpf_skb_change_tail() and friends, which go
> + * through __pskb_pull_tail() and free the frag_list skbs that
> + * strparser still tracks. Run the program on a clone when the head
> + * has a frag_list and the program can modify the packet, so it
> + * cannot drop frags strparser owns.
> + */
> + if (skb_has_frag_list(skb) && prog->aux->changes_pkt_data) {
> + parse_skb = skb_clone(skb, GFP_ATOMIC);
> + if (!parse_skb) {
> + rcu_read_unlock();
> + return 0;
> + }
> + }
> + parse_skb->sk = psock->sk;
> + ret = bpf_prog_run_pin_on_cpu(prog, parse_skb);
> + parse_skb->sk = NULL;
> + if (parse_skb != skb)
> + consume_skb(parse_skb);
> }
> rcu_read_unlock();
> return ret;
> --
> 2.43.0
>
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH bpf v2] bpf, sockmap: fix use-after-free when the stream parser resizes the skb
2026-06-12 12:35 [PATCH bpf v2] bpf, sockmap: fix use-after-free when the stream parser resizes the skb Sechang Lim
2026-06-17 22:36 ` Bobby Eshleman
2026-06-18 0:25 ` Kuniyuki Iwashima
@ 2026-06-18 2:45 ` Jiayuan Chen
2 siblings, 0 replies; 5+ messages in thread
From: Jiayuan Chen @ 2026-06-18 2:45 UTC (permalink / raw)
To: Sechang Lim, John Fastabend, Jakub Sitnicki, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, Bobby Eshleman, netdev, bpf, linux-kernel
On 6/12/26 8:35 PM, Sechang Lim wrote:
> sk_psock_strp_parse() runs the BPF_PROG_TYPE_SK_SKB stream-parser program
> to find the length of the next message. strparser assembles a message out
> of several received skbs by chaining them onto the head's frag_list and
> recording where to append the next one in strp->skb_nextp:
>
> *strp->skb_nextp = skb;
> strp->skb_nextp = &skb->next;
>
> and then calls the parser on the head:
>
> len = (*strp->cb.parse_msg)(strp, head);
>
> The parser is only meant to inspect the skb, but the program may call
> bpf_skb_change_tail() -- or the sibling bpf_skb_pull_data(),
> bpf_skb_change_head(), bpf_skb_adjust_room(), all allowed for SK_SKB.
> Once the head carries a frag_list these go
>
> ... -> skb_ensure_writable -> pskb_may_pull -> __pskb_pull_tail
>
> and __pskb_pull_tail() frees the frag_list skbs that strparser still
> tracks through skb_nextp:
>
> while ((list = skb_shinfo(skb)->frag_list) != insp) {
> skb_shinfo(skb)->frag_list = list->next;
> consume_skb(list);
> }
>
> strp->skb_nextp now points into a freed sk_buff. The next segment of
> the same message arrives in __strp_recv(), which links it with
> *strp->skb_nextp = skb, an 8-byte write into the freed skb. The free
> and the write happen in different __strp_recv() calls, so the message
> has to span at least three segments before it triggers.
>
> BUG: KASAN: slab-use-after-free in __strp_recv+0x447/0xda0
> Write of size 8 at addr ffff88810db86140 by task repro/349
>
> Call Trace:
> <IRQ>
> __strp_recv+0x447/0xda0
> __tcp_read_sock+0x13d/0x590
> tcp_bpf_strp_read_sock+0x195/0x320
> strp_data_ready+0x267/0x340
> sk_psock_strp_data_ready+0x1ce/0x350
> tcp_data_queue+0x1364/0x2fd0
> tcp_rcv_established+0xe07/0x1640
> [...]
>
> Allocated by task 349:
> skb_clone+0x17b/0x210
> __strp_recv+0x2c3/0xda0
> __tcp_read_sock+0x13d/0x590
> [...]
>
> Freed by task 349:
> kmem_cache_free+0x150/0x570
> __pskb_pull_tail+0x57b/0xc20
> skb_ensure_writable+0x236/0x260
> __bpf_skb_change_tail+0x1d4/0x590
> sk_skb_change_tail+0x2a/0x40
> bpf_prog_1b285dcd6c41373e+0x27/0x30
> bpf_prog_run_pin_on_cpu+0xf3/0x260
> sk_psock_strp_parse+0x118/0x1e0
> __strp_recv+0x4f6/0xda0
> [...]
>
> The same resize also leaves the head's length inconsistent with its
> frags, so a later __pskb_pull_tail() can instead hit the
> BUG_ON(skb_copy_bits(...)) in net/core/skbuff.c.
>
> Run the parser on a private clone of the head when the message spans more
> than one skb and the program can modify the packet
> (prog->aux->changes_pkt_data), so a resizing helper can only touch the
> clone and strparser's head and skb_nextp stay valid. Single-skb messages
> have no frag_list and read-only parsers cannot resize, so both are still
> parsed in place. If the clone cannot be allocated, return 0 so the caller
> retries on the next read rather than failing the parser.
>
> Fixes: 8a31db561566 ("bpf: add access to sock fields and pkt data from sk_skb programs")
Please consider Kuniyuki Iwashima's suggestion.
But it only covers the ATTACH path; the other two paths should be
covered as well:
- BPF_PROG_ATTACH → sock_map_get_from_fd → sock_map_prog_update
- BPF_LINK_CREATE → sock_map_link_create → sock_map_prog_update
- replace prog → sock_map_link_update_prog
A new helper for this check is probably needed, called from both
sock_map_prog_update() and sock_map_link_update_prog().
Since this rejects the program at attach time rather than fixing a
runtime crash,
I'm not sure a Fixes tag is appropriate here - thoughts?
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH bpf v2] bpf, sockmap: fix use-after-free when the stream parser resizes the skb
@ 2026-06-12 11:34 Sechang Lim
0 siblings, 0 replies; 5+ messages in thread
From: Sechang Lim @ 2026-06-12 11:34 UTC (permalink / raw)
To: John Fastabend, Jakub Sitnicki, David S . Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, Bobby Eshleman, netdev, bpf, linux-kernel
sk_psock_strp_parse() runs the BPF_PROG_TYPE_SK_SKB stream-parser program
to find the length of the next message. strparser assembles a message out
of several received skbs by chaining them onto the head's frag_list and
recording where to append the next one in strp->skb_nextp:
*strp->skb_nextp = skb;
strp->skb_nextp = &skb->next;
and then calls the parser on the head:
len = (*strp->cb.parse_msg)(strp, head);
The parser is only meant to inspect the skb, but the program may call
bpf_skb_change_tail() -- or the sibling bpf_skb_pull_data(),
bpf_skb_change_head(), bpf_skb_adjust_room(), all allowed for SK_SKB.
Once the head carries a frag_list these go
... -> skb_ensure_writable -> pskb_may_pull -> __pskb_pull_tail
and __pskb_pull_tail() frees the frag_list skbs that strparser still
tracks through skb_nextp:
while ((list = skb_shinfo(skb)->frag_list) != insp) {
skb_shinfo(skb)->frag_list = list->next;
consume_skb(list);
}
strp->skb_nextp now points into a freed sk_buff. The next segment of
the same message arrives in __strp_recv(), which links it with
*strp->skb_nextp = skb, an 8-byte write into the freed skb. The free
and the write happen in different __strp_recv() calls, so the message
has to span at least three segments before it triggers.
BUG: KASAN: slab-use-after-free in __strp_recv+0x447/0xda0
Write of size 8 at addr ffff88810db86140 by task repro/349
Call Trace:
<IRQ>
__strp_recv+0x447/0xda0
__tcp_read_sock+0x13d/0x590
tcp_bpf_strp_read_sock+0x195/0x320
strp_data_ready+0x267/0x340
sk_psock_strp_data_ready+0x1ce/0x350
tcp_data_queue+0x1364/0x2fd0
tcp_rcv_established+0xe07/0x1640
[...]
Allocated by task 349:
skb_clone+0x17b/0x210
__strp_recv+0x2c3/0xda0
__tcp_read_sock+0x13d/0x590
[...]
Freed by task 349:
kmem_cache_free+0x150/0x570
__pskb_pull_tail+0x57b/0xc20
skb_ensure_writable+0x236/0x260
__bpf_skb_change_tail+0x1d4/0x590
sk_skb_change_tail+0x2a/0x40
bpf_prog_1b285dcd6c41373e+0x27/0x30
bpf_prog_run_pin_on_cpu+0xf3/0x260
sk_psock_strp_parse+0x118/0x1e0
__strp_recv+0x4f6/0xda0
[...]
The same resize also leaves the head's length inconsistent with its
frags, so a later __pskb_pull_tail() can instead hit the
BUG_ON(skb_copy_bits(...)) in net/core/skbuff.c.
Run the parser on a private clone of the head when the message spans more
than one skb and the program can modify the packet
(prog->aux->changes_pkt_data), so a resizing helper can only touch the
clone and strparser's head and skb_nextp stay valid. Single-skb messages
have no frag_list and read-only parsers cannot resize, so both are still
parsed in place. If the clone cannot be allocated, return 0 so the caller
retries on the next read rather than failing the parser.
Fixes: 8a31db561566 ("bpf: add access to sock fields and pkt data from sk_skb programs")
Signed-off-by: Sechang Lim <rhkrqnwk98@gmail.com>
---
v2:
- clone only when prog->aux->changes_pkt_data (Bobby Eshleman)
- return 0 on clone failure instead of -ENOMEM (Bobby Eshleman)
- free the clone with consume_skb() instead of kfree_skb()
- drop the unrelated guard(rcu)() change (Bobby Eshleman)
v1:
- https://lore.kernel.org/all/20260609112316.3685738-1-rhkrqnwk98@gmail.com/
net/core/skmsg.c | 25 ++++++++++++++++++++++---
1 file changed, 22 insertions(+), 3 deletions(-)
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index e1850caf1a71..501e8cfc7505 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -1149,9 +1149,28 @@ static int sk_psock_strp_parse(struct strparser *strp, struct sk_buff *skb)
rcu_read_lock();
prog = READ_ONCE(psock->progs.stream_parser);
if (likely(prog)) {
- skb->sk = psock->sk;
- ret = bpf_prog_run_pin_on_cpu(prog, skb);
- skb->sk = NULL;
+ struct sk_buff *parse_skb = skb;
+
+ /*
+ * strparser chains the message skbs through skb->frag_list and
+ * keeps a pointer into that list in strp->skb_nextp. The parser
+ * program may call bpf_skb_change_tail() and friends, which go
+ * through __pskb_pull_tail() and free the frag_list skbs that
+ * strparser still tracks. Run the program on a clone when the head
+ * has a frag_list and the program can modify the packet, so it
+ * cannot drop frags strparser owns.
+ */
+ if (skb_has_frag_list(skb) && prog->aux->changes_pkt_data) {
+ parse_skb = skb_clone(skb, GFP_ATOMIC);
+ if (!parse_skb) {
+ rcu_read_unlock();
+ return 0;
+ }
+ }
+ parse_skb->sk = psock->sk;
+ ret = bpf_prog_run_pin_on_cpu(prog, parse_skb);
+ parse_skb->sk = NULL;
+ if (parse_skb != skb)
+ consume_skb(parse_skb);
}
rcu_read_unlock();
return ret;
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-06-18 2:45 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-12 12:35 [PATCH bpf v2] bpf, sockmap: fix use-after-free when the stream parser resizes the skb Sechang Lim
2026-06-17 22:36 ` Bobby Eshleman
2026-06-18 0:25 ` Kuniyuki Iwashima
2026-06-18 2:45 ` Jiayuan Chen
-- strict thread matches above, loose matches on Subject: below --
2026-06-12 11:34 Sechang Lim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox