netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net v5] skmsg: skip zero length skb in sk_msg_recvmsg
@ 2024-07-03  8:39 Geliang Tang
  2024-07-09  0:33 ` John Fastabend
  2024-07-09  8:40 ` patchwork-bot+netdevbpf
  0 siblings, 2 replies; 3+ messages in thread
From: Geliang Tang @ 2024-07-03  8:39 UTC (permalink / raw)
  To: John Fastabend, Jakub Sitnicki, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann
  Cc: Geliang Tang, David Ahern, Eduard Zingerman, Mykola Lysenko,
	Martin KaFai Lau, Song Liu, Yonghong Song, KP Singh,
	Stanislav Fomichev, Hao Luo, Jiri Olsa, Shuah Khan,
	Mykyta Yatsenko, Miao Xu, Yuran Pereira, Huacai Chen, Tiezhu Yang,
	D . Wythe, netdev, bpf, linux-kselftest

From: Geliang Tang <tanggeliang@kylinos.cn>

Run this BPF selftests (./test_progs -t sockmap_basic) on a Loongarch
platform, a kernel panic occurs:

'''
Oops[#1]:
CPU: 22 PID: 2824 Comm: test_progs Tainted: G           OE  6.10.0-rc2+ #18
Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018
   ... ...
   ra: 90000000048bf6c0 sk_msg_recvmsg+0x120/0x560
  ERA: 9000000004162774 copy_page_to_iter+0x74/0x1c0
 CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
 PRMD: 0000000c (PPLV0 +PIE +PWE)
 EUEN: 00000007 (+FPE +SXE +ASXE -BTE)
 ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0)
 BADV: 0000000000000040
 PRID: 0014c011 (Loongson-64bit, Loongson-3C5000)
Modules linked in: bpf_testmod(OE) xt_CHECKSUM xt_MASQUERADE xt_conntrack
Process test_progs (pid: 2824, threadinfo=0000000000863a31, task=...)
Stack : ...
        ...
Call Trace:
[<9000000004162774>] copy_page_to_iter+0x74/0x1c0
[<90000000048bf6c0>] sk_msg_recvmsg+0x120/0x560
[<90000000049f2b90>] tcp_bpf_recvmsg_parser+0x170/0x4e0
[<90000000049aae34>] inet_recvmsg+0x54/0x100
[<900000000481ad5c>] sock_recvmsg+0x7c/0xe0
[<900000000481e1a8>] __sys_recvfrom+0x108/0x1c0
[<900000000481e27c>] sys_recvfrom+0x1c/0x40
[<9000000004c076ec>] do_syscall+0x8c/0xc0
[<9000000003731da4>] handle_syscall+0xc4/0x160

Code: ...

---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Fatal exception
Kernel relocated by 0x3510000
 .text @ 0x9000000003710000
 .data @ 0x9000000004d70000
 .bss  @ 0x9000000006469400
---[ end Kernel panic - not syncing: Fatal exception ]---
'''

This crash happens every time when running sockmap_skb_verdict_shutdown
subtest in sockmap_basic.

This crash is because a NULL pointer is passed to page_address() in
sk_msg_recvmsg(). Due to the difference implementations depending on the
architecture, page_address(NULL) will trigger a panic on Loongarch
platform but not on X86 platform. So this bug was hidden on X86 platform
for a while, but now it is exposed on Loongarch platform.

The root cause is a zero length skb (skb->len == 0) is put on the queue.

This zero length skb is a TCP FIN packet, which is sent by shutdown(),
invoked in test_sockmap_skb_verdict_shutdown():

	shutdown(p1, SHUT_WR);

In this case, in sk_psock_skb_ingress_enqueue(), num_sge is zero, and no
page is put to this sge (see sg_set_page in sg_set_page), but this empty
sge is queued into ingress_msg list.

And in sk_msg_recvmsg(), this empty sge is used, and a NULL page is got by
sg_page(sge). Pass this NULL page to copy_page_to_iter(), which passes it
to kmap_local_page() and to page_address(), then kernel panics.

To solve this, we should skip this zero length skb. So in sk_msg_recvmsg(),
if copy is zero, that means it's a zero length skb, skip invoking
copy_page_to_iter(). We are using the EFAULT return triggered by
copy_page_to_iter to check for is_fin in tcp_bpf.c.

Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Suggested-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
---
v5:
 - update v5 as John suggested.
 - skmsg: skip zero length skb in sk_msg_recvmsg

v4:
 - skmsg: skip empty sge in sk_msg_recvmsg

v3:
 - skmsg: prevent empty ingress skb from enqueuing
   
v2:
 - skmsg: null check for sg_page in sk_msg_recvmsg
---
 net/core/skmsg.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index fd20aae30be2..bbf40b999713 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -434,7 +434,8 @@ int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg,
 			page = sg_page(sge);
 			if (copied + copy > len)
 				copy = len - copied;
-			copy = copy_page_to_iter(page, sge->offset, copy, iter);
+			if (copy)
+				copy = copy_page_to_iter(page, sge->offset, copy, iter);
 			if (!copy) {
 				copied = copied ? copied : -EFAULT;
 				goto out;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* RE: [PATCH net v5] skmsg: skip zero length skb in sk_msg_recvmsg
  2024-07-03  8:39 [PATCH net v5] skmsg: skip zero length skb in sk_msg_recvmsg Geliang Tang
@ 2024-07-09  0:33 ` John Fastabend
  2024-07-09  8:40 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 3+ messages in thread
From: John Fastabend @ 2024-07-09  0:33 UTC (permalink / raw)
  To: Geliang Tang, John Fastabend, Jakub Sitnicki, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
	Daniel Borkmann
  Cc: Geliang Tang, David Ahern, Eduard Zingerman, Mykola Lysenko,
	Martin KaFai Lau, Song Liu, Yonghong Song, KP Singh,
	Stanislav Fomichev, Hao Luo, Jiri Olsa, Shuah Khan,
	Mykyta Yatsenko, Miao Xu, Yuran Pereira, Huacai Chen, Tiezhu Yang,
	D . Wythe, netdev, bpf, linux-kselftest

Geliang Tang wrote:
> From: Geliang Tang <tanggeliang@kylinos.cn>
> 
> Run this BPF selftests (./test_progs -t sockmap_basic) on a Loongarch
> platform, a kernel panic occurs:
> 
> '''
> Oops[#1]:
> CPU: 22 PID: 2824 Comm: test_progs Tainted: G           OE  6.10.0-rc2+ #18
> Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018
>    ... ...
>    ra: 90000000048bf6c0 sk_msg_recvmsg+0x120/0x560
>   ERA: 9000000004162774 copy_page_to_iter+0x74/0x1c0
>  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
>  PRMD: 0000000c (PPLV0 +PIE +PWE)
>  EUEN: 00000007 (+FPE +SXE +ASXE -BTE)
>  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
> ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0)
>  BADV: 0000000000000040
>  PRID: 0014c011 (Loongson-64bit, Loongson-3C5000)
> Modules linked in: bpf_testmod(OE) xt_CHECKSUM xt_MASQUERADE xt_conntrack
> Process test_progs (pid: 2824, threadinfo=0000000000863a31, task=...)
> Stack : ...
>         ...
> Call Trace:
> [<9000000004162774>] copy_page_to_iter+0x74/0x1c0
> [<90000000048bf6c0>] sk_msg_recvmsg+0x120/0x560
> [<90000000049f2b90>] tcp_bpf_recvmsg_parser+0x170/0x4e0
> [<90000000049aae34>] inet_recvmsg+0x54/0x100
> [<900000000481ad5c>] sock_recvmsg+0x7c/0xe0
> [<900000000481e1a8>] __sys_recvfrom+0x108/0x1c0
> [<900000000481e27c>] sys_recvfrom+0x1c/0x40
> [<9000000004c076ec>] do_syscall+0x8c/0xc0
> [<9000000003731da4>] handle_syscall+0xc4/0x160
> 
> Code: ...
> 
> ---[ end trace 0000000000000000 ]---
> Kernel panic - not syncing: Fatal exception
> Kernel relocated by 0x3510000
>  .text @ 0x9000000003710000
>  .data @ 0x9000000004d70000
>  .bss  @ 0x9000000006469400
> ---[ end Kernel panic - not syncing: Fatal exception ]---
> '''
> 
> This crash happens every time when running sockmap_skb_verdict_shutdown
> subtest in sockmap_basic.
> 
> This crash is because a NULL pointer is passed to page_address() in
> sk_msg_recvmsg(). Due to the difference implementations depending on the
> architecture, page_address(NULL) will trigger a panic on Loongarch
> platform but not on X86 platform. So this bug was hidden on X86 platform
> for a while, but now it is exposed on Loongarch platform.
> 
> The root cause is a zero length skb (skb->len == 0) is put on the queue.
> 
> This zero length skb is a TCP FIN packet, which is sent by shutdown(),
> invoked in test_sockmap_skb_verdict_shutdown():
> 
> 	shutdown(p1, SHUT_WR);
> 
> In this case, in sk_psock_skb_ingress_enqueue(), num_sge is zero, and no
> page is put to this sge (see sg_set_page in sg_set_page), but this empty
> sge is queued into ingress_msg list.
> 
> And in sk_msg_recvmsg(), this empty sge is used, and a NULL page is got by
> sg_page(sge). Pass this NULL page to copy_page_to_iter(), which passes it
> to kmap_local_page() and to page_address(), then kernel panics.
> 
> To solve this, we should skip this zero length skb. So in sk_msg_recvmsg(),
> if copy is zero, that means it's a zero length skb, skip invoking
> copy_page_to_iter(). We are using the EFAULT return triggered by
> copy_page_to_iter to check for is_fin in tcp_bpf.c.
> 
> Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
> Suggested-by: John Fastabend <john.fastabend@gmail.com>
> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
> ---
> v5:
>  - update v5 as John suggested.
>  - skmsg: skip zero length skb in sk_msg_recvmsg
> 
> v4:
>  - skmsg: skip empty sge in sk_msg_recvmsg
> 
> v3:
>  - skmsg: prevent empty ingress skb from enqueuing
>    
> v2:
>  - skmsg: null check for sg_page in sk_msg_recvmsg
> ---
>  net/core/skmsg.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)

I don't have any better ideas so lets use this.

Reviewed-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH net v5] skmsg: skip zero length skb in sk_msg_recvmsg
  2024-07-03  8:39 [PATCH net v5] skmsg: skip zero length skb in sk_msg_recvmsg Geliang Tang
  2024-07-09  0:33 ` John Fastabend
@ 2024-07-09  8:40 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 3+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-07-09  8:40 UTC (permalink / raw)
  To: Geliang Tang
  Cc: john.fastabend, jakub, davem, edumazet, kuba, pabeni, ast, daniel,
	tanggeliang, dsahern, eddyz87, mykolal, martin.lau, song,
	yonghong.song, kpsingh, sdf, haoluo, jolsa, shuah, yatsenko,
	miaxu, yuran.pereira, chenhuacai, yangtiezhu, alibuda, netdev,
	bpf, linux-kselftest

Hello:

This patch was applied to bpf/bpf.git (master)
by Daniel Borkmann <daniel@iogearbox.net>:

On Wed,  3 Jul 2024 16:39:31 +0800 you wrote:
> From: Geliang Tang <tanggeliang@kylinos.cn>
> 
> Run this BPF selftests (./test_progs -t sockmap_basic) on a Loongarch
> platform, a kernel panic occurs:
> 
> '''
> Oops[#1]:
> CPU: 22 PID: 2824 Comm: test_progs Tainted: G           OE  6.10.0-rc2+ #18
> Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018
>    ... ...
>    ra: 90000000048bf6c0 sk_msg_recvmsg+0x120/0x560
>   ERA: 9000000004162774 copy_page_to_iter+0x74/0x1c0
>  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
>  PRMD: 0000000c (PPLV0 +PIE +PWE)
>  EUEN: 00000007 (+FPE +SXE +ASXE -BTE)
>  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
> ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0)
>  BADV: 0000000000000040
>  PRID: 0014c011 (Loongson-64bit, Loongson-3C5000)
> Modules linked in: bpf_testmod(OE) xt_CHECKSUM xt_MASQUERADE xt_conntrack
> Process test_progs (pid: 2824, threadinfo=0000000000863a31, task=...)
> Stack : ...
>         ...
> Call Trace:
> [<9000000004162774>] copy_page_to_iter+0x74/0x1c0
> [<90000000048bf6c0>] sk_msg_recvmsg+0x120/0x560
> [<90000000049f2b90>] tcp_bpf_recvmsg_parser+0x170/0x4e0
> [<90000000049aae34>] inet_recvmsg+0x54/0x100
> [<900000000481ad5c>] sock_recvmsg+0x7c/0xe0
> [<900000000481e1a8>] __sys_recvfrom+0x108/0x1c0
> [<900000000481e27c>] sys_recvfrom+0x1c/0x40
> [<9000000004c076ec>] do_syscall+0x8c/0xc0
> [<9000000003731da4>] handle_syscall+0xc4/0x160
> 
> [...]

Here is the summary with links:
  - [net,v5] skmsg: skip zero length skb in sk_msg_recvmsg
    https://git.kernel.org/bpf/bpf/c/f0c180256937

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-07-09  8:40 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-03  8:39 [PATCH net v5] skmsg: skip zero length skb in sk_msg_recvmsg Geliang Tang
2024-07-09  0:33 ` John Fastabend
2024-07-09  8:40 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).