BPF List
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: zijianzhang@bytedance.com,  bpf@vger.kernel.org
Cc: edumazet@google.com,  john.fastabend@gmail.com,
	 jakub@cloudflare.com,  davem@davemloft.net,  dsahern@kernel.org,
	 kuba@kernel.org,  pabeni@redhat.com,  ast@kernel.org,
	 daniel@iogearbox.net,  andrii@kernel.org,  martin.lau@linux.dev,
	 eddyz87@gmail.com,  song@kernel.org,  yonghong.song@linux.dev,
	 kpsingh@kernel.org,  sdf@fomichev.me,  haoluo@google.com,
	 jolsa@kernel.org,  mykolal@fb.com,  shuah@kernel.org,
	 wangyufen@huawei.com,  xiyou.wangcong@gmail.com,
	 zijianzhang@bytedance.com
Subject: RE: [PATCH bpf 2/2] tcp_bpf: Fix the sk_mem_uncharge logic in tcp_bpf_sendmsg
Date: Wed, 20 Nov 2024 21:43:48 -0800	[thread overview]
Message-ID: <673ec8941b96d_157a20829@john.notmuch> (raw)
In-Reply-To: <20241016234838.3167769-3-zijianzhang@bytedance.com>

zijianzhang@ wrote:
> From: Zijian Zhang <zijianzhang@bytedance.com>
> 
> The current sk memory accounting logic in __SK_REDIRECT is pre-uncharging
> tosend bytes, which is either msg->sg.size or a smaller value apply_bytes.
> Potential problems with this strategy are as follows:
> - If the actual sent bytes are smaller than tosend, we need to charge some
> bytes back, as in line 487, which is okay but seems not clean.
> - When tosend is set to apply_bytes, as in line 417, and (ret < 0), we may
> miss uncharging (msg->sg.size - apply_bytes) bytes.
> 
> 415 tosend = msg->sg.size;
> 416 if (psock->apply_bytes && psock->apply_bytes < tosend)
> 417   tosend = psock->apply_bytes;
> ...
> 443 sk_msg_return(sk, msg, tosend);
> 444 release_sock(sk);
> 446 origsize = msg->sg.size;
> 447 ret = tcp_bpf_sendmsg_redir(sk_redir, redir_ingress,
> 448                             msg, tosend, flags);
> 449 sent = origsize - msg->sg.size;
> ...
> 454 lock_sock(sk);
> 455 if (unlikely(ret < 0)) {
> 456   int free = sk_msg_free_nocharge(sk, msg);
> 458   if (!cork)
> 459     *copied -= free;
> 460 }
> ...
> 487 if (eval == __SK_REDIRECT)
> 488   sk_mem_charge(sk, tosend - sent);
> 
> When running the selftest test_txmsg_redir_wait_sndmem with txmsg_apply,
> the following warning will be reported,
> ------------[ cut here ]------------
> WARNING: CPU: 6 PID: 57 at net/ipv4/af_inet.c:156 inet_sock_destruct+0x190/0x1a0
> Modules linked in:
> CPU: 6 UID: 0 PID: 57 Comm: kworker/6:0 Not tainted 6.12.0-rc1.bm.1-amd64+ #43
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
> Workqueue: events sk_psock_destroy
> RIP: 0010:inet_sock_destruct+0x190/0x1a0
> RSP: 0018:ffffad0a8021fe08 EFLAGS: 00010206
> RAX: 0000000000000011 RBX: ffff9aab4475b900 RCX: ffff9aab481a0800
> RDX: 0000000000000303 RSI: 0000000000000011 RDI: ffff9aab4475b900
> RBP: ffff9aab4475b990 R08: 0000000000000000 R09: ffff9aab40050ec0
> R10: 0000000000000000 R11: ffff9aae6fdb1d01 R12: ffff9aab49c60400
> R13: ffff9aab49c60598 R14: ffff9aab49c60598 R15: dead000000000100
> FS:  0000000000000000(0000) GS:ffff9aae6fd80000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007ffec7e47bd8 CR3: 00000001a1a1c004 CR4: 0000000000770ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> PKRU: 55555554
> Call Trace:
> <TASK>
> ? __warn+0x89/0x130
> ? inet_sock_destruct+0x190/0x1a0
> ? report_bug+0xfc/0x1e0
> ? handle_bug+0x5c/0xa0
> ? exc_invalid_op+0x17/0x70
> ? asm_exc_invalid_op+0x1a/0x20
> ? inet_sock_destruct+0x190/0x1a0
> __sk_destruct+0x25/0x220
> sk_psock_destroy+0x2b2/0x310
> process_scheduled_works+0xa3/0x3e0
> worker_thread+0x117/0x240
> ? __pfx_worker_thread+0x10/0x10
> kthread+0xcf/0x100
> ? __pfx_kthread+0x10/0x10
> ret_from_fork+0x31/0x40
> ? __pfx_kthread+0x10/0x10
> ret_from_fork_asm+0x1a/0x30
> </TASK>
> ---[ end trace 0000000000000000 ]---
> 
> In __SK_REDIRECT, a more concise way is delaying the uncharging after sent
> bytes are finalized, and uncharge this value. When (ret < 0), we shall
> invoke sk_msg_free.
> 
> Same thing happens in case __SK_DROP, when tosend is set to apply_bytes,
> we may miss uncharging (msg->sg.size - apply_bytes) bytes. The same
> warning will be reported in selftest.
> 
> 468 case __SK_DROP:
> 469 default:
> 470 sk_msg_free_partial(sk, msg, tosend);
> 471 sk_msg_apply_bytes(psock, tosend);
> 472 *copied -= (tosend + delta);
> 473 return -EACCES;
> 
> So instead of sk_msg_free_partial we can do sk_msg_free here.
> 
> Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
> Fixes: 8ec95b94716a ("bpf, sockmap: Fix the sk->sk_forward_alloc warning of sk_stream_kill_queues")
> Signed-off-by: Zijian Zhang <zijianzhang@bytedance.com>
> ---

Agree thanks.

Acked-by: John Fastabend <john.fastabend@gmail.com>

  reply	other threads:[~2024-11-21  5:43 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-16 23:48 [PATCH bpf 0/2] tcp_bpf: Fix the sk_mem_uncharge logic in tcp_bpf_sendmsg zijianzhang
2024-10-16 23:48 ` [PATCH bpf 1/2] selftests/bpf: Add apply_bytes test to test_txmsg_redir_wait_sndmem in test_sockmap zijianzhang
2024-11-21  5:38   ` John Fastabend
2024-10-16 23:48 ` [PATCH bpf 2/2] tcp_bpf: Fix the sk_mem_uncharge logic in tcp_bpf_sendmsg zijianzhang
2024-11-21  5:43   ` John Fastabend [this message]
2024-11-21  5:49 ` [PATCH bpf 0/2] " John Fastabend
2024-11-26 19:50 ` patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=673ec8941b96d_157a20829@john.notmuch \
    --to=john.fastabend@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=eddyz87@gmail.com \
    --cc=edumazet@google.com \
    --cc=haoluo@google.com \
    --cc=jakub@cloudflare.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=mykolal@fb.com \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    --cc=shuah@kernel.org \
    --cc=song@kernel.org \
    --cc=wangyufen@huawei.com \
    --cc=xiyou.wangcong@gmail.com \
    --cc=yonghong.song@linux.dev \
    --cc=zijianzhang@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox