From: Martin KaFai Lau <martin.lau@linux.dev>
To: Kuniyuki Iwashima <kuniyu@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
John Fastabend <john.fastabend@gmail.com>,
Stanislav Fomichev <sdf@fomichev.me>,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@kernel.org>,
Roman Gushchin <roman.gushchin@linux.dev>,
Shakeel Butt <shakeel.butt@linux.dev>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Neal Cardwell <ncardwell@google.com>,
Willem de Bruijn <willemb@google.com>,
Mina Almasry <almasrymina@google.com>,
Kuniyuki Iwashima <kuni1840@gmail.com>,
bpf@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH v4 bpf-next/net 5/5] selftest: bpf: Add test for SK_BPF_MEMCG_SOCK_ISOLATED.
Date: Wed, 3 Sep 2025 22:50:53 -0700 [thread overview]
Message-ID: <40ed29b3-84d7-4812-890d-3676957d503f@linux.dev> (raw)
In-Reply-To: <CAAVpQUCyPPO1dfkkU4Hxz67JFcW6dhSfYnmUp0foNMYua_doyg@mail.gmail.com>
On 9/3/25 10:08 AM, Kuniyuki Iwashima wrote:
> On Wed, Sep 3, 2025 at 9:59 AM Kuniyuki Iwashima <kuniyu@google.com> wrote:
>>
>> On Tue, Sep 2, 2025 at 1:49 PM Kuniyuki Iwashima <kuniyu@google.com> wrote:
>>>
>>> On Tue, Sep 2, 2025 at 1:26 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>>>
>>>> On 8/28/25 6:00 PM, Kuniyuki Iwashima wrote:
>>>>> The test does the following for IPv4/IPv6 x TCP/UDP sockets
>>>>> with/without BPF prog.
>>>>>
>>>>> 1. Create socket pairs
>>>>> 2. Send a bunch of data that requires more than 256 pages
>>>>> 3. Read memory_allocated from the 3rd column in /proc/net/protocols
>>>>> 4. Check if unread data is charged to memory_allocated
>>>>>
>>>>> If BPF prog is attached, memory_allocated should not be changed,
>>>>> but we allow a small error (up to 10 pages) in case other processes
>>>>> on the host use some amounts of TCP/UDP memory.
>>>>>
>>>>> At 2., the test actually sends more than 1024 pages because the sysctl
>>>>> net.core.mem_pcpu_rsv is 256 is by default, which means 256 pages are
>>>>> buffered per cpu before reporting to sk->sk_prot->memory_allocated.
>>>>>
>>>>> BUF_SINGLE (1024) * NR_SEND (64) * NR_SOCKETS (64) / 4096
>>>>> = 1024 pages
>>>>>
>>>>> When I reduced it to 512 pages, the following assertion for the
>>>>> non-isolated case got flaky.
>>>>>
>>>>> ASSERT_GT(memory_allocated[1], memory_allocated[0] + 256, ...)
>>>>>
>>>>> Another contributor to slowness is 150ms sleep to make sure 1 RCU
>>>>> grace period passes because UDP recv queue is destroyed after that.
>>>>
>>>> There is a kern_sync_rcu() in testing_helpers.c.
>>>
>>> Nice helper :) Will use it.
>>>
>>>>
>>>>>
>>>>> # time ./test_progs -t sk_memcg
>>>>> #370/1 sk_memcg/TCP :OK
>>>>> #370/2 sk_memcg/UDP :OK
>>>>> #370/3 sk_memcg/TCPv6 :OK
>>>>> #370/4 sk_memcg/UDPv6 :OK
>>>>> #370 sk_memcg:OK
>>>>> Summary: 1/4 PASSED, 0 SKIPPED, 0 FAILED
>>>>>
>>>>> real 0m1.214s
>>>>> user 0m0.014s
>>>>> sys 0m0.318s
>>>>
>>>> Thanks. It finished much faster in my setup also comparing with the earlier
>>>> revision. However, it is a bit flaky when I run it in a loop:
>>>>
>>>> check_isolated:FAIL:not isolated unexpected not isolated: actual 861 <= expected 861
>>>>
>>>> I usually can hit this at ~40-th iteration.
>>>
>>> Oh.. I tested ~10 times manually but will try in a tight loop.
>>
>> This didn't reproduce on my QEMU with/without --enable-kvm.
>>
>> Changing the assert from _GT to _GE will address the very case
>> above, but I'm not sure if it's enough.
>
> I doubled NR_SEND and it was still faster with kern_sync_rcu()
> than usleep(), so I'll simply double NR_SEND in v5
>
> # time ./test_progs -t sk_memcg
> ...
> Summary: 1/4 PASSED, 0 SKIPPED, 0 FAILED
> real 0m0.483s
> user 0m0.010s
> sys 0m0.191s
>
>
>>
>> Does the bpf CI run tests repeatedly or is this only a manual
>> scenario ?
I haven't seen bpf CI hit it yet. It is in my manual bash while loop. It should
not be dismissed so easily. Some flaky CI tests were eventually reproduced in a
loop before and fixed. I kept the bash loop continue this time until grep-ed a
"0" from the error output:
check_isolated:FAIL:not isolated unexpected not isolated: actual 0 <= expected 256
The "long memory_allocated[2]" read from /proc/net/protocols are printed as 0
but it is probably actually negative:
static inline long
proto_memory_allocated(const struct proto *prot)
{
return max(0L, atomic_long_read(prot->memory_allocated));
}
prot->memory_allocated could be negative afaict but printed as 0 in
/proc/net/protocols. Even the machine is network quiet after test_progs started,
the "prot->memory_allocated" and the "proto->per_cpu_fw_alloc" could be in some
random states before the test_progs start. When I hit "0", it will take some
efforts to send some random traffic to the machine to get the test working again. :(
Also, after reading the selftest closer, I am not sure I understand why "+ 256".
The "proto->per_cpu_fw_alloc" can start with -255 or +255.
I don't think changing NR_SEND help here. It needs a better way. May be some
functions can be traced such that prot->memory_allocated can be read directly?
If fentry and fexit of that function has different memory_allocated values, then
the test could also become more straight forward.
next prev parent reply other threads:[~2025-09-04 5:51 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-29 1:00 [PATCH v4 bpf-next/net 0/5] bpf: Allow decoupling memcg from sk->sk_prot->memory_allocated Kuniyuki Iwashima
2025-08-29 1:00 ` [PATCH v4 bpf-next/net 1/5] tcp: Save lock_sock() for memcg in inet_csk_accept() Kuniyuki Iwashima
2025-09-02 18:55 ` Martin KaFai Lau
2025-09-02 19:32 ` Kuniyuki Iwashima
2025-08-29 1:00 ` [PATCH v4 bpf-next/net 2/5] bpf: Support bpf_setsockopt() for BPF_CGROUP_INET_SOCK_CREATE Kuniyuki Iwashima
2025-09-02 19:10 ` Martin KaFai Lau
2025-09-02 19:33 ` Kuniyuki Iwashima
2025-08-29 1:00 ` [PATCH v4 bpf-next/net 3/5] bpf: Introduce SK_BPF_MEMCG_FLAGS and SK_BPF_MEMCG_SOCK_ISOLATED Kuniyuki Iwashima
2025-09-02 20:02 ` Martin KaFai Lau
2025-09-02 20:13 ` Kuniyuki Iwashima
2025-08-29 1:00 ` [PATCH v4 bpf-next/net 4/5] net-memcg: Allow decoupling memcg from global protocol memory accounting Kuniyuki Iwashima
2025-09-02 20:16 ` Martin KaFai Lau
2025-09-02 20:45 ` Kuniyuki Iwashima
2025-08-29 1:00 ` [PATCH v4 bpf-next/net 5/5] selftest: bpf: Add test for SK_BPF_MEMCG_SOCK_ISOLATED Kuniyuki Iwashima
2025-09-02 20:26 ` Martin KaFai Lau
2025-09-02 20:49 ` Kuniyuki Iwashima
2025-09-03 16:59 ` Kuniyuki Iwashima
2025-09-03 17:08 ` Kuniyuki Iwashima
2025-09-04 5:50 ` Martin KaFai Lau [this message]
2025-09-04 16:45 ` Kuniyuki Iwashima
2025-09-04 19:48 ` Martin KaFai Lau
2025-09-04 20:29 ` Kuniyuki Iwashima
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=40ed29b3-84d7-4812-890d-3676957d503f@linux.dev \
--to=martin.lau@linux.dev \
--cc=almasrymina@google.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hannes@cmpxchg.org \
--cc=john.fastabend@gmail.com \
--cc=kuba@kernel.org \
--cc=kuni1840@gmail.com \
--cc=kuniyu@google.com \
--cc=mhocko@kernel.org \
--cc=ncardwell@google.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=roman.gushchin@linux.dev \
--cc=sdf@fomichev.me \
--cc=shakeel.butt@linux.dev \
--cc=willemb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.