* [PATCH net 0/2] fix OOM and order check in msg_zerocopy selftest
@ 2024-07-01 22:53 zijianzhang
2024-07-01 22:53 ` [PATCH net 1/2] selftests: fix OOM " zijianzhang
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: zijianzhang @ 2024-07-01 22:53 UTC (permalink / raw)
To: netdev; +Cc: willemdebruijn.kernel, cong.wang, xiaochun.lu, Zijian Zhang
From: Zijian Zhang <zijianzhang@bytedance.com>
In selftests/net/msg_zerocopy.c, it has a while loop keeps calling sendmsg
on a socket with MSG_ZEROCOPY flag, and it will recv the notifications
until the socket is not writable. Typically, it will start the receiving
process after around 30+ sendmsgs. However, as the introduction of commit
dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale"), the sender is
always writable and does not get any chance to run recv notifications.
The selftest always exits with OUT_OF_MEMORY because the memory used by
opt_skb exceeds the net.core.optmem_max. Meanwhile, it could be set to a
different value to trigger OOM on older kernels too.
Thus, we introduce "cfg_notification_limit" to force sender to receive
notifications after some number of sendmsgs.
And, we find that when lock debugging is on, notifications may not come in
order. Thus, we have order checking outputs managed by cfg_verbose, to
avoid too many outputs in this case.
Zijian Zhang (2):
selftests: fix OOM in msg_zerocopy selftest
selftests: make order checking verbose in msg_zerocopy selftest
tools/testing/selftests/net/msg_zerocopy.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
--
2.20.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH net 1/2] selftests: fix OOM in msg_zerocopy selftest
2024-07-01 22:53 [PATCH net 0/2] fix OOM and order check in msg_zerocopy selftest zijianzhang
@ 2024-07-01 22:53 ` zijianzhang
2024-07-02 13:17 ` Willem de Bruijn
2024-07-04 1:50 ` Jakub Kicinski
2024-07-01 22:53 ` [PATCH net 2/2] selftests: make order checking verbose " zijianzhang
2024-07-04 2:50 ` [PATCH net 0/2] fix OOM and order check " patchwork-bot+netdevbpf
2 siblings, 2 replies; 10+ messages in thread
From: zijianzhang @ 2024-07-01 22:53 UTC (permalink / raw)
To: netdev; +Cc: willemdebruijn.kernel, cong.wang, xiaochun.lu, Zijian Zhang
From: Zijian Zhang <zijianzhang@bytedance.com>
In selftests/net/msg_zerocopy.c, it has a while loop keeps calling sendmsg
on a socket with MSG_ZEROCOPY flag, and it will recv the notifications
until the socket is not writable. Typically, it will start the receiving
process after around 30+ sendmsgs. However, as the introduction of commit
dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale"), the sender is
always writable and does not get any chance to run recv notifications.
The selftest always exits with OUT_OF_MEMORY because the memory used by
opt_skb exceeds the net.core.optmem_max. Meanwhile, it could be set to a
different value to trigger OOM on older kernels too.
Thus, we introduce "cfg_notification_limit" to force sender to receive
notifications after some number of sendmsgs.
Fixes: 07b65c5b31ce ("test: add msg_zerocopy test")
Signed-off-by: Zijian Zhang <zijianzhang@bytedance.com>
Signed-off-by: Xiaochun Lu <xiaochun.lu@bytedance.com>
---
tools/testing/selftests/net/msg_zerocopy.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/msg_zerocopy.c b/tools/testing/selftests/net/msg_zerocopy.c
index bdc03a2097e8..926556febc83 100644
--- a/tools/testing/selftests/net/msg_zerocopy.c
+++ b/tools/testing/selftests/net/msg_zerocopy.c
@@ -85,6 +85,7 @@ static bool cfg_rx;
static int cfg_runtime_ms = 4200;
static int cfg_verbose;
static int cfg_waittime_ms = 500;
+static int cfg_notification_limit = 32;
static bool cfg_zerocopy;
static socklen_t cfg_alen;
@@ -95,6 +96,7 @@ static char payload[IP_MAXPACKET];
static long packets, bytes, completions, expected_completions;
static int zerocopied = -1;
static uint32_t next_completion;
+static uint32_t sends_since_notify;
static unsigned long gettimeofday_ms(void)
{
@@ -208,6 +210,7 @@ static bool do_sendmsg(int fd, struct msghdr *msg, bool do_zerocopy, int domain)
error(1, errno, "send");
if (cfg_verbose && ret != len)
fprintf(stderr, "send: ret=%u != %u\n", ret, len);
+ sends_since_notify++;
if (len) {
packets++;
@@ -460,6 +463,7 @@ static bool do_recv_completion(int fd, int domain)
static void do_recv_completions(int fd, int domain)
{
while (do_recv_completion(fd, domain)) {}
+ sends_since_notify = 0;
}
/* Wait for all remaining completions on the errqueue */
@@ -549,6 +553,9 @@ static void do_tx(int domain, int type, int protocol)
else
do_sendmsg(fd, &msg, cfg_zerocopy, domain);
+ if (cfg_zerocopy && sends_since_notify >= cfg_notification_limit)
+ do_recv_completions(fd, domain);
+
while (!do_poll(fd, POLLOUT)) {
if (cfg_zerocopy)
do_recv_completions(fd, domain);
@@ -708,7 +715,7 @@ static void parse_opts(int argc, char **argv)
cfg_payload_len = max_payload_len;
- while ((c = getopt(argc, argv, "46c:C:D:i:mp:rs:S:t:vz")) != -1) {
+ while ((c = getopt(argc, argv, "46c:C:D:i:l:mp:rs:S:t:vz")) != -1) {
switch (c) {
case '4':
if (cfg_family != PF_UNSPEC)
@@ -736,6 +743,9 @@ static void parse_opts(int argc, char **argv)
if (cfg_ifindex == 0)
error(1, errno, "invalid iface: %s", optarg);
break;
+ case 'l':
+ cfg_notification_limit = strtoul(optarg, NULL, 0);
+ break;
case 'm':
cfg_cork_mixed = true;
break;
--
2.20.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH net 2/2] selftests: make order checking verbose in msg_zerocopy selftest
2024-07-01 22:53 [PATCH net 0/2] fix OOM and order check in msg_zerocopy selftest zijianzhang
2024-07-01 22:53 ` [PATCH net 1/2] selftests: fix OOM " zijianzhang
@ 2024-07-01 22:53 ` zijianzhang
2024-07-02 13:18 ` Willem de Bruijn
2024-07-04 2:50 ` [PATCH net 0/2] fix OOM and order check " patchwork-bot+netdevbpf
2 siblings, 1 reply; 10+ messages in thread
From: zijianzhang @ 2024-07-01 22:53 UTC (permalink / raw)
To: netdev; +Cc: willemdebruijn.kernel, cong.wang, xiaochun.lu, Zijian Zhang
From: Zijian Zhang <zijianzhang@bytedance.com>
We find that when lock debugging is on, notifications may not come in
order. Thus, we have order checking outputs managed by cfg_verbose, to
avoid too many outputs in this case.
Fixes: 07b65c5b31ce ("test: add msg_zerocopy test")
Signed-off-by: Zijian Zhang <zijianzhang@bytedance.com>
Signed-off-by: Xiaochun Lu <xiaochun.lu@bytedance.com>
---
tools/testing/selftests/net/msg_zerocopy.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/msg_zerocopy.c b/tools/testing/selftests/net/msg_zerocopy.c
index 926556febc83..7ea5fb28c93d 100644
--- a/tools/testing/selftests/net/msg_zerocopy.c
+++ b/tools/testing/selftests/net/msg_zerocopy.c
@@ -438,7 +438,7 @@ static bool do_recv_completion(int fd, int domain)
/* Detect notification gaps. These should not happen often, if at all.
* Gaps can occur due to drops, reordering and retransmissions.
*/
- if (lo != next_completion)
+ if (cfg_verbose && lo != next_completion)
fprintf(stderr, "gap: %u..%u does not append to %u\n",
lo, hi, next_completion);
next_completion = hi + 1;
--
2.20.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH net 1/2] selftests: fix OOM in msg_zerocopy selftest
2024-07-01 22:53 ` [PATCH net 1/2] selftests: fix OOM " zijianzhang
@ 2024-07-02 13:17 ` Willem de Bruijn
2024-07-04 1:50 ` Jakub Kicinski
1 sibling, 0 replies; 10+ messages in thread
From: Willem de Bruijn @ 2024-07-02 13:17 UTC (permalink / raw)
To: zijianzhang, netdev
Cc: willemdebruijn.kernel, cong.wang, xiaochun.lu, Zijian Zhang
zijianzhang@ wrote:
> From: Zijian Zhang <zijianzhang@bytedance.com>
>
> In selftests/net/msg_zerocopy.c, it has a while loop keeps calling sendmsg
> on a socket with MSG_ZEROCOPY flag, and it will recv the notifications
> until the socket is not writable. Typically, it will start the receiving
> process after around 30+ sendmsgs. However, as the introduction of commit
> dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale"), the sender is
> always writable and does not get any chance to run recv notifications.
> The selftest always exits with OUT_OF_MEMORY because the memory used by
> opt_skb exceeds the net.core.optmem_max. Meanwhile, it could be set to a
> different value to trigger OOM on older kernels too.
>
> Thus, we introduce "cfg_notification_limit" to force sender to receive
> notifications after some number of sendmsgs.
>
> Fixes: 07b65c5b31ce ("test: add msg_zerocopy test")
> Signed-off-by: Zijian Zhang <zijianzhang@bytedance.com>
> Signed-off-by: Xiaochun Lu <xiaochun.lu@bytedance.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net 2/2] selftests: make order checking verbose in msg_zerocopy selftest
2024-07-01 22:53 ` [PATCH net 2/2] selftests: make order checking verbose " zijianzhang
@ 2024-07-02 13:18 ` Willem de Bruijn
2024-07-02 18:05 ` Zijian Zhang
0 siblings, 1 reply; 10+ messages in thread
From: Willem de Bruijn @ 2024-07-02 13:18 UTC (permalink / raw)
To: zijianzhang, netdev
Cc: willemdebruijn.kernel, cong.wang, xiaochun.lu, Zijian Zhang
zijianzhang@ wrote:
> From: Zijian Zhang <zijianzhang@bytedance.com>
>
> We find that when lock debugging is on, notifications may not come in
> order. Thus, we have order checking outputs managed by cfg_verbose, to
> avoid too many outputs in this case.
>
> Fixes: 07b65c5b31ce ("test: add msg_zerocopy test")
> Signed-off-by: Zijian Zhang <zijianzhang@bytedance.com>
> Signed-off-by: Xiaochun Lu <xiaochun.lu@bytedance.com>
Why did you split this trivial change out? Anyway..
Reviewed-by: Willem de Bruijn <willemb@google.com>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net 2/2] selftests: make order checking verbose in msg_zerocopy selftest
2024-07-02 13:18 ` Willem de Bruijn
@ 2024-07-02 18:05 ` Zijian Zhang
0 siblings, 0 replies; 10+ messages in thread
From: Zijian Zhang @ 2024-07-02 18:05 UTC (permalink / raw)
To: Willem de Bruijn, netdev; +Cc: cong.wang, xiaochun.lu
On 7/2/24 6:18 AM, Willem de Bruijn wrote:
> zijianzhang@ wrote:
>> From: Zijian Zhang <zijianzhang@bytedance.com>
>>
>> We find that when lock debugging is on, notifications may not come in
>> order. Thus, we have order checking outputs managed by cfg_verbose, to
>> avoid too many outputs in this case.
>>
>> Fixes: 07b65c5b31ce ("test: add msg_zerocopy test")
>> Signed-off-by: Zijian Zhang <zijianzhang@bytedance.com>
>> Signed-off-by: Xiaochun Lu <xiaochun.lu@bytedance.com>
>
> Why did you split this trivial change out? Anyway..
>
> Reviewed-by: Willem de Bruijn <willemb@google.com>
>
>
I am also conflicted, this change is trivial, but I think it's unrelated
to OOM, thus split it out. Sorry for the confusion, I'll be careful next
time. Thanks for the reviewing and quick reply :)
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net 1/2] selftests: fix OOM in msg_zerocopy selftest
2024-07-01 22:53 ` [PATCH net 1/2] selftests: fix OOM " zijianzhang
2024-07-02 13:17 ` Willem de Bruijn
@ 2024-07-04 1:50 ` Jakub Kicinski
2024-07-04 2:32 ` Zijian Zhang
1 sibling, 1 reply; 10+ messages in thread
From: Jakub Kicinski @ 2024-07-04 1:50 UTC (permalink / raw)
To: zijianzhang; +Cc: netdev, willemdebruijn.kernel, cong.wang, xiaochun.lu
On Mon, 1 Jul 2024 22:53:48 +0000 zijianzhang@bytedance.com wrote:
> In selftests/net/msg_zerocopy.c, it has a while loop keeps calling sendmsg
> on a socket with MSG_ZEROCOPY flag, and it will recv the notifications
> until the socket is not writable. Typically, it will start the receiving
> process after around 30+ sendmsgs. However, as the introduction of commit
> dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale"), the sender is
> always writable and does not get any chance to run recv notifications.
> The selftest always exits with OUT_OF_MEMORY because the memory used by
> opt_skb exceeds the net.core.optmem_max. Meanwhile, it could be set to a
> different value to trigger OOM on older kernels too.
This test doesn't fail in netdev CI. Is the problem fix in net-next
somehow? Or the "always exits with OUT_OF_MEMORY" is an exaggerations?
(TBH I'm not even sure what it means to "exit with OUT_OF_MEMORY" in
this context.)
TAP version 13
1..1
# timeout set to 3600
# selftests: net: msg_zerocopy.sh
# ipv4 tcp -t 1
# tx=164425 (10260 MB) txc=0 zc=n
# rx=59526 (10260 MB)
# ipv4 tcp -z -t 1
# tx=111332 (6947 MB) txc=111332 zc=n
# rx=55245 (6947 MB)
# ok
# ipv6 tcp -t 1
# tx=168140 (10492 MB) txc=0 zc=n
# rx=64633 (10492 MB)
# ipv6 tcp -z -t 1
# tx=108388 (6763 MB) txc=108388 zc=n
# rx=54146 (6763 MB)
# ok
# ipv4 udp -t 1
# tx=173970 (10856 MB) txc=0 zc=n
# rx=173936 (10854 MB)
# ipv4 udp -z -t 1
# tx=117728 (7346 MB) txc=117728 zc=n
# rx=117703 (7345 MB)
# ok
# ipv6 udp -t 1
# tx=225502 (14072 MB) txc=0 zc=n
# rx=225502 (14072 MB)
# ipv6 udp -z -t 1
# tx=111215 (6940 MB) txc=111215 zc=n
# rx=111212 (6940 MB)
# ok
# OK. All tests passed
ok 1 selftests: net: msg_zerocopy.sh
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net 1/2] selftests: fix OOM in msg_zerocopy selftest
2024-07-04 1:50 ` Jakub Kicinski
@ 2024-07-04 2:32 ` Zijian Zhang
2024-07-04 2:42 ` Jakub Kicinski
0 siblings, 1 reply; 10+ messages in thread
From: Zijian Zhang @ 2024-07-04 2:32 UTC (permalink / raw)
To: Jakub Kicinski; +Cc: netdev, willemdebruijn.kernel, cong.wang, xiaochun.lu
On 7/3/24 6:50 PM, Jakub Kicinski wrote:
> On Mon, 1 Jul 2024 22:53:48 +0000 zijianzhang@bytedance.com wrote:
>> In selftests/net/msg_zerocopy.c, it has a while loop keeps calling sendmsg
>> on a socket with MSG_ZEROCOPY flag, and it will recv the notifications
>> until the socket is not writable. Typically, it will start the receiving
>> process after around 30+ sendmsgs. However, as the introduction of commit
>> dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale"), the sender is
>> always writable and does not get any chance to run recv notifications.
>> The selftest always exits with OUT_OF_MEMORY because the memory used by
>> opt_skb exceeds the net.core.optmem_max. Meanwhile, it could be set to a
>> different value to trigger OOM on older kernels too.
>
> This test doesn't fail in netdev CI. Is the problem fix in net-next
> somehow? Or the "always exits with OUT_OF_MEMORY" is an exaggerations?
> (TBH I'm not even sure what it means to "exit with OUT_OF_MEMORY" in
> this context.)
>
The reason why this test doesn't fail in CI:
According to the test output,
# ipv4 tcp -z -t 1
# tx=111332 (6947 MB) txc=111332 zc=n
zerocopy is false here.
This is because of some limitation of zerocopy in localhost.
Specifically, the subsection "Notification Latency" in the sendmsg
zerocopy the paper.
In order to make "zc=y", we may need to update skb_orphan_frags_rx to
the same as skb_orphan_frags, recompile the kernel, and run the test.
By OUT_OF_MEMORY I mean:
Each calling of sendmsg with zerocopy will allocate an skb with
sock_omalloc. If users never recv the notifications but keep calling
sendmsg with zerocopy. The send system call will finally return with
-ENOMEM.
I hope this clarifies your confusion :)
> TAP version 13
> 1..1
> # timeout set to 3600
> # selftests: net: msg_zerocopy.sh
> # ipv4 tcp -t 1
> # tx=164425 (10260 MB) txc=0 zc=n
> # rx=59526 (10260 MB)
> # ipv4 tcp -z -t 1
> # tx=111332 (6947 MB) txc=111332 zc=n
> # rx=55245 (6947 MB)
> # ok
> # ipv6 tcp -t 1
> # tx=168140 (10492 MB) txc=0 zc=n
> # rx=64633 (10492 MB)
> # ipv6 tcp -z -t 1
> # tx=108388 (6763 MB) txc=108388 zc=n
> # rx=54146 (6763 MB)
> # ok
> # ipv4 udp -t 1
> # tx=173970 (10856 MB) txc=0 zc=n
> # rx=173936 (10854 MB)
> # ipv4 udp -z -t 1
> # tx=117728 (7346 MB) txc=117728 zc=n
> # rx=117703 (7345 MB)
> # ok
> # ipv6 udp -t 1
> # tx=225502 (14072 MB) txc=0 zc=n
> # rx=225502 (14072 MB)
> # ipv6 udp -z -t 1
> # tx=111215 (6940 MB) txc=111215 zc=n
> # rx=111212 (6940 MB)
> # ok
> # OK. All tests passed
> ok 1 selftests: net: msg_zerocopy.sh
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net 1/2] selftests: fix OOM in msg_zerocopy selftest
2024-07-04 2:32 ` Zijian Zhang
@ 2024-07-04 2:42 ` Jakub Kicinski
0 siblings, 0 replies; 10+ messages in thread
From: Jakub Kicinski @ 2024-07-04 2:42 UTC (permalink / raw)
To: Zijian Zhang; +Cc: netdev, willemdebruijn.kernel, cong.wang, xiaochun.lu
On Wed, 3 Jul 2024 19:32:33 -0700 Zijian Zhang wrote:
> > This test doesn't fail in netdev CI. Is the problem fix in net-next
> > somehow? Or the "always exits with OUT_OF_MEMORY" is an exaggerations?
> > (TBH I'm not even sure what it means to "exit with OUT_OF_MEMORY" in
> > this context.)
> >
> The reason why this test doesn't fail in CI:
>
> According to the test output,
> # ipv4 tcp -z -t 1
> # tx=111332 (6947 MB) txc=111332 zc=n
> zerocopy is false here.
>
> This is because of some limitation of zerocopy in localhost.
> Specifically, the subsection "Notification Latency" in the sendmsg
> zerocopy the paper.
>
> In order to make "zc=y", we may need to update skb_orphan_frags_rx to
> the same as skb_orphan_frags, recompile the kernel, and run the test.
>
> By OUT_OF_MEMORY I mean:
>
> Each calling of sendmsg with zerocopy will allocate an skb with
> sock_omalloc. If users never recv the notifications but keep calling
> sendmsg with zerocopy. The send system call will finally return with
> -ENOMEM.
>
> I hope this clarifies your confusion :)
It does, thanks!
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net 0/2] fix OOM and order check in msg_zerocopy selftest
2024-07-01 22:53 [PATCH net 0/2] fix OOM and order check in msg_zerocopy selftest zijianzhang
2024-07-01 22:53 ` [PATCH net 1/2] selftests: fix OOM " zijianzhang
2024-07-01 22:53 ` [PATCH net 2/2] selftests: make order checking verbose " zijianzhang
@ 2024-07-04 2:50 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 10+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-07-04 2:50 UTC (permalink / raw)
To: Zijian Zhang; +Cc: netdev, willemdebruijn.kernel, cong.wang, xiaochun.lu
Hello:
This series was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Mon, 1 Jul 2024 22:53:47 +0000 you wrote:
> From: Zijian Zhang <zijianzhang@bytedance.com>
>
> In selftests/net/msg_zerocopy.c, it has a while loop keeps calling sendmsg
> on a socket with MSG_ZEROCOPY flag, and it will recv the notifications
> until the socket is not writable. Typically, it will start the receiving
> process after around 30+ sendmsgs. However, as the introduction of commit
> dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale"), the sender is
> always writable and does not get any chance to run recv notifications.
> The selftest always exits with OUT_OF_MEMORY because the memory used by
> opt_skb exceeds the net.core.optmem_max. Meanwhile, it could be set to a
> different value to trigger OOM on older kernels too.
>
> [...]
Here is the summary with links:
- [net,1/2] selftests: fix OOM in msg_zerocopy selftest
https://git.kernel.org/netdev/net/c/af2b7e5b741a
- [net,2/2] selftests: make order checking verbose in msg_zerocopy selftest
https://git.kernel.org/netdev/net/c/7d6d8f0c8b70
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-07-04 2:50 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-01 22:53 [PATCH net 0/2] fix OOM and order check in msg_zerocopy selftest zijianzhang
2024-07-01 22:53 ` [PATCH net 1/2] selftests: fix OOM " zijianzhang
2024-07-02 13:17 ` Willem de Bruijn
2024-07-04 1:50 ` Jakub Kicinski
2024-07-04 2:32 ` Zijian Zhang
2024-07-04 2:42 ` Jakub Kicinski
2024-07-01 22:53 ` [PATCH net 2/2] selftests: make order checking verbose " zijianzhang
2024-07-02 13:18 ` Willem de Bruijn
2024-07-02 18:05 ` Zijian Zhang
2024-07-04 2:50 ` [PATCH net 0/2] fix OOM and order check " patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).