* [PATCH net] net: avoid indirect memory pressure calls
@ 2023-02-24 18:46 Florian Westphal
2023-02-27 23:27 ` Jakub Kicinski
2023-02-28 16:44 ` Eric Dumazet
0 siblings, 2 replies; 9+ messages in thread
From: Florian Westphal @ 2023-02-24 18:46 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, kuba, pabeni, shakeelb, soheil, Florian Westphal
There is a noticeable tcp performance regression (loopback or cross-netns),
seen with iperf3 -Z (sendfile mode) when generic retpolines are needed.
With SK_RECLAIM_THRESHOLD checks gone number of calls to enter/leave
memory pressure happen much more often. For TCP indirect calls are
used.
We can't remove the if-set-return short-circuit check in
tcp_enter_memory_pressure because there are callers other than
sk_enter_memory_pressure. Doing a check in the sk wrapper too
reduces the indirect calls enough to recover some performance.
Before,
0.00-60.00 sec 322 GBytes 46.1 Gbits/sec receiver
After:
0.00-60.04 sec 359 GBytes 51.4 Gbits/sec receiver
"iperf3 -c $peer -t 60 -Z -f g", connected via veth in another netns.
Fixes: 4890b686f408 ("net: keep sk->sk_forward_alloc as small as possible")
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/core/sock.c | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)
diff --git a/net/core/sock.c b/net/core/sock.c
index 341c565dbc26..45d247112aa5 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2809,22 +2809,26 @@ EXPORT_SYMBOL(sock_cmsg_send);
static void sk_enter_memory_pressure(struct sock *sk)
{
- if (!sk->sk_prot->enter_memory_pressure)
+ unsigned long *memory_pressure = sk->sk_prot->memory_pressure;
+
+ if (!memory_pressure || READ_ONCE(*memory_pressure))
return;
- sk->sk_prot->enter_memory_pressure(sk);
+ if (sk->sk_prot->enter_memory_pressure)
+ sk->sk_prot->enter_memory_pressure(sk);
}
static void sk_leave_memory_pressure(struct sock *sk)
{
- if (sk->sk_prot->leave_memory_pressure) {
- sk->sk_prot->leave_memory_pressure(sk);
- } else {
- unsigned long *memory_pressure = sk->sk_prot->memory_pressure;
+ unsigned long *memory_pressure = sk->sk_prot->memory_pressure;
- if (memory_pressure && READ_ONCE(*memory_pressure))
- WRITE_ONCE(*memory_pressure, 0);
- }
+ if (!memory_pressure || READ_ONCE(*memory_pressure) == 0)
+ return;
+
+ if (sk->sk_prot->leave_memory_pressure)
+ sk->sk_prot->leave_memory_pressure(sk);
+ else
+ WRITE_ONCE(*memory_pressure, 0);
}
DEFINE_STATIC_KEY_FALSE(net_high_order_alloc_disable_key);
--
2.39.2
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH net] net: avoid indirect memory pressure calls 2023-02-24 18:46 [PATCH net] net: avoid indirect memory pressure calls Florian Westphal @ 2023-02-27 23:27 ` Jakub Kicinski 2023-02-28 16:28 ` Alexander Lobakin 2023-02-28 16:44 ` Eric Dumazet 1 sibling, 1 reply; 9+ messages in thread From: Jakub Kicinski @ 2023-02-27 23:27 UTC (permalink / raw) To: edumazet; +Cc: Florian Westphal, netdev, davem, pabeni, shakeelb, soheil On Fri, 24 Feb 2023 19:46:06 +0100 Florian Westphal wrote: > There is a noticeable tcp performance regression (loopback or cross-netns), > seen with iperf3 -Z (sendfile mode) when generic retpolines are needed. > > With SK_RECLAIM_THRESHOLD checks gone number of calls to enter/leave > memory pressure happen much more often. For TCP indirect calls are > used. > > We can't remove the if-set-return short-circuit check in > tcp_enter_memory_pressure because there are callers other than > sk_enter_memory_pressure. Doing a check in the sk wrapper too > reduces the indirect calls enough to recover some performance. > > Before, > 0.00-60.00 sec 322 GBytes 46.1 Gbits/sec receiver > > After: > 0.00-60.04 sec 359 GBytes 51.4 Gbits/sec receiver > > "iperf3 -c $peer -t 60 -Z -f g", connected via veth in another netns. > > Fixes: 4890b686f408 ("net: keep sk->sk_forward_alloc as small as possible") > Signed-off-by: Florian Westphal <fw@strlen.de> Looks acceptable, Eric? ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net] net: avoid indirect memory pressure calls 2023-02-27 23:27 ` Jakub Kicinski @ 2023-02-28 16:28 ` Alexander Lobakin 2023-02-28 16:34 ` Florian Westphal 0 siblings, 1 reply; 9+ messages in thread From: Alexander Lobakin @ 2023-02-28 16:28 UTC (permalink / raw) To: Jakub Kicinski, Florian Westphal Cc: edumazet, netdev, davem, pabeni, shakeelb, soheil From: Jakub Kicinski <kuba@kernel.org> Date: Mon, 27 Feb 2023 15:27:41 -0800 > On Fri, 24 Feb 2023 19:46:06 +0100 Florian Westphal wrote: >> There is a noticeable tcp performance regression (loopback or cross-netns), >> seen with iperf3 -Z (sendfile mode) when generic retpolines are needed. >> >> With SK_RECLAIM_THRESHOLD checks gone number of calls to enter/leave >> memory pressure happen much more often. For TCP indirect calls are >> used. >> >> We can't remove the if-set-return short-circuit check in >> tcp_enter_memory_pressure because there are callers other than >> sk_enter_memory_pressure. Doing a check in the sk wrapper too >> reduces the indirect calls enough to recover some performance. >> >> Before, >> 0.00-60.00 sec 322 GBytes 46.1 Gbits/sec receiver >> >> After: >> 0.00-60.04 sec 359 GBytes 51.4 Gbits/sec receiver >> >> "iperf3 -c $peer -t 60 -Z -f g", connected via veth in another netns. >> >> Fixes: 4890b686f408 ("net: keep sk->sk_forward_alloc as small as possible") >> Signed-off-by: Florian Westphal <fw@strlen.de> > > Looks acceptable, Eric? > I'm no Eric, but I'd only change this: + if (!memory_pressure || READ_ONCE(*memory_pressure) == 0) to + if (!memory_pressure || !READ_ONCE(*memory_pressure)) :p The perf boost looks gross, love that *_* Thanks, Olek ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net] net: avoid indirect memory pressure calls 2023-02-28 16:28 ` Alexander Lobakin @ 2023-02-28 16:34 ` Florian Westphal 2023-02-28 16:35 ` Eric Dumazet 0 siblings, 1 reply; 9+ messages in thread From: Florian Westphal @ 2023-02-28 16:34 UTC (permalink / raw) To: Alexander Lobakin Cc: Jakub Kicinski, Florian Westphal, edumazet, netdev, davem, pabeni, shakeelb, soheil Alexander Lobakin <aleksander.lobakin@intel.com> wrote: > From: Jakub Kicinski <kuba@kernel.org> > Date: Mon, 27 Feb 2023 15:27:41 -0800 > > > On Fri, 24 Feb 2023 19:46:06 +0100 Florian Westphal wrote: > >> There is a noticeable tcp performance regression (loopback or cross-netns), > >> seen with iperf3 -Z (sendfile mode) when generic retpolines are needed. > >> > >> With SK_RECLAIM_THRESHOLD checks gone number of calls to enter/leave > >> memory pressure happen much more often. For TCP indirect calls are > >> used. > >> > >> We can't remove the if-set-return short-circuit check in > >> tcp_enter_memory_pressure because there are callers other than > >> sk_enter_memory_pressure. Doing a check in the sk wrapper too > >> reduces the indirect calls enough to recover some performance. > >> > >> Before, > >> 0.00-60.00 sec 322 GBytes 46.1 Gbits/sec receiver > >> > >> After: > >> 0.00-60.04 sec 359 GBytes 51.4 Gbits/sec receiver > >> > >> "iperf3 -c $peer -t 60 -Z -f g", connected via veth in another netns. > >> > >> Fixes: 4890b686f408 ("net: keep sk->sk_forward_alloc as small as possible") > >> Signed-off-by: Florian Westphal <fw@strlen.de> > > > > Looks acceptable, Eric? > > > I'm no Eric, but I'd only change this: > > + if (!memory_pressure || READ_ONCE(*memory_pressure) == 0) > > to > > + if (!memory_pressure || !READ_ONCE(*memory_pressure)) I intentioanlly used '== 0', i found it too easy to miss the '!' before 'R'. But maybe I just need better glasses. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net] net: avoid indirect memory pressure calls 2023-02-28 16:34 ` Florian Westphal @ 2023-02-28 16:35 ` Eric Dumazet 0 siblings, 0 replies; 9+ messages in thread From: Eric Dumazet @ 2023-02-28 16:35 UTC (permalink / raw) To: Florian Westphal Cc: Alexander Lobakin, Jakub Kicinski, netdev, davem, pabeni, shakeelb, soheil On Tue, Feb 28, 2023 at 5:35 PM Florian Westphal <fw@strlen.de> wrote: > > Alexander Lobakin <aleksander.lobakin@intel.com> wrote: > > From: Jakub Kicinski <kuba@kernel.org> > > Date: Mon, 27 Feb 2023 15:27:41 -0800 > > > > > On Fri, 24 Feb 2023 19:46:06 +0100 Florian Westphal wrote: > > >> There is a noticeable tcp performance regression (loopback or cross-netns), > > >> seen with iperf3 -Z (sendfile mode) when generic retpolines are needed. > > >> > > >> With SK_RECLAIM_THRESHOLD checks gone number of calls to enter/leave > > >> memory pressure happen much more often. For TCP indirect calls are > > >> used. > > >> > > >> We can't remove the if-set-return short-circuit check in > > >> tcp_enter_memory_pressure because there are callers other than > > >> sk_enter_memory_pressure. Doing a check in the sk wrapper too > > >> reduces the indirect calls enough to recover some performance. > > >> > > >> Before, > > >> 0.00-60.00 sec 322 GBytes 46.1 Gbits/sec receiver > > >> > > >> After: > > >> 0.00-60.04 sec 359 GBytes 51.4 Gbits/sec receiver > > >> > > >> "iperf3 -c $peer -t 60 -Z -f g", connected via veth in another netns. > > >> > > >> Fixes: 4890b686f408 ("net: keep sk->sk_forward_alloc as small as possible") > > >> Signed-off-by: Florian Westphal <fw@strlen.de> > > > > > > Looks acceptable, Eric? > > > > > I'm no Eric, but I'd only change this: > > > > + if (!memory_pressure || READ_ONCE(*memory_pressure) == 0) > > > > to > > > > + if (!memory_pressure || !READ_ONCE(*memory_pressure)) > > I intentioanlly used '== 0', i found it too easy to miss the '!' before > 'R'. But maybe I just need better glasses. Sorry for the delay, I will take a look. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net] net: avoid indirect memory pressure calls 2023-02-24 18:46 [PATCH net] net: avoid indirect memory pressure calls Florian Westphal 2023-02-27 23:27 ` Jakub Kicinski @ 2023-02-28 16:44 ` Eric Dumazet 2023-02-28 17:42 ` Eric Dumazet 1 sibling, 1 reply; 9+ messages in thread From: Eric Dumazet @ 2023-02-28 16:44 UTC (permalink / raw) To: Florian Westphal; +Cc: netdev, davem, kuba, pabeni, shakeelb, soheil On Fri, Feb 24, 2023 at 7:49 PM Florian Westphal <fw@strlen.de> wrote: > > There is a noticeable tcp performance regression (loopback or cross-netns), > seen with iperf3 -Z (sendfile mode) when generic retpolines are needed. > > With SK_RECLAIM_THRESHOLD checks gone number of calls to enter/leave > memory pressure happen much more often. For TCP indirect calls are > used. > > We can't remove the if-set-return short-circuit check in > tcp_enter_memory_pressure because there are callers other than > sk_enter_memory_pressure. Doing a check in the sk wrapper too > reduces the indirect calls enough to recover some performance. > > Before, > 0.00-60.00 sec 322 GBytes 46.1 Gbits/sec receiver > > After: > 0.00-60.04 sec 359 GBytes 51.4 Gbits/sec receiver > > "iperf3 -c $peer -t 60 -Z -f g", connected via veth in another netns. > > Fixes: 4890b686f408 ("net: keep sk->sk_forward_alloc as small as possible") > Signed-off-by: Florian Westphal <fw@strlen.de> > --- > net/core/sock.c | 22 +++++++++++++--------- > 1 file changed, 13 insertions(+), 9 deletions(-) > > diff --git a/net/core/sock.c b/net/core/sock.c > index 341c565dbc26..45d247112aa5 100644 > --- a/net/core/sock.c > +++ b/net/core/sock.c > @@ -2809,22 +2809,26 @@ EXPORT_SYMBOL(sock_cmsg_send); > > static void sk_enter_memory_pressure(struct sock *sk) This one is probably not called under normal circumstances. > { > - if (!sk->sk_prot->enter_memory_pressure) > + unsigned long *memory_pressure = sk->sk_prot->memory_pressure; > + > + if (!memory_pressure || READ_ONCE(*memory_pressure)) > return; > > - sk->sk_prot->enter_memory_pressure(sk); > + if (sk->sk_prot->enter_memory_pressure) > + sk->sk_prot->enter_memory_pressure(sk); > } > > static void sk_leave_memory_pressure(struct sock *sk) > { > - if (sk->sk_prot->leave_memory_pressure) { > - sk->sk_prot->leave_memory_pressure(sk); > - } else { > - unsigned long *memory_pressure = sk->sk_prot->memory_pressure; > + unsigned long *memory_pressure = sk->sk_prot->memory_pressure; > > - if (memory_pressure && READ_ONCE(*memory_pressure)) > - WRITE_ONCE(*memory_pressure, 0); > - } > + if (!memory_pressure || READ_ONCE(*memory_pressure) == 0) > + return; > + > + if (sk->sk_prot->leave_memory_pressure) > + sk->sk_prot->leave_memory_pressure(sk); > + else > + WRITE_ONCE(*memory_pressure, 0); > } > For this one, we have too callers. First one (__sk_mem_reduce_allocated) is using if (sk_under_memory_pressure(sk) ... So I wonder if for consistency we could use the same heuristic [1] ? Note that [1] is memcg aware/ready. Refactoring of sk_enter_memory_pressure() and sk_leave_memory_pressure() could wait net-next maybe... [1] diff --git a/net/core/sock.c b/net/core/sock.c index 341c565dbc262fcece1c5b410609d910a68edcb0..0472f8559a10136672ce6647f133367c81a93cb7 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -2993,7 +2993,8 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind) /* Under limit. */ if (allocated <= sk_prot_mem_limits(sk, 0)) { - sk_leave_memory_pressure(sk); + if (sk_under_memory_pressure(sk)) + sk_leave_memory_pressure(sk); return 1; } ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net] net: avoid indirect memory pressure calls 2023-02-28 16:44 ` Eric Dumazet @ 2023-02-28 17:42 ` Eric Dumazet 2023-03-01 12:31 ` Florian Westphal 0 siblings, 1 reply; 9+ messages in thread From: Eric Dumazet @ 2023-02-28 17:42 UTC (permalink / raw) To: Florian Westphal, Brian Vazquez Cc: netdev, davem, kuba, pabeni, shakeelb, soheil On Tue, Feb 28, 2023 at 5:44 PM Eric Dumazet <edumazet@google.com> wrote: > > On Fri, Feb 24, 2023 at 7:49 PM Florian Westphal <fw@strlen.de> wrote: > > > > There is a noticeable tcp performance regression (loopback or cross-netns), > > seen with iperf3 -Z (sendfile mode) when generic retpolines are needed. > > > > With SK_RECLAIM_THRESHOLD checks gone number of calls to enter/leave > > memory pressure happen much more often. For TCP indirect calls are > > used. > > > > We can't remove the if-set-return short-circuit check in > > tcp_enter_memory_pressure because there are callers other than > > sk_enter_memory_pressure. Doing a check in the sk wrapper too > > reduces the indirect calls enough to recover some performance. > > > > Before, > > 0.00-60.00 sec 322 GBytes 46.1 Gbits/sec receiver > > > > After: > > 0.00-60.04 sec 359 GBytes 51.4 Gbits/sec receiver > > BTW I was curious why Google was not seeing this, and it appears Brian Vasquez forgot to upstream this change... commit 5ea2f21d6c1078d2c563cb455ad5877b4ada94e1 Author: Brian Vazquez <brianvv@google.com> Date: Thu Mar 3 19:09:49 2022 -0800 PRODKERNEL: net-directcall: annotate tcp_leave_memory_pressure and tcp_getsockopt Switch to upstream infra to make rebase easier Tested: built and booted on lab machine Upstream-Plan: 150254871 Effort: net-directcall diff --git a/net/core/sock.c b/net/core/sock.c index 05032b399c873984e5297898d647905ca9f21f2e..54cb989dc162f3982380ac12cf5a150214e209a2 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -2647,10 +2647,13 @@ static void sk_enter_memory_pressure(struct sock *sk) sk->sk_prot->enter_memory_pressure(sk); } +INDIRECT_CALLABLE_DECLARE(void tcp_leave_memory_pressure(struct sock *sk)); + static void sk_leave_memory_pressure(struct sock *sk) { if (sk->sk_prot->leave_memory_pressure) { - sk->sk_prot->leave_memory_pressure(sk); + INDIRECT_CALL_1(sk->sk_prot->leave_memory_pressure, + tcp_leave_memory_pressure, sk); } else { unsigned long *memory_pressure = sk->sk_prot->memory_pressure; @@ -3439,6 +3442,10 @@ int sock_recv_errqueue(struct sock *sk, struct msghdr *msg, int len, } EXPORT_SYMBOL(sock_recv_errqueue); +INDIRECT_CALLABLE_DECLARE(int tcp_getsockopt(struct sock *sk, int level, + int optname, char __user *optval, + int __user *optlen)); + /* * Get a socket option on an socket. * @@ -3451,7 +3458,8 @@ int sock_common_getsockopt(struct socket *sock, int level, int optname, { struct sock *sk = sock->sk; - return sk->sk_prot->getsockopt(sk, level, optname, optval, optlen); + return INDIRECT_CALL_1(sk->sk_prot->getsockopt, tcp_getsockopt, + sk, level, optname, optval, optlen); } EXPORT_SYMBOL(sock_common_getsockopt); ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net] net: avoid indirect memory pressure calls 2023-02-28 17:42 ` Eric Dumazet @ 2023-03-01 12:31 ` Florian Westphal 2023-03-01 12:51 ` Eric Dumazet 0 siblings, 1 reply; 9+ messages in thread From: Florian Westphal @ 2023-03-01 12:31 UTC (permalink / raw) To: Eric Dumazet Cc: Florian Westphal, Brian Vazquez, netdev, davem, kuba, pabeni, shakeelb, soheil Eric Dumazet <edumazet@google.com> wrote: > BTW I was curious why Google was not seeing this, and it appears Brian Vasquez > forgot to upstream this change... > > commit 5ea2f21d6c1078d2c563cb455ad5877b4ada94e1 > Author: Brian Vazquez <brianvv@google.com> > Date: Thu Mar 3 19:09:49 2022 -0800 > > PRODKERNEL: net-directcall: annotate tcp_leave_memory_pressure and > tcp_getsockopt > > diff --git a/net/core/sock.c b/net/core/sock.c > index 05032b399c873984e5297898d647905ca9f21f2e..54cb989dc162f3982380ac12cf5a150214e209a2 > 100644 > --- a/net/core/sock.c > +++ b/net/core/sock.c > @@ -2647,10 +2647,13 @@ static void sk_enter_memory_pressure(struct sock *sk) > sk->sk_prot->enter_memory_pressure(sk); > } > > +INDIRECT_CALLABLE_DECLARE(void tcp_leave_memory_pressure(struct sock *sk)); > + > static void sk_leave_memory_pressure(struct sock *sk) > { > if (sk->sk_prot->leave_memory_pressure) { > - sk->sk_prot->leave_memory_pressure(sk); > + INDIRECT_CALL_1(sk->sk_prot->leave_memory_pressure, > + tcp_leave_memory_pressure, sk); > } else { > unsigned long *memory_pressure = sk->sk_prot->memory_pressure; re-tested: this change also resolves the regression i was seeing. If you prefer to upstream this instead of the proposed change then I'm fine with that. Thanks. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net] net: avoid indirect memory pressure calls 2023-03-01 12:31 ` Florian Westphal @ 2023-03-01 12:51 ` Eric Dumazet 0 siblings, 0 replies; 9+ messages in thread From: Eric Dumazet @ 2023-03-01 12:51 UTC (permalink / raw) To: Florian Westphal Cc: Brian Vazquez, netdev, davem, kuba, pabeni, shakeelb, soheil On Wed, Mar 1, 2023 at 1:31 PM Florian Westphal <fw@strlen.de> wrote: > > Eric Dumazet <edumazet@google.com> wrote: > > BTW I was curious why Google was not seeing this, and it appears Brian Vasquez > > forgot to upstream this change... > > > > commit 5ea2f21d6c1078d2c563cb455ad5877b4ada94e1 > > Author: Brian Vazquez <brianvv@google.com> > > Date: Thu Mar 3 19:09:49 2022 -0800 > > > > PRODKERNEL: net-directcall: annotate tcp_leave_memory_pressure and > > tcp_getsockopt > > > > diff --git a/net/core/sock.c b/net/core/sock.c > > index 05032b399c873984e5297898d647905ca9f21f2e..54cb989dc162f3982380ac12cf5a150214e209a2 > > 100644 > > --- a/net/core/sock.c > > +++ b/net/core/sock.c > > @@ -2647,10 +2647,13 @@ static void sk_enter_memory_pressure(struct sock *sk) > > sk->sk_prot->enter_memory_pressure(sk); > > } > > > > +INDIRECT_CALLABLE_DECLARE(void tcp_leave_memory_pressure(struct sock *sk)); > > + > > static void sk_leave_memory_pressure(struct sock *sk) > > { > > if (sk->sk_prot->leave_memory_pressure) { > > - sk->sk_prot->leave_memory_pressure(sk); > > + INDIRECT_CALL_1(sk->sk_prot->leave_memory_pressure, > > + tcp_leave_memory_pressure, sk); > > } else { > > unsigned long *memory_pressure = sk->sk_prot->memory_pressure; > > re-tested: this change also resolves the regression i was seeing. > > If you prefer to upstream this instead of the proposed change then I'm > fine with that. This seems a bit less risky, if the plan is to add this to stable trees. ( We had this mitigation for ~4 years at Google) I will rebase Brian patch (only the tcp_leave_memory_pressure part) and send it. Thanks ! ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2023-03-01 12:51 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-02-24 18:46 [PATCH net] net: avoid indirect memory pressure calls Florian Westphal 2023-02-27 23:27 ` Jakub Kicinski 2023-02-28 16:28 ` Alexander Lobakin 2023-02-28 16:34 ` Florian Westphal 2023-02-28 16:35 ` Eric Dumazet 2023-02-28 16:44 ` Eric Dumazet 2023-02-28 17:42 ` Eric Dumazet 2023-03-01 12:31 ` Florian Westphal 2023-03-01 12:51 ` Eric Dumazet
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).