* [PATCH net 0/3] udp: behave under memory pressure
@ 2020-01-17 17:27 Paolo Abeni
2020-01-17 17:27 ` [PATCH net 1/3] net: generic enter_memory_pressure implementation Paolo Abeni
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Paolo Abeni @ 2020-01-17 17:27 UTC (permalink / raw)
To: netdev; +Cc: David S. Miller, Willem de Bruijn
Williem reported that in some scenarios the UDP protocol can keep a lot of
memory in use on an idle system. He also diagnosed the root cause in the
forward allocated memory bulk free.
This series addresses the issue adding memory pressure tracking for the UDP
protocol, and flushing the fwd allocated memory if the protocol is under
memory pressure.
The first two patches clean-up the current memory pressure helpers for UDP
usage, and the 3rd one is the actual fix.
Targeting the net tree, as this addresses a reported issue. I guess even
net-next can be considered a valid target, as this also changes slightly the
protocol behavior under memory pressure. Please advise on the preferred option.
Paolo Abeni (3):
net: generic enter_memory_pressure implementation.
net: add annotation to memory_pressure lockless access
udp: avoid bulk memory scheduling on memory pressure.
include/net/sock.h | 4 ++--
include/net/udp.h | 2 ++
net/core/sock.c | 10 +++++++---
net/ipv4/udp.c | 13 ++++++++++++-
net/ipv6/udp.c | 2 ++
5 files changed, 25 insertions(+), 6 deletions(-)
--
2.21.0
^ permalink raw reply [flat|nested] 7+ messages in thread* [PATCH net 1/3] net: generic enter_memory_pressure implementation. 2020-01-17 17:27 [PATCH net 0/3] udp: behave under memory pressure Paolo Abeni @ 2020-01-17 17:27 ` Paolo Abeni 2020-01-17 17:27 ` [PATCH net 2/3] net: add annotations to memory_pressure lockless access Paolo Abeni 2020-01-17 17:27 ` [PATCH net 3/3] udp: avoid bulk memory scheduling on memory pressure Paolo Abeni 2 siblings, 0 replies; 7+ messages in thread From: Paolo Abeni @ 2020-01-17 17:27 UTC (permalink / raw) To: netdev; +Cc: David S. Miller, Willem de Bruijn Currently sk_leave_memory_pressure() offers a generic implementation for protocol lacking the leave_memory_pressure() helper, but supporting the memory pressure model. sk_enter_memory_pressure() lacks such bits. As a result we get code duplication and additional, avoidable indirect calls. This change provides a generic implementation for entering memory pressure, similar to the existing sk_leave_memory_pressure(). Note: no existing protocol is affected, until the existing enter_memory_pressure helper is removed from the relevant 'proto' struct. Signed-off-by: Paolo Abeni <pabeni@redhat.com> --- net/core/sock.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/net/core/sock.c b/net/core/sock.c index 8459ad579f73..8cf24dca9bde 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -2323,10 +2323,14 @@ EXPORT_SYMBOL(sock_cmsg_send); static void sk_enter_memory_pressure(struct sock *sk) { - if (!sk->sk_prot->enter_memory_pressure) - return; + if (sk->sk_prot->enter_memory_pressure) { + sk->sk_prot->enter_memory_pressure(sk); + } else { + unsigned long *memory_pressure = sk->sk_prot->memory_pressure; - sk->sk_prot->enter_memory_pressure(sk); + if (memory_pressure && !READ_ONCE(*memory_pressure)) + WRITE_ONCE(*memory_pressure, 1); + } } static void sk_leave_memory_pressure(struct sock *sk) -- 2.21.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH net 2/3] net: add annotations to memory_pressure lockless access 2020-01-17 17:27 [PATCH net 0/3] udp: behave under memory pressure Paolo Abeni 2020-01-17 17:27 ` [PATCH net 1/3] net: generic enter_memory_pressure implementation Paolo Abeni @ 2020-01-17 17:27 ` Paolo Abeni 2020-01-17 17:27 ` [PATCH net 3/3] udp: avoid bulk memory scheduling on memory pressure Paolo Abeni 2 siblings, 0 replies; 7+ messages in thread From: Paolo Abeni @ 2020-01-17 17:27 UTC (permalink / raw) To: netdev; +Cc: David S. Miller, Willem de Bruijn The proto memory pressure status is updated without any related lock held. This patch adds annotations to document this fact and avoid future syzbot complains. Signed-off-by: Paolo Abeni <pabeni@redhat.com> --- include/net/sock.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/net/sock.h b/include/net/sock.h index 8dff68b4c316..08383624b8cb 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1264,7 +1264,7 @@ static inline bool sk_under_memory_pressure(const struct sock *sk) mem_cgroup_under_socket_pressure(sk->sk_memcg)) return true; - return !!*sk->sk_prot->memory_pressure; + return !!READ_ONCE(*sk->sk_prot->memory_pressure); } static inline long @@ -1318,7 +1318,7 @@ proto_memory_pressure(struct proto *prot) { if (!prot->memory_pressure) return false; - return !!*prot->memory_pressure; + return !!READ_ONCE(*prot->memory_pressure); } -- 2.21.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH net 3/3] udp: avoid bulk memory scheduling on memory pressure. 2020-01-17 17:27 [PATCH net 0/3] udp: behave under memory pressure Paolo Abeni 2020-01-17 17:27 ` [PATCH net 1/3] net: generic enter_memory_pressure implementation Paolo Abeni 2020-01-17 17:27 ` [PATCH net 2/3] net: add annotations to memory_pressure lockless access Paolo Abeni @ 2020-01-17 17:27 ` Paolo Abeni 2020-01-17 17:51 ` Eric Dumazet 2 siblings, 1 reply; 7+ messages in thread From: Paolo Abeni @ 2020-01-17 17:27 UTC (permalink / raw) To: netdev; +Cc: David S. Miller, Willem de Bruijn Williem reported that after commit 0d4a6608f68c ("udp: do rmem bulk free even if the rx sk queue is empty") the memory allocated by an almost idle system with many UDP sockets can grow a lot. This change addresses the issue enabling memory pressure tracking for UDP and flushing the fwd allocated memory on dequeue if the UDP protocol is under memory pressure. Note that with this patch applied, the system allocates more liberally memory for UDP sockets while the total memory usage is below udp_mem[1], while the vanilla kernel would allow at most a single page per socket when UDP memory usage goes above udp_mem[0] - see __sk_mem_raise_allocated(). Reported-and-diagnosed-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com> Fixes: commit 0d4a6608f68c ("udp: do rmem bulk free even if the rx sk queue is empty") Signed-off-by: Paolo Abeni <pabeni@redhat.com> --- include/net/udp.h | 2 ++ net/ipv4/udp.c | 13 ++++++++++++- net/ipv6/udp.c | 2 ++ 3 files changed, 16 insertions(+), 1 deletion(-) diff --git a/include/net/udp.h b/include/net/udp.h index bad74f780831..cff730798291 100644 --- a/include/net/udp.h +++ b/include/net/udp.h @@ -94,6 +94,8 @@ static inline struct udp_hslot *udp_hashslot2(struct udp_table *table, extern struct proto udp_prot; extern atomic_long_t udp_memory_allocated; +extern unsigned long udp_memory_pressure; +extern struct percpu_counter udp_sockets_allocated; /* sysctl variables for udp */ extern long sysctl_udp_mem[3]; diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 93a355b6b092..3a68ec6c3410 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -119,6 +119,12 @@ EXPORT_SYMBOL(udp_table); long sysctl_udp_mem[3] __read_mostly; EXPORT_SYMBOL(sysctl_udp_mem); +unsigned long udp_memory_pressure __read_mostly; +EXPORT_SYMBOL(udp_memory_pressure); + +struct percpu_counter udp_sockets_allocated; +EXPORT_SYMBOL(udp_sockets_allocated); + atomic_long_t udp_memory_allocated; EXPORT_SYMBOL(udp_memory_allocated); @@ -1368,7 +1374,8 @@ static void udp_rmem_release(struct sock *sk, int size, int partial, if (likely(partial)) { up->forward_deficit += size; size = up->forward_deficit; - if (size < (sk->sk_rcvbuf >> 2)) + if (size < (sk->sk_rcvbuf >> 2) && + !READ_ONCE(udp_memory_pressure)) return; } else { size += up->forward_deficit; @@ -2789,7 +2796,9 @@ struct proto udp_prot = { .unhash = udp_lib_unhash, .rehash = udp_v4_rehash, .get_port = udp_v4_get_port, + .memory_pressure = &udp_memory_pressure, .memory_allocated = &udp_memory_allocated, + .sockets_allocated = &udp_sockets_allocated, .sysctl_mem = sysctl_udp_mem, .sysctl_wmem_offset = offsetof(struct net, ipv4.sysctl_udp_wmem_min), .sysctl_rmem_offset = offsetof(struct net, ipv4.sysctl_udp_rmem_min), @@ -3062,6 +3071,8 @@ void __init udp_init(void) sysctl_udp_mem[1] = limit; sysctl_udp_mem[2] = sysctl_udp_mem[0] * 2; + percpu_counter_init(&udp_sockets_allocated, 0, GFP_KERNEL); + __udp_sysctl_init(&init_net); /* 16 spinlocks per cpu */ diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 9fec580c968e..b29d92574ccc 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -1670,7 +1670,9 @@ struct proto udpv6_prot = { .unhash = udp_lib_unhash, .rehash = udp_v6_rehash, .get_port = udp_v6_get_port, + .memory_pressure = &udp_memory_pressure, .memory_allocated = &udp_memory_allocated, + .sockets_allocated = &udp_sockets_allocated, .sysctl_mem = sysctl_udp_mem, .sysctl_wmem_offset = offsetof(struct net, ipv4.sysctl_udp_wmem_min), .sysctl_rmem_offset = offsetof(struct net, ipv4.sysctl_udp_rmem_min), -- 2.21.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH net 3/3] udp: avoid bulk memory scheduling on memory pressure. 2020-01-17 17:27 ` [PATCH net 3/3] udp: avoid bulk memory scheduling on memory pressure Paolo Abeni @ 2020-01-17 17:51 ` Eric Dumazet 2020-01-17 18:37 ` Paolo Abeni 2020-01-17 18:38 ` Willem de Bruijn 0 siblings, 2 replies; 7+ messages in thread From: Eric Dumazet @ 2020-01-17 17:51 UTC (permalink / raw) To: Paolo Abeni, netdev; +Cc: David S. Miller, Willem de Bruijn On 1/17/20 9:27 AM, Paolo Abeni wrote: > Williem reported that after commit 0d4a6608f68c ("udp: do rmem bulk > free even if the rx sk queue is empty") the memory allocated by > an almost idle system with many UDP sockets can grow a lot. > > This change addresses the issue enabling memory pressure tracking > for UDP and flushing the fwd allocated memory on dequeue if the > UDP protocol is under memory pressure. > > Note that with this patch applied, the system allocates more > liberally memory for UDP sockets while the total memory usage is > below udp_mem[1], while the vanilla kernel would allow at most a > single page per socket when UDP memory usage goes above udp_mem[0] > - see __sk_mem_raise_allocated(). > > Reported-and-diagnosed-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com> > Fixes: commit 0d4a6608f68c ("udp: do rmem bulk free even if the rx sk queue is empty") Not a proper Fixes: tag Frankly I would rather revert this patch, unless you show how much things were improved. Where in the UDP code the forward allocations will be released while udp_memory_pressure is hit ? TCP has many calls to sk_mem_reclaim() and sk_mem_reclaim_partial() to try to gracefully exit memory pressure. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net 3/3] udp: avoid bulk memory scheduling on memory pressure. 2020-01-17 17:51 ` Eric Dumazet @ 2020-01-17 18:37 ` Paolo Abeni 2020-01-17 18:38 ` Willem de Bruijn 1 sibling, 0 replies; 7+ messages in thread From: Paolo Abeni @ 2020-01-17 18:37 UTC (permalink / raw) To: Eric Dumazet, netdev; +Cc: David S. Miller, Willem de Bruijn Hi, On Fri, 2020-01-17 at 09:51 -0800, Eric Dumazet wrote: > On 1/17/20 9:27 AM, Paolo Abeni wrote: > > Williem reported that after commit 0d4a6608f68c ("udp: do rmem bulk > > free even if the rx sk queue is empty") the memory allocated by > > an almost idle system with many UDP sockets can grow a lot. > > > > This change addresses the issue enabling memory pressure tracking > > for UDP and flushing the fwd allocated memory on dequeue if the > > UDP protocol is under memory pressure. > > > > Note that with this patch applied, the system allocates more > > liberally memory for UDP sockets while the total memory usage is > > below udp_mem[1], while the vanilla kernel would allow at most a > > single page per socket when UDP memory usage goes above udp_mem[0] > > - see __sk_mem_raise_allocated(). > > > > Reported-and-diagnosed-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com> > > Fixes: commit 0d4a6608f68c ("udp: do rmem bulk free even if the rx sk queue is empty") Thank you for the feedback. > Not a proper Fixes: tag > > Frankly I would rather revert this patch, unless you show how much things were improved. unpatched version: # ensure we will hit memory pressure echo "5000 10000 20000" > /proc/sys/net/ipv4/udp_mem # run the repro from Willem ./repro.py # it get stuck after opening a bunch of sockets, because # __udp_enqueue_schedule_skb() hits the memory limit # and packets are dropped. patched kernel: echo "5000 10000 20000" > /proc/sys/net/ipv4/udp_mem ./repro.py # completes successfully, output trimmed sockets: used 10179 TCP: inuse 4 orphan 0 tw 0 alloc 7 mem 1 UDP: inuse 4 mem 19860 UDPLITE: inuse 0 RAW: inuse 0 FRAG: inuse 0 memory 0 To complete successfully with an unpatched kernel the reproducer requires memory limits 1 or 2 order of magnitude greaters. > Where in the UDP code the forward allocations will be released while udp_memory_pressure > is hit ? fwd memory is released by udp_rmem_release(), so every time some process tries to read from any UDP socket. Surely less effective than the TCP infrastructure, but that other option looks overkill for UDP?!? Thanks, Paolo ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net 3/3] udp: avoid bulk memory scheduling on memory pressure. 2020-01-17 17:51 ` Eric Dumazet 2020-01-17 18:37 ` Paolo Abeni @ 2020-01-17 18:38 ` Willem de Bruijn 1 sibling, 0 replies; 7+ messages in thread From: Willem de Bruijn @ 2020-01-17 18:38 UTC (permalink / raw) To: Eric Dumazet; +Cc: Paolo Abeni, Network Development, David S. Miller On Fri, Jan 17, 2020 at 12:51 PM Eric Dumazet <eric.dumazet@gmail.com> wrote: > > > > On 1/17/20 9:27 AM, Paolo Abeni wrote: > > Williem reported that after commit 0d4a6608f68c ("udp: do rmem bulk > > free even if the rx sk queue is empty") the memory allocated by > > an almost idle system with many UDP sockets can grow a lot. > > > > This change addresses the issue enabling memory pressure tracking > > for UDP and flushing the fwd allocated memory on dequeue if the > > UDP protocol is under memory pressure. > > > > Note that with this patch applied, the system allocates more > > liberally memory for UDP sockets while the total memory usage is > > below udp_mem[1], while the vanilla kernel would allow at most a > > single page per socket when UDP memory usage goes above udp_mem[0] > > - see __sk_mem_raise_allocated(). > > > > Reported-and-diagnosed-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com> > > Fixes: commit 0d4a6608f68c ("udp: do rmem bulk free even if the rx sk queue is empty") Thanks a lot for the quick follow-up, Paolo! > Not a proper Fixes: tag And to give credit where it's due: Eric diagnosed the issue. > Frankly I would rather revert this patch, unless you show how much things were improved. In response to your question in the cover letter, I also think that for net the three-line revert patch is more obviously correct. Memory pressure might be helpful in net-next with some iteration, of course. > Where in the UDP code the forward allocations will be released while udp_memory_pressure > is hit ? > > TCP has many calls to sk_mem_reclaim() and sk_mem_reclaim_partial() to try > to gracefully exit memory pressure. > ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2020-01-17 18:39 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-01-17 17:27 [PATCH net 0/3] udp: behave under memory pressure Paolo Abeni 2020-01-17 17:27 ` [PATCH net 1/3] net: generic enter_memory_pressure implementation Paolo Abeni 2020-01-17 17:27 ` [PATCH net 2/3] net: add annotations to memory_pressure lockless access Paolo Abeni 2020-01-17 17:27 ` [PATCH net 3/3] udp: avoid bulk memory scheduling on memory pressure Paolo Abeni 2020-01-17 17:51 ` Eric Dumazet 2020-01-17 18:37 ` Paolo Abeni 2020-01-17 18:38 ` Willem de Bruijn
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).