cgroups.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] memcg: net: track network throttling due to memcg memory pressure
@ 2025-10-16 16:10 Shakeel Butt
  2025-10-16 19:46 ` Andrew Morton
  0 siblings, 1 reply; 3+ messages in thread
From: Shakeel Butt @ 2025-10-16 16:10 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
	Tejun Heo, Eric Dumazet, Kuniyuki Iwashima, Paolo Abeni,
	Willem de Bruijn, Jakub Kicinski, David S . Miller, Matyas Hurtik,
	Daniel Sedlak, Simon Horman, Neal Cardwell, Wei Wang, netdev,
	linux-mm, cgroups, linux-kernel, Meta kernel team

The kernel can throttle network sockets if the memory cgroup associated
with the corresponding socket is under memory pressure. The throttling
actions include clamping the transmit window, failing to expand receive
or send buffers, aggressively prune out-of-order receive queue, FIN
deferred to a retransmitted packet and more. Let's add memcg metric to
indicate track such throttling actions.

At the moment memcg memory pressure is defined through vmpressure and in
future it may be defined using PSI or we may add more flexible way for
the users to define memory pressure, maybe through ebpf. However the
potential throttling actions will remain the same, so this newly
introduced metric will continue to track throttling actions irrespective
of how memcg memory pressure is defined.

Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Daniel Sedlak <daniel.sedlak@cdn77.com>
---
Changes since v1:
- renamed socks_throttled & MEMCG_SOCKS_THROTTLED as suggested by Roman
http://lore.kernel.org/20251016013116.3093530-1-shakeel.butt@linux.dev

 Documentation/admin-guide/cgroup-v2.rst | 4 ++++
 include/linux/memcontrol.h              | 1 +
 include/net/sock.h                      | 6 +++++-
 kernel/cgroup/cgroup.c                  | 1 +
 mm/memcontrol.c                         | 3 +++
 5 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 0e6c67ac585a..3345961c30ac 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1515,6 +1515,10 @@ The following nested keys are defined.
           oom_group_kill
                 The number of times a group OOM has occurred.
 
+          sock_throttled
+                The number of times network sockets associated with
+                this cgroup are throttled.
+
   memory.events.local
 	Similar to memory.events but the fields in the file are local
 	to the cgroup i.e. not hierarchical. The file modified event
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 7ed15f858dc4..e0240560cea4 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -52,6 +52,7 @@ enum memcg_memory_event {
 	MEMCG_SWAP_HIGH,
 	MEMCG_SWAP_MAX,
 	MEMCG_SWAP_FAIL,
+	MEMCG_SOCK_THROTTLED,
 	MEMCG_NR_MEMORY_EVENTS,
 };
 
diff --git a/include/net/sock.h b/include/net/sock.h
index 60bcb13f045c..ff7d49af1619 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -2635,8 +2635,12 @@ static inline bool mem_cgroup_sk_under_memory_pressure(const struct sock *sk)
 #endif /* CONFIG_MEMCG_V1 */
 
 	do {
-		if (time_before64(get_jiffies_64(), mem_cgroup_get_socket_pressure(memcg)))
+		if (time_before64(get_jiffies_64(),
+				  mem_cgroup_get_socket_pressure(memcg))) {
+			memcg_memory_event(mem_cgroup_from_sk(sk),
+					   MEMCG_SOCK_THROTTLED);
 			return true;
+		}
 	} while ((memcg = parent_mem_cgroup(memcg)));
 
 	return false;
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index fdee387f0d6b..8df671c59987 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -4704,6 +4704,7 @@ void cgroup_file_notify(struct cgroup_file *cfile)
 	}
 	spin_unlock_irqrestore(&cgroup_file_kn_lock, flags);
 }
+EXPORT_SYMBOL_GPL(cgroup_file_notify);
 
 /**
  * cgroup_file_show - show or hide a hidden cgroup file
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 3ae5cbcaed75..976412c8196e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -81,6 +81,7 @@ struct cgroup_subsys memory_cgrp_subsys __read_mostly;
 EXPORT_SYMBOL(memory_cgrp_subsys);
 
 struct mem_cgroup *root_mem_cgroup __read_mostly;
+EXPORT_SYMBOL(root_mem_cgroup);
 
 /* Active memory cgroup to use from an interrupt context */
 DEFINE_PER_CPU(struct mem_cgroup *, int_active_memcg);
@@ -4463,6 +4464,8 @@ static void __memory_events_show(struct seq_file *m, atomic_long_t *events)
 		   atomic_long_read(&events[MEMCG_OOM_KILL]));
 	seq_printf(m, "oom_group_kill %lu\n",
 		   atomic_long_read(&events[MEMCG_OOM_GROUP_KILL]));
+	seq_printf(m, "sock_throttled %lu\n",
+		   atomic_long_read(&events[MEMCG_SOCK_THROTTLED]));
 }
 
 static int memory_events_show(struct seq_file *m, void *v)
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] memcg: net: track network throttling due to memcg memory pressure
  2025-10-16 16:10 [PATCH v2] memcg: net: track network throttling due to memcg memory pressure Shakeel Butt
@ 2025-10-16 19:46 ` Andrew Morton
  2025-10-16 19:57   ` Shakeel Butt
  0 siblings, 1 reply; 3+ messages in thread
From: Andrew Morton @ 2025-10-16 19:46 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
	Tejun Heo, Eric Dumazet, Kuniyuki Iwashima, Paolo Abeni,
	Willem de Bruijn, Jakub Kicinski, David S . Miller, Matyas Hurtik,
	Daniel Sedlak, Simon Horman, Neal Cardwell, Wei Wang, netdev,
	linux-mm, cgroups, linux-kernel, Meta kernel team

On Thu, 16 Oct 2025 09:10:35 -0700 Shakeel Butt <shakeel.butt@linux.dev> wrote:

> The kernel can throttle network sockets if the memory cgroup associated
> with the corresponding socket is under memory pressure. The throttling
> actions include clamping the transmit window, failing to expand receive
> or send buffers, aggressively prune out-of-order receive queue, FIN
> deferred to a retransmitted packet and more. Let's add memcg metric to
> indicate track such throttling actions.
> 
> At the moment memcg memory pressure is defined through vmpressure and in
> future it may be defined using PSI or we may add more flexible way for
> the users to define memory pressure, maybe through ebpf. However the
> potential throttling actions will remain the same, so this newly
> introduced metric will continue to track throttling actions irrespective
> of how memcg memory pressure is defined.
> 
> ...
>
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -2635,8 +2635,12 @@ static inline bool mem_cgroup_sk_under_memory_pressure(const struct sock *sk)
>  #endif /* CONFIG_MEMCG_V1 */
>  
>  	do {
> -		if (time_before64(get_jiffies_64(), mem_cgroup_get_socket_pressure(memcg)))
> +		if (time_before64(get_jiffies_64(),
> +				  mem_cgroup_get_socket_pressure(memcg))) {
> +			memcg_memory_event(mem_cgroup_from_sk(sk),
> +					   MEMCG_SOCK_THROTTLED);
>  			return true;
> +		}
>  	} while ((memcg = parent_mem_cgroup(memcg)));
>  

Totally OT, but that's one bigass inlined function.  A quick test
indicates that uninlining just this function reduces the size of
tcp_input.o and tcp_output.o nicely.  x86_64 defconfig:

   text	   data	    bss	    dec	    hex	filename
  52130	   1686	      0	  53816	   d238	net/ipv4/tcp_input.o
  32335	   1221	      0	  33556	   8314	net/ipv4/tcp_output.o

   text	   data	    bss	    dec	    hex	filename
  51346	   1494	      0	  52840	   ce68	net/ipv4/tcp_input.o
  31911	   1125	      0	  33036	   810c	net/ipv4/tcp_output.o



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] memcg: net: track network throttling due to memcg memory pressure
  2025-10-16 19:46 ` Andrew Morton
@ 2025-10-16 19:57   ` Shakeel Butt
  0 siblings, 0 replies; 3+ messages in thread
From: Shakeel Butt @ 2025-10-16 19:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
	Tejun Heo, Eric Dumazet, Kuniyuki Iwashima, Paolo Abeni,
	Willem de Bruijn, Jakub Kicinski, David S . Miller, Matyas Hurtik,
	Daniel Sedlak, Simon Horman, Neal Cardwell, Wei Wang, netdev,
	linux-mm, cgroups, linux-kernel, Meta kernel team

On Thu, Oct 16, 2025 at 12:46:10PM -0700, Andrew Morton wrote:
> On Thu, 16 Oct 2025 09:10:35 -0700 Shakeel Butt <shakeel.butt@linux.dev> wrote:
> 
> > The kernel can throttle network sockets if the memory cgroup associated
> > with the corresponding socket is under memory pressure. The throttling
> > actions include clamping the transmit window, failing to expand receive
> > or send buffers, aggressively prune out-of-order receive queue, FIN
> > deferred to a retransmitted packet and more. Let's add memcg metric to
> > indicate track such throttling actions.
> > 
> > At the moment memcg memory pressure is defined through vmpressure and in
> > future it may be defined using PSI or we may add more flexible way for
> > the users to define memory pressure, maybe through ebpf. However the
> > potential throttling actions will remain the same, so this newly
> > introduced metric will continue to track throttling actions irrespective
> > of how memcg memory pressure is defined.
> > 
> > ...
> >
> > --- a/include/net/sock.h
> > +++ b/include/net/sock.h
> > @@ -2635,8 +2635,12 @@ static inline bool mem_cgroup_sk_under_memory_pressure(const struct sock *sk)
> >  #endif /* CONFIG_MEMCG_V1 */
> >  
> >  	do {
> > -		if (time_before64(get_jiffies_64(), mem_cgroup_get_socket_pressure(memcg)))
> > +		if (time_before64(get_jiffies_64(),
> > +				  mem_cgroup_get_socket_pressure(memcg))) {
> > +			memcg_memory_event(mem_cgroup_from_sk(sk),
> > +					   MEMCG_SOCK_THROTTLED);
> >  			return true;
> > +		}
> >  	} while ((memcg = parent_mem_cgroup(memcg)));
> >  
> 
> Totally OT, but that's one bigass inlined function.  A quick test
> indicates that uninlining just this function reduces the size of
> tcp_input.o and tcp_output.o nicely.  x86_64 defconfig:
> 
>    text	   data	    bss	    dec	    hex	filename
>   52130	   1686	      0	  53816	   d238	net/ipv4/tcp_input.o
>   32335	   1221	      0	  33556	   8314	net/ipv4/tcp_output.o
> 
>    text	   data	    bss	    dec	    hex	filename
>   51346	   1494	      0	  52840	   ce68	net/ipv4/tcp_input.o
>   31911	   1125	      0	  33036	   810c	net/ipv4/tcp_output.o
> 

Nice find and this inlining might be hurting instead of helping. I will
look into it if no one else comes to it before me.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-10-16 19:57 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-16 16:10 [PATCH v2] memcg: net: track network throttling due to memcg memory pressure Shakeel Butt
2025-10-16 19:46 ` Andrew Morton
2025-10-16 19:57   ` Shakeel Butt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).