linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Kuniyuki Iwashima <kuniyu@google.com>
To: "David S. Miller" <davem@davemloft.net>,
	"Eric Dumazet" <edumazet@google.com>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Neal Cardwell" <ncardwell@google.com>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Willem de Bruijn" <willemb@google.com>,
	"Matthieu Baerts" <matttbe@kernel.org>,
	"Mat Martineau" <martineau@kernel.org>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Michal Hocko" <mhocko@kernel.org>,
	"Roman Gushchin" <roman.gushchin@linux.dev>,
	"Shakeel Butt" <shakeel.butt@linux.dev>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Michal Koutný" <mkoutny@suse.com>, "Tejun Heo" <tj@kernel.org>
Cc: Simon Horman <horms@kernel.org>,
	Geliang Tang <geliang@kernel.org>,
	 Muchun Song <muchun.song@linux.dev>,
	Mina Almasry <almasrymina@google.com>,
	 Kuniyuki Iwashima <kuniyu@google.com>,
	Kuniyuki Iwashima <kuni1840@gmail.com>,
	netdev@vger.kernel.org,  mptcp@lists.linux.dev,
	cgroups@vger.kernel.org, linux-mm@kvack.org
Subject: [PATCH v3 net-next 00/12] net-memcg: Decouple controlled memcg from sk->sk_prot->memory_allocated.
Date: Tue, 12 Aug 2025 17:58:18 +0000	[thread overview]
Message-ID: <20250812175848.512446-1-kuniyu@google.com> (raw)

Some protocols (e.g., TCP, UDP) have their own memory accounting for
socket buffers and charge memory to global per-protocol counters such
as /proc/net/ipv4/tcp_mem.

When running under a non-root cgroup, this memory is also charged to
the memcg as sock in memory.stat.

Sockets of such protocols are still subject to the global limits,
thus affected by a noisy neighbour outside cgroup.

This makes it difficult to accurately estimate and configure appropriate
global limits.

If all workloads were guaranteed to be controlled under memcg, the issue
can be worked around by setting tcp_mem[0~2] to UINT_MAX.

However, this assumption does not always hold, and processes that belong
to the root cgroup or opt out of memcg can consume memory up to the global
limit, which is problematic.

This series decouples memcg from the global memory accounting if its
memory.max is not "max".  This simplifies the memcg configuration while
keeping the global limits within a reasonable range, which is only 10% of
the physical memory by default.

Overview of the series:

  patch 1 is a bug fix for MPTCP
  patch 2 ~ 9 move sk->sk_memcg accesses to a single place
  patch 10 moves sk_memcg under CONFIG_MEMCG
  patch 11 stores a flag in the lowest bit of sk->sk_memcg
  patch 12 decouples memcg from sk_prot->memory_allocated based on the flag


Changes:
  v3:
    * Patch 12
      * Fix build failrue for kTLS (include <net/proto_memory.h>)

  v2: https://lore.kernel.org/netdev/20250811173116.2829786-1-kuniyu@google.com/
    * Remove per-memcg knob
    * Patch 11
      * Set flag on sk_memcg based on memory.max
    * Patch 12
      * Add sk_should_enter_memory_pressure() and cover
        tcp_enter_memory_pressure() calls
      * Update examples in changelog

  v1: https://lore.kernel.org/netdev/20250721203624.3807041-1-kuniyu@google.com/


Kuniyuki Iwashima (12):
  mptcp: Fix up subflow's memcg when CONFIG_SOCK_CGROUP_DATA=n.
  mptcp: Use tcp_under_memory_pressure() in mptcp_epollin_ready().
  tcp: Simplify error path in inet_csk_accept().
  net: Call trace_sock_exceed_buf_limit() for memcg failure with
    SK_MEM_RECV.
  net: Clean up __sk_mem_raise_allocated().
  net-memcg: Introduce mem_cgroup_from_sk().
  net-memcg: Introduce mem_cgroup_sk_enabled().
  net-memcg: Pass struct sock to mem_cgroup_sk_(un)?charge().
  net-memcg: Pass struct sock to mem_cgroup_sk_under_memory_pressure().
  net: Define sk_memcg under CONFIG_MEMCG.
  net-memcg: Store MEMCG_SOCK_ISOLATED in sk->sk_memcg.
  net-memcg: Decouple controlled memcg from global protocol memory
    accounting.

 include/linux/memcontrol.h      | 45 +++++++++-------
 include/net/proto_memory.h      | 15 ++++--
 include/net/sock.h              | 67 +++++++++++++++++++++++
 include/net/tcp.h               | 10 ++--
 mm/memcontrol.c                 | 48 +++++++++++++----
 net/core/sock.c                 | 94 +++++++++++++++++++++------------
 net/ipv4/inet_connection_sock.c | 35 +++++++-----
 net/ipv4/tcp.c                  |  3 +-
 net/ipv4/tcp_output.c           | 13 +++--
 net/mptcp/protocol.c            |  4 +-
 net/mptcp/protocol.h            |  4 +-
 net/mptcp/subflow.c             | 11 ++--
 net/tls/tls_device.c            |  4 +-
 13 files changed, 254 insertions(+), 99 deletions(-)

-- 
2.51.0.rc0.205.g4a044479a3-goog



             reply	other threads:[~2025-08-12 17:58 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-12 17:58 Kuniyuki Iwashima [this message]
2025-08-12 17:58 ` [PATCH v3 net-next 01/12] mptcp: Fix up subflow's memcg when CONFIG_SOCK_CGROUP_DATA=n Kuniyuki Iwashima
2025-08-13  8:54   ` Matthieu Baerts
2025-08-14 12:30   ` Michal Koutný
2025-08-14 19:17     ` Kuniyuki Iwashima
2025-08-12 17:58 ` [PATCH v3 net-next 02/12] mptcp: Use tcp_under_memory_pressure() in mptcp_epollin_ready() Kuniyuki Iwashima
2025-08-13  8:54   ` Matthieu Baerts
2025-08-12 17:58 ` [PATCH v3 net-next 03/12] tcp: Simplify error path in inet_csk_accept() Kuniyuki Iwashima
2025-08-12 17:58 ` [PATCH v3 net-next 04/12] net: Call trace_sock_exceed_buf_limit() for memcg failure with SK_MEM_RECV Kuniyuki Iwashima
2025-08-12 17:58 ` [PATCH v3 net-next 05/12] net: Clean up __sk_mem_raise_allocated() Kuniyuki Iwashima
2025-08-12 17:58 ` [PATCH v3 net-next 06/12] net-memcg: Introduce mem_cgroup_from_sk() Kuniyuki Iwashima
2025-08-13  1:44   ` Roman Gushchin
2025-08-12 17:58 ` [PATCH v3 net-next 07/12] net-memcg: Introduce mem_cgroup_sk_enabled() Kuniyuki Iwashima
2025-08-13  1:46   ` Roman Gushchin
2025-08-12 17:58 ` [PATCH v3 net-next 08/12] net-memcg: Pass struct sock to mem_cgroup_sk_(un)?charge() Kuniyuki Iwashima
2025-08-13  1:47   ` Roman Gushchin
2025-08-12 17:58 ` [PATCH v3 net-next 09/12] net-memcg: Pass struct sock to mem_cgroup_sk_under_memory_pressure() Kuniyuki Iwashima
2025-08-13  1:49   ` Roman Gushchin
2025-08-13  1:49   ` Roman Gushchin
2025-08-12 17:58 ` [PATCH v3 net-next 10/12] net: Define sk_memcg under CONFIG_MEMCG Kuniyuki Iwashima
2025-08-12 17:58 ` [PATCH v3 net-next 11/12] net-memcg: Store MEMCG_SOCK_ISOLATED in sk->sk_memcg Kuniyuki Iwashima
2025-08-12 17:58 ` [PATCH v3 net-next 12/12] net-memcg: Decouple controlled memcg from global protocol memory accounting Kuniyuki Iwashima
2025-08-13  1:57   ` Roman Gushchin
2025-08-13  5:32     ` Kuniyuki Iwashima
2025-08-13  7:11   ` Shakeel Butt
2025-08-13 18:19     ` Kuniyuki Iwashima
2025-08-13 20:53       ` Shakeel Butt
2025-08-14  0:54         ` Martin KaFai Lau
2025-08-14  4:34           ` Kuniyuki Iwashima
2025-08-14 17:10             ` Shakeel Butt
2025-08-13 13:00   ` Johannes Weiner
2025-08-13 18:43     ` Kuniyuki Iwashima
2025-08-13 20:21       ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250812175848.512446-1-kuniyu@google.com \
    --to=kuniyu@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=almasrymina@google.com \
    --cc=cgroups@vger.kernel.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=geliang@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=horms@kernel.org \
    --cc=kuba@kernel.org \
    --cc=kuni1840@gmail.com \
    --cc=linux-mm@kvack.org \
    --cc=martineau@kernel.org \
    --cc=matttbe@kernel.org \
    --cc=mhocko@kernel.org \
    --cc=mkoutny@suse.com \
    --cc=mptcp@lists.linux.dev \
    --cc=muchun.song@linux.dev \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=tj@kernel.org \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).