linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Shakeel Butt <shakeel.butt@linux.dev>
To: Kuniyuki Iwashima <kuniyu@google.com>
Cc: "David S. Miller" <davem@davemloft.net>,
	"Eric Dumazet" <edumazet@google.com>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Neal Cardwell" <ncardwell@google.com>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Willem de Bruijn" <willemb@google.com>,
	"Matthieu Baerts" <matttbe@kernel.org>,
	"Mat Martineau" <martineau@kernel.org>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Michal Hocko" <mhocko@kernel.org>,
	"Roman Gushchin" <roman.gushchin@linux.dev>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Michal Koutný" <mkoutny@suse.com>, "Tejun Heo" <tj@kernel.org>,
	"Simon Horman" <horms@kernel.org>,
	"Geliang Tang" <geliang@kernel.org>,
	"Muchun Song" <muchun.song@linux.dev>,
	"Mina Almasry" <almasrymina@google.com>,
	"Kuniyuki Iwashima" <kuni1840@gmail.com>,
	netdev@vger.kernel.org, mptcp@lists.linux.dev,
	cgroups@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH v3 net-next 12/12] net-memcg: Decouple controlled memcg from global protocol memory accounting.
Date: Wed, 13 Aug 2025 00:11:26 -0700	[thread overview]
Message-ID: <w6klr435a4rygmnifuujg6x4k77ch7cwoq6dspmyknqt24cpjz@bbz4wzmxjsfk> (raw)
In-Reply-To: <20250812175848.512446-13-kuniyu@google.com>

On Tue, Aug 12, 2025 at 05:58:30PM +0000, Kuniyuki Iwashima wrote:
> Some protocols (e.g., TCP, UDP) implement memory accounting for socket
> buffers and charge memory to per-protocol global counters pointed to by
> sk->sk_proto->memory_allocated.
> 
> When running under a non-root cgroup, this memory is also charged to the
> memcg as "sock" in memory.stat.
> 
> Even when a memcg controls memory usage, sockets of such protocols are
> still subject to global limits (e.g., /proc/sys/net/ipv4/tcp_mem).
> 
> This makes it difficult to accurately estimate and configure appropriate
> global limits, especially in multi-tenant environments.
> 
> If all workloads were guaranteed to be controlled under memcg, the issue
> could be worked around by setting tcp_mem[0~2] to UINT_MAX.
> 
> In reality, this assumption does not always hold, and processes that
> belong to the root cgroup or opt out of memcg can consume memory up to
> the global limit, becoming a noisy neighbour.

Processes running in root memcg (I am not sure what does 'opt out of
memcg means') means admin has intentionally allowed scenarios where
noisy neighbour situation can happen, so I am not really following your
argument here.

> 
> Let's decouple memcg from the global per-protocol memory accounting if
> it has a finite memory.max (!= "max").

Why decouple only for some? (Also if you really want to check memcg
limits, you need to check limits for all ancestors and not just the
given memcg).

Why not start with just two global options (maybe start with boot
parameter)?

Option 1: Existing behavior where memcg and global TCP accounting are
coupled.

Option 2: Completely decouple memcg and global TCP accounting i.e. use
mem_cgroup_sockets_enabled to either do global TCP accounting or memcg
accounting.

Keep the option 1 default.

I assume you want third option where a mix of these options can happen
i.e. some sockets are only accounted to a memcg and some are accounted
to both memcg and global TCP. I would recommend to make that a followup
patch series. Keep this series simple and non-controversial.



  parent reply	other threads:[~2025-08-13  7:11 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-12 17:58 [PATCH v3 net-next 00/12] net-memcg: Decouple controlled memcg from sk->sk_prot->memory_allocated Kuniyuki Iwashima
2025-08-12 17:58 ` [PATCH v3 net-next 01/12] mptcp: Fix up subflow's memcg when CONFIG_SOCK_CGROUP_DATA=n Kuniyuki Iwashima
2025-08-13  8:54   ` Matthieu Baerts
2025-08-14 12:30   ` Michal Koutný
2025-08-14 19:17     ` Kuniyuki Iwashima
2025-08-12 17:58 ` [PATCH v3 net-next 02/12] mptcp: Use tcp_under_memory_pressure() in mptcp_epollin_ready() Kuniyuki Iwashima
2025-08-13  8:54   ` Matthieu Baerts
2025-08-12 17:58 ` [PATCH v3 net-next 03/12] tcp: Simplify error path in inet_csk_accept() Kuniyuki Iwashima
2025-08-12 17:58 ` [PATCH v3 net-next 04/12] net: Call trace_sock_exceed_buf_limit() for memcg failure with SK_MEM_RECV Kuniyuki Iwashima
2025-08-12 17:58 ` [PATCH v3 net-next 05/12] net: Clean up __sk_mem_raise_allocated() Kuniyuki Iwashima
2025-08-12 17:58 ` [PATCH v3 net-next 06/12] net-memcg: Introduce mem_cgroup_from_sk() Kuniyuki Iwashima
2025-08-13  1:44   ` Roman Gushchin
2025-08-12 17:58 ` [PATCH v3 net-next 07/12] net-memcg: Introduce mem_cgroup_sk_enabled() Kuniyuki Iwashima
2025-08-13  1:46   ` Roman Gushchin
2025-08-12 17:58 ` [PATCH v3 net-next 08/12] net-memcg: Pass struct sock to mem_cgroup_sk_(un)?charge() Kuniyuki Iwashima
2025-08-13  1:47   ` Roman Gushchin
2025-08-12 17:58 ` [PATCH v3 net-next 09/12] net-memcg: Pass struct sock to mem_cgroup_sk_under_memory_pressure() Kuniyuki Iwashima
2025-08-13  1:49   ` Roman Gushchin
2025-08-13  1:49   ` Roman Gushchin
2025-08-12 17:58 ` [PATCH v3 net-next 10/12] net: Define sk_memcg under CONFIG_MEMCG Kuniyuki Iwashima
2025-08-12 17:58 ` [PATCH v3 net-next 11/12] net-memcg: Store MEMCG_SOCK_ISOLATED in sk->sk_memcg Kuniyuki Iwashima
2025-08-12 17:58 ` [PATCH v3 net-next 12/12] net-memcg: Decouple controlled memcg from global protocol memory accounting Kuniyuki Iwashima
2025-08-13  1:57   ` Roman Gushchin
2025-08-13  5:32     ` Kuniyuki Iwashima
2025-08-13  7:11   ` Shakeel Butt [this message]
2025-08-13 18:19     ` Kuniyuki Iwashima
2025-08-13 20:53       ` Shakeel Butt
2025-08-14  0:54         ` Martin KaFai Lau
2025-08-14  4:34           ` Kuniyuki Iwashima
2025-08-14 17:10             ` Shakeel Butt
2025-08-13 13:00   ` Johannes Weiner
2025-08-13 18:43     ` Kuniyuki Iwashima
2025-08-13 20:21       ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=w6klr435a4rygmnifuujg6x4k77ch7cwoq6dspmyknqt24cpjz@bbz4wzmxjsfk \
    --to=shakeel.butt@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=almasrymina@google.com \
    --cc=cgroups@vger.kernel.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=geliang@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=horms@kernel.org \
    --cc=kuba@kernel.org \
    --cc=kuni1840@gmail.com \
    --cc=kuniyu@google.com \
    --cc=linux-mm@kvack.org \
    --cc=martineau@kernel.org \
    --cc=matttbe@kernel.org \
    --cc=mhocko@kernel.org \
    --cc=mkoutny@suse.com \
    --cc=mptcp@lists.linux.dev \
    --cc=muchun.song@linux.dev \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=roman.gushchin@linux.dev \
    --cc=tj@kernel.org \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).