On Mon, Jul 21, 2025 at 08:35:32PM +0000, Kuniyuki Iwashima <kuniyu@google.com> wrote:
> Some protocols (e.g., TCP, UDP) implement memory accounting for socket
> buffers and charge memory to per-protocol global counters pointed to by
> sk->sk_proto->memory_allocated.
> 
> When running under a non-root cgroup, this memory is also charged to the
> memcg as sock in memory.stat.
> 
> Even when memory usage is controlled by memcg, sockets using such protocols
> are still subject to global limits (e.g., /proc/sys/net/ipv4/tcp_mem).

IIUC the envisioned use case is that some cgroups feed from global
resource and some from their own limit.
It means the admin knows both:
  a) how to configure individual cgroup,
  b) how to configure global limit (for the rest).
So why cannot they stick to a single model only?

> This makes it difficult to accurately estimate and configure appropriate
> global limits, especially in multi-tenant environments.
> 
> If all workloads were guaranteed to be controlled under memcg, the issue
> could be worked around by setting tcp_mem[0~2] to UINT_MAX.
> 
> In reality, this assumption does not always hold, and a single workload
> that opts out of memcg can consume memory up to the global limit,
> becoming a noisy neighbour.

That doesn't like a good idea to remove limits from possibly noisy
units.

> Let's decouple memcg from the global per-protocol memory accounting.
> 
> This simplifies memcg configuration while keeping the global limits
> within a reasonable range.

I think this is a configuration issue only, i.e. instead of preserving
the global limit because of _some_ memcgs, the configuration management
could have a default memcg limit that is substituted to those memcgs so
that there's no risk of runaways even in absence of global limit.

Regards,
Michal