All of lore.kernel.org
 help / color / mirror / Atom feed
From: Simon Horman <horms@kernel.org>
To: Abel Wu <wuyun.abel@bytedance.com>
Cc: "David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Shakeel Butt <shakeelb@google.com>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Michal Hocko <mhocko@suse.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Yosry Ahmed <yosryahmed@google.com>, Yu Zhao <yuzhao@google.com>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	Kefeng Wang <wangkefeng.wang@huawei.com>,
	Yafang Shao <laoar.shao@gmail.com>,
	Kuniyuki Iwashima <kuniyu@amazon.com>,
	Martin KaFai Lau <martin.lau@kernel.org>,
	Breno Leitao <leitao@debian.org>,
	Alexander Mikhalitsyn <alexander@mihalicyn.com>,
	David Howells <dhowells@redhat.com>,
	Jason Xing <kernelxing@tencent.com>,
	open list <linux-kernel@vger.kernel.org>,
	"open list:NETWORKING [GENERAL]" <netdev@vger.kernel.org>,
	"open list:MEMORY MANAGEMENT" <linux-mm@kvack.org>
Subject: Re: [RFC PATCH net-next 3/3] sock: Throttle pressure-aware sockets under pressure
Date: Fri, 1 Sep 2023 15:59:32 +0200	[thread overview]
Message-ID: <20230901135932.GH140739@kernel.org> (raw)
In-Reply-To: <20230901062141.51972-4-wuyun.abel@bytedance.com>

On Fri, Sep 01, 2023 at 02:21:28PM +0800, Abel Wu wrote:
> A socket is pressure-aware when its protocol has pressure defined, that
> is sk_has_memory_pressure(sk) != NULL, e.g. TCP. These protocols might
> want to limit the usage of socket memory depending on both the state of
> global & memcg pressure through sk_under_memory_pressure(sk).
> 
> While for allocation, memcg pressure will be simply ignored when usage
> is under global limit (sysctl_mem[0]). This behavior has different impacts
> on different cgroup modes. In cgroupv2 socket and other purposes share a
> same memory limit, thus allowing sockmem to burst under memcg reclaiming
> pressure could lead to longer stall, sometimes even OOM. While cgroupv1
> has no such worries.
> 
> As a cloud service provider, we encountered a problem in our production
> environment during the transition from cgroup v1 to v2 (partly due to the
> heavy taxes of accounting socket memory in v1). Say one workload behaves
> fine in cgroupv1 with memcg limit configured to 10GB memory and another
> 1GB tcpmem, but will suck (or even be OOM-killed) in v2 with 11GB memory
> due to burst memory usage on socket, since there is no specific limit for
> socket memory in cgroupv2 and relies largely on workloads doing traffic
> control themselves.
> 
> It's rational for the workloads to build some traffic control to better
> utilize the resources they bought, but from kernel's point of view it's
> also reasonable to suppress the allocation of socket memory once there is
> a shortage of free memory, given that performance degradation is better
> than failure.
> 
> As per the above, this patch aims to be more conservative on allocation
> for the pressure-aware sockets under global and/or memcg pressure. While
> OTOH throttling on incoming traffic could hurt latency badly possibly
> due to SACKed segs get dropped from the OFO queue. See a related commit
> 720ca52bcef22 ("net-memcg: avoid stalls when under memory pressure").
> This patch preserves this decision by throttling RX allocation only at
> critical pressure level when it hardly makes sense to continue receive
> data.
> 
> No functional change intended for pressure-unaware protocols.
> 
> Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>

...

> @@ -3087,8 +3100,20 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
>  	if (sk_has_memory_pressure(sk)) {
>  		u64 alloc;
>  
> -		if (!sk_under_memory_pressure(sk))
> +		/* Be more conservative if the socket's memcg (or its
> +		 * parents) is under reclaim pressure, try to possibly
> +		 * avoid further memstall.
> +		 */
> +		if (under_memcg_pressure)
> +			goto suppress_allocation;
> +
> +		if (!sk_under_global_memory_pressure(sk))
>  			return 1;
> +
> +		/* Trying to be fair among all the sockets of same
> +		 * protocal under global memory pressure, by allowing

nit: checkpatch.pl --codespell says, protocal -> protocol

> +		 * the ones that under average usage to raise.
> +		 */
>  		alloc = sk_sockets_allocated_read_positive(sk);
>  		if (sk_prot_mem_limits(sk, 2) > alloc *
>  		    sk_mem_pages(sk->sk_wmem_queued +
> -- 
> 2.37.3
> 
> 


  reply	other threads:[~2023-09-01 14:00 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-01  6:21 [RFC PATCH net-next 0/3] sock: Be aware of memcg pressure on alloc Abel Wu
2023-09-01  6:21 ` [RFC PATCH net-next 1/3] sock: Code cleanup on __sk_mem_raise_allocated() Abel Wu
2023-09-14  5:47   ` Shakeel Butt
2023-09-01  6:21 ` [RFC PATCH net-next 2/3] net-memcg: Record pressure level when under pressure Abel Wu
2023-09-01  6:21 ` [RFC PATCH net-next 3/3] sock: Throttle pressure-aware sockets " Abel Wu
2023-09-01 13:59   ` Simon Horman [this message]
2023-09-03  4:54     ` Abel Wu
2023-09-18  7:48   ` Abel Wu
2023-09-18 15:49     ` Shakeel Butt
2023-09-08  7:55 ` [RFC PATCH net-next 0/3] sock: Be aware of memcg pressure on alloc Abel Wu
2023-09-08 15:42   ` Shakeel Butt
2023-09-10  5:09     ` Abel Wu
2023-09-14 21:20 ` Shakeel Butt
2023-09-15  8:47   ` Abel Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230901135932.GH140739@kernel.org \
    --to=horms@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=alexander@mihalicyn.com \
    --cc=davem@davemloft.net \
    --cc=dhowells@redhat.com \
    --cc=edumazet@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernelxing@tencent.com \
    --cc=kuba@kernel.org \
    --cc=kuniyu@amazon.com \
    --cc=laoar.shao@gmail.com \
    --cc=leitao@debian.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=martin.lau@kernel.org \
    --cc=mhocko@suse.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeelb@google.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=willy@infradead.org \
    --cc=wuyun.abel@bytedance.com \
    --cc=yosryahmed@google.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.