* [PATCH net v3 1/3] sock: Code cleanup on __sk_mem_raise_allocated()
@ 2023-10-19 12:00 Abel Wu
2023-10-19 12:00 ` [PATCH net v3 2/3] sock: Doc behaviors for pressure heurisitics Abel Wu
` (3 more replies)
0 siblings, 4 replies; 10+ messages in thread
From: Abel Wu @ 2023-10-19 12:00 UTC (permalink / raw)
To: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Shakeel Butt
Cc: netdev, linux-kernel, Abel Wu
Code cleanup for both better simplicity and readability.
No functional change intended.
Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
---
net/core/sock.c | 22 ++++++++++++----------
1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/net/core/sock.c b/net/core/sock.c
index 16584e2dd648..4412c47466a7 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3041,17 +3041,19 @@ EXPORT_SYMBOL(sk_wait_data);
*/
int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
{
- bool memcg_charge = mem_cgroup_sockets_enabled && sk->sk_memcg;
+ struct mem_cgroup *memcg = mem_cgroup_sockets_enabled ? sk->sk_memcg : NULL;
struct proto *prot = sk->sk_prot;
- bool charged = true;
+ bool charged = false;
long allocated;
sk_memory_allocated_add(sk, amt);
allocated = sk_memory_allocated(sk);
- if (memcg_charge &&
- !(charged = mem_cgroup_charge_skmem(sk->sk_memcg, amt,
- gfp_memcg_charge())))
- goto suppress_allocation;
+
+ if (memcg) {
+ if (!mem_cgroup_charge_skmem(memcg, amt, gfp_memcg_charge()))
+ goto suppress_allocation;
+ charged = true;
+ }
/* Under limit. */
if (allocated <= sk_prot_mem_limits(sk, 0)) {
@@ -3106,8 +3108,8 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
*/
if (sk->sk_wmem_queued + size >= sk->sk_sndbuf) {
/* Force charge with __GFP_NOFAIL */
- if (memcg_charge && !charged) {
- mem_cgroup_charge_skmem(sk->sk_memcg, amt,
+ if (memcg && !charged) {
+ mem_cgroup_charge_skmem(memcg, amt,
gfp_memcg_charge() | __GFP_NOFAIL);
}
return 1;
@@ -3119,8 +3121,8 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
sk_memory_allocated_sub(sk, amt);
- if (memcg_charge && charged)
- mem_cgroup_uncharge_skmem(sk->sk_memcg, amt);
+ if (charged)
+ mem_cgroup_uncharge_skmem(memcg, amt);
return 0;
}
--
2.37.3
^ permalink raw reply related [flat|nested] 10+ messages in thread* [PATCH net v3 2/3] sock: Doc behaviors for pressure heurisitics
2023-10-19 12:00 [PATCH net v3 1/3] sock: Code cleanup on __sk_mem_raise_allocated() Abel Wu
@ 2023-10-19 12:00 ` Abel Wu
2023-10-23 7:49 ` Simon Horman
2023-10-19 12:00 ` [PATCH net v3 3/3] sock: Ignore memcg pressure heuristics when raising allocated Abel Wu
` (2 subsequent siblings)
3 siblings, 1 reply; 10+ messages in thread
From: Abel Wu @ 2023-10-19 12:00 UTC (permalink / raw)
To: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Shakeel Butt
Cc: netdev, linux-kernel, Abel Wu
There are now two accounting infrastructures for skmem, while the
heuristics in __sk_mem_raise_allocated() were actually introduced
before memcg was born.
Add some comments to clarify whether they can be applied to both
infrastructures or not.
Suggested-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
---
net/core/sock.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/net/core/sock.c b/net/core/sock.c
index 4412c47466a7..45841a5689b6 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3069,7 +3069,14 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
if (allocated > sk_prot_mem_limits(sk, 2))
goto suppress_allocation;
- /* guarantee minimum buffer size under pressure */
+ /* Guarantee minimum buffer size under pressure (either global
+ * or memcg) to make sure features described in RFC 7323 (TCP
+ * Extensions for High Performance) work properly.
+ *
+ * This rule does NOT stand when exceeds global or memcg's hard
+ * limit, or else a DoS attack can be taken place by spawning
+ * lots of sockets whose usage are under minimum buffer size.
+ */
if (kind == SK_MEM_RECV) {
if (atomic_read(&sk->sk_rmem_alloc) < sk_get_rmem0(sk, prot))
return 1;
@@ -3090,6 +3097,11 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
if (!sk_under_memory_pressure(sk))
return 1;
+
+ /* Try to be fair among all the sockets under global
+ * pressure by allowing the ones that below average
+ * usage to raise.
+ */
alloc = sk_sockets_allocated_read_positive(sk);
if (sk_prot_mem_limits(sk, 2) > alloc *
sk_mem_pages(sk->sk_wmem_queued +
--
2.37.3
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH net v3 2/3] sock: Doc behaviors for pressure heurisitics
2023-10-19 12:00 ` [PATCH net v3 2/3] sock: Doc behaviors for pressure heurisitics Abel Wu
@ 2023-10-23 7:49 ` Simon Horman
0 siblings, 0 replies; 10+ messages in thread
From: Simon Horman @ 2023-10-23 7:49 UTC (permalink / raw)
To: Abel Wu
Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Shakeel Butt, netdev, linux-kernel
On Thu, Oct 19, 2023 at 08:00:25PM +0800, Abel Wu wrote:
> There are now two accounting infrastructures for skmem, while the
> heuristics in __sk_mem_raise_allocated() were actually introduced
> before memcg was born.
>
> Add some comments to clarify whether they can be applied to both
> infrastructures or not.
>
> Suggested-by: Shakeel Butt <shakeelb@google.com>
> Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
> Acked-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH net v3 3/3] sock: Ignore memcg pressure heuristics when raising allocated
2023-10-19 12:00 [PATCH net v3 1/3] sock: Code cleanup on __sk_mem_raise_allocated() Abel Wu
2023-10-19 12:00 ` [PATCH net v3 2/3] sock: Doc behaviors for pressure heurisitics Abel Wu
@ 2023-10-19 12:00 ` Abel Wu
2023-10-23 7:49 ` Simon Horman
2023-10-24 7:08 ` Paolo Abeni
2023-10-23 7:49 ` [PATCH net v3 1/3] sock: Code cleanup on __sk_mem_raise_allocated() Simon Horman
2023-10-24 8:50 ` patchwork-bot+netdevbpf
3 siblings, 2 replies; 10+ messages in thread
From: Abel Wu @ 2023-10-19 12:00 UTC (permalink / raw)
To: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Shakeel Butt
Cc: netdev, linux-kernel, Abel Wu
Before sockets became aware of net-memcg's memory pressure since
commit e1aab161e013 ("socket: initial cgroup code."), the memory
usage would be granted to raise if below average even when under
protocol's pressure. This provides fairness among the sockets of
same protocol.
That commit changes this because the heuristic will also be
effective when only memcg is under pressure which makes no sense.
So revert that behavior.
After reverting, __sk_mem_raise_allocated() no longer considers
memcg's pressure. As memcgs are isolated from each other w.r.t.
memory accounting, consuming one's budget won't affect others.
So except the places where buffer sizes are needed to be tuned,
allow workloads to use the memory they are provisioned.
Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
---
net/core/sock.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/net/core/sock.c b/net/core/sock.c
index 45841a5689b6..0ec3f5d70715 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3037,7 +3037,13 @@ EXPORT_SYMBOL(sk_wait_data);
* @amt: pages to allocate
* @kind: allocation type
*
- * Similar to __sk_mem_schedule(), but does not update sk_forward_alloc
+ * Similar to __sk_mem_schedule(), but does not update sk_forward_alloc.
+ *
+ * Unlike the globally shared limits among the sockets under same protocol,
+ * consuming the budget of a memcg won't have direct effect on other ones.
+ * So be optimistic about memcg's tolerance, and leave the callers to decide
+ * whether or not to raise allocated through sk_under_memory_pressure() or
+ * its variants.
*/
int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
{
@@ -3095,7 +3101,11 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
if (sk_has_memory_pressure(sk)) {
u64 alloc;
- if (!sk_under_memory_pressure(sk))
+ /* The following 'average' heuristic is within the
+ * scope of global accounting, so it only makes
+ * sense for global memory pressure.
+ */
+ if (!sk_under_global_memory_pressure(sk))
return 1;
/* Try to be fair among all the sockets under global
--
2.37.3
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH net v3 3/3] sock: Ignore memcg pressure heuristics when raising allocated
2023-10-19 12:00 ` [PATCH net v3 3/3] sock: Ignore memcg pressure heuristics when raising allocated Abel Wu
@ 2023-10-23 7:49 ` Simon Horman
2023-10-24 7:08 ` Paolo Abeni
1 sibling, 0 replies; 10+ messages in thread
From: Simon Horman @ 2023-10-23 7:49 UTC (permalink / raw)
To: Abel Wu
Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Shakeel Butt, netdev, linux-kernel
On Thu, Oct 19, 2023 at 08:00:26PM +0800, Abel Wu wrote:
> Before sockets became aware of net-memcg's memory pressure since
> commit e1aab161e013 ("socket: initial cgroup code."), the memory
> usage would be granted to raise if below average even when under
> protocol's pressure. This provides fairness among the sockets of
> same protocol.
>
> That commit changes this because the heuristic will also be
> effective when only memcg is under pressure which makes no sense.
> So revert that behavior.
>
> After reverting, __sk_mem_raise_allocated() no longer considers
> memcg's pressure. As memcgs are isolated from each other w.r.t.
> memory accounting, consuming one's budget won't affect others.
> So except the places where buffer sizes are needed to be tuned,
> allow workloads to use the memory they are provisioned.
>
> Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
> Acked-by: Shakeel Butt <shakeelb@google.com>
> Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH net v3 3/3] sock: Ignore memcg pressure heuristics when raising allocated
2023-10-19 12:00 ` [PATCH net v3 3/3] sock: Ignore memcg pressure heuristics when raising allocated Abel Wu
2023-10-23 7:49 ` Simon Horman
@ 2023-10-24 7:08 ` Paolo Abeni
2023-10-24 7:35 ` Shakeel Butt
2023-10-24 8:21 ` Abel Wu
1 sibling, 2 replies; 10+ messages in thread
From: Paolo Abeni @ 2023-10-24 7:08 UTC (permalink / raw)
To: Abel Wu, David S . Miller, Eric Dumazet, Jakub Kicinski,
Shakeel Butt
Cc: netdev, linux-kernel
On Thu, 2023-10-19 at 20:00 +0800, Abel Wu wrote:
> Before sockets became aware of net-memcg's memory pressure since
> commit e1aab161e013 ("socket: initial cgroup code."), the memory
> usage would be granted to raise if below average even when under
> protocol's pressure. This provides fairness among the sockets of
> same protocol.
>
> That commit changes this because the heuristic will also be
> effective when only memcg is under pressure which makes no sense.
> So revert that behavior.
>
> After reverting, __sk_mem_raise_allocated() no longer considers
> memcg's pressure. As memcgs are isolated from each other w.r.t.
> memory accounting, consuming one's budget won't affect others.
> So except the places where buffer sizes are needed to be tuned,
> allow workloads to use the memory they are provisioned.
>
> Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
> Acked-by: Shakeel Butt <shakeelb@google.com>
> Acked-by: Paolo Abeni <pabeni@redhat.com>
It's totally not clear to me why you changed the target tree from net-
next to net ?!? This is net-next material, I asked to strip the fixes
tag exactly for that reason.
Since there is agreement on this series and we are late in the cycle, I
would avoid a re-post (we can apply the series to net-next anyway) but
any clarification on the target tree change will be appreciated,
thanks!
Paolo
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH net v3 3/3] sock: Ignore memcg pressure heuristics when raising allocated
2023-10-24 7:08 ` Paolo Abeni
@ 2023-10-24 7:35 ` Shakeel Butt
2023-10-24 8:21 ` Abel Wu
1 sibling, 0 replies; 10+ messages in thread
From: Shakeel Butt @ 2023-10-24 7:35 UTC (permalink / raw)
To: Paolo Abeni
Cc: Abel Wu, David S . Miller, Eric Dumazet, Jakub Kicinski, netdev,
linux-kernel
On Tue, Oct 24, 2023 at 12:08 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On Thu, 2023-10-19 at 20:00 +0800, Abel Wu wrote:
> > Before sockets became aware of net-memcg's memory pressure since
> > commit e1aab161e013 ("socket: initial cgroup code."), the memory
> > usage would be granted to raise if below average even when under
> > protocol's pressure. This provides fairness among the sockets of
> > same protocol.
> >
> > That commit changes this because the heuristic will also be
> > effective when only memcg is under pressure which makes no sense.
> > So revert that behavior.
> >
> > After reverting, __sk_mem_raise_allocated() no longer considers
> > memcg's pressure. As memcgs are isolated from each other w.r.t.
> > memory accounting, consuming one's budget won't affect others.
> > So except the places where buffer sizes are needed to be tuned,
> > allow workloads to use the memory they are provisioned.
> >
> > Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
> > Acked-by: Shakeel Butt <shakeelb@google.com>
> > Acked-by: Paolo Abeni <pabeni@redhat.com>
>
> It's totally not clear to me why you changed the target tree from net-
> next to net ?!? This is net-next material, I asked to strip the fixes
> tag exactly for that reason.
>
> Since there is agreement on this series and we are late in the cycle, I
> would avoid a re-post (we can apply the series to net-next anyway) but
> any clarification on the target tree change will be appreciated,
> thanks!
>
I didn't even notice the change in the target tree. I would say let's
keep this for net-next as there are no urgent fixes here.
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: Re: [PATCH net v3 3/3] sock: Ignore memcg pressure heuristics when raising allocated
2023-10-24 7:08 ` Paolo Abeni
2023-10-24 7:35 ` Shakeel Butt
@ 2023-10-24 8:21 ` Abel Wu
1 sibling, 0 replies; 10+ messages in thread
From: Abel Wu @ 2023-10-24 8:21 UTC (permalink / raw)
To: Paolo Abeni, David S . Miller, Eric Dumazet, Jakub Kicinski,
Shakeel Butt
Cc: netdev, linux-kernel
On 10/24/23 3:08 PM, Paolo Abeni Wrote:
> On Thu, 2023-10-19 at 20:00 +0800, Abel Wu wrote:
>> Before sockets became aware of net-memcg's memory pressure since
>> commit e1aab161e013 ("socket: initial cgroup code."), the memory
>> usage would be granted to raise if below average even when under
>> protocol's pressure. This provides fairness among the sockets of
>> same protocol.
>>
>> That commit changes this because the heuristic will also be
>> effective when only memcg is under pressure which makes no sense.
>> So revert that behavior.
>>
>> After reverting, __sk_mem_raise_allocated() no longer considers
>> memcg's pressure. As memcgs are isolated from each other w.r.t.
>> memory accounting, consuming one's budget won't affect others.
>> So except the places where buffer sizes are needed to be tuned,
>> allow workloads to use the memory they are provisioned.
>>
>> Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
>> Acked-by: Shakeel Butt <shakeelb@google.com>
>> Acked-by: Paolo Abeni <pabeni@redhat.com>
>
> It's totally not clear to me why you changed the target tree from net-
> next to net ?!? This is net-next material, I asked to strip the fixes
> tag exactly for that reason.
Sorry I misunderstood your suggestion..
>
> Since there is agreement on this series and we are late in the cycle, I
> would avoid a re-post (we can apply the series to net-next anyway) but
> any clarification on the target tree change will be appreciated,
> thanks!
Please apply to net-next.
Thanks!
Abel
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net v3 1/3] sock: Code cleanup on __sk_mem_raise_allocated()
2023-10-19 12:00 [PATCH net v3 1/3] sock: Code cleanup on __sk_mem_raise_allocated() Abel Wu
2023-10-19 12:00 ` [PATCH net v3 2/3] sock: Doc behaviors for pressure heurisitics Abel Wu
2023-10-19 12:00 ` [PATCH net v3 3/3] sock: Ignore memcg pressure heuristics when raising allocated Abel Wu
@ 2023-10-23 7:49 ` Simon Horman
2023-10-24 8:50 ` patchwork-bot+netdevbpf
3 siblings, 0 replies; 10+ messages in thread
From: Simon Horman @ 2023-10-23 7:49 UTC (permalink / raw)
To: Abel Wu
Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Shakeel Butt, netdev, linux-kernel
On Thu, Oct 19, 2023 at 08:00:24PM +0800, Abel Wu wrote:
> Code cleanup for both better simplicity and readability.
> No functional change intended.
>
> Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
> Acked-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net v3 1/3] sock: Code cleanup on __sk_mem_raise_allocated()
2023-10-19 12:00 [PATCH net v3 1/3] sock: Code cleanup on __sk_mem_raise_allocated() Abel Wu
` (2 preceding siblings ...)
2023-10-23 7:49 ` [PATCH net v3 1/3] sock: Code cleanup on __sk_mem_raise_allocated() Simon Horman
@ 2023-10-24 8:50 ` patchwork-bot+netdevbpf
3 siblings, 0 replies; 10+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-10-24 8:50 UTC (permalink / raw)
To: Abel Wu; +Cc: davem, edumazet, kuba, pabeni, shakeelb, netdev, linux-kernel
Hello:
This series was applied to netdev/net-next.git (main)
by Paolo Abeni <pabeni@redhat.com>:
On Thu, 19 Oct 2023 20:00:24 +0800 you wrote:
> Code cleanup for both better simplicity and readability.
> No functional change intended.
>
> Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
> Acked-by: Shakeel Butt <shakeelb@google.com>
> ---
> net/core/sock.c | 22 ++++++++++++----------
> 1 file changed, 12 insertions(+), 10 deletions(-)
Here is the summary with links:
- [net,v3,1/3] sock: Code cleanup on __sk_mem_raise_allocated()
https://git.kernel.org/netdev/net-next/c/2def8ff3fdb6
- [net,v3,2/3] sock: Doc behaviors for pressure heurisitics
https://git.kernel.org/netdev/net-next/c/2e12072c67b5
- [net,v3,3/3] sock: Ignore memcg pressure heuristics when raising allocated
https://git.kernel.org/netdev/net-next/c/66e6369e312d
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2023-10-24 8:50 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-19 12:00 [PATCH net v3 1/3] sock: Code cleanup on __sk_mem_raise_allocated() Abel Wu
2023-10-19 12:00 ` [PATCH net v3 2/3] sock: Doc behaviors for pressure heurisitics Abel Wu
2023-10-23 7:49 ` Simon Horman
2023-10-19 12:00 ` [PATCH net v3 3/3] sock: Ignore memcg pressure heuristics when raising allocated Abel Wu
2023-10-23 7:49 ` Simon Horman
2023-10-24 7:08 ` Paolo Abeni
2023-10-24 7:35 ` Shakeel Butt
2023-10-24 8:21 ` Abel Wu
2023-10-23 7:49 ` [PATCH net v3 1/3] sock: Code cleanup on __sk_mem_raise_allocated() Simon Horman
2023-10-24 8:50 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).