* [PATCH net-next v2 1/3] sock: Code cleanup on __sk_mem_raise_allocated()
@ 2023-10-16 13:28 Abel Wu
2023-10-16 13:28 ` [PATCH net-next v2 2/3] sock: Doc behaviors for pressure heurisitics Abel Wu
2023-10-16 13:28 ` [PATCH net-next v2 3/3] sock: Fix improper heuristic on raising memory Abel Wu
0 siblings, 2 replies; 10+ messages in thread
From: Abel Wu @ 2023-10-16 13:28 UTC (permalink / raw)
To: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Shakeel Butt
Cc: netdev, linux-kernel, Abel Wu
Code cleanup for both better simplicity and readability.
No functional change intended.
Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
---
net/core/sock.c | 22 ++++++++++++----------
1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/net/core/sock.c b/net/core/sock.c
index 290165954379..43842520db86 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3039,17 +3039,19 @@ EXPORT_SYMBOL(sk_wait_data);
*/
int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
{
- bool memcg_charge = mem_cgroup_sockets_enabled && sk->sk_memcg;
+ struct mem_cgroup *memcg = mem_cgroup_sockets_enabled ? sk->sk_memcg : NULL;
struct proto *prot = sk->sk_prot;
- bool charged = true;
+ bool charged = false;
long allocated;
sk_memory_allocated_add(sk, amt);
allocated = sk_memory_allocated(sk);
- if (memcg_charge &&
- !(charged = mem_cgroup_charge_skmem(sk->sk_memcg, amt,
- gfp_memcg_charge())))
- goto suppress_allocation;
+
+ if (memcg) {
+ if (!mem_cgroup_charge_skmem(memcg, amt, gfp_memcg_charge()))
+ goto suppress_allocation;
+ charged = true;
+ }
/* Under limit. */
if (allocated <= sk_prot_mem_limits(sk, 0)) {
@@ -3104,8 +3106,8 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
*/
if (sk->sk_wmem_queued + size >= sk->sk_sndbuf) {
/* Force charge with __GFP_NOFAIL */
- if (memcg_charge && !charged) {
- mem_cgroup_charge_skmem(sk->sk_memcg, amt,
+ if (memcg && !charged) {
+ mem_cgroup_charge_skmem(memcg, amt,
gfp_memcg_charge() | __GFP_NOFAIL);
}
return 1;
@@ -3117,8 +3119,8 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
sk_memory_allocated_sub(sk, amt);
- if (memcg_charge && charged)
- mem_cgroup_uncharge_skmem(sk->sk_memcg, amt);
+ if (charged)
+ mem_cgroup_uncharge_skmem(memcg, amt);
return 0;
}
--
2.37.3
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH net-next v2 2/3] sock: Doc behaviors for pressure heurisitics
2023-10-16 13:28 [PATCH net-next v2 1/3] sock: Code cleanup on __sk_mem_raise_allocated() Abel Wu
@ 2023-10-16 13:28 ` Abel Wu
2023-10-16 15:51 ` Shakeel Butt
2023-10-16 13:28 ` [PATCH net-next v2 3/3] sock: Fix improper heuristic on raising memory Abel Wu
1 sibling, 1 reply; 10+ messages in thread
From: Abel Wu @ 2023-10-16 13:28 UTC (permalink / raw)
To: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Shakeel Butt
Cc: netdev, linux-kernel, Abel Wu
There are now two accounting infrastructures for skmem, while the
heuristics in __sk_mem_raise_allocated() were actually introduced
before memcg was born.
Add some comments to clarify whether they can be applied to both
infrastructures or not.
Suggested-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
---
net/core/sock.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/net/core/sock.c b/net/core/sock.c
index 43842520db86..9f969e3c2ddf 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3067,7 +3067,14 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
if (allocated > sk_prot_mem_limits(sk, 2))
goto suppress_allocation;
- /* guarantee minimum buffer size under pressure */
+ /* Guarantee minimum buffer size under pressure (either global
+ * or memcg) to make sure features described in RFC 7323 (TCP
+ * Extensions for High Performance) work properly.
+ *
+ * This rule does NOT stand when exceeds global or memcg's hard
+ * limit, or else a DoS attack can be taken place by spawning
+ * lots of sockets whose usage are under minimum buffer size.
+ */
if (kind == SK_MEM_RECV) {
if (atomic_read(&sk->sk_rmem_alloc) < sk_get_rmem0(sk, prot))
return 1;
@@ -3088,6 +3095,11 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
if (!sk_under_memory_pressure(sk))
return 1;
+
+ /* Try to be fair among all the sockets under global
+ * pressure by allowing the ones that below average
+ * usage to raise.
+ */
alloc = sk_sockets_allocated_read_positive(sk);
if (sk_prot_mem_limits(sk, 2) > alloc *
sk_mem_pages(sk->sk_wmem_queued +
--
2.37.3
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH net-next v2 3/3] sock: Fix improper heuristic on raising memory
2023-10-16 13:28 [PATCH net-next v2 1/3] sock: Code cleanup on __sk_mem_raise_allocated() Abel Wu
2023-10-16 13:28 ` [PATCH net-next v2 2/3] sock: Doc behaviors for pressure heurisitics Abel Wu
@ 2023-10-16 13:28 ` Abel Wu
2023-10-16 15:52 ` Shakeel Butt
` (2 more replies)
1 sibling, 3 replies; 10+ messages in thread
From: Abel Wu @ 2023-10-16 13:28 UTC (permalink / raw)
To: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Shakeel Butt
Cc: netdev, linux-kernel, Abel Wu
Before sockets became aware of net-memcg's memory pressure since
commit e1aab161e013 ("socket: initial cgroup code."), the memory
usage would be granted to raise if below average even when under
protocol's pressure. This provides fairness among the sockets of
same protocol.
That commit changes this because the heuristic will also be
effective when only memcg is under pressure which makes no sense.
Fix this by reverting to the behavior before that commit.
After this fix, __sk_mem_raise_allocated() no longer considers
memcg's pressure. As memcgs are isolated from each other w.r.t.
memory accounting, consuming one's budget won't affect others.
So except the places where buffer sizes are needed to be tuned,
allow workloads to use the memory they are provisioned.
Fixes: e1aab161e013 ("socket: initial cgroup code.")
Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
---
v2:
- Ignore memcg pressure when raising memory allocated.
---
net/core/sock.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/net/core/sock.c b/net/core/sock.c
index 9f969e3c2ddf..1d28e3e87970 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3035,7 +3035,13 @@ EXPORT_SYMBOL(sk_wait_data);
* @amt: pages to allocate
* @kind: allocation type
*
- * Similar to __sk_mem_schedule(), but does not update sk_forward_alloc
+ * Similar to __sk_mem_schedule(), but does not update sk_forward_alloc.
+ *
+ * Unlike the globally shared limits among the sockets under same protocol,
+ * consuming the budget of a memcg won't have direct effect on other ones.
+ * So be optimistic about memcg's tolerance, and leave the callers to decide
+ * whether or not to raise allocated through sk_under_memory_pressure() or
+ * its variants.
*/
int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
{
@@ -3093,7 +3099,11 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
if (sk_has_memory_pressure(sk)) {
u64 alloc;
- if (!sk_under_memory_pressure(sk))
+ /* The following 'average' heuristic is within the
+ * scope of global accounting, so it only makes
+ * sense for global memory pressure.
+ */
+ if (!sk_under_global_memory_pressure(sk))
return 1;
/* Try to be fair among all the sockets under global
--
2.37.3
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH net-next v2 2/3] sock: Doc behaviors for pressure heurisitics
2023-10-16 13:28 ` [PATCH net-next v2 2/3] sock: Doc behaviors for pressure heurisitics Abel Wu
@ 2023-10-16 15:51 ` Shakeel Butt
0 siblings, 0 replies; 10+ messages in thread
From: Shakeel Butt @ 2023-10-16 15:51 UTC (permalink / raw)
To: Abel Wu
Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
netdev, linux-kernel
On Mon, Oct 16, 2023 at 6:28 AM Abel Wu <wuyun.abel@bytedance.com> wrote:
>
> There are now two accounting infrastructures for skmem, while the
> heuristics in __sk_mem_raise_allocated() were actually introduced
> before memcg was born.
>
> Add some comments to clarify whether they can be applied to both
> infrastructures or not.
>
> Suggested-by: Shakeel Butt <shakeelb@google.com>
> Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net-next v2 3/3] sock: Fix improper heuristic on raising memory
2023-10-16 13:28 ` [PATCH net-next v2 3/3] sock: Fix improper heuristic on raising memory Abel Wu
@ 2023-10-16 15:52 ` Shakeel Butt
2023-10-19 11:21 ` Abel Wu
2023-10-19 8:02 ` Paolo Abeni
2023-10-19 8:53 ` Paolo Abeni
2 siblings, 1 reply; 10+ messages in thread
From: Shakeel Butt @ 2023-10-16 15:52 UTC (permalink / raw)
To: Abel Wu
Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
netdev, linux-kernel
On Mon, Oct 16, 2023 at 6:28 AM Abel Wu <wuyun.abel@bytedance.com> wrote:
>
> Before sockets became aware of net-memcg's memory pressure since
> commit e1aab161e013 ("socket: initial cgroup code."), the memory
> usage would be granted to raise if below average even when under
> protocol's pressure. This provides fairness among the sockets of
> same protocol.
>
> That commit changes this because the heuristic will also be
> effective when only memcg is under pressure which makes no sense.
> Fix this by reverting to the behavior before that commit.
>
> After this fix, __sk_mem_raise_allocated() no longer considers
> memcg's pressure. As memcgs are isolated from each other w.r.t.
> memory accounting, consuming one's budget won't affect others.
> So except the places where buffer sizes are needed to be tuned,
> allow workloads to use the memory they are provisioned.
>
> Fixes: e1aab161e013 ("socket: initial cgroup code.")
> Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net-next v2 3/3] sock: Fix improper heuristic on raising memory
2023-10-16 13:28 ` [PATCH net-next v2 3/3] sock: Fix improper heuristic on raising memory Abel Wu
2023-10-16 15:52 ` Shakeel Butt
@ 2023-10-19 8:02 ` Paolo Abeni
2023-10-19 8:53 ` Paolo Abeni
2 siblings, 0 replies; 10+ messages in thread
From: Paolo Abeni @ 2023-10-19 8:02 UTC (permalink / raw)
To: Abel Wu, David S . Miller, Eric Dumazet, Jakub Kicinski,
Shakeel Butt
Cc: netdev, linux-kernel
On Mon, 2023-10-16 at 21:28 +0800, Abel Wu wrote:
> Before sockets became aware of net-memcg's memory pressure since
> commit e1aab161e013 ("socket: initial cgroup code."), the memory
> usage would be granted to raise if below average even when under
> protocol's pressure. This provides fairness among the sockets of
> same protocol.
>
> That commit changes this because the heuristic will also be
> effective when only memcg is under pressure which makes no sense.
> Fix this by reverting to the behavior before that commit.
>
> After this fix, __sk_mem_raise_allocated() no longer considers
> memcg's pressure. As memcgs are isolated from each other w.r.t.
> memory accounting, consuming one's budget won't affect others.
> So except the places where buffer sizes are needed to be tuned,
> allow workloads to use the memory they are provisioned.
>
> Fixes: e1aab161e013 ("socket: initial cgroup code.")
> Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
> ---
> v2:
> - Ignore memcg pressure when raising memory allocated.
> ---
> net/core/sock.c | 14 ++++++++++++--
> 1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/net/core/sock.c b/net/core/sock.c
> index 9f969e3c2ddf..1d28e3e87970 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -3035,7 +3035,13 @@ EXPORT_SYMBOL(sk_wait_data);
> * @amt: pages to allocate
> * @kind: allocation type
> *
> - * Similar to __sk_mem_schedule(), but does not update sk_forward_alloc
> + * Similar to __sk_mem_schedule(), but does not update sk_forward_alloc.
> + *
> + * Unlike the globally shared limits among the sockets under same protocol,
> + * consuming the budget of a memcg won't have direct effect on other ones.
> + * So be optimistic about memcg's tolerance, and leave the callers to decide
> + * whether or not to raise allocated through sk_under_memory_pressure() or
> + * its variants.
> */
> int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
> {
> @@ -3093,7 +3099,11 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
> if (sk_has_memory_pressure(sk)) {
> u64 alloc;
>
> - if (!sk_under_memory_pressure(sk))
> + /* The following 'average' heuristic is within the
> + * scope of global accounting, so it only makes
> + * sense for global memory pressure.
> + */
> + if (!sk_under_global_memory_pressure(sk))
> return 1;
Since the whole logic is fairly non trivial I'd like to explicitly note
(for my own future memory) that I think this is the correct approach.
The memcg granted the current allocation via the
mem_cgroup_charge_skmem() call above, the heuristic to eventually
suppress the allocation should be outside the memcg scope.
LGTM, thanks!
Paolo
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net-next v2 3/3] sock: Fix improper heuristic on raising memory
2023-10-16 13:28 ` [PATCH net-next v2 3/3] sock: Fix improper heuristic on raising memory Abel Wu
2023-10-16 15:52 ` Shakeel Butt
2023-10-19 8:02 ` Paolo Abeni
@ 2023-10-19 8:53 ` Paolo Abeni
2023-10-19 11:23 ` Abel Wu
2 siblings, 1 reply; 10+ messages in thread
From: Paolo Abeni @ 2023-10-19 8:53 UTC (permalink / raw)
To: Abel Wu, David S . Miller, Eric Dumazet, Jakub Kicinski,
Shakeel Butt
Cc: netdev, linux-kernel
On Mon, 2023-10-16 at 21:28 +0800, Abel Wu wrote:
> Before sockets became aware of net-memcg's memory pressure since
> commit e1aab161e013 ("socket: initial cgroup code."), the memory
> usage would be granted to raise if below average even when under
> protocol's pressure. This provides fairness among the sockets of
> same protocol.
>
> That commit changes this because the heuristic will also be
> effective when only memcg is under pressure which makes no sense.
> Fix this by reverting to the behavior before that commit.
>
> After this fix, __sk_mem_raise_allocated() no longer considers
> memcg's pressure. As memcgs are isolated from each other w.r.t.
> memory accounting, consuming one's budget won't affect others.
> So except the places where buffer sizes are needed to be tuned,
> allow workloads to use the memory they are provisioned.
>
> Fixes: e1aab161e013 ("socket: initial cgroup code.")
I think it's better to drop this fixes tag. This is a functional change
and with such tag on at this point of the cycle, will land soon into
every stable tree. That feels not appropriate.
Please repost without such tag, thanks!
You can send the change to stables trees later, if needed.
Paolo
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Re: [PATCH net-next v2 3/3] sock: Fix improper heuristic on raising memory
2023-10-16 15:52 ` Shakeel Butt
@ 2023-10-19 11:21 ` Abel Wu
0 siblings, 0 replies; 10+ messages in thread
From: Abel Wu @ 2023-10-19 11:21 UTC (permalink / raw)
To: Shakeel Butt
Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
netdev, linux-kernel
On 10/16/23 11:52 PM, Shakeel Butt Wrote:
> On Mon, Oct 16, 2023 at 6:28 AM Abel Wu <wuyun.abel@bytedance.com> wrote:
>>
>> Before sockets became aware of net-memcg's memory pressure since
>> commit e1aab161e013 ("socket: initial cgroup code."), the memory
>> usage would be granted to raise if below average even when under
>> protocol's pressure. This provides fairness among the sockets of
>> same protocol.
>>
>> That commit changes this because the heuristic will also be
>> effective when only memcg is under pressure which makes no sense.
>> Fix this by reverting to the behavior before that commit.
>>
>> After this fix, __sk_mem_raise_allocated() no longer considers
>> memcg's pressure. As memcgs are isolated from each other w.r.t.
>> memory accounting, consuming one's budget won't affect others.
>> So except the places where buffer sizes are needed to be tuned,
>> allow workloads to use the memory they are provisioned.
>>
>> Fixes: e1aab161e013 ("socket: initial cgroup code.")
>> Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
>
> Acked-by: Shakeel Butt <shakeelb@google.com>
Thanks!
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Re: [PATCH net-next v2 3/3] sock: Fix improper heuristic on raising memory
2023-10-19 8:53 ` Paolo Abeni
@ 2023-10-19 11:23 ` Abel Wu
2023-10-19 11:41 ` Paolo Abeni
0 siblings, 1 reply; 10+ messages in thread
From: Abel Wu @ 2023-10-19 11:23 UTC (permalink / raw)
To: Paolo Abeni, David S . Miller, Eric Dumazet, Jakub Kicinski,
Shakeel Butt
Cc: netdev, linux-kernel
On 10/19/23 4:53 PM, Paolo Abeni Wrote:
> On Mon, 2023-10-16 at 21:28 +0800, Abel Wu wrote:
>> Before sockets became aware of net-memcg's memory pressure since
>> commit e1aab161e013 ("socket: initial cgroup code."), the memory
>> usage would be granted to raise if below average even when under
>> protocol's pressure. This provides fairness among the sockets of
>> same protocol.
>>
>> That commit changes this because the heuristic will also be
>> effective when only memcg is under pressure which makes no sense.
>> Fix this by reverting to the behavior before that commit.
>>
>> After this fix, __sk_mem_raise_allocated() no longer considers
>> memcg's pressure. As memcgs are isolated from each other w.r.t.
>> memory accounting, consuming one's budget won't affect others.
>> So except the places where buffer sizes are needed to be tuned,
>> allow workloads to use the memory they are provisioned.
>>
>> Fixes: e1aab161e013 ("socket: initial cgroup code.")
>
> I think it's better to drop this fixes tag. This is a functional change
> and with such tag on at this point of the cycle, will land soon into
> every stable tree. That feels not appropriate.
>
> Please repost without such tag, thanks!
>
> You can send the change to stables trees later, if needed.
OK. Shall I add a Acked-by tag for you?
Thanks!
Abel
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Re: [PATCH net-next v2 3/3] sock: Fix improper heuristic on raising memory
2023-10-19 11:23 ` Abel Wu
@ 2023-10-19 11:41 ` Paolo Abeni
0 siblings, 0 replies; 10+ messages in thread
From: Paolo Abeni @ 2023-10-19 11:41 UTC (permalink / raw)
To: Abel Wu, David S . Miller, Eric Dumazet, Jakub Kicinski,
Shakeel Butt
Cc: netdev, linux-kernel
On Thu, 2023-10-19 at 19:23 +0800, Abel Wu wrote:
> On 10/19/23 4:53 PM, Paolo Abeni Wrote:
> > On Mon, 2023-10-16 at 21:28 +0800, Abel Wu wrote:
> > > Before sockets became aware of net-memcg's memory pressure since
> > > commit e1aab161e013 ("socket: initial cgroup code."), the memory
> > > usage would be granted to raise if below average even when under
> > > protocol's pressure. This provides fairness among the sockets of
> > > same protocol.
> > >
> > > That commit changes this because the heuristic will also be
> > > effective when only memcg is under pressure which makes no sense.
> > > Fix this by reverting to the behavior before that commit.
> > >
> > > After this fix, __sk_mem_raise_allocated() no longer considers
> > > memcg's pressure. As memcgs are isolated from each other w.r.t.
> > > memory accounting, consuming one's budget won't affect others.
> > > So except the places where buffer sizes are needed to be tuned,
> > > allow workloads to use the memory they are provisioned.
> > >
> > > Fixes: e1aab161e013 ("socket: initial cgroup code.")
> >
> > I think it's better to drop this fixes tag. This is a functional change
> > and with such tag on at this point of the cycle, will land soon into
> > every stable tree. That feels not appropriate.
> >
> > Please repost without such tag, thanks!
> >
> > You can send the change to stables trees later, if needed.
>
> OK. Shall I add a Acked-by tag for you?
Let's be formal:
Acked-by: Paolo Abeni <pabeni@redhat.com>
/P
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2023-10-19 11:41 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-16 13:28 [PATCH net-next v2 1/3] sock: Code cleanup on __sk_mem_raise_allocated() Abel Wu
2023-10-16 13:28 ` [PATCH net-next v2 2/3] sock: Doc behaviors for pressure heurisitics Abel Wu
2023-10-16 15:51 ` Shakeel Butt
2023-10-16 13:28 ` [PATCH net-next v2 3/3] sock: Fix improper heuristic on raising memory Abel Wu
2023-10-16 15:52 ` Shakeel Butt
2023-10-19 11:21 ` Abel Wu
2023-10-19 8:02 ` Paolo Abeni
2023-10-19 8:53 ` Paolo Abeni
2023-10-19 11:23 ` Abel Wu
2023-10-19 11:41 ` Paolo Abeni
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).