From: Cathy Zhang <cathy.zhang@intel.com>
To: edumazet@google.com, davem@davemloft.net, kuba@kernel.org,
pabeni@redhat.com
Cc: jesse.brandeburg@intel.com, suresh.srinivas@intel.com,
tim.c.chen@intel.com, lizhen.you@intel.com,
cathy.zhang@intel.com, eric.dumazet@gmail.com,
netdev@vger.kernel.org
Subject: [PATCH net-next 2/2] net: Add sysctl_reclaim_threshold
Date: Sun, 7 May 2023 19:08:01 -0700 [thread overview]
Message-ID: <20230508020801.10702-3-cathy.zhang@intel.com> (raw)
In-Reply-To: <20230508020801.10702-1-cathy.zhang@intel.com>
Add a new ABI /proc/sys/net/core/reclaim_threshold which allows to
change the size of reserved memory from reclaiming in sk_mem_uncharge.
It allows to keep sk->sk_forward_alloc as small as possible when
system is under memory pressure, it also allows to change it larger to
avoid memcg charge overhead and improve performance when system is not
under memory pressure. The original reclaim threshold for reserved
memory per-socket is 2MB, it's selected as the max value, while the
default value is 64KB which is closer to the maximum size of sk_buff.
Issue the following command as root to change the default value:
echo 16384 > /proc/sys/net/core/reclaim_threshold
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Signed-off-by: Lizhen You <lizhen.you@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
Reviewed-by: Suresh Srinivas <suresh.srinivas@intel.com>
---
Documentation/admin-guide/sysctl/net.rst | 12 ++++++++++++
include/net/sock.h | 13 +++++++++++--
net/core/sysctl_net_core.c | 14 ++++++++++++++
3 files changed, 37 insertions(+), 2 deletions(-)
diff --git a/Documentation/admin-guide/sysctl/net.rst b/Documentation/admin-guide/sysctl/net.rst
index 466c560b0c30..2981278af3d9 100644
--- a/Documentation/admin-guide/sysctl/net.rst
+++ b/Documentation/admin-guide/sysctl/net.rst
@@ -413,6 +413,18 @@ historical importance.
Default: 0
+reclaim_threshold
+------------------------
+
+The threshold indicates when it can start to reclaim memory during a TCP
+connection lifecycle. If the per-socket forward allocated memory is beyond the
+threshold, it will reclaim the part exceeding this value. It could help keep
+per-socket forward allocated memory with a proper size to improve performance
+and make system away from memory pressure meanwhile. The threshold value is
+allowed to be changed in [4096, 2097152].
+
+Default: 64 KB
+
2. /proc/sys/net/unix - Parameters for Unix domain sockets
----------------------------------------------------------
diff --git a/include/net/sock.h b/include/net/sock.h
index 6d2960479a80..3ca4c03a23ba 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -89,6 +89,10 @@ void SOCK_DEBUG(const struct sock *sk, const char *msg, ...)
}
#endif
+#if IS_ENABLED(CONFIG_SYSCTL)
+extern unsigned int sysctl_reclaim_threshold __read_mostly;
+#endif
+
/* This is the per-socket lock. The spinlock provides a synchronization
* between user contexts and software interrupt processing, whereas the
* mini-semaphore synchronizes multiple users amongst themselves.
@@ -1663,6 +1667,11 @@ static inline void sk_mem_charge(struct sock *sk, int size)
static inline void sk_mem_uncharge(struct sock *sk, int size)
{
int reclaimable;
+#if IS_ENABLED(CONFIG_SYSCTL)
+ int reclaim_threshold = READ_ONCE(sysctl_reclaim_threshold);
+#else
+ int reclaim_threshold = SK_RECLAIM_THRESHOLD;
+#endif
if (!sk_has_account(sk))
return;
@@ -1680,8 +1689,8 @@ static inline void sk_mem_uncharge(struct sock *sk, int size)
* In order to avoid the above issue, it's necessary to keep
* sk->sk_forward_alloc with a proper size while doing reclaim.
*/
- if (reclaimable > SK_RECLAIM_THRESHOLD) {
- reclaimable -= SK_RECLAIM_THRESHOLD;
+ if (reclaimable > reclaim_threshold) {
+ reclaimable -= reclaim_threshold;
__sk_mem_reclaim(sk, reclaimable);
}
}
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index 782273bb93c2..82aee37769ba 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -46,6 +46,11 @@ EXPORT_SYMBOL(sysctl_fb_tunnels_only_for_init_net);
int sysctl_devconf_inherit_init_net __read_mostly;
EXPORT_SYMBOL(sysctl_devconf_inherit_init_net);
+static unsigned int min_reclaim = PAGE_SIZE;
+static unsigned int max_reclaim = 2 * 1024 * 1024;
+unsigned int sysctl_reclaim_threshold __read_mostly = 64 * 1024;
+EXPORT_SYMBOL(sysctl_reclaim_threshold);
+
#if IS_ENABLED(CONFIG_NET_FLOW_LIMIT) || IS_ENABLED(CONFIG_RPS)
static void dump_cpumask(void *buffer, size_t *lenp, loff_t *ppos,
struct cpumask *mask)
@@ -407,6 +412,15 @@ static struct ctl_table net_core_table[] = {
.proc_handler = proc_dointvec_minmax,
.extra1 = &min_rcvbuf,
},
+ {
+ .procname = "reclaim_threshold",
+ .data = &sysctl_reclaim_threshold,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_douintvec_minmax,
+ .extra1 = &min_reclaim,
+ .extra2 = &max_reclaim,
+ },
{
.procname = "dev_weight",
.data = &weight_p,
--
2.34.1
next prev parent reply other threads:[~2023-05-08 2:08 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-08 2:07 [PATCH net-next 0/2] net: fix memcg overhead caused by sk->sk_forward_alloc size Cathy Zhang
2023-05-08 2:08 ` [PATCH net-next 1/2] net: Keep sk->sk_forward_alloc as a proper size Cathy Zhang
2023-05-09 2:02 ` Jakub Kicinski
2023-05-09 6:52 ` Zhang, Cathy
2023-05-09 2:06 ` Jakub Kicinski
2023-05-09 6:57 ` Zhang, Cathy
2023-05-09 8:43 ` Simon Horman
2023-05-09 9:36 ` Zhang, Cathy
2023-05-09 9:45 ` Simon Horman
2023-05-09 10:41 ` Zhang, Cathy
2023-05-09 8:48 ` Eric Dumazet
2023-05-09 9:33 ` Zhang, Cathy
2023-05-09 9:51 ` Paolo Abeni
2023-05-09 10:39 ` Zhang, Cathy
2023-05-09 11:01 ` Zhang, Cathy
2023-05-09 11:58 ` Eric Dumazet
2023-05-09 15:07 ` Zhang, Cathy
2023-05-09 15:43 ` Eric Dumazet
2023-05-09 16:09 ` Shakeel Butt
2023-05-10 6:54 ` Zhang, Cathy
2023-05-10 11:11 ` Zhang, Cathy
2023-05-10 11:24 ` Eric Dumazet
2023-05-10 13:52 ` Zhang, Cathy
2023-05-10 15:07 ` Eric Dumazet
2023-05-10 16:09 ` Zhang, Cathy
2023-05-10 19:00 ` Shakeel Butt
2023-05-11 0:53 ` Zhang, Cathy
2023-05-11 6:59 ` Zhang, Cathy
2023-05-11 7:50 ` Eric Dumazet
2023-05-11 9:26 ` Zhang, Cathy
2023-05-11 16:23 ` Shakeel Butt
2023-05-11 16:35 ` Eric Dumazet
2023-05-11 17:10 ` Shakeel Butt
2023-05-11 21:18 ` Shakeel Butt
2023-05-12 2:38 ` Zhang, Cathy
2023-05-12 3:23 ` Zhang, Cathy
2023-05-12 5:06 ` Shakeel Butt
2023-05-12 5:51 ` Zhang, Cathy
2023-05-12 17:17 ` Shakeel Butt
2023-05-15 3:46 ` Zhang, Cathy
2023-05-15 4:13 ` Shakeel Butt
2023-05-15 6:27 ` Zhang, Cathy
2023-05-15 19:50 ` Shakeel Butt
2023-05-16 5:46 ` Oliver Sang
2023-05-17 16:24 ` Shakeel Butt
2023-05-17 16:33 ` Eric Dumazet
2023-05-17 17:04 ` Shakeel Butt
2023-07-28 2:26 ` Zhang, Cathy
2023-05-19 2:53 ` Oliver Sang
2023-05-31 8:46 ` Oliver Sang
2023-05-31 19:45 ` Shakeel Butt
2023-06-01 2:48 ` Zhang, Cathy
2023-06-01 3:21 ` Eric Dumazet
2023-06-01 2:46 ` Zhang, Cathy
2023-05-10 7:43 ` Zhang, Cathy
2023-05-09 17:58 ` Shakeel Butt
2023-05-10 7:21 ` Zhang, Cathy
2023-05-09 17:19 ` Shakeel Butt
2023-05-09 18:04 ` Chen, Tim C
2023-05-09 18:17 ` Shakeel Butt
2023-05-10 7:03 ` Zhang, Cathy
2023-05-10 7:32 ` Zhang, Cathy
2023-05-08 2:08 ` Cathy Zhang [this message]
2023-05-09 2:05 ` [PATCH net-next 2/2] net: Add sysctl_reclaim_threshold Jakub Kicinski
2023-05-09 6:55 ` Zhang, Cathy
2023-05-09 13:36 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230508020801.10702-3-cathy.zhang@intel.com \
--to=cathy.zhang@intel.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=eric.dumazet@gmail.com \
--cc=jesse.brandeburg@intel.com \
--cc=kuba@kernel.org \
--cc=lizhen.you@intel.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=suresh.srinivas@intel.com \
--cc=tim.c.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).