* [PATCH v2] Documentation: sysctl: document net core sysctls
[not found] <20260407083213.27045-1-chakrabortyshubham66@gmail.com>
@ 2026-04-09 17:48 ` Shubham Chakraborty
2026-04-13 16:47 ` Simon Horman
0 siblings, 1 reply; 2+ messages in thread
From: Shubham Chakraborty @ 2026-04-09 17:48 UTC (permalink / raw)
To: netdev
Cc: davem, edumazet, kuba, pabeni, horms, kuniyu, corbet, skhan,
linux-doc, linux-kernel, Shubham Chakraborty
Document missing net.core and net.unix sysctl entries in
admin-guide/sysctl/net.rst, and correct wording for defaults
that are derived from PAGE_SIZE, HZ, or CONFIG_MAX_SKB_FRAGS.
Also clarify that the RFS and flow-limit controls are only present
when CONFIG_RPS or CONFIG_NET_FLOW_LIMIT is enabled, and describe
rps_sock_flow_entries the way the handler implements it: non-zero
values are rounded up to the nearest power of two.
Signed-off-by: Shubham Chakraborty <chakrabortyshubham66@gmail.com>
---
Documentation/admin-guide/sysctl/net.rst | 66 +++++++++++++++++++++++-
1 file changed, 64 insertions(+), 2 deletions(-)
diff --git a/Documentation/admin-guide/sysctl/net.rst b/Documentation/admin-guide/sysctl/net.rst
index 3b2ad61995d4..05d301b8752c 100644
--- a/Documentation/admin-guide/sysctl/net.rst
+++ b/Documentation/admin-guide/sysctl/net.rst
@@ -210,7 +210,9 @@ Default: 0 (off)
mem_pcpu_rsv
------------
-Per-cpu reserved forward alloc cache size in page units. Default 1MB per CPU.
+Per-cpu reserved forward alloc cache size in page units.
+
+Default: 1MB per CPU, expressed in page units
bypass_prot_mem
---------------
@@ -238,6 +240,37 @@ rps_default_mask
The default RPS CPU mask used on newly created network devices. An empty
mask means RPS disabled by default.
+rps_sock_flow_entries
+---------------------
+
+The total number of entries in the RPS flow table. This is used by
+RFS (Receive Flow Steering) to track which CPU is currently processing
+a flow in userspace. Non-zero values are rounded up to the nearest
+power of two.
+Available only when ``CONFIG_RPS`` is enabled.
+
+Default: 0
+
+flow_limit_cpu_bitmap
+---------------------
+
+Bitmap of CPUs for which RPS flow limiting is enabled. Flow limiting
+prioritizes small flows during CPU contention by dropping packets
+from large flows slightly ahead of those from small flows.
+Available only when ``CONFIG_NET_FLOW_LIMIT`` is enabled.
+
+Default: 0 (disabled)
+
+flow_limit_table_len
+--------------------
+
+The number of buckets in the flow limit hashtable. This value is
+only consulted when a new table is allocated. Modifying it does
+not update active tables. This value should be a power of two.
+Available only when ``CONFIG_NET_FLOW_LIMIT`` is enabled.
+
+Default: 4096
+
tstamp_allow_data
-----------------
Allow processes to receive tx timestamps looped together with the original
@@ -290,6 +323,8 @@ probed in a round-robin manner. Also, a polling cycle may not exceed
netdev_budget_usecs microseconds, even if netdev_budget has not been
exhausted.
+Default: 300
+
netdev_budget_usecs
---------------------
@@ -297,12 +332,16 @@ Maximum number of microseconds in one NAPI polling cycle. Polling
will exit when either netdev_budget_usecs have elapsed during the
poll cycle or the number of packets processed reaches netdev_budget.
+Default: ``2 * USEC_PER_SEC / HZ`` (2000 when ``HZ`` is 1000)
+
netdev_max_backlog
------------------
Maximum number of packets, queued on the INPUT side, when the interface
receives packets faster than kernel can process them.
+Default: 1000
+
qdisc_max_burst
------------------
@@ -368,6 +407,15 @@ by the cpu which allocated them.
Default: 128
+max_skb_frags
+-------------
+
+The maximum number of fragments allowed per skb (socket buffer).
+This is mostly used for performance tuning of GSO (Generic
+Segmentation Offload).
+
+Default: ``CONFIG_MAX_SKB_FRAGS`` (17 if not overridden)
+
optmem_max
----------
@@ -377,6 +425,16 @@ optmem_max as a limit for its internal structures.
Default : 128 KB
+somaxconn
+---------
+
+Limit of the socket listen() backlog, known in userspace as SOMAXCONN.
+The maximum number of established sockets waiting to be accepted by
+accept(). If the backlog is greater than this value, it will be
+silently truncated to this value.
+
+Default: 4096
+
fb_tunnels_only_for_init_net
----------------------------
@@ -449,6 +507,8 @@ GRO has decided not to coalesce, it is placed on a per-NAPI list. This
list is then passed to the stack when the number of segments reaches the
gro_normal_batch limit.
+Default: 8
+
high_order_alloc_disable
------------------------
@@ -465,9 +525,11 @@ Default: 0
----------------------------------------------------------
There is only one file in this directory.
-unix_dgram_qlen limits the max number of datagrams queued in Unix domain
+max_dgram_qlen limits the max number of datagrams queued in Unix domain
socket's buffer. It will not take effect unless PF_UNIX flag is specified.
+Default: 10
+
3. /proc/sys/net/ipv4 - IPV4 settings
-------------------------------------
--
2.53.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH v2] Documentation: sysctl: document net core sysctls
2026-04-09 17:48 ` [PATCH v2] Documentation: sysctl: document net core sysctls Shubham Chakraborty
@ 2026-04-13 16:47 ` Simon Horman
0 siblings, 0 replies; 2+ messages in thread
From: Simon Horman @ 2026-04-13 16:47 UTC (permalink / raw)
To: Shubham Chakraborty
Cc: netdev, davem, edumazet, kuba, pabeni, kuniyu, corbet, skhan,
linux-doc, linux-kernel
On Thu, Apr 09, 2026 at 11:18:59PM +0530, Shubham Chakraborty wrote:
> Document missing net.core and net.unix sysctl entries in
> admin-guide/sysctl/net.rst, and correct wording for defaults
> that are derived from PAGE_SIZE, HZ, or CONFIG_MAX_SKB_FRAGS.
>
> Also clarify that the RFS and flow-limit controls are only present
> when CONFIG_RPS or CONFIG_NET_FLOW_LIMIT is enabled, and describe
> rps_sock_flow_entries the way the handler implements it: non-zero
> values are rounded up to the nearest power of two.
>
> Signed-off-by: Shubham Chakraborty <chakrabortyshubham66@gmail.com>
...
> @@ -238,6 +240,37 @@ rps_default_mask
> The default RPS CPU mask used on newly created network devices. An empty
> mask means RPS disabled by default.
>
> +rps_sock_flow_entries
> +---------------------
> +
> +The total number of entries in the RPS flow table. This is used by
Maybe s/This/The table/ to make it clearer that it is the table,
rather than the number of entries, that track CPUs.
> +RFS (Receive Flow Steering) to track which CPU is currently processing
> +a flow in userspace. Non-zero values are rounded up to the nearest
> +power of two.
> +Available only when ``CONFIG_RPS`` is enabled.
I think it would be worth noting that a value of 0 disables RPS.
> +
> +Default: 0
...
> netdev_budget_usecs
> ---------------------
>
The lines above the following hunk are:
netdev_budget_usecs
---------------------
Maximum number of microseconds in one NAPI polling cycle. Polling
> @@ -297,12 +332,16 @@ Maximum number of microseconds in one NAPI polling cycle. Polling
> will exit when either netdev_budget_usecs have elapsed during the
> poll cycle or the number of packets processed reaches netdev_budget.
>
> +Default: ``2 * USEC_PER_SEC / HZ`` (2000 when ``HZ`` is 1000)
> +
Well, that is awkward.
Looking at git history, it seems that this sysctl was added by 7acf8a1e8a28
("Replace 2 jiffies with sysctl netdev_budget_usecs to enable softirq
tuning") in 2017. And at that time the unic was us, and the default was 2000 us.
But that was changed by a fix for that commit, a4837980fd9f ("net: revert
default NAPI poll timeout to 2 jiffies"), in 2020. As a side-effect of
that commit, the default was changed to what you have documented above,
and the unit changed to jiffies.
So while what you have is correct it seems nonsensical to me for the unit
to be jiffies. Because that's not a meaningful unit for users. And because
the name of the sysctl ends in usecs.
But I'm unsure what to do about it. Since changing the unit this would
represent (another) KABI break.
* Add another knob that shadows this one (But what to call it?)
* Simply remove this one (KAPI break)
* Change the unit of this knob (KAPI break)
If the code is left as is, then I think it should be documented that the
unit is jiffies.
...
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-04-13 16:47 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260407083213.27045-1-chakrabortyshubham66@gmail.com>
2026-04-09 17:48 ` [PATCH v2] Documentation: sysctl: document net core sysctls Shubham Chakraborty
2026-04-13 16:47 ` Simon Horman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox