netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v3 0/3] Documentation and ynl: add flow control
@ 2025-08-20 13:10 Oleksij Rempel
  2025-08-20 13:10 ` [PATCH net-next v3 1/3] tools: ynl-gen: generate kdoc for attribute enums Oleksij Rempel
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Oleksij Rempel @ 2025-08-20 13:10 UTC (permalink / raw)
  To: Andrew Lunn, Heiner Kallweit, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Rob Herring, Krzysztof Kozlowski,
	Florian Fainelli, Maxime Chevallier, Kory Maincent,
	Lukasz Majewski, Jonathan Corbet, Donald Hunter, Vadim Fedorenko,
	Jiri Pirko, Vladimir Oltean, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend
  Cc: Oleksij Rempel, kernel, linux-kernel, netdev, Russell King,
	Divya.Koppera, Sabrina Dubroca, Stanislav Fomichev

This series improves kernel documentation around Ethernet flow control
and enhances the ynl tooling to generate kernel-doc comments for
attribute enums.

Patch 1 extends the ynl generator to emit kdoc for enums based on YAML
attribute documentation.
Patch 2 regenerates all affected UAPI headers (dpll, ethtool, team,
net_shaper, netdev, ovpn) so that attribute enums now carry kernel-doc.
Patch 3 adds a new flow_control.rst document and annotates the ethtool
pause/pause-stat YAML definitions, relying on the kdoc generation
support from the earlier patches.

Oleksij Rempel (3):
  tools: ynl-gen: generate kdoc for attribute enums
  net: ynl: add generated kdoc to UAPI headers
  Documentation: net: add flow control guide and document ethtool API

 Documentation/netlink/specs/ethtool.yaml      |  27 ++
 Documentation/networking/flow_control.rst     | 379 ++++++++++++++++++
 Documentation/networking/index.rst            |   1 +
 Documentation/networking/phy.rst              |  12 +-
 include/linux/ethtool.h                       |  45 ++-
 include/uapi/linux/dpll.h                     |  30 ++
 .../uapi/linux/ethtool_netlink_generated.h    |  57 ++-
 include/uapi/linux/if_team.h                  |  11 +
 include/uapi/linux/net_shaper.h               |  50 +++
 include/uapi/linux/netdev.h                   | 165 ++++++++
 include/uapi/linux/ovpn.h                     |  62 +++
 net/dcb/dcbnl.c                               |   2 +
 net/ethtool/pause.c                           |   4 +
 tools/include/uapi/linux/netdev.h             | 165 ++++++++
 tools/net/ynl/pyynl/ynl_gen_c.py              |  23 ++
 15 files changed, 1018 insertions(+), 15 deletions(-)
 create mode 100644 Documentation/networking/flow_control.rst

--
2.39.5


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH net-next v3 1/3] tools: ynl-gen: generate kdoc for attribute enums
  2025-08-20 13:10 [PATCH net-next v3 0/3] Documentation and ynl: add flow control Oleksij Rempel
@ 2025-08-20 13:10 ` Oleksij Rempel
  2025-08-20 13:10 ` [PATCH net-next v3 2/3] net: ynl: add generated kdoc to UAPI headers Oleksij Rempel
  2025-08-20 13:10 ` [PATCH net-next v3 3/3] Documentation: net: add flow control guide and document ethtool API Oleksij Rempel
  2 siblings, 0 replies; 10+ messages in thread
From: Oleksij Rempel @ 2025-08-20 13:10 UTC (permalink / raw)
  To: Andrew Lunn, Heiner Kallweit, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Rob Herring, Krzysztof Kozlowski,
	Florian Fainelli, Maxime Chevallier, Kory Maincent,
	Lukasz Majewski, Jonathan Corbet, Donald Hunter, Vadim Fedorenko,
	Jiri Pirko, Vladimir Oltean, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend
  Cc: Oleksij Rempel, kernel, linux-kernel, netdev, Russell King,
	Divya.Koppera, Sabrina Dubroca, Stanislav Fomichev

Parse 'doc' strings from the YAML spec to generate kernel-doc comments
for the corresponding enums in the C UAPI header, making the headers
self-documenting.

The generated comment format depends on the documentation available:
 - a full kdoc block ('/**') with @member tags is used if attributes are
   documented
 - a simple block comment ('/*') is used if only the set itself has a doc
   string

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
---
 tools/net/ynl/pyynl/ynl_gen_c.py | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/tools/net/ynl/pyynl/ynl_gen_c.py b/tools/net/ynl/pyynl/ynl_gen_c.py
index ef032e17fec4..d7634560c461 100755
--- a/tools/net/ynl/pyynl/ynl_gen_c.py
+++ b/tools/net/ynl/pyynl/ynl_gen_c.py
@@ -3225,6 +3225,29 @@ def render_uapi(family, cw):
         if attr_set.subset_of:
             continue
 
+        # Write kdoc for attribute-set enums
+        has_main_doc = 'doc' in attr_set.yaml and attr_set.yaml['doc']
+        has_attr_doc = any('doc' in attr for _, attr in attr_set.items())
+
+        if has_main_doc or has_attr_doc:
+            if has_attr_doc:
+                cw.p('/**')
+                # Construct the main description line for the enum
+                doc_line = f"enum {c_lower(family.ident_name + '_' + attr_set.name)}"
+                if has_main_doc:
+                    doc_line += f" - {attr_set.yaml['doc']}"
+                cw.write_doc_line(doc_line)
+
+                # Write documentation for each attribute (enum member)
+                for _, attr in attr_set.items():
+                    if 'doc' in attr and attr['doc']:
+                        doc = f"@{attr.enum_name}: {attr['doc']}"
+                        cw.write_doc_line(doc)
+            else:  # Only has main doc, use a simpler comment block
+                cw.p('/*')
+                cw.write_doc_line(attr_set.yaml['doc'], indent=False)
+            cw.p(' */')
+
         max_value = f"({attr_set.cnt_name} - 1)"
 
         val = 0
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH net-next v3 2/3] net: ynl: add generated kdoc to UAPI headers
  2025-08-20 13:10 [PATCH net-next v3 0/3] Documentation and ynl: add flow control Oleksij Rempel
  2025-08-20 13:10 ` [PATCH net-next v3 1/3] tools: ynl-gen: generate kdoc for attribute enums Oleksij Rempel
@ 2025-08-20 13:10 ` Oleksij Rempel
  2025-08-22 14:11   ` ALOK TIWARI
  2025-08-20 13:10 ` [PATCH net-next v3 3/3] Documentation: net: add flow control guide and document ethtool API Oleksij Rempel
  2 siblings, 1 reply; 10+ messages in thread
From: Oleksij Rempel @ 2025-08-20 13:10 UTC (permalink / raw)
  To: Andrew Lunn, Heiner Kallweit, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Rob Herring, Krzysztof Kozlowski,
	Florian Fainelli, Maxime Chevallier, Kory Maincent,
	Lukasz Majewski, Jonathan Corbet, Donald Hunter, Vadim Fedorenko,
	Jiri Pirko, Vladimir Oltean, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend
  Cc: Oleksij Rempel, kernel, linux-kernel, netdev, Russell King,
	Divya.Koppera, Sabrina Dubroca, Stanislav Fomichev

Run the ynl regeneration script to apply the kdoc generation
support added in the previous commit.

This updates the generated UAPI headers for dpll, ethtool, team,
net_shaper, netdev, and ovpn with documentation parsed from their
respective YAML specifications.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
---
 include/uapi/linux/dpll.h                     |  30 ++++
 .../uapi/linux/ethtool_netlink_generated.h    |  29 +++
 include/uapi/linux/if_team.h                  |  11 ++
 include/uapi/linux/net_shaper.h               |  50 ++++++
 include/uapi/linux/netdev.h                   | 165 ++++++++++++++++++
 include/uapi/linux/ovpn.h                     |  62 +++++++
 tools/include/uapi/linux/netdev.h             | 165 ++++++++++++++++++
 7 files changed, 512 insertions(+)

diff --git a/include/uapi/linux/dpll.h b/include/uapi/linux/dpll.h
index 37b438ce8efc..23a4e3598650 100644
--- a/include/uapi/linux/dpll.h
+++ b/include/uapi/linux/dpll.h
@@ -203,6 +203,18 @@ enum dpll_feature_state {
 	DPLL_FEATURE_STATE_ENABLE,
 };
 
+/**
+ * enum dpll_dpll
+ * @DPLL_A_CLOCK_QUALITY_LEVEL: Level of quality of a clock device. This mainly
+ *   applies when the dpll lock-status is DPLL_LOCK_STATUS_HOLDOVER. This could
+ *   be put to message multiple times to indicate possible parallel quality
+ *   levels (e.g. one specified by ITU option 1 and another one specified by
+ *   option 2).
+ * @DPLL_A_PHASE_OFFSET_MONITOR: Receive or request state of phase offset
+ *   monitor feature. If enabled, dpll device shall monitor and notify all
+ *   currently available inputs for changes of their phase offset against the
+ *   dpll device.
+ */
 enum dpll_a {
 	DPLL_A_ID = 1,
 	DPLL_A_MODULE_NAME,
@@ -221,6 +233,24 @@ enum dpll_a {
 	DPLL_A_MAX = (__DPLL_A_MAX - 1)
 };
 
+/**
+ * enum dpll_pin
+ * @DPLL_A_PIN_FRACTIONAL_FREQUENCY_OFFSET: The FFO (Fractional Frequency
+ *   Offset) between the RX and TX symbol rate on the media associated with the
+ *   pin: (rx_frequency-tx_frequency)/rx_frequency Value is in PPM (parts per
+ *   million). This may be implemented for example for pin of type
+ *   PIN_TYPE_SYNCE_ETH_PORT.
+ * @DPLL_A_PIN_ESYNC_FREQUENCY: Frequency of Embedded SYNC signal. If provided,
+ *   the pin is configured with a SYNC signal embedded into its base clock
+ *   frequency.
+ * @DPLL_A_PIN_ESYNC_FREQUENCY_SUPPORTED: If provided a pin is capable of
+ *   embedding a SYNC signal (within given range) into its base frequency
+ *   signal.
+ * @DPLL_A_PIN_ESYNC_PULSE: A ratio of high to low state of a SYNC signal pulse
+ *   embedded into base clock frequency. Value is in percents.
+ * @DPLL_A_PIN_REFERENCE_SYNC: Capable pin provides list of pins that can be
+ *   bound to create a reference-sync pin pair.
+ */
 enum dpll_a_pin {
 	DPLL_A_PIN_ID = 1,
 	DPLL_A_PIN_PARENT_ID,
diff --git a/include/uapi/linux/ethtool_netlink_generated.h b/include/uapi/linux/ethtool_netlink_generated.h
index e3b8813465d7..46de09954042 100644
--- a/include/uapi/linux/ethtool_netlink_generated.h
+++ b/include/uapi/linux/ethtool_netlink_generated.h
@@ -197,6 +197,15 @@ enum {
 	ETHTOOL_A_RINGS_MAX = (__ETHTOOL_A_RINGS_CNT - 1)
 };
 
+/**
+ * enum ethtool_mm_stat - MAC Merge (802.3)
+ * @ETHTOOL_A_MM_STAT_REASSEMBLY_ERRORS: aMACMergeFrameAssErrorCount
+ * @ETHTOOL_A_MM_STAT_SMD_ERRORS: aMACMergeFrameSmdErrorCount
+ * @ETHTOOL_A_MM_STAT_REASSEMBLY_OK: aMACMergeFrameAssOkCount
+ * @ETHTOOL_A_MM_STAT_RX_FRAG_COUNT: aMACMergeFragCountRx
+ * @ETHTOOL_A_MM_STAT_TX_FRAG_COUNT: aMACMergeFragCountTx
+ * @ETHTOOL_A_MM_STAT_HOLD_COUNT: aMACMergeHoldCount
+ */
 enum {
 	ETHTOOL_A_MM_STAT_UNSPEC,
 	ETHTOOL_A_MM_STAT_PAD,
@@ -448,6 +457,12 @@ enum {
 	ETHTOOL_A_TSINFO_MAX = (__ETHTOOL_A_TSINFO_CNT - 1)
 };
 
+/**
+ * enum ethtool_cable_result
+ * @ETHTOOL_A_CABLE_RESULT_PAIR: ETHTOOL_A_CABLE_PAIR
+ * @ETHTOOL_A_CABLE_RESULT_CODE: ETHTOOL_A_CABLE_RESULT_CODE
+ * @ETHTOOL_A_CABLE_RESULT_SRC: ETHTOOL_A_CABLE_INF_SRC
+ */
 enum {
 	ETHTOOL_A_CABLE_RESULT_UNSPEC,
 	ETHTOOL_A_CABLE_RESULT_PAIR,
@@ -485,6 +500,10 @@ enum {
 	ETHTOOL_A_CABLE_TEST_MAX = (__ETHTOOL_A_CABLE_TEST_CNT - 1)
 };
 
+/**
+ * enum ethtool_cable_test_ntf
+ * @ETHTOOL_A_CABLE_TEST_NTF_STATUS: _STARTED/_COMPLETE
+ */
 enum {
 	ETHTOOL_A_CABLE_TEST_NTF_UNSPEC,
 	ETHTOOL_A_CABLE_TEST_NTF_HEADER,
@@ -678,6 +697,12 @@ enum {
 	ETHTOOL_A_PSE_MAX = (__ETHTOOL_A_PSE_CNT - 1)
 };
 
+/*
+ * Flow types, corresponding to those defined in the old ethtool header for
+ * RXFH and RXNFC as ${PROTO}_FLOW. The values are not matching the old ones to
+ * avoid carrying into Netlink the IP_USER_FLOW vs IPV4_FLOW vs IPV4_USER_FLOW
+ * confusion.
+ */
 enum {
 	ETHTOOL_A_FLOW_ETHER = 1,
 	ETHTOOL_A_FLOW_IP4,
@@ -783,6 +808,10 @@ enum {
 	ETHTOOL_A_TSCONFIG_MAX = (__ETHTOOL_A_TSCONFIG_CNT - 1)
 };
 
+/**
+ * enum ethtool_pse_ntf
+ * @ETHTOOL_A_PSE_NTF_EVENTS: List of events reported by the PSE controller
+ */
 enum {
 	ETHTOOL_A_PSE_NTF_HEADER = 1,
 	ETHTOOL_A_PSE_NTF_EVENTS,
diff --git a/include/uapi/linux/if_team.h b/include/uapi/linux/if_team.h
index a5c06243a435..22d68c0dad60 100644
--- a/include/uapi/linux/if_team.h
+++ b/include/uapi/linux/if_team.h
@@ -12,6 +12,12 @@
 #define TEAM_STRING_MAX_LEN			32
 #define TEAM_GENL_CHANGE_EVENT_MC_GRP_NAME	"change_event"
 
+/*
+ * The team nested layout of get/set msg looks like [TEAM_ATTR_LIST_OPTION]
+ * [TEAM_ATTR_ITEM_OPTION] [TEAM_ATTR_OPTION_*], ... [TEAM_ATTR_ITEM_OPTION]
+ * [TEAM_ATTR_OPTION_*], ... ... [TEAM_ATTR_LIST_PORT] [TEAM_ATTR_ITEM_PORT]
+ * [TEAM_ATTR_PORT_*], ... [TEAM_ATTR_ITEM_PORT] [TEAM_ATTR_PORT_*], ... ...
+ */
 enum {
 	TEAM_ATTR_UNSPEC,
 	TEAM_ATTR_TEAM_IFINDEX,
@@ -30,6 +36,11 @@ enum {
 	TEAM_ATTR_ITEM_OPTION_MAX = (__TEAM_ATTR_ITEM_OPTION_MAX - 1)
 };
 
+/**
+ * enum team_attr_option
+ * @TEAM_ATTR_OPTION_PORT_IFINDEX: for per-port options
+ * @TEAM_ATTR_OPTION_ARRAY_INDEX: for array options
+ */
 enum {
 	TEAM_ATTR_OPTION_UNSPEC,
 	TEAM_ATTR_OPTION_NAME,
diff --git a/include/uapi/linux/net_shaper.h b/include/uapi/linux/net_shaper.h
index d8834b59f7d7..1aeeb1d68fff 100644
--- a/include/uapi/linux/net_shaper.h
+++ b/include/uapi/linux/net_shaper.h
@@ -41,6 +41,28 @@ enum net_shaper_metric {
 	NET_SHAPER_METRIC_PPS,
 };
 
+/**
+ * enum net_shaper_net_shaper
+ * @NET_SHAPER_A_HANDLE: Unique identifier for the given shaper inside the
+ *   owning device.
+ * @NET_SHAPER_A_METRIC: Metric used by the given shaper for bw-min, bw-max and
+ *   burst.
+ * @NET_SHAPER_A_BW_MIN: Guaranteed bandwidth for the given shaper.
+ * @NET_SHAPER_A_BW_MAX: Maximum bandwidth for the given shaper or 0 when
+ *   unlimited.
+ * @NET_SHAPER_A_BURST: Maximum burst-size for shaping. Should not be
+ *   interpreted as a quantum.
+ * @NET_SHAPER_A_PRIORITY: Scheduling priority for the given shaper. The
+ *   priority scheduling is applied to sibling shapers.
+ * @NET_SHAPER_A_WEIGHT: Relative weight for round robin scheduling of the
+ *   given shaper. The scheduling is applied to all sibling shapers with the
+ *   same priority.
+ * @NET_SHAPER_A_IFINDEX: Interface index owning the specified shaper.
+ * @NET_SHAPER_A_PARENT: Identifier for the parent of the affected shaper. Only
+ *   needed for @group operation.
+ * @NET_SHAPER_A_LEAVES: Describes a set of leaves shapers for a @group
+ *   operation.
+ */
 enum {
 	NET_SHAPER_A_HANDLE = 1,
 	NET_SHAPER_A_METRIC,
@@ -57,6 +79,13 @@ enum {
 	NET_SHAPER_A_MAX = (__NET_SHAPER_A_MAX - 1)
 };
 
+/**
+ * enum net_shaper_handle
+ * @NET_SHAPER_A_HANDLE_SCOPE: Defines the shaper @id interpretation.
+ * @NET_SHAPER_A_HANDLE_ID: Numeric identifier of a shaper. The id semantic
+ *   depends on the scope. For @queue scope it's the queue id and for @node
+ *   scope it's the node identifier.
+ */
 enum {
 	NET_SHAPER_A_HANDLE_SCOPE = 1,
 	NET_SHAPER_A_HANDLE_ID,
@@ -65,6 +94,27 @@ enum {
 	NET_SHAPER_A_HANDLE_MAX = (__NET_SHAPER_A_HANDLE_MAX - 1)
 };
 
+/**
+ * enum net_shaper_caps
+ * @NET_SHAPER_A_CAPS_IFINDEX: Interface index queried for shapers
+ *   capabilities.
+ * @NET_SHAPER_A_CAPS_SCOPE: The scope to which the queried capabilities apply.
+ * @NET_SHAPER_A_CAPS_SUPPORT_METRIC_BPS: The device accepts 'bps' metric for
+ *   bw-min, bw-max and burst.
+ * @NET_SHAPER_A_CAPS_SUPPORT_METRIC_PPS: The device accepts 'pps' metric for
+ *   bw-min, bw-max and burst.
+ * @NET_SHAPER_A_CAPS_SUPPORT_NESTING: The device supports nesting shaper
+ *   belonging to this scope below 'node' scoped shapers. Only 'queue' and
+ *   'node' scope can have flag 'support-nesting'.
+ * @NET_SHAPER_A_CAPS_SUPPORT_BW_MIN: The device supports a minimum guaranteed
+ *   B/W.
+ * @NET_SHAPER_A_CAPS_SUPPORT_BW_MAX: The device supports maximum B/W shaping.
+ * @NET_SHAPER_A_CAPS_SUPPORT_BURST: The device supports a maximum burst size.
+ * @NET_SHAPER_A_CAPS_SUPPORT_PRIORITY: The device supports priority
+ *   scheduling.
+ * @NET_SHAPER_A_CAPS_SUPPORT_WEIGHT: The device supports weighted round robin
+ *   scheduling.
+ */
 enum {
 	NET_SHAPER_A_CAPS_IFINDEX = 1,
 	NET_SHAPER_A_CAPS_SCOPE,
diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h
index 48eb49aa03d4..4d5169fc798d 100644
--- a/include/uapi/linux/netdev.h
+++ b/include/uapi/linux/netdev.h
@@ -82,6 +82,16 @@ enum netdev_napi_threaded {
 	NETDEV_NAPI_THREADED_ENABLED,
 };
 
+/**
+ * enum netdev_dev
+ * @NETDEV_A_DEV_IFINDEX: netdev ifindex
+ * @NETDEV_A_DEV_XDP_FEATURES: Bitmask of enabled xdp-features.
+ * @NETDEV_A_DEV_XDP_ZC_MAX_SEGS: max fragment count supported by ZC driver
+ * @NETDEV_A_DEV_XDP_RX_METADATA_FEATURES: Bitmask of supported XDP receive
+ *   metadata features. See Documentation/networking/xdp-rx-metadata.rst for
+ *   more details.
+ * @NETDEV_A_DEV_XSK_FEATURES: Bitmask of enabled AF_XDP features.
+ */
 enum {
 	NETDEV_A_DEV_IFINDEX = 1,
 	NETDEV_A_DEV_PAD,
@@ -99,6 +109,29 @@ enum {
 	NETDEV_A_IO_URING_PROVIDER_INFO_MAX = (__NETDEV_A_IO_URING_PROVIDER_INFO_MAX - 1)
 };
 
+/**
+ * enum netdev_page_pool
+ * @NETDEV_A_PAGE_POOL_ID: Unique ID of a Page Pool instance.
+ * @NETDEV_A_PAGE_POOL_IFINDEX: ifindex of the netdev to which the pool
+ *   belongs. May be reported as 0 if the page pool was allocated for a netdev
+ *   which got destroyed already (page pools may outlast their netdevs because
+ *   they wait for all memory to be returned).
+ * @NETDEV_A_PAGE_POOL_NAPI_ID: Id of NAPI using this Page Pool instance.
+ * @NETDEV_A_PAGE_POOL_INFLIGHT: Number of outstanding references to this page
+ *   pool (allocated but yet to be freed pages). Allocated pages may be held in
+ *   socket receive queues, driver receive ring, page pool recycling ring, the
+ *   page pool cache, etc.
+ * @NETDEV_A_PAGE_POOL_INFLIGHT_MEM: Amount of memory held by inflight pages.
+ * @NETDEV_A_PAGE_POOL_DETACH_TIME: Seconds in CLOCK_BOOTTIME of when Page Pool
+ *   was detached by the driver. Once detached Page Pool can no longer be used
+ *   to allocate memory. Page Pools wait for all the memory allocated from them
+ *   to be freed before truly disappearing. "Detached" Page Pools cannot be
+ *   "re-attached", they are just waiting to disappear. Attribute is absent if
+ *   Page Pool has not been detached, and can still be used to allocate new
+ *   memory.
+ * @NETDEV_A_PAGE_POOL_DMABUF: ID of the dmabuf this page-pool is attached to.
+ * @NETDEV_A_PAGE_POOL_IO_URING: io-uring memory provider information.
+ */
 enum {
 	NETDEV_A_PAGE_POOL_ID = 1,
 	NETDEV_A_PAGE_POOL_IFINDEX,
@@ -113,6 +146,11 @@ enum {
 	NETDEV_A_PAGE_POOL_MAX = (__NETDEV_A_PAGE_POOL_MAX - 1)
 };
 
+/**
+ * enum netdev_page_pool_stats - Page pool statistics, see docs for struct
+ *   page_pool_stats for information about individual statistics.
+ * @NETDEV_A_PAGE_POOL_STATS_INFO: Page pool identifying information.
+ */
 enum {
 	NETDEV_A_PAGE_POOL_STATS_INFO = 1,
 	NETDEV_A_PAGE_POOL_STATS_ALLOC_FAST = 8,
@@ -131,6 +169,28 @@ enum {
 	NETDEV_A_PAGE_POOL_STATS_MAX = (__NETDEV_A_PAGE_POOL_STATS_MAX - 1)
 };
 
+/**
+ * enum netdev_napi
+ * @NETDEV_A_NAPI_IFINDEX: ifindex of the netdevice to which NAPI instance
+ *   belongs.
+ * @NETDEV_A_NAPI_ID: ID of the NAPI instance.
+ * @NETDEV_A_NAPI_IRQ: The associated interrupt vector number for the napi
+ * @NETDEV_A_NAPI_PID: PID of the napi thread, if NAPI is configured to operate
+ *   in threaded mode. If NAPI is not in threaded mode (i.e. uses normal
+ *   softirq context), the attribute will be absent.
+ * @NETDEV_A_NAPI_DEFER_HARD_IRQS: The number of consecutive empty polls before
+ *   IRQ deferral ends and hardware IRQs are re-enabled.
+ * @NETDEV_A_NAPI_GRO_FLUSH_TIMEOUT: The timeout, in nanoseconds, of when to
+ *   trigger the NAPI watchdog timer which schedules NAPI processing.
+ *   Additionally, a non-zero value will also prevent GRO from flushing recent
+ *   super-frames at the end of a NAPI cycle. This may add receive latency in
+ *   exchange for reducing the number of frames processed by the network stack.
+ * @NETDEV_A_NAPI_IRQ_SUSPEND_TIMEOUT: The timeout, in nanoseconds, of how long
+ *   to suspend irq processing, if event polling finds events
+ * @NETDEV_A_NAPI_THREADED: Whether the NAPI is configured to operate in
+ *   threaded polling mode. If this is set to enabled then the NAPI context
+ *   operates in threaded polling mode.
+ */
 enum {
 	NETDEV_A_NAPI_IFINDEX = 1,
 	NETDEV_A_NAPI_ID,
@@ -150,6 +210,22 @@ enum {
 	NETDEV_A_XSK_INFO_MAX = (__NETDEV_A_XSK_INFO_MAX - 1)
 };
 
+/**
+ * enum netdev_queue
+ * @NETDEV_A_QUEUE_ID: Queue index; most queue types are indexed like a C
+ *   array, with indexes starting at 0 and ending at queue count - 1. Queue
+ *   indexes are scoped to an interface and queue type.
+ * @NETDEV_A_QUEUE_IFINDEX: ifindex of the netdevice to which the queue
+ *   belongs.
+ * @NETDEV_A_QUEUE_TYPE: Queue type as rx, tx. Each queue type defines a
+ *   separate ID space. XDP TX queues allocated in the kernel are not linked to
+ *   NAPIs and thus not listed. AF_XDP queues will have more information set in
+ *   the xsk attribute.
+ * @NETDEV_A_QUEUE_NAPI_ID: ID of the NAPI instance which services this queue.
+ * @NETDEV_A_QUEUE_DMABUF: ID of the dmabuf attached to this queue, if any.
+ * @NETDEV_A_QUEUE_IO_URING: io_uring memory provider information.
+ * @NETDEV_A_QUEUE_XSK: XSK information for this queue, if any.
+ */
 enum {
 	NETDEV_A_QUEUE_ID = 1,
 	NETDEV_A_QUEUE_IFINDEX,
@@ -163,6 +239,88 @@ enum {
 	NETDEV_A_QUEUE_MAX = (__NETDEV_A_QUEUE_MAX - 1)
 };
 
+/**
+ * enum netdev_qstats - Get device statistics, scoped to a device or a queue.
+ *   These statistics extend (and partially duplicate) statistics available in
+ *   struct rtnl_link_stats64. Value of the `scope` attribute determines how
+ *   statistics are aggregated. When aggregated for the entire device the
+ *   statistics represent the total number of events since last explicit reset
+ *   of the device (i.e. not a reconfiguration like changing queue count). When
+ *   reported per-queue, however, the statistics may not add up to the total
+ *   number of events, will only be reported for currently active objects, and
+ *   will likely report the number of events since last reconfiguration.
+ * @NETDEV_A_QSTATS_IFINDEX: ifindex of the netdevice to which stats belong.
+ * @NETDEV_A_QSTATS_QUEUE_TYPE: Queue type as rx, tx, for queue-id.
+ * @NETDEV_A_QSTATS_QUEUE_ID: Queue ID, if stats are scoped to a single queue
+ *   instance.
+ * @NETDEV_A_QSTATS_SCOPE: What object type should be used to iterate over the
+ *   stats.
+ * @NETDEV_A_QSTATS_RX_PACKETS: Number of wire packets successfully received
+ *   and passed to the stack. For drivers supporting XDP, XDP is considered the
+ *   first layer of the stack, so packets consumed by XDP are still counted
+ *   here.
+ * @NETDEV_A_QSTATS_RX_BYTES: Successfully received bytes, see `rx-packets`.
+ * @NETDEV_A_QSTATS_TX_PACKETS: Number of wire packets successfully sent.
+ *   Packet is considered to be successfully sent once it is in device memory
+ *   (usually this means the device has issued a DMA completion for the
+ *   packet).
+ * @NETDEV_A_QSTATS_TX_BYTES: Successfully sent bytes, see `tx-packets`.
+ * @NETDEV_A_QSTATS_RX_ALLOC_FAIL: Number of times skb or buffer allocation
+ *   failed on the Rx datapath. Allocation failure may, or may not result in a
+ *   packet drop, depending on driver implementation and whether system
+ *   recovers quickly.
+ * @NETDEV_A_QSTATS_RX_HW_DROPS: Number of all packets which entered the
+ *   device, but never left it, including but not limited to: packets dropped
+ *   due to lack of buffer space, processing errors, explicit or implicit
+ *   policies and packet filters.
+ * @NETDEV_A_QSTATS_RX_HW_DROP_OVERRUNS: Number of packets dropped due to
+ *   transient lack of resources, such as buffer space, host descriptors etc.
+ * @NETDEV_A_QSTATS_RX_CSUM_COMPLETE: Number of packets that were marked as
+ *   CHECKSUM_COMPLETE.
+ * @NETDEV_A_QSTATS_RX_CSUM_UNNECESSARY: Number of packets that were marked as
+ *   CHECKSUM_UNNECESSARY.
+ * @NETDEV_A_QSTATS_RX_CSUM_NONE: Number of packets that were not checksummed
+ *   by device.
+ * @NETDEV_A_QSTATS_RX_CSUM_BAD: Number of packets with bad checksum. The
+ *   packets are not discarded, but still delivered to the stack.
+ * @NETDEV_A_QSTATS_RX_HW_GRO_PACKETS: Number of packets that were coalesced
+ *   from smaller packets by the device. Counts only packets coalesced with the
+ *   HW-GRO netdevice feature, LRO-coalesced packets are not counted.
+ * @NETDEV_A_QSTATS_RX_HW_GRO_BYTES: See `rx-hw-gro-packets`.
+ * @NETDEV_A_QSTATS_RX_HW_GRO_WIRE_PACKETS: Number of packets that were
+ *   coalesced to bigger packetss with the HW-GRO netdevice feature.
+ *   LRO-coalesced packets are not counted.
+ * @NETDEV_A_QSTATS_RX_HW_GRO_WIRE_BYTES: See `rx-hw-gro-wire-packets`.
+ * @NETDEV_A_QSTATS_RX_HW_DROP_RATELIMITS: Number of the packets dropped by the
+ *   device due to the received packets bitrate exceeding the device rate
+ *   limit.
+ * @NETDEV_A_QSTATS_TX_HW_DROPS: Number of packets that arrived at the device
+ *   but never left it, encompassing packets dropped for reasons such as
+ *   processing errors, as well as those affected by explicitly defined
+ *   policies and packet filtering criteria.
+ * @NETDEV_A_QSTATS_TX_HW_DROP_ERRORS: Number of packets dropped because they
+ *   were invalid or malformed.
+ * @NETDEV_A_QSTATS_TX_CSUM_NONE: Number of packets that did not require the
+ *   device to calculate the checksum.
+ * @NETDEV_A_QSTATS_TX_NEEDS_CSUM: Number of packets that required the device
+ *   to calculate the checksum. This counter includes the number of GSO wire
+ *   packets for which device calculated the L4 checksum.
+ * @NETDEV_A_QSTATS_TX_HW_GSO_PACKETS: Number of packets that necessitated
+ *   segmentation into smaller packets by the device.
+ * @NETDEV_A_QSTATS_TX_HW_GSO_BYTES: See `tx-hw-gso-packets`.
+ * @NETDEV_A_QSTATS_TX_HW_GSO_WIRE_PACKETS: Number of wire-sized packets
+ *   generated by processing `tx-hw-gso-packets`
+ * @NETDEV_A_QSTATS_TX_HW_GSO_WIRE_BYTES: See `tx-hw-gso-wire-packets`.
+ * @NETDEV_A_QSTATS_TX_HW_DROP_RATELIMITS: Number of the packets dropped by the
+ *   device due to the transmit packets bitrate exceeding the device rate
+ *   limit.
+ * @NETDEV_A_QSTATS_TX_STOP: Number of times driver paused accepting new tx
+ *   packets from the stack to this queue, because the queue was full. Note
+ *   that if BQL is supported and enabled on the device the networking stack
+ *   will avoid queuing a lot of data at once.
+ * @NETDEV_A_QSTATS_TX_WAKE: Number of times driver re-started accepting send
+ *   requests to this queue from the stack.
+ */
 enum {
 	NETDEV_A_QSTATS_IFINDEX = 1,
 	NETDEV_A_QSTATS_QUEUE_TYPE,
@@ -200,6 +358,13 @@ enum {
 	NETDEV_A_QSTATS_MAX = (__NETDEV_A_QSTATS_MAX - 1)
 };
 
+/**
+ * enum netdev_dmabuf
+ * @NETDEV_A_DMABUF_IFINDEX: netdev ifindex to bind the dmabuf to.
+ * @NETDEV_A_DMABUF_QUEUES: receive queues to bind the dmabuf to.
+ * @NETDEV_A_DMABUF_FD: dmabuf file descriptor to bind.
+ * @NETDEV_A_DMABUF_ID: id of the dmabuf binding
+ */
 enum {
 	NETDEV_A_DMABUF_IFINDEX = 1,
 	NETDEV_A_DMABUF_QUEUES,
diff --git a/include/uapi/linux/ovpn.h b/include/uapi/linux/ovpn.h
index 680d1522dc87..cff05828d79b 100644
--- a/include/uapi/linux/ovpn.h
+++ b/include/uapi/linux/ovpn.h
@@ -30,6 +30,43 @@ enum ovpn_key_slot {
 	OVPN_KEY_SLOT_SECONDARY,
 };
 
+/**
+ * enum ovpn_peer
+ * @OVPN_A_PEER_ID: The unique ID of the peer in the device context. To be used
+ *   to identify peers during operations for a specific device
+ * @OVPN_A_PEER_REMOTE_IPV4: The remote IPv4 address of the peer
+ * @OVPN_A_PEER_REMOTE_IPV6: The remote IPv6 address of the peer
+ * @OVPN_A_PEER_REMOTE_IPV6_SCOPE_ID: The scope id of the remote IPv6 address
+ *   of the peer (RFC2553)
+ * @OVPN_A_PEER_REMOTE_PORT: The remote port of the peer
+ * @OVPN_A_PEER_SOCKET: The socket to be used to communicate with the peer
+ * @OVPN_A_PEER_SOCKET_NETNSID: The ID of the netns the socket assigned to this
+ *   peer lives in
+ * @OVPN_A_PEER_VPN_IPV4: The IPv4 address assigned to the peer by the server
+ * @OVPN_A_PEER_VPN_IPV6: The IPv6 address assigned to the peer by the server
+ * @OVPN_A_PEER_LOCAL_IPV4: The local IPv4 to be used to send packets to the
+ *   peer (UDP only)
+ * @OVPN_A_PEER_LOCAL_IPV6: The local IPv6 to be used to send packets to the
+ *   peer (UDP only)
+ * @OVPN_A_PEER_LOCAL_PORT: The local port to be used to send packets to the
+ *   peer (UDP only)
+ * @OVPN_A_PEER_KEEPALIVE_INTERVAL: The number of seconds after which a keep
+ *   alive message is sent to the peer
+ * @OVPN_A_PEER_KEEPALIVE_TIMEOUT: The number of seconds from the last activity
+ *   after which the peer is assumed dead
+ * @OVPN_A_PEER_DEL_REASON: The reason why a peer was deleted
+ * @OVPN_A_PEER_VPN_RX_BYTES: Number of bytes received over the tunnel
+ * @OVPN_A_PEER_VPN_TX_BYTES: Number of bytes transmitted over the tunnel
+ * @OVPN_A_PEER_VPN_RX_PACKETS: Number of packets received over the tunnel
+ * @OVPN_A_PEER_VPN_TX_PACKETS: Number of packets transmitted over the tunnel
+ * @OVPN_A_PEER_LINK_RX_BYTES: Number of bytes received at the transport level
+ * @OVPN_A_PEER_LINK_TX_BYTES: Number of bytes transmitted at the transport
+ *   level
+ * @OVPN_A_PEER_LINK_RX_PACKETS: Number of packets received at the transport
+ *   level
+ * @OVPN_A_PEER_LINK_TX_PACKETS: Number of packets transmitted at the transport
+ *   level
+ */
 enum {
 	OVPN_A_PEER_ID = 1,
 	OVPN_A_PEER_REMOTE_IPV4,
@@ -59,6 +96,18 @@ enum {
 	OVPN_A_PEER_MAX = (__OVPN_A_PEER_MAX - 1)
 };
 
+/**
+ * enum ovpn_keyconf
+ * @OVPN_A_KEYCONF_PEER_ID: The unique ID of the peer in the device context. To
+ *   be used to identify peers during key operations
+ * @OVPN_A_KEYCONF_SLOT: The slot where the key should be stored
+ * @OVPN_A_KEYCONF_KEY_ID: The unique ID of the key in the peer context. Used
+ *   to fetch the correct key upon decryption
+ * @OVPN_A_KEYCONF_CIPHER_ALG: The cipher to be used when communicating with
+ *   the peer
+ * @OVPN_A_KEYCONF_ENCRYPT_DIR: Key material for encrypt direction
+ * @OVPN_A_KEYCONF_DECRYPT_DIR: Key material for decrypt direction
+ */
 enum {
 	OVPN_A_KEYCONF_PEER_ID = 1,
 	OVPN_A_KEYCONF_SLOT,
@@ -71,6 +120,12 @@ enum {
 	OVPN_A_KEYCONF_MAX = (__OVPN_A_KEYCONF_MAX - 1)
 };
 
+/**
+ * enum ovpn_keydir
+ * @OVPN_A_KEYDIR_CIPHER_KEY: The actual key to be used by the cipher
+ * @OVPN_A_KEYDIR_NONCE_TAIL: Random nonce to be concatenated to the packet ID,
+ *   in order to obtain the actual cipher IV
+ */
 enum {
 	OVPN_A_KEYDIR_CIPHER_KEY = 1,
 	OVPN_A_KEYDIR_NONCE_TAIL,
@@ -79,6 +134,13 @@ enum {
 	OVPN_A_KEYDIR_MAX = (__OVPN_A_KEYDIR_MAX - 1)
 };
 
+/**
+ * enum ovpn_ovpn
+ * @OVPN_A_IFINDEX: Index of the ovpn interface to operate on
+ * @OVPN_A_PEER: The peer object containing the attributed of interest for the
+ *   specific operation
+ * @OVPN_A_KEYCONF: Peer specific cipher configuration
+ */
 enum {
 	OVPN_A_IFINDEX = 1,
 	OVPN_A_PEER,
diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h
index 48eb49aa03d4..4d5169fc798d 100644
--- a/tools/include/uapi/linux/netdev.h
+++ b/tools/include/uapi/linux/netdev.h
@@ -82,6 +82,16 @@ enum netdev_napi_threaded {
 	NETDEV_NAPI_THREADED_ENABLED,
 };
 
+/**
+ * enum netdev_dev
+ * @NETDEV_A_DEV_IFINDEX: netdev ifindex
+ * @NETDEV_A_DEV_XDP_FEATURES: Bitmask of enabled xdp-features.
+ * @NETDEV_A_DEV_XDP_ZC_MAX_SEGS: max fragment count supported by ZC driver
+ * @NETDEV_A_DEV_XDP_RX_METADATA_FEATURES: Bitmask of supported XDP receive
+ *   metadata features. See Documentation/networking/xdp-rx-metadata.rst for
+ *   more details.
+ * @NETDEV_A_DEV_XSK_FEATURES: Bitmask of enabled AF_XDP features.
+ */
 enum {
 	NETDEV_A_DEV_IFINDEX = 1,
 	NETDEV_A_DEV_PAD,
@@ -99,6 +109,29 @@ enum {
 	NETDEV_A_IO_URING_PROVIDER_INFO_MAX = (__NETDEV_A_IO_URING_PROVIDER_INFO_MAX - 1)
 };
 
+/**
+ * enum netdev_page_pool
+ * @NETDEV_A_PAGE_POOL_ID: Unique ID of a Page Pool instance.
+ * @NETDEV_A_PAGE_POOL_IFINDEX: ifindex of the netdev to which the pool
+ *   belongs. May be reported as 0 if the page pool was allocated for a netdev
+ *   which got destroyed already (page pools may outlast their netdevs because
+ *   they wait for all memory to be returned).
+ * @NETDEV_A_PAGE_POOL_NAPI_ID: Id of NAPI using this Page Pool instance.
+ * @NETDEV_A_PAGE_POOL_INFLIGHT: Number of outstanding references to this page
+ *   pool (allocated but yet to be freed pages). Allocated pages may be held in
+ *   socket receive queues, driver receive ring, page pool recycling ring, the
+ *   page pool cache, etc.
+ * @NETDEV_A_PAGE_POOL_INFLIGHT_MEM: Amount of memory held by inflight pages.
+ * @NETDEV_A_PAGE_POOL_DETACH_TIME: Seconds in CLOCK_BOOTTIME of when Page Pool
+ *   was detached by the driver. Once detached Page Pool can no longer be used
+ *   to allocate memory. Page Pools wait for all the memory allocated from them
+ *   to be freed before truly disappearing. "Detached" Page Pools cannot be
+ *   "re-attached", they are just waiting to disappear. Attribute is absent if
+ *   Page Pool has not been detached, and can still be used to allocate new
+ *   memory.
+ * @NETDEV_A_PAGE_POOL_DMABUF: ID of the dmabuf this page-pool is attached to.
+ * @NETDEV_A_PAGE_POOL_IO_URING: io-uring memory provider information.
+ */
 enum {
 	NETDEV_A_PAGE_POOL_ID = 1,
 	NETDEV_A_PAGE_POOL_IFINDEX,
@@ -113,6 +146,11 @@ enum {
 	NETDEV_A_PAGE_POOL_MAX = (__NETDEV_A_PAGE_POOL_MAX - 1)
 };
 
+/**
+ * enum netdev_page_pool_stats - Page pool statistics, see docs for struct
+ *   page_pool_stats for information about individual statistics.
+ * @NETDEV_A_PAGE_POOL_STATS_INFO: Page pool identifying information.
+ */
 enum {
 	NETDEV_A_PAGE_POOL_STATS_INFO = 1,
 	NETDEV_A_PAGE_POOL_STATS_ALLOC_FAST = 8,
@@ -131,6 +169,28 @@ enum {
 	NETDEV_A_PAGE_POOL_STATS_MAX = (__NETDEV_A_PAGE_POOL_STATS_MAX - 1)
 };
 
+/**
+ * enum netdev_napi
+ * @NETDEV_A_NAPI_IFINDEX: ifindex of the netdevice to which NAPI instance
+ *   belongs.
+ * @NETDEV_A_NAPI_ID: ID of the NAPI instance.
+ * @NETDEV_A_NAPI_IRQ: The associated interrupt vector number for the napi
+ * @NETDEV_A_NAPI_PID: PID of the napi thread, if NAPI is configured to operate
+ *   in threaded mode. If NAPI is not in threaded mode (i.e. uses normal
+ *   softirq context), the attribute will be absent.
+ * @NETDEV_A_NAPI_DEFER_HARD_IRQS: The number of consecutive empty polls before
+ *   IRQ deferral ends and hardware IRQs are re-enabled.
+ * @NETDEV_A_NAPI_GRO_FLUSH_TIMEOUT: The timeout, in nanoseconds, of when to
+ *   trigger the NAPI watchdog timer which schedules NAPI processing.
+ *   Additionally, a non-zero value will also prevent GRO from flushing recent
+ *   super-frames at the end of a NAPI cycle. This may add receive latency in
+ *   exchange for reducing the number of frames processed by the network stack.
+ * @NETDEV_A_NAPI_IRQ_SUSPEND_TIMEOUT: The timeout, in nanoseconds, of how long
+ *   to suspend irq processing, if event polling finds events
+ * @NETDEV_A_NAPI_THREADED: Whether the NAPI is configured to operate in
+ *   threaded polling mode. If this is set to enabled then the NAPI context
+ *   operates in threaded polling mode.
+ */
 enum {
 	NETDEV_A_NAPI_IFINDEX = 1,
 	NETDEV_A_NAPI_ID,
@@ -150,6 +210,22 @@ enum {
 	NETDEV_A_XSK_INFO_MAX = (__NETDEV_A_XSK_INFO_MAX - 1)
 };
 
+/**
+ * enum netdev_queue
+ * @NETDEV_A_QUEUE_ID: Queue index; most queue types are indexed like a C
+ *   array, with indexes starting at 0 and ending at queue count - 1. Queue
+ *   indexes are scoped to an interface and queue type.
+ * @NETDEV_A_QUEUE_IFINDEX: ifindex of the netdevice to which the queue
+ *   belongs.
+ * @NETDEV_A_QUEUE_TYPE: Queue type as rx, tx. Each queue type defines a
+ *   separate ID space. XDP TX queues allocated in the kernel are not linked to
+ *   NAPIs and thus not listed. AF_XDP queues will have more information set in
+ *   the xsk attribute.
+ * @NETDEV_A_QUEUE_NAPI_ID: ID of the NAPI instance which services this queue.
+ * @NETDEV_A_QUEUE_DMABUF: ID of the dmabuf attached to this queue, if any.
+ * @NETDEV_A_QUEUE_IO_URING: io_uring memory provider information.
+ * @NETDEV_A_QUEUE_XSK: XSK information for this queue, if any.
+ */
 enum {
 	NETDEV_A_QUEUE_ID = 1,
 	NETDEV_A_QUEUE_IFINDEX,
@@ -163,6 +239,88 @@ enum {
 	NETDEV_A_QUEUE_MAX = (__NETDEV_A_QUEUE_MAX - 1)
 };
 
+/**
+ * enum netdev_qstats - Get device statistics, scoped to a device or a queue.
+ *   These statistics extend (and partially duplicate) statistics available in
+ *   struct rtnl_link_stats64. Value of the `scope` attribute determines how
+ *   statistics are aggregated. When aggregated for the entire device the
+ *   statistics represent the total number of events since last explicit reset
+ *   of the device (i.e. not a reconfiguration like changing queue count). When
+ *   reported per-queue, however, the statistics may not add up to the total
+ *   number of events, will only be reported for currently active objects, and
+ *   will likely report the number of events since last reconfiguration.
+ * @NETDEV_A_QSTATS_IFINDEX: ifindex of the netdevice to which stats belong.
+ * @NETDEV_A_QSTATS_QUEUE_TYPE: Queue type as rx, tx, for queue-id.
+ * @NETDEV_A_QSTATS_QUEUE_ID: Queue ID, if stats are scoped to a single queue
+ *   instance.
+ * @NETDEV_A_QSTATS_SCOPE: What object type should be used to iterate over the
+ *   stats.
+ * @NETDEV_A_QSTATS_RX_PACKETS: Number of wire packets successfully received
+ *   and passed to the stack. For drivers supporting XDP, XDP is considered the
+ *   first layer of the stack, so packets consumed by XDP are still counted
+ *   here.
+ * @NETDEV_A_QSTATS_RX_BYTES: Successfully received bytes, see `rx-packets`.
+ * @NETDEV_A_QSTATS_TX_PACKETS: Number of wire packets successfully sent.
+ *   Packet is considered to be successfully sent once it is in device memory
+ *   (usually this means the device has issued a DMA completion for the
+ *   packet).
+ * @NETDEV_A_QSTATS_TX_BYTES: Successfully sent bytes, see `tx-packets`.
+ * @NETDEV_A_QSTATS_RX_ALLOC_FAIL: Number of times skb or buffer allocation
+ *   failed on the Rx datapath. Allocation failure may, or may not result in a
+ *   packet drop, depending on driver implementation and whether system
+ *   recovers quickly.
+ * @NETDEV_A_QSTATS_RX_HW_DROPS: Number of all packets which entered the
+ *   device, but never left it, including but not limited to: packets dropped
+ *   due to lack of buffer space, processing errors, explicit or implicit
+ *   policies and packet filters.
+ * @NETDEV_A_QSTATS_RX_HW_DROP_OVERRUNS: Number of packets dropped due to
+ *   transient lack of resources, such as buffer space, host descriptors etc.
+ * @NETDEV_A_QSTATS_RX_CSUM_COMPLETE: Number of packets that were marked as
+ *   CHECKSUM_COMPLETE.
+ * @NETDEV_A_QSTATS_RX_CSUM_UNNECESSARY: Number of packets that were marked as
+ *   CHECKSUM_UNNECESSARY.
+ * @NETDEV_A_QSTATS_RX_CSUM_NONE: Number of packets that were not checksummed
+ *   by device.
+ * @NETDEV_A_QSTATS_RX_CSUM_BAD: Number of packets with bad checksum. The
+ *   packets are not discarded, but still delivered to the stack.
+ * @NETDEV_A_QSTATS_RX_HW_GRO_PACKETS: Number of packets that were coalesced
+ *   from smaller packets by the device. Counts only packets coalesced with the
+ *   HW-GRO netdevice feature, LRO-coalesced packets are not counted.
+ * @NETDEV_A_QSTATS_RX_HW_GRO_BYTES: See `rx-hw-gro-packets`.
+ * @NETDEV_A_QSTATS_RX_HW_GRO_WIRE_PACKETS: Number of packets that were
+ *   coalesced to bigger packetss with the HW-GRO netdevice feature.
+ *   LRO-coalesced packets are not counted.
+ * @NETDEV_A_QSTATS_RX_HW_GRO_WIRE_BYTES: See `rx-hw-gro-wire-packets`.
+ * @NETDEV_A_QSTATS_RX_HW_DROP_RATELIMITS: Number of the packets dropped by the
+ *   device due to the received packets bitrate exceeding the device rate
+ *   limit.
+ * @NETDEV_A_QSTATS_TX_HW_DROPS: Number of packets that arrived at the device
+ *   but never left it, encompassing packets dropped for reasons such as
+ *   processing errors, as well as those affected by explicitly defined
+ *   policies and packet filtering criteria.
+ * @NETDEV_A_QSTATS_TX_HW_DROP_ERRORS: Number of packets dropped because they
+ *   were invalid or malformed.
+ * @NETDEV_A_QSTATS_TX_CSUM_NONE: Number of packets that did not require the
+ *   device to calculate the checksum.
+ * @NETDEV_A_QSTATS_TX_NEEDS_CSUM: Number of packets that required the device
+ *   to calculate the checksum. This counter includes the number of GSO wire
+ *   packets for which device calculated the L4 checksum.
+ * @NETDEV_A_QSTATS_TX_HW_GSO_PACKETS: Number of packets that necessitated
+ *   segmentation into smaller packets by the device.
+ * @NETDEV_A_QSTATS_TX_HW_GSO_BYTES: See `tx-hw-gso-packets`.
+ * @NETDEV_A_QSTATS_TX_HW_GSO_WIRE_PACKETS: Number of wire-sized packets
+ *   generated by processing `tx-hw-gso-packets`
+ * @NETDEV_A_QSTATS_TX_HW_GSO_WIRE_BYTES: See `tx-hw-gso-wire-packets`.
+ * @NETDEV_A_QSTATS_TX_HW_DROP_RATELIMITS: Number of the packets dropped by the
+ *   device due to the transmit packets bitrate exceeding the device rate
+ *   limit.
+ * @NETDEV_A_QSTATS_TX_STOP: Number of times driver paused accepting new tx
+ *   packets from the stack to this queue, because the queue was full. Note
+ *   that if BQL is supported and enabled on the device the networking stack
+ *   will avoid queuing a lot of data at once.
+ * @NETDEV_A_QSTATS_TX_WAKE: Number of times driver re-started accepting send
+ *   requests to this queue from the stack.
+ */
 enum {
 	NETDEV_A_QSTATS_IFINDEX = 1,
 	NETDEV_A_QSTATS_QUEUE_TYPE,
@@ -200,6 +358,13 @@ enum {
 	NETDEV_A_QSTATS_MAX = (__NETDEV_A_QSTATS_MAX - 1)
 };
 
+/**
+ * enum netdev_dmabuf
+ * @NETDEV_A_DMABUF_IFINDEX: netdev ifindex to bind the dmabuf to.
+ * @NETDEV_A_DMABUF_QUEUES: receive queues to bind the dmabuf to.
+ * @NETDEV_A_DMABUF_FD: dmabuf file descriptor to bind.
+ * @NETDEV_A_DMABUF_ID: id of the dmabuf binding
+ */
 enum {
 	NETDEV_A_DMABUF_IFINDEX = 1,
 	NETDEV_A_DMABUF_QUEUES,
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH net-next v3 3/3] Documentation: net: add flow control guide and document ethtool API
  2025-08-20 13:10 [PATCH net-next v3 0/3] Documentation and ynl: add flow control Oleksij Rempel
  2025-08-20 13:10 ` [PATCH net-next v3 1/3] tools: ynl-gen: generate kdoc for attribute enums Oleksij Rempel
  2025-08-20 13:10 ` [PATCH net-next v3 2/3] net: ynl: add generated kdoc to UAPI headers Oleksij Rempel
@ 2025-08-20 13:10 ` Oleksij Rempel
  2025-08-22 11:35   ` Vladimir Oltean
  2 siblings, 1 reply; 10+ messages in thread
From: Oleksij Rempel @ 2025-08-20 13:10 UTC (permalink / raw)
  To: Andrew Lunn, Heiner Kallweit, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Rob Herring, Krzysztof Kozlowski,
	Florian Fainelli, Maxime Chevallier, Kory Maincent,
	Lukasz Majewski, Jonathan Corbet, Donald Hunter, Vadim Fedorenko,
	Jiri Pirko, Vladimir Oltean, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend
  Cc: Oleksij Rempel, kernel, linux-kernel, netdev, Russell King,
	Divya.Koppera, Sabrina Dubroca, Stanislav Fomichev

Introduce a new document, flow_control.rst, to provide a comprehensive
guide on Ethernet Flow Control in Linux. The guide explains how flow
control works, how autonegotiation resolves pause capabilities, and how
to configure it using ethtool and Netlink.

In parallel, document the pause and pause-stat attributes in the
ethtool.yaml netlink spec. This enables the ynl tool to generate
kernel-doc comments for the corresponding enums in the UAPI header,
making the C interface self-documenting.

Finally, replace the legacy flow control section in phy.rst with a
reference to the new document and add pointers in the relevant C source
files.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
---
changes v3:
- add warning about half-duplex collision-based flow control on shared media
- clarify pause autoneg vs. generic autoneg and forced mode semantics
- document pause quanta defaults used by common MAC drivers, with time examples
- fix vague cross-reference, point to autonegotiation resolution section
- expand notes on PAUSE vs. PFC exclusivity
- include generated enums (pause / pause-stat) in UAPI with kernel-doc
changes v2:
- remove recommendations
- add note about autoneg resolution
---
 Documentation/netlink/specs/ethtool.yaml      |  27 ++
 Documentation/networking/flow_control.rst     | 379 ++++++++++++++++++
 Documentation/networking/index.rst            |   1 +
 Documentation/networking/phy.rst              |  12 +-
 include/linux/ethtool.h                       |  45 ++-
 .../uapi/linux/ethtool_netlink_generated.h    |  28 +-
 net/dcb/dcbnl.c                               |   2 +
 net/ethtool/pause.c                           |   4 +
 8 files changed, 483 insertions(+), 15 deletions(-)
 create mode 100644 Documentation/networking/flow_control.rst

diff --git a/Documentation/netlink/specs/ethtool.yaml b/Documentation/netlink/specs/ethtool.yaml
index 7a7594713f1f..13d8dcfa8dc5 100644
--- a/Documentation/netlink/specs/ethtool.yaml
+++ b/Documentation/netlink/specs/ethtool.yaml
@@ -864,7 +864,9 @@ attribute-sets:
 
   -
     name: pause-stat
+    doc: Statistics counters for link-wide PAUSE frames (IEEE 802.3 Annex 31B).
     attr-cnt-name: __ethtool-a-pause-stat-cnt
+    enum-name: ethtool_a_pause_stat
     attributes:
       -
         name: unspec
@@ -875,13 +877,17 @@ attribute-sets:
         type: pad
       -
         name: tx-frames
+        doc: Number of PAUSE frames transmitted.
         type: u64
       -
         name: rx-frames
+        doc: Number of PAUSE frames received.
         type: u64
   -
     name: pause
+    doc: Parameters for link-wide PAUSE (IEEE 802.3 Annex 31B).
     attr-cnt-name: __ethtool-a-pause-cnt
+    enum-name: ethtool_a_pause
     attributes:
       -
         name: unspec
@@ -893,19 +899,40 @@ attribute-sets:
         nested-attributes: header
       -
         name: autoneg
+        doc: |
+          Acts as a mode selector for the driver.
+          On GET: indicates the driver's behavior. If true, the driver will
+          respect the negotiated outcome; if false, the driver will use a
+          forced configuration.
+          On SET: if true, the driver configures the PHY's advertisement based
+          on the rx and tx attributes. If false, the driver forces the MAC
+          into the state defined by the rx and tx attributes.
         type: u8
       -
         name: rx
+        doc: |
+          Enable receiving PAUSE frames (pausing local TX).
+          On GET: reflects the currently preferred configuration state.
         type: u8
       -
         name: tx
+        doc: |
+          Enable transmitting PAUSE frames (pausing peer TX).
+          On GET: reflects the currently preferred configuration state.
         type: u8
       -
         name: stats
+        doc: |
+          Contains the pause statistics counters. The source of these
+          statistics is determined by stats-src.
         type: nest
         nested-attributes: pause-stat
       -
         name: stats-src
+        doc: |
+          Selects the source of the MAC statistics, values from
+          enum ethtool_mac_stats_src. This allows requesting statistics
+          from an aggregated MAC or a specific PHY, for example.
         type: u32
   -
     name: eee
diff --git a/Documentation/networking/flow_control.rst b/Documentation/networking/flow_control.rst
new file mode 100644
index 000000000000..ba315a5bcb87
--- /dev/null
+++ b/Documentation/networking/flow_control.rst
@@ -0,0 +1,379 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _ethernet-flow-control:
+
+=====================
+Ethernet Flow Control
+=====================
+
+This document is a practical guide to Ethernet Flow Control in Linux, covering
+what it is, how it works, and how to configure it.
+
+What is Flow Control?
+=====================
+
+Flow control is a mechanism to prevent a fast sender from overwhelming a
+slow receiver with data, which would cause buffer overruns and dropped packets.
+The receiver can signal the sender to temporarily stop transmitting, giving it
+time to process its backlog.
+
+Standards references
+====================
+
+Ethernet flow control mechanisms are specified across consolidated IEEE base
+standards; some originated as amendments:
+
+- Collision-based flow control is part of CSMA/CD in **IEEE 802.3**
+  (half-duplex).
+- Link‑wide PAUSE is defined in **IEEE 802.3 Annex 31B**
+  (originally **802.3x**).
+- Priority-based Flow Control (PFC) is defined in **IEEE 802.1Q Clause 36**
+  (originally **802.1Qbb**).
+
+In the remainder of this document, the consolidated clause numbers are used.
+
+How It Works: The Mechanisms
+============================
+
+The method used for flow control depends on the link's duplex mode.
+
+.. note::
+   The user-visible ``ethtool`` pause API described in this document controls
+   **link-wide PAUSE** (IEEE 802.3 Annex 31B) only. It does not control the
+   collision-based behavior that exists on half-duplex links.
+
+1. Half-Duplex: Collision-Based Flow Control
+--------------------------------------------
+On half-duplex links, a device cannot send and receive simultaneously, so PAUSE
+frames are not used. Flow control is achieved by leveraging the CSMA/CD
+(Carrier Sense Multiple Access with Collision Detection) protocol itself.
+
+* **How it works**: To inhibit incoming data, a receiving device can force a
+    collision on the line. When the sending station detects this collision, it
+    terminates its transmission, sends a "jam" signal, and then executes the
+    "Collision backoff and retransmission" procedure as defined in IEEE 802.3,
+    Section 4.2.3.2.5. This algorithm makes the sender wait for a random
+    period before attempting to retransmit. By repeatedly forcing collisions,
+    the receiver can effectively throttle the sender's transmission rate.
+
+.. note::
+    While this mechanism is part of the IEEE standard, there is currently no
+    generic kernel API to configure or control it. Drivers should not enable
+    this feature until a standardized interface is available.
+
+.. warning::
+   On shared-medium networks (e.g. 10BASE2, or twisted-pair networks using a
+   hub rather than a switch) forcing collisions inhibits traffic **across the
+   entire shared segment**, not just a single point-to-point link. Enabling
+   such behavior is generally undesirable.
+
+2. Full-Duplex: Link-wide PAUSE (IEEE 802.3 Annex 31B)
+------------------------------------------------------
+On full-duplex links, devices can send and receive at the same time. Flow
+control is achieved by sending a special **PAUSE frame**, defined by IEEE
+802.3 Annex 31B. This mechanism pauses all traffic on the link and is therefore
+called *link-wide PAUSE*.
+
+* **What it is**: A standard Ethernet frame with a globally reserved
+    destination MAC address (``01-80-C2-00-00-01``). This address is in a range
+    that standard IEEE 802.1D-compliant bridges do not forward. However, some
+    unmanaged or misconfigured bridges have been reported to forward these
+    frames, which can disrupt flow control across a network.
+
+* **How it works**: The frame contains a MAC Control opcode for PAUSE
+    (``0x0001``) and a ``pause_time`` value, telling the sender how long to
+    wait before sending more data frames. This time is specified in units of
+    "pause quanta," where one quantum is the time it takes to transmit 512 bits.
+    For example, one pause quantum is 51.2 microseconds on a 10 Mbit/s link,
+    and 512 nanoseconds on a 1 Gbit/s link.
+
+* **Who uses it**: Any full-duplex link, from 10 Mbit/s to multi-gigabit speeds.
+
+3. Full-Duplex: Priority-based Flow Control (PFC) (IEEE 802.1Q Clause 36)
+-------------------------------------------------------------------------
+Priority-based Flow Control is an enhancement to the standard PAUSE mechanism
+that allows flow control to be applied independently to different classes of
+traffic, identified by their priority level.
+
+* **What it is**: PFC allows a receiver to pause traffic for one or more of the
+    8 standard priority levels without stopping traffic for other priorities.
+    This is critical in data center environments for protocols that cannot
+    tolerate packet loss due to congestion (e.g., Fibre Channel over Ethernet
+    or RoCE).
+
+* **How it works**: PFC uses a specific PAUSE frame format. It shares the same
+    globally reserved destination MAC address (``01-80-C2-00-00-01``) as legacy
+    PAUSE frames but uses a unique opcode (``0x0101``). The frame payload
+    contains two key fields:
+
+    - **``priority_enable_vector``**: An 8-bit mask where each bit corresponds to
+      one of the 8 priorities. If a bit is set to 1, it means the pause time
+      for that priority is active.
+    - **``time_vector``**: A list of eight 2-octet fields, one for each priority.
+      Each field specifies the ``pause_time`` for its corresponding priority,
+      measured in units of ``pause_quanta`` (the time to transmit 512 bits).
+
+.. note::
+    When PFC is enabled for at least one priority on a port, the standard
+    **link-wide PAUSE** (IEEE 802.3 Annex 31B) must be disabled for that port.
+    The two mechanisms are mutually exclusive (IEEE 802.1Q Clause 36).
+
+Configuring Flow Control
+========================
+
+Link-wide PAUSE and Priority-based Flow Control are configured with different
+tools.
+
+Configuring Link-wide PAUSE with ``ethtool`` (IEEE 802.3 Annex 31B)
+-------------------------------------------------------------------
+Use ``ethtool -a <interface>`` to view and ``ethtool -A <interface>`` to change
+the link-wide PAUSE settings.
+
+.. code-block:: bash
+
+  # View current link-wide PAUSE settings
+  ethtool -a eth0
+
+  # Enable RX and TX pause, with autonegotiation
+  ethtool -A eth0 autoneg on rx on tx on
+
+**Key Configuration Concepts**:
+
+* **Pause Autoneg vs Generic Autoneg**: ``ethtool -A ... autoneg {on,off}``
+  controls **Pause Autoneg** (Annex 31B) only. It is independent from the
+  **Generic link autonegotiation** configured with ``ethtool -s``. A device can
+  have Generic autoneg **on** while Pause Autoneg is **off**, and vice versa.
+
+* **If Pause Autoneg is off** (``-A ... autoneg off``): the device will **not**
+  advertise pause in the PHY. The MAC PAUSE state is **forced** according to
+  ``rx``/``tx`` and does not depend on partner capabilities or resolution.
+  Ensure the peer is configured complementarily for PAUSE to be effective.
+
+* **If generic autoneg is off** but **Pause Autoneg is on**, the pause policy
+  is **remembered** by the kernel and applied later when Generic autoneg is
+  enabled again.
+
+* **Autonegotiation Mode**: The PHY will *advertise* the ``rx`` and ``tx``
+  capabilities. The final active state is determined by what both sides of the
+  link agree on. See the "PHY (Physical Layer Transceiver)" section below,
+  especially the *Resolution* subsection, for details of the negotiation rules.
+
+* **Forced Mode**: This mode is necessary when autonegotiation is not used or
+  not possible. This includes links where one or both partners have
+  autonegotiation disabled, or in setups without a PHY (e.g., direct
+  MAC-to-MAC connections). The driver bypasses PHY advertisement and
+  directly forces the MAC into the specified ``rx``/``tx`` state. The
+  configuration on both sides of the link must be complementary. For
+  example, if one side is set to ``tx on`` ``rx off``, the link partner must be
+  set to ``tx off`` ``rx on`` for flow control to function correctly.
+
+Configuring PFC with ``dcb`` (IEEE 802.1Q Clause 36)
+----------------------------------------------------
+PFC is part of the Data Center Bridging (DCB) subsystem and is managed with the
+``dcb`` tool (iproute2). Some deployments use ``dcbtool`` (lldpad) instead; this
+document shows ``dcb(8)`` examples.
+
+**Viewing PFC Settings**:
+
+.. code-block:: text
+
+  $ dcb pfc show dev eth0
+  pfc-cap 8 macsec-bypass off delay 4096
+  prio-pfc 0:off 1:off 2:off 3:off 4:off 5:off 6:on 7:on
+
+This shows the PFC state (on/off) for each priority (0-7).
+
+**Changing PFC Settings**:
+
+.. code-block:: bash
+
+  # Enable PFC on priorities 6 and 7, leaving others as they are
+  $ dcb pfc set dev eth0 prio-pfc 6:on 7:on
+
+  # Disable PFC for all priorities except 6 and 7
+  $ dcb pfc set dev eth0 prio-pfc all:off 6:on 7:on
+
+Monitoring Flow Control
+=======================
+
+The standard way to check if flow control is actively being used is to view the
+pause-related statistics.
+
+**Monitoring Link-wide PAUSE**:
+Use ``ethtool --include-statistics -a <interface>``.
+
+.. code-block:: text
+
+  $ ethtool --include-statistics -a eth0
+  Pause parameters for eth0:
+  ...
+  Statistics:
+    tx_pause_frames: 0
+    rx_pause_frames: 0
+
+**Monitoring PFC**:
+PFC statistics (sent and received frames per priority) are available
+through the ``dcb`` tool.
+
+.. code-block:: text
+
+  $ dcb pfc show dev eth0 requests indications
+  requests 0:0 1:0 2:0 3:1024 4:2048 5:0 6:0 7:0
+  indications 0:0 1:0 2:0 3:512 4:4096 5:0 6:0 7:0
+
+The ``requests`` counters track transmitted PFC frames (TX), and the
+``indications`` counters track received PFC frames (RX).
+
+Link-wide PAUSE Autonegotiation Details
+=======================================
+
+The autonegotiation process for link-wide PAUSE is managed by the PHY and
+involves advertising capabilities and resolving the outcome.
+
+* Terminology (link-wide PAUSE):
+
+    - **Symmetric pause**: both directions are paused when requested (TX+RX
+      enabled).
+    - **Asymmetric pause**: only one direction is paused (e.g., RX-only or
+      TX-only).
+
+    In IEEE 802.3 advertisement/resolution, symmetric/asymmetric are encoded
+    using two bits (Pause/Asym) and resolved per the standard truth tables
+    below.
+
+* **Advertisement**: The PHY advertises the MAC's flow control capabilities.
+  This is done using two bits in the advertisement register: "Symmetric
+  Pause" (Pause) and "Asymmetric Pause" (Asym). These bits should be
+  interpreted as a combined value, not as independent flags. The kernel
+  converts the user's ``rx`` and ``tx`` settings into this two-bit value as
+  follows:
+
+  .. code-block:: text
+
+    tx  rx | Pause  Asym
+    -------+-------------
+     0   0 |   0      0
+     0   1 |   1      1
+     1   0 |   0      1
+     1   1 |   1      0
+
+* **Resolution**: After negotiation, the PHY reports the link partner's
+  advertised Pause and Asym bits. The final flow control mode is determined
+  by the combination of the local and partner advertisements, according to
+  the IEEE 802.3 standard:
+
+  .. code-block:: text
+
+    Local Device       | Link Partner       | Result
+    Pause  Asym        | Pause   Asym       |
+    -------------------+--------------------+---------
+      0      X         |  0       X         | Disabled
+      0      1         |  1       0         | Disabled
+      0      1         |  1       1         | TX only
+      1      0         |  0       X         | Disabled
+      1      X         |  1       X         | TX + RX
+      1      1         |  0       1         | RX only
+
+  It is important to note that the advertised bits reflect the *current
+  configuration* of the MAC, which may not represent its full hardware
+  capabilities.
+
+Kernel Policy: "Set and Trust"
+==============================
+
+The ethtool pause API is defined as a **wish policy** for
+IEEE 802.3 link-wide PAUSE only. A user request is always accepted
+as the preferred configuration, but it may not be possible to apply
+it in all link states.
+
+Key constraints:
+
+- Link-wide PAUSE is not valid on half-duplex links.
+- Link-wide PAUSE cannot be used together with Priority-based Flow Control
+  (PFC, IEEE 802.1Q Clause 36).
+- If autonegotiation is active and the link is currently down, the future
+  mode is not yet known.
+
+Because of these constraints, the kernel stores the requested setting
+and applies it only when the link is in a compatible state.
+
+Implications for userspace:
+
+1. Set once (the "wish"): the requested Rx/Tx PAUSE policy is
+   remembered even if it cannot be applied immediately.
+2. Applied conditionally: when the link comes up, the kernel enables
+   PAUSE only if the active mode allows it.
+
+Component Roles in Flow Control
+===============================
+
+The configuration of flow control involves several components, each with a
+distinct role.
+
+The MAC (Media Access Controller)
+---------------------------------
+The MAC is the hardware component that actually sends and receives PAUSE
+frames. Its capabilities define the upper limit of what the driver can support.
+For link-wide PAUSE, MACs can vary in their support for symmetric (both
+directions) or asymmetric (independent TX/RX) flow control.
+
+For PFC, the MAC must be capable of generating and interpreting the
+priority-based PAUSE frames and managing separate pause states for each
+traffic class.
+
+Many MACs also implement automatic PAUSE frame transmission based on the fill
+level of their internal RX FIFO. This is typically configured with two
+thresholds:
+
+* **FLOW_ON (High Water Mark)**: When the RX FIFO usage reaches this
+  threshold, the MAC automatically transmits a PAUSE frame to stop the sender.
+
+* **FLOW_OFF (Low Water Mark)**: When the RX FIFO usage drops below this
+  threshold, the MAC transmits a PAUSE frame with a quanta of zero to tell
+  the sender it can resume transmission.
+
+The optimal values for these thresholds depend on the link's round-trip-time
+(RTT) and the peer's internal processing latency. The high water mark must be
+set low enough so that the MAC's RX FIFO does not overflow while waiting for
+the peer to react to the PAUSE frame. The driver is responsible for configuring
+sensible defaults according to the IEEE specification. User tuning should only
+be necessary in special cases, such as on links with unusually long cable
+lengths (e.g., long-haul fiber).
+
+The PHY (Physical Layer Transceiver)
+------------------------------------
+The PHY's role is distinct for each flow control mechanism:
+
+* **Link-wide PAUSE**: During the autonegotiation process, the PHY is
+  responsible for advertising the device's flow control capabilities. See the
+  "Link-wide PAUSE Autonegotiation Details" section for more information.
+
+* **Half-Duplex Collision-Based Flow Control**: The PHY is fundamental to the
+  CSMA/CD process. It performs carrier sensing (checking if the line is idle)
+  and collision detection, which is the mechanism leveraged to throttle the
+  sender.
+
+* **Priority-based Flow Control (PFC)**: The PHY is not directly involved in
+  negotiating PFC capabilities. Its role is to establish the physical link.
+  PFC negotiation happens at a higher layer via the Data Center Bridging
+  Capability Exchange Protocol (DCBX).
+
+User Space Interface
+====================
+The primary user space tools are ``ethtool`` for link-wide PAUSE and ``dcb`` for
+PFC. They communicate with the kernel to configure the network device driver
+and underlying hardware.
+
+**Link-wide PAUSE Netlink Interface (``ethtool``)**
+
+See the ethtool Netlink spec (``Documentation/netlink/specs/ethtool.yaml``)
+for the authoritative definition of the Pause control and Pause statistics
+attributes. The generated UAPI is in
+``include/uapi/linux/ethtool_netlink_generated.h``.
+
+**PFC Netlink Interface (``dcb``)**
+
+The authoritative definitions for DCB/PFC netlink attributes and commands are in
+``include/uapi/linux/dcbnl.h``. See also the ``dcb(8)`` manual page and the DCB
+subsystem documentation for userspace configuration details.
+
diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index b7a4969e9bc9..243f4ceb4ab1 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -55,6 +55,7 @@ Contents:
    eql
    fib_trie
    filter
+   flow_control
    generic-hdlc
    generic_netlink
    ../netlink/specs/index
diff --git a/Documentation/networking/phy.rst b/Documentation/networking/phy.rst
index 7f159043ad5a..a900e18a93d3 100644
--- a/Documentation/networking/phy.rst
+++ b/Documentation/networking/phy.rst
@@ -343,16 +343,8 @@ Some of the interface modes are described below:
 Pause frames / flow control
 ===========================
 
-The PHY does not participate directly in flow control/pause frames except by
-making sure that the SUPPORTED_Pause and SUPPORTED_AsymPause bits are set in
-MII_ADVERTISE to indicate towards the link partner that the Ethernet MAC
-controller supports such a thing. Since flow control/pause frames generation
-involves the Ethernet MAC driver, it is recommended that this driver takes care
-of properly indicating advertisement and support for such features by setting
-the SUPPORTED_Pause and SUPPORTED_AsymPause bits accordingly. This can be done
-either before or after phy_connect() and/or as a result of implementing the
-ethtool::set_pauseparam feature.
-
+For detailed link-wide PAUSE and PFC behavior and configuration, see
+flow_control.rst.
 
 Keeping Close Tabs on the PAL
 =============================
diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index de5bd76a400c..d921bd602064 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -931,9 +931,48 @@ struct kernel_ethtool_ts_info {
  * @get_pause_stats: Report pause frame statistics. Drivers must not zero
  *	statistics which they don't report. The stats structure is initialized
  *	to ETHTOOL_STAT_NOT_SET indicating driver does not report statistics.
- * @get_pauseparam: Report pause parameters
- * @set_pauseparam: Set pause parameters.  Returns a negative error code
- *	or zero.
+ *
+ * @get_pauseparam: Report the configured policy for link-wide PAUSE
+ *      (IEEE 802.3 Annex 31B). Drivers must fill struct ethtool_pauseparam
+ *      such that:
+ *      @autoneg:
+ *              This refers to **Pause Autoneg** (IEEE 802.3 Annex 31B) only
+ *              and is independent of generic link autonegotiation configured
+ *              via ethtool -s.
+ *              true  -> the device follows the negotiated result of pause
+ *                       autonegotiation (Pause/Asym);
+ *              false -> the device uses a forced MAC state independent of
+ *                       negotiation.
+ *      @rx_pause/@tx_pause:
+ *              represent the desired policy (preferred configuration).
+ *              In autoneg mode they describe what is to be advertised;
+ *              in forced mode they describe the MAC state to apply.
+ *
+ *      Drivers (and/or frameworks) should persist this policy across link
+ *      changes and reapply appropriate MAC programming when link parameters
+ *      change.
+ *
+ * @set_pauseparam: Apply a policy for link-wide PAUSE (IEEE 802.3 Annex 31B).
+ *      If @autoneg is true:
+ *              Arrange for pause advertisement (Pause/Asym) based on
+ *              @rx_pause/@tx_pause and program the MAC to follow the
+ *              negotiated result (which may be symmetric, asymmetric, or off
+ *              depending on the link partner).
+ *      If @autoneg is false:
+ *              Do not rely on autonegotiation; force the MAC RX/TX pause
+ *              state directly per @rx_pause/@tx_pause.
+ *
+ *      Implementations that integrate with PHYLIB/PHYLINK should cooperate
+ *      with those frameworks for advertisement and resolution; MAC drivers are
+ *      still responsible for applying the required MAC state.
+ *
+ *      Return: 0 on success or a negative errno. Return -EOPNOTSUPP if
+ *      link-wide PAUSE is unsupported. If only symmetric pause is supported,
+ *      reject unsupported asymmetric requests with -EINVAL (or document any
+ *      coercion policy).
+ *
+ *      See also: Documentation/networking/flow_control.rst
+ *
  * @self_test: Run specified self-tests
  * @get_strings: Return a set of strings that describe the requested objects
  * @set_phys_id: Identify the physical devices, e.g. by flashing an LED
diff --git a/include/uapi/linux/ethtool_netlink_generated.h b/include/uapi/linux/ethtool_netlink_generated.h
index 46de09954042..0af7b90101c1 100644
--- a/include/uapi/linux/ethtool_netlink_generated.h
+++ b/include/uapi/linux/ethtool_netlink_generated.h
@@ -384,7 +384,13 @@ enum {
 	ETHTOOL_A_COALESCE_MAX = (__ETHTOOL_A_COALESCE_CNT - 1)
 };
 
-enum {
+/**
+ * enum ethtool_pause_stat - Statistics counters for link-wide PAUSE frames
+ *   (IEEE 802.3 Annex 31B).
+ * @ETHTOOL_A_PAUSE_STAT_TX_FRAMES: Number of PAUSE frames transmitted.
+ * @ETHTOOL_A_PAUSE_STAT_RX_FRAMES: Number of PAUSE frames received.
+ */
+enum ethtool_a_pause_stat {
 	ETHTOOL_A_PAUSE_STAT_UNSPEC,
 	ETHTOOL_A_PAUSE_STAT_PAD,
 	ETHTOOL_A_PAUSE_STAT_TX_FRAMES,
@@ -394,7 +400,25 @@ enum {
 	ETHTOOL_A_PAUSE_STAT_MAX = (__ETHTOOL_A_PAUSE_STAT_CNT - 1)
 };
 
-enum {
+/**
+ * enum ethtool_pause - Parameters for link-wide PAUSE (IEEE 802.3 Annex 31B).
+ * @ETHTOOL_A_PAUSE_AUTONEG: Acts as a mode selector for the driver. On GET:
+ *   indicates the driver's behavior. If true, the driver will respect the
+ *   negotiated outcome; if false, the driver will use a forced configuration.
+ *   On SET: if true, the driver configures the PHY's advertisement based on
+ *   the rx and tx attributes. If false, the driver forces the MAC into the
+ *   state defined by the rx and tx attributes.
+ * @ETHTOOL_A_PAUSE_RX: Enable receiving PAUSE frames (pausing local TX). On
+ *   GET: reflects the currently preferred configuration state.
+ * @ETHTOOL_A_PAUSE_TX: Enable transmitting PAUSE frames (pausing peer TX). On
+ *   GET: reflects the currently preferred configuration state.
+ * @ETHTOOL_A_PAUSE_STATS: Contains the pause statistics counters. The source
+ *   of these statistics is determined by stats-src.
+ * @ETHTOOL_A_PAUSE_STATS_SRC: Selects the source of the MAC statistics, values
+ *   from enum ethtool_mac_stats_src. This allows requesting statistics from an
+ *   aggregated MAC or a specific PHY, for example.
+ */
+enum ethtool_a_pause {
 	ETHTOOL_A_PAUSE_UNSPEC,
 	ETHTOOL_A_PAUSE_HEADER,
 	ETHTOOL_A_PAUSE_AUTONEG,
diff --git a/net/dcb/dcbnl.c b/net/dcb/dcbnl.c
index 03eb1d941fca..91ee22f53774 100644
--- a/net/dcb/dcbnl.c
+++ b/net/dcb/dcbnl.c
@@ -27,6 +27,8 @@
  *
  * Priority-based Flow Control (PFC) - provides a flow control mechanism which
  *   can work independently for each 802.1p priority.
+ *   See Documentation/networking/flow_control.rst for a high level description
+ *   of the user space interface for Priority-based Flow Control (PFC).
  *
  * Congestion Notification - provides a mechanism for end-to-end congestion
  *   control for protocols which do not have built-in congestion management.
diff --git a/net/ethtool/pause.c b/net/ethtool/pause.c
index 0f9af1e66548..eacf6a4859bf 100644
--- a/net/ethtool/pause.c
+++ b/net/ethtool/pause.c
@@ -1,5 +1,9 @@
 // SPDX-License-Identifier: GPL-2.0-only
 
+/* See Documentation/networking/flow_control.rst for a high level description of
+ * the userspace interface.
+ */
+
 #include "netlink.h"
 #include "common.h"
 
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next v3 3/3] Documentation: net: add flow control guide and document ethtool API
  2025-08-20 13:10 ` [PATCH net-next v3 3/3] Documentation: net: add flow control guide and document ethtool API Oleksij Rempel
@ 2025-08-22 11:35   ` Vladimir Oltean
  2025-08-22 12:12     ` Oleksij Rempel
  0 siblings, 1 reply; 10+ messages in thread
From: Vladimir Oltean @ 2025-08-22 11:35 UTC (permalink / raw)
  To: Oleksij Rempel
  Cc: Andrew Lunn, Heiner Kallweit, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Rob Herring, Krzysztof Kozlowski,
	Florian Fainelli, Maxime Chevallier, Kory Maincent,
	Lukasz Majewski, Jonathan Corbet, Donald Hunter, Vadim Fedorenko,
	Jiri Pirko, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, kernel, linux-kernel,
	netdev, Russell King, Divya.Koppera, Sabrina Dubroca,
	Stanislav Fomichev

On Wed, Aug 20, 2025 at 03:10:23PM +0200, Oleksij Rempel wrote:
>          name: stats-src
> +        doc: |
> +          Selects the source of the MAC statistics, values from
> +          enum ethtool_mac_stats_src. This allows requesting statistics
> +          from an aggregated MAC or a specific PHY, for example.

"This allows requesting statistics from the individual components of the
MAC Merge layer" would be better - nothing to do with PHYs.

>          type: u32
>    -
>      name: eee
> diff --git a/Documentation/networking/flow_control.rst b/Documentation/networking/flow_control.rst
> new file mode 100644
> index 000000000000..ba315a5bcb87
> --- /dev/null
> +++ b/Documentation/networking/flow_control.rst
> @@ -0,0 +1,379 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +.. _ethernet-flow-control:
> +
> +=====================
> +Ethernet Flow Control
> +=====================
> +
> +This document is a practical guide to Ethernet Flow Control in Linux, covering
> +what it is, how it works, and how to configure it.
> +
> +What is Flow Control?
> +=====================
> +
> +Flow control is a mechanism to prevent a fast sender from overwhelming a
> +slow receiver with data, which would cause buffer overruns and dropped packets.
> +The receiver can signal the sender to temporarily stop transmitting, giving it
> +time to process its backlog.
> +
> +Standards references
> +====================
> +
> +Ethernet flow control mechanisms are specified across consolidated IEEE base
> +standards; some originated as amendments:
> +
> +- Collision-based flow control is part of CSMA/CD in **IEEE 802.3**
> +  (half-duplex).
> +- Link‑wide PAUSE is defined in **IEEE 802.3 Annex 31B**

There are some odd characters here.

> +  (originally **802.3x**).
> +- Priority-based Flow Control (PFC) is defined in **IEEE 802.1Q Clause 36**
> +  (originally **802.1Qbb**).
> +
> +In the remainder of this document, the consolidated clause numbers are used.
> +
> +How It Works: The Mechanisms
> +============================
> +
> +The method used for flow control depends on the link's duplex mode.
> +
> +.. note::
> +   The user-visible ``ethtool`` pause API described in this document controls
> +   **link-wide PAUSE** (IEEE 802.3 Annex 31B) only. It does not control the
> +   collision-based behavior that exists on half-duplex links.
> +
> +2. Full-Duplex: Link-wide PAUSE (IEEE 802.3 Annex 31B)
> +------------------------------------------------------
> +On full-duplex links, devices can send and receive at the same time. Flow
> +control is achieved by sending a special **PAUSE frame**, defined by IEEE
> +802.3 Annex 31B. This mechanism pauses all traffic on the link and is therefore
> +called *link-wide PAUSE*.
> +
> +* **What it is**: A standard Ethernet frame with a globally reserved
> +    destination MAC address (``01-80-C2-00-00-01``). This address is in a range
> +    that standard IEEE 802.1D-compliant bridges do not forward. However, some
> +    unmanaged or misconfigured bridges have been reported to forward these
> +    frames, which can disrupt flow control across a network.
> +
> +* **How it works**: The frame contains a MAC Control opcode for PAUSE
> +    (``0x0001``) and a ``pause_time`` value, telling the sender how long to
> +    wait before sending more data frames. This time is specified in units of
> +    "pause quanta," where one quantum is the time it takes to transmit 512 bits.
> +    For example, one pause quantum is 51.2 microseconds on a 10 Mbit/s link,
> +    and 512 nanoseconds on a 1 Gbit/s link.

I might also mention that the quantum value of 0 is special and it means
that the transmitter can resume, even if past quanta have not elapsed.

> +
> +* **Who uses it**: Any full-duplex link, from 10 Mbit/s to multi-gigabit speeds.
> +
> +The MAC (Media Access Controller)
> +---------------------------------
> +The MAC is the hardware component that actually sends and receives PAUSE
> +frames. Its capabilities define the upper limit of what the driver can support.
> +For link-wide PAUSE, MACs can vary in their support for symmetric (both
> +directions) or asymmetric (independent TX/RX) flow control.
> +
> +For PFC, the MAC must be capable of generating and interpreting the
> +priority-based PAUSE frames and managing separate pause states for each
> +traffic class.
> +
> +Many MACs also implement automatic PAUSE frame transmission based on the fill
> +level of their internal RX FIFO. This is typically configured with two
> +thresholds:
> +
> +* **FLOW_ON (High Water Mark)**: When the RX FIFO usage reaches this
> +  threshold, the MAC automatically transmits a PAUSE frame to stop the sender.
> +
> +* **FLOW_OFF (Low Water Mark)**: When the RX FIFO usage drops below this
> +  threshold, the MAC transmits a PAUSE frame with a quanta of zero to tell

I think quanta is plural.

> +  the sender it can resume transmission.
> +
> +The optimal values for these thresholds depend on the link's round-trip-time
> +(RTT) and the peer's internal processing latency. The high water mark must be
> +set low enough so that the MAC's RX FIFO does not overflow while waiting for
> +the peer to react to the PAUSE frame. The driver is responsible for configuring
> +sensible defaults according to the IEEE specification. User tuning should only
> +be necessary in special cases, such as on links with unusually long cable
> +lengths (e.g., long-haul fiber).

How would user tuning be achieved?

> diff --git a/include/uapi/linux/ethtool_netlink_generated.h b/include/uapi/linux/ethtool_netlink_generated.h
> index 46de09954042..0af7b90101c1 100644
> --- a/include/uapi/linux/ethtool_netlink_generated.h
> +++ b/include/uapi/linux/ethtool_netlink_generated.h
> @@ -394,7 +400,25 @@ enum {
>  	ETHTOOL_A_PAUSE_STAT_MAX = (__ETHTOOL_A_PAUSE_STAT_CNT - 1)
>  };
>  
> -enum {
> +/**
> + * enum ethtool_pause - Parameters for link-wide PAUSE (IEEE 802.3 Annex 31B).
> + * @ETHTOOL_A_PAUSE_AUTONEG: Acts as a mode selector for the driver. On GET:
> + *   indicates the driver's behavior. If true, the driver will respect the
> + *   negotiated outcome; if false, the driver will use a forced configuration.
> + *   On SET: if true, the driver configures the PHY's advertisement based on
> + *   the rx and tx attributes. If false, the driver forces the MAC into the
> + *   state defined by the rx and tx attributes.
> + * @ETHTOOL_A_PAUSE_RX: Enable receiving PAUSE frames (pausing local TX). On
> + *   GET: reflects the currently preferred configuration state.
> + * @ETHTOOL_A_PAUSE_TX: Enable transmitting PAUSE frames (pausing peer TX). On
> + *   GET: reflects the currently preferred configuration state.
> + * @ETHTOOL_A_PAUSE_STATS: Contains the pause statistics counters. The source
> + *   of these statistics is determined by stats-src.
> + * @ETHTOOL_A_PAUSE_STATS_SRC: Selects the source of the MAC statistics, values
> + *   from enum ethtool_mac_stats_src. This allows requesting statistics from an
> + *   aggregated MAC or a specific PHY, for example.

Same here.

> + */
> +enum ethtool_a_pause {
>  	ETHTOOL_A_PAUSE_UNSPEC,
>  	ETHTOOL_A_PAUSE_HEADER,
>  	ETHTOOL_A_PAUSE_AUTONEG,

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next v3 3/3] Documentation: net: add flow control guide and document ethtool API
  2025-08-22 11:35   ` Vladimir Oltean
@ 2025-08-22 12:12     ` Oleksij Rempel
  2025-08-22 14:19       ` Vladimir Oltean
  0 siblings, 1 reply; 10+ messages in thread
From: Oleksij Rempel @ 2025-08-22 12:12 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Andrew Lunn, Heiner Kallweit, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Rob Herring, Krzysztof Kozlowski,
	Florian Fainelli, Maxime Chevallier, Kory Maincent,
	Lukasz Majewski, Jonathan Corbet, Donald Hunter, Vadim Fedorenko,
	Jiri Pirko, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, kernel, linux-kernel,
	netdev, Russell King, Divya.Koppera, Sabrina Dubroca,
	Stanislav Fomichev

On Fri, Aug 22, 2025 at 02:35:19PM +0300, Vladimir Oltean wrote:
...
> > +
> > +* **Who uses it**: Any full-duplex link, from 10 Mbit/s to multi-gigabit speeds.
> > +
> > +The MAC (Media Access Controller)
> > +---------------------------------
> > +The MAC is the hardware component that actually sends and receives PAUSE
> > +frames. Its capabilities define the upper limit of what the driver can support.
> > +For link-wide PAUSE, MACs can vary in their support for symmetric (both
> > +directions) or asymmetric (independent TX/RX) flow control.
> > +
> > +For PFC, the MAC must be capable of generating and interpreting the
> > +priority-based PAUSE frames and managing separate pause states for each
> > +traffic class.
> > +
> > +Many MACs also implement automatic PAUSE frame transmission based on the fill
> > +level of their internal RX FIFO. This is typically configured with two
> > +thresholds:
> > +
> > +* **FLOW_ON (High Water Mark)**: When the RX FIFO usage reaches this
> > +  threshold, the MAC automatically transmits a PAUSE frame to stop the sender.
> > +
> > +* **FLOW_OFF (Low Water Mark)**: When the RX FIFO usage drops below this
> > +  threshold, the MAC transmits a PAUSE frame with a quanta of zero to tell
> > +  the sender it can resume transmission.
> > +
> > +The optimal values for these thresholds depend on the link's round-trip-time
> > +(RTT) and the peer's internal processing latency. The high water mark must be
> > +set low enough so that the MAC's RX FIFO does not overflow while waiting for
> > +the peer to react to the PAUSE frame. The driver is responsible for configuring
> > +sensible defaults according to the IEEE specification. User tuning should only
> > +be necessary in special cases, such as on links with unusually long cable
> > +lengths (e.g., long-haul fiber).
> 
> How would user tuning be achieved?

Do you mean how such tuning could be exposed to user space (e.g. via
ethtool/sysfs), or rather whether it makes sense to provide a user
interface for this at all, since drivers normally set safe defaults?

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next v3 2/3] net: ynl: add generated kdoc to UAPI headers
  2025-08-20 13:10 ` [PATCH net-next v3 2/3] net: ynl: add generated kdoc to UAPI headers Oleksij Rempel
@ 2025-08-22 14:11   ` ALOK TIWARI
  2025-08-24  8:10     ` Oleksij Rempel
  0 siblings, 1 reply; 10+ messages in thread
From: ALOK TIWARI @ 2025-08-22 14:11 UTC (permalink / raw)
  To: Oleksij Rempel, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Rob Herring,
	Krzysztof Kozlowski, Florian Fainelli, Maxime Chevallier,
	Kory Maincent, Lukasz Majewski, Jonathan Corbet, Donald Hunter,
	Vadim Fedorenko, Jiri Pirko, Vladimir Oltean, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend
  Cc: kernel, linux-kernel, netdev, Russell King, Divya.Koppera,
	Sabrina Dubroca, Stanislav Fomichev



On 8/20/2025 6:40 PM, Oleksij Rempel wrote:
> Run the ynl regeneration script to apply the kdoc generation
> support added in the previous commit.
> 
> This updates the generated UAPI headers for dpll, ethtool, team,
> net_shaper, netdev, and ovpn with documentation parsed from their
> respective YAML specifications.
> 
> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
> ---
>   include/uapi/linux/dpll.h                     |  30 ++++
>   .../uapi/linux/ethtool_netlink_generated.h    |  29 +++
>   include/uapi/linux/if_team.h                  |  11 ++
>   include/uapi/linux/net_shaper.h               |  50 ++++++
>   include/uapi/linux/netdev.h                   | 165 ++++++++++++++++++
>   include/uapi/linux/ovpn.h                     |  62 +++++++
>   tools/include/uapi/linux/netdev.h             | 165 ++++++++++++++++++
>   7 files changed, 512 insertions(+)
> 
> diff --git a/include/uapi/linux/dpll.h b/include/uapi/linux/dpll.h
> index 37b438ce8efc..23a4e3598650 100644
> --- a/include/uapi/linux/dpll.h
> +++ b/include/uapi/linux/dpll.h
> @@ -203,6 +203,18 @@ enum dpll_feature_state {
>   	DPLL_FEATURE_STATE_ENABLE,
>   };
>   
> +/**
> + * enum dpll_dpll
> + * @DPLL_A_CLOCK_QUALITY_LEVEL: Level of quality of a clock device. This mainly
> + *   applies when the dpll lock-status is DPLL_LOCK_STATUS_HOLDOVER. This could
> + *   be put to message multiple times to indicate possible parallel quality
> + *   levels (e.g. one specified by ITU option 1 and another one specified by
> + *   option 2).
> + * @DPLL_A_PHASE_OFFSET_MONITOR: Receive or request state of phase offset
> + *   monitor feature. If enabled, dpll device shall monitor and notify all
> + *   currently available inputs for changes of their phase offset against the
> + *   dpll device.
> + */
>   enum dpll_a {
>   	DPLL_A_ID = 1,
>   	DPLL_A_MODULE_NAME,
> @@ -221,6 +233,24 @@ enum dpll_a {
>   	DPLL_A_MAX = (__DPLL_A_MAX - 1)
>   };
>   
> +/**
> + * enum dpll_pin
> + * @DPLL_A_PIN_FRACTIONAL_FREQUENCY_OFFSET: The FFO (Fractional Frequency
> + *   Offset) between the RX and TX symbol rate on the media associated with the
> + *   pin: (rx_frequency-tx_frequency)/rx_frequency Value is in PPM (parts per

spacing for clarity (rx_frequency - tx_frequency) / rx_frequency

> + *   million). This may be implemented for example for pin of type
> + *   PIN_TYPE_SYNCE_ETH_PORT.
> + * @DPLL_A_PIN_ESYNC_FREQUENCY: Frequency of Embedded SYNC signal. If provided,
> + *   the pin is configured with a SYNC signal embedded into its base clock
> + *   frequency.
> + * @DPLL_A_PIN_ESYNC_FREQUENCY_SUPPORTED: If provided a pin is capable of
> + *   embedding a SYNC signal (within given range) into its base frequency
> + *   signal.
> + * @DPLL_A_PIN_ESYNC_PULSE: A ratio of high to low state of a SYNC signal pulse
> + *   embedded into base clock frequency. Value is in percents.

should be "percent"

> + * @DPLL_A_PIN_REFERENCE_SYNC: Capable pin provides list of pins that can be
> + *   bound to create a reference-sync pin pair.
> + */
[clip]
> +/**
> + * enum ovpn_keyconf
> + * @OVPN_A_KEYCONF_PEER_ID: The unique ID of the peer in the device context. To
> + *   be used to identify peers during key operations
> + * @OVPN_A_KEYCONF_SLOT: The slot where the key should be stored
> + * @OVPN_A_KEYCONF_KEY_ID: The unique ID of the key in the peer context. Used
> + *   to fetch the correct key upon decryption
> + * @OVPN_A_KEYCONF_CIPHER_ALG: The cipher to be used when communicating with
> + *   the peer
> + * @OVPN_A_KEYCONF_ENCRYPT_DIR: Key material for encrypt direction
> + * @OVPN_A_KEYCONF_DECRYPT_DIR: Key material for decrypt direction
> + */
>   enum {
>   	OVPN_A_KEYCONF_PEER_ID = 1,
>   	OVPN_A_KEYCONF_SLOT,
> @@ -71,6 +120,12 @@ enum {
>   	OVPN_A_KEYCONF_MAX = (__OVPN_A_KEYCONF_MAX - 1)
>   };
>   
> +/**
> + * enum ovpn_keydir
> + * @OVPN_A_KEYDIR_CIPHER_KEY: The actual key to be used by the cipher
> + * @OVPN_A_KEYDIR_NONCE_TAIL: Random nonce to be concatenated to the packet ID,
> + *   in order to obtain the actual cipher IV
> + */
>   enum {
>   	OVPN_A_KEYDIR_CIPHER_KEY = 1,
>   	OVPN_A_KEYDIR_NONCE_TAIL,
> @@ -79,6 +134,13 @@ enum {
>   	OVPN_A_KEYDIR_MAX = (__OVPN_A_KEYDIR_MAX - 1)
>   };
>   
> +/**
> + * enum ovpn_ovpn
> + * @OVPN_A_IFINDEX: Index of the ovpn interface to operate on
> + * @OVPN_A_PEER: The peer object containing the attributed of interest for the

typo attributed -> attributes

> + *   specific operation
> + * @OVPN_A_KEYCONF: Peer specific cipher configuration
> + */
>   enum {
>   	OVPN_A_IFINDEX = 1,
>   	OVPN_A_PEER,
> diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h
> index 48eb49aa03d4..4d5169fc798d 100644
> --- a/tools/include/uapi/linux/netdev.h
> +++ b/tools/include/uapi/linux/netdev.h
> @@ -82,6 +82,16 @@ enum netdev_napi_threaded {
>   	NETDEV_NAPI_THREADED_ENABLED,
>   };
>   
[clip]
> +/**
> + * enum netdev_qstats - Get device statistics, scoped to a device or a queue.
> + *   These statistics extend (and partially duplicate) statistics available in
> + *   struct rtnl_link_stats64. Value of the `scope` attribute determines how
> + *   statistics are aggregated. When aggregated for the entire device the
> + *   statistics represent the total number of events since last explicit reset
> + *   of the device (i.e. not a reconfiguration like changing queue count). When
> + *   reported per-queue, however, the statistics may not add up to the total
> + *   number of events, will only be reported for currently active objects, and
> + *   will likely report the number of events since last reconfiguration.
> + * @NETDEV_A_QSTATS_IFINDEX: ifindex of the netdevice to which stats belong.
> + * @NETDEV_A_QSTATS_QUEUE_TYPE: Queue type as rx, tx, for queue-id.
> + * @NETDEV_A_QSTATS_QUEUE_ID: Queue ID, if stats are scoped to a single queue
> + *   instance.
> + * @NETDEV_A_QSTATS_SCOPE: What object type should be used to iterate over the
> + *   stats.
> + * @NETDEV_A_QSTATS_RX_PACKETS: Number of wire packets successfully received
> + *   and passed to the stack. For drivers supporting XDP, XDP is considered the
> + *   first layer of the stack, so packets consumed by XDP are still counted
> + *   here.
> + * @NETDEV_A_QSTATS_RX_BYTES: Successfully received bytes, see `rx-packets`.
> + * @NETDEV_A_QSTATS_TX_PACKETS: Number of wire packets successfully sent.
> + *   Packet is considered to be successfully sent once it is in device memory
> + *   (usually this means the device has issued a DMA completion for the
> + *   packet).
> + * @NETDEV_A_QSTATS_TX_BYTES: Successfully sent bytes, see `tx-packets`.
> + * @NETDEV_A_QSTATS_RX_ALLOC_FAIL: Number of times skb or buffer allocation
> + *   failed on the Rx datapath. Allocation failure may, or may not result in a
> + *   packet drop, depending on driver implementation and whether system
> + *   recovers quickly.
> + * @NETDEV_A_QSTATS_RX_HW_DROPS: Number of all packets which entered the
> + *   device, but never left it, including but not limited to: packets dropped
> + *   due to lack of buffer space, processing errors, explicit or implicit
> + *   policies and packet filters.
> + * @NETDEV_A_QSTATS_RX_HW_DROP_OVERRUNS: Number of packets dropped due to
> + *   transient lack of resources, such as buffer space, host descriptors etc.
> + * @NETDEV_A_QSTATS_RX_CSUM_COMPLETE: Number of packets that were marked as
> + *   CHECKSUM_COMPLETE.
> + * @NETDEV_A_QSTATS_RX_CSUM_UNNECESSARY: Number of packets that were marked as
> + *   CHECKSUM_UNNECESSARY.
> + * @NETDEV_A_QSTATS_RX_CSUM_NONE: Number of packets that were not checksummed
> + *   by device.
> + * @NETDEV_A_QSTATS_RX_CSUM_BAD: Number of packets with bad checksum. The
> + *   packets are not discarded, but still delivered to the stack.
> + * @NETDEV_A_QSTATS_RX_HW_GRO_PACKETS: Number of packets that were coalesced
> + *   from smaller packets by the device. Counts only packets coalesced with the
> + *   HW-GRO netdevice feature, LRO-coalesced packets are not counted.
> + * @NETDEV_A_QSTATS_RX_HW_GRO_BYTES: See `rx-hw-gro-packets`.
> + * @NETDEV_A_QSTATS_RX_HW_GRO_WIRE_PACKETS: Number of packets that were
> + *   coalesced to bigger packetss with the HW-GRO netdevice feature.

packetss -> packets

> + *   LRO-coalesced packets are not counted.
> + * @NETDEV_A_QSTATS_RX_HW_GRO_WIRE_BYTES: See `rx-hw-gro-wire-packets`.
> + * @NETDEV_A_QSTATS_RX_HW_DROP_RATELIMITS: Number of the packets dropped by the
> + *   device due to the received packets bitrate exceeding the device rate

Thanks,
Alok

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next v3 3/3] Documentation: net: add flow control guide and document ethtool API
  2025-08-22 12:12     ` Oleksij Rempel
@ 2025-08-22 14:19       ` Vladimir Oltean
  0 siblings, 0 replies; 10+ messages in thread
From: Vladimir Oltean @ 2025-08-22 14:19 UTC (permalink / raw)
  To: Oleksij Rempel
  Cc: Andrew Lunn, Heiner Kallweit, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Rob Herring, Krzysztof Kozlowski,
	Florian Fainelli, Maxime Chevallier, Kory Maincent,
	Lukasz Majewski, Jonathan Corbet, Donald Hunter, Vadim Fedorenko,
	Jiri Pirko, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, kernel, linux-kernel,
	netdev, Russell King, Divya.Koppera, Sabrina Dubroca,
	Stanislav Fomichev

On Fri, Aug 22, 2025 at 02:12:26PM +0200, Oleksij Rempel wrote:
> > > +The optimal values for these thresholds depend on the link's round-trip-time
> > > +(RTT) and the peer's internal processing latency. The high water mark must be
> > > +set low enough so that the MAC's RX FIFO does not overflow while waiting for
> > > +the peer to react to the PAUSE frame. The driver is responsible for configuring
> > > +sensible defaults according to the IEEE specification. User tuning should only
> > > +be necessary in special cases, such as on links with unusually long cable
> > > +lengths (e.g., long-haul fiber).
> > 
> > How would user tuning be achieved?
> 
> Do you mean how such tuning could be exposed to user space (e.g. via
> ethtool/sysfs), or rather whether it makes sense to provide a user
> interface for this at all, since drivers normally set safe defaults?

Sorry for not being clear. I think that by saying that user tuning
should only be necessary in certain cases, you're giving the impression
that it's supported in current API. You might want to clarify that it's
not.

Also, I'm not sure that the length of the cable runs would be a factor
in tuning the flow control watermarks, do you have a reference for that?
I'm mentally debating the value of the last sentence.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next v3 2/3] net: ynl: add generated kdoc to UAPI headers
  2025-08-22 14:11   ` ALOK TIWARI
@ 2025-08-24  8:10     ` Oleksij Rempel
  2025-08-25 17:01       ` Jakub Kicinski
  0 siblings, 1 reply; 10+ messages in thread
From: Oleksij Rempel @ 2025-08-24  8:10 UTC (permalink / raw)
  To: ALOK TIWARI
  Cc: Andrew Lunn, Heiner Kallweit, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Rob Herring, Krzysztof Kozlowski,
	Florian Fainelli, Maxime Chevallier, Kory Maincent,
	Lukasz Majewski, Jonathan Corbet, Donald Hunter, Vadim Fedorenko,
	Jiri Pirko, Vladimir Oltean, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, kernel, linux-kernel,
	netdev, Russell King, Divya.Koppera, Sabrina Dubroca,
	Stanislav Fomichev

On Fri, Aug 22, 2025 at 07:41:39PM +0530, ALOK TIWARI wrote:
> 
> 
> On 8/20/2025 6:40 PM, Oleksij Rempel wrote:
> > Run the ynl regeneration script to apply the kdoc generation
> > support added in the previous commit.
> > 
> > This updates the generated UAPI headers for dpll, ethtool, team,
> > net_shaper, netdev, and ovpn with documentation parsed from their
> > respective YAML specifications.
> > 
> > Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
> > ---
> >   include/uapi/linux/dpll.h                     |  30 ++++
> >   .../uapi/linux/ethtool_netlink_generated.h    |  29 +++
> >   include/uapi/linux/if_team.h                  |  11 ++
> >   include/uapi/linux/net_shaper.h               |  50 ++++++
> >   include/uapi/linux/netdev.h                   | 165 ++++++++++++++++++
> >   include/uapi/linux/ovpn.h                     |  62 +++++++
> >   tools/include/uapi/linux/netdev.h             | 165 ++++++++++++++++++
> >   7 files changed, 512 insertions(+)
> > 
> > diff --git a/include/uapi/linux/dpll.h b/include/uapi/linux/dpll.h
> > index 37b438ce8efc..23a4e3598650 100644
> > --- a/include/uapi/linux/dpll.h
> > +++ b/include/uapi/linux/dpll.h
> > @@ -203,6 +203,18 @@ enum dpll_feature_state {
> >   	DPLL_FEATURE_STATE_ENABLE,
> >   };
> > +/**
> > + * enum dpll_dpll
> > + * @DPLL_A_CLOCK_QUALITY_LEVEL: Level of quality of a clock device. This mainly
> > + *   applies when the dpll lock-status is DPLL_LOCK_STATUS_HOLDOVER. This could
> > + *   be put to message multiple times to indicate possible parallel quality
> > + *   levels (e.g. one specified by ITU option 1 and another one specified by
> > + *   option 2).
> > + * @DPLL_A_PHASE_OFFSET_MONITOR: Receive or request state of phase offset
> > + *   monitor feature. If enabled, dpll device shall monitor and notify all
> > + *   currently available inputs for changes of their phase offset against the
> > + *   dpll device.
> > + */
> >   enum dpll_a {
> >   	DPLL_A_ID = 1,
> >   	DPLL_A_MODULE_NAME,
> > @@ -221,6 +233,24 @@ enum dpll_a {
> >   	DPLL_A_MAX = (__DPLL_A_MAX - 1)
> >   };
> > +/**
> > + * enum dpll_pin
> > + * @DPLL_A_PIN_FRACTIONAL_FREQUENCY_OFFSET: The FFO (Fractional Frequency
> > + *   Offset) between the RX and TX symbol rate on the media associated with the
> > + *   pin: (rx_frequency-tx_frequency)/rx_frequency Value is in PPM (parts per
> 
> spacing for clarity (rx_frequency - tx_frequency) / rx_frequency

Thank you for the review. The comments you refer to are autogenerated
from the YAML specs. Extending my patch to adjust or clean up those
generated comments would mean adding side-quests outside the scope of
the actual change. I’d rather keep this series focused, otherwise I risk
not being able to complete it.

Best Regards,
Oleksij
-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next v3 2/3] net: ynl: add generated kdoc to UAPI headers
  2025-08-24  8:10     ` Oleksij Rempel
@ 2025-08-25 17:01       ` Jakub Kicinski
  0 siblings, 0 replies; 10+ messages in thread
From: Jakub Kicinski @ 2025-08-25 17:01 UTC (permalink / raw)
  To: Oleksij Rempel
  Cc: ALOK TIWARI, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Paolo Abeni, Rob Herring, Krzysztof Kozlowski,
	Florian Fainelli, Maxime Chevallier, Kory Maincent,
	Lukasz Majewski, Jonathan Corbet, Donald Hunter, Vadim Fedorenko,
	Jiri Pirko, Vladimir Oltean, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, kernel, linux-kernel,
	netdev, Russell King, Divya.Koppera, Sabrina Dubroca,
	Stanislav Fomichev

On Sun, 24 Aug 2025 10:10:43 +0200 Oleksij Rempel wrote:
> > > +/**
> > > + * enum dpll_pin
> > > + * @DPLL_A_PIN_FRACTIONAL_FREQUENCY_OFFSET: The FFO (Fractional Frequency
> > > + *   Offset) between the RX and TX symbol rate on the media associated with the
> > > + *   pin: (rx_frequency-tx_frequency)/rx_frequency Value is in PPM (parts per  
> > 
> > spacing for clarity (rx_frequency - tx_frequency) / rx_frequency  
> 
> Thank you for the review. The comments you refer to are autogenerated
> from the YAML specs. Extending my patch to adjust or clean up those
> generated comments would mean adding side-quests outside the scope of
> the actual change. I’d rather keep this series focused, otherwise I risk
> not being able to complete it.

SG, FWIW.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-08-25 17:01 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-20 13:10 [PATCH net-next v3 0/3] Documentation and ynl: add flow control Oleksij Rempel
2025-08-20 13:10 ` [PATCH net-next v3 1/3] tools: ynl-gen: generate kdoc for attribute enums Oleksij Rempel
2025-08-20 13:10 ` [PATCH net-next v3 2/3] net: ynl: add generated kdoc to UAPI headers Oleksij Rempel
2025-08-22 14:11   ` ALOK TIWARI
2025-08-24  8:10     ` Oleksij Rempel
2025-08-25 17:01       ` Jakub Kicinski
2025-08-20 13:10 ` [PATCH net-next v3 3/3] Documentation: net: add flow control guide and document ethtool API Oleksij Rempel
2025-08-22 11:35   ` Vladimir Oltean
2025-08-22 12:12     ` Oleksij Rempel
2025-08-22 14:19       ` Vladimir Oltean

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).