* [PATCH 0/5] doc: ethdev documentation grammar and typo corrections
@ 2026-01-16 21:29 Stephen Hemminger
2026-01-16 21:29 ` [PATCH 1/5] doc: correct grammar and punctuation errors in ethdev guide Stephen Hemminger
` (5 more replies)
0 siblings, 6 replies; 18+ messages in thread
From: Stephen Hemminger @ 2026-01-16 21:29 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger
This series corrects grammar errors, typos, and punctuation issues
in the Ethernet device library section of the DPDK Programmer's Guide.
Initial work on this was done as part of review by technical writer
then more changes as result of AI reviews.
The first patch provides a comprehensive update to the Poll Mode Driver
documentation, modernizing the description and improving readability.
The remaining patches address smaller issues in the rte_flow, QoS
framework, switch representation, and traffic management guides.
Stephen Hemminger (5):
doc: correct grammar and punctuation errors in ethdev guide
doc: correct grammar in rte_flow guide
doc: correct grammar in QoS framework guide
doc: correct typos in switch representation guide
doc: correct typos in traffic management guide
doc/guides/prog_guide/ethdev/ethdev.rst | 557 +++++++++---------
doc/guides/prog_guide/ethdev/flow_offload.rst | 2 +-
.../prog_guide/ethdev/qos_framework.rst | 8 +-
.../ethdev/switch_representation.rst | 4 +-
.../prog_guide/ethdev/traffic_management.rst | 5 +-
5 files changed, 286 insertions(+), 290 deletions(-)
--
2.51.0
^ permalink raw reply [flat|nested] 18+ messages in thread* [PATCH 1/5] doc: correct grammar and punctuation errors in ethdev guide 2026-01-16 21:29 [PATCH 0/5] doc: ethdev documentation grammar and typo corrections Stephen Hemminger @ 2026-01-16 21:29 ` Stephen Hemminger 2026-01-16 21:29 ` [PATCH 2/5] doc: correct grammar in rte_flow guide Stephen Hemminger ` (4 subsequent siblings) 5 siblings, 0 replies; 18+ messages in thread From: Stephen Hemminger @ 2026-01-16 21:29 UTC (permalink / raw) To: dev; +Cc: Stephen Hemminger, Nandini Persad Change various grammar, punctuation, and typographical errors throughout the Poll Mode Driver documentation. - Change extra spaces after emphasized terms (*run-to-completion*, *pipe-line*) - Correct possessive forms ("port's" -> "ports'", "processors" -> "processor's") - Change subject-verb agreement ("VFs detects" -> "VFs detect") - Add missing articles and words ("It is duty" -> "It is the duty", "allows the application create" -> "allows the application to create") - Remove extraneous words ("release of all" -> "release all", "ensures sure" -> "ensures") - Change typos ("dev_unint()" -> "dev_uninit()", "receive of transmit" -> "receive or transmit", "UDP/TCP/ SCTP" -> "UDP/TCP/SCTP") - Add missing punctuation (period at end of bullet point) - Change spacing around inline code markup - Clarify awkward sentence about PROACTIVE vs PASSIVE error Signed-off-by: Nandini Persad <nandinipersad361@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> --- doc/guides/prog_guide/ethdev/ethdev.rst | 557 ++++++++++++------------ 1 file changed, 277 insertions(+), 280 deletions(-) diff --git a/doc/guides/prog_guide/ethdev/ethdev.rst b/doc/guides/prog_guide/ethdev/ethdev.rst index daaf43ea3b..31b5e973ae 100644 --- a/doc/guides/prog_guide/ethdev/ethdev.rst +++ b/doc/guides/prog_guide/ethdev/ethdev.rst @@ -4,126 +4,134 @@ Poll Mode Driver ================ -The DPDK includes 1 Gigabit, 10 Gigabit and 40 Gigabit and para virtualized virtio Poll Mode Drivers. +The Data Plane Development Kit (DPDK) supports a wide range of Ethernet speeds, +from 10 Megabits to 400 Gigabits, depending on hardware capability. -A Poll Mode Driver (PMD) consists of APIs, provided through the BSD driver running in user space, -to configure the devices and their respective queues. -In addition, a PMD accesses the RX and TX descriptors directly without any interrupts -(with the exception of Link Status Change interrupts) to quickly receive, -process and deliver packets in the user's application. -This section describes the requirements of the PMDs, -their global design principles and proposes a high-level architecture and a generic external API for the Ethernet PMDs. +DPDK's Poll Mode Drivers (PMDs) are high-performance, optimized drivers for various +network interface cards that bypass the traditional kernel network stack to reduce +latency and improve throughput. They access Rx and Tx descriptors directly in a polling +mode without relying on interrupts (except for Link Status Change notifications), enabling +efficient packet reception and transmission in user-space applications. + +This section outlines the requirements of Ethernet PMDs, their design principles, +and presents a high-level architecture along with a generic external API. Requirements and Assumptions ---------------------------- -The DPDK environment for packet processing applications allows for two models, run-to-completion and pipe-line: +The DPDK environment for packet processing applications supports two models: run-to-completion and pipeline. -* In the *run-to-completion* model, a specific port's RX descriptor ring is polled for packets through an API. - Packets are then processed on the same core and placed on a port's TX descriptor ring through an API for transmission. +* In the *run-to-completion* model, a specific port's Rx descriptor ring is polled for packets through an API. + The application then processes packets on the same core and transmits them via the port's Tx descriptor ring using another API. -* In the *pipe-line* model, one core polls one or more port's RX descriptor ring through an API. - Packets are received and passed to another core via a ring. - The other core continues to process the packet which then may be placed on a port's TX descriptor ring through an API for transmission. +* In the *pipeline* model, one core polls the Rx descriptor ring(s) of one or more ports via an API. + The application then passes received packets to another core via a ring for further processing, + which may include transmission through the Tx descriptor ring using an API. -In a synchronous run-to-completion model, -each logical core assigned to the DPDK executes a packet processing loop that includes the following steps: +In a synchronous run-to-completion model, a logical core (lcore) +assigned to DPDK executes a packet processing loop that includes the following steps: -* Retrieve input packets through the PMD receive API +* Retrieve input packets using the PMD receive API -* Process each received packet one at a time, up to its forwarding +* Process each received packet individually, up to its forwarding -* Send pending output packets through the PMD transmit API +* Transmit output packets using the PMD transmit API -Conversely, in an asynchronous pipe-line model, some logical cores may be dedicated to the retrieval of received packets and -other logical cores to the processing of previously received packets. -Received packets are exchanged between logical cores through rings. -The loop for packet retrieval includes the following steps: +In contrast, the asynchronous pipeline model assigns some logical cores to retrieve packets +and others to process them. The application exchanges packets between cores via rings. + +The packet retrieval loop includes: * Retrieve input packets through the PMD receive API * Provide received packets to processing lcores through packet queues -The loop for packet processing includes the following steps: - -* Retrieve the received packet from the packet queue +The packet processing loop includes: -* Process the received packet, up to its retransmission if forwarded +* Dequeue received packets from the packet queue -To avoid any unnecessary interrupt processing overhead, the execution environment must not use any asynchronous notification mechanisms. -Whenever needed and appropriate, asynchronous communication should be introduced as much as possible through the use of rings. +* Process packets, including retransmission if forwarded -Avoiding lock contention is a key issue in a multi-core environment. -To address this issue, PMDs are designed to work with per-core private resources as much as possible. -For example, a PMD maintains a separate transmit queue per-core, per-port, if the PMD is not ``RTE_ETH_TX_OFFLOAD_MT_LOCKFREE`` capable. -In the same way, every receive queue of a port is assigned to and polled by a single logical core (lcore). +To minimize interrupt-related overhead, the execution environment should avoid asynchronous +notification mechanisms. When asynchronous communication is required, implement it +using rings where possible. Minimizing lock contention is critical in multi-core environments. +To support this, PMDs use per-core private resources whenever possible. +For example, if a PMD does not support ``RTE_ETH_TX_OFFLOAD_MT_LOCKFREE``, it maintains a separate +transmit queue per core and per port. Similarly, each receive queue is assigned to and polled by a single lcore. -To comply with Non-Uniform Memory Access (NUMA), memory management is designed to assign to each logical core -a private buffer pool in local memory to minimize remote memory access. -The configuration of packet buffer pools should take into account the underlying physical memory architecture in terms of DIMMS, -channels and ranks. -The application must ensure that appropriate parameters are given at memory pool creation time. +To support Non-Uniform Memory Access (NUMA), the memory management design assigns each logical +core a private buffer pool in local memory to reduce remote memory access. Configuration of packet +buffer pools should consider the underlying physical memory layout, such as DIMMs, channels, and ranks. +The application must set proper parameters during memory pool creation. See :doc:`../mempool_lib`. Design Principles ----------------- -The API and architecture of the Ethernet* PMDs are designed with the following guidelines in mind. +The API and architecture of the Ethernet PMDs follow these design principles: -PMDs must help global policy-oriented decisions to be enforced at the upper application level. -Conversely, NIC PMD functions should not impede the benefits expected by upper-level global policies, -or worse prevent such policies from being applied. +PMDs should support the enforcement of global, policy-driven decisions at the upper application level. +At the same time, NIC PMD functions must not hinder the performance gains expected by these higher-level policies, +or worse, prevent them from being implemented. -For instance, both the receive and transmit functions of a PMD have a maximum number of packets/descriptors to poll. -This allows a run-to-completion processing stack to statically fix or -to dynamically adapt its overall behavior through different global loop policies, such as: +For example, both the receive and transmit functions of a PMD define a maximum number of packets to poll. +This enables a run-to-completion processing stack to either statically configure or dynamically adjust its +behavior according to different global loop strategies, such as: -* Receive, process immediately and transmit packets one at a time in a piecemeal fashion. +* Receiving, processing, and transmitting packets one at a time in a piecemeal fashion -* Receive as many packets as possible, then process all received packets, transmitting them immediately. +* Receiving as many packets as possible, then processing and transmitting them all immediately -* Receive a given maximum number of packets, process the received packets, accumulate them and finally send all accumulated packets to transmit. +* Receiving a set number of packets, processing them, and batching them for transmission at once -To achieve optimal performance, overall software design choices and pure software optimization techniques must be considered and -balanced against available low-level hardware-based optimization features (CPU cache properties, bus speed, NIC PCI bandwidth, and so on). -The case of packet transmission is an example of this software/hardware tradeoff issue when optimizing burst-oriented network packet processing engines. -In the initial case, the PMD could export only an rte_eth_tx_one function to transmit one packet at a time on a given queue. -On top of that, one can easily build an rte_eth_tx_burst function that loops invoking the rte_eth_tx_one function to transmit several packets at a time. -However, an rte_eth_tx_burst function is effectively implemented by the PMD to minimize the driver-level transmit cost per packet through the following optimizations: +To maximize performance, developers must consider overall software architecture and optimization techniques +alongside available low-level hardware optimizations (e.g., CPU cache behavior, bus speed, and NIC PCI bandwidth). -* Share among multiple packets the un-amortized cost of invoking the rte_eth_tx_one function. +Packet transmission in burst-oriented network engines illustrates this software/hardware tradeoff. +A PMD could expose only the ``rte_eth_tx_one`` function to transmit a single packet at a time on a given queue. +While it is possible to build an ``rte_eth_tx_burst`` function by repeatedly calling ``rte_eth_tx_one``, +most PMDs implement ``rte_eth_tx_burst`` directly to reduce per-packet transmission overhead. -* Enable the rte_eth_tx_burst function to take advantage of burst-oriented hardware features (prefetch data in cache, use of NIC head/tail registers) - to minimize the number of CPU cycles per packet, for example by avoiding unnecessary read memory accesses to ring transmit descriptors, - or by systematically using arrays of pointers that exactly fit cache line boundaries and sizes. +This implementation includes several key optimizations: -* Apply burst-oriented software optimization techniques to remove operations that would otherwise be unavoidable, such as ring index wrap back management. +* Sharing the fixed cost of invoking ``rte_eth_tx_one`` across multiple packets -Burst-oriented functions are also introduced via the API for services that are intensively used by the PMD. -This applies in particular to buffer allocators used to populate NIC rings, which provide functions to allocate/free several buffers at a time. -For example, an mbuf_multiple_alloc function returning an array of pointers to rte_mbuf buffers which speeds up the receive poll function of the PMD when -replenishing multiple descriptors of the receive ring. +* Leveraging burst-oriented hardware features (e.g., data prefetching, NIC head/tail registers, vector extensions) + to reduce CPU cycles per packet, including minimizing unnecessary memory accesses and aligning pointer arrays + with cache line boundaries and sizes + +* Applying software-level burst optimizations to eliminate otherwise unavoidable overheads, such as ring index wrap-around handling + +The API also introduces burst-oriented functions for PMD-intensive services, such as buffer allocation. +For instance, buffer allocators used to populate NIC rings often support functions that allocate or free multiple buffers in a single call. +An example is ``rte_pktmbuf_alloc_bulk``, which returns an array of rte_mbuf pointers, significantly improving PMD performance +when replenishing multiple descriptors in the receive ring. Logical Cores, Memory and NIC Queues Relationships -------------------------------------------------- -The DPDK supports NUMA allowing for better performance when a processor's logical cores and interfaces utilize its local memory. -Therefore, mbuf allocation associated with local PCIe* interfaces should be allocated from memory pools created in the local memory. -The buffers should, if possible, remain on the local processor to obtain the best performance results and RX and TX buffer descriptors -should be populated with mbufs allocated from a mempool allocated from local memory. +DPDK supports NUMA, which improves performance when a processor's logical cores and network interfaces +use memory local to that processor. To maximize this benefit, allocate mbufs associated with local PCIe* interfaces +from memory pools located in the same NUMA node. + +Ideally, keep these buffers on the local processor to achieve optimal performance. Populate Rx and Tx buffer +descriptors with mbufs from mempools created in local memory. -The run-to-completion model also performs better if packet or data manipulation is in local memory instead of a remote processors memory. -This is also true for the pipe-line model provided all logical cores used are located on the same processor. +The run-to-completion model also benefits from performing packet data operations in local memory, +rather than accessing remote memory across NUMA nodes. +The same applies to the pipeline model, provided all logical cores involved reside on the same processor. -Multiple logical cores should never share receive or transmit queues for interfaces since this would require global locks and hinder performance. +Never share receive and transmit queues between multiple logical cores, as doing so requires +global locks and severely impacts performance. -If the PMD is ``RTE_ETH_TX_OFFLOAD_MT_LOCKFREE`` capable, multiple threads can invoke ``rte_eth_tx_burst()`` -concurrently on the same tx queue without SW lock. This PMD feature found in some NICs and useful in the following use cases: +If the PMD supports the ``RTE_ETH_TX_OFFLOAD_MT_LOCKFREE`` offload, +multiple threads can call ``rte_eth_tx_burst()`` concurrently on the same Tx queue without a software lock. +This capability, available in some NICs, proves advantageous in these scenarios: -* Remove explicit spinlock in some applications where lcores are not mapped to Tx queues with 1:1 relation. +* Eliminating explicit spinlocks in applications where Tx queues do not map 1:1 to logical cores -* In the eventdev use case, avoid dedicating a separate TX core for transmitting and thus - enables more scaling as all workers can send the packets. +* In eventdev-based workloads, allowing all worker threads to transmit packets, removing the need for a dedicated + Tx core and enabling greater scalability See `Hardware Offload`_ for ``RTE_ETH_TX_OFFLOAD_MT_LOCKFREE`` capability probing details. @@ -133,11 +141,10 @@ Device Identification, Ownership and Configuration Device Identification ~~~~~~~~~~~~~~~~~~~~~ -Each NIC port is uniquely designated by its (bus/bridge, device, function) PCI -identifiers assigned by the PCI probing/enumeration function executed at DPDK initialization. -Based on their PCI identifier, NIC ports are assigned two other identifiers: +The PCI probing/enumeration function executed at DPDK initialization assigns each NIC port a unique PCI +identifier (bus/bridge, device, function). Based on this PCI identifier, DPDK assigns each NIC port two additional identifiers: -* A port index used to designate the NIC port in all functions exported by the PMD API. +* A port index used to designate the NIC port in all functions exported by the PMD API * A port name used to designate the port in console messages, for administration or debugging purposes. For ease of use, the port name includes the port index. @@ -145,83 +152,82 @@ Based on their PCI identifier, NIC ports are assigned two other identifiers: Port Ownership ~~~~~~~~~~~~~~ -The Ethernet devices ports can be owned by a single DPDK entity (application, library, PMD, process, etc). -The ownership mechanism is controlled by ethdev APIs and allows to set/remove/get a port owner by DPDK entities. -It prevents Ethernet ports to be managed by different entities. +A single DPDK entity (application, library, PMD, process, etc.) can own Ethernet device ports. +The ethdev APIs control the ownership mechanism and allow DPDK entities to set, remove, or get a port owner. +This prevents different entities from managing the same Ethernet ports. .. note:: - It is the DPDK entity responsibility to set the port owner before using it and to manage the port usage synchronization between different threads or processes. + The DPDK entity must set port ownership before using the port and manage usage synchronization between different threads or processes. -It is recommended to set port ownership early, -like during the probing notification ``RTE_ETH_EVENT_NEW``. +Set port ownership early, for instance during the probing notification ``RTE_ETH_EVENT_NEW``. Device Configuration ~~~~~~~~~~~~~~~~~~~~ -The configuration of each NIC port includes the following operations: +Configuring each NIC port includes the following operations: * Allocate PCI resources -* Reset the hardware (issue a Global Reset) to a well-known default state +* Reset the hardware to a well-known default state (issue a Global Reset) * Set up the PHY and the link * Initialize statistics counters -The PMD API must also export functions to start/stop the all-multicast feature of a port and functions to set/unset the port in promiscuous mode. +The PMD API must also export functions to start/stop the all-multicast feature of a port and functions to set/unset promiscuous mode. -Some hardware offload features must be individually configured at port initialization through specific configuration parameters. -This is the case for the Receive Side Scaling (RSS) and Data Center Bridging (DCB) features for example. +Some hardware offload features require individual configuration at port initialization through specific parameters. +This includes Receive Side Scaling (RSS) and Data Center Bridging (DCB) features. On-the-Fly Configuration ~~~~~~~~~~~~~~~~~~~~~~~~ -All device features that can be started or stopped "on the fly" (that is, without stopping the device) do not require the PMD API to export dedicated functions for this purpose. +Device features that can start or stop "on the fly" (without stopping the device) do not require the PMD API to export dedicated functions. -All that is required is the mapping address of the device PCI registers to implement the configuration of these features in specific functions outside of the drivers. +Implementing the configuration of these features in specific functions outside of the drivers requires only the mapping address of the device PCI registers. For this purpose, -the PMD API exports a function that provides all the information associated with a device that can be used to set up a given device feature outside of the driver. -This includes the PCI vendor identifier, the PCI device identifier, the mapping address of the PCI device registers, and the name of the driver. +the PMD API exports a function that provides all device information needed to set up a given feature outside of the driver. +This includes the PCI vendor identifier, the PCI device identifier, the mapping address of the PCI device registers, and the driver name. -The main advantage of this approach is that it gives complete freedom on the choice of the API used to configure, to start, and to stop such features. +The main advantage of this approach is complete freedom in choosing the API to configure, start, and stop such features. As an example, refer to the configuration of the IEEE1588 feature for the Intel® 82576 Gigabit Ethernet Controller and -the Intel® 82599 10 Gigabit Ethernet Controller controllers in the testpmd application. +the Intel® 82599 10 Gigabit Ethernet Controller in the testpmd application. -Other features such as the L3/L4 5-Tuple packet filtering feature of a port can be configured in the same way. -Ethernet* flow control (pause frame) can be configured on the individual port. +Configure other features such as the L3/L4 5-Tuple packet filtering feature of a port in the same way. +Configure Ethernet* flow control (pause frame) on the individual port. Refer to the testpmd source code for details. -Also, L4 (UDP/TCP/ SCTP) checksum offload by the NIC can be enabled for an individual packet as long as the packet mbuf is set up correctly. See `Hardware Offload`_ for details. +Also, enable L4 (UDP/TCP/SCTP) checksum offload by the NIC for an individual packet by setting up the packet mbuf correctly. See `Hardware Offload`_ for details. Configuration of Transmit Queues ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Each transmit queue is independently configured with the following information: +Configure each transmit queue independently with the following information: * The number of descriptors of the transmit ring -* The socket identifier used to identify the appropriate DMA memory zone from which to allocate the transmit ring in NUMA architectures +* The socket identifier used to identify the appropriate DMA memory zone for allocating the transmit ring in NUMA architectures * The values of the Prefetch, Host and Write-Back threshold registers of the transmit queue * The *minimum* transmit packets to free threshold (tx_free_thresh). - When the number of descriptors used to transmit packets exceeds this threshold, the network adaptor should be checked to see if it has written back descriptors. - A value of 0 can be passed during the TX queue configuration to indicate the default value should be used. + When the number of descriptors used to transmit packets exceeds this threshold, check the network adaptor to see if it has written back descriptors. + Pass a value of 0 during Tx queue configuration to use the default value. The default value for tx_free_thresh is 32. - This ensures that the PMD does not search for completed descriptors until at least 32 have been processed by the NIC for this queue. + This ensures the PMD does not search for completed descriptors until the NIC has processed at least 32 for this queue. -* The *minimum* RS bit threshold. The minimum number of transmit descriptors to use before setting the Report Status (RS) bit in the transmit descriptor. +* The *minimum* RS bit threshold. The minimum number of transmit descriptors to use before setting the Report Status (RS) bit in the transmit descriptor. Note that this parameter may only be valid for Intel 10 GbE network adapters. - The RS bit is set on the last descriptor used to transmit a packet if the number of descriptors used since the last RS bit setting, + Set the RS bit on the last descriptor used to transmit a packet if the number of descriptors used since the last RS bit setting, up to the first descriptor used to transmit the packet, exceeds the transmit RS bit threshold (tx_rs_thresh). - In short, this parameter controls which transmit descriptors are written back to host memory by the network adapter. - A value of 0 can be passed during the TX queue configuration to indicate that the default value should be used. + In short, this parameter controls which transmit descriptors the network adapter writes back to host memory. + Pass a value of 0 during Tx queue configuration to use the default value. The default value for tx_rs_thresh is 32. - This ensures that at least 32 descriptors are used before the network adapter writes back the most recently used descriptor. - This saves upstream PCIe* bandwidth resulting from TX descriptor write-backs. - It is important to note that the TX Write-back threshold (TX wthresh) should be set to 0 when tx_rs_thresh is greater than 1. + This ensures the PMD uses at least 32 descriptors before the network adapter writes back the most recently used descriptor. + This saves upstream PCIe* bandwidth that would be used for Tx descriptor write-backs. + Set the Tx Write-back threshold (Tx wthresh) to 0 when tx_rs_thresh is greater than 1. Refer to the Intel® 82599 10 Gigabit Ethernet Controller Datasheet for more details. The following constraints must be satisfied for tx_free_thresh and tx_rs_thresh: @@ -236,46 +242,45 @@ The following constraints must be satisfied for tx_free_thresh and tx_rs_thresh: * tx_free_thresh must be less than the size of the ring minus 3. -* For optimal performance, TX wthresh should be set to 0 when tx_rs_thresh is greater than 1. +* For optimal performance, set Tx wthresh to 0 when tx_rs_thresh is greater than 1. -One descriptor in the TX ring is used as a sentinel to avoid a hardware race condition, hence the maximum threshold constraints. +One descriptor in the Tx ring serves as a sentinel to avoid a hardware race condition, hence the maximum threshold constraints. .. note:: - When configuring for DCB operation, at port initialization, both the number of transmit queues and the number of receive queues must be set to 128. + When configuring for DCB operation at port initialization, set both the number of transmit queues and the number of receive queues to 128. Free Tx mbuf on Demand ~~~~~~~~~~~~~~~~~~~~~~ -Many of the drivers do not release the mbuf back to the mempool, or local cache, -immediately after the packet has been transmitted. +Many drivers do not release the mbuf back to the mempool or local cache immediately after packet transmission. Instead, they leave the mbuf in their Tx ring and either perform a bulk release when the ``tx_rs_thresh`` has been crossed or free the mbuf when a slot in the Tx ring is needed. An application can request the driver to release used mbufs with the ``rte_eth_tx_done_cleanup()`` API. -This API requests the driver to release mbufs that are no longer in use, -independent of whether or not the ``tx_rs_thresh`` has been crossed. -There are two scenarios when an application may want the mbuf released immediately: +This API requests the driver to release mbufs no longer in use, +independent of whether the ``tx_rs_thresh`` has been crossed. +Two scenarios exist where an application may want the mbuf released immediately: * When a given packet needs to be sent to multiple destination interfaces (either for Layer 2 flooding or Layer 3 multi-cast). - One option is to make a copy of the packet or a copy of the header portion that needs to be manipulated. + One option is to copy the packet or the header portion that needs manipulation. A second option is to transmit the packet and then poll the ``rte_eth_tx_done_cleanup()`` API - until the reference count on the packet is decremented. - Then the same packet can be transmitted to the next destination interface. - The application is still responsible for managing any packet manipulations needed - between the different destination interfaces, but a packet copy can be avoided. - This API is independent of whether the packet was transmitted or dropped, + until the reference count on the packet decrements. + Then, transmit the same packet to the next destination interface. + The application remains responsible for managing any packet manipulations needed + between the different destination interfaces, but avoids a packet copy. + This API operates independently of whether the interface transmitted or dropped the packet, only that the mbuf is no longer in use by the interface. -* Some applications are designed to make multiple runs, like a packet generator. +* Some applications make multiple runs, like a packet generator. For performance reasons and consistency between runs, the application may want to reset back to an initial state between each run, where all mbufs are returned to the mempool. - In this case, it can call the ``rte_eth_tx_done_cleanup()`` API - for each destination interface it has been using - to request it to release of all its used mbufs. + In this case, call the ``rte_eth_tx_done_cleanup()`` API + for each destination interface used + to request it to release all used mbufs. To determine if a driver supports this API, check for the *Free Tx mbuf on demand* feature in the *Network Interface Controller Drivers* document. @@ -285,49 +290,49 @@ Hardware Offload Depending on driver capabilities advertised by ``rte_eth_dev_info_get()``, the PMD may support hardware offloading -feature like checksumming, TCP segmentation, VLAN insertion or -lockfree multithreaded TX burst on the same TX queue. +features like checksumming, TCP segmentation, VLAN insertion, or +lockfree multithreaded Tx burst on the same Tx queue. -The support of these offload features implies the addition of dedicated -status bit(s) and value field(s) into the rte_mbuf data structure, along -with their appropriate handling by the receive/transmit functions -exported by each PMD. The list of flags and their precise meaning is -described in the mbuf API documentation and in the :ref:`mbuf_meta` chapter. +Supporting these offload features requires adding dedicated +status bit(s) and value field(s) to the rte_mbuf data structure, along +with appropriate handling by the receive/transmit functions +exported by each PMD. The mbuf API documentation and the :ref:`mbuf_meta` chapter +describe the list of flags and their precise meanings. Per-Port and Per-Queue Offloads ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -In the DPDK offload API, offloads are divided into per-port and per-queue offloads as follows: +In the DPDK offload API, offloads divide into per-port and per-queue offloads as follows: -* A per-queue offloading can be enabled on a queue and disabled on another queue at the same time. -* A pure per-port offload is the one supported by device but not per-queue type. -* A pure per-port offloading can't be enabled on a queue and disabled on another queue at the same time. -* A pure per-port offloading must be enabled or disabled on all queues at the same time. -* Any offloading is per-queue or pure per-port type, but can't be both types at same devices. +* A per-queue offload can be enabled on one queue and disabled on another queue simultaneously. +* A pure per-port offload is supported by a device but not as a per-queue type. +* A pure per-port offload cannot be enabled on one queue and disabled on another queue simultaneously. +* A pure per-port offload must be enabled or disabled on all queues simultaneously. +* An offload is either per-queue or pure per-port type; it cannot be both types on the same device. * Port capabilities = per-queue capabilities + pure per-port capabilities. -* Any supported offloading can be enabled on all queues. +* Any supported offload can be enabled on all queues. -The different offloads capabilities can be queried using ``rte_eth_dev_info_get()``. +Query the different offload capabilities using ``rte_eth_dev_info_get()``. The ``dev_info->[rt]x_queue_offload_capa`` returned from ``rte_eth_dev_info_get()`` includes all per-queue offloading capabilities. The ``dev_info->[rt]x_offload_capa`` returned from ``rte_eth_dev_info_get()`` includes all pure per-port and per-queue offloading capabilities. Supported offloads can be either per-port or per-queue. -Offloads are enabled using the existing ``RTE_ETH_TX_OFFLOAD_*`` or ``RTE_ETH_RX_OFFLOAD_*`` flags. -Any requested offloading by an application must be within the device capabilities. -Any offloading is disabled by default if it is not set in the parameter +Enable offloads using the existing ``RTE_ETH_TX_OFFLOAD_*`` or ``RTE_ETH_RX_OFFLOAD_*`` flags. +Any offload requested by an application must be within the device capabilities. +Any offload is disabled by default if it is not set in the parameter ``dev_conf->[rt]xmode.offloads`` to ``rte_eth_dev_configure()`` and ``[rt]x_conf->offloads`` to ``rte_eth_[rt]x_queue_setup()``. -If any offloading is enabled in ``rte_eth_dev_configure()`` by an application, -it is enabled on all queues no matter whether it is per-queue or -per-port type and no matter whether it is set or cleared in +If an application enables any offload in ``rte_eth_dev_configure()``, +it is enabled on all queues regardless of whether it is per-queue or +per-port type and regardless of whether it is set or cleared in ``[rt]x_conf->offloads`` to ``rte_eth_[rt]x_queue_setup()``. -If a per-queue offloading hasn't been enabled in ``rte_eth_dev_configure()``, -it can be enabled or disabled in ``rte_eth_[rt]x_queue_setup()`` for individual queue. -A newly added offloads in ``[rt]x_conf->offloads`` to ``rte_eth_[rt]x_queue_setup()`` input by application -is the one which hasn't been enabled in ``rte_eth_dev_configure()`` and is requested to be enabled -in ``rte_eth_[rt]x_queue_setup()``. It must be per-queue type, otherwise trigger an error log. +If a per-queue offload has not been enabled in ``rte_eth_dev_configure()``, +it can be enabled or disabled in ``rte_eth_[rt]x_queue_setup()`` for an individual queue. +A newly added offload in ``[rt]x_conf->offloads`` to ``rte_eth_[rt]x_queue_setup()`` input by the application +is one that has not been enabled in ``rte_eth_dev_configure()`` and is requested to be enabled +in ``rte_eth_[rt]x_queue_setup()``. It must be per-queue type; otherwise an error log triggers. Poll Mode Driver API -------------------- @@ -335,44 +340,43 @@ Poll Mode Driver API Generalities ~~~~~~~~~~~~ -By default, all functions exported by a PMD are lock-free functions that are assumed -not to be invoked in parallel on different logical cores to work on the same target object. -For instance, a PMD receive function cannot be invoked in parallel on two logical cores to poll the same RX queue of the same port. -Of course, this function can be invoked in parallel by different logical cores on different RX queues. -It is the responsibility of the upper-level application to enforce this rule. +By default, all functions exported by a PMD are lock-free functions assumed +not to be invoked in parallel on different logical cores working on the same target object. +For instance, a PMD receive function cannot be invoked in parallel on two logical cores polling the same Rx queue of the same port. +This function can be invoked in parallel by different logical cores on different Rx queues. +The upper-level application must enforce this rule. -If needed, parallel accesses by multiple logical cores to shared queues can be explicitly protected by dedicated inline lock-aware functions +If needed, explicitly protect parallel accesses by multiple logical cores to shared queues using dedicated inline lock-aware functions built on top of their corresponding lock-free functions of the PMD API. Generic Packet Representation ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -A packet is represented by an rte_mbuf structure, which is a generic metadata structure containing all necessary housekeeping information. -This includes fields and status bits corresponding to offload hardware features, such as checksum computation of IP headers or VLAN tags. +An rte_mbuf structure represents a packet. This generic metadata structure contains all necessary housekeeping information, +including fields and status bits corresponding to offload hardware features, such as checksum computation of IP headers or VLAN tags. The rte_mbuf data structure includes specific fields to represent, in a generic way, the offload features provided by network controllers. -For an input packet, most fields of the rte_mbuf structure are filled in by the PMD receive function with the information contained in the receive descriptor. -Conversely, for output packets, most fields of rte_mbuf structures are used by the PMD transmit function to initialize transmit descriptors. +For an input packet, the PMD receive function fills in most fields of the rte_mbuf structure with information contained in the receive descriptor. +Conversely, for output packets, the PMD transmit function uses most fields of rte_mbuf structures to initialize transmit descriptors. See :doc:`../mbuf_lib` chapter for more details. Ethernet Device API ~~~~~~~~~~~~~~~~~~~ -The Ethernet device API exported by the Ethernet PMDs is described in the *DPDK API Reference*. +The *DPDK API Reference* describes the Ethernet device API exported by the Ethernet PMDs. .. _ethernet_device_standard_device_arguments: Ethernet Device Standard Device Arguments ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Standard Ethernet device arguments allow for a set of commonly used arguments/ -parameters which are applicable to all Ethernet devices to be available to for -specification of specific device and for passing common configuration -parameters to those ports. +Standard Ethernet device arguments provide a set of commonly used arguments/ +parameters applicable to all Ethernet devices. Use these arguments/parameters to +specify specific devices and pass common configuration parameters to those ports. -* ``representor`` for a device which supports the creation of representor ports - this argument allows user to specify which switch ports to enable port +* Use ``representor`` for a device that supports creating representor ports. + This argument allows the user to specify which switch ports to enable port representors for:: -a DBDF,representor=vf0 @@ -380,8 +384,8 @@ parameters to those ports. -a DBDF,representor=vf[0-31] -a DBDF,representor=vf[0,2-4,7,9-11] - These examples will attach VF representors relative to DBDF. - The VF IDs can be a list, a range or a mix. + These examples attach VF representors relative to DBDF. + The VF IDs can be a list, a range, or a mix. SF representors follow the same syntax:: -a DBDF,representor=sf0 @@ -389,47 +393,47 @@ parameters to those ports. -a DBDF,representor=sf[0-1023] -a DBDF,representor=sf[0,2-4,7,9-11] - If there are multiple PFs associated with the same PCI device, - the PF ID must be used to distinguish between representors relative to different PFs:: + If multiple PFs are associated with the same PCI device, + use the PF ID to distinguish between representors relative to different PFs:: -a DBDF,representor=pf1vf0 -a DBDF,representor=pf[0-1]vf0 - The example above will attach 4 representors pf0vf0, pf1vf0, pf0 and pf1. - If only VF representors are required, the PF part must be enclosed with parentheses:: + The example above attaches 4 representors pf0vf0, pf1vf0, pf0, and pf1. + If only VF representors are required, enclose the PF part in parentheses:: -a DBDF,representor=(pf[0-1])vf0 - The example above will attach 2 representors pf0vf0, pf1vf0. + The example above attaches 2 representors pf0vf0 and pf1vf0. - List of representors for the same PCI device is enclosed in square brackets:: + Enclose the list of representors for the same PCI device in square brackets:: -a DBDF,representor=[pf[0-1],pf2vf[0-2],pf3[3,5-8]] - Note: PMDs may have additional extensions for the representor parameter, and users - should consult the relevant PMD documentation to see support devargs. + Note: PMDs may have additional extensions for the representor parameter. Consult + the relevant PMD documentation for supported devargs. Extended Statistics API ~~~~~~~~~~~~~~~~~~~~~~~ -The extended statistics API allows a PMD to expose all statistics that are -available to it, including statistics that are unique to the device. -Each statistic has three properties ``name``, ``id`` and ``value``: +The extended statistics API allows a PMD to expose all available statistics, +including statistics unique to the device. +Each statistic has three properties: ``name``, ``id``, and ``value``: -* ``name``: A human readable string formatted by the scheme detailed below. +* ``name``: A human-readable string formatted by the scheme detailed below. * ``id``: An integer that represents only that statistic. -* ``value``: A unsigned 64-bit integer that is the value of the statistic. +* ``value``: An unsigned 64-bit integer that is the value of the statistic. -Note that extended statistic identifiers are -driver-specific, and hence might not be the same for different ports. -The API consists of various ``rte_eth_xstats_*()`` functions, and allows an -application to be flexible in how it retrieves statistics. +Note that extended statistic identifiers are driver-specific, +and therefore might not be the same for different ports. +The API consists of various ``rte_eth_xstats_*()`` functions and provides +applications flexibility in how they retrieve statistics. Scheme for Human Readable Names ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -A naming scheme exists for the strings exposed to clients of the API. This is -to allow scraping of the API for statistics of interest. The naming scheme uses +A naming scheme governs the strings exposed to clients of the API. This scheme +allows scraping of the API for statistics of interest. The naming scheme uses strings split by a single underscore ``_``. The scheme is as follows: * direction @@ -438,69 +442,67 @@ strings split by a single underscore ``_``. The scheme is as follows: * detail n * unit -Examples of common statistics xstats strings, formatted to comply to the scheme +Examples of common statistics xstats strings, formatted to comply with the scheme proposed above: * ``rx_bytes`` * ``rx_crc_errors`` * ``tx_multicast_packets`` -The scheme, although quite simple, allows flexibility in presenting and reading +The scheme, although simple, provides flexibility in presenting and reading information from the statistic strings. The following example illustrates the -naming scheme:``rx_packets``. In this example, the string is split into two -components. The first component ``rx`` indicates that the statistic is -associated with the receive side of the NIC. The second component ``packets`` +naming scheme: ``rx_packets``. In this example, the string splits into two +components. The first component ``rx`` indicates that the statistic +is associated with the receive side of the NIC. The second component ``packets`` indicates that the unit of measure is packets. A more complicated example: ``tx_size_128_to_255_packets``. In this example, -``tx`` indicates transmission, ``size`` is the first detail, ``128`` etc are +``tx`` indicates transmission, ``size`` is the first detail, ``128`` etc. are more details, and ``packets`` indicates that this is a packet counter. Some additions in the metadata scheme are as follows: * If the first part does not match ``rx`` or ``tx``, the statistic does not - have an affinity with either receive of transmit. + have an affinity with either receive or transmit. * If the first letter of the second part is ``q`` and this ``q`` is followed by a number, this statistic is part of a specific queue. -An example where queue numbers are used is as follows: ``tx_q7_bytes`` which -indicates this statistic applies to queue number 7, and represents the number +An example where queue numbers are used: ``tx_q7_bytes`` indicates this statistic applies to queue number 7 and represents the number of transmitted bytes on that queue. API Design ^^^^^^^^^^ -The xstats API uses the ``name``, ``id``, and ``value`` to allow performant -lookup of specific statistics. Performant lookup means two things; +The xstats API uses ``name``, ``id``, and ``value`` to allow performant +lookup of specific statistics. Performant lookup means two things: -* No string comparisons with the ``name`` of the statistic in fast-path -* Allow requesting of only the statistics of interest +* No string comparisons with the ``name`` of the statistic in the fast path +* Allow requesting only the statistics of interest -The API ensures these requirements are met by mapping the ``name`` of the -statistic to a unique ``id``, which is used as a key for lookup in the fast-path. -The API allows applications to request an array of ``id`` values, so that the -PMD only performs the required calculations. Expected usage is that the -application scans the ``name`` of each statistic, and caches the ``id`` -if it has an interest in that statistic. On the fast-path, the integer can be used +The API meets these requirements by mapping the ``name`` of the +statistic to a unique ``id``, which serves as a key for lookup in the fast path. +The API allows applications to request an array of ``id`` values, so the +PMD only performs the required calculations. The expected usage is that the +application scans the ``name`` of each statistic and caches the ``id`` +if it has an interest in that statistic. On the fast path, the integer can be used to retrieve the actual ``value`` of the statistic that the ``id`` represents. API Functions ^^^^^^^^^^^^^ -The API is built out of a small number of functions, which can be used to -retrieve the number of statistics and the names, IDs and values of those -statistics. +The API is built from a small number of functions, which retrieve the number of statistics +and the names, IDs, and values of those statistics. -* ``rte_eth_xstats_get_names_by_id()``: returns the names of the statistics. When given a - ``NULL`` parameter the function returns the number of statistics that are available. +* ``rte_eth_xstats_get_names_by_id()``: Returns the names of the statistics. When given a + ``NULL`` parameter, the function returns the number of available statistics. * ``rte_eth_xstats_get_id_by_name()``: Searches for the statistic ID that matches - ``xstat_name``. If found, the ``id`` integer is set. + ``xstat_name``. If found, sets the ``id`` integer. * ``rte_eth_xstats_get_by_id()``: Fills in an array of ``uint64_t`` values - with matching the provided ``ids`` array. If the ``ids`` array is NULL, it - returns all statistics that are available. + matching the provided ``ids`` array. If the ``ids`` array is NULL, it + returns all available statistics. Application Usage @@ -509,11 +511,11 @@ Application Usage Imagine an application that wants to view the dropped packet count. If no packets are dropped, the application does not read any other metrics for performance reasons. If packets are dropped, the application has a particular -set of statistics that it requests. This "set" of statistics allows the app to -decide what next steps to perform. The following code-snippets show how the -xstats API can be used to achieve this goal. +set of statistics that it requests. This "set" of statistics allows the application to +decide what next steps to perform. The following code snippets show how the +xstats API achieves this goal. -First step is to get all statistics names and list them: +The first step is to get all statistics names and list them: .. code-block:: c @@ -557,9 +559,9 @@ First step is to get all statistics names and list them: printf("%s: %"PRIu64"\n", xstats_names[i].name, values[i]); } -The application has access to the names of all of the statistics that the PMD -exposes. The application can decide which statistics are of interest, cache the -ids of those statistics by looking up the name as follows: +The application has access to the names of all statistics that the PMD +exposes. The application can decide which statistics are of interest and cache the +IDs of those statistics by looking up the name as follows: .. code-block:: c @@ -576,10 +578,8 @@ ids of those statistics by looking up the name as follows: goto err; } -The API provides flexibility to the application so that it can look up multiple -statistics using an array containing multiple ``id`` numbers. This reduces the -function call overhead of retrieving statistics, and makes lookup of multiple -statistics simpler for the application. +The API allows the application to look up multiple statistics using an array containing multiple ``id`` numbers. +This reduces function call overhead when retrieving statistics and simplifies looking up multiple statistics. .. code-block:: c @@ -597,12 +597,12 @@ statistics simpler for the application. } -This array lookup API for xstats allows the application create multiple -"groups" of statistics, and look up the values of those IDs using a single API -call. As an end result, the application is able to achieve its goal of -monitoring a single statistic ("rx_errors" in this case), and if that shows +This array lookup API for xstats allows the application to create multiple +"groups" of statistics and look up the values of those IDs using a single API +call. As a result, the application achieves its goal of +monitoring a single statistic (in this case, "rx_errors"). If that shows packets being dropped, it can easily retrieve a "set" of statistics using the -IDs array parameter to ``rte_eth_xstats_get_by_id`` function. +IDs array parameter to the ``rte_eth_xstats_get_by_id`` function. NIC Reset API ~~~~~~~~~~~~~ @@ -611,84 +611,81 @@ NIC Reset API int rte_eth_dev_reset(uint16_t port_id); -Sometimes a port has to be reset passively. For example when a PF is -reset, all its VFs should also be reset by the application to make them -consistent with the PF. A DPDK application also can call this function -to trigger a port reset. Normally, a DPDK application would invokes this -function when an RTE_ETH_EVENT_INTR_RESET event is detected. +Sometimes a port must be reset passively. For example, when a PF is +reset, the application should also reset all its VFs to maintain consistency +with the PF. A DPDK application can also call this function +to trigger a port reset. Normally, a DPDK application invokes this +function when it detects an RTE_ETH_EVENT_INTR_RESET event. -It is the duty of the PMD to trigger RTE_ETH_EVENT_INTR_RESET events and -the application should register a callback function to handle these -events. When a PMD needs to trigger a reset, it can trigger an +The PMD triggers RTE_ETH_EVENT_INTR_RESET events. +The application should register a callback function to handle these +events. When a PMD needs to trigger a reset, it triggers an RTE_ETH_EVENT_INTR_RESET event. On receiving an RTE_ETH_EVENT_INTR_RESET -event, applications can handle it as follows: Stop working queues, stop +event, applications should: stop working queues, stop calling Rx and Tx functions, and then call rte_eth_dev_reset(). For -thread safety all these operations should be called from the same thread. +thread safety, call all these operations from the same thread. -For example when PF is reset, the PF sends a message to notify VFs of -this event and also trigger an interrupt to VFs. Then in the interrupt -service routine the VFs detects this notification message and calls +For example, when a PF is reset, it sends a message to notify VFs of +this event and also triggers an interrupt to VFs. Then, in the interrupt +service routine, the VFs detect this notification message and call rte_eth_dev_callback_process(dev, RTE_ETH_EVENT_INTR_RESET, NULL). This means that a PF reset triggers an RTE_ETH_EVENT_INTR_RESET -event within VFs. The function rte_eth_dev_callback_process() will -call the registered callback function. The callback function can trigger -the application to handle all operations the VF reset requires including +event within VFs. The function rte_eth_dev_callback_process() +calls the registered callback function. The callback function can trigger +the application to handle all operations the VF reset requires, including stopping Rx/Tx queues and calling rte_eth_dev_reset(). -The rte_eth_dev_reset() itself is a generic function which only does -some hardware reset operations through calling dev_unint() and -dev_init(), and itself does not handle synchronization, which is handled -by application. +The rte_eth_dev_reset() function is a generic function that only performs hardware reset operations by calling dev_uninit() and +dev_init(). It does not handle synchronization; the application handles that. -The PMD itself should not call rte_eth_dev_reset(). The PMD can trigger -the application to handle reset event. It is duty of application to -handle all synchronization before it calls rte_eth_dev_reset(). +The PMD should not call rte_eth_dev_reset(). The PMD can trigger +the application to handle the reset event. The application must +handle all synchronization before calling rte_eth_dev_reset(). The above error handling mode is known as ``RTE_ETH_ERROR_HANDLE_MODE_PASSIVE``. Proactive Error Handling Mode ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -This mode is known as ``RTE_ETH_ERROR_HANDLE_MODE_PROACTIVE``, -different from the application invokes recovery in PASSIVE mode, -the PMD automatically recovers from error in PROACTIVE mode, -and only a small amount of work is required for the application. +This mode is known as ``RTE_ETH_ERROR_HANDLE_MODE_PROACTIVE``, which +differs from PASSIVE mode where the application invokes recovery. +In PROACTIVE mode, the PMD automatically recovers from errors, +and the application requires only minimal handling. During error detection and automatic recovery, the PMD sets the data path pointers to dummy functions -(which will prevent the crash), -and also make sure the control path operations fail with a return code ``-EBUSY``. +(which prevent crashes) +and ensures control path operations fail with return code ``-EBUSY``. Because the PMD recovers automatically, -the application can only sense that the data flow is disconnected for a while -and the control API returns an error in this period. +the application only senses that the data flow is disconnected for a while +and that the control API returns an error during this period. -In order to sense the error happening/recovering, -as well as to restore some additional configuration, +To sense error occurrence and recovery, +as well as to restore additional configuration, three events are available: ``RTE_ETH_EVENT_ERR_RECOVERING`` - Notify the application that an error is detected - and the recovery is being started. + Notifies the application that an error is detected + and recovery is beginning. Upon receiving the event, the application should not invoke - any control path function until receiving + any control path function until receiving the ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or ``RTE_ETH_EVENT_RECOVERY_FAILED`` event. .. note:: Before the PMD reports the recovery result, - the PMD may report the ``RTE_ETH_EVENT_ERR_RECOVERING`` event again, - because a larger error may occur during the recovery. + it may report the ``RTE_ETH_EVENT_ERR_RECOVERING`` event again + because a larger error may occur during recovery. ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` - Notify the application that the recovery from error is successful, - the PMD already re-configures the port, + Notifies the application that recovery from the error was successful. + The PMD has reconfigured the port, and the effect is the same as a restart operation. ``RTE_ETH_EVENT_RECOVERY_FAILED`` - Notify the application that the recovery from error failed, - the port should not be usable anymore. + Notifies the application that recovery from the error failed. + The port should not be usable anymore. The application should close the port. -The error handling mode supported by the PMD can be reported through -``rte_eth_dev_info_get``. +Query the error handling mode supported by the PMD using ``rte_eth_dev_info_get()``. -- 2.51.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 2/5] doc: correct grammar in rte_flow guide 2026-01-16 21:29 [PATCH 0/5] doc: ethdev documentation grammar and typo corrections Stephen Hemminger 2026-01-16 21:29 ` [PATCH 1/5] doc: correct grammar and punctuation errors in ethdev guide Stephen Hemminger @ 2026-01-16 21:29 ` Stephen Hemminger 2026-01-16 21:29 ` [PATCH 3/5] doc: correct grammar in QoS framework guide Stephen Hemminger ` (3 subsequent siblings) 5 siblings, 0 replies; 18+ messages in thread From: Stephen Hemminger @ 2026-01-16 21:29 UTC (permalink / raw) To: dev; +Cc: Stephen Hemminger, Ori Kam Remove duplicate article "an" in phrase "a single an item". Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> --- doc/guides/prog_guide/ethdev/flow_offload.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/guides/prog_guide/ethdev/flow_offload.rst b/doc/guides/prog_guide/ethdev/flow_offload.rst index 1cd904e1ee..d51926a47c 100644 --- a/doc/guides/prog_guide/ethdev/flow_offload.rst +++ b/doc/guides/prog_guide/ethdev/flow_offload.rst @@ -109,7 +109,7 @@ However, ``rte_eth_dev_configure()`` may fail if any rules remain, so the application must flush them before attempting a reconfiguration. Keeping may be unsupported for some types of rule items and actions, as well as depending on the value of flow attributes transfer bit. -A combination of a single an item or action type +A combination of a single item or action type and a value of the transfer bit is called a rule feature. For example: a COUNT action with the transfer bit set. To test if rules with a particular feature are kept, the application must try -- 2.51.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 3/5] doc: correct grammar in QoS framework guide 2026-01-16 21:29 [PATCH 0/5] doc: ethdev documentation grammar and typo corrections Stephen Hemminger 2026-01-16 21:29 ` [PATCH 1/5] doc: correct grammar and punctuation errors in ethdev guide Stephen Hemminger 2026-01-16 21:29 ` [PATCH 2/5] doc: correct grammar in rte_flow guide Stephen Hemminger @ 2026-01-16 21:29 ` Stephen Hemminger 2026-01-16 21:29 ` [PATCH 4/5] doc: correct typos in switch representation guide Stephen Hemminger ` (2 subsequent siblings) 5 siblings, 0 replies; 18+ messages in thread From: Stephen Hemminger @ 2026-01-16 21:29 UTC (permalink / raw) To: dev; +Cc: Stephen Hemminger, Cristian Dumitrescu Add missing article "a" in phrase "As result" which should be "As a result" in four locations including table cell descriptions. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> --- doc/guides/prog_guide/ethdev/qos_framework.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/doc/guides/prog_guide/ethdev/qos_framework.rst b/doc/guides/prog_guide/ethdev/qos_framework.rst index 1144037dfa..53a72ab677 100644 --- a/doc/guides/prog_guide/ethdev/qos_framework.rst +++ b/doc/guides/prog_guide/ethdev/qos_framework.rst @@ -621,7 +621,7 @@ The token bucket generic parameters and operations are presented in :numref:`tab | | | while the bucket is full are dropped. | | | | | +---+------------------------+------------------------------------------------------------------------------+ - | 3 | Credit consumption | As result of packet scheduling, the necessary number of credits is removed | + | 3 | Credit consumption | As a result of packet scheduling, the necessary number of credits is removed | | | | from the bucket. The packet can only be sent if enough credits are in the | | | | bucket to send the full packet (packet bytes and framing overhead for the | | | | packet). | @@ -716,7 +716,7 @@ where, r = port line rate (in bytes per second). | | | * tb_time += n_periods * tb_period; | | | | | +---+-------------------------+-----------------------------------------------------------------------------+ - | 3 | Credit consumption | As result of packet scheduling, the necessary number of credits is removed | + | 3 | Credit consumption | As a result of packet scheduling, the necessary number of credits is removed| | | (on packet scheduling) | from the bucket. The packet can only be sent if enough credits are in the | | | | bucket to send the full packet (packet bytes and framing overhead for the | | | | packet). | @@ -805,7 +805,7 @@ as described in :numref:`table_qos_10` and :numref:`table_qos_11`. | | | } | | | | | +---+--------------------------+----------------------------------------------------------------------------+ - | 3 | Credit consumption | As result of packet scheduling, the TC limit is decreased with the | + | 3 | Credit consumption | As a result of packet scheduling, the TC limit is decreased with the | | | (on packet scheduling) | necessary number of credits. The packet can only be sent if enough credits | | | | are currently available in the TC limit to send the full packet | | | | (packet bytes and framing overhead for the packet). | @@ -1720,7 +1720,7 @@ Traffic Metering The traffic metering component implements the Single Rate Three Color Marker (srTCM) and Two Rate Three Color Marker (trTCM) algorithms, as defined by IETF RFC 2697 and 2698 respectively. These algorithms meter the stream of incoming packets based on the allowance defined in advance for each traffic flow. -As result, each incoming packet is tagged as green, +As a result, each incoming packet is tagged as green, yellow or red based on the monitored consumption of the flow the packet belongs to. Functional Overview -- 2.51.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 4/5] doc: correct typos in switch representation guide 2026-01-16 21:29 [PATCH 0/5] doc: ethdev documentation grammar and typo corrections Stephen Hemminger ` (2 preceding siblings ...) 2026-01-16 21:29 ` [PATCH 3/5] doc: correct grammar in QoS framework guide Stephen Hemminger @ 2026-01-16 21:29 ` Stephen Hemminger 2026-01-16 21:29 ` [PATCH 5/5] doc: correct typos in traffic management guide Stephen Hemminger 2026-01-28 19:45 ` [PATCH v2 0/8] doc: ethdev documentation grammar and typo corrections Stephen Hemminger 5 siblings, 0 replies; 18+ messages in thread From: Stephen Hemminger @ 2026-01-16 21:29 UTC (permalink / raw) To: dev; +Cc: Stephen Hemminger, Thomas Monjalon, Andrew Rybchenko Two typos corrected: - "according on" to "according to" - "physical of virtual" to "physical or virtual" Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> --- doc/guides/prog_guide/ethdev/switch_representation.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/guides/prog_guide/ethdev/switch_representation.rst b/doc/guides/prog_guide/ethdev/switch_representation.rst index 2ef2772afb..0bc88bda8f 100644 --- a/doc/guides/prog_guide/ethdev/switch_representation.rst +++ b/doc/guides/prog_guide/ethdev/switch_representation.rst @@ -19,7 +19,7 @@ managed by the host system and fully transparent to users and applications. On the other hand, applications typically found on hypervisors that process layer 2 (L2) traffic (such as OVS) need to steer traffic themselves -according on their own criteria. +according to their own criteria. Without a standard software interface to manage traffic steering rules between VFs, SFs, PFs and the various physical ports of a given device, @@ -84,7 +84,7 @@ thought as a software "patch panel" front-end for applications. - Among other things, they can be used to assign MAC addresses to the resource they represent. -- Applications can tell port representors apart from other physical of virtual +- Applications can tell port representors apart from other physical or virtual port by checking the dev_flags field within their device information structure for the RTE_ETH_DEV_REPRESENTOR bit-field. -- 2.51.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 5/5] doc: correct typos in traffic management guide 2026-01-16 21:29 [PATCH 0/5] doc: ethdev documentation grammar and typo corrections Stephen Hemminger ` (3 preceding siblings ...) 2026-01-16 21:29 ` [PATCH 4/5] doc: correct typos in switch representation guide Stephen Hemminger @ 2026-01-16 21:29 ` Stephen Hemminger 2026-01-28 19:45 ` [PATCH v2 0/8] doc: ethdev documentation grammar and typo corrections Stephen Hemminger 5 siblings, 0 replies; 18+ messages in thread From: Stephen Hemminger @ 2026-01-16 21:29 UTC (permalink / raw) To: dev; +Cc: Stephen Hemminger, Cristian Dumitrescu Two documentation issues corrected: - Remove errant asterisk from "Head Drop*" - Remove duplicate phrase in WRED algorithm description Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> --- doc/guides/prog_guide/ethdev/traffic_management.rst | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/doc/guides/prog_guide/ethdev/traffic_management.rst b/doc/guides/prog_guide/ethdev/traffic_management.rst index c356791a45..086eb9d3af 100644 --- a/doc/guides/prog_guide/ethdev/traffic_management.rst +++ b/doc/guides/prog_guide/ethdev/traffic_management.rst @@ -109,15 +109,14 @@ They are made available for every leaf node in the hierarchy, subject to the specific implementation supporting them. On request of writing a new packet into the current queue while the queue is full, the Tail Drop algorithm drops the new packet while leaving the queue -unmodified, as opposed to the Head Drop* algorithm, which drops the packet +unmodified, as opposed to the Head Drop algorithm, which drops the packet at the head of the queue (the oldest packet waiting in the queue) and admits the new packet at the tail of the queue. The Random Early Detection (RED) algorithm works by proactively dropping more and more input packets as the queue occupancy builds up. When the queue is full or almost full, RED effectively works as Tail Drop. The Weighted RED (WRED) -algorithm uses a separate set of RED thresholds for each packet color and uses -separate set of RED thresholds for each packet color. +algorithm uses a separate set of RED thresholds for each packet color. Each hierarchy leaf node with WRED enabled as its congestion management mode has zero or one private WRED context (only one leaf node using it) and/or zero, -- 2.51.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v2 0/8] doc: ethdev documentation grammar and typo corrections 2026-01-16 21:29 [PATCH 0/5] doc: ethdev documentation grammar and typo corrections Stephen Hemminger ` (4 preceding siblings ...) 2026-01-16 21:29 ` [PATCH 5/5] doc: correct typos in traffic management guide Stephen Hemminger @ 2026-01-28 19:45 ` Stephen Hemminger 2026-01-28 19:46 ` [PATCH v2 1/8] doc: correct grammar and punctuation errors in ethdev guide Stephen Hemminger ` (8 more replies) 5 siblings, 9 replies; 18+ messages in thread From: Stephen Hemminger @ 2026-01-28 19:45 UTC (permalink / raw) To: dev; +Cc: Stephen Hemminger This series corrects grammar errors, typos, and punctuation issues in the Ethernet device library section of the DPDK Programmer's Guide. Initial work on this was done as part of review by technical writer then more changes as result of AI reviews. The first patch provides a comprehensive update to the Poll Mode Driver documentation, modernizing the description and improving readability. The remaining patches address smaller issues in the rte_flow, QoS framework, switch representation, traffic management, and traffic metering guides. v2: - Add patch to fix minor grammar issue in ethdev introduction - Add patch to correct alphabetical ordering in ethdev toctree - Add patch to fix grammar issues in traffic metering and policing guide Stephen Hemminger (8): doc: correct grammar and punctuation errors in ethdev guide doc: correct grammar in flow guide doc: correct grammar in QoS framework guide doc: correct typos in switch representation guide doc: correct typos in traffic management guide doc: correct grammar and improve clarity in ethdev guide doc: correct alphabetical ordering in ethdev toctree doc: correct grammar and improve clarity in MTR guide doc/guides/prog_guide/ethdev/ethdev.rst | 557 +++++++++--------- doc/guides/prog_guide/ethdev/flow_offload.rst | 58 +- doc/guides/prog_guide/ethdev/index.rst | 6 +- .../prog_guide/ethdev/qos_framework.rst | 59 +- .../ethdev/switch_representation.rst | 6 +- .../prog_guide/ethdev/traffic_management.rst | 25 +- .../ethdev/traffic_metering_and_policing.rst | 8 +- 7 files changed, 360 insertions(+), 359 deletions(-) -- 2.51.0 ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v2 1/8] doc: correct grammar and punctuation errors in ethdev guide 2026-01-28 19:45 ` [PATCH v2 0/8] doc: ethdev documentation grammar and typo corrections Stephen Hemminger @ 2026-01-28 19:46 ` Stephen Hemminger 2026-01-28 19:46 ` [PATCH v2 2/8] doc: correct grammar in flow guide Stephen Hemminger ` (7 subsequent siblings) 8 siblings, 0 replies; 18+ messages in thread From: Stephen Hemminger @ 2026-01-28 19:46 UTC (permalink / raw) To: dev; +Cc: Stephen Hemminger, Nandini Persad Change various grammar, punctuation, and typographical errors throughout the Poll Mode Driver documentation. - Change extra spaces after emphasized terms (*run-to-completion*, *pipe-line*) - Correct possessive forms ("port's" -> "ports'", "processors" -> "processor's") - Change subject-verb agreement ("VFs detects" -> "VFs detect") - Add missing articles and words ("It is duty" -> "It is the duty", "allows the application create" -> "allows the application to create") - Remove extraneous words ("release of all" -> "release all", "ensures sure" -> "ensures") - Change typos ("dev_unint()" -> "dev_uninit()", "receive of transmit" -> "receive or transmit", "UDP/TCP/ SCTP" -> "UDP/TCP/SCTP") - Add missing punctuation (period at end of bullet point) - Change spacing around inline code markup - Clarify awkward sentence about PROACTIVE vs PASSIVE error Signed-off-by: Nandini Persad <nandinipersad361@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> --- doc/guides/prog_guide/ethdev/ethdev.rst | 557 ++++++++++++------------ 1 file changed, 277 insertions(+), 280 deletions(-) diff --git a/doc/guides/prog_guide/ethdev/ethdev.rst b/doc/guides/prog_guide/ethdev/ethdev.rst index daaf43ea3b..ffe3fb1416 100644 --- a/doc/guides/prog_guide/ethdev/ethdev.rst +++ b/doc/guides/prog_guide/ethdev/ethdev.rst @@ -4,126 +4,134 @@ Poll Mode Driver ================ -The DPDK includes 1 Gigabit, 10 Gigabit and 40 Gigabit and para virtualized virtio Poll Mode Drivers. +The Data Plane Development Kit (DPDK) supports a wide range of Ethernet speeds, +from 10 Megabits to 400 Gigabits, depending on hardware capability. -A Poll Mode Driver (PMD) consists of APIs, provided through the BSD driver running in user space, -to configure the devices and their respective queues. -In addition, a PMD accesses the RX and TX descriptors directly without any interrupts -(with the exception of Link Status Change interrupts) to quickly receive, -process and deliver packets in the user's application. -This section describes the requirements of the PMDs, -their global design principles and proposes a high-level architecture and a generic external API for the Ethernet PMDs. +DPDK's Poll Mode Drivers (PMDs) are high-performance, optimized drivers for various +network interface cards that bypass the traditional kernel network stack to reduce +latency and improve throughput. They access Rx and Tx descriptors directly in a polling +mode without relying on interrupts (except for Link Status Change notifications), enabling +efficient packet reception and transmission in user-space applications. + +This section outlines the requirements of Ethernet PMDs, their design principles, +and presents a high-level architecture along with a generic external API. Requirements and Assumptions ---------------------------- -The DPDK environment for packet processing applications allows for two models, run-to-completion and pipe-line: +The DPDK environment for packet processing applications supports two models: run-to-completion and pipeline. -* In the *run-to-completion* model, a specific port's RX descriptor ring is polled for packets through an API. - Packets are then processed on the same core and placed on a port's TX descriptor ring through an API for transmission. +* In the *run-to-completion* model, a specific port's Rx descriptor ring is polled for packets through an API. + The application then processes packets on the same core and transmits them via the port's Tx descriptor ring using another API. -* In the *pipe-line* model, one core polls one or more port's RX descriptor ring through an API. - Packets are received and passed to another core via a ring. - The other core continues to process the packet which then may be placed on a port's TX descriptor ring through an API for transmission. +* In the *pipeline* model, one core polls the Rx descriptor ring(s) of one or more ports via an API. + The application then passes received packets to another core via a ring for further processing, + which may include transmission through the Tx descriptor ring using an API. -In a synchronous run-to-completion model, -each logical core assigned to the DPDK executes a packet processing loop that includes the following steps: +In a synchronous run-to-completion model, a logical core (lcore) +assigned to DPDK executes a packet processing loop that includes the following steps: -* Retrieve input packets through the PMD receive API +* Retrieve input packets using the PMD receive API -* Process each received packet one at a time, up to its forwarding +* Process each received packet individually, up to its forwarding -* Send pending output packets through the PMD transmit API +* Transmit output packets using the PMD transmit API -Conversely, in an asynchronous pipe-line model, some logical cores may be dedicated to the retrieval of received packets and -other logical cores to the processing of previously received packets. -Received packets are exchanged between logical cores through rings. -The loop for packet retrieval includes the following steps: +In contrast, the asynchronous pipeline model assigns some logical cores to retrieve packets +and others to process them. The application exchanges packets between cores via rings. + +The packet retrieval loop includes: * Retrieve input packets through the PMD receive API * Provide received packets to processing lcores through packet queues -The loop for packet processing includes the following steps: - -* Retrieve the received packet from the packet queue +The packet processing loop includes: -* Process the received packet, up to its retransmission if forwarded +* Dequeue received packets from the packet queue -To avoid any unnecessary interrupt processing overhead, the execution environment must not use any asynchronous notification mechanisms. -Whenever needed and appropriate, asynchronous communication should be introduced as much as possible through the use of rings. +* Process packets, including retransmission if forwarded -Avoiding lock contention is a key issue in a multi-core environment. -To address this issue, PMDs are designed to work with per-core private resources as much as possible. -For example, a PMD maintains a separate transmit queue per-core, per-port, if the PMD is not ``RTE_ETH_TX_OFFLOAD_MT_LOCKFREE`` capable. -In the same way, every receive queue of a port is assigned to and polled by a single logical core (lcore). +To minimize interrupt-related overhead, the execution environment should avoid asynchronous +notification mechanisms. When asynchronous communication is required, implement it +using rings where possible. Minimizing lock contention is critical in multi-core environments. +To support this, PMDs use per-core private resources whenever possible. +For example, if a PMD does not support ``RTE_ETH_TX_OFFLOAD_MT_LOCKFREE``, it maintains a separate +transmit queue per core and per port. Similarly, each receive queue is assigned to and polled by a single lcore. -To comply with Non-Uniform Memory Access (NUMA), memory management is designed to assign to each logical core -a private buffer pool in local memory to minimize remote memory access. -The configuration of packet buffer pools should take into account the underlying physical memory architecture in terms of DIMMS, -channels and ranks. -The application must ensure that appropriate parameters are given at memory pool creation time. +To support Non-Uniform Memory Access (NUMA), the memory management design assigns each logical +core a private buffer pool in local memory to reduce remote memory access. Configuration of packet +buffer pools should consider the underlying physical memory layout, such as DIMMs, channels, and ranks. +The application must set proper parameters during memory pool creation. See :doc:`../mempool_lib`. Design Principles ----------------- -The API and architecture of the Ethernet* PMDs are designed with the following guidelines in mind. +The API and architecture of the Ethernet PMDs follow these design principles: -PMDs must help global policy-oriented decisions to be enforced at the upper application level. -Conversely, NIC PMD functions should not impede the benefits expected by upper-level global policies, -or worse prevent such policies from being applied. +PMDs should support the enforcement of global, policy-driven decisions at the upper application level. +At the same time, NIC PMD functions must not hinder the performance gains expected by these higher-level policies, +or worse, prevent them from being implemented. -For instance, both the receive and transmit functions of a PMD have a maximum number of packets/descriptors to poll. -This allows a run-to-completion processing stack to statically fix or -to dynamically adapt its overall behavior through different global loop policies, such as: +For example, both the receive and transmit functions of a PMD define a maximum number of packets to poll. +This enables a run-to-completion processing stack to either statically configure or dynamically adjust its +behavior according to different global loop strategies, such as: -* Receive, process immediately and transmit packets one at a time in a piecemeal fashion. +* Receiving, processing, and transmitting packets one at a time in a piecemeal fashion -* Receive as many packets as possible, then process all received packets, transmitting them immediately. +* Receiving as many packets as possible, then processing and transmitting them all immediately -* Receive a given maximum number of packets, process the received packets, accumulate them and finally send all accumulated packets to transmit. +* Receiving a set number of packets, processing them, and batching them for transmission at once -To achieve optimal performance, overall software design choices and pure software optimization techniques must be considered and -balanced against available low-level hardware-based optimization features (CPU cache properties, bus speed, NIC PCI bandwidth, and so on). -The case of packet transmission is an example of this software/hardware tradeoff issue when optimizing burst-oriented network packet processing engines. -In the initial case, the PMD could export only an rte_eth_tx_one function to transmit one packet at a time on a given queue. -On top of that, one can easily build an rte_eth_tx_burst function that loops invoking the rte_eth_tx_one function to transmit several packets at a time. -However, an rte_eth_tx_burst function is effectively implemented by the PMD to minimize the driver-level transmit cost per packet through the following optimizations: +To maximize performance, developers must consider overall software architecture and optimization techniques +alongside available low-level hardware optimizations (e.g., CPU cache behavior, bus speed, and NIC PCI bandwidth). -* Share among multiple packets the un-amortized cost of invoking the rte_eth_tx_one function. +Packet transmission in burst-oriented network engines illustrates this software/hardware tradeoff. +A PMD could expose only the ``rte_eth_tx_one`` function to transmit a single packet at a time on a given queue. +While it is possible to build an ``rte_eth_tx_burst`` function by repeatedly calling ``rte_eth_tx_one``, +most PMDs implement ``rte_eth_tx_burst`` directly to reduce per-packet transmission overhead. -* Enable the rte_eth_tx_burst function to take advantage of burst-oriented hardware features (prefetch data in cache, use of NIC head/tail registers) - to minimize the number of CPU cycles per packet, for example by avoiding unnecessary read memory accesses to ring transmit descriptors, - or by systematically using arrays of pointers that exactly fit cache line boundaries and sizes. +This implementation includes several key optimizations: -* Apply burst-oriented software optimization techniques to remove operations that would otherwise be unavoidable, such as ring index wrap back management. +* Sharing the fixed cost of invoking ``rte_eth_tx_one`` across multiple packets -Burst-oriented functions are also introduced via the API for services that are intensively used by the PMD. -This applies in particular to buffer allocators used to populate NIC rings, which provide functions to allocate/free several buffers at a time. -For example, an mbuf_multiple_alloc function returning an array of pointers to rte_mbuf buffers which speeds up the receive poll function of the PMD when -replenishing multiple descriptors of the receive ring. +* Leveraging burst-oriented hardware features (e.g., data prefetching, NIC head/tail registers, vector extensions) + to reduce CPU cycles per packet, including minimizing unnecessary memory accesses and aligning pointer arrays + with cache line boundaries and sizes + +* Applying software-level burst optimizations to eliminate otherwise unavoidable overheads, such as ring index wrap-around handling + +The API also introduces burst-oriented functions for PMD-intensive services, such as buffer allocation. +For instance, buffer allocators used to populate NIC rings often support functions that allocate or free multiple buffers in a single call. +An example is ``rte_pktmbuf_alloc_bulk``, which returns an array of rte_mbuf pointers, significantly improving PMD performance +when replenishing multiple descriptors in the receive ring. Logical Cores, Memory and NIC Queues Relationships -------------------------------------------------- -The DPDK supports NUMA allowing for better performance when a processor's logical cores and interfaces utilize its local memory. -Therefore, mbuf allocation associated with local PCIe* interfaces should be allocated from memory pools created in the local memory. -The buffers should, if possible, remain on the local processor to obtain the best performance results and RX and TX buffer descriptors -should be populated with mbufs allocated from a mempool allocated from local memory. +DPDK supports NUMA, which improves performance when a processor's logical cores and network interfaces +use memory local to that processor. To maximize this benefit, allocate mbufs associated with local PCIe* interfaces +from memory pools located in the same NUMA node. + +Ideally, keep these buffers on the local processor to achieve optimal performance. Populate Rx and Tx buffer +descriptors with mbufs from mempools created in local memory. -The run-to-completion model also performs better if packet or data manipulation is in local memory instead of a remote processors memory. -This is also true for the pipe-line model provided all logical cores used are located on the same processor. +The run-to-completion model also benefits from performing packet data operations in local memory, +rather than accessing remote memory across NUMA nodes. +The same applies to the pipeline model, provided all logical cores involved reside on the same processor. -Multiple logical cores should never share receive or transmit queues for interfaces since this would require global locks and hinder performance. +Never share receive and transmit queues between multiple logical cores, as doing so requires +global locks and severely impacts performance. -If the PMD is ``RTE_ETH_TX_OFFLOAD_MT_LOCKFREE`` capable, multiple threads can invoke ``rte_eth_tx_burst()`` -concurrently on the same tx queue without SW lock. This PMD feature found in some NICs and useful in the following use cases: +If the PMD supports the ``RTE_ETH_TX_OFFLOAD_MT_LOCKFREE`` offload, +multiple threads can call ``rte_eth_tx_burst()`` concurrently on the same Tx queue without a software lock. +This capability, available in some NICs, proves advantageous in these scenarios: -* Remove explicit spinlock in some applications where lcores are not mapped to Tx queues with 1:1 relation. +* Eliminating explicit spinlocks in applications where Tx queues do not map 1:1 to logical cores -* In the eventdev use case, avoid dedicating a separate TX core for transmitting and thus - enables more scaling as all workers can send the packets. +* In eventdev-based workloads, allowing all worker threads to transmit packets, removing the need for a dedicated + Tx core and enabling greater scalability See `Hardware Offload`_ for ``RTE_ETH_TX_OFFLOAD_MT_LOCKFREE`` capability probing details. @@ -133,11 +141,10 @@ Device Identification, Ownership and Configuration Device Identification ~~~~~~~~~~~~~~~~~~~~~ -Each NIC port is uniquely designated by its (bus/bridge, device, function) PCI -identifiers assigned by the PCI probing/enumeration function executed at DPDK initialization. -Based on their PCI identifier, NIC ports are assigned two other identifiers: +The PCI probing/enumeration function executed at DPDK initialization assigns each NIC port a unique PCI +identifier (bus/bridge, device, function). Based on this PCI identifier, DPDK assigns each NIC port two additional identifiers: -* A port index used to designate the NIC port in all functions exported by the PMD API. +* A port index used to designate the NIC port in all functions exported by the PMD API * A port name used to designate the port in console messages, for administration or debugging purposes. For ease of use, the port name includes the port index. @@ -145,83 +152,82 @@ Based on their PCI identifier, NIC ports are assigned two other identifiers: Port Ownership ~~~~~~~~~~~~~~ -The Ethernet devices ports can be owned by a single DPDK entity (application, library, PMD, process, etc). -The ownership mechanism is controlled by ethdev APIs and allows to set/remove/get a port owner by DPDK entities. -It prevents Ethernet ports to be managed by different entities. +A single DPDK entity (application, library, PMD, process, etc.) can own Ethernet device ports. +The ethdev APIs control the ownership mechanism and allow DPDK entities to set, remove, or get a port owner. +This prevents different entities from managing the same Ethernet ports. .. note:: - It is the DPDK entity responsibility to set the port owner before using it and to manage the port usage synchronization between different threads or processes. + The DPDK entity must set port ownership before using the port and manage usage synchronization between different threads or processes. -It is recommended to set port ownership early, -like during the probing notification ``RTE_ETH_EVENT_NEW``. +Set port ownership early, for instance during the probing notification ``RTE_ETH_EVENT_NEW``. Device Configuration ~~~~~~~~~~~~~~~~~~~~ -The configuration of each NIC port includes the following operations: +Configuring each NIC port includes the following operations: * Allocate PCI resources -* Reset the hardware (issue a Global Reset) to a well-known default state +* Reset the hardware to a well-known default state (issue a Global Reset) * Set up the PHY and the link * Initialize statistics counters -The PMD API must also export functions to start/stop the all-multicast feature of a port and functions to set/unset the port in promiscuous mode. +The PMD API must also export functions to start/stop the all-multicast feature of a port and functions to set/unset promiscuous mode. -Some hardware offload features must be individually configured at port initialization through specific configuration parameters. -This is the case for the Receive Side Scaling (RSS) and Data Center Bridging (DCB) features for example. +Some hardware offload features require individual configuration at port initialization through specific parameters. +This includes Receive Side Scaling (RSS) and Data Center Bridging (DCB) features. On-the-Fly Configuration ~~~~~~~~~~~~~~~~~~~~~~~~ -All device features that can be started or stopped "on the fly" (that is, without stopping the device) do not require the PMD API to export dedicated functions for this purpose. +Device features that can start or stop "on the fly" (without stopping the device) do not require the PMD API to export dedicated functions. -All that is required is the mapping address of the device PCI registers to implement the configuration of these features in specific functions outside of the drivers. +Implementing the configuration of these features in specific functions outside of the drivers requires only the mapping address of the device PCI registers. For this purpose, -the PMD API exports a function that provides all the information associated with a device that can be used to set up a given device feature outside of the driver. -This includes the PCI vendor identifier, the PCI device identifier, the mapping address of the PCI device registers, and the name of the driver. +the PMD API exports a function that provides all device information needed to set up a given feature outside of the driver. +This includes the PCI vendor identifier, the PCI device identifier, the mapping address of the PCI device registers, and the driver name. -The main advantage of this approach is that it gives complete freedom on the choice of the API used to configure, to start, and to stop such features. +The main advantage of this approach is complete freedom in choosing the API to configure, start, and stop such features. As an example, refer to the configuration of the IEEE1588 feature for the Intel® 82576 Gigabit Ethernet Controller and -the Intel® 82599 10 Gigabit Ethernet Controller controllers in the testpmd application. +the Intel® 82599 10 Gigabit Ethernet Controller in the testpmd application. -Other features such as the L3/L4 5-Tuple packet filtering feature of a port can be configured in the same way. -Ethernet* flow control (pause frame) can be configured on the individual port. +Configure other features such as the L3/L4 5-Tuple packet filtering feature of a port in the same way. +Configure Ethernet* flow control (pause frame) on an individual port. Refer to the testpmd source code for details. -Also, L4 (UDP/TCP/ SCTP) checksum offload by the NIC can be enabled for an individual packet as long as the packet mbuf is set up correctly. See `Hardware Offload`_ for details. +Also, enable L4 (UDP/TCP/SCTP) checksum offload by the NIC for an individual packet by setting up the packet mbuf correctly. See `Hardware Offload`_ for details. Configuration of Transmit Queues ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Each transmit queue is independently configured with the following information: +Configure each transmit queue independently with the following information: * The number of descriptors of the transmit ring -* The socket identifier used to identify the appropriate DMA memory zone from which to allocate the transmit ring in NUMA architectures +* The socket identifier used to identify the appropriate DMA memory zone for allocating the transmit ring in NUMA architectures * The values of the Prefetch, Host and Write-Back threshold registers of the transmit queue * The *minimum* transmit packets to free threshold (tx_free_thresh). - When the number of descriptors used to transmit packets exceeds this threshold, the network adaptor should be checked to see if it has written back descriptors. - A value of 0 can be passed during the TX queue configuration to indicate the default value should be used. + When the number of descriptors used to transmit packets exceeds this threshold, check the network adapter to see if it has written back descriptors. + Pass a value of 0 during Tx queue configuration to use the default value. The default value for tx_free_thresh is 32. - This ensures that the PMD does not search for completed descriptors until at least 32 have been processed by the NIC for this queue. + This ensures the PMD does not search for completed descriptors until the NIC has processed at least 32 for this queue. -* The *minimum* RS bit threshold. The minimum number of transmit descriptors to use before setting the Report Status (RS) bit in the transmit descriptor. +* The *minimum* RS bit threshold. The minimum number of transmit descriptors to use before setting the Report Status (RS) bit in the transmit descriptor. Note that this parameter may only be valid for Intel 10 GbE network adapters. - The RS bit is set on the last descriptor used to transmit a packet if the number of descriptors used since the last RS bit setting, + Set the RS bit on the last descriptor used to transmit a packet if the number of descriptors used since the last RS bit setting, up to the first descriptor used to transmit the packet, exceeds the transmit RS bit threshold (tx_rs_thresh). - In short, this parameter controls which transmit descriptors are written back to host memory by the network adapter. - A value of 0 can be passed during the TX queue configuration to indicate that the default value should be used. + In short, this parameter controls which transmit descriptors the network adapter writes back to host memory. + Pass a value of 0 during Tx queue configuration to use the default value. The default value for tx_rs_thresh is 32. - This ensures that at least 32 descriptors are used before the network adapter writes back the most recently used descriptor. - This saves upstream PCIe* bandwidth resulting from TX descriptor write-backs. - It is important to note that the TX Write-back threshold (TX wthresh) should be set to 0 when tx_rs_thresh is greater than 1. + This ensures the PMD uses at least 32 descriptors before the network adapter writes back the most recently used descriptor. + This saves upstream PCIe* bandwidth that would be used for Tx descriptor write-backs. + Set the Tx Write-back threshold (Tx wthresh) to 0 when tx_rs_thresh is greater than 1. Refer to the Intel® 82599 10 Gigabit Ethernet Controller Datasheet for more details. The following constraints must be satisfied for tx_free_thresh and tx_rs_thresh: @@ -236,46 +242,45 @@ The following constraints must be satisfied for tx_free_thresh and tx_rs_thresh: * tx_free_thresh must be less than the size of the ring minus 3. -* For optimal performance, TX wthresh should be set to 0 when tx_rs_thresh is greater than 1. +* For optimal performance, set Tx wthresh to 0 when tx_rs_thresh is greater than 1. -One descriptor in the TX ring is used as a sentinel to avoid a hardware race condition, hence the maximum threshold constraints. +One descriptor in the Tx ring serves as a sentinel to avoid a hardware race condition, hence the maximum threshold constraints. .. note:: - When configuring for DCB operation, at port initialization, both the number of transmit queues and the number of receive queues must be set to 128. + When configuring for DCB operation at port initialization, set both the number of transmit queues and the number of receive queues to 128. Free Tx mbuf on Demand ~~~~~~~~~~~~~~~~~~~~~~ -Many of the drivers do not release the mbuf back to the mempool, or local cache, -immediately after the packet has been transmitted. +Many drivers do not release the mbuf back to the mempool or local cache immediately after packet transmission. Instead, they leave the mbuf in their Tx ring and either perform a bulk release when the ``tx_rs_thresh`` has been crossed or free the mbuf when a slot in the Tx ring is needed. An application can request the driver to release used mbufs with the ``rte_eth_tx_done_cleanup()`` API. -This API requests the driver to release mbufs that are no longer in use, -independent of whether or not the ``tx_rs_thresh`` has been crossed. -There are two scenarios when an application may want the mbuf released immediately: +This API requests the driver to release mbufs no longer in use, +independent of whether the ``tx_rs_thresh`` has been crossed. +Two scenarios exist where an application may want the mbuf released immediately: * When a given packet needs to be sent to multiple destination interfaces (either for Layer 2 flooding or Layer 3 multi-cast). - One option is to make a copy of the packet or a copy of the header portion that needs to be manipulated. + One option is to copy the packet or the header portion that needs manipulation. A second option is to transmit the packet and then poll the ``rte_eth_tx_done_cleanup()`` API - until the reference count on the packet is decremented. - Then the same packet can be transmitted to the next destination interface. - The application is still responsible for managing any packet manipulations needed - between the different destination interfaces, but a packet copy can be avoided. - This API is independent of whether the packet was transmitted or dropped, + until the reference count on the packet decrements. + Then, transmit the same packet to the next destination interface. + The application remains responsible for managing any packet manipulations needed + between the different destination interfaces, but avoids a packet copy. + This API operates independently of whether the interface transmitted or dropped the packet, only that the mbuf is no longer in use by the interface. -* Some applications are designed to make multiple runs, like a packet generator. +* Some applications make multiple runs, like a packet generator. For performance reasons and consistency between runs, the application may want to reset back to an initial state between each run, where all mbufs are returned to the mempool. - In this case, it can call the ``rte_eth_tx_done_cleanup()`` API - for each destination interface it has been using - to request it to release of all its used mbufs. + In this case, call the ``rte_eth_tx_done_cleanup()`` API + for each destination interface used + to request it to release all used mbufs. To determine if a driver supports this API, check for the *Free Tx mbuf on demand* feature in the *Network Interface Controller Drivers* document. @@ -285,49 +290,49 @@ Hardware Offload Depending on driver capabilities advertised by ``rte_eth_dev_info_get()``, the PMD may support hardware offloading -feature like checksumming, TCP segmentation, VLAN insertion or -lockfree multithreaded TX burst on the same TX queue. +features like checksumming, TCP segmentation, VLAN insertion, or +lockfree multithreaded Tx burst on the same Tx queue. -The support of these offload features implies the addition of dedicated -status bit(s) and value field(s) into the rte_mbuf data structure, along -with their appropriate handling by the receive/transmit functions -exported by each PMD. The list of flags and their precise meaning is -described in the mbuf API documentation and in the :ref:`mbuf_meta` chapter. +Supporting these offload features requires adding dedicated +status bit(s) and value field(s) to the rte_mbuf data structure, along +with appropriate handling by the receive/transmit functions +exported by each PMD. The mbuf API documentation and the :ref:`mbuf_meta` chapter +describe the list of flags and their precise meanings. Per-Port and Per-Queue Offloads ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -In the DPDK offload API, offloads are divided into per-port and per-queue offloads as follows: +In the DPDK offload API, offloads divide into per-port and per-queue offloads as follows: -* A per-queue offloading can be enabled on a queue and disabled on another queue at the same time. -* A pure per-port offload is the one supported by device but not per-queue type. -* A pure per-port offloading can't be enabled on a queue and disabled on another queue at the same time. -* A pure per-port offloading must be enabled or disabled on all queues at the same time. -* Any offloading is per-queue or pure per-port type, but can't be both types at same devices. +* A per-queue offload can be enabled on one queue and disabled on another queue simultaneously. +* A pure per-port offload is supported by a device but not as a per-queue type. +* A pure per-port offload cannot be enabled on one queue and disabled on another queue simultaneously. +* A pure per-port offload must be enabled or disabled on all queues simultaneously. +* An offload is either per-queue or pure per-port type; it cannot be both types on the same device. * Port capabilities = per-queue capabilities + pure per-port capabilities. -* Any supported offloading can be enabled on all queues. +* Any supported offload can be enabled on all queues. -The different offloads capabilities can be queried using ``rte_eth_dev_info_get()``. +Query the different offload capabilities using ``rte_eth_dev_info_get()``. The ``dev_info->[rt]x_queue_offload_capa`` returned from ``rte_eth_dev_info_get()`` includes all per-queue offloading capabilities. The ``dev_info->[rt]x_offload_capa`` returned from ``rte_eth_dev_info_get()`` includes all pure per-port and per-queue offloading capabilities. Supported offloads can be either per-port or per-queue. -Offloads are enabled using the existing ``RTE_ETH_TX_OFFLOAD_*`` or ``RTE_ETH_RX_OFFLOAD_*`` flags. -Any requested offloading by an application must be within the device capabilities. -Any offloading is disabled by default if it is not set in the parameter +Enable offloads using the existing ``RTE_ETH_TX_OFFLOAD_*`` or ``RTE_ETH_RX_OFFLOAD_*`` flags. +Any offload requested by an application must be within the device capabilities. +Any offload is disabled by default if it is not set in the parameter ``dev_conf->[rt]xmode.offloads`` to ``rte_eth_dev_configure()`` and ``[rt]x_conf->offloads`` to ``rte_eth_[rt]x_queue_setup()``. -If any offloading is enabled in ``rte_eth_dev_configure()`` by an application, -it is enabled on all queues no matter whether it is per-queue or -per-port type and no matter whether it is set or cleared in +If an application enables any offload in ``rte_eth_dev_configure()``, +it is enabled on all queues regardless of whether it is per-queue or +per-port type and regardless of whether it is set or cleared in ``[rt]x_conf->offloads`` to ``rte_eth_[rt]x_queue_setup()``. -If a per-queue offloading hasn't been enabled in ``rte_eth_dev_configure()``, -it can be enabled or disabled in ``rte_eth_[rt]x_queue_setup()`` for individual queue. -A newly added offloads in ``[rt]x_conf->offloads`` to ``rte_eth_[rt]x_queue_setup()`` input by application -is the one which hasn't been enabled in ``rte_eth_dev_configure()`` and is requested to be enabled -in ``rte_eth_[rt]x_queue_setup()``. It must be per-queue type, otherwise trigger an error log. +If a per-queue offload has not been enabled in ``rte_eth_dev_configure()``, +it can be enabled or disabled in ``rte_eth_[rt]x_queue_setup()`` for an individual queue. +A newly added offload in ``[rt]x_conf->offloads`` to ``rte_eth_[rt]x_queue_setup()`` input by the application +is one that has not been enabled in ``rte_eth_dev_configure()`` and is requested to be enabled +in ``rte_eth_[rt]x_queue_setup()``. It must be per-queue type; otherwise an error log triggers. Poll Mode Driver API -------------------- @@ -335,44 +340,43 @@ Poll Mode Driver API Generalities ~~~~~~~~~~~~ -By default, all functions exported by a PMD are lock-free functions that are assumed -not to be invoked in parallel on different logical cores to work on the same target object. -For instance, a PMD receive function cannot be invoked in parallel on two logical cores to poll the same RX queue of the same port. -Of course, this function can be invoked in parallel by different logical cores on different RX queues. -It is the responsibility of the upper-level application to enforce this rule. +By default, all functions exported by a PMD are lock-free functions assumed +not to be invoked in parallel on different logical cores working on the same target object. +For instance, a PMD receive function cannot be invoked in parallel on two logical cores polling the same Rx queue of the same port. +This function can be invoked in parallel by different logical cores on different Rx queues. +The upper-level application must enforce this rule. -If needed, parallel accesses by multiple logical cores to shared queues can be explicitly protected by dedicated inline lock-aware functions +If needed, explicitly protect parallel accesses by multiple logical cores to shared queues using dedicated inline lock-aware functions built on top of their corresponding lock-free functions of the PMD API. Generic Packet Representation ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -A packet is represented by an rte_mbuf structure, which is a generic metadata structure containing all necessary housekeeping information. -This includes fields and status bits corresponding to offload hardware features, such as checksum computation of IP headers or VLAN tags. +An rte_mbuf structure represents a packet. This generic metadata structure contains all necessary housekeeping information, +including fields and status bits corresponding to offload hardware features, such as checksum computation of IP headers or VLAN tags. The rte_mbuf data structure includes specific fields to represent, in a generic way, the offload features provided by network controllers. -For an input packet, most fields of the rte_mbuf structure are filled in by the PMD receive function with the information contained in the receive descriptor. -Conversely, for output packets, most fields of rte_mbuf structures are used by the PMD transmit function to initialize transmit descriptors. +For an input packet, the PMD receive function fills in most fields of the rte_mbuf structure with information contained in the receive descriptor. +Conversely, for output packets, the PMD transmit function uses most fields of rte_mbuf structures to initialize transmit descriptors. See :doc:`../mbuf_lib` chapter for more details. Ethernet Device API ~~~~~~~~~~~~~~~~~~~ -The Ethernet device API exported by the Ethernet PMDs is described in the *DPDK API Reference*. +The *DPDK API Reference* describes the Ethernet device API exported by the Ethernet PMDs. .. _ethernet_device_standard_device_arguments: Ethernet Device Standard Device Arguments ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Standard Ethernet device arguments allow for a set of commonly used arguments/ -parameters which are applicable to all Ethernet devices to be available to for -specification of specific device and for passing common configuration -parameters to those ports. +Standard Ethernet device arguments provide a set of commonly used arguments/ +parameters applicable to all Ethernet devices. Use these arguments/parameters to +specify specific devices and pass common configuration parameters to those ports. -* ``representor`` for a device which supports the creation of representor ports - this argument allows user to specify which switch ports to enable port +* Use ``representor`` for a device that supports creating representor ports. + This argument allows the user to specify which switch ports to enable port representors for:: -a DBDF,representor=vf0 @@ -380,8 +384,8 @@ parameters to those ports. -a DBDF,representor=vf[0-31] -a DBDF,representor=vf[0,2-4,7,9-11] - These examples will attach VF representors relative to DBDF. - The VF IDs can be a list, a range or a mix. + These examples attach VF representors relative to DBDF. + The VF IDs can be a list, a range, or a mix. SF representors follow the same syntax:: -a DBDF,representor=sf0 @@ -389,47 +393,47 @@ parameters to those ports. -a DBDF,representor=sf[0-1023] -a DBDF,representor=sf[0,2-4,7,9-11] - If there are multiple PFs associated with the same PCI device, - the PF ID must be used to distinguish between representors relative to different PFs:: + If multiple PFs are associated with the same PCI device, + use the PF ID to distinguish between representors relative to different PFs:: -a DBDF,representor=pf1vf0 -a DBDF,representor=pf[0-1]vf0 - The example above will attach 4 representors pf0vf0, pf1vf0, pf0 and pf1. - If only VF representors are required, the PF part must be enclosed with parentheses:: + The example above attaches 4 representors pf0vf0, pf1vf0, pf0, and pf1. + If only VF representors are required, enclose the PF part in parentheses:: -a DBDF,representor=(pf[0-1])vf0 - The example above will attach 2 representors pf0vf0, pf1vf0. + The example above attaches 2 representors pf0vf0 and pf1vf0. - List of representors for the same PCI device is enclosed in square brackets:: + Enclose the list of representors for the same PCI device in square brackets:: -a DBDF,representor=[pf[0-1],pf2vf[0-2],pf3[3,5-8]] - Note: PMDs may have additional extensions for the representor parameter, and users - should consult the relevant PMD documentation to see support devargs. + Note: PMDs may have additional extensions for the representor parameter. Consult + the relevant PMD documentation for supported devargs. Extended Statistics API ~~~~~~~~~~~~~~~~~~~~~~~ -The extended statistics API allows a PMD to expose all statistics that are -available to it, including statistics that are unique to the device. -Each statistic has three properties ``name``, ``id`` and ``value``: +The extended statistics API allows a PMD to expose all available statistics, +including statistics unique to the device. +Each statistic has three properties: ``name``, ``id``, and ``value``: -* ``name``: A human readable string formatted by the scheme detailed below. +* ``name``: A human-readable string formatted by the scheme detailed below. * ``id``: An integer that represents only that statistic. -* ``value``: A unsigned 64-bit integer that is the value of the statistic. +* ``value``: An unsigned 64-bit integer that is the value of the statistic. -Note that extended statistic identifiers are -driver-specific, and hence might not be the same for different ports. -The API consists of various ``rte_eth_xstats_*()`` functions, and allows an -application to be flexible in how it retrieves statistics. +Note that extended statistic identifiers are driver-specific, +and therefore might not be the same for different ports. +The API consists of various ``rte_eth_xstats_*()`` functions and provides +applications flexibility in how they retrieve statistics. Scheme for Human Readable Names ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -A naming scheme exists for the strings exposed to clients of the API. This is -to allow scraping of the API for statistics of interest. The naming scheme uses +A naming scheme governs the strings exposed to clients of the API. This scheme +allows scraping of the API for statistics of interest. The naming scheme uses strings split by a single underscore ``_``. The scheme is as follows: * direction @@ -438,69 +442,67 @@ strings split by a single underscore ``_``. The scheme is as follows: * detail n * unit -Examples of common statistics xstats strings, formatted to comply to the scheme +Examples of common statistics xstats strings, formatted to comply with the scheme proposed above: * ``rx_bytes`` * ``rx_crc_errors`` * ``tx_multicast_packets`` -The scheme, although quite simple, allows flexibility in presenting and reading +The scheme, although simple, provides flexibility in presenting and reading information from the statistic strings. The following example illustrates the -naming scheme:``rx_packets``. In this example, the string is split into two -components. The first component ``rx`` indicates that the statistic is -associated with the receive side of the NIC. The second component ``packets`` +naming scheme: ``rx_packets``. In this example, the string splits into two +components. The first component ``rx`` indicates that the statistic +is associated with the receive side of the NIC. The second component ``packets`` indicates that the unit of measure is packets. A more complicated example: ``tx_size_128_to_255_packets``. In this example, -``tx`` indicates transmission, ``size`` is the first detail, ``128`` etc are +``tx`` indicates transmission, ``size`` is the first detail, ``128`` etc. are more details, and ``packets`` indicates that this is a packet counter. Some additions in the metadata scheme are as follows: * If the first part does not match ``rx`` or ``tx``, the statistic does not - have an affinity with either receive of transmit. + have an affinity with either receive or transmit. * If the first letter of the second part is ``q`` and this ``q`` is followed by a number, this statistic is part of a specific queue. -An example where queue numbers are used is as follows: ``tx_q7_bytes`` which -indicates this statistic applies to queue number 7, and represents the number +An example where queue numbers are used: ``tx_q7_bytes`` indicates this statistic applies to queue number 7 and represents the number of transmitted bytes on that queue. API Design ^^^^^^^^^^ -The xstats API uses the ``name``, ``id``, and ``value`` to allow performant -lookup of specific statistics. Performant lookup means two things; +The xstats API uses ``name``, ``id``, and ``value`` to allow performant +lookup of specific statistics. Performant lookup means two things: -* No string comparisons with the ``name`` of the statistic in fast-path -* Allow requesting of only the statistics of interest +* No string comparisons with the ``name`` of the statistic in the fast path +* Allow requesting only the statistics of interest -The API ensures these requirements are met by mapping the ``name`` of the -statistic to a unique ``id``, which is used as a key for lookup in the fast-path. -The API allows applications to request an array of ``id`` values, so that the -PMD only performs the required calculations. Expected usage is that the -application scans the ``name`` of each statistic, and caches the ``id`` -if it has an interest in that statistic. On the fast-path, the integer can be used +The API meets these requirements by mapping the ``name`` of the +statistic to a unique ``id``, which serves as a key for lookup in the fast path. +The API allows applications to request an array of ``id`` values, so the +PMD only performs the required calculations. The expected usage is that the +application scans the ``name`` of each statistic and caches the ``id`` +if it has an interest in that statistic. On the fast path, the integer can be used to retrieve the actual ``value`` of the statistic that the ``id`` represents. API Functions ^^^^^^^^^^^^^ -The API is built out of a small number of functions, which can be used to -retrieve the number of statistics and the names, IDs and values of those -statistics. +The API is built from a small number of functions, which retrieve the number of statistics +and the names, IDs, and values of those statistics. -* ``rte_eth_xstats_get_names_by_id()``: returns the names of the statistics. When given a - ``NULL`` parameter the function returns the number of statistics that are available. +* ``rte_eth_xstats_get_names_by_id()``: Returns the names of the statistics. When given a + ``NULL`` parameter, the function returns the number of available statistics. * ``rte_eth_xstats_get_id_by_name()``: Searches for the statistic ID that matches - ``xstat_name``. If found, the ``id`` integer is set. + ``xstat_name``. If found, sets the ``id`` integer. * ``rte_eth_xstats_get_by_id()``: Fills in an array of ``uint64_t`` values - with matching the provided ``ids`` array. If the ``ids`` array is NULL, it - returns all statistics that are available. + matching the provided ``ids`` array. If the ``ids`` array is NULL, it + returns all available statistics. Application Usage @@ -509,11 +511,11 @@ Application Usage Imagine an application that wants to view the dropped packet count. If no packets are dropped, the application does not read any other metrics for performance reasons. If packets are dropped, the application has a particular -set of statistics that it requests. This "set" of statistics allows the app to -decide what next steps to perform. The following code-snippets show how the -xstats API can be used to achieve this goal. +set of statistics that it requests. This "set" of statistics allows the application to +decide what next steps to perform. The following code snippets show how the +xstats API achieves this goal. -First step is to get all statistics names and list them: +The first step is to get all statistics names and list them: .. code-block:: c @@ -557,9 +559,9 @@ First step is to get all statistics names and list them: printf("%s: %"PRIu64"\n", xstats_names[i].name, values[i]); } -The application has access to the names of all of the statistics that the PMD -exposes. The application can decide which statistics are of interest, cache the -ids of those statistics by looking up the name as follows: +The application has access to the names of all statistics that the PMD +exposes. The application can decide which statistics are of interest and cache the +IDs of those statistics by looking up the name as follows: .. code-block:: c @@ -576,10 +578,8 @@ ids of those statistics by looking up the name as follows: goto err; } -The API provides flexibility to the application so that it can look up multiple -statistics using an array containing multiple ``id`` numbers. This reduces the -function call overhead of retrieving statistics, and makes lookup of multiple -statistics simpler for the application. +The API allows the application to look up multiple statistics using an array containing multiple ``id`` numbers. +This reduces function call overhead when retrieving statistics and simplifies looking up multiple statistics. .. code-block:: c @@ -597,12 +597,12 @@ statistics simpler for the application. } -This array lookup API for xstats allows the application create multiple -"groups" of statistics, and look up the values of those IDs using a single API -call. As an end result, the application is able to achieve its goal of -monitoring a single statistic ("rx_errors" in this case), and if that shows +This array lookup API for xstats allows the application to create multiple +"groups" of statistics and look up the values of those IDs using a single API +call. As a result, the application achieves its goal of +monitoring a single statistic (in this case, "rx_errors"). If that shows packets being dropped, it can easily retrieve a "set" of statistics using the -IDs array parameter to ``rte_eth_xstats_get_by_id`` function. +IDs array parameter to the ``rte_eth_xstats_get_by_id`` function. NIC Reset API ~~~~~~~~~~~~~ @@ -611,84 +611,81 @@ NIC Reset API int rte_eth_dev_reset(uint16_t port_id); -Sometimes a port has to be reset passively. For example when a PF is -reset, all its VFs should also be reset by the application to make them -consistent with the PF. A DPDK application also can call this function -to trigger a port reset. Normally, a DPDK application would invokes this -function when an RTE_ETH_EVENT_INTR_RESET event is detected. +Sometimes a port must be reset passively. For example, when a PF is +reset, the application should also reset all its VFs to maintain consistency +with the PF. A DPDK application can also call this function +to trigger a port reset. Normally, a DPDK application invokes this +function when it detects an RTE_ETH_EVENT_INTR_RESET event. -It is the duty of the PMD to trigger RTE_ETH_EVENT_INTR_RESET events and -the application should register a callback function to handle these -events. When a PMD needs to trigger a reset, it can trigger an +The PMD triggers RTE_ETH_EVENT_INTR_RESET events. +The application should register a callback function to handle these +events. When a PMD needs to trigger a reset, it triggers an RTE_ETH_EVENT_INTR_RESET event. On receiving an RTE_ETH_EVENT_INTR_RESET -event, applications can handle it as follows: Stop working queues, stop +event, applications should: stop working queues, stop calling Rx and Tx functions, and then call rte_eth_dev_reset(). For -thread safety all these operations should be called from the same thread. +thread safety, call all these operations from the same thread. -For example when PF is reset, the PF sends a message to notify VFs of -this event and also trigger an interrupt to VFs. Then in the interrupt -service routine the VFs detects this notification message and calls +For example, when a PF is reset, it sends a message to notify VFs of +this event and also triggers an interrupt to VFs. Then, in the interrupt +service routine, the VFs detect this notification message and call rte_eth_dev_callback_process(dev, RTE_ETH_EVENT_INTR_RESET, NULL). This means that a PF reset triggers an RTE_ETH_EVENT_INTR_RESET -event within VFs. The function rte_eth_dev_callback_process() will -call the registered callback function. The callback function can trigger -the application to handle all operations the VF reset requires including +event within VFs. The function rte_eth_dev_callback_process() +calls the registered callback function. The callback function can trigger +the application to handle all operations the VF reset requires, including stopping Rx/Tx queues and calling rte_eth_dev_reset(). -The rte_eth_dev_reset() itself is a generic function which only does -some hardware reset operations through calling dev_unint() and -dev_init(), and itself does not handle synchronization, which is handled -by application. +The rte_eth_dev_reset() function is a generic function that only performs hardware reset operations by calling dev_uninit() and +dev_init(). It does not handle synchronization; the application handles that. -The PMD itself should not call rte_eth_dev_reset(). The PMD can trigger -the application to handle reset event. It is duty of application to -handle all synchronization before it calls rte_eth_dev_reset(). +The PMD should not call rte_eth_dev_reset(). The PMD can trigger +the application to handle the reset event. The application must +handle all synchronization before calling rte_eth_dev_reset(). The above error handling mode is known as ``RTE_ETH_ERROR_HANDLE_MODE_PASSIVE``. Proactive Error Handling Mode ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -This mode is known as ``RTE_ETH_ERROR_HANDLE_MODE_PROACTIVE``, -different from the application invokes recovery in PASSIVE mode, -the PMD automatically recovers from error in PROACTIVE mode, -and only a small amount of work is required for the application. +This mode is known as ``RTE_ETH_ERROR_HANDLE_MODE_PROACTIVE``, which +differs from PASSIVE mode where the application invokes recovery. +In PROACTIVE mode, the PMD automatically recovers from errors, +and the application requires only minimal handling. During error detection and automatic recovery, the PMD sets the data path pointers to dummy functions -(which will prevent the crash), -and also make sure the control path operations fail with a return code ``-EBUSY``. +(which prevent crashes) +and ensures control path operations fail with return code ``-EBUSY``. Because the PMD recovers automatically, -the application can only sense that the data flow is disconnected for a while -and the control API returns an error in this period. +the application only senses that the data flow is disconnected for a while +and that the control API returns an error during this period. -In order to sense the error happening/recovering, -as well as to restore some additional configuration, +To sense error occurrence and recovery, +as well as to restore additional configuration, three events are available: ``RTE_ETH_EVENT_ERR_RECOVERING`` - Notify the application that an error is detected - and the recovery is being started. + Notifies the application that an error is detected + and recovery is beginning. Upon receiving the event, the application should not invoke - any control path function until receiving + any control path function until receiving the ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or ``RTE_ETH_EVENT_RECOVERY_FAILED`` event. .. note:: Before the PMD reports the recovery result, - the PMD may report the ``RTE_ETH_EVENT_ERR_RECOVERING`` event again, - because a larger error may occur during the recovery. + it may report the ``RTE_ETH_EVENT_ERR_RECOVERING`` event again + because a larger error may occur during recovery. ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` - Notify the application that the recovery from error is successful, - the PMD already re-configures the port, + Notifies the application that recovery from the error was successful. + The PMD has reconfigured the port, and the effect is the same as a restart operation. ``RTE_ETH_EVENT_RECOVERY_FAILED`` - Notify the application that the recovery from error failed, - the port should not be usable anymore. + Notifies the application that recovery from the error failed. + The port should not be usable anymore. The application should close the port. -The error handling mode supported by the PMD can be reported through -``rte_eth_dev_info_get``. +Query the error handling mode supported by the PMD using ``rte_eth_dev_info_get()``. -- 2.51.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v2 2/8] doc: correct grammar in flow guide 2026-01-28 19:45 ` [PATCH v2 0/8] doc: ethdev documentation grammar and typo corrections Stephen Hemminger 2026-01-28 19:46 ` [PATCH v2 1/8] doc: correct grammar and punctuation errors in ethdev guide Stephen Hemminger @ 2026-01-28 19:46 ` Stephen Hemminger 2026-01-28 19:46 ` [PATCH v2 3/8] doc: correct grammar in QoS framework guide Stephen Hemminger ` (6 subsequent siblings) 8 siblings, 0 replies; 18+ messages in thread From: Stephen Hemminger @ 2026-01-28 19:46 UTC (permalink / raw) To: dev; +Cc: Stephen Hemminger, Ori Kam Clarify several sections of the generic flow API documentation for better readability and technical accuracy. Fix terminology to use consistent DPDK conventions and improve explanations of complex concepts like transfer flows, indirect actions, and template tables. Changes: - Remove duplicate article "an" in phrase "a single an item". - Replace "can not" with "cannot" throughout document - Clarify that transfer flows use PORT_REPRESENTOR and REPRESENTED_PORT items instead of direction attributes - Improve explanation of indirect action persistence behavior - Fix grammar in flow isolated mode section - Standardize terminology for "ethdev" vs "port ID" - Clarify template table specialization description - Fix minor punctuation and spacing issues Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> --- doc/guides/prog_guide/ethdev/flow_offload.rst | 58 ++++++++++--------- 1 file changed, 30 insertions(+), 28 deletions(-) diff --git a/doc/guides/prog_guide/ethdev/flow_offload.rst b/doc/guides/prog_guide/ethdev/flow_offload.rst index 1cd904e1ee..d5cca86c82 100644 --- a/doc/guides/prog_guide/ethdev/flow_offload.rst +++ b/doc/guides/prog_guide/ethdev/flow_offload.rst @@ -109,7 +109,7 @@ However, ``rte_eth_dev_configure()`` may fail if any rules remain, so the application must flush them before attempting a reconfiguration. Keeping may be unsupported for some types of rule items and actions, as well as depending on the value of flow attributes transfer bit. -A combination of a single an item or action type +A combination of a single item or action type and a value of the transfer bit is called a rule feature. For example: a COUNT action with the transfer bit set. To test if rules with a particular feature are kept, the application must try @@ -198,15 +198,15 @@ Attribute: Transfer ^^^^^^^^^^^^^^^^^^^ Instead of simply matching the properties of traffic as it would appear on a -given DPDK port ID, enabling this attribute transfers a flow rule to the +given ethdev, enabling this attribute transfers a flow rule to the lowest possible level of any device endpoints found in the pattern. When supported, this effectively enables an application to reroute traffic not necessarily intended for it (e.g. coming from or addressed to different physical ports, VFs or applications) at the device level. -In "transfer" flows, the use of `Attribute: Traffic direction`_ in not allowed. -One may use `Item: PORT_REPRESENTOR`_ and `Item: REPRESENTED_PORT`_ instead. +In "transfer" flows, the use of `Attribute: Traffic direction`_ is not allowed. +Use `Item: PORT_REPRESENTOR`_ and `Item: REPRESENTED_PORT`_ instead. Pattern item ~~~~~~~~~~~~ @@ -535,11 +535,11 @@ Item: ``PORT_ID`` ^^^^^^^^^^^^^^^^^ This item is deprecated. Consider: - - `Item: PORT_REPRESENTOR`_ - - `Item: REPRESENTED_PORT`_ -Matches traffic originating from (ingress) or going to (egress) a given DPDK -port ID. +- `Item: PORT_REPRESENTOR`_ +- `Item: REPRESENTED_PORT`_ + +Matches traffic originating from (ingress) or going to (egress) a given ethdev. Normally only supported if the port ID in question is known by the underlying PMD and related to the device the flow rule is created against. @@ -566,7 +566,7 @@ Item: ``MARK`` Matches an arbitrary integer value which was set using the ``MARK`` action in a previously matched rule. -This item can only specified once as a match criteria as the ``MARK`` action can +This item can only be specified once as a match criteria as the ``MARK`` action can only be specified once in a flow action. Note the value of MARK field is arbitrary and application defined. @@ -2058,11 +2058,11 @@ Action: ``PF`` ^^^^^^^^^^^^^^ This action is deprecated. Consider: - - `Action: PORT_REPRESENTOR`_ - - `Action: REPRESENTED_PORT`_ -Directs matching traffic to the physical function (PF) of the current -device. +- `Action: PORT_REPRESENTOR`_ +- `Action: REPRESENTED_PORT`_ + +Directs matching traffic to the physical function (PF) of the current device. - No configurable properties. @@ -2080,8 +2080,9 @@ Action: ``VF`` ^^^^^^^^^^^^^^ This action is deprecated. Consider: - - `Action: PORT_REPRESENTOR`_ - - `Action: REPRESENTED_PORT`_ + +- `Action: PORT_REPRESENTOR`_ +- `Action: REPRESENTED_PORT`_ Directs matching traffic to a given virtual function of the current device. @@ -2105,8 +2106,9 @@ rule or if packets are not addressed to a VF in the first place. Action: ``PORT_ID`` ^^^^^^^^^^^^^^^^^^^ This action is deprecated. Consider: - - `Action: PORT_REPRESENTOR`_ - - `Action: REPRESENTED_PORT`_ + +- `Action: PORT_REPRESENTOR`_ +- `Action: REPRESENTED_PORT`_ Directs matching traffic to a given DPDK port ID. @@ -2336,8 +2338,8 @@ VXLAN tunnel as defined in the``rte_flow_action_vxlan_encap`` flow items definition. This action modifies the payload of matched flows. The flow definition specified -in the ``rte_flow_action_tunnel_encap`` action structure must define a valid -VLXAN network overlay which conforms with RFC 7348 (Virtual eXtensible Local +in the ``rte_flow_action_vxlan_encap`` action structure must define a valid +VXLAN network overlay which conforms with RFC 7348 (Virtual eXtensible Local Area Network (VXLAN): A Framework for Overlaying Virtualized Layer 2 Networks over Layer 3 Networks). The pattern must be terminated with the RTE_FLOW_ITEM_TYPE_END item type. @@ -2391,7 +2393,7 @@ NVGRE tunnel as defined in the``rte_flow_action_tunnel_encap`` flow item definition. This action modifies the payload of matched flows. The flow definition specified -in the ``rte_flow_action_tunnel_encap`` action structure must defined a valid +in the ``rte_flow_action_tunnel_encap`` action structure must define a valid NVGRE network overlay which conforms with RFC 7637 (NVGRE: Network Virtualization Using Generic Routing Encapsulation). The pattern must be terminated with the RTE_FLOW_ITEM_TYPE_END item type. @@ -2967,10 +2969,10 @@ The indirect action specified data (e.g. counter) can be queried by The following description of indirect action persistence is an experimental behavior that may change without a prior notice. -If ``RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP`` is not advertised, -indirect actions cannot be created until the device is started for the first time -and cannot be kept when the device is stopped. -However, PMD also does not flush them automatically on stop, +If the ``RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP`` capability is not advertised, +indirect actions cannot be created until the device starts for the first time +and are not kept when the device is stopped. +However, the PMD also does not flush them automatically on stop, so the application must call ``rte_flow_action_handle_destroy()`` before stopping the device to ensure no indirect actions remain. @@ -3010,7 +3012,7 @@ Indirect API creates a shared flow action with unique action handle. Flow rules can access the shared flow action and resources related to that action through the indirect action handle. In addition, the API allows to update existing shared flow action configuration. -After the update completes, new action configuration +After the update completes, the new action configuration is available to all flows that reference that shared action. Indirect actions list expands the indirect action API: @@ -3020,7 +3022,7 @@ Indirect actions list expands the indirect action API: single action only. Input flow actions arranged in END terminated list. -- Flow rule can provide rule specific configuration parameters to +- Flow rules can provide rule-specific configuration parameters to existing shared handle. Updates of flow rule specific configuration will not change the base action configuration. @@ -3834,8 +3836,8 @@ Group Miss Actions ~~~~~~~~~~~~~~~~~~ In an application, many flow rules share common group attributes, meaning they can be grouped and -classified together. A user can explicitly specify a set of actions performed on a packet when it -did not match any flows rules in a group using the following API: +classified together. A user can explicitly specify a set of actions performed on a packet when +it did not match any flow rules in a group using the following API: .. code-block:: c -- 2.51.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v2 3/8] doc: correct grammar in QoS framework guide 2026-01-28 19:45 ` [PATCH v2 0/8] doc: ethdev documentation grammar and typo corrections Stephen Hemminger 2026-01-28 19:46 ` [PATCH v2 1/8] doc: correct grammar and punctuation errors in ethdev guide Stephen Hemminger 2026-01-28 19:46 ` [PATCH v2 2/8] doc: correct grammar in flow guide Stephen Hemminger @ 2026-01-28 19:46 ` Stephen Hemminger 2026-01-28 19:46 ` [PATCH v2 4/8] doc: correct typos in switch representation guide Stephen Hemminger ` (5 subsequent siblings) 8 siblings, 0 replies; 18+ messages in thread From: Stephen Hemminger @ 2026-01-28 19:46 UTC (permalink / raw) To: dev; +Cc: Stephen Hemminger, Cristian Dumitrescu Add missing article "a" in phrase "As result" which should be "As a result" in four locations including table cell descriptions. Improve readability and correct grammatical issues: - Correct missing word "The" in dropper description - Add missing article in EWMA filter description - Correct run-on sentence in traffic metering section - Improve clarity of color aware mode explanation - Correct inconsistent hyphenation in "run-time" Correct various issues in the QoS framework documentation: - Correct "an so on" to "and so on" - Correct "weights weights" duplicate word - Correct duplicate "the" in token bucket operations description - Correct "Pipelevel" to "Pipe level" for consistency - Correct "out performs" to "outperforms" - Correct "They could made" grammar error to "They could be made" - Correct inconsistent spacing in section references - Update hardcoded section numbers to use proper internal references Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> --- .../prog_guide/ethdev/qos_framework.rst | 59 ++++++++++--------- 1 file changed, 30 insertions(+), 29 deletions(-) diff --git a/doc/guides/prog_guide/ethdev/qos_framework.rst b/doc/guides/prog_guide/ethdev/qos_framework.rst index 1144037dfa..0f61264ccc 100644 --- a/doc/guides/prog_guide/ethdev/qos_framework.rst +++ b/doc/guides/prog_guide/ethdev/qos_framework.rst @@ -189,7 +189,7 @@ The functionality of each hierarchical level is detailed in the following table. | | | | called Best Effort (BE), has 4 queues. | | | | | | | | | | #. Queues of the lowest priority TC (BE) are serviced using | - | | | | Weighted Round Robin (WRR) according to predefined weights| + | | | | Weighted Round Robin (WRR) according to predefined | | | | | weights. | | | | | | +---+--------------------+----------------------------+---------------------------------------------------------------+ @@ -368,7 +368,7 @@ The expected drop in performance is due to: #. Need to make the queue and bitmap operations thread safe, which requires either using locking primitives for access serialization (for example, spinlocks/ semaphores) or - using atomic primitives for lockless access (for example, Test and Set, Compare And Swap, an so on). + using atomic primitives for lockless access (for example, Test and Set, Compare And Swap, and so on). The impact is much higher in the former case. #. Ping-pong of cache lines storing the shared data structures between the cache hierarchies of the two cores @@ -621,7 +621,7 @@ The token bucket generic parameters and operations are presented in :numref:`tab | | | while the bucket is full are dropped. | | | | | +---+------------------------+------------------------------------------------------------------------------+ - | 3 | Credit consumption | As result of packet scheduling, the necessary number of credits is removed | + | 3 | Credit consumption | As a result of packet scheduling, the necessary number of credits is removed | | | | from the bucket. The packet can only be sent if enough credits are in the | | | | bucket to send the full packet (packet bytes and framing overhead for the | | | | packet). | @@ -681,7 +681,7 @@ where, r = port line rate (in bytes per second). +---+-------------------------+-----------------------------------------------------------------------------+ | 2 | Credit update | Credit update options: | | | | | - | | | * Every time a packet is sent for a port, update the credits of all the | + | | | * Every time a packet is sent for a port, update the credits of all | | | | the subports and pipes of that port. Not feasible. | | | | | | | | * Every time a packet is sent, update the credits for the pipe and | @@ -716,7 +716,7 @@ where, r = port line rate (in bytes per second). | | | * tb_time += n_periods * tb_period; | | | | | +---+-------------------------+-----------------------------------------------------------------------------+ - | 3 | Credit consumption | As result of packet scheduling, the necessary number of credits is removed | + | 3 | Credit consumption | As a result of packet scheduling, the necessary number of credits is removed| | | (on packet scheduling) | from the bucket. The packet can only be sent if enough credits are in the | | | | bucket to send the full packet (packet bytes and framing overhead for the | | | | packet). | @@ -805,7 +805,7 @@ as described in :numref:`table_qos_10` and :numref:`table_qos_11`. | | | } | | | | | +---+--------------------------+----------------------------------------------------------------------------+ - | 3 | Credit consumption | As result of packet scheduling, the TC limit is decreased with the | + | 3 | Credit consumption | As a result of packet scheduling, the TC limit is decreased with the | | | (on packet scheduling) | necessary number of credits. The packet can only be sent if enough credits | | | | are currently available in the TC limit to send the full packet | | | | (packet bytes and framing overhead for the packet). | @@ -1042,7 +1042,7 @@ The highest value for the watermark is picked as the highest rate configured for | | | | | | | } | | | | | - | | | **Pipelevel:** | + | | | **Pipe level:** | | | | | | | | if(pipe_period_id != subport_period_id) | | | | | @@ -1163,7 +1163,7 @@ Droppers The purpose of the DPDK dropper is to drop packets arriving at a packet scheduler to avoid congestion. The dropper supports the Proportional Integral Controller Enhanced (PIE), Random Early Detection (RED), -Weighted Random Early Detection (WRED) and tail drop algorithms. +Weighted Random Early Detection (WRED), and tail drop algorithms. :numref:`figure_blk_diag_dropper` illustrates how the dropper integrates with the scheduler. The DPDK currently does not support congestion management so the dropper provides the only method for congestion avoidance. @@ -1175,7 +1175,7 @@ so the dropper provides the only method for congestion avoidance. High-level Block Diagram of the DPDK Dropper -The dropper uses one of two congestion avoidance algorithms: +The dropper uses one of the following congestion avoidance algorithms: - the Random Early Detection (RED) as documented in the reference publication. - the Proportional Integral Controller Enhanced (PIE) as documented in RFC8033 publication. @@ -1196,7 +1196,7 @@ In the case of severe congestion, the dropper resorts to tail drop. This occurs when a packet queue has reached maximum capacity and cannot store any more packets. In this situation, all arriving packets are dropped. -The flow through the dropper is illustrated in :numref:`figure_flow_tru_dropper`. +The flow through the dropper is illustrated in :numref:`figure_flow_tru_dropper`, The RED/WRED/PIE algorithm is exercised first and tail drop second. .. _figure_flow_tru_dropper: @@ -1205,7 +1205,7 @@ The RED/WRED/PIE algorithm is exercised first and tail drop second. Flow Through the Dropper -The PIE algorithm periodically updates the drop probability based on the latency samples. +The PIE algorithm periodically updates the drop probability based on latency samples. The current latency sample but also analyze whether the latency is trending up or down. This is the classical Proportional Integral (PI) controller method, which is known for eliminating steady state errors. @@ -1226,7 +1226,7 @@ The use cases supported by the dropper are: * * Mark empty (record the time at which a packet queue becomes empty) -The configuration use case is explained in :ref:`Section2.23.3.1 <Configuration>`, +The configuration use case is explained in :ref:`Section 2.23.3.1 <Configuration>`, the enqueue operation is explained in :ref:`Section 2.23.3.2 <Enqueue_Operation>` and the mark empty operation is explained in :ref:`Section 2.23.3.3 <Queue_Empty_Operation>`. @@ -1262,7 +1262,7 @@ The meaning of these parameters is explained in more detail in the following sec The format of these parameters as specified to the dropper module API corresponds to the format used by Cisco* in their RED implementation. The minimum and maximum threshold parameters are specified to the dropper module in terms of number of packets. -The mark probability parameter is specified as an inverse value, for example, +The mark probability parameter is specified as an inverse value; for example, an inverse mark probability parameter value of 10 corresponds to a mark probability of 1/10 (that is, 1 in 10 packets will be dropped). The EWMA filter weight parameter is specified as an inverse log value, @@ -1278,7 +1278,7 @@ A PIE configuration contains the parameters given in :numref:`table_qos_16a`. | Parameter | Minimum | Maximum | Default | | | | | | +==========================+=========+=========+==================+ - | Queue delay reference | 1 | uint16 | 15 | + | Queue delay reference | 1 | uint16_t| 15 | | Latency Target Value | | | | | Unit: ms | | | | +--------------------------+---------+---------+------------------+ @@ -1286,7 +1286,7 @@ A PIE configuration contains the parameters given in :numref:`table_qos_16a`. | Unit: ms | | | | +--------------------------+---------+---------+------------------+ | Tail Drop Threshold | 1 | uint16 | 64 | - | Unit: bytes | | | | + | Unit: number of packets | | | | +--------------------------+---------+---------+------------------+ | Period to calculate | 1 | uint16 | 15 | | drop probability | | | | @@ -1294,8 +1294,9 @@ A PIE configuration contains the parameters given in :numref:`table_qos_16a`. +--------------------------+---------+---------+------------------+ The meaning of these parameters is explained in more detail in the next sections. -The format of these parameters as specified to the dropper module API. -They could made self calculated for fine tuning, within the apps. +The format of these parameters is specified to the dropper module API +and can be fine-tuned within applications. +They could be made self-calculated for fine tuning within the apps. .. _Enqueue_Operation: @@ -1316,7 +1317,7 @@ decision is the output value and the remaining values are configuration paramete EWMA Filter Microblock ^^^^^^^^^^^^^^^^^^^^^^ -The purpose of the EWMA Filter microblock is to filter queue size values to smooth out transient changes +The purpose of the EWMA filter microblock is to filter queue size values to smooth out transient changes that result from "bursty" traffic. The output value is the average queue size which gives a more stable view of the current congestion level in the queue. @@ -1426,7 +1427,7 @@ These approaches include: * Large look-up table (76 KB) -The method that was finally selected (described above in Section 26.3.2.2.1) out performs all of these approaches +The method that was finally selected (described above in Section 26.3.2.2.1) outperforms all of these approaches in terms of run-time performance and memory requirements and also achieves accuracy comparable to floating-point evaluation. :numref:`table_qos_17` lists the performance of each of these alternative approaches relative to the method that is used in the dropper. @@ -1700,7 +1701,7 @@ The arguments passed to the enqueue API are configuration data, run-time data, the current size of the packet queue (in packets) and a value representing the current time. The time reference is in units of bytes, where a byte signifies the time duration required by the physical interface to send out a byte on the transmission medium -(see Section 26.2.4.5.1 "Internal Time Reference" ). +(see `Internal Time Reference`_). The dropper reuses the scheduler time stamps for performance reasons. Empty API @@ -1720,7 +1721,7 @@ Traffic Metering The traffic metering component implements the Single Rate Three Color Marker (srTCM) and Two Rate Three Color Marker (trTCM) algorithms, as defined by IETF RFC 2697 and 2698 respectively. These algorithms meter the stream of incoming packets based on the allowance defined in advance for each traffic flow. -As result, each incoming packet is tagged as green, +As a result, each incoming packet is tagged as green, yellow or red based on the monitored consumption of the flow the packet belongs to. Functional Overview @@ -1739,12 +1740,12 @@ with the two buckets sharing the same token update rate: The trTCM algorithm defines two token buckets for each traffic flow, with the two buckets being updated with tokens at independent rates: -* Committed (C) bucket: fed with tokens at the rate defined by the Committed Information Rate (CIR) parameter +* Committed (C) bucket: fed with tokens at the rate defined by Committed Information Rate (CIR) parameter (measured in bytes of IP packet per second). The size of the C bucket is defined by the Committed Burst Size (CBS) parameter (measured in bytes); -* Peak (P) bucket: fed with tokens at the rate defined by the Peak Information Rate (PIR) parameter - (measured in IP packet bytes per second). +* Peak (P) bucket: fed with tokens at the rate defined by Peak Information Rate (PIR) parameter + (measured in bytes of IP packet per second). The size of the P bucket is defined by the Peak Burst Size (PBS) parameter (measured in bytes). Please refer to RFC 2697 (for srTCM) and RFC 2698 (for trTCM) for details on how tokens are consumed @@ -1755,7 +1756,7 @@ Color Blind and Color Aware Modes For both algorithms, the color blind mode is functionally equivalent to the color aware mode with input color set as green. For color aware mode, a packet with red input color can only get the red output color, -while a packet with yellow input color can only get the yellow or red output colors. +while a packet with yellow input color can only get yellow or red output colors. The reason why the color blind mode is still implemented distinctly than the color aware mode is that color blind mode can be implemented with fewer operations than the color aware mode. @@ -1766,12 +1767,12 @@ Implementation Overview For each input packet, the steps for the srTCM / trTCM algorithms are: * Update the C and E / P token buckets. This is done by reading the current time (from the CPU timestamp counter), - identifying the amount of time since the last bucket update and computing the associated number of tokens - (according to the pre-configured bucket rate). + identifying the amount of time since the last bucket update, and computing the associated number of tokens + according to the pre-configured bucket rate. The number of tokens in the bucket is limited by the pre-configured bucket size; * Identify the output color for the current packet based on the size of the IP packet - and the amount of tokens currently available in the C and E / P buckets; for color aware mode only, + and the amount of tokens currently available in the C and E / P buckets. For color aware mode only, the input color of the packet is also considered. When the output color is not red, a number of tokens equal to the length of the IP packet are - subtracted from the C or E /P or both buckets, depending on the algorithm and the output color of the packet. + subtracted from the C or E / P or both buckets, depending on the algorithm and the output color of the packet. -- 2.51.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v2 4/8] doc: correct typos in switch representation guide 2026-01-28 19:45 ` [PATCH v2 0/8] doc: ethdev documentation grammar and typo corrections Stephen Hemminger ` (2 preceding siblings ...) 2026-01-28 19:46 ` [PATCH v2 3/8] doc: correct grammar in QoS framework guide Stephen Hemminger @ 2026-01-28 19:46 ` Stephen Hemminger 2026-01-28 19:46 ` [PATCH v2 5/8] doc: correct typos in traffic management guide Stephen Hemminger ` (4 subsequent siblings) 8 siblings, 0 replies; 18+ messages in thread From: Stephen Hemminger @ 2026-01-28 19:46 UTC (permalink / raw) To: dev; +Cc: Stephen Hemminger, Thomas Monjalon, Andrew Rybchenko Two typos corrected: - "according on" to "according to" - "physical of virtual" to "physical or virtual" Correct minor grammatical issue in the switch representation guide: - Add missing article "a" before "large number of ports" Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> --- doc/guides/prog_guide/ethdev/switch_representation.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/doc/guides/prog_guide/ethdev/switch_representation.rst b/doc/guides/prog_guide/ethdev/switch_representation.rst index 2ef2772afb..3d9d8d8d60 100644 --- a/doc/guides/prog_guide/ethdev/switch_representation.rst +++ b/doc/guides/prog_guide/ethdev/switch_representation.rst @@ -19,7 +19,7 @@ managed by the host system and fully transparent to users and applications. On the other hand, applications typically found on hypervisors that process layer 2 (L2) traffic (such as OVS) need to steer traffic themselves -according on their own criteria. +according to their own criteria. Without a standard software interface to manage traffic steering rules between VFs, SFs, PFs and the various physical ports of a given device, @@ -84,7 +84,7 @@ thought as a software "patch panel" front-end for applications. - Among other things, they can be used to assign MAC addresses to the resource they represent. -- Applications can tell port representors apart from other physical of virtual +- Applications can tell port representors apart from other physical or virtual port by checking the dev_flags field within their device information structure for the RTE_ETH_DEV_REPRESENTOR bit-field. @@ -124,7 +124,7 @@ thought as a software "patch panel" front-end for applications. - For some PMDs, memory usage of representors is huge when number of representor grows, mbufs are allocated for each descriptor of Rx queue. - Polling large number of ports brings more CPU load, cache miss and + Polling a large number of ports brings more CPU load, cache miss and latency. Shared Rx queue can be used to share Rx queue between PF and representors among same Rx domain. ``RTE_ETH_DEV_CAPA_RXQ_SHARE`` in device info is used to indicate the capability. Setting non-zero share -- 2.51.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v2 5/8] doc: correct typos in traffic management guide 2026-01-28 19:45 ` [PATCH v2 0/8] doc: ethdev documentation grammar and typo corrections Stephen Hemminger ` (3 preceding siblings ...) 2026-01-28 19:46 ` [PATCH v2 4/8] doc: correct typos in switch representation guide Stephen Hemminger @ 2026-01-28 19:46 ` Stephen Hemminger 2026-01-28 19:46 ` [PATCH v2 6/8] doc: correct grammar and improve clarity in ethdev guide Stephen Hemminger ` (3 subsequent siblings) 8 siblings, 0 replies; 18+ messages in thread From: Stephen Hemminger @ 2026-01-28 19:46 UTC (permalink / raw) To: dev; +Cc: Stephen Hemminger, Cristian Dumitrescu Two documentation issues corrected: - Remove errant asterisk from "Head Drop*" - Remove duplicate phrase in WRED algorithm description Correct spelling error: "Weighed" to "Weighted" for consistency with standard industry terminology (Weighted Fair Queuing). Correct grammar and punctuation issues: - Add comma after "i.e." per standard usage - Correct "does meet the needs to" to "meets the needs of" - Add missing space before parenthesis - Simplify awkward phrasing "In case, when" - Add missing comma after "etc." - Correct subject-verb agreement "APIs supports" to "APIs support" - Remove extra space before comma in "Queuing , etc." Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> --- .../prog_guide/ethdev/traffic_management.rst | 25 +++++++++---------- 1 file changed, 12 insertions(+), 13 deletions(-) diff --git a/doc/guides/prog_guide/ethdev/traffic_management.rst b/doc/guides/prog_guide/ethdev/traffic_management.rst index c356791a45..91c032f480 100644 --- a/doc/guides/prog_guide/ethdev/traffic_management.rst +++ b/doc/guides/prog_guide/ethdev/traffic_management.rst @@ -17,7 +17,7 @@ Main features: * Part of DPDK rte_ethdev API * Capability query API per port, per hierarchy level and per hierarchy node -* Scheduling algorithms: Strict Priority (SP), Weighed Fair Queuing (WFQ) +* Scheduling algorithms: Strict Priority (SP), Weighted Fair Queuing (WFQ) * Traffic shaping: single/dual rate, private (per node) and shared (by multiple nodes) shapers * Congestion management for hierarchy leaf nodes: algorithms of tail drop, head @@ -30,24 +30,24 @@ Main features: Capability API -------------- -The aim of these APIs is to advertise the capability information (i.e critical +The aim of these APIs is to advertise the capability information (i.e., critical parameter values) that the TM implementation (HW/SW) is able to support for the -application. The APIs supports the information disclosure at the TM level, at +application. The APIs support the information disclosure at the TM level, at any hierarchical level of the TM and at any node level of the specific -hierarchical level. Such information helps towards rapid understanding of -whether a specific implementation does meet the needs to the user application. +hierarchical level. Such information helps for rapid understanding of +whether a specific implementation meets the needs of the user application. At the TM level, users can get high level idea with the help of various parameters such as maximum number of nodes, maximum number of hierarchical levels, maximum number of shapers, maximum number of private shapers, type of -scheduling algorithm (Strict Priority, Weighted Fair Queuing , etc.), etc., +scheduling algorithm (Strict Priority, Weighted Fair Queuing, etc.), etc., supported by the implementation. Likewise, users can query the capability of the TM at the hierarchical level to have more granular knowledge about the specific level. The various parameters such as maximum number of nodes at the level, maximum number of leaf/non-leaf -nodes at the level, type of the shaper(dual rate, single rate) supported at -the level if node is non-leaf type etc., are exposed as a result of +nodes at the level, type of the shaper (dual rate, single rate) supported at +the level if node is non-leaf type, etc., are exposed as a result of hierarchical level capability query. Finally, the node level capability API offers knowledge about the capability @@ -66,7 +66,7 @@ level/position in the tree. The SP algorithm is used to schedule between sibling nodes with different priority, while WFQ is used to schedule between groups of siblings that have the same priority. -Algorithms such as Weighed Round Robin (WRR), byte-level WRR, Deficit WRR +Algorithms such as Weighted Round Robin (WRR), byte-level WRR, Deficit WRR (DWRR), etc are considered approximations of the ideal WFQ and are therefore assimilated to WFQ, although an associated implementation-dependent accuracy, performance and resource usage trade-off might exist. @@ -109,15 +109,14 @@ They are made available for every leaf node in the hierarchy, subject to the specific implementation supporting them. On request of writing a new packet into the current queue while the queue is full, the Tail Drop algorithm drops the new packet while leaving the queue -unmodified, as opposed to the Head Drop* algorithm, which drops the packet +unmodified, as opposed to the Head Drop algorithm, which drops the packet at the head of the queue (the oldest packet waiting in the queue) and admits the new packet at the tail of the queue. The Random Early Detection (RED) algorithm works by proactively dropping more and more input packets as the queue occupancy builds up. When the queue is full or almost full, RED effectively works as Tail Drop. The Weighted RED (WRED) -algorithm uses a separate set of RED thresholds for each packet color and uses -separate set of RED thresholds for each packet color. +algorithm uses a separate set of RED thresholds for each packet color. Each hierarchy leaf node with WRED enabled as its congestion management mode has zero or one private WRED context (only one leaf node using it) and/or zero, @@ -144,7 +143,7 @@ The TM APIs have been provided to support various types of packet marking such as VLAN DEI packet marking (IEEE 802.1Q), IPv4/IPv6 ECN marking of TCP and SCTP packets (IETF RFC 3168) and IPv4/IPv6 DSCP packet marking (IETF RFC 2597). All VLAN frames of a given color get their DEI bit set if marking is enabled -for this color. In case, when marking for a given color is not enabled, the +for this color. When marking for a given color is not enabled, the DEI bit is left as is (either set or not). All IPv4/IPv6 packets of a given color with ECN set to 2’b01 or 2’b10 carrying -- 2.51.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v2 6/8] doc: correct grammar and improve clarity in ethdev guide 2026-01-28 19:45 ` [PATCH v2 0/8] doc: ethdev documentation grammar and typo corrections Stephen Hemminger ` (4 preceding siblings ...) 2026-01-28 19:46 ` [PATCH v2 5/8] doc: correct typos in traffic management guide Stephen Hemminger @ 2026-01-28 19:46 ` Stephen Hemminger 2026-03-25 11:41 ` Thomas Monjalon 2026-01-28 19:46 ` [PATCH v2 7/8] doc: correct alphabetical ordering in ethdev toctree Stephen Hemminger ` (2 subsequent siblings) 8 siblings, 1 reply; 18+ messages in thread From: Stephen Hemminger @ 2026-01-28 19:46 UTC (permalink / raw) To: dev; +Cc: Stephen Hemminger Improve readability and correct minor grammar issues in the ethdev documentation. Changes include: - Correct subject-verb agreement and article usage - Improve sentence structure for better clarity - Standardize terminology usage throughout - Remove redundant phrases - Correct punctuation in lists Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> --- doc/guides/prog_guide/ethdev/ethdev.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/guides/prog_guide/ethdev/ethdev.rst b/doc/guides/prog_guide/ethdev/ethdev.rst index ffe3fb1416..7afcf0956d 100644 --- a/doc/guides/prog_guide/ethdev/ethdev.rst +++ b/doc/guides/prog_guide/ethdev/ethdev.rst @@ -7,7 +7,7 @@ Poll Mode Driver The Data Plane Development Kit (DPDK) supports a wide range of Ethernet speeds, from 10 Megabits to 400 Gigabits, depending on hardware capability. -DPDK's Poll Mode Drivers (PMDs) are high-performance, optimized drivers for various +DPDK Poll Mode Drivers (PMDs) are high-performance, optimized drivers for various network interface cards that bypass the traditional kernel network stack to reduce latency and improve throughput. They access Rx and Tx descriptors directly in a polling mode without relying on interrupts (except for Link Status Change notifications), enabling -- 2.51.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v2 6/8] doc: correct grammar and improve clarity in ethdev guide 2026-01-28 19:46 ` [PATCH v2 6/8] doc: correct grammar and improve clarity in ethdev guide Stephen Hemminger @ 2026-03-25 11:41 ` Thomas Monjalon 0 siblings, 0 replies; 18+ messages in thread From: Thomas Monjalon @ 2026-03-25 11:41 UTC (permalink / raw) To: Stephen Hemminger; +Cc: dev 28/01/2026 20:46, Stephen Hemminger: > Improve readability and correct minor grammar issues in the ethdev > documentation. Changes include: > > - Correct subject-verb agreement and article usage > - Improve sentence structure for better clarity > - Standardize terminology usage throughout > - Remove redundant phrases > - Correct punctuation in lists > > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> > --- > doc/guides/prog_guide/ethdev/ethdev.rst | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > -DPDK's Poll Mode Drivers (PMDs) are high-performance, optimized drivers for various > +DPDK Poll Mode Drivers (PMDs) are high-performance, optimized drivers for various This single change does not match the commit description. I suppose I should just squash it in the commit changing the whole guide. ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v2 7/8] doc: correct alphabetical ordering in ethdev toctree 2026-01-28 19:45 ` [PATCH v2 0/8] doc: ethdev documentation grammar and typo corrections Stephen Hemminger ` (5 preceding siblings ...) 2026-01-28 19:46 ` [PATCH v2 6/8] doc: correct grammar and improve clarity in ethdev guide Stephen Hemminger @ 2026-01-28 19:46 ` Stephen Hemminger 2026-03-25 11:45 ` Thomas Monjalon 2026-01-28 19:46 ` [PATCH v2 8/8] doc: correct grammar and improve clarity in MTR guide Stephen Hemminger 2026-02-05 21:29 ` [PATCH v2 0/8] doc: ethdev documentation grammar and typo corrections Stephen Hemminger 8 siblings, 1 reply; 18+ messages in thread From: Stephen Hemminger @ 2026-01-28 19:46 UTC (permalink / raw) To: dev; +Cc: Stephen Hemminger The toctree entries should be alphabetically ordered for consistency with DPDK documentation guidelines. Move "flow_offload" before "qos_framework" and "switch_representation" to maintain alphabetical order. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> --- doc/guides/prog_guide/ethdev/index.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/doc/guides/prog_guide/ethdev/index.rst b/doc/guides/prog_guide/ethdev/index.rst index 392ced0a2e..82a749e2b3 100644 --- a/doc/guides/prog_guide/ethdev/index.rst +++ b/doc/guides/prog_guide/ethdev/index.rst @@ -8,8 +8,8 @@ Ethernet Device Library :maxdepth: 1 ethdev - switch_representation flow_offload - traffic_metering_and_policing - traffic_management qos_framework + switch_representation + traffic_management + traffic_metering_and_policing -- 2.51.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v2 7/8] doc: correct alphabetical ordering in ethdev toctree 2026-01-28 19:46 ` [PATCH v2 7/8] doc: correct alphabetical ordering in ethdev toctree Stephen Hemminger @ 2026-03-25 11:45 ` Thomas Monjalon 0 siblings, 0 replies; 18+ messages in thread From: Thomas Monjalon @ 2026-03-25 11:45 UTC (permalink / raw) To: Stephen Hemminger; +Cc: dev 28/01/2026 20:46, Stephen Hemminger: > The toctree entries should be alphabetically ordered for consistency > with DPDK documentation guidelines. Move "flow_offload" before > "qos_framework" and "switch_representation" to maintain alphabetical > order. Many toctrees are not in alphabetical order, but logical order. I think we decided alphabetical order only for driver lists. ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v2 8/8] doc: correct grammar and improve clarity in MTR guide 2026-01-28 19:45 ` [PATCH v2 0/8] doc: ethdev documentation grammar and typo corrections Stephen Hemminger ` (6 preceding siblings ...) 2026-01-28 19:46 ` [PATCH v2 7/8] doc: correct alphabetical ordering in ethdev toctree Stephen Hemminger @ 2026-01-28 19:46 ` Stephen Hemminger 2026-02-05 21:29 ` [PATCH v2 0/8] doc: ethdev documentation grammar and typo corrections Stephen Hemminger 8 siblings, 0 replies; 18+ messages in thread From: Stephen Hemminger @ 2026-01-28 19:46 UTC (permalink / raw) To: dev; +Cc: Stephen Hemminger, Cristian Dumitrescu Correct grammatical issues and improve clarity in the traffic metering and policing documentation: - Correct grammar: "override the color the packet" to "override the color of the packet" - Correct typo: "show" to "shown" for correct past participle - Standardize terminology: use "color-aware" and "color-blind" consistently with hyphens as compound adjectives Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> --- .../prog_guide/ethdev/traffic_metering_and_policing.rst | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/doc/guides/prog_guide/ethdev/traffic_metering_and_policing.rst b/doc/guides/prog_guide/ethdev/traffic_metering_and_policing.rst index 7f9faf36e2..de8d643fb1 100644 --- a/doc/guides/prog_guide/ethdev/traffic_metering_and_policing.rst +++ b/doc/guides/prog_guide/ethdev/traffic_metering_and_policing.rst @@ -47,15 +47,17 @@ Traffic metering determines the color for the current packet (green, yellow, red) based on the previous history for this flow as maintained by the MTR object. The policer can do nothing, override the color the packet or drop the packet. Statistics counters are maintained for MTR object, as configured. +object. The policer can do nothing, override the color of the packet, or drop the +packet. Statistics counters are maintained for each MTR object, as configured. The processing done for each input packet hitting an MTR object is: * Traffic metering: The packet is assigned a color (the meter output color) based on the previous traffic history reflected in the current state of the MTR object, according to the specific traffic metering algorithm. The - traffic metering algorithm can typically work in color aware mode, in which + traffic metering algorithm can typically work in color-aware mode, in which case the input packet already has an initial color (the input color), or in - color blind mode, which is equivalent to considering all input packets + color-blind mode, which is equivalent to considering all input packets initially colored as green. * There is a meter policy API to manage pre-defined policies for meter. @@ -105,7 +107,7 @@ traffic meter and policing library. * Adding one (or multiple) actions of the type ``RTE_FLOW_ACTION_TYPE_METER`` to the list of meter actions (``struct rte_mtr_meter_policy_params::actions``) - specified per color as show in :numref:`figure_rte_mtr_chaining`. + specified per color as shown in :numref:`figure_rte_mtr_chaining`. #. The ``rte_mtr_meter_profile_get()`` and ``rte_mtr_meter_policy_get()`` API functions are available for getting the object pointers directly. -- 2.51.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v2 0/8] doc: ethdev documentation grammar and typo corrections 2026-01-28 19:45 ` [PATCH v2 0/8] doc: ethdev documentation grammar and typo corrections Stephen Hemminger ` (7 preceding siblings ...) 2026-01-28 19:46 ` [PATCH v2 8/8] doc: correct grammar and improve clarity in MTR guide Stephen Hemminger @ 2026-02-05 21:29 ` Stephen Hemminger 8 siblings, 0 replies; 18+ messages in thread From: Stephen Hemminger @ 2026-02-05 21:29 UTC (permalink / raw) To: dev On Wed, 28 Jan 2026 11:45:59 -0800 Stephen Hemminger <stephen@networkplumber.org> wrote: > This series corrects grammar errors, typos, and punctuation issues > in the Ethernet device library section of the DPDK Programmer's Guide. > Initial work on this was done as part of review by technical writer > then more changes as result of AI reviews. > > The first patch provides a comprehensive update to the Poll Mode Driver > documentation, modernizing the description and improving readability. > The remaining patches address smaller issues in the rte_flow, QoS > framework, switch representation, traffic management, and traffic > metering guides. > > v2: > - Add patch to fix minor grammar issue in ethdev introduction > - Add patch to correct alphabetical ordering in ethdev toctree > - Add patch to fix grammar issues in traffic metering and policing guide Queued this to next-net. ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2026-03-25 11:45 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-01-16 21:29 [PATCH 0/5] doc: ethdev documentation grammar and typo corrections Stephen Hemminger 2026-01-16 21:29 ` [PATCH 1/5] doc: correct grammar and punctuation errors in ethdev guide Stephen Hemminger 2026-01-16 21:29 ` [PATCH 2/5] doc: correct grammar in rte_flow guide Stephen Hemminger 2026-01-16 21:29 ` [PATCH 3/5] doc: correct grammar in QoS framework guide Stephen Hemminger 2026-01-16 21:29 ` [PATCH 4/5] doc: correct typos in switch representation guide Stephen Hemminger 2026-01-16 21:29 ` [PATCH 5/5] doc: correct typos in traffic management guide Stephen Hemminger 2026-01-28 19:45 ` [PATCH v2 0/8] doc: ethdev documentation grammar and typo corrections Stephen Hemminger 2026-01-28 19:46 ` [PATCH v2 1/8] doc: correct grammar and punctuation errors in ethdev guide Stephen Hemminger 2026-01-28 19:46 ` [PATCH v2 2/8] doc: correct grammar in flow guide Stephen Hemminger 2026-01-28 19:46 ` [PATCH v2 3/8] doc: correct grammar in QoS framework guide Stephen Hemminger 2026-01-28 19:46 ` [PATCH v2 4/8] doc: correct typos in switch representation guide Stephen Hemminger 2026-01-28 19:46 ` [PATCH v2 5/8] doc: correct typos in traffic management guide Stephen Hemminger 2026-01-28 19:46 ` [PATCH v2 6/8] doc: correct grammar and improve clarity in ethdev guide Stephen Hemminger 2026-03-25 11:41 ` Thomas Monjalon 2026-01-28 19:46 ` [PATCH v2 7/8] doc: correct alphabetical ordering in ethdev toctree Stephen Hemminger 2026-03-25 11:45 ` Thomas Monjalon 2026-01-28 19:46 ` [PATCH v2 8/8] doc: correct grammar and improve clarity in MTR guide Stephen Hemminger 2026-02-05 21:29 ` [PATCH v2 0/8] doc: ethdev documentation grammar and typo corrections Stephen Hemminger
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox