From mboxrd@z Thu Jan 1 00:00:00 1970 From: Hemant Agrawal Subject: Re: [PATCH 2/2] ethdev: add hierarchical scheduler API Date: Tue, 21 Feb 2017 16:05:57 +0530 Message-ID: <4cffb9bd-98e7-1e37-5b2e-028a995e2e46@nxp.com> References: <1486735550-149878-1-git-send-email-cristian.dumitrescu@intel.com> <1486735550-149878-3-git-send-email-cristian.dumitrescu@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit Cc: , To: Cristian Dumitrescu , Return-path: Received: from NAM03-BY2-obe.outbound.protection.outlook.com (mail-by2nam03on0074.outbound.protection.outlook.com [104.47.42.74]) by dpdk.org (Postfix) with ESMTP id 36813FE5 for ; Tue, 21 Feb 2017 11:36:12 +0100 (CET) In-Reply-To: <1486735550-149878-3-git-send-email-cristian.dumitrescu@intel.com> List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 2/10/2017 7:35 PM, Cristian Dumitrescu wrote: > This patch introduces the generic ethdev API for the hierarchical scheduler > capability. > > Main features: > - Exposed as ethdev plugin capability (similar to rte_flow approach) > - Capability query API per port and per hierarchy node > - Scheduling algorithms: strict priority (SP), Weighed Fair Queuing (WFQ), > Weighted Round Robin (WRR) > - Traffic shaping: single/dual rate, private (per node) and shared (by multiple > nodes) shapers > - Congestion management for hierarchy leaf nodes: algorithms of tail drop, > head drop, WRED; private (per node) and shared (by multiple nodes) WRED > contexts > - Packet marking: IEEE 802.1q (VLAN DEI), IETF RFC 3168 (IPv4/IPv6 ECN for > TCP and SCTP), IETF RFC 2597 (IPv4 / IPv6 DSCP) > > Changes since RFC [1]: > - Implemented as ethdev plugin (similar to rte_flow) as opposed to more > monolithic additions to ethdev itself > - Implemented feedback from Jerin [2] and Hemant [3]. Implemented all the > suggested items with only one exception, see the long list below, hopefully > nothing was forgotten. > - The item not done (hopefully for a good reason): driver-generated object > IDs. IMO the choice to have application-generated object IDs adds marginal > complexity to the driver (search ID function required), but it provides > huge simplification for the application. The app does not need to worry > about building & managing tree-like structure for storing driver-generated > object IDs, the app can use its own convention for node IDs depending on > the specific hierarchy that it needs. Trivial example: identify all > level-2 nodes with IDs like 100, 200, 300, … and the level-3 nodes based > on their level-2 parents: 110, 120, 130, 140, …, 210, 220, 230, 240, …, > 310, 320, 330, … and level-4 nodes based on their level-3 parents: 111, > 112, 113, 114, …, 121, 122, 123, 124, …). Moreover, see the change log for > the other related simplification that was implemented: leaf nodes now have > predefined IDs that are the same with their Ethernet TX queue ID ( > therefore no translation is required for leaf nodes). > - Capability API. Done per port and per node as well. > - Dual rate shapers > - Added configuration of private shaper (per node) directly from the shaper > profile as part of node API (no shaper ID needed for private shapers), while > the shared shapers are configured outside of the node API using shaper profile > and communicated to the node using shared shaper ID. So there is no > configuration overhead for shared shapers if the app does not use any of them. > - Leaf nodes now have predefined IDs that are the same with their Ethernet TX > queue ID (therefore no translation is required for leaf nodes). This is also > used to differentiate between a leaf node and a non-leaf node. > - Domain-specific errors to give a precise indication of the error cause (same > as done by rte_flow) > - Packet marking API > - Packet length optional adjustment for shapers, positive (e.g. for adding > Ethernet framing overhead of 20 bytes) or negative (e.g. for rate limiting > based on IP packet bytes) > > Next steps: > - SW fallback based on librte_sched library (to be later introduced by > standalone patch set) > > [1] RFC: http://dpdk.org/ml/archives/dev/2016-November/050956.html > [2] Jerin’s feedback on RFC: http://www.dpdk.org/ml/archives/dev/2017-January/054484.html > [3] Hemants’s feedback on RFC: http://www.dpdk.org/ml/archives/dev/2017-January/054866.html > > Signed-off-by: Cristian Dumitrescu > --- > MAINTAINERS | 4 + > lib/librte_ether/Makefile | 5 +- > lib/librte_ether/rte_ether_version.map | 30 + > lib/librte_ether/rte_scheddev.c | 790 ++++++++++++++++++++ > lib/librte_ether/rte_scheddev.h | 1273 ++++++++++++++++++++++++++++++++ > lib/librte_ether/rte_scheddev_driver.h | 374 ++++++++++ > 6 files changed, 2475 insertions(+), 1 deletion(-) > create mode 100644 lib/librte_ether/rte_scheddev.c > create mode 100644 lib/librte_ether/rte_scheddev.h > create mode 100644 lib/librte_ether/rte_scheddev_driver.h > ... > + > +#ifndef __INCLUDE_RTE_SCHEDDEV_H__ > +#define __INCLUDE_RTE_SCHEDDEV_H__ > + > +/** > + * @file > + * RTE Generic Hierarchical Scheduler API > + * > + * This interface provides the ability to configure the hierarchical scheduler > + * feature in a generic way. > + */ > + > +#include > + > +#include > + > +#ifdef __cplusplus > +extern "C" { > +#endif > + > +/** Ethernet framing overhead > + * > + * Overhead fields per Ethernet frame: > + * 1. Preamble: 7 bytes; > + * 2. Start of Frame Delimiter (SFD): 1 byte; > + * 3. Inter-Frame Gap (IFG): 12 bytes. > + */ > +#define RTE_SCHEDDEV_ETH_FRAMING_OVERHEAD 20 > + > +/** > + * Ethernet framing overhead plus Frame Check Sequence (FCS). Useful when FCS > + * is generated and added at the end of the Ethernet frame on TX side without > + * any SW intervention. > + */ > +#define RTE_SCHEDDEV_ETH_FRAMING_OVERHEAD_FCS 24 > + > +/**< Invalid WRED profile ID */ > +#define RTE_SCHEDDEV_WRED_PROFILE_ID_NONE UINT32_MAX > + > +/**< Invalid shaper profile ID */ > +#define RTE_SCHEDDEV_SHAPER_PROFILE_ID_NONE UINT32_MAX > + > +/**< Scheduler hierarchy root node ID */ > +#define RTE_SCHEDDEV_ROOT_NODE_ID UINT32_MAX > + > + > +/** > + * Scheduler node capabilities > + */ > +struct rte_scheddev_node_capabilities { > + /**< Private shaper support. */ > + int shaper_private_supported; > + > + /**< Dual rate shaping support for private shaper. Valid only when > + * private shaper is supported. > + */ > + int shaper_private_dual_rate_supported; > + > + /**< Minimum committed/peak rate (bytes per second) for private > + * shaper. Valid only when private shaper is supported. > + */ > + uint64_t shaper_private_rate_min; > + > + /**< Maximum committed/peak rate (bytes per second) for private > + * shaper. Valid only when private shaper is supported. > + */ > + uint64_t shaper_private_rate_max; > + > + /**< Maximum number of supported shared shapers. The value of zero > + * indicates that shared shapers are not supported. > + */ > + uint32_t shaper_shared_n_max; > + > + /**< Items valid only for non-leaf nodes. */ > + struct { > + /**< Maximum number of children nodes. */ > + uint32_t n_children_max; > + > + /**< Lowest priority supported. The value of 1 indicates that > + * only priority 0 is supported, which essentially means that > + * Strict Priority (SP) algorithm is not supported. > + */ > + uint32_t sp_priority_min; > + This can be simply sp_priority_level, with 0 indicating no support 1 indicates '0' and '1' priority. or 7 indicates '0' to '7' i.e. total 8 priorities. > + /**< Maximum number of sibling nodes that can have the same > + * priority at any given time. When equal to *n_children_max*, > + * it indicates that WFQ/WRR algorithms are not supported. > + */ > + uint32_t sp_n_children_max; not clear to me. OK, more than 1 children can have same priority, than you apply WRR/WFQ among them. However, there can be different sets, e.g prio '0' and '1' has only 1 children. while prio '2' has 6 children, than you apply WRR/WFQ among them. > + > + /**< WFQ algorithm support. */ > + int scheduling_wfq_supported; > + > + /**< WRR algorithm support. */ > + int scheduling_wrr_supported; > + > + /**< Maximum WFQ/WRR weight. */ > + uint32_t scheduling_wfq_wrr_weight_max; > + } nonleaf; > + > + /**< Items valid only for leaf nodes. */ > + struct { > + /**< Head drop algorithm support. */ > + int cman_head_drop_supported; > + > + /**< Private WRED context support. */ > + int cman_wred_context_private_supported; > + The context part is not clear to me. > + /**< Maximum number of shared WRED contexts supported. The value > + * of zero indicates that shared WRED contexts are not > + * supported. > + */ > + uint32_t cman_wred_context_shared_n_max; > + } leaf; non-leaf nodes may have different capabilities. your leaf node is like a QoS Queue, are you supporting shapper on leaf node as well? I will still prefer if you separate QoS Queue from a standard Sched node, the capabilities are different and it will be cleaner at the cost of increased structure and number of APIs. > +}; > + > +/** > + * Scheduler capabilities > + */ > +struct rte_scheddev_capabilities { > + /**< Maximum number of nodes. */ > + uint32_t n_nodes_max; > + > + /**< Maximum number of levels (i.e. number of nodes connecting the root > + * node with any leaf node, including the root and the leaf). > + */ > + uint32_t n_levels_max; > + > + /**< Maximum number of shapers, either private or shared. In case the > + * implementation does not share any resource between private and > + * shared shapers, it is typically equal to the sum between > + * *shaper_private_n_max* and *shaper_shared_n_max*. > + */ > + uint32_t shaper_n_max; > + > + /**< Maximum number of private shapers. Indicates the maximum number of > + * nodes that can concurrently have the private shaper enabled. > + */ > + uint32_t shaper_private_n_max; > + > + /**< Maximum number of shared shapers. The value of zero indicates that > + * shared shapers are not supported. > + */ > + uint32_t shaper_shared_n_max; > + > + /**< Maximum number of nodes that can share the same shared shaper. Only > + * valid when shared shapers are supported. > + */ > + uint32_t shaper_shared_n_nodes_max; > + > + /**< Maximum number of shared shapers that can be configured with dual > + * rate shaping. The value of zero indicates that dual rate shaping > + * support is not available for shared shapers. > + */ > + uint32_t shaper_shared_dual_rate_n_max; > + > + /**< Minimum committed/peak rate (bytes per second) for shared > + * shapers. Only valid when shared shapers are supported. > + */ > + uint64_t shaper_shared_rate_min; > + > + /**< Maximum committed/peak rate (bytes per second) for shared > + * shaper. Only valid when shared shapers are supported. > + */ > + uint64_t shaper_shared_rate_max; > + > + /**< Minimum value allowed for packet length adjustment for > + * private/shared shapers. > + */ > + int shaper_pkt_length_adjust_min; > + > + /**< Maximum value allowed for packet length adjustment for > + * private/shared shapers. > + */ > + int shaper_pkt_length_adjust_max; > + > + /**< Maximum number of WRED contexts. */ > + uint32_t cman_wred_context_n_max; > + > + /**< Maximum number of private WRED contexts. Indicates the maximum > + * number of leaf nodes that can concurrently have the private WRED > + * context enabled. > + */ > + uint32_t cman_wred_context_private_n_max; > + > + /**< Maximum number of shared WRED contexts. The value of zero indicates > + * that shared WRED contexts are not supported. > + */ > + uint32_t cman_wred_context_shared_n_max; > + > + /**< Maximum number of leaf nodes that can share the same WRED context. > + * Only valid when shared WRED contexts are supported. > + */ > + uint32_t cman_wred_context_shared_n_nodes_max; > + > + /**< Support for VLAN DEI packet marking. */ > + int mark_vlan_dei_supported; > + > + /**< Support for IPv4/IPv6 ECN marking of TCP packets. */ > + int mark_ip_ecn_tcp_supported; > + > + /**< Support for IPv4/IPv6 ECN marking of SCTP packets. */ > + int mark_ip_ecn_sctp_supported; > + > + /**< Support for IPv4/IPv6 DSCP packet marking. */ > + int mark_ip_dscp_supported; > + > + /**< Summary of node-level capabilities across all nodes. */ > + struct rte_scheddev_node_capabilities node; This should be array of numbers of levels supported in the system. Non-leaf node at level 2 can have different capabilities than level 3 node. > +}; > + > +/** > + * Congestion management (CMAN) mode > + * > + * This is used for controlling the admission of packets into a packet queue or > + * group of packet queues on congestion. On request of writing a new packet > + * into the current queue while the queue is full, the *tail drop* algorithm > + * drops the new packet while leaving the queue unmodified, as opposed to *head > + * drop* algorithm, which drops the packet at the head of the queue (the oldest > + * packet waiting in the queue) and admits the new packet at the tail of the > + * queue. > + * > + * The *Random Early Detection (RED)* algorithm works by proactively dropping > + * more and more input packets as the queue occupancy builds up. When the queue > + * is full or almost full, RED effectively works as *tail drop*. The *Weighted > + * RED* algorithm uses a separate set of RED thresholds for each packet color. > + */ > +enum rte_scheddev_cman_mode { > + RTE_SCHEDDEV_CMAN_TAIL_DROP = 0, /**< Tail drop */ > + RTE_SCHEDDEV_CMAN_HEAD_DROP, /**< Head drop */ > + RTE_SCHEDDEV_CMAN_WRED, /**< Weighted Random Early Detection (WRED) */ > +}; > + > +/** > + * Color > + */ > +enum rte_scheddev_color { > + e_RTE_SCHEDDEV_GREEN = 0, /**< Green */ > + e_RTE_SCHEDDEV_YELLOW, /**< Yellow */ > + e_RTE_SCHEDDEV_RED, /**< Red */ > + e_RTE_SCHEDDEV_COLORS /**< Number of colors */ > +}; > + > +/** > + * WRED profile > + */ > +struct rte_scheddev_wred_params { > + /**< One set of RED parameters per packet color */ > + struct rte_red_params red_params[e_RTE_SCHEDDEV_COLORS]; > +}; > + > +/** > + * Token bucket > + */ > +struct rte_scheddev_token_bucket { > + /**< Token bucket rate (bytes per second) */ > + uint64_t rate; > + > + /**< Token bucket size (bytes), a.k.a. max burst size */ > + uint64_t size; > +}; > + > +/** > + * Shaper (rate limiter) profile > + * > + * Multiple shaper instances can share the same shaper profile. Each node has > + * zero or one private shaper (only one node using it) and/or zero, one or > + * several shared shapers (multiple nodes use the same shaper instance). > + * > + * Single rate shapers use a single token bucket. A single rate shaper can be > + * configured by setting the rate of the committed bucket to zero, which > + * effectively disables this bucket. The peak bucket is used to limit the rate > + * and the burst size for the current shaper. > + * > + * Dual rate shapers use both the committed and the peak token buckets. The > + * rate of the committed bucket has to be less than or equal to the rate of the > + * peak bucket. > + */ > +struct rte_scheddev_shaper_params { > + /**< Committed token bucket */ > + struct rte_scheddev_token_bucket committed; > + > + /**< Peak token bucket */ > + struct rte_scheddev_token_bucket peak; > + > + /**< Signed value to be added to the length of each packet for the > + * purpose of shaping. Can be used to correct the packet length with > + * the framing overhead bytes that are also consumed on the wire (e.g. > + * RTE_SCHEDDEV_ETH_FRAMING_OVERHEAD_FCS). > + */ > + int32_t pkt_length_adjust; > +}; > + > +/** > + * Node parameters > + * > + * Each scheduler hierarchy node has multiple inputs (children nodes of the > + * current parent node) and a single output (which is input to its parent > + * node). The current node arbitrates its inputs using Strict Priority (SP), > + * Weighted Fair Queuing (WFQ) and Weighted Round Robin (WRR) algorithms to > + * schedule input packets on its output while observing its shaping (rate > + * limiting) constraints. > + * > + * Algorithms such as byte-level WRR, Deficit WRR (DWRR), etc are considered > + * approximations of the ideal of WFQ and are assimilated to WFQ, although > + * an associated implementation-dependent trade-off on accuracy, performance > + * and resource usage might exist. > + * > + * Children nodes with different priorities are scheduled using the SP > + * algorithm, based on their priority, with zero (0) as the highest priority. > + * Children with same priority are scheduled using the WFQ or WRR algorithm, > + * based on their weight, which is relative to the sum of the weights of all > + * siblings with same priority, with one (1) as the lowest weight. > + * > + * Each leaf node sits on on top of a TX queue of the current Ethernet port. > + * Therefore, the leaf nodes are predefined with the node IDs of 0 .. (N-1), > + * where N is the number of TX queues configured for the current Ethernet port. > + * The non-leaf nodes have their IDs generated by the application. > + */ Ok, that means 0 to N-1 is reserved for leaf nodes. the application will choose any value for non-leaf nodes? What will be the parent node id for the root node? > +struct rte_scheddev_node_params { > + /**< Shaper profile for the private shaper. The absence of the private > + * shaper for the current node is indicated by setting this parameter > + * to RTE_SCHEDDEV_SHAPER_PROFILE_ID_NONE. > + */ > + uint32_t shaper_profile_id; > + > + /**< User allocated array of valid shared shaper IDs. */ > + uint32_t *shared_shaper_id; > + > + /**< Number of shared shaper IDs in the *shared_shaper_id* array. */ > + uint32_t n_shared_shapers; > + > + union { > + /**< Parameters only valid for non-leaf nodes. */ > + struct { > + /**< For each priority, indicates whether the children > + * nodes sharing the same priority are to be scheduled > + * by WFQ or by WRR. When NULL, it indicates that WFQ > + * is to be used for all priorities. When non-NULL, it > + * points to a pre-allocated array of *n_priority* > + * elements, with a non-zero value element indicating > + * WFQ and a zero value element for WRR. > + */ > + int *scheduling_mode_per_priority; what is the structure of the pointer element? Just a bool array? > + > + /**< Number of priorities. */ > + uint32_t n_priorities; > + } nonleaf; > + > + /**< Parameters only valid for leaf nodes. */ > + struct { > + /**< Congestion management mode */ > + enum rte_scheddev_cman_mode cman; > + > + /**< WRED parameters (valid when *cman* is WRED). */ > + struct { > + /**< WRED profile for private WRED context. */ > + uint32_t wred_profile_id; > + > + /**< User allocated array of shared WRED context > + * IDs. The absence of a private WRED context > + * for current leaf node is indicated by value > + * RTE_SCHEDDEV_WRED_PROFILE_ID_NONE. > + */ > + uint32_t *shared_wred_context_id; > + > + /**< Number of shared WRED context IDs in the > + * *shared_wred_context_id* array. > + */ > + uint32_t n_shared_wred_contexts; > + } wred; > + } leaf; need a bool is_leaf here to differentiate between leaf and non-leaf node. > + }; > +}; > + > +/** > + * Node statistics counter type > + */ > +enum rte_scheddev_stats_counter { > + /**< Number of packets scheduled from current node. */ > + RTE_SCHEDDEV_STATS_COUNTER_N_PKTS = 1 << 0, > + > + /**< Number of bytes scheduled from current node. */ > + RTE_SCHEDDEV_STATS_COUNTER_N_BYTES = 1 << 1, > + > + /**< Number of packets dropped by current node. */ > + RTE_SCHEDDEV_STATS_COUNTER_N_PKTS_DROPPED = 1 << 2, > + > + /**< Number of bytes dropped by current node. */ > + RTE_SCHEDDEV_STATS_COUNTER_N_BYTES_DROPPED = 1 << 3, > + > + /**< Number of packets currently waiting in the packet queue of current > + * leaf node. > + */ > + RTE_SCHEDDEV_STATS_COUNTER_N_PKTS_QUEUED = 1 << 4, > + > + /**< Number of bytes currently waiting in the packet queue of current > + * leaf node. > + */ > + RTE_SCHEDDEV_STATS_COUNTER_N_BYTES_QUEUED = 1 << 5, > +}; > + > +/** > + * Node statistics counters > + */ > +struct rte_scheddev_node_stats { > + /**< Number of packets scheduled from current node. */ > + uint64_t n_pkts; > + > + /**< Number of bytes scheduled from current node. */ > + uint64_t n_bytes; > + > + /**< Statistics counters for leaf nodes only. */ > + struct { > + /**< Number of packets dropped by current leaf node. */ > + uint64_t n_pkts_dropped; > + > + /**< Number of bytes dropped by current leaf node. */ > + uint64_t n_bytes_dropped; > + > + /**< Number of packets currently waiting in the packet queue of > + * current leaf node. > + */ > + uint64_t n_pkts_queued; > + > + /**< Number of bytes currently waiting in the packet queue of > + * current leaf node. > + */ > + uint64_t n_bytes_queued; > + } leaf; > +}; > + > +/** > + * Verbose error types. > + * > + * Most of them provide the type of the object referenced by struct > + * rte_scheddev_error::cause. > + */ > +enum rte_scheddev_error_type { > + RTE_SCHEDDEV_ERROR_TYPE_NONE, /**< No error. */ > + RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED, /**< Cause unspecified. */ > + RTE_SCHEDDEV_ERROR_TYPE_WRED_PROFILE, > + RTE_SCHEDDEV_ERROR_TYPE_WRED_PROFILE_GREEN, > + RTE_SCHEDDEV_ERROR_TYPE_WRED_PROFILE_YELLOW, > + RTE_SCHEDDEV_ERROR_TYPE_WRED_PROFILE_RED, > + RTE_SCHEDDEV_ERROR_TYPE_WRED_PROFILE_ID, > + RTE_SCHEDDEV_ERROR_TYPE_SHARED_WRED_CONTEXT_ID, > + RTE_SCHEDDEV_ERROR_TYPE_SHAPER_PROFILE, > + RTE_SCHEDDEV_ERROR_TYPE_SHARED_SHAPER_ID, > + RTE_SCHEDDEV_ERROR_TYPE_NODE_PARAMS, > + RTE_SCHEDDEV_ERROR_TYPE_NODE_PARAMS_PARENT_NODE_ID, > + RTE_SCHEDDEV_ERROR_TYPE_NODE_PARAMS_PRIORITY, > + RTE_SCHEDDEV_ERROR_TYPE_NODE_PARAMS_WEIGHT, > + RTE_SCHEDDEV_ERROR_TYPE_NODE_PARAMS_SCHEDULING_MODE, > + RTE_SCHEDDEV_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID, > + RTE_SCHEDDEV_ERROR_TYPE_NODE_PARAMS_SHARED_SHAPER_ID, > + RTE_SCHEDDEV_ERROR_TYPE_NODE_PARAMS_LEAF, > + RTE_SCHEDDEV_ERROR_TYPE_NODE_PARAMS_LEAF_CMAN, > + RTE_SCHEDDEV_ERROR_TYPE_NODE_PARAMS_LEAF_WRED_PROFILE_ID, > + RTE_SCHEDDEV_ERROR_TYPE_NODE_PARAMS_LEAF_SHARED_WRED_CONTEXT_ID, > + RTE_SCHEDDEV_ERROR_TYPE_NODE_ID, > +}; > + > +/** > + * Verbose error structure definition. > + * > + * This object is normally allocated by applications and set by PMDs, the > + * message points to a constant string which does not need to be freed by > + * the application, however its pointer can be considered valid only as long > + * as its associated DPDK port remains configured. Closing the underlying > + * device or unloading the PMD invalidates it. > + * > + * Both cause and message may be NULL regardless of the error type. > + */ > +struct rte_scheddev_error { > + enum rte_scheddev_error_type type; /**< Cause field and error type. */ > + const void *cause; /**< Object responsible for the error. */ > + const char *message; /**< Human-readable error message. */ > +}; > + > +/** > + * Scheduler capabilities get > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param cap > + * Scheduler capabilities. Needs to be pre-allocated and valid. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_capabilities_get(uint8_t port_id, > + struct rte_scheddev_capabilities *cap, > + struct rte_scheddev_error *error); > + > +/** > + * Scheduler node capabilities get > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param node_id > + * Node ID. Needs to be valid. > + * @param cap > + * Scheduler node capabilities. Needs to be pre-allocated and valid. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_node_capabilities_get(uint8_t port_id, > + uint32_t node_id, > + struct rte_scheddev_node_capabilities *cap, > + struct rte_scheddev_error *error); > + Node capabilities is already part of scheddev_capabilities? What are you expecting different here. Unless you support different capability for each level, this may not be useful. > +/** > + * Scheduler WRED profile add > + * > + * Create a new WRED profile with ID set to *wred_profile_id*. The new profile > + * is used to create one or several WRED contexts. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param wred_profile_id > + * WRED profile ID for the new profile. Needs to be unused. > + * @param profile > + * WRED profile parameters. Needs to be pre-allocated and valid. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_wred_profile_add(uint8_t port_id, > + uint32_t wred_profile_id, > + struct rte_scheddev_wred_params *profile, > + struct rte_scheddev_error *error); > + > +/** > + * Scheduler WRED profile delete > + * > + * Delete an existing WRED profile. This operation fails when there is currently > + * at least one user (i.e. WRED context) of this WRED profile. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param wred_profile_id > + * WRED profile ID. Needs to be the valid. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_wred_profile_delete(uint8_t port_id, > + uint32_t wred_profile_id, > + struct rte_scheddev_error *error); > + > +/** > + * Scheduler shared WRED context add or update > + * > + * When *shared_wred_context_id* is invalid, a new WRED context with this ID is > + * created by using the WRED profile identified by *wred_profile_id*. > + * > + * When *shared_wred_context_id* is valid, this WRED context is no longer using > + * the profile previously assigned to it and is updated to use the profile > + * identified by *wred_profile_id*. > + * > + * A valid shared WRED context can be assigned to several scheduler hierarchy > + * leaf nodes configured to use WRED as the congestion management mode. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param shared_wred_context_id > + * Shared WRED context ID > + * @param wred_profile_id > + * WRED profile ID. Needs to be the valid. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_shared_wred_context_add_update(uint8_t port_id, > + uint32_t shared_wred_context_id, > + uint32_t wred_profile_id, > + struct rte_scheddev_error *error); > + > +/** > + * Scheduler shared WRED context delete > + * > + * Delete an existing shared WRED context. This operation fails when there is > + * currently at least one user (i.e. scheduler hierarchy leaf node) of this > + * shared WRED context. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param shared_wred_context_id > + * Shared WRED context ID. Needs to be the valid. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_shared_wred_context_delete(uint8_t port_id, > + uint32_t shared_wred_context_id, > + struct rte_scheddev_error *error); > + > +/** > + * Scheduler shaper profile add > + * > + * Create a new shaper profile with ID set to *shaper_profile_id*. The new > + * shaper profile is used to create one or several shapers. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param shaper_profile_id > + * Shaper profile ID for the new profile. Needs to be unused. > + * @param profile > + * Shaper profile parameters. Needs to be pre-allocated and valid. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_shaper_profile_add(uint8_t port_id, > + uint32_t shaper_profile_id, > + struct rte_scheddev_shaper_params *profile, > + struct rte_scheddev_error *error); > + > +/** > + * Scheduler shaper profile delete > + * > + * Delete an existing shaper profile. This operation fails when there is > + * currently at least one user (i.e. shaper) of this shaper profile. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param shaper_profile_id > + * Shaper profile ID. Needs to be the valid. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_shaper_profile_delete(uint8_t port_id, > + uint32_t shaper_profile_id, > + struct rte_scheddev_error *error); > + > +/** > + * Scheduler shared shaper add or update > + * > + * When *shared_shaper_id* is not a valid shared shaper ID, a new shared shaper > + * with this ID is created using the shaper profile identified by > + * *shaper_profile_id*. > + * > + * When *shared_shaper_id* is a valid shared shaper ID, this shared shaper is no > + * longer using the shaper profile previously assigned to it and is updated to > + * use the shaper profile identified by *shaper_profile_id*. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param shared_shaper_id > + * Shared shaper ID > + * @param shaper_profile_id > + * Shaper profile ID. Needs to be the valid. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_shared_shaper_add_update(uint8_t port_id, > + uint32_t shared_shaper_id, > + uint32_t shaper_profile_id, > + struct rte_scheddev_error *error); > + > +/** > + * Scheduler shared shaper delete > + * > + * Delete an existing shared shaper. This operation fails when there is > + * currently at least one user (i.e. scheduler hierarchy node) of this shared > + * shaper. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param shared_shaper_id > + * Shared shaper ID. Needs to be the valid. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_shared_shaper_delete(uint8_t port_id, > + uint32_t shared_shaper_id, > + struct rte_scheddev_error *error); > + > +/** > + * Scheduler node add > + * > + * When *node_id* is not a valid node ID, a new node with this ID is created and > + * connected as child to the existing node identified by *parent_node_id*. > + * > + * When *node_id* is a valid node ID, this node is disconnected from its current > + * parent and connected as child to another existing node identified by > + * *parent_node_id *. > + * > + * This function can be called during port initialization phase (before the > + * Ethernet port is started) for building the scheduler start-up hierarchy. > + * Subject to the specific Ethernet port supporting on-the-fly scheduler > + * hierarchy updates, this function can also be called during run-time (after > + * the Ethernet port is started). This should a capability, whether dynamic_hierarchy_updates are supported or not. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param node_id > + * Node ID > + * @param parent_node_id > + * Parent node ID. Needs to be the valid. What will be the parent node id for the root node? how the root node is created on the ethernet port? > + * @param priority > + * Node priority. The highest node priority is zero. Used by the SP algorithm > + * running on the parent of the current node for scheduling this child node. > + * @param weight > + * Node weight. The node weight is relative to the weight sum of all siblings > + * that have the same priority. The lowest weight is one. Used by the WFQ/WRR > + * algorithm running on the parent of the current node for scheduling this > + * child node. > + * @param params > + * Node parameters. Needs to be pre-allocated and valid. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_node_add(uint8_t port_id, > + uint32_t node_id, > + uint32_t parent_node_id, > + uint32_t priority, > + uint32_t weight, > + struct rte_scheddev_node_params *params, > + struct rte_scheddev_error *error); > + > +/** > + * Scheduler node delete > + * > + * Delete an existing node. This operation fails when this node currently has at > + * least one user (i.e. child node). > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param node_id > + * Node ID. Needs to be valid. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_node_delete(uint8_t port_id, > + uint32_t node_id, > + struct rte_scheddev_error *error); > + > +/** > + * Scheduler node suspend > + * > + * Suspend an existing node. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param node_id > + * Node ID. Needs to be valid. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_node_suspend(uint8_t port_id, > + uint32_t node_id, > + struct rte_scheddev_error *error); > + > +/** > + * Scheduler node resume > + * > + * Resume an existing node that was previously suspended. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param node_id > + * Node ID. Needs to be valid. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_node_resume(uint8_t port_id, > + uint32_t node_id, > + struct rte_scheddev_error *error); > + > +/** > + * Scheduler hierarchy set > + * > + * This function is called during the port initialization phase (before the > + * Ethernet port is started) to freeze the scheduler start-up hierarchy. > + * > + * This function fails when the currently configured scheduler hierarchy is not > + * supported by the Ethernet port, in which case the user can abort or try out > + * another hierarchy configuration (e.g. a hierarchy with less leaf nodes), > + * which can be build from scratch (when *clear_on_fail* is enabled) or by > + * modifying the existing hierarchy configuration (when *clear_on_fail* is > + * disabled). > + * > + * Note that, even when the configured scheduler hierarchy is supported (so this > + * function is successful), the Ethernet port start might still fail due to e.g. > + * not enough memory being available in the system, etc. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param clear_on_fail > + * On function call failure, hierarchy is cleared when this parameter is > + * non-zero and preserved when this parameter is equal to zero. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_hierarchy_set(uint8_t port_id, > + int clear_on_fail, > + struct rte_scheddev_error *error); > + > +/** > + * Scheduler node parent update > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param node_id > + * Node ID. Needs to be valid. > + * @param parent_node_id > + * Node ID for the new parent. Needs to be valid. > + * @param priority > + * Node priority. The highest node priority is zero. Used by the SP algorithm > + * running on the parent of the current node for scheduling this child node. > + * @param weight > + * Node weight. The node weight is relative to the weight sum of all siblings > + * that have the same priority. The lowest weight is zero. Used by the WFQ/WRR > + * algorithm running on the parent of the current node for scheduling this > + * child node. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_node_parent_update(uint8_t port_id, > + uint32_t node_id, > + uint32_t parent_node_id, > + uint32_t priority, > + uint32_t weight, > + struct rte_scheddev_error *error); > + The usages are not clear. How it is different from node_add API. is the intention to update a specific node or change the connection of a specific node to a existing or new parent. > +/** > + * Scheduler node private shaper update > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param node_id > + * Node ID. Needs to be valid. > + * @param shaper_profile_id > + * Shaper profile ID for the private shaper of the current node. Needs to be > + * either valid shaper profile ID or RTE_SCHEDDEV_SHAPER_PROFILE_ID_NONE, with > + * the latter disabling the private shaper of the current node. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_node_shaper_update(uint8_t port_id, > + uint32_t node_id, > + uint32_t shaper_profile_id, > + struct rte_scheddev_error *error); > + > +/** > + * Scheduler node shared shapers update > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param node_id > + * Node ID. Needs to be valid. > + * @param shared_shaper_id > + * Shared shaper ID. Needs to be valid. > + * @param add > + * Set to non-zero value to add this shared shaper to current node or to zero > + * to delete this shared shaper from current node. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_node_shared_shaper_update(uint8_t port_id, > + uint32_t node_id, > + uint32_t shared_shaper_id, > + int add, > + struct rte_scheddev_error *error); > + > +/** > + * Scheduler node scheduling mode update > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param node_id > + * Node ID. Needs to be valid leaf node ID. > + * @param scheduling_mode_per_priority > + * For each priority, indicates whether the children nodes sharing the same > + * priority are to be scheduled by WFQ or by WRR. When NULL, it indicates that > + * WFQ is to be used for all priorities. When non-NULL, it points to a > + * pre-allocated array of *n_priority* elements, with a non-zero value element > + * indicating WFQ and a zero value element for WRR. > + * @param n_priorities > + * Number of priorities. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_node_scheduling_mode_update(uint8_t port_id, > + uint32_t node_id, > + int *scheduling_mode_per_priority, > + uint32_t n_priorities, > + struct rte_scheddev_error *error); > + > +/** > + * Scheduler node congestion management mode update > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param node_id > + * Node ID. Needs to be valid leaf node ID. > + * @param cman > + * Congestion management mode. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_node_cman_update(uint8_t port_id, > + uint32_t node_id, > + enum rte_scheddev_cman_mode cman, > + struct rte_scheddev_error *error); > + > +/** > + * Scheduler node private WRED context update > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param node_id > + * Node ID. Needs to be valid leaf node ID. > + * @param wred_profile_id > + * WRED profile ID for the private WRED context of the current node. Needs to > + * be either valid WRED profile ID or RTE_SCHEDDEV_WRED_PROFILE_ID_NONE, with > + * the latter disabling the private WRED context of the current node. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_node_wred_context_update(uint8_t port_id, > + uint32_t node_id, > + uint32_t wred_profile_id, > + struct rte_scheddev_error *error); > + > +/** > + * Scheduler node shared WRED context update > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param node_id > + * Node ID. Needs to be valid leaf node ID. > + * @param shared_wred_context_id > + * Shared WRED context ID. Needs to be valid. > + * @param add > + * Set to non-zero value to add this shared WRED context to current node or to > + * zero to delete this shared WRED context from current node. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_node_shared_wred_context_update(uint8_t port_id, > + uint32_t node_id, > + uint32_t shared_wred_context_id, > + int add, > + struct rte_scheddev_error *error); > + > +/** > + * Scheduler packet marking - VLAN DEI (IEEE 802.1Q) > + * > + * IEEE 802.1p maps the traffic class to the VLAN Priority Code Point (PCP) > + * field (3 bits), while IEEE 802.1q maps the drop priority to the VLAN Drop > + * Eligible Indicator (DEI) field (1 bit), which was previously named Canonical > + * Format Indicator (CFI). > + * > + * All VLAN frames of a given color get their DEI bit set if marking is enabled > + * for this color; otherwise, their DEI bit is left as is (either set or not). > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param mark_green > + * Set to non-zero value to enable marking of green packets and to zero to > + * disable it. > + * @param mark_yellow > + * Set to non-zero value to enable marking of yellow packets and to zero to > + * disable it. > + * @param mark_red > + * Set to non-zero value to enable marking of red packets and to zero to > + * disable it. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_mark_vlan_dei(uint8_t port_id, > + int mark_green, > + int mark_yellow, > + int mark_red, > + struct rte_scheddev_error *error); > + > +/** > + * Scheduler packet marking - IPv4 / IPv6 ECN (IETF RFC 3168) > + * > + * IETF RFCs 2474 and 3168 reorganize the IPv4 Type of Service (TOS) field > + * (8 bits) and the IPv6 Traffic Class (TC) field (8 bits) into Differentiated > + * Services Codepoint (DSCP) field (6 bits) and Explicit Congestion Notification > + * (ECN) field (2 bits). The DSCP field is typically used to encode the traffic > + * class and/or drop priority (RFC 2597), while the ECN field is used by RFC > + * 3168 to implement a congestion notification mechanism to be leveraged by > + * transport layer protocols such as TCP and SCTP that have congestion control > + * mechanisms. > + * > + * When congestion is experienced, as alternative to dropping the packet, > + * routers can change the ECN field of input packets from 2'b01 or 2'b10 (values > + * indicating that source endpoint is ECN-capable) to 2'b11 (meaning that > + * congestion is experienced). The destination endpoint can use the ECN-Echo > + * (ECE) TCP flag to relay the congestion indication back to the source > + * endpoint, which acknowledges it back to the destination endpoint with the > + * Congestion Window Reduced (CWR) TCP flag. > + * > + * All IPv4/IPv6 packets of a given color with ECN set to 2’b01 or 2’b10 > + * carrying TCP or SCTP have their ECN set to 2’b11 if the marking feature is > + * enabled for the current color, otherwise the ECN field is left as is. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param mark_green > + * Set to non-zero value to enable marking of green packets and to zero to > + * disable it. > + * @param mark_yellow > + * Set to non-zero value to enable marking of yellow packets and to zero to > + * disable it. > + * @param mark_red > + * Set to non-zero value to enable marking of red packets and to zero to > + * disable it. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_mark_ip_ecn(uint8_t port_id, > + int mark_green, > + int mark_yellow, > + int mark_red, > + struct rte_scheddev_error *error); > + > +/** > + * Scheduler packet marking - IPv4 / IPv6 DSCP (IETF RFC 2597) > + * > + * IETF RFC 2597 maps the traffic class and the drop priority to the IPv4/IPv6 > + * Differentiated Services Codepoint (DSCP) field (6 bits). Here are the DSCP > + * values proposed by this RFC: > + * > + * Class 1 Class 2 Class 3 Class 4 > + * +----------+----------+----------+----------+ > + * Low Drop Prec | 001010 | 010010 | 011010 | 100010 | > + * Medium Drop Prec | 001100 | 010100 | 011100 | 100100 | > + * High Drop Prec | 001110 | 010110 | 011110 | 100110 | > + * +----------+----------+----------+----------+ > + * > + * There are 4 traffic classes (classes 1 .. 4) encoded by DSCP bits 1 and 2, as > + * well as 3 drop priorities (low/medium/high) encoded by DSCP bits 3 and 4. > + * > + * All IPv4/IPv6 packets have their color marked into DSCP bits 3 and 4 as > + * follows: green mapped to Low Drop Precedence (2’b01), yellow to Medium > + * (2’b10) and red to High (2’b11). Marking needs to be explicitly enabled > + * for each color; when not enabled for a given color, the DSCP field of all > + * packets with that color is left as is. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param mark_green > + * Set to non-zero value to enable marking of green packets and to zero to > + * disable it. > + * @param mark_yellow > + * Set to non-zero value to enable marking of yellow packets and to zero to > + * disable it. > + * @param mark_red > + * Set to non-zero value to enable marking of red packets and to zero to > + * disable it. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_mark_ip_dscp(uint8_t port_id, > + int mark_green, > + int mark_yellow, > + int mark_red, > + struct rte_scheddev_error *error); > + > +/** > + * Scheduler get statistics counter types enabled for all nodes > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param nonleaf_node_capability_stats_mask > + * Statistics counter types available per node for all non-leaf nodes. Needs > + * to be pre-allocated. > + * @param nonleaf_node_enabled_stats_mask > + * Statistics counter types currently enabled per node for each non-leaf node. > + * This is a subset of *nonleaf_node_capability_stats_mask*. Needs to be > + * pre-allocated. > + * @param leaf_node_capability_stats_mask > + * Statistics counter types available per node for all leaf nodes. Needs to > + * be pre-allocated. > + * @param leaf_node_enabled_stats_mask > + * Statistics counter types currently enabled for each leaf node. This is > + * a subset of *leaf_node_capability_stats_mask*. Needs to be pre-allocated. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_stats_get_enabled(uint8_t port_id, > + uint64_t *nonleaf_node_capability_stats_mask, > + uint64_t *nonleaf_node_enabled_stats_mask, > + uint64_t *leaf_node_capability_stats_mask, > + uint64_t *leaf_node_enabled_stats_mask, > + struct rte_scheddev_error *error); > + > +/** > + * Scheduler enable selected statistics counters for all nodes > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param nonleaf_node_enabled_stats_mask > + * Statistics counter types to be enabled per node for each non-leaf node. > + * This needs to be a subset of the statistics counter types available per > + * node for all non-leaf nodes. Any statistics counter type not included in > + * this set is to be disabled for all non-leaf nodes. > + * @param leaf_node_enabled_stats_mask > + * Statistics counter types to be enabled per node for each leaf node. This > + * needs to be a subset of the statistics counter types available per node for > + * all leaf nodes. Any statistics counter type not included in this set is to > + * be disabled for all leaf nodes. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_stats_enable(uint8_t port_id, > + uint64_t nonleaf_node_enabled_stats_mask, > + uint64_t leaf_node_enabled_stats_mask, > + struct rte_scheddev_error *error); > + > +/** > + * Scheduler get statistics counter types enabled for current node > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param node_id > + * Node ID. Needs to be valid. > + * @param capability_stats_mask > + * Statistics counter types available for the current node. Needs to be > + * pre-allocated. > + * @param enabled_stats_mask > + * Statistics counter types currently enabled for the current node. This is > + * a subset of *capability_stats_mask*. Needs to be pre-allocated. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_node_stats_get_enabled(uint8_t port_id, > + uint32_t node_id, > + uint64_t *capability_stats_mask, > + uint64_t *enabled_stats_mask, > + struct rte_scheddev_error *error); > + > +/** > + * Scheduler enable selected statistics counters for current node > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param node_id > + * Node ID. Needs to be valid. > + * @param enabled_stats_mask > + * Statistics counter types to be enabled for the current node. This needs to > + * be a subset of the statistics counter types available for the current node. > + * Any statistics counter type not included in this set is to be disabled for > + * the current node. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_node_stats_enable(uint8_t port_id, > + uint32_t node_id, > + uint64_t enabled_stats_mask, > + struct rte_scheddev_error *error); > + > +/** > + * Scheduler node statistics counters read > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param node_id > + * Node ID. Needs to be valid. > + * @param stats > + * When non-NULL, it contains the current value for the statistics counters > + * enabled for the current node. > + * @param clear > + * When this parameter has a non-zero value, the statistics counters are > + * cleared (i.e. set to zero) immediately after they have been read, otherwise > + * the statistics counters are left untouched. > + * @param error > + * Error details. Filled in only on error, when not NULL. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_scheddev_node_stats_read(uint8_t port_id, > + uint32_t node_id, > + struct rte_scheddev_node_stats *stats, > + int clear, > + struct rte_scheddev_error *error); > + > +#ifdef __cplusplus > +} > +#endif > + > +#endif /* __INCLUDE_RTE_SCHEDDEV_H__ */ > diff --git a/lib/librte_ether/rte_scheddev_driver.h b/lib/librte_ether/rte_scheddev_driver.h > new file mode 100644 > index 0000000..c0a0321 > --- /dev/null > +++ b/lib/librte_ether/rte_scheddev_driver.h > @@ -0,0 +1,374 @@ > +/*- > + * BSD LICENSE > + * > + * Copyright(c) 2017 Intel Corporation. All rights reserved. > + * All rights reserved. > + * > + * Redistribution and use in source and binary forms, with or without > + * modification, are permitted provided that the following conditions > + * are met: > + * > + * * Redistributions of source code must retain the above copyright > + * notice, this list of conditions and the following disclaimer. > + * * Redistributions in binary form must reproduce the above copyright > + * notice, this list of conditions and the following disclaimer in > + * the documentation and/or other materials provided with the > + * distribution. > + * * Neither the name of Intel Corporation nor the names of its > + * contributors may be used to endorse or promote products derived > + * from this software without specific prior written permission. > + * > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS > + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. > + */ > + > +#ifndef __INCLUDE_RTE_SCHEDDEV_DRIVER_H__ > +#define __INCLUDE_RTE_SCHEDDEV_DRIVER_H__ > + > +/** > + * @file > + * RTE Generic Hierarchical Scheduler API (Driver Side) > + * > + * This file provides implementation helpers for internal use by PMDs, they > + * are not intended to be exposed to applications and are not subject to ABI > + * versioning. > + */ > + > +#include > + > +#include > +#include "rte_ethdev.h" > +#include "rte_scheddev.h" > + > +#ifdef __cplusplus > +extern "C" { > +#endif > + > +typedef int (*rte_scheddev_capabilities_get_t)(struct rte_eth_dev *dev, > + struct rte_scheddev_capabilities *cap, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler capabilities get */ > + > +typedef int (*rte_scheddev_node_capabilities_get_t)(struct rte_eth_dev *dev, > + uint32_t node_id, > + struct rte_scheddev_node_capabilities *cap, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler node capabilities get */ > + > +typedef int (*rte_scheddev_wred_profile_add_t)(struct rte_eth_dev *dev, > + uint32_t wred_profile_id, > + struct rte_scheddev_wred_params *profile, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler WRED profile add */ > + > +typedef int (*rte_scheddev_wred_profile_delete_t)(struct rte_eth_dev *dev, > + uint32_t wred_profile_id, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler WRED profile delete */ > + > +typedef int (*rte_scheddev_shared_wred_context_add_update_t)( > + struct rte_eth_dev *dev, > + uint32_t shared_wred_context_id, > + uint32_t wred_profile_id, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler shared WRED context add */ > + > +typedef int (*rte_scheddev_shared_wred_context_delete_t)( > + struct rte_eth_dev *dev, > + uint32_t shared_wred_context_id, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler shared WRED context delete */ > + > +typedef int (*rte_scheddev_shaper_profile_add_t)(struct rte_eth_dev *dev, > + uint32_t shaper_profile_id, > + struct rte_scheddev_shaper_params *profile, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler shaper profile add */ > + > +typedef int (*rte_scheddev_shaper_profile_delete_t)(struct rte_eth_dev *dev, > + uint32_t shaper_profile_id, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler shaper profile delete */ > + > +typedef int (*rte_scheddev_shared_shaper_add_update_t)(struct rte_eth_dev *dev, > + uint32_t shared_shaper_id, > + uint32_t shaper_profile_id, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler shared shaper add/update */ > + > +typedef int (*rte_scheddev_shared_shaper_delete_t)(struct rte_eth_dev *dev, > + uint32_t shared_shaper_id, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler shared shaper delete */ > + > +typedef int (*rte_scheddev_node_add_t)(struct rte_eth_dev *dev, > + uint32_t node_id, > + uint32_t parent_node_id, > + uint32_t priority, > + uint32_t weight, > + struct rte_scheddev_node_params *params, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler node add */ > + > +typedef int (*rte_scheddev_node_delete_t)(struct rte_eth_dev *dev, > + uint32_t node_id, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler node delete */ > + > +typedef int (*rte_scheddev_node_suspend_t)(struct rte_eth_dev *dev, > + uint32_t node_id, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler node suspend */ > + > +typedef int (*rte_scheddev_node_resume_t)(struct rte_eth_dev *dev, > + uint32_t node_id, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler node resume */ > + > +typedef int (*rte_scheddev_hierarchy_set_t)(struct rte_eth_dev *dev, > + int clear_on_fail, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler hierarchy set */ > + > +typedef int (*rte_scheddev_node_parent_update_t)(struct rte_eth_dev *dev, > + uint32_t node_id, > + uint32_t parent_node_id, > + uint32_t priority, > + uint32_t weight, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler node parent update */ > + > +typedef int (*rte_scheddev_node_shaper_update_t)(struct rte_eth_dev *dev, > + uint32_t node_id, > + uint32_t shaper_profile_id, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler node shaper update */ > + > +typedef int (*rte_scheddev_node_shared_shaper_update_t)(struct rte_eth_dev *dev, > + uint32_t node_id, > + uint32_t shared_shaper_id, > + int32_t add, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler node shaper update */ > + > +typedef int (*rte_scheddev_node_scheduling_mode_update_t)( > + struct rte_eth_dev *dev, > + uint32_t node_id, > + int *scheduling_mode_per_priority, > + uint32_t n_priorities, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler node scheduling mode update */ > + > +typedef int (*rte_scheddev_node_cman_update_t)(struct rte_eth_dev *dev, > + uint32_t node_id, > + enum rte_scheddev_cman_mode cman, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler node congestion management mode update */ > + > +typedef int (*rte_scheddev_node_wred_context_update_t)( > + struct rte_eth_dev *dev, > + uint32_t node_id, > + uint32_t wred_profile_id, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler node WRED context update */ > + > +typedef int (*rte_scheddev_node_shared_wred_context_update_t)( > + struct rte_eth_dev *dev, > + uint32_t node_id, > + uint32_t shared_wred_context_id, > + int add, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler node WRED context update */ > + > +typedef int (*rte_scheddev_mark_vlan_dei_t)(struct rte_eth_dev *dev, > + int mark_green, > + int mark_yellow, > + int mark_red, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler packet marking - VLAN DEI */ > + > +typedef int (*rte_scheddev_mark_ip_ecn_t)(struct rte_eth_dev *dev, > + int mark_green, > + int mark_yellow, > + int mark_red, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler packet marking - IPv4/IPv6 ECN */ > + > +typedef int (*rte_scheddev_mark_ip_dscp_t)(struct rte_eth_dev *dev, > + int mark_green, > + int mark_yellow, > + int mark_red, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler packet marking - IPv4/IPv6 DSCP */ > + > +typedef int (*rte_scheddev_stats_get_enabled_t)(struct rte_eth_dev *dev, > + uint64_t *nonleaf_node_capability_stats_mask, > + uint64_t *nonleaf_node_enabled_stats_mask, > + uint64_t *leaf_node_capability_stats_mask, > + uint64_t *leaf_node_enabled_stats_mask, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler get set of stats counters enabled for all nodes */ > + > +typedef int (*rte_scheddev_stats_enable_t)(struct rte_eth_dev *dev, > + uint64_t nonleaf_node_enabled_stats_mask, > + uint64_t leaf_node_enabled_stats_mask, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler enable selected stats counters for all nodes */ > + > +typedef int (*rte_scheddev_node_stats_get_enabled_t)(struct rte_eth_dev *dev, > + uint32_t node_id, > + uint64_t *capability_stats_mask, > + uint64_t *enabled_stats_mask, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler get set of stats counters enabled for specific node */ > + > +typedef int (*rte_scheddev_node_stats_enable_t)(struct rte_eth_dev *dev, > + uint32_t node_id, > + uint64_t enabled_stats_mask, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler enable selected stats counters for specific node */ > + > +typedef int (*rte_scheddev_node_stats_read_t)(struct rte_eth_dev *dev, > + uint32_t node_id, > + struct rte_scheddev_node_stats *stats, > + int clear, > + struct rte_scheddev_error *error); > +/**< @internal Scheduler read stats counters for specific node */ > + > +struct rte_scheddev_ops { > + /** Scheduler capabilities_get */ > + rte_scheddev_capabilities_get_t capabilities_get; > + /** Scheduler node capabilities get */ > + rte_scheddev_node_capabilities_get_t node_capabilities_get; > + > + /** Scheduler WRED profile add */ > + rte_scheddev_wred_profile_add_t wred_profile_add; > + /** Scheduler WRED profile delete */ > + rte_scheddev_wred_profile_delete_t wred_profile_delete; > + /** Scheduler shared WRED context add/update */ > + rte_scheddev_shared_wred_context_add_update_t > + shared_wred_context_add_update; > + /** Scheduler shared WRED context delete */ > + rte_scheddev_shared_wred_context_delete_t > + shared_wred_context_delete; > + /** Scheduler shaper profile add */ > + rte_scheddev_shaper_profile_add_t shaper_profile_add; > + /** Scheduler shaper profile delete */ > + rte_scheddev_shaper_profile_delete_t shaper_profile_delete; > + /** Scheduler shared shaper add/update */ > + rte_scheddev_shared_shaper_add_update_t shared_shaper_add_update; > + /** Scheduler shared shaper delete */ > + rte_scheddev_shared_shaper_delete_t shared_shaper_delete; > + > + /** Scheduler node add */ > + rte_scheddev_node_add_t node_add; > + /** Scheduler node delete */ > + rte_scheddev_node_delete_t node_delete; > + /** Scheduler node suspend */ > + rte_scheddev_node_suspend_t node_suspend; > + /** Scheduler node resume */ > + rte_scheddev_node_resume_t node_resume; > + /** Scheduler hierarchy set */ > + rte_scheddev_hierarchy_set_t hierarchy_set; > + > + /** Scheduler node parent update */ > + rte_scheddev_node_parent_update_t node_parent_update; > + /** Scheduler node shaper update */ > + rte_scheddev_node_shaper_update_t node_shaper_update; > + /** Scheduler node shared shaper update */ > + rte_scheddev_node_shared_shaper_update_t node_shared_shaper_update; > + /** Scheduler node scheduling mode update */ > + rte_scheddev_node_scheduling_mode_update_t node_scheduling_mode_update; > + /** Scheduler node congestion management mode update */ > + rte_scheddev_node_cman_update_t node_cman_update; > + /** Scheduler node WRED context update */ > + rte_scheddev_node_wred_context_update_t node_wred_context_update; > + /** Scheduler node shared WRED context update */ > + rte_scheddev_node_shared_wred_context_update_t > + node_shared_wred_context_update; > + > + /** Scheduler packet marking - VLAN DEI */ > + rte_scheddev_mark_vlan_dei_t mark_vlan_dei; > + /** Scheduler packet marking - IPv4/IPv6 ECN */ > + rte_scheddev_mark_ip_ecn_t mark_ip_ecn; > + /** Scheduler packet marking - IPv4/IPv6 DSCP */ > + rte_scheddev_mark_ip_dscp_t mark_ip_dscp; > + > + /** Scheduler get statistics counter type enabled for all nodes */ > + rte_scheddev_stats_get_enabled_t stats_get_enabled; > + /** Scheduler enable selected statistics counters for all nodes */ > + rte_scheddev_stats_enable_t stats_enable; > + /** Scheduler get statistics counter type enabled for current node */ > + rte_scheddev_node_stats_get_enabled_t node_stats_get_enabled; > + /** Scheduler enable selected statistics counters for current node */ > + rte_scheddev_node_stats_enable_t node_stats_enable; > + /** Scheduler read statistics counters for current node */ > + rte_scheddev_node_stats_read_t node_stats_read; > +}; > + > +/** > + * Initialize generic error structure. > + * > + * This function also sets rte_errno to a given value. > + * > + * @param error > + * Pointer to error structure (may be NULL). > + * @param code > + * Related error code (rte_errno). > + * @param type > + * Cause field and error type. > + * @param cause > + * Object responsible for the error. > + * @param message > + * Human-readable error message. > + * > + * @return > + * Error code. > + */ > +static inline int > +rte_scheddev_error_set(struct rte_scheddev_error *error, > + int code, > + enum rte_scheddev_error_type type, > + const void *cause, > + const char *message) > +{ > + if (error) { > + *error = (struct rte_scheddev_error){ > + .type = type, > + .cause = cause, > + .message = message, > + }; > + } > + rte_errno = code; > + return code; > +} > + > +/** > + * Get generic hierarchical scheduler operations structure from a port > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param error > + * Error details > + * > + * @return > + * The hierarchical scheduler operations structure associated with port_id on > + * success, NULL otherwise. > + */ > +const struct rte_scheddev_ops * > +rte_scheddev_ops_get(uint8_t port_id, struct rte_scheddev_error *error); > + > +#ifdef __cplusplus > +} > +#endif > + > +#endif /* __INCLUDE_RTE_SCHEDDEV_DRIVER_H__ */ >