DPDK-dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: Why IP_PIPELINE is faster than L2FWD
From: Royce Niu @ 2016-12-22 12:48 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: Royce Niu, dev
In-Reply-To: <20161222111528.GA11104@bricha3-MOBL3.ger.corp.intel.com>

But, actually, L3FWD of IP_PIPELINE is also faster than stock L2FWD, which
also modifies mac addr. How can explain this?

Actually, I want to know why IP_PIPELINE is much faster and I can learn
from IP_PIPELINE and make our own program.

But, the documentation of that is not detailed enough. if it is possible,
could you tell me where is the key to boost? Thanks!

On Thu, Dec 22, 2016 at 7:15 PM, Bruce Richardson <
bruce.richardson@intel.com> wrote:

> On Thu, Dec 22, 2016 at 12:18:12AM +0800, Royce Niu wrote:
> > Hi all,
> >
> > I tested default L2FWD and IP_PIPELINE (pass-through). The throughput of
> > IP_PIPELINE is higher immensely.
> >
> > There are only two virtual NICs in KVM. The experiment is just moving
> > packet from vNIC0  to vNIC1. I think the function is so simple. Why L2FWD
> > is much slower?
> >
> > How can I improve L2FWD, to make L2FWD faster?
> >
> Is IP_PIPELINE in passthrough mode modifying the packets? L2FWD swaps
> the mac addresses on each packet as it processes them, which can slow it
> down. L2FWD is also more an example of how the APIs work than anything
> else. For fastest possible port-to-port forwarding, testpmd should give
> the highest performance.
>
> /Bruce
>



-- 
Regards,

Royce

^ permalink raw reply

* [PATCH v14 0/8] add Tx preparation
From: Tomasz Kulasek @ 2016-12-22 13:05 UTC (permalink / raw)
  To: dev
In-Reply-To: <1481650914-40324-1-git-send-email-tomaszx.kulasek@intel.com>

As discussed in that thread:

http://dpdk.org/ml/archives/dev/2015-September/023603.html

Different NIC models depending on HW offload requested might impose
different requirements on packets to be TX-ed in terms of:

 - Max number of fragments per packet allowed
 - Max number of fragments per TSO segments
 - The way pseudo-header checksum should be pre-calculated
 - L3/L4 header fields filling
 - etc.


MOTIVATION:
-----------

1) Some work cannot (and didn't should) be done in rte_eth_tx_burst.
   However, this work is sometimes required, and now, it's an
   application issue.

2) Different hardware may have different requirements for TX offloads,
   other subset can be supported and so on.

3) Some parameters (e.g. number of segments in ixgbe driver) may hung
   device. These parameters may be vary for different devices.

   For example i40e HW allows 8 fragments per packet, but that is after
   TSO segmentation. While ixgbe has a 38-fragment pre-TSO limit.

4) Fields in packet may require different initialization (like e.g. will
   require pseudo-header checksum precalculation, sometimes in a
   different way depending on packet type, and so on). Now application
   needs to care about it.

5) Using additional API (rte_eth_tx_prepare) before rte_eth_tx_burst let
   to prepare packet burst in acceptable form for specific device.

6) Some additional checks may be done in debug mode keeping tx_burst
   implementation clean.


PROPOSAL:
---------

To help user to deal with all these varieties we propose to:

1) Introduce rte_eth_tx_prepare() function to do necessary preparations
   of packet burst to be safely transmitted on device for desired HW
   offloads (set/reset checksum field according to the hardware
   requirements) and check HW constraints (number of segments per
   packet, etc).

   While the limitations and requirements may differ for devices, it
   requires to extend rte_eth_dev structure with new function pointer
   "tx_pkt_prepare" which can be implemented in the driver to prepare
   and verify packets, in devices specific way, before burst, what
   should to prevent application to send malformed packets.

2) Also new fields will be introduced in rte_eth_desc_lim: 
   nb_seg_max and nb_mtu_seg_max, providing an information about max
   segments in TSO and non-TSO packets acceptable by device.

   This information is useful for application to not create/limit
   malicious packet.


APPLICATION (CASE OF USE):
--------------------------

1) Application should to initialize burst of packets to send, set
   required tx offload flags and required fields, like l2_len, l3_len,
   l4_len, and tso_segsz

2) Application passes burst to the rte_eth_tx_prep to check conditions
   required to send packets through the NIC.

3) The result of rte_eth_tx_prep can be used to send valid packets
   and/or restore invalid if function fails.

e.g.

	for (i = 0; i < nb_pkts; i++) {

		/* initialize or process packet */

		bufs[i]->tso_segsz = 800;
		bufs[i]->ol_flags = PKT_TX_TCP_SEG | PKT_TX_IPV4
				| PKT_TX_IP_CKSUM;
		bufs[i]->l2_len = sizeof(struct ether_hdr);
		bufs[i]->l3_len = sizeof(struct ipv4_hdr);
		bufs[i]->l4_len = sizeof(struct tcp_hdr);
	}

	/* Prepare burst of TX packets */
	nb_prep = rte_eth_tx_prepare(port, 0, bufs, nb_pkts);

	if (nb_prep < nb_pkts) {
		printf("Tx prepare failed\n");

		/* nb_prep indicates here first invalid packet. rte_eth_tx_prep
		 * can be used on remaining packets to find another ones.
		 */

	}

	/* Send burst of TX packets */
	nb_tx = rte_eth_tx_burst(port, 0, bufs, nb_prep);

	/* Free any unsent packets. */

v14 changes:
 - added support for ena
 - introduced rte_net_intel_cksum_flags_prepare(m, ol_flags) function
   in rte_net.h to allow application choose offloads to be computed
   if not all are required
 - all drivers support tx preparation API for now, so removed
   csum txprep command from test-pmd as redundant, and use Tx 
   preparation by default 

v13 changes:
 - added support for vmxnet3
 - reworded help information for "csum txprep" command
 - renamed RTE_ETHDEV_TX_PREPARE to RTE_ETHDEV_TX_PREPARE_NOOP to
   better suit its purpose.

v12 changes:
 - renamed API function from "rte_eth_tx_prep" to "rte_eth_tx_prepare"
   (to be not confused with "prepend")
 - changed "rte_phdr_cksum_fix" to "rte_net_intel_cksum_prepare"
 - added "csum txprep (on|off)" command to the csum engine allowing to
   select txprep path for packet processing

v11 changed:
 - updated comments
 - added information to the API description about packet data
   requirements/limitations.

v10 changes:
 - moved drivers tx calback check in rte_eth_tx_prep after queue_id check

v9 changes:
 - fixed headers structure fragmentation check
 - moved fragmentation check into rte_validate_tx_offload()

v8 changes:
 - mbuf argument in rte_validate_tx_offload declared as const

v7 changes:
 - comments reworded/added
 - changed errno values returned from Tx prep API
 - added check in rte_phdr_cksum_fix if headers are in the first
   data segment and can be safetly modified
 - moved rte_validate_tx_offload to rte_mbuf
 - moved rte_phdr_cksum_fix to rte_net.h
 - removed rte_pkt.h new file as useless

v6 changes:
 - added performance impact test results to the patch description

v5 changes:
 - rebased csum engine modification
 - added information to the csum engine about performance tests
 - some performance improvements

v4 changes:
 - tx_prep is now set to default behavior (NULL) for simple/vector path
   in fm10k, i40e and ixgbe drivers to increase performance, when
   Tx offloads are not intentionally available

v3 changes:
 - reworked csum testpmd engine instead adding new one,
 - fixed checksum initialization procedure to include also outer
   checksum offloads,
 - some minor formattings and optimalizations

v2 changes:
 - rte_eth_tx_prep() returns number of packets when device doesn't
   support tx_prep functionality,
 - introduced CONFIG_RTE_ETHDEV_TX_PREP allowing to turn off tx_prep


Konstantin Ananyev (2):
  ena: add Tx preparation
  vmxnet3: add Tx preparation

Tomasz Kulasek (6):
  ethdev: add Tx preparation
  e1000: add Tx preparation
  fm10k: add Tx preparation
  i40e: add Tx preparation
  ixgbe: add Tx preparation
  testpmd: use Tx preparation in csum engine

 app/test-pmd/csumonly.c              |   37 +++++------
 app/test-pmd/testpmd.c               |    5 ++
 app/test-pmd/testpmd.h               |    2 +
 config/common_base                   |    9 +++
 drivers/net/e1000/e1000_ethdev.h     |   11 ++++
 drivers/net/e1000/em_ethdev.c        |    5 +-
 drivers/net/e1000/em_rxtx.c          |   48 +++++++++++++-
 drivers/net/e1000/igb_ethdev.c       |    4 ++
 drivers/net/e1000/igb_rxtx.c         |   53 +++++++++++++++-
 drivers/net/ena/ena_ethdev.c         |   51 +++++++++++++++
 drivers/net/fm10k/fm10k.h            |    6 ++
 drivers/net/fm10k/fm10k_ethdev.c     |    5 ++
 drivers/net/fm10k/fm10k_rxtx.c       |   50 ++++++++++++++-
 drivers/net/i40e/i40e_ethdev.c       |    3 +
 drivers/net/i40e/i40e_rxtx.c         |   74 +++++++++++++++++++++-
 drivers/net/i40e/i40e_rxtx.h         |    8 +++
 drivers/net/ixgbe/ixgbe_ethdev.c     |    3 +
 drivers/net/ixgbe/ixgbe_ethdev.h     |    5 +-
 drivers/net/ixgbe/ixgbe_rxtx.c       |   57 +++++++++++++++++
 drivers/net/ixgbe/ixgbe_rxtx.h       |    2 +
 drivers/net/vmxnet3/vmxnet3_ethdev.c |    6 ++
 drivers/net/vmxnet3/vmxnet3_ethdev.h |    2 +
 drivers/net/vmxnet3/vmxnet3_rxtx.c   |   56 +++++++++++++++++
 lib/librte_ether/rte_ethdev.h        |  115 ++++++++++++++++++++++++++++++++++
 lib/librte_mbuf/rte_mbuf.h           |   64 +++++++++++++++++++
 lib/librte_net/rte_net.h             |  110 ++++++++++++++++++++++++++++++++
 26 files changed, 764 insertions(+), 27 deletions(-)

-- 
1.7.9.5

^ permalink raw reply

* [PATCH v14 1/8] ethdev: add Tx preparation
From: Tomasz Kulasek @ 2016-12-22 13:05 UTC (permalink / raw)
  To: dev
In-Reply-To: <1482411919-7620-1-git-send-email-tomaszx.kulasek@intel.com>

Added API for `rte_eth_tx_prepare`

uint16_t rte_eth_tx_prepare(uint8_t port_id, uint16_t queue_id,
	struct rte_mbuf **tx_pkts, uint16_t nb_pkts)

Added fields to the `struct rte_eth_desc_lim`:

	uint16_t nb_seg_max;
		/**< Max number of segments per whole packet. */

	uint16_t nb_mtu_seg_max;
		/**< Max number of segments per one MTU */

Added functions:

int
rte_validate_tx_offload(struct rte_mbuf *m)

  to validate general requirements for tx offload set in mbuf of packet
  such a flag completness. In current implementation this function is
  called optionaly when RTE_LIBRTE_ETHDEV_DEBUG is enabled.


int rte_net_intel_cksum_prepare(struct rte_mbuf *m)

  to prepare pseudo header checksum for TSO and non-TSO tcp/udp packets
  before hardware tx checksum offload.
   - for non-TSO tcp/udp packets full pseudo-header checksum is
     counted and set.
   - for TSO the IP payload length is not included.


int
rte_net_intel_cksum_flags_prepare(struct rte_mbuf *m, uint64_t ol_flags)

  this function uses same logic as rte_net_intel_cksum_prepare, but
  allows application to choose which offloads should be taken into
  account, if full preparation is not required.


PERFORMANCE TESTS
-----------------

This feature was tested with modified csum engine from test-pmd.

The packet checksum preparation was moved from application to Tx
preparation step placed before burst.

We may expect some overhead costs caused by:
1) using additional callback before burst,
2) rescanning burst,
3) additional condition checking (packet validation),
4) worse optimization (e.g. packet data access, etc.)

We tested it using ixgbe Tx preparation implementation with some parts
disabled to have comparable information about the impact of different
parts of implementation.

IMPACT:

1) For unimplemented Tx preparation callback the performance impact is
   negligible,
2) For packet condition check without checksum modifications (nb_segs,
   available offloads, etc.) is 14626628/14252168 (~2.62% drop),
3) Full support in ixgbe driver (point 2 + packet checksum
   initialization) is 14060924/13588094 (~3.48% drop)

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
 config/common_base            |    9 ++++
 lib/librte_ether/rte_ethdev.h |  115 +++++++++++++++++++++++++++++++++++++++++
 lib/librte_mbuf/rte_mbuf.h    |   64 +++++++++++++++++++++++
 lib/librte_net/rte_net.h      |  110 +++++++++++++++++++++++++++++++++++++++
 4 files changed, 298 insertions(+)

diff --git a/config/common_base b/config/common_base
index edb6a54..92c413a 100644
--- a/config/common_base
+++ b/config/common_base
@@ -123,6 +123,15 @@ CONFIG_RTE_ETHDEV_QUEUE_STAT_CNTRS=16
 CONFIG_RTE_ETHDEV_RXTX_CALLBACKS=y
 
 #
+# Use real NOOP to turn off TX preparation stage
+#
+# While the behaviour of ``rte_ethdev_tx_prepare`` may change after turning on
+# real NOOP, this configuration shouldn't be never enabled globaly, and can be
+# used in appropriate target configuration file with a following restrictions
+#
+CONFIG_RTE_ETHDEV_TX_PREPARE_NOOP=n
+
+#
 # Support NIC bypass logic
 #
 CONFIG_RTE_NIC_BYPASS=n
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 52119af..10be095 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -182,6 +182,7 @@
 #include <rte_pci.h>
 #include <rte_dev.h>
 #include <rte_devargs.h>
+#include <rte_errno.h>
 #include "rte_ether.h"
 #include "rte_eth_ctrl.h"
 #include "rte_dev_info.h"
@@ -702,6 +703,8 @@ struct rte_eth_desc_lim {
 	uint16_t nb_max;   /**< Max allowed number of descriptors. */
 	uint16_t nb_min;   /**< Min allowed number of descriptors. */
 	uint16_t nb_align; /**< Number of descriptors should be aligned to. */
+	uint16_t nb_seg_max;  /**< Max number of segments per whole packet. */
+	uint16_t nb_mtu_seg_max; /**< Max number of segments per one MTU */
 };
 
 /**
@@ -1191,6 +1194,11 @@ typedef uint16_t (*eth_tx_burst_t)(void *txq,
 				   uint16_t nb_pkts);
 /**< @internal Send output packets on a transmit queue of an Ethernet device. */
 
+typedef uint16_t (*eth_tx_prep_t)(void *txq,
+				   struct rte_mbuf **tx_pkts,
+				   uint16_t nb_pkts);
+/**< @internal Prepare output packets on a transmit queue of an Ethernet device. */
+
 typedef int (*flow_ctrl_get_t)(struct rte_eth_dev *dev,
 			       struct rte_eth_fc_conf *fc_conf);
 /**< @internal Get current flow control parameter on an Ethernet device */
@@ -1625,6 +1633,7 @@ struct rte_eth_rxtx_callback {
 struct rte_eth_dev {
 	eth_rx_burst_t rx_pkt_burst; /**< Pointer to PMD receive function. */
 	eth_tx_burst_t tx_pkt_burst; /**< Pointer to PMD transmit function. */
+	eth_tx_prep_t tx_pkt_prepare; /**< Pointer to PMD transmit prepare function. */
 	struct rte_eth_dev_data *data;  /**< Pointer to device data */
 	const struct eth_driver *driver;/**< Driver for this device */
 	const struct eth_dev_ops *dev_ops; /**< Functions exported by PMD */
@@ -2832,6 +2841,112 @@ int rte_eth_dev_set_vlan_ether_type(uint8_t port_id,
 	return (*dev->tx_pkt_burst)(dev->data->tx_queues[queue_id], tx_pkts, nb_pkts);
 }
 
+/**
+ * Process a burst of output packets on a transmit queue of an Ethernet device.
+ *
+ * The rte_eth_tx_prepare() function is invoked to prepare output packets to be
+ * transmitted on the output queue *queue_id* of the Ethernet device designated
+ * by its *port_id*.
+ * The *nb_pkts* parameter is the number of packets to be prepared which are
+ * supplied in the *tx_pkts* array of *rte_mbuf* structures, each of them
+ * allocated from a pool created with rte_pktmbuf_pool_create().
+ * For each packet to send, the rte_eth_tx_prepare() function performs
+ * the following operations:
+ *
+ * - Check if packet meets devices requirements for tx offloads.
+ *
+ * - Check limitations about number of segments.
+ *
+ * - Check additional requirements when debug is enabled.
+ *
+ * - Update and/or reset required checksums when tx offload is set for packet.
+ *
+ * Since this function can modify packet data, provided mbufs must be safely
+ * writable (e.g. modified data cannot be in shared segment).
+ *
+ * The rte_eth_tx_prepare() function returns the number of packets ready to be
+ * sent. A return value equal to *nb_pkts* means that all packets are valid and
+ * ready to be sent, otherwise stops processing on the first invalid packet and
+ * leaves the rest packets untouched.
+ *
+ * When this functionality is not implemented in the driver, all packets are
+ * are returned untouched.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ *   The value must be a valid port id.
+ * @param queue_id
+ *   The index of the transmit queue through which output packets must be
+ *   sent.
+ *   The value must be in the range [0, nb_tx_queue - 1] previously supplied
+ *   to rte_eth_dev_configure().
+ * @param tx_pkts
+ *   The address of an array of *nb_pkts* pointers to *rte_mbuf* structures
+ *   which contain the output packets.
+ * @param nb_pkts
+ *   The maximum number of packets to process.
+ * @return
+ *   The number of packets correct and ready to be sent. The return value can be
+ *   less than the value of the *tx_pkts* parameter when some packet doesn't
+ *   meet devices requirements with rte_errno set appropriately:
+ *   - -EINVAL: offload flags are not correctly set
+ *   - -ENOTSUP: the offload feature is not supported by the hardware
+ *
+ */
+
+#ifndef RTE_ETHDEV_TX_PREPARE_NOOP
+
+static inline uint16_t
+rte_eth_tx_prepare(uint8_t port_id, uint16_t queue_id,
+		struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
+{
+	struct rte_eth_dev *dev;
+
+#ifdef RTE_LIBRTE_ETHDEV_DEBUG
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		RTE_PMD_DEBUG_TRACE("Invalid TX port_id=%d\n", port_id);
+		rte_errno = -EINVAL;
+		return 0;
+	}
+#endif
+
+	dev = &rte_eth_devices[port_id];
+
+#ifdef RTE_LIBRTE_ETHDEV_DEBUG
+	if (queue_id >= dev->data->nb_tx_queues) {
+		RTE_PMD_DEBUG_TRACE("Invalid TX queue_id=%d\n", queue_id);
+		rte_errno = -EINVAL;
+		return 0;
+	}
+#endif
+
+	if (!dev->tx_pkt_prepare)
+		return nb_pkts;
+
+	return (*dev->tx_pkt_prepare)(dev->data->tx_queues[queue_id],
+			tx_pkts, nb_pkts);
+}
+
+#else
+
+/*
+ * Native NOOP operation for compilation targets which doesn't require any
+ * preparations steps, and functional NOOP may introduce unnecessary performance
+ * drop.
+ *
+ * Generally this is not a good idea to turn it on globally and didn't should
+ * be used if behavior of tx_preparation can change.
+ */
+
+static inline uint16_t
+rte_eth_tx_prepare(__rte_unused uint8_t port_id, __rte_unused uint16_t queue_id,
+		__rte_unused struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
+{
+	return nb_pkts;
+}
+
+#endif
+
 typedef void (*buffer_tx_error_fn)(struct rte_mbuf **unsent, uint16_t count,
 		void *userdata);
 
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index ead7c6e..39ee5ed 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -283,6 +283,19 @@
  */
 #define PKT_TX_OUTER_IPV6    (1ULL << 60)
 
+/**
+ * Bit Mask of all supported packet Tx offload features flags, which can be set
+ * for packet.
+ */
+#define PKT_TX_OFFLOAD_MASK (    \
+		PKT_TX_IP_CKSUM |        \
+		PKT_TX_L4_MASK |         \
+		PKT_TX_OUTER_IP_CKSUM |  \
+		PKT_TX_TCP_SEG |         \
+		PKT_TX_QINQ_PKT |        \
+		PKT_TX_VLAN_PKT |        \
+		PKT_TX_TUNNEL_MASK)
+
 #define __RESERVED           (1ULL << 61) /**< reserved for future mbuf use */
 
 #define IND_ATTACHED_MBUF    (1ULL << 62) /**< Indirect attached mbuf */
@@ -1647,6 +1660,57 @@ static inline int rte_pktmbuf_chain(struct rte_mbuf *head, struct rte_mbuf *tail
 }
 
 /**
+ * Validate general requirements for tx offload in mbuf.
+ *
+ * This function checks correctness and completeness of Tx offload settings.
+ *
+ * @param m
+ *   The packet mbuf to be validated.
+ * @return
+ *   0 if packet is valid
+ */
+static inline int
+rte_validate_tx_offload(const struct rte_mbuf *m)
+{
+	uint64_t ol_flags = m->ol_flags;
+	uint64_t inner_l3_offset = m->l2_len;
+
+	/* Does packet set any of available offloads? */
+	if (!(ol_flags & PKT_TX_OFFLOAD_MASK))
+		return 0;
+
+	if (ol_flags & PKT_TX_OUTER_IP_CKSUM)
+		inner_l3_offset += m->outer_l2_len + m->outer_l3_len;
+
+	/* Headers are fragmented */
+	if (rte_pktmbuf_data_len(m) < inner_l3_offset + m->l3_len + m->l4_len)
+		return -ENOTSUP;
+
+	/* IP checksum can be counted only for IPv4 packet */
+	if ((ol_flags & PKT_TX_IP_CKSUM) && (ol_flags & PKT_TX_IPV6))
+		return -EINVAL;
+
+	/* IP type not set when required */
+	if (ol_flags & (PKT_TX_L4_MASK | PKT_TX_TCP_SEG))
+		if (!(ol_flags & (PKT_TX_IPV4 | PKT_TX_IPV6)))
+			return -EINVAL;
+
+	/* Check requirements for TSO packet */
+	if (ol_flags & PKT_TX_TCP_SEG)
+		if ((m->tso_segsz == 0) ||
+				((ol_flags & PKT_TX_IPV4) &&
+				!(ol_flags & PKT_TX_IP_CKSUM)))
+			return -EINVAL;
+
+	/* PKT_TX_OUTER_IP_CKSUM set for non outer IPv4 packet. */
+	if ((ol_flags & PKT_TX_OUTER_IP_CKSUM) &&
+			!(ol_flags & PKT_TX_OUTER_IPV4))
+		return -EINVAL;
+
+	return 0;
+}
+
+/**
  * Dump an mbuf structure to a file.
  *
  * Dump all fields for the given packet mbuf and all its associated
diff --git a/lib/librte_net/rte_net.h b/lib/librte_net/rte_net.h
index d4156ae..548eaed 100644
--- a/lib/librte_net/rte_net.h
+++ b/lib/librte_net/rte_net.h
@@ -38,6 +38,11 @@
 extern "C" {
 #endif
 
+#include <rte_ip.h>
+#include <rte_udp.h>
+#include <rte_tcp.h>
+#include <rte_sctp.h>
+
 /**
  * Structure containing header lengths associated to a packet, filled
  * by rte_net_get_ptype().
@@ -86,6 +91,111 @@ struct rte_net_hdr_lens {
 uint32_t rte_net_get_ptype(const struct rte_mbuf *m,
 	struct rte_net_hdr_lens *hdr_lens, uint32_t layers);
 
+/**
+ * Prepare pseudo header checksum
+ *
+ * This function prepares pseudo header checksum for TSO and non-TSO tcp/udp in
+ * provided mbufs packet data and based on the requested offload flags.
+ *
+ * - for non-TSO tcp/udp packets full pseudo-header checksum is counted and set
+ *   in packet data,
+ * - for TSO the IP payload length is not included in pseudo header.
+ *
+ * This function expects that used headers are in the first data segment of
+ * mbuf, are not fragmented and can be safely modified.
+ *
+ * @param m
+ *   The packet mbuf to be fixed.
+ * @param ol_flags
+ *   TX offloads flags to use with this packet.
+ * @return
+ *   0 if checksum is initialized properly
+ */
+static inline int
+rte_net_intel_cksum_flags_prepare(struct rte_mbuf *m, uint64_t ol_flags)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	struct ipv6_hdr *ipv6_hdr;
+	struct tcp_hdr *tcp_hdr;
+	struct udp_hdr *udp_hdr;
+	uint64_t inner_l3_offset = m->l2_len;
+
+	if (ol_flags & PKT_TX_OUTER_IP_CKSUM)
+		inner_l3_offset += m->outer_l2_len + m->outer_l3_len;
+
+	if ((ol_flags & PKT_TX_UDP_CKSUM) == PKT_TX_UDP_CKSUM) {
+		if (ol_flags & PKT_TX_IPV4) {
+			ipv4_hdr = rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *,
+					inner_l3_offset);
+
+			if (ol_flags & PKT_TX_IP_CKSUM)
+				ipv4_hdr->hdr_checksum = 0;
+
+			udp_hdr = (struct udp_hdr *)((char *)ipv4_hdr +
+					m->l3_len);
+			udp_hdr->dgram_cksum = rte_ipv4_phdr_cksum(ipv4_hdr,
+					ol_flags);
+		} else {
+			ipv6_hdr = rte_pktmbuf_mtod_offset(m, struct ipv6_hdr *,
+					inner_l3_offset);
+			/* non-TSO udp */
+			udp_hdr = rte_pktmbuf_mtod_offset(m, struct udp_hdr *,
+					inner_l3_offset + m->l3_len);
+			udp_hdr->dgram_cksum = rte_ipv6_phdr_cksum(ipv6_hdr,
+					ol_flags);
+		}
+	} else if ((ol_flags & PKT_TX_TCP_CKSUM) ||
+			(ol_flags & PKT_TX_TCP_SEG)) {
+		if (ol_flags & PKT_TX_IPV4) {
+			ipv4_hdr = rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *,
+					inner_l3_offset);
+
+			if (ol_flags & PKT_TX_IP_CKSUM)
+				ipv4_hdr->hdr_checksum = 0;
+
+			/* non-TSO tcp or TSO */
+			tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr +
+					m->l3_len);
+			tcp_hdr->cksum = rte_ipv4_phdr_cksum(ipv4_hdr,
+					ol_flags);
+		} else {
+			ipv6_hdr = rte_pktmbuf_mtod_offset(m, struct ipv6_hdr *,
+					inner_l3_offset);
+			/* non-TSO tcp or TSO */
+			tcp_hdr = rte_pktmbuf_mtod_offset(m, struct tcp_hdr *,
+					inner_l3_offset + m->l3_len);
+			tcp_hdr->cksum = rte_ipv6_phdr_cksum(ipv6_hdr,
+					ol_flags);
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * Prepare pseudo header checksum
+ *
+ * This function prepares pseudo header checksum for TSO and non-TSO tcp/udp in
+ * provided mbufs packet data.
+ *
+ * - for non-TSO tcp/udp packets full pseudo-header checksum is counted and set
+ *   in packet data,
+ * - for TSO the IP payload length is not included in pseudo header.
+ *
+ * This function expects that used headers are in the first data segment of
+ * mbuf, are not fragmented and can be safely modified.
+ *
+ * @param m
+ *   The packet mbuf to be fixed.
+ * @return
+ *   0 if checksum is initialized properly
+ */
+static inline int
+rte_net_intel_cksum_prepare(struct rte_mbuf *m)
+{
+	return rte_net_intel_cksum_flags_prepare(m, m->ol_flags);
+}
+
 #ifdef __cplusplus
 }
 #endif
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH v14 2/8] e1000: add Tx preparation
From: Tomasz Kulasek @ 2016-12-22 13:05 UTC (permalink / raw)
  To: dev
In-Reply-To: <1482411919-7620-1-git-send-email-tomaszx.kulasek@intel.com>

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 drivers/net/e1000/e1000_ethdev.h |   11 ++++++++
 drivers/net/e1000/em_ethdev.c    |    5 +++-
 drivers/net/e1000/em_rxtx.c      |   48 +++++++++++++++++++++++++++++++++-
 drivers/net/e1000/igb_ethdev.c   |    4 +++
 drivers/net/e1000/igb_rxtx.c     |   53 +++++++++++++++++++++++++++++++++++++-
 5 files changed, 118 insertions(+), 3 deletions(-)

diff --git a/drivers/net/e1000/e1000_ethdev.h b/drivers/net/e1000/e1000_ethdev.h
index 6c25c8d..bd0f277 100644
--- a/drivers/net/e1000/e1000_ethdev.h
+++ b/drivers/net/e1000/e1000_ethdev.h
@@ -138,6 +138,11 @@
 #define E1000_MISC_VEC_ID               RTE_INTR_VEC_ZERO_OFFSET
 #define E1000_RX_VEC_START              RTE_INTR_VEC_RXTX_OFFSET
 
+#define IGB_TX_MAX_SEG     UINT8_MAX
+#define IGB_TX_MAX_MTU_SEG UINT8_MAX
+#define EM_TX_MAX_SEG      UINT8_MAX
+#define EM_TX_MAX_MTU_SEG  UINT8_MAX
+
 /* structure for interrupt relative data */
 struct e1000_interrupt {
 	uint32_t flags;
@@ -315,6 +320,9 @@ int eth_igb_tx_queue_setup(struct rte_eth_dev *dev, uint16_t tx_queue_id,
 uint16_t eth_igb_xmit_pkts(void *txq, struct rte_mbuf **tx_pkts,
 		uint16_t nb_pkts);
 
+uint16_t eth_igb_prep_pkts(void *txq, struct rte_mbuf **tx_pkts,
+		uint16_t nb_pkts);
+
 uint16_t eth_igb_recv_pkts(void *rxq, struct rte_mbuf **rx_pkts,
 		uint16_t nb_pkts);
 
@@ -376,6 +384,9 @@ int eth_em_tx_queue_setup(struct rte_eth_dev *dev, uint16_t tx_queue_id,
 uint16_t eth_em_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
 		uint16_t nb_pkts);
 
+uint16_t eth_em_prep_pkts(void *txq, struct rte_mbuf **tx_pkts,
+		uint16_t nb_pkts);
+
 uint16_t eth_em_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 		uint16_t nb_pkts);
 
diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index 866a5cf..00d5996 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -300,6 +300,7 @@ static int eth_em_set_mc_addr_list(struct rte_eth_dev *dev,
 	eth_dev->dev_ops = &eth_em_ops;
 	eth_dev->rx_pkt_burst = (eth_rx_burst_t)&eth_em_recv_pkts;
 	eth_dev->tx_pkt_burst = (eth_tx_burst_t)&eth_em_xmit_pkts;
+	eth_dev->tx_pkt_prepare = (eth_tx_prep_t)&eth_em_prep_pkts;
 
 	/* for secondary processes, we don't initialise any further as primary
 	 * has already done this work. Only check we don't need a different
@@ -1079,6 +1080,8 @@ static int eth_em_set_mc_addr_list(struct rte_eth_dev *dev,
 		.nb_max = E1000_MAX_RING_DESC,
 		.nb_min = E1000_MIN_RING_DESC,
 		.nb_align = EM_TXD_ALIGN,
+		.nb_seg_max = EM_TX_MAX_SEG,
+		.nb_mtu_seg_max = EM_TX_MAX_MTU_SEG,
 	};
 
 	dev_info->speed_capa = ETH_LINK_SPEED_10M_HD | ETH_LINK_SPEED_10M |
diff --git a/drivers/net/e1000/em_rxtx.c b/drivers/net/e1000/em_rxtx.c
index 41f51c0..7e271ad 100644
--- a/drivers/net/e1000/em_rxtx.c
+++ b/drivers/net/e1000/em_rxtx.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -66,6 +66,7 @@
 #include <rte_udp.h>
 #include <rte_tcp.h>
 #include <rte_sctp.h>
+#include <rte_net.h>
 #include <rte_string_fns.h>
 
 #include "e1000_logs.h"
@@ -77,6 +78,14 @@
 
 #define E1000_RXDCTL_GRAN	0x01000000 /* RXDCTL Granularity */
 
+#define E1000_TX_OFFLOAD_MASK ( \
+		PKT_TX_IP_CKSUM |       \
+		PKT_TX_L4_MASK |        \
+		PKT_TX_VLAN_PKT)
+
+#define E1000_TX_OFFLOAD_NOTSUP_MASK \
+		(PKT_TX_OFFLOAD_MASK ^ E1000_TX_OFFLOAD_MASK)
+
 /**
  * Structure associated with each descriptor of the RX ring of a RX queue.
  */
@@ -618,6 +627,43 @@ struct em_tx_queue {
 
 /*********************************************************************
  *
+ *  TX prep functions
+ *
+ **********************************************************************/
+uint16_t
+eth_em_prep_pkts(__rte_unused void *tx_queue, struct rte_mbuf **tx_pkts,
+		uint16_t nb_pkts)
+{
+	int i, ret;
+	struct rte_mbuf *m;
+
+	for (i = 0; i < nb_pkts; i++) {
+		m = tx_pkts[i];
+
+		if (m->ol_flags & E1000_TX_OFFLOAD_NOTSUP_MASK) {
+			rte_errno = -ENOTSUP;
+			return i;
+		}
+
+#ifdef RTE_LIBRTE_ETHDEV_DEBUG
+		ret = rte_validate_tx_offload(m);
+		if (ret != 0) {
+			rte_errno = ret;
+			return i;
+		}
+#endif
+		ret = rte_net_intel_cksum_prepare(m);
+		if (ret != 0) {
+			rte_errno = ret;
+			return i;
+		}
+	}
+
+	return i;
+}
+
+/*********************************************************************
+ *
  *  RX functions
  *
  **********************************************************************/
diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index 08f2a68..cfe1180 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -369,6 +369,8 @@ static void eth_igbvf_interrupt_handler(struct rte_intr_handle *handle,
 	.nb_max = E1000_MAX_RING_DESC,
 	.nb_min = E1000_MIN_RING_DESC,
 	.nb_align = IGB_RXD_ALIGN,
+	.nb_seg_max = IGB_TX_MAX_SEG,
+	.nb_mtu_seg_max = IGB_TX_MAX_MTU_SEG,
 };
 
 static const struct eth_dev_ops eth_igb_ops = {
@@ -760,6 +762,7 @@ struct rte_igb_xstats_name_off {
 	eth_dev->dev_ops = &eth_igb_ops;
 	eth_dev->rx_pkt_burst = &eth_igb_recv_pkts;
 	eth_dev->tx_pkt_burst = &eth_igb_xmit_pkts;
+	eth_dev->tx_pkt_prepare = &eth_igb_prep_pkts;
 
 	/* for secondary processes, we don't initialise any further as primary
 	 * has already done this work. Only check we don't need a different
@@ -963,6 +966,7 @@ struct rte_igb_xstats_name_off {
 	eth_dev->dev_ops = &igbvf_eth_dev_ops;
 	eth_dev->rx_pkt_burst = &eth_igb_recv_pkts;
 	eth_dev->tx_pkt_burst = &eth_igb_xmit_pkts;
+	eth_dev->tx_pkt_prepare = &eth_igb_prep_pkts;
 
 	/* for secondary processes, we don't initialise any further as primary
 	 * has already done this work. Only check we don't need a different
diff --git a/drivers/net/e1000/igb_rxtx.c b/drivers/net/e1000/igb_rxtx.c
index dbd37ac..5d0d3cd 100644
--- a/drivers/net/e1000/igb_rxtx.c
+++ b/drivers/net/e1000/igb_rxtx.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -65,6 +65,7 @@
 #include <rte_udp.h>
 #include <rte_tcp.h>
 #include <rte_sctp.h>
+#include <rte_net.h>
 #include <rte_string_fns.h>
 
 #include "e1000_logs.h"
@@ -78,6 +79,9 @@
 		PKT_TX_L4_MASK |		 \
 		PKT_TX_TCP_SEG)
 
+#define IGB_TX_OFFLOAD_NOTSUP_MASK \
+		(PKT_TX_OFFLOAD_MASK ^ IGB_TX_OFFLOAD_MASK)
+
 /**
  * Structure associated with each descriptor of the RX ring of a RX queue.
  */
@@ -616,6 +620,52 @@ struct igb_tx_queue {
 
 /*********************************************************************
  *
+ *  TX prep functions
+ *
+ **********************************************************************/
+uint16_t
+eth_igb_prep_pkts(__rte_unused void *tx_queue, struct rte_mbuf **tx_pkts,
+		uint16_t nb_pkts)
+{
+	int i, ret;
+	struct rte_mbuf *m;
+
+	for (i = 0; i < nb_pkts; i++) {
+		m = tx_pkts[i];
+
+		/* Check some limitations for TSO in hardware */
+		if (m->ol_flags & PKT_TX_TCP_SEG)
+			if ((m->tso_segsz > IGB_TSO_MAX_MSS) ||
+					(m->l2_len + m->l3_len + m->l4_len >
+					IGB_TSO_MAX_HDRLEN)) {
+				rte_errno = -EINVAL;
+				return i;
+			}
+
+		if (m->ol_flags & IGB_TX_OFFLOAD_NOTSUP_MASK) {
+			rte_errno = -ENOTSUP;
+			return i;
+		}
+
+#ifdef RTE_LIBRTE_ETHDEV_DEBUG
+		ret = rte_validate_tx_offload(m);
+		if (ret != 0) {
+			rte_errno = ret;
+			return i;
+		}
+#endif
+		ret = rte_net_intel_cksum_prepare(m);
+		if (ret != 0) {
+			rte_errno = ret;
+			return i;
+		}
+	}
+
+	return i;
+}
+
+/*********************************************************************
+ *
  *  RX functions
  *
  **********************************************************************/
@@ -1364,6 +1414,7 @@ struct igb_tx_queue {
 
 	igb_reset_tx_queue(txq, dev);
 	dev->tx_pkt_burst = eth_igb_xmit_pkts;
+	dev->tx_pkt_prepare = &eth_igb_prep_pkts;
 	dev->data->tx_queues[queue_idx] = txq;
 
 	return 0;
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH v14 3/8] fm10k: add Tx preparation
From: Tomasz Kulasek @ 2016-12-22 13:05 UTC (permalink / raw)
  To: dev
In-Reply-To: <1482411919-7620-1-git-send-email-tomaszx.kulasek@intel.com>

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 drivers/net/fm10k/fm10k.h        |    6 +++++
 drivers/net/fm10k/fm10k_ethdev.c |    5 ++++
 drivers/net/fm10k/fm10k_rxtx.c   |   50 +++++++++++++++++++++++++++++++++++++-
 3 files changed, 60 insertions(+), 1 deletion(-)

diff --git a/drivers/net/fm10k/fm10k.h b/drivers/net/fm10k/fm10k.h
index 05aa1a2..c6fed21 100644
--- a/drivers/net/fm10k/fm10k.h
+++ b/drivers/net/fm10k/fm10k.h
@@ -69,6 +69,9 @@
 #define FM10K_MAX_RX_DESC  (FM10K_MAX_RX_RING_SZ / sizeof(union fm10k_rx_desc))
 #define FM10K_MAX_TX_DESC  (FM10K_MAX_TX_RING_SZ / sizeof(struct fm10k_tx_desc))
 
+#define FM10K_TX_MAX_SEG     UINT8_MAX
+#define FM10K_TX_MAX_MTU_SEG UINT8_MAX
+
 /*
  * byte aligment for HW RX data buffer
  * Datasheet requires RX buffer addresses shall either be 512-byte aligned or
@@ -356,6 +359,9 @@ uint16_t fm10k_recv_scattered_pkts(void *rx_queue,
 uint16_t fm10k_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
 	uint16_t nb_pkts);
 
+uint16_t fm10k_prep_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
+	uint16_t nb_pkts);
+
 int fm10k_rxq_vec_setup(struct fm10k_rx_queue *rxq);
 int fm10k_rx_vec_condition_check(struct rte_eth_dev *);
 void fm10k_rx_queue_release_mbufs_vec(struct fm10k_rx_queue *rxq);
diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index fe74f6d..6648468 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -1447,6 +1447,8 @@ static int fm10k_xstats_get_names(__rte_unused struct rte_eth_dev *dev,
 		.nb_max = FM10K_MAX_TX_DESC,
 		.nb_min = FM10K_MIN_TX_DESC,
 		.nb_align = FM10K_MULT_TX_DESC,
+		.nb_seg_max = FM10K_TX_MAX_SEG,
+		.nb_mtu_seg_max = FM10K_TX_MAX_MTU_SEG,
 	};
 
 	dev_info->speed_capa = ETH_LINK_SPEED_1G | ETH_LINK_SPEED_2_5G |
@@ -2755,8 +2757,10 @@ static void __attribute__((cold))
 			fm10k_txq_vec_setup(txq);
 		}
 		dev->tx_pkt_burst = fm10k_xmit_pkts_vec;
+		dev->tx_pkt_prepare = NULL;
 	} else {
 		dev->tx_pkt_burst = fm10k_xmit_pkts;
+		dev->tx_pkt_prepare = fm10k_prep_pkts;
 		PMD_INIT_LOG(DEBUG, "Use regular Tx func");
 	}
 }
@@ -2835,6 +2839,7 @@ static void __attribute__((cold))
 	dev->dev_ops = &fm10k_eth_dev_ops;
 	dev->rx_pkt_burst = &fm10k_recv_pkts;
 	dev->tx_pkt_burst = &fm10k_xmit_pkts;
+	dev->tx_pkt_prepare = &fm10k_prep_pkts;
 
 	/* only initialize in the primary process */
 	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
diff --git a/drivers/net/fm10k/fm10k_rxtx.c b/drivers/net/fm10k/fm10k_rxtx.c
index 32cc7ff..144e5e6 100644
--- a/drivers/net/fm10k/fm10k_rxtx.c
+++ b/drivers/net/fm10k/fm10k_rxtx.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2013-2015 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2013-2016 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -35,6 +35,7 @@
 
 #include <rte_ethdev.h>
 #include <rte_common.h>
+#include <rte_net.h>
 #include "fm10k.h"
 #include "base/fm10k_type.h"
 
@@ -65,6 +66,15 @@ static inline void dump_rxd(union fm10k_rx_desc *rxd)
 }
 #endif
 
+#define FM10K_TX_OFFLOAD_MASK (  \
+		PKT_TX_VLAN_PKT |        \
+		PKT_TX_IP_CKSUM |        \
+		PKT_TX_L4_MASK |         \
+		PKT_TX_TCP_SEG)
+
+#define FM10K_TX_OFFLOAD_NOTSUP_MASK \
+		(PKT_TX_OFFLOAD_MASK ^ FM10K_TX_OFFLOAD_MASK)
+
 /* @note: When this function is changed, make corresponding change to
  * fm10k_dev_supported_ptypes_get()
  */
@@ -597,3 +607,41 @@ static inline void tx_xmit_pkt(struct fm10k_tx_queue *q, struct rte_mbuf *mb)
 
 	return count;
 }
+
+uint16_t
+fm10k_prep_pkts(__rte_unused void *tx_queue, struct rte_mbuf **tx_pkts,
+		uint16_t nb_pkts)
+{
+	int i, ret;
+	struct rte_mbuf *m;
+
+	for (i = 0; i < nb_pkts; i++) {
+		m = tx_pkts[i];
+
+		if ((m->ol_flags & PKT_TX_TCP_SEG) &&
+				(m->tso_segsz < FM10K_TSO_MINMSS)) {
+			rte_errno = -EINVAL;
+			return i;
+		}
+
+		if (m->ol_flags & FM10K_TX_OFFLOAD_NOTSUP_MASK) {
+			rte_errno = -ENOTSUP;
+			return i;
+		}
+
+#ifdef RTE_LIBRTE_ETHDEV_DEBUG
+		ret = rte_validate_tx_offload(m);
+		if (ret != 0) {
+			rte_errno = ret;
+			return i;
+		}
+#endif
+		ret = rte_net_intel_cksum_prepare(m);
+		if (ret != 0) {
+			rte_errno = ret;
+			return i;
+		}
+	}
+
+	return i;
+}
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH v14 4/8] i40e: add Tx preparation
From: Tomasz Kulasek @ 2016-12-22 13:05 UTC (permalink / raw)
  To: dev
In-Reply-To: <1482411919-7620-1-git-send-email-tomaszx.kulasek@intel.com>

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 drivers/net/i40e/i40e_ethdev.c |    3 ++
 drivers/net/i40e/i40e_rxtx.c   |   74 +++++++++++++++++++++++++++++++++++++++-
 drivers/net/i40e/i40e_rxtx.h   |    8 +++++
 3 files changed, 84 insertions(+), 1 deletion(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index b0c0fbf..0e20178 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -944,6 +944,7 @@ static inline void i40e_GLQF_reg_init(struct i40e_hw *hw)
 	dev->dev_ops = &i40e_eth_dev_ops;
 	dev->rx_pkt_burst = i40e_recv_pkts;
 	dev->tx_pkt_burst = i40e_xmit_pkts;
+	dev->tx_pkt_prepare = i40e_prep_pkts;
 
 	/* for secondary processes, we don't initialise any further as primary
 	 * has already done this work. Only check we don't need a different
@@ -2646,6 +2647,8 @@ static int i40e_dev_xstats_get_names(__rte_unused struct rte_eth_dev *dev,
 		.nb_max = I40E_MAX_RING_DESC,
 		.nb_min = I40E_MIN_RING_DESC,
 		.nb_align = I40E_ALIGN_RING_DESC,
+		.nb_seg_max = I40E_TX_MAX_SEG,
+		.nb_mtu_seg_max = I40E_TX_MAX_MTU_SEG,
 	};
 
 	if (pf->flags & I40E_FLAG_VMDQ) {
diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 7ae7d9f..1c9a6c8 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -50,6 +50,8 @@
 #include <rte_tcp.h>
 #include <rte_sctp.h>
 #include <rte_udp.h>
+#include <rte_ip.h>
+#include <rte_net.h>
 
 #include "i40e_logs.h"
 #include "base/i40e_prototype.h"
@@ -79,6 +81,17 @@
 		PKT_TX_TCP_SEG |		 \
 		PKT_TX_OUTER_IP_CKSUM)
 
+#define I40E_TX_OFFLOAD_MASK (  \
+		PKT_TX_IP_CKSUM |       \
+		PKT_TX_L4_MASK |        \
+		PKT_TX_OUTER_IP_CKSUM | \
+		PKT_TX_TCP_SEG |        \
+		PKT_TX_QINQ_PKT |       \
+		PKT_TX_VLAN_PKT)
+
+#define I40E_TX_OFFLOAD_NOTSUP_MASK \
+		(PKT_TX_OFFLOAD_MASK ^ I40E_TX_OFFLOAD_MASK)
+
 static uint16_t i40e_xmit_pkts_simple(void *tx_queue,
 				      struct rte_mbuf **tx_pkts,
 				      uint16_t nb_pkts);
@@ -1411,6 +1424,63 @@ static inline int __attribute__((always_inline))
 	return nb_tx;
 }
 
+/*********************************************************************
+ *
+ *  TX prep functions
+ *
+ **********************************************************************/
+uint16_t
+i40e_prep_pkts(__rte_unused void *tx_queue, struct rte_mbuf **tx_pkts,
+		uint16_t nb_pkts)
+{
+	int i, ret;
+	uint64_t ol_flags;
+	struct rte_mbuf *m;
+
+	for (i = 0; i < nb_pkts; i++) {
+		m = tx_pkts[i];
+		ol_flags = m->ol_flags;
+
+		/**
+		 * m->nb_segs is uint8_t, so nb_segs is always less than
+		 * I40E_TX_MAX_SEG.
+		 * We check only a condition for nb_segs > I40E_TX_MAX_MTU_SEG.
+		 */
+		if (!(ol_flags & PKT_TX_TCP_SEG)) {
+			if (m->nb_segs > I40E_TX_MAX_MTU_SEG) {
+				rte_errno = -EINVAL;
+				return i;
+			}
+		} else if ((m->tso_segsz < I40E_MIN_TSO_MSS) ||
+				(m->tso_segsz > I40E_MAX_TSO_MSS)) {
+			/* MSS outside the range (256B - 9674B) are considered
+			 * malicious
+			 */
+			rte_errno = -EINVAL;
+			return i;
+		}
+
+		if (ol_flags & I40E_TX_OFFLOAD_NOTSUP_MASK) {
+			rte_errno = -ENOTSUP;
+			return i;
+		}
+
+#ifdef RTE_LIBRTE_ETHDEV_DEBUG
+		ret = rte_validate_tx_offload(m);
+		if (ret != 0) {
+			rte_errno = ret;
+			return i;
+		}
+#endif
+		ret = rte_net_intel_cksum_prepare(m);
+		if (ret != 0) {
+			rte_errno = ret;
+			return i;
+		}
+	}
+	return i;
+}
+
 /*
  * Find the VSI the queue belongs to. 'queue_idx' is the queue index
  * application used, which assume having sequential ones. But from driver's
@@ -2763,9 +2833,11 @@ void __attribute__((cold))
 			PMD_INIT_LOG(DEBUG, "Simple tx finally be used.");
 			dev->tx_pkt_burst = i40e_xmit_pkts_simple;
 		}
+		dev->tx_pkt_prepare = NULL;
 	} else {
 		PMD_INIT_LOG(DEBUG, "Xmit tx finally be used.");
 		dev->tx_pkt_burst = i40e_xmit_pkts;
+		dev->tx_pkt_prepare = i40e_prep_pkts;
 	}
 }
 
diff --git a/drivers/net/i40e/i40e_rxtx.h b/drivers/net/i40e/i40e_rxtx.h
index ecdb13c..9df8a56 100644
--- a/drivers/net/i40e/i40e_rxtx.h
+++ b/drivers/net/i40e/i40e_rxtx.h
@@ -63,6 +63,12 @@
 #define	I40E_MIN_RING_DESC	64
 #define	I40E_MAX_RING_DESC	4096
 
+#define I40E_MIN_TSO_MSS          256
+#define I40E_MAX_TSO_MSS          9674
+
+#define I40E_TX_MAX_SEG     UINT8_MAX
+#define I40E_TX_MAX_MTU_SEG 8
+
 #undef container_of
 #define container_of(ptr, type, member) ({ \
 		typeof(((type *)0)->member)(*__mptr) = (ptr); \
@@ -223,6 +229,8 @@ uint16_t i40e_recv_scattered_pkts(void *rx_queue,
 uint16_t i40e_xmit_pkts(void *tx_queue,
 			struct rte_mbuf **tx_pkts,
 			uint16_t nb_pkts);
+uint16_t i40e_prep_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
+		uint16_t nb_pkts);
 int i40e_tx_queue_init(struct i40e_tx_queue *txq);
 int i40e_rx_queue_init(struct i40e_rx_queue *rxq);
 void i40e_free_tx_resources(struct i40e_tx_queue *txq);
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH v14 5/8] ixgbe: add Tx preparation
From: Tomasz Kulasek @ 2016-12-22 13:05 UTC (permalink / raw)
  To: dev
In-Reply-To: <1482411919-7620-1-git-send-email-tomaszx.kulasek@intel.com>

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 drivers/net/ixgbe/ixgbe_ethdev.c |    3 ++
 drivers/net/ixgbe/ixgbe_ethdev.h |    5 +++-
 drivers/net/ixgbe/ixgbe_rxtx.c   |   57 ++++++++++++++++++++++++++++++++++++++
 drivers/net/ixgbe/ixgbe_rxtx.h   |    2 ++
 4 files changed, 66 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index baffc71..d726a2b 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -517,6 +517,8 @@ static int ixgbe_dev_udp_tunnel_port_del(struct rte_eth_dev *dev,
 	.nb_max = IXGBE_MAX_RING_DESC,
 	.nb_min = IXGBE_MIN_RING_DESC,
 	.nb_align = IXGBE_TXD_ALIGN,
+	.nb_seg_max = IXGBE_TX_MAX_SEG,
+	.nb_mtu_seg_max = IXGBE_TX_MAX_SEG,
 };
 
 static const struct eth_dev_ops ixgbe_eth_dev_ops = {
@@ -1103,6 +1105,7 @@ struct rte_ixgbe_xstats_name_off {
 	eth_dev->dev_ops = &ixgbe_eth_dev_ops;
 	eth_dev->rx_pkt_burst = &ixgbe_recv_pkts;
 	eth_dev->tx_pkt_burst = &ixgbe_xmit_pkts;
+	eth_dev->tx_pkt_prepare = &ixgbe_prep_pkts;
 
 	/*
 	 * For secondary processes, we don't initialise any further as primary
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.h b/drivers/net/ixgbe/ixgbe_ethdev.h
index 4ff6338..e229cf5 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.h
+++ b/drivers/net/ixgbe/ixgbe_ethdev.h
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -396,6 +396,9 @@ uint16_t ixgbe_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
 uint16_t ixgbe_xmit_pkts_simple(void *tx_queue, struct rte_mbuf **tx_pkts,
 		uint16_t nb_pkts);
 
+uint16_t ixgbe_prep_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
+		uint16_t nb_pkts);
+
 int ixgbe_dev_rss_hash_update(struct rte_eth_dev *dev,
 			      struct rte_eth_rss_conf *rss_conf);
 
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index b2d9f45..0bbc583 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -70,6 +70,7 @@
 #include <rte_string_fns.h>
 #include <rte_errno.h>
 #include <rte_ip.h>
+#include <rte_net.h>
 
 #include "ixgbe_logs.h"
 #include "base/ixgbe_api.h"
@@ -87,6 +88,9 @@
 		PKT_TX_TCP_SEG |		 \
 		PKT_TX_OUTER_IP_CKSUM)
 
+#define IXGBE_TX_OFFLOAD_NOTSUP_MASK \
+		(PKT_TX_OFFLOAD_MASK ^ IXGBE_TX_OFFLOAD_MASK)
+
 #if 1
 #define RTE_PMD_USE_PREFETCH
 #endif
@@ -905,6 +909,57 @@ static inline int __attribute__((always_inline))
 
 /*********************************************************************
  *
+ *  TX prep functions
+ *
+ **********************************************************************/
+uint16_t
+ixgbe_prep_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
+{
+	int i, ret;
+	uint64_t ol_flags;
+	struct rte_mbuf *m;
+	struct ixgbe_tx_queue *txq = (struct ixgbe_tx_queue *)tx_queue;
+
+	for (i = 0; i < nb_pkts; i++) {
+		m = tx_pkts[i];
+		ol_flags = m->ol_flags;
+
+		/**
+		 * Check if packet meets requirements for number of segments
+		 *
+		 * NOTE: for ixgbe it's always (40 - WTHRESH) for both TSO and
+		 *       non-TSO
+		 */
+
+		if (m->nb_segs > IXGBE_TX_MAX_SEG - txq->wthresh) {
+			rte_errno = -EINVAL;
+			return i;
+		}
+
+		if (ol_flags & IXGBE_TX_OFFLOAD_NOTSUP_MASK) {
+			rte_errno = -ENOTSUP;
+			return i;
+		}
+
+#ifdef RTE_LIBRTE_ETHDEV_DEBUG
+		ret = rte_validate_tx_offload(m);
+		if (ret != 0) {
+			rte_errno = ret;
+			return i;
+		}
+#endif
+		ret = rte_net_intel_cksum_prepare(m);
+		if (ret != 0) {
+			rte_errno = ret;
+			return i;
+		}
+	}
+
+	return i;
+}
+
+/*********************************************************************
+ *
  *  RX functions
  *
  **********************************************************************/
@@ -2282,6 +2337,7 @@ void __attribute__((cold))
 	if (((txq->txq_flags & IXGBE_SIMPLE_FLAGS) == IXGBE_SIMPLE_FLAGS)
 			&& (txq->tx_rs_thresh >= RTE_PMD_IXGBE_TX_MAX_BURST)) {
 		PMD_INIT_LOG(DEBUG, "Using simple tx code path");
+		dev->tx_pkt_prepare = NULL;
 #ifdef RTE_IXGBE_INC_VECTOR
 		if (txq->tx_rs_thresh <= RTE_IXGBE_TX_MAX_FREE_BUF_SZ &&
 				(rte_eal_process_type() != RTE_PROC_PRIMARY ||
@@ -2302,6 +2358,7 @@ void __attribute__((cold))
 				(unsigned long)txq->tx_rs_thresh,
 				(unsigned long)RTE_PMD_IXGBE_TX_MAX_BURST);
 		dev->tx_pkt_burst = ixgbe_xmit_pkts;
+		dev->tx_pkt_prepare = ixgbe_prep_pkts;
 	}
 }
 
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.h b/drivers/net/ixgbe/ixgbe_rxtx.h
index 2608b36..7bbd9b8 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.h
+++ b/drivers/net/ixgbe/ixgbe_rxtx.h
@@ -80,6 +80,8 @@
 #define RTE_IXGBE_WAIT_100_US               100
 #define RTE_IXGBE_VMTXSW_REGISTER_COUNT     2
 
+#define IXGBE_TX_MAX_SEG                    40
+
 #define IXGBE_PACKET_TYPE_MASK_82599        0X7F
 #define IXGBE_PACKET_TYPE_MASK_X550         0X10FF
 #define IXGBE_PACKET_TYPE_MASK_TUNNEL       0XFF
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH v14 6/8] vmxnet3: add Tx preparation
From: Tomasz Kulasek @ 2016-12-22 13:05 UTC (permalink / raw)
  To: dev; +Cc: Ananyev, Konstantin
In-Reply-To: <1482411919-7620-1-git-send-email-tomaszx.kulasek@intel.com>

From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 drivers/net/vmxnet3/vmxnet3_ethdev.c |    6 ++++
 drivers/net/vmxnet3/vmxnet3_ethdev.h |    2 ++
 drivers/net/vmxnet3/vmxnet3_rxtx.c   |   56 ++++++++++++++++++++++++++++++++++
 3 files changed, 64 insertions(+)

diff --git a/drivers/net/vmxnet3/vmxnet3_ethdev.c b/drivers/net/vmxnet3/vmxnet3_ethdev.c
index 93c9ac9..e31896f 100644
--- a/drivers/net/vmxnet3/vmxnet3_ethdev.c
+++ b/drivers/net/vmxnet3/vmxnet3_ethdev.c
@@ -69,6 +69,8 @@
 
 #define PROCESS_SYS_EVENTS 0
 
+#define	VMXNET3_TX_MAX_SEG	UINT8_MAX
+
 static int eth_vmxnet3_dev_init(struct rte_eth_dev *eth_dev);
 static int eth_vmxnet3_dev_uninit(struct rte_eth_dev *eth_dev);
 static int vmxnet3_dev_configure(struct rte_eth_dev *dev);
@@ -237,6 +239,7 @@ static void vmxnet3_mac_addr_set(struct rte_eth_dev *dev,
 	eth_dev->dev_ops = &vmxnet3_eth_dev_ops;
 	eth_dev->rx_pkt_burst = &vmxnet3_recv_pkts;
 	eth_dev->tx_pkt_burst = &vmxnet3_xmit_pkts;
+	eth_dev->tx_pkt_prepare = vmxnet3_prep_pkts;
 	pci_dev = eth_dev->pci_dev;
 
 	/*
@@ -326,6 +329,7 @@ static void vmxnet3_mac_addr_set(struct rte_eth_dev *dev,
 	eth_dev->dev_ops = NULL;
 	eth_dev->rx_pkt_burst = NULL;
 	eth_dev->tx_pkt_burst = NULL;
+	eth_dev->tx_pkt_prepare = NULL;
 
 	rte_free(eth_dev->data->mac_addrs);
 	eth_dev->data->mac_addrs = NULL;
@@ -728,6 +732,8 @@ static void vmxnet3_mac_addr_set(struct rte_eth_dev *dev,
 		.nb_max = VMXNET3_TX_RING_MAX_SIZE,
 		.nb_min = VMXNET3_DEF_TX_RING_SIZE,
 		.nb_align = 1,
+		.nb_seg_max = VMXNET3_TX_MAX_SEG,
+		.nb_mtu_seg_max = VMXNET3_MAX_TXD_PER_PKT,
 	};
 
 	dev_info->rx_offload_capa =
diff --git a/drivers/net/vmxnet3/vmxnet3_ethdev.h b/drivers/net/vmxnet3/vmxnet3_ethdev.h
index 7d3b11e..469db71 100644
--- a/drivers/net/vmxnet3/vmxnet3_ethdev.h
+++ b/drivers/net/vmxnet3/vmxnet3_ethdev.h
@@ -171,5 +171,7 @@ uint16_t vmxnet3_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 			   uint16_t nb_pkts);
 uint16_t vmxnet3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
 			   uint16_t nb_pkts);
+uint16_t vmxnet3_prep_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
+			uint16_t nb_pkts);
 
 #endif /* _VMXNET3_ETHDEV_H_ */
diff --git a/drivers/net/vmxnet3/vmxnet3_rxtx.c b/drivers/net/vmxnet3/vmxnet3_rxtx.c
index b109168..3651369 100644
--- a/drivers/net/vmxnet3/vmxnet3_rxtx.c
+++ b/drivers/net/vmxnet3/vmxnet3_rxtx.c
@@ -69,6 +69,7 @@
 #include <rte_sctp.h>
 #include <rte_string_fns.h>
 #include <rte_errno.h>
+#include <rte_net.h>
 
 #include "base/vmxnet3_defs.h"
 #include "vmxnet3_ring.h"
@@ -76,6 +77,14 @@
 #include "vmxnet3_logs.h"
 #include "vmxnet3_ethdev.h"
 
+#define	VMXNET3_TX_OFFLOAD_MASK	( \
+		PKT_TX_VLAN_PKT | \
+		PKT_TX_L4_MASK |  \
+		PKT_TX_TCP_SEG)
+
+#define	VMXNET3_TX_OFFLOAD_NOTSUP_MASK	\
+	(PKT_TX_OFFLOAD_MASK ^ VMXNET3_TX_OFFLOAD_MASK)
+
 static const uint32_t rxprod_reg[2] = {VMXNET3_REG_RXPROD, VMXNET3_REG_RXPROD2};
 
 static int vmxnet3_post_rx_bufs(vmxnet3_rx_queue_t*, uint8_t);
@@ -350,6 +359,53 @@
 }
 
 uint16_t
+vmxnet3_prep_pkts(__rte_unused void *tx_queue, struct rte_mbuf **tx_pkts,
+	uint16_t nb_pkts)
+{
+	int32_t ret;
+	uint32_t i;
+	uint64_t ol_flags;
+	struct rte_mbuf *m;
+
+	for (i = 0; i != nb_pkts; i++) {
+		m = tx_pkts[i];
+		ol_flags = m->ol_flags;
+
+		/* Non-TSO packet cannot occupy more than
+		 * VMXNET3_MAX_TXD_PER_PKT TX descriptors.
+		 */
+		if ((ol_flags & PKT_TX_TCP_SEG) == 0 &&
+				m->nb_segs > VMXNET3_MAX_TXD_PER_PKT) {
+			rte_errno = -EINVAL;
+			return i;
+		}
+
+		/* check that only supported TX offloads are requested. */
+		if ((ol_flags & VMXNET3_TX_OFFLOAD_NOTSUP_MASK) != 0 ||
+				(ol_flags & PKT_TX_L4_MASK) ==
+				PKT_TX_SCTP_CKSUM) {
+			rte_errno = -ENOTSUP;
+			return i;
+		}
+
+#ifdef RTE_LIBRTE_ETHDEV_DEBUG
+		ret = rte_validate_tx_offload(m);
+		if (ret != 0) {
+			rte_errno = ret;
+			return i;
+		}
+#endif
+		ret = rte_net_intel_cksum_prepare(m);
+		if (ret != 0) {
+			rte_errno = ret;
+			return i;
+		}
+	}
+
+	return i;
+}
+
+uint16_t
 vmxnet3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
 		  uint16_t nb_pkts)
 {
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH v14 7/8] ena: add Tx preparation
From: Tomasz Kulasek @ 2016-12-22 13:05 UTC (permalink / raw)
  To: dev; +Cc: Konstantin Ananyev
In-Reply-To: <1482411919-7620-1-git-send-email-tomaszx.kulasek@intel.com>

From: Konstantin Ananyev <konstantin.ananyev@intel.com>

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 drivers/net/ena/ena_ethdev.c |   51 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)

diff --git a/drivers/net/ena/ena_ethdev.c b/drivers/net/ena/ena_ethdev.c
index 555fb31..51af723 100644
--- a/drivers/net/ena/ena_ethdev.c
+++ b/drivers/net/ena/ena_ethdev.c
@@ -39,6 +39,7 @@
 #include <rte_errno.h>
 #include <rte_version.h>
 #include <rte_eal_memconfig.h>
+#include <rte_net.h>
 
 #include "ena_ethdev.h"
 #include "ena_logs.h"
@@ -168,6 +169,14 @@ struct ena_stats {
 #define PCI_DEVICE_ID_ENA_VF	0xEC20
 #define PCI_DEVICE_ID_ENA_LLQ_VF	0xEC21
 
+#define	ENA_TX_OFFLOAD_MASK	(\
+	PKT_TX_L4_MASK |         \
+	PKT_TX_IP_CKSUM |        \
+	PKT_TX_TCP_SEG)
+
+#define	ENA_TX_OFFLOAD_NOTSUP_MASK	\
+	(PKT_TX_OFFLOAD_MASK ^ ENA_TX_OFFLOAD_MASK)
+
 static struct rte_pci_id pci_id_ena_map[] = {
 	{ RTE_PCI_DEVICE(PCI_VENDOR_ID_AMAZON, PCI_DEVICE_ID_ENA_VF) },
 	{ RTE_PCI_DEVICE(PCI_VENDOR_ID_AMAZON, PCI_DEVICE_ID_ENA_LLQ_VF) },
@@ -179,6 +188,8 @@ static int ena_device_init(struct ena_com_dev *ena_dev,
 static int ena_dev_configure(struct rte_eth_dev *dev);
 static uint16_t eth_ena_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
 				  uint16_t nb_pkts);
+static uint16_t eth_ena_prep_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
+		uint16_t nb_pkts);
 static int ena_tx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx,
 			      uint16_t nb_desc, unsigned int socket_id,
 			      const struct rte_eth_txconf *tx_conf);
@@ -1272,6 +1283,7 @@ static int eth_ena_dev_init(struct rte_eth_dev *eth_dev)
 	eth_dev->dev_ops = &ena_dev_ops;
 	eth_dev->rx_pkt_burst = &eth_ena_recv_pkts;
 	eth_dev->tx_pkt_burst = &eth_ena_xmit_pkts;
+	eth_dev->tx_pkt_prepare = &eth_ena_prep_pkts;
 	adapter->rte_eth_dev_data = eth_dev->data;
 	adapter->rte_dev = eth_dev;
 
@@ -1570,6 +1582,45 @@ static uint16_t eth_ena_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 	return recv_idx;
 }
 
+static uint16_t
+eth_ena_prep_pkts(__rte_unused void *tx_queue, struct rte_mbuf **tx_pkts,
+		uint16_t nb_pkts)
+{
+	int32_t ret;
+	uint32_t i;
+	struct rte_mbuf *m;
+	uint64_t ol_flags;
+
+	for (i = 0; i != nb_pkts; i++) {
+		m = tx_pkts[i];
+		ol_flags = m->ol_flags;
+
+		if ((ol_flags & ENA_TX_OFFLOAD_NOTSUP_MASK) != 0 ||
+				(ol_flags & PKT_TX_L4_MASK) ==
+				PKT_TX_SCTP_CKSUM) {
+			rte_errno = -ENOTSUP;
+			return i;
+		}
+
+#ifdef RTE_LIBRTE_ETHDEV_DEBUG
+		ret = rte_validate_tx_offload(m);
+		if (ret != 0) {
+			rte_errno = ret;
+			return i;
+		}
+#endif
+		/* ENA doesn't need different phdr cskum for TSO */
+		ret = rte_net_intel_cksum_flags_prepare(m,
+			ol_flags & ~PKT_TX_TCP_SEG);
+		if (ret != 0) {
+			rte_errno = ret;
+			return i;
+		}
+	}
+
+	return i;
+}
+
 static uint16_t eth_ena_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
 				  uint16_t nb_pkts)
 {
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH v14 8/8] testpmd: use Tx preparation in csum engine
From: Tomasz Kulasek @ 2016-12-22 13:05 UTC (permalink / raw)
  To: dev
In-Reply-To: <1482411919-7620-1-git-send-email-tomaszx.kulasek@intel.com>

Since all current drivers supports Tx preparation API, it is used
in csum forwarding engine by default for all drivers.

Adding additional step to the csum engine costs about 3-4% of performance
drop, on my setup with ixgbe driver. It's caused mostly by the need
of reaccessing and modification of packet data.

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 app/test-pmd/csumonly.c |   37 ++++++++++++++++---------------------
 app/test-pmd/testpmd.c  |    5 +++++
 app/test-pmd/testpmd.h  |    2 ++
 3 files changed, 23 insertions(+), 21 deletions(-)

diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 57e6ae2..806f957 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -112,15 +112,6 @@ struct simple_gre_hdr {
 } __attribute__((__packed__));
 
 static uint16_t
-get_psd_sum(void *l3_hdr, uint16_t ethertype, uint64_t ol_flags)
-{
-	if (ethertype == _htons(ETHER_TYPE_IPv4))
-		return rte_ipv4_phdr_cksum(l3_hdr, ol_flags);
-	else /* assume ethertype == ETHER_TYPE_IPv6 */
-		return rte_ipv6_phdr_cksum(l3_hdr, ol_flags);
-}
-
-static uint16_t
 get_udptcp_checksum(void *l3_hdr, void *l4_hdr, uint16_t ethertype)
 {
 	if (ethertype == _htons(ETHER_TYPE_IPv4))
@@ -370,11 +361,9 @@ struct simple_gre_hdr {
 		/* do not recalculate udp cksum if it was 0 */
 		if (udp_hdr->dgram_cksum != 0) {
 			udp_hdr->dgram_cksum = 0;
-			if (testpmd_ol_flags & TESTPMD_TX_OFFLOAD_UDP_CKSUM) {
+			if (testpmd_ol_flags & TESTPMD_TX_OFFLOAD_UDP_CKSUM)
 				ol_flags |= PKT_TX_UDP_CKSUM;
-				udp_hdr->dgram_cksum = get_psd_sum(l3_hdr,
-					info->ethertype, ol_flags);
-			} else {
+			else {
 				udp_hdr->dgram_cksum =
 					get_udptcp_checksum(l3_hdr, udp_hdr,
 						info->ethertype);
@@ -383,15 +372,11 @@ struct simple_gre_hdr {
 	} else if (info->l4_proto == IPPROTO_TCP) {
 		tcp_hdr = (struct tcp_hdr *)((char *)l3_hdr + info->l3_len);
 		tcp_hdr->cksum = 0;
-		if (tso_segsz) {
+		if (tso_segsz)
 			ol_flags |= PKT_TX_TCP_SEG;
-			tcp_hdr->cksum = get_psd_sum(l3_hdr, info->ethertype,
-				ol_flags);
-		} else if (testpmd_ol_flags & TESTPMD_TX_OFFLOAD_TCP_CKSUM) {
+		else if (testpmd_ol_flags & TESTPMD_TX_OFFLOAD_TCP_CKSUM)
 			ol_flags |= PKT_TX_TCP_CKSUM;
-			tcp_hdr->cksum = get_psd_sum(l3_hdr, info->ethertype,
-				ol_flags);
-		} else {
+		else {
 			tcp_hdr->cksum =
 				get_udptcp_checksum(l3_hdr, tcp_hdr,
 					info->ethertype);
@@ -648,6 +633,7 @@ struct simple_gre_hdr {
 	void *l3_hdr = NULL, *outer_l3_hdr = NULL; /* can be IPv4 or IPv6 */
 	uint16_t nb_rx;
 	uint16_t nb_tx;
+	uint16_t nb_prep;
 	uint16_t i;
 	uint64_t rx_ol_flags, tx_ol_flags;
 	uint16_t testpmd_ol_flags;
@@ -857,7 +843,16 @@ struct simple_gre_hdr {
 			printf("\n");
 		}
 	}
-	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, nb_rx);
+
+	nb_prep = rte_eth_tx_prepare(fs->tx_port, fs->tx_queue,
+			pkts_burst, nb_rx);
+	if (nb_prep != nb_rx)
+		printf("Preparing packet burst to transmit failed: %s\n",
+				rte_strerror(rte_errno));
+
+	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst,
+			nb_prep);
+
 	/*
 	 * Retry if necessary
 	 */
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index a0332c2..634f10b 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -180,6 +180,11 @@ struct fwd_engine * fwd_engines[] = {
 enum tx_pkt_split tx_pkt_split = TX_PKT_SPLIT_OFF;
 /**< Split policy for packets to TX. */
 
+/*
+ * Enable Tx preparation path in the "csum" engine.
+ */
+uint8_t tx_prepare;
+
 uint16_t nb_pkt_per_burst = DEF_PKT_BURST; /**< Number of packets per burst. */
 uint16_t mb_mempool_cache = DEF_MBUF_CACHE; /**< Size of mbuf mempool cache. */
 
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 9c1e703..488a6e1 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -383,6 +383,8 @@ enum tx_pkt_split {
 
 extern enum tx_pkt_split tx_pkt_split;
 
+extern uint8_t tx_prepare;
+
 extern uint16_t nb_pkt_per_burst;
 extern uint16_t mb_mempool_cache;
 extern int8_t rx_pthresh;
-- 
1.7.9.5

^ permalink raw reply related

* Example(Load_balancer) Tx Flush Bug(This bug dpdk each version)
From: Maple @ 2016-12-22 13:07 UTC (permalink / raw)
  To: dev; +Cc: maintainers
In-Reply-To: <201612221005440765861@raisecom.com>

From: Maple <liujian@raisecom.com>
To: <dev@dpdk.org>
Cc: <thomas.monjalon@6wind.com>
Subject: [PATCH] Load_balancer Tx Flush Bug
Date: Thu, 22 Dec 2016 09:57:48 +0800
Message-Id: <1482371868-19669-1-git-send-email-liujian@raisecom.com>
X-Mailer: git-send-email 1.9.1
In-Reply-To: <2016122122394164225248@raisecom.com>
References: <2016122122394164225248@raisecom.com>

We found a bug in use load_balancer example,and,This bug DPDK each version.
In IO tx flush, only flush port 0.
So,If I enable more than the Port,then,In addition to 0 port won't flush.

Signed-off-by: Maple <liujian@raisecom.com>
---
 a/examples/load_balancer/runtime.c | 667 ++++++++++++++++++++++++++++++++++++
 b/examples/load_balancer/runtime.c | 669 +++++++++++++++++++++++++++++++++++++
 2 files changed, 1336 insertions(+)
 create mode 100644 a/examples/load_balancer/runtime.c
 create mode 100644 b/examples/load_balancer/runtime.c

diff --git a/examples/load_balancer/runtime.c b/examples/load_balancer/runtime.c
index 9612392..3a2e900 100644
--- a/test/a/examples/load_balancer/runtime.c
+++ b/test/b/examples/load_balancer/runtime.c
@@ -418,9 +418,11 @@ app_lcore_io_tx(
 static inline void
 app_lcore_io_tx_flush(struct app_lcore_params_io *lp)
 {
+       uint8_t i;
        uint8_t port;

-       for (port = 0; port < lp->tx.n_nic_ports; port ++) {
+       port = lp->tx.nic_ports[0];
+       for (i = 0; i < lp->tx.n_nic_ports; i ++) {
                uint32_t n_pkts;

                if (likely((lp->tx.mbuf_out_flush[port] == 0) ||

^ permalink raw reply related

* [PATCH v3] ethdev: cleanup device ops struct whitespace
From: Ferruh Yigit @ 2016-12-22 13:10 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon, Ferruh Yigit
In-Reply-To: <20161222115330.7164-1-ferruh.yigit@intel.com>

- Grouped related items using empty lines
- Aligned arguments to same column
- All item comments that doesn't fit same line are placed blow the item
  itself
- Moved some comments to same line if overall line < 100 chars

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>

---

- ! This patch has the problem of trashing the git history for the struct,
  which is indeed valid argument.
- Some re-ordering also may be required which I hesitate to do
- Some item comments doesn't give extra information and can be removed

v3:
- group MAC, MTU, promisc and allmuti functions together
- group rxq/txq_info_get with dev_infos_get
- group l2_tunnel_* and udp_tunnel_* functions together

v2:
- extract mtu_set into new group
- move rss_hash_* to reta_* group
- move set_mc_addr_list to mac_addr_* group
- move set_vf_rate_limit to set_vf_* group
- move get_dcb_info out of timesync_* group

To make it easy to comment to latest struct, copy-paste here:

struct eth_dev_ops {
	eth_dev_configure_t        dev_configure; /**< Configure device. */
	eth_dev_start_t            dev_start;     /**< Start device. */
	eth_dev_stop_t             dev_stop;      /**< Stop device. */
	eth_dev_set_link_up_t      dev_set_link_up;   /**< Device link up. */
	eth_dev_set_link_down_t    dev_set_link_down; /**< Device link down. */
	eth_dev_close_t            dev_close;     /**< Close device. */
	eth_link_update_t          link_update;   /**< Get device link state. */

	eth_promiscuous_enable_t   promiscuous_enable; /**< Promiscuous ON. */
	eth_promiscuous_disable_t  promiscuous_disable;/**< Promiscuous OFF. */
	eth_allmulticast_enable_t  allmulticast_enable;/**< RX multicast ON. */
	eth_allmulticast_disable_t allmulticast_disable;/**< RX multicast OF. */
	eth_mac_addr_remove_t      mac_addr_remove; /**< Remove MAC address. */
	eth_mac_addr_add_t         mac_addr_add;  /**< Add a MAC address. */
	eth_mac_addr_set_t         mac_addr_set;  /**< Set a MAC address. */
	eth_set_mc_addr_list_t     set_mc_addr_list; /**< set list of mcast addrs. */
	mtu_set_t                  mtu_set;       /**< Set MTU. */

	eth_stats_get_t            stats_get;     /**< Get generic device statistics. */
	eth_stats_reset_t          stats_reset;   /**< Reset generic device statistics. */
	eth_xstats_get_t           xstats_get;    /**< Get extended device statistics. */
	eth_xstats_reset_t         xstats_reset;  /**< Reset extended device statistics. */
	eth_xstats_get_names_t     xstats_get_names;
	/**< Get names of extended statistics. */
	eth_queue_stats_mapping_set_t queue_stats_mapping_set;
	/**< Configure per queue stat counter mapping. */

	eth_dev_infos_get_t        dev_infos_get; /**< Get device info. */
	eth_rxq_info_get_t         rxq_info_get; /**< retrieve RX queue information. */
	eth_txq_info_get_t         txq_info_get; /**< retrieve TX queue information. */
	eth_dev_supported_ptypes_get_t dev_supported_ptypes_get;
	/**< Get packet types supported and identified by device. */

	vlan_filter_set_t          vlan_filter_set; /**< Filter VLAN Setup. */
	vlan_tpid_set_t            vlan_tpid_set; /**< Outer/Inner VLAN TPID Setup. */
	vlan_strip_queue_set_t     vlan_strip_queue_set; /**< VLAN Stripping on queue. */
	vlan_offload_set_t         vlan_offload_set; /**< Set VLAN Offload. */
	vlan_pvid_set_t            vlan_pvid_set; /**< Set port based TX VLAN insertion. */

	eth_queue_start_t          rx_queue_start;/**< Start RX for a queue. */
	eth_queue_stop_t           rx_queue_stop; /**< Stop RX for a queue. */
	eth_queue_start_t          tx_queue_start;/**< Start TX for a queue. */
	eth_queue_stop_t           tx_queue_stop; /**< Stop TX for a queue. */
	eth_rx_queue_setup_t       rx_queue_setup;/**< Set up device RX queue. */
	eth_queue_release_t        rx_queue_release; /**< Release RX queue. */
	eth_rx_queue_count_t       rx_queue_count;/**< Get Rx queue count. */
	eth_rx_descriptor_done_t   rx_descriptor_done; /**< Check rxd DD bit. */
	eth_rx_enable_intr_t       rx_queue_intr_enable;  /**< Enable Rx queue interrupt. */
	eth_rx_disable_intr_t      rx_queue_intr_disable; /**< Disable Rx queue interrupt. */
	eth_tx_queue_setup_t       tx_queue_setup;/**< Set up device TX queue. */
	eth_queue_release_t        tx_queue_release; /**< Release TX queue. */

	eth_dev_led_on_t           dev_led_on;    /**< Turn on LED. */
	eth_dev_led_off_t          dev_led_off;   /**< Turn off LED. */

	flow_ctrl_get_t            flow_ctrl_get; /**< Get flow control. */
	flow_ctrl_set_t            flow_ctrl_set; /**< Setup flow control. */
	priority_flow_ctrl_set_t   priority_flow_ctrl_set; /**< Setup priority flow control. */

	eth_uc_hash_table_set_t    uc_hash_table_set; /**< Set Unicast Table Array. */
	eth_uc_all_hash_table_set_t uc_all_hash_table_set; /**< Set Unicast hash bitmap. */

	eth_mirror_rule_set_t	   mirror_rule_set; /**< Add a traffic mirror rule. */
	eth_mirror_rule_reset_t	   mirror_rule_reset; /**< reset a traffic mirror rule. */

	eth_set_vf_rx_mode_t       set_vf_rx_mode;/**< Set VF RX mode. */
	eth_set_vf_rx_t            set_vf_rx;     /**< enable/disable a VF receive. */
	eth_set_vf_tx_t            set_vf_tx;     /**< enable/disable a VF transmit. */
	eth_set_vf_vlan_filter_t   set_vf_vlan_filter; /**< Set VF VLAN filter. */
	eth_set_vf_rate_limit_t    set_vf_rate_limit; /**< Set VF rate limit. */

	eth_udp_tunnel_port_add_t  udp_tunnel_port_add; /** Add UDP tunnel port. */
	eth_udp_tunnel_port_del_t  udp_tunnel_port_del; /** Del UDP tunnel port. */
	eth_l2_tunnel_eth_type_conf_t l2_tunnel_eth_type_conf;
	/** Config ether type of l2 tunnel. */
	eth_l2_tunnel_offload_set_t   l2_tunnel_offload_set;
	/** Enable/disable l2 tunnel offload functions. */

	eth_set_queue_rate_limit_t set_queue_rate_limit; /**< Set queue rate limit. */

	rss_hash_update_t          rss_hash_update; /** Configure RSS hash protocols. */
	rss_hash_conf_get_t        rss_hash_conf_get; /** Get current RSS hash configuration. */
	reta_update_t              reta_update;   /** Update redirection table. */
	reta_query_t               reta_query;    /** Query redirection table. */

	eth_get_reg_t              get_reg;           /**< Get registers. */
	eth_get_eeprom_length_t    get_eeprom_length; /**< Get eeprom length. */
	eth_get_eeprom_t           get_eeprom;        /**< Get eeprom data. */
	eth_set_eeprom_t           set_eeprom;        /**< Set eeprom. */

	/* bypass control */
\#ifdef RTE_NIC_BYPASS
	bypass_init_t              bypass_init;
	bypass_state_set_t         bypass_state_set;
	bypass_state_show_t        bypass_state_show;
	bypass_event_set_t         bypass_event_set;
	bypass_event_show_t        bypass_event_show;
	bypass_wd_timeout_set_t    bypass_wd_timeout_set;
	bypass_wd_timeout_show_t   bypass_wd_timeout_show;
	bypass_ver_show_t          bypass_ver_show;
	bypass_wd_reset_t          bypass_wd_reset;
\#endif

	eth_filter_ctrl_t          filter_ctrl; /**< common filter control. */

	eth_get_dcb_info           get_dcb_info; /** Get DCB information. */

	eth_timesync_enable_t      timesync_enable;
	/** Turn IEEE1588/802.1AS timestamping on. */
	eth_timesync_disable_t     timesync_disable;
	/** Turn IEEE1588/802.1AS timestamping off. */
	eth_timesync_read_rx_timestamp_t timesync_read_rx_timestamp;
	/** Read the IEEE1588/802.1AS RX timestamp. */
	eth_timesync_read_tx_timestamp_t timesync_read_tx_timestamp;
	/** Read the IEEE1588/802.1AS TX timestamp. */
	eth_timesync_adjust_time   timesync_adjust_time; /** Adjust the device clock. */
	eth_timesync_read_time     timesync_read_time; /** Get the device clock time. */
	eth_timesync_write_time    timesync_write_time; /** Set the device clock time. */
};
---
 lib/librte_ether/rte_ethdev.h | 174 +++++++++++++++++++++---------------------
 1 file changed, 85 insertions(+), 89 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 52119af..272fd41 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1431,11 +1431,18 @@ struct eth_dev_ops {
 	eth_dev_set_link_up_t      dev_set_link_up;   /**< Device link up. */
 	eth_dev_set_link_down_t    dev_set_link_down; /**< Device link down. */
 	eth_dev_close_t            dev_close;     /**< Close device. */
+	eth_link_update_t          link_update;   /**< Get device link state. */
+
 	eth_promiscuous_enable_t   promiscuous_enable; /**< Promiscuous ON. */
 	eth_promiscuous_disable_t  promiscuous_disable;/**< Promiscuous OFF. */
 	eth_allmulticast_enable_t  allmulticast_enable;/**< RX multicast ON. */
 	eth_allmulticast_disable_t allmulticast_disable;/**< RX multicast OF. */
-	eth_link_update_t          link_update;   /**< Get device link state. */
+	eth_mac_addr_remove_t      mac_addr_remove; /**< Remove MAC address. */
+	eth_mac_addr_add_t         mac_addr_add;  /**< Add a MAC address. */
+	eth_mac_addr_set_t         mac_addr_set;  /**< Set a MAC address. */
+	eth_set_mc_addr_list_t     set_mc_addr_list; /**< set list of mcast addrs. */
+	mtu_set_t                  mtu_set;       /**< Set MTU. */
+
 	eth_stats_get_t            stats_get;     /**< Get generic device statistics. */
 	eth_stats_reset_t          stats_reset;   /**< Reset generic device statistics. */
 	eth_xstats_get_t           xstats_get;    /**< Get extended device statistics. */
@@ -1444,109 +1451,98 @@ struct eth_dev_ops {
 	/**< Get names of extended statistics. */
 	eth_queue_stats_mapping_set_t queue_stats_mapping_set;
 	/**< Configure per queue stat counter mapping. */
+
 	eth_dev_infos_get_t        dev_infos_get; /**< Get device info. */
+	eth_rxq_info_get_t         rxq_info_get; /**< retrieve RX queue information. */
+	eth_txq_info_get_t         txq_info_get; /**< retrieve TX queue information. */
 	eth_dev_supported_ptypes_get_t dev_supported_ptypes_get;
-	/**< Get packet types supported and identified by device*/
-	mtu_set_t                  mtu_set; /**< Set MTU. */
-	vlan_filter_set_t          vlan_filter_set;  /**< Filter VLAN Setup. */
-	vlan_tpid_set_t            vlan_tpid_set;      /**< Outer/Inner VLAN TPID Setup. */
+	/**< Get packet types supported and identified by device. */
+
+	vlan_filter_set_t          vlan_filter_set; /**< Filter VLAN Setup. */
+	vlan_tpid_set_t            vlan_tpid_set; /**< Outer/Inner VLAN TPID Setup. */
 	vlan_strip_queue_set_t     vlan_strip_queue_set; /**< VLAN Stripping on queue. */
 	vlan_offload_set_t         vlan_offload_set; /**< Set VLAN Offload. */
-	vlan_pvid_set_t            vlan_pvid_set; /**< Set port based TX VLAN insertion */
-	eth_queue_start_t          rx_queue_start;/**< Start RX for a queue.*/
-	eth_queue_stop_t           rx_queue_stop;/**< Stop RX for a queue.*/
-	eth_queue_start_t          tx_queue_start;/**< Start TX for a queue.*/
-	eth_queue_stop_t           tx_queue_stop;/**< Stop TX for a queue.*/
-	eth_rx_queue_setup_t       rx_queue_setup;/**< Set up device RX queue.*/
-	eth_queue_release_t        rx_queue_release;/**< Release RX queue.*/
-	eth_rx_queue_count_t       rx_queue_count; /**< Get Rx queue count. */
-	eth_rx_descriptor_done_t   rx_descriptor_done;  /**< Check rxd DD bit */
-	/**< Enable Rx queue interrupt. */
-	eth_rx_enable_intr_t       rx_queue_intr_enable;
-	/**< Disable Rx queue interrupt.*/
-	eth_rx_disable_intr_t      rx_queue_intr_disable;
-	eth_tx_queue_setup_t       tx_queue_setup;/**< Set up device TX queue.*/
-	eth_queue_release_t        tx_queue_release;/**< Release TX queue.*/
+	vlan_pvid_set_t            vlan_pvid_set; /**< Set port based TX VLAN insertion. */
+
+	eth_queue_start_t          rx_queue_start;/**< Start RX for a queue. */
+	eth_queue_stop_t           rx_queue_stop; /**< Stop RX for a queue. */
+	eth_queue_start_t          tx_queue_start;/**< Start TX for a queue. */
+	eth_queue_stop_t           tx_queue_stop; /**< Stop TX for a queue. */
+	eth_rx_queue_setup_t       rx_queue_setup;/**< Set up device RX queue. */
+	eth_queue_release_t        rx_queue_release; /**< Release RX queue. */
+	eth_rx_queue_count_t       rx_queue_count;/**< Get Rx queue count. */
+	eth_rx_descriptor_done_t   rx_descriptor_done; /**< Check rxd DD bit. */
+	eth_rx_enable_intr_t       rx_queue_intr_enable;  /**< Enable Rx queue interrupt. */
+	eth_rx_disable_intr_t      rx_queue_intr_disable; /**< Disable Rx queue interrupt. */
+	eth_tx_queue_setup_t       tx_queue_setup;/**< Set up device TX queue. */
+	eth_queue_release_t        tx_queue_release; /**< Release TX queue. */
+
 	eth_dev_led_on_t           dev_led_on;    /**< Turn on LED. */
 	eth_dev_led_off_t          dev_led_off;   /**< Turn off LED. */
+
 	flow_ctrl_get_t            flow_ctrl_get; /**< Get flow control. */
 	flow_ctrl_set_t            flow_ctrl_set; /**< Setup flow control. */
-	priority_flow_ctrl_set_t   priority_flow_ctrl_set; /**< Setup priority flow control.*/
-	eth_mac_addr_remove_t      mac_addr_remove; /**< Remove MAC address */
-	eth_mac_addr_add_t         mac_addr_add;  /**< Add a MAC address */
-	eth_mac_addr_set_t         mac_addr_set;  /**< Set a MAC address */
-	eth_uc_hash_table_set_t    uc_hash_table_set;  /**< Set Unicast Table Array */
-	eth_uc_all_hash_table_set_t uc_all_hash_table_set;  /**< Set Unicast hash bitmap */
-	eth_mirror_rule_set_t	   mirror_rule_set;  /**< Add a traffic mirror rule.*/
-	eth_mirror_rule_reset_t	   mirror_rule_reset;  /**< reset a traffic mirror rule.*/
-	eth_set_vf_rx_mode_t       set_vf_rx_mode;   /**< Set VF RX mode */
-	eth_set_vf_rx_t            set_vf_rx;  /**< enable/disable a VF receive */
-	eth_set_vf_tx_t            set_vf_tx;  /**< enable/disable a VF transmit */
-	eth_set_vf_vlan_filter_t   set_vf_vlan_filter;  /**< Set VF VLAN filter */
-	/** Add UDP tunnel port. */
-	eth_udp_tunnel_port_add_t udp_tunnel_port_add;
-	/** Del UDP tunnel port. */
-	eth_udp_tunnel_port_del_t udp_tunnel_port_del;
-	eth_set_queue_rate_limit_t set_queue_rate_limit;   /**< Set queue rate limit */
-	eth_set_vf_rate_limit_t    set_vf_rate_limit;   /**< Set VF rate limit */
-	/** Update redirection table. */
-	reta_update_t reta_update;
-	/** Query redirection table. */
-	reta_query_t reta_query;
-
-	eth_get_reg_t get_reg;
-	/**< Get registers */
-	eth_get_eeprom_length_t get_eeprom_length;
-	/**< Get eeprom length */
-	eth_get_eeprom_t get_eeprom;
-	/**< Get eeprom data */
-	eth_set_eeprom_t set_eeprom;
-	/**< Set eeprom */
-  /* bypass control */
+	priority_flow_ctrl_set_t   priority_flow_ctrl_set; /**< Setup priority flow control. */
+
+	eth_uc_hash_table_set_t    uc_hash_table_set; /**< Set Unicast Table Array. */
+	eth_uc_all_hash_table_set_t uc_all_hash_table_set; /**< Set Unicast hash bitmap. */
+
+	eth_mirror_rule_set_t	   mirror_rule_set; /**< Add a traffic mirror rule. */
+	eth_mirror_rule_reset_t	   mirror_rule_reset; /**< reset a traffic mirror rule. */
+
+	eth_set_vf_rx_mode_t       set_vf_rx_mode;/**< Set VF RX mode. */
+	eth_set_vf_rx_t            set_vf_rx;     /**< enable/disable a VF receive. */
+	eth_set_vf_tx_t            set_vf_tx;     /**< enable/disable a VF transmit. */
+	eth_set_vf_vlan_filter_t   set_vf_vlan_filter; /**< Set VF VLAN filter. */
+	eth_set_vf_rate_limit_t    set_vf_rate_limit; /**< Set VF rate limit. */
+
+	eth_udp_tunnel_port_add_t  udp_tunnel_port_add; /** Add UDP tunnel port. */
+	eth_udp_tunnel_port_del_t  udp_tunnel_port_del; /** Del UDP tunnel port. */
+	eth_l2_tunnel_eth_type_conf_t l2_tunnel_eth_type_conf;
+	/** Config ether type of l2 tunnel. */
+	eth_l2_tunnel_offload_set_t   l2_tunnel_offload_set;
+	/** Enable/disable l2 tunnel offload functions. */
+
+	eth_set_queue_rate_limit_t set_queue_rate_limit; /**< Set queue rate limit. */
+
+	rss_hash_update_t          rss_hash_update; /** Configure RSS hash protocols. */
+	rss_hash_conf_get_t        rss_hash_conf_get; /** Get current RSS hash configuration. */
+	reta_update_t              reta_update;   /** Update redirection table. */
+	reta_query_t               reta_query;    /** Query redirection table. */
+
+	eth_get_reg_t              get_reg;           /**< Get registers. */
+	eth_get_eeprom_length_t    get_eeprom_length; /**< Get eeprom length. */
+	eth_get_eeprom_t           get_eeprom;        /**< Get eeprom data. */
+	eth_set_eeprom_t           set_eeprom;        /**< Set eeprom. */
+
+	/* bypass control */
 #ifdef RTE_NIC_BYPASS
-  bypass_init_t bypass_init;
-  bypass_state_set_t bypass_state_set;
-  bypass_state_show_t bypass_state_show;
-  bypass_event_set_t bypass_event_set;
-  bypass_event_show_t bypass_event_show;
-  bypass_wd_timeout_set_t bypass_wd_timeout_set;
-  bypass_wd_timeout_show_t bypass_wd_timeout_show;
-  bypass_ver_show_t bypass_ver_show;
-  bypass_wd_reset_t bypass_wd_reset;
+	bypass_init_t              bypass_init;
+	bypass_state_set_t         bypass_state_set;
+	bypass_state_show_t        bypass_state_show;
+	bypass_event_set_t         bypass_event_set;
+	bypass_event_show_t        bypass_event_show;
+	bypass_wd_timeout_set_t    bypass_wd_timeout_set;
+	bypass_wd_timeout_show_t   bypass_wd_timeout_show;
+	bypass_ver_show_t          bypass_ver_show;
+	bypass_wd_reset_t          bypass_wd_reset;
 #endif
 
-	/** Configure RSS hash protocols. */
-	rss_hash_update_t rss_hash_update;
-	/** Get current RSS hash configuration. */
-	rss_hash_conf_get_t rss_hash_conf_get;
-	eth_filter_ctrl_t              filter_ctrl;
-	/**< common filter control. */
-	eth_set_mc_addr_list_t set_mc_addr_list; /**< set list of mcast addrs */
-	eth_rxq_info_get_t rxq_info_get;
-	/**< retrieve RX queue information. */
-	eth_txq_info_get_t txq_info_get;
-	/**< retrieve TX queue information. */
+	eth_filter_ctrl_t          filter_ctrl; /**< common filter control. */
+
+	eth_get_dcb_info           get_dcb_info; /** Get DCB information. */
+
+	eth_timesync_enable_t      timesync_enable;
 	/** Turn IEEE1588/802.1AS timestamping on. */
-	eth_timesync_enable_t timesync_enable;
+	eth_timesync_disable_t     timesync_disable;
 	/** Turn IEEE1588/802.1AS timestamping off. */
-	eth_timesync_disable_t timesync_disable;
-	/** Read the IEEE1588/802.1AS RX timestamp. */
 	eth_timesync_read_rx_timestamp_t timesync_read_rx_timestamp;
-	/** Read the IEEE1588/802.1AS TX timestamp. */
+	/** Read the IEEE1588/802.1AS RX timestamp. */
 	eth_timesync_read_tx_timestamp_t timesync_read_tx_timestamp;
-
-	/** Get DCB information */
-	eth_get_dcb_info get_dcb_info;
-	/** Adjust the device clock.*/
-	eth_timesync_adjust_time timesync_adjust_time;
-	/** Get the device clock time. */
-	eth_timesync_read_time timesync_read_time;
-	/** Set the device clock time. */
-	eth_timesync_write_time timesync_write_time;
-	/** Config ether type of l2 tunnel */
-	eth_l2_tunnel_eth_type_conf_t l2_tunnel_eth_type_conf;
-	/** Enable/disable l2 tunnel offload functions */
-	eth_l2_tunnel_offload_set_t l2_tunnel_offload_set;
+	/** Read the IEEE1588/802.1AS TX timestamp. */
+	eth_timesync_adjust_time   timesync_adjust_time; /** Adjust the device clock. */
+	eth_timesync_read_time     timesync_read_time; /** Get the device clock time. */
+	eth_timesync_write_time    timesync_write_time; /** Set the device clock time. */
 };
 
 /**
-- 
2.9.3

^ permalink raw reply related

* Re: [PATCH v13 6/7] vmxnet3: add Tx preparation
From: Thomas Monjalon @ 2016-12-22 13:10 UTC (permalink / raw)
  To: Ferruh Yigit; +Cc: dev, Tomasz Kulasek, Ananyev, Konstantin
In-Reply-To: <3c86020d-b81f-bbe0-4cfc-746f7714af01@intel.com>

2016-12-20 13:36, Ferruh Yigit:
> On 12/13/2016 5:41 PM, Tomasz Kulasek wrote:
> > From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
> > 
> > Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> > ---
> 
> <...>
> 
> >  
> >  uint16_t
> > +vmxnet3_prep_pkts(__rte_unused void *tx_queue, struct rte_mbuf **tx_pkts,
> > +	uint16_t nb_pkts)
> > +{
> <...>
> > +
> > +#ifdef RTE_LIBRTE_ETHDEV_DEBUG
> > +		ret = rte_validate_tx_offload(m);
> > +		if (ret != 0) {
> > +			rte_errno = ret;
> > +			return i;
> > +		}
> > +#endif
> > +		ret = rte_net_intel_cksum_prepare(m);
> 
> Since this API used beyond Intel drivers, what do you think renaming it?
> rte_net_generic_cksum_prepare() ?

I think it is good to have Intel in its name because it is where it
comes from.
Hopefully we won't have to care about this specific API when tx_prepare
will be well accepted.

^ permalink raw reply

* Re: [PATCH v12 1/6] ethdev: add Tx preparation
From: Thomas Monjalon @ 2016-12-22 13:14 UTC (permalink / raw)
  To: Ananyev, Konstantin
  Cc: Kulasek, TomaszX, dev, olivier.matz@6wind.com, Richardson, Bruce
In-Reply-To: <2601191342CEEE43887BDE71AB9772583F0E2A74@irsmsx105.ger.corp.intel.com>

2016-12-02 00:10, Ananyev, Konstantin:
> I have absolutely no problem to remove the RTE_ETHDEV_TX_PREPARE and associated logic.
> I personally don't use ARM boxes and don't plan to,
> and in theory users can still do conditional compilation at the upper layer, if they want to. 

Yes you're right. The application can avoid calling tx_prepare at all.
No need of an ifdef inside DPDK.

^ permalink raw reply

* Re: Why IP_PIPELINE is faster than L2FWD
From: Bruce Richardson @ 2016-12-22 13:25 UTC (permalink / raw)
  To: Royce Niu; +Cc: dev, cristian.dumitrescu
In-Reply-To: <CAOwUCNtJPX9wn_0heWkOtYW3e932sG_OJP-55+9dg3SBk05aXQ@mail.gmail.com>

On Thu, Dec 22, 2016 at 08:48:50PM +0800, Royce Niu wrote:
> But, actually, L3FWD of IP_PIPELINE is also faster than stock L2FWD, which
> also modifies mac addr. How can explain this?
> 
> Actually, I want to know why IP_PIPELINE is much faster and I can learn
> from IP_PIPELINE and make our own program.
> 
> But, the documentation of that is not detailed enough. if it is possible,
> could you tell me where is the key to boost? Thanks!
>

Adding Cristian as IP Pipeline maintainer.

A lot of tuning work went into IP Pipeline and the table and port
libraries it uses, so I'm not sure that there is just one or two key
changes which give it such good performance. L2 forward just hasn't had
the same level of tuning and, while performing well, is also simplified
to make it understandable as an example. Contrast the code in l2fwd
against equivalent vector code in l3fwd-lpm* files e.g. l3fwd_lpm_sse.h.
The latter is very high performing, the former is more readable.

Regards,
/Bruce

> On Thu, Dec 22, 2016 at 7:15 PM, Bruce Richardson <
> bruce.richardson@intel.com> wrote:
> 
> > On Thu, Dec 22, 2016 at 12:18:12AM +0800, Royce Niu wrote:
> > > Hi all,
> > >
> > > I tested default L2FWD and IP_PIPELINE (pass-through). The throughput of
> > > IP_PIPELINE is higher immensely.
> > >
> > > There are only two virtual NICs in KVM. The experiment is just moving
> > > packet from vNIC0  to vNIC1. I think the function is so simple. Why L2FWD
> > > is much slower?
> > >
> > > How can I improve L2FWD, to make L2FWD faster?
> > >
> > Is IP_PIPELINE in passthrough mode modifying the packets? L2FWD swaps
> > the mac addresses on each packet as it processes them, which can slow it
> > down. L2FWD is also more an example of how the APIs work than anything
> > else. For fastest possible port-to-port forwarding, testpmd should give
> > the highest performance.
> >
> > /Bruce
> >
> 
> 
> 
> -- 
> Regards,
> 
> Royce

^ permalink raw reply

* Re: [PATCH v12 1/6] ethdev: add Tx preparation
From: Thomas Monjalon @ 2016-12-22 13:30 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: Kulasek, TomaszX, Olivier Matz, dev
In-Reply-To: <2601191342CEEE43887BDE71AB9772583F0E689E@irsmsx105.ger.corp.intel.com>

2016-12-12 11:51, Ananyev, Konstantin:
> > > The application gets few information from tx_prepare() about what should
> > > be done to make the packet accepted by the hw, and the actions will
> > > probably be different depending on hardware.
> 
> That's true.
> I am open to suggestions how in future to provide extra information to the upper layer.
> Set rte_errno to different values depending on type of error,
> OR extra parameter in tx_prepare() that will provide more detailed error information,
> OR something else?

That's one of the reason which give me a feeling that it is safer
to introduce tx_prepare as an experimental API in 17.02.
So the users will know that it can change in the next release.
What do you think?

^ permalink raw reply

* Re: Why IP_PIPELINE is faster than L2FWD
From: Royce Niu @ 2016-12-22 13:36 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: Royce Niu, dev, cristian.dumitrescu
In-Reply-To: <20161222132542.GA44940@bricha3-MOBL3.ger.corp.intel.com>

Dear Bruce,

Thanks for your kind explanation.

I will try to follow your suggestion and see the source code.

On Thu, Dec 22, 2016 at 9:25 PM, Bruce Richardson <
bruce.richardson@intel.com> wrote:

> On Thu, Dec 22, 2016 at 08:48:50PM +0800, Royce Niu wrote:
> > But, actually, L3FWD of IP_PIPELINE is also faster than stock L2FWD,
> which
> > also modifies mac addr. How can explain this?
> >
> > Actually, I want to know why IP_PIPELINE is much faster and I can learn
> > from IP_PIPELINE and make our own program.
> >
> > But, the documentation of that is not detailed enough. if it is possible,
> > could you tell me where is the key to boost? Thanks!
> >
>
> Adding Cristian as IP Pipeline maintainer.
>
> A lot of tuning work went into IP Pipeline and the table and port
> libraries it uses, so I'm not sure that there is just one or two key
> changes which give it such good performance. L2 forward just hasn't had
> the same level of tuning and, while performing well, is also simplified
> to make it understandable as an example. Contrast the code in l2fwd
> against equivalent vector code in l3fwd-lpm* files e.g. l3fwd_lpm_sse.h.
> The latter is very high performing, the former is more readable.
>
> Regards,
> /Bruce
>
> > On Thu, Dec 22, 2016 at 7:15 PM, Bruce Richardson <
> > bruce.richardson@intel.com> wrote:
> >
> > > On Thu, Dec 22, 2016 at 12:18:12AM +0800, Royce Niu wrote:
> > > > Hi all,
> > > >
> > > > I tested default L2FWD and IP_PIPELINE (pass-through). The
> throughput of
> > > > IP_PIPELINE is higher immensely.
> > > >
> > > > There are only two virtual NICs in KVM. The experiment is just moving
> > > > packet from vNIC0  to vNIC1. I think the function is so simple. Why
> L2FWD
> > > > is much slower?
> > > >
> > > > How can I improve L2FWD, to make L2FWD faster?
> > > >
> > > Is IP_PIPELINE in passthrough mode modifying the packets? L2FWD swaps
> > > the mac addresses on each packet as it processes them, which can slow
> it
> > > down. L2FWD is also more an example of how the APIs work than anything
> > > else. For fastest possible port-to-port forwarding, testpmd should give
> > > the highest performance.
> > >
> > > /Bruce
> > >
> >
> >
> >
> > --
> > Regards,
> >
> > Royce
>



-- 
Regards,

Royce

^ permalink raw reply

* Re: [PATCH v12 1/6] ethdev: add Tx preparation
From: Jerin Jacob @ 2016-12-22 13:37 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Ananyev, Konstantin, Kulasek, TomaszX, dev,
	olivier.matz@6wind.com, Richardson, Bruce
In-Reply-To: <7594197.BWrLiCir60@xps13>

On Thu, Dec 22, 2016 at 02:14:45PM +0100, Thomas Monjalon wrote:
> 2016-12-02 00:10, Ananyev, Konstantin:
> > I have absolutely no problem to remove the RTE_ETHDEV_TX_PREPARE and associated logic.
> > I personally don't use ARM boxes and don't plan to,
> > and in theory users can still do conditional compilation at the upper layer, if they want to. 
> 
> Yes you're right. The application can avoid calling tx_prepare at all.

There are applications inside dpdk repo which will be using tx_prep so
in that case, IMHO, let the ifdef inside the DPDK library and disable it by
default so that if required we can disable it in one shot on integrated
controllers targets where is the system has only one integrated controller and
integrated controller does not need tx_prep


> No need of an ifdef inside DPDK.

^ permalink raw reply

* Re: [PATCH v5 00/29] Support VFD and DPDK PF + kernel VF on i40e
From: Vincent JARDIN @ 2016-12-22 13:46 UTC (permalink / raw)
  To: Chen, Jing D
  Cc: Thomas Monjalon, dev@dpdk.org, Yigit, Ferruh, Wu, Jingjing,
	Zhang, Helin
In-Reply-To: <4341B239C0EFF9468EE453F9E9F4604D3C5C6B22@shsmsx102.ccr.corp.intel.com>

Le 22/12/2016 à 09:10, Chen, Jing D a écrit :
> In the meanwhile, we have some test models ongoing to validate combination of Linux and
> DPDK drivers for VF and PF. We'll fully support below 4 cases going forward.
> 1. DPDK PF + DPDK VF
> 2. DPDK PF + Linux VF

+ DPDK PF + FreeBSD VF
+ DPDK PF + Windows VF
+ DPDK PF + OS xyz VF

> 3. Linux PF + DPDK VF
> 4. Linux PF + Linux VF (it's not our scope)

So, you confirm the issue: having DPDK becoming a PF, even if SRIOV 
protocol includes version-ing, it doubles the combinatory cases.

>
> After applied this patch, i've done below test without observing compatibility issue.
> 1. DPDK PF + DPDK VF (middle of 16.11 and 17.02 code base). PF to support API 1.0 while VF
>     to support API 1.1/1.0	
> 2. DPDK PF + Linux VF 1.5.14. PF to support 1.0, while Linux to support 1.1/1.0
>
> Linux PF + DPDK VF has been tested with 1.0 API long time ago. There is some test activities
> ongoing.
>
> Finally, please give strong reasons to support your NAC.

I feel bad because I do recognize the strong and hard work that you have 
done on this PF development, but I feel we need first to assess if DPDK 
should become a PF or not. I know ixgbe did open the path and that they 
are some historical DPDK PF supports in Intel NICs, but before we 
generalize it, we have to make sure we are not turning this DataPlane 
development Kit into a ControlPlane Driver Kit that we are scared to 
upstream into Linux kernel. Even if "DPDK is not Linux", it does not 
mean that Linux should be ignored. In case of DPDK on other OS, same, 
their PF could be extended too.

So currently, yes, I do keep a nack't

Since DPDK PF features can be into Linux PF features too and since Linux 
(and other hypervisors) has already some tools to manage PF (see 
iproute2, etc.), why should we have an other management path with DPDK? 
DPDK is aimed to be a Dataplane Development kit, not a 
management/control plane driver kit.

Assuming you want to use DPDK PF for dataplane feature, that could be OK 
then, using:
   - configure one VF on the hypervisor from Linux's PF, let's name if 
VF_forPFtraffic, see http://dpdk.org/doc/guides/howto/flow_bifurcation.html
   - have no (or few IO)s to the PF's queue
   - assign the traffic to all VF_forPFtraffic's queues of the hypervisor,
   - run DPDK into the hypervisor's VF_forPFtraffic

Doing so, we get the same benefit of running DPDK over PF or running 
DPDK over VF_forPFtraffic, don't we? It is a benefit of:
   http://dpdk.org/doc/guides/howto/flow_bifurcation.html

Thank you,
   Vincent

^ permalink raw reply

* Re: [PATCH v12 1/6] ethdev: add Tx preparation
From: Ananyev, Konstantin @ 2016-12-22 14:11 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: Kulasek, TomaszX, Olivier Matz, dev@dpdk.org
In-Reply-To: <2376458.ueV7yndzPH@xps13>

> 
> 2016-12-12 11:51, Ananyev, Konstantin:
> > > > The application gets few information from tx_prepare() about what should
> > > > be done to make the packet accepted by the hw, and the actions will
> > > > probably be different depending on hardware.
> >
> > That's true.
> > I am open to suggestions how in future to provide extra information to the upper layer.
> > Set rte_errno to different values depending on type of error,
> > OR extra parameter in tx_prepare() that will provide more detailed error information,
> > OR something else?
> 
> That's one of the reason which give me a feeling that it is safer
> to introduce tx_prepare as an experimental API in 17.02.
> So the users will know that it can change in the next release.
> What do you think?

I think that's the good reason and I am ok with it. 
Konstantin

^ permalink raw reply

* Re: [PATCH v14 1/8] ethdev: add Tx preparation
From: Thomas Monjalon @ 2016-12-22 14:24 UTC (permalink / raw)
  To: Tomasz Kulasek; +Cc: dev
In-Reply-To: <1482411919-7620-2-git-send-email-tomaszx.kulasek@intel.com>

Hi Tomasz,

2016-12-22 14:05, Tomasz Kulasek:
> Added API for `rte_eth_tx_prepare`
> 
> uint16_t rte_eth_tx_prepare(uint8_t port_id, uint16_t queue_id,
> 	struct rte_mbuf **tx_pkts, uint16_t nb_pkts)

As discussed earlier and agreed by Konstantin, please mark this API
as experimental.
We could make some changes in 17.05 to improve error description
or add some flags to modify the behaviour.


> int rte_net_intel_cksum_prepare(struct rte_mbuf *m)
> 
>   to prepare pseudo header checksum for TSO and non-TSO tcp/udp packets
>   before hardware tx checksum offload.
>    - for non-TSO tcp/udp packets full pseudo-header checksum is
>      counted and set.
>    - for TSO the IP payload length is not included.
> 
> 
> int
> rte_net_intel_cksum_flags_prepare(struct rte_mbuf *m, uint64_t ol_flags)
> 
>   this function uses same logic as rte_net_intel_cksum_prepare, but
>   allows application to choose which offloads should be taken into
>   account, if full preparation is not required.

How the application knows which offload flag should be taken into account?


>  #
> +# Use real NOOP to turn off TX preparation stage
> +#
> +# While the behaviour of ``rte_ethdev_tx_prepare`` may change after turning on
> +# real NOOP, this configuration shouldn't be never enabled globaly, and can be
> +# used in appropriate target configuration file with a following restrictions
> +#
> +CONFIG_RTE_ETHDEV_TX_PREPARE_NOOP=n

As discussed earlier, it would be easier to not call tx_prepare at all.
However, this option allows an optimization when compiling DPDK for a
known environment without modifying the application.
So it is worth to keep it.

The text explaining the option should be improved.
I suggest this text:

# Turn off Tx preparation stage
#
# Warning: rte_ethdev_tx_prepare() can be safely disabled only if using a
# driver which do not implement any Tx preparation.


> +	uint16_t nb_seg_max;  /**< Max number of segments per whole packet. */
> +	uint16_t nb_mtu_seg_max; /**< Max number of segments per one MTU */

In another mail, you've added this explanation:
* For non-TSO packet, a single transmit packet may span up to "nb_mtu_seg_max" buffers.
* For TSO packet the total number of data descriptors is "nb_seg_max", and each segment within the TSO may span up to "nb_mtu_seg_max".

Maybe you can try to mix these comments to improve the doxygen.

^ permalink raw reply

* Re: [PATCH v14 8/8] testpmd: use Tx preparation in csum engine
From: Thomas Monjalon @ 2016-12-22 14:28 UTC (permalink / raw)
  To: Tomasz Kulasek; +Cc: dev
In-Reply-To: <1482411919-7620-9-git-send-email-tomaszx.kulasek@intel.com>

2016-12-22 14:05, Tomasz Kulasek:
> Since all current drivers supports Tx preparation API, it is used
> in csum forwarding engine by default for all drivers.
[...]
> +/*
> + * Enable Tx preparation path in the "csum" engine.
> + */
> +uint8_t tx_prepare;

It seems this variable is not used.

^ permalink raw reply

* Re: [PATCH v2 0/5] example/ethtool: add bus info and fw version get
From: Ferruh Yigit @ 2016-12-22 14:36 UTC (permalink / raw)
  To: Thomas Monjalon, Qiming Yang; +Cc: dev, Remy Horton
In-Reply-To: <1578263.GeZ0IiYehl@xps13>

On 12/22/2016 11:07 AM, Thomas Monjalon wrote:
> 2016-12-08 16:34, Remy Horton:
>>
>> On 06/12/2016 15:16, Qiming Yang wrote:
>> [..]
>>> Qiming Yang (5):
>>>   ethdev: add firmware version get
>>>   net/e1000: add firmware version get
>>>   net/ixgbe: add firmware version get
>>>   net/i40e: add firmware version get
>>>   ethtool: dispaly bus info and firmware version
>>
>> s/dispaly/display
>>
>> doc/guides/rel_notes/release_17_02.rst ought to be updated as well. Code 
>> itself looks ok though..
>>
>> Acked-by: Remy Horton <remy.horton@intel.com>
> 
> It must be a feature in the table (doc/guides/nics/features/).
> The deprecation notice must be removed also.
> 
> I think it is OK to add a new dev_ops and a new API function for firmware
> query. Generally speaking, it is a good thing to avoid putting all
> informations in the same structure (e.g. rte_eth_dev_info). 

OK.

> However, there
> is a balance to find. Could we plan to add more info to this new query?
> Instead of
> 	rte_eth_dev_fwver_get(uint8_t port_id, char *fw_version, int fw_length)

Here there is another problem, the content and the format of the string
is not defined. In this patchset it is not same for different PMDs.
This is OK for just printing the data, but not good for an API. How can
the application know what to expect.

> could it fill a struct?
> 	rte_eth_dev_fw_info_get(uint8_t port_id, struct rte_eth_dev_fw_info *fw_info)

I believe this is better. But the problem we are having with this usage
is: ABI breakage.

Since this struct will be a public structure, in the future if we want
to add a new field to the struct, it will break the ABI, and just this
change will cause a new version for whole ethdev library!

When all required fields received via arguments, one by one, instead of
struct, at least ABI versioning can be done on the API when new field
added, and can be possible to escape from ABI breakage. But this will be
ugly when number of arguments increased.

Or any other opinion on how to define API to reduce ABI breakage?

> 
> We already have
> 	rte_eth_dev_get_reg_info(uint8_t port_id, struct rte_dev_reg_info *info)
> with
> 	uint32_t version; /**< Device version */
> 
> There are also these functions (a bit related):
> 	rte_eth_dev_get_eeprom_length(uint8_t port_id)
> 	rte_eth_dev_get_eeprom(uint8_t port_id, struct rte_dev_eeprom_info *info)
> 

^ permalink raw reply

* Re: [PATCH v6 23/25] app/testpmd: handle i40e in VF VLAN filter command
From: Iremonger, Bernard @ 2016-12-22 14:47 UTC (permalink / raw)
  To: Yigit, Ferruh, Lu, Wenzhuo, dev@dpdk.org
In-Reply-To: <c2b4b530-66f3-5df6-5dd8-70412b6ed227@intel.com>


Hi Ferruh,

> -----Original Message-----
> From: Yigit, Ferruh
> Sent: Thursday, December 22, 2016 10:57 AM
> To: Lu, Wenzhuo <wenzhuo.lu@intel.com>; dev@dpdk.org
> Cc: Iremonger, Bernard <bernard.iremonger@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v6 23/25] app/testpmd: handle i40e in VF
> VLAN filter command
> 
> On 12/21/2016 6:34 AM, Wenzhuo Lu wrote:
> > modify set_vf_rx_vlan function to handle the i40e PMD.
> >
> > Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
> > ---
> 
> <...>
> 
> > +
> > +	switch (ret) {
> > +	case 0:
> > +		break;
> > +	case -EINVAL:
> > +		printf("invalid vlan_id %d or vf_mask %lu\n",
> 
> To fix 32bit compilation:
> printf("invalid vlan_id %d or vf_mask %"PRIu64"\n",
> 
> 
> <...>
I will fix this in v7.

Regards,

Bernard.

^ permalink raw reply

* Re: [PATCH v2 0/5] example/ethtool: add bus info and fw version get
From: Thomas Monjalon @ 2016-12-22 14:47 UTC (permalink / raw)
  To: Ferruh Yigit; +Cc: Qiming Yang, dev, Remy Horton
In-Reply-To: <a191e031-c2fd-2a18-ea02-af5fff9668ec@intel.com>

2016-12-22 14:36, Ferruh Yigit:
> On 12/22/2016 11:07 AM, Thomas Monjalon wrote:
> > I think it is OK to add a new dev_ops and a new API function for firmware
> > query. Generally speaking, it is a good thing to avoid putting all
> > informations in the same structure (e.g. rte_eth_dev_info). 
> 
> OK.
> 
> > However, there
> > is a balance to find. Could we plan to add more info to this new query?
> > Instead of
> > 	rte_eth_dev_fwver_get(uint8_t port_id, char *fw_version, int fw_length)
[...]
> > could it fill a struct?
> > 	rte_eth_dev_fw_info_get(uint8_t port_id, struct rte_eth_dev_fw_info *fw_info)
> 
> I believe this is better. But the problem we are having with this usage
> is: ABI breakage.
> 
> Since this struct will be a public structure, in the future if we want
> to add a new field to the struct, it will break the ABI, and just this
> change will cause a new version for whole ethdev library!
> 
> When all required fields received via arguments, one by one, instead of
> struct, at least ABI versioning can be done on the API when new field
> added, and can be possible to escape from ABI breakage. But this will be
> ugly when number of arguments increased.
> 
> Or any other opinion on how to define API to reduce ABI breakage?

You're right.
But I don't think we should have a function per data. Just because it would
be ugly :)
I hope the ABI could become stable with time.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox