DPDK-dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: KNI Questions
From: Stephen Hemminger @ 2016-12-15 17:16 UTC (permalink / raw)
  To: Ferruh Yigit; +Cc: dev
In-Reply-To: <53ad7e36-380c-e5b7-a002-1690d2e63603@intel.com>

On Thu, 15 Dec 2016 11:53:59 +0000
Ferruh Yigit <ferruh.yigit@intel.com> wrote:

> Hi Stephen,
> 
> <...>
> 
> > 
> > Which raises a couple of questions:
> >  1. Why is DPDK still keeping KNI support for Intel specific ethtool functionality.
> >     This always breaks, is code bloat, and means a 3rd copy of base code (Linux, DPDK PMD, + KNI)  
> 
> I agree on you comments related to the ethtool functionality,
> but right now that is a functionality that people may be using, I think
> we should not remove it without providing an alternative to it.
> 
> > 
> >  2. Why is KNI not upstream?
> >     If not acceptable due to security or supportablity then why does it still exist?  
> 
> I believe you are one of the most knowledgeable person in the mail list
> on upstreaming, any support is welcome.

It should be upstreamable but I doubt it would make it past the maintainer.
Mostly because it supports DPDK which he is not in favor of but also since
it is a specialized interface only usable by DPDK, ie. not a general infrastructure.

^ permalink raw reply

* Re: [PATCH v2 1/6] eventdev: introduce event driven programming model
From: Van Haaren, Harry @ 2016-12-15 16:54 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: dev@dpdk.org, thomas.monjalon@6wind.com, Richardson, Bruce,
	hemant.agrawal@nxp.com, Eads, Gage
In-Reply-To: <20161214131356.GA4224@localhost.localdomain>


> From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> Sent: Wednesday, December 14, 2016 1:14 PM
> To: Van Haaren, Harry <harry.van.haaren@intel.com>
<snip>

> So incorporating my latest suggestions on moving sub_event_type field around:
> 
> union {
> 	uint64_t event;
> 	struct {
> 		uint32_t flow_id: 20;
> 		uint32_t sub_event_type : 8;
> 		uint32_t event_type : 4;
> 
> 		uint8_t operation  : 2; /* new fwd drop */
> 		uint8_t rsvd: 4; /* for future additions */
> 		uint8_t sched_type : 2;
> 
> 		uint8_t queue_id;
> 		uint8_t priority;
> 		uint8_t impl_opaque;
> 	};
> };

Thanks, looks good to me!

^ permalink raw reply

* Re: [PATCH v3] drivers: advertise kmod dependencies in pmdinfo
From: Neil Horman @ 2016-12-15 16:09 UTC (permalink / raw)
  To: Olivier Matz
  Cc: dev, thomas.monjalon, vido, fiona.trahe, stephen,
	adrien.mazarguil
In-Reply-To: <1481809599-27896-1-git-send-email-olivier.matz@6wind.com>

On Thu, Dec 15, 2016 at 02:46:39PM +0100, Olivier Matz wrote:
> Add a new macro RTE_PMD_REGISTER_KMOD_DEP() that allows a driver to
> declare the list of kernel modules required to run properly.
> 
> Today, most PCI drivers require uio/vfio.
> 
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> Acked-by: Fiona Trahe <fiona.trahe@intel.com>
> ---
> 
> v2 -> v3:
> - fix kmods deps advertised by mellanox drivers as pointed out
>   by Adrien
> 
> v1 -> v2:                                                                                                
> - do not advertise uio_pci_generic for vf drivers
> - rebase on top of head: use new driver names and prefix
>   macro with RTE_                                                                                       
> 
> rfc -> v1:
> - the kmod information can be per-device using a modalias-like
>   pattern
> - change syntax to use '&' and '|' instead of ',' and ':'
> - remove useless prerequisites in kmod lis: no need to
>   specify both uio and uio_pci_generic, only the latter is
>   required
> - update kmod list in szedata2 driver
> - remove kmod list in qat driver: it requires more than just loading
>   a kmod, which is described in documentation
> 
>  buildtools/pmdinfogen/pmdinfogen.c      |  1 +
>  buildtools/pmdinfogen/pmdinfogen.h      |  1 +
>  drivers/net/bnx2x/bnx2x_ethdev.c        |  2 ++
>  drivers/net/bnxt/bnxt_ethdev.c          |  1 +
>  drivers/net/cxgbe/cxgbe_ethdev.c        |  1 +
>  drivers/net/e1000/em_ethdev.c           |  1 +
>  drivers/net/e1000/igb_ethdev.c          |  2 ++
>  drivers/net/ena/ena_ethdev.c            |  1 +
>  drivers/net/enic/enic_ethdev.c          |  1 +
>  drivers/net/fm10k/fm10k_ethdev.c        |  1 +
>  drivers/net/i40e/i40e_ethdev.c          |  1 +
>  drivers/net/i40e/i40e_ethdev_vf.c       |  1 +
>  drivers/net/ixgbe/ixgbe_ethdev.c        |  2 ++
>  drivers/net/mlx4/mlx4.c                 |  2 ++
>  drivers/net/mlx5/mlx5.c                 |  1 +
>  drivers/net/nfp/nfp_net.c               |  1 +
>  drivers/net/qede/qede_ethdev.c          |  2 ++
>  drivers/net/szedata2/rte_eth_szedata2.c |  2 ++
>  drivers/net/thunderx/nicvf_ethdev.c     |  1 +
>  drivers/net/virtio/virtio_ethdev.c      |  1 +
>  drivers/net/vmxnet3/vmxnet3_ethdev.c    |  1 +
>  lib/librte_eal/common/include/rte_dev.h | 25 +++++++++++++++++++++++++
>  tools/dpdk-pmdinfo.py                   |  5 ++++-
>  23 files changed, 56 insertions(+), 1 deletion(-)
> 
Its odd that all devices, regardless of vendor should depend on the igb_uio
module.  It seems to me that depending on uio_pci_generic or vfio is sufficient.

Neil

^ permalink raw reply

* [PATCH v2 0/3] AESNI MB PMD updates
From: Pablo de Lara @ 2016-12-15 16:00 UTC (permalink / raw)
  To: declan.doherty; +Cc: dev, Pablo de Lara

The library used in AESNI MB PMD, Intel Multi Buffer Crypto for IPsec,
has been migrated to a new location, in github (see documentation patch
for the link).

The library has also been updated, so single crypto operations
are supported (cipher and authentication only). Therefore, the PMD
has been updated to support these operations.

This patchset depends on patchset "Add scatter-gather list capability to
Intel QuickAssist Technology driver" (http://dpdk.org/ml/archives/dev/2016-November/050947.html)

Changes in v2:
- Fixed hash only tests, including truncated digest length


Pablo de Lara (3):
  doc: update AESNI MB PMD guide
  crypto/aesni_mb: add single operation functionality
  doc: add missing supported algos for AESNI MB PMD

 app/test/test_cryptodev.c                   | 34 ++++++++++++++++++
 app/test/test_cryptodev_aes_test_vectors.h  | 36 ++++++++++++-------
 app/test/test_cryptodev_hash_test_vectors.h | 54 +++++++++++++++++++----------
 doc/guides/cryptodevs/aesni_mb.rst          | 14 +++-----
 doc/guides/rel_notes/release_17_02.rst      |  8 +++++
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c  | 49 +++++++++++++++++---------
 6 files changed, 140 insertions(+), 55 deletions(-)

-- 
2.7.4

^ permalink raw reply

* [PATCH v2 3/3] doc: add missing supported algos for AESNI MB PMD
From: Pablo de Lara @ 2016-12-15 16:00 UTC (permalink / raw)
  To: declan.doherty; +Cc: dev, Pablo de Lara
In-Reply-To: <1481817632-183082-1-git-send-email-pablo.de.lara.guarch@intel.com>

AESNI MB PMD supports SHA224-HMAC and SHA384-HMAC,
but the documentation was not updated with this.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
---
 doc/guides/cryptodevs/aesni_mb.rst | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/doc/guides/cryptodevs/aesni_mb.rst b/doc/guides/cryptodevs/aesni_mb.rst
index cb429d7..8b18eba 100644
--- a/doc/guides/cryptodevs/aesni_mb.rst
+++ b/doc/guides/cryptodevs/aesni_mb.rst
@@ -55,7 +55,9 @@ Cipher algorithms:
 Hash algorithms:
 
 * RTE_CRYPTO_HASH_SHA1_HMAC
+* RTE_CRYPTO_HASH_SHA224_HMAC
 * RTE_CRYPTO_HASH_SHA256_HMAC
+* RTE_CRYPTO_HASH_SHA384_HMAC
 * RTE_CRYPTO_HASH_SHA512_HMAC
 
 Limitations
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2 2/3] crypto/aesni_mb: add single operation functionality
From: Pablo de Lara @ 2016-12-15 16:00 UTC (permalink / raw)
  To: declan.doherty; +Cc: dev, Pablo de Lara
In-Reply-To: <1481817632-183082-1-git-send-email-pablo.de.lara.guarch@intel.com>

Update driver to use new AESNI Multibuffer IPSec library single
operation functionality (cipher only and authentication only).
This patch also adds tests for this new feature.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 app/test/test_cryptodev.c                   | 34 ++++++++++++++++++
 app/test/test_cryptodev_aes_test_vectors.h  | 36 ++++++++++++-------
 app/test/test_cryptodev_hash_test_vectors.h | 54 +++++++++++++++++++----------
 doc/guides/cryptodevs/aesni_mb.rst          |  2 --
 doc/guides/rel_notes/release_17_02.rst      |  1 +
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c  | 49 +++++++++++++++++---------
 6 files changed, 128 insertions(+), 48 deletions(-)

diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c
index f1f3542..5895d99 100644
--- a/app/test/test_cryptodev.c
+++ b/app/test/test_cryptodev.c
@@ -1466,6 +1466,38 @@ test_AES_CBC_HMAC_SHA512_decrypt_perform(struct rte_cryptodev_sym_session *sess,
 }
 
 static int
+test_AES_cipheronly_mb_all(void)
+{
+	struct crypto_testsuite_params *ts_params = &testsuite_params;
+	int status;
+
+	status = test_blockcipher_all_tests(ts_params->mbuf_pool,
+		ts_params->op_mpool, ts_params->valid_devs[0],
+		RTE_CRYPTODEV_AESNI_MB_PMD,
+		BLKCIPHER_AES_CIPHERONLY_TYPE);
+
+	TEST_ASSERT_EQUAL(status, 0, "Test failed");
+
+	return TEST_SUCCESS;
+}
+
+static int
+test_authonly_mb_all(void)
+{
+	struct crypto_testsuite_params *ts_params = &testsuite_params;
+	int status;
+
+	status = test_blockcipher_all_tests(ts_params->mbuf_pool,
+		ts_params->op_mpool, ts_params->valid_devs[0],
+		RTE_CRYPTODEV_AESNI_MB_PMD,
+		BLKCIPHER_AUTHONLY_TYPE);
+
+	TEST_ASSERT_EQUAL(status, 0, "Test failed");
+
+	return TEST_SUCCESS;
+}
+
+static int
 test_AES_chain_mb_all(void)
 {
 	struct crypto_testsuite_params *ts_params = &testsuite_params;
@@ -6559,6 +6591,8 @@ static struct unit_test_suite cryptodev_aesni_mb_testsuite  = {
 	.teardown = testsuite_teardown,
 	.unit_test_cases = {
 		TEST_CASE_ST(ut_setup, ut_teardown, test_AES_chain_mb_all),
+		TEST_CASE_ST(ut_setup, ut_teardown, test_AES_cipheronly_mb_all),
+		TEST_CASE_ST(ut_setup, ut_teardown, test_authonly_mb_all),
 
 		TEST_CASES_END() /**< NULL terminate unit test array */
 	}
diff --git a/app/test/test_cryptodev_aes_test_vectors.h b/app/test/test_cryptodev_aes_test_vectors.h
index efbe7da..898aae1 100644
--- a/app/test/test_cryptodev_aes_test_vectors.h
+++ b/app/test/test_cryptodev_aes_test_vectors.h
@@ -1025,84 +1025,96 @@ static const struct blockcipher_test_case aes_cipheronly_test_cases[] = {
 		.test_data = &aes_test_data_4,
 		.op_mask = BLOCKCIPHER_TEST_OP_ENCRYPT,
 		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL |
-			BLOCKCIPHER_TEST_TARGET_PMD_QAT
+			BLOCKCIPHER_TEST_TARGET_PMD_QAT |
+			BLOCKCIPHER_TEST_TARGET_PMD_MB
 	},
 	{
 		.test_descr = "AES-128-CBC Decryption",
 		.test_data = &aes_test_data_4,
 		.op_mask = BLOCKCIPHER_TEST_OP_DECRYPT,
 		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL |
-			BLOCKCIPHER_TEST_TARGET_PMD_QAT
+			BLOCKCIPHER_TEST_TARGET_PMD_QAT |
+			BLOCKCIPHER_TEST_TARGET_PMD_MB
 	},
 	{
 		.test_descr = "AES-192-CBC Encryption",
 		.test_data = &aes_test_data_10,
 		.op_mask = BLOCKCIPHER_TEST_OP_ENCRYPT,
 		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL |
-			BLOCKCIPHER_TEST_TARGET_PMD_QAT
+			BLOCKCIPHER_TEST_TARGET_PMD_QAT |
+			BLOCKCIPHER_TEST_TARGET_PMD_MB
 	},
 	{
 		.test_descr = "AES-192-CBC Decryption",
 		.test_data = &aes_test_data_10,
 		.op_mask = BLOCKCIPHER_TEST_OP_DECRYPT,
 		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL |
-			BLOCKCIPHER_TEST_TARGET_PMD_QAT
+			BLOCKCIPHER_TEST_TARGET_PMD_QAT |
+			BLOCKCIPHER_TEST_TARGET_PMD_MB
 	},
 	{
 		.test_descr = "AES-256-CBC Encryption",
 		.test_data = &aes_test_data_11,
 		.op_mask = BLOCKCIPHER_TEST_OP_ENCRYPT,
 		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL |
-			BLOCKCIPHER_TEST_TARGET_PMD_QAT
+			BLOCKCIPHER_TEST_TARGET_PMD_QAT |
+			BLOCKCIPHER_TEST_TARGET_PMD_MB
 	},
 	{
 		.test_descr = "AES-256-CBC Decryption",
 		.test_data = &aes_test_data_11,
 		.op_mask = BLOCKCIPHER_TEST_OP_DECRYPT,
 		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL |
-			BLOCKCIPHER_TEST_TARGET_PMD_QAT
+			BLOCKCIPHER_TEST_TARGET_PMD_QAT |
+			BLOCKCIPHER_TEST_TARGET_PMD_MB
 	},
 	{
 		.test_descr = "AES-128-CTR Encryption",
 		.test_data = &aes_test_data_1,
 		.op_mask = BLOCKCIPHER_TEST_OP_ENCRYPT,
 		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL |
-			BLOCKCIPHER_TEST_TARGET_PMD_QAT
+			BLOCKCIPHER_TEST_TARGET_PMD_QAT |
+			BLOCKCIPHER_TEST_TARGET_PMD_MB
 	},
 	{
 		.test_descr = "AES-128-CTR Decryption",
 		.test_data = &aes_test_data_1,
 		.op_mask = BLOCKCIPHER_TEST_OP_DECRYPT,
 		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL |
-			BLOCKCIPHER_TEST_TARGET_PMD_QAT
+			BLOCKCIPHER_TEST_TARGET_PMD_QAT |
+			BLOCKCIPHER_TEST_TARGET_PMD_MB
 	},
 	{
 		.test_descr = "AES-192-CTR Encryption",
 		.test_data = &aes_test_data_2,
 		.op_mask = BLOCKCIPHER_TEST_OP_ENCRYPT,
 		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL |
-			BLOCKCIPHER_TEST_TARGET_PMD_QAT
+			BLOCKCIPHER_TEST_TARGET_PMD_QAT |
+			BLOCKCIPHER_TEST_TARGET_PMD_MB
 	},
 	{
 		.test_descr = "AES-192-CTR Decryption",
 		.test_data = &aes_test_data_2,
 		.op_mask = BLOCKCIPHER_TEST_OP_DECRYPT,
 		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL |
-			BLOCKCIPHER_TEST_TARGET_PMD_QAT
+			BLOCKCIPHER_TEST_TARGET_PMD_QAT |
+			BLOCKCIPHER_TEST_TARGET_PMD_MB
 	},
 	{
 		.test_descr = "AES-256-CTR Encryption",
 		.test_data = &aes_test_data_3,
 		.op_mask = BLOCKCIPHER_TEST_OP_ENCRYPT,
 		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL |
-			BLOCKCIPHER_TEST_TARGET_PMD_QAT
+			BLOCKCIPHER_TEST_TARGET_PMD_QAT |
+			BLOCKCIPHER_TEST_TARGET_PMD_MB
 	},
 	{
 		.test_descr = "AES-256-CTR Decryption",
 		.test_data = &aes_test_data_3,
 		.op_mask = BLOCKCIPHER_TEST_OP_DECRYPT,
 		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL |
-			BLOCKCIPHER_TEST_TARGET_PMD_QAT
+			BLOCKCIPHER_TEST_TARGET_PMD_QAT |
+			BLOCKCIPHER_TEST_TARGET_PMD_MB
 	},
 };
 
diff --git a/app/test/test_cryptodev_hash_test_vectors.h b/app/test/test_cryptodev_hash_test_vectors.h
index 9f095cf..a8f9da0 100644
--- a/app/test/test_cryptodev_hash_test_vectors.h
+++ b/app/test/test_cryptodev_hash_test_vectors.h
@@ -97,7 +97,8 @@ hmac_md5_test_vector = {
 			0x50, 0xE8, 0xDE, 0xC5, 0xC1, 0x76, 0xAC, 0xAE,
 			0x15, 0x4A, 0xF1, 0x7F, 0x7E, 0x04, 0x42, 0x9B
 		},
-		.len = 16
+		.len = 16,
+		.truncated_len = 12
 	}
 };
 
@@ -139,7 +140,8 @@ hmac_sha1_test_vector = {
 			0x7E, 0x2E, 0x8F, 0xFC, 0x48, 0x39, 0x46, 0x17,
 			0x3F, 0x91, 0x64, 0x59
 		},
-		.len = 20
+		.len = 20,
+		.truncated_len = 12
 	}
 };
 
@@ -184,7 +186,8 @@ hmac_sha224_test_vector = {
 			0xF1, 0x8A, 0x63, 0xBB, 0x5D, 0x1D, 0xE3, 0x9F,
 			0x92, 0xF6, 0xAA, 0x19
 		},
-		.len = 28
+		.len = 28,
+		.truncated_len = 14
 	}
 };
 
@@ -229,7 +232,8 @@ hmac_sha256_test_vector = {
 			0x06, 0x4D, 0x64, 0x09, 0x0A, 0xCC, 0x02, 0x77,
 			0x71, 0x83, 0x48, 0x71, 0x07, 0x02, 0x25, 0x17
 		},
-		.len = 32
+		.len = 32,
+		.truncated_len = 16
 	}
 };
 
@@ -280,7 +284,8 @@ hmac_sha384_test_vector = {
 			0x10, 0x90, 0x0A, 0xE3, 0xF0, 0x59, 0xDD, 0xC0,
 			0x6F, 0xE6, 0x8C, 0x84, 0xD5, 0x03, 0xF8, 0x9E
 		},
-		.len = 48
+		.len = 48,
+		.truncated_len = 24
 	}
 };
 
@@ -337,7 +342,8 @@ hmac_sha512_test_vector = {
 			0x97, 0x37, 0x0F, 0xBE, 0xC2, 0x45, 0xA0, 0x87,
 			0xAF, 0x24, 0x27, 0x0C, 0x78, 0xBA, 0xBE, 0x20
 		},
-		.len = 64
+		.len = 64,
+		.truncated_len = 32
 	}
 };
 
@@ -358,13 +364,15 @@ static const struct blockcipher_test_case hash_test_cases[] = {
 		.test_descr = "HMAC-MD5 Digest",
 		.test_data = &hmac_md5_test_vector,
 		.op_mask = BLOCKCIPHER_TEST_OP_AUTH_GEN,
-		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL
+		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL |
+			BLOCKCIPHER_TEST_TARGET_PMD_MB
 	},
 	{
 		.test_descr = "HMAC-MD5 Digest Verify",
 		.test_data = &hmac_md5_test_vector,
 		.op_mask = BLOCKCIPHER_TEST_OP_AUTH_VERIFY,
-		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL
+		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL |
+			BLOCKCIPHER_TEST_TARGET_PMD_MB
 	},
 	{
 		.test_descr = "SHA1 Digest",
@@ -382,13 +390,15 @@ static const struct blockcipher_test_case hash_test_cases[] = {
 		.test_descr = "HMAC-SHA1 Digest",
 		.test_data = &hmac_sha1_test_vector,
 		.op_mask = BLOCKCIPHER_TEST_OP_AUTH_GEN,
-		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL
+		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL |
+			BLOCKCIPHER_TEST_TARGET_PMD_MB
 	},
 	{
 		.test_descr = "HMAC-SHA1 Digest Verify",
 		.test_data = &hmac_sha1_test_vector,
 		.op_mask = BLOCKCIPHER_TEST_OP_AUTH_VERIFY,
-		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL
+		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL |
+			BLOCKCIPHER_TEST_TARGET_PMD_MB
 	},
 	{
 		.test_descr = "SHA224 Digest",
@@ -406,13 +416,15 @@ static const struct blockcipher_test_case hash_test_cases[] = {
 		.test_descr = "HMAC-SHA224 Digest",
 		.test_data = &hmac_sha224_test_vector,
 		.op_mask = BLOCKCIPHER_TEST_OP_AUTH_GEN,
-		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL
+		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL |
+			BLOCKCIPHER_TEST_TARGET_PMD_MB
 	},
 	{
 		.test_descr = "HMAC-SHA224 Digest Verify",
 		.test_data = &hmac_sha224_test_vector,
 		.op_mask = BLOCKCIPHER_TEST_OP_AUTH_VERIFY,
-		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL
+		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL |
+			BLOCKCIPHER_TEST_TARGET_PMD_MB
 	},
 	{
 		.test_descr = "SHA256 Digest",
@@ -430,13 +442,15 @@ static const struct blockcipher_test_case hash_test_cases[] = {
 		.test_descr = "HMAC-SHA256 Digest",
 		.test_data = &hmac_sha256_test_vector,
 		.op_mask = BLOCKCIPHER_TEST_OP_AUTH_GEN,
-		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL
+		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL |
+			BLOCKCIPHER_TEST_TARGET_PMD_MB
 	},
 	{
 		.test_descr = "HMAC-SHA256 Digest Verify",
 		.test_data = &hmac_sha256_test_vector,
 		.op_mask = BLOCKCIPHER_TEST_OP_AUTH_VERIFY,
-		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL
+		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL |
+			BLOCKCIPHER_TEST_TARGET_PMD_MB
 	},
 	{
 		.test_descr = "SHA384 Digest",
@@ -454,13 +468,15 @@ static const struct blockcipher_test_case hash_test_cases[] = {
 		.test_descr = "HMAC-SHA384 Digest",
 		.test_data = &hmac_sha384_test_vector,
 		.op_mask = BLOCKCIPHER_TEST_OP_AUTH_GEN,
-		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL
+		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL |
+			BLOCKCIPHER_TEST_TARGET_PMD_MB
 	},
 	{
 		.test_descr = "HMAC-SHA384 Digest Verify",
 		.test_data = &hmac_sha384_test_vector,
 		.op_mask = BLOCKCIPHER_TEST_OP_AUTH_VERIFY,
-		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL
+		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL |
+			BLOCKCIPHER_TEST_TARGET_PMD_MB
 	},
 	{
 		.test_descr = "SHA512 Digest",
@@ -478,13 +494,15 @@ static const struct blockcipher_test_case hash_test_cases[] = {
 		.test_descr = "HMAC-SHA512 Digest",
 		.test_data = &hmac_sha512_test_vector,
 		.op_mask = BLOCKCIPHER_TEST_OP_AUTH_GEN,
-		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL
+		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL |
+			BLOCKCIPHER_TEST_TARGET_PMD_MB
 	},
 	{
 		.test_descr = "HMAC-SHA512 Digest Verify",
 		.test_data = &hmac_sha512_test_vector,
 		.op_mask = BLOCKCIPHER_TEST_OP_AUTH_VERIFY,
-		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL
+		.pmd_mask = BLOCKCIPHER_TEST_TARGET_PMD_OPENSSL |
+			BLOCKCIPHER_TEST_TARGET_PMD_MB
 	},
 };
 
diff --git a/doc/guides/cryptodevs/aesni_mb.rst b/doc/guides/cryptodevs/aesni_mb.rst
index b47cb6a..cb429d7 100644
--- a/doc/guides/cryptodevs/aesni_mb.rst
+++ b/doc/guides/cryptodevs/aesni_mb.rst
@@ -62,8 +62,6 @@ Limitations
 -----------
 
 * Chained mbufs are not supported.
-* Hash only is not supported.
-* Cipher only is not supported.
 * Only in-place is currently supported (destination address is the same as source address).
 * Only supports session-oriented API implementation (session-less APIs are not supported).
 
diff --git a/doc/guides/rel_notes/release_17_02.rst b/doc/guides/rel_notes/release_17_02.rst
index 4f666df..5aa8a94 100644
--- a/doc/guides/rel_notes/release_17_02.rst
+++ b/doc/guides/rel_notes/release_17_02.rst
@@ -49,6 +49,7 @@ New Features
 
   * The Intel(R) Multi Buffer Crypto for IPsec library used in
     AESNI MB PMD has been moved to a new repository, in github.
+  * Support for single operations (cipher only and authentication only).
 
 
 Resolved Issues
diff --git a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
index f07cd07..7591cc5 100644
--- a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
+++ b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
@@ -110,21 +110,22 @@ calculate_auth_precomputes(hash_one_block_t one_block_hash,
 static int
 aesni_mb_get_chain_order(const struct rte_crypto_sym_xform *xform)
 {
-	/*
-	 * Multi-buffer only supports HASH_CIPHER or CIPHER_HASH chained
-	 * operations, all other options are invalid, so we must have exactly
-	 * 2 xform structs chained together
-	 */
-	if (xform->next == NULL || xform->next->next != NULL)
+	if (xform == NULL)
 		return -1;
 
-	if (xform->type == RTE_CRYPTO_SYM_XFORM_AUTH &&
-			xform->next->type == RTE_CRYPTO_SYM_XFORM_CIPHER)
-		return HASH_CIPHER;
-
-	if (xform->type == RTE_CRYPTO_SYM_XFORM_CIPHER &&
-				xform->next->type == RTE_CRYPTO_SYM_XFORM_AUTH)
+	if ((xform->type == RTE_CRYPTO_SYM_XFORM_CIPHER) &&
+			(xform->cipher.op == RTE_CRYPTO_CIPHER_OP_ENCRYPT) &&
+			((xform->next == NULL) ||
+			(xform->next->type == RTE_CRYPTO_SYM_XFORM_AUTH)))
 		return CIPHER_HASH;
+	if ((xform->type == RTE_CRYPTO_SYM_XFORM_CIPHER) &&
+			(xform->cipher.op == RTE_CRYPTO_CIPHER_OP_DECRYPT) &&
+			(xform->next == NULL))
+		return HASH_CIPHER;
+	if ((xform->type == RTE_CRYPTO_SYM_XFORM_AUTH) &&
+			(xform->next == NULL ||
+			xform->next->type == RTE_CRYPTO_SYM_XFORM_CIPHER))
+		return HASH_CIPHER;
 
 	return -1;
 }
@@ -137,6 +138,11 @@ aesni_mb_set_session_auth_parameters(const struct aesni_mb_ops *mb_ops,
 {
 	hash_one_block_t hash_oneblock_fn;
 
+	if (xform == NULL) {
+		sess->auth.algo = NULL_HASH;
+		return 0;
+	}
+
 	if (xform->type != RTE_CRYPTO_SYM_XFORM_AUTH) {
 		MB_LOG_ERR("Crypto xform struct not of type auth");
 		return -1;
@@ -199,6 +205,11 @@ aesni_mb_set_session_cipher_parameters(const struct aesni_mb_ops *mb_ops,
 {
 	aes_keyexp_t aes_keyexp_fn;
 
+	if (xform == NULL) {
+		sess->cipher.mode = NULL_CIPHER;
+		return 0;
+	}
+
 	if (xform->type != RTE_CRYPTO_SYM_XFORM_CIPHER) {
 		MB_LOG_ERR("Crypto xform struct not of type cipher");
 		return -1;
@@ -270,8 +281,13 @@ aesni_mb_set_session_parameters(const struct aesni_mb_ops *mb_ops,
 	switch (aesni_mb_get_chain_order(xform)) {
 	case HASH_CIPHER:
 		sess->chain_order = HASH_CIPHER;
-		auth_xform = xform;
-		cipher_xform = xform->next;
+		if (xform->type == RTE_CRYPTO_SYM_XFORM_AUTH) {
+			auth_xform = xform;
+			cipher_xform = xform->next;
+		} else {
+			cipher_xform = xform;
+			auth_xform = xform->next;
+		}
 		break;
 	case CIPHER_HASH:
 		sess->chain_order = CIPHER_HASH;
@@ -396,7 +412,7 @@ process_crypto_op(struct aesni_mb_qp *qp, struct rte_crypto_op *op,
 	}
 
 	/* Set digest output location */
-	if (job->cipher_direction == DECRYPT) {
+	if (job->cipher_direction == DECRYPT && job->hash_alg != NULL_HASH) {
 		job->auth_tag_output = (uint8_t *)rte_pktmbuf_append(m_dst,
 				get_digest_byte_length(job->hash_alg));
 
@@ -471,7 +487,8 @@ post_process_mb_job(struct aesni_mb_qp *qp, JOB_AES_HMAC *job)
 		return op;
 	} else if (job->chain_order == HASH_CIPHER) {
 		/* Verify digest if required */
-		if (memcmp(job->auth_tag_output, op->sym->auth.digest.data,
+		if (job->hash_alg != NULL_HASH && memcmp(job->auth_tag_output,
+				op->sym->auth.digest.data,
 				job->auth_tag_output_len_in_bytes) != 0)
 			op->status = RTE_CRYPTO_OP_STATUS_AUTH_FAILED;
 
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2 1/3] doc: update AESNI MB PMD guide
From: Pablo de Lara @ 2016-12-15 16:00 UTC (permalink / raw)
  To: declan.doherty; +Cc: dev, Pablo de Lara
In-Reply-To: <1481817632-183082-1-git-send-email-pablo.de.lara.guarch@intel.com>

The Intel(R) Multi Buffer Crypto library used in the AESNI MB PMD
has been moved to a new repository, in github.
This patch updates the link where it can be downloaded.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
---
 doc/guides/cryptodevs/aesni_mb.rst     | 10 +++-------
 doc/guides/rel_notes/release_17_02.rst |  7 +++++++
 2 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/doc/guides/cryptodevs/aesni_mb.rst b/doc/guides/cryptodevs/aesni_mb.rst
index e812e95..b47cb6a 100644
--- a/doc/guides/cryptodevs/aesni_mb.rst
+++ b/doc/guides/cryptodevs/aesni_mb.rst
@@ -66,21 +66,17 @@ Limitations
 * Cipher only is not supported.
 * Only in-place is currently supported (destination address is the same as source address).
 * Only supports session-oriented API implementation (session-less APIs are not supported).
-*  Not performance tuned.
 
 Installation
 ------------
 
 To build DPDK with the AESNI_MB_PMD the user is required to download the mult-
-buffer library from `here <https://downloadcenter.intel.com/download/22972>`_
-and compile it on their user system before building DPDK. When building the
-multi-buffer library it is necessary to have YASM package installed and also
-requires the overriding of YASM path when building, as a path is hard coded in
-the Makefile of the release package.
+buffer library from `here <https://github.com/01org/intel-ipsec-mb>`_
+and compile it on their user system before building DPDK.
 
 .. code-block:: console
 
-	make YASM=/usr/bin/yasm
+	make
 
 Initialization
 --------------
diff --git a/doc/guides/rel_notes/release_17_02.rst b/doc/guides/rel_notes/release_17_02.rst
index 873333b..4f666df 100644
--- a/doc/guides/rel_notes/release_17_02.rst
+++ b/doc/guides/rel_notes/release_17_02.rst
@@ -44,6 +44,13 @@ New Features
 
   * Scatter-gather list (SGL) support.
 
+
+* **Updated the AESNI MB PMD.**
+
+  * The Intel(R) Multi Buffer Crypto for IPsec library used in
+    AESNI MB PMD has been moved to a new repository, in github.
+
+
 Resolved Issues
 ---------------
 
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH v4] net/kni: add KNI PMD
From: Ferruh Yigit @ 2016-12-15 15:55 UTC (permalink / raw)
  To: Yong Wang, dev@dpdk.org
In-Reply-To: <BY2PR05MB235997D7301F907E66E29555AF9A0@BY2PR05MB2359.namprd05.prod.outlook.com>

On 12/14/2016 7:25 PM, Yong Wang wrote:
>> -----Original Message-----
>> From: Ferruh Yigit [mailto:ferruh.yigit@intel.com]
>> Sent: Wednesday, December 14, 2016 8:00 AM
>> To: Yong Wang <yongwang@vmware.com>; dev@dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH v4] net/kni: add KNI PMD
>>
>> On 12/12/2016 9:59 PM, Yong Wang wrote:
>>>> -----Original Message-----
>>>> From: Ferruh Yigit [mailto:ferruh.yigit@intel.com]
>>>> Sent: Wednesday, November 30, 2016 10:12 AM
>>>> To: dev@dpdk.org
>>>> Cc: Ferruh Yigit <ferruh.yigit@intel.com>; Yong Wang
>>>> <yongwang@vmware.com>
>>>> Subject: [PATCH v4] net/kni: add KNI PMD
>>>>
>>>> Add KNI PMD which wraps librte_kni for ease of use.
>>>>
>>>> KNI PMD can be used as any regular PMD to send / receive packets to the
>>>> Linux networking stack.
>>>>
>>>> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
>>>> ---
>>>>
>>>> v4:
>>>> * allow only single queue
>>>> * use driver.name as name
>>>>
>>>> v3:
>>>> * rebase on top of latest master
>>>>
>>>> v2:
>>>> * updated driver name eth_kni -> net_kni
>>>> ---
>>>>  config/common_base                      |   1 +
>>>>  config/common_linuxapp                  |   1 +
>>>>  drivers/net/Makefile                    |   1 +
>>>>  drivers/net/kni/Makefile                |  63 +++++
>>>>  drivers/net/kni/rte_eth_kni.c           | 462
>>>> ++++++++++++++++++++++++++++++++
>>>>  drivers/net/kni/rte_pmd_kni_version.map |   4 +
>>>>  mk/rte.app.mk                           |  10 +-
>>>>  7 files changed, 537 insertions(+), 5 deletions(-)
>>>>  create mode 100644 drivers/net/kni/Makefile
>>>>  create mode 100644 drivers/net/kni/rte_eth_kni.c
>>>>  create mode 100644 drivers/net/kni/rte_pmd_kni_version.map
>>>>
>>>> diff --git a/config/common_base b/config/common_base
>>>> index 4bff83a..3385879 100644
>>>> --- a/config/common_base
>>>> +++ b/config/common_base
>>>> @@ -543,6 +543,7 @@ CONFIG_RTE_PIPELINE_STATS_COLLECT=n
>>>>  # Compile librte_kni
>>>>  #
>>>>  CONFIG_RTE_LIBRTE_KNI=n
>>>> +CONFIG_RTE_LIBRTE_PMD_KNI=n
>>>>  CONFIG_RTE_KNI_KMOD=n
>>>>  CONFIG_RTE_KNI_PREEMPT_DEFAULT=y
>>>>  CONFIG_RTE_KNI_VHOST=n
>>>> diff --git a/config/common_linuxapp b/config/common_linuxapp
>>>> index 2483dfa..2ecd510 100644
>>>> --- a/config/common_linuxapp
>>>> +++ b/config/common_linuxapp
>>>> @@ -39,6 +39,7 @@ CONFIG_RTE_EAL_IGB_UIO=y
>>>>  CONFIG_RTE_EAL_VFIO=y
>>>>  CONFIG_RTE_KNI_KMOD=y
>>>>  CONFIG_RTE_LIBRTE_KNI=y
>>>> +CONFIG_RTE_LIBRTE_PMD_KNI=y
>>>>  CONFIG_RTE_LIBRTE_VHOST=y
>>>>  CONFIG_RTE_LIBRTE_PMD_VHOST=y
>>>>  CONFIG_RTE_LIBRTE_PMD_AF_PACKET=y
>>>> diff --git a/drivers/net/Makefile b/drivers/net/Makefile
>>>> index bc93230..c4771cd 100644
>>>> --- a/drivers/net/Makefile
>>>> +++ b/drivers/net/Makefile
>>>> @@ -41,6 +41,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += enic
>>>>  DIRS-$(CONFIG_RTE_LIBRTE_FM10K_PMD) += fm10k
>>>>  DIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e
>>>>  DIRS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe
>>>> +DIRS-$(CONFIG_RTE_LIBRTE_PMD_KNI) += kni
>>>>  DIRS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4
>>>>  DIRS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5
>>>>  DIRS-$(CONFIG_RTE_LIBRTE_MPIPE_PMD) += mpipe
>>>> diff --git a/drivers/net/kni/Makefile b/drivers/net/kni/Makefile
>>>> new file mode 100644
>>>> index 0000000..0b7cf91
>>>> --- /dev/null
>>>> +++ b/drivers/net/kni/Makefile
>>>> @@ -0,0 +1,63 @@
>>>> +#   BSD LICENSE
>>>> +#
>>>> +#   Copyright(c) 2016 Intel Corporation. All rights reserved.
>>>> +#
>>>> +#   Redistribution and use in source and binary forms, with or without
>>>> +#   modification, are permitted provided that the following conditions
>>>> +#   are met:
>>>> +#
>>>> +#     * Redistributions of source code must retain the above copyright
>>>> +#       notice, this list of conditions and the following disclaimer.
>>>> +#     * Redistributions in binary form must reproduce the above copyright
>>>> +#       notice, this list of conditions and the following disclaimer in
>>>> +#       the documentation and/or other materials provided with the
>>>> +#       distribution.
>>>> +#     * Neither the name of Intel Corporation nor the names of its
>>>> +#       contributors may be used to endorse or promote products derived
>>>> +#       from this software without specific prior written permission.
>>>> +#
>>>> +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
>>>> CONTRIBUTORS
>>>> +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING,
>> BUT
>>>> NOT
>>>> +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
>>>> FITNESS FOR
>>>> +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
>>>> COPYRIGHT
>>>> +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
>>>> INCIDENTAL,
>>>> +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
>> BUT
>>>> NOT
>>>> +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
>> LOSS
>>>> OF USE,
>>>> +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
>>>> AND ON ANY
>>>> +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
>>>> TORT
>>>> +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
>> OF
>>>> THE USE
>>>> +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
>>>> DAMAGE.
>>>> +
>>>> +include $(RTE_SDK)/mk/rte.vars.mk
>>>> +
>>>> +#
>>>> +# library name
>>>> +#
>>>> +LIB = librte_pmd_kni.a
>>>> +
>>>> +CFLAGS += -O3
>>>> +CFLAGS += $(WERROR_FLAGS)
>>>> +LDLIBS += -lpthread
>>>> +
>>>> +EXPORT_MAP := rte_pmd_kni_version.map
>>>> +
>>>> +LIBABIVER := 1
>>>> +
>>>> +#
>>>> +# all source are stored in SRCS-y
>>>> +#
>>>> +SRCS-$(CONFIG_RTE_LIBRTE_PMD_KNI) += rte_eth_kni.c
>>>> +
>>>> +#
>>>> +# Export include files
>>>> +#
>>>> +SYMLINK-y-include +=
>>>> +
>>>> +# this lib depends upon:
>>>> +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_KNI) += lib/librte_eal
>>>> +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_KNI) += lib/librte_ether
>>>> +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_KNI) += lib/librte_kni
>>>> +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_KNI) += lib/librte_mbuf
>>>> +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_KNI) += lib/librte_mempool
>>>> +
>>>> +include $(RTE_SDK)/mk/rte.lib.mk
>>>> diff --git a/drivers/net/kni/rte_eth_kni.c b/drivers/net/kni/rte_eth_kni.c
>>>> new file mode 100644
>>>> index 0000000..6c4df96
>>>> --- /dev/null
>>>> +++ b/drivers/net/kni/rte_eth_kni.c
>>>> @@ -0,0 +1,462 @@
>>>> +/*-
>>>> + *   BSD LICENSE
>>>> + *
>>>> + *   Copyright(c) 2016 Intel Corporation. All rights reserved.
>>>> + *   All rights reserved.
>>>> + *
>>>> + *   Redistribution and use in source and binary forms, with or without
>>>> + *   modification, are permitted provided that the following conditions
>>>> + *   are met:
>>>> + *
>>>> + *     * Redistributions of source code must retain the above copyright
>>>> + *       notice, this list of conditions and the following disclaimer.
>>>> + *     * Redistributions in binary form must reproduce the above copyright
>>>> + *       notice, this list of conditions and the following disclaimer in
>>>> + *       the documentation and/or other materials provided with the
>>>> + *       distribution.
>>>> + *     * Neither the name of Intel Corporation nor the names of its
>>>> + *       contributors may be used to endorse or promote products derived
>>>> + *       from this software without specific prior written permission.
>>>> + *
>>>> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
>>>> CONTRIBUTORS
>>>> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING,
>> BUT
>>>> NOT
>>>> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
>>>> FITNESS FOR
>>>> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
>>>> COPYRIGHT
>>>> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
>>>> INCIDENTAL,
>>>> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
>> BUT
>>>> NOT
>>>> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
>> LOSS
>>>> OF USE,
>>>> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
>>>> AND ON ANY
>>>> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
>>>> TORT
>>>> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
>> OF
>>>> THE USE
>>>> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
>>>> DAMAGE.
>>>> + */
>>>> +
>>>> +#include <fcntl.h>
>>>> +#include <pthread.h>
>>>> +#include <unistd.h>
>>>> +
>>>> +#include <rte_ethdev.h>
>>>> +#include <rte_kni.h>
>>>> +#include <rte_malloc.h>
>>>> +#include <rte_vdev.h>
>>>> +
>>>> +/* Only single queue supported */
>>>> +#define KNI_MAX_QUEUE_PER_PORT 1
>>>> +
>>>> +#define MAX_PACKET_SZ 2048
>>>> +#define MAX_KNI_PORTS 8
>>>> +
>>>> +struct pmd_queue_stats {
>>>> +	uint64_t pkts;
>>>> +	uint64_t bytes;
>>>> +	uint64_t err_pkts;
>>>> +};
>>>> +
>>>> +struct pmd_queue {
>>>> +	struct pmd_internals *internals;
>>>> +	struct rte_mempool *mb_pool;
>>>> +
>>>> +	struct pmd_queue_stats rx;
>>>> +	struct pmd_queue_stats tx;
>>>> +};
>>>> +
>>>> +struct pmd_internals {
>>>> +	struct rte_kni *kni;
>>>> +	int is_kni_started;
>>>> +
>>>> +	pthread_t thread;
>>>> +	int stop_thread;
>>>> +
>>>> +	struct pmd_queue rx_queues[KNI_MAX_QUEUE_PER_PORT];
>>>> +	struct pmd_queue tx_queues[KNI_MAX_QUEUE_PER_PORT];
>>>> +};
>>>> +
>>>> +static struct ether_addr eth_addr;
>>>> +static struct rte_eth_link pmd_link = {
>>>> +		.link_speed = ETH_SPEED_NUM_10G,
>>>> +		.link_duplex = ETH_LINK_FULL_DUPLEX,
>>>> +		.link_status = 0
>>>> +};
>>>> +static int is_kni_initialized;
>>>> +
>>>> +static uint16_t
>>>> +eth_kni_rx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
>>>> +{
>>>> +	struct pmd_queue *kni_q = q;
>>>> +	struct rte_kni *kni = kni_q->internals->kni;
>>>> +	uint16_t nb_pkts;
>>>> +
>>>> +	nb_pkts = rte_kni_rx_burst(kni, bufs, nb_bufs);
>>>> +
>>>> +	kni_q->rx.pkts += nb_pkts;
>>>> +	kni_q->rx.err_pkts += nb_bufs - nb_pkts;
>>>> +
>>>> +	return nb_pkts;
>>>> +}
>>>> +
>>>> +static uint16_t
>>>> +eth_kni_tx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
>>>> +{
>>>> +	struct pmd_queue *kni_q = q;
>>>> +	struct rte_kni *kni = kni_q->internals->kni;
>>>> +	uint16_t nb_pkts;
>>>> +
>>>> +	nb_pkts =  rte_kni_tx_burst(kni, bufs, nb_bufs);
>>>> +
>>>> +	kni_q->tx.pkts += nb_pkts;
>>>> +	kni_q->tx.err_pkts += nb_bufs - nb_pkts;
>>>> +
>>>> +	return nb_pkts;
>>>> +}
>>>> +
>>>> +static void *
>>>> +kni_handle_request(void *param)
>>>> +{
>>>> +	struct pmd_internals *internals = param;
>>>> +#define MS 1000
>>>> +
>>>> +	while (!internals->stop_thread) {
>>>> +		rte_kni_handle_request(internals->kni);
>>>> +		usleep(500 * MS);
>>>> +	}
>>>> +
>>>> +	return param;
>>>> +}
>>>> +
>>>
>>> Do we really need a thread to handle request by default? I know there are
>> apps that handle request their own way and having a separate thread could
>> add synchronization problems.  Can we at least add an option to disable this?
>>
>> I didn't think about there can be a use case that requires own request
>> handling.
>>
>> But, kni requests should be handled to make kni interface run properly,
>> and to handle interface "kni" handler (internals->kni) required, which
>> this PMD doesn't expose.
>>
>> So, just disabling this thread won't work on its own.
> 
> I understand that and what I am asking is a way to at least disable this without having to make code changes for applications that have their own way of handling KNI request and the callback mentioned below sounds good to me.  I am fine with adding this capability with this commit or in a separate commit after you have this commit checked in.

I don't mind adding in new version, only I am trying to understand it.

Normally what it does is calling KNI library rte_kni_handle_request()
API periodically on KNI handler. What an app may be doing own its way,
other than tweaking the period?

>  
>> A solution can be found, like callback registraion, or get_handler API,
>> but if an application has custom request handling, perhaps it may prefer
>> to use kni library directly instead of this wrapper, since wrapper
>> already doesn't expose all kni features.
> 
> I think one of the motivation of having KNI pmd is that it's abstracted the same way as other physical or virtual devices.  I think it makes sense to achieve  feature parity with the KNI library as much as possible.  What's currently supported in KNI library but missing in KNI PMD and any specific reason they are not supported?

Mainly what missing is rte_kni_conf and some APIs has default values,
instead of being variable.
And ethtool (kni control path) is not supported with PMD.

Default values used (instead of configurable devargs) , to make PMD simple.
And ethtool support is a) hard to add, b) doesn't quite fit to KNI PMD
logic.

> 
>>>
>>>> +static int
>>>> +eth_kni_start(struct rte_eth_dev *dev)
>>>> +{
>>>> +	struct pmd_internals *internals = dev->data->dev_private;
>>>> +	uint16_t port_id = dev->data->port_id;
>>>> +	struct rte_mempool *mb_pool;
>>>> +	struct rte_kni_conf conf;
>>>> +	const char *name = dev->data->name + 4; /* remove net_ */
>>>> +
>>>> +	snprintf(conf.name, RTE_KNI_NAMESIZE, "%s", name);
>>>> +	conf.force_bind = 0;
>>>> +	conf.group_id = port_id;
>>>> +	conf.mbuf_size = MAX_PACKET_SZ;
>>>> +	mb_pool = internals->rx_queues[0].mb_pool;
>>>> +
>>>> +	internals->kni = rte_kni_alloc(mb_pool, &conf, NULL);
>>>> +	if (internals->kni == NULL) {
>>>> +		RTE_LOG(ERR, PMD,
>>>> +			"Fail to create kni for port: %d\n", port_id);
>>>> +		return -1;
>>>> +	}
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static int
>>>> +eth_kni_dev_start(struct rte_eth_dev *dev)
>>>> +{
>>>> +	struct pmd_internals *internals = dev->data->dev_private;
>>>> +	int ret;
>>>> +
>>>> +	if (internals->is_kni_started == 0) {
>>>> +		ret = eth_kni_start(dev);
>>>> +		if (ret)
>>>> +			return -1;
>>>> +		internals->is_kni_started = 1;
>>>> +	}
>>>> +
>>>
>>> In case is_kni_started is 1 already,  shouldn't we return directly instead of
>> proceeding?
>>
>> "is_kni_started" is just to protect "eth_kni_start()", as you can see it
>> doesn't have a counterpart in eth_kni_dev_stop(). This flag is to be
>> sure "eth_kni_start()" called only once during PMD life cycle.
>>
>> The check you mentioned already done, start() / stop() functions already
>> balanced by APIs calling these functions.
> 
> What about KNI request handing thread then?  Is it safe to have multiple threads calling into rte_kni_handle_request()? My understanding is that this is not safe as kni_fifo is not multi-thread safe.  It's also a bit wasteful to create multiple threads here.

That thread created within start() and canceled in stop().
And it is not possible to have start() call twice, the API that calls
start(), rte_eth_dev_start(), prevents multiple calls already. Same for
stop().

> 
>>>
>>>> +	ret = pthread_create(&internals->thread, NULL, kni_handle_request,
>>>> +			internals);
>>>> +	if (ret) {
>>>> +		RTE_LOG(ERR, PMD, "Fail to create kni request thread\n");
>>>> +		return -1;
>>>> +	}
>>>> +
>>>> +	dev->data->dev_link.link_status = 1;
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static void
>>>> +eth_kni_dev_stop(struct rte_eth_dev *dev)
>>>> +{
>>>> +	struct pmd_internals *internals = dev->data->dev_private;
>>>> +	int ret;
>>>> +
>>>> +	internals->stop_thread = 1;
>>>> +
>>>> +	ret = pthread_cancel(internals->thread);
>>>> +	if (ret)
>>>> +		RTE_LOG(ERR, PMD, "Can't cancel the thread\n");
>>>> +
>>>> +	ret = pthread_join(internals->thread, NULL);
>>>> +	if (ret)
>>>> +		RTE_LOG(ERR, PMD, "Can't join the thread\n");
>>>> +
>>>> +	internals->stop_thread = 0;
>>>> +
>>>> +	dev->data->dev_link.link_status = 0;
>>>> +}
>>>> +
>>>> +static int
>>>> +eth_kni_dev_configure(struct rte_eth_dev *dev __rte_unused)
>>>> +{
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static void
>>>> +eth_kni_dev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info
>>>> *dev_info)
>>>> +{
>>>> +	struct rte_eth_dev_data *data = dev->data;
>>>> +
>>>> +	dev_info->driver_name = data->drv_name;
>>>> +	dev_info->max_mac_addrs = 1;
>>>> +	dev_info->max_rx_pktlen = (uint32_t)-1;
>>>> +	dev_info->max_rx_queues = KNI_MAX_QUEUE_PER_PORT;
>>>> +	dev_info->max_tx_queues = KNI_MAX_QUEUE_PER_PORT;
>>>> +	dev_info->min_rx_bufsize = 0;
>>>> +	dev_info->pci_dev = NULL;
>>>> +}
>>>> +
>>>> +static int
>>>> +eth_kni_rx_queue_setup(struct rte_eth_dev *dev,
>>>> +		uint16_t rx_queue_id,
>>>> +		uint16_t nb_rx_desc __rte_unused,
>>>> +		unsigned int socket_id __rte_unused,
>>>> +		const struct rte_eth_rxconf *rx_conf __rte_unused,
>>>> +		struct rte_mempool *mb_pool)
>>>> +{
>>>> +	struct pmd_internals *internals = dev->data->dev_private;
>>>> +	struct pmd_queue *q;
>>>> +
>>>> +	q = &internals->rx_queues[rx_queue_id];
>>>> +	q->internals = internals;
>>>> +	q->mb_pool = mb_pool;
>>>> +
>>>> +	dev->data->rx_queues[rx_queue_id] = q;
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static int
>>>> +eth_kni_tx_queue_setup(struct rte_eth_dev *dev,
>>>> +		uint16_t tx_queue_id,
>>>> +		uint16_t nb_tx_desc __rte_unused,
>>>> +		unsigned int socket_id __rte_unused,
>>>> +		const struct rte_eth_txconf *tx_conf __rte_unused)
>>>> +{
>>>> +	struct pmd_internals *internals = dev->data->dev_private;
>>>> +	struct pmd_queue *q;
>>>> +
>>>> +	q = &internals->tx_queues[tx_queue_id];
>>>> +	q->internals = internals;
>>>> +
>>>> +	dev->data->tx_queues[tx_queue_id] = q;
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static void
>>>> +eth_kni_queue_release(void *q __rte_unused)
>>>> +{
>>>> +}
>>>> +
>>>> +static int
>>>> +eth_kni_link_update(struct rte_eth_dev *dev __rte_unused,
>>>> +		int wait_to_complete __rte_unused)
>>>> +{
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static void
>>>> +eth_kni_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
>>>> +{
>>>> +	unsigned long rx_packets_total = 0, rx_bytes_total = 0;
>>>> +	unsigned long tx_packets_total = 0, tx_bytes_total = 0;
>>>> +	struct rte_eth_dev_data *data = dev->data;
>>>> +	unsigned long tx_packets_err_total = 0;
>>>> +	unsigned int i, num_stats;
>>>> +	struct pmd_queue *q;
>>>> +
>>>> +	num_stats = RTE_MIN((unsigned
>>>> int)RTE_ETHDEV_QUEUE_STAT_CNTRS,
>>>> +			data->nb_rx_queues);
>>>> +	for (i = 0; i < num_stats; i++) {
>>>> +		q = data->rx_queues[i];
>>>> +		stats->q_ipackets[i] = q->rx.pkts;
>>>> +		stats->q_ibytes[i] = q->rx.bytes;
>>>> +		rx_packets_total += stats->q_ipackets[i];
>>>> +		rx_bytes_total += stats->q_ibytes[i];
>>>> +	}
>>>> +
>>>> +	num_stats = RTE_MIN((unsigned
>>>> int)RTE_ETHDEV_QUEUE_STAT_CNTRS,
>>>> +			data->nb_tx_queues);
>>>> +	for (i = 0; i < num_stats; i++) {
>>>> +		q = data->tx_queues[i];
>>>> +		stats->q_opackets[i] = q->tx.pkts;
>>>> +		stats->q_obytes[i] = q->tx.bytes;
>>>> +		stats->q_errors[i] = q->tx.err_pkts;
>>>> +		tx_packets_total += stats->q_opackets[i];
>>>> +		tx_bytes_total += stats->q_obytes[i];
>>>> +		tx_packets_err_total += stats->q_errors[i];
>>>> +	}
>>>> +
>>>> +	stats->ipackets = rx_packets_total;
>>>> +	stats->ibytes = rx_bytes_total;
>>>> +	stats->opackets = tx_packets_total;
>>>> +	stats->obytes = tx_bytes_total;
>>>> +	stats->oerrors = tx_packets_err_total;
>>>> +}
>>>> +
>>>> +static void
>>>> +eth_kni_stats_reset(struct rte_eth_dev *dev)
>>>> +{
>>>> +	struct rte_eth_dev_data *data = dev->data;
>>>> +	struct pmd_queue *q;
>>>> +	unsigned int i;
>>>> +
>>>> +	for (i = 0; i < data->nb_rx_queues; i++) {
>>>> +		q = data->rx_queues[i];
>>>> +		q->rx.pkts = 0;
>>>> +		q->rx.bytes = 0;
>>>> +	}
>>>> +	for (i = 0; i < data->nb_tx_queues; i++) {
>>>> +		q = data->tx_queues[i];
>>>> +		q->tx.pkts = 0;
>>>> +		q->tx.bytes = 0;
>>>> +		q->tx.err_pkts = 0;
>>>> +	}
>>>> +}
>>>> +
>>>> +static const struct eth_dev_ops eth_kni_ops = {
>>>> +	.dev_start = eth_kni_dev_start,
>>>> +	.dev_stop = eth_kni_dev_stop,
>>>> +	.dev_configure = eth_kni_dev_configure,
>>>> +	.dev_infos_get = eth_kni_dev_info,
>>>> +	.rx_queue_setup = eth_kni_rx_queue_setup,
>>>> +	.tx_queue_setup = eth_kni_tx_queue_setup,
>>>> +	.rx_queue_release = eth_kni_queue_release,
>>>> +	.tx_queue_release = eth_kni_queue_release,
>>>> +	.link_update = eth_kni_link_update,
>>>> +	.stats_get = eth_kni_stats_get,
>>>> +	.stats_reset = eth_kni_stats_reset,
>>>> +};
>>>> +
>>>> +static struct rte_vdev_driver eth_kni_drv;
>>>> +
>>>> +static struct rte_eth_dev *
>>>> +eth_kni_create(const char *name, unsigned int numa_node)
>>>> +{
>>>> +	struct pmd_internals *internals = NULL;
>>>> +	struct rte_eth_dev_data *data;
>>>> +	struct rte_eth_dev *eth_dev;
>>>> +
>>>> +	RTE_LOG(INFO, PMD, "Creating kni ethdev on numa socket %u\n",
>>>> +			numa_node);
>>>> +
>>>> +	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
>>>> +	if (data == NULL)
>>>> +		goto error;
>>>> +
>>>> +	internals = rte_zmalloc_socket(name, sizeof(*internals), 0,
>>>> numa_node);
>>>> +	if (internals == NULL)
>>>> +		goto error;
>>>> +
>>>> +	/* reserve an ethdev entry */
>>>> +	eth_dev = rte_eth_dev_allocate(name);
>>>> +	if (eth_dev == NULL)
>>>> +		goto error;
>>>> +
>>>> +	data->dev_private = internals;
>>>> +	data->port_id = eth_dev->data->port_id;
>>>> +	memmove(data->name, eth_dev->data->name, sizeof(data-
>>>>> name));
>>>> +	data->nb_rx_queues = 1;
>>>> +	data->nb_tx_queues = 1;
>>>> +	data->dev_link = pmd_link;
>>>> +	data->mac_addrs = &eth_addr;
>>>> +
>>>> +	eth_dev->data = data;
>>>> +	eth_dev->dev_ops = &eth_kni_ops;
>>>> +	eth_dev->driver = NULL;
>>>> +
>>>> +	data->dev_flags = RTE_ETH_DEV_DETACHABLE;
>>>> +	data->kdrv = RTE_KDRV_NONE;
>>>> +	data->drv_name = eth_kni_drv.driver.name;
>>>> +	data->numa_node = numa_node;
>>>> +
>>>> +	return eth_dev;
>>>> +
>>>> +error:
>>>> +	rte_free(data);
>>>> +	rte_free(internals);
>>>> +
>>>> +	return NULL;
>>>> +}
>>>> +
>>>> +static int
>>>> +kni_init(void)
>>>> +{
>>>> +	if (is_kni_initialized == 0)
>>>> +		rte_kni_init(MAX_KNI_PORTS);
>>>> +
>>>> +	is_kni_initialized += 1;
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static int
>>>> +eth_kni_probe(const char *name, const char *params __rte_unused)
>>>> +{
>>>> +	struct rte_eth_dev *eth_dev;
>>>> +	int ret;
>>>> +
>>>> +	RTE_LOG(INFO, PMD, "Initializing eth_kni for %s\n", name);
>>>> +
>>>> +	ret = kni_init();
>>>> +	if (ret < 0)
>>>> +		/* Not return error to prevent panic in rte_eal_init() */
>>>> +		return 0;
>>>
>>> If we don't return error here, the application that needs to add KNI ports
>> eventually will fail.  If it's a fail-stop situation, isn't it better to return error
>> where the it happened?
>>
>> I am not sure this is fail-stop situation, but instead this gives a
>> chance to applicaton for a graceful exit.
>>
>> If an error value returned here, it will lead to a rte_panic() and
>> application terminated abnormally!
>>
>> But if we return a success at this point, since no ethernet device
>> created, there is no handler in application to use, which also means no
>> KNI interface created.
>> Application can check number of ports and recognize KNI port is missing,
>> app may chose to terminate or not, also it prefers to terminate, can do
>> it properly.
> 
> I might be wrong but as far as I know,  other virtual or physical PMDS do not have this behavior.  What you proposed makes sense but it also means that the application needs extra logic (checking if all ports are successfully initialized) to handle such failures (depending on the application, it might be able to proceed or it might need to fail-stop).  Personally I would prefer consistency across all PMDs here no matter what behavior we choose here as that's the "contract" the application needs to know.

Right, other PMDs don't have this behavior, I will update this to be
consistent with others.

>  
>>>
>>>> +	eth_dev = eth_kni_create(name, rte_socket_id());
>>>> +	if (eth_dev == NULL)
>>>> +		return -1;
>>>> +
>>>> +	eth_dev->rx_pkt_burst = eth_kni_rx;
>>>> +	eth_dev->tx_pkt_burst = eth_kni_tx;
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static int
>>>> +eth_kni_remove(const char *name)
>>>> +{
>>>> +	struct rte_eth_dev *eth_dev;
>>>> +	struct pmd_internals *internals;
>>>> +
>>>> +	RTE_LOG(INFO, PMD, "Un-Initializing eth_kni for %s\n", name);
>>>> +
>>>> +	/* find the ethdev entry */
>>>> +	eth_dev = rte_eth_dev_allocated(name);
>>>> +	if (eth_dev == NULL)
>>>> +		return -1;
>>>> +
>>>> +	eth_kni_dev_stop(eth_dev);
>>>> +
>>>> +	if (eth_dev->data) {
>>>> +		internals = eth_dev->data->dev_private;
>>>> +		rte_kni_release(internals->kni);
>>>> +
>>>> +		rte_free(internals);
>>>> +	}
>>>> +	rte_free(eth_dev->data);
>>>> +
>>>> +	rte_eth_dev_release_port(eth_dev);
>>>> +
>>>> +	is_kni_initialized -= 1;
>>>> +	if (is_kni_initialized == 0)
>>>> +		rte_kni_close();
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static struct rte_vdev_driver eth_kni_drv = {
>>>> +	.probe = eth_kni_probe,
>>>> +	.remove = eth_kni_remove,
>>>> +};
>>>> +
>>>> +RTE_PMD_REGISTER_VDEV(net_kni, eth_kni_drv);
>>>> diff --git a/drivers/net/kni/rte_pmd_kni_version.map
>>>> b/drivers/net/kni/rte_pmd_kni_version.map
>>>> new file mode 100644
>>>> index 0000000..31eca32
>>>> --- /dev/null
>>>> +++ b/drivers/net/kni/rte_pmd_kni_version.map
>>>> @@ -0,0 +1,4 @@
>>>> +DPDK_17.02 {
>>>> +
>>>> +	local: *;
>>>> +};
>>>> diff --git a/mk/rte.app.mk b/mk/rte.app.mk
>>>> index f75f0e2..af02816 100644
>>>> --- a/mk/rte.app.mk
>>>> +++ b/mk/rte.app.mk
>>>> @@ -59,11 +59,6 @@ _LDLIBS-y += -L$(RTE_SDK_BIN)/lib
>>>>  #
>>>>  # Order is important: from higher level to lower level
>>>>  #
>>>> -
>>>> -ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
>>>> -_LDLIBS-$(CONFIG_RTE_LIBRTE_KNI)            += -lrte_kni
>>>> -endif
>>>> -
>>>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_PIPELINE)       += -lrte_pipeline
>>>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_TABLE)          += -lrte_table
>>>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_PORT)           += -lrte_port
>>>> @@ -84,6 +79,10 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_POWER)          += -
>>>> lrte_power
>>>>
>>>>  _LDLIBS-y += --whole-archive
>>>>
>>>> +ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
>>>> +_LDLIBS-$(CONFIG_RTE_LIBRTE_KNI)            += -lrte_kni
>>>> +endif
>>>> +
>>>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_TIMER)          += -lrte_timer
>>>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_HASH)           += -lrte_hash
>>>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_VHOST)          += -lrte_vhost
>>>> @@ -115,6 +114,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_ENIC_PMD)       +=
>> -
>>>> lrte_pmd_enic
>>>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_FM10K_PMD)      += -lrte_pmd_fm10k
>>>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_I40E_PMD)       += -lrte_pmd_i40e
>>>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD)      += -lrte_pmd_ixgbe
>>>> +_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_KNI)        += -lrte_pmd_kni
>>>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD)       += -lrte_pmd_mlx4 -
>>>> libverbs
>>>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD)       += -lrte_pmd_mlx5 -
>>>> libverbs
>>>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_MPIPE_PMD)      += -lrte_pmd_mpipe -
>> lgxio
>>>> --
>>>> 2.9.3
>>>
> 

^ permalink raw reply

* Re: [PATCH] doc: fix required tools list layout
From: Mcnamara, John @ 2016-12-15 15:09 UTC (permalink / raw)
  To: Baruch Siach, dev@dpdk.org; +Cc: David Marchand
In-Reply-To: <819ec07b51126e7877503b95fa30d7fd8da3a42a.1481623418.git.baruch@tkos.co.il>



> -----Original Message-----
> From: Baruch Siach [mailto:baruch@tkos.co.il]
> Sent: Tuesday, December 13, 2016 10:04 AM
> To: dev@dpdk.org
> Cc: Mcnamara, John <john.mcnamara@intel.com>; David Marchand
> <david.marchand@6wind.com>; Baruch Siach <baruch@tkos.co.il>
> Subject: [PATCH] doc: fix required tools list layout
> 
> The Python requirement should appear in the bullet list.
> 
> Signed-off-by: Baruch Siach <baruch@tkos.co.il>
> ---
>  doc/guides/linux_gsg/sys_reqs.rst | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/doc/guides/linux_gsg/sys_reqs.rst
> b/doc/guides/linux_gsg/sys_reqs.rst
> index 3d743421595a..621cc9ddaef6 100644
> --- a/doc/guides/linux_gsg/sys_reqs.rst
> +++ b/doc/guides/linux_gsg/sys_reqs.rst
> @@ -84,9 +84,7 @@ Compilation of the DPDK
>      x86_x32 ABI is currently supported with distribution packages only on
> Ubuntu
>      higher than 13.10 or recent Debian distribution. The only supported
> compiler is gcc 4.9+.
> 
> -.. note::
> -
> -    Python, version 2.6 or 2.7, to use various helper scripts included in
> the DPDK package.
> +*   Python, version 2.6 or 2.7, to use various helper scripts included in
> the DPDK package.
> 

Hi Baruch,

In addition to this change the note on the previous item should be indented to the level of the bullet item. It is probably worth making that change at the same time.

Also, the Python version should probably say 2.7+ and 3.2+ if this patch is accepted: 

    http://dpdk.org/dev/patchwork/patch/17775/

However, since that change hasn't been acked/merged yet you can leave that part of your patch as it is and I'll fix the version numbers in the other patch.

John

^ permalink raw reply

* Re: [PATCH] doc: fix environment variable typo
From: Mcnamara, John @ 2016-12-15 14:59 UTC (permalink / raw)
  To: Baruch Siach, dev@dpdk.org; +Cc: Horton, Remy
In-Reply-To: <fce9d7d4be77343de88e6b43e2323f173fffe088.1481636426.git.baruch@tkos.co.il>



> -----Original Message-----
> From: Baruch Siach [mailto:baruch@tkos.co.il]
> Sent: Tuesday, December 13, 2016 1:40 PM
> To: dev@dpdk.org
> Cc: Mcnamara, John <john.mcnamara@intel.com>; Horton, Remy
> <remy.horton@intel.com>; Baruch Siach <baruch@tkos.co.il>
> Subject: [PATCH] doc: fix environment variable typo
> 
> Signed-off-by: Baruch Siach <baruch@tkos.co.il>

Acked-by: John McNamara <john.mcnamara@intel.com>

^ permalink raw reply

* Re: [PATCH] doc: correct source extract command
From: Mcnamara, John @ 2016-12-15 14:59 UTC (permalink / raw)
  To: Baruch Siach, dev@dpdk.org; +Cc: David Marchand
In-Reply-To: <5f75987b3a17859f4c0961e3ddee72db938c4f07.1481627935.git.baruch@tkos.co.il>



> -----Original Message-----
> From: Baruch Siach [mailto:baruch@tkos.co.il]
> Sent: Tuesday, December 13, 2016 11:19 AM
> To: dev@dpdk.org
> Cc: Mcnamara, John <john.mcnamara@intel.com>; David Marchand
> <david.marchand@6wind.com>; Baruch Siach <baruch@tkos.co.il>
> Subject: [PATCH] doc: correct source extract command
> 
> DPDK source archives are .tar.xz or .tar.gz, not .zip. Use .tar.xz in the
> instructions, since that is what the main download page links to.
> 
> Also, correct the archive file and directory name capitalization.
> 
> Signed-off-by: Baruch Siach <baruch@tkos.co.il>

Acked-by: John McNamara <john.mcnamara@intel.com>

^ permalink raw reply

* Re: [PATCH v3] drivers: advertise kmod dependencies in pmdinfo
From: Ferruh Yigit @ 2016-12-15 14:52 UTC (permalink / raw)
  To: Olivier Matz, dev
  Cc: nhorman, thomas.monjalon, vido, fiona.trahe, stephen,
	adrien.mazarguil, Thomas Monjalon
In-Reply-To: <1481809599-27896-1-git-send-email-olivier.matz@6wind.com>

Hi Olivier, Thomas,

On 12/15/2016 1:46 PM, Olivier Matz wrote:
> Add a new macro RTE_PMD_REGISTER_KMOD_DEP() that allows a driver to
> declare the list of kernel modules required to run properly.
> 
> Today, most PCI drivers require uio/vfio.
> 
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> Acked-by: Fiona Trahe <fiona.trahe@intel.com>

This patch is for master branch, what do you think targeting it to
next-net tree?
So that new PMDs also can be included into patch?

Thanks,
ferruh

<...>

^ permalink raw reply

* Re: [PATCH 09/32] lib/ether: add rte_device in rte_eth_dev
From: Ferruh Yigit @ 2016-12-15 14:41 UTC (permalink / raw)
  To: Hemant Agrawal, dev@dpdk.org
  Cc: thomas.monjalon@6wind.com, Richardson, Bruce,
	shreyansh.jain@nxp.com
In-Reply-To: <c359a60c-f4a1-ea4d-ddc1-9c961b9cb9f9@nxp.com>

On 12/7/2016 6:41 AM, Hemant Agrawal wrote:
> On 12/7/2016 1:18 AM, Ferruh Yigit wrote:
>> On 12/4/2016 6:17 PM, Hemant Agrawal wrote:
>>> Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
>>> ---
>>>  lib/librte_ether/rte_ethdev.h | 1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
>>> index 3c45a1f..6f5673f 100644
>>> --- a/lib/librte_ether/rte_ethdev.h
>>> +++ b/lib/librte_ether/rte_ethdev.h
>>> @@ -1626,6 +1626,7 @@ struct rte_eth_dev {
>>>  	eth_rx_burst_t rx_pkt_burst; /**< Pointer to PMD receive function. */
>>>  	eth_tx_burst_t tx_pkt_burst; /**< Pointer to PMD transmit function. */
>>>  	struct rte_eth_dev_data *data;  /**< Pointer to device data */
>>> +	struct rte_device *device;
>>
>> I believe this change should not be part of a PMD patchset. This change
>> is more generic than the PMD.
>>
>> Won't Shreyansh's patch already do this?
> 
> I agree that this patch is not a fit for this PMD patchset, Shreyansh's 
> patch is not yet doing it. He will be taking care of it next.
> 
> So till Shreyansh provide the support, we need it.

If you need it, what do you think sending this as a separate patch? And
when accepted, your driver can use it?

> 
>>
>>>  	const struct eth_driver *driver;/**< Driver for this device */
>>>  	const struct eth_dev_ops *dev_ops; /**< Functions exported by PMD */
>>>  	struct rte_pci_device *pci_dev; /**< PCI info. supplied by probing */
>>>
>>
>>
> 
> 

^ permalink raw reply

* [PATCH v3] drivers: advertise kmod dependencies in pmdinfo
From: Olivier Matz @ 2016-12-15 13:46 UTC (permalink / raw)
  To: dev; +Cc: nhorman, thomas.monjalon, vido, fiona.trahe, stephen,
	adrien.mazarguil
In-Reply-To: <1479808257-8725-1-git-send-email-olivier.matz@6wind.com>

Add a new macro RTE_PMD_REGISTER_KMOD_DEP() that allows a driver to
declare the list of kernel modules required to run properly.

Today, most PCI drivers require uio/vfio.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
---

v2 -> v3:
- fix kmods deps advertised by mellanox drivers as pointed out
  by Adrien

v1 -> v2:                                                                                                
- do not advertise uio_pci_generic for vf drivers
- rebase on top of head: use new driver names and prefix
  macro with RTE_                                                                                       

rfc -> v1:
- the kmod information can be per-device using a modalias-like
  pattern
- change syntax to use '&' and '|' instead of ',' and ':'
- remove useless prerequisites in kmod lis: no need to
  specify both uio and uio_pci_generic, only the latter is
  required
- update kmod list in szedata2 driver
- remove kmod list in qat driver: it requires more than just loading
  a kmod, which is described in documentation

 buildtools/pmdinfogen/pmdinfogen.c      |  1 +
 buildtools/pmdinfogen/pmdinfogen.h      |  1 +
 drivers/net/bnx2x/bnx2x_ethdev.c        |  2 ++
 drivers/net/bnxt/bnxt_ethdev.c          |  1 +
 drivers/net/cxgbe/cxgbe_ethdev.c        |  1 +
 drivers/net/e1000/em_ethdev.c           |  1 +
 drivers/net/e1000/igb_ethdev.c          |  2 ++
 drivers/net/ena/ena_ethdev.c            |  1 +
 drivers/net/enic/enic_ethdev.c          |  1 +
 drivers/net/fm10k/fm10k_ethdev.c        |  1 +
 drivers/net/i40e/i40e_ethdev.c          |  1 +
 drivers/net/i40e/i40e_ethdev_vf.c       |  1 +
 drivers/net/ixgbe/ixgbe_ethdev.c        |  2 ++
 drivers/net/mlx4/mlx4.c                 |  2 ++
 drivers/net/mlx5/mlx5.c                 |  1 +
 drivers/net/nfp/nfp_net.c               |  1 +
 drivers/net/qede/qede_ethdev.c          |  2 ++
 drivers/net/szedata2/rte_eth_szedata2.c |  2 ++
 drivers/net/thunderx/nicvf_ethdev.c     |  1 +
 drivers/net/virtio/virtio_ethdev.c      |  1 +
 drivers/net/vmxnet3/vmxnet3_ethdev.c    |  1 +
 lib/librte_eal/common/include/rte_dev.h | 25 +++++++++++++++++++++++++
 tools/dpdk-pmdinfo.py                   |  5 ++++-
 23 files changed, 56 insertions(+), 1 deletion(-)

diff --git a/buildtools/pmdinfogen/pmdinfogen.c b/buildtools/pmdinfogen/pmdinfogen.c
index 59ab956..5129c57 100644
--- a/buildtools/pmdinfogen/pmdinfogen.c
+++ b/buildtools/pmdinfogen/pmdinfogen.c
@@ -269,6 +269,7 @@ struct opt_tag {
 
 static const struct opt_tag opt_tags[] = {
 	{"_param_string_export", "params"},
+	{"_kmod_dep_export", "kmod"},
 };
 
 static int complete_pmd_entry(struct elf_info *info, struct pmd_driver *drv)
diff --git a/buildtools/pmdinfogen/pmdinfogen.h b/buildtools/pmdinfogen/pmdinfogen.h
index e9eabff..27bab30 100644
--- a/buildtools/pmdinfogen/pmdinfogen.h
+++ b/buildtools/pmdinfogen/pmdinfogen.h
@@ -89,6 +89,7 @@ else \
 
 enum opt_params {
 	PMD_PARAM_STRING = 0,
+	PMD_KMOD_DEP,
 	PMD_OPT_MAX
 };
 
diff --git a/drivers/net/bnx2x/bnx2x_ethdev.c b/drivers/net/bnx2x/bnx2x_ethdev.c
index 0eae433..0f1e4a2 100644
--- a/drivers/net/bnx2x/bnx2x_ethdev.c
+++ b/drivers/net/bnx2x/bnx2x_ethdev.c
@@ -643,5 +643,7 @@ static struct eth_driver rte_bnx2xvf_pmd = {
 
 RTE_PMD_REGISTER_PCI(net_bnx2x, rte_bnx2x_pmd.pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_bnx2x, pci_id_bnx2x_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_bnx2x, "* igb_uio | uio_pci_generic | vfio");
 RTE_PMD_REGISTER_PCI(net_bnx2xvf, rte_bnx2xvf_pmd.pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_bnx2xvf, pci_id_bnx2xvf_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_bnx2xvf, "* igb_uio | vfio");
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 035fe07..a24e153 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -1173,3 +1173,4 @@ static struct eth_driver bnxt_rte_pmd = {
 
 RTE_PMD_REGISTER_PCI(net_bnxt, bnxt_rte_pmd.pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_bnxt, bnxt_pci_id_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_bnxt, "* igb_uio | uio_pci_generic | vfio");
diff --git a/drivers/net/cxgbe/cxgbe_ethdev.c b/drivers/net/cxgbe/cxgbe_ethdev.c
index b7f28eb..317598d 100644
--- a/drivers/net/cxgbe/cxgbe_ethdev.c
+++ b/drivers/net/cxgbe/cxgbe_ethdev.c
@@ -1050,3 +1050,4 @@ static struct eth_driver rte_cxgbe_pmd = {
 
 RTE_PMD_REGISTER_PCI(net_cxgbe, rte_cxgbe_pmd.pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_cxgbe, cxgb4_pci_tbl);
+RTE_PMD_REGISTER_KMOD_DEP(net_cxgbe, "* igb_uio | uio_pci_generic | vfio");
diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index aee3d34..866a5cf 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -1807,3 +1807,4 @@ eth_em_set_mc_addr_list(struct rte_eth_dev *dev,
 
 RTE_PMD_REGISTER_PCI(net_e1000_em, rte_em_pmd.pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_e1000_em, pci_id_em_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_e1000_em, "* igb_uio | uio_pci_generic | vfio");
diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index 2fddf0c..08f2a68 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -5240,5 +5240,7 @@ eth_igb_configure_msix_intr(struct rte_eth_dev *dev)
 
 RTE_PMD_REGISTER_PCI(net_e1000_igb, rte_igb_pmd.pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_e1000_igb, pci_id_igb_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_e1000_igb, "* igb_uio | uio_pci_generic | vfio");
 RTE_PMD_REGISTER_PCI(net_e1000_igb_vf, rte_igbvf_pmd.pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_e1000_igb_vf, pci_id_igbvf_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_e1000_igb_vf, "* igb_uio | vfio");
diff --git a/drivers/net/ena/ena_ethdev.c b/drivers/net/ena/ena_ethdev.c
index ab9a178..555fb31 100644
--- a/drivers/net/ena/ena_ethdev.c
+++ b/drivers/net/ena/ena_ethdev.c
@@ -1716,3 +1716,4 @@ static struct eth_driver rte_ena_pmd = {
 
 RTE_PMD_REGISTER_PCI(net_ena, rte_ena_pmd.pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_ena, pci_id_ena_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_ena, "* igb_uio | uio_pci_generic | vfio");
diff --git a/drivers/net/enic/enic_ethdev.c b/drivers/net/enic/enic_ethdev.c
index 2b154ec..f997302 100644
--- a/drivers/net/enic/enic_ethdev.c
+++ b/drivers/net/enic/enic_ethdev.c
@@ -645,3 +645,4 @@ static struct eth_driver rte_enic_pmd = {
 
 RTE_PMD_REGISTER_PCI(net_enic, rte_enic_pmd.pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_enic, pci_id_enic_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_enic, "* igb_uio | uio_pci_generic | vfio");
diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index 923690c..fe74f6d 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -3074,3 +3074,4 @@ static struct eth_driver rte_pmd_fm10k = {
 
 RTE_PMD_REGISTER_PCI(net_fm10k, rte_pmd_fm10k.pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_fm10k, pci_id_fm10k_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_fm10k, "* igb_uio | uio_pci_generic | vfio");
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 67778ba..b0c0fbf 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -711,6 +711,7 @@ rte_i40e_dev_atomic_write_link_status(struct rte_eth_dev *dev,
 
 RTE_PMD_REGISTER_PCI(net_i40e, rte_i40e_pmd.pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_i40e, pci_id_i40e_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_i40e, "* igb_uio | uio_pci_generic | vfio");
 
 #ifndef I40E_GLQF_ORT
 #define I40E_GLQF_ORT(_i)    (0x00268900 + ((_i) * 4))
diff --git a/drivers/net/i40e/i40e_ethdev_vf.c b/drivers/net/i40e/i40e_ethdev_vf.c
index aa306d6..7869b9b 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -1539,6 +1539,7 @@ static struct eth_driver rte_i40evf_pmd = {
 
 RTE_PMD_REGISTER_PCI(net_i40e_vf, rte_i40evf_pmd.pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_i40e_vf, pci_id_i40evf_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_i40e_vf, "* igb_uio | vfio");
 
 static int
 i40evf_dev_configure(struct rte_eth_dev *dev)
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index edc9b22..baffc71 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -7594,5 +7594,7 @@ ixgbevf_dev_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
 
 RTE_PMD_REGISTER_PCI(net_ixgbe, rte_ixgbe_pmd.pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_ixgbe, pci_id_ixgbe_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_ixgbe, "* igb_uio | uio_pci_generic | vfio");
 RTE_PMD_REGISTER_PCI(net_ixgbe_vf, rte_ixgbevf_pmd.pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_ixgbe_vf, pci_id_ixgbevf_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_ixgbe_vf, "* igb_uio | vfio");
diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index da61a85..90c3c07 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -5937,3 +5937,5 @@ rte_mlx4_pmd_init(void)
 
 RTE_PMD_EXPORT_NAME(net_mlx4, __COUNTER__);
 RTE_PMD_REGISTER_PCI_TABLE(net_mlx4, mlx4_pci_id_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_mlx4,
+	"* ib_uverbs & mlx4_en & mlx4_core & mlx4_ib");
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 90cc35e..206c9f9 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -759,3 +759,4 @@ rte_mlx5_pmd_init(void)
 
 RTE_PMD_EXPORT_NAME(net_mlx5, __COUNTER__);
 RTE_PMD_REGISTER_PCI_TABLE(net_mlx5, mlx5_pci_id_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_mlx5, "* ib_uverbs & mlx5_core & mlx5_ib");
diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index de80b46..e315dd8 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -2481,6 +2481,7 @@ static struct eth_driver rte_nfp_net_pmd = {
 
 RTE_PMD_REGISTER_PCI(net_nfp, rte_nfp_net_pmd.pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_nfp, pci_id_nfp_net_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_nfp, "* igb_uio | uio_pci_generic | vfio");
 
 /*
  * Local variables:
diff --git a/drivers/net/qede/qede_ethdev.c b/drivers/net/qede/qede_ethdev.c
index d106dd0..001166a 100644
--- a/drivers/net/qede/qede_ethdev.c
+++ b/drivers/net/qede/qede_ethdev.c
@@ -1668,5 +1668,7 @@ static struct eth_driver rte_qede_pmd = {
 
 RTE_PMD_REGISTER_PCI(net_qede, rte_qede_pmd.pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_qede, pci_id_qede_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_qede, "* igb_uio | uio_pci_generic | vfio");
 RTE_PMD_REGISTER_PCI(net_qede_vf, rte_qedevf_pmd.pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_qede_vf, pci_id_qedevf_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_qede_vf, "* igb_uio | vfio");
diff --git a/drivers/net/szedata2/rte_eth_szedata2.c b/drivers/net/szedata2/rte_eth_szedata2.c
index f3cd52d..677ba9f 100644
--- a/drivers/net/szedata2/rte_eth_szedata2.c
+++ b/drivers/net/szedata2/rte_eth_szedata2.c
@@ -1583,3 +1583,5 @@ static struct eth_driver szedata2_eth_driver = {
 
 RTE_PMD_REGISTER_PCI(RTE_SZEDATA2_DRIVER_NAME, szedata2_eth_driver.pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(RTE_SZEDATA2_DRIVER_NAME, rte_szedata2_pci_id_table);
+RTE_PMD_REGISTER_KMOD_DEP(RTE_SZEDATA2_DRIVER_NAME,
+	"* combo6core & combov3 & szedata2 & szedata2_cv3");
diff --git a/drivers/net/thunderx/nicvf_ethdev.c b/drivers/net/thunderx/nicvf_ethdev.c
index 466e49c..db03fa8 100644
--- a/drivers/net/thunderx/nicvf_ethdev.c
+++ b/drivers/net/thunderx/nicvf_ethdev.c
@@ -2121,3 +2121,4 @@ static struct eth_driver rte_nicvf_pmd = {
 
 RTE_PMD_REGISTER_PCI(net_thunderx, rte_nicvf_pmd.pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_thunderx, pci_id_nicvf_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_thunderx, "* igb_uio | uio_pci_generic | vfio");
diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c
index 079fd6c..1bd60e9 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1669,3 +1669,4 @@ __rte_unused uint8_t is_rx)
 
 RTE_PMD_EXPORT_NAME(net_virtio, __COUNTER__);
 RTE_PMD_REGISTER_PCI_TABLE(net_virtio, pci_id_virtio_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_virtio, "* igb_uio | uio_pci_generic | vfio");
diff --git a/drivers/net/vmxnet3/vmxnet3_ethdev.c b/drivers/net/vmxnet3/vmxnet3_ethdev.c
index 8bb13e5..93c9ac9 100644
--- a/drivers/net/vmxnet3/vmxnet3_ethdev.c
+++ b/drivers/net/vmxnet3/vmxnet3_ethdev.c
@@ -962,3 +962,4 @@ vmxnet3_process_events(struct vmxnet3_hw *hw)
 
 RTE_PMD_REGISTER_PCI(net_vmxnet3, rte_vmxnet3_pmd.pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_vmxnet3, pci_id_vmxnet3_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_vmxnet3, "* igb_uio | uio_pci_generic | vfio");
diff --git a/lib/librte_eal/common/include/rte_dev.h b/lib/librte_eal/common/include/rte_dev.h
index 8840380..1708244 100644
--- a/lib/librte_eal/common/include/rte_dev.h
+++ b/lib/librte_eal/common/include/rte_dev.h
@@ -239,6 +239,31 @@ RTE_STR(table)
 static const char DRV_EXP_TAG(name, param_string_export)[] \
 __attribute__((used)) = str
 
+/**
+ * Advertise the list of kernel modules required to run this driver
+ *
+ * This string lists the kernel modules required for the devices
+ * associated to a PMD. The format of each line of the string is:
+ * "<device-pattern> <kmod-expression>".
+ *
+ * The possible formats for the device pattern are:
+ *   "*"                     all devices supported by this driver
+ *   "pci:*"                 all PCI devices supported by this driver
+ *   "pci:v8086:d*:sv*:sd*"  all PCI devices supported by this driver
+ *                           whose vendor id is 0x8086.
+ *
+ * The format of the kernel modules list is a parenthesed expression
+ * containing logical-and (&) and logical-or (|).
+ *
+ * The device pattern and the kmod expression are separated by a space.
+ *
+ * Example:
+ * - "* igb_uio | uio_pci_generic | vfio"
+ */
+#define RTE_PMD_REGISTER_KMOD_DEP(name, str) \
+static const char DRV_EXP_TAG(name, kmod_dep_export)[] \
+__attribute__((used)) = str
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/tools/dpdk-pmdinfo.py b/tools/dpdk-pmdinfo.py
index 3db9819..17bfed4 100755
--- a/tools/dpdk-pmdinfo.py
+++ b/tools/dpdk-pmdinfo.py
@@ -312,7 +312,10 @@ def parse_pmd_info_string(self, mystring):
         global raw_output
         global pcidb
 
-        optional_pmd_info = [{'id': 'params', 'tag': 'PMD PARAMETERS'}]
+        optional_pmd_info = [
+            {'id': 'params', 'tag': 'PMD PARAMETERS'},
+            {'id': 'kmod', 'tag': 'PMD KMOD DEPENDENCIES'}
+        ]
 
         i = mystring.index("=")
         mystring = mystring[i + 2:]
-- 
2.8.1

^ permalink raw reply related

* Re: [PATCH v2 1/6] eventdev: introduce event driven programming model
From: Jerin Jacob @ 2016-12-15 13:39 UTC (permalink / raw)
  To: Bruce Richardson
  Cc: dev, thomas.monjalon, hemant.agrawal, gage.eads, harry.van.haaren
In-Reply-To: <20161214151922.GB110884@bricha3-MOBL3.ger.corp.intel.com>

On Wed, Dec 14, 2016 at 03:19:22PM +0000, Bruce Richardson wrote:
> On Tue, Dec 06, 2016 at 09:22:15AM +0530, Jerin Jacob wrote:
> > In a polling model, lcores poll ethdev ports and associated
> > rx queues directly to look for packet. In an event driven model,
> > by contrast, lcores call the scheduler that selects packets for
> > them based on programmer-specified criteria. Eventdev library
> > adds support for event driven programming model, which offer
> > applications automatic multicore scaling, dynamic load balancing,
> > pipelining, packet ingress order maintenance and
> > synchronization services to simplify application packet processing.
> > 
> > By introducing event driven programming model, DPDK can support
> > both polling and event driven programming models for packet processing,
> > and applications are free to choose whatever model
> > (or combination of the two) that best suits their needs.
> > 
> > This patch adds the eventdev specification header file.
> > 
> > Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> > ---
> <snip>
> > + *
> > + * The *nb_events* parameter is the number of event objects to enqueue which are
> > + * supplied in the *ev* array of *rte_event* structure.
> > + *
> > + * The rte_event_enqueue_burst() function returns the number of
> > + * events objects it actually enqueued. A return value equal to *nb_events*
> > + * means that all event objects have been enqueued.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param port_id
> > + *   The identifier of the event port.
> > + * @param ev
> > + *   Points to an array of *nb_events* objects of type *rte_event* structure
> > + *   which contain the event object enqueue operations to be processed.
> > + * @param nb_events
> > + *   The number of event objects to enqueue, typically number of
> > + *   rte_event_port_enqueue_depth() available for this port.
> > + *
> > + * @return
> > + *   The number of event objects actually enqueued on the event device. The
> > + *   return value can be less than the value of the *nb_events* parameter when
> > + *   the event devices queue is full or if invalid parameters are specified in a
> > + *   *rte_event*. If return value is less than *nb_events*, the remaining events
> > + *   at the end of ev[] are not consumed,and the caller has to take care of them
> > + *
> > + * @see rte_event_port_enqueue_depth()
> > + */
> > +uint16_t
> > +rte_event_enqueue_burst(uint8_t dev_id, uint8_t port_id, struct rte_event ev[],
> > +			uint16_t nb_events);
> > +
> One suggestion - do we want to make the ev[] array const, to disallow
> drivers from modifying the events passed in? Since the event structure
> is only 16B big, it should be small enough to be copied around in
> scheduler instances, allow the original events to remain unmodified.

Seems like a good idea to me. I will add it in v3.

> 
> /Bruce

^ permalink raw reply

* Re: KNI broken again with 4.9 kernel
From: Jay Rolette @ 2016-12-15 12:55 UTC (permalink / raw)
  To: Mcnamara, John; +Cc: Stephen Hemminger, dev@dpdk.org, Yigit, Ferruh
In-Reply-To: <B27915DBBA3421428155699D51E4CFE202685DB8@IRSMSX103.ger.corp.intel.com>

On Thu, Dec 15, 2016 at 6:01 AM, Mcnamara, John <john.mcnamara@intel.com>
wrote:

>
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Stephen Hemminger
> > Sent: Wednesday, December 14, 2016 11:41 PM
> > To: dev@dpdk.org
> > Subject: [dpdk-dev] KNI broken again with 4.9 kernel
> >
> > /build/lib/librte_eal/linuxapp/kni/igb_main.c:2317:21: error:
> > initialization from incompatible pointer type [-Werror=incompatible-
> > pointer-types]
> >   .ndo_set_vf_vlan = igb_ndo_set_vf_vlan,
> >                      ^~~~~~~~~~~~~~~~~~~
> >
> > I am sure Ferruh Yigit will fix it.
> >
> > Which raises a couple of questions:
> >  1. Why is DPDK still keeping KNI support for Intel specific ethtool
> > functionality.
> >     This always breaks, is code bloat, and means a 3rd copy of base code
> > (Linux, DPDK PMD, + KNI)
> >
> >  2. Why is KNI not upstream?
> >     If not acceptable due to security or supportablity then why does it
> > still exist?
> >
> >  3. If not upstream, then maintainer should track upstream kernel changes
> > and fix DPDK before
> >     kernel is released.  The ABI is normally set early in the rc cycle
> > weeks before release.
>
>
> Hi Stephen,
>
> On point 2: The feedback we have always received is that the KNI code
> isn't upstreamable. Do you think there is an upstream path?
>
> > If not acceptable due to security or supportablity then why does it
> > still exist?
>
> The most commonly expressed reason when we have asked this question in the
> past (and we did again at Userspace a few months ago) is that the people
> who use it want the performance.
>

We use KNI in our product. In our case, it's because it allows "normal"
non-DPDK apps in the control plane to interact with traffic on the fastpath
as needed. Having everything under the sun live in DPDK's essentially flat
memory space is not great for security or stability.

It helps time to market by being able to use existing programs that
interface to the network via sockets instead of having to limit ourselves
to the relatively tiny set of libraries out there that work directly DPDK.

Double bonus on the time-to-market argument since we can implement
functionality in other higher-level languages as appropriate.

Performance-wise, KNI is "ok" but not great. It's not clear to me why it is
so much slower than using a NIC normally (not via DPDK) via the Linux
network stack. Copying data between sk_buf and mbuf is never going to be
cheap, but comparing that to what happens through all the kernel network
stack layers, the end result seems too slow.

That said, it's still faster than TAP/TUN interfaces and similar approaches.

Jay

On point 3: We do have an internal continuous integration system that runs
> nightly compiles of DPDK against the latest kernel and flags any issues.
>
> John
>

^ permalink raw reply

* [PATCH v2 32/32] net/sfc: support firmware-assisted TSOv2
From: Andrew Rybchenko @ 2016-12-15 12:51 UTC (permalink / raw)
  To: dev; +Cc: ferruh.yigit, Ivan Malov
In-Reply-To: <1481806283-10387-1-git-send-email-arybchenko@solarflare.com>

From: Ivan Malov <ivan.malov@oktetlabs.ru>

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
Reviewed-by: Mark Spender <mspender@solarflare.com>
Reviewed-by: Robert Stonehouse <rstonehouse@solarflare.com>
---
 config/common_base                   |   1 +
 doc/guides/nics/features/sfc_efx.ini |   1 +
 doc/guides/nics/sfc_efx.rst          |   8 ++
 drivers/net/sfc/Makefile             |   4 +
 drivers/net/sfc/sfc.c                |   8 ++
 drivers/net/sfc/sfc.h                |   2 +
 drivers/net/sfc/sfc_ethdev.c         |   3 +
 drivers/net/sfc/sfc_tso.c            | 200 +++++++++++++++++++++++++++++++++++
 drivers/net/sfc/sfc_tx.c             |  89 +++++++++++++++-
 drivers/net/sfc/sfc_tx.h             |  28 +++++
 10 files changed, 341 insertions(+), 3 deletions(-)
 create mode 100644 drivers/net/sfc/sfc_tso.c

diff --git a/config/common_base b/config/common_base
index 59cb830..faee944 100644
--- a/config/common_base
+++ b/config/common_base
@@ -343,6 +343,7 @@ CONFIG_RTE_LIBRTE_PMD_XENVIRT=n
 #
 CONFIG_RTE_LIBRTE_SFC_EFX_PMD=y
 CONFIG_RTE_LIBRTE_SFC_EFX_DEBUG=n
+CONFIG_RTE_LIBRTE_SFC_EFX_TSO=n
 
 #
 # Compile null PMD
diff --git a/doc/guides/nics/features/sfc_efx.ini b/doc/guides/nics/features/sfc_efx.ini
index 07c58d5..3a15baa 100644
--- a/doc/guides/nics/features/sfc_efx.ini
+++ b/doc/guides/nics/features/sfc_efx.ini
@@ -11,6 +11,7 @@ Queue start/stop     = Y
 MTU update           = Y
 Jumbo frame          = Y
 Scattered Rx         = Y
+TSO                  = Y
 Promiscuous mode     = Y
 Allmulticast mode    = Y
 Multicast MAC filter = Y
diff --git a/doc/guides/nics/sfc_efx.rst b/doc/guides/nics/sfc_efx.rst
index bc45b17..6be4fba 100644
--- a/doc/guides/nics/sfc_efx.rst
+++ b/doc/guides/nics/sfc_efx.rst
@@ -63,6 +63,8 @@ SFC EFX PMD has support for:
 
 - Allmulticast mode
 
+- TCP segmentation offload (TSO)
+
 - Multicast MAC filter
 
 - IPv4/IPv6 TCP/UDP receive checksum offload
@@ -169,6 +171,12 @@ Please note that enabling debugging options may affect system performance.
 
   Enable compilation of the extra run-time consistency checks.
 
+- ``CONFIG_RTE_LIBRTE_SFC_EFX_TSO`` (default **n**)
+
+  Toggle TCP segmentation offload support.
+  Enabling the feature limits the number of available transmit queues
+  significantly due to the limited number of adapter TSO contexts.
+
 
 Per-Device Parameters
 ~~~~~~~~~~~~~~~~~~~~~
diff --git a/drivers/net/sfc/Makefile b/drivers/net/sfc/Makefile
index dd099b2..14d6536 100644
--- a/drivers/net/sfc/Makefile
+++ b/drivers/net/sfc/Makefile
@@ -90,6 +90,8 @@ SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_port.c
 SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_rx.c
 SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += sfc_tx.c
 
+SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_TSO) += sfc_tso.c
+
 VPATH += $(SRCDIR)/base
 
 SRCS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += efx_bootcfg.c
@@ -139,4 +141,6 @@ DEPDIRS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += lib/librte_ether
 DEPDIRS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += lib/librte_mempool
 DEPDIRS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD) += lib/librte_mbuf
 
+DEPDIRS-$(CONFIG_RTE_LIBRTE_SFC_EFX_TSO) += lib/librte_net
+
 include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/sfc/sfc.c b/drivers/net/sfc/sfc.c
index e79367d..02a56f7 100644
--- a/drivers/net/sfc/sfc.c
+++ b/drivers/net/sfc/sfc.c
@@ -621,6 +621,14 @@ sfc_attach(struct sfc_adapter *sa)
 	if (rc != 0)
 		goto fail_set_rss_defaults;
 
+#ifdef RTE_LIBRTE_SFC_EFX_TSO
+	sa->tso = efx_nic_cfg_get(sa->nic)->enc_fw_assisted_tso_v2_enabled;
+	if (!sa->tso)
+		sfc_warn(sa, "TSO support isn't available on this adapter");
+#else /* !RTE_LIBRTE_SFC_EFX_TSO */
+	sa->tso = B_FALSE;
+#endif /* RTE_LIBRTE_SFC_EFX_TSO */
+
 	sfc_log_init(sa, "fini nic");
 	efx_nic_fini(enp);
 
diff --git a/drivers/net/sfc/sfc.h b/drivers/net/sfc/sfc.h
index d02d1c0..6716acd 100644
--- a/drivers/net/sfc/sfc.h
+++ b/drivers/net/sfc/sfc.h
@@ -195,6 +195,8 @@ struct sfc_adapter {
 	unsigned int			txq_count;
 	struct sfc_txq_info		*txq_info;
 
+	boolean_t			tso;
+
 	unsigned int			rss_channels;
 
 #if EFSYS_OPT_RX_SCALE
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index f45072c..dd5ca5c 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -93,6 +93,9 @@ sfc_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
 	}
 #endif
 
+	if (sa->tso)
+		dev_info->tx_offload_capa |= DEV_TX_OFFLOAD_TCP_TSO;
+
 	dev_info->rx_desc_lim.nb_max = EFX_RXQ_MAXNDESCS;
 	dev_info->rx_desc_lim.nb_min = EFX_RXQ_MINNDESCS;
 	/* The RXQ hardware requires that the descriptor count is a power
diff --git a/drivers/net/sfc/sfc_tso.c b/drivers/net/sfc/sfc_tso.c
new file mode 100644
index 0000000..68d84c9
--- /dev/null
+++ b/drivers/net/sfc/sfc_tso.c
@@ -0,0 +1,200 @@
+/*-
+ * Copyright (c) 2016 Solarflare Communications Inc.
+ * All rights reserved.
+ *
+ * This software was jointly developed between OKTET Labs (under contract
+ * for Solarflare) and Solarflare Communications, Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice,
+ *    this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ *    this list of conditions and the following disclaimer in the documentation
+ *    and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
+ * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
+ * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
+ * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <rte_ip.h>
+#include <rte_tcp.h>
+
+#include "sfc.h"
+#include "sfc_debug.h"
+#include "sfc_tx.h"
+#include "sfc_ev.h"
+
+/** Standard TSO header length */
+#define SFC_TSOH_STD_LEN        256
+
+/** The number of TSO option descriptors that precede the packet descriptors */
+#define SFC_TSO_OPDESCS_IDX_SHIFT	2
+
+int
+sfc_tso_alloc_tsoh_objs(struct sfc_tx_sw_desc *sw_ring,
+			unsigned int txq_entries, unsigned int socket_id)
+{
+	unsigned int i;
+
+	for (i = 0; i < txq_entries; ++i) {
+		sw_ring[i].tsoh = rte_malloc_socket("sfc-txq-tsoh-obj",
+						    SFC_TSOH_STD_LEN,
+						    SFC_TX_SEG_BOUNDARY,
+						    socket_id);
+		if (sw_ring[i].tsoh == NULL)
+			goto fail_alloc_tsoh_objs;
+	}
+
+	return 0;
+
+fail_alloc_tsoh_objs:
+	while (i > 0)
+		rte_free(sw_ring[--i].tsoh);
+
+	return ENOMEM;
+}
+
+void
+sfc_tso_free_tsoh_objs(struct sfc_tx_sw_desc *sw_ring, unsigned int txq_entries)
+{
+	unsigned int i;
+
+	for (i = 0; i < txq_entries; ++i) {
+		rte_free(sw_ring[i].tsoh);
+		sw_ring[i].tsoh = NULL;
+	}
+}
+
+static void
+sfc_tso_prepare_header(struct sfc_txq *txq, struct rte_mbuf **in_seg,
+		       size_t *in_off, unsigned int idx, size_t bytes_left)
+{
+	struct rte_mbuf *m = *in_seg;
+	size_t bytes_to_copy = 0;
+	uint8_t *tsoh = txq->sw_ring[idx & txq->ptr_mask].tsoh;
+
+	do {
+		bytes_to_copy = MIN(bytes_left, m->data_len);
+
+		rte_memcpy(tsoh, rte_pktmbuf_mtod(m, uint8_t *),
+			   bytes_to_copy);
+
+		bytes_left -= bytes_to_copy;
+		tsoh += bytes_to_copy;
+
+		if (bytes_left > 0) {
+			m = m->next;
+			SFC_ASSERT(m != NULL);
+		}
+	} while (bytes_left > 0);
+
+	if (bytes_to_copy == m->data_len) {
+		*in_seg = m->next;
+		*in_off = 0;
+	} else {
+		*in_seg = m;
+		*in_off = bytes_to_copy;
+	}
+}
+
+int
+sfc_tso_do(struct sfc_txq *txq, unsigned int idx, struct rte_mbuf **in_seg,
+	   size_t *in_off, efx_desc_t **pend, unsigned int *pkt_descs,
+	   size_t *pkt_len)
+{
+	uint8_t *tsoh;
+	const struct tcp_hdr *th;
+	efsys_dma_addr_t header_paddr;
+	efsys_dma_addr_t paddr_next_frag;
+	uint16_t packet_id;
+	uint32_t sent_seq;
+	struct rte_mbuf *m = *in_seg;
+	size_t nh_off = m->l2_len; /* IP header offset */
+	size_t tcph_off = m->l2_len + m->l3_len; /* TCP header offset */
+	size_t header_len = m->l2_len + m->l3_len + m->l4_len;
+	const efx_nic_cfg_t *encp = efx_nic_cfg_get(txq->evq->sa->nic);
+
+	idx += SFC_TSO_OPDESCS_IDX_SHIFT;
+
+	/* Packets which have too big headers should be discarded */
+	if (unlikely(header_len > SFC_TSOH_STD_LEN))
+		return EMSGSIZE;
+
+	/*
+	 * The TCP header must start at most 208 bytes into the frame.
+	 * If it starts later than this then the NIC won't realise
+	 * it's a TCP packet and TSO edits won't be applied
+	 */
+	if (unlikely(tcph_off > encp->enc_tx_tso_tcp_header_offset_limit))
+		return EMSGSIZE;
+
+	header_paddr = rte_pktmbuf_mtophys(m);
+	paddr_next_frag = P2ROUNDUP(header_paddr + 1, SFC_TX_SEG_BOUNDARY);
+
+	/*
+	 * Sometimes headers may be split across multiple mbufs. In such cases
+	 * we need to glue those pieces and store them in some temporary place.
+	 * Also, packet headers must be contiguous in memory, so that
+	 * they can be referred to with a single DMA descriptor. Hence, handle
+	 * the case where the original header crosses a 4K memory boundary
+	 */
+	if ((m->data_len < header_len) ||
+	    ((paddr_next_frag - header_paddr) < header_len)) {
+		sfc_tso_prepare_header(txq, in_seg, in_off, idx, header_len);
+		tsoh = txq->sw_ring[idx & txq->ptr_mask].tsoh;
+
+		header_paddr = rte_malloc_virt2phy((void *)tsoh);
+	} else {
+		if (m->data_len == header_len) {
+			*in_off = 0;
+			*in_seg = m->next;
+		} else {
+			*in_off = header_len;
+		}
+
+		tsoh = rte_pktmbuf_mtod(m, uint8_t *);
+	}
+
+	/* Handle IP header */
+	if (m->ol_flags & PKT_TX_IPV4) {
+		const struct ipv4_hdr *iphe4;
+
+		iphe4 = (const struct ipv4_hdr *)(tsoh + nh_off);
+		rte_memcpy(&packet_id, &iphe4->packet_id, sizeof(uint16_t));
+		packet_id = rte_be_to_cpu_16(packet_id);
+	} else if (m->ol_flags & PKT_TX_IPV6) {
+		packet_id = 0;
+	} else {
+		return EINVAL;
+	}
+
+	/* Handle TCP header */
+	th = (const struct tcp_hdr *)(tsoh + tcph_off);
+
+	rte_memcpy(&sent_seq, &th->sent_seq, sizeof(uint32_t));
+	sent_seq = rte_be_to_cpu_32(sent_seq);
+
+	efx_tx_qdesc_tso2_create(txq->common, packet_id, sent_seq, m->tso_segsz,
+				 *pend, EFX_TX_FATSOV2_OPT_NDESCS);
+
+	*pend += EFX_TX_FATSOV2_OPT_NDESCS;
+	*pkt_descs += EFX_TX_FATSOV2_OPT_NDESCS;
+
+	efx_tx_qdesc_dma_create(txq->common, header_paddr, header_len,
+				B_FALSE, (*pend)++);
+	(*pkt_descs)++;
+	*pkt_len -= header_len;
+
+	return 0;
+}
diff --git a/drivers/net/sfc/sfc_tx.c b/drivers/net/sfc/sfc_tx.c
index 86bcfec..3e64c0f 100644
--- a/drivers/net/sfc/sfc_tx.c
+++ b/drivers/net/sfc/sfc_tx.c
@@ -184,6 +184,13 @@ sfc_tx_qinit(struct sfc_adapter *sa, unsigned int sw_index,
 	if (txq->sw_ring == NULL)
 		goto fail_desc_alloc;
 
+	if (sa->tso) {
+		rc = sfc_tso_alloc_tsoh_objs(txq->sw_ring, txq_info->entries,
+					     socket_id);
+		if (rc != 0)
+			goto fail_alloc_tsoh_objs;
+	}
+
 	txq->state = SFC_TXQ_INITIALIZED;
 	txq->ptr_mask = txq_info->entries - 1;
 	txq->free_thresh = (tx_conf->tx_free_thresh) ? tx_conf->tx_free_thresh :
@@ -199,6 +206,9 @@ sfc_tx_qinit(struct sfc_adapter *sa, unsigned int sw_index,
 
 	return 0;
 
+fail_alloc_tsoh_objs:
+	rte_free(txq->sw_ring);
+
 fail_desc_alloc:
 	rte_free(txq->pend_desc);
 
@@ -234,6 +244,8 @@ sfc_tx_qfini(struct sfc_adapter *sa, unsigned int sw_index)
 	SFC_ASSERT(txq != NULL);
 	SFC_ASSERT(txq->state == SFC_TXQ_INITIALIZED);
 
+	sfc_tso_free_tsoh_objs(txq->sw_ring, txq_info->entries);
+
 	txq_info->txq = NULL;
 	txq_info->entries = 0;
 
@@ -300,6 +312,11 @@ sfc_tx_init(struct sfc_adapter *sa)
 
 	sa->txq_count = sa->eth_dev->data->nb_tx_queues;
 
+	if (sa->tso)
+		sa->txq_count = MIN(sa->txq_count,
+		   efx_nic_cfg_get(sa->nic)->enc_fw_assisted_tso_v2_n_contexts /
+		   efx_nic_cfg_get(sa->nic)->enc_hw_pf_count);
+
 	sa->txq_info = rte_calloc_socket("sfc-txqs", sa->txq_count,
 					 sizeof(sa->txq_info[0]), 0,
 					 sa->socket_id);
@@ -373,17 +390,25 @@ sfc_tx_qstart(struct sfc_adapter *sa, unsigned int sw_index)
 	 * hence, we always enable it here
 	 */
 	if ((txq->flags & ETH_TXQ_FLAGS_NOXSUMTCP) ||
-	    (txq->flags & ETH_TXQ_FLAGS_NOXSUMUDP))
+	    (txq->flags & ETH_TXQ_FLAGS_NOXSUMUDP)) {
 		flags = EFX_TXQ_CKSUM_IPV4;
-	else
+	} else {
 		flags = EFX_TXQ_CKSUM_IPV4 | EFX_TXQ_CKSUM_TCPUDP;
 
+		if (sa->tso)
+			flags |= EFX_TXQ_FATSOV2;
+	}
+
 	rc = efx_tx_qcreate(sa->nic, sw_index, 0, &txq->mem,
 			    txq_info->entries, 0 /* not used on EF10 */,
 			    flags, evq->common,
 			    &txq->common, &desc_index);
-	if (rc != 0)
+	if (rc != 0) {
+		if (sa->tso && (rc == ENOSPC))
+			sfc_err(sa, "ran out of TSO contexts");
+
 		goto fail_tx_qcreate;
+	}
 
 	txq->added = txq->pending = txq->completed = desc_index;
 	txq->hw_vlan_tci = 0;
@@ -494,6 +519,13 @@ sfc_tx_start(struct sfc_adapter *sa)
 
 	sfc_log_init(sa, "txq_count = %u", sa->txq_count);
 
+	if (sa->tso) {
+		if (!efx_nic_cfg_get(sa->nic)->enc_fw_assisted_tso_v2_enabled) {
+			sfc_warn(sa, "TSO support was unable to be restored");
+			sa->tso = B_FALSE;
+		}
+	}
+
 	rc = efx_tx_init(sa->nic);
 	if (rc != 0)
 		goto fail_efx_tx_init;
@@ -607,6 +639,7 @@ sfc_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 		struct rte_mbuf		*m_seg = *pktp;
 		size_t			pkt_len = m_seg->pkt_len;
 		unsigned int		pkt_descs = 0;
+		size_t			in_off = 0;
 
 		/*
 		 * Here VLAN TCI is expected to be zero in case if no
@@ -617,6 +650,46 @@ sfc_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 		 */
 		pkt_descs += sfc_tx_maybe_insert_tag(txq, m_seg, &pend);
 
+#ifdef RTE_LIBRTE_SFC_EFX_TSO
+		if (m_seg->ol_flags & PKT_TX_TCP_SEG) {
+			/*
+			 * We expect correct 'pkt->l[2, 3, 4]_len' values
+			 * to be set correctly by the caller
+			 */
+			if (sfc_tso_do(txq, added, &m_seg, &in_off, &pend,
+				       &pkt_descs, &pkt_len) != 0) {
+				/* We may have reached this place for
+				 * one of the following reasons:
+				 *
+				 * 1) Packet header length is greater
+				 *    than SFC_TSOH_STD_LEN
+				 * 2) TCP header starts at more then
+				 *    208 bytes into the frame
+				 *
+				 * We will deceive RTE saying that we have sent
+				 * the packet, but we will actually drop it.
+				 * Hence, we should revert 'pend' to the
+				 * previous state (in case we have added
+				 * VLAN descriptor) and start processing
+				 * another one packet. But the original
+				 * mbuf shouldn't be orphaned
+				 */
+				pend -= pkt_descs;
+
+				rte_pktmbuf_free(*pktp);
+
+				continue;
+			}
+
+			/*
+			 * We've only added 2 FATSOv2 option descriptors
+			 * and 1 descriptor for the linearized packet header.
+			 * The outstanding work will be done in the same manner
+			 * as for the usual non-TSO path
+			 */
+		}
+#endif /* RTE_LIBRTE_SFC_EFX_TSO */
+
 		for (; m_seg != NULL; m_seg = m_seg->next) {
 			efsys_dma_addr_t	next_frag;
 			size_t			seg_len;
@@ -624,6 +697,16 @@ sfc_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 			seg_len = m_seg->data_len;
 			next_frag = rte_mbuf_data_dma_addr(m_seg);
 
+			/*
+			 * If we've started TSO transaction few steps earlier,
+			 * we'll skip packet header using an offset in the
+			 * current segment (which has been set to the
+			 * first one containing payload)
+			 */
+			seg_len -= in_off;
+			next_frag += in_off;
+			in_off = 0;
+
 			do {
 				efsys_dma_addr_t	frag_addr = next_frag;
 				size_t			frag_len;
diff --git a/drivers/net/sfc/sfc_tx.h b/drivers/net/sfc/sfc_tx.h
index 4d25c6a..581e2aa 100644
--- a/drivers/net/sfc/sfc_tx.h
+++ b/drivers/net/sfc/sfc_tx.h
@@ -50,6 +50,9 @@ struct sfc_evq;
 
 struct sfc_tx_sw_desc {
 	struct rte_mbuf		*mbuf;
+#ifdef RTE_LIBRTE_SFC_EFX_TSO
+	uint8_t			*tsoh;	/* Buffer to store TSO header */
+#endif /* RTE_LIBRTE_SFC_EFX_TSO */
 };
 
 enum sfc_txq_state_bit {
@@ -113,6 +116,31 @@ void sfc_tx_stop(struct sfc_adapter *sa);
 uint16_t sfc_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
 		       uint16_t nb_pkts);
 
+#ifdef RTE_LIBRTE_SFC_EFX_TSO
+/* From 'sfc_tso.c' */
+int sfc_tso_alloc_tsoh_objs(struct sfc_tx_sw_desc *sw_ring,
+			    unsigned int txq_entries, unsigned int socket_id);
+void sfc_tso_free_tsoh_objs(struct sfc_tx_sw_desc *sw_ring,
+			    unsigned int txq_entries);
+int sfc_tso_do(struct sfc_txq *txq, unsigned int idx, struct rte_mbuf **in_seg,
+	       size_t *in_off, efx_desc_t **pend, unsigned int *pkt_descs,
+	       size_t *pkt_len);
+#else /* !RTE_LIBRTE_SFC_EFX_TSO */
+static inline int
+sfc_tso_alloc_tsoh_objs(__rte_unused struct sfc_tx_sw_desc *sw_ring,
+			__rte_unused unsigned int txq_entries,
+			__rte_unused unsigned int socket_id)
+{
+	return 0;
+}
+
+static inline void
+sfc_tso_free_tsoh_objs(__rte_unused struct sfc_tx_sw_desc *sw_ring,
+		       __rte_unused unsigned int txq_entries)
+{
+}
+#endif /* RTE_LIBRTE_SFC_EFX_TSO */
+
 #ifdef __cplusplus
 }
 #endif
-- 
2.5.5

^ permalink raw reply related

* [PATCH v2 31/32] net/sfc: add callback to update RSS redirection table
From: Andrew Rybchenko @ 2016-12-15 12:51 UTC (permalink / raw)
  To: dev; +Cc: ferruh.yigit, Ivan Malov
In-Reply-To: <1481806283-10387-1-git-send-email-arybchenko@solarflare.com>

From: Ivan Malov <ivan.malov@oktetlabs.ru>

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
Reviewed-by: Robert Stonehouse <rstonehouse@solarflare.com>
---
 doc/guides/nics/features/sfc_efx.ini |  1 +
 drivers/net/sfc/sfc_ethdev.c         | 60 ++++++++++++++++++++++++++++++++++++
 2 files changed, 61 insertions(+)

diff --git a/doc/guides/nics/features/sfc_efx.ini b/doc/guides/nics/features/sfc_efx.ini
index 4f6f117..07c58d5 100644
--- a/doc/guides/nics/features/sfc_efx.ini
+++ b/doc/guides/nics/features/sfc_efx.ini
@@ -16,6 +16,7 @@ Allmulticast mode    = Y
 Multicast MAC filter = Y
 RSS hash             = Y
 RSS key update       = Y
+RSS reta update      = Y
 Flow control         = Y
 VLAN offload         = P
 L3 checksum offload  = Y
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index 0cd96ac..f45072c 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -1117,6 +1117,65 @@ sfc_dev_rss_reta_query(struct rte_eth_dev *dev,
 
 	return 0;
 }
+
+static int
+sfc_dev_rss_reta_update(struct rte_eth_dev *dev,
+			struct rte_eth_rss_reta_entry64 *reta_conf,
+			uint16_t reta_size)
+{
+	struct sfc_adapter *sa = dev->data->dev_private;
+	unsigned int *rss_tbl_new;
+	uint16_t entry;
+	int rc;
+
+
+	if ((sa->rss_channels == 1) ||
+	    (sa->rss_support != EFX_RX_SCALE_EXCLUSIVE)) {
+		sfc_err(sa, "RSS is not available");
+		return -ENOTSUP;
+	}
+
+	if (reta_size != EFX_RSS_TBL_SIZE) {
+		sfc_err(sa, "RETA size is wrong (should be %u)",
+			EFX_RSS_TBL_SIZE);
+		return -EINVAL;
+	}
+
+	rss_tbl_new = rte_zmalloc("rss_tbl_new", sizeof(sa->rss_tbl), 0);
+	if (rss_tbl_new == NULL)
+		return -ENOMEM;
+
+	sfc_adapter_lock(sa);
+
+	rte_memcpy(rss_tbl_new, sa->rss_tbl, sizeof(sa->rss_tbl));
+
+	for (entry = 0; entry < reta_size; entry++) {
+		int grp_idx = entry % RTE_RETA_GROUP_SIZE;
+		struct rte_eth_rss_reta_entry64 *grp;
+
+		grp = &reta_conf[entry / RTE_RETA_GROUP_SIZE];
+
+		if (grp->mask & (1ull << grp_idx)) {
+			if (grp->reta[grp_idx] >= sa->rss_channels) {
+				rc = EINVAL;
+				goto bad_reta_entry;
+			}
+			rss_tbl_new[entry] = grp->reta[grp_idx];
+		}
+	}
+
+	rc = efx_rx_scale_tbl_set(sa->nic, rss_tbl_new, EFX_RSS_TBL_SIZE);
+	if (rc == 0)
+		rte_memcpy(sa->rss_tbl, rss_tbl_new, sizeof(sa->rss_tbl));
+
+bad_reta_entry:
+	sfc_adapter_unlock(sa);
+
+	rte_free(rss_tbl_new);
+
+	SFC_ASSERT(rc >= 0);
+	return -rc;
+}
 #endif
 
 static const struct eth_dev_ops sfc_eth_dev_ops = {
@@ -1151,6 +1210,7 @@ static const struct eth_dev_ops sfc_eth_dev_ops = {
 	.flow_ctrl_set			= sfc_flow_ctrl_set,
 	.mac_addr_set			= sfc_mac_addr_set,
 #if EFSYS_OPT_RX_SCALE
+	.reta_update			= sfc_dev_rss_reta_update,
 	.reta_query			= sfc_dev_rss_reta_query,
 	.rss_hash_update		= sfc_dev_rss_hash_update,
 	.rss_hash_conf_get		= sfc_dev_rss_hash_conf_get,
-- 
2.5.5

^ permalink raw reply related

* [PATCH v2 25/32] net/sfc/base: do not use enum type when values are bitmask
From: Andrew Rybchenko @ 2016-12-15 12:51 UTC (permalink / raw)
  To: dev; +Cc: ferruh.yigit, Andrew Rybchenko
In-Reply-To: <1481806283-10387-1-git-send-email-arybchenko@solarflare.com>

From: Andrew Rybchenko <Andrew.Rybchenko@oktetlabs.ru>

ICC complains that enumerated type mixed with another type.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
Reviewed-by: Robert Stonehouse <rstonehouse@solarflare.com>
---
 drivers/net/sfc/base/ef10_rx.c |  8 ++++----
 drivers/net/sfc/base/efx.h     | 12 ++++++------
 drivers/net/sfc/base/efx_rx.c  |  8 ++++----
 3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/net/sfc/base/ef10_rx.c b/drivers/net/sfc/base/ef10_rx.c
index 2bcd823..b65faed 100644
--- a/drivers/net/sfc/base/ef10_rx.c
+++ b/drivers/net/sfc/base/ef10_rx.c
@@ -304,13 +304,13 @@ efx_mcdi_rss_context_set_flags(
 
 	MCDI_IN_POPULATE_DWORD_4(req, RSS_CONTEXT_SET_FLAGS_IN_FLAGS,
 	    RSS_CONTEXT_SET_FLAGS_IN_TOEPLITZ_IPV4_EN,
-	    (type & (1U << EFX_RX_HASH_IPV4)) ? 1 : 0,
+	    (type & EFX_RX_HASH_IPV4) ? 1 : 0,
 	    RSS_CONTEXT_SET_FLAGS_IN_TOEPLITZ_TCPV4_EN,
-	    (type & (1U << EFX_RX_HASH_TCPIPV4)) ? 1 : 0,
+	    (type & EFX_RX_HASH_TCPIPV4) ? 1 : 0,
 	    RSS_CONTEXT_SET_FLAGS_IN_TOEPLITZ_IPV6_EN,
-	    (type & (1U << EFX_RX_HASH_IPV6)) ? 1 : 0,
+	    (type & EFX_RX_HASH_IPV6) ? 1 : 0,
 	    RSS_CONTEXT_SET_FLAGS_IN_TOEPLITZ_TCPV6_EN,
-	    (type & (1U << EFX_RX_HASH_TCPIPV6)) ? 1 : 0);
+	    (type & EFX_RX_HASH_TCPIPV6) ? 1 : 0);
 
 	efx_mcdi_execute(enp, &req);
 
diff --git a/drivers/net/sfc/base/efx.h b/drivers/net/sfc/base/efx.h
index 025721f..0815d7a 100644
--- a/drivers/net/sfc/base/efx.h
+++ b/drivers/net/sfc/base/efx.h
@@ -1851,12 +1851,12 @@ typedef enum efx_rx_hash_alg_e {
 	EFX_RX_HASHALG_TOEPLITZ
 } efx_rx_hash_alg_t;
 
-typedef enum efx_rx_hash_type_e {
-	EFX_RX_HASH_IPV4 = 0,
-	EFX_RX_HASH_TCPIPV4,
-	EFX_RX_HASH_IPV6,
-	EFX_RX_HASH_TCPIPV6,
-} efx_rx_hash_type_t;
+#define	EFX_RX_HASH_IPV4	(1U << 0)
+#define	EFX_RX_HASH_TCPIPV4	(1U << 1)
+#define	EFX_RX_HASH_IPV6	(1U << 2)
+#define	EFX_RX_HASH_TCPIPV6	(1U << 3)
+
+typedef unsigned int efx_rx_hash_type_t;
 
 typedef enum efx_rx_hash_support_e {
 	EFX_RX_HASH_UNAVAILABLE = 0,	/* Hardware hash not inserted */
diff --git a/drivers/net/sfc/base/efx_rx.c b/drivers/net/sfc/base/efx_rx.c
index 330d2aa..c815634 100644
--- a/drivers/net/sfc/base/efx_rx.c
+++ b/drivers/net/sfc/base/efx_rx.c
@@ -786,12 +786,12 @@ siena_rx_scale_mode_set(
 
 	case EFX_RX_HASHALG_TOEPLITZ:
 		EFX_RX_TOEPLITZ_IPV4_HASH(enp, insert,
-		    type & (1 << EFX_RX_HASH_IPV4),
-		    type & (1 << EFX_RX_HASH_TCPIPV4));
+		    type & EFX_RX_HASH_IPV4,
+		    type & EFX_RX_HASH_TCPIPV4);
 
 		EFX_RX_TOEPLITZ_IPV6_HASH(enp,
-		    type & (1 << EFX_RX_HASH_IPV6),
-		    type & (1 << EFX_RX_HASH_TCPIPV6),
+		    type & EFX_RX_HASH_IPV6,
+		    type & EFX_RX_HASH_TCPIPV6,
 		    rc);
 		if (rc != 0)
 			goto fail1;
-- 
2.5.5

^ permalink raw reply related

* [PATCH v2 27/32] net/sfc: support RSS hash offload
From: Andrew Rybchenko @ 2016-12-15 12:51 UTC (permalink / raw)
  To: dev; +Cc: ferruh.yigit, Ivan Malov
In-Reply-To: <1481806283-10387-1-git-send-email-arybchenko@solarflare.com>

From: Ivan Malov <ivan.malov@oktetlabs.ru>

Extract RSS hash provided by the HW in the prefix and put it to mbuf.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
Reviewed-by: Robert Stonehouse <rstonehouse@solarflare.com>
---
 doc/guides/nics/features/sfc_efx.ini |  1 +
 doc/guides/nics/sfc_efx.rst          |  2 ++
 drivers/net/sfc/sfc_rx.c             | 31 ++++++++++++++++++++++++++++++-
 3 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/doc/guides/nics/features/sfc_efx.ini b/doc/guides/nics/features/sfc_efx.ini
index e7a1143..debea27 100644
--- a/doc/guides/nics/features/sfc_efx.ini
+++ b/doc/guides/nics/features/sfc_efx.ini
@@ -14,6 +14,7 @@ Scattered Rx         = Y
 Promiscuous mode     = Y
 Allmulticast mode    = Y
 Multicast MAC filter = Y
+RSS hash             = Y
 Flow control         = Y
 VLAN offload         = P
 L3 checksum offload  = Y
diff --git a/doc/guides/nics/sfc_efx.rst b/doc/guides/nics/sfc_efx.rst
index 17e81dd..bc45b17 100644
--- a/doc/guides/nics/sfc_efx.rst
+++ b/doc/guides/nics/sfc_efx.rst
@@ -71,6 +71,8 @@ SFC EFX PMD has support for:
 
 - Receive side scaling (RSS)
 
+- RSS hash
+
 - Scattered Rx DMA for packet that are larger that a single Rx descriptor
 
 - Deferred receive and transmit queue start
diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index 36a7d71..9b507c3 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -185,6 +185,28 @@ sfc_rx_desc_flags_to_packet_type(const unsigned int desc_flags)
 		((desc_flags & EFX_PKT_UDP) ? RTE_PTYPE_L4_UDP : 0);
 }
 
+static void
+sfc_rx_set_rss_hash(struct sfc_rxq *rxq, unsigned int flags, struct rte_mbuf *m)
+{
+#if EFSYS_OPT_RX_SCALE
+	uint8_t *mbuf_data;
+
+
+	if ((rxq->flags & SFC_RXQ_RSS_HASH) == 0)
+		return;
+
+	mbuf_data = rte_pktmbuf_mtod(m, uint8_t *);
+
+	if (flags & (EFX_PKT_IPV4 | EFX_PKT_IPV6)) {
+		m->hash.rss = efx_pseudo_hdr_hash_get(rxq->common,
+						      EFX_RX_HASHALG_TOEPLITZ,
+						      mbuf_data);
+
+		m->ol_flags |= PKT_RX_RSS_HASH;
+	}
+#endif
+}
+
 uint16_t
 sfc_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 {
@@ -231,7 +253,6 @@ sfc_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 			seg_len = rxd->size - prefix_size;
 		}
 
-		m->data_off += prefix_size;
 		rte_pktmbuf_data_len(m) = seg_len;
 		rte_pktmbuf_pkt_len(m) = seg_len;
 
@@ -261,6 +282,14 @@ sfc_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 		m->ol_flags = sfc_rx_desc_flags_to_offload_flags(desc_flags);
 		m->packet_type = sfc_rx_desc_flags_to_packet_type(desc_flags);
 
+		/*
+		 * Extract RSS hash from the packet prefix and
+		 * set the corresponding field (if needed and possible)
+		 */
+		sfc_rx_set_rss_hash(rxq, desc_flags, m);
+
+		m->data_off += prefix_size;
+
 		*rx_pkts++ = m;
 		done_pkts++;
 		continue;
-- 
2.5.5

^ permalink raw reply related

* [PATCH v2 23/32] net/sfc: support deferred start of transmit queues
From: Andrew Rybchenko @ 2016-12-15 12:51 UTC (permalink / raw)
  To: dev; +Cc: ferruh.yigit, Ivan Malov
In-Reply-To: <1481806283-10387-1-git-send-email-arybchenko@solarflare.com>

From: Ivan Malov <ivan.malov@oktetlabs.ru>

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
Reviewed-by: Robert Stonehouse <rstonehouse@solarflare.com>
---
 doc/guides/nics/features/sfc_efx.ini |  2 +-
 doc/guides/nics/sfc_efx.rst          |  2 +-
 drivers/net/sfc/sfc_ethdev.c         | 51 ++++++++++++++++++++++++++++++++++++
 drivers/net/sfc/sfc_tx.c             | 18 +++++++------
 drivers/net/sfc/sfc_tx.h             |  2 ++
 5 files changed, 65 insertions(+), 10 deletions(-)

diff --git a/doc/guides/nics/features/sfc_efx.ini b/doc/guides/nics/features/sfc_efx.ini
index 4a887f0..38bf9d2 100644
--- a/doc/guides/nics/features/sfc_efx.ini
+++ b/doc/guides/nics/features/sfc_efx.ini
@@ -7,7 +7,7 @@
 Speed capabilities   = Y
 Link status          = Y
 Link status event    = Y
-Queue start/stop     = P
+Queue start/stop     = Y
 MTU update           = Y
 Jumbo frame          = Y
 Scattered Rx         = Y
diff --git a/doc/guides/nics/sfc_efx.rst b/doc/guides/nics/sfc_efx.rst
index b0beaf1..304dc95 100644
--- a/doc/guides/nics/sfc_efx.rst
+++ b/doc/guides/nics/sfc_efx.rst
@@ -71,7 +71,7 @@ SFC EFX PMD has support for:
 
 - Scattered Rx DMA for packet that are larger that a single Rx descriptor
 
-- Deferred receive queue start
+- Deferred receive and transmit queue start
 
 
 Non-supported Features
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index 5d0d774..ba3c838 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -865,6 +865,7 @@ sfc_tx_queue_info_get(struct rte_eth_dev *dev, uint16_t tx_queue_id,
 
 	qinfo->conf.txq_flags = txq_info->txq->flags;
 	qinfo->conf.tx_free_thresh = txq_info->txq->free_thresh;
+	qinfo->conf.tx_deferred_start = txq_info->deferred_start;
 	qinfo->nb_desc = txq_info->entries;
 
 	sfc_adapter_unlock(sa);
@@ -936,6 +937,54 @@ sfc_rx_queue_stop(struct rte_eth_dev *dev, uint16_t rx_queue_id)
 	return 0;
 }
 
+static int
+sfc_tx_queue_start(struct rte_eth_dev *dev, uint16_t tx_queue_id)
+{
+	struct sfc_adapter *sa = dev->data->dev_private;
+	int rc;
+
+	sfc_log_init(sa, "TxQ = %u", tx_queue_id);
+
+	sfc_adapter_lock(sa);
+
+	rc = EINVAL;
+	if (sa->state != SFC_ADAPTER_STARTED)
+		goto fail_not_started;
+
+	rc = sfc_tx_qstart(sa, tx_queue_id);
+	if (rc != 0)
+		goto fail_tx_qstart;
+
+	sa->txq_info[tx_queue_id].deferred_started = B_TRUE;
+
+	sfc_adapter_unlock(sa);
+	return 0;
+
+fail_tx_qstart:
+
+fail_not_started:
+	sfc_adapter_unlock(sa);
+	SFC_ASSERT(rc > 0);
+	return -rc;
+}
+
+static int
+sfc_tx_queue_stop(struct rte_eth_dev *dev, uint16_t tx_queue_id)
+{
+	struct sfc_adapter *sa = dev->data->dev_private;
+
+	sfc_log_init(sa, "TxQ = %u", tx_queue_id);
+
+	sfc_adapter_lock(sa);
+
+	sfc_tx_qstop(sa, tx_queue_id);
+
+	sa->txq_info[tx_queue_id].deferred_started = B_FALSE;
+
+	sfc_adapter_unlock(sa);
+	return 0;
+}
+
 static const struct eth_dev_ops sfc_eth_dev_ops = {
 	.dev_configure			= sfc_dev_configure,
 	.dev_start			= sfc_dev_start,
@@ -956,6 +1005,8 @@ static const struct eth_dev_ops sfc_eth_dev_ops = {
 	.mtu_set			= sfc_dev_set_mtu,
 	.rx_queue_start			= sfc_rx_queue_start,
 	.rx_queue_stop			= sfc_rx_queue_stop,
+	.tx_queue_start			= sfc_tx_queue_start,
+	.tx_queue_stop			= sfc_tx_queue_stop,
 	.rx_queue_setup			= sfc_rx_queue_setup,
 	.rx_queue_release		= sfc_rx_queue_release,
 	.rx_queue_count			= sfc_rx_queue_count,
diff --git a/drivers/net/sfc/sfc_tx.c b/drivers/net/sfc/sfc_tx.c
index 13b24f7..15a6f9f 100644
--- a/drivers/net/sfc/sfc_tx.c
+++ b/drivers/net/sfc/sfc_tx.c
@@ -72,11 +72,6 @@ sfc_tx_qcheck_conf(struct sfc_adapter *sa, uint16_t nb_tx_desc,
 		rc = EINVAL;
 	}
 
-	if (tx_conf->tx_deferred_start != 0) {
-		sfc_err(sa, "TX queue deferred start is not supported (yet)");
-		rc = EINVAL;
-	}
-
 	if (tx_conf->tx_thresh.pthresh != 0 ||
 	    tx_conf->tx_thresh.hthresh != 0 ||
 	    tx_conf->tx_thresh.wthresh != 0) {
@@ -198,6 +193,7 @@ sfc_tx_qinit(struct sfc_adapter *sa, unsigned int sw_index,
 	evq->txq = txq;
 
 	txq_info->txq = txq;
+	txq_info->deferred_start = (tx_conf->tx_deferred_start != 0);
 
 	return 0;
 
@@ -425,6 +421,9 @@ sfc_tx_qstop(struct sfc_adapter *sa, unsigned int sw_index)
 
 	txq = txq_info->txq;
 
+	if (txq->state == SFC_TXQ_INITIALIZED)
+		return;
+
 	SFC_ASSERT(txq->state & SFC_TXQ_STARTED);
 
 	txq->state &= ~SFC_TXQ_RUNNING;
@@ -497,9 +496,12 @@ sfc_tx_start(struct sfc_adapter *sa)
 		goto fail_efx_tx_init;
 
 	for (sw_index = 0; sw_index < sa->txq_count; ++sw_index) {
-		rc = sfc_tx_qstart(sa, sw_index);
-		if (rc != 0)
-			goto fail_tx_qstart;
+		if (!(sa->txq_info[sw_index].deferred_start) ||
+		    sa->txq_info[sw_index].deferred_started) {
+			rc = sfc_tx_qstart(sa, sw_index);
+			if (rc != 0)
+				goto fail_tx_qstart;
+		}
 	}
 
 	return 0;
diff --git a/drivers/net/sfc/sfc_tx.h b/drivers/net/sfc/sfc_tx.h
index f9eecc0..632e3be 100644
--- a/drivers/net/sfc/sfc_tx.h
+++ b/drivers/net/sfc/sfc_tx.h
@@ -91,6 +91,8 @@ sfc_txq_sw_index(const struct sfc_txq *txq)
 struct sfc_txq_info {
 	unsigned int		entries;
 	struct sfc_txq		*txq;
+	boolean_t		deferred_start;
+	boolean_t		deferred_started;
 };
 
 int sfc_tx_init(struct sfc_adapter *sa);
-- 
2.5.5

^ permalink raw reply related

* [PATCH v2 29/32] net/sfc: add callback to set RSS key and hash types config
From: Andrew Rybchenko @ 2016-12-15 12:51 UTC (permalink / raw)
  To: dev; +Cc: ferruh.yigit, Ivan Malov
In-Reply-To: <1481806283-10387-1-git-send-email-arybchenko@solarflare.com>

From: Ivan Malov <ivan.malov@oktetlabs.ru>

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
Reviewed-by: Robert Stonehouse <rstonehouse@solarflare.com>
---
 doc/guides/nics/features/sfc_efx.ini |  1 +
 drivers/net/sfc/sfc_ethdev.c         | 63 ++++++++++++++++++++++++++++++++++++
 2 files changed, 64 insertions(+)

diff --git a/doc/guides/nics/features/sfc_efx.ini b/doc/guides/nics/features/sfc_efx.ini
index debea27..4f6f117 100644
--- a/doc/guides/nics/features/sfc_efx.ini
+++ b/doc/guides/nics/features/sfc_efx.ini
@@ -15,6 +15,7 @@ Promiscuous mode     = Y
 Allmulticast mode    = Y
 Multicast MAC filter = Y
 RSS hash             = Y
+RSS key update       = Y
 Flow control         = Y
 VLAN offload         = P
 L3 checksum offload  = Y
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index c78d798..f9a766c 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -1025,6 +1025,68 @@ sfc_dev_rss_hash_conf_get(struct rte_eth_dev *dev,
 
 	return 0;
 }
+
+static int
+sfc_dev_rss_hash_update(struct rte_eth_dev *dev,
+			struct rte_eth_rss_conf *rss_conf)
+{
+	struct sfc_adapter *sa = dev->data->dev_private;
+	unsigned int efx_hash_types;
+	int rc = 0;
+
+	if ((sa->rss_channels == 1) ||
+	    (sa->rss_support != EFX_RX_SCALE_EXCLUSIVE)) {
+		sfc_err(sa, "RSS is not available");
+		return -ENOTSUP;
+	}
+
+	if ((rss_conf->rss_key != NULL) &&
+	    (rss_conf->rss_key_len != sizeof(sa->rss_key))) {
+		sfc_err(sa, "RSS key size is wrong (should be %lu)",
+			sizeof(sa->rss_key));
+		return -EINVAL;
+	}
+
+	if ((rss_conf->rss_hf & ~SFC_RSS_OFFLOADS) != 0) {
+		sfc_err(sa, "unsupported hash functions requested");
+		return -EINVAL;
+	}
+
+	sfc_adapter_lock(sa);
+
+	efx_hash_types = sfc_rte_to_efx_hash_type(rss_conf->rss_hf);
+
+	rc = efx_rx_scale_mode_set(sa->nic, EFX_RX_HASHALG_TOEPLITZ,
+				   efx_hash_types, B_TRUE);
+	if (rc != 0)
+		goto fail_scale_mode_set;
+
+	if (rss_conf->rss_key != NULL) {
+		if (sa->state == SFC_ADAPTER_STARTED) {
+			rc = efx_rx_scale_key_set(sa->nic, rss_conf->rss_key,
+						  sizeof(sa->rss_key));
+			if (rc != 0)
+				goto fail_scale_key_set;
+		}
+
+		rte_memcpy(sa->rss_key, rss_conf->rss_key, sizeof(sa->rss_key));
+	}
+
+	sa->rss_hash_types = efx_hash_types;
+
+	sfc_adapter_unlock(sa);
+
+	return 0;
+
+fail_scale_key_set:
+	if (efx_rx_scale_mode_set(sa->nic, EFX_RX_HASHALG_TOEPLITZ,
+				  sa->rss_hash_types, B_TRUE) != 0)
+		sfc_err(sa, "failed to restore RSS mode");
+
+fail_scale_mode_set:
+	sfc_adapter_unlock(sa);
+	return -rc;
+}
 #endif
 
 static const struct eth_dev_ops sfc_eth_dev_ops = {
@@ -1059,6 +1121,7 @@ static const struct eth_dev_ops sfc_eth_dev_ops = {
 	.flow_ctrl_set			= sfc_flow_ctrl_set,
 	.mac_addr_set			= sfc_mac_addr_set,
 #if EFSYS_OPT_RX_SCALE
+	.rss_hash_update		= sfc_dev_rss_hash_update,
 	.rss_hash_conf_get		= sfc_dev_rss_hash_conf_get,
 #endif
 	.set_mc_addr_list		= sfc_set_mc_addr_list,
-- 
2.5.5

^ permalink raw reply related

* [PATCH v2 30/32] net/sfc: add callback to query RSS redirection table
From: Andrew Rybchenko @ 2016-12-15 12:51 UTC (permalink / raw)
  To: dev; +Cc: ferruh.yigit, Ivan Malov
In-Reply-To: <1481806283-10387-1-git-send-email-arybchenko@solarflare.com>

From: Ivan Malov <ivan.malov@oktetlabs.ru>

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
Reviewed-by: Robert Stonehouse <rstonehouse@solarflare.com>
---
 drivers/net/sfc/sfc_ethdev.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index f9a766c..0cd96ac 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -1087,6 +1087,36 @@ sfc_dev_rss_hash_update(struct rte_eth_dev *dev,
 	sfc_adapter_unlock(sa);
 	return -rc;
 }
+
+static int
+sfc_dev_rss_reta_query(struct rte_eth_dev *dev,
+		       struct rte_eth_rss_reta_entry64 *reta_conf,
+		       uint16_t reta_size)
+{
+	struct sfc_adapter *sa = dev->data->dev_private;
+	int entry;
+
+	if ((sa->rss_channels == 1) ||
+	    (sa->rss_support != EFX_RX_SCALE_EXCLUSIVE))
+		return -ENOTSUP;
+
+	if (reta_size != EFX_RSS_TBL_SIZE)
+		return -EINVAL;
+
+	sfc_adapter_lock(sa);
+
+	for (entry = 0; entry < reta_size; entry++) {
+		int grp = entry / RTE_RETA_GROUP_SIZE;
+		int grp_idx = entry % RTE_RETA_GROUP_SIZE;
+
+		if ((reta_conf[grp].mask >> grp_idx) & 1)
+			reta_conf[grp].reta[grp_idx] = sa->rss_tbl[entry];
+	}
+
+	sfc_adapter_unlock(sa);
+
+	return 0;
+}
 #endif
 
 static const struct eth_dev_ops sfc_eth_dev_ops = {
@@ -1121,6 +1151,7 @@ static const struct eth_dev_ops sfc_eth_dev_ops = {
 	.flow_ctrl_set			= sfc_flow_ctrl_set,
 	.mac_addr_set			= sfc_mac_addr_set,
 #if EFSYS_OPT_RX_SCALE
+	.reta_query			= sfc_dev_rss_reta_query,
 	.rss_hash_update		= sfc_dev_rss_hash_update,
 	.rss_hash_conf_get		= sfc_dev_rss_hash_conf_get,
 #endif
-- 
2.5.5

^ permalink raw reply related

* [PATCH v2 26/32] net/sfc: add basic stubs for RSS support on driver attach
From: Andrew Rybchenko @ 2016-12-15 12:51 UTC (permalink / raw)
  To: dev; +Cc: ferruh.yigit, Ivan Malov
In-Reply-To: <1481806283-10387-1-git-send-email-arybchenko@solarflare.com>

From: Ivan Malov <ivan.malov@oktetlabs.ru>

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
Reviewed-by: Robert Stonehouse <rstonehouse@solarflare.com>
---
 doc/guides/nics/sfc_efx.rst  |  2 ++
 drivers/net/sfc/efsys.h      |  2 +-
 drivers/net/sfc/sfc.c        | 76 +++++++++++++++++++++++++++++++++++++++++
 drivers/net/sfc/sfc.h        | 17 ++++++++++
 drivers/net/sfc/sfc_ethdev.c |  8 +++++
 drivers/net/sfc/sfc_rx.c     | 81 +++++++++++++++++++++++++++++++++++++++++++-
 drivers/net/sfc/sfc_rx.h     |  8 +++++
 7 files changed, 192 insertions(+), 2 deletions(-)

diff --git a/doc/guides/nics/sfc_efx.rst b/doc/guides/nics/sfc_efx.rst
index 2244e7a..17e81dd 100644
--- a/doc/guides/nics/sfc_efx.rst
+++ b/doc/guides/nics/sfc_efx.rst
@@ -69,6 +69,8 @@ SFC EFX PMD has support for:
 
 - Received packet type information
 
+- Receive side scaling (RSS)
+
 - Scattered Rx DMA for packet that are larger that a single Rx descriptor
 
 - Deferred receive and transmit queue start
diff --git a/drivers/net/sfc/efsys.h b/drivers/net/sfc/efsys.h
index 0f941e6..fb2f3b5 100644
--- a/drivers/net/sfc/efsys.h
+++ b/drivers/net/sfc/efsys.h
@@ -195,7 +195,7 @@ prefetch_read_once(const volatile void *addr)
 #define EFSYS_OPT_BOOTCFG 0
 
 #define EFSYS_OPT_DIAG 0
-#define EFSYS_OPT_RX_SCALE 0
+#define EFSYS_OPT_RX_SCALE 1
 #define EFSYS_OPT_QSTATS 0
 /* Filters support is required for SFN7xxx and SFN8xx */
 #define EFSYS_OPT_FILTER 1
diff --git a/drivers/net/sfc/sfc.c b/drivers/net/sfc/sfc.c
index e2e6c9e..e79367d 100644
--- a/drivers/net/sfc/sfc.c
+++ b/drivers/net/sfc/sfc.c
@@ -484,6 +484,73 @@ sfc_mem_bar_fini(struct sfc_adapter *sa)
 	memset(ebp, 0, sizeof(*ebp));
 }
 
+#if EFSYS_OPT_RX_SCALE
+/*
+ * A fixed RSS key which has a property of being symmetric
+ * (symmetrical flows are distributed to the same CPU)
+ * and also known to give a uniform distribution
+ * (a good distribution of traffic between different CPUs)
+ */
+static const uint8_t default_rss_key[SFC_RSS_KEY_SIZE] = {
+	0x6d, 0x5a, 0x6d, 0x5a, 0x6d, 0x5a, 0x6d, 0x5a,
+	0x6d, 0x5a, 0x6d, 0x5a, 0x6d, 0x5a, 0x6d, 0x5a,
+	0x6d, 0x5a, 0x6d, 0x5a, 0x6d, 0x5a, 0x6d, 0x5a,
+	0x6d, 0x5a, 0x6d, 0x5a, 0x6d, 0x5a, 0x6d, 0x5a,
+	0x6d, 0x5a, 0x6d, 0x5a, 0x6d, 0x5a, 0x6d, 0x5a,
+};
+#endif
+
+static int
+sfc_set_rss_defaults(struct sfc_adapter *sa)
+{
+#if EFSYS_OPT_RX_SCALE
+	int rc;
+
+	rc = efx_intr_init(sa->nic, sa->intr.type, NULL);
+	if (rc != 0)
+		goto fail_intr_init;
+
+	rc = efx_ev_init(sa->nic);
+	if (rc != 0)
+		goto fail_ev_init;
+
+	rc = efx_rx_init(sa->nic);
+	if (rc != 0)
+		goto fail_rx_init;
+
+	rc = efx_rx_scale_support_get(sa->nic, &sa->rss_support);
+	if (rc != 0)
+		goto fail_scale_support_get;
+
+	rc = efx_rx_hash_support_get(sa->nic, &sa->hash_support);
+	if (rc != 0)
+		goto fail_hash_support_get;
+
+	efx_rx_fini(sa->nic);
+	efx_ev_fini(sa->nic);
+	efx_intr_fini(sa->nic);
+
+	sa->rss_hash_types = sfc_rte_to_efx_hash_type(SFC_RSS_OFFLOADS);
+
+	rte_memcpy(sa->rss_key, default_rss_key, sizeof(sa->rss_key));
+
+	return 0;
+
+fail_hash_support_get:
+fail_scale_support_get:
+fail_rx_init:
+	efx_ev_fini(sa->nic);
+
+fail_ev_init:
+	efx_intr_fini(sa->nic);
+
+fail_intr_init:
+	return rc;
+#else
+	return 0;
+#endif
+}
+
 int
 sfc_attach(struct sfc_adapter *sa)
 {
@@ -550,6 +617,10 @@ sfc_attach(struct sfc_adapter *sa)
 	efx_phy_adv_cap_get(sa->nic, EFX_PHY_CAP_PERM,
 			    &sa->port.phy_adv_cap_mask);
 
+	rc = sfc_set_rss_defaults(sa);
+	if (rc != 0)
+		goto fail_set_rss_defaults;
+
 	sfc_log_init(sa, "fini nic");
 	efx_nic_fini(enp);
 
@@ -558,7 +629,12 @@ sfc_attach(struct sfc_adapter *sa)
 	sfc_log_init(sa, "done");
 	return 0;
 
+fail_set_rss_defaults:
+	sfc_intr_detach(sa);
+
 fail_intr_attach:
+	efx_nic_fini(sa->nic);
+
 fail_estimate_rsrc_limits:
 fail_nic_reset:
 	sfc_log_init(sa, "unprobe nic");
diff --git a/drivers/net/sfc/sfc.h b/drivers/net/sfc/sfc.h
index 7b135e1..d02d1c0 100644
--- a/drivers/net/sfc/sfc.h
+++ b/drivers/net/sfc/sfc.h
@@ -42,6 +42,13 @@
 extern "C" {
 #endif
 
+#if EFSYS_OPT_RX_SCALE
+/** RSS key length (bytes) */
+#define SFC_RSS_KEY_SIZE	40
+/** RSS hash offloads mask */
+#define SFC_RSS_OFFLOADS	(ETH_RSS_IP | ETH_RSS_TCP)
+#endif
+
 /*
  * +---------------+
  * | UNINITIALIZED |<-----------+
@@ -187,6 +194,16 @@ struct sfc_adapter {
 
 	unsigned int			txq_count;
 	struct sfc_txq_info		*txq_info;
+
+	unsigned int			rss_channels;
+
+#if EFSYS_OPT_RX_SCALE
+	efx_rx_scale_support_t		rss_support;
+	efx_rx_hash_support_t		hash_support;
+	efx_rx_hash_type_t		rss_hash_types;
+	unsigned int			rss_tbl[EFX_RSS_TBL_SIZE];
+	uint8_t				rss_key[SFC_RSS_KEY_SIZE];
+#endif
 };
 
 /*
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index 0de17ca..b17607f 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -85,6 +85,14 @@ sfc_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
 	else
 		dev_info->tx_offload_capa |= DEV_TX_OFFLOAD_VLAN_INSERT;
 
+#if EFSYS_OPT_RX_SCALE
+	if (sa->rss_support != EFX_RX_SCALE_UNAVAILABLE) {
+		dev_info->reta_size = EFX_RSS_TBL_SIZE;
+		dev_info->hash_key_size = SFC_RSS_KEY_SIZE;
+		dev_info->flow_type_rss_offloads = SFC_RSS_OFFLOADS;
+	}
+#endif
+
 	dev_info->rx_desc_lim.nb_max = EFX_RXQ_MAXNDESCS;
 	dev_info->rx_desc_lim.nb_min = EFX_RXQ_MINNDESCS;
 	/* The RXQ hardware requires that the descriptor count is a power
diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index 3bfce1c..36a7d71 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -411,7 +411,8 @@ sfc_rx_qstart(struct sfc_adapter *sa, unsigned int sw_index)
 
 	if (sw_index == 0) {
 		rc = efx_mac_filter_default_rxq_set(sa->nic, rxq->common,
-						    B_FALSE);
+						    (sa->rss_channels > 1) ?
+						    B_TRUE : B_FALSE);
 		if (rc != 0)
 			goto fail_mac_filter_default_rxq_set;
 	}
@@ -683,6 +684,11 @@ sfc_rx_qinit(struct sfc_adapter *sa, unsigned int sw_index,
 	rxq->batch_max = encp->enc_rx_batch_max;
 	rxq->prefix_size = encp->enc_rx_prefix_size;
 
+#if EFSYS_OPT_RX_SCALE
+	if (sa->hash_support == EFX_RX_HASH_AVAILABLE)
+		rxq->flags |= SFC_RXQ_RSS_HASH;
+#endif
+
 	rxq->state = SFC_RXQ_INITIALIZED;
 
 	rxq_info->rxq = rxq;
@@ -728,6 +734,56 @@ sfc_rx_qfini(struct sfc_adapter *sa, unsigned int sw_index)
 	rte_free(rxq);
 }
 
+#if EFSYS_OPT_RX_SCALE
+efx_rx_hash_type_t
+sfc_rte_to_efx_hash_type(uint64_t rss_hf)
+{
+	efx_rx_hash_type_t efx_hash_types = 0;
+
+	if ((rss_hf & (ETH_RSS_IPV4 | ETH_RSS_FRAG_IPV4 |
+		       ETH_RSS_NONFRAG_IPV4_OTHER)) != 0)
+		efx_hash_types |= EFX_RX_HASH_IPV4;
+
+	if ((rss_hf & ETH_RSS_NONFRAG_IPV4_TCP) != 0)
+		efx_hash_types |= EFX_RX_HASH_TCPIPV4;
+
+	if ((rss_hf & (ETH_RSS_IPV6 | ETH_RSS_FRAG_IPV6 |
+			ETH_RSS_NONFRAG_IPV6_OTHER | ETH_RSS_IPV6_EX)) != 0)
+		efx_hash_types |= EFX_RX_HASH_IPV6;
+
+	if ((rss_hf & (ETH_RSS_NONFRAG_IPV6_TCP | ETH_RSS_IPV6_TCP_EX)) != 0)
+		efx_hash_types |= EFX_RX_HASH_TCPIPV6;
+
+	return efx_hash_types;
+}
+#endif
+
+static int
+sfc_rx_rss_config(struct sfc_adapter *sa)
+{
+	int rc = 0;
+
+#if EFSYS_OPT_RX_SCALE
+	if (sa->rss_channels > 1) {
+		rc = efx_rx_scale_mode_set(sa->nic, EFX_RX_HASHALG_TOEPLITZ,
+					   sa->rss_hash_types, B_TRUE);
+		if (rc != 0)
+			goto finish;
+
+		rc = efx_rx_scale_key_set(sa->nic, sa->rss_key,
+					  sizeof(sa->rss_key));
+		if (rc != 0)
+			goto finish;
+
+		rc = efx_rx_scale_tbl_set(sa->nic, sa->rss_tbl,
+					  sizeof(sa->rss_tbl));
+	}
+
+finish:
+#endif
+	return rc;
+}
+
 int
 sfc_rx_start(struct sfc_adapter *sa)
 {
@@ -740,6 +796,10 @@ sfc_rx_start(struct sfc_adapter *sa)
 	if (rc != 0)
 		goto fail_rx_init;
 
+	rc = sfc_rx_rss_config(sa);
+	if (rc != 0)
+		goto fail_rss_config;
+
 	for (sw_index = 0; sw_index < sa->rxq_count; ++sw_index) {
 		if ((!sa->rxq_info[sw_index].deferred_start ||
 		     sa->rxq_info[sw_index].deferred_started)) {
@@ -755,6 +815,7 @@ sfc_rx_start(struct sfc_adapter *sa)
 	while (sw_index-- > 0)
 		sfc_rx_qstop(sa, sw_index);
 
+fail_rss_config:
 	efx_rx_fini(sa->nic);
 
 fail_rx_init:
@@ -801,6 +862,14 @@ sfc_rx_check_mode(struct sfc_adapter *sa, struct rte_eth_rxmode *rxmode)
 	case ETH_MQ_RX_NONE:
 		/* No special checks are required */
 		break;
+#if EFSYS_OPT_RX_SCALE
+	case ETH_MQ_RX_RSS:
+		if (sa->rss_support == EFX_RX_SCALE_UNAVAILABLE) {
+			sfc_err(sa, "RSS is not available");
+			rc = EINVAL;
+		}
+		break;
+#endif
 	default:
 		sfc_err(sa, "Rx multi-queue mode %u not supported",
 			rxmode->mq_mode);
@@ -876,6 +945,16 @@ sfc_rx_init(struct sfc_adapter *sa)
 			goto fail_rx_qinit_info;
 	}
 
+#if EFSYS_OPT_RX_SCALE
+	sa->rss_channels = (dev_conf->rxmode.mq_mode == ETH_MQ_RX_RSS) ?
+			   MIN(sa->rxq_count, EFX_MAXRSS) : 1;
+
+	if (sa->rss_channels > 1) {
+		for (sw_index = 0; sw_index < EFX_RSS_TBL_SIZE; ++sw_index)
+			sa->rss_tbl[sw_index] = sw_index % sa->rss_channels;
+	}
+#endif
+
 	return 0;
 
 fail_rx_qinit_info:
diff --git a/drivers/net/sfc/sfc_rx.h b/drivers/net/sfc/sfc_rx.h
index 4aa6aea..c0cb17a 100644
--- a/drivers/net/sfc/sfc_rx.h
+++ b/drivers/net/sfc/sfc_rx.h
@@ -83,6 +83,10 @@ struct sfc_rxq {
 	unsigned int		completed;
 	uint16_t		batch_max;
 	uint16_t		prefix_size;
+#if EFSYS_OPT_RX_SCALE
+	unsigned int		flags;
+#define SFC_RXQ_RSS_HASH	0x1
+#endif
 
 	/* Used on refill */
 	unsigned int		added;
@@ -146,6 +150,10 @@ unsigned int sfc_rx_qdesc_npending(struct sfc_adapter *sa,
 				   unsigned int sw_index);
 int sfc_rx_qdesc_done(struct sfc_rxq *rxq, unsigned int offset);
 
+#if EFSYS_OPT_RX_SCALE
+efx_rx_hash_type_t sfc_rte_to_efx_hash_type(uint64_t rss_hf);
+#endif
+
 #ifdef __cplusplus
 }
 #endif
-- 
2.5.5

^ permalink raw reply related

* [PATCH v2 28/32] net/sfc: add callback to query RSS key and hash types config
From: Andrew Rybchenko @ 2016-12-15 12:51 UTC (permalink / raw)
  To: dev; +Cc: ferruh.yigit, Ivan Malov
In-Reply-To: <1481806283-10387-1-git-send-email-arybchenko@solarflare.com>

From: Ivan Malov <ivan.malov@oktetlabs.ru>

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
Reviewed-by: Robert Stonehouse <rstonehouse@solarflare.com>
---
 drivers/net/sfc/sfc_ethdev.c | 33 +++++++++++++++++++++++++++++++++
 drivers/net/sfc/sfc_rx.c     | 22 ++++++++++++++++++++++
 drivers/net/sfc/sfc_rx.h     |  1 +
 3 files changed, 56 insertions(+)

diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index b17607f..c78d798 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -997,6 +997,36 @@ sfc_tx_queue_stop(struct rte_eth_dev *dev, uint16_t tx_queue_id)
 	return 0;
 }
 
+#if EFSYS_OPT_RX_SCALE
+static int
+sfc_dev_rss_hash_conf_get(struct rte_eth_dev *dev,
+			  struct rte_eth_rss_conf *rss_conf)
+{
+	struct sfc_adapter *sa = dev->data->dev_private;
+
+	if ((sa->rss_channels == 1) ||
+	    (sa->rss_support != EFX_RX_SCALE_EXCLUSIVE))
+		return -ENOTSUP;
+
+	sfc_adapter_lock(sa);
+
+	/*
+	 * Mapping of hash configuration between RTE and EFX is not one-to-one,
+	 * hence, conversion is done here to derive a correct set of ETH_RSS
+	 * flags which corresponds to the active EFX configuration stored
+	 * locally in 'sfc_adapter' and kept up-to-date
+	 */
+	rss_conf->rss_hf = sfc_efx_to_rte_hash_type(sa->rss_hash_types);
+	rss_conf->rss_key_len = SFC_RSS_KEY_SIZE;
+	if (rss_conf->rss_key != NULL)
+		rte_memcpy(rss_conf->rss_key, sa->rss_key, SFC_RSS_KEY_SIZE);
+
+	sfc_adapter_unlock(sa);
+
+	return 0;
+}
+#endif
+
 static const struct eth_dev_ops sfc_eth_dev_ops = {
 	.dev_configure			= sfc_dev_configure,
 	.dev_start			= sfc_dev_start,
@@ -1028,6 +1058,9 @@ static const struct eth_dev_ops sfc_eth_dev_ops = {
 	.flow_ctrl_get			= sfc_flow_ctrl_get,
 	.flow_ctrl_set			= sfc_flow_ctrl_set,
 	.mac_addr_set			= sfc_mac_addr_set,
+#if EFSYS_OPT_RX_SCALE
+	.rss_hash_conf_get		= sfc_dev_rss_hash_conf_get,
+#endif
 	.set_mc_addr_list		= sfc_set_mc_addr_list,
 	.rxq_info_get			= sfc_rx_queue_info_get,
 	.txq_info_get			= sfc_tx_queue_info_get,
diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index 9b507c3..906536e 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -785,6 +785,28 @@ sfc_rte_to_efx_hash_type(uint64_t rss_hf)
 
 	return efx_hash_types;
 }
+
+uint64_t
+sfc_efx_to_rte_hash_type(efx_rx_hash_type_t efx_hash_types)
+{
+	uint64_t rss_hf = 0;
+
+	if ((efx_hash_types & EFX_RX_HASH_IPV4) != 0)
+		rss_hf |= (ETH_RSS_IPV4 | ETH_RSS_FRAG_IPV4 |
+			   ETH_RSS_NONFRAG_IPV4_OTHER);
+
+	if ((efx_hash_types & EFX_RX_HASH_TCPIPV4) != 0)
+		rss_hf |= ETH_RSS_NONFRAG_IPV4_TCP;
+
+	if ((efx_hash_types & EFX_RX_HASH_IPV6) != 0)
+		rss_hf |= (ETH_RSS_IPV6 | ETH_RSS_FRAG_IPV6 |
+			   ETH_RSS_NONFRAG_IPV6_OTHER | ETH_RSS_IPV6_EX);
+
+	if ((efx_hash_types & EFX_RX_HASH_TCPIPV6) != 0)
+		rss_hf |= (ETH_RSS_NONFRAG_IPV6_TCP | ETH_RSS_IPV6_TCP_EX);
+
+	return rss_hf;
+}
 #endif
 
 static int
diff --git a/drivers/net/sfc/sfc_rx.h b/drivers/net/sfc/sfc_rx.h
index c0cb17a..45b1d77 100644
--- a/drivers/net/sfc/sfc_rx.h
+++ b/drivers/net/sfc/sfc_rx.h
@@ -152,6 +152,7 @@ int sfc_rx_qdesc_done(struct sfc_rxq *rxq, unsigned int offset);
 
 #if EFSYS_OPT_RX_SCALE
 efx_rx_hash_type_t sfc_rte_to_efx_hash_type(uint64_t rss_hf);
+uint64_t sfc_efx_to_rte_hash_type(efx_rx_hash_type_t efx_hash_types);
 #endif
 
 #ifdef __cplusplus
-- 
2.5.5

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox