DPDK-dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v8 18/25] app/testpmd: use VFD APIs on i40e
From: Lu, Wenzhuo @ 2017-01-11  1:27 UTC (permalink / raw)
  To: Yigit, Ferruh, dev@dpdk.org; +Cc: Chen, Jing D, Iremonger, Bernard
In-Reply-To: <598bd9db-073a-b40c-b60e-1d755b8f95ce@intel.com>

Hi Ferruh,

> -----Original Message-----
> From: Yigit, Ferruh
> Sent: Wednesday, January 11, 2017 4:09 AM
> To: Lu, Wenzhuo; dev@dpdk.org
> Cc: Chen, Jing D; Iremonger, Bernard
> Subject: Re: [dpdk-dev] [PATCH v8 18/25] app/testpmd: use VFD APIs on i40e
> 
> On 1/10/2017 11:29 AM, Ferruh Yigit wrote:
> > On 1/10/2017 7:16 AM, Wenzhuo Lu wrote:
> >> The new VF Daemon (VFD) APIs is implemented on i40e. Change testpmd
> >> code to use them, including VF MAC anti-spoofing, VF VLAN
> >> anti-spoofing, TX loopback, VF VLAN strip, VF VLAN insert.
> >>
> >> Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
> >> Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
> >> Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
> >> ---
> >
> > <...>
> >
> >> +#ifdef RTE_LIBRTE_IXGBE_PMD
> >
> > Within this ifdef, there is following call:
> >
> > ret = rte_pmd_ixgbe_set_all_queues_drop_en(res->port_id, is_on);
> >
> > It is not safe to make that call directly, port should be verified
> > first if it is ixgbe port, as done in other functions.
> 
> With a second thought, although this part is not correct, it is out of this
> patches scope, because it concerns ixgbe related part. So below comment also
> become invalid.
> 
> Wenzhuo,
> 
> What do you think not changing this patch, but fixing this in a separate patch?
Agree. I'll not change this patch but fix this issue with a separate patch.

> 
> Thanks,
> ferruh

^ permalink raw reply

* Re: [PATCH v1 2/2] Test cases for rte_memcmp functions
From: Wang, Zhihong @ 2017-01-11  1:28 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: Ravi Kerur, dev@dpdk.org
In-Reply-To: <3403725.gYqeoHWGyH@xps13>



> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Monday, January 9, 2017 7:09 PM
> To: Wang, Zhihong <zhihong.wang@intel.com>
> Cc: Ravi Kerur <rkerur@gmail.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v1 2/2] Test cases for rte_memcmp
> functions
> 
> 2017-01-09 05:29, Wang, Zhihong:
> > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > > 2016-06-07 11:09, Wang, Zhihong:
> > > > From: Ravi Kerur [mailto:rkerur@gmail.com]
> > > > > Zhilong, Thomas,
> > > > >
> > > > > If there is enough interest within DPDK community I can work on
> adding
> > > support
> > > > > for 'unaligned access' and 'test cases' for it. Please let me know either
> > > way.
> > > >
> > > > Hi Ravi,
> > > >
> > > > This rte_memcmp is proved with better performance than glibc's in
> aligned
> > > > cases, I think it has good value to DPDK lib.
> > > >
> > > > Though we don't have memcmp in critical pmd data path, it offers a
> better
> > > > choice for applications who do.
> > >
> > > Re-thinking about this series, could it be some values to have a
> rte_memcmp
> > > implementation?
> >
> > I think this series (rte_memcmp included) could help:
> >
> >  1. Potentially better performance in hot paths.
> >
> >  2. Agile for tuning.
> >
> >  3. Avoid performance complications -- unusual but possible,
> >     like the glibc memset issue I met while working on vhost
> >     enqueue.
> >
> > > What is the value compared to glibc one? Why not working on glibc?
> >
> > As to working on glibc, wider design consideration and test
> > coverage might be needed, and we'll face different release
> > cycles, can we have the same agility? Also working with old
> > glibc could be a problem.
> 
> Probably we need both: add the optimized version in DPDK while working
> on a glibc optimization.
> This strategy could be applicable to memcpy, memcmp and memset.

This does help in the long run if turned out feasible.

^ permalink raw reply

* Re: [PATCH v2 5/7] net/virtio_user: add vhost kernel support
From: Tan, Jianfeng @ 2017-01-11  2:30 UTC (permalink / raw)
  To: Jason Wang, dev; +Cc: yuanhan.liu, ferruh.yigit, cunming.liang
In-Reply-To: <46af618f-c01b-3571-78fc-12d10859a4a1@redhat.com>

Hi Jason,


On 1/9/2017 12:39 PM, Jason Wang wrote:
>> +    if (!enable) {
>> +        if (dev->tapfds[pair_idx]) {
>> +            close(dev->tapfds[pair_idx]);
>> +            dev->tapfds[pair_idx] = -1;
>> +        }
>> +        return vhost_kernel_set_backend(vhostfd, -1);
>
> If this is used to for thing like ethtool -L in guest, we should use 
> TUNSETQUEUE here.

To make it clear, why we need to ioctl(..., TUNSETQUEUE, ...) here. 
According to Linux/Documentation/networking/tuntap.txt,
     "A new ioctl(TUNSETQUEUE) were introduced to enable or disable a 
queue. When
     calling it with IFF_DETACH_QUEUE flag, the queue were disabled. And 
when
     calling it with IFF_ATTACH_QUEUE flag, the queue were enabled. The 
queue were
     enabled by default after it was created through TUNSETIFF."

As it's enabled by default, do you still see the necessity to call it 
explicitly?

Thanks,
Jianfeng

^ permalink raw reply

* [PATCH v4] mempool: use cache in single producer or consumer mode
From: Wenfeng Liu @ 2017-01-11  2:25 UTC (permalink / raw)
  To: olivier.matz, konstantin.ananyev; +Cc: dev
In-Reply-To: <1484036802-3031-1-git-send-email-liuwf@arraynetworks.com.cn>

Currently we will check mempool flags when we put/get objects from
mempool. However, this makes cache useless when mempool is SC|SP,
SC|MP, MC|SP cases.
This patch makes cache available in above cases and improves performance.

Signed-off-by: Wenfeng Liu <liuwf@arraynetworks.com.cn>
---
 lib/librte_mempool/rte_mempool.h | 26 +++++++++++---------------
 1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index d315d42..d0f5b27 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -1038,19 +1038,15 @@ static inline struct rte_mempool_cache *__attribute__((always_inline))
  */
 static inline void __attribute__((always_inline))
 __mempool_generic_put(struct rte_mempool *mp, void * const *obj_table,
-		      unsigned n, struct rte_mempool_cache *cache, int flags)
+		      unsigned n, struct rte_mempool_cache *cache)
 {
 	void **cache_objs;
 
 	/* increment stat now, adding in mempool always success */
 	__MEMPOOL_STAT_ADD(mp, put, n);
 
-	/* No cache provided or single producer */
-	if (unlikely(cache == NULL || flags & MEMPOOL_F_SP_PUT))
-		goto ring_enqueue;
-
-	/* Go straight to ring if put would overflow mem allocated for cache */
-	if (unlikely(n > RTE_MEMPOOL_CACHE_MAX_SIZE))
+	/* No cache provided or if put would overflow mem allocated for cache */
+	if (unlikely(cache == NULL || n > RTE_MEMPOOL_CACHE_MAX_SIZE))
 		goto ring_enqueue;
 
 	cache_objs = &cache->objs[cache->len];
@@ -1104,10 +1100,11 @@ static inline void __attribute__((always_inline))
  */
 static inline void __attribute__((always_inline))
 rte_mempool_generic_put(struct rte_mempool *mp, void * const *obj_table,
-			unsigned n, struct rte_mempool_cache *cache, int flags)
+			unsigned n, struct rte_mempool_cache *cache,
+			__rte_unused int flags)
 {
 	__mempool_check_cookies(mp, obj_table, n, 0);
-	__mempool_generic_put(mp, obj_table, n, cache, flags);
+	__mempool_generic_put(mp, obj_table, n, cache);
 }
 
 /**
@@ -1244,15 +1241,14 @@ static inline void __attribute__((always_inline))
  */
 static inline int __attribute__((always_inline))
 __mempool_generic_get(struct rte_mempool *mp, void **obj_table,
-		      unsigned n, struct rte_mempool_cache *cache, int flags)
+		      unsigned n, struct rte_mempool_cache *cache)
 {
 	int ret;
 	uint32_t index, len;
 	void **cache_objs;
 
-	/* No cache provided or single consumer */
-	if (unlikely(cache == NULL || flags & MEMPOOL_F_SC_GET ||
-		     n >= cache->size))
+	/* No cache provided or cannot be satisfied from cache */
+	if (unlikely(cache == NULL || n >= cache->size))
 		goto ring_dequeue;
 
 	cache_objs = cache->objs;
@@ -1326,10 +1322,10 @@ static inline int __attribute__((always_inline))
  */
 static inline int __attribute__((always_inline))
 rte_mempool_generic_get(struct rte_mempool *mp, void **obj_table, unsigned n,
-			struct rte_mempool_cache *cache, int flags)
+			struct rte_mempool_cache *cache, __rte_unused int flags)
 {
 	int ret;
-	ret = __mempool_generic_get(mp, obj_table, n, cache, flags);
+	ret = __mempool_generic_get(mp, obj_table, n, cache);
 	if (ret == 0)
 		__mempool_check_cookies(mp, obj_table, n, 1);
 	return ret;
-- 
1.8.3.1

^ permalink raw reply related

* Re: [PATCH v2 5/7] net/virtio_user: add vhost kernel support
From: Jason Wang @ 2017-01-11  2:42 UTC (permalink / raw)
  To: Tan, Jianfeng, dev; +Cc: yuanhan.liu, ferruh.yigit, cunming.liang
In-Reply-To: <4fa4d06c-d359-df12-a073-7c2c2540b634@intel.com>



On 2017年01月10日 14:11, Tan, Jianfeng wrote:
>
> Hi Jason,
>
>
> On 1/9/2017 12:39 PM, Jason Wang wrote:
>>
>>
>> On 2016年12月23日 15:14, Jianfeng Tan wrote:
>>> This patch add support vhost kernel as the backend for virtio_user.
>>> Three main hook functions are added:
>>>    - vhost_kernel_setup() to open char device, each vq pair needs one
>>>      vhostfd;
>>>    - vhost_kernel_ioctl() to communicate control messages with vhost
>>>      kernel module;
>>>    - vhost_kernel_enable_queue_pair() to open tap device and set it
>>>      as the backend of corresonding vhost fd (that is to say, vq pair).
>>>
>>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
>>> ---
>>>
> [...]
>>> +/* TUNSETIFF ifr flags */
>>> +#define IFF_TAP          0x0002
>>> +#define IFF_NO_PI        0x1000
>>> +#define IFF_ONE_QUEUE    0x2000
>>> +#define IFF_VNET_HDR     0x4000
>>> +#define IFF_MULTI_QUEUE  0x0100
>>> +#define IFF_ATTACH_QUEUE 0x0200
>>> +#define IFF_DETACH_QUEUE 0x0400
>>
>> Do we really want to duplicate those things which has been exposed by 
>> uapi here?
>
> You mean those defined in <linux/if_tun.h>? Redefine those common 
> macros, or include standard header file, with respective pros and 
> cons. DPDK prefers the redefinition way as far as I understand, 
> doesn't it?
>

Well, if you really want to do this, you may want to use an independent 
file. Then you can sync it with linux headers with a bash script.

>>
>>> +
>>> +/* Constants */
>>> +#define TUN_DEF_SNDBUF    (1ull << 20)
>>> +#define PATH_NET_TUN    "/dev/net/tun"
>>> +#define VHOST_KERNEL_MAX_REGIONS    64
>>
>> Unfortunate not a constant any more since c9ce42f72fd0 vhost: add 
>> max_mem_regions module parameter.
>
> Yes, I was considering to ignore this in the initial release. But it's 
> not a big effort, I'll try to fix it in latest version.
>
>>
>>> +
>>> +static uint64_t vhost_req_user_to_kernel[] = {
>>> +    [VHOST_USER_SET_OWNER] = VHOST_SET_OWNER,
>>> +    [VHOST_USER_RESET_OWNER] = VHOST_RESET_OWNER,
>>> +    [VHOST_USER_SET_FEATURES] = VHOST_SET_FEATURES,
>>> +    [VHOST_USER_GET_FEATURES] = VHOST_GET_FEATURES,
>>> +    [VHOST_USER_SET_VRING_CALL] = VHOST_SET_VRING_CALL,
>>> +    [VHOST_USER_SET_VRING_NUM] = VHOST_SET_VRING_NUM,
>>> +    [VHOST_USER_SET_VRING_BASE] = VHOST_SET_VRING_BASE,
>>> +    [VHOST_USER_GET_VRING_BASE] = VHOST_GET_VRING_BASE,
>>> +    [VHOST_USER_SET_VRING_ADDR] = VHOST_SET_VRING_ADDR,
>>> +    [VHOST_USER_SET_VRING_KICK] = VHOST_SET_VRING_KICK,
>>> +    [VHOST_USER_SET_MEM_TABLE] = VHOST_SET_MEM_TABLE,
>>> +};
>>> +
>>> +/* By default, vhost kernel module allows 64 regions, but DPDK allows
>>> + * 256 segments. As a relief, below function merges those virtually
>>> + * adjacent memsegs into one region.
>>> + */
>>> +static struct vhost_memory_kernel *
>>> +prepare_vhost_memory_kernel(void)
>>> +{
>>> +    uint32_t i, j, k = 0;
>>> +    struct rte_memseg *seg;
>>> +    struct vhost_memory_region *mr;
>>> +    struct vhost_memory_kernel *vm;
>>> +
>>> +    vm = malloc(sizeof(struct vhost_memory_kernel) +
>>> +            VHOST_KERNEL_MAX_REGIONS *
>>> +            sizeof(struct vhost_memory_region));
>>> +
>>> +    for (i = 0; i < RTE_MAX_MEMSEG; ++i) {
>>> +        seg = &rte_eal_get_configuration()->mem_config->memseg[i];
>>> +        if (!seg->addr)
>>> +            break;
>>
>> If we're sure the number of regions is less than 64(or the module 
>> parameter read from /sys), can we avoid the iteration here?
>
> The "if" statement under the "for" statement can save us from all 
> RTE_MAX_MEMSEG iteration.

But if we know the number of regions is short than the limit, there's 
even no need for this?

>
>>
>>> +
>>> +        int new_region = 1;
>>> +
>>> +        for (j = 0; j < k; ++j) {
>>> +            mr = &vm->regions[j];
>>> +
>>> +            if (mr->userspace_addr + mr->memory_size ==
>>> +                (uint64_t)seg->addr) {
>>> +                mr->memory_size += seg->len;
>>> +                new_region = 0;
>>> +                break;
>>> +            }
>>> +
>>> +            if ((uint64_t)seg->addr + seg->len ==
>>> +                mr->userspace_addr) {
>>> +                mr->guest_phys_addr = (uint64_t)seg->addr;
>>> +                mr->userspace_addr = (uint64_t)seg->addr;
>>> +                mr->memory_size += seg->len;
>>> +                new_region = 0;
>>> +                break;
>>> +            }
>>> +        }
>>> +
>>> +        if (new_region == 0)
>>> +            continue;
>>> +
>>> +        mr = &vm->regions[k++];
>>> +        mr->guest_phys_addr = (uint64_t)seg->addr; /* use vaddr 
>>> here! */
>>> +        mr->userspace_addr = (uint64_t)seg->addr;
>>> +        mr->memory_size = seg->len;
>>> +        mr->mmap_offset = 0;
>>> +
>>> +        if (k >= VHOST_KERNEL_MAX_REGIONS) {
>>> +            free(vm);
>>> +            return NULL;
>>> +        }
>>> +    }
>>> +
>>> +    vm->nregions = k;
>>> +    vm->padding = 0;
>>> +    return vm;
>>> +}
>>> +
>>> +static int
>>> +vhost_kernel_ioctl(struct virtio_user_dev *dev,
>>> +           enum vhost_user_request req,
>>> +           void *arg)
>>> +{
>>> +    int i, ret = -1;
>>> +    uint64_t req_kernel;
>>> +    struct vhost_memory_kernel *vm = NULL;
>>> +
>>> +    req_kernel = vhost_req_user_to_kernel[req];
>>> +
>>> +    if (req_kernel == VHOST_SET_MEM_TABLE) {
>>> +        vm = prepare_vhost_memory_kernel();
>>> +        if (!vm)
>>> +            return -1;
>>> +        arg = (void *)vm;
>>> +    }
>>> +
>>> +    /* Does not work when VIRTIO_F_IOMMU_PLATFORM now, why? */
>>
>> I think the reason is when VIRTIO_F_IOMMU_PLATFORM is negotiated, all 
>> address should be iova instead of gpa.
>>
>
> Yes, I agree. As we don't have to do memory protection in a single 
> process, so it's completely useless here, right?

Yes if there's no IOMMU concept in this case.

>
>>> +    if (req_kernel == VHOST_SET_FEATURES)
>>> +        *(uint64_t *)arg &= ~(1ULL << VIRTIO_F_IOMMU_PLATFORM);
>>> +
>>> +    for (i = 0; i < VHOST_KERNEL_MAX_QUEUES; ++i) {
>>> +        if (dev->vhostfds[i] < 0)
>>> +            continue;
>>> +
> [...]
>>> +    if (!enable) {
>>> +        if (dev->tapfds[pair_idx]) {
>>> +            close(dev->tapfds[pair_idx]);
>>> +            dev->tapfds[pair_idx] = -1;
>>> +        }
>>> +        return vhost_kernel_set_backend(vhostfd, -1);
>>
>> If this is used to for thing like ethtool -L in guest, we should use 
>> TUNSETQUEUE here.
>
> Oops, I was missing this ioctl operation. Let me fix it in next version.
>
>>
>>> +    } else if (dev->tapfds[pair_idx] >= 0) {
>>> +        return 0;
>>> +    }
>>> +
>>> +    if ((dev->features & (1ULL << VIRTIO_NET_F_MRG_RXBUF)) ||
>>> +        (dev->features & (1ULL << VIRTIO_F_VERSION_1)))
>>> +        hdr_size = sizeof(struct virtio_net_hdr_mrg_rxbuf);
>>> +    else
>>> +        hdr_size = sizeof(struct virtio_net_hdr);
>>> +
>>> +    /* TODO:
>>> +     * 1. verify we can get/set vnet_hdr_len, tap_probe_vnet_hdr_len
>>> +     * 2. get number of memory regions from vhost module parameter
>>> +     * max_mem_regions, supported in newer version linux kernel
>>> +     */
>>> +    tapfd = open(PATH_NET_TUN, O_RDWR);
>>> +    if (tapfd < 0) {
>>> +        PMD_DRV_LOG(ERR, "fail to open %s: %s",
>>> +                PATH_NET_TUN, strerror(errno));
>>> +        return -1;
>>> +    }
>>> +
>>> +    /* Construct ifr */
>>> +    memset(&ifr, 0, sizeof(ifr));
>>> +    ifr.ifr_flags = IFF_TAP | IFF_NO_PI;
>>> +
>>> +    if (ioctl(tapfd, TUNGETFEATURES, &features) == -1) {
>>> +        PMD_DRV_LOG(ERR, "TUNGETFEATURES failed: %s", 
>>> strerror(errno));
>>> +        goto error;
>>> +    }
>>> +    if (features & IFF_ONE_QUEUE)
>>> +        ifr.ifr_flags |= IFF_ONE_QUEUE;
>>> +
>>> +    /* Let tap instead of vhost-net handle vnet header, as the 
>>> latter does
>>> +     * not support offloading. And in this case, we should not set 
>>> feature
>>> +     * bit VHOST_NET_F_VIRTIO_NET_HDR.
>>> +     */
>>> +    if (features & IFF_VNET_HDR) {
>>> +        ifr.ifr_flags |= IFF_VNET_HDR;
>>> +    } else {
>>> +        PMD_DRV_LOG(ERR, "TAP does not support IFF_VNET_HDR");
>>> +        goto error;
>>> +    }
>>> +
>>> +    if (dev->ifname)
>>> +        strncpy(ifr.ifr_name, dev->ifname, IFNAMSIZ);
>>> +    else
>>> +        strncpy(ifr.ifr_name, "tap%d", IFNAMSIZ);
>>> +    if (ioctl(tapfd, TUNSETIFF, (void *)&ifr) == -1) {
>>> +        PMD_DRV_LOG(ERR, "TUNSETIFF failed: %s", strerror(errno));
>>> +        goto error;
>>> +    }
>>
>> This requires CAP_NET_ADMIN, so we should really consider to accept a 
>> pre-created fd here.
>
> It sounds like a future work for me. So far, all DPDK apps are running 
> in privileged mode (including CAP_NET_ADMIN?).
>

That's not safe. Accepting a per-created fd can solve this, and can even 
support macvtap.

>>
>>> +
>>> +    fcntl(tapfd, F_SETFL, O_NONBLOCK);
>>> +
>>> +    if (ioctl(tapfd, TUNSETVNETHDRSZ, &hdr_size) < 0) {
>>> +        PMD_DRV_LOG(ERR, "TUNSETVNETHDRSZ failed: %s", 
>>> strerror(errno));
>>> +        goto error;
>>> +    }
>>> +
>>> +    if (ioctl(tapfd, TUNSETSNDBUF, &sndbuf) < 0) {
>>> +        PMD_DRV_LOG(ERR, "TUNSETSNDBUF failed: %s", strerror(errno));
>>> +        goto error;
>>> +    }
>>
>> Let's use INT_MAX as default here to survive from evil consumer here.
>
> Oh yes, I will fix it.
>
> Thanks,
> Jianfeng

^ permalink raw reply

* Re: [PATCH v2 5/7] net/virtio_user: add vhost kernel support
From: Jason Wang @ 2017-01-11  2:45 UTC (permalink / raw)
  To: Tan, Jianfeng, dev; +Cc: yuanhan.liu, ferruh.yigit, cunming.liang
In-Reply-To: <271f1854-8cd8-8671-95ed-79cfa52d8ad7@intel.com>



On 2017年01月11日 10:30, Tan, Jianfeng wrote:
>
> Hi Jason,
>
>
> On 1/9/2017 12:39 PM, Jason Wang wrote:
>>> +    if (!enable) {
>>> +        if (dev->tapfds[pair_idx]) {
>>> +            close(dev->tapfds[pair_idx]);
>>> +            dev->tapfds[pair_idx] = -1;
>>> +        }
>>> +        return vhost_kernel_set_backend(vhostfd, -1);
>>
>> If this is used to for thing like ethtool -L in guest, we should use 
>> TUNSETQUEUE here.
>
> To make it clear, why we need to ioctl(..., TUNSETQUEUE, ...) here. 
> According to Linux/Documentation/networking/tuntap.txt,
>     "A new ioctl(TUNSETQUEUE) were introduced to enable or disable a 
> queue. When
>     calling it with IFF_DETACH_QUEUE flag, the queue were disabled. 
> And when
>     calling it with IFF_ATTACH_QUEUE flag, the queue were enabled. The 
> queue were
>     enabled by default after it was created through TUNSETIFF."
>
> As it's enabled by default, do you still see the necessity to call it 
> explicitly?

If you want to keep it enabled, no need. But if you want to disable one 
specific queue (which I believe is the case of !enable?), you need to 
call it.

Thanks

>
> Thanks,
> Jianfeng

^ permalink raw reply

* [PATCH] app/testpmd: fix ixgbe private API calling
From: Wenzhuo Lu @ 2017-01-11  2:47 UTC (permalink / raw)
  To: dev; +Cc: jingjing.wu, Wenzhuo Lu, stable

Some ixgbe private APIs are added to expose ixgbe
specific functions.
When they're used by testpmd, there's no check for
if the NICs are ixgbe. Other NICs also have chance
to  call these APIs.
This patch add the check and the feedback print.

Fixes: 425781ff5afe ("app/testpmd: add ixgbe VF management")

CC: stable@dpdk.org
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
---
 app/test-pmd/cmdline.c | 101 +++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 81 insertions(+), 20 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index f768b6b..172861a 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -10883,11 +10883,16 @@ struct cmd_vf_vlan_anti_spoof_result {
 	__attribute__((unused)) void *data)
 {
 	struct cmd_vf_vlan_anti_spoof_result *res = parsed_result;
-	int ret = 0;
+	int ret = -ENOTSUP;
 	int is_on = (strcmp(res->on_off, "on") == 0) ? 1 : 0;
+	struct rte_eth_dev_info dev_info;
 
-	ret = rte_pmd_ixgbe_set_vf_vlan_anti_spoof(res->port_id, res->vf_id,
-			is_on);
+	rte_eth_dev_info_get(res->port_id, &dev_info);
+
+	if (strstr(dev_info.driver_name, "ixgbe") != NULL)
+		ret = rte_pmd_ixgbe_set_vf_vlan_anti_spoof(res->port_id,
+							   res->vf_id,
+							   is_on);
 	switch (ret) {
 	case 0:
 		break;
@@ -10897,6 +10902,9 @@ struct cmd_vf_vlan_anti_spoof_result {
 	case -ENODEV:
 		printf("invalid port_id %d\n", res->port_id);
 		break;
+	case -ENOTSUP:
+		printf("function not implemented\n");
+		break;
 	default:
 		printf("programming error: (%s)\n", strerror(-ret));
 	}
@@ -10968,11 +10976,16 @@ struct cmd_vf_mac_anti_spoof_result {
 	__attribute__((unused)) void *data)
 {
 	struct cmd_vf_mac_anti_spoof_result *res = parsed_result;
-	int ret;
+	int ret = -ENOTSUP;
 	int is_on = (strcmp(res->on_off, "on") == 0) ? 1 : 0;
+	struct rte_eth_dev_info dev_info;
 
-	ret = rte_pmd_ixgbe_set_vf_mac_anti_spoof(res->port_id, res->vf_id,
-			is_on);
+	rte_eth_dev_info_get(res->port_id, &dev_info);
+
+	if (strstr(dev_info.driver_name, "ixgbe") != NULL)
+		ret = rte_pmd_ixgbe_set_vf_mac_anti_spoof(res->port_id,
+							  res->vf_id,
+							  is_on);
 	switch (ret) {
 	case 0:
 		break;
@@ -10982,6 +10995,9 @@ struct cmd_vf_mac_anti_spoof_result {
 	case -ENODEV:
 		printf("invalid port_id %d\n", res->port_id);
 		break;
+	case -ENOTSUP:
+		printf("function not implemented\n");
+		break;
 	default:
 		printf("programming error: (%s)\n", strerror(-ret));
 	}
@@ -11053,10 +11069,15 @@ struct cmd_vf_vlan_stripq_result {
 	__attribute__((unused)) void *data)
 {
 	struct cmd_vf_vlan_stripq_result *res = parsed_result;
-	int ret = 0;
+	int ret = -ENOTSUP;
 	int is_on = (strcmp(res->on_off, "on") == 0) ? 1 : 0;
+	struct rte_eth_dev_info dev_info;
 
-	ret = rte_pmd_ixgbe_set_vf_vlan_stripq(res->port_id, res->vf_id, is_on);
+	rte_eth_dev_info_get(res->port_id, &dev_info);
+
+	if (strstr(dev_info.driver_name, "ixgbe") != NULL)
+		ret = rte_pmd_ixgbe_set_vf_vlan_stripq(res->port_id, res->vf_id,
+						       is_on);
 	switch (ret) {
 	case 0:
 		break;
@@ -11066,6 +11087,9 @@ struct cmd_vf_vlan_stripq_result {
 	case -ENODEV:
 		printf("invalid port_id %d\n", res->port_id);
 		break;
+	case -ENOTSUP:
+		printf("function not implemented\n");
+		break;
 	default:
 		printf("programming error: (%s)\n", strerror(-ret));
 	}
@@ -11137,9 +11161,14 @@ struct cmd_vf_vlan_insert_result {
 	__attribute__((unused)) void *data)
 {
 	struct cmd_vf_vlan_insert_result *res = parsed_result;
-	int ret;
+	int ret = -ENOTSUP;
+	struct rte_eth_dev_info dev_info;
 
-	ret = rte_pmd_ixgbe_set_vf_vlan_insert(res->port_id, res->vf_id, res->vlan_id);
+	rte_eth_dev_info_get(res->port_id, &dev_info);
+
+	if (strstr(dev_info.driver_name, "ixgbe") != NULL)
+		ret = rte_pmd_ixgbe_set_vf_vlan_insert(res->port_id, res->vf_id,
+						       res->vlan_id);
 	switch (ret) {
 	case 0:
 		break;
@@ -11149,6 +11178,9 @@ struct cmd_vf_vlan_insert_result {
 	case -ENODEV:
 		printf("invalid port_id %d\n", res->port_id);
 		break;
+	case -ENOTSUP:
+		printf("function not implemented\n");
+		break;
 	default:
 		printf("programming error: (%s)\n", strerror(-ret));
 	}
@@ -11210,10 +11242,14 @@ struct cmd_tx_loopback_result {
 	__attribute__((unused)) void *data)
 {
 	struct cmd_tx_loopback_result *res = parsed_result;
-	int ret;
+	int ret = -ENOTSUP;
 	int is_on = (strcmp(res->on_off, "on") == 0) ? 1 : 0;
+	struct rte_eth_dev_info dev_info;
 
-	ret = rte_pmd_ixgbe_set_tx_loopback(res->port_id, is_on);
+	rte_eth_dev_info_get(res->port_id, &dev_info);
+
+	if (strstr(dev_info.driver_name, "ixgbe") != NULL)
+		ret = rte_pmd_ixgbe_set_tx_loopback(res->port_id, is_on);
 	switch (ret) {
 	case 0:
 		break;
@@ -11223,6 +11259,9 @@ struct cmd_tx_loopback_result {
 	case -ENODEV:
 		printf("invalid port_id %d\n", res->port_id);
 		break;
+	case -ENOTSUP:
+		printf("function not implemented\n");
+		break;
 	default:
 		printf("programming error: (%s)\n", strerror(-ret));
 	}
@@ -11287,10 +11326,14 @@ struct cmd_all_queues_drop_en_result {
 	__attribute__((unused)) void *data)
 {
 	struct cmd_all_queues_drop_en_result *res = parsed_result;
-	int ret = 0;
+	int ret = -ENOTSUP;
 	int is_on = (strcmp(res->on_off, "on") == 0) ? 1 : 0;
+	struct rte_eth_dev_info dev_info;
 
-	ret = rte_pmd_ixgbe_set_all_queues_drop_en(res->port_id, is_on);
+	rte_eth_dev_info_get(res->port_id, &dev_info);
+
+	if (strstr(dev_info.driver_name, "ixgbe") != NULL)
+		ret = rte_pmd_ixgbe_set_all_queues_drop_en(res->port_id, is_on);
 	switch (ret) {
 	case 0:
 		break;
@@ -11300,6 +11343,9 @@ struct cmd_all_queues_drop_en_result {
 	case -ENODEV:
 		printf("invalid port_id %d\n", res->port_id);
 		break;
+	case -ENOTSUP:
+		printf("function not implemented\n");
+		break;
 	default:
 		printf("programming error: (%s)\n", strerror(-ret));
 	}
@@ -11370,11 +11416,16 @@ struct cmd_vf_split_drop_en_result {
 	__attribute__((unused)) void *data)
 {
 	struct cmd_vf_split_drop_en_result *res = parsed_result;
-	int ret;
+	int ret = -ENOTSUP;
 	int is_on = (strcmp(res->on_off, "on") == 0) ? 1 : 0;
+	struct rte_eth_dev_info dev_info;
 
-	ret = rte_pmd_ixgbe_set_vf_split_drop_en(res->port_id, res->vf_id,
-			is_on);
+	rte_eth_dev_info_get(res->port_id, &dev_info);
+
+	if (strstr(dev_info.driver_name, "ixgbe") != NULL)
+		ret = rte_pmd_ixgbe_set_vf_split_drop_en(res->port_id,
+							 res->vf_id,
+							 is_on);
 	switch (ret) {
 	case 0:
 		break;
@@ -11384,6 +11435,9 @@ struct cmd_vf_split_drop_en_result {
 	case -ENODEV:
 		printf("invalid port_id %d\n", res->port_id);
 		break;
+	case -ENOTSUP:
+		printf("function not implemented\n");
+		break;
 	default:
 		printf("programming error: (%s)\n", strerror(-ret));
 	}
@@ -11455,10 +11509,14 @@ struct cmd_set_vf_mac_addr_result {
 	__attribute__((unused)) void *data)
 {
 	struct cmd_set_vf_mac_addr_result *res = parsed_result;
-	int ret;
+	int ret = -ENOTSUP;
+	struct rte_eth_dev_info dev_info;
 
-	ret = rte_pmd_ixgbe_set_vf_mac_addr(res->port_id, res->vf_id,
-			&res->mac_addr);
+	rte_eth_dev_info_get(res->port_id, &dev_info);
+
+	if (strstr(dev_info.driver_name, "ixgbe") != NULL)
+		ret = rte_pmd_ixgbe_set_vf_mac_addr(res->port_id, res->vf_id,
+						    &res->mac_addr);
 	switch (ret) {
 	case 0:
 		break;
@@ -11468,6 +11526,9 @@ struct cmd_set_vf_mac_addr_result {
 	case -ENODEV:
 		printf("invalid port_id %d\n", res->port_id);
 		break;
+	case -ENOTSUP:
+		printf("function not implemented\n");
+		break;
 	default:
 		printf("programming error: (%s)\n", strerror(-ret));
 	}
-- 
1.9.3

^ permalink raw reply related

* Re: [PATCH v8 00/25] Support VFD on i40e
From: JOSHI, KAUSTUBH  (KAUSTUBH) @ 2017-01-11  2:59 UTC (permalink / raw)
  To: Vincent Jardin
  Cc: Zhang, Helin, Lu, Wenzhuo, dev@dpdk.org,
	DANIELS, EDWARD S (EDWARD), ZELEZNIAK, ALEX
In-Reply-To: <1598a0c7f80.27fc.bb328046f2889bc8f44aafa891a44dd2@6wind.com>

Hi Vincent, 

Greetings! Jumping into this debate a bit late, but let me share our point of view based on how we are using this code within AT&T for our NFV cloud. 

Actually, we first started with trying to do the configuration within the kernel drivers as you suggest, but quickly realized that besides the practical problem of kernel upstreaming being a much more arduous road (which can be overcome), the bigger problem was that there is no standardization in the configuration interfaces for the NICs in the kernel community. So different drivers do things differently and expose different settings, and no forum exists to drive towards such standardization. This was leading to vendors have to maintain patched versions of drivers for doing PF configuration, which is not a desirable situation. 

So, to build a portable (to multiple NICs) SRIOV VF manager like VFd, DPDK seemed like a good a forum with some hope for driving towards a standard set of interfaces and without having to worry about a lot of legacy baggage and old hardware. Especially since DPDK already takes on the role of configuring NICs for the data plane functions anyway - both PF and VF drivers will have to be included for data plane usage anyway - we viewed that adding VF config options will not cause any forking, but simply flush out the DPDK drivers and their interfaces to be more complete. These APIs could be optional, so new vendors aren’t obligated to add them. 

Furthermore, allowing VF config using the DPDK PF driver also has the side benefit of allowing a complete SRIOV system (both VF and PF) to be built entirely with DPDK, also making version alignment easier.

We started with Niantic, which already had PF and VF drivers, and things have worked out very well with it. However, we would like VFd to be a multi-NIC vendor agnostic VF management tool, which is why we’ve been asking for making the PF config APIs richer.    

Regards

KJ


> On Jan 10, 2017, at 3:23 PM, Vincent Jardin <vincent.jardin@6wind.com> wrote:
> 
> Nope. First one needs to assess if DPDK should be intensively used to become a PF knowing Linux can do the jobs. Linux kernel community does not like the forking of Kernel drivers, I tend to agree that we should not keep duplicating options that can be solved with the Linux kernel.
> 
> Best regards,
> Vincent
> 
> 


^ permalink raw reply

* Re: i40e_aq_get_phy_capabilities() fails when using SFP+ with no link
From: Christos Ricudis @ 2017-01-11  3:00 UTC (permalink / raw)
  To: Zhang, Helin; +Cc: Olivier MATZ, Rowden, Aaron F, dev@dpdk.org, Wu, Jingjing
In-Reply-To: <F35DEAC7BCE34641BA9FAC6BCA4A12E71A98F8FE@SHSMSX103.ccr.corp.intel.com>

Hi Helin, 

> On 11 Jan 2017, at 09:08, Zhang, Helin <helin.zhang@intel.com> wrote:
> 
> Hi Aaron
> 
> Is the SFP+ (Finisar FTLX8571D3BCL) supported and validated by Intel?
> It seems there is some PHY issue in this case.

As the original reporter of this issue, I will test with validated SFP+s and will report on my testing. 

Shouldn’t unsupported SFP+s be blacklisted in the I40E driver? 

Best regards, 
Christos Ricudis

> 
> Regards,
> Helin
> 
> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com] 
> Sent: Tuesday, January 10, 2017 11:29 PM
> To: Christos Ricudis <ricudis.christos@gmail.com>
> Cc: dev@dpdk.org; Zhang, Helin <helin.zhang@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>
> Subject: Re: [dpdk-dev] i40e_aq_get_phy_capabilities() fails when using SFP+ with no link
> 
> Hi Christos,
> 
> +CC i40e maintainers
> 
> On Tue, 10 Jan 2017 20:32:26 +0800, Christos Ricudis <ricudis.christos@gmail.com> wrote:
>> Hello,
>> 
>> Using a X710 based 4-port 4x10Gbit NIC, I have came across the 
>> following issue on the i40e PMD:
>> 
>> When an optical SFP+ (Finisar FTLX8571D3BCL) is used with no active 
>> link partner on the other end of the link (or fiber completely 
>> disconnected from the SFP+), i40e_aq_get_phy_capabilities() (called by 
>> i40e_dev_sync_phy_type() on port initialization), fails with a
>> 0x05 return value (EIO) on the AQ response structure. The struct 
>> i40e_aq_get_phy_abilities_resp buffer passed to the Get Phy Abilities 
>> command is unmodified upon return.
>> 
>> This prevents DPDK 16.11 from initializing the port, and ultimately 
>> fails with the following error:
>> 
>> PMD: eth_i40e_dev_init(): Failed to sync phy type: -95
>> 
>> The change introducing this issue was
>> http://dpdk.org/ml/archives/dev/2016-September/047663.html
>> 
>> Reading the X710 datasheet, I notice that no specific mention is given 
>> on the meaning of EIO as a response to Get PHY Abilities command 
>> (opcode 0x0600), whereas in most other commands, an explicit mention 
>> of the meaning of the possible error status responses is given.
>> 
>> This behaviour is the same across the following NVM releases: 
>> 
>> FW 4.33 API 1.2 NVM 04.04.02 eetrack 800018a6 FW 4.40 API 1.4 NVM 
>> 04.05.03 eetrack 80001cd8 FW 5.0 API 1.5 NVM 05.00.04 eetrack 800024da
>> 
>> I will try to get around the issue by falling back to PHY capabilities 
>> detection using the device ID in the case
>> i40e_aq_get_phy_capabilities() fails, but conceptually the 
>> capabilities of the PHY should not be dependent on whether PHY detects 
>> an active link or not.
>> 
>> I’d be happy to do more testing on this issue per your 
>> recommendations.
>> 
>> Moreover, while trying to debug this issue, I managed to get both 3 
>> NIC adapters on my test system on a state where the PHY has apparently 
>> died - no link indication at all on any ports. A reboot solved this, 
>> and I am now trying to replicate this behaviour under more controlled 
>> conditions.
>> 
> 
> I'm currently running into a similar issue (I think). I can reproduce it with testpmd with the following case:
> 
>  set link_check off
>  port stop 0
>  # don't wait between these 2 commands
>  port start 0
> 
> I added some logs that are displayed after the port start:
> 
>  PMD: i40e_set_tx_function(): Vector tx finally be used.
>  PMD: i40e_set_rx_function(): Vector rx enabled, please make sure RX
>    burst size no less than 4 (port=0).
>  PMD: i40e_dev_rx_queue_start():  >>
>  PMD: i40e_dev_tx_queue_start():  >>
>  PMD: i40e_dev_start(): applying link settings...
>  PMD: i40e_apply_link_speed(): abilities = 38, speed = 0
>  PMD: i40e_phy_conf_link(): i40e_aq_get_phy_capabilities failed -7
>  PMD: i40e_dev_start(): Fail to apply link setting
>  PMD: i40e_dev_clear_queues():  >>
> 
> The -7 corresponds to I40E_ERR_UNKNOWN_PHY. This happens in
> i40e_aq_get_phy_capabilities() in the following code, which makes me think it's the same problem than yours:
> 
> 	if (hw->aq.asq_last_status == I40E_AQ_RC_EIO)
> 		status = I40E_ERR_UNKNOWN_PHY;
> 
> A workaround in my usecase is to wait a bit between the stop and the start.
> 
> Any help is welcome.
> 
> Regards,
> Olivier
> 
> 

^ permalink raw reply

* Re: [PATCH v2 5/7] net/virtio_user: add vhost kernel support
From: Tan, Jianfeng @ 2017-01-11  3:13 UTC (permalink / raw)
  To: Jason Wang, dev; +Cc: yuanhan.liu, ferruh.yigit, cunming.liang
In-Reply-To: <fd33abf3-3daf-4963-5ab3-018ebe6c4a67@redhat.com>



On 1/11/2017 10:42 AM, Jason Wang wrote:
>
>
> On 2017年01月10日 14:11, Tan, Jianfeng wrote:
>>
>> Hi Jason,
>>
>>
>> On 1/9/2017 12:39 PM, Jason Wang wrote:
>>>
>>>
>>> On 2016年12月23日 15:14, Jianfeng Tan wrote:
>>>> This patch add support vhost kernel as the backend for virtio_user.
>>>> Three main hook functions are added:
>>>>    - vhost_kernel_setup() to open char device, each vq pair needs one
>>>>      vhostfd;
>>>>    - vhost_kernel_ioctl() to communicate control messages with vhost
>>>>      kernel module;
>>>>    - vhost_kernel_enable_queue_pair() to open tap device and set it
>>>>      as the backend of corresonding vhost fd (that is to say, vq 
>>>> pair).
>>>>
>>>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
>>>> ---
>>>>
>> [...]
>>>> +/* TUNSETIFF ifr flags */
>>>> +#define IFF_TAP          0x0002
>>>> +#define IFF_NO_PI        0x1000
>>>> +#define IFF_ONE_QUEUE    0x2000
>>>> +#define IFF_VNET_HDR     0x4000
>>>> +#define IFF_MULTI_QUEUE  0x0100
>>>> +#define IFF_ATTACH_QUEUE 0x0200
>>>> +#define IFF_DETACH_QUEUE 0x0400
>>>
>>> Do we really want to duplicate those things which has been exposed 
>>> by uapi here?
>>
>> You mean those defined in <linux/if_tun.h>? Redefine those common 
>> macros, or include standard header file, with respective pros and 
>> cons. DPDK prefers the redefinition way as far as I understand, 
>> doesn't it?
>>
>
> Well, if you really want to do this, you may want to use an 
> independent file. Then you can sync it with linux headers with a bash 
> script.

Agreed!

[...]
>>>> +static struct vhost_memory_kernel *
>>>> +prepare_vhost_memory_kernel(void)
>>>> +{
>>>> +    uint32_t i, j, k = 0;
>>>> +    struct rte_memseg *seg;
>>>> +    struct vhost_memory_region *mr;
>>>> +    struct vhost_memory_kernel *vm;
>>>> +
>>>> +    vm = malloc(sizeof(struct vhost_memory_kernel) +
>>>> +            VHOST_KERNEL_MAX_REGIONS *
>>>> +            sizeof(struct vhost_memory_region));
>>>> +
>>>> +    for (i = 0; i < RTE_MAX_MEMSEG; ++i) {
>>>> +        seg = &rte_eal_get_configuration()->mem_config->memseg[i];
>>>> +        if (!seg->addr)
>>>> +            break;
>>>
>>> If we're sure the number of regions is less than 64(or the module 
>>> parameter read from /sys), can we avoid the iteration here?
>>
>> The "if" statement under the "for" statement can save us from all 
>> RTE_MAX_MEMSEG iteration.
>
> But if we know the number of regions is short than the limit, there's 
> even no need for this?

The problem is: we don't have a variable to know how many segments are 
there in DPDK. Anyway, we need to report error in the situation that # 
of segments > # of regions. Or can you give a more specific example?

[...]
>>>> +    if (dev->ifname)
>>>> +        strncpy(ifr.ifr_name, dev->ifname, IFNAMSIZ);
>>>> +    else
>>>> +        strncpy(ifr.ifr_name, "tap%d", IFNAMSIZ);
>>>> +    if (ioctl(tapfd, TUNSETIFF, (void *)&ifr) == -1) {
>>>> +        PMD_DRV_LOG(ERR, "TUNSETIFF failed: %s", strerror(errno));
>>>> +        goto error;
>>>> +    }
>>>
>>> This requires CAP_NET_ADMIN, so we should really consider to accept 
>>> a pre-created fd here.
>>
>> It sounds like a future work for me. So far, all DPDK apps are 
>> running in privileged mode (including CAP_NET_ADMIN?).
>>
>
> That's not safe. 

Yes.

> Accepting a per-created fd can solve this, and can even support macvtap

So you are proposing a way to specify the virtio_user parameter like, 
--vdev=virtio_user,tapfd=fd1,xxx, right? As replied in your another 
comment, it's a great way to go. But I'm afraid we have no time to 
implement that in this cycle.

Thanks,
Jianfeng

^ permalink raw reply

* Re: [PATCH v2 5/7] net/virtio_user: add vhost kernel support
From: Jason Wang @ 2017-01-11  3:23 UTC (permalink / raw)
  To: Tan, Jianfeng, dev; +Cc: yuanhan.liu, ferruh.yigit, cunming.liang
In-Reply-To: <b82e84e0-7412-6928-529e-51072491af4a@intel.com>



On 2017年01月11日 11:13, Tan, Jianfeng wrote:
>
>
> On 1/11/2017 10:42 AM, Jason Wang wrote:
>>
>>
>> On 2017年01月10日 14:11, Tan, Jianfeng wrote:
>>>
>>> Hi Jason,
>>>
>>>
>>> On 1/9/2017 12:39 PM, Jason Wang wrote:
>>>>
>>>>
>>>> On 2016年12月23日 15:14, Jianfeng Tan wrote:
>>>>> This patch add support vhost kernel as the backend for virtio_user.
>>>>> Three main hook functions are added:
>>>>>    - vhost_kernel_setup() to open char device, each vq pair needs one
>>>>>      vhostfd;
>>>>>    - vhost_kernel_ioctl() to communicate control messages with vhost
>>>>>      kernel module;
>>>>>    - vhost_kernel_enable_queue_pair() to open tap device and set it
>>>>>      as the backend of corresonding vhost fd (that is to say, vq 
>>>>> pair).
>>>>>
>>>>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
>>>>> ---
>>>>>
>>> [...]
>>>>> +/* TUNSETIFF ifr flags */
>>>>> +#define IFF_TAP          0x0002
>>>>> +#define IFF_NO_PI        0x1000
>>>>> +#define IFF_ONE_QUEUE    0x2000
>>>>> +#define IFF_VNET_HDR     0x4000
>>>>> +#define IFF_MULTI_QUEUE  0x0100
>>>>> +#define IFF_ATTACH_QUEUE 0x0200
>>>>> +#define IFF_DETACH_QUEUE 0x0400
>>>>
>>>> Do we really want to duplicate those things which has been exposed 
>>>> by uapi here?
>>>
>>> You mean those defined in <linux/if_tun.h>? Redefine those common 
>>> macros, or include standard header file, with respective pros and 
>>> cons. DPDK prefers the redefinition way as far as I understand, 
>>> doesn't it?
>>>
>>
>> Well, if you really want to do this, you may want to use an 
>> independent file. Then you can sync it with linux headers with a bash 
>> script.
>
> Agreed!
>
> [...]
>>>>> +static struct vhost_memory_kernel *
>>>>> +prepare_vhost_memory_kernel(void)
>>>>> +{
>>>>> +    uint32_t i, j, k = 0;
>>>>> +    struct rte_memseg *seg;
>>>>> +    struct vhost_memory_region *mr;
>>>>> +    struct vhost_memory_kernel *vm;
>>>>> +
>>>>> +    vm = malloc(sizeof(struct vhost_memory_kernel) +
>>>>> +            VHOST_KERNEL_MAX_REGIONS *
>>>>> +            sizeof(struct vhost_memory_region));
>>>>> +
>>>>> +    for (i = 0; i < RTE_MAX_MEMSEG; ++i) {
>>>>> +        seg = &rte_eal_get_configuration()->mem_config->memseg[i];
>>>>> +        if (!seg->addr)
>>>>> +            break;
>>>>
>>>> If we're sure the number of regions is less than 64(or the module 
>>>> parameter read from /sys), can we avoid the iteration here?
>>>
>>> The "if" statement under the "for" statement can save us from all 
>>> RTE_MAX_MEMSEG iteration.
>>
>> But if we know the number of regions is short than the limit, there's 
>> even no need for this?
>
> The problem is: we don't have a variable to know how many segments are 
> there in DPDK. Anyway, we need to report error in the situation that # 
> of segments > # of regions. Or can you give a more specific example?
>
> [...]

If we don't know #segments, I'm fine with current code.

>>>>> +    if (dev->ifname)
>>>>> +        strncpy(ifr.ifr_name, dev->ifname, IFNAMSIZ);
>>>>> +    else
>>>>> +        strncpy(ifr.ifr_name, "tap%d", IFNAMSIZ);
>>>>> +    if (ioctl(tapfd, TUNSETIFF, (void *)&ifr) == -1) {
>>>>> +        PMD_DRV_LOG(ERR, "TUNSETIFF failed: %s", strerror(errno));
>>>>> +        goto error;
>>>>> +    }
>>>>
>>>> This requires CAP_NET_ADMIN, so we should really consider to accept 
>>>> a pre-created fd here.
>>>
>>> It sounds like a future work for me. So far, all DPDK apps are 
>>> running in privileged mode (including CAP_NET_ADMIN?).
>>>
>>
>> That's not safe. 
>
> Yes.
>
>> Accepting a per-created fd can solve this, and can even support macvtap
>
> So you are proposing a way to specify the virtio_user parameter like, 
> --vdev=virtio_user,tapfd=fd1,xxx, right? As replied in your another 
> comment, it's a great way to go. But I'm afraid we have no time to 
> implement that in this cycle.
>
> Thanks,
> Jianfeng

Ok, just want to show its advantages. It can be added on top.

And two more suggestions:
- better to split tap support out of vhost file
- kernel support more than 8 queues on recent kernel, so there's no need 
to limit it to 8. When running on old kernel, TUNSETIFF will fail which 
can give user a hint to reduce #queues.

Thanks

^ permalink raw reply

* [PATCH] kni: use bulk functions to allocate and free mbufs
From: Sergey Vyazmitinov @ 2016-12-29 21:50 UTC (permalink / raw)
  To: olivier.matz; +Cc: ferruh.yigit, dev, Sergey Vyazmitinov

Optimized kni_allocate_mbufs and kni_free_mbufs by using mbuf bulk
functions. This can improve performance more than two times.

Signed-off-by: Sergey Vyazmitinov <s.vyazmitinov@brain4net.com>
---
 lib/librte_kni/rte_kni.c         | 44 +++++++++++++++++-----------------------
 lib/librte_kni/rte_kni_fifo.h    | 18 ++++++++++++++++
 lib/librte_mbuf/rte_mbuf.h       | 32 +++++++++++++++++++++++++++++
 lib/librte_mempool/rte_mempool.h |  6 ++++++
 4 files changed, 75 insertions(+), 25 deletions(-)

diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
index a80cefd..cb4cfa6 100644
--- a/lib/librte_kni/rte_kni.c
+++ b/lib/librte_kni/rte_kni.c
@@ -590,22 +590,21 @@ rte_kni_rx_burst(struct rte_kni *kni, struct rte_mbuf **mbufs, unsigned num)
 static void
 kni_free_mbufs(struct rte_kni *kni)
 {
-	int i, ret;
+	unsigned freeing;
 	struct rte_mbuf *pkts[MAX_MBUF_BURST_NUM];
 
-	ret = kni_fifo_get(kni->free_q, (void **)pkts, MAX_MBUF_BURST_NUM);
-	if (likely(ret > 0)) {
-		for (i = 0; i < ret; i++)
-			rte_pktmbuf_free(pkts[i]);
+	freeing = kni_fifo_get(kni->free_q, (void **)pkts, MAX_MBUF_BURST_NUM);
+	if (likely(freeing > 0)) {
+		rte_pktmbuf_free_bulk(kni->pktmbuf_pool, pkts, freeing);
 	}
 }
 
 static void
 kni_allocate_mbufs(struct rte_kni *kni)
 {
-	int i, ret;
-	struct rte_mbuf *pkts[MAX_MBUF_BURST_NUM];
-	void *phys[MAX_MBUF_BURST_NUM];
+	unsigned count, allocated, put;
+	struct rte_mbuf *pkts[KNI_FIFO_COUNT_MAX];
+	void *phys[KNI_FIFO_COUNT_MAX];
 
 	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, pool) !=
 			 offsetof(struct rte_kni_mbuf, pool));
@@ -628,28 +627,23 @@ kni_allocate_mbufs(struct rte_kni *kni)
 		return;
 	}
 
-	for (i = 0; i < MAX_MBUF_BURST_NUM; i++) {
-		pkts[i] = rte_pktmbuf_alloc(kni->pktmbuf_pool);
-		if (unlikely(pkts[i] == NULL)) {
-			/* Out of memory */
-			RTE_LOG(ERR, KNI, "Out of memory\n");
-			break;
-		}
-		phys[i] = va2pa(pkts[i]);
-	}
+	/* Calculate alloc queue free space */
+	count = kni_fifo_free_count(kni->alloc_q);
 
-	/* No pkt mbuf alocated */
-	if (i <= 0)
-		return;
+	/* Get buffers from mempool */
+	allocated = rte_pktmbuf_alloc_bulk(kni->pktmbuf_pool, pkts, count);
+	for (unsigned i = 0; i < allocated; i++)
+		phys[i] = va2pa(pkts[i]);
 
-	ret = kni_fifo_put(kni->alloc_q, phys, i);
+	/* Put buffers into alloc queue */
+	put = kni_fifo_put(kni->alloc_q, (void **)phys, allocated);
 
 	/* Check if any mbufs not put into alloc_q, and then free them */
-	if (ret >= 0 && ret < i && ret < MAX_MBUF_BURST_NUM) {
-		int j;
-
-		for (j = ret; j < i; j++)
+	if (unlikely(put < allocated)) {
+		for (unsigned j = put; j < allocated; j++) {
+			RTE_LOG(ERR, KNI, "Free allocated buffer\n");
 			rte_pktmbuf_free(pkts[j]);
+		}
 	}
 }
 
diff --git a/lib/librte_kni/rte_kni_fifo.h b/lib/librte_kni/rte_kni_fifo.h
index 8cb8587..361ddb0 100644
--- a/lib/librte_kni/rte_kni_fifo.h
+++ b/lib/librte_kni/rte_kni_fifo.h
@@ -91,3 +91,21 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data, unsigned num)
 	fifo->read = new_read;
 	return i;
 }
+
+/**
+ * Get the num of elements in the fifo
+ */
+static inline unsigned
+kni_fifo_count(struct rte_kni_fifo *fifo)
+{
+	return (fifo->len + fifo->write - fifo->read) & (fifo->len - 1);
+}
+
+/**
+ * Get the num of available elements in the fifo
+ */
+static inline unsigned
+kni_fifo_free_count(struct rte_kni_fifo *fifo)
+{
+	return (fifo->read - fifo->write - 1) & (fifo->len - 1);
+}
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 4476d75..707c300 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -1261,6 +1261,38 @@ static inline void rte_pktmbuf_free(struct rte_mbuf *m)
 }
 
 /**
+ * Free n packets mbuf back into its original mempool.
+ *
+ * Free each mbuf, and all its segments in case of chained buffers. Each
+ * segment is added back into its original mempool.
+ *
+ * @param mp
+ *   The packets mempool.
+ * @param mbufs
+ *   The packets mbufs array to be freed.
+ * @param n
+ *   Number of packets.
+ */
+static inline void rte_pktmbuf_free_bulk(struct rte_mempool *mp,
+		struct rte_mbuf **mbufs, unsigned n)
+{
+	struct rte_mbuf *mbuf, *m_next;
+	unsigned i;
+	for (i = 0; i < n; ++i) {
+		mbuf = mbufs[i];
+		__rte_mbuf_sanity_check(mbuf, 1);
+
+		mbuf = mbuf->next;
+		while (mbuf != NULL) {
+			m_next = mbuf->next;
+			rte_pktmbuf_free_seg(mbuf);
+			mbuf = m_next;
+		}
+	}
+	rte_mempool_put_bulk(mp, (void * const *)mbufs, n);
+}
+
+/**
  * Creates a "clone" of the given packet mbuf.
  *
  * Walks through all segments of the given packet mbuf, and for each of them:
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index d315d42..e612a0a 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -1497,6 +1497,12 @@ rte_mempool_get(struct rte_mempool *mp, void **obj_p)
 	return rte_mempool_get_bulk(mp, obj_p, 1);
 }
 
+static inline int __attribute__((always_inline))
+rte_mempool_get_n(struct rte_mempool *mp, void **obj_p, int n)
+{
+	return rte_mempool_get_bulk(mp, obj_p, n);
+}
+
 /**
  * Return the number of entries in the mempool.
  *
-- 
2.7.4

^ permalink raw reply related

* [PATCH] net/enic: remove unnecessary function parameter attributes
From: John Daley @ 2017-01-11  3:42 UTC (permalink / raw)
  To: ferruh.yigit; +Cc: dev, John Daley, stable
In-Reply-To: <20170111034212.5294-1-johndale@cisco.com>

Remove __rte_unused attributes in function declaration when
the parameters really are used.

Fixes: dfbd6a9cb504 ("net/enic: extend flow director support for 1300 series")

CC: stable@dpdk.org
Signed-off-by: John Daley <johndale@cisco.com>
---
 drivers/net/enic/enic.h | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/net/enic/enic.h b/drivers/net/enic/enic.h
index 865cd76..7ff994b 100644
--- a/drivers/net/enic/enic.h
+++ b/drivers/net/enic/enic.h
@@ -301,8 +301,7 @@ int enic_link_update(struct enic *enic);
 void enic_fdir_info(struct enic *enic);
 void enic_fdir_info_get(struct enic *enic, struct rte_eth_fdir_info *stats);
 void copy_fltr_v1(struct filter_v2 *fltr, struct rte_eth_fdir_input *input,
-		  struct rte_eth_fdir_masks *masks);
-void copy_fltr_v2(__rte_unused struct filter_v2 *fltr,
-		  __rte_unused struct rte_eth_fdir_input *input,
 		  __rte_unused struct rte_eth_fdir_masks *masks);
+void copy_fltr_v2(struct filter_v2 *fltr, struct rte_eth_fdir_input *input,
+		  struct rte_eth_fdir_masks *masks);
 #endif /* _ENIC_H_ */
-- 
2.10.0

^ permalink raw reply related

* [PATCH] net/enic: use new Rx checksum flags
From: John Daley @ 2017-01-11  3:42 UTC (permalink / raw)
  To: ferruh.yigit; +Cc: dev, John Daley

Use the new L3 and L4 ..CKSUM_GOOD  and ..CKSUM_UNKNOWN flags to
distinguish good checksums from unknown ones.

Signed-off-by: John Daley <johndale@cisco.com>
---
 drivers/net/enic/enic_rxtx.c | 29 +++++++++++++++++++----------
 1 file changed, 19 insertions(+), 10 deletions(-)

diff --git a/drivers/net/enic/enic_rxtx.c b/drivers/net/enic/enic_rxtx.c
index 5d59d8f..981be3a 100644
--- a/drivers/net/enic/enic_rxtx.c
+++ b/drivers/net/enic/enic_rxtx.c
@@ -260,16 +260,25 @@ enic_cq_rx_to_pkt_flags(struct cq_desc *cqd, struct rte_mbuf *mbuf)
 	}
 
 	/* checksum flags */
-	if (!enic_cq_rx_desc_csum_not_calc(cqrd) &&
-		(mbuf->packet_type & RTE_PTYPE_L3_IPV4)) {
-		uint32_t l4_flags = mbuf->packet_type & RTE_PTYPE_L4_MASK;
-
-		if (unlikely(!enic_cq_rx_desc_ipv4_csum_ok(cqrd)))
-			pkt_flags |= PKT_RX_IP_CKSUM_BAD;
-		if (l4_flags == RTE_PTYPE_L4_UDP ||
-		    l4_flags == RTE_PTYPE_L4_TCP) {
-			if (unlikely(!enic_cq_rx_desc_tcp_udp_csum_ok(cqrd)))
-				pkt_flags |= PKT_RX_L4_CKSUM_BAD;
+	if (mbuf->packet_type & RTE_PTYPE_L3_IPV4) {
+		if (enic_cq_rx_desc_csum_not_calc(cqrd))
+			pkt_flags |= (PKT_RX_IP_CKSUM_UNKNOWN &
+				     PKT_RX_L4_CKSUM_UNKNOWN);
+		else {
+			uint32_t l4_flags;
+			l4_flags = mbuf->packet_type & RTE_PTYPE_L4_MASK;
+
+			if (enic_cq_rx_desc_ipv4_csum_ok(cqrd))
+				pkt_flags |= PKT_RX_IP_CKSUM_GOOD;
+			else
+				pkt_flags |= PKT_RX_IP_CKSUM_BAD;
+
+			if (l4_flags & (RTE_PTYPE_L4_UDP | RTE_PTYPE_L4_TCP)) {
+				if (enic_cq_rx_desc_tcp_udp_csum_ok(cqrd))
+					pkt_flags |= PKT_RX_L4_CKSUM_GOOD;
+				else
+					pkt_flags |= PKT_RX_L4_CKSUM_BAD;
+			}
 		}
 	}
 
-- 
2.10.0

^ permalink raw reply related

* [PATCH v2] kni: use bulk functions to allocate and free mbufs
From: Sergey Vyazmitinov @ 2016-12-29 22:08 UTC (permalink / raw)
  To: olivier.matz; +Cc: ferruh.yigit, dev, Sergey Vyazmitinov

Optimized kni_allocate_mbufs and kni_free_mbufs by using mbuf bulk
functions. This can improve performance more than two times.

Signed-off-by: Sergey Vyazmitinov <s.vyazmitinov@brain4net.com>
---
 lib/librte_kni/rte_kni.c      | 44 +++++++++++++++++++------------------------
 lib/librte_kni/rte_kni_fifo.h | 18 ++++++++++++++++++
 lib/librte_mbuf/rte_mbuf.h    | 32 +++++++++++++++++++++++++++++++
 3 files changed, 69 insertions(+), 25 deletions(-)

diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
index a80cefd..8b0b2be 100644
--- a/lib/librte_kni/rte_kni.c
+++ b/lib/librte_kni/rte_kni.c
@@ -590,22 +590,21 @@ rte_kni_rx_burst(struct rte_kni *kni, struct rte_mbuf **mbufs, unsigned num)
 static void
 kni_free_mbufs(struct rte_kni *kni)
 {
-	int i, ret;
+	unsigned int freeing;
 	struct rte_mbuf *pkts[MAX_MBUF_BURST_NUM];
 
-	ret = kni_fifo_get(kni->free_q, (void **)pkts, MAX_MBUF_BURST_NUM);
-	if (likely(ret > 0)) {
-		for (i = 0; i < ret; i++)
-			rte_pktmbuf_free(pkts[i]);
+	freeing = kni_fifo_get(kni->free_q, (void **)pkts, MAX_MBUF_BURST_NUM);
+	if (likely(freeing > 0)) {
+		rte_pktmbuf_free_bulk(kni->pktmbuf_pool, pkts, freeing);
 	}
 }
 
 static void
 kni_allocate_mbufs(struct rte_kni *kni)
 {
-	int i, ret;
-	struct rte_mbuf *pkts[MAX_MBUF_BURST_NUM];
-	void *phys[MAX_MBUF_BURST_NUM];
+	unsigned int count, allocated, put;
+	struct rte_mbuf *pkts[KNI_FIFO_COUNT_MAX];
+	void *phys[KNI_FIFO_COUNT_MAX];
 
 	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, pool) !=
 			 offsetof(struct rte_kni_mbuf, pool));
@@ -628,28 +627,23 @@ kni_allocate_mbufs(struct rte_kni *kni)
 		return;
 	}
 
-	for (i = 0; i < MAX_MBUF_BURST_NUM; i++) {
-		pkts[i] = rte_pktmbuf_alloc(kni->pktmbuf_pool);
-		if (unlikely(pkts[i] == NULL)) {
-			/* Out of memory */
-			RTE_LOG(ERR, KNI, "Out of memory\n");
-			break;
-		}
-		phys[i] = va2pa(pkts[i]);
-	}
+	/* Calculate alloc queue free space */
+	count = kni_fifo_free_count(kni->alloc_q);
 
-	/* No pkt mbuf alocated */
-	if (i <= 0)
-		return;
+	/* Get buffers from mempool */
+	allocated = rte_pktmbuf_alloc_bulk(kni->pktmbuf_pool, pkts, count);
+	for (unsigned int i = 0; i < allocated; i++)
+		phys[i] = va2pa(pkts[i]);
 
-	ret = kni_fifo_put(kni->alloc_q, phys, i);
+	/* Put buffers into alloc queue */
+	put = kni_fifo_put(kni->alloc_q, (void **)phys, allocated);
 
 	/* Check if any mbufs not put into alloc_q, and then free them */
-	if (ret >= 0 && ret < i && ret < MAX_MBUF_BURST_NUM) {
-		int j;
-
-		for (j = ret; j < i; j++)
+	if (unlikely(put < allocated)) {
+		for (unsigned int j = put; j < allocated; j++) {
+			RTE_LOG(ERR, KNI, "Free allocated buffer\n");
 			rte_pktmbuf_free(pkts[j]);
+		}
 	}
 }
 
diff --git a/lib/librte_kni/rte_kni_fifo.h b/lib/librte_kni/rte_kni_fifo.h
index 8cb8587..361ddb0 100644
--- a/lib/librte_kni/rte_kni_fifo.h
+++ b/lib/librte_kni/rte_kni_fifo.h
@@ -91,3 +91,21 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data, unsigned num)
 	fifo->read = new_read;
 	return i;
 }
+
+/**
+ * Get the num of elements in the fifo
+ */
+static inline unsigned
+kni_fifo_count(struct rte_kni_fifo *fifo)
+{
+	return (fifo->len + fifo->write - fifo->read) & (fifo->len - 1);
+}
+
+/**
+ * Get the num of available elements in the fifo
+ */
+static inline unsigned
+kni_fifo_free_count(struct rte_kni_fifo *fifo)
+{
+	return (fifo->read - fifo->write - 1) & (fifo->len - 1);
+}
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 4476d75..56e9ef7 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -1261,6 +1261,38 @@ static inline void rte_pktmbuf_free(struct rte_mbuf *m)
 }
 
 /**
+ * Free n packets mbuf back into its original mempool.
+ *
+ * Free each mbuf, and all its segments in case of chained buffers. Each
+ * segment is added back into its original mempool.
+ *
+ * @param mp
+ *   The packets mempool.
+ * @param mbufs
+ *   The packets mbufs array to be freed.
+ * @param n
+ *   Number of packets.
+ */
+static inline void rte_pktmbuf_free_bulk(struct rte_mempool *mp,
+		struct rte_mbuf **mbufs, unsigned int n)
+{
+	struct rte_mbuf *mbuf, *m_next;
+	unsigned int i;
+	for (i = 0; i < n; ++i) {
+		mbuf = mbufs[i];
+		__rte_mbuf_sanity_check(mbuf, 1);
+
+		mbuf = mbuf->next;
+		while (mbuf != NULL) {
+			m_next = mbuf->next;
+			rte_pktmbuf_free_seg(mbuf);
+			mbuf = m_next;
+		}
+	}
+	rte_mempool_put_bulk(mp, (void * const *)mbufs, n);
+}
+
+/**
  * Creates a "clone" of the given packet mbuf.
  *
  * Walks through all segments of the given packet mbuf, and for each of them:
-- 
2.7.4

^ permalink raw reply related

* [PATCH 0/2] net/virtio: optimize virtio net header reset
From: Yuanhan Liu @ 2017-01-11  4:27 UTC (permalink / raw)
  To: dev; +Cc: Tan Jianfeng, Wang Zhihong, Yuanhan Liu

This two patches optimized the virtio net header reset when TSO is not
actually used (though it could be enabled). The basic idea is to not reset
(assign 0) when it's already 0. This could avoid some severe cache issues.
Micro benchmarking shows it could boost the performance up to 20+%.

---
Yuanhan Liu (2):
  net/virtio: fix performance regression due to TSO enabling
  net/virtio: optimize header reset on any layout

 drivers/net/virtio/virtio_rxtx.c | 28 ++++++++++++++++++++--------
 1 file changed, 20 insertions(+), 8 deletions(-)

-- 
1.9.0

^ permalink raw reply

* [PATCH 1/2] net/virtio: fix performance regression due to TSO enabling
From: Yuanhan Liu @ 2017-01-11  4:27 UTC (permalink / raw)
  To: dev
  Cc: Tan Jianfeng, Wang Zhihong, Yuanhan Liu, Olivier Matz,
	Maxime Coquelin, Michael S. Tsirkin, stable
In-Reply-To: <1484108832-19907-1-git-send-email-yuanhan.liu@linux.intel.com>

TSO is now enabled, but it's not actually being used by default in a
simple L2 forward mode. In such case, we have to zero the virtio net
headers, to inform the vhost backend that no offload is being used:

    hdr->csum_start = 0;
    hdr->csum_offset = 0;
    hdr->flags = 0;

    hdr->gso_type = 0;
    hdr->gso_size = 0;
    hdr->hdr_len = 0;

Such writes could be very costly; it introduces severe cache issues:
The above operations introduce cache write for each packet, which
stalls the read operation from the vhost backend.

The fact that virtio net header is initiated to zero in PMD driver
init stage means that these costly writes are unnecessary and could
be avoided:

    if (hdr->csum_start != 0)
        hdr->csum_start = 0;

And that's what the macro ASSIGN_UNLESS_EQUAL does. With this, the
performance drop introduced by TSO enabling is recovered: it could
be up to 20% in micro benchmarking.

Fixes: 58169a9c8153 ("net/virtio: support Tx checksum offload")
Fixes: 696573046e9e ("net/virtio: support TSO")

Cc: Olivier Matz <olivier.matz@6wind.com>
Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: stable@dpdk.org
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
---
 drivers/net/virtio/virtio_rxtx.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index 1e5a6b9..8ec2f1a 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -258,6 +258,12 @@
 		vtpci_with_feature(hw, VIRTIO_NET_F_HOST_TSO6);
 }
 
+/* avoid write operation when necessary, to lessen cache issues */
+#define ASSIGN_UNLESS_EQUAL(var, val) do {	\
+	if ((var) != (val))			\
+		(var) = (val);			\
+} while (0)
+
 static inline void
 virtqueue_enqueue_xmit(struct virtnet_tx *txvq, struct rte_mbuf *cookie,
 		       uint16_t needed, int use_indirect, int can_push)
@@ -337,9 +343,9 @@
 			break;
 
 		default:
-			hdr->csum_start = 0;
-			hdr->csum_offset = 0;
-			hdr->flags = 0;
+			ASSIGN_UNLESS_EQUAL(hdr->csum_start, 0);
+			ASSIGN_UNLESS_EQUAL(hdr->csum_offset, 0);
+			ASSIGN_UNLESS_EQUAL(hdr->flags, 0);
 			break;
 		}
 
@@ -355,9 +361,9 @@
 				cookie->l3_len +
 				cookie->l4_len;
 		} else {
-			hdr->gso_type = 0;
-			hdr->gso_size = 0;
-			hdr->hdr_len = 0;
+			ASSIGN_UNLESS_EQUAL(hdr->gso_type, 0);
+			ASSIGN_UNLESS_EQUAL(hdr->gso_size, 0);
+			ASSIGN_UNLESS_EQUAL(hdr->hdr_len, 0);
 		}
 	}
 
-- 
1.9.0

^ permalink raw reply related

* [PATCH 2/2] net/virtio: optimize header reset on any layout
From: Yuanhan Liu @ 2017-01-11  4:27 UTC (permalink / raw)
  To: dev
  Cc: Tan Jianfeng, Wang Zhihong, Yuanhan Liu, Maxime Coquelin,
	Michael S. Tsirkin, stable
In-Reply-To: <1484108832-19907-1-git-send-email-yuanhan.liu@linux.intel.com>

When any layout is used, the header is stored in the head room of mbuf.
mbuf is allocated and filled by user, means there is no gurateen the
header is all zero for non TSO case. Therefore, we have to do the reset
by ourself:

    memest(hdr, 0, head_size);

The memset has two impacts on performance:

- memset could not be inlined, which is a bit costly.
- more importantly, it touches the mbuf, which could introduce severe
  cache issues as described by former patch.

Similiary, we could do the same trick: reset just when necessary, when
the corresponding field is already 0, which is likely true for a simple
l2 forward case. It could boost the performance up to 20+% in micro
benchmarking.

Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: stable@dpdk.org
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
---
 drivers/net/virtio/virtio_rxtx.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index 8ec2f1a..5ca3a88 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -292,8 +292,14 @@
 		hdr = (struct virtio_net_hdr *)
 			rte_pktmbuf_prepend(cookie, head_size);
 		/* if offload disabled, it is not zeroed below, do it now */
-		if (offload == 0)
-			memset(hdr, 0, head_size);
+		if (offload == 0) {
+			ASSIGN_UNLESS_EQUAL(hdr->csum_start, 0);
+			ASSIGN_UNLESS_EQUAL(hdr->csum_offset, 0);
+			ASSIGN_UNLESS_EQUAL(hdr->flags, 0);
+			ASSIGN_UNLESS_EQUAL(hdr->gso_type, 0);
+			ASSIGN_UNLESS_EQUAL(hdr->gso_size, 0);
+			ASSIGN_UNLESS_EQUAL(hdr->hdr_len, 0);
+		}
 	} else if (use_indirect) {
 		/* setup tx ring slot to point to indirect
 		 * descriptor list stored in reserved region.
-- 
1.9.0

^ permalink raw reply related

* [PATCH v6 0/6] Add MACsec offload support for ixgbe
From: Tiwei Bie @ 2017-01-11  4:31 UTC (permalink / raw)
  To: dev
  Cc: adrien.mazarguil, wenzhuo.lu, john.mcnamara, olivier.matz,
	thomas.monjalon, konstantin.ananyev, helin.zhang, wei.dai,
	xiao.w.wang
In-Reply-To: <1483514502-32841-1-git-send-email-tiwei.bie@intel.com>

This patch set adds the MACsec offload support for ixgbe.
The testpmd is also updated to support MACsec cmds.

v2:
- Update the documents for testpmd;
- Update the release notes;
- Reuse the functions provided by base code;

v3:
- Add the missing parts of MACsec mbuf flag and reorganize the patch set;
- Add an ethdev event type for MACsec;
- Advertise the MACsec offload capabilities based on the mac type;
- Minor fixes and improvements;

v4:
- Reserve bits in mbuf and ethdev for PMD specific API;
- Use the reserved bits in PMD specific API;

v5:
- Add MACsec offload in the NIC feature list;
- Minor improvements on comments;

v6:
- Revert the changes related to the reserved flags;
- Rebase the patch set on the latest branch, xstats code is changed recently;
- Update the feature list when adding the MACsec support for ixgbe;

Tiwei Bie (6):
  mbuf: add flag for MACsec
  ethdev: add event type for MACsec
  ethdev: add MACsec offload capability flags
  net/ixgbe: add MACsec offload support
  app/testpmd: add MACsec offload commands
  doc: add ixgbe specific APIs

 app/test-pmd/cmdline.c                      | 389 ++++++++++++++++++++++
 app/test-pmd/macfwd.c                       |   2 +
 app/test-pmd/macswap.c                      |   2 +
 app/test-pmd/testpmd.h                      |   2 +
 app/test-pmd/txonly.c                       |   2 +
 doc/guides/nics/features/default.ini        |   1 +
 doc/guides/nics/features/ixgbe.ini          |   1 +
 doc/guides/rel_notes/release_17_02.rst      |  10 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  32 ++
 drivers/net/ixgbe/ixgbe_ethdev.c            | 482 +++++++++++++++++++++++++++-
 drivers/net/ixgbe/ixgbe_ethdev.h            |  45 +++
 drivers/net/ixgbe/ixgbe_rxtx.c              |   3 +
 drivers/net/ixgbe/rte_pmd_ixgbe.h           | 100 ++++++
 drivers/net/ixgbe/rte_pmd_ixgbe_version.map |  11 +
 lib/librte_ether/rte_ethdev.h               |   3 +
 lib/librte_mbuf/rte_mbuf.c                  |   2 +
 lib/librte_mbuf/rte_mbuf.h                  |   6 +
 17 files changed, 1088 insertions(+), 5 deletions(-)

-- 
2.7.4

^ permalink raw reply

* [PATCH v6 1/6] mbuf: add flag for MACsec
From: Tiwei Bie @ 2017-01-11  4:31 UTC (permalink / raw)
  To: dev
  Cc: adrien.mazarguil, wenzhuo.lu, john.mcnamara, olivier.matz,
	thomas.monjalon, konstantin.ananyev, helin.zhang, wei.dai,
	xiao.w.wang
In-Reply-To: <1484109098-8936-1-git-send-email-tiwei.bie@intel.com>

Add a new Tx flag in mbuf, that can be set by applications to
enable the MACsec offload for a packet to be transmitted.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
---
 doc/guides/rel_notes/release_17_02.rst | 5 +++++
 lib/librte_mbuf/rte_mbuf.c             | 2 ++
 lib/librte_mbuf/rte_mbuf.h             | 6 ++++++
 3 files changed, 13 insertions(+)

diff --git a/doc/guides/rel_notes/release_17_02.rst b/doc/guides/rel_notes/release_17_02.rst
index 180af82..3958714 100644
--- a/doc/guides/rel_notes/release_17_02.rst
+++ b/doc/guides/rel_notes/release_17_02.rst
@@ -52,6 +52,11 @@ New Features
   See the :ref:`Generic flow API <Generic_flow_API>` documentation for more
   information.
 
+* **Improved offload support in mbuf.**
+
+  Added a new Tx mbuf flag for MACsec, which can be set by applications
+  to enable the MACsec offload for the packets to be transmitted.
+
 
 Resolved Issues
 ---------------
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index 63f43c8..72ad91e 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -404,6 +404,7 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
 	case PKT_TX_TUNNEL_GRE: return "PKT_TX_TUNNEL_GRE";
 	case PKT_TX_TUNNEL_IPIP: return "PKT_TX_TUNNEL_IPIP";
 	case PKT_TX_TUNNEL_GENEVE: return "PKT_TX_TUNNEL_GENEVE";
+	case PKT_TX_MACSEC: return "PKT_TX_MACSEC";
 	default: return NULL;
 	}
 }
@@ -434,6 +435,7 @@ rte_get_tx_ol_flag_list(uint64_t mask, char *buf, size_t buflen)
 		  "PKT_TX_TUNNEL_NONE" },
 		{ PKT_TX_TUNNEL_GENEVE, PKT_TX_TUNNEL_MASK,
 		  "PKT_TX_TUNNEL_NONE" },
+		{ PKT_TX_MACSEC, PKT_TX_MACSEC, NULL },
 	};
 	const char *name;
 	unsigned int i;
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 4476d75..ffbb4b3 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -182,6 +182,12 @@ extern "C" {
 /* add new TX flags here */
 
 /**
+ * Offload the MACsec. This flag must be set by the application to enable
+ * this offload feature for a packet to be transmitted.
+ */
+#define PKT_TX_MACSEC        (1ULL << 44)
+
+/**
  * Bits 45:48 used for the tunnel type.
  * When doing Tx offload like TSO or checksum, the HW needs to configure the
  * tunnel type into the HW descriptors.
-- 
2.7.4

^ permalink raw reply related

* [PATCH v6 2/6] ethdev: add event type for MACsec
From: Tiwei Bie @ 2017-01-11  4:31 UTC (permalink / raw)
  To: dev
  Cc: adrien.mazarguil, wenzhuo.lu, john.mcnamara, olivier.matz,
	thomas.monjalon, konstantin.ananyev, helin.zhang, wei.dai,
	xiao.w.wang
In-Reply-To: <1484109098-8936-1-git-send-email-tiwei.bie@intel.com>

This commit adds a below event type:

- RTE_ETH_EVENT_MACSEC

This event will occur when the PN counter in a MACsec connection
reaches the exhaustion threshold.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
---
 lib/librte_ether/rte_ethdev.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 1c356c1..d5d2956 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -3188,6 +3188,7 @@ enum rte_eth_event_type {
 	RTE_ETH_EVENT_INTR_RESET,
 			/**< reset interrupt event, sent to VF on PF reset */
 	RTE_ETH_EVENT_VF_MBOX,  /**< message from the VF received by PF */
+	RTE_ETH_EVENT_MACSEC,   /**< MACsec offload related event */
 	RTE_ETH_EVENT_MAX       /**< max value of this enum */
 };
 
-- 
2.7.4

^ permalink raw reply related

* [PATCH v6 3/6] ethdev: add MACsec offload capability flags
From: Tiwei Bie @ 2017-01-11  4:31 UTC (permalink / raw)
  To: dev
  Cc: adrien.mazarguil, wenzhuo.lu, john.mcnamara, olivier.matz,
	thomas.monjalon, konstantin.ananyev, helin.zhang, wei.dai,
	xiao.w.wang
In-Reply-To: <1484109098-8936-1-git-send-email-tiwei.bie@intel.com>

If these flags are advertised by a PMD, the NIC supports the MACsec
offload. The incoming MACsec traffics can be offloaded transparently
after the MACsec offload is configured correctly by the application.
And the application can set the PKT_TX_MACSEC flag in mbufs to enable
the MACsec offload for the packets to be transmitted.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
---
 lib/librte_ether/rte_ethdev.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index d5d2956..49c073b 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -881,6 +881,7 @@ struct rte_eth_conf {
 #define DEV_RX_OFFLOAD_TCP_LRO     0x00000010
 #define DEV_RX_OFFLOAD_QINQ_STRIP  0x00000020
 #define DEV_RX_OFFLOAD_OUTER_IPV4_CKSUM 0x00000040
+#define DEV_RX_OFFLOAD_MACSEC_STRIP     0x00000080
 
 /**
  * TX offload capabilities of a device.
@@ -898,6 +899,7 @@ struct rte_eth_conf {
 #define DEV_TX_OFFLOAD_GRE_TNL_TSO      0x00000400    /**< Used for tunneling packet. */
 #define DEV_TX_OFFLOAD_IPIP_TNL_TSO     0x00000800    /**< Used for tunneling packet. */
 #define DEV_TX_OFFLOAD_GENEVE_TNL_TSO   0x00001000    /**< Used for tunneling packet. */
+#define DEV_TX_OFFLOAD_MACSEC_INSERT    0x00002000
 
 /**
  * Ethernet device information
-- 
2.7.4

^ permalink raw reply related

* [PATCH v6 4/6] net/ixgbe: add MACsec offload support
From: Tiwei Bie @ 2017-01-11  4:31 UTC (permalink / raw)
  To: dev
  Cc: adrien.mazarguil, wenzhuo.lu, john.mcnamara, olivier.matz,
	thomas.monjalon, konstantin.ananyev, helin.zhang, wei.dai,
	xiao.w.wang
In-Reply-To: <1484109098-8936-1-git-send-email-tiwei.bie@intel.com>

MACsec (or LinkSec, 802.1AE) is a MAC level encryption/authentication
scheme defined in IEEE 802.1AE that uses symmetric cryptography.
This commit adds the MACsec offload support for ixgbe.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
---
 doc/guides/nics/features/default.ini        |   1 +
 doc/guides/nics/features/ixgbe.ini          |   1 +
 drivers/net/ixgbe/ixgbe_ethdev.c            | 482 +++++++++++++++++++++++++++-
 drivers/net/ixgbe/ixgbe_ethdev.h            |  45 +++
 drivers/net/ixgbe/ixgbe_rxtx.c              |   3 +
 drivers/net/ixgbe/rte_pmd_ixgbe.h           | 100 ++++++
 drivers/net/ixgbe/rte_pmd_ixgbe_version.map |  11 +
 7 files changed, 638 insertions(+), 5 deletions(-)

diff --git a/doc/guides/nics/features/default.ini b/doc/guides/nics/features/default.ini
index f1bf9bf..c412d6f 100644
--- a/doc/guides/nics/features/default.ini
+++ b/doc/guides/nics/features/default.ini
@@ -43,6 +43,7 @@ VLAN offload         =
 QinQ offload         =
 L3 checksum offload  =
 L4 checksum offload  =
+MACsec offload       =
 Inner L3 checksum    =
 Inner L4 checksum    =
 Packet type parsing  =
diff --git a/doc/guides/nics/features/ixgbe.ini b/doc/guides/nics/features/ixgbe.ini
index 4a5667f..24c02fb 100644
--- a/doc/guides/nics/features/ixgbe.ini
+++ b/doc/guides/nics/features/ixgbe.ini
@@ -36,6 +36,7 @@ VLAN offload         = Y
 QinQ offload         = Y
 L3 checksum offload  = Y
 L4 checksum offload  = Y
+MACsec offload       = Y
 Inner L3 checksum    = Y
 Inner L4 checksum    = Y
 Packet type parsing  = Y
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 060772d..6aad7ef 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -231,6 +231,7 @@ static int ixgbe_dev_rss_reta_query(struct rte_eth_dev *dev,
 			uint16_t reta_size);
 static void ixgbe_dev_link_status_print(struct rte_eth_dev *dev);
 static int ixgbe_dev_lsc_interrupt_setup(struct rte_eth_dev *dev);
+static int ixgbe_dev_macsec_interrupt_setup(struct rte_eth_dev *dev);
 static int ixgbe_dev_rxq_interrupt_setup(struct rte_eth_dev *dev);
 static int ixgbe_dev_interrupt_get_status(struct rte_eth_dev *dev);
 static int ixgbe_dev_interrupt_action(struct rte_eth_dev *dev,
@@ -747,6 +748,51 @@ static const struct rte_ixgbe_xstats_name_off rte_ixgbe_stats_strings[] = {
 #define IXGBE_NB_HW_STATS (sizeof(rte_ixgbe_stats_strings) / \
 			   sizeof(rte_ixgbe_stats_strings[0]))
 
+/* MACsec statistics */
+static const struct rte_ixgbe_xstats_name_off rte_ixgbe_macsec_strings[] = {
+	{"out_pkts_untagged", offsetof(struct ixgbe_macsec_stats,
+		out_pkts_untagged)},
+	{"out_pkts_encrypted", offsetof(struct ixgbe_macsec_stats,
+		out_pkts_encrypted)},
+	{"out_pkts_protected", offsetof(struct ixgbe_macsec_stats,
+		out_pkts_protected)},
+	{"out_octets_encrypted", offsetof(struct ixgbe_macsec_stats,
+		out_octets_encrypted)},
+	{"out_octets_protected", offsetof(struct ixgbe_macsec_stats,
+		out_octets_protected)},
+	{"in_pkts_untagged", offsetof(struct ixgbe_macsec_stats,
+		in_pkts_untagged)},
+	{"in_pkts_badtag", offsetof(struct ixgbe_macsec_stats,
+		in_pkts_badtag)},
+	{"in_pkts_nosci", offsetof(struct ixgbe_macsec_stats,
+		in_pkts_nosci)},
+	{"in_pkts_unknownsci", offsetof(struct ixgbe_macsec_stats,
+		in_pkts_unknownsci)},
+	{"in_octets_decrypted", offsetof(struct ixgbe_macsec_stats,
+		in_octets_decrypted)},
+	{"in_octets_validated", offsetof(struct ixgbe_macsec_stats,
+		in_octets_validated)},
+	{"in_pkts_unchecked", offsetof(struct ixgbe_macsec_stats,
+		in_pkts_unchecked)},
+	{"in_pkts_delayed", offsetof(struct ixgbe_macsec_stats,
+		in_pkts_delayed)},
+	{"in_pkts_late", offsetof(struct ixgbe_macsec_stats,
+		in_pkts_late)},
+	{"in_pkts_ok", offsetof(struct ixgbe_macsec_stats,
+		in_pkts_ok)},
+	{"in_pkts_invalid", offsetof(struct ixgbe_macsec_stats,
+		in_pkts_invalid)},
+	{"in_pkts_notvalid", offsetof(struct ixgbe_macsec_stats,
+		in_pkts_notvalid)},
+	{"in_pkts_unusedsa", offsetof(struct ixgbe_macsec_stats,
+		in_pkts_unusedsa)},
+	{"in_pkts_notusingsa", offsetof(struct ixgbe_macsec_stats,
+		in_pkts_notusingsa)},
+};
+
+#define IXGBE_NB_MACSEC_STATS (sizeof(rte_ixgbe_macsec_strings) / \
+			   sizeof(rte_ixgbe_macsec_strings[0]))
+
 /* Per-queue statistics */
 static const struct rte_ixgbe_xstats_name_off rte_ixgbe_rxq_strings[] = {
 	{"mbuf_allocation_errors", offsetof(struct ixgbe_hw_stats, rnbc)},
@@ -2370,6 +2416,7 @@ ixgbe_dev_start(struct rte_eth_dev *dev)
 		/* check if lsc interrupt is enabled */
 		if (dev->data->dev_conf.intr_conf.lsc != 0)
 			ixgbe_dev_lsc_interrupt_setup(dev);
+		ixgbe_dev_macsec_interrupt_setup(dev);
 	} else {
 		rte_intr_callback_unregister(intr_handle,
 					     ixgbe_dev_interrupt_handler, dev);
@@ -2560,6 +2607,7 @@ ixgbe_dev_close(struct rte_eth_dev *dev)
 static void
 ixgbe_read_stats_registers(struct ixgbe_hw *hw,
 			   struct ixgbe_hw_stats *hw_stats,
+			   struct ixgbe_macsec_stats *macsec_stats,
 			   uint64_t *total_missed_rx, uint64_t *total_qbrc,
 			   uint64_t *total_qprc, uint64_t *total_qprdc)
 {
@@ -2729,6 +2777,40 @@ ixgbe_read_stats_registers(struct ixgbe_hw *hw,
 	/* Flow Director Stats registers */
 	hw_stats->fdirmatch += IXGBE_READ_REG(hw, IXGBE_FDIRMATCH);
 	hw_stats->fdirmiss += IXGBE_READ_REG(hw, IXGBE_FDIRMISS);
+
+	/* MACsec Stats registers */
+	macsec_stats->out_pkts_untagged += IXGBE_READ_REG(hw, IXGBE_LSECTXUT);
+	macsec_stats->out_pkts_encrypted +=
+		IXGBE_READ_REG(hw, IXGBE_LSECTXPKTE);
+	macsec_stats->out_pkts_protected +=
+		IXGBE_READ_REG(hw, IXGBE_LSECTXPKTP);
+	macsec_stats->out_octets_encrypted +=
+		IXGBE_READ_REG(hw, IXGBE_LSECTXOCTE);
+	macsec_stats->out_octets_protected +=
+		IXGBE_READ_REG(hw, IXGBE_LSECTXOCTP);
+	macsec_stats->in_pkts_untagged += IXGBE_READ_REG(hw, IXGBE_LSECRXUT);
+	macsec_stats->in_pkts_badtag += IXGBE_READ_REG(hw, IXGBE_LSECRXBAD);
+	macsec_stats->in_pkts_nosci += IXGBE_READ_REG(hw, IXGBE_LSECRXNOSCI);
+	macsec_stats->in_pkts_unknownsci +=
+		IXGBE_READ_REG(hw, IXGBE_LSECRXUNSCI);
+	macsec_stats->in_octets_decrypted +=
+		IXGBE_READ_REG(hw, IXGBE_LSECRXOCTD);
+	macsec_stats->in_octets_validated +=
+		IXGBE_READ_REG(hw, IXGBE_LSECRXOCTV);
+	macsec_stats->in_pkts_unchecked += IXGBE_READ_REG(hw, IXGBE_LSECRXUNCH);
+	macsec_stats->in_pkts_delayed += IXGBE_READ_REG(hw, IXGBE_LSECRXDELAY);
+	macsec_stats->in_pkts_late += IXGBE_READ_REG(hw, IXGBE_LSECRXLATE);
+	for (i = 0; i < 2; i++) {
+		macsec_stats->in_pkts_ok +=
+			IXGBE_READ_REG(hw, IXGBE_LSECRXOK(i));
+		macsec_stats->in_pkts_invalid +=
+			IXGBE_READ_REG(hw, IXGBE_LSECRXINV(i));
+		macsec_stats->in_pkts_notvalid +=
+			IXGBE_READ_REG(hw, IXGBE_LSECRXNV(i));
+	}
+	macsec_stats->in_pkts_unusedsa += IXGBE_READ_REG(hw, IXGBE_LSECRXUNSA);
+	macsec_stats->in_pkts_notusingsa +=
+		IXGBE_READ_REG(hw, IXGBE_LSECRXNUSA);
 }
 
 /*
@@ -2741,6 +2823,9 @@ ixgbe_dev_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
 			IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
 	struct ixgbe_hw_stats *hw_stats =
 			IXGBE_DEV_PRIVATE_TO_STATS(dev->data->dev_private);
+	struct ixgbe_macsec_stats *macsec_stats =
+			IXGBE_DEV_PRIVATE_TO_MACSEC_STATS(
+				dev->data->dev_private);
 	uint64_t total_missed_rx, total_qbrc, total_qprc, total_qprdc;
 	unsigned i;
 
@@ -2749,8 +2834,8 @@ ixgbe_dev_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
 	total_qprc = 0;
 	total_qprdc = 0;
 
-	ixgbe_read_stats_registers(hw, hw_stats, &total_missed_rx, &total_qbrc,
-			&total_qprc, &total_qprdc);
+	ixgbe_read_stats_registers(hw, hw_stats, macsec_stats, &total_missed_rx,
+			&total_qbrc, &total_qprc, &total_qprdc);
 
 	if (stats == NULL)
 		return;
@@ -2802,7 +2887,7 @@ ixgbe_dev_stats_reset(struct rte_eth_dev *dev)
 /* This function calculates the number of xstats based on the current config */
 static unsigned
 ixgbe_xstats_calc_num(void) {
-	return IXGBE_NB_HW_STATS +
+	return IXGBE_NB_HW_STATS + IXGBE_NB_MACSEC_STATS +
 		(IXGBE_NB_RXQ_PRIO_STATS * IXGBE_NB_RXQ_PRIO_VALUES) +
 		(IXGBE_NB_TXQ_PRIO_STATS * IXGBE_NB_TXQ_PRIO_VALUES);
 }
@@ -2829,6 +2914,15 @@ static int ixgbe_dev_xstats_get_names(__rte_unused struct rte_eth_dev *dev,
 			count++;
 		}
 
+		/* MACsec Stats */
+		for (i = 0; i < IXGBE_NB_MACSEC_STATS; i++) {
+			snprintf(xstats_names[count].name,
+				sizeof(xstats_names[count].name),
+				"%s",
+				rte_ixgbe_macsec_strings[i].name);
+			count++;
+		}
+
 		/* RX Priority Stats */
 		for (stat = 0; stat < IXGBE_NB_RXQ_PRIO_STATS; stat++) {
 			for (i = 0; i < IXGBE_NB_RXQ_PRIO_VALUES; i++) {
@@ -2878,6 +2972,9 @@ ixgbe_dev_xstats_get(struct rte_eth_dev *dev, struct rte_eth_xstat *xstats,
 			IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
 	struct ixgbe_hw_stats *hw_stats =
 			IXGBE_DEV_PRIVATE_TO_STATS(dev->data->dev_private);
+	struct ixgbe_macsec_stats *macsec_stats =
+			IXGBE_DEV_PRIVATE_TO_MACSEC_STATS(
+				dev->data->dev_private);
 	uint64_t total_missed_rx, total_qbrc, total_qprc, total_qprdc;
 	unsigned i, stat, count = 0;
 
@@ -2891,8 +2988,8 @@ ixgbe_dev_xstats_get(struct rte_eth_dev *dev, struct rte_eth_xstat *xstats,
 	total_qprc = 0;
 	total_qprdc = 0;
 
-	ixgbe_read_stats_registers(hw, hw_stats, &total_missed_rx, &total_qbrc,
-				   &total_qprc, &total_qprdc);
+	ixgbe_read_stats_registers(hw, hw_stats, macsec_stats, &total_missed_rx,
+			&total_qbrc, &total_qprc, &total_qprdc);
 
 	/* If this is a reset xstats is NULL, and we have cleared the
 	 * registers by reading them.
@@ -2909,6 +3006,14 @@ ixgbe_dev_xstats_get(struct rte_eth_dev *dev, struct rte_eth_xstat *xstats,
 		count++;
 	}
 
+	/* MACsec Stats */
+	for (i = 0; i < IXGBE_NB_MACSEC_STATS; i++) {
+		xstats[count].value = *(uint64_t *)(((char *)macsec_stats) +
+				rte_ixgbe_macsec_strings[i].offset);
+		xstats[count].id = count;
+		count++;
+	}
+
 	/* RX Priority Stats */
 	for (stat = 0; stat < IXGBE_NB_RXQ_PRIO_STATS; stat++) {
 		for (i = 0; i < IXGBE_NB_RXQ_PRIO_VALUES; i++) {
@@ -2938,6 +3043,9 @@ ixgbe_dev_xstats_reset(struct rte_eth_dev *dev)
 {
 	struct ixgbe_hw_stats *stats =
 			IXGBE_DEV_PRIVATE_TO_STATS(dev->data->dev_private);
+	struct ixgbe_macsec_stats *macsec_stats =
+			IXGBE_DEV_PRIVATE_TO_MACSEC_STATS(
+				dev->data->dev_private);
 
 	unsigned count = ixgbe_xstats_calc_num();
 
@@ -2946,6 +3054,7 @@ ixgbe_dev_xstats_reset(struct rte_eth_dev *dev)
 
 	/* Reset software totals */
 	memset(stats, 0, sizeof(*stats));
+	memset(macsec_stats, 0, sizeof(*macsec_stats));
 }
 
 static void
@@ -3078,6 +3187,10 @@ ixgbe_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
 	    !RTE_ETH_DEV_SRIOV(dev).active)
 		dev_info->rx_offload_capa |= DEV_RX_OFFLOAD_TCP_LRO;
 
+	if (hw->mac.type == ixgbe_mac_82599EB ||
+	    hw->mac.type == ixgbe_mac_X540)
+		dev_info->rx_offload_capa |= DEV_RX_OFFLOAD_MACSEC_STRIP;
+
 	if (hw->mac.type == ixgbe_mac_X550 ||
 	    hw->mac.type == ixgbe_mac_X550EM_x ||
 	    hw->mac.type == ixgbe_mac_X550EM_a)
@@ -3091,6 +3204,10 @@ ixgbe_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
 		DEV_TX_OFFLOAD_SCTP_CKSUM  |
 		DEV_TX_OFFLOAD_TCP_TSO;
 
+	if (hw->mac.type == ixgbe_mac_82599EB ||
+	    hw->mac.type == ixgbe_mac_X540)
+		dev_info->tx_offload_capa |= DEV_TX_OFFLOAD_MACSEC_INSERT;
+
 	if (hw->mac.type == ixgbe_mac_X550 ||
 	    hw->mac.type == ixgbe_mac_X550EM_x ||
 	    hw->mac.type == ixgbe_mac_X550EM_a)
@@ -3388,6 +3505,28 @@ ixgbe_dev_rxq_interrupt_setup(struct rte_eth_dev *dev)
 	return 0;
 }
 
+/**
+ * It clears the interrupt causes and enables the interrupt.
+ * It will be called once only during nic initialized.
+ *
+ * @param dev
+ *  Pointer to struct rte_eth_dev.
+ *
+ * @return
+ *  - On success, zero.
+ *  - On failure, a negative value.
+ */
+static int
+ixgbe_dev_macsec_interrupt_setup(struct rte_eth_dev *dev)
+{
+	struct ixgbe_interrupt *intr =
+		IXGBE_DEV_PRIVATE_TO_INTR(dev->data->dev_private);
+
+	intr->mask |= IXGBE_EICR_LINKSEC;
+
+	return 0;
+}
+
 /*
  * It reads ICR and sets flag (IXGBE_EICR_LSC) for the link_update.
  *
@@ -3422,6 +3561,9 @@ ixgbe_dev_interrupt_get_status(struct rte_eth_dev *dev)
 	if (eicr & IXGBE_EICR_MAILBOX)
 		intr->flags |= IXGBE_FLAG_MAILBOX;
 
+	if (eicr & IXGBE_EICR_LINKSEC)
+		intr->flags |= IXGBE_FLAG_MACSEC;
+
 	if (hw->mac.type ==  ixgbe_mac_X550EM_x &&
 	    hw->phy.type == ixgbe_phy_x550em_ext_t &&
 	    (eicr & IXGBE_EICR_GPI_SDP0_X550EM_x))
@@ -3576,6 +3718,12 @@ ixgbe_dev_interrupt_delayed_handler(void *param)
 		_rte_eth_dev_callback_process(dev, RTE_ETH_EVENT_INTR_LSC, NULL);
 	}
 
+	if (intr->flags & IXGBE_FLAG_MACSEC) {
+		_rte_eth_dev_callback_process(dev, RTE_ETH_EVENT_MACSEC,
+					      NULL);
+		intr->flags &= ~IXGBE_FLAG_MACSEC;
+	}
+
 	PMD_DRV_LOG(DEBUG, "enable intr in delayed handler S[%08x]", eicr);
 	ixgbe_enable_intr(dev);
 	rte_intr_enable(intr_handle);
@@ -7616,6 +7764,330 @@ ixgbevf_dev_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
 	ixgbevf_dev_interrupt_action(dev);
 }
 
+/**
+ *  ixgbe_disable_sec_tx_path_generic - Stops the transmit data path
+ *  @hw: pointer to hardware structure
+ *
+ *  Stops the transmit data path and waits for the HW to internally empty
+ *  the Tx security block
+ **/
+int ixgbe_disable_sec_tx_path_generic(struct ixgbe_hw *hw)
+{
+#define IXGBE_MAX_SECTX_POLL 40
+
+	int i;
+	int sectxreg;
+
+	sectxreg = IXGBE_READ_REG(hw, IXGBE_SECTXCTRL);
+	sectxreg |= IXGBE_SECTXCTRL_TX_DIS;
+	IXGBE_WRITE_REG(hw, IXGBE_SECTXCTRL, sectxreg);
+	for (i = 0; i < IXGBE_MAX_SECTX_POLL; i++) {
+		sectxreg = IXGBE_READ_REG(hw, IXGBE_SECTXSTAT);
+		if (sectxreg & IXGBE_SECTXSTAT_SECTX_RDY)
+			break;
+		/* Use interrupt-safe sleep just in case */
+		usec_delay(1000);
+	}
+
+	/* For informational purposes only */
+	if (i >= IXGBE_MAX_SECTX_POLL)
+		PMD_DRV_LOG(DEBUG, "Tx unit being enabled before security "
+			 "path fully disabled.  Continuing with init.\n");
+
+	return IXGBE_SUCCESS;
+}
+
+/**
+ *  ixgbe_enable_sec_tx_path_generic - Enables the transmit data path
+ *  @hw: pointer to hardware structure
+ *
+ *  Enables the transmit data path.
+ **/
+int ixgbe_enable_sec_tx_path_generic(struct ixgbe_hw *hw)
+{
+	uint32_t sectxreg;
+
+	sectxreg = IXGBE_READ_REG(hw, IXGBE_SECTXCTRL);
+	sectxreg &= ~IXGBE_SECTXCTRL_TX_DIS;
+	IXGBE_WRITE_REG(hw, IXGBE_SECTXCTRL, sectxreg);
+	IXGBE_WRITE_FLUSH(hw);
+
+	return IXGBE_SUCCESS;
+}
+
+int
+rte_pmd_ixgbe_macsec_enable(uint8_t port, uint8_t en, uint8_t rp)
+{
+	struct ixgbe_hw *hw;
+	struct rte_eth_dev *dev;
+	uint32_t ctrl;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port, -ENODEV);
+
+	dev = &rte_eth_devices[port];
+	hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	/* Stop the data paths */
+	if (ixgbe_disable_sec_rx_path(hw) != IXGBE_SUCCESS)
+		return -ENOTSUP;
+	/*
+	 * Workaround:
+	 * As no ixgbe_disable_sec_rx_path equivalent is
+	 * implemented for tx in the base code, and we are
+	 * not allowed to modify the base code in DPDK, so
+	 * just call the hand-written one directly for now.
+	 * The hardware support has been checked by
+	 * ixgbe_disable_sec_rx_path().
+	 */
+	ixgbe_disable_sec_tx_path_generic(hw);
+
+	/* Enable Ethernet CRC (required by MACsec offload) */
+	ctrl = IXGBE_READ_REG(hw, IXGBE_HLREG0);
+	ctrl |= IXGBE_HLREG0_TXCRCEN | IXGBE_HLREG0_RXCRCSTRP;
+	IXGBE_WRITE_REG(hw, IXGBE_HLREG0, ctrl);
+
+	/* Enable the TX and RX crypto engines */
+	ctrl = IXGBE_READ_REG(hw, IXGBE_SECTXCTRL);
+	ctrl &= ~IXGBE_SECTXCTRL_SECTX_DIS;
+	IXGBE_WRITE_REG(hw, IXGBE_SECTXCTRL, ctrl);
+
+	ctrl = IXGBE_READ_REG(hw, IXGBE_SECRXCTRL);
+	ctrl &= ~IXGBE_SECRXCTRL_SECRX_DIS;
+	IXGBE_WRITE_REG(hw, IXGBE_SECRXCTRL, ctrl);
+
+	ctrl = IXGBE_READ_REG(hw, IXGBE_SECTXMINIFG);
+	ctrl &= ~IXGBE_SECTX_MINSECIFG_MASK;
+	ctrl |= 0x3;
+	IXGBE_WRITE_REG(hw, IXGBE_SECTXMINIFG, ctrl);
+
+	/* Enable SA lookup */
+	ctrl = IXGBE_READ_REG(hw, IXGBE_LSECTXCTRL);
+	ctrl &= ~IXGBE_LSECTXCTRL_EN_MASK;
+	ctrl |= en ? IXGBE_LSECTXCTRL_AUTH_ENCRYPT :
+		     IXGBE_LSECTXCTRL_AUTH;
+	ctrl |= IXGBE_LSECTXCTRL_AISCI;
+	ctrl &= ~IXGBE_LSECTXCTRL_PNTHRSH_MASK;
+	ctrl |= IXGBE_MACSEC_PNTHRSH & IXGBE_LSECTXCTRL_PNTHRSH_MASK;
+	IXGBE_WRITE_REG(hw, IXGBE_LSECTXCTRL, ctrl);
+
+	ctrl = IXGBE_READ_REG(hw, IXGBE_LSECRXCTRL);
+	ctrl &= ~IXGBE_LSECRXCTRL_EN_MASK;
+	ctrl |= IXGBE_LSECRXCTRL_STRICT << IXGBE_LSECRXCTRL_EN_SHIFT;
+	ctrl &= ~IXGBE_LSECRXCTRL_PLSH;
+	if (rp)
+		ctrl |= IXGBE_LSECRXCTRL_RP;
+	else
+		ctrl &= ~IXGBE_LSECRXCTRL_RP;
+	IXGBE_WRITE_REG(hw, IXGBE_LSECRXCTRL, ctrl);
+
+	/* Start the data paths */
+	ixgbe_enable_sec_rx_path(hw);
+	/*
+	 * Workaround:
+	 * As no ixgbe_enable_sec_rx_path equivalent is
+	 * implemented for tx in the base code, and we are
+	 * not allowed to modify the base code in DPDK, so
+	 * just call the hand-written one directly for now.
+	 */
+	ixgbe_enable_sec_tx_path_generic(hw);
+
+	return 0;
+}
+
+int
+rte_pmd_ixgbe_macsec_disable(uint8_t port)
+{
+	struct ixgbe_hw *hw;
+	struct rte_eth_dev *dev;
+	uint32_t ctrl;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port, -ENODEV);
+
+	dev = &rte_eth_devices[port];
+	hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	/* Stop the data paths */
+	if (ixgbe_disable_sec_rx_path(hw) != IXGBE_SUCCESS)
+		return -ENOTSUP;
+	/*
+	 * Workaround:
+	 * As no ixgbe_disable_sec_rx_path equivalent is
+	 * implemented for tx in the base code, and we are
+	 * not allowed to modify the base code in DPDK, so
+	 * just call the hand-written one directly for now.
+	 * The hardware support has been checked by
+	 * ixgbe_disable_sec_rx_path().
+	 */
+	ixgbe_disable_sec_tx_path_generic(hw);
+
+	/* Disable the TX and RX crypto engines */
+	ctrl = IXGBE_READ_REG(hw, IXGBE_SECTXCTRL);
+	ctrl |= IXGBE_SECTXCTRL_SECTX_DIS;
+	IXGBE_WRITE_REG(hw, IXGBE_SECTXCTRL, ctrl);
+
+	ctrl = IXGBE_READ_REG(hw, IXGBE_SECRXCTRL);
+	ctrl |= IXGBE_SECRXCTRL_SECRX_DIS;
+	IXGBE_WRITE_REG(hw, IXGBE_SECRXCTRL, ctrl);
+
+	/* Disable SA lookup */
+	ctrl = IXGBE_READ_REG(hw, IXGBE_LSECTXCTRL);
+	ctrl &= ~IXGBE_LSECTXCTRL_EN_MASK;
+	ctrl |= IXGBE_LSECTXCTRL_DISABLE;
+	IXGBE_WRITE_REG(hw, IXGBE_LSECTXCTRL, ctrl);
+
+	ctrl = IXGBE_READ_REG(hw, IXGBE_LSECRXCTRL);
+	ctrl &= ~IXGBE_LSECRXCTRL_EN_MASK;
+	ctrl |= IXGBE_LSECRXCTRL_DISABLE << IXGBE_LSECRXCTRL_EN_SHIFT;
+	IXGBE_WRITE_REG(hw, IXGBE_LSECRXCTRL, ctrl);
+
+	/* Start the data paths */
+	ixgbe_enable_sec_rx_path(hw);
+	/*
+	 * Workaround:
+	 * As no ixgbe_enable_sec_rx_path equivalent is
+	 * implemented for tx in the base code, and we are
+	 * not allowed to modify the base code in DPDK, so
+	 * just call the hand-written one directly for now.
+	 */
+	ixgbe_enable_sec_tx_path_generic(hw);
+
+	return 0;
+}
+
+int
+rte_pmd_ixgbe_macsec_config_txsc(uint8_t port, uint8_t *mac)
+{
+	struct ixgbe_hw *hw;
+	struct rte_eth_dev *dev;
+	uint32_t ctrl;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port, -ENODEV);
+
+	dev = &rte_eth_devices[port];
+	hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	ctrl = mac[0] | (mac[1] << 8) | (mac[2] << 16) | (mac[3] << 24);
+	IXGBE_WRITE_REG(hw, IXGBE_LSECTXSCL, ctrl);
+
+	ctrl = mac[4] | (mac[5] << 8);
+	IXGBE_WRITE_REG(hw, IXGBE_LSECTXSCH, ctrl);
+
+	return 0;
+}
+
+int
+rte_pmd_ixgbe_macsec_config_rxsc(uint8_t port, uint8_t *mac, uint16_t pi)
+{
+	struct ixgbe_hw *hw;
+	struct rte_eth_dev *dev;
+	uint32_t ctrl;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port, -ENODEV);
+
+	dev = &rte_eth_devices[port];
+	hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	ctrl = mac[0] | (mac[1] << 8) | (mac[2] << 16) | (mac[3] << 24);
+	IXGBE_WRITE_REG(hw, IXGBE_LSECRXSCL, ctrl);
+
+	pi = rte_cpu_to_be_16(pi);
+	ctrl = mac[4] | (mac[5] << 8) | (pi << 16);
+	IXGBE_WRITE_REG(hw, IXGBE_LSECRXSCH, ctrl);
+
+	return 0;
+}
+
+int
+rte_pmd_ixgbe_macsec_select_txsa(uint8_t port, uint8_t idx, uint8_t an,
+				 uint32_t pn, uint8_t *key)
+{
+	struct ixgbe_hw *hw;
+	struct rte_eth_dev *dev;
+	uint32_t ctrl, i;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port, -ENODEV);
+
+	dev = &rte_eth_devices[port];
+	hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	if (idx != 0 && idx != 1)
+		return -EINVAL;
+
+	if (an >= 4)
+		return -EINVAL;
+
+	hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	/* Set the PN and key */
+	pn = rte_cpu_to_be_32(pn);
+	if (idx == 0) {
+		IXGBE_WRITE_REG(hw, IXGBE_LSECTXPN0, pn);
+
+		for (i = 0; i < 4; i++) {
+			ctrl = (key[i * 4 + 0] <<  0) |
+			       (key[i * 4 + 1] <<  8) |
+			       (key[i * 4 + 2] << 16) |
+			       (key[i * 4 + 3] << 24);
+			IXGBE_WRITE_REG(hw, IXGBE_LSECTXKEY0(i), ctrl);
+		}
+	} else {
+		IXGBE_WRITE_REG(hw, IXGBE_LSECTXPN1, pn);
+
+		for (i = 0; i < 4; i++) {
+			ctrl = (key[i * 4 + 0] <<  0) |
+			       (key[i * 4 + 1] <<  8) |
+			       (key[i * 4 + 2] << 16) |
+			       (key[i * 4 + 3] << 24);
+			IXGBE_WRITE_REG(hw, IXGBE_LSECTXKEY1(i), ctrl);
+		}
+	}
+
+	/* Set AN and select the SA */
+	ctrl = (an << idx * 2) | (idx << 4);
+	IXGBE_WRITE_REG(hw, IXGBE_LSECTXSA, ctrl);
+
+	return 0;
+}
+
+int
+rte_pmd_ixgbe_macsec_select_rxsa(uint8_t port, uint8_t idx, uint8_t an,
+				 uint32_t pn, uint8_t *key)
+{
+	struct ixgbe_hw *hw;
+	struct rte_eth_dev *dev;
+	uint32_t ctrl, i;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port, -ENODEV);
+
+	dev = &rte_eth_devices[port];
+	hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	if (idx != 0 && idx != 1)
+		return -EINVAL;
+
+	if (an >= 4)
+		return -EINVAL;
+
+	/* Set the PN */
+	pn = rte_cpu_to_be_32(pn);
+	IXGBE_WRITE_REG(hw, IXGBE_LSECRXPN(idx), pn);
+
+	/* Set the key */
+	for (i = 0; i < 4; i++) {
+		ctrl = (key[i * 4 + 0] <<  0) |
+		       (key[i * 4 + 1] <<  8) |
+		       (key[i * 4 + 2] << 16) |
+		       (key[i * 4 + 3] << 24);
+		IXGBE_WRITE_REG(hw, IXGBE_LSECRXKEY(idx, i), ctrl);
+	}
+
+	/* Set the AN and validate the SA */
+	ctrl = an | (1 << 2);
+	IXGBE_WRITE_REG(hw, IXGBE_LSECRXSA(idx), ctrl);
+
+	return 0;
+}
+
 RTE_PMD_REGISTER_PCI(net_ixgbe, rte_ixgbe_pmd.pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_ixgbe, pci_id_ixgbe_map);
 RTE_PMD_REGISTER_KMOD_DEP(net_ixgbe, "* igb_uio | uio_pci_generic | vfio");
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.h b/drivers/net/ixgbe/ixgbe_ethdev.h
index 69b276f..14e1fd6 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.h
+++ b/drivers/net/ixgbe/ixgbe_ethdev.h
@@ -43,6 +43,7 @@
 #define IXGBE_FLAG_NEED_LINK_UPDATE (uint32_t)(1 << 0)
 #define IXGBE_FLAG_MAILBOX          (uint32_t)(1 << 1)
 #define IXGBE_FLAG_PHY_INTERRUPT    (uint32_t)(1 << 2)
+#define IXGBE_FLAG_MACSEC           (uint32_t)(1 << 3)
 
 /*
  * Defines that were not part of ixgbe_type.h as they are not used by the
@@ -130,6 +131,10 @@
 #define IXGBE_MISC_VEC_ID               RTE_INTR_VEC_ZERO_OFFSET
 #define IXGBE_RX_VEC_START              RTE_INTR_VEC_RXTX_OFFSET
 
+#define IXGBE_SECTX_MINSECIFG_MASK      0x0000000F
+
+#define IXGBE_MACSEC_PNTHRSH            0xFFFFFE00
+
 /*
  * Information about the fdir mode.
  */
@@ -265,11 +270,44 @@ struct ixgbe_filter_info {
 };
 
 /*
+ * Statistics counters collected by the MACsec
+ */
+struct ixgbe_macsec_stats {
+	/* TX port statistics */
+	uint64_t out_pkts_untagged;
+	uint64_t out_pkts_encrypted;
+	uint64_t out_pkts_protected;
+	uint64_t out_octets_encrypted;
+	uint64_t out_octets_protected;
+
+	/* RX port statistics */
+	uint64_t in_pkts_untagged;
+	uint64_t in_pkts_badtag;
+	uint64_t in_pkts_nosci;
+	uint64_t in_pkts_unknownsci;
+	uint64_t in_octets_decrypted;
+	uint64_t in_octets_validated;
+
+	/* RX SC statistics */
+	uint64_t in_pkts_unchecked;
+	uint64_t in_pkts_delayed;
+	uint64_t in_pkts_late;
+
+	/* RX SA statistics */
+	uint64_t in_pkts_ok;
+	uint64_t in_pkts_invalid;
+	uint64_t in_pkts_notvalid;
+	uint64_t in_pkts_unusedsa;
+	uint64_t in_pkts_notusingsa;
+};
+
+/*
  * Structure to store private data for each driver instance (for each port).
  */
 struct ixgbe_adapter {
 	struct ixgbe_hw             hw;
 	struct ixgbe_hw_stats       stats;
+	struct ixgbe_macsec_stats   macsec_stats;
 	struct ixgbe_hw_fdir_info   fdir;
 	struct ixgbe_interrupt      intr;
 	struct ixgbe_stat_mapping_registers stat_mappings;
@@ -300,6 +338,9 @@ struct ixgbe_adapter {
 #define IXGBE_DEV_PRIVATE_TO_STATS(adapter) \
 	(&((struct ixgbe_adapter *)adapter)->stats)
 
+#define IXGBE_DEV_PRIVATE_TO_MACSEC_STATS(adapter) \
+	(&((struct ixgbe_adapter *)adapter)->macsec_stats)
+
 #define IXGBE_DEV_PRIVATE_TO_INTR(adapter) \
 	(&((struct ixgbe_adapter *)adapter)->intr)
 
@@ -448,4 +489,8 @@ uint32_t ixgbe_convert_vm_rx_mask_to_val(uint16_t rx_mask, uint32_t orig_val);
 
 int ixgbe_fdir_ctrl_func(struct rte_eth_dev *dev,
 			enum rte_filter_op filter_op, void *arg);
+
+int ixgbe_disable_sec_tx_path_generic(struct ixgbe_hw *hw);
+
+int ixgbe_enable_sec_tx_path_generic(struct ixgbe_hw *hw);
 #endif /* _IXGBE_ETHDEV_H_ */
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index 0bbc583..00013c6 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -86,6 +86,7 @@
 		PKT_TX_IP_CKSUM |		 \
 		PKT_TX_L4_MASK |		 \
 		PKT_TX_TCP_SEG |		 \
+		PKT_TX_MACSEC |			 \
 		PKT_TX_OUTER_IP_CKSUM)
 
 #define IXGBE_TX_OFFLOAD_NOTSUP_MASK \
@@ -523,6 +524,8 @@ tx_desc_ol_flags_to_cmdtype(uint64_t ol_flags)
 		cmdtype |= IXGBE_ADVTXD_DCMD_TSE;
 	if (ol_flags & PKT_TX_OUTER_IP_CKSUM)
 		cmdtype |= (1 << IXGBE_ADVTXD_OUTERIPCS_SHIFT);
+	if (ol_flags & PKT_TX_MACSEC)
+		cmdtype |= IXGBE_ADVTXD_MAC_LINKSEC;
 	return cmdtype;
 }
 
diff --git a/drivers/net/ixgbe/rte_pmd_ixgbe.h b/drivers/net/ixgbe/rte_pmd_ixgbe.h
index c2fb826..22bc4c7 100644
--- a/drivers/net/ixgbe/rte_pmd_ixgbe.h
+++ b/drivers/net/ixgbe/rte_pmd_ixgbe.h
@@ -183,6 +183,106 @@ int
 rte_pmd_ixgbe_set_vf_vlan_stripq(uint8_t port, uint16_t vf, uint8_t on);
 
 /**
+ * Enable MACsec offload.
+ *
+ * @param port
+ *   The port identifier of the Ethernet device.
+ * @param en
+ *    1 - Enable encryption (encrypt and add integrity signature).
+ *    0 - Disable encryption (only add integrity signature).
+ * @param rp
+ *    1 - Enable replay protection.
+ *    0 - Disable replay protection.
+ * @return
+ *   - (0) if successful.
+ *   - (-ENODEV) if *port* invalid.
+ *   - (-ENOTSUP) if hardware doesn't support this feature.
+ */
+int rte_pmd_ixgbe_macsec_enable(uint8_t port, uint8_t en, uint8_t rp);
+
+/**
+ * Disable MACsec offload.
+ *
+ * @param port
+ *   The port identifier of the Ethernet device.
+ * @return
+ *   - (0) if successful.
+ *   - (-ENODEV) if *port* invalid.
+ *   - (-ENOTSUP) if hardware doesn't support this feature.
+ */
+int rte_pmd_ixgbe_macsec_disable(uint8_t port);
+
+/**
+ * Configure Tx SC (Secure Connection).
+ *
+ * @param port
+ *   The port identifier of the Ethernet device.
+ * @param mac
+ *   The MAC address on the local side.
+ * @return
+ *   - (0) if successful.
+ *   - (-ENODEV) if *port* invalid.
+ */
+int rte_pmd_ixgbe_macsec_config_txsc(uint8_t port, uint8_t *mac);
+
+/**
+ * Configure Rx SC (Secure Connection).
+ *
+ * @param port
+ *   The port identifier of the Ethernet device.
+ * @param mac
+ *   The MAC address on the remote side.
+ * @param pi
+ *   The PI (port identifier) on the remote side.
+ * @return
+ *   - (0) if successful.
+ *   - (-ENODEV) if *port* invalid.
+ */
+int rte_pmd_ixgbe_macsec_config_rxsc(uint8_t port, uint8_t *mac, uint16_t pi);
+
+/**
+ * Enable Tx SA (Secure Association).
+ *
+ * @param port
+ *   The port identifier of the Ethernet device.
+ * @param idx
+ *   The SA to be enabled (0 or 1).
+ * @param an
+ *   The association number on the local side.
+ * @param pn
+ *   The packet number on the local side.
+ * @param key
+ *   The key on the local side.
+ * @return
+ *   - (0) if successful.
+ *   - (-ENODEV) if *port* invalid.
+ *   - (-EINVAL) if bad parameter.
+ */
+int rte_pmd_ixgbe_macsec_select_txsa(uint8_t port, uint8_t idx, uint8_t an,
+		uint32_t pn, uint8_t *key);
+
+/**
+ * Enable Rx SA (Secure Association).
+ *
+ * @param port
+ *   The port identifier of the Ethernet device.
+ * @param idx
+ *   The SA to be enabled (0 or 1)
+ * @param an
+ *   The association number on the remote side.
+ * @param pn
+ *   The packet number on the remote side.
+ * @param key
+ *   The key on the remote side.
+ * @return
+ *   - (0) if successful.
+ *   - (-ENODEV) if *port* invalid.
+ *   - (-EINVAL) if bad parameter.
+ */
+int rte_pmd_ixgbe_macsec_select_rxsa(uint8_t port, uint8_t idx, uint8_t an,
+		uint32_t pn, uint8_t *key);
+
+/**
  * Response sent back to ixgbe driver from user app after callback
  */
 enum rte_pmd_ixgbe_mb_event_rsp {
diff --git a/drivers/net/ixgbe/rte_pmd_ixgbe_version.map b/drivers/net/ixgbe/rte_pmd_ixgbe_version.map
index 92434f3..6d68934 100644
--- a/drivers/net/ixgbe/rte_pmd_ixgbe_version.map
+++ b/drivers/net/ixgbe/rte_pmd_ixgbe_version.map
@@ -15,3 +15,14 @@ DPDK_16.11 {
 	rte_pmd_ixgbe_set_vf_vlan_insert;
 	rte_pmd_ixgbe_set_vf_vlan_stripq;
 } DPDK_2.0;
+
+DPDK_17.02 {
+	global:
+
+	rte_pmd_ixgbe_macsec_enable;
+	rte_pmd_ixgbe_macsec_disable;
+	rte_pmd_ixgbe_macsec_config_txsc;
+	rte_pmd_ixgbe_macsec_config_rxsc;
+	rte_pmd_ixgbe_macsec_select_txsa;
+	rte_pmd_ixgbe_macsec_select_rxsa;
+} DPDK_16.11;
-- 
2.7.4

^ permalink raw reply related

* [PATCH v6 5/6] app/testpmd: add MACsec offload commands
From: Tiwei Bie @ 2017-01-11  4:31 UTC (permalink / raw)
  To: dev
  Cc: adrien.mazarguil, wenzhuo.lu, john.mcnamara, olivier.matz,
	thomas.monjalon, konstantin.ananyev, helin.zhang, wei.dai,
	xiao.w.wang
In-Reply-To: <1484109098-8936-1-git-send-email-tiwei.bie@intel.com>

Below MACsec offload commands are added:

- set macsec offload <port_id> on encrypt on|off replay-protect on|off
- set macsec offload <port_id> off
- set macsec sc tx|rx <port_id> <mac> <pi>
- set macsec sa tx|rx <port_id> <idx> <an> <pn> <key>

Also update the testpmd user guide.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
---
 app/test-pmd/cmdline.c                      | 389 ++++++++++++++++++++++++++++
 app/test-pmd/macfwd.c                       |   2 +
 app/test-pmd/macswap.c                      |   2 +
 app/test-pmd/testpmd.h                      |   2 +
 app/test-pmd/txonly.c                       |   2 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  32 +++
 6 files changed, 429 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index f768b6b..4a67894 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -275,6 +275,18 @@ static void cmd_help_long_parsed(void *parsed_result,
 
 			"set vf mac antispoof (port_id) (vf_id) (on|off).\n"
 			"    Set MAC antispoof for a VF from the PF.\n\n"
+
+			"set macsec offload (port_id) on encrypt (on|off) replay-protect (on|off)\n"
+			"    Enable MACsec offload.\n\n"
+
+			"set macsec offload (port_id) off\n"
+			"    Disable MACsec offload.\n\n"
+
+			"set macsec sc (tx|rx) (port_id) (mac) (pi)\n"
+			"    Configure MACsec secure connection (SC).\n\n"
+
+			"set macsec sa (tx|rx) (port_id) (idx) (an) (pn) (key)\n"
+			"    Configure MACsec secure association (SA).\n\n"
 #endif
 
 			"vlan set strip (on|off) (port_id)\n"
@@ -11488,6 +11500,379 @@ cmdline_parse_inst_t cmd_set_vf_mac_addr = {
 		NULL,
 	},
 };
+
+/* MACsec configuration */
+
+/* Common result structure for MACsec offload enable */
+struct cmd_macsec_offload_on_result {
+	cmdline_fixed_string_t set;
+	cmdline_fixed_string_t macsec;
+	cmdline_fixed_string_t offload;
+	uint8_t port_id;
+	cmdline_fixed_string_t on;
+	cmdline_fixed_string_t encrypt;
+	cmdline_fixed_string_t en_on_off;
+	cmdline_fixed_string_t replay_protect;
+	cmdline_fixed_string_t rp_on_off;
+};
+
+/* Common CLI fields for MACsec offload disable */
+cmdline_parse_token_string_t cmd_macsec_offload_on_set =
+	TOKEN_STRING_INITIALIZER
+		(struct cmd_macsec_offload_on_result,
+		 set, "set");
+cmdline_parse_token_string_t cmd_macsec_offload_on_macsec =
+	TOKEN_STRING_INITIALIZER
+		(struct cmd_macsec_offload_on_result,
+		 macsec, "macsec");
+cmdline_parse_token_string_t cmd_macsec_offload_on_offload =
+	TOKEN_STRING_INITIALIZER
+		(struct cmd_macsec_offload_on_result,
+		 offload, "offload");
+cmdline_parse_token_string_t cmd_macsec_offload_on_port_id =
+	TOKEN_NUM_INITIALIZER
+		(struct cmd_macsec_offload_on_result,
+		 port_id, UINT8);
+cmdline_parse_token_string_t cmd_macsec_offload_on_on =
+	TOKEN_STRING_INITIALIZER
+		(struct cmd_macsec_offload_on_result,
+		 on, "on");
+cmdline_parse_token_string_t cmd_macsec_offload_on_encrypt =
+	TOKEN_STRING_INITIALIZER
+		(struct cmd_macsec_offload_on_result,
+		 encrypt, "encrypt");
+cmdline_parse_token_string_t cmd_macsec_offload_on_en_on_off =
+	TOKEN_STRING_INITIALIZER
+		(struct cmd_macsec_offload_on_result,
+		 en_on_off, "on#off");
+cmdline_parse_token_string_t cmd_macsec_offload_on_replay_protect =
+	TOKEN_STRING_INITIALIZER
+		(struct cmd_macsec_offload_on_result,
+		 replay_protect, "replay-protect");
+cmdline_parse_token_string_t cmd_macsec_offload_on_rp_on_off =
+	TOKEN_STRING_INITIALIZER
+		(struct cmd_macsec_offload_on_result,
+		 rp_on_off, "on#off");
+
+static void
+cmd_set_macsec_offload_on_parsed(
+	void *parsed_result,
+	__attribute__((unused)) struct cmdline *cl,
+	__attribute__((unused)) void *data)
+{
+	struct cmd_macsec_offload_on_result *res = parsed_result;
+	int ret;
+	portid_t port_id = res->port_id;
+	int en = (strcmp(res->en_on_off, "on") == 0) ? 1 : 0;
+	int rp = (strcmp(res->rp_on_off, "on") == 0) ? 1 : 0;
+
+	if (port_id_is_invalid(port_id, ENABLED_WARN))
+		return;
+
+	ports[port_id].tx_ol_flags |= TESTPMD_TX_OFFLOAD_MACSEC;
+	ret = rte_pmd_ixgbe_macsec_enable(port_id, en, rp);
+
+	switch (ret) {
+	case 0:
+		break;
+	case -ENODEV:
+		printf("invalid port_id %d\n", port_id);
+		break;
+	default:
+		printf("programming error: (%s)\n", strerror(-ret));
+	}
+}
+
+cmdline_parse_inst_t cmd_set_macsec_offload_on = {
+	.f = cmd_set_macsec_offload_on_parsed,
+	.data = NULL,
+	.help_str = "set macsec offload <port_id> on "
+		"encrypt on|off replay-protect on|off",
+	.tokens = {
+		(void *)&cmd_macsec_offload_on_set,
+		(void *)&cmd_macsec_offload_on_macsec,
+		(void *)&cmd_macsec_offload_on_offload,
+		(void *)&cmd_macsec_offload_on_port_id,
+		(void *)&cmd_macsec_offload_on_on,
+		(void *)&cmd_macsec_offload_on_encrypt,
+		(void *)&cmd_macsec_offload_on_en_on_off,
+		(void *)&cmd_macsec_offload_on_replay_protect,
+		(void *)&cmd_macsec_offload_on_rp_on_off,
+		NULL,
+	},
+};
+
+/* Common result structure for MACsec offload disable */
+struct cmd_macsec_offload_off_result {
+	cmdline_fixed_string_t set;
+	cmdline_fixed_string_t macsec;
+	cmdline_fixed_string_t offload;
+	uint8_t port_id;
+	cmdline_fixed_string_t off;
+};
+
+/* Common CLI fields for MACsec offload disable */
+cmdline_parse_token_string_t cmd_macsec_offload_off_set =
+	TOKEN_STRING_INITIALIZER
+		(struct cmd_macsec_offload_off_result,
+		 set, "set");
+cmdline_parse_token_string_t cmd_macsec_offload_off_macsec =
+	TOKEN_STRING_INITIALIZER
+		(struct cmd_macsec_offload_off_result,
+		 macsec, "macsec");
+cmdline_parse_token_string_t cmd_macsec_offload_off_offload =
+	TOKEN_STRING_INITIALIZER
+		(struct cmd_macsec_offload_off_result,
+		 offload, "offload");
+cmdline_parse_token_string_t cmd_macsec_offload_off_port_id =
+	TOKEN_NUM_INITIALIZER
+		(struct cmd_macsec_offload_off_result,
+		 port_id, UINT8);
+cmdline_parse_token_string_t cmd_macsec_offload_off_off =
+	TOKEN_STRING_INITIALIZER
+		(struct cmd_macsec_offload_off_result,
+		 off, "off");
+
+static void
+cmd_set_macsec_offload_off_parsed(
+	void *parsed_result,
+	__attribute__((unused)) struct cmdline *cl,
+	__attribute__((unused)) void *data)
+{
+	struct cmd_macsec_offload_off_result *res = parsed_result;
+	int ret;
+	portid_t port_id = res->port_id;
+
+	if (port_id_is_invalid(port_id, ENABLED_WARN))
+		return;
+
+	ports[port_id].tx_ol_flags &= ~TESTPMD_TX_OFFLOAD_MACSEC;
+	ret = rte_pmd_ixgbe_macsec_disable(port_id);
+
+	switch (ret) {
+	case 0:
+		break;
+	case -ENODEV:
+		printf("invalid port_id %d\n", port_id);
+		break;
+	default:
+		printf("programming error: (%s)\n", strerror(-ret));
+	}
+}
+
+cmdline_parse_inst_t cmd_set_macsec_offload_off = {
+	.f = cmd_set_macsec_offload_off_parsed,
+	.data = NULL,
+	.help_str = "set macsec offload <port_id> off",
+	.tokens = {
+		(void *)&cmd_macsec_offload_off_set,
+		(void *)&cmd_macsec_offload_off_macsec,
+		(void *)&cmd_macsec_offload_off_offload,
+		(void *)&cmd_macsec_offload_off_port_id,
+		(void *)&cmd_macsec_offload_off_off,
+		NULL,
+	},
+};
+
+/* Common result structure for MACsec secure connection configure */
+struct cmd_macsec_sc_result {
+	cmdline_fixed_string_t set;
+	cmdline_fixed_string_t macsec;
+	cmdline_fixed_string_t sc;
+	cmdline_fixed_string_t tx_rx;
+	uint8_t port_id;
+	struct ether_addr mac;
+	uint16_t pi;
+};
+
+/* Common CLI fields for MACsec secure connection configure */
+cmdline_parse_token_string_t cmd_macsec_sc_set =
+	TOKEN_STRING_INITIALIZER
+		(struct cmd_macsec_sc_result,
+		 set, "set");
+cmdline_parse_token_string_t cmd_macsec_sc_macsec =
+	TOKEN_STRING_INITIALIZER
+		(struct cmd_macsec_sc_result,
+		 macsec, "macsec");
+cmdline_parse_token_string_t cmd_macsec_sc_sc =
+	TOKEN_STRING_INITIALIZER
+		(struct cmd_macsec_sc_result,
+		 sc, "sc");
+cmdline_parse_token_string_t cmd_macsec_sc_tx_rx =
+	TOKEN_STRING_INITIALIZER
+		(struct cmd_macsec_sc_result,
+		 tx_rx, "tx#rx");
+cmdline_parse_token_num_t cmd_macsec_sc_port_id =
+	TOKEN_NUM_INITIALIZER
+		(struct cmd_macsec_sc_result,
+		 port_id, UINT8);
+cmdline_parse_token_etheraddr_t cmd_macsec_sc_mac =
+	TOKEN_ETHERADDR_INITIALIZER
+		(struct cmd_macsec_sc_result,
+		 mac);
+cmdline_parse_token_num_t cmd_macsec_sc_pi =
+	TOKEN_NUM_INITIALIZER
+		(struct cmd_macsec_sc_result,
+		 pi, UINT16);
+
+static void
+cmd_set_macsec_sc_parsed(
+	void *parsed_result,
+	__attribute__((unused)) struct cmdline *cl,
+	__attribute__((unused)) void *data)
+{
+	struct cmd_macsec_sc_result *res = parsed_result;
+	int ret;
+	int is_tx = (strcmp(res->tx_rx, "tx") == 0) ? 1 : 0;
+
+	ret = is_tx ?
+		rte_pmd_ixgbe_macsec_config_txsc(res->port_id,
+				res->mac.addr_bytes) :
+		rte_pmd_ixgbe_macsec_config_rxsc(res->port_id,
+				res->mac.addr_bytes, res->pi);
+	switch (ret) {
+	case 0:
+		break;
+	case -ENODEV:
+		printf("invalid port_id %d\n", res->port_id);
+		break;
+	default:
+		printf("programming error: (%s)\n", strerror(-ret));
+	}
+}
+
+cmdline_parse_inst_t cmd_set_macsec_sc = {
+	.f = cmd_set_macsec_sc_parsed,
+	.data = NULL,
+	.help_str = "set macsec sc tx|rx <port_id> <mac> <pi>",
+	.tokens = {
+		(void *)&cmd_macsec_sc_set,
+		(void *)&cmd_macsec_sc_macsec,
+		(void *)&cmd_macsec_sc_sc,
+		(void *)&cmd_macsec_sc_tx_rx,
+		(void *)&cmd_macsec_sc_port_id,
+		(void *)&cmd_macsec_sc_mac,
+		(void *)&cmd_macsec_sc_pi,
+		NULL,
+	},
+};
+
+/* Common result structure for MACsec secure connection configure */
+struct cmd_macsec_sa_result {
+	cmdline_fixed_string_t set;
+	cmdline_fixed_string_t macsec;
+	cmdline_fixed_string_t sa;
+	cmdline_fixed_string_t tx_rx;
+	uint8_t port_id;
+	uint8_t idx;
+	uint8_t an;
+	uint32_t pn;
+	cmdline_fixed_string_t key;
+};
+
+/* Common CLI fields for MACsec secure connection configure */
+cmdline_parse_token_string_t cmd_macsec_sa_set =
+	TOKEN_STRING_INITIALIZER
+		(struct cmd_macsec_sa_result,
+		 set, "set");
+cmdline_parse_token_string_t cmd_macsec_sa_macsec =
+	TOKEN_STRING_INITIALIZER
+		(struct cmd_macsec_sa_result,
+		 macsec, "macsec");
+cmdline_parse_token_string_t cmd_macsec_sa_sa =
+	TOKEN_STRING_INITIALIZER
+		(struct cmd_macsec_sa_result,
+		 sa, "sa");
+cmdline_parse_token_string_t cmd_macsec_sa_tx_rx =
+	TOKEN_STRING_INITIALIZER
+		(struct cmd_macsec_sa_result,
+		 tx_rx, "tx#rx");
+cmdline_parse_token_num_t cmd_macsec_sa_port_id =
+	TOKEN_NUM_INITIALIZER
+		(struct cmd_macsec_sa_result,
+		 port_id, UINT8);
+cmdline_parse_token_num_t cmd_macsec_sa_idx =
+	TOKEN_NUM_INITIALIZER
+		(struct cmd_macsec_sa_result,
+		 idx, UINT8);
+cmdline_parse_token_num_t cmd_macsec_sa_an =
+	TOKEN_NUM_INITIALIZER
+		(struct cmd_macsec_sa_result,
+		 an, UINT8);
+cmdline_parse_token_num_t cmd_macsec_sa_pn =
+	TOKEN_NUM_INITIALIZER
+		(struct cmd_macsec_sa_result,
+		 pn, UINT32);
+cmdline_parse_token_string_t cmd_macsec_sa_key =
+	TOKEN_STRING_INITIALIZER
+		(struct cmd_macsec_sa_result,
+		 key, NULL);
+
+static void
+cmd_set_macsec_sa_parsed(
+	void *parsed_result,
+	__attribute__((unused)) struct cmdline *cl,
+	__attribute__((unused)) void *data)
+{
+	struct cmd_macsec_sa_result *res = parsed_result;
+	int ret;
+	int is_tx = (strcmp(res->tx_rx, "tx") == 0) ? 1 : 0;
+	uint8_t key[16] = { 0 };
+	uint8_t xdgt0;
+	uint8_t xdgt1;
+	int key_len;
+	int i;
+
+	key_len = strlen(res->key) / 2;
+	if (key_len > 16)
+		key_len = 16;
+
+	for (i = 0; i < key_len; i++) {
+		xdgt0 = parse_and_check_key_hexa_digit(res->key, (i * 2));
+		if (xdgt0 == 0xFF)
+			return;
+		xdgt1 = parse_and_check_key_hexa_digit(res->key, (i * 2) + 1);
+		if (xdgt1 == 0xFF)
+			return;
+		key[i] = (uint8_t) ((xdgt0 * 16) + xdgt1);
+	}
+
+	ret = is_tx ?
+		rte_pmd_ixgbe_macsec_select_txsa(res->port_id,
+			res->idx, res->an, res->pn, key) :
+		rte_pmd_ixgbe_macsec_select_rxsa(res->port_id,
+			res->idx, res->an, res->pn, key);
+	switch (ret) {
+	case 0:
+		break;
+	case -EINVAL:
+		printf("invalid idx %d or an %d\n", res->idx, res->an);
+		break;
+	case -ENODEV:
+		printf("invalid port_id %d\n", res->port_id);
+		break;
+	default:
+		printf("programming error: (%s)\n", strerror(-ret));
+	}
+}
+
+cmdline_parse_inst_t cmd_set_macsec_sa = {
+	.f = cmd_set_macsec_sa_parsed,
+	.data = NULL,
+	.help_str = "set macsec sa tx|rx <port_id> <idx> <an> <pn> <key>",
+	.tokens = {
+		(void *)&cmd_macsec_sa_set,
+		(void *)&cmd_macsec_sa_macsec,
+		(void *)&cmd_macsec_sa_sa,
+		(void *)&cmd_macsec_sa_tx_rx,
+		(void *)&cmd_macsec_sa_port_id,
+		(void *)&cmd_macsec_sa_idx,
+		(void *)&cmd_macsec_sa_an,
+		(void *)&cmd_macsec_sa_pn,
+		(void *)&cmd_macsec_sa_key,
+		NULL,
+	},
+};
 #endif
 
 /* ******************************************************************************** */
@@ -11656,6 +12041,10 @@ cmdline_parse_ctx_t main_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_set_all_queues_drop_en,
 	(cmdline_parse_inst_t *)&cmd_set_vf_split_drop_en,
 	(cmdline_parse_inst_t *)&cmd_set_vf_mac_addr,
+	(cmdline_parse_inst_t *)&cmd_set_macsec_offload_on,
+	(cmdline_parse_inst_t *)&cmd_set_macsec_offload_off,
+	(cmdline_parse_inst_t *)&cmd_set_macsec_sc,
+	(cmdline_parse_inst_t *)&cmd_set_macsec_sa,
 #endif
 	NULL,
 };
diff --git a/app/test-pmd/macfwd.c b/app/test-pmd/macfwd.c
index d361db1..cf7eab1 100644
--- a/app/test-pmd/macfwd.c
+++ b/app/test-pmd/macfwd.c
@@ -113,6 +113,8 @@ pkt_burst_mac_forward(struct fwd_stream *fs)
 		ol_flags = PKT_TX_VLAN_PKT;
 	if (txp->tx_ol_flags & TESTPMD_TX_OFFLOAD_INSERT_QINQ)
 		ol_flags |= PKT_TX_QINQ_PKT;
+	if (txp->tx_ol_flags & TESTPMD_TX_OFFLOAD_MACSEC)
+		ol_flags |= PKT_TX_MACSEC;
 	for (i = 0; i < nb_rx; i++) {
 		if (likely(i < nb_rx - 1))
 			rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[i + 1],
diff --git a/app/test-pmd/macswap.c b/app/test-pmd/macswap.c
index f996039..3a09351 100644
--- a/app/test-pmd/macswap.c
+++ b/app/test-pmd/macswap.c
@@ -113,6 +113,8 @@ pkt_burst_mac_swap(struct fwd_stream *fs)
 		ol_flags = PKT_TX_VLAN_PKT;
 	if (txp->tx_ol_flags & TESTPMD_TX_OFFLOAD_INSERT_QINQ)
 		ol_flags |= PKT_TX_QINQ_PKT;
+	if (txp->tx_ol_flags & TESTPMD_TX_OFFLOAD_MACSEC)
+		ol_flags |= PKT_TX_MACSEC;
 	for (i = 0; i < nb_rx; i++) {
 		if (likely(i < nb_rx - 1))
 			rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[i + 1],
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 22ce2d6..0a9a1af 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -143,6 +143,8 @@ struct fwd_stream {
 #define TESTPMD_TX_OFFLOAD_INSERT_VLAN       0x0040
 /** Insert double VLAN header in forward engine */
 #define TESTPMD_TX_OFFLOAD_INSERT_QINQ       0x0080
+/** Offload MACsec in forward engine */
+#define TESTPMD_TX_OFFLOAD_MACSEC            0x0100
 
 /** Descriptor for a single flow. */
 struct port_flow {
diff --git a/app/test-pmd/txonly.c b/app/test-pmd/txonly.c
index e996f35..8b1a2af 100644
--- a/app/test-pmd/txonly.c
+++ b/app/test-pmd/txonly.c
@@ -215,6 +215,8 @@ pkt_burst_transmit(struct fwd_stream *fs)
 		ol_flags = PKT_TX_VLAN_PKT;
 	if (txp->tx_ol_flags & TESTPMD_TX_OFFLOAD_INSERT_QINQ)
 		ol_flags |= PKT_TX_QINQ_PKT;
+	if (txp->tx_ol_flags & TESTPMD_TX_OFFLOAD_MACSEC)
+		ol_flags |= PKT_TX_MACSEC;
 	for (nb_pkt = 0; nb_pkt < nb_pkt_per_burst; nb_pkt++) {
 		pkt = rte_mbuf_raw_alloc(mbp);
 		if (pkt == NULL) {
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index c611dc5..e3222c0 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -507,6 +507,38 @@ Set mac antispoof for a VF from the PF::
 
    testpmd> set vf mac antispoof  (port_id) (vf_id) (on|off)
 
+set macsec offload
+~~~~~~~~~~~~~~~~~~
+
+Enable/disable MACsec offload::
+
+   testpmd> set macsec offload (port_id) on encrypt (on|off) replay-protect (on|off)
+   testpmd> set macsec offload (port_id) off
+
+set macsec sc
+~~~~~~~~~~~~~
+
+Configure MACsec secure connection (SC)::
+
+   testpmd> set macsec sc (tx|rx) (port_id) (mac) (pi)
+
+.. note::
+
+   The pi argument is ignored for tx.
+   Check the NIC Datasheet for hardware limits.
+
+set macsec sa
+~~~~~~~~~~~~~
+
+Configure MACsec secure association (SA)::
+
+   testpmd> set macsec sa (tx|rx) (port_id) (idx) (an) (pn) (key)
+
+.. note::
+
+   The IDX value must be 0 or 1.
+   Check the NIC Datasheet for hardware limits.
+
 vlan set strip
 ~~~~~~~~~~~~~~
 
-- 
2.7.4

^ permalink raw reply related

* [PATCH v6 6/6] doc: add ixgbe specific APIs
From: Tiwei Bie @ 2017-01-11  4:31 UTC (permalink / raw)
  To: dev
  Cc: adrien.mazarguil, wenzhuo.lu, john.mcnamara, olivier.matz,
	thomas.monjalon, konstantin.ananyev, helin.zhang, wei.dai,
	xiao.w.wang
In-Reply-To: <1484109098-8936-1-git-send-email-tiwei.bie@intel.com>

Add information about the new ixgbe PMD APIs in the release notes.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
---
 doc/guides/rel_notes/release_17_02.rst | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/doc/guides/rel_notes/release_17_02.rst b/doc/guides/rel_notes/release_17_02.rst
index 3958714..a3490a5 100644
--- a/doc/guides/rel_notes/release_17_02.rst
+++ b/doc/guides/rel_notes/release_17_02.rst
@@ -52,6 +52,11 @@ New Features
   See the :ref:`Generic flow API <Generic_flow_API>` documentation for more
   information.
 
+* **Added APIs for MACsec offload support to the ixgbe PMD.**
+
+  Six new APIs have been added to the ixgbe PMD for MACsec offload support.
+  The declarations for the APIs can be found in ``rte_pmd_ixgbe.h``.
+
 * **Improved offload support in mbuf.**
 
   Added a new Tx mbuf flag for MACsec, which can be set by applications
-- 
2.7.4

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox