Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [BACKPORT 4.14.y 4/8] net: sctp: fix warning "NULL check before some freeing functions is not needed"
From: Baolin Wang @ 2019-09-04  2:38 UTC (permalink / raw)
  To: Greg KH
  Cc: Marcelo Ricardo Leitner, # 3.4.x, vyasevich, Neil Horman,
	David Miller, hariprasad.kelam, linux-sctp, Networking,
	Arnd Bergmann, Orson Zhai, Vincent Guittot, LKML
In-Reply-To: <20190903183331.GB26562@kroah.com>

On Wed, 4 Sep 2019 at 02:33, Greg KH <greg@kroah.com> wrote:
>
> On Tue, Sep 03, 2019 at 11:52:06AM -0300, Marcelo Ricardo Leitner wrote:
> > On Tue, Sep 03, 2019 at 02:58:16PM +0800, Baolin Wang wrote:
> > > From: Hariprasad Kelam <hariprasad.kelam@gmail.com>
> > >
> > > This patch removes NULL checks before calling kfree.
> > >
> > > fixes below issues reported by coccicheck
> > > net/sctp/sm_make_chunk.c:2586:3-8: WARNING: NULL check before some
> > > freeing functions is not needed.
> > > net/sctp/sm_make_chunk.c:2652:3-8: WARNING: NULL check before some
> > > freeing functions is not needed.
> > > net/sctp/sm_make_chunk.c:2667:3-8: WARNING: NULL check before some
> > > freeing functions is not needed.
> > > net/sctp/sm_make_chunk.c:2684:3-8: WARNING: NULL check before some
> > > freeing functions is not needed.
> >
> > Hi. This doesn't seem the kind of patch that should be backported to
> > such old/stable releases. After all, it's just a cleanup.
>
> I agree, this does not seem necessary _unless_ it is needed for a later
> real fix.

It can remove warnings from our product kernel since this patch
(c4964bfaf433 sctp: Free cookie before we memdup a new one) was merged
into stable, we still need backport it to our product kernel manually.

But if you still think this is unnecessary, please ignore this patch.
Thanks for your comments.

-- 
Baolin Wang
Best Regards

^ permalink raw reply

* [PATCH 1/3] ixgbe: Use kzfree() rather than its implementation.
From: zhong jiang @ 2019-09-04  2:39 UTC (permalink / raw)
  To: davem, anna.schumaker, trond.myklebust; +Cc: zhongjiang, netdev, linux-kernel
In-Reply-To: <1567564752-6430-1-git-send-email-zhongjiang@huawei.com>

Use kzfree() instead of memset() + kfree().

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
index 31629fc..113f608 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
@@ -960,11 +960,9 @@ int ixgbe_ipsec_vf_add_sa(struct ixgbe_adapter *adapter, u32 *msgbuf, u32 vf)
 	return 0;
 
 err_aead:
-	memset(xs->aead, 0, sizeof(*xs->aead));
-	kfree(xs->aead);
+	kzfree(xs->aead);
 err_xs:
-	memset(xs, 0, sizeof(*xs));
-	kfree(xs);
+	kzfree(xs);
 err_out:
 	msgbuf[1] = err;
 	return err;
@@ -1049,8 +1047,7 @@ int ixgbe_ipsec_vf_del_sa(struct ixgbe_adapter *adapter, u32 *msgbuf, u32 vf)
 	ixgbe_ipsec_del_sa(xs);
 
 	/* remove the xs that was made-up in the add request */
-	memset(xs, 0, sizeof(*xs));
-	kfree(xs);
+	kzfree(xs);
 
 	return 0;
 }
-- 
1.7.12.4


^ permalink raw reply related

* [PATCH 0/3] net: Use kzfree() directly
From: zhong jiang @ 2019-09-04  2:39 UTC (permalink / raw)
  To: davem, anna.schumaker, trond.myklebust; +Cc: zhongjiang, netdev, linux-kernel

With the help of Coccinelle. We find some place to replace.

@@
expression M, S;
@@

- memset(M, 0, S);
- kfree(M);
+ kzfree(M); 

zhong jiang (3):
  ixgbe: Use kzfree() rather than its implementation.
  sunrpc: Use kzfree rather than its implementation.
  net: mpoa: Use kzfree rather than its implementation.

 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 9 +++------
 net/atm/mpoa_caches.c                          | 6 ++----
 net/sunrpc/auth_gss/gss_krb5_keys.c            | 9 +++------
 3 files changed, 8 insertions(+), 16 deletions(-)

-- 
1.7.12.4

^ permalink raw reply

* [PATCH 2/3] sunrpc: Use kzfree rather than its implementation.
From: zhong jiang @ 2019-09-04  2:39 UTC (permalink / raw)
  To: davem, anna.schumaker, trond.myklebust; +Cc: zhongjiang, netdev, linux-kernel
In-Reply-To: <1567564752-6430-1-git-send-email-zhongjiang@huawei.com>

Use kzfree instead of memset() + kfree().

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
---
 net/sunrpc/auth_gss/gss_krb5_keys.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/net/sunrpc/auth_gss/gss_krb5_keys.c b/net/sunrpc/auth_gss/gss_krb5_keys.c
index 550fdf1..3b7f721 100644
--- a/net/sunrpc/auth_gss/gss_krb5_keys.c
+++ b/net/sunrpc/auth_gss/gss_krb5_keys.c
@@ -228,14 +228,11 @@ u32 krb5_derive_key(const struct gss_krb5_enctype *gk5e,
 	ret = 0;
 
 err_free_raw:
-	memset(rawkey, 0, keybytes);
-	kfree(rawkey);
+	kzfree(rawkey);
 err_free_out:
-	memset(outblockdata, 0, blocksize);
-	kfree(outblockdata);
+	kzfree(outblockdata);
 err_free_in:
-	memset(inblockdata, 0, blocksize);
-	kfree(inblockdata);
+	kzfree(inblockdata);
 err_free_cipher:
 	crypto_free_sync_skcipher(cipher);
 err_return:
-- 
1.7.12.4


^ permalink raw reply related

* [PATCH 3/3] net: mpoa: Use kzfree rather than its implementation.
From: zhong jiang @ 2019-09-04  2:39 UTC (permalink / raw)
  To: davem, anna.schumaker, trond.myklebust; +Cc: zhongjiang, netdev, linux-kernel
In-Reply-To: <1567564752-6430-1-git-send-email-zhongjiang@huawei.com>

Use kzfree instead of memset() + kfree().

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
---
 net/atm/mpoa_caches.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/atm/mpoa_caches.c b/net/atm/mpoa_caches.c
index 4bb4183..3286f9d 100644
--- a/net/atm/mpoa_caches.c
+++ b/net/atm/mpoa_caches.c
@@ -180,8 +180,7 @@ static int cache_hit(in_cache_entry *entry, struct mpoa_client *mpc)
 static void in_cache_put(in_cache_entry *entry)
 {
 	if (refcount_dec_and_test(&entry->use)) {
-		memset(entry, 0, sizeof(in_cache_entry));
-		kfree(entry);
+		kzfree(entry);
 	}
 }
 
@@ -416,8 +415,7 @@ static eg_cache_entry *eg_cache_get_by_src_ip(__be32 ipaddr,
 static void eg_cache_put(eg_cache_entry *entry)
 {
 	if (refcount_dec_and_test(&entry->use)) {
-		memset(entry, 0, sizeof(eg_cache_entry));
-		kfree(entry);
+		kzfree(entry);
 	}
 }
 
-- 
1.7.12.4


^ permalink raw reply related

* Re: [RFC v3] vhost: introduce mdev based hardware vhost backend
From: Tiwei Bie @ 2019-09-04  2:48 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: jasowang, alex.williamson, maxime.coquelin, linux-kernel, kvm,
	virtualization, netdev, dan.daly, cunming.liang, zhihong.wang,
	lingshan.zhu
In-Reply-To: <20190903043704-mutt-send-email-mst@kernel.org>

On Tue, Sep 03, 2019 at 07:26:03AM -0400, Michael S. Tsirkin wrote:
> On Wed, Aug 28, 2019 at 01:37:12PM +0800, Tiwei Bie wrote:
> > Details about this can be found here:
> > 
> > https://lwn.net/Articles/750770/
> > 
> > What's new in this version
> > ==========================
> > 
> > There are three choices based on the discussion [1] in RFC v2:
> > 
> > > #1. We expose a VFIO device, so we can reuse the VFIO container/group
> > >     based DMA API and potentially reuse a lot of VFIO code in QEMU.
> > >
> > >     But in this case, we have two choices for the VFIO device interface
> > >     (i.e. the interface on top of VFIO device fd):
> > >
> > >     A) we may invent a new vhost protocol (as demonstrated by the code
> > >        in this RFC) on VFIO device fd to make it work in VFIO's way,
> > >        i.e. regions and irqs.
> > >
> > >     B) Or as you proposed, instead of inventing a new vhost protocol,
> > >        we can reuse most existing vhost ioctls on the VFIO device fd
> > >        directly. There should be no conflicts between the VFIO ioctls
> > >        (type is 0x3B) and VHOST ioctls (type is 0xAF) currently.
> > >
> > > #2. Instead of exposing a VFIO device, we may expose a VHOST device.
> > >     And we will introduce a new mdev driver vhost-mdev to do this.
> > >     It would be natural to reuse the existing kernel vhost interface
> > >     (ioctls) on it as much as possible. But we will need to invent
> > >     some APIs for DMA programming (reusing VHOST_SET_MEM_TABLE is a
> > >     choice, but it's too heavy and doesn't support vIOMMU by itself).
> > 
> > This version is more like a quick PoC to try Jason's proposal on
> > reusing vhost ioctls. And the second way (#1/B) in above three
> > choices was chosen in this version to demonstrate the idea quickly.
> > 
> > Now the userspace API looks like this:
> > 
> > - VFIO's container/group based IOMMU API is used to do the
> >   DMA programming.
> > 
> > - Vhost's existing ioctls are used to setup the device.
> > 
> > And the device will report device_api as "vfio-vhost".
> > 
> > Note that, there are dirty hacks in this version. If we decide to
> > go this way, some refactoring in vhost.c/vhost.h may be needed.
> > 
> > PS. The direct mapping of the notify registers isn't implemented
> >     in this version.
> > 
> > [1] https://lkml.org/lkml/2019/7/9/101
> > 
> > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> 
> ....
> 
> > +long vhost_mdev_ioctl(struct mdev_device *mdev, unsigned int cmd,
> > +		      unsigned long arg)
> > +{
> > +	void __user *argp = (void __user *)arg;
> > +	struct vhost_mdev *vdpa;
> > +	unsigned long minsz;
> > +	int ret = 0;
> > +
> > +	if (!mdev)
> > +		return -EINVAL;
> > +
> > +	vdpa = mdev_get_drvdata(mdev);
> > +	if (!vdpa)
> > +		return -ENODEV;
> > +
> > +	switch (cmd) {
> > +	case VFIO_DEVICE_GET_INFO:
> > +	{
> > +		struct vfio_device_info info;
> > +
> > +		minsz = offsetofend(struct vfio_device_info, num_irqs);
> > +
> > +		if (copy_from_user(&info, (void __user *)arg, minsz)) {
> > +			ret = -EFAULT;
> > +			break;
> > +		}
> > +
> > +		if (info.argsz < minsz) {
> > +			ret = -EINVAL;
> > +			break;
> > +		}
> > +
> > +		info.flags = VFIO_DEVICE_FLAGS_VHOST;
> > +		info.num_regions = 0;
> > +		info.num_irqs = 0;
> > +
> > +		if (copy_to_user((void __user *)arg, &info, minsz)) {
> > +			ret = -EFAULT;
> > +			break;
> > +		}
> > +
> > +		break;
> > +	}
> > +	case VFIO_DEVICE_GET_REGION_INFO:
> > +	case VFIO_DEVICE_GET_IRQ_INFO:
> > +	case VFIO_DEVICE_SET_IRQS:
> > +	case VFIO_DEVICE_RESET:
> > +		ret = -EINVAL;
> > +		break;
> > +
> > +	case VHOST_MDEV_SET_STATE:
> > +		ret = vhost_set_state(vdpa, argp);
> > +		break;
> > +	case VHOST_GET_FEATURES:
> > +		ret = vhost_get_features(vdpa, argp);
> > +		break;
> > +	case VHOST_SET_FEATURES:
> > +		ret = vhost_set_features(vdpa, argp);
> > +		break;
> > +	case VHOST_GET_VRING_BASE:
> > +		ret = vhost_get_vring_base(vdpa, argp);
> > +		break;
> > +	default:
> > +		ret = vhost_dev_ioctl(&vdpa->dev, cmd, argp);
> > +		if (ret == -ENOIOCTLCMD)
> > +			ret = vhost_vring_ioctl(&vdpa->dev, cmd, argp);
> > +	}
> > +
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL(vhost_mdev_ioctl);
> 
> 
> I don't have a problem with this approach. A small question:
> would it make sense to have two fds: send vhost ioctls
> on one and vfio ioctls on another?
> We can then pass vfio fd to the vhost fd with a
> SET_BACKEND ioctl.
> 
> What do you think?

I like this idea! I will give it a try.
So we can introduce /dev/vhost-mdev to have the vhost fd, and let
userspace pass vfio fd to the vhost fd with a SET_BACKEND ioctl.

Thanks a lot!
Tiwei

> 
> -- 
> MST

^ permalink raw reply

* [PATCH] net: hsr: remove an redundant null check before kfree_skb
From: zhong jiang @ 2019-09-04  3:09 UTC (permalink / raw)
  To: davem, arvid.brodin; +Cc: zhongjiang, netdev, linux-kernel

kfree_skb has taken the null pointer into account. Hence just remove
the null check before kfree_skb.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
---
 net/hsr/hsr_forward.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/hsr/hsr_forward.c b/net/hsr/hsr_forward.c
index ddd9605..0c9e5b0 100644
--- a/net/hsr/hsr_forward.c
+++ b/net/hsr/hsr_forward.c
@@ -367,10 +367,8 @@ void hsr_forward_skb(struct sk_buff *skb, struct hsr_port *port)
 		port->dev->stats.tx_bytes += skb->len;
 	}
 
-	if (frame.skb_hsr)
-		kfree_skb(frame.skb_hsr);
-	if (frame.skb_std)
-		kfree_skb(frame.skb_std);
+	kfree_skb(frame.skb_hsr);
+	kfree_skb(frame.skb_std);
 	return;
 
 out_drop:
-- 
1.7.12.4


^ permalink raw reply related

* Re: [PATCH] net-ipv6: fix excessive RTF_ADDRCONF flag on ::1/128 local route (and others)
From: Lorenzo Colitti @ 2019-09-04  3:17 UTC (permalink / raw)
  To: David Ahern; +Cc: Maciej Żenczykowski, David S . Miller, Linux NetDev
In-Reply-To: <60b98521-cf3a-1130-896d-2947fc4d5290@gmail.com>

On Wed, Sep 4, 2019 at 4:45 AM David Ahern <dsahern@gmail.com> wrote:
>
> exactly. It was shortsighted of me to add the ADDRCONF flag and removing
> it reverts back to the previous behavior.
>
> When I enable radvd, I do see the flag set when it should be and not for
> other addresses. I believe the patch is correct.

Ah, wait, I was confused. Sorry.

What I was saying is that RTF_ADDRCONF flag should be set on the local
table /128 route for the IPv6 addresses configured by RAs. The way
those are configured is ndisc_router_discovery -> addrconf_prefix_rcv
-> addrconf_prefix_rcv_add_addr -> ipv6_add_addr ->
addrconf_f6i_alloc. Because in this patch, addrconf_f6i_alloc
unconditionally clears RTF_ADDRCONF, I didn't see how the flag could
be set. But I now realize that that was not happening before David's
commit, either: in 5.1 those routes were not created with RTF_ADDRCONF
either.

In other words: before 5.1, the /128 routes in the local table to IPv6
addresses created by SLAAC did not have RTF_ADDRCONF. After David's
commit, they did, because *all* /128 routes to IPv6 addresses had
RTF_ADDRCONF (correct for SLAAC adresses, but definitely incorrect for
manual addresses, loopback, etc.). If this commit is applied, we'll go
back to the 5.1 state. In the future it would be good to ensure that
at least the /128 routes created by SLAAC do have RTF_ADDRCONF, but no
need to do so in this commit.

^ permalink raw reply

* [PATCHv2 net-next] ipmr: remove cache_resolve_queue_len
From: Hangbin Liu @ 2019-09-04  3:34 UTC (permalink / raw)
  To: netdev
  Cc: Phil Karn, Sukumar Gopalakrishnan, David S . Miller,
	Alexey Kuznetsov, Hideaki YOSHIFUJI, Hangbin Liu
In-Reply-To: <20190903084359.13310-1-liuhangbin@gmail.com>

This is a re-post of previous patch wrote by David Miller[1].

Phil Karn reported[2] that on busy networks with lots of unresolved
multicast routing entries, the creation of new multicast group routes
can be extremely slow and unreliable.

The reason is we hard-coded multicast route entries with unresolved source
addresses(cache_resolve_queue_len) to 10. If some multicast route never
resolves and the unresolved source addresses increased, there will
be no ability to create new multicast route cache.

To resolve this issue, we need either add a sysctl entry to make the
cache_resolve_queue_len configurable, or just remove cache_resolve_queue_len
directly, as we already have the socket receive queue limits of mrouted
socket, pointed by David.

From my side, I'd perfer to remove the cache_resolve_queue_len instead
of creating two more(IPv4 and IPv6 version) sysctl entry.

[1] https://lkml.org/lkml/2018/7/22/11
[2] https://lkml.org/lkml/2018/7/21/343

v2: hold the mfc_unres_lock while walking the unresolved list in
queue_count(), as Nikolay Aleksandrov remind.

Reported-by: Phil Karn <karn@ka9q.net>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---
 include/linux/mroute_base.h |  2 --
 net/ipv4/ipmr.c             | 30 +++++++++++++++++++++---------
 net/ipv6/ip6mr.c            | 10 +++-------
 3 files changed, 24 insertions(+), 18 deletions(-)

diff --git a/include/linux/mroute_base.h b/include/linux/mroute_base.h
index 34de06b426ef..50fb89533029 100644
--- a/include/linux/mroute_base.h
+++ b/include/linux/mroute_base.h
@@ -235,7 +235,6 @@ struct mr_table_ops {
  * @mfc_hash: Hash table of all resolved routes for easy lookup
  * @mfc_cache_list: list of resovled routes for possible traversal
  * @maxvif: Identifier of highest value vif currently in use
- * @cache_resolve_queue_len: current size of unresolved queue
  * @mroute_do_assert: Whether to inform userspace on wrong ingress
  * @mroute_do_pim: Whether to receive IGMP PIMv1
  * @mroute_reg_vif_num: PIM-device vif index
@@ -252,7 +251,6 @@ struct mr_table {
 	struct rhltable		mfc_hash;
 	struct list_head	mfc_cache_list;
 	int			maxvif;
-	atomic_t		cache_resolve_queue_len;
 	bool			mroute_do_assert;
 	bool			mroute_do_pim;
 	bool			mroute_do_wrvifwhole;
diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index c07bc82cbbe9..4d67c64d94a4 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -744,8 +744,6 @@ static void ipmr_destroy_unres(struct mr_table *mrt, struct mfc_cache *c)
 	struct sk_buff *skb;
 	struct nlmsgerr *e;
 
-	atomic_dec(&mrt->cache_resolve_queue_len);
-
 	while ((skb = skb_dequeue(&c->_c.mfc_un.unres.unresolved))) {
 		if (ip_hdr(skb)->version == 0) {
 			struct nlmsghdr *nlh = skb_pull(skb,
@@ -1133,9 +1131,11 @@ static int ipmr_cache_unresolved(struct mr_table *mrt, vifi_t vifi,
 	}
 
 	if (!found) {
+		bool was_empty;
+
 		/* Create a new entry if allowable */
-		if (atomic_read(&mrt->cache_resolve_queue_len) >= 10 ||
-		    (c = ipmr_cache_alloc_unres()) == NULL) {
+		c = ipmr_cache_alloc_unres();
+		if (!c) {
 			spin_unlock_bh(&mfc_unres_lock);
 
 			kfree_skb(skb);
@@ -1161,11 +1161,11 @@ static int ipmr_cache_unresolved(struct mr_table *mrt, vifi_t vifi,
 			return err;
 		}
 
-		atomic_inc(&mrt->cache_resolve_queue_len);
+		was_empty = list_empty(&mrt->mfc_unres_queue);
 		list_add(&c->_c.list, &mrt->mfc_unres_queue);
 		mroute_netlink_event(mrt, c, RTM_NEWROUTE);
 
-		if (atomic_read(&mrt->cache_resolve_queue_len) == 1)
+		if (was_empty)
 			mod_timer(&mrt->ipmr_expire_timer,
 				  c->_c.mfc_un.unres.expires);
 	}
@@ -1272,7 +1272,6 @@ static int ipmr_mfc_add(struct net *net, struct mr_table *mrt,
 		if (uc->mfc_origin == c->mfc_origin &&
 		    uc->mfc_mcastgrp == c->mfc_mcastgrp) {
 			list_del(&_uc->list);
-			atomic_dec(&mrt->cache_resolve_queue_len);
 			found = true;
 			break;
 		}
@@ -1328,7 +1327,7 @@ static void mroute_clean_tables(struct mr_table *mrt, int flags)
 	}
 
 	if (flags & MRT_FLUSH_MFC) {
-		if (atomic_read(&mrt->cache_resolve_queue_len) != 0) {
+		if (!list_empty(&mrt->mfc_unres_queue)) {
 			spin_lock_bh(&mfc_unres_lock);
 			list_for_each_entry_safe(c, tmp, &mrt->mfc_unres_queue, list) {
 				list_del(&c->list);
@@ -2750,9 +2749,22 @@ static int ipmr_rtm_route(struct sk_buff *skb, struct nlmsghdr *nlh,
 		return ipmr_mfc_delete(tbl, &mfcc, parent);
 }
 
+static int queue_count(struct mr_table *mrt)
+{
+	struct list_head *pos;
+	int count = 0;
+
+	spin_lock_bh(&mfc_unres_lock);
+	list_for_each(pos, &mrt->mfc_unres_queue)
+		count++;
+	spin_unlock_bh(&mfc_unres_lock);
+
+	return count;
+}
+
 static bool ipmr_fill_table(struct mr_table *mrt, struct sk_buff *skb)
 {
-	u32 queue_len = atomic_read(&mrt->cache_resolve_queue_len);
+	u32 queue_len = queue_count(mrt);
 
 	if (nla_put_u32(skb, IPMRA_TABLE_ID, mrt->id) ||
 	    nla_put_u32(skb, IPMRA_TABLE_CACHE_RES_QUEUE_LEN, queue_len) ||
diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index e80d36c5073d..d02f0f903559 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -768,8 +768,6 @@ static void ip6mr_destroy_unres(struct mr_table *mrt, struct mfc6_cache *c)
 	struct net *net = read_pnet(&mrt->net);
 	struct sk_buff *skb;
 
-	atomic_dec(&mrt->cache_resolve_queue_len);
-
 	while ((skb = skb_dequeue(&c->_c.mfc_un.unres.unresolved)) != NULL) {
 		if (ipv6_hdr(skb)->version == 0) {
 			struct nlmsghdr *nlh = skb_pull(skb,
@@ -1148,8 +1146,8 @@ static int ip6mr_cache_unresolved(struct mr_table *mrt, mifi_t mifi,
 		 *	Create a new entry if allowable
 		 */
 
-		if (atomic_read(&mrt->cache_resolve_queue_len) >= 10 ||
-		    (c = ip6mr_cache_alloc_unres()) == NULL) {
+		c = ip6mr_cache_alloc_unres();
+		if (!c) {
 			spin_unlock_bh(&mfc_unres_lock);
 
 			kfree_skb(skb);
@@ -1176,7 +1174,6 @@ static int ip6mr_cache_unresolved(struct mr_table *mrt, mifi_t mifi,
 			return err;
 		}
 
-		atomic_inc(&mrt->cache_resolve_queue_len);
 		list_add(&c->_c.list, &mrt->mfc_unres_queue);
 		mr6_netlink_event(mrt, c, RTM_NEWROUTE);
 
@@ -1468,7 +1465,6 @@ static int ip6mr_mfc_add(struct net *net, struct mr_table *mrt,
 		if (ipv6_addr_equal(&uc->mf6c_origin, &c->mf6c_origin) &&
 		    ipv6_addr_equal(&uc->mf6c_mcastgrp, &c->mf6c_mcastgrp)) {
 			list_del(&_uc->list);
-			atomic_dec(&mrt->cache_resolve_queue_len);
 			found = true;
 			break;
 		}
@@ -1526,7 +1522,7 @@ static void mroute_clean_tables(struct mr_table *mrt, int flags)
 	}
 
 	if (flags & MRT6_FLUSH_MFC) {
-		if (atomic_read(&mrt->cache_resolve_queue_len) != 0) {
+		if (!list_empty(&mrt->mfc_unres_queue)) {
 			spin_lock_bh(&mfc_unres_lock);
 			list_for_each_entry_safe(c, tmp, &mrt->mfc_unres_queue, list) {
 				list_del(&c->list);
-- 
2.19.2


^ permalink raw reply related

* [PATCH 0/3] net: remove an redundant continue
From: zhong jiang @ 2019-09-04  3:46 UTC (permalink / raw)
  To: davem, kvalo, pkshih; +Cc: zhongjiang, netdev, linux-kernel

With the help of Coccinelle. we find some place to replace.

@@

for (...;...;...) {
   ...
   if (...) {
     ...
-   continue;
   }
}


zhong jiang (3):
  rtlwifi: Remove an unnecessary continue in
    _rtl8723be_phy_config_bb_with_pgheaderfile
  nfp: Drop unnecessary continue in nfp_net_pf_alloc_vnics
  ath10k: Drop unnecessary continue in ath10k_mac_update_vif_chan

 drivers/net/ethernet/netronome/nfp/nfp_net_main.c    | 4 +---
 drivers/net/wireless/ath/ath10k/mac.c                | 4 +---
 drivers/net/wireless/realtek/rtlwifi/rtl8723be/phy.c | 1 -
 3 files changed, 2 insertions(+), 7 deletions(-)

-- 
1.7.12.4


^ permalink raw reply

* [PATCH 3/3] ath10k: Drop unnecessary continue in ath10k_mac_update_vif_chan
From: zhong jiang @ 2019-09-04  3:46 UTC (permalink / raw)
  To: davem, kvalo, pkshih; +Cc: zhongjiang, netdev, linux-kernel
In-Reply-To: <1567568784-9669-1-git-send-email-zhongjiang@huawei.com>

Continue is not needed at the bottom of a loop. Hence just remove it.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
---
 drivers/net/wireless/ath/ath10k/mac.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
index 12dad65..91e4635 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -7804,11 +7804,9 @@ static int ath10k_ampdu_action(struct ieee80211_hw *hw,
 			continue;
 
 		ret = ath10k_wmi_vdev_down(ar, arvif->vdev_id);
-		if (ret) {
+		if (ret)
 			ath10k_warn(ar, "failed to down vdev %d: %d\n",
 				    arvif->vdev_id, ret);
-			continue;
-		}
 	}
 
 	/* All relevant vdevs are downed and associated channel resources
-- 
1.7.12.4


^ permalink raw reply related

* [PATCH 2/3] nfp: Drop unnecessary continue in nfp_net_pf_alloc_vnics
From: zhong jiang @ 2019-09-04  3:46 UTC (permalink / raw)
  To: davem, kvalo, pkshih; +Cc: zhongjiang, netdev, linux-kernel
In-Reply-To: <1567568784-9669-1-git-send-email-zhongjiang@huawei.com>

Continue is not needed at the bottom of a loop.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
---
 drivers/net/ethernet/netronome/nfp/nfp_net_main.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_main.c b/drivers/net/ethernet/netronome/nfp/nfp_net_main.c
index 986464d..68db47d 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_main.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_main.c
@@ -205,10 +205,8 @@ static void nfp_net_pf_free_vnics(struct nfp_pf *pf)
 		ctrl_bar += NFP_PF_CSR_SLICE_SIZE;
 
 		/* Kill the vNIC if app init marked it as invalid */
-		if (nn->port && nn->port->type == NFP_PORT_INVALID) {
+		if (nn->port && nn->port->type == NFP_PORT_INVALID)
 			nfp_net_pf_free_vnic(pf, nn);
-			continue;
-		}
 	}
 
 	if (list_empty(&pf->vnics))
-- 
1.7.12.4


^ permalink raw reply related

* [PATCH 1/3] rtlwifi: Remove an unnecessary continue in _rtl8723be_phy_config_bb_with_pgheaderfile
From: zhong jiang @ 2019-09-04  3:46 UTC (permalink / raw)
  To: davem, kvalo, pkshih; +Cc: zhongjiang, netdev, linux-kernel
In-Reply-To: <1567568784-9669-1-git-send-email-zhongjiang@huawei.com>

Continue is not needed at the bottom of a loop. Hence just drop it.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
---
 drivers/net/wireless/realtek/rtlwifi/rtl8723be/phy.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8723be/phy.c b/drivers/net/wireless/realtek/rtlwifi/rtl8723be/phy.c
index aa8a095..4805b84 100644
--- a/drivers/net/wireless/realtek/rtlwifi/rtl8723be/phy.c
+++ b/drivers/net/wireless/realtek/rtlwifi/rtl8723be/phy.c
@@ -732,7 +732,6 @@ static bool _rtl8723be_phy_config_bb_with_pgheaderfile(struct ieee80211_hw *hw,
 				else
 					_rtl8723be_store_tx_power_by_rate(hw,
 							v1, v2, v3, v4, v5, v6);
-				continue;
 			}
 		}
 	} else {
-- 
1.7.12.4


^ permalink raw reply related

* RE: [PATCH net-next 1/5] net/tls: use the full sk_proto pointer
From: John Fastabend @ 2019-09-04  4:27 UTC (permalink / raw)
  To: Jakub Kicinski, davem
  Cc: netdev, oss-drivers, davejwatson, borisp, aviadye, john.fastabend,
	daniel, Jakub Kicinski, John Hurley, Dirk van der Merwe
In-Reply-To: <20190903043106.27570-2-jakub.kicinski@netronome.com>

Jakub Kicinski wrote:
> Since we already have the pointer to the full original sk_proto
> stored use that instead of storing all individual callback
> pointers as well.
> 
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> Reviewed-by: John Hurley <john.hurley@netronome.com>
> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
> ---
>  drivers/crypto/chelsio/chtls/chtls_main.c |  6 +++--
>  include/net/tls.h                         | 10 ---------
>  net/tls/tls_main.c                        | 27 +++++++++--------------
>  3 files changed, 14 insertions(+), 29 deletions(-)
> 

I like it should probably do the same to tcp_bpf.c.

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply

* RE: [PATCH net-next 4/5] net/tls: clean up the number of #ifdefs for CONFIG_TLS_DEVICE
From: John Fastabend @ 2019-09-04  4:31 UTC (permalink / raw)
  To: Jakub Kicinski, davem
  Cc: netdev, oss-drivers, davejwatson, borisp, aviadye, john.fastabend,
	daniel, Jakub Kicinski, John Hurley, Dirk van der Merwe
In-Reply-To: <20190903043106.27570-5-jakub.kicinski@netronome.com>

Jakub Kicinski wrote:
> TLS code has a number of #ifdefs which make the code a little
> harder to follow. Recent fixes removed the ifdef around the
> TLS_HW define, so we can switch to the often used pattern
> of defining tls_device functions as empty static inlines
> in the header when CONFIG_TLS_DEVICE=n.
> 
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> Reviewed-by: John Hurley <john.hurley@netronome.com>
> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
> ---
>  include/net/tls.h  | 38 ++++++++++++++++++++++++++++++++------
>  net/tls/tls_main.c | 19 +------------------
>  net/tls/tls_sw.c   |  6 ++----
>  3 files changed, 35 insertions(+), 28 deletions(-)

Thanks I've been meaning to do this I agree it looks nicer.

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply

* Re: [RFC v3] vhost: introduce mdev based hardware vhost backend
From: Jason Wang @ 2019-09-04  4:32 UTC (permalink / raw)
  To: Tiwei Bie, Michael S. Tsirkin
  Cc: alex.williamson, maxime.coquelin, linux-kernel, kvm,
	virtualization, netdev, dan.daly, cunming.liang, zhihong.wang,
	lingshan.zhu
In-Reply-To: <20190904024801.GA5671@___>


On 2019/9/4 上午10:48, Tiwei Bie wrote:
> On Tue, Sep 03, 2019 at 07:26:03AM -0400, Michael S. Tsirkin wrote:
>> On Wed, Aug 28, 2019 at 01:37:12PM +0800, Tiwei Bie wrote:
>>> Details about this can be found here:
>>>
>>> https://lwn.net/Articles/750770/
>>>
>>> What's new in this version
>>> ==========================
>>>
>>> There are three choices based on the discussion [1] in RFC v2:
>>>
>>>> #1. We expose a VFIO device, so we can reuse the VFIO container/group
>>>>      based DMA API and potentially reuse a lot of VFIO code in QEMU.
>>>>
>>>>      But in this case, we have two choices for the VFIO device interface
>>>>      (i.e. the interface on top of VFIO device fd):
>>>>
>>>>      A) we may invent a new vhost protocol (as demonstrated by the code
>>>>         in this RFC) on VFIO device fd to make it work in VFIO's way,
>>>>         i.e. regions and irqs.
>>>>
>>>>      B) Or as you proposed, instead of inventing a new vhost protocol,
>>>>         we can reuse most existing vhost ioctls on the VFIO device fd
>>>>         directly. There should be no conflicts between the VFIO ioctls
>>>>         (type is 0x3B) and VHOST ioctls (type is 0xAF) currently.
>>>>
>>>> #2. Instead of exposing a VFIO device, we may expose a VHOST device.
>>>>      And we will introduce a new mdev driver vhost-mdev to do this.
>>>>      It would be natural to reuse the existing kernel vhost interface
>>>>      (ioctls) on it as much as possible. But we will need to invent
>>>>      some APIs for DMA programming (reusing VHOST_SET_MEM_TABLE is a
>>>>      choice, but it's too heavy and doesn't support vIOMMU by itself).
>>> This version is more like a quick PoC to try Jason's proposal on
>>> reusing vhost ioctls. And the second way (#1/B) in above three
>>> choices was chosen in this version to demonstrate the idea quickly.
>>>
>>> Now the userspace API looks like this:
>>>
>>> - VFIO's container/group based IOMMU API is used to do the
>>>    DMA programming.
>>>
>>> - Vhost's existing ioctls are used to setup the device.
>>>
>>> And the device will report device_api as "vfio-vhost".
>>>
>>> Note that, there are dirty hacks in this version. If we decide to
>>> go this way, some refactoring in vhost.c/vhost.h may be needed.
>>>
>>> PS. The direct mapping of the notify registers isn't implemented
>>>      in this version.
>>>
>>> [1] https://lkml.org/lkml/2019/7/9/101
>>>
>>> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
>> ....
>>
>>> +long vhost_mdev_ioctl(struct mdev_device *mdev, unsigned int cmd,
>>> +		      unsigned long arg)
>>> +{
>>> +	void __user *argp = (void __user *)arg;
>>> +	struct vhost_mdev *vdpa;
>>> +	unsigned long minsz;
>>> +	int ret = 0;
>>> +
>>> +	if (!mdev)
>>> +		return -EINVAL;
>>> +
>>> +	vdpa = mdev_get_drvdata(mdev);
>>> +	if (!vdpa)
>>> +		return -ENODEV;
>>> +
>>> +	switch (cmd) {
>>> +	case VFIO_DEVICE_GET_INFO:
>>> +	{
>>> +		struct vfio_device_info info;
>>> +
>>> +		minsz = offsetofend(struct vfio_device_info, num_irqs);
>>> +
>>> +		if (copy_from_user(&info, (void __user *)arg, minsz)) {
>>> +			ret = -EFAULT;
>>> +			break;
>>> +		}
>>> +
>>> +		if (info.argsz < minsz) {
>>> +			ret = -EINVAL;
>>> +			break;
>>> +		}
>>> +
>>> +		info.flags = VFIO_DEVICE_FLAGS_VHOST;
>>> +		info.num_regions = 0;
>>> +		info.num_irqs = 0;
>>> +
>>> +		if (copy_to_user((void __user *)arg, &info, minsz)) {
>>> +			ret = -EFAULT;
>>> +			break;
>>> +		}
>>> +
>>> +		break;
>>> +	}
>>> +	case VFIO_DEVICE_GET_REGION_INFO:
>>> +	case VFIO_DEVICE_GET_IRQ_INFO:
>>> +	case VFIO_DEVICE_SET_IRQS:
>>> +	case VFIO_DEVICE_RESET:
>>> +		ret = -EINVAL;
>>> +		break;
>>> +
>>> +	case VHOST_MDEV_SET_STATE:
>>> +		ret = vhost_set_state(vdpa, argp);
>>> +		break;
>>> +	case VHOST_GET_FEATURES:
>>> +		ret = vhost_get_features(vdpa, argp);
>>> +		break;
>>> +	case VHOST_SET_FEATURES:
>>> +		ret = vhost_set_features(vdpa, argp);
>>> +		break;
>>> +	case VHOST_GET_VRING_BASE:
>>> +		ret = vhost_get_vring_base(vdpa, argp);
>>> +		break;
>>> +	default:
>>> +		ret = vhost_dev_ioctl(&vdpa->dev, cmd, argp);
>>> +		if (ret == -ENOIOCTLCMD)
>>> +			ret = vhost_vring_ioctl(&vdpa->dev, cmd, argp);
>>> +	}
>>> +
>>> +	return ret;
>>> +}
>>> +EXPORT_SYMBOL(vhost_mdev_ioctl);
>>
>> I don't have a problem with this approach. A small question:
>> would it make sense to have two fds: send vhost ioctls
>> on one and vfio ioctls on another?
>> We can then pass vfio fd to the vhost fd with a
>> SET_BACKEND ioctl.
>>
>> What do you think?
> I like this idea! I will give it a try.
> So we can introduce /dev/vhost-mdev to have the vhost fd,


You still need to think about how to connect it to current sysfs based 
mdev management interface, or you want to invent another API, or just 
use the /dev/vhost-net but pass vfio fd through ioctl to the file.

Thanks


>   and let
> userspace pass vfio fd to the vhost fd with a SET_BACKEND ioctl.
>
> Thanks a lot!
> Tiwei
>
>> -- 
>> MST

^ permalink raw reply

* [net-next 13/15] ice: Report stats when VSI is down
From: Jeff Kirsher @ 2019-09-04  4:35 UTC (permalink / raw)
  To: davem; +Cc: Dave Ertman, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190904043512.28066-1-jeffrey.t.kirsher@intel.com>

From: Dave Ertman <david.m.ertman@intel.com>

There is currently a check in get_ndo_stats that
returns before updating stats if the VSI is down
or there are no Tx or Rx queues.  This causes the
netdev to report zero stats with the netdev is down.

Remove the check so that the behavior of reporting
stats is the same as it was in IXGBE.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_main.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 732397a2e8fa..50a17a0337be 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -3477,12 +3477,16 @@ void ice_get_stats64(struct net_device *netdev, struct rtnl_link_stats64 *stats)
 
 	vsi_stats = &vsi->net_stats;
 
-	if (test_bit(__ICE_DOWN, vsi->state) || !vsi->num_txq || !vsi->num_rxq)
+	if (!vsi->num_txq || !vsi->num_rxq)
 		return;
+
 	/* netdev packet/byte stats come from ring counter. These are obtained
 	 * by summing up ring counters (done by ice_update_vsi_ring_stats).
+	 * But, only call the update routine and read the registers if VSI is
+	 * not down.
 	 */
-	ice_update_vsi_ring_stats(vsi);
+	if (!test_bit(__ICE_DOWN, vsi->state))
+		ice_update_vsi_ring_stats(vsi);
 	stats->tx_packets = vsi_stats->tx_packets;
 	stats->tx_bytes = vsi_stats->tx_bytes;
 	stats->rx_packets = vsi_stats->rx_packets;
-- 
2.21.0


^ permalink raw reply related

* [net-next 00/15][pull request] 100GbE Intel Wired LAN Driver Updates 2019-09-03
From: Jeff Kirsher @ 2019-09-04  4:34 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, nhorman, sassmann

This series contains updates to ice driver only.

Anirudh adds the ability for the driver to handle EMP resets correctly
by adding the logic to the existing ice_reset_subtask().

Jeb fixes up the logic to properly free up the resources for a switch
rule whether or not it was successful in the removal.

Brett fixes up the reporting of ITR values to let the user know odd ITR
values are not allowed.  Fixes the driver to only disable VLAN pruning
on VLAN deletion when the VLAN being deleted is the last VLAN on the VF
VSI.

Chinh updates the driver to determine the TSA value from the priority
value when in CEE mode.

Bruce aligns the driver with the hardware specification by ensuring that
a PF reset is done as part of the unload logic.  Also update the driver
unloading field, based on the latest hardware specification, which
allows us to remove an unnecessary endian conversion.  Moves #defines
based on their need in the code.

Jesse adds the current state of auto-negotiation in the link up message.
In addition, adds additional information to inform the user of an issue
with the topology/configuration of the link.

Usha updates the driver to allow the maximum TCs that the firmware
supports, rather than hard coding to a set value.

Dave updates the DCB initialization flow to handle the case of an actual
error during DCB init.  Updated the driver to report the current stats,
even when the netdev is down, which aligns with our other drivers.

Mitch fixes the VF reset code flows to ensure that it properly calls
ice_dis_vsi_txq() to notify the firmware that the VF is being reset.

Michal fixes the driver so the DCB is not enabled when the SW LLDP is
activated, which was causing a communication issue with other NICs.  The
problem lies in that DCB was being enabled without checking the number
of TCs.

The following are changes since commit 67538eb5c00f08d7fe27f1bb703098b17302bdc0:
  Merge branch 'mvpp2-per-cpu-buffers'
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue 100GbE

Anirudh Venkataramanan (1):
  ice: Fix EMP reset handling

Brett Creeley (2):
  ice: Report what the user set for coalesce [tx|rx]-usecs
  ice: Only disable VLAN pruning for the VF when all VLANs are removed

Bruce Allan (2):
  ice: add needed PFR during driver unload
  ice: update driver unloading field for Queue Shutdown AQ command

Chinh T Cao (1):
  ice: Deduce TSA value from the priority value in the CEE mode

Dave Ertman (2):
  ice: Correctly handle return values for init DCB
  ice: Report stats when VSI is down

Jeb Cramer (1):
  ice: Fix resource leak in ice_remove_rule_internal()

Jesse Brandeburg (2):
  ice: add print of autoneg state to link message
  ice: print extra message if topology issue

Michal Swiatkowski (1):
  ice: Remove enable DCB when SW LLDP is activated

Mitch Williams (1):
  ice: Always notify FW of VF reset

Tony Nguyen (1):
  ice: Cleanup defines in ice_type.h

Usha Ketineni (1):
  ice: Limit Max TCs on devices with more than 4 ports

 .../net/ethernet/intel/ice/ice_adminq_cmd.h   |  5 +-
 drivers/net/ethernet/intel/ice/ice_common.c   | 14 ++-
 drivers/net/ethernet/intel/ice/ice_dcb.c      |  8 +-
 drivers/net/ethernet/intel/ice/ice_dcb_lib.c  | 16 ++--
 drivers/net/ethernet/intel/ice/ice_ethtool.c  | 88 +++++++++----------
 drivers/net/ethernet/intel/ice/ice_main.c     | 43 ++++++++-
 drivers/net/ethernet/intel/ice/ice_switch.c   |  5 +-
 drivers/net/ethernet/intel/ice/ice_type.h     |  9 +-
 .../net/ethernet/intel/ice/ice_virtchnl_pf.c  | 30 ++++---
 9 files changed, 144 insertions(+), 74 deletions(-)

-- 
2.21.0

^ permalink raw reply

* [net-next 10/15] ice: Limit Max TCs on devices with more than 4 ports
From: Jeff Kirsher @ 2019-09-04  4:35 UTC (permalink / raw)
  To: davem
  Cc: Usha Ketineni, netdev, nhorman, sassmann, Tony Nguyen,
	Andrew Bowers, Jeff Kirsher
In-Reply-To: <20190904043512.28066-1-jeffrey.t.kirsher@intel.com>

From: Usha Ketineni <usha.k.ketineni@intel.com>

This patch limits the max TCs set by the driver to the value provided by
the firmware as per the capabilities of the device. Otherwise, hard coding
to 8 TC max would fail the device configurations with more than 4 ports.

Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_adminq_cmd.h |  1 +
 drivers/net/ethernet/intel/ice/ice_common.c     | 12 ++++++++++++
 drivers/net/ethernet/intel/ice/ice_dcb_lib.c    | 10 ++++++++--
 drivers/net/ethernet/intel/ice/ice_type.h       |  3 +++
 4 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
index 8ebc695171b6..4da0cde9695b 100644
--- a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
+++ b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
@@ -91,6 +91,7 @@ struct ice_aqc_list_caps_elem {
 #define ICE_AQC_CAPS_SRIOV				0x0012
 #define ICE_AQC_CAPS_VF					0x0013
 #define ICE_AQC_CAPS_VSI				0x0017
+#define ICE_AQC_CAPS_DCB				0x0018
 #define ICE_AQC_CAPS_RSS				0x0040
 #define ICE_AQC_CAPS_RXQS				0x0041
 #define ICE_AQC_CAPS_TXQS				0x0042
diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
index 6c0abb284c10..9492cd34b09d 100644
--- a/drivers/net/ethernet/intel/ice/ice_common.c
+++ b/drivers/net/ethernet/intel/ice/ice_common.c
@@ -1594,6 +1594,18 @@ ice_parse_caps(struct ice_hw *hw, void *buf, u32 cap_count,
 					  prefix, func_p->guar_num_vsi);
 			}
 			break;
+		case ICE_AQC_CAPS_DCB:
+			caps->dcb = (number == 1);
+			caps->active_tc_bitmap = logical_id;
+			caps->maxtc = phys_id;
+			ice_debug(hw, ICE_DBG_INIT,
+				  "%s: DCB = %d\n", prefix, caps->dcb);
+			ice_debug(hw, ICE_DBG_INIT,
+				  "%s: active TC bitmap = %d\n", prefix,
+				  caps->active_tc_bitmap);
+			ice_debug(hw, ICE_DBG_INIT,
+				  "%s: TC max = %d\n", prefix, caps->maxtc);
+			break;
 		case ICE_AQC_CAPS_RSS:
 			caps->rss_table_size = number;
 			caps->rss_table_entry_width = logical_id;
diff --git a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
index d9578919aad8..4614ec95529b 100644
--- a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
@@ -413,7 +413,7 @@ static int ice_dcb_sw_dflt_cfg(struct ice_pf *pf, bool locked)
 	memset(&pi->local_dcbx_cfg, 0, sizeof(*dcbcfg));
 
 	dcbcfg->etscfg.willing = 1;
-	dcbcfg->etscfg.maxtcs = 8;
+	dcbcfg->etscfg.maxtcs = hw->func_caps.common_cap.maxtc;
 	dcbcfg->etscfg.tcbwtable[0] = 100;
 	dcbcfg->etscfg.tsatable[0] = ICE_IEEE_TSA_ETS;
 
@@ -422,7 +422,7 @@ static int ice_dcb_sw_dflt_cfg(struct ice_pf *pf, bool locked)
 	dcbcfg->etsrec.willing = 0;
 
 	dcbcfg->pfc.willing = 1;
-	dcbcfg->pfc.pfccap = IEEE_8021QAZ_MAX_TCS;
+	dcbcfg->pfc.pfccap = hw->func_caps.common_cap.maxtc;
 
 	dcbcfg->numapps = 1;
 	dcbcfg->app[0].selector = ICE_APP_SEL_ETHTYPE;
@@ -454,6 +454,9 @@ int ice_init_pf_dcb(struct ice_pf *pf, bool locked)
 	err = ice_init_dcb(hw);
 	if (err) {
 		/* FW LLDP is disabled, activate SW DCBX/LLDP mode */
+		dev_info(&pf->pdev->dev,
+			 "DCB is enabled in the hardware, max number of TCs supported on this port are %d\n",
+			 pf->hw.func_caps.common_cap.maxtc);
 		dev_info(&pf->pdev->dev,
 			 "FW LLDP is disabled, DCBx/LLDP in SW mode.\n");
 		port_info->is_sw_lldp = true;
@@ -484,6 +487,9 @@ int ice_init_pf_dcb(struct ice_pf *pf, bool locked)
 	if (err)
 		goto dcb_init_err;
 
+	dev_info(&pf->pdev->dev,
+		 "DCB is enabled in the hardware, max number of TCs supported on this port are %d\n",
+		 pf->hw.func_caps.common_cap.maxtc);
 	dev_info(&pf->pdev->dev, "DCBX offload supported\n");
 	return err;
 
diff --git a/drivers/net/ethernet/intel/ice/ice_type.h b/drivers/net/ethernet/intel/ice/ice_type.h
index 40b028e73234..4501d50a7dcc 100644
--- a/drivers/net/ethernet/intel/ice/ice_type.h
+++ b/drivers/net/ethernet/intel/ice/ice_type.h
@@ -139,6 +139,9 @@ struct ice_phy_info {
 /* Common HW capabilities for SW use */
 struct ice_hw_common_caps {
 	u32 valid_functions;
+	/* DCB capabilities */
+	u32 active_tc_bitmap;
+	u32 maxtc;
 
 	/* Tx/Rx queues */
 	u16 num_rxq;		/* Number/Total Rx queues */
-- 
2.21.0


^ permalink raw reply related

* [net-next 12/15] ice: Always notify FW of VF reset
From: Jeff Kirsher @ 2019-09-04  4:35 UTC (permalink / raw)
  To: davem
  Cc: Mitch Williams, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190904043512.28066-1-jeffrey.t.kirsher@intel.com>

From: Mitch Williams <mitch.a.williams@intel.com>

The call to ice_dis_vsi_txq() acts as the notification to the firmware
that the VF is being reset. Because of this, we need to make this call
every time we reset, regardless of whatever else we do to stop the Tx
queues.

Without this change, VF resets would fail to complete on interfaces that
were up and running.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 .../net/ethernet/intel/ice/ice_virtchnl_pf.c  | 25 ++++++++++++-------
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c b/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c
index b93324e9f4bc..c58e3e3212df 100644
--- a/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c
@@ -1074,9 +1074,16 @@ bool ice_reset_all_vfs(struct ice_pf *pf, bool is_vflr)
 	for (v = 0; v < pf->num_alloc_vfs; v++)
 		ice_trigger_vf_reset(&pf->vf[v], is_vflr);
 
-	for (v = 0; v < pf->num_alloc_vfs; v++)
-		if (test_bit(ICE_VF_STATE_QS_ENA, pf->vf[v].vf_states))
-			ice_dis_vf_qs(&pf->vf[v]);
+	for (v = 0; v < pf->num_alloc_vfs; v++) {
+		struct ice_vsi *vsi;
+
+		vf = &pf->vf[v];
+		vsi = pf->vsi[vf->lan_vsi_idx];
+		if (test_bit(ICE_VF_STATE_QS_ENA, vf->vf_states))
+			ice_dis_vf_qs(vf);
+		ice_dis_vsi_txq(vsi->port_info, vsi->idx, 0, 0, NULL, NULL,
+				NULL, ICE_VF_RESET, vf->vf_id, NULL);
+	}
 
 	/* HW requires some time to make sure it can flush the FIFO for a VF
 	 * when it resets it. Poll the VPGEN_VFRSTAT register for each VF in
@@ -1171,12 +1178,12 @@ static bool ice_reset_vf(struct ice_vf *vf, bool is_vflr)
 
 	if (test_bit(ICE_VF_STATE_QS_ENA, vf->vf_states))
 		ice_dis_vf_qs(vf);
-	else
-		/* Call Disable LAN Tx queue AQ call even when queues are not
-		 * enabled. This is needed for successful completion of VFR
-		 */
-		ice_dis_vsi_txq(vsi->port_info, vsi->idx, 0, 0, NULL, NULL,
-				NULL, ICE_VF_RESET, vf->vf_id, NULL);
+
+	/* Call Disable LAN Tx queue AQ whether or not queues are
+	 * enabled. This is needed for successful completion of VFR.
+	 */
+	ice_dis_vsi_txq(vsi->port_info, vsi->idx, 0, 0, NULL, NULL,
+			NULL, ICE_VF_RESET, vf->vf_id, NULL);
 
 	hw = &pf->hw;
 	/* poll VPGEN_VFRSTAT reg to make sure
-- 
2.21.0


^ permalink raw reply related

* [net-next 11/15] ice: Correctly handle return values for init DCB
From: Jeff Kirsher @ 2019-09-04  4:35 UTC (permalink / raw)
  To: davem; +Cc: Dave Ertman, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190904043512.28066-1-jeffrey.t.kirsher@intel.com>

From: Dave Ertman <david.m.ertman@intel.com>

In the init path for DCB, the call to ice_init_dcb()
can return a non-zero value for either an actual
error, or due to the FW lldp engine being stopped.

We are currently treating all non-zero values only as
an indication that the FW LLDP engine is stopped.

Check for an actual error in the DCB init flow.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_dcb_lib.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
index 4614ec95529b..021e2e81d731 100644
--- a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
@@ -452,14 +452,18 @@ int ice_init_pf_dcb(struct ice_pf *pf, bool locked)
 	port_info = hw->port_info;
 
 	err = ice_init_dcb(hw);
+	if (err && !port_info->is_sw_lldp) {
+		dev_err(&pf->pdev->dev, "Error initializing DCB %d\n", err);
+		goto dcb_init_err;
+	}
+
+	dev_info(&pf->pdev->dev,
+		 "DCB is enabled in the hardware, max number of TCs supported on this port are %d\n",
+		 pf->hw.func_caps.common_cap.maxtc);
 	if (err) {
 		/* FW LLDP is disabled, activate SW DCBX/LLDP mode */
-		dev_info(&pf->pdev->dev,
-			 "DCB is enabled in the hardware, max number of TCs supported on this port are %d\n",
-			 pf->hw.func_caps.common_cap.maxtc);
 		dev_info(&pf->pdev->dev,
 			 "FW LLDP is disabled, DCBx/LLDP in SW mode.\n");
-		port_info->is_sw_lldp = true;
 		clear_bit(ICE_FLAG_FW_LLDP_AGENT, pf->flags);
 		err = ice_dcb_sw_dflt_cfg(pf, locked);
 		if (err) {
@@ -475,7 +479,6 @@ int ice_init_pf_dcb(struct ice_pf *pf, bool locked)
 		return 0;
 	}
 
-	port_info->is_sw_lldp = false;
 	set_bit(ICE_FLAG_FW_LLDP_AGENT, pf->flags);
 
 	/* DCBX in FW and LLDP enabled in FW */
@@ -487,10 +490,6 @@ int ice_init_pf_dcb(struct ice_pf *pf, bool locked)
 	if (err)
 		goto dcb_init_err;
 
-	dev_info(&pf->pdev->dev,
-		 "DCB is enabled in the hardware, max number of TCs supported on this port are %d\n",
-		 pf->hw.func_caps.common_cap.maxtc);
-	dev_info(&pf->pdev->dev, "DCBX offload supported\n");
 	return err;
 
 dcb_init_err:
-- 
2.21.0


^ permalink raw reply related

* [net-next 04/15] ice: Deduce TSA value from the priority value in the CEE mode
From: Jeff Kirsher @ 2019-09-04  4:35 UTC (permalink / raw)
  To: davem; +Cc: Chinh T Cao, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190904043512.28066-1-jeffrey.t.kirsher@intel.com>

From: Chinh T Cao <chinh.t.cao@intel.com>

In CEE mode, the TSA information can be derived from the reported
priority value.

Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_dcb.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_dcb.c b/drivers/net/ethernet/intel/ice/ice_dcb.c
index d60c942249e8..c5ee8d930611 100644
--- a/drivers/net/ethernet/intel/ice/ice_dcb.c
+++ b/drivers/net/ethernet/intel/ice/ice_dcb.c
@@ -444,9 +444,15 @@ ice_parse_cee_pgcfg_tlv(struct ice_cee_feat_tlv *tlv,
 	 *        |pg0|pg1|pg2|pg3|pg4|pg5|pg6|pg7|
 	 *        ---------------------------------
 	 */
-	ice_for_each_traffic_class(i)
+	ice_for_each_traffic_class(i) {
 		etscfg->tcbwtable[i] = buf[offset++];
 
+		if (etscfg->prio_table[i] == ICE_CEE_PGID_STRICT)
+			dcbcfg->etscfg.tsatable[i] = ICE_IEEE_TSA_STRICT;
+		else
+			dcbcfg->etscfg.tsatable[i] = ICE_IEEE_TSA_ETS;
+	}
+
 	/* Number of TCs supported (1 octet) */
 	etscfg->maxtcs = buf[offset];
 }
-- 
2.21.0


^ permalink raw reply related

* [net-next 06/15] ice: update driver unloading field for Queue Shutdown AQ command
From: Jeff Kirsher @ 2019-09-04  4:35 UTC (permalink / raw)
  To: davem; +Cc: Bruce Allan, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190904043512.28066-1-jeffrey.t.kirsher@intel.com>

From: Bruce Allan <bruce.w.allan@intel.com>

According to recent specification versions, the field in the Queue Shutdown
AdminQ command consisting of the "driver unloading" indication is not a 4
byte field (it is byte.bit 16.0).  Change it to a byte and remove the
unnecessary endian conversion.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_adminq_cmd.h | 4 ++--
 drivers/net/ethernet/intel/ice/ice_common.c     | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
index bf9aa533a7c6..8ebc695171b6 100644
--- a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
+++ b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
@@ -35,9 +35,9 @@ struct ice_aqc_get_ver {
 
 /* Queue Shutdown (direct 0x0003) */
 struct ice_aqc_q_shutdown {
-	__le32 driver_unloading;
+	u8 driver_unloading;
 #define ICE_AQC_DRIVER_UNLOADING	BIT(0)
-	u8 reserved[12];
+	u8 reserved[15];
 };
 
 /* Request resource ownership (direct 0x0008)
diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
index 302ad981129c..6c0abb284c10 100644
--- a/drivers/net/ethernet/intel/ice/ice_common.c
+++ b/drivers/net/ethernet/intel/ice/ice_common.c
@@ -1275,7 +1275,7 @@ enum ice_status ice_aq_q_shutdown(struct ice_hw *hw, bool unloading)
 	ice_fill_dflt_direct_cmd_desc(&desc, ice_aqc_opc_q_shutdown);
 
 	if (unloading)
-		cmd->driver_unloading = cpu_to_le32(ICE_AQC_DRIVER_UNLOADING);
+		cmd->driver_unloading = ICE_AQC_DRIVER_UNLOADING;
 
 	return ice_aq_send_cmd(hw, &desc, NULL, 0, NULL);
 }
-- 
2.21.0


^ permalink raw reply related

* [net-next 15/15] ice: Only disable VLAN pruning for the VF when all VLANs are removed
From: Jeff Kirsher @ 2019-09-04  4:35 UTC (permalink / raw)
  To: davem; +Cc: Brett Creeley, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190904043512.28066-1-jeffrey.t.kirsher@intel.com>

From: Brett Creeley <brett.creeley@intel.com>

Currently if the VF adds a VLAN, VLAN pruning will be enabled for that VSI.
Also, when a VLAN gets deleted it will disable VLAN pruning even if other
VLAN(s) exists for the VF. Fix this by only disabling VLAN pruning on the
VF VSI when removing the last VF (i.e. vf->num_vlan == 0).

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c b/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c
index c58e3e3212df..c38939b1d496 100644
--- a/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c
@@ -2767,8 +2767,9 @@ static int ice_vc_process_vlan_msg(struct ice_vf *vf, u8 *msg, bool add_v)
 			}
 
 			vf->num_vlan--;
-			/* Disable VLAN pruning when removing VLAN */
-			ice_cfg_vlan_pruning(vsi, false, false);
+			/* Disable VLAN pruning when the last VLAN is removed */
+			if (!vf->num_vlan)
+				ice_cfg_vlan_pruning(vsi, false, false);
 
 			/* Disable Unicast/Multicast VLAN promiscuous mode */
 			if (vlan_promisc) {
-- 
2.21.0


^ permalink raw reply related

* [net-next 05/15] ice: add needed PFR during driver unload
From: Jeff Kirsher @ 2019-09-04  4:35 UTC (permalink / raw)
  To: davem; +Cc: Bruce Allan, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190904043512.28066-1-jeffrey.t.kirsher@intel.com>

From: Bruce Allan <bruce.w.allan@intel.com>

According to the specification, a PF Reset must be done as part of the
driver unload flow.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_main.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index b62c01ca9c28..8217b81eb9d8 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -2638,6 +2638,11 @@ static void ice_remove(struct pci_dev *pdev)
 	ice_deinit_pf(pf);
 	ice_deinit_hw(&pf->hw);
 	ice_clear_interrupt_scheme(pf);
+	/* Issue a PFR as part of the prescribed driver unload flow.  Do not
+	 * do it via ice_schedule_reset() since there is no need to rebuild
+	 * and the service task is already stopped.
+	 */
+	ice_reset(&pf->hw, ICE_RESET_PFR);
 	pci_disable_pcie_error_reporting(pdev);
 }
 
-- 
2.21.0


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox