Netdev List
 help / color / mirror / Atom feed
* Re: Large performance regression with 6in4 tunnel (sit)
From: Eli Cooper @ 2016-11-27 16:22 UTC (permalink / raw)
  To: Stephen Rothwell; +Cc: netdev
In-Reply-To: <20161127130229.4c88cc2c@canb.auug.org.au>

Hi Stephen,


On 2016/11/27 10:02, Stephen Rothwell wrote:
> Hi Eli,
>
> On Sun, 27 Nov 2016 11:54:41 +1100 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>> On Fri, 25 Nov 2016 14:05:04 +0800 Eli Cooper <elicooper@gmx.com> wrote:
>>> I think this is similar to the bug I fixed in commit ae148b085876
>>> ("ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit()").
>>>
>>> I can reproduce a similar problem by applying xfrm to sit traffic.
>>> TSO/GSO packets are dropped when IPSec is enabled, and IPv6 throughput
>>> drops to 10s of Kbps. I am not sure if this is the same issue you
>>> experienced, but I wrote a patch that fixed at least the issue I had.
>>>
>>> Could you test the patch I sent to the mailing list just now?  
>> Thanks for the patch!
>>
>> Its a bit tricky to test since the problem only occurs in a production
>> machine (I tried reproducing in a VM, but the problem did not occur),
That's probably because the ethernet NIC in your VM does not support
segmentation offloading. You could, however, try reproducing it on
another (real) machine with the same driver.
>> but I will try to just rebuild the sit module and see if I can insert
>> the modified one.
> OK, I tried your patch and unfortunately, it doesn't seem to have
> worked ... I still get the large packets dropped and resent smaller.
>
It's a shame ... In my case, large packets are dropped only when xfrm is
in effect (therefore another output path is taken), and probably that's
not your case. Well, on the plus side, at least you reminded me that sit
device also needs to update skb's protocol.

Thanks,
Eli

^ permalink raw reply

* Re: [PATCH net-next 4/5] net/socket: add helpers for recvmmsg
From: Paolo Abeni @ 2016-11-27 16:21 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: netdev, David S. Miller, Eric Dumazet, Jesper Dangaard Brouer,
	Hannes Frederic Sowa, Sabrina Dubroca
In-Reply-To: <1480113022.8455.580.camel@edumazet-glaptop3.roam.corp.google.com>

Hi Eric,

On Fri, 2016-11-25 at 14:30 -0800, Eric Dumazet wrote:
> On Fri, 2016-11-25 at 16:39 +0100, Paolo Abeni wrote:
> > _skb_try_recv_datagram_batch dequeues multiple skb's from the
> > socket's receive queue, and runs the bulk_destructor callback under
> > the receive queue lock.
> 
> ...
> 
> > +	last = (struct sk_buff *)queue;
> > +	first = (struct sk_buff *)queue->next;
> > +	skb_queue_walk(queue, skb) {
> > +		last = skb;
> > +		totalsize += skb->truesize;
> > +		if (++datagrams == batch)
> > +			break;
> > +	}
> 
> This is absolutely not good.
> 
> Walking through a list, bringing 2 cache lines per skb, is not the
> proper way to deal with bulking.
> 
> And I do not see where 'batch' value coming from user space is capped ?
> 
> Is it really vlen argument coming from recvmmsg() system call ???
> 
> This code runs with BH masked, so you do not want to give user a way to
> make you loop there 1000 times 
> 
> Bulking is nice, only if you do not compromise with system stability and
> latency requirements from other users/applications.

Thank you for reviewing this.

You are right, the cacheline miss while accessing skb->truesize has
measurable performance impact, and the max burst length comes in from
recvmmsg(). 

We can easily cap the burst to some max value (e.g. 64 or less) and we
can pass to the bulk destructor the skb list and burst length without
accessing skb truesize beforehand. If the burst is short, say 8 skbs or
less, the bulk destructor walk again the list and release the memory,
elsewhere it defers the release after __skb_try_recv_datagram_batch()
completion: we walk the list without the lock held and we acquire it
later again to release all the memory. 

Thank you again,

Paolo

^ permalink raw reply

* [PATCH net] net/sched: act_pedit: limit negative offset
From: Amir Vadai @ 2016-11-27 15:58 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jamal Hadi Salim, Or Gerlitz, Hadar Har-Zion, Jiri Pirko,
	Amir Vadai

Should not allow setting a negative offset that goes below the skb head.

Signed-off-by: Amir Vadai <amir@vadai.me>
---
Hi Dave,

Please pull to -stable branches.

Thanks,
Amir

 net/sched/act_pedit.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
index b54d56d4959b..e79e8a88f2d2 100644
--- a/net/sched/act_pedit.c
+++ b/net/sched/act_pedit.c
@@ -154,8 +154,11 @@ static int tcf_pedit(struct sk_buff *skb, const struct tc_action *a,
 			}
 
 			ptr = skb_header_pointer(skb, off + offset, 4, &_data);
-			if (!ptr)
+			if ((unsigned char *)ptr < skb->head) {
+				pr_info("tc filter pedit offset out of bounds\n");
 				goto bad;
+			}
+
 			/* just do it, baby */
 			*ptr = ((*ptr & tkey->mask) ^ tkey->val);
 			if (ptr == &_data)
-- 
2.10.2

^ permalink raw reply related

* Re: [PATCH net-next 09/11] qede: Better utilize the qede_[rt]x_queue
From: kbuild test robot @ 2016-11-27 15:51 UTC (permalink / raw)
  To: Yuval Mintz; +Cc: kbuild-all, davem, netdev, Yuval Mintz
In-Reply-To: <1480258273-24973-10-git-send-email-Yuval.Mintz@cavium.com>

[-- Attachment #1: Type: text/plain, Size: 2946 bytes --]

Hi Yuval,

[auto build test WARNING on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Yuval-Mintz/qed-Add-XDP-support/20161127-225956
config: tile-allmodconfig (attached as .config)
compiler: tilegx-linux-gcc (GCC) 4.6.2
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=tile 

All warnings (new ones prefixed by >>):

   drivers/net/ethernet/qlogic/qede/qede_main.c: In function 'qede_alloc_mem_rxq':
>> drivers/net/ethernet/qlogic/qede/qede_main.c:2960:3: warning: large integer implicitly truncated to unsigned type [-Woverflow]

vim +2960 drivers/net/ethernet/qlogic/qede/qede_main.c

55482edc Manish Chopra 2016-03-04  2944  err:
55482edc Manish Chopra 2016-03-04  2945  	qede_free_sge_mem(edev, rxq);
55482edc Manish Chopra 2016-03-04  2946  	edev->gro_disable = 1;
55482edc Manish Chopra 2016-03-04  2947  	return -ENOMEM;
55482edc Manish Chopra 2016-03-04  2948  }
55482edc Manish Chopra 2016-03-04  2949  
2950219d Yuval Mintz   2015-10-26  2950  /* This function allocates all memory needed per Rx queue */
1a635e48 Yuval Mintz   2016-08-15  2951  static int qede_alloc_mem_rxq(struct qede_dev *edev, struct qede_rx_queue *rxq)
2950219d Yuval Mintz   2015-10-26  2952  {
f86af2df Manish Chopra 2016-04-20  2953  	int i, rc, size;
2950219d Yuval Mintz   2015-10-26  2954  
2950219d Yuval Mintz   2015-10-26  2955  	rxq->num_rx_buffers = edev->q_num_rx_buffers;
2950219d Yuval Mintz   2015-10-26  2956  
1a635e48 Yuval Mintz   2016-08-15  2957  	rxq->rx_buf_size = NET_IP_ALIGN + ETH_OVERHEAD + edev->ndev->mtu;
1a635e48 Yuval Mintz   2016-08-15  2958  
fc48b7a6 Yuval Mintz   2016-02-15  2959  	if (rxq->rx_buf_size > PAGE_SIZE)
fc48b7a6 Yuval Mintz   2016-02-15 @2960  		rxq->rx_buf_size = PAGE_SIZE;
fc48b7a6 Yuval Mintz   2016-02-15  2961  
fc48b7a6 Yuval Mintz   2016-02-15  2962  	/* Segment size to spilt a page in multiple equal parts */
fc48b7a6 Yuval Mintz   2016-02-15  2963  	rxq->rx_buf_seg_size = roundup_pow_of_two(rxq->rx_buf_size);
2950219d Yuval Mintz   2015-10-26  2964  
2950219d Yuval Mintz   2015-10-26  2965  	/* Allocate the parallel driver ring for Rx buffers */
fc48b7a6 Yuval Mintz   2016-02-15  2966  	size = sizeof(*rxq->sw_rx_ring) * RX_RING_SIZE;
2950219d Yuval Mintz   2015-10-26  2967  	rxq->sw_rx_ring = kzalloc(size, GFP_KERNEL);
2950219d Yuval Mintz   2015-10-26  2968  	if (!rxq->sw_rx_ring) {

:::::: The code at line 2960 was first introduced by commit
:::::: fc48b7a6148af974b49db145812a8b060324a503 qed/qede: use 8.7.3.0 FW.

:::::: TO: Yuval Mintz <Yuval.Mintz@qlogic.com>
:::::: CC: David S. Miller <davem@davemloft.net>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 46485 bytes --]

^ permalink raw reply

* [PATCH net-next 02/10] net/mlx4: Replace ENOSYS with better fitting error codes
From: Tariq Toukan @ 2016-11-27 15:51 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Eran Ben Elisha, Tariq Toukan
In-Reply-To: <1480261877-19720-1-git-send-email-tariqt@mellanox.com>

Conform the following warning:
WARNING: ENOSYS means 'invalid syscall nr' and nothing else.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_ethtool.c       | 2 +-
 drivers/net/ethernet/mellanox/mlx4/fw.c               | 2 +-
 drivers/net/ethernet/mellanox/mlx4/main.c             | 6 +++---
 drivers/net/ethernet/mellanox/mlx4/resource_tracker.c | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
index 487a58f9c192..7e6d0425394d 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
@@ -1983,7 +1983,7 @@ static int mlx4_en_get_module_info(struct net_device *dev,
 		modinfo->eeprom_len = ETH_MODULE_SFF_8472_LEN;
 		break;
 	default:
-		return -ENOSYS;
+		return -EINVAL;
 	}
 
 	return 0;
diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c b/drivers/net/ethernet/mellanox/mlx4/fw.c
index b03b473a7b07..c2ba16e23169 100644
--- a/drivers/net/ethernet/mellanox/mlx4/fw.c
+++ b/drivers/net/ethernet/mellanox/mlx4/fw.c
@@ -672,7 +672,7 @@ int mlx4_QUERY_FUNC_CAP(struct mlx4_dev *dev, u8 gen_or_port,
 	MLX4_GET(field, outbox, QUERY_FUNC_CAP_PHYS_PORT_OFFSET);
 	func_cap->physical_port = field;
 	if (func_cap->physical_port != gen_or_port) {
-		err = -ENOSYS;
+		err = -EINVAL;
 		goto out;
 	}
 
diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index 7cd1fb566f5a..41cd05f37b20 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -820,7 +820,7 @@ static int mlx4_slave_cap(struct mlx4_dev *dev)
 	 */
 	if (hca_param.global_caps) {
 		mlx4_err(dev, "Unknown hca global capabilities\n");
-		return -ENOSYS;
+		return -EINVAL;
 	}
 
 	mlx4_log_num_mgm_entry_size = hca_param.log_mc_entry_sz;
@@ -878,7 +878,7 @@ static int mlx4_slave_cap(struct mlx4_dev *dev)
 	    PF_CONTEXT_BEHAVIOUR_MASK) {
 		mlx4_err(dev, "Unknown pf context behaviour %x known flags %x\n",
 			 func_cap.pf_context_behaviour, PF_CONTEXT_BEHAVIOUR_MASK);
-		return -ENOSYS;
+		return -EINVAL;
 	}
 
 	dev->caps.num_ports		= func_cap.num_ports;
@@ -3476,7 +3476,7 @@ static int mlx4_load_one(struct pci_dev *pdev, int pci_dev_data,
 	mlx4_enable_msi_x(dev);
 	if ((mlx4_is_mfunc(dev)) &&
 	    !(dev->flags & MLX4_FLAG_MSI_X)) {
-		err = -ENOSYS;
+		err = -ENOTSUPP;
 		mlx4_err(dev, "INTx is not supported in multi-function mode, aborting\n");
 		goto err_free_eq;
 	}
diff --git a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
index ba7b70630d5d..05ba270c4862 100644
--- a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
+++ b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
@@ -1396,7 +1396,7 @@ static int remove_ok(struct res_common *res, enum mlx4_resource type, int extra)
 	case RES_MTT:
 		return remove_mtt_ok((struct res_mtt *)res, extra);
 	case RES_MAC:
-		return -ENOSYS;
+		return -ENOTSUPP;
 	case RES_EQ:
 		return remove_eq_ok((struct res_eq *)res);
 	case RES_COUNTER:
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH net-next 04/10] net/mlx4_core: Device revision support
From: Tariq Toukan @ 2016-11-27 15:51 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Eran Ben Elisha, Yishai Hadas, Tariq Toukan
In-Reply-To: <1480261877-19720-1-git-send-email-tariqt@mellanox.com>

From: Yishai Hadas <yishaih@mellanox.com>

The device revision field returned by the NodeInfo MAD is incorrect
on ConnectX3 devices.

This patch is driver side handling to complete a FW fix added at 2.11.1172.
INIT_HCA - bit at offset 0x0C.12 is set to 1 so that FW will report
correct device revision.

Older FW versions won't be affected from turning on that bit,
no capability bit is needed.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/fw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c b/drivers/net/ethernet/mellanox/mlx4/fw.c
index c2ba16e23169..53031cabf138 100644
--- a/drivers/net/ethernet/mellanox/mlx4/fw.c
+++ b/drivers/net/ethernet/mellanox/mlx4/fw.c
@@ -1888,7 +1888,7 @@ int mlx4_INIT_HCA(struct mlx4_dev *dev, struct mlx4_init_hca_param *param)
 	*((u8 *) mailbox->buf + INIT_HCA_VERSION_OFFSET) = INIT_HCA_VERSION;
 
 	*((u8 *) mailbox->buf + INIT_HCA_CACHELINE_SZ_OFFSET) =
-		(ilog2(cache_line_size()) - 4) << 5;
+		((ilog2(cache_line_size()) - 4) << 5) | (1 << 4);
 
 #if defined(__LITTLE_ENDIAN)
 	*(inbox + INIT_HCA_FLAGS_OFFSET / 4) &= ~cpu_to_be32(1 << 1);
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH net-next 01/10] net/mlx4_core: Make each VF manage its own mac table
From: Tariq Toukan @ 2016-11-27 15:51 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Eran Ben Elisha, Erez Shitrit, Gal Pressman, Tariq Toukan
In-Reply-To: <1480261877-19720-1-git-send-email-tariqt@mellanox.com>

From: Erez Shitrit <erezsh@mellanox.com>

Each VF can catch up to the max number of MACs in the
MAC table (128) on the base of "first asks first gets".
The VF should know the total free number of MACs from
the PF.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/fw.c            | 13 ++++
 drivers/net/ethernet/mellanox/mlx4/main.c          |  4 +-
 drivers/net/ethernet/mellanox/mlx4/mlx4.h          |  1 +
 drivers/net/ethernet/mellanox/mlx4/port.c          | 90 +++++++++++++++++++++-
 .../net/ethernet/mellanox/mlx4/resource_tracker.c  |  6 +-
 include/linux/mlx4/device.h                        |  1 +
 6 files changed, 110 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c b/drivers/net/ethernet/mellanox/mlx4/fw.c
index 84bab9f0732e..b03b473a7b07 100644
--- a/drivers/net/ethernet/mellanox/mlx4/fw.c
+++ b/drivers/net/ethernet/mellanox/mlx4/fw.c
@@ -1415,6 +1415,9 @@ int mlx4_QUERY_PORT_wrapper(struct mlx4_dev *dev, int slave,
 			   MLX4_CMD_NATIVE);
 
 	if (!err && dev->caps.function != slave) {
+		u8 field;
+		u8 vlan;
+
 		def_mac = priv->mfunc.master.vf_oper[slave].vport[vhcr->in_modifier].state.mac;
 		MLX4_PUT(outbox->buf, def_mac, QUERY_PORT_MAC_OFFSET);
 
@@ -1455,6 +1458,16 @@ int mlx4_QUERY_PORT_wrapper(struct mlx4_dev *dev, int slave,
 		short_field = dev->caps.pkey_table_len[vhcr->in_modifier];
 		MLX4_PUT(outbox->buf, short_field,
 			 QUERY_PORT_CUR_MAX_PKEY_OFFSET);
+
+		/* Change the mac table size for the VF */
+		MLX4_GET(field, outbox, QUERY_PORT_MAX_MACVLAN_OFFSET);
+		/* keep the origin vlan of the VF */
+		vlan = field >> 4;
+		/* set the field with the prev vlan and the mac defined quota */
+		field = vlan << 4;
+		field |= ilog2(mlx4_get_port_free_macs(dev,
+						       priv->port->port + 1));
+		MLX4_PUT(outbox->buf, field, QUERY_PORT_MAX_MACVLAN_OFFSET);
 	}
 out:
 	return err;
diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index 6f4e67bc3538..7cd1fb566f5a 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -2937,12 +2937,14 @@ static int mlx4_init_port_info(struct mlx4_dev *dev, int port)
 	info->dev = dev;
 	info->port = port;
 	if (!mlx4_is_slave(dev)) {
-		mlx4_init_mac_table(dev, &info->mac_table);
 		mlx4_init_vlan_table(dev, &info->vlan_table);
 		mlx4_init_roce_gid_table(dev, &info->gid_table);
 		info->base_qpn = mlx4_get_base_qpn(dev, port);
 	}
 
+	/* let the vf manage its own mac table state */
+	mlx4_init_mac_table(dev, &info->mac_table);
+
 	sprintf(info->dev_name, "mlx4_port%d", port);
 	info->port_attr.attr.name = info->dev_name;
 	if (mlx4_is_mfunc(dev))
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
index 88ee7d8a5923..d953d6eb7d9e 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
@@ -741,6 +741,7 @@ struct mlx4_catas_err {
 
 #define MLX4_MAX_MAC_NUM	128
 #define MLX4_MAC_TABLE_SIZE	(MLX4_MAX_MAC_NUM << 3)
+#define MLX4_VF_MAC_QUOTA	2
 
 struct mlx4_mac_table {
 	__be64			entries[MLX4_MAX_MAC_NUM];
diff --git a/drivers/net/ethernet/mellanox/mlx4/port.c b/drivers/net/ethernet/mellanox/mlx4/port.c
index b656dd5772e5..86cb58690845 100644
--- a/drivers/net/ethernet/mellanox/mlx4/port.c
+++ b/drivers/net/ethernet/mellanox/mlx4/port.c
@@ -39,6 +39,7 @@
 
 #include "mlx4.h"
 #include "mlx4_stats.h"
+#include "fw.h"
 
 #define MLX4_MAC_VALID		(1ull << 63)
 
@@ -54,6 +55,34 @@
 #define MLX4_IGNORE_FCS_MASK			0x1
 #define MLX4_TC_MAX_NUMBER			8
 
+static void mlx4_inc_port_macs(struct mlx4_dev *mdev, int port)
+{
+	struct mlx4_port_info *info = &mlx4_priv(mdev)->port[port];
+
+	mutex_lock(&info->mac_table.mutex);
+	info->mac_table.total++;
+	mutex_unlock(&info->mac_table.mutex);
+	mlx4_info(mdev, "%s added mac for port: %d, now: %d\n",
+		  __func__, port, info->mac_table.total);
+}
+
+static void mlx4_dec_port_macs(struct mlx4_dev *mdev, int port)
+{
+	struct mlx4_port_info *info = &mlx4_priv(mdev)->port[port];
+
+	if (!info->mac_table.total) {
+		mlx4_warn(mdev, "No current macs for port: %d\n", port);
+		return;
+	}
+
+	mutex_lock(&info->mac_table.mutex);
+	info->mac_table.total--;
+	mutex_unlock(&info->mac_table.mutex);
+
+	mlx4_info(mdev, "%s removed mac, port: %d, now: %d\n",
+		  __func__, port, info->mac_table.total);
+}
+
 void mlx4_init_mac_table(struct mlx4_dev *dev, struct mlx4_mac_table *table)
 {
 	int i;
@@ -340,6 +369,8 @@ int mlx4_register_mac(struct mlx4_dev *dev, u8 port, u64 mac)
 	int err = -EINVAL;
 
 	if (mlx4_is_mfunc(dev)) {
+		u32 p_l;
+
 		if (!(dev->flags & MLX4_FLAG_OLD_REG_MAC)) {
 			err = mlx4_cmd_imm(dev, mac, &out_param,
 					   ((u32) port) << 8 | (u32) RES_MAC,
@@ -358,8 +389,13 @@ int mlx4_register_mac(struct mlx4_dev *dev, u8 port, u64 mac)
 		if (err)
 			return err;
 
-		return get_param_l(&out_param);
+		p_l = get_param_l(&out_param);
+		/* update vf table, the master updated via __register_mac */
+		if (p_l && mlx4_is_slave(dev))
+			mlx4_inc_port_macs(dev, port);
+		return p_l;
 	}
+
 	return __mlx4_register_mac(dev, port, mac);
 }
 EXPORT_SYMBOL_GPL(mlx4_register_mac);
@@ -459,6 +495,11 @@ void mlx4_unregister_mac(struct mlx4_dev *dev, u8 port, u64 mac)
 					    RES_OP_RESERVE_AND_MAP, MLX4_CMD_FREE_RES,
 					    MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
 		}
+
+		/* update vf mac table */
+		if (mlx4_is_slave(dev))
+			mlx4_dec_port_macs(dev, port);
+
 		return;
 	}
 	__mlx4_unregister_mac(dev, port, mac);
@@ -2016,3 +2057,50 @@ int mlx4_max_tc(struct mlx4_dev *dev)
 	return num_tc;
 }
 EXPORT_SYMBOL(mlx4_max_tc);
+
+static int mlx4_get_port_reserved_mac_num(struct mlx4_dev *mdev, int port)
+{
+	struct mlx4_priv *priv = mlx4_priv(mdev);
+	struct resource_allocator *res_alloc;
+	int reserved;
+
+	if (mlx4_is_slave(mdev))
+		return 0;
+
+	res_alloc = &priv->mfunc.master.res_tracker.res_alloc[RES_MAC];
+
+	reserved = (port > 0) ? res_alloc->res_port_rsvd[port - 1] :
+		res_alloc->res_reserved;
+
+	return reserved;
+}
+
+static int mlx4_get_port_max_macs(struct mlx4_dev *mdev, int port)
+{
+	struct mlx4_port_info *info = &mlx4_priv(mdev)->port[port];
+
+	/* The maximum value should considers the reserved macs for the vfs */
+	return info->mac_table.max - mlx4_get_port_reserved_mac_num(mdev, port);
+}
+
+static int mlx4_get_port_total_macs(struct mlx4_dev *mdev, int port)
+{
+	struct mlx4_port_info *info = &mlx4_priv(mdev)->port[port];
+
+	return info->mac_table.total;
+}
+
+int mlx4_get_port_free_macs(struct mlx4_dev *mdev, int port)
+{
+	/* slave will get the free macs (log2) from its master */
+	if (mlx4_is_slave(mdev)) {
+		struct mlx4_port_cap port_cap;
+
+		mlx4_QUERY_PORT(mdev, port, &port_cap);
+		return (1 << port_cap.log_max_macs);
+	}
+
+	return (mlx4_get_port_max_macs(mdev, port) -
+		mlx4_get_port_total_macs(mdev, port));
+}
+EXPORT_SYMBOL(mlx4_get_port_free_macs);
diff --git a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
index c548beaaf910..ba7b70630d5d 100644
--- a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
+++ b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
@@ -576,14 +576,14 @@ int mlx4_init_resource_tracker(struct mlx4_dev *dev)
 					}
 					res_alloc->quota[t] =
 						MLX4_MAX_MAC_NUM -
-						2 * max_vfs_pport;
-					res_alloc->guaranteed[t] = 2;
+						MLX4_VF_MAC_QUOTA * max_vfs_pport;
+					res_alloc->guaranteed[t] = MLX4_VF_MAC_QUOTA;
 					for (j = 0; j < MLX4_MAX_PORTS; j++)
 						res_alloc->res_port_free[j] =
 							MLX4_MAX_MAC_NUM;
 				} else {
 					res_alloc->quota[t] = MLX4_MAX_MAC_NUM;
-					res_alloc->guaranteed[t] = 2;
+					res_alloc->guaranteed[t] = MLX4_VF_MAC_QUOTA;
 				}
 				break;
 			case RES_VLAN:
diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index 3be7abd6e722..4220fe8fe094 100644
--- a/include/linux/mlx4/device.h
+++ b/include/linux/mlx4/device.h
@@ -1493,6 +1493,7 @@ struct mlx4_slaves_pport mlx4_phys_to_slaves_pport_actv(
 int mlx4_get_base_gid_ix(struct mlx4_dev *dev, int slave, int port);
 
 int mlx4_config_vxlan_port(struct mlx4_dev *dev, __be16 udp_port);
+int mlx4_get_port_free_macs(struct mlx4_dev *mdev, int port);
 int mlx4_disable_rx_port_check(struct mlx4_dev *dev, bool dis);
 int mlx4_config_roce_v2_port(struct mlx4_dev *dev, u16 udp_port);
 int mlx4_virt2phy_port_map(struct mlx4_dev *dev, u32 port1, u32 port2);
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH net-next 07/10] net/mlx4_core: Add resource alloc/dealloc debugging
From: Tariq Toukan @ 2016-11-27 15:51 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Eran Ben Elisha, Matan Barak, Tariq Toukan
In-Reply-To: <1480261877-19720-1-git-send-email-tariqt@mellanox.com>

From: Matan Barak <matanb@mellanox.com>

In order to aid debugging of functions that take a resource but
don't put it, add the last function name that successfully grabbed
this resource.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
---
 .../net/ethernet/mellanox/mlx4/resource_tracker.c  | 49 ++++++++++++++++++++--
 1 file changed, 45 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
index 05ba270c4862..f59efe59ce58 100644
--- a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
+++ b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
@@ -77,6 +77,7 @@ struct res_common {
 	int			from_state;
 	int			to_state;
 	int			removing;
+	const char		*func_name;
 };
 
 enum {
@@ -837,6 +838,36 @@ static int mpt_mask(struct mlx4_dev *dev)
 	return dev->caps.num_mpts - 1;
 }
 
+static const char *mlx4_resource_type_to_str(enum mlx4_resource t)
+{
+	switch (t) {
+	case RES_QP:
+		return "QP";
+	case RES_CQ:
+		return "CQ";
+	case RES_SRQ:
+		return "SRQ";
+	case RES_XRCD:
+		return "XRCD";
+	case RES_MPT:
+		return "MPT";
+	case RES_MTT:
+		return "MTT";
+	case RES_MAC:
+		return "MAC";
+	case RES_VLAN:
+		return "VLAN";
+	case RES_COUNTER:
+		return "COUNTER";
+	case RES_FS_RULE:
+		return "FS_RULE";
+	case RES_EQ:
+		return "EQ";
+	default:
+		return "INVALID RESOURCE";
+	}
+}
+
 static void *find_res(struct mlx4_dev *dev, u64 res_id,
 		      enum mlx4_resource type)
 {
@@ -846,9 +877,9 @@ static void *find_res(struct mlx4_dev *dev, u64 res_id,
 				  res_id);
 }
 
-static int get_res(struct mlx4_dev *dev, int slave, u64 res_id,
-		   enum mlx4_resource type,
-		   void *res)
+static int _get_res(struct mlx4_dev *dev, int slave, u64 res_id,
+		    enum mlx4_resource type,
+		    void *res, const char *func_name)
 {
 	struct res_common *r;
 	int err = 0;
@@ -861,6 +892,10 @@ static int get_res(struct mlx4_dev *dev, int slave, u64 res_id,
 	}
 
 	if (r->state == RES_ANY_BUSY) {
+		mlx4_warn(dev,
+			  "%s(%d) trying to get resource %llx of type %s, but it's already taken by %s\n",
+			  func_name, slave, res_id, mlx4_resource_type_to_str(type),
+			  r->func_name);
 		err = -EBUSY;
 		goto exit;
 	}
@@ -872,6 +907,7 @@ static int get_res(struct mlx4_dev *dev, int slave, u64 res_id,
 
 	r->from_state = r->state;
 	r->state = RES_ANY_BUSY;
+	r->func_name = func_name;
 
 	if (res)
 		*((struct res_common **)res) = r;
@@ -881,6 +917,9 @@ static int get_res(struct mlx4_dev *dev, int slave, u64 res_id,
 	return err;
 }
 
+#define get_res(dev, slave, res_id, type, res) \
+	_get_res((dev), (slave), (res_id), (type), (res), __func__)
+
 int mlx4_get_slave_from_resource_id(struct mlx4_dev *dev,
 				    enum mlx4_resource type,
 				    u64 res_id, int *slave)
@@ -911,8 +950,10 @@ static void put_res(struct mlx4_dev *dev, int slave, u64 res_id,
 
 	spin_lock_irq(mlx4_tlock(dev));
 	r = find_res(dev, res_id, type);
-	if (r)
+	if (r) {
 		r->state = r->from_state;
+		r->func_name = "";
+	}
 	spin_unlock_irq(mlx4_tlock(dev));
 }
 
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH net-next 06/10] net/mlx4_core: Dynamically allocate structs at mlx4_slave_cap
From: Tariq Toukan @ 2016-11-27 15:51 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Eran Ben Elisha, Tal Alon, Tariq Toukan
In-Reply-To: <1480261877-19720-1-git-send-email-tariqt@mellanox.com>

From: Eran Ben Elisha <eranbe@mellanox.com>

In order to avoid temporary large structs on the stack,
allocate them dynamically.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Tal Alon <talal@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/main.c | 244 +++++++++++++++++-------------
 1 file changed, 142 insertions(+), 102 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index 4a9497e9778d..65502df9fd96 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -799,40 +799,117 @@ static void slave_adjust_steering_mode(struct mlx4_dev *dev,
 		 mlx4_steering_mode_str(dev->caps.steering_mode));
 }
 
+static void mlx4_slave_destroy_special_qp_cap(struct mlx4_dev *dev)
+{
+	kfree(dev->caps.qp0_qkey);
+	kfree(dev->caps.qp0_tunnel);
+	kfree(dev->caps.qp0_proxy);
+	kfree(dev->caps.qp1_tunnel);
+	kfree(dev->caps.qp1_proxy);
+	dev->caps.qp0_qkey = NULL;
+	dev->caps.qp0_tunnel = NULL;
+	dev->caps.qp0_proxy = NULL;
+	dev->caps.qp1_tunnel = NULL;
+	dev->caps.qp1_proxy = NULL;
+}
+
+static int mlx4_slave_special_qp_cap(struct mlx4_dev *dev)
+{
+	struct mlx4_func_cap *func_cap = NULL;
+	int i, err = 0;
+
+	func_cap = kzalloc(sizeof(*func_cap), GFP_KERNEL);
+	dev->caps.qp0_qkey = kcalloc(dev->caps.num_ports,
+				     sizeof(u32), GFP_KERNEL);
+	dev->caps.qp0_tunnel = kcalloc(dev->caps.num_ports,
+				       sizeof(u32), GFP_KERNEL);
+	dev->caps.qp0_proxy = kcalloc(dev->caps.num_ports,
+				      sizeof(u32), GFP_KERNEL);
+	dev->caps.qp1_tunnel = kcalloc(dev->caps.num_ports,
+				       sizeof(u32), GFP_KERNEL);
+	dev->caps.qp1_proxy = kcalloc(dev->caps.num_ports,
+				      sizeof(u32), GFP_KERNEL);
+
+	if (!dev->caps.qp0_tunnel || !dev->caps.qp0_proxy ||
+	    !dev->caps.qp1_tunnel || !dev->caps.qp1_proxy ||
+	    !dev->caps.qp0_qkey || !func_cap) {
+		mlx4_err(dev, "Failed to allocate memory for special qps cap\n");
+		err = -ENOMEM;
+		goto err_mem;
+	}
+
+	for (i = 1; i <= dev->caps.num_ports; ++i) {
+		err = mlx4_QUERY_FUNC_CAP(dev, i, func_cap);
+		if (err) {
+			mlx4_err(dev, "QUERY_FUNC_CAP port command failed for port %d, aborting (%d)\n",
+				 i, err);
+			goto err_mem;
+		}
+		dev->caps.qp0_qkey[i - 1] = func_cap->qp0_qkey;
+		dev->caps.qp0_tunnel[i - 1] = func_cap->qp0_tunnel_qpn;
+		dev->caps.qp0_proxy[i - 1] = func_cap->qp0_proxy_qpn;
+		dev->caps.qp1_tunnel[i - 1] = func_cap->qp1_tunnel_qpn;
+		dev->caps.qp1_proxy[i - 1] = func_cap->qp1_proxy_qpn;
+		dev->caps.port_mask[i] = dev->caps.port_type[i];
+		dev->caps.phys_port_id[i] = func_cap->phys_port_id;
+		err = mlx4_get_slave_pkey_gid_tbl_len(dev, i,
+				    &dev->caps.gid_table_len[i],
+				    &dev->caps.pkey_table_len[i]);
+		if (err) {
+			mlx4_err(dev, "QUERY_PORT command failed for port %d, aborting (%d)\n",
+				 i, err);
+			goto err_mem;
+		}
+	}
+
+err_mem:
+	if (err)
+		mlx4_slave_destroy_special_qp_cap(dev);
+	kfree(func_cap);
+	return err;
+}
+
 static int mlx4_slave_cap(struct mlx4_dev *dev)
 {
 	int			   err;
 	u32			   page_size;
-	struct mlx4_dev_cap	   dev_cap;
-	struct mlx4_func_cap	   func_cap;
-	struct mlx4_init_hca_param hca_param;
-	u8			   i;
+	struct mlx4_dev_cap	   *dev_cap = NULL;
+	struct mlx4_func_cap	   *func_cap = NULL;
+	struct mlx4_init_hca_param *hca_param = NULL;
+
+	hca_param = kzalloc(sizeof(*hca_param), GFP_KERNEL);
+	func_cap = kzalloc(sizeof(*func_cap), GFP_KERNEL);
+	dev_cap = kzalloc(sizeof(*dev_cap), GFP_KERNEL);
+	if (!hca_param || !func_cap || !dev_cap) {
+		mlx4_err(dev, "Failed to allocate memory for slave_cap\n");
+		err = -ENOMEM;
+		goto free_mem;
+	}
 
-	memset(&hca_param, 0, sizeof(hca_param));
-	err = mlx4_QUERY_HCA(dev, &hca_param);
+	err = mlx4_QUERY_HCA(dev, hca_param);
 	if (err) {
 		mlx4_err(dev, "QUERY_HCA command failed, aborting\n");
-		return err;
+		goto free_mem;
 	}
 
 	/* fail if the hca has an unknown global capability
 	 * at this time global_caps should be always zeroed
 	 */
-	if (hca_param.global_caps) {
+	if (hca_param->global_caps) {
 		mlx4_err(dev, "Unknown hca global capabilities\n");
-		return -EINVAL;
+		err = -EINVAL;
+		goto free_mem;
 	}
 
-	mlx4_log_num_mgm_entry_size = hca_param.log_mc_entry_sz;
+	mlx4_log_num_mgm_entry_size = hca_param->log_mc_entry_sz;
 
-	dev->caps.hca_core_clock = hca_param.hca_core_clock;
+	dev->caps.hca_core_clock = hca_param->hca_core_clock;
 
-	memset(&dev_cap, 0, sizeof(dev_cap));
-	dev->caps.max_qp_dest_rdma = 1 << hca_param.log_rd_per_qp;
-	err = mlx4_dev_cap(dev, &dev_cap);
+	dev->caps.max_qp_dest_rdma = 1 << hca_param->log_rd_per_qp;
+	err = mlx4_dev_cap(dev, dev_cap);
 	if (err) {
 		mlx4_err(dev, "QUERY_DEV_CAP command failed, aborting\n");
-		return err;
+		goto free_mem;
 	}
 
 	err = mlx4_QUERY_FW(dev);
@@ -844,21 +921,23 @@ static int mlx4_slave_cap(struct mlx4_dev *dev)
 	if (page_size > PAGE_SIZE) {
 		mlx4_err(dev, "HCA minimum page size of %d bigger than kernel PAGE_SIZE of %ld, aborting\n",
 			 page_size, PAGE_SIZE);
-		return -ENODEV;
+		err = -ENODEV;
+		goto free_mem;
 	}
 
 	/* Set uar_page_shift for VF */
-	dev->uar_page_shift = hca_param.uar_page_sz + 12;
+	dev->uar_page_shift = hca_param->uar_page_sz + 12;
 
 	/* Make sure the master uar page size is valid */
 	if (dev->uar_page_shift > PAGE_SHIFT) {
 		mlx4_err(dev,
 			 "Invalid configuration: uar page size is larger than system page size\n");
-		return  -ENODEV;
+		err = -ENODEV;
+		goto free_mem;
 	}
 
 	/* Set reserved_uars based on the uar_page_shift */
-	mlx4_set_num_reserved_uars(dev, &dev_cap);
+	mlx4_set_num_reserved_uars(dev, dev_cap);
 
 	/* Although uar page size in FW differs from system page size,
 	 * upper software layers (mlx4_ib, mlx4_en and part of mlx4_core)
@@ -866,34 +945,35 @@ static int mlx4_slave_cap(struct mlx4_dev *dev)
 	 */
 	dev->caps.uar_page_size = PAGE_SIZE;
 
-	memset(&func_cap, 0, sizeof(func_cap));
-	err = mlx4_QUERY_FUNC_CAP(dev, 0, &func_cap);
+	err = mlx4_QUERY_FUNC_CAP(dev, 0, func_cap);
 	if (err) {
 		mlx4_err(dev, "QUERY_FUNC_CAP general command failed, aborting (%d)\n",
 			 err);
-		return err;
+		goto free_mem;
 	}
 
-	if ((func_cap.pf_context_behaviour | PF_CONTEXT_BEHAVIOUR_MASK) !=
+	if ((func_cap->pf_context_behaviour | PF_CONTEXT_BEHAVIOUR_MASK) !=
 	    PF_CONTEXT_BEHAVIOUR_MASK) {
 		mlx4_err(dev, "Unknown pf context behaviour %x known flags %x\n",
-			 func_cap.pf_context_behaviour, PF_CONTEXT_BEHAVIOUR_MASK);
-		return -EINVAL;
-	}
-
-	dev->caps.num_ports		= func_cap.num_ports;
-	dev->quotas.qp			= func_cap.qp_quota;
-	dev->quotas.srq			= func_cap.srq_quota;
-	dev->quotas.cq			= func_cap.cq_quota;
-	dev->quotas.mpt			= func_cap.mpt_quota;
-	dev->quotas.mtt			= func_cap.mtt_quota;
-	dev->caps.num_qps		= 1 << hca_param.log_num_qps;
-	dev->caps.num_srqs		= 1 << hca_param.log_num_srqs;
-	dev->caps.num_cqs		= 1 << hca_param.log_num_cqs;
-	dev->caps.num_mpts		= 1 << hca_param.log_mpt_sz;
-	dev->caps.num_eqs		= func_cap.max_eq;
-	dev->caps.reserved_eqs		= func_cap.reserved_eq;
-	dev->caps.reserved_lkey		= func_cap.reserved_lkey;
+			 func_cap->pf_context_behaviour,
+			 PF_CONTEXT_BEHAVIOUR_MASK);
+		err = -EINVAL;
+		goto free_mem;
+	}
+
+	dev->caps.num_ports		= func_cap->num_ports;
+	dev->quotas.qp			= func_cap->qp_quota;
+	dev->quotas.srq			= func_cap->srq_quota;
+	dev->quotas.cq			= func_cap->cq_quota;
+	dev->quotas.mpt			= func_cap->mpt_quota;
+	dev->quotas.mtt			= func_cap->mtt_quota;
+	dev->caps.num_qps		= 1 << hca_param->log_num_qps;
+	dev->caps.num_srqs		= 1 << hca_param->log_num_srqs;
+	dev->caps.num_cqs		= 1 << hca_param->log_num_cqs;
+	dev->caps.num_mpts		= 1 << hca_param->log_mpt_sz;
+	dev->caps.num_eqs		= func_cap->max_eq;
+	dev->caps.reserved_eqs		= func_cap->reserved_eq;
+	dev->caps.reserved_lkey		= func_cap->reserved_lkey;
 	dev->caps.num_pds               = MLX4_NUM_PDS;
 	dev->caps.num_mgms              = 0;
 	dev->caps.num_amgms             = 0;
@@ -906,38 +986,10 @@ static int mlx4_slave_cap(struct mlx4_dev *dev)
 
 	mlx4_replace_zero_macs(dev);
 
-	dev->caps.qp0_qkey = kcalloc(dev->caps.num_ports, sizeof(u32), GFP_KERNEL);
-	dev->caps.qp0_tunnel = kcalloc(dev->caps.num_ports, sizeof (u32), GFP_KERNEL);
-	dev->caps.qp0_proxy = kcalloc(dev->caps.num_ports, sizeof (u32), GFP_KERNEL);
-	dev->caps.qp1_tunnel = kcalloc(dev->caps.num_ports, sizeof (u32), GFP_KERNEL);
-	dev->caps.qp1_proxy = kcalloc(dev->caps.num_ports, sizeof (u32), GFP_KERNEL);
-
-	if (!dev->caps.qp0_tunnel || !dev->caps.qp0_proxy ||
-	    !dev->caps.qp1_tunnel || !dev->caps.qp1_proxy ||
-	    !dev->caps.qp0_qkey) {
-		err = -ENOMEM;
-		goto err_mem;
-	}
-
-	for (i = 1; i <= dev->caps.num_ports; ++i) {
-		err = mlx4_QUERY_FUNC_CAP(dev, i, &func_cap);
-		if (err) {
-			mlx4_err(dev, "QUERY_FUNC_CAP port command failed for port %d, aborting (%d)\n",
-				 i, err);
-			goto err_mem;
-		}
-		dev->caps.qp0_qkey[i - 1] = func_cap.qp0_qkey;
-		dev->caps.qp0_tunnel[i - 1] = func_cap.qp0_tunnel_qpn;
-		dev->caps.qp0_proxy[i - 1] = func_cap.qp0_proxy_qpn;
-		dev->caps.qp1_tunnel[i - 1] = func_cap.qp1_tunnel_qpn;
-		dev->caps.qp1_proxy[i - 1] = func_cap.qp1_proxy_qpn;
-		dev->caps.port_mask[i] = dev->caps.port_type[i];
-		dev->caps.phys_port_id[i] = func_cap.phys_port_id;
-		err = mlx4_get_slave_pkey_gid_tbl_len(dev, i,
-						      &dev->caps.gid_table_len[i],
-						      &dev->caps.pkey_table_len[i]);
-		if (err)
-			goto err_mem;
+	err = mlx4_slave_special_qp_cap(dev);
+	if (err) {
+		mlx4_err(dev, "Set special QP caps failed. aborting\n");
+		goto free_mem;
 	}
 
 	if (dev->caps.uar_page_size * (dev->caps.num_uars -
@@ -952,7 +1004,7 @@ static int mlx4_slave_cap(struct mlx4_dev *dev)
 		goto err_mem;
 	}
 
-	if (hca_param.dev_cap_enabled & MLX4_DEV_CAP_64B_EQE_ENABLED) {
+	if (hca_param->dev_cap_enabled & MLX4_DEV_CAP_64B_EQE_ENABLED) {
 		dev->caps.eqe_size   = 64;
 		dev->caps.eqe_factor = 1;
 	} else {
@@ -960,20 +1012,20 @@ static int mlx4_slave_cap(struct mlx4_dev *dev)
 		dev->caps.eqe_factor = 0;
 	}
 
-	if (hca_param.dev_cap_enabled & MLX4_DEV_CAP_64B_CQE_ENABLED) {
+	if (hca_param->dev_cap_enabled & MLX4_DEV_CAP_64B_CQE_ENABLED) {
 		dev->caps.cqe_size   = 64;
 		dev->caps.userspace_caps |= MLX4_USER_DEV_CAP_LARGE_CQE;
 	} else {
 		dev->caps.cqe_size   = 32;
 	}
 
-	if (hca_param.dev_cap_enabled & MLX4_DEV_CAP_EQE_STRIDE_ENABLED) {
-		dev->caps.eqe_size = hca_param.eqe_size;
+	if (hca_param->dev_cap_enabled & MLX4_DEV_CAP_EQE_STRIDE_ENABLED) {
+		dev->caps.eqe_size = hca_param->eqe_size;
 		dev->caps.eqe_factor = 0;
 	}
 
-	if (hca_param.dev_cap_enabled & MLX4_DEV_CAP_CQE_STRIDE_ENABLED) {
-		dev->caps.cqe_size = hca_param.cqe_size;
+	if (hca_param->dev_cap_enabled & MLX4_DEV_CAP_CQE_STRIDE_ENABLED) {
+		dev->caps.cqe_size = hca_param->cqe_size;
 		/* User still need to know when CQE > 32B */
 		dev->caps.userspace_caps |= MLX4_USER_DEV_CAP_LARGE_CQE;
 	}
@@ -981,31 +1033,24 @@ static int mlx4_slave_cap(struct mlx4_dev *dev)
 	dev->caps.flags2 &= ~MLX4_DEV_CAP_FLAG2_TS;
 	mlx4_warn(dev, "Timestamping is not supported in slave mode\n");
 
-	slave_adjust_steering_mode(dev, &dev_cap, &hca_param);
+	slave_adjust_steering_mode(dev, dev_cap, hca_param);
 	mlx4_dbg(dev, "RSS support for IP fragments is %s\n",
-		 hca_param.rss_ip_frags ? "on" : "off");
+		 hca_param->rss_ip_frags ? "on" : "off");
 
-	if (func_cap.extra_flags & MLX4_QUERY_FUNC_FLAGS_BF_RES_QP &&
+	if (func_cap->extra_flags & MLX4_QUERY_FUNC_FLAGS_BF_RES_QP &&
 	    dev->caps.bf_reg_size)
 		dev->caps.alloc_res_qp_mask |= MLX4_RESERVE_ETH_BF_QP;
 
-	if (func_cap.extra_flags & MLX4_QUERY_FUNC_FLAGS_A0_RES_QP)
+	if (func_cap->extra_flags & MLX4_QUERY_FUNC_FLAGS_A0_RES_QP)
 		dev->caps.alloc_res_qp_mask |= MLX4_RESERVE_A0_QP;
 
-	return 0;
-
 err_mem:
-	kfree(dev->caps.qp0_qkey);
-	kfree(dev->caps.qp0_tunnel);
-	kfree(dev->caps.qp0_proxy);
-	kfree(dev->caps.qp1_tunnel);
-	kfree(dev->caps.qp1_proxy);
-	dev->caps.qp0_qkey = NULL;
-	dev->caps.qp0_tunnel = NULL;
-	dev->caps.qp0_proxy = NULL;
-	dev->caps.qp1_tunnel = NULL;
-	dev->caps.qp1_proxy = NULL;
-
+	if (err)
+		mlx4_slave_destroy_special_qp_cap(dev);
+free_mem:
+	kfree(hca_param);
+	kfree(func_cap);
+	kfree(dev_cap);
 	return err;
 }
 
@@ -2381,13 +2426,8 @@ static int mlx4_init_hca(struct mlx4_dev *dev)
 	unmap_internal_clock(dev);
 	unmap_bf_area(dev);
 
-	if (mlx4_is_slave(dev)) {
-		kfree(dev->caps.qp0_qkey);
-		kfree(dev->caps.qp0_tunnel);
-		kfree(dev->caps.qp0_proxy);
-		kfree(dev->caps.qp1_tunnel);
-		kfree(dev->caps.qp1_proxy);
-	}
+	if (mlx4_is_slave(dev))
+		mlx4_slave_destroy_special_qp_cap(dev);
 
 err_close:
 	if (mlx4_is_slave(dev))
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH net-next 08/10] net/mlx4_core: Get num_tc using netdev_get_num_tc
From: Tariq Toukan @ 2016-11-27 15:51 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Eran Ben Elisha, Alaa Hleihel, Tariq Toukan
In-Reply-To: <1480261877-19720-1-git-send-email-tariqt@mellanox.com>

From: Alaa Hleihel <alaa@mellanox.com>

Avoid reading num_tc directly from struct net_device, but use
the helper function netdev_get_num_tc.

Fixes: bc6a4744b827 ("net/mlx4_en: num cores tx rings for every UP")
Fixes: f5b6345ba8da ("net/mlx4_en: User prio mapping gets corrupted when changing number of channels")
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_ethtool.c | 2 +-
 drivers/net/ethernet/mellanox/mlx4/en_tx.c      | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
index 7e6d0425394d..9848302ace84 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
@@ -1791,7 +1791,7 @@ static int mlx4_en_set_channels(struct net_device *dev,
 	netif_set_real_num_tx_queues(dev, priv->tx_ring_num[TX]);
 	netif_set_real_num_rx_queues(dev, priv->rx_ring_num);
 
-	if (dev->num_tc)
+	if (netdev_get_num_tc(dev))
 		mlx4_en_setup_tc(dev, MLX4_EN_NUM_UP);
 
 	en_warn(priv, "Using %d TX rings\n", priv->tx_ring_num[TX]);
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 4b597dca5c52..abe1c5994b73 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -710,7 +710,7 @@ u16 mlx4_en_select_queue(struct net_device *dev, struct sk_buff *skb,
 	u16 rings_p_up = priv->num_tx_rings_p_up;
 	u8 up = 0;
 
-	if (dev->num_tc)
+	if (netdev_get_num_tc(dev))
 		return skb_tx_hash(dev, skb);
 
 	if (skb_vlan_tag_present(skb))
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH net-next 09/10] net/mlx4: Change number of max MSIXs from 64 to 1024
From: Tariq Toukan @ 2016-11-27 15:51 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Eran Ben Elisha, Matan Barak, Moshe Shemesh, Tariq Toukan
In-Reply-To: <1480261877-19720-1-git-send-email-tariqt@mellanox.com>

From: Matan Barak <matanb@mellanox.com>

Increase the number of max MSIXs in order to achieve
better performance on machines with high number of CPUs.

Fixes: 0b7ca5a928e2 ("mlx4: Changing interrupt scheme")
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
---
 include/linux/mlx4/device.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index 1dcd6ac3b1f3..44863e3c0981 100644
--- a/include/linux/mlx4/device.h
+++ b/include/linux/mlx4/device.h
@@ -47,7 +47,7 @@
 #define DEFAULT_UAR_PAGE_SHIFT  12
 
 #define MAX_MSIX_P_PORT		17
-#define MAX_MSIX		64
+#define MAX_MSIX		1024
 #define MIN_MSIX_P_PORT		5
 #define MLX4_IS_LEGACY_EQ_MODE(dev_cap) ((dev_cap).num_comp_vectors < \
 					 (dev_cap).num_ports * MIN_MSIX_P_PORT)
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH net-next 03/10] net/mlx4_en: Add new FDB entry only if there is space in the mac table
From: Tariq Toukan @ 2016-11-27 15:51 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Eran Ben Elisha, Erez Shitrit, Gal Pressman, Tariq Toukan
In-Reply-To: <1480261877-19720-1-git-send-email-tariqt@mellanox.com>

From: Erez Shitrit <erezsh@mellanox.com>

Before adding a new mac to the FDB (Forwarding Database),
make sure there is space for it. Each port has 128
macs that are allocated between the hypervisor and the VFs.
If there is no space, return error.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 6 ++++++
 drivers/net/ethernet/mellanox/mlx4/port.c      | 8 ++++++++
 include/linux/mlx4/device.h                    | 1 +
 3 files changed, 15 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index 9018bb1b2e12..60c3b2da8714 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -1123,6 +1123,12 @@ static void mlx4_en_do_uc_filter(struct mlx4_en_priv *priv,
 			}
 			mac = mlx4_mac_to_u64(ha->addr);
 			memcpy(entry->mac, ha->addr, ETH_ALEN);
+
+			if (!mlx4_is_available_mac(mdev->dev, priv->port)) {
+				mlx4_warn(mdev, "Cannot add mac:%pM, no free macs.\n", &mac);
+				break;
+			}
+
 			err = mlx4_register_mac(mdev->dev, priv->port, mac);
 			if (err < 0) {
 				en_err(priv, "Failed registering MAC %pM on port %d: %d\n",
diff --git a/drivers/net/ethernet/mellanox/mlx4/port.c b/drivers/net/ethernet/mellanox/mlx4/port.c
index 86cb58690845..ccc4670f92b5 100644
--- a/drivers/net/ethernet/mellanox/mlx4/port.c
+++ b/drivers/net/ethernet/mellanox/mlx4/port.c
@@ -2104,3 +2104,11 @@ int mlx4_get_port_free_macs(struct mlx4_dev *mdev, int port)
 		mlx4_get_port_total_macs(mdev, port));
 }
 EXPORT_SYMBOL(mlx4_get_port_free_macs);
+
+bool mlx4_is_available_mac(struct mlx4_dev *mdev, int port)
+{
+	int free_macs = mlx4_get_port_free_macs(mdev, port);
+
+	return free_macs >= MLX4_VF_MAC_QUOTA;
+}
+EXPORT_SYMBOL(mlx4_is_available_mac);
diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index 4220fe8fe094..1dcd6ac3b1f3 100644
--- a/include/linux/mlx4/device.h
+++ b/include/linux/mlx4/device.h
@@ -1494,6 +1494,7 @@ struct mlx4_slaves_pport mlx4_phys_to_slaves_pport_actv(
 
 int mlx4_config_vxlan_port(struct mlx4_dev *dev, __be16 udp_port);
 int mlx4_get_port_free_macs(struct mlx4_dev *mdev, int port);
+bool mlx4_is_available_mac(struct mlx4_dev *mdev, int port);
 int mlx4_disable_rx_port_check(struct mlx4_dev *dev, bool dis);
 int mlx4_config_roce_v2_port(struct mlx4_dev *dev, u16 udp_port);
 int mlx4_virt2phy_port_map(struct mlx4_dev *dev, u32 port1, u32 port2);
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH net-next 05/10] net/mlx4_core: Set EQ affinity hint to local NUMA CPUs
From: Tariq Toukan @ 2016-11-27 15:51 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Eran Ben Elisha, Daniel Jurgens, Tariq Toukan
In-Reply-To: <1480261877-19720-1-git-send-email-tariqt@mellanox.com>

From: Daniel Jurgens <danielj@mellanox.com>

Use CPUs on the close NUMA when setting the EQ affinity hints.

Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/main.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index 41cd05f37b20..4a9497e9778d 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -2825,7 +2825,8 @@ static int mlx4_init_affinity_hint(struct mlx4_dev *dev, int port, int eqn)
 	if (!zalloc_cpumask_var(&eq->affinity_mask, GFP_KERNEL))
 		return -ENOMEM;
 
-	cpumask_set_cpu(requested_cpu, eq->affinity_mask);
+	cpumask_set_cpu(cpumask_local_spread(requested_cpu, dev->numa_node),
+			eq->affinity_mask);
 
 	return 0;
 }
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH net-next 10/10] net/mlx4_core: Fix racy CQ (Completion Queue) free
From: Tariq Toukan @ 2016-11-27 15:51 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Eran Ben Elisha, Jack Morgenstein, Matan Barak,
	Tariq Toukan
In-Reply-To: <1480261877-19720-1-git-send-email-tariqt@mellanox.com>

From: Jack Morgenstein <jackm@dev.mellanox.co.il>

In function mlx4_cq_completion() and mlx4_cq_event(), the
radix_tree_lookup requires a rcu_read_lock.
This is mandatory: if another core frees the CQ, it could
run the radix_tree_node_rcu_free() call_rcu() callback while
its being used by the radix tree lookup function.

Additionally, in function mlx4_cq_event(), since we are adding
the rcu lock around the radix-tree lookup, we no longer need to take
the spinlock. Also, the synchronize_irq() call for the async event
eliminates the need for incrementing the cq reference count in
mlx4_cq_event().

Other changes:
1. In function mlx4_cq_free(), replace spin_lock_irq with spin_lock:
   we no longer take this spinlock in the interrupt context.
   The spinlock here, therefore, simply protects against different
   threads simultaneously invoking mlx4_cq_free() for different cq's.

2. In function mlx4_cq_free(), we move the radix tree delete to before
   the synchronize_irq() calls. This guarantees that we will not
   access this cq during any subsequent interrupts, and therefore can
   safely free the CQ after the synchronize_irq calls. The rcu_read_lock
   in the interrupt handlers only needs to protect against corrupting the
   radix tree; the interrupt handlers may access the cq outside the
   rcu_read_lock due to the synchronize_irq calls which protect against
   premature freeing of the cq.

3. In function mlx4_cq_event(), we change the mlx_warn message to mlx4_dbg.

4. We leave the cq reference count mechanism in place, because it is
   still needed for the cq completion tasklet mechanism.

Fixes: 6d90aa5cf17b ("net/mlx4_core: Make sure there are no pending async events when freeing CQ")
Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/cq.c | 38 +++++++++++++++++----------------
 1 file changed, 20 insertions(+), 18 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/cq.c b/drivers/net/ethernet/mellanox/mlx4/cq.c
index a849da92f857..6b8635378f1f 100644
--- a/drivers/net/ethernet/mellanox/mlx4/cq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cq.c
@@ -101,13 +101,19 @@ void mlx4_cq_completion(struct mlx4_dev *dev, u32 cqn)
 {
 	struct mlx4_cq *cq;
 
+	rcu_read_lock();
 	cq = radix_tree_lookup(&mlx4_priv(dev)->cq_table.tree,
 			       cqn & (dev->caps.num_cqs - 1));
+	rcu_read_unlock();
+
 	if (!cq) {
 		mlx4_dbg(dev, "Completion event for bogus CQ %08x\n", cqn);
 		return;
 	}
 
+	/* Acessing the CQ outside of rcu_read_lock is safe, because
+	 * the CQ is freed only after interrupt handling is completed.
+	 */
 	++cq->arm_sn;
 
 	cq->comp(cq);
@@ -118,23 +124,19 @@ void mlx4_cq_event(struct mlx4_dev *dev, u32 cqn, int event_type)
 	struct mlx4_cq_table *cq_table = &mlx4_priv(dev)->cq_table;
 	struct mlx4_cq *cq;
 
-	spin_lock(&cq_table->lock);
-
+	rcu_read_lock();
 	cq = radix_tree_lookup(&cq_table->tree, cqn & (dev->caps.num_cqs - 1));
-	if (cq)
-		atomic_inc(&cq->refcount);
-
-	spin_unlock(&cq_table->lock);
+	rcu_read_unlock();
 
 	if (!cq) {
-		mlx4_warn(dev, "Async event for bogus CQ %08x\n", cqn);
+		mlx4_dbg(dev, "Async event for bogus CQ %08x\n", cqn);
 		return;
 	}
 
+	/* Acessing the CQ outside of rcu_read_lock is safe, because
+	 * the CQ is freed only after interrupt handling is completed.
+	 */
 	cq->event(cq, event_type);
-
-	if (atomic_dec_and_test(&cq->refcount))
-		complete(&cq->free);
 }
 
 static int mlx4_SW2HW_CQ(struct mlx4_dev *dev, struct mlx4_cmd_mailbox *mailbox,
@@ -301,9 +303,9 @@ int mlx4_cq_alloc(struct mlx4_dev *dev, int nent,
 	if (err)
 		return err;
 
-	spin_lock_irq(&cq_table->lock);
+	spin_lock(&cq_table->lock);
 	err = radix_tree_insert(&cq_table->tree, cq->cqn, cq);
-	spin_unlock_irq(&cq_table->lock);
+	spin_unlock(&cq_table->lock);
 	if (err)
 		goto err_icm;
 
@@ -349,9 +351,9 @@ int mlx4_cq_alloc(struct mlx4_dev *dev, int nent,
 	return 0;
 
 err_radix:
-	spin_lock_irq(&cq_table->lock);
+	spin_lock(&cq_table->lock);
 	radix_tree_delete(&cq_table->tree, cq->cqn);
-	spin_unlock_irq(&cq_table->lock);
+	spin_unlock(&cq_table->lock);
 
 err_icm:
 	mlx4_cq_free_icm(dev, cq->cqn);
@@ -370,15 +372,15 @@ void mlx4_cq_free(struct mlx4_dev *dev, struct mlx4_cq *cq)
 	if (err)
 		mlx4_warn(dev, "HW2SW_CQ failed (%d) for CQN %06x\n", err, cq->cqn);
 
+	spin_lock(&cq_table->lock);
+	radix_tree_delete(&cq_table->tree, cq->cqn);
+	spin_unlock(&cq_table->lock);
+
 	synchronize_irq(priv->eq_table.eq[MLX4_CQ_TO_EQ_VECTOR(cq->vector)].irq);
 	if (priv->eq_table.eq[MLX4_CQ_TO_EQ_VECTOR(cq->vector)].irq !=
 	    priv->eq_table.eq[MLX4_EQ_ASYNC].irq)
 		synchronize_irq(priv->eq_table.eq[MLX4_EQ_ASYNC].irq);
 
-	spin_lock_irq(&cq_table->lock);
-	radix_tree_delete(&cq_table->tree, cq->cqn);
-	spin_unlock_irq(&cq_table->lock);
-
 	if (atomic_dec_and_test(&cq->refcount))
 		complete(&cq->free);
 	wait_for_completion(&cq->free);
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH net-next 00/10] mlx4 misc improvements
From: Tariq Toukan @ 2016-11-27 15:51 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Eran Ben Elisha, Tariq Toukan

Hi Dave,

This patchset contains several improvements, cleanups, and bug fixes
from the team to the mlx4 Eth and core drivers.

Series generated against net-next commit:
e5f12b3f5ebb Merge branch 'mlxsw-trap-groups-and-policers'

Thanks,
Tariq.

Alaa Hleihel (1):
  net/mlx4_core: Get num_tc using netdev_get_num_tc

Daniel Jurgens (1):
  net/mlx4_core: Set EQ affinity hint to local NUMA CPUs

Eran Ben Elisha (1):
  net/mlx4_core: Dynamically allocate structs at mlx4_slave_cap

Erez Shitrit (2):
  net/mlx4_core: Make each VF manage its own mac table
  net/mlx4_en: Add new FDB entry only if there is space in the mac table

Jack Morgenstein (1):
  net/mlx4_core: Fix racy CQ (Completion Queue) free

Matan Barak (2):
  net/mlx4_core: Add resource alloc/dealloc debugging
  net/mlx4: Change number of max MSIXs from 64 to 1024

Tariq Toukan (1):
  net/mlx4: Replace ENOSYS with better fitting error codes

Yishai Hadas (1):
  net/mlx4_core: Device revision support

 drivers/net/ethernet/mellanox/mlx4/cq.c            |  38 ++--
 drivers/net/ethernet/mellanox/mlx4/en_ethtool.c    |   4 +-
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c     |   6 +
 drivers/net/ethernet/mellanox/mlx4/en_tx.c         |   2 +-
 drivers/net/ethernet/mellanox/mlx4/fw.c            |  17 +-
 drivers/net/ethernet/mellanox/mlx4/main.c          | 253 ++++++++++++---------
 drivers/net/ethernet/mellanox/mlx4/mlx4.h          |   1 +
 drivers/net/ethernet/mellanox/mlx4/port.c          |  98 +++++++-
 .../net/ethernet/mellanox/mlx4/resource_tracker.c  |  57 ++++-
 include/linux/mlx4/device.h                        |   4 +-
 10 files changed, 342 insertions(+), 138 deletions(-)

-- 
1.8.3.1

^ permalink raw reply

* Re: [PATCH iproute2 0/2] Adding help and inline mode control to eswitch
From: Roi Dayan @ 2016-11-27 11:30 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: roid, netdev, Or Gerlitz
In-Reply-To: <1480245663-814-1-git-send-email-roid@mellanox.com>

Hi Stephen,

I forgot to mention that the patches are for net-next branch of iproute2.

Thanks,
Roi


On 27/11/2016 13:21, Roi Dayan wrote:
> Hi,
>
> The first patch is adding missing help message for eswitch.
> The second patch is adding functionality to show/modify eswitch inline mode.
>
> Thanks,
> Roi
>
>
> Roi Dayan (2):
>    devlink: Add usage help for eswitch subcommand
>    devlink: Add option to set and show eswitch inline mode
>
>   devlink/devlink.c       |   89 +++++++++++++++++++++++++++++++++++++++++++++-
>   include/linux/devlink.h |    8 ++++
>   man/man8/devlink-dev.8  |   21 ++++++++++-
>   3 files changed, 115 insertions(+), 3 deletions(-)
>

^ permalink raw reply

* [PATCH net-next V2 06/10] net/mlx5e: Add DCBX control interface
From: Saeed Mahameed @ 2016-11-27 15:02 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Huy Nguyen, Saeed Mahameed
In-Reply-To: <1480258932-10302-1-git-send-email-saeedm@mellanox.com>

From: Huy Nguyen <huyn@mellanox.com>

Use setdcbx interface to set the DCBX mode to firmware or os.
If setdcbx is called with mode value of zero, the DCBX mode
is set to firmware.

Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c | 27 +++++++++++++++++++---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c b/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c
index 1753182..7f6c225 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c
@@ -287,13 +287,34 @@ static int mlx5e_dcbnl_ieee_setpfc(struct net_device *dev,
 
 static u8 mlx5e_dcbnl_getdcbx(struct net_device *dev)
 {
-	return DCB_CAP_DCBX_HOST |
-	       DCB_CAP_DCBX_VER_IEEE |
-	       DCB_CAP_DCBX_VER_CEE;
+	struct mlx5e_priv *priv = netdev_priv(dev);
+	struct mlx5e_dcbx *dcbx = &priv->dcbx;
+	u8 mode = DCB_CAP_DCBX_VER_IEEE | DCB_CAP_DCBX_VER_CEE;
+
+	if (dcbx->mode == MLX5E_DCBX_PARAM_VER_OPER_HOST)
+		mode |= DCB_CAP_DCBX_HOST;
+
+	return mode;
 }
 
 static u8 mlx5e_dcbnl_setdcbx(struct net_device *dev, u8 mode)
 {
+	struct mlx5e_priv *priv = netdev_priv(dev);
+	struct mlx5e_dcbx *dcbx = &priv->dcbx;
+
+	if ((!mode) && MLX5_CAP_GEN(priv->mdev, dcbx)) {
+		if (dcbx->mode == MLX5E_DCBX_PARAM_VER_OPER_AUTO)
+			return 0;
+
+		/* set dcbx to fw controlled */
+		if (!mlx5e_dcbnl_set_dcbx_mode(priv, MLX5E_DCBX_PARAM_VER_OPER_AUTO)) {
+			dcbx->mode = MLX5E_DCBX_PARAM_VER_OPER_AUTO;
+			return 0;
+		}
+
+		return 1;
+	}
+
 	if (mlx5e_dcbnl_switch_to_host_mode(netdev_priv(dev)))
 		return 1;
 
-- 
2.7.4

^ permalink raw reply related

* [PATCH net-next V2 10/10] net/mlx5e: Add CQE compression user control
From: Saeed Mahameed @ 2016-11-27 15:02 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Shaker Daibes, Saeed Mahameed
In-Reply-To: <1480258932-10302-1-git-send-email-saeedm@mellanox.com>

From: Shaker Daibes <shakerd@mellanox.com>

The user can now override the automatic driver decision using the
rx_cqe_compress flag, which is the preference for CQE compression.
The flag is initialized with the automatic driver decision.

Signed-off-by: Shaker Daibes <shakerd@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |  5 +--
 drivers/net/ethernet/mellanox/mlx5/core/en_clock.c |  3 +-
 .../net/ethernet/mellanox/mlx5/core/en_ethtool.c   | 39 ++++++++++++++++++++--
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  | 13 ++++----
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    |  4 +--
 5 files changed, 51 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 84ac78f..442dbc3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -171,10 +171,12 @@ extern const char mlx5e_self_tests[][ETH_GSTRING_LEN];
 
 static const char mlx5e_priv_flags[][ETH_GSTRING_LEN] = {
 	"rx_cqe_moder",
+	"rx_cqe_compress",
 };
 
 enum mlx5e_priv_flag {
 	MLX5E_PFLAG_RX_CQE_BASED_MODER = (1 << 0),
+	MLX5E_PFLAG_RX_CQE_COMPRESS = (1 << 1),
 };
 
 #define MLX5E_SET_PFLAG(priv, pflag, enable)			\
@@ -205,8 +207,7 @@ struct mlx5e_params {
 	u16 num_channels;
 	u8  num_tc;
 	u8  rx_cq_period_mode;
-	bool rx_cqe_compress_admin;
-	bool rx_cqe_compress;
+	bool rx_cqe_compress_def;
 	struct mlx5e_cq_moder rx_cq_moderation;
 	struct mlx5e_cq_moder tx_cq_moderation;
 	u16 min_rx_wqes;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_clock.c b/drivers/net/ethernet/mellanox/mlx5/core/en_clock.c
index 13dc388..2cd8e56 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_clock.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_clock.c
@@ -94,7 +94,7 @@ int mlx5e_hwstamp_set(struct net_device *dev, struct ifreq *ifr)
 	switch (config.rx_filter) {
 	case HWTSTAMP_FILTER_NONE:
 		/* Reset CQE compression to Admin default */
-		mlx5e_modify_rx_cqe_compression(priv, priv->params.rx_cqe_compress_admin);
+		mlx5e_modify_rx_cqe_compression(priv, priv->params.rx_cqe_compress_def);
 		break;
 	case HWTSTAMP_FILTER_ALL:
 	case HWTSTAMP_FILTER_SOME:
@@ -111,6 +111,7 @@ int mlx5e_hwstamp_set(struct net_device *dev, struct ifreq *ifr)
 	case HWTSTAMP_FILTER_PTP_V2_SYNC:
 	case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
 		/* Disable CQE compression */
+		netdev_warn(dev, "Disabling cqe compression");
 		mlx5e_modify_rx_cqe_compression(priv, false);
 		config.rx_filter = HWTSTAMP_FILTER_ALL;
 		break;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index d2bdccb..aa963d7 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -1481,6 +1481,35 @@ static int set_pflag_rx_cqe_based_moder(struct net_device *netdev, bool enable)
 	return err;
 }
 
+static int set_pflag_rx_cqe_compress(struct net_device *netdev,
+				     bool enable)
+{
+	struct mlx5e_priv *priv = netdev_priv(netdev);
+	struct mlx5_core_dev *mdev = priv->mdev;
+	int err = 0;
+	bool reset;
+
+	if (!MLX5_CAP_GEN(mdev, cqe_compression))
+		return -ENOTSUPP;
+
+	if (enable && priv->tstamp.hwtstamp_config.rx_filter != HWTSTAMP_FILTER_NONE) {
+		netdev_err(netdev, "Can't enable cqe compression while timestamping is enabled.\n");
+		return -EINVAL;
+	}
+
+	reset = test_bit(MLX5E_STATE_OPENED, &priv->state);
+
+	if (reset)
+		mlx5e_close_locked(netdev);
+
+	MLX5E_SET_PFLAG(priv, MLX5E_PFLAG_RX_CQE_COMPRESS, enable);
+	priv->params.rx_cqe_compress_def = enable;
+
+	if (reset)
+		err = mlx5e_open_locked(netdev);
+	return err;
+}
+
 static int mlx5e_handle_pflag(struct net_device *netdev,
 			      u32 wanted_flags,
 			      enum mlx5e_priv_flag flag,
@@ -1511,13 +1540,19 @@ static int mlx5e_set_priv_flags(struct net_device *netdev, u32 pflags)
 	int err;
 
 	mutex_lock(&priv->state_lock);
-
 	err = mlx5e_handle_pflag(netdev, pflags,
 				 MLX5E_PFLAG_RX_CQE_BASED_MODER,
 				 set_pflag_rx_cqe_based_moder);
+	if (err)
+		goto out;
 
+	err = mlx5e_handle_pflag(netdev, pflags,
+				 MLX5E_PFLAG_RX_CQE_COMPRESS,
+				 set_pflag_rx_cqe_compress);
+
+out:
 	mutex_unlock(&priv->state_lock);
-	return err ? -EINVAL : 0;
+	return err;
 }
 
 static u32 mlx5e_get_priv_flags(struct net_device *netdev)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 004940a..6b492ca 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -84,7 +84,8 @@ static void mlx5e_set_rq_type_params(struct mlx5e_priv *priv, u8 rq_type)
 	switch (priv->params.rq_wq_type) {
 	case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
 		priv->params.log_rq_size = MLX5E_PARAMS_DEFAULT_LOG_RQ_SIZE_MPW;
-		priv->params.mpwqe_log_stride_sz = priv->params.rx_cqe_compress ?
+		priv->params.mpwqe_log_stride_sz =
+			MLX5E_GET_PFLAG(priv, MLX5E_PFLAG_RX_CQE_COMPRESS) ?
 			MLX5_MPWRQ_LOG_STRIDE_SIZE_CQE_COMPRESS :
 			MLX5_MPWRQ_LOG_STRIDE_SIZE;
 		priv->params.mpwqe_log_num_strides = MLX5_MPWRQ_LOG_WQE_SZ -
@@ -101,7 +102,7 @@ static void mlx5e_set_rq_type_params(struct mlx5e_priv *priv, u8 rq_type)
 		       priv->params.rq_wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ,
 		       BIT(priv->params.log_rq_size),
 		       BIT(priv->params.mpwqe_log_stride_sz),
-		       priv->params.rx_cqe_compress_admin);
+		       MLX5E_GET_PFLAG(priv, MLX5E_PFLAG_RX_CQE_COMPRESS));
 }
 
 static void mlx5e_set_rq_priv_params(struct mlx5e_priv *priv)
@@ -1664,7 +1665,7 @@ static void mlx5e_build_rx_cq_param(struct mlx5e_priv *priv,
 	}
 
 	MLX5_SET(cqc, cqc, log_cq_size, log_cq_size);
-	if (priv->params.rx_cqe_compress) {
+	if (MLX5E_GET_PFLAG(priv, MLX5E_PFLAG_RX_CQE_COMPRESS)) {
 		MLX5_SET(cqc, cqc, mini_cqe_res_format, MLX5_CQE_FORMAT_CSUM);
 		MLX5_SET(cqc, cqc, cqe_comp_en, 1);
 	}
@@ -3447,17 +3448,16 @@ static void mlx5e_build_nic_netdev_priv(struct mlx5_core_dev *mdev,
 	priv->params.log_sq_size = MLX5E_PARAMS_DEFAULT_LOG_SQ_SIZE;
 
 	/* set CQE compression */
-	priv->params.rx_cqe_compress_admin = false;
+	priv->params.rx_cqe_compress_def = false;
 	if (MLX5_CAP_GEN(mdev, cqe_compression) &&
 	    MLX5_CAP_GEN(mdev, vport_group_manager)) {
 		mlx5e_get_max_linkspeed(mdev, &link_speed);
 		mlx5e_get_pci_bw(mdev, &pci_bw);
 		mlx5_core_dbg(mdev, "Max link speed = %d, PCI BW = %d\n",
 			      link_speed, pci_bw);
-		priv->params.rx_cqe_compress_admin =
+		priv->params.rx_cqe_compress_def =
 			cqe_compress_heuristic(link_speed, pci_bw);
 	}
-	priv->params.rx_cqe_compress = priv->params.rx_cqe_compress_admin;
 
 	mlx5e_set_rq_priv_params(priv);
 	if (priv->params.rq_wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ)
@@ -3490,6 +3490,7 @@ static void mlx5e_build_nic_netdev_priv(struct mlx5_core_dev *mdev,
 	/* Initialize pflags */
 	MLX5E_SET_PFLAG(priv, MLX5E_PFLAG_RX_CQE_BASED_MODER,
 			priv->params.rx_cq_period_mode == MLX5_CQ_PERIOD_MODE_START_FROM_CQE);
+	MLX5E_SET_PFLAG(priv, MLX5E_PFLAG_RX_CQE_COMPRESS, priv->params.rx_cqe_compress_def);
 
 	mutex_init(&priv->state_lock);
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index c6de6fb..b036710 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -164,14 +164,14 @@ void mlx5e_modify_rx_cqe_compression(struct mlx5e_priv *priv, bool val)
 
 	mutex_lock(&priv->state_lock);
 
-	if (priv->params.rx_cqe_compress == val)
+	if (MLX5E_GET_PFLAG(priv, MLX5E_PFLAG_RX_CQE_COMPRESS) == val)
 		goto unlock;
 
 	was_opened = test_bit(MLX5E_STATE_OPENED, &priv->state);
 	if (was_opened)
 		mlx5e_close_locked(priv->netdev);
 
-	priv->params.rx_cqe_compress = val;
+	MLX5E_SET_PFLAG(priv, MLX5E_PFLAG_RX_CQE_COMPRESS, val);
 
 	if (was_opened)
 		mlx5e_open_locked(priv->netdev);
-- 
2.7.4

^ permalink raw reply related

* [PATCH net-next V2 02/10] net/mlx5e: Support DCBX CEE API
From: Saeed Mahameed @ 2016-11-27 15:02 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Huy Nguyen, Saeed Mahameed
In-Reply-To: <1480258932-10302-1-git-send-email-saeedm@mellanox.com>

From: Huy Nguyen <huyn@mellanox.com>

Add DCBX CEE API interface for ConnectX-4. Configurations are stored in
a temporary structure and are applied to the card's firmware when
the CEE's setall callback function is called.

Note:
  priority group in CEE is equivalent to traffic class in ConnectX-4
  hardware spec.

  bw allocation per priority in CEE is not supported because ConnectX-4
  only supports bw allocation per traffic class.

  user priority in CEE does not have an equivalent term in ConnectX-4.
  Therefore, user priority to priority mapping in CEE is not supported.

Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |  24 ++
 drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c | 301 ++++++++++++++++++++-
 drivers/net/ethernet/mellanox/mlx5/core/port.c     |  43 +++
 include/linux/mlx5/port.h                          |   4 +
 4 files changed, 370 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index a2b32ed..31387ed 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -221,6 +221,26 @@ struct mlx5e_params {
 	u32 lro_timeout;
 };
 
+#ifdef CONFIG_MLX5_CORE_EN_DCB
+struct mlx5e_cee_config {
+	/* bw pct for priority group */
+	u8                         pg_bw_pct[CEE_DCBX_MAX_PGS];
+	u8                         prio_to_pg_map[CEE_DCBX_MAX_PRIO];
+	bool                       pfc_setting[CEE_DCBX_MAX_PRIO];
+	bool                       pfc_enable;
+};
+
+enum {
+	MLX5_DCB_CHG_RESET,
+	MLX5_DCB_NO_CHG,
+	MLX5_DCB_CHG_NO_RESET,
+};
+
+struct mlx5e_dcbx {
+	struct mlx5e_cee_config    cee_cfg; /* pending configuration */
+};
+#endif
+
 struct mlx5e_tstamp {
 	rwlock_t                   lock;
 	struct cyclecounter        cycles;
@@ -688,6 +708,10 @@ struct mlx5e_priv {
 	struct mlx5e_stats         stats;
 	struct mlx5e_tstamp        tstamp;
 	u16 q_counter;
+#ifdef CONFIG_MLX5_CORE_EN_DCB
+	struct mlx5e_dcbx          dcbx;
+#endif
+
 	const struct mlx5e_profile *profile;
 	void                      *ppriv;
 };
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c b/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c
index 762af16..0595243 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c
@@ -38,6 +38,9 @@
 #define MLX5E_100MB (100000)
 #define MLX5E_1GB   (1000000)
 
+#define MLX5E_CEE_STATE_UP    1
+#define MLX5E_CEE_STATE_DOWN  0
+
 static int mlx5e_dcbnl_ieee_getets(struct net_device *netdev,
 				   struct ieee_ets *ets)
 {
@@ -222,13 +225,15 @@ static int mlx5e_dcbnl_ieee_setpfc(struct net_device *dev,
 
 static u8 mlx5e_dcbnl_getdcbx(struct net_device *dev)
 {
-	return DCB_CAP_DCBX_HOST | DCB_CAP_DCBX_VER_IEEE;
+	return DCB_CAP_DCBX_HOST |
+	       DCB_CAP_DCBX_VER_IEEE |
+	       DCB_CAP_DCBX_VER_CEE;
 }
 
 static u8 mlx5e_dcbnl_setdcbx(struct net_device *dev, u8 mode)
 {
 	if ((mode & DCB_CAP_DCBX_LLD_MANAGED) ||
-	    (mode & DCB_CAP_DCBX_VER_CEE) ||
+	    !(mode & DCB_CAP_DCBX_VER_CEE) ||
 	    !(mode & DCB_CAP_DCBX_VER_IEEE) ||
 	    !(mode & DCB_CAP_DCBX_HOST))
 		return 1;
@@ -304,6 +309,281 @@ static int mlx5e_dcbnl_ieee_setmaxrate(struct net_device *netdev,
 	return mlx5_modify_port_ets_rate_limit(mdev, max_bw_value, max_bw_unit);
 }
 
+static u8 mlx5e_dcbnl_setall(struct net_device *netdev)
+{
+	struct mlx5e_priv *priv = netdev_priv(netdev);
+	struct mlx5e_cee_config *cee_cfg = &priv->dcbx.cee_cfg;
+	struct mlx5_core_dev *mdev = priv->mdev;
+	struct ieee_ets ets;
+	struct ieee_pfc pfc;
+	int err;
+	int i;
+
+	memset(&ets, 0, sizeof(ets));
+	memset(&pfc, 0, sizeof(pfc));
+
+	ets.ets_cap = IEEE_8021QAZ_MAX_TCS;
+	for (i = 0; i < CEE_DCBX_MAX_PGS; i++) {
+		ets.tc_tx_bw[i] = cee_cfg->pg_bw_pct[i];
+		ets.tc_rx_bw[i] = cee_cfg->pg_bw_pct[i];
+		ets.tc_tsa[i]   = IEEE_8021QAZ_TSA_ETS;
+		ets.prio_tc[i]  = cee_cfg->prio_to_pg_map[i];
+	}
+
+	err = mlx5e_dbcnl_validate_ets(netdev, &ets);
+	if (err) {
+		netdev_err(netdev,
+			   "%s, Failed to validate ETS: %d\n", __func__, err);
+		goto out;
+	}
+
+	err = mlx5e_dcbnl_ieee_setets_core(priv, &ets);
+	if (err) {
+		netdev_err(netdev,
+			   "%s, Failed to set ETS: %d\n", __func__, err);
+		goto out;
+	}
+
+	/* Set PFC */
+	pfc.pfc_cap = mlx5_max_tc(mdev) + 1;
+	if (!cee_cfg->pfc_enable)
+		pfc.pfc_en = 0;
+	else
+		for (i = 0; i < CEE_DCBX_MAX_PRIO; i++)
+			pfc.pfc_en |= cee_cfg->pfc_setting[i] << i;
+
+	err = mlx5e_dcbnl_ieee_setpfc(netdev, &pfc);
+	if (err) {
+		netdev_err(netdev,
+			   "%s, Failed to set PFC: %d\n", __func__, err);
+		goto out;
+	}
+out:
+	return err ? MLX5_DCB_NO_CHG : MLX5_DCB_CHG_RESET;
+}
+
+static u8 mlx5e_dcbnl_getstate(struct net_device *netdev)
+{
+	return MLX5E_CEE_STATE_UP;
+}
+
+static void mlx5e_dcbnl_getpermhwaddr(struct net_device *netdev,
+				      u8 *perm_addr)
+{
+	struct mlx5e_priv *priv = netdev_priv(netdev);
+
+	if (!perm_addr)
+		return;
+
+	mlx5_query_nic_vport_mac_address(priv->mdev, 0, perm_addr);
+}
+
+static void mlx5e_dcbnl_setpgtccfgtx(struct net_device *netdev,
+				     int priority, u8 prio_type,
+				     u8 pgid, u8 bw_pct, u8 up_map)
+{
+	struct mlx5e_priv *priv = netdev_priv(netdev);
+	struct mlx5e_cee_config *cee_cfg = &priv->dcbx.cee_cfg;
+
+	if (priority >= CEE_DCBX_MAX_PRIO) {
+		netdev_err(netdev,
+			   "%s, priority is out of range\n", __func__);
+		return;
+	}
+
+	if (pgid >= CEE_DCBX_MAX_PGS) {
+		netdev_err(netdev,
+			   "%s, priority group is out of range\n", __func__);
+		return;
+	}
+
+	cee_cfg->prio_to_pg_map[priority] = pgid;
+}
+
+static void mlx5e_dcbnl_setpgbwgcfgtx(struct net_device *netdev,
+				      int pgid, u8 bw_pct)
+{
+	struct mlx5e_priv *priv = netdev_priv(netdev);
+	struct mlx5e_cee_config *cee_cfg = &priv->dcbx.cee_cfg;
+
+	if (pgid >= CEE_DCBX_MAX_PGS) {
+		netdev_err(netdev,
+			   "%s, priority group is out of range\n", __func__);
+		return;
+	}
+
+	cee_cfg->pg_bw_pct[pgid] = bw_pct;
+}
+
+static void mlx5e_dcbnl_getpgtccfgtx(struct net_device *netdev,
+				     int priority, u8 *prio_type,
+				     u8 *pgid, u8 *bw_pct, u8 *up_map)
+{
+	struct mlx5e_priv *priv = netdev_priv(netdev);
+	struct mlx5_core_dev *mdev = priv->mdev;
+
+	if (priority >= CEE_DCBX_MAX_PRIO) {
+		netdev_err(netdev,
+			   "%s, priority is out of range\n", __func__);
+		return;
+	}
+
+	*prio_type = 0;
+	*bw_pct = 0;
+	*up_map = 0;
+
+	if (mlx5_query_port_prio_tc(mdev, priority, pgid))
+		*pgid = 0;
+}
+
+static void mlx5e_dcbnl_getpgbwgcfgtx(struct net_device *netdev,
+				      int pgid, u8 *bw_pct)
+{
+	struct mlx5e_priv *priv = netdev_priv(netdev);
+	struct mlx5_core_dev *mdev = priv->mdev;
+
+	if (pgid >= CEE_DCBX_MAX_PGS) {
+		netdev_err(netdev,
+			   "%s, priority group is out of range\n", __func__);
+		return;
+	}
+
+	if (mlx5_query_port_tc_bw_alloc(mdev, pgid, bw_pct))
+		*bw_pct = 0;
+}
+
+static void mlx5e_dcbnl_setpfccfg(struct net_device *netdev,
+				  int priority, u8 setting)
+{
+	struct mlx5e_priv *priv = netdev_priv(netdev);
+	struct mlx5e_cee_config *cee_cfg = &priv->dcbx.cee_cfg;
+
+	if (priority >= CEE_DCBX_MAX_PRIO) {
+		netdev_err(netdev,
+			   "%s, priority is out of range\n", __func__);
+		return;
+	}
+
+	if (setting > 1)
+		return;
+
+	cee_cfg->pfc_setting[priority] = setting;
+}
+
+static int
+mlx5e_dcbnl_get_priority_pfc(struct net_device *netdev,
+			     int priority, u8 *setting)
+{
+	struct ieee_pfc pfc;
+	int err;
+
+	err = mlx5e_dcbnl_ieee_getpfc(netdev, &pfc);
+
+	if (err)
+		*setting = 0;
+	else
+		*setting = (pfc.pfc_en >> priority) & 0x01;
+
+	return err;
+}
+
+static void mlx5e_dcbnl_getpfccfg(struct net_device *netdev,
+				  int priority, u8 *setting)
+{
+	if (priority >= CEE_DCBX_MAX_PRIO) {
+		netdev_err(netdev,
+			   "%s, priority is out of range\n", __func__);
+		return;
+	}
+
+	if (!setting)
+		return;
+
+	mlx5e_dcbnl_get_priority_pfc(netdev, priority, setting);
+}
+
+static u8 mlx5e_dcbnl_getcap(struct net_device *netdev,
+			     int capid, u8 *cap)
+{
+	struct mlx5e_priv *priv = netdev_priv(netdev);
+	struct mlx5_core_dev *mdev = priv->mdev;
+	u8 rval = 0;
+
+	switch (capid) {
+	case DCB_CAP_ATTR_PG:
+		*cap = true;
+		break;
+	case DCB_CAP_ATTR_PFC:
+		*cap = true;
+		break;
+	case DCB_CAP_ATTR_UP2TC:
+		*cap = false;
+		break;
+	case DCB_CAP_ATTR_PG_TCS:
+		*cap = 1 << mlx5_max_tc(mdev);
+		break;
+	case DCB_CAP_ATTR_PFC_TCS:
+		*cap = 1 << mlx5_max_tc(mdev);
+		break;
+	case DCB_CAP_ATTR_GSP:
+		*cap = false;
+		break;
+	case DCB_CAP_ATTR_BCN:
+		*cap = false;
+		break;
+	case DCB_CAP_ATTR_DCBX:
+		*cap = (DCB_CAP_DCBX_LLD_MANAGED |
+			DCB_CAP_DCBX_VER_CEE |
+			DCB_CAP_DCBX_STATIC);
+		break;
+	default:
+		*cap = 0;
+		rval = 1;
+		break;
+	}
+
+	return rval;
+}
+
+static int mlx5e_dcbnl_getnumtcs(struct net_device *netdev,
+				 int tcs_id, u8 *num)
+{
+	struct mlx5e_priv *priv = netdev_priv(netdev);
+	struct mlx5_core_dev *mdev = priv->mdev;
+
+	switch (tcs_id) {
+	case DCB_NUMTCS_ATTR_PG:
+	case DCB_NUMTCS_ATTR_PFC:
+		*num = mlx5_max_tc(mdev) + 1;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static u8 mlx5e_dcbnl_getpfcstate(struct net_device *netdev)
+{
+	struct ieee_pfc pfc;
+
+	if (mlx5e_dcbnl_ieee_getpfc(netdev, &pfc))
+		return MLX5E_CEE_STATE_DOWN;
+
+	return pfc.pfc_en ? MLX5E_CEE_STATE_UP : MLX5E_CEE_STATE_DOWN;
+}
+
+static void mlx5e_dcbnl_setpfcstate(struct net_device *netdev, u8 state)
+{
+	struct mlx5e_priv *priv = netdev_priv(netdev);
+	struct mlx5e_cee_config *cee_cfg = &priv->dcbx.cee_cfg;
+
+	if ((state != MLX5E_CEE_STATE_UP) && (state != MLX5E_CEE_STATE_DOWN))
+		return;
+
+	cee_cfg->pfc_enable = state;
+}
+
 const struct dcbnl_rtnl_ops mlx5e_dcbnl_ops = {
 	.ieee_getets	= mlx5e_dcbnl_ieee_getets,
 	.ieee_setets	= mlx5e_dcbnl_ieee_setets,
@@ -313,4 +593,21 @@ const struct dcbnl_rtnl_ops mlx5e_dcbnl_ops = {
 	.ieee_setpfc	= mlx5e_dcbnl_ieee_setpfc,
 	.getdcbx	= mlx5e_dcbnl_getdcbx,
 	.setdcbx	= mlx5e_dcbnl_setdcbx,
+
+/* CEE interfaces */
+	.setall         = mlx5e_dcbnl_setall,
+	.getstate       = mlx5e_dcbnl_getstate,
+	.getpermhwaddr  = mlx5e_dcbnl_getpermhwaddr,
+
+	.setpgtccfgtx   = mlx5e_dcbnl_setpgtccfgtx,
+	.setpgbwgcfgtx  = mlx5e_dcbnl_setpgbwgcfgtx,
+	.getpgtccfgtx   = mlx5e_dcbnl_getpgtccfgtx,
+	.getpgbwgcfgtx  = mlx5e_dcbnl_getpgbwgcfgtx,
+
+	.setpfccfg      = mlx5e_dcbnl_setpfccfg,
+	.getpfccfg      = mlx5e_dcbnl_getpfccfg,
+	.getcap         = mlx5e_dcbnl_getcap,
+	.getnumtcs      = mlx5e_dcbnl_getnumtcs,
+	.getpfcstate    = mlx5e_dcbnl_getpfcstate,
+	.setpfcstate    = mlx5e_dcbnl_setpfcstate,
 };
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/port.c b/drivers/net/ethernet/mellanox/mlx5/core/port.c
index b77928f..ed4898f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/port.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/port.c
@@ -572,6 +572,28 @@ int mlx5_set_port_prio_tc(struct mlx5_core_dev *mdev, u8 *prio_tc)
 }
 EXPORT_SYMBOL_GPL(mlx5_set_port_prio_tc);
 
+int mlx5_query_port_prio_tc(struct mlx5_core_dev *mdev,
+			    u8 prio, u8 *tc)
+{
+	u32 in[MLX5_ST_SZ_DW(qtct_reg)];
+	u32 out[MLX5_ST_SZ_DW(qtct_reg)];
+	int err;
+
+	memset(in, 0, sizeof(in));
+	memset(out, 0, sizeof(out));
+
+	MLX5_SET(qtct_reg, in, port_number, 1);
+	MLX5_SET(qtct_reg, in, prio, prio);
+
+	err = mlx5_core_access_reg(mdev, in, sizeof(in), out,
+				   sizeof(out), MLX5_REG_QTCT, 0, 0);
+	if (!err)
+		*tc = MLX5_GET(qtct_reg, out, tclass);
+
+	return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_query_port_prio_tc);
+
 static int mlx5_set_port_qetcr_reg(struct mlx5_core_dev *mdev, u32 *in,
 				   int inlen)
 {
@@ -625,6 +647,27 @@ int mlx5_set_port_tc_bw_alloc(struct mlx5_core_dev *mdev, u8 *tc_bw)
 }
 EXPORT_SYMBOL_GPL(mlx5_set_port_tc_bw_alloc);
 
+int mlx5_query_port_tc_bw_alloc(struct mlx5_core_dev *mdev,
+				u8 tc, u8 *bw_pct)
+{
+	u32 out[MLX5_ST_SZ_DW(qetc_reg)];
+	void *ets_tcn_conf;
+	int err;
+
+	err = mlx5_query_port_qetcr_reg(mdev, out, sizeof(out));
+	if (err)
+		return err;
+
+	ets_tcn_conf = MLX5_ADDR_OF(qetc_reg, out,
+				    tc_configuration[tc]);
+
+	*bw_pct = MLX5_GET(ets_tcn_config_reg, ets_tcn_conf,
+			   bw_allocation);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(mlx5_query_port_tc_bw_alloc);
+
 int mlx5_modify_port_ets_rate_limit(struct mlx5_core_dev *mdev,
 				    u8 *max_bw_value,
 				    u8 *max_bw_units)
diff --git a/include/linux/mlx5/port.h b/include/linux/mlx5/port.h
index dde8c7e..bdee439 100644
--- a/include/linux/mlx5/port.h
+++ b/include/linux/mlx5/port.h
@@ -141,8 +141,12 @@ int mlx5_query_port_pfc(struct mlx5_core_dev *dev, u8 *pfc_en_tx,
 int mlx5_max_tc(struct mlx5_core_dev *mdev);
 
 int mlx5_set_port_prio_tc(struct mlx5_core_dev *mdev, u8 *prio_tc);
+int mlx5_query_port_prio_tc(struct mlx5_core_dev *mdev,
+			    u8 prio, u8 *tc);
 int mlx5_set_port_tc_group(struct mlx5_core_dev *mdev, u8 *tc_group);
 int mlx5_set_port_tc_bw_alloc(struct mlx5_core_dev *mdev, u8 *tc_bw);
+int mlx5_query_port_tc_bw_alloc(struct mlx5_core_dev *mdev,
+				u8 tc, u8 *bw_pct);
 int mlx5_modify_port_ets_rate_limit(struct mlx5_core_dev *mdev,
 				    u8 *max_bw_value,
 				    u8 *max_bw_unit);
-- 
2.7.4

^ permalink raw reply related

* [PATCH net-next V2 08/10] net/mlx5e: Add support for loopback selftest
From: Saeed Mahameed @ 2016-11-27 15:02 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Saeed Mahameed, Kamal Heib
In-Reply-To: <1480258932-10302-1-git-send-email-saeedm@mellanox.com>

Extend the self diagnostic tests to support loopback test.

The loopback test doesn't require the offline flag, it will use the
generic dev_queue_xmit and a dedicated packet_type to capture and verify
mlx5e selftest loopback packets.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Kamal Heib <kamalh@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |   3 +-
 .../net/ethernet/mellanox/mlx5/core/en_common.c    |   7 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |   2 +-
 .../net/ethernet/mellanox/mlx5/core/en_selftest.c  | 218 +++++++++++++++++++++
 4 files changed, 227 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index f7bb4a7..9cf32d3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -885,7 +885,8 @@ void mlx5e_destroy_tir(struct mlx5_core_dev *mdev,
 		       struct mlx5e_tir *tir);
 int mlx5e_create_mdev_resources(struct mlx5_core_dev *mdev);
 void mlx5e_destroy_mdev_resources(struct mlx5_core_dev *mdev);
-int mlx5e_refresh_tirs_self_loopback_enable(struct mlx5_core_dev *mdev);
+int mlx5e_refresh_tirs_self_loopback(struct mlx5_core_dev *mdev,
+				     bool enable_uc_lb);
 
 struct mlx5_eswitch_rep;
 int mlx5e_vport_rep_load(struct mlx5_eswitch *esw,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_common.c b/drivers/net/ethernet/mellanox/mlx5/core/en_common.c
index 029e856..f175518 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_common.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_common.c
@@ -137,7 +137,8 @@ void mlx5e_destroy_mdev_resources(struct mlx5_core_dev *mdev)
 	mlx5_unmap_free_uar(mdev, &res->cq_uar);
 }
 
-int mlx5e_refresh_tirs_self_loopback_enable(struct mlx5_core_dev *mdev)
+int mlx5e_refresh_tirs_self_loopback(struct mlx5_core_dev *mdev,
+				     bool enable_uc_lb)
 {
 	struct mlx5e_tir *tir;
 	void *in;
@@ -149,6 +150,10 @@ int mlx5e_refresh_tirs_self_loopback_enable(struct mlx5_core_dev *mdev)
 	if (!in)
 		return -ENOMEM;
 
+	if (enable_uc_lb)
+		MLX5_SET(modify_tir_in, in, ctx.self_lb_block,
+			 MLX5_TIRC_SELF_LB_BLOCK_BLOCK_UNICAST_);
+
 	MLX5_SET(modify_tir_in, in, bitmask.self_lb_en, 1);
 
 	list_for_each_entry(tir, &mdev->mlx5e_res.td.tirs_list, list) {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index f5b93c2..89d5c65 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2136,7 +2136,7 @@ int mlx5e_open_locked(struct net_device *netdev)
 		goto err_clear_state_opened_flag;
 	}
 
-	err = mlx5e_refresh_tirs_self_loopback_enable(priv->mdev);
+	err = mlx5e_refresh_tirs_self_loopback(priv->mdev, false);
 	if (err) {
 		netdev_err(netdev, "%s: mlx5e_refresh_tirs_self_loopback_enable failed, %d\n",
 			   __func__, err);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c b/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
index 49504a7..c32af7d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
@@ -30,12 +30,16 @@
  * SOFTWARE.
  */
 
+#include <linux/ip.h>
+#include <linux/udp.h>
+#include <net/udp.h>
 #include "en.h"
 
 enum {
 	MLX5E_ST_LINK_STATE,
 	MLX5E_ST_LINK_SPEED,
 	MLX5E_ST_HEALTH_INFO,
+	MLX5E_ST_LOOPBACK,
 	MLX5E_ST_NUM,
 };
 
@@ -43,6 +47,7 @@ const char mlx5e_self_tests[MLX5E_ST_NUM][ETH_GSTRING_LEN] = {
 	"Link Test",
 	"Speed Test",
 	"Health Test",
+	"Loopback Test",
 };
 
 int mlx5e_self_test_num(struct mlx5e_priv *priv)
@@ -88,10 +93,223 @@ static int mlx5e_test_link_speed(struct mlx5e_priv *priv)
 	return 1;
 }
 
+/* loopback test */
+#define MLX5E_TEST_PKT_SIZE (MLX5_MPWRQ_SMALL_PACKET_THRESHOLD - NET_IP_ALIGN)
+static const char mlx5e_test_text[ETH_GSTRING_LEN] = "MLX5E SELF TEST";
+#define MLX5E_TEST_MAGIC 0x5AEED15C001ULL
+
+struct mlx5ehdr {
+	__be32 version;
+	__be64 magic;
+	char   text[ETH_GSTRING_LEN];
+};
+
+static struct sk_buff *mlx5e_test_get_udp_skb(struct mlx5e_priv *priv)
+{
+	struct sk_buff *skb = NULL;
+	struct mlx5ehdr *mlxh;
+	struct ethhdr *ethh;
+	struct udphdr *udph;
+	struct iphdr *iph;
+	int datalen, iplen;
+
+	datalen = MLX5E_TEST_PKT_SIZE -
+		  (sizeof(*ethh) + sizeof(*iph) + sizeof(*udph));
+
+	skb = netdev_alloc_skb(priv->netdev, MLX5E_TEST_PKT_SIZE);
+	if (!skb) {
+		netdev_err(priv->netdev, "\tFailed to alloc loopback skb\n");
+		return NULL;
+	}
+
+	prefetchw(skb->data);
+	skb_reserve(skb, NET_IP_ALIGN);
+
+	/*  Reserve for ethernet and IP header  */
+	ethh = (struct ethhdr *)skb_push(skb, ETH_HLEN);
+	skb_reset_mac_header(skb);
+
+	skb_set_network_header(skb, skb->len);
+	iph = (struct iphdr *)skb_put(skb, sizeof(struct iphdr));
+
+	skb_set_transport_header(skb, skb->len);
+	udph = (struct udphdr *)skb_put(skb, sizeof(struct udphdr));
+
+	/* Fill ETH header */
+	ether_addr_copy(ethh->h_dest, priv->netdev->dev_addr);
+	eth_zero_addr(ethh->h_source);
+	ethh->h_proto = htons(ETH_P_IP);
+
+	/* Fill UDP header */
+	udph->source = htons(9);
+	udph->dest = htons(9); /* Discard Protocol */
+	udph->len = htons(datalen + sizeof(struct udphdr));
+	udph->check = 0;
+
+	/* Fill IP header */
+	iph->ihl = 5;
+	iph->ttl = 32;
+	iph->version = 4;
+	iph->protocol = IPPROTO_UDP;
+	iplen = sizeof(struct iphdr) + sizeof(struct udphdr) + datalen;
+	iph->tot_len = htons(iplen);
+	iph->frag_off = 0;
+	iph->saddr = 0;
+	iph->daddr = 0;
+	iph->tos = 0;
+	iph->id = 0;
+	ip_send_check(iph);
+
+	/* Fill test header and data */
+	mlxh = (struct mlx5ehdr *)skb_put(skb, sizeof(*mlxh));
+	mlxh->version = 0;
+	mlxh->magic = cpu_to_be64(MLX5E_TEST_MAGIC);
+	strlcpy(mlxh->text, mlx5e_test_text, sizeof(mlxh->text));
+	datalen -= sizeof(*mlxh);
+	memset(skb_put(skb, datalen), 0, datalen);
+
+	skb->csum = 0;
+	skb->ip_summed = CHECKSUM_PARTIAL;
+	udp4_hwcsum(skb, iph->saddr, iph->daddr);
+
+	skb->protocol = htons(ETH_P_IP);
+	skb->pkt_type = PACKET_HOST;
+	skb->dev = priv->netdev;
+
+	return skb;
+}
+
+struct mlx5e_lbt_priv {
+	struct packet_type pt;
+	struct completion comp;
+	bool loopback_ok;
+};
+
+static int
+mlx5e_test_loopback_validate(struct sk_buff *skb,
+			     struct net_device *ndev,
+			     struct packet_type *pt,
+			     struct net_device *orig_ndev)
+{
+	struct mlx5e_lbt_priv *lbtp = pt->af_packet_priv;
+	struct mlx5ehdr *mlxh;
+	struct ethhdr *ethh;
+	struct udphdr *udph;
+	struct iphdr *iph;
+
+	/* We are only going to peek, no need to clone the SKB */
+	if (skb->protocol != htons(ETH_P_IP))
+		goto out;
+
+	if (MLX5E_TEST_PKT_SIZE - ETH_HLEN > skb_headlen(skb))
+		goto out;
+
+	ethh = (struct ethhdr *)skb_mac_header(skb);
+	if (!ether_addr_equal(ethh->h_dest, orig_ndev->dev_addr))
+		goto out;
+
+	iph = ip_hdr(skb);
+	if (iph->protocol != IPPROTO_UDP)
+		goto out;
+
+	udph = udp_hdr(skb);
+	if (udph->dest != htons(9))
+		goto out;
+
+	mlxh = (struct mlx5ehdr *)((char *)udph + sizeof(*udph));
+	if (mlxh->magic != cpu_to_be64(MLX5E_TEST_MAGIC))
+		goto out; /* so close ! */
+
+	/* bingo */
+	lbtp->loopback_ok = true;
+	complete(&lbtp->comp);
+out:
+	kfree_skb(skb);
+	return 0;
+}
+
+static int mlx5e_test_loopback_setup(struct mlx5e_priv *priv,
+				     struct mlx5e_lbt_priv *lbtp)
+{
+	int err = 0;
+
+	err = mlx5e_refresh_tirs_self_loopback(priv->mdev, true);
+	if (err) {
+		netdev_err(priv->netdev,
+			   "\tFailed to enable UC loopback err(%d)\n", err);
+		return err;
+	}
+
+	lbtp->loopback_ok = false;
+	init_completion(&lbtp->comp);
+
+	lbtp->pt.type = htons(ETH_P_ALL);
+	lbtp->pt.func = mlx5e_test_loopback_validate;
+	lbtp->pt.dev = priv->netdev;
+	lbtp->pt.af_packet_priv = lbtp;
+	dev_add_pack(&lbtp->pt);
+	return err;
+}
+
+static void mlx5e_test_loopback_cleanup(struct mlx5e_priv *priv,
+					struct mlx5e_lbt_priv *lbtp)
+{
+	dev_remove_pack(&lbtp->pt);
+	mlx5e_refresh_tirs_self_loopback(priv->mdev, false);
+}
+
+#define MLX5E_LB_VERIFY_TIMEOUT (msecs_to_jiffies(200))
+static int mlx5e_test_loopback(struct mlx5e_priv *priv)
+{
+	struct mlx5e_lbt_priv *lbtp;
+	struct sk_buff *skb = NULL;
+	int err;
+
+	if (!test_bit(MLX5E_STATE_OPENED, &priv->state)) {
+		netdev_err(priv->netdev,
+			   "\tCan't perform loobpack test while device is down\n");
+		return -ENODEV;
+	}
+
+	lbtp = kzalloc(sizeof(*lbtp), GFP_KERNEL);
+	if (!lbtp)
+		return -ENOMEM;
+	lbtp->loopback_ok = false;
+
+	err = mlx5e_test_loopback_setup(priv, lbtp);
+	if (err)
+		goto out;
+
+	skb = mlx5e_test_get_udp_skb(priv);
+	if (!skb) {
+		err = -ENOMEM;
+		goto cleanup;
+	}
+
+	skb_set_queue_mapping(skb, 0);
+	err = dev_queue_xmit(skb);
+	if (err) {
+		netdev_err(priv->netdev,
+			   "\tFailed to xmit loopback packet err(%d)\n",
+			   err);
+		goto cleanup;
+	}
+
+	wait_for_completion_timeout(&lbtp->comp, MLX5E_LB_VERIFY_TIMEOUT);
+	err = !lbtp->loopback_ok;
+
+cleanup:
+	mlx5e_test_loopback_cleanup(priv, lbtp);
+out:
+	kfree(lbtp);
+	return err;
+}
+
 static int (*mlx5e_st_func[MLX5E_ST_NUM])(struct mlx5e_priv *) = {
 	mlx5e_test_link_state,
 	mlx5e_test_link_speed,
 	mlx5e_test_health_info,
+	mlx5e_test_loopback
 };
 
 void mlx5e_self_test(struct net_device *ndev, struct ethtool_test *etest,
-- 
2.7.4

^ permalink raw reply related

* [PATCH net-next V2 07/10] net/mlx5e: Add support for ethtool self diagnostics test
From: Saeed Mahameed @ 2016-11-27 15:02 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Kamal Heib, Saeed Mahameed
In-Reply-To: <1480258932-10302-1-git-send-email-saeedm@mellanox.com>

From: Kamal Heib <kamalh@mellanox.com>

The self diagnostics test implementaion include the following features:
1. Link Test: Check that link is in up state.
2. Speed Test: Check that link was negotiated correctly.
3. Health Test: Check the device health.

Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/Makefile   |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |   5 +
 .../net/ethernet/mellanox/mlx5/core/en_ethtool.c   |   8 +-
 .../net/ethernet/mellanox/mlx5/core/en_selftest.c  | 126 +++++++++++++++++++++
 4 files changed, 139 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
index 0343725..9f43beb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
@@ -8,6 +8,6 @@ mlx5_core-y :=	main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \
 mlx5_core-$(CONFIG_MLX5_CORE_EN) += wq.o eswitch.o eswitch_offloads.o \
 		en_main.o en_common.o en_fs.o en_ethtool.o en_tx.o \
 		en_rx.o en_rx_am.o en_txrx.o en_clock.o vxlan.o \
-		en_tc.o en_arfs.o en_rep.o en_fs_ethtool.o
+		en_tc.o en_arfs.o en_rep.o en_fs_ethtool.o en_selftest.o
 
 mlx5_core-$(CONFIG_MLX5_CORE_EN_DCB) +=  en_dcbnl.o
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 6c954cd..f7bb4a7 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -167,6 +167,8 @@ struct mlx5e_umr_wqe {
 	struct mlx5_wqe_data_seg       data;
 };
 
+extern const char mlx5e_self_tests[][ETH_GSTRING_LEN];
+
 static const char mlx5e_priv_flags[][ETH_GSTRING_LEN] = {
 	"rx_cqe_moder",
 };
@@ -754,6 +756,9 @@ int mlx5e_create_flow_steering(struct mlx5e_priv *priv);
 void mlx5e_destroy_flow_steering(struct mlx5e_priv *priv);
 void mlx5e_init_l2_addr(struct mlx5e_priv *priv);
 void mlx5e_destroy_flow_table(struct mlx5e_flow_table *ft);
+int mlx5e_self_test_num(struct mlx5e_priv *priv);
+void mlx5e_self_test(struct net_device *ndev, struct ethtool_test *etest,
+		     u64 *buf);
 int mlx5e_ethtool_get_flow(struct mlx5e_priv *priv, struct ethtool_rxnfc *info,
 			   int location);
 int mlx5e_ethtool_get_all_flows(struct mlx5e_priv *priv,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index 9ea7b37..839c4e9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -180,6 +180,8 @@ static int mlx5e_get_sset_count(struct net_device *dev, int sset)
 
 	case ETH_SS_PRIV_FLAGS:
 		return ARRAY_SIZE(mlx5e_priv_flags);
+	case ETH_SS_TEST:
+		return mlx5e_self_test_num(priv);
 	/* fallthrough */
 	default:
 		return -EOPNOTSUPP;
@@ -286,6 +288,9 @@ static void mlx5e_get_strings(struct net_device *dev,
 		break;
 
 	case ETH_SS_TEST:
+		for (i = 0; i < mlx5e_self_test_num(priv); i++)
+			strcpy(data + i * ETH_GSTRING_LEN,
+			       mlx5e_self_tests[i]);
 		break;
 
 	case ETH_SS_STATS:
@@ -1573,5 +1578,6 @@ const struct ethtool_ops mlx5e_ethtool_ops = {
 	.get_module_info   = mlx5e_get_module_info,
 	.get_module_eeprom = mlx5e_get_module_eeprom,
 	.get_priv_flags    = mlx5e_get_priv_flags,
-	.set_priv_flags    = mlx5e_set_priv_flags
+	.set_priv_flags    = mlx5e_set_priv_flags,
+	.self_test         = mlx5e_self_test,
 };
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c b/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
new file mode 100644
index 0000000..49504a7
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
@@ -0,0 +1,126 @@
+/*
+ * Copyright (c) 2016, Mellanox Technologies, Ltd.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "en.h"
+
+enum {
+	MLX5E_ST_LINK_STATE,
+	MLX5E_ST_LINK_SPEED,
+	MLX5E_ST_HEALTH_INFO,
+	MLX5E_ST_NUM,
+};
+
+const char mlx5e_self_tests[MLX5E_ST_NUM][ETH_GSTRING_LEN] = {
+	"Link Test",
+	"Speed Test",
+	"Health Test",
+};
+
+int mlx5e_self_test_num(struct mlx5e_priv *priv)
+{
+	return ARRAY_SIZE(mlx5e_self_tests);
+}
+
+static int mlx5e_test_health_info(struct mlx5e_priv *priv)
+{
+	struct mlx5_core_health *health = &priv->mdev->priv.health;
+
+	return health->sick ? 1 : 0;
+}
+
+static int mlx5e_test_link_state(struct mlx5e_priv *priv)
+{
+	u8 port_state;
+
+	if (!netif_carrier_ok(priv->netdev))
+		return 1;
+
+	port_state = mlx5_query_vport_state(priv->mdev, MLX5_QUERY_VPORT_STATE_IN_OP_MOD_VNIC_VPORT, 0);
+	return port_state == VPORT_STATE_UP ? 0 : 1;
+}
+
+static int mlx5e_test_link_speed(struct mlx5e_priv *priv)
+{
+	u32 out[MLX5_ST_SZ_DW(ptys_reg)];
+	u32 eth_proto_oper;
+	int i;
+
+	if (!netif_carrier_ok(priv->netdev))
+		return 1;
+
+	if (mlx5_query_port_ptys(priv->mdev, out, sizeof(out), MLX5_PTYS_EN, 1))
+		return 1;
+
+	eth_proto_oper = MLX5_GET(ptys_reg, out, eth_proto_oper);
+	for (i = 0; i < MLX5E_LINK_MODES_NUMBER; i++) {
+		if (eth_proto_oper & MLX5E_PROT_MASK(i))
+			return 0;
+	}
+	return 1;
+}
+
+static int (*mlx5e_st_func[MLX5E_ST_NUM])(struct mlx5e_priv *) = {
+	mlx5e_test_link_state,
+	mlx5e_test_link_speed,
+	mlx5e_test_health_info,
+};
+
+void mlx5e_self_test(struct net_device *ndev, struct ethtool_test *etest,
+		     u64 *buf)
+{
+	struct mlx5e_priv *priv = netdev_priv(ndev);
+	int i;
+
+	memset(buf, 0, sizeof(u64) * MLX5E_ST_NUM);
+
+	mutex_lock(&priv->state_lock);
+	netdev_info(ndev, "Self test begin..\n");
+
+	for (i = 0; i < MLX5E_ST_NUM; i++) {
+		netdev_info(ndev, "\t[%d] %s start..\n",
+			    i, mlx5e_self_tests[i]);
+		buf[i] = mlx5e_st_func[i](priv);
+		netdev_info(ndev, "\t[%d] %s end: result(%lld)\n",
+			    i, mlx5e_self_tests[i], buf[i]);
+	}
+
+	mutex_unlock(&priv->state_lock);
+
+	for (i = 0; i < MLX5E_ST_NUM; i++) {
+		if (buf[i]) {
+			etest->flags |= ETH_TEST_FL_FAILED;
+			break;
+		}
+	}
+	netdev_info(ndev, "Self test out: status flags(0x%x)\n",
+		    etest->flags);
+}
-- 
2.7.4

^ permalink raw reply related

* [PATCH net-next V2 09/10] net/mlx5e: Moves pflags to priv->params
From: Saeed Mahameed @ 2016-11-27 15:02 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Shaker Daibes, Saeed Mahameed
In-Reply-To: <1480258932-10302-1-git-send-email-saeedm@mellanox.com>

From: Shaker Daibes <shakerd@mellanox.com>

pflags is a configuration parameter for the netdev, naturally it belongs
to priv->params.
Also introduce MLX5E_GET_PFLAG

Signed-off-by: Shaker Daibes <shakerd@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h         | 16 +++++++++-------
 drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c |  6 +++---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c    |  4 ++--
 3 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 9cf32d3..84ac78f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -177,14 +177,16 @@ enum mlx5e_priv_flag {
 	MLX5E_PFLAG_RX_CQE_BASED_MODER = (1 << 0),
 };
 
-#define MLX5E_SET_PRIV_FLAG(priv, pflag, enable)    \
-	do {                                        \
-		if (enable)                         \
-			priv->pflags |= pflag;      \
-		else                                \
-			priv->pflags &= ~pflag;     \
+#define MLX5E_SET_PFLAG(priv, pflag, enable)			\
+	do {							\
+		if (enable)					\
+			(priv)->params.pflags |= (pflag);	\
+		else						\
+			(priv)->params.pflags &= ~(pflag);	\
 	} while (0)
 
+#define MLX5E_GET_PFLAG(priv, pflag) (!!((priv)->params.pflags & (pflag)))
+
 #ifdef CONFIG_MLX5_CORE_EN_DCB
 #define MLX5E_MAX_BW_ALLOC 100 /* Max percentage of BW allocation */
 #endif
@@ -218,6 +220,7 @@ struct mlx5e_params {
 	bool vlan_strip_disable;
 	bool rx_am_enabled;
 	u32 lro_timeout;
+	u32 pflags;
 };
 
 #ifdef CONFIG_MLX5_CORE_EN_DCB
@@ -705,7 +708,6 @@ struct mlx5e_priv {
 	struct work_struct         tx_timeout_work;
 	struct delayed_work        update_stats_work;
 
-	u32                        pflags;
 	struct mlx5_core_dev      *mdev;
 	struct net_device         *netdev;
 	struct mlx5e_stats         stats;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index 839c4e9..d2bdccb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -1488,7 +1488,7 @@ static int mlx5e_handle_pflag(struct net_device *netdev,
 {
 	struct mlx5e_priv *priv = netdev_priv(netdev);
 	bool enable = !!(wanted_flags & flag);
-	u32 changes = wanted_flags ^ priv->pflags;
+	u32 changes = wanted_flags ^ priv->params.pflags;
 	int err;
 
 	if (!(changes & flag))
@@ -1501,7 +1501,7 @@ static int mlx5e_handle_pflag(struct net_device *netdev,
 		return err;
 	}
 
-	MLX5E_SET_PRIV_FLAG(priv, flag, enable);
+	MLX5E_SET_PFLAG(priv, flag, enable);
 	return 0;
 }
 
@@ -1524,7 +1524,7 @@ static u32 mlx5e_get_priv_flags(struct net_device *netdev)
 {
 	struct mlx5e_priv *priv = netdev_priv(netdev);
 
-	return priv->pflags;
+	return priv->params.pflags;
 }
 
 static int mlx5e_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *cmd)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 89d5c65..004940a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -3488,8 +3488,8 @@ static void mlx5e_build_nic_netdev_priv(struct mlx5_core_dev *mdev,
 		SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
 
 	/* Initialize pflags */
-	MLX5E_SET_PRIV_FLAG(priv, MLX5E_PFLAG_RX_CQE_BASED_MODER,
-			    priv->params.rx_cq_period_mode == MLX5_CQ_PERIOD_MODE_START_FROM_CQE);
+	MLX5E_SET_PFLAG(priv, MLX5E_PFLAG_RX_CQE_BASED_MODER,
+			priv->params.rx_cq_period_mode == MLX5_CQ_PERIOD_MODE_START_FROM_CQE);
 
 	mutex_init(&priv->state_lock);
 
-- 
2.7.4

^ permalink raw reply related

* [PATCH net-next V2 04/10] net/mlx5: Add DCBX firmware commands support
From: Saeed Mahameed @ 2016-11-27 15:02 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Huy Nguyen, Saeed Mahameed
In-Reply-To: <1480258932-10302-1-git-send-email-saeedm@mellanox.com>

From: Huy Nguyen <huyn@mellanox.com>

Add set/query commands for DCBX_PARAM register

Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/port.c | 20 ++++++++++++++++++++
 include/linux/mlx5/driver.h                    |  7 +++++++
 include/linux/mlx5/port.h                      |  2 ++
 3 files changed, 29 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/port.c b/drivers/net/ethernet/mellanox/mlx5/core/port.c
index ed4898f..d2ec9d2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/port.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/port.c
@@ -548,6 +548,26 @@ int mlx5_max_tc(struct mlx5_core_dev *mdev)
 	return num_tc - 1;
 }
 
+int mlx5_query_port_dcbx_param(struct mlx5_core_dev *mdev, u32 *out)
+{
+	u32 in[MLX5_ST_SZ_DW(dcbx_param)] = {0};
+
+	MLX5_SET(dcbx_param, in, port_number, 1);
+
+	return  mlx5_core_access_reg(mdev, in, sizeof(in), out,
+				    sizeof(in), MLX5_REG_DCBX_PARAM, 0, 0);
+}
+
+int mlx5_set_port_dcbx_param(struct mlx5_core_dev *mdev, u32 *in)
+{
+	u32 out[MLX5_ST_SZ_DW(dcbx_param)];
+
+	MLX5_SET(dcbx_param, in, port_number, 1);
+
+	return mlx5_core_access_reg(mdev, in, sizeof(out), out,
+				    sizeof(out), MLX5_REG_DCBX_PARAM, 0, 1);
+}
+
 int mlx5_set_port_prio_tc(struct mlx5_core_dev *mdev, u8 *prio_tc)
 {
 	u32 in[MLX5_ST_SZ_DW(qtct_reg)] = {0};
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index ae1f451..68b85ef 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -104,6 +104,8 @@ enum {
 enum {
 	MLX5_REG_QETCR		 = 0x4005,
 	MLX5_REG_QTCT		 = 0x400a,
+	MLX5_REG_DCBX_PARAM      = 0x4020,
+	MLX5_REG_DCBX_APP        = 0x4021,
 	MLX5_REG_PCAP		 = 0x5001,
 	MLX5_REG_PMTU		 = 0x5003,
 	MLX5_REG_PTYS		 = 0x5004,
@@ -124,6 +126,11 @@ enum {
 	MLX5_REG_MPCNT		 = 0x9051,
 };
 
+enum mlx5_dcbx_oper_mode {
+	MLX5E_DCBX_PARAM_VER_OPER_HOST  = 0x0,
+	MLX5E_DCBX_PARAM_VER_OPER_AUTO  = 0x3,
+};
+
 enum {
 	MLX5_ATOMIC_OPS_CMP_SWAP	= 1 << 0,
 	MLX5_ATOMIC_OPS_FETCH_ADD	= 1 << 1,
diff --git a/include/linux/mlx5/port.h b/include/linux/mlx5/port.h
index bdee439..e527732 100644
--- a/include/linux/mlx5/port.h
+++ b/include/linux/mlx5/port.h
@@ -162,4 +162,6 @@ void mlx5_query_port_fcs(struct mlx5_core_dev *mdev, bool *supported,
 int mlx5_query_module_eeprom(struct mlx5_core_dev *dev,
 			     u16 offset, u16 size, u8 *data);
 
+int mlx5_query_port_dcbx_param(struct mlx5_core_dev *mdev, u32 *out);
+int mlx5_set_port_dcbx_param(struct mlx5_core_dev *mdev, u32 *in);
 #endif /* __MLX5_PORT_H__ */
-- 
2.7.4

^ permalink raw reply related

* [PATCH net-next V2 05/10] net/mlx5e: ConnectX-4 firmware support for DCBX
From: Saeed Mahameed @ 2016-11-27 15:02 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Huy Nguyen, Saeed Mahameed
In-Reply-To: <1480258932-10302-1-git-send-email-saeedm@mellanox.com>

From: Huy Nguyen <huyn@mellanox.com>

DBCX by default is controlled by firmware where dcbx capability bit
is set. In this mode, firmware is responsible for reading/sending the
TLV packets from/to the remote partner.

This patch sets up the infrastructure to move between HOST/FW DCBX
control mode.

Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |  2 +
 drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c | 91 ++++++++++++++++++++++
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  | 29 +------
 3 files changed, 94 insertions(+), 28 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 60aa13b..6c954cd 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -234,6 +234,7 @@ enum {
 };
 
 struct mlx5e_dcbx {
+	enum mlx5_dcbx_oper_mode   mode;
 	struct mlx5e_cee_config    cee_cfg; /* pending configuration */
 
 	/* The only setting that cannot be read from FW */
@@ -843,6 +844,7 @@ extern const struct ethtool_ops mlx5e_ethtool_ops;
 #ifdef CONFIG_MLX5_CORE_EN_DCB
 extern const struct dcbnl_rtnl_ops mlx5e_dcbnl_ops;
 int mlx5e_dcbnl_ieee_setets_core(struct mlx5e_priv *priv, struct ieee_ets *ets);
+void mlx5e_dcbnl_initialize(struct mlx5e_priv *priv);
 #endif
 
 #ifndef CONFIG_RFS_ACCEL
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c b/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c
index 35e90d1..1753182 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c
@@ -41,6 +41,45 @@
 #define MLX5E_CEE_STATE_UP    1
 #define MLX5E_CEE_STATE_DOWN  0
 
+/* If dcbx mode is non-host set the dcbx mode to host.
+ */
+static int mlx5e_dcbnl_set_dcbx_mode(struct mlx5e_priv *priv,
+				     enum mlx5_dcbx_oper_mode mode)
+{
+	struct mlx5_core_dev *mdev = priv->mdev;
+	u32 param[MLX5_ST_SZ_DW(dcbx_param)];
+	int err;
+
+	err = mlx5_query_port_dcbx_param(mdev, param);
+	if (err)
+		return err;
+
+	MLX5_SET(dcbx_param, param, version_admin, mode);
+	if (mode != MLX5E_DCBX_PARAM_VER_OPER_HOST)
+		MLX5_SET(dcbx_param, param, willing_admin, 1);
+
+	return mlx5_set_port_dcbx_param(mdev, param);
+}
+
+static int mlx5e_dcbnl_switch_to_host_mode(struct mlx5e_priv *priv)
+{
+	struct mlx5e_dcbx *dcbx = &priv->dcbx;
+	int err;
+
+	if (!MLX5_CAP_GEN(priv->mdev, dcbx))
+		return 0;
+
+	if (dcbx->mode == MLX5E_DCBX_PARAM_VER_OPER_HOST)
+		return 0;
+
+	err = mlx5e_dcbnl_set_dcbx_mode(priv, MLX5E_DCBX_PARAM_VER_OPER_HOST);
+	if (err)
+		return err;
+
+	dcbx->mode = MLX5E_DCBX_PARAM_VER_OPER_HOST;
+	return 0;
+}
+
 static int mlx5e_dcbnl_ieee_getets(struct net_device *netdev,
 				   struct ieee_ets *ets)
 {
@@ -255,6 +294,9 @@ static u8 mlx5e_dcbnl_getdcbx(struct net_device *dev)
 
 static u8 mlx5e_dcbnl_setdcbx(struct net_device *dev, u8 mode)
 {
+	if (mlx5e_dcbnl_switch_to_host_mode(netdev_priv(dev)))
+		return 1;
+
 	if ((mode & DCB_CAP_DCBX_LLD_MANAGED) ||
 	    !(mode & DCB_CAP_DCBX_VER_CEE) ||
 	    !(mode & DCB_CAP_DCBX_VER_IEEE) ||
@@ -637,3 +679,52 @@ const struct dcbnl_rtnl_ops mlx5e_dcbnl_ops = {
 	.getpfcstate    = mlx5e_dcbnl_getpfcstate,
 	.setpfcstate    = mlx5e_dcbnl_setpfcstate,
 };
+
+static void mlx5e_dcbnl_query_dcbx_mode(struct mlx5e_priv *priv,
+					enum mlx5_dcbx_oper_mode *mode)
+{
+	u32 out[MLX5_ST_SZ_DW(dcbx_param)];
+
+	*mode = MLX5E_DCBX_PARAM_VER_OPER_HOST;
+
+	if (!mlx5_query_port_dcbx_param(priv->mdev, out))
+		*mode = MLX5_GET(dcbx_param, out, version_oper);
+
+	/* From driver's point of view, we only care if the mode
+	 * is host (HOST) or non-host (AUTO)
+	 */
+	if (*mode != MLX5E_DCBX_PARAM_VER_OPER_HOST)
+		*mode = MLX5E_DCBX_PARAM_VER_OPER_AUTO;
+}
+
+static void mlx5e_ets_init(struct mlx5e_priv *priv)
+{
+	int i;
+	struct ieee_ets ets;
+
+	memset(&ets, 0, sizeof(ets));
+	ets.ets_cap = mlx5_max_tc(priv->mdev) + 1;
+	for (i = 0; i < ets.ets_cap; i++) {
+		ets.tc_tx_bw[i] = MLX5E_MAX_BW_ALLOC;
+		ets.tc_tsa[i] = IEEE_8021QAZ_TSA_VENDOR;
+		ets.prio_tc[i] = i;
+	}
+
+	memcpy(priv->dcbx.tc_tsa, ets.tc_tsa, sizeof(ets.tc_tsa));
+
+	/* tclass[prio=0]=1, tclass[prio=1]=0, tclass[prio=i]=i (for i>1) */
+	ets.prio_tc[0] = 1;
+	ets.prio_tc[1] = 0;
+
+	mlx5e_dcbnl_ieee_setets_core(priv, &ets);
+}
+
+void mlx5e_dcbnl_initialize(struct mlx5e_priv *priv)
+{
+	struct mlx5e_dcbx *dcbx = &priv->dcbx;
+
+	if (MLX5_CAP_GEN(priv->mdev, dcbx))
+		mlx5e_dcbnl_query_dcbx_mode(priv, &dcbx->mode);
+
+	mlx5e_ets_init(priv);
+}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index b1e8ec5..f5b93c2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -3325,33 +3325,6 @@ u16 mlx5e_get_max_inline_cap(struct mlx5_core_dev *mdev)
 	       2 /*sizeof(mlx5e_tx_wqe.inline_hdr_start)*/;
 }
 
-#ifdef CONFIG_MLX5_CORE_EN_DCB
-static void mlx5e_ets_init(struct mlx5e_priv *priv)
-{
-	struct ieee_ets ets;
-	int i;
-
-	if (!MLX5_CAP_GEN(priv->mdev, ets))
-		return;
-
-	memset(&ets, 0, sizeof(ets));
-	ets.ets_cap = mlx5_max_tc(priv->mdev) + 1;
-	for (i = 0; i < ets.ets_cap; i++) {
-		ets.tc_tx_bw[i] = MLX5E_MAX_BW_ALLOC;
-		ets.tc_tsa[i] = IEEE_8021QAZ_TSA_VENDOR;
-		ets.prio_tc[i] = i;
-	}
-
-	memcpy(priv->dcbx.tc_tsa, ets.tc_tsa, sizeof(ets.tc_tsa));
-
-	/* tclass[prio=0]=1, tclass[prio=1]=0, tclass[prio=i]=i (for i>1) */
-	ets.prio_tc[0] = 1;
-	ets.prio_tc[1] = 0;
-
-	mlx5e_dcbnl_ieee_setets_core(priv, &ets);
-}
-#endif
-
 void mlx5e_build_default_indir_rqt(struct mlx5_core_dev *mdev,
 				   u32 *indirection_rqt, int len,
 				   int num_channels)
@@ -3794,7 +3767,7 @@ static int mlx5e_init_nic_tx(struct mlx5e_priv *priv)
 	}
 
 #ifdef CONFIG_MLX5_CORE_EN_DCB
-	mlx5e_ets_init(priv);
+	mlx5e_dcbnl_initialize(priv);
 #endif
 	return 0;
 }
-- 
2.7.4

^ permalink raw reply related

* [PATCH net-next V2 00/10] Mellanox 100G mlx5 DCBX and ethtool updates
From: Saeed Mahameed @ 2016-11-27 15:02 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Saeed Mahameed

Hi Dave, 

This series provides the following mlx5 updates:

>From Huy:
DCBX CEE API and DCBX firmware/host modes support.
    - 1st patch ensures the dcbnl_rtnl_ops is published only when the qos
      capability bits is on. 
    - 2nd patch adds the support for CEE interfaces into mlx5 dcbnl_rtnl_ops
    - 3rd patch refactors ETS query to read ETS configuration directly from
      firmware rather than having a software shadow to it. The existing IEEE
      interfaces stays the same.
    - 4th patch adds the support for MLX5_REG_DCBX_PARAM and MLX5_REG_DCBX_APP
      firmware commands to manipulate mlx5 DCBX mode. 
    - 5th patch adds the driver support for the new DCBX firmware.  This ensures
      the backward compatibility versus the old and new firmware. With the new DCBX
      firmware, qos settings can be controlled by either firmware or software
      depending on the DCBX mode. 

>From Kamal and Saeed:
    - mlx5 self-test support.

>From Shaker:
    - Private flag to give the user the ability to enable/disable mlx5 CQE
      compression.

V1->V2:
    - Check ETS capability where needed in:
	("net/mlx5e: Read ETS settings directly from firmware")
    - Fix return value of mlx5e_dcbnl_switch_to_host_mode in:
	("net/mlx5e: ConnectX-4 firmware support for DCBX")
    - Update commit message of:
	("net/mlx5e: ConnectX-4 firmware support for DCBX")
    - Fix two sparse static check warnings in en_selftest.c

This series was generated against commit:
e5f12b3f5ebb ("Merge branch 'mlxsw-trap-groups-and-policers'")

Huy Nguyen (6):
  net/mlx5e: Add qos capability check
  net/mlx5e: Support DCBX CEE API
  net/mlx5e: Read ETS settings directly from firmware
  net/mlx5: Add DCBX firmware commands support
  net/mlx5e: ConnectX-4 firmware support for DCBX
  net/mlx5e: Add DCBX control interface

Kamal Heib (1):
  net/mlx5e: Add support for ethtool self diagnostics test

Saeed Mahameed (1):
  net/mlx5e: Add support for loopback selftest

Shaker Daibes (2):
  net/mlx5e: Moves pflags to priv->params
  net/mlx5e: Add CQE compression user control

 drivers/net/ethernet/mellanox/mlx5/core/Makefile   |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |  61 ++-
 drivers/net/ethernet/mellanox/mlx5/core/en_clock.c |   3 +-
 .../net/ethernet/mellanox/mlx5/core/en_common.c    |   7 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c | 457 ++++++++++++++++++++-
 .../net/ethernet/mellanox/mlx5/core/en_ethtool.c   |  53 ++-
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |  46 +--
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    |   4 +-
 .../net/ethernet/mellanox/mlx5/core/en_selftest.c  | 344 ++++++++++++++++
 drivers/net/ethernet/mellanox/mlx5/core/port.c     |  63 +++
 include/linux/mlx5/driver.h                        |   7 +
 include/linux/mlx5/port.h                          |   6 +
 12 files changed, 985 insertions(+), 68 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c

-- 
2.7.4

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox