netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [pull request][for-next 00/12] Mellanox, mlx5 IPoIB Muli Pkey support 2017-10-11
@ 2017-10-14 18:48 Saeed Mahameed
  2017-10-14 18:48 ` [for-next 02/12] net/mlx5: PTP code migration to driver core section Saeed Mahameed
                   ` (9 more replies)
  0 siblings, 10 replies; 20+ messages in thread
From: Saeed Mahameed @ 2017-10-14 18:48 UTC (permalink / raw)
  To: David S. Miller, Doug Ledford
  Cc: netdev, linux-rdma, Leon Romanovsky, Saeed Mahameed

Hi Dave and Doug,

This series includes updates for mlx5 IPoIB offloading driver from Alex
and Feras to add the support for Muli Pkey in the mlx5i ipoib offloading netdev,
to be merged into net-next and rdma-next trees.

Doug, I am sorry I couldn't base this on rc2 since the series needs and conflicts
with a fix that was submitted to rc3, so to keep things simple I based it on rc4,
I hope this is ok with you..

Please pull and let me know if there's any problem.

Thanks,
Saeed.

---

The following changes since commit 8a5776a5f49812d29fe4b2d0a2d71675c3facf3f:

  Linux 4.14-rc4 (2017-10-08 20:53:29 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git tags/mlx5-updates-2017-10-11

for you to fetch changes up to b5ae577741bec22b584fa704076ccd8221cad19d:

  net/mlx5e: IPoIB, Modify rdma netdev allocate and free to support PKEY (2017-10-14 11:22:12 -0700)

----------------------------------------------------------------
mlx5-updates-2017-10-11: IPoIB Muli Pkey support

This series provides the support for IPoIB Multi Pkey.
InfiniBand Pkeys are the equivalent of Ethernet vlans.
Currently IPoIB device driver supports only default Pkey and IPoIB Pkey child
interfaces are not supported with IPoIB offloads mode, this series will add
the support for that by allowing creating mlx5 multiple IPoIB netdevices with
a non-default Pkey.

mlx5 IPoIB Pkey child interface is smaller version of mlx5i IPoIB interfaces and shares
most of its resources with the parent IPoIB interface, namely RX steering and ring
queue resources.

The only mlx5 resources a child Pkey interface will be creating are the TX rings,
since they should be assigned to a specific Pkey.

mlx5i Pkey netdev is implemented via new mlx5e netdev profile implemented in
mlx5/core/ipoib/ipoib_vlan.c.

The series starts with a refactoring of mlx5e PTP and mlx5 clock implementation
to move the code to be part of mlx5 core rather than mlx5e netdevice, in order to
make mlx5 clock and PTP registration part of the core to be shared with mlx5e
master Ethernet netdev/IPoIB parent netdev and mlx5_ib in the near future.

Add the support for attaching multiple underlay QPs for the different Pkeys
in mlx5 core RX steering.

Add Pkey index to rdma_netdev to add the ability to set PKEY index to lower
IPoIB offload netdev.

Use hash-table to map between DQPN (Destination QP number) to child netdev
for the IPoIB parent netdev to forward RX packets to the corresponding
child Pkey netdev, since the RX rings are shared.

The reset of the series adds the ipoib child Pkey: mlx5e netdev profile,
netdev nods implementation and minimal set of ethtool callbacks.

Thanks,
Saeed.

----------------------------------------------------------------
Alex Vesker (10):
      net/mlx5e: IPoIB, Move underlay QP init/uninit to separate functions
      net/mlx5: Support for attaching multiple underlay QPs to root flow table
      IB/ipoib: Grab rtnl lock on heavy flush when calling ndo_open/stop
      IB/ipoib: Add ability to set PKEY index to lower device driver
      net/mlx5e: IPoIB, Support for setting PKEY index to underlay QP
      net/mlx5e: IPoIB, Use hash-table to map between QPN to child netdev
      net/mlx5e: IPoIB, Add PKEY child interface nic profile
      net/mlx5e: IPoIB, Add PKEY child interface ndos
      net/mlx5e: IPoIB, Add PKEY child interface ethtool ops
      net/mlx5e: IPoIB, Modify rdma netdev allocate and free to support PKEY

Feras Daoud (2):
      net/mlx5: File renaming towards ptp core implementation
      net/mlx5: PTP code migration to driver core section

 drivers/infiniband/ulp/ipoib/ipoib_ib.c            |  15 +-
 drivers/net/ethernet/mellanox/mlx5/core/Kconfig    |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/Makefile   |   6 +-
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |  39 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_clock.c | 619 ---------------------
 .../net/ethernet/mellanox/mlx5/core/en_common.c    |   1 +
 .../net/ethernet/mellanox/mlx5/core/en_ethtool.c   |   7 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |  95 +++-
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    |  37 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c    |   6 +-
 drivers/net/ethernet/mellanox/mlx5/core/eq.c       |   3 +-
 drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c   |  13 +-
 drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h   |   4 +-
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c  | 123 +++-
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.h  |   7 +-
 .../ethernet/mellanox/mlx5/core/ipoib/ethtool.c    |   5 +
 .../net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c  | 260 ++++++---
 .../net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h  |  36 ++
 .../ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c | 350 ++++++++++++
 .../net/ethernet/mellanox/mlx5/core/lib/clock.c    | 525 +++++++++++++++++
 .../net/ethernet/mellanox/mlx5/core/lib/clock.h    |  51 ++
 drivers/net/ethernet/mellanox/mlx5/core/main.c     |   4 +
 .../net/ethernet/mellanox/mlx5/core/mlx5_core.h    |   1 +
 include/linux/mlx5/driver.h                        |  24 +
 24 files changed, 1437 insertions(+), 796 deletions(-)
 delete mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_clock.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/lib/clock.h

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [for-next 01/12] net/mlx5: File renaming towards ptp core implementation
       [not found] ` <20171014184827.18491-1-saeedm-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2017-10-14 18:48   ` Saeed Mahameed
  2017-10-14 18:48   ` [for-next 03/12] net/mlx5e: IPoIB, Move underlay QP init/uninit to separate functions Saeed Mahameed
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 20+ messages in thread
From: Saeed Mahameed @ 2017-10-14 18:48 UTC (permalink / raw)
  To: David S. Miller, Doug Ledford
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Leon Romanovsky, Feras Daoud, Eitan Rabin, Saeed Mahameed

From: Feras Daoud <ferasda-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

en_clock.c renamed clock.c and moved to lib/ as first step
towards relocating code to core part of the driver to allow
sharing between Ethernet and Infiniband.

Signed-off-by: Feras Daoud <ferasda-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Eitan Rabin <rabin-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Saeed Mahameed <saeedm-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/net/ethernet/mellanox/mlx5/core/Kconfig                     | 2 +-
 drivers/net/ethernet/mellanox/mlx5/core/Makefile                    | 4 ++--
 drivers/net/ethernet/mellanox/mlx5/core/{en_clock.c => lib/clock.c} | 0
 3 files changed, 3 insertions(+), 3 deletions(-)
 rename drivers/net/ethernet/mellanox/mlx5/core/{en_clock.c => lib/clock.c} (100%)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
index fdaef00465d7..25deaa5a534c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
@@ -6,6 +6,7 @@ config MLX5_CORE
 	tristate "Mellanox Technologies ConnectX-4 and Connect-IB core driver"
 	depends on MAY_USE_DEVLINK
 	depends on PCI
+	imply PTP_1588_CLOCK
 	default n
 	---help---
 	  Core driver for low level functionality of the ConnectX-4 and
@@ -29,7 +30,6 @@ config MLX5_CORE_EN
 	bool "Mellanox Technologies ConnectX-4 Ethernet support"
 	depends on NETDEVICES && ETHERNET && INET && PCI && MLX5_CORE
 	depends on IPV6=y || IPV6=n || MLX5_CORE=m
-	imply PTP_1588_CLOCK
 	default n
 	---help---
 	  Ethernet support in Mellanox Technologies ConnectX-4 NIC.
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
index 87a3099808f3..d9621b2152d3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
@@ -4,7 +4,7 @@ subdir-ccflags-y += -I$(src)
 mlx5_core-y :=	main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \
 		health.o mcg.o cq.o srq.o alloc.o qp.o port.o mr.o pd.o \
 		mad.o transobj.o vport.o sriov.o fs_cmd.o fs_core.o \
-		fs_counters.o rl.o lag.o dev.o wq.o lib/gid.o \
+		fs_counters.o rl.o lag.o dev.o wq.o lib/gid.o lib/clock.o \
 		diag/fs_tracepoint.o
 
 mlx5_core-$(CONFIG_MLX5_ACCEL) += accel/ipsec.o
@@ -13,7 +13,7 @@ mlx5_core-$(CONFIG_MLX5_FPGA) += fpga/cmd.o fpga/core.o fpga/conn.o fpga/sdk.o \
 		fpga/ipsec.o
 
 mlx5_core-$(CONFIG_MLX5_CORE_EN) += en_main.o en_common.o en_fs.o en_ethtool.o \
-		en_tx.o en_rx.o en_rx_am.o en_txrx.o en_clock.o vxlan.o \
+		en_tx.o en_rx.o en_rx_am.o en_txrx.o vxlan.o \
 		en_arfs.o en_fs_ethtool.o en_selftest.o
 
 mlx5_core-$(CONFIG_MLX5_MPFS) += lib/mpfs.o
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_clock.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
similarity index 100%
rename from drivers/net/ethernet/mellanox/mlx5/core/en_clock.c
rename to drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
-- 
2.14.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [for-next 02/12] net/mlx5: PTP code migration to driver core section
  2017-10-14 18:48 [pull request][for-next 00/12] Mellanox, mlx5 IPoIB Muli Pkey support 2017-10-11 Saeed Mahameed
@ 2017-10-14 18:48 ` Saeed Mahameed
  2017-10-14 18:48 ` [for-next 04/12] net/mlx5: Support for attaching multiple underlay QPs to root flow table Saeed Mahameed
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 20+ messages in thread
From: Saeed Mahameed @ 2017-10-14 18:48 UTC (permalink / raw)
  To: David S. Miller, Doug Ledford
  Cc: netdev, linux-rdma, Leon Romanovsky, Feras Daoud, Eitan Rabin,
	Saeed Mahameed

From: Feras Daoud <ferasda@mellanox.com>

PTP code is moved to core section of mlx5 driver in order to share
it between ethernet and infiniband. This movement involves the following
changes:
- Change mlx5e_ prefix to be mlx5_
- Add clock structs to Core
- Add clock object to mlx5_core_dev
- Call Init/Uninit clock from core init/cleanup
- Rename mlx5e_tstamp to be mlx5_clock

Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Eitan Rabin <rabin@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |  39 +-
 .../net/ethernet/mellanox/mlx5/core/en_ethtool.c   |   7 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |  95 +++-
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    |  17 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c    |   6 +-
 drivers/net/ethernet/mellanox/mlx5/core/eq.c       |   3 +-
 .../net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c  |   3 +-
 .../net/ethernet/mellanox/mlx5/core/lib/clock.c    | 548 +++++++++------------
 .../net/ethernet/mellanox/mlx5/core/lib/clock.h    |  51 ++
 drivers/net/ethernet/mellanox/mlx5/core/main.c     |   4 +
 .../net/ethernet/mellanox/mlx5/core/mlx5_core.h    |   1 +
 include/linux/mlx5/driver.h                        |  24 +
 12 files changed, 416 insertions(+), 382 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/lib/clock.h

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index cc13d3dbd366..2059122eb089 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -267,28 +267,6 @@ struct mlx5e_dcbx {
 };
 #endif
 
-#define MAX_PIN_NUM	8
-struct mlx5e_pps {
-	u8                         pin_caps[MAX_PIN_NUM];
-	struct work_struct         out_work;
-	u64                        start[MAX_PIN_NUM];
-	u8                         enabled;
-};
-
-struct mlx5e_tstamp {
-	rwlock_t                   lock;
-	struct cyclecounter        cycles;
-	struct timecounter         clock;
-	struct hwtstamp_config     hwtstamp_config;
-	u32                        nominal_c_mult;
-	unsigned long              overflow_period;
-	struct delayed_work        overflow_work;
-	struct mlx5_core_dev      *mdev;
-	struct ptp_clock          *ptp;
-	struct ptp_clock_info      ptp_info;
-	struct mlx5e_pps           pps_info;
-};
-
 enum {
 	MLX5E_RQ_STATE_ENABLED,
 	MLX5E_RQ_STATE_AM,
@@ -375,9 +353,10 @@ struct mlx5e_txqsq {
 	u8                         min_inline_mode;
 	u16                        edge;
 	struct device             *pdev;
-	struct mlx5e_tstamp       *tstamp;
 	__be32                     mkey_be;
 	unsigned long              state;
+	struct hwtstamp_config    *tstamp;
+	struct mlx5_clock         *clock;
 
 	/* control path */
 	struct mlx5_wq_ctrl        wq_ctrl;
@@ -543,10 +522,11 @@ struct mlx5e_rq {
 	struct mlx5e_channel  *channel;
 	struct device         *pdev;
 	struct net_device     *netdev;
-	struct mlx5e_tstamp   *tstamp;
 	struct mlx5e_rq_stats  stats;
 	struct mlx5e_cq        cq;
 	struct mlx5e_page_cache page_cache;
+	struct hwtstamp_config *tstamp;
+	struct mlx5_clock      *clock;
 
 	mlx5e_fp_handle_rx_cqe handle_rx_cqe;
 	mlx5e_fp_post_rx_wqes  post_wqes;
@@ -588,7 +568,7 @@ struct mlx5e_channel {
 	/* control */
 	struct mlx5e_priv         *priv;
 	struct mlx5_core_dev      *mdev;
-	struct mlx5e_tstamp       *tstamp;
+	struct hwtstamp_config    *tstamp;
 	int                        ix;
 };
 
@@ -789,7 +769,7 @@ struct mlx5e_priv {
 	struct mlx5_core_dev      *mdev;
 	struct net_device         *netdev;
 	struct mlx5e_stats         stats;
-	struct mlx5e_tstamp        tstamp;
+	struct hwtstamp_config     tstamp;
 	u16 q_counter;
 #ifdef CONFIG_MLX5_CORE_EN_DCB
 	struct mlx5e_dcbx          dcbx;
@@ -873,12 +853,6 @@ void mlx5e_ethtool_init_steering(struct mlx5e_priv *priv);
 void mlx5e_ethtool_cleanup_steering(struct mlx5e_priv *priv);
 void mlx5e_set_rx_mode_work(struct work_struct *work);
 
-void mlx5e_fill_hwstamp(struct mlx5e_tstamp *clock, u64 timestamp,
-			struct skb_shared_hwtstamps *hwts);
-void mlx5e_timestamp_init(struct mlx5e_priv *priv);
-void mlx5e_timestamp_cleanup(struct mlx5e_priv *priv);
-void mlx5e_pps_event_handler(struct mlx5e_priv *priv,
-			     struct ptp_clock_event *event);
 int mlx5e_hwstamp_set(struct mlx5e_priv *priv, struct ifreq *ifr);
 int mlx5e_hwstamp_get(struct mlx5e_priv *priv, struct ifreq *ifr);
 int mlx5e_modify_rx_cqe_compression_locked(struct mlx5e_priv *priv, bool val);
@@ -889,6 +863,7 @@ int mlx5e_vlan_rx_kill_vid(struct net_device *dev, __always_unused __be16 proto,
 			   u16 vid);
 void mlx5e_enable_vlan_filter(struct mlx5e_priv *priv);
 void mlx5e_disable_vlan_filter(struct mlx5e_priv *priv);
+void mlx5e_timestamp_set(struct mlx5e_priv *priv);
 
 struct mlx5e_redirect_rqt_param {
 	bool is_rss;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index d12e9fc0d76b..81a112e40fe3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -1417,14 +1417,15 @@ static int mlx5e_set_pauseparam(struct net_device *netdev,
 int mlx5e_ethtool_get_ts_info(struct mlx5e_priv *priv,
 			      struct ethtool_ts_info *info)
 {
+	struct mlx5_core_dev *mdev = priv->mdev;
 	int ret;
 
 	ret = ethtool_op_get_ts_info(priv->netdev, info);
 	if (ret)
 		return ret;
 
-	info->phc_index = priv->tstamp.ptp ?
-			  ptp_clock_index(priv->tstamp.ptp) : -1;
+	info->phc_index = mdev->clock.ptp ?
+			  ptp_clock_index(mdev->clock.ptp) : -1;
 
 	if (!MLX5_CAP_GEN(priv->mdev, device_frequency_khz))
 		return 0;
@@ -1754,7 +1755,7 @@ static int set_pflag_rx_cqe_compress(struct net_device *netdev,
 	if (!MLX5_CAP_GEN(mdev, cqe_compression))
 		return -EOPNOTSUPP;
 
-	if (enable && priv->tstamp.hwtstamp_config.rx_filter != HWTSTAMP_FILTER_NONE) {
+	if (enable && priv->tstamp.rx_filter != HWTSTAMP_FILTER_NONE) {
 		netdev_err(netdev, "Can't enable cqe compression while timestamping is enabled.\n");
 		return -EINVAL;
 	}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index cc11bbbd0309..6df00dd9745a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -373,8 +373,6 @@ static void mlx5e_async_event(struct mlx5_core_dev *mdev, void *vpriv,
 			      enum mlx5_dev_event event, unsigned long param)
 {
 	struct mlx5e_priv *priv = vpriv;
-	struct ptp_clock_event ptp_event;
-	struct mlx5_eqe *eqe = NULL;
 
 	if (!test_bit(MLX5E_STATE_ASYNC_EVENTS_ENABLED, &priv->state))
 		return;
@@ -384,14 +382,6 @@ static void mlx5e_async_event(struct mlx5_core_dev *mdev, void *vpriv,
 	case MLX5_DEV_EVENT_PORT_DOWN:
 		queue_work(priv->wq, &priv->update_carrier_work);
 		break;
-	case MLX5_DEV_EVENT_PPS:
-		eqe = (struct mlx5_eqe *)param;
-		ptp_event.index = eqe->data.pps.pin;
-		ptp_event.timestamp =
-			timecounter_cyc2time(&priv->tstamp.clock,
-					     be64_to_cpu(eqe->data.pps.time_stamp));
-		mlx5e_pps_event_handler(vpriv, &ptp_event);
-		break;
 	default:
 		break;
 	}
@@ -585,6 +575,7 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c,
 	rq->pdev    = c->pdev;
 	rq->netdev  = c->netdev;
 	rq->tstamp  = c->tstamp;
+	rq->clock   = &mdev->clock;
 	rq->channel = c;
 	rq->ix      = c->ix;
 	rq->mdev    = mdev;
@@ -1123,6 +1114,7 @@ static int mlx5e_alloc_txqsq(struct mlx5e_channel *c,
 
 	sq->pdev      = c->pdev;
 	sq->tstamp    = c->tstamp;
+	sq->clock     = &mdev->clock;
 	sq->mkey_be   = c->mkey_be;
 	sq->channel   = c;
 	sq->txq_ix    = txq_ix;
@@ -2678,6 +2670,12 @@ void mlx5e_switch_priv_channels(struct mlx5e_priv *priv,
 		netif_carrier_on(netdev);
 }
 
+void mlx5e_timestamp_set(struct mlx5e_priv *priv)
+{
+	priv->tstamp.tx_type   = HWTSTAMP_TX_OFF;
+	priv->tstamp.rx_filter = HWTSTAMP_FILTER_NONE;
+}
+
 int mlx5e_open_locked(struct net_device *netdev)
 {
 	struct mlx5e_priv *priv = netdev_priv(netdev);
@@ -2693,7 +2691,7 @@ int mlx5e_open_locked(struct net_device *netdev)
 	mlx5e_activate_priv_channels(priv);
 	if (priv->profile->update_carrier)
 		priv->profile->update_carrier(priv);
-	mlx5e_timestamp_init(priv);
+	mlx5e_timestamp_set(priv);
 
 	if (priv->profile->update_stats)
 		queue_delayed_work(priv->wq, &priv->update_stats_work, 0);
@@ -2731,7 +2729,6 @@ int mlx5e_close_locked(struct net_device *netdev)
 
 	clear_bit(MLX5E_STATE_OPENED, &priv->state);
 
-	mlx5e_timestamp_cleanup(priv);
 	netif_carrier_off(priv->netdev);
 	mlx5e_deactivate_priv_channels(priv);
 	mlx5e_close_channels(&priv->channels);
@@ -3403,6 +3400,80 @@ static int mlx5e_change_mtu(struct net_device *netdev, int new_mtu)
 	return err;
 }
 
+int mlx5e_hwstamp_set(struct mlx5e_priv *priv, struct ifreq *ifr)
+{
+	struct hwtstamp_config config;
+	int err;
+
+	if (!MLX5_CAP_GEN(priv->mdev, device_frequency_khz))
+		return -EOPNOTSUPP;
+
+	if (copy_from_user(&config, ifr->ifr_data, sizeof(config)))
+		return -EFAULT;
+
+	/* TX HW timestamp */
+	switch (config.tx_type) {
+	case HWTSTAMP_TX_OFF:
+	case HWTSTAMP_TX_ON:
+		break;
+	default:
+		return -ERANGE;
+	}
+
+	mutex_lock(&priv->state_lock);
+	/* RX HW timestamp */
+	switch (config.rx_filter) {
+	case HWTSTAMP_FILTER_NONE:
+		/* Reset CQE compression to Admin default */
+		mlx5e_modify_rx_cqe_compression_locked(priv, priv->channels.params.rx_cqe_compress_def);
+		break;
+	case HWTSTAMP_FILTER_ALL:
+	case HWTSTAMP_FILTER_SOME:
+	case HWTSTAMP_FILTER_PTP_V1_L4_EVENT:
+	case HWTSTAMP_FILTER_PTP_V1_L4_SYNC:
+	case HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ:
+	case HWTSTAMP_FILTER_PTP_V2_L4_EVENT:
+	case HWTSTAMP_FILTER_PTP_V2_L4_SYNC:
+	case HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ:
+	case HWTSTAMP_FILTER_PTP_V2_L2_EVENT:
+	case HWTSTAMP_FILTER_PTP_V2_L2_SYNC:
+	case HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ:
+	case HWTSTAMP_FILTER_PTP_V2_EVENT:
+	case HWTSTAMP_FILTER_PTP_V2_SYNC:
+	case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
+	case HWTSTAMP_FILTER_NTP_ALL:
+		/* Disable CQE compression */
+		netdev_warn(priv->netdev, "Disabling cqe compression");
+		err = mlx5e_modify_rx_cqe_compression_locked(priv, false);
+		if (err) {
+			netdev_err(priv->netdev, "Failed disabling cqe compression err=%d\n", err);
+			mutex_unlock(&priv->state_lock);
+			return err;
+		}
+		config.rx_filter = HWTSTAMP_FILTER_ALL;
+		break;
+	default:
+		mutex_unlock(&priv->state_lock);
+		return -ERANGE;
+	}
+
+	memcpy(&priv->tstamp, &config, sizeof(config));
+	mutex_unlock(&priv->state_lock);
+
+	return copy_to_user(ifr->ifr_data, &config,
+			    sizeof(config)) ? -EFAULT : 0;
+}
+
+int mlx5e_hwstamp_get(struct mlx5e_priv *priv, struct ifreq *ifr)
+{
+	struct hwtstamp_config *cfg = &priv->tstamp;
+
+	if (!MLX5_CAP_GEN(priv->mdev, device_frequency_khz))
+		return -EOPNOTSUPP;
+
+	return copy_to_user(ifr->ifr_data, cfg, sizeof(*cfg)) ? -EFAULT : 0;
+}
+
 static int mlx5e_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 {
 	struct mlx5e_priv *priv = netdev_priv(dev);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 15a1687483cc..7e3bfe62ef6e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -42,10 +42,11 @@
 #include "en_rep.h"
 #include "ipoib/ipoib.h"
 #include "en_accel/ipsec_rxtx.h"
+#include "lib/clock.h"
 
-static inline bool mlx5e_rx_hw_stamp(struct mlx5e_tstamp *tstamp)
+static inline bool mlx5e_rx_hw_stamp(struct hwtstamp_config *config)
 {
-	return tstamp->hwtstamp_config.rx_filter == HWTSTAMP_FILTER_ALL;
+	return config->rx_filter == HWTSTAMP_FILTER_ALL;
 }
 
 static inline void mlx5e_read_cqe_slot(struct mlx5e_cq *cq, u32 cqcc,
@@ -661,7 +662,6 @@ static inline void mlx5e_build_rx_skb(struct mlx5_cqe64 *cqe,
 				      struct sk_buff *skb)
 {
 	struct net_device *netdev = rq->netdev;
-	struct mlx5e_tstamp *tstamp = rq->tstamp;
 	int lro_num_seg;
 
 	lro_num_seg = be32_to_cpu(cqe->srqn) >> 24;
@@ -676,8 +676,9 @@ static inline void mlx5e_build_rx_skb(struct mlx5_cqe64 *cqe,
 		rq->stats.lro_bytes += cqe_bcnt;
 	}
 
-	if (unlikely(mlx5e_rx_hw_stamp(tstamp)))
-		mlx5e_fill_hwstamp(tstamp, get_cqe_ts(cqe), skb_hwtstamps(skb));
+	if (unlikely(mlx5e_rx_hw_stamp(rq->tstamp)))
+		skb_hwtstamps(skb)->hwtstamp =
+				mlx5_timecounter_cyc2time(rq->clock, get_cqe_ts(cqe));
 
 	skb_record_rx_queue(skb, rq->ix);
 
@@ -1163,7 +1164,6 @@ static inline void mlx5i_complete_rx_cqe(struct mlx5e_rq *rq,
 					 struct sk_buff *skb)
 {
 	struct net_device *netdev = rq->netdev;
-	struct mlx5e_tstamp *tstamp = rq->tstamp;
 	char *pseudo_header;
 	u8 *dgid;
 	u8 g;
@@ -1188,8 +1188,9 @@ static inline void mlx5i_complete_rx_cqe(struct mlx5e_rq *rq,
 	skb->ip_summed = CHECKSUM_COMPLETE;
 	skb->csum = csum_unfold((__force __sum16)cqe->check_sum);
 
-	if (unlikely(mlx5e_rx_hw_stamp(tstamp)))
-		mlx5e_fill_hwstamp(tstamp, get_cqe_ts(cqe), skb_hwtstamps(skb));
+	if (unlikely(mlx5e_rx_hw_stamp(rq->tstamp)))
+		skb_hwtstamps(skb)->hwtstamp =
+				mlx5_timecounter_cyc2time(rq->clock, get_cqe_ts(cqe));
 
 	skb_record_rx_queue(skb, rq->ix);
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index 1d6925d4369a..a7c208a1ad83 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -35,6 +35,7 @@
 #include "en.h"
 #include "ipoib/ipoib.h"
 #include "en_accel/ipsec_rxtx.h"
+#include "lib/clock.h"
 
 #define MLX5E_SQ_NOPS_ROOM  MLX5_SEND_WQE_MAX_WQEBBS
 #define MLX5E_SQ_STOP_ROOM (MLX5_SEND_WQE_MAX_WQEBBS +\
@@ -452,8 +453,9 @@ bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget)
 				     SKBTX_HW_TSTAMP)) {
 				struct skb_shared_hwtstamps hwts = {};
 
-				mlx5e_fill_hwstamp(sq->tstamp,
-						   get_cqe_ts(cqe), &hwts);
+				hwts.hwtstamp =
+					mlx5_timecounter_cyc2time(sq->clock,
+								  get_cqe_ts(cqe));
 				skb_tstamp_tx(skb, &hwts);
 			}
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index fc606bfd1d6e..60771865c99c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -491,8 +491,7 @@ static irqreturn_t mlx5_eq_int(int irq, void *eq_ptr)
 			break;
 
 		case MLX5_EVENT_TYPE_PPS_EVENT:
-			if (dev->event)
-				dev->event(dev, MLX5_DEV_EVENT_PPS, (unsigned long)eqe);
+			mlx5_pps_event(dev, eqe);
 			break;
 
 		case MLX5_EVENT_TYPE_FPGA_ERROR:
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
index 145e392ab849..14dfb577691b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
@@ -404,7 +404,7 @@ static int mlx5i_open(struct net_device *netdev)
 
 	mlx5e_refresh_tirs(priv, false);
 	mlx5e_activate_priv_channels(priv);
-	mlx5e_timestamp_init(priv);
+	mlx5e_timestamp_set(priv);
 
 	mutex_unlock(&priv->state_lock);
 	return 0;
@@ -429,7 +429,6 @@ static int mlx5i_close(struct net_device *netdev)
 
 	clear_bit(MLX5E_STATE_OPENED, &priv->state);
 
-	mlx5e_timestamp_cleanup(priv);
 	netif_carrier_off(priv->netdev);
 	mlx5e_deactivate_priv_channels(priv);
 	mlx5e_close_channels(&priv->channels);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
index 84dd63e74041..fa8aed62b231 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
@@ -34,250 +34,164 @@
 #include "en.h"
 
 enum {
-	MLX5E_CYCLES_SHIFT	= 23
+	MLX5_CYCLES_SHIFT	= 23
 };
 
 enum {
-	MLX5E_PIN_MODE_IN		= 0x0,
-	MLX5E_PIN_MODE_OUT		= 0x1,
+	MLX5_PIN_MODE_IN		= 0x0,
+	MLX5_PIN_MODE_OUT		= 0x1,
 };
 
 enum {
-	MLX5E_OUT_PATTERN_PULSE		= 0x0,
-	MLX5E_OUT_PATTERN_PERIODIC	= 0x1,
+	MLX5_OUT_PATTERN_PULSE		= 0x0,
+	MLX5_OUT_PATTERN_PERIODIC	= 0x1,
 };
 
 enum {
-	MLX5E_EVENT_MODE_DISABLE	= 0x0,
-	MLX5E_EVENT_MODE_REPETETIVE	= 0x1,
-	MLX5E_EVENT_MODE_ONCE_TILL_ARM	= 0x2,
+	MLX5_EVENT_MODE_DISABLE	= 0x0,
+	MLX5_EVENT_MODE_REPETETIVE	= 0x1,
+	MLX5_EVENT_MODE_ONCE_TILL_ARM	= 0x2,
 };
 
 enum {
-	MLX5E_MTPPS_FS_ENABLE			= BIT(0x0),
-	MLX5E_MTPPS_FS_PATTERN			= BIT(0x2),
-	MLX5E_MTPPS_FS_PIN_MODE			= BIT(0x3),
-	MLX5E_MTPPS_FS_TIME_STAMP		= BIT(0x4),
-	MLX5E_MTPPS_FS_OUT_PULSE_DURATION	= BIT(0x5),
-	MLX5E_MTPPS_FS_ENH_OUT_PER_ADJ		= BIT(0x7),
+	MLX5_MTPPS_FS_ENABLE			= BIT(0x0),
+	MLX5_MTPPS_FS_PATTERN			= BIT(0x2),
+	MLX5_MTPPS_FS_PIN_MODE			= BIT(0x3),
+	MLX5_MTPPS_FS_TIME_STAMP		= BIT(0x4),
+	MLX5_MTPPS_FS_OUT_PULSE_DURATION	= BIT(0x5),
+	MLX5_MTPPS_FS_ENH_OUT_PER_ADJ		= BIT(0x7),
 };
 
-void mlx5e_fill_hwstamp(struct mlx5e_tstamp *tstamp, u64 timestamp,
-			struct skb_shared_hwtstamps *hwts)
+static u64 read_internal_timer(const struct cyclecounter *cc)
 {
-	u64 nsec;
+	struct mlx5_clock *clock = container_of(cc, struct mlx5_clock, cycles);
+	struct mlx5_core_dev *mdev = container_of(clock, struct mlx5_core_dev,
+						  clock);
 
-	read_lock(&tstamp->lock);
-	nsec = timecounter_cyc2time(&tstamp->clock, timestamp);
-	read_unlock(&tstamp->lock);
-
-	hwts->hwtstamp = ns_to_ktime(nsec);
-}
-
-static u64 mlx5e_read_internal_timer(const struct cyclecounter *cc)
-{
-	struct mlx5e_tstamp *tstamp = container_of(cc, struct mlx5e_tstamp,
-						   cycles);
-
-	return mlx5_read_internal_timer(tstamp->mdev) & cc->mask;
+	return mlx5_read_internal_timer(mdev) & cc->mask;
 }
 
-static void mlx5e_pps_out(struct work_struct *work)
+static void mlx5_pps_out(struct work_struct *work)
 {
-	struct mlx5e_pps *pps_info = container_of(work, struct mlx5e_pps,
-						  out_work);
-	struct mlx5e_tstamp *tstamp = container_of(pps_info, struct mlx5e_tstamp,
-						   pps_info);
+	struct mlx5_pps *pps_info = container_of(work, struct mlx5_pps,
+						 out_work);
+	struct mlx5_clock *clock = container_of(pps_info, struct mlx5_clock,
+						pps_info);
+	struct mlx5_core_dev *mdev = container_of(clock, struct mlx5_core_dev,
+						  clock);
 	u32 in[MLX5_ST_SZ_DW(mtpps_reg)] = {0};
 	unsigned long flags;
 	int i;
 
-	for (i = 0; i < tstamp->ptp_info.n_pins; i++) {
+	for (i = 0; i < clock->ptp_info.n_pins; i++) {
 		u64 tstart;
 
-		write_lock_irqsave(&tstamp->lock, flags);
-		tstart = tstamp->pps_info.start[i];
-		tstamp->pps_info.start[i] = 0;
-		write_unlock_irqrestore(&tstamp->lock, flags);
+		write_lock_irqsave(&clock->lock, flags);
+		tstart = clock->pps_info.start[i];
+		clock->pps_info.start[i] = 0;
+		write_unlock_irqrestore(&clock->lock, flags);
 		if (!tstart)
 			continue;
 
 		MLX5_SET(mtpps_reg, in, pin, i);
 		MLX5_SET64(mtpps_reg, in, time_stamp, tstart);
-		MLX5_SET(mtpps_reg, in, field_select, MLX5E_MTPPS_FS_TIME_STAMP);
-		mlx5_set_mtpps(tstamp->mdev, in, sizeof(in));
+		MLX5_SET(mtpps_reg, in, field_select, MLX5_MTPPS_FS_TIME_STAMP);
+		mlx5_set_mtpps(mdev, in, sizeof(in));
 	}
 }
 
-static void mlx5e_timestamp_overflow(struct work_struct *work)
+static void mlx5_timestamp_overflow(struct work_struct *work)
 {
 	struct delayed_work *dwork = to_delayed_work(work);
-	struct mlx5e_tstamp *tstamp = container_of(dwork, struct mlx5e_tstamp,
-						   overflow_work);
-	struct mlx5e_priv *priv = container_of(tstamp, struct mlx5e_priv, tstamp);
+	struct mlx5_clock *clock = container_of(dwork, struct mlx5_clock,
+						overflow_work);
 	unsigned long flags;
 
-	write_lock_irqsave(&tstamp->lock, flags);
-	timecounter_read(&tstamp->clock);
-	write_unlock_irqrestore(&tstamp->lock, flags);
-	queue_delayed_work(priv->wq, &tstamp->overflow_work,
-			   msecs_to_jiffies(tstamp->overflow_period * 1000));
-}
-
-int mlx5e_hwstamp_set(struct mlx5e_priv *priv, struct ifreq *ifr)
-{
-	struct hwtstamp_config config;
-	int err;
-
-	if (!MLX5_CAP_GEN(priv->mdev, device_frequency_khz))
-		return -EOPNOTSUPP;
-
-	if (copy_from_user(&config, ifr->ifr_data, sizeof(config)))
-		return -EFAULT;
-
-	/* TX HW timestamp */
-	switch (config.tx_type) {
-	case HWTSTAMP_TX_OFF:
-	case HWTSTAMP_TX_ON:
-		break;
-	default:
-		return -ERANGE;
-	}
-
-	mutex_lock(&priv->state_lock);
-	/* RX HW timestamp */
-	switch (config.rx_filter) {
-	case HWTSTAMP_FILTER_NONE:
-		/* Reset CQE compression to Admin default */
-		mlx5e_modify_rx_cqe_compression_locked(priv, priv->channels.params.rx_cqe_compress_def);
-		break;
-	case HWTSTAMP_FILTER_ALL:
-	case HWTSTAMP_FILTER_SOME:
-	case HWTSTAMP_FILTER_PTP_V1_L4_EVENT:
-	case HWTSTAMP_FILTER_PTP_V1_L4_SYNC:
-	case HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ:
-	case HWTSTAMP_FILTER_PTP_V2_L4_EVENT:
-	case HWTSTAMP_FILTER_PTP_V2_L4_SYNC:
-	case HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ:
-	case HWTSTAMP_FILTER_PTP_V2_L2_EVENT:
-	case HWTSTAMP_FILTER_PTP_V2_L2_SYNC:
-	case HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ:
-	case HWTSTAMP_FILTER_PTP_V2_EVENT:
-	case HWTSTAMP_FILTER_PTP_V2_SYNC:
-	case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
-	case HWTSTAMP_FILTER_NTP_ALL:
-		/* Disable CQE compression */
-		netdev_warn(priv->netdev, "Disabling cqe compression");
-		err = mlx5e_modify_rx_cqe_compression_locked(priv, false);
-		if (err) {
-			netdev_err(priv->netdev, "Failed disabling cqe compression err=%d\n", err);
-			mutex_unlock(&priv->state_lock);
-			return err;
-		}
-		config.rx_filter = HWTSTAMP_FILTER_ALL;
-		break;
-	default:
-		mutex_unlock(&priv->state_lock);
-		return -ERANGE;
-	}
-
-	memcpy(&priv->tstamp.hwtstamp_config, &config, sizeof(config));
-	mutex_unlock(&priv->state_lock);
-
-	return copy_to_user(ifr->ifr_data, &config,
-			    sizeof(config)) ? -EFAULT : 0;
-}
-
-int mlx5e_hwstamp_get(struct mlx5e_priv *priv, struct ifreq *ifr)
-{
-	struct hwtstamp_config *cfg = &priv->tstamp.hwtstamp_config;
-
-	if (!MLX5_CAP_GEN(priv->mdev, device_frequency_khz))
-		return -EOPNOTSUPP;
-
-	return copy_to_user(ifr->ifr_data, cfg, sizeof(*cfg)) ? -EFAULT : 0;
+	write_lock_irqsave(&clock->lock, flags);
+	timecounter_read(&clock->tc);
+	write_unlock_irqrestore(&clock->lock, flags);
+	schedule_delayed_work(&clock->overflow_work, clock->overflow_period);
 }
 
-static int mlx5e_ptp_settime(struct ptp_clock_info *ptp,
-			     const struct timespec64 *ts)
+static int mlx5_ptp_settime(struct ptp_clock_info *ptp,
+			    const struct timespec64 *ts)
 {
-	struct mlx5e_tstamp *tstamp = container_of(ptp, struct mlx5e_tstamp,
-						   ptp_info);
+	struct mlx5_clock *clock = container_of(ptp, struct mlx5_clock,
+						 ptp_info);
 	u64 ns = timespec64_to_ns(ts);
 	unsigned long flags;
 
-	write_lock_irqsave(&tstamp->lock, flags);
-	timecounter_init(&tstamp->clock, &tstamp->cycles, ns);
-	write_unlock_irqrestore(&tstamp->lock, flags);
+	write_lock_irqsave(&clock->lock, flags);
+	timecounter_init(&clock->tc, &clock->cycles, ns);
+	write_unlock_irqrestore(&clock->lock, flags);
 
 	return 0;
 }
 
-static int mlx5e_ptp_gettime(struct ptp_clock_info *ptp,
-			     struct timespec64 *ts)
+static int mlx5_ptp_gettime(struct ptp_clock_info *ptp, struct timespec64 *ts)
 {
-	struct mlx5e_tstamp *tstamp = container_of(ptp, struct mlx5e_tstamp,
-						   ptp_info);
+	struct mlx5_clock *clock = container_of(ptp, struct mlx5_clock,
+						ptp_info);
 	u64 ns;
 	unsigned long flags;
 
-	write_lock_irqsave(&tstamp->lock, flags);
-	ns = timecounter_read(&tstamp->clock);
-	write_unlock_irqrestore(&tstamp->lock, flags);
+	write_lock_irqsave(&clock->lock, flags);
+	ns = timecounter_read(&clock->tc);
+	write_unlock_irqrestore(&clock->lock, flags);
 
 	*ts = ns_to_timespec64(ns);
 
 	return 0;
 }
 
-static int mlx5e_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta)
+static int mlx5_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta)
 {
-	struct mlx5e_tstamp *tstamp = container_of(ptp, struct mlx5e_tstamp,
-						   ptp_info);
+	struct mlx5_clock *clock = container_of(ptp, struct mlx5_clock,
+						ptp_info);
 	unsigned long flags;
 
-	write_lock_irqsave(&tstamp->lock, flags);
-	timecounter_adjtime(&tstamp->clock, delta);
-	write_unlock_irqrestore(&tstamp->lock, flags);
+	write_lock_irqsave(&clock->lock, flags);
+	timecounter_adjtime(&clock->tc, delta);
+	write_unlock_irqrestore(&clock->lock, flags);
 
 	return 0;
 }
 
-static int mlx5e_ptp_adjfreq(struct ptp_clock_info *ptp, s32 delta)
+static int mlx5_ptp_adjfreq(struct ptp_clock_info *ptp, s32 delta)
 {
 	u64 adj;
 	u32 diff;
 	unsigned long flags;
 	int neg_adj = 0;
-	struct mlx5e_tstamp *tstamp = container_of(ptp, struct mlx5e_tstamp,
-						  ptp_info);
+	struct mlx5_clock *clock = container_of(ptp, struct mlx5_clock,
+						ptp_info);
 
 	if (delta < 0) {
 		neg_adj = 1;
 		delta = -delta;
 	}
 
-	adj = tstamp->nominal_c_mult;
+	adj = clock->nominal_c_mult;
 	adj *= delta;
 	diff = div_u64(adj, 1000000000ULL);
 
-	write_lock_irqsave(&tstamp->lock, flags);
-	timecounter_read(&tstamp->clock);
-	tstamp->cycles.mult = neg_adj ? tstamp->nominal_c_mult - diff :
-					tstamp->nominal_c_mult + diff;
-	write_unlock_irqrestore(&tstamp->lock, flags);
+	write_lock_irqsave(&clock->lock, flags);
+	timecounter_read(&clock->tc);
+	clock->cycles.mult = neg_adj ? clock->nominal_c_mult - diff :
+				       clock->nominal_c_mult + diff;
+	write_unlock_irqrestore(&clock->lock, flags);
 
 	return 0;
 }
 
-static int mlx5e_extts_configure(struct ptp_clock_info *ptp,
-				 struct ptp_clock_request *rq,
-				 int on)
+static int mlx5_extts_configure(struct ptp_clock_info *ptp,
+				struct ptp_clock_request *rq,
+				int on)
 {
-	struct mlx5e_tstamp *tstamp =
-		container_of(ptp, struct mlx5e_tstamp, ptp_info);
-	struct mlx5e_priv *priv =
-		container_of(tstamp, struct mlx5e_priv, tstamp);
+	struct mlx5_clock *clock =
+			container_of(ptp, struct mlx5_clock, ptp_info);
+	struct mlx5_core_dev *mdev =
+			container_of(clock, struct mlx5_core_dev, clock);
 	u32 in[MLX5_ST_SZ_DW(mtpps_reg)] = {0};
 	u32 field_select = 0;
 	u8 pin_mode = 0;
@@ -285,24 +199,24 @@ static int mlx5e_extts_configure(struct ptp_clock_info *ptp,
 	int pin = -1;
 	int err = 0;
 
-	if (!MLX5_PPS_CAP(priv->mdev))
+	if (!MLX5_PPS_CAP(mdev))
 		return -EOPNOTSUPP;
 
-	if (rq->extts.index >= tstamp->ptp_info.n_pins)
+	if (rq->extts.index >= clock->ptp_info.n_pins)
 		return -EINVAL;
 
 	if (on) {
-		pin = ptp_find_pin(tstamp->ptp, PTP_PF_EXTTS, rq->extts.index);
+		pin = ptp_find_pin(clock->ptp, PTP_PF_EXTTS, rq->extts.index);
 		if (pin < 0)
 			return -EBUSY;
-		pin_mode = MLX5E_PIN_MODE_IN;
+		pin_mode = MLX5_PIN_MODE_IN;
 		pattern = !!(rq->extts.flags & PTP_FALLING_EDGE);
-		field_select = MLX5E_MTPPS_FS_PIN_MODE |
-			       MLX5E_MTPPS_FS_PATTERN |
-			       MLX5E_MTPPS_FS_ENABLE;
+		field_select = MLX5_MTPPS_FS_PIN_MODE |
+			       MLX5_MTPPS_FS_PATTERN |
+			       MLX5_MTPPS_FS_ENABLE;
 	} else {
 		pin = rq->extts.index;
-		field_select = MLX5E_MTPPS_FS_ENABLE;
+		field_select = MLX5_MTPPS_FS_ENABLE;
 	}
 
 	MLX5_SET(mtpps_reg, in, pin, pin);
@@ -311,22 +225,22 @@ static int mlx5e_extts_configure(struct ptp_clock_info *ptp,
 	MLX5_SET(mtpps_reg, in, enable, on);
 	MLX5_SET(mtpps_reg, in, field_select, field_select);
 
-	err = mlx5_set_mtpps(priv->mdev, in, sizeof(in));
+	err = mlx5_set_mtpps(mdev, in, sizeof(in));
 	if (err)
 		return err;
 
-	return mlx5_set_mtppse(priv->mdev, pin, 0,
-			       MLX5E_EVENT_MODE_REPETETIVE & on);
+	return mlx5_set_mtppse(mdev, pin, 0,
+			       MLX5_EVENT_MODE_REPETETIVE & on);
 }
 
-static int mlx5e_perout_configure(struct ptp_clock_info *ptp,
-				  struct ptp_clock_request *rq,
-				  int on)
+static int mlx5_perout_configure(struct ptp_clock_info *ptp,
+				 struct ptp_clock_request *rq,
+				 int on)
 {
-	struct mlx5e_tstamp *tstamp =
-		container_of(ptp, struct mlx5e_tstamp, ptp_info);
-	struct mlx5e_priv *priv =
-		container_of(tstamp, struct mlx5e_priv, tstamp);
+	struct mlx5_clock *clock =
+			container_of(ptp, struct mlx5_clock, ptp_info);
+	struct mlx5_core_dev *mdev =
+			container_of(clock, struct mlx5_core_dev, clock);
 	u32 in[MLX5_ST_SZ_DW(mtpps_reg)] = {0};
 	u64 nsec_now, nsec_delta, time_stamp = 0;
 	u64 cycles_now, cycles_delta;
@@ -339,20 +253,20 @@ static int mlx5e_perout_configure(struct ptp_clock_info *ptp,
 	int err = 0;
 	s64 ns;
 
-	if (!MLX5_PPS_CAP(priv->mdev))
+	if (!MLX5_PPS_CAP(mdev))
 		return -EOPNOTSUPP;
 
-	if (rq->perout.index >= tstamp->ptp_info.n_pins)
+	if (rq->perout.index >= clock->ptp_info.n_pins)
 		return -EINVAL;
 
 	if (on) {
-		pin = ptp_find_pin(tstamp->ptp, PTP_PF_PEROUT,
+		pin = ptp_find_pin(clock->ptp, PTP_PF_PEROUT,
 				   rq->perout.index);
 		if (pin < 0)
 			return -EBUSY;
 
-		pin_mode = MLX5E_PIN_MODE_OUT;
-		pattern = MLX5E_OUT_PATTERN_PERIODIC;
+		pin_mode = MLX5_PIN_MODE_OUT;
+		pattern = MLX5_OUT_PATTERN_PERIODIC;
 		ts.tv_sec = rq->perout.period.sec;
 		ts.tv_nsec = rq->perout.period.nsec;
 		ns = timespec64_to_ns(&ts);
@@ -363,21 +277,21 @@ static int mlx5e_perout_configure(struct ptp_clock_info *ptp,
 		ts.tv_sec = rq->perout.start.sec;
 		ts.tv_nsec = rq->perout.start.nsec;
 		ns = timespec64_to_ns(&ts);
-		cycles_now = mlx5_read_internal_timer(tstamp->mdev);
-		write_lock_irqsave(&tstamp->lock, flags);
-		nsec_now = timecounter_cyc2time(&tstamp->clock, cycles_now);
+		cycles_now = mlx5_read_internal_timer(mdev);
+		write_lock_irqsave(&clock->lock, flags);
+		nsec_now = timecounter_cyc2time(&clock->tc, cycles_now);
 		nsec_delta = ns - nsec_now;
-		cycles_delta = div64_u64(nsec_delta << tstamp->cycles.shift,
-					 tstamp->cycles.mult);
-		write_unlock_irqrestore(&tstamp->lock, flags);
+		cycles_delta = div64_u64(nsec_delta << clock->cycles.shift,
+					 clock->cycles.mult);
+		write_unlock_irqrestore(&clock->lock, flags);
 		time_stamp = cycles_now + cycles_delta;
-		field_select = MLX5E_MTPPS_FS_PIN_MODE |
-			       MLX5E_MTPPS_FS_PATTERN |
-			       MLX5E_MTPPS_FS_ENABLE |
-			       MLX5E_MTPPS_FS_TIME_STAMP;
+		field_select = MLX5_MTPPS_FS_PIN_MODE |
+			       MLX5_MTPPS_FS_PATTERN |
+			       MLX5_MTPPS_FS_ENABLE |
+			       MLX5_MTPPS_FS_TIME_STAMP;
 	} else {
 		pin = rq->perout.index;
-		field_select = MLX5E_MTPPS_FS_ENABLE;
+		field_select = MLX5_MTPPS_FS_ENABLE;
 	}
 
 	MLX5_SET(mtpps_reg, in, pin, pin);
@@ -387,233 +301,225 @@ static int mlx5e_perout_configure(struct ptp_clock_info *ptp,
 	MLX5_SET64(mtpps_reg, in, time_stamp, time_stamp);
 	MLX5_SET(mtpps_reg, in, field_select, field_select);
 
-	err = mlx5_set_mtpps(priv->mdev, in, sizeof(in));
+	err = mlx5_set_mtpps(mdev, in, sizeof(in));
 	if (err)
 		return err;
 
-	return mlx5_set_mtppse(priv->mdev, pin, 0,
-			       MLX5E_EVENT_MODE_REPETETIVE & on);
+	return mlx5_set_mtppse(mdev, pin, 0,
+			       MLX5_EVENT_MODE_REPETETIVE & on);
 }
 
-static int mlx5e_pps_configure(struct ptp_clock_info *ptp,
-			       struct ptp_clock_request *rq,
-			       int on)
+static int mlx5_pps_configure(struct ptp_clock_info *ptp,
+			      struct ptp_clock_request *rq,
+			      int on)
 {
-	struct mlx5e_tstamp *tstamp =
-		container_of(ptp, struct mlx5e_tstamp, ptp_info);
+	struct mlx5_clock *clock =
+			container_of(ptp, struct mlx5_clock, ptp_info);
 
-	tstamp->pps_info.enabled = !!on;
+	clock->pps_info.enabled = !!on;
 	return 0;
 }
 
-static int mlx5e_ptp_enable(struct ptp_clock_info *ptp,
-			    struct ptp_clock_request *rq,
-			    int on)
+static int mlx5_ptp_enable(struct ptp_clock_info *ptp,
+			   struct ptp_clock_request *rq,
+			   int on)
 {
 	switch (rq->type) {
 	case PTP_CLK_REQ_EXTTS:
-		return mlx5e_extts_configure(ptp, rq, on);
+		return mlx5_extts_configure(ptp, rq, on);
 	case PTP_CLK_REQ_PEROUT:
-		return mlx5e_perout_configure(ptp, rq, on);
+		return mlx5_perout_configure(ptp, rq, on);
 	case PTP_CLK_REQ_PPS:
-		return mlx5e_pps_configure(ptp, rq, on);
+		return mlx5_pps_configure(ptp, rq, on);
 	default:
 		return -EOPNOTSUPP;
 	}
 	return 0;
 }
 
-static int mlx5e_ptp_verify(struct ptp_clock_info *ptp, unsigned int pin,
-			    enum ptp_pin_function func, unsigned int chan)
+static int mlx5_ptp_verify(struct ptp_clock_info *ptp, unsigned int pin,
+			   enum ptp_pin_function func, unsigned int chan)
 {
 	return (func == PTP_PF_PHYSYNC) ? -EOPNOTSUPP : 0;
 }
 
-static const struct ptp_clock_info mlx5e_ptp_clock_info = {
+static const struct ptp_clock_info mlx5_ptp_clock_info = {
 	.owner		= THIS_MODULE,
+	.name		= "mlx5_p2p",
 	.max_adj	= 100000000,
 	.n_alarm	= 0,
 	.n_ext_ts	= 0,
 	.n_per_out	= 0,
 	.n_pins		= 0,
 	.pps		= 0,
-	.adjfreq	= mlx5e_ptp_adjfreq,
-	.adjtime	= mlx5e_ptp_adjtime,
-	.gettime64	= mlx5e_ptp_gettime,
-	.settime64	= mlx5e_ptp_settime,
+	.adjfreq	= mlx5_ptp_adjfreq,
+	.adjtime	= mlx5_ptp_adjtime,
+	.gettime64	= mlx5_ptp_gettime,
+	.settime64	= mlx5_ptp_settime,
 	.enable		= NULL,
 	.verify		= NULL,
 };
 
-static void mlx5e_timestamp_init_config(struct mlx5e_tstamp *tstamp)
-{
-	tstamp->hwtstamp_config.tx_type = HWTSTAMP_TX_OFF;
-	tstamp->hwtstamp_config.rx_filter = HWTSTAMP_FILTER_NONE;
-}
-
-static int mlx5e_init_pin_config(struct mlx5e_tstamp *tstamp)
+static int mlx5_init_pin_config(struct mlx5_clock *clock)
 {
 	int i;
 
-	tstamp->ptp_info.pin_config =
-		kzalloc(sizeof(*tstamp->ptp_info.pin_config) *
-			       tstamp->ptp_info.n_pins, GFP_KERNEL);
-	if (!tstamp->ptp_info.pin_config)
+	clock->ptp_info.pin_config =
+			kzalloc(sizeof(*clock->ptp_info.pin_config) *
+				clock->ptp_info.n_pins, GFP_KERNEL);
+	if (!clock->ptp_info.pin_config)
 		return -ENOMEM;
-	tstamp->ptp_info.enable = mlx5e_ptp_enable;
-	tstamp->ptp_info.verify = mlx5e_ptp_verify;
-	tstamp->ptp_info.pps = 1;
+	clock->ptp_info.enable = mlx5_ptp_enable;
+	clock->ptp_info.verify = mlx5_ptp_verify;
+	clock->ptp_info.pps = 1;
 
-	for (i = 0; i < tstamp->ptp_info.n_pins; i++) {
-		snprintf(tstamp->ptp_info.pin_config[i].name,
-			 sizeof(tstamp->ptp_info.pin_config[i].name),
+	for (i = 0; i < clock->ptp_info.n_pins; i++) {
+		snprintf(clock->ptp_info.pin_config[i].name,
+			 sizeof(clock->ptp_info.pin_config[i].name),
 			 "mlx5_pps%d", i);
-		tstamp->ptp_info.pin_config[i].index = i;
-		tstamp->ptp_info.pin_config[i].func = PTP_PF_NONE;
-		tstamp->ptp_info.pin_config[i].chan = i;
+		clock->ptp_info.pin_config[i].index = i;
+		clock->ptp_info.pin_config[i].func = PTP_PF_NONE;
+		clock->ptp_info.pin_config[i].chan = i;
 	}
 
 	return 0;
 }
 
-static void mlx5e_get_pps_caps(struct mlx5e_priv *priv,
-			       struct mlx5e_tstamp *tstamp)
+static void mlx5_get_pps_caps(struct mlx5_core_dev *mdev)
 {
+	struct mlx5_clock *clock = &mdev->clock;
 	u32 out[MLX5_ST_SZ_DW(mtpps_reg)] = {0};
 
-	mlx5_query_mtpps(priv->mdev, out, sizeof(out));
-
-	tstamp->ptp_info.n_pins = MLX5_GET(mtpps_reg, out,
-					   cap_number_of_pps_pins);
-	tstamp->ptp_info.n_ext_ts = MLX5_GET(mtpps_reg, out,
-					     cap_max_num_of_pps_in_pins);
-	tstamp->ptp_info.n_per_out = MLX5_GET(mtpps_reg, out,
-					      cap_max_num_of_pps_out_pins);
-
-	tstamp->pps_info.pin_caps[0] = MLX5_GET(mtpps_reg, out, cap_pin_0_mode);
-	tstamp->pps_info.pin_caps[1] = MLX5_GET(mtpps_reg, out, cap_pin_1_mode);
-	tstamp->pps_info.pin_caps[2] = MLX5_GET(mtpps_reg, out, cap_pin_2_mode);
-	tstamp->pps_info.pin_caps[3] = MLX5_GET(mtpps_reg, out, cap_pin_3_mode);
-	tstamp->pps_info.pin_caps[4] = MLX5_GET(mtpps_reg, out, cap_pin_4_mode);
-	tstamp->pps_info.pin_caps[5] = MLX5_GET(mtpps_reg, out, cap_pin_5_mode);
-	tstamp->pps_info.pin_caps[6] = MLX5_GET(mtpps_reg, out, cap_pin_6_mode);
-	tstamp->pps_info.pin_caps[7] = MLX5_GET(mtpps_reg, out, cap_pin_7_mode);
+	mlx5_query_mtpps(mdev, out, sizeof(out));
+
+	clock->ptp_info.n_pins = MLX5_GET(mtpps_reg, out,
+					  cap_number_of_pps_pins);
+	clock->ptp_info.n_ext_ts = MLX5_GET(mtpps_reg, out,
+					    cap_max_num_of_pps_in_pins);
+	clock->ptp_info.n_per_out = MLX5_GET(mtpps_reg, out,
+					     cap_max_num_of_pps_out_pins);
+
+	clock->pps_info.pin_caps[0] = MLX5_GET(mtpps_reg, out, cap_pin_0_mode);
+	clock->pps_info.pin_caps[1] = MLX5_GET(mtpps_reg, out, cap_pin_1_mode);
+	clock->pps_info.pin_caps[2] = MLX5_GET(mtpps_reg, out, cap_pin_2_mode);
+	clock->pps_info.pin_caps[3] = MLX5_GET(mtpps_reg, out, cap_pin_3_mode);
+	clock->pps_info.pin_caps[4] = MLX5_GET(mtpps_reg, out, cap_pin_4_mode);
+	clock->pps_info.pin_caps[5] = MLX5_GET(mtpps_reg, out, cap_pin_5_mode);
+	clock->pps_info.pin_caps[6] = MLX5_GET(mtpps_reg, out, cap_pin_6_mode);
+	clock->pps_info.pin_caps[7] = MLX5_GET(mtpps_reg, out, cap_pin_7_mode);
 }
 
-void mlx5e_pps_event_handler(struct mlx5e_priv *priv,
-			     struct ptp_clock_event *event)
+void mlx5_pps_event(struct mlx5_core_dev *mdev,
+		    struct mlx5_eqe *eqe)
 {
-	struct net_device *netdev = priv->netdev;
-	struct mlx5e_tstamp *tstamp = &priv->tstamp;
+	struct mlx5_clock *clock = &mdev->clock;
+	struct ptp_clock_event ptp_event;
 	struct timespec64 ts;
 	u64 nsec_now, nsec_delta;
 	u64 cycles_now, cycles_delta;
-	int pin = event->index;
+	int pin = eqe->data.pps.pin;
 	s64 ns;
 	unsigned long flags;
 
-	switch (tstamp->ptp_info.pin_config[pin].func) {
+	switch (clock->ptp_info.pin_config[pin].func) {
 	case PTP_PF_EXTTS:
-		if (tstamp->pps_info.enabled) {
-			event->type = PTP_CLOCK_PPSUSR;
-			event->pps_times.ts_real = ns_to_timespec64(event->timestamp);
+		if (clock->pps_info.enabled) {
+			ptp_event.type = PTP_CLOCK_PPSUSR;
+			ptp_event.pps_times.ts_real = ns_to_timespec64(eqe->data.pps.time_stamp);
 		} else {
-			event->type = PTP_CLOCK_EXTTS;
+			ptp_event.type = PTP_CLOCK_EXTTS;
 		}
-		ptp_clock_event(tstamp->ptp, event);
+		ptp_clock_event(clock->ptp, &ptp_event);
 		break;
 	case PTP_PF_PEROUT:
-		mlx5e_ptp_gettime(&tstamp->ptp_info, &ts);
-		cycles_now = mlx5_read_internal_timer(tstamp->mdev);
+		mlx5_ptp_gettime(&clock->ptp_info, &ts);
+		cycles_now = mlx5_read_internal_timer(mdev);
 		ts.tv_sec += 1;
 		ts.tv_nsec = 0;
 		ns = timespec64_to_ns(&ts);
-		write_lock_irqsave(&tstamp->lock, flags);
-		nsec_now = timecounter_cyc2time(&tstamp->clock, cycles_now);
+		write_lock_irqsave(&clock->lock, flags);
+		nsec_now = timecounter_cyc2time(&clock->tc, cycles_now);
 		nsec_delta = ns - nsec_now;
-		cycles_delta = div64_u64(nsec_delta << tstamp->cycles.shift,
-					 tstamp->cycles.mult);
-		tstamp->pps_info.start[pin] = cycles_now + cycles_delta;
-		queue_work(priv->wq, &tstamp->pps_info.out_work);
-		write_unlock_irqrestore(&tstamp->lock, flags);
+		cycles_delta = div64_u64(nsec_delta << clock->cycles.shift,
+					 clock->cycles.mult);
+		clock->pps_info.start[pin] = cycles_now + cycles_delta;
+		schedule_work(&clock->pps_info.out_work);
+		write_unlock_irqrestore(&clock->lock, flags);
 		break;
 	default:
-		netdev_err(netdev, "%s: Unhandled event\n", __func__);
+		mlx5_core_err(mdev, " Unhandled event\n");
 	}
 }
 
-void mlx5e_timestamp_init(struct mlx5e_priv *priv)
+void mlx5_init_clock(struct mlx5_core_dev *mdev)
 {
-	struct mlx5e_tstamp *tstamp = &priv->tstamp;
+	struct mlx5_clock *clock = &mdev->clock;
 	u64 ns;
 	u64 frac = 0;
 	u32 dev_freq;
 
-	mlx5e_timestamp_init_config(tstamp);
-	dev_freq = MLX5_CAP_GEN(priv->mdev, device_frequency_khz);
+	dev_freq = MLX5_CAP_GEN(mdev, device_frequency_khz);
 	if (!dev_freq) {
-		mlx5_core_warn(priv->mdev, "invalid device_frequency_khz, aborting HW clock init\n");
+		mlx5_core_warn(mdev, "invalid device_frequency_khz, aborting HW clock init\n");
 		return;
 	}
-	rwlock_init(&tstamp->lock);
-	tstamp->cycles.read = mlx5e_read_internal_timer;
-	tstamp->cycles.shift = MLX5E_CYCLES_SHIFT;
-	tstamp->cycles.mult = clocksource_khz2mult(dev_freq,
-						   tstamp->cycles.shift);
-	tstamp->nominal_c_mult = tstamp->cycles.mult;
-	tstamp->cycles.mask = CLOCKSOURCE_MASK(41);
-	tstamp->mdev = priv->mdev;
-
-	timecounter_init(&tstamp->clock, &tstamp->cycles,
+	rwlock_init(&clock->lock);
+	clock->cycles.read = read_internal_timer;
+	clock->cycles.shift = MLX5_CYCLES_SHIFT;
+	clock->cycles.mult = clocksource_khz2mult(dev_freq,
+						  clock->cycles.shift);
+	clock->nominal_c_mult = clock->cycles.mult;
+	clock->cycles.mask = CLOCKSOURCE_MASK(41);
+
+	timecounter_init(&clock->tc, &clock->cycles,
 			 ktime_to_ns(ktime_get_real()));
 
 	/* Calculate period in seconds to call the overflow watchdog - to make
 	 * sure counter is checked at least once every wrap around.
 	 */
-	ns = cyclecounter_cyc2ns(&tstamp->cycles, tstamp->cycles.mask,
+	ns = cyclecounter_cyc2ns(&clock->cycles, clock->cycles.mask,
 				 frac, &frac);
 	do_div(ns, NSEC_PER_SEC / 2 / HZ);
-	tstamp->overflow_period = ns;
+	clock->overflow_period = ns;
 
-	INIT_WORK(&tstamp->pps_info.out_work, mlx5e_pps_out);
-	INIT_DELAYED_WORK(&tstamp->overflow_work, mlx5e_timestamp_overflow);
-	if (tstamp->overflow_period)
-		queue_delayed_work(priv->wq, &tstamp->overflow_work, 0);
+	INIT_WORK(&clock->pps_info.out_work, mlx5_pps_out);
+	INIT_DELAYED_WORK(&clock->overflow_work, mlx5_timestamp_overflow);
+	if (clock->overflow_period)
+		schedule_delayed_work(&clock->overflow_work, 0);
 	else
-		mlx5_core_warn(priv->mdev, "invalid overflow period, overflow_work is not scheduled\n");
+		mlx5_core_warn(mdev, "invalid overflow period, overflow_work is not scheduled\n");
 
 	/* Configure the PHC */
-	tstamp->ptp_info = mlx5e_ptp_clock_info;
-	snprintf(tstamp->ptp_info.name, 16, "mlx5 ptp");
+	clock->ptp_info = mlx5_ptp_clock_info;
 
 	/* Initialize 1PPS data structures */
-	if (MLX5_PPS_CAP(priv->mdev))
-		mlx5e_get_pps_caps(priv, tstamp);
-	if (tstamp->ptp_info.n_pins)
-		mlx5e_init_pin_config(tstamp);
-
-	tstamp->ptp = ptp_clock_register(&tstamp->ptp_info,
-					 &priv->mdev->pdev->dev);
-	if (IS_ERR(tstamp->ptp)) {
-		mlx5_core_warn(priv->mdev, "ptp_clock_register failed %ld\n",
-			       PTR_ERR(tstamp->ptp));
-		tstamp->ptp = NULL;
+	if (MLX5_PPS_CAP(mdev))
+		mlx5_get_pps_caps(mdev);
+	if (clock->ptp_info.n_pins)
+		mlx5_init_pin_config(clock);
+
+	clock->ptp = ptp_clock_register(&clock->ptp_info,
+					&mdev->pdev->dev);
+	if (IS_ERR(clock->ptp)) {
+		mlx5_core_warn(mdev, "ptp_clock_register failed %ld\n",
+			       PTR_ERR(clock->ptp));
+		clock->ptp = NULL;
 	}
 }
 
-void mlx5e_timestamp_cleanup(struct mlx5e_priv *priv)
+void mlx5_cleanup_clock(struct mlx5_core_dev *mdev)
 {
-	struct mlx5e_tstamp *tstamp = &priv->tstamp;
+	struct mlx5_clock *clock = &mdev->clock;
 
-	if (!MLX5_CAP_GEN(priv->mdev, device_frequency_khz))
+	if (!MLX5_CAP_GEN(mdev, device_frequency_khz))
 		return;
 
-	if (priv->tstamp.ptp) {
-		ptp_clock_unregister(priv->tstamp.ptp);
-		priv->tstamp.ptp = NULL;
+	if (clock->ptp) {
+		ptp_clock_unregister(clock->ptp);
+		clock->ptp = NULL;
 	}
 
-	cancel_work_sync(&tstamp->pps_info.out_work);
-	cancel_delayed_work_sync(&tstamp->overflow_work);
-	kfree(tstamp->ptp_info.pin_config);
+	cancel_work_sync(&clock->pps_info.out_work);
+	cancel_delayed_work_sync(&clock->overflow_work);
+	kfree(clock->ptp_info.pin_config);
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.h
new file mode 100644
index 000000000000..a8eecedd46c2
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.h
@@ -0,0 +1,51 @@
+/*
+ * Copyright (c) 2017, Mellanox Technologies, Ltd.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef __LIB_CLOCK_H__
+#define __LIB_CLOCK_H__
+
+void mlx5_init_clock(struct mlx5_core_dev *mdev);
+void mlx5_cleanup_clock(struct mlx5_core_dev *mdev);
+
+static inline ktime_t mlx5_timecounter_cyc2time(struct mlx5_clock *clock,
+						u64 timestamp)
+{
+	u64 nsec;
+
+	read_lock(&clock->lock);
+	nsec = timecounter_cyc2time(&clock->tc, timestamp);
+	read_unlock(&clock->lock);
+
+	return ns_to_ktime(nsec);
+}
+
+#endif
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 0d2c8dcd6eae..ecbe9fad22d8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -59,6 +59,7 @@
 #include "lib/mlx5.h"
 #include "fpga/core.h"
 #include "accel/ipsec.h"
+#include "lib/clock.h"
 
 MODULE_AUTHOR("Eli Cohen <eli@mellanox.com>");
 MODULE_DESCRIPTION("Mellanox Connect-IB, ConnectX-4 core driver");
@@ -889,6 +890,8 @@ static int mlx5_init_once(struct mlx5_core_dev *dev, struct mlx5_priv *priv)
 
 	mlx5_init_reserved_gids(dev);
 
+	mlx5_init_clock(dev);
+
 	err = mlx5_init_rl_table(dev);
 	if (err) {
 		dev_err(&pdev->dev, "Failed to init rate limiting\n");
@@ -949,6 +952,7 @@ static void mlx5_cleanup_once(struct mlx5_core_dev *dev)
 	mlx5_eswitch_cleanup(dev->priv.eswitch);
 	mlx5_mpfs_cleanup(dev);
 	mlx5_cleanup_rl_table(dev);
+	mlx5_cleanup_clock(dev);
 	mlx5_cleanup_reserved_gids(dev);
 	mlx5_cleanup_mkey_table(dev);
 	mlx5_cleanup_srq_table(dev);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
index b7c2900b75f9..8f00de2fe283 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
@@ -93,6 +93,7 @@ void mlx5_core_event(struct mlx5_core_dev *dev, enum mlx5_dev_event event,
 		     unsigned long param);
 void mlx5_core_page_fault(struct mlx5_core_dev *dev,
 			  struct mlx5_pagefault *pfault);
+void mlx5_pps_event(struct mlx5_core_dev *dev, struct mlx5_eqe *eqe);
 void mlx5_port_module_event(struct mlx5_core_dev *dev, struct mlx5_eqe *eqe);
 void mlx5_enter_error_state(struct mlx5_core_dev *dev, bool force);
 void mlx5_disable_device(struct mlx5_core_dev *dev);
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 401c8972cc3a..08c77b7e59cb 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -49,6 +49,8 @@
 #include <linux/mlx5/device.h>
 #include <linux/mlx5/doorbell.h>
 #include <linux/mlx5/srq.h>
+#include <linux/timecounter.h>
+#include <linux/ptp_clock_kernel.h>
 
 enum {
 	MLX5_BOARD_ID_LEN = 64,
@@ -760,6 +762,27 @@ struct mlx5_rsvd_gids {
 	struct ida ida;
 };
 
+#define MAX_PIN_NUM	8
+struct mlx5_pps {
+	u8                         pin_caps[MAX_PIN_NUM];
+	struct work_struct         out_work;
+	u64                        start[MAX_PIN_NUM];
+	u8                         enabled;
+};
+
+struct mlx5_clock {
+	rwlock_t                   lock;
+	struct cyclecounter        cycles;
+	struct timecounter         tc;
+	struct hwtstamp_config     hwtstamp_config;
+	u32                        nominal_c_mult;
+	unsigned long              overflow_period;
+	struct delayed_work        overflow_work;
+	struct ptp_clock          *ptp;
+	struct ptp_clock_info      ptp_info;
+	struct mlx5_pps            pps_info;
+};
+
 struct mlx5_core_dev {
 	struct pci_dev	       *pdev;
 	/* sync pci state */
@@ -800,6 +823,7 @@ struct mlx5_core_dev {
 #ifdef CONFIG_RFS_ACCEL
 	struct cpu_rmap         *rmap;
 #endif
+	struct mlx5_clock        clock;
 };
 
 struct mlx5_db {
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [for-next 03/12] net/mlx5e: IPoIB, Move underlay QP init/uninit to separate functions
       [not found] ` <20171014184827.18491-1-saeedm-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2017-10-14 18:48   ` [for-next 01/12] net/mlx5: File renaming towards ptp core implementation Saeed Mahameed
@ 2017-10-14 18:48   ` Saeed Mahameed
  2017-10-14 18:48   ` [for-next 08/12] net/mlx5e: IPoIB, Use hash-table to map between QPN to child netdev Saeed Mahameed
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 20+ messages in thread
From: Saeed Mahameed @ 2017-10-14 18:48 UTC (permalink / raw)
  To: David S. Miller, Doug Ledford
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Leon Romanovsky, Alex Vesker, Leon Romanovsky, Saeed Mahameed

From: Alex Vesker <valex-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

During the creation of the underlay QP the PKEY index is unknown, the
PKEY index is known only when calling ndo_open.
PKEY index attached to the QP during state modification.

Splitting the functions will also make the code symmetric and more
readable. This split is also required for later PKEY support to be
called with the PKEY index during ndo_open.

Signed-off-by: Alex Vesker <valex-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Erez Shitrit <erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Signed-off-by: Saeed Mahameed <saeedm-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 .../net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c  | 108 +++++++++++++--------
 1 file changed, 70 insertions(+), 38 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
index 14dfb577691b..feb94db6b921 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
@@ -108,11 +108,68 @@ static void mlx5i_cleanup(struct mlx5e_priv *priv)
 	/* Do nothing .. */
 }
 
+static int mlx5i_init_underlay_qp(struct mlx5e_priv *priv)
+{
+	struct mlx5_core_dev *mdev = priv->mdev;
+	struct mlx5i_priv *ipriv = priv->ppriv;
+	struct mlx5_core_qp *qp = &ipriv->qp;
+	struct mlx5_qp_context *context;
+	int ret;
+
+	/* QP states */
+	context = kzalloc(sizeof(*context), GFP_KERNEL);
+	if (!context)
+		return -ENOMEM;
+
+	context->flags = cpu_to_be32(MLX5_QP_PM_MIGRATED << 11);
+	context->pri_path.port = 1;
+	context->qkey = cpu_to_be32(IB_DEFAULT_Q_KEY);
+
+	ret = mlx5_core_qp_modify(mdev, MLX5_CMD_OP_RST2INIT_QP, 0, context, qp);
+	if (ret) {
+		mlx5_core_err(mdev, "Failed to modify qp RST2INIT, err: %d\n", ret);
+		goto err_qp_modify_to_err;
+	}
+	memset(context, 0, sizeof(*context));
+
+	ret = mlx5_core_qp_modify(mdev, MLX5_CMD_OP_INIT2RTR_QP, 0, context, qp);
+	if (ret) {
+		mlx5_core_err(mdev, "Failed to modify qp INIT2RTR, err: %d\n", ret);
+		goto err_qp_modify_to_err;
+	}
+
+	ret = mlx5_core_qp_modify(mdev, MLX5_CMD_OP_RTR2RTS_QP, 0, context, qp);
+	if (ret) {
+		mlx5_core_err(mdev, "Failed to modify qp RTR2RTS, err: %d\n", ret);
+		goto err_qp_modify_to_err;
+	}
+
+	kfree(context);
+	return 0;
+
+err_qp_modify_to_err:
+	mlx5_core_qp_modify(mdev, MLX5_CMD_OP_2ERR_QP, 0, &context, qp);
+	kfree(context);
+	return ret;
+}
+
+static void mlx5i_uninit_underlay_qp(struct mlx5e_priv *priv)
+{
+	struct mlx5i_priv *ipriv = priv->ppriv;
+	struct mlx5_core_dev *mdev = priv->mdev;
+	struct mlx5_qp_context context;
+	int err;
+
+	err = mlx5_core_qp_modify(mdev, MLX5_CMD_OP_2RST_QP, 0, &context,
+				  &ipriv->qp);
+	if (err)
+		mlx5_core_err(mdev, "Failed to modify qp 2RST, err: %d\n", err);
+}
+
 #define MLX5_QP_ENHANCED_ULP_STATELESS_MODE 2
 
 static int mlx5i_create_underlay_qp(struct mlx5_core_dev *mdev, struct mlx5_core_qp *qp)
 {
-	struct mlx5_qp_context *context = NULL;
 	u32 *in = NULL;
 	void *addr_path;
 	int ret = 0;
@@ -140,38 +197,7 @@ static int mlx5i_create_underlay_qp(struct mlx5_core_dev *mdev, struct mlx5_core
 		goto out;
 	}
 
-	/* QP states */
-	context = kzalloc(sizeof(*context), GFP_KERNEL);
-	if (!context) {
-		ret = -ENOMEM;
-		goto out;
-	}
-
-	context->flags = cpu_to_be32(MLX5_QP_PM_MIGRATED << 11);
-	context->pri_path.port = 1;
-	context->qkey = cpu_to_be32(IB_DEFAULT_Q_KEY);
-
-	ret = mlx5_core_qp_modify(mdev, MLX5_CMD_OP_RST2INIT_QP, 0, context, qp);
-	if (ret) {
-		mlx5_core_err(mdev, "Failed to modify qp RST2INIT, err: %d\n", ret);
-		goto out;
-	}
-	memset(context, 0, sizeof(*context));
-
-	ret = mlx5_core_qp_modify(mdev, MLX5_CMD_OP_INIT2RTR_QP, 0, context, qp);
-	if (ret) {
-		mlx5_core_err(mdev, "Failed to modify qp INIT2RTR, err: %d\n", ret);
-		goto out;
-	}
-
-	ret = mlx5_core_qp_modify(mdev, MLX5_CMD_OP_RTR2RTS_QP, 0, context, qp);
-	if (ret) {
-		mlx5_core_err(mdev, "Failed to modify qp RTR2RTS, err: %d\n", ret);
-		goto out;
-	}
-
 out:
-	kfree(context);
 	kvfree(in);
 	return ret;
 }
@@ -192,13 +218,23 @@ static int mlx5i_init_tx(struct mlx5e_priv *priv)
 		return err;
 	}
 
+	err = mlx5i_init_underlay_qp(priv);
+	if (err) {
+		mlx5_core_warn(priv->mdev, "intilize underlay QP failed, %d\n", err);
+		goto err_destroy_underlay_qp;
+	}
+
 	err = mlx5e_create_tis(priv->mdev, 0 /* tc */, ipriv->qp.qpn, &priv->tisn[0]);
 	if (err) {
 		mlx5_core_warn(priv->mdev, "create tis failed, %d\n", err);
-		return err;
+		goto err_destroy_underlay_qp;
 	}
 
 	return 0;
+
+err_destroy_underlay_qp:
+	mlx5i_destroy_underlay_qp(priv->mdev, &ipriv->qp);
+	return err;
 }
 
 static void mlx5i_cleanup_tx(struct mlx5e_priv *priv)
@@ -381,12 +417,8 @@ static int mlx5i_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 static void mlx5i_dev_cleanup(struct net_device *dev)
 {
 	struct mlx5e_priv    *priv   = mlx5i_epriv(dev);
-	struct mlx5_core_dev *mdev   = priv->mdev;
-	struct mlx5i_priv    *ipriv  = priv->ppriv;
-	struct mlx5_qp_context context;
 
-	/* detach qp from flow-steering by reset it */
-	mlx5_core_qp_modify(mdev, MLX5_CMD_OP_2RST_QP, 0, &context, &ipriv->qp);
+	mlx5i_uninit_underlay_qp(priv);
 }
 
 static int mlx5i_open(struct net_device *netdev)
-- 
2.14.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [for-next 04/12] net/mlx5: Support for attaching multiple underlay QPs to root flow table
  2017-10-14 18:48 [pull request][for-next 00/12] Mellanox, mlx5 IPoIB Muli Pkey support 2017-10-11 Saeed Mahameed
  2017-10-14 18:48 ` [for-next 02/12] net/mlx5: PTP code migration to driver core section Saeed Mahameed
@ 2017-10-14 18:48 ` Saeed Mahameed
  2017-10-14 18:48 ` [for-next 05/12] IB/ipoib: Grab rtnl lock on heavy flush when calling ndo_open/stop Saeed Mahameed
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 20+ messages in thread
From: Saeed Mahameed @ 2017-10-14 18:48 UTC (permalink / raw)
  To: David S. Miller, Doug Ledford
  Cc: netdev, linux-rdma, Leon Romanovsky, Alex Vesker, Saeed Mahameed,
	Leon Romanovsky

From: Alex Vesker <valex@mellanox.com>

Previous support allowed connecting only a single QPN to the FT.
Now using a linked list multiple QPNs can be attached to the same FT.

Supporting attaching multiple underlay QPs is required for PKEY
support in which child and parent share the same FT.

The actual attaching/detaching FW commands will be called inside the
function symmetrically.

This change requires a change in IPoIB open and close functions, the
attaching/detaching to/from the FT is done each time we open/close.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
---
 drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c   |  13 ++-
 drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h   |   4 +-
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c  | 123 ++++++++++++++++++---
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.h  |   7 +-
 .../net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c  |  78 +++++++------
 5 files changed, 171 insertions(+), 54 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
index 36ecc2b2e187..881e2e55840c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
@@ -40,7 +40,8 @@
 #include "eswitch.h"
 
 int mlx5_cmd_update_root_ft(struct mlx5_core_dev *dev,
-			    struct mlx5_flow_table *ft, u32 underlay_qpn)
+			    struct mlx5_flow_table *ft, u32 underlay_qpn,
+			    bool disconnect)
 {
 	u32 in[MLX5_ST_SZ_DW(set_flow_table_root_in)]   = {0};
 	u32 out[MLX5_ST_SZ_DW(set_flow_table_root_out)] = {0};
@@ -52,7 +53,15 @@ int mlx5_cmd_update_root_ft(struct mlx5_core_dev *dev,
 	MLX5_SET(set_flow_table_root_in, in, opcode,
 		 MLX5_CMD_OP_SET_FLOW_TABLE_ROOT);
 	MLX5_SET(set_flow_table_root_in, in, table_type, ft->type);
-	MLX5_SET(set_flow_table_root_in, in, table_id, ft->id);
+
+	if (disconnect) {
+		MLX5_SET(set_flow_table_root_in, in, op_mod, 1);
+		MLX5_SET(set_flow_table_root_in, in, table_id, 0);
+	} else {
+		MLX5_SET(set_flow_table_root_in, in, op_mod, 0);
+		MLX5_SET(set_flow_table_root_in, in, table_id, ft->id);
+	}
+
 	MLX5_SET(set_flow_table_root_in, in, underlay_qpn, underlay_qpn);
 	if (ft->vport) {
 		MLX5_SET(set_flow_table_root_in, in, vport_number, ft->vport);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h
index c6d7bdf255b6..71e2d0f37ad9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h
@@ -71,8 +71,8 @@ int mlx5_cmd_delete_fte(struct mlx5_core_dev *dev,
 			unsigned int index);
 
 int mlx5_cmd_update_root_ft(struct mlx5_core_dev *dev,
-			    struct mlx5_flow_table *ft,
-			    u32 underlay_qpn);
+			    struct mlx5_flow_table *ft, u32 underlay_qpn,
+			    bool disconnect);
 
 int mlx5_cmd_fc_alloc(struct mlx5_core_dev *dev, u32 *id);
 int mlx5_cmd_fc_free(struct mlx5_core_dev *dev, u32 id);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
index 5a7bea688ec8..8a1a7ba9fe53 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -693,8 +693,10 @@ static int update_root_ft_create(struct mlx5_flow_table *ft, struct fs_prio
 				 *prio)
 {
 	struct mlx5_flow_root_namespace *root = find_root(&prio->node);
+	struct mlx5_ft_underlay_qp *uqp;
 	int min_level = INT_MAX;
 	int err;
+	u32 qpn;
 
 	if (root->root_ft)
 		min_level = root->root_ft->level;
@@ -702,10 +704,24 @@ static int update_root_ft_create(struct mlx5_flow_table *ft, struct fs_prio
 	if (ft->level >= min_level)
 		return 0;
 
-	err = mlx5_cmd_update_root_ft(root->dev, ft, root->underlay_qpn);
+	if (list_empty(&root->underlay_qpns)) {
+		/* Don't set any QPN (zero) in case QPN list is empty */
+		qpn = 0;
+		err = mlx5_cmd_update_root_ft(root->dev, ft, qpn, false);
+	} else {
+		list_for_each_entry(uqp, &root->underlay_qpns, list) {
+			qpn = uqp->qpn;
+			err = mlx5_cmd_update_root_ft(root->dev, ft, qpn,
+						      false);
+			if (err)
+				break;
+		}
+	}
+
 	if (err)
-		mlx5_core_warn(root->dev, "Update root flow table of id=%u failed\n",
-			       ft->id);
+		mlx5_core_warn(root->dev,
+			       "Update root flow table of id(%u) qpn(%d) failed\n",
+			       ft->id, qpn);
 	else
 		root->root_ft = ft;
 
@@ -1661,23 +1677,43 @@ static struct mlx5_flow_table *find_next_ft(struct mlx5_flow_table *ft)
 static int update_root_ft_destroy(struct mlx5_flow_table *ft)
 {
 	struct mlx5_flow_root_namespace *root = find_root(&ft->node);
+	struct mlx5_ft_underlay_qp *uqp;
 	struct mlx5_flow_table *new_root_ft = NULL;
+	int err = 0;
+	u32 qpn;
 
 	if (root->root_ft != ft)
 		return 0;
 
 	new_root_ft = find_next_ft(ft);
-	if (new_root_ft) {
-		int err = mlx5_cmd_update_root_ft(root->dev, new_root_ft,
-						  root->underlay_qpn);
 
-		if (err) {
-			mlx5_core_warn(root->dev, "Update root flow table of id=%u failed\n",
-				       ft->id);
-			return err;
+	if (!new_root_ft) {
+		root->root_ft = NULL;
+		return 0;
+	}
+
+	if (list_empty(&root->underlay_qpns)) {
+		/* Don't set any QPN (zero) in case QPN list is empty */
+		qpn = 0;
+		err = mlx5_cmd_update_root_ft(root->dev, new_root_ft, qpn,
+					      false);
+	} else {
+		list_for_each_entry(uqp, &root->underlay_qpns, list) {
+			qpn = uqp->qpn;
+			err = mlx5_cmd_update_root_ft(root->dev, new_root_ft,
+						      qpn, false);
+			if (err)
+				break;
 		}
 	}
-	root->root_ft = new_root_ft;
+
+	if (err)
+		mlx5_core_warn(root->dev,
+			       "Update root flow table of id(%u) qpn(%d) failed\n",
+			       ft->id, qpn);
+	else
+		root->root_ft = new_root_ft;
+
 	return 0;
 }
 
@@ -1965,6 +2001,8 @@ static struct mlx5_flow_root_namespace *create_root_ns(struct mlx5_flow_steering
 	root_ns->dev = steering->dev;
 	root_ns->table_type = table_type;
 
+	INIT_LIST_HEAD(&root_ns->underlay_qpns);
+
 	ns = &root_ns->ns;
 	fs_init_namespace(ns);
 	mutex_init(&root_ns->chain_lock);
@@ -2245,17 +2283,76 @@ int mlx5_init_fs(struct mlx5_core_dev *dev)
 int mlx5_fs_add_rx_underlay_qpn(struct mlx5_core_dev *dev, u32 underlay_qpn)
 {
 	struct mlx5_flow_root_namespace *root = dev->priv.steering->root_ns;
+	struct mlx5_ft_underlay_qp *new_uqp;
+	int err = 0;
+
+	new_uqp = kzalloc(sizeof(*new_uqp), GFP_KERNEL);
+	if (!new_uqp)
+		return -ENOMEM;
+
+	mutex_lock(&root->chain_lock);
+
+	if (!root->root_ft) {
+		err = -EINVAL;
+		goto update_ft_fail;
+	}
+
+	err = mlx5_cmd_update_root_ft(dev, root->root_ft, underlay_qpn, false);
+	if (err) {
+		mlx5_core_warn(dev, "Failed adding underlay QPN (%u) to root FT err(%d)\n",
+			       underlay_qpn, err);
+		goto update_ft_fail;
+	}
+
+	new_uqp->qpn = underlay_qpn;
+	list_add_tail(&new_uqp->list, &root->underlay_qpns);
+
+	mutex_unlock(&root->chain_lock);
 
-	root->underlay_qpn = underlay_qpn;
 	return 0;
+
+update_ft_fail:
+	mutex_unlock(&root->chain_lock);
+	kfree(new_uqp);
+	return err;
 }
 EXPORT_SYMBOL(mlx5_fs_add_rx_underlay_qpn);
 
 int mlx5_fs_remove_rx_underlay_qpn(struct mlx5_core_dev *dev, u32 underlay_qpn)
 {
 	struct mlx5_flow_root_namespace *root = dev->priv.steering->root_ns;
+	struct mlx5_ft_underlay_qp *uqp;
+	bool found = false;
+	int err = 0;
+
+	mutex_lock(&root->chain_lock);
+	list_for_each_entry(uqp, &root->underlay_qpns, list) {
+		if (uqp->qpn == underlay_qpn) {
+			found = true;
+			break;
+		}
+	}
+
+	if (!found) {
+		mlx5_core_warn(dev, "Failed finding underlay qp (%u) in qpn list\n",
+			       underlay_qpn);
+		err = -EINVAL;
+		goto out;
+	}
+
+	err = mlx5_cmd_update_root_ft(dev, root->root_ft, underlay_qpn, true);
+	if (err)
+		mlx5_core_warn(dev, "Failed removing underlay QPN (%u) from root FT err(%d)\n",
+			       underlay_qpn, err);
+
+	list_del(&uqp->list);
+	mutex_unlock(&root->chain_lock);
+	kfree(uqp);
 
-	root->underlay_qpn = 0;
 	return 0;
+
+out:
+	mutex_unlock(&root->chain_lock);
+	return err;
 }
 EXPORT_SYMBOL(mlx5_fs_remove_rx_underlay_qpn);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h
index 48dd78975062..9bc048a89bc0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h
@@ -147,6 +147,11 @@ struct mlx5_fc {
 	struct mlx5_fc_cache cache ____cacheline_aligned_in_smp;
 };
 
+struct mlx5_ft_underlay_qp {
+	struct list_head list;
+	u32 qpn;
+};
+
 #define MLX5_FTE_MATCH_PARAM_RESERVED	reserved_at_600
 /* Calculate the fte_match_param length and without the reserved length.
  * Make sure the reserved field is the last.
@@ -212,7 +217,7 @@ struct mlx5_flow_root_namespace {
 	struct mlx5_flow_table		*root_ft;
 	/* Should be held when chaining flow tables */
 	struct mutex			chain_lock;
-	u32				underlay_qpn;
+	struct list_head		underlay_qpns;
 };
 
 int mlx5_init_fc_stats(struct mlx5_core_dev *dev);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
index feb94db6b921..00f0e6a038bb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
@@ -218,12 +218,6 @@ static int mlx5i_init_tx(struct mlx5e_priv *priv)
 		return err;
 	}
 
-	err = mlx5i_init_underlay_qp(priv);
-	if (err) {
-		mlx5_core_warn(priv->mdev, "intilize underlay QP failed, %d\n", err);
-		goto err_destroy_underlay_qp;
-	}
-
 	err = mlx5e_create_tis(priv->mdev, 0 /* tc */, ipriv->qp.qpn, &priv->tisn[0]);
 	if (err) {
 		mlx5_core_warn(priv->mdev, "create tis failed, %d\n", err);
@@ -285,7 +279,6 @@ static void mlx5i_destroy_flow_steering(struct mlx5e_priv *priv)
 
 static int mlx5i_init_rx(struct mlx5e_priv *priv)
 {
-	struct mlx5i_priv *ipriv  = priv->ppriv;
 	int err;
 
 	err = mlx5e_create_indirect_rqt(priv);
@@ -304,18 +297,12 @@ static int mlx5i_init_rx(struct mlx5e_priv *priv)
 	if (err)
 		goto err_destroy_indirect_tirs;
 
-	err = mlx5_fs_add_rx_underlay_qpn(priv->mdev, ipriv->qp.qpn);
-	if (err)
-		goto err_destroy_direct_tirs;
-
 	err = mlx5i_create_flow_steering(priv);
 	if (err)
-		goto err_remove_rx_underlay_qpn;
+		goto err_destroy_direct_tirs;
 
 	return 0;
 
-err_remove_rx_underlay_qpn:
-	mlx5_fs_remove_rx_underlay_qpn(priv->mdev, ipriv->qp.qpn);
 err_destroy_direct_tirs:
 	mlx5e_destroy_direct_tirs(priv);
 err_destroy_indirect_tirs:
@@ -329,9 +316,6 @@ static int mlx5i_init_rx(struct mlx5e_priv *priv)
 
 static void mlx5i_cleanup_rx(struct mlx5e_priv *priv)
 {
-	struct mlx5i_priv *ipriv  = priv->ppriv;
-
-	mlx5_fs_remove_rx_underlay_qpn(priv->mdev, ipriv->qp.qpn);
 	mlx5i_destroy_flow_steering(priv);
 	mlx5e_destroy_direct_tirs(priv);
 	mlx5e_destroy_indirect_tirs(priv);
@@ -423,49 +407,71 @@ static void mlx5i_dev_cleanup(struct net_device *dev)
 
 static int mlx5i_open(struct net_device *netdev)
 {
-	struct mlx5e_priv *priv = mlx5i_epriv(netdev);
+	struct mlx5e_priv *epriv = mlx5i_epriv(netdev);
+	struct mlx5i_priv *ipriv = epriv->ppriv;
+	struct mlx5_core_dev *mdev = epriv->mdev;
 	int err;
 
-	mutex_lock(&priv->state_lock);
+	mutex_lock(&epriv->state_lock);
 
-	set_bit(MLX5E_STATE_OPENED, &priv->state);
+	set_bit(MLX5E_STATE_OPENED, &epriv->state);
 
-	err = mlx5e_open_channels(priv, &priv->channels);
-	if (err)
+	err = mlx5i_init_underlay_qp(epriv);
+	if (err) {
+		mlx5_core_warn(mdev, "prepare underlay qp state failed, %d\n", err);
 		goto err_clear_state_opened_flag;
+	}
 
-	mlx5e_refresh_tirs(priv, false);
-	mlx5e_activate_priv_channels(priv);
-	mlx5e_timestamp_set(priv);
+	err = mlx5_fs_add_rx_underlay_qpn(mdev, ipriv->qp.qpn);
+	if (err) {
+		mlx5_core_warn(mdev, "attach underlay qp to ft failed, %d\n", err);
+		goto err_reset_qp;
+	}
 
-	mutex_unlock(&priv->state_lock);
+	err = mlx5e_open_channels(epriv, &epriv->channels);
+	if (err)
+		goto err_remove_fs_underlay_qp;
+
+	mlx5e_refresh_tirs(epriv, false);
+	mlx5e_activate_priv_channels(epriv);
+	mlx5e_timestamp_set(epriv);
+
+	mutex_unlock(&epriv->state_lock);
 	return 0;
 
+err_remove_fs_underlay_qp:
+	mlx5_fs_remove_rx_underlay_qpn(mdev, ipriv->qp.qpn);
+err_reset_qp:
+	mlx5i_uninit_underlay_qp(epriv);
 err_clear_state_opened_flag:
-	clear_bit(MLX5E_STATE_OPENED, &priv->state);
-	mutex_unlock(&priv->state_lock);
+	clear_bit(MLX5E_STATE_OPENED, &epriv->state);
+	mutex_unlock(&epriv->state_lock);
 	return err;
 }
 
 static int mlx5i_close(struct net_device *netdev)
 {
-	struct mlx5e_priv *priv = mlx5i_epriv(netdev);
+	struct mlx5e_priv *epriv = mlx5i_epriv(netdev);
+	struct mlx5i_priv *ipriv = epriv->ppriv;
+	struct mlx5_core_dev *mdev = epriv->mdev;
 
 	/* May already be CLOSED in case a previous configuration operation
 	 * (e.g RX/TX queue size change) that involves close&open failed.
 	 */
-	mutex_lock(&priv->state_lock);
+	mutex_lock(&epriv->state_lock);
 
-	if (!test_bit(MLX5E_STATE_OPENED, &priv->state))
+	if (!test_bit(MLX5E_STATE_OPENED, &epriv->state))
 		goto unlock;
 
-	clear_bit(MLX5E_STATE_OPENED, &priv->state);
+	clear_bit(MLX5E_STATE_OPENED, &epriv->state);
 
-	netif_carrier_off(priv->netdev);
-	mlx5e_deactivate_priv_channels(priv);
-	mlx5e_close_channels(&priv->channels);
+	netif_carrier_off(epriv->netdev);
+	mlx5_fs_remove_rx_underlay_qpn(mdev, ipriv->qp.qpn);
+	mlx5i_uninit_underlay_qp(epriv);
+	mlx5e_deactivate_priv_channels(epriv);
+	mlx5e_close_channels(&epriv->channels);;
 unlock:
-	mutex_unlock(&priv->state_lock);
+	mutex_unlock(&epriv->state_lock);
 	return 0;
 }
 
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [for-next 05/12] IB/ipoib: Grab rtnl lock on heavy flush when calling ndo_open/stop
  2017-10-14 18:48 [pull request][for-next 00/12] Mellanox, mlx5 IPoIB Muli Pkey support 2017-10-11 Saeed Mahameed
  2017-10-14 18:48 ` [for-next 02/12] net/mlx5: PTP code migration to driver core section Saeed Mahameed
  2017-10-14 18:48 ` [for-next 04/12] net/mlx5: Support for attaching multiple underlay QPs to root flow table Saeed Mahameed
@ 2017-10-14 18:48 ` Saeed Mahameed
  2017-10-14 18:48 ` [for-next 06/12] IB/ipoib: Add ability to set PKEY index to lower device driver Saeed Mahameed
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 20+ messages in thread
From: Saeed Mahameed @ 2017-10-14 18:48 UTC (permalink / raw)
  To: David S. Miller, Doug Ledford
  Cc: netdev, linux-rdma, Leon Romanovsky, Alex Vesker, Saeed Mahameed

From: Alex Vesker <valex@mellanox.com>

When ndo_open and ndo_stop are called RTNL lock should be held.
In this specific case ipoib_ib_dev_open calls the offloaded ndo_open
which re-sets the number of TX queue assuming RTNL lock is held.
Since RTNL lock is not held, RTNL assert will fail.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/infiniband/ulp/ipoib/ipoib_ib.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
index 6cd61638b441..c97384c914a4 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
@@ -1203,10 +1203,15 @@ static void __ipoib_ib_dev_flush(struct ipoib_dev_priv *priv,
 		ipoib_ib_dev_down(dev);
 
 	if (level == IPOIB_FLUSH_HEAVY) {
+		rtnl_lock();
 		if (test_bit(IPOIB_FLAG_INITIALIZED, &priv->flags))
 			ipoib_ib_dev_stop(dev);
-		if (ipoib_ib_dev_open(dev) != 0)
+
+		result = ipoib_ib_dev_open(dev);
+		rtnl_unlock();
+		if (result)
 			return;
+
 		if (netif_queue_stopped(dev))
 			netif_start_queue(dev);
 	}
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [for-next 06/12] IB/ipoib: Add ability to set PKEY index to lower device driver
  2017-10-14 18:48 [pull request][for-next 00/12] Mellanox, mlx5 IPoIB Muli Pkey support 2017-10-11 Saeed Mahameed
                   ` (2 preceding siblings ...)
  2017-10-14 18:48 ` [for-next 05/12] IB/ipoib: Grab rtnl lock on heavy flush when calling ndo_open/stop Saeed Mahameed
@ 2017-10-14 18:48 ` Saeed Mahameed
  2017-10-14 18:48 ` [for-next 07/12] net/mlx5e: IPoIB, Support for setting PKEY index to underlay QP Saeed Mahameed
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 20+ messages in thread
From: Saeed Mahameed @ 2017-10-14 18:48 UTC (permalink / raw)
  To: David S. Miller, Doug Ledford
  Cc: netdev, linux-rdma, Leon Romanovsky, Alex Vesker, Saeed Mahameed

From: Alex Vesker <valex@mellanox.com>

To support passing child interfaces to the lower device a new
rdma_netdev function was used, set_id. This will allow us to
attach the PKEY index lower device resources such as TIS/QP.
For devices that do not support offloads in IPoIB same logic
will be used, setting the PKEY index to priv struct.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/infiniband/ulp/ipoib/ipoib_ib.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
index c97384c914a4..fe690f82af29 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
@@ -893,13 +893,17 @@ int ipoib_ib_dev_open(struct net_device *dev)
 void ipoib_pkey_dev_check_presence(struct net_device *dev)
 {
 	struct ipoib_dev_priv *priv = ipoib_priv(dev);
+	struct rdma_netdev *rn = netdev_priv(dev);
 
 	if (!(priv->pkey & 0x7fff) ||
 	    ib_find_pkey(priv->ca, priv->port, priv->pkey,
-			 &priv->pkey_index))
+			 &priv->pkey_index)) {
 		clear_bit(IPOIB_PKEY_ASSIGNED, &priv->flags);
-	else
+	} else {
+		if (rn->set_id)
+			rn->set_id(dev, priv->pkey_index);
 		set_bit(IPOIB_PKEY_ASSIGNED, &priv->flags);
+	}
 }
 
 void ipoib_ib_dev_up(struct net_device *dev)
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [for-next 07/12] net/mlx5e: IPoIB, Support for setting PKEY index to underlay QP
  2017-10-14 18:48 [pull request][for-next 00/12] Mellanox, mlx5 IPoIB Muli Pkey support 2017-10-11 Saeed Mahameed
                   ` (3 preceding siblings ...)
  2017-10-14 18:48 ` [for-next 06/12] IB/ipoib: Add ability to set PKEY index to lower device driver Saeed Mahameed
@ 2017-10-14 18:48 ` Saeed Mahameed
  2017-10-14 18:48 ` [for-next 09/12] net/mlx5e: IPoIB, Add PKEY child interface nic profile Saeed Mahameed
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 20+ messages in thread
From: Saeed Mahameed @ 2017-10-14 18:48 UTC (permalink / raw)
  To: David S. Miller, Doug Ledford
  Cc: netdev, linux-rdma, Leon Romanovsky, Alex Vesker, Saeed Mahameed

From: Alex Vesker <valex@mellanox.com>

Added a function to set PKEY index to IPoIB device driver using the
already present set_id function. PKEY index is attached to the QP
during state modification.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c | 9 +++++++++
 drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h | 1 +
 2 files changed, 10 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
index 00f0e6a038bb..679c1f9af642 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
@@ -123,6 +123,7 @@ static int mlx5i_init_underlay_qp(struct mlx5e_priv *priv)
 
 	context->flags = cpu_to_be32(MLX5_QP_PM_MIGRATED << 11);
 	context->pri_path.port = 1;
+	context->pri_path.pkey_index = cpu_to_be16(ipriv->pkey_index);
 	context->qkey = cpu_to_be32(IB_DEFAULT_Q_KEY);
 
 	ret = mlx5_core_qp_modify(mdev, MLX5_CMD_OP_RST2INIT_QP, 0, context, qp);
@@ -529,6 +530,13 @@ static int mlx5i_xmit(struct net_device *dev, struct sk_buff *skb,
 	return mlx5i_sq_xmit(sq, skb, &mah->av, dqpn, ipriv->qkey);
 }
 
+static void mlx5i_set_pkey_index(struct net_device *netdev, int id)
+{
+	struct mlx5i_priv *ipriv = netdev_priv(netdev);
+
+	ipriv->pkey_index = (u16)id;
+}
+
 static int mlx5i_check_required_hca_cap(struct mlx5_core_dev *mdev)
 {
 	if (MLX5_CAP_GEN(mdev, port_type) != MLX5_CAP_PORT_TYPE_IB)
@@ -593,6 +601,7 @@ struct net_device *mlx5_rdma_netdev_alloc(struct mlx5_core_dev *mdev,
 	rn->send = mlx5i_xmit;
 	rn->attach_mcast = mlx5i_attach_mcast;
 	rn->detach_mcast = mlx5i_detach_mcast;
+	rn->set_id = mlx5i_set_pkey_index;
 
 	return netdev;
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h
index a0f405f520f7..9a729883c3b3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h
@@ -50,6 +50,7 @@ struct mlx5i_priv {
 	struct rdma_netdev rn; /* keep this first */
 	struct mlx5_core_qp qp;
 	u32    qkey;
+	u16    pkey_index;
 	char  *mlx5e_priv[0];
 };
 
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [for-next 08/12] net/mlx5e: IPoIB, Use hash-table to map between QPN to child netdev
       [not found] ` <20171014184827.18491-1-saeedm-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2017-10-14 18:48   ` [for-next 01/12] net/mlx5: File renaming towards ptp core implementation Saeed Mahameed
  2017-10-14 18:48   ` [for-next 03/12] net/mlx5e: IPoIB, Move underlay QP init/uninit to separate functions Saeed Mahameed
@ 2017-10-14 18:48   ` Saeed Mahameed
       [not found]     ` <20171014184827.18491-9-saeedm-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2017-10-14 18:48   ` [for-next 11/12] net/mlx5e: IPoIB, Add PKEY child interface ethtool ops Saeed Mahameed
  2017-10-14 21:19   ` [pull request][for-next 00/12] Mellanox, mlx5 IPoIB Muli Pkey support 2017-10-11 Doug Ledford
  4 siblings, 1 reply; 20+ messages in thread
From: Saeed Mahameed @ 2017-10-14 18:48 UTC (permalink / raw)
  To: David S. Miller, Doug Ledford
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Leon Romanovsky, Alex Vesker, Saeed Mahameed

From: Alex Vesker <valex-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

This change is needed for PKEY support, since the RQs are shared
between the child interface and the parent. The parent is responsible
for NAPI and the precessing of RX completions. Using the dqpn in the
completion descriptor we set the corresponding child IPoIB netdevice
on the SKB.
The mapping between the dqpn and the netdevice is done using a HT,
each mlx5 IPoIB interface registers its mapping on creation.

Signed-off-by: Alex Vesker <valex-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Erez Shitrit <erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Saeed Mahameed <saeedm-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/net/ethernet/mellanox/mlx5/core/Makefile   |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    |  20 ++-
 .../net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c  |  16 +++
 .../net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h  |  12 ++
 .../ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c | 136 +++++++++++++++++++++
 5 files changed, 184 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
index d9621b2152d3..100fe4ecad9b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
@@ -22,7 +22,7 @@ mlx5_core-$(CONFIG_MLX5_ESWITCH) += eswitch.o eswitch_offloads.o en_rep.o en_tc.
 
 mlx5_core-$(CONFIG_MLX5_CORE_EN_DCB) +=  en_dcbnl.o
 
-mlx5_core-$(CONFIG_MLX5_CORE_IPOIB) += ipoib/ipoib.o ipoib/ethtool.o
+mlx5_core-$(CONFIG_MLX5_CORE_IPOIB) += ipoib/ipoib.o ipoib/ethtool.o ipoib/ipoib_vlan.o
 
 mlx5_core-$(CONFIG_MLX5_EN_IPSEC) += en_accel/ipsec.o en_accel/ipsec_rxtx.o \
 		en_accel/ipsec_stats.o
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 7e3bfe62ef6e..2c3f2e9b6983 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -1163,11 +1163,25 @@ static inline void mlx5i_complete_rx_cqe(struct mlx5e_rq *rq,
 					 u32 cqe_bcnt,
 					 struct sk_buff *skb)
 {
-	struct net_device *netdev = rq->netdev;
+	struct net_device *netdev;
 	char *pseudo_header;
+	u32 qpn;
 	u8 *dgid;
 	u8 g;
 
+	qpn = be32_to_cpu(cqe->sop_drop_qpn) & 0xffffff;
+	netdev = mlx5i_pkey_get_netdev(rq->netdev, qpn);
+
+	/* No mapping present, cannot process SKB. This might happen if a child
+	 * interface is going down while having unprocessed CQEs on parent RQ
+	 */
+	if (unlikely(!netdev)) {
+		/* TODO: add drop counters support */
+		skb->dev = NULL;
+		pr_warn_once("Unable to map QPN %u to dev - dropping skb\n", qpn);
+		return;
+	}
+
 	g = (be32_to_cpu(cqe->flags_rqpn) >> 28) & 3;
 	dgid = skb->data + MLX5_IB_GRH_DGID_OFFSET;
 	if ((!g) || dgid[0] != 0xff)
@@ -1230,6 +1244,10 @@ void mlx5i_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe)
 		goto wq_free_wqe;
 
 	mlx5i_complete_rx_cqe(rq, cqe, cqe_bcnt, skb);
+	if (unlikely(!skb->dev)) {
+		dev_kfree_skb_any(skb);
+		goto wq_free_wqe;
+	}
 	napi_gro_receive(rq->cq.napi, skb);
 
 wq_free_wqe:
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
index 679c1f9af642..c479fe54a6ca 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
@@ -382,6 +382,9 @@ static int mlx5i_dev_init(struct net_device *dev)
 	dev->dev_addr[2] = (ipriv->qp.qpn >>  8) & 0xff;
 	dev->dev_addr[3] = (ipriv->qp.qpn) & 0xff;
 
+	/* Add QPN to net-device mapping to HT */
+	mlx5i_pkey_add_qpn(dev ,ipriv->qp.qpn);
+
 	return 0;
 }
 
@@ -402,8 +405,12 @@ static int mlx5i_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 static void mlx5i_dev_cleanup(struct net_device *dev)
 {
 	struct mlx5e_priv    *priv   = mlx5i_epriv(dev);
+	struct mlx5i_priv    *ipriv = priv->ppriv;
 
 	mlx5i_uninit_underlay_qp(priv);
+
+	/* Delete QPN to net-device mapping from HT */
+	mlx5i_pkey_del_qpn(dev, ipriv->qp.qpn);
 }
 
 static int mlx5i_open(struct net_device *netdev)
@@ -590,6 +597,12 @@ struct net_device *mlx5_rdma_netdev_alloc(struct mlx5_core_dev *mdev,
 	if (!epriv->wq)
 		goto err_free_netdev;
 
+	err = mlx5i_pkey_qpn_ht_init(netdev);
+	if (err) {
+		mlx5_core_warn(mdev, "allocate qpn_to_netdev ht failed\n");
+		goto destroy_wq;
+	}
+
 	profile->init(mdev, netdev, profile, ipriv);
 
 	mlx5e_attach_netdev(epriv);
@@ -605,6 +618,8 @@ struct net_device *mlx5_rdma_netdev_alloc(struct mlx5_core_dev *mdev,
 
 	return netdev;
 
+destroy_wq:
+	destroy_workqueue(epriv->wq);
 err_free_netdev:
 	free_netdev(netdev);
 free_mdev_resources:
@@ -623,6 +638,7 @@ void mlx5_rdma_netdev_free(struct net_device *netdev)
 	mlx5e_detach_netdev(priv);
 	profile->cleanup(priv);
 	destroy_workqueue(priv->wq);
+	mlx5i_pkey_qpn_ht_cleanup(netdev);
 	free_netdev(netdev);
 
 	mlx5e_destroy_mdev_resources(mdev);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h
index 9a729883c3b3..e313f6d90729 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h
@@ -51,9 +51,21 @@ struct mlx5i_priv {
 	struct mlx5_core_qp qp;
 	u32    qkey;
 	u16    pkey_index;
+	struct mlx5i_pkey_qpn_ht *qpn_htbl;
 	char  *mlx5e_priv[0];
 };
 
+/* Allocate/Free underlay QPN to net-device hash table */
+int mlx5i_pkey_qpn_ht_init(struct net_device *netdev);
+void mlx5i_pkey_qpn_ht_cleanup(struct net_device *netdev);
+
+/* Add/Remove an underlay QPN to net-device mapping to/from the hash table */
+int mlx5i_pkey_add_qpn(struct net_device *netdev, u32 qpn);
+int mlx5i_pkey_del_qpn(struct net_device *netdev, u32 qpn);
+
+/* Get the net-device corresponding to the given underlay QPN */
+struct net_device *mlx5i_pkey_get_netdev(struct net_device *netdev, u32 qpn);
+
 /* Extract mlx5e_priv from IPoIB netdev */
 #define mlx5i_epriv(netdev) ((void *)(((struct mlx5i_priv *)netdev_priv(netdev))->mlx5e_priv))
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c
new file mode 100644
index 000000000000..e4d39aa1f552
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c
@@ -0,0 +1,136 @@
+/*
+ * Copyright (c) 2017, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <linux/hash.h>
+#include "ipoib.h"
+
+#define MLX5I_MAX_LOG_PKEY_SUP 7
+
+struct qpn_to_netdev {
+	struct net_device *netdev;
+	struct hlist_node hlist;
+	u32 underlay_qpn;
+};
+
+struct mlx5i_pkey_qpn_ht {
+	struct hlist_head buckets[1 << MLX5I_MAX_LOG_PKEY_SUP];
+	spinlock_t ht_lock; /* Synchronise with NAPI */
+};
+
+int mlx5i_pkey_qpn_ht_init(struct net_device *netdev)
+{
+	struct mlx5i_priv *ipriv = netdev_priv(netdev);
+	struct mlx5i_pkey_qpn_ht *qpn_htbl;
+
+	qpn_htbl = kzalloc(sizeof(*qpn_htbl), GFP_KERNEL);
+	if (!qpn_htbl)
+		return -ENOMEM;
+
+	ipriv->qpn_htbl = qpn_htbl;
+	spin_lock_init(&qpn_htbl->ht_lock);
+
+	return 0;
+}
+
+void mlx5i_pkey_qpn_ht_cleanup(struct net_device *netdev)
+{
+	struct mlx5i_priv *ipriv = netdev_priv(netdev);
+
+	kfree(ipriv->qpn_htbl);
+}
+
+static struct qpn_to_netdev *mlx5i_find_qpn_to_netdev_node(struct hlist_head *buckets,
+							   u32 qpn)
+{
+	struct hlist_head *h = &buckets[hash_32(qpn, MLX5I_MAX_LOG_PKEY_SUP)];
+	struct qpn_to_netdev *node;
+
+	hlist_for_each_entry(node, h, hlist) {
+		if (node->underlay_qpn == qpn)
+			return node;
+	}
+
+	return NULL;
+}
+
+int mlx5i_pkey_add_qpn(struct net_device *netdev, u32 qpn)
+{
+	struct mlx5i_priv *ipriv = netdev_priv(netdev);
+	struct mlx5i_pkey_qpn_ht *ht = ipriv->qpn_htbl;
+	u8 key = hash_32(qpn, MLX5I_MAX_LOG_PKEY_SUP);
+	struct qpn_to_netdev *new_node;
+
+	new_node = kzalloc(sizeof(*new_node), GFP_KERNEL);
+	if (!new_node)
+		return -ENOMEM;
+
+	new_node->netdev = netdev;
+	new_node->underlay_qpn = qpn;
+	spin_lock_bh(&ht->ht_lock);
+	hlist_add_head(&new_node->hlist, &ht->buckets[key]);
+	spin_unlock_bh(&ht->ht_lock);
+
+	return 0;
+}
+
+int mlx5i_pkey_del_qpn(struct net_device *netdev, u32 qpn)
+{
+	struct mlx5e_priv *epriv = mlx5i_epriv(netdev);
+	struct mlx5i_priv *ipriv = epriv->ppriv;
+	struct mlx5i_pkey_qpn_ht *ht = ipriv->qpn_htbl;
+	struct qpn_to_netdev *node;
+
+	node = mlx5i_find_qpn_to_netdev_node(ht->buckets, qpn);
+	if (!node) {
+		mlx5_core_warn(epriv->mdev, "QPN to netdev delete from HT failed\n");
+		return -EINVAL;
+	}
+
+	spin_lock_bh(&ht->ht_lock);
+	hlist_del_init(&node->hlist);
+	spin_unlock_bh(&ht->ht_lock);
+	kfree(node);
+
+	return 0;
+}
+
+struct net_device *mlx5i_pkey_get_netdev(struct net_device *netdev, u32 qpn)
+{
+	struct mlx5i_priv *ipriv = netdev_priv(netdev);
+	struct qpn_to_netdev *node;
+
+	node = mlx5i_find_qpn_to_netdev_node(ipriv->qpn_htbl->buckets, qpn);
+	if (!node)
+		return NULL;
+
+	return node->netdev;
+}
-- 
2.14.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [for-next 09/12] net/mlx5e: IPoIB, Add PKEY child interface nic profile
  2017-10-14 18:48 [pull request][for-next 00/12] Mellanox, mlx5 IPoIB Muli Pkey support 2017-10-11 Saeed Mahameed
                   ` (4 preceding siblings ...)
  2017-10-14 18:48 ` [for-next 07/12] net/mlx5e: IPoIB, Support for setting PKEY index to underlay QP Saeed Mahameed
@ 2017-10-14 18:48 ` Saeed Mahameed
  2017-10-14 18:48 ` [for-next 10/12] net/mlx5e: IPoIB, Add PKEY child interface ndos Saeed Mahameed
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 20+ messages in thread
From: Saeed Mahameed @ 2017-10-14 18:48 UTC (permalink / raw)
  To: David S. Miller, Doug Ledford
  Cc: netdev, linux-rdma, Leon Romanovsky, Alex Vesker, Saeed Mahameed

From: Alex Vesker <valex@mellanox.com>

Child interface profile will be called to support child interface
specific behaviour. The child code is sparse compared to the parent
since the RX channels are shared between the interfaces.
Creating a septate profile for child and parent will make a smother
code with a better ability for future expansion.
The profile stuct is exposed to the parent using a getter function.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c  | 12 ++--
 .../net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h  | 13 ++++
 .../ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c | 83 ++++++++++++++++++++++
 3 files changed, 102 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
index c479fe54a6ca..196771cc599e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
@@ -70,10 +70,10 @@ static void mlx5i_build_nic_params(struct mlx5_core_dev *mdev,
 }
 
 /* Called directly after IPoIB netdevice was created to initialize SW structs */
-static void mlx5i_init(struct mlx5_core_dev *mdev,
-		       struct net_device *netdev,
-		       const struct mlx5e_profile *profile,
-		       void *ppriv)
+void mlx5i_init(struct mlx5_core_dev *mdev,
+		struct net_device *netdev,
+		const struct mlx5e_profile *profile,
+		void *ppriv)
 {
 	struct mlx5e_priv *priv  = mlx5i_epriv(netdev);
 
@@ -169,7 +169,7 @@ static void mlx5i_uninit_underlay_qp(struct mlx5e_priv *priv)
 
 #define MLX5_QP_ENHANCED_ULP_STATELESS_MODE 2
 
-static int mlx5i_create_underlay_qp(struct mlx5_core_dev *mdev, struct mlx5_core_qp *qp)
+int mlx5i_create_underlay_qp(struct mlx5_core_dev *mdev, struct mlx5_core_qp *qp)
 {
 	u32 *in = NULL;
 	void *addr_path;
@@ -203,7 +203,7 @@ static int mlx5i_create_underlay_qp(struct mlx5_core_dev *mdev, struct mlx5_core
 	return ret;
 }
 
-static void mlx5i_destroy_underlay_qp(struct mlx5_core_dev *mdev, struct mlx5_core_qp *qp)
+void mlx5i_destroy_underlay_qp(struct mlx5_core_dev *mdev, struct mlx5_core_qp *qp)
 {
 	mlx5_core_destroy_qp(mdev, qp);
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h
index e313f6d90729..c9895f7a2358 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h
@@ -55,6 +55,10 @@ struct mlx5i_priv {
 	char  *mlx5e_priv[0];
 };
 
+/* Underlay QP create/destroy functions */
+int mlx5i_create_underlay_qp(struct mlx5_core_dev *mdev, struct mlx5_core_qp *qp);
+void mlx5i_destroy_underlay_qp(struct mlx5_core_dev *mdev, struct mlx5_core_qp *qp);
+
 /* Allocate/Free underlay QPN to net-device hash table */
 int mlx5i_pkey_qpn_ht_init(struct net_device *netdev);
 void mlx5i_pkey_qpn_ht_cleanup(struct net_device *netdev);
@@ -66,6 +70,15 @@ int mlx5i_pkey_del_qpn(struct net_device *netdev, u32 qpn);
 /* Get the net-device corresponding to the given underlay QPN */
 struct net_device *mlx5i_pkey_get_netdev(struct net_device *netdev, u32 qpn);
 
+/* Parent profile functions */
+void mlx5i_init(struct mlx5_core_dev *mdev,
+		struct net_device *netdev,
+		const struct mlx5e_profile *profile,
+		void *ppriv);
+
+/* Get child interface nic profile */
+const struct mlx5e_profile *mlx5i_pkey_get_profile(void);
+
 /* Extract mlx5e_priv from IPoIB netdev */
 #define mlx5i_epriv(netdev) ((void *)(((struct mlx5i_priv *)netdev_priv(netdev))->mlx5e_priv))
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c
index e4d39aa1f552..17c508d98dbb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c
@@ -134,3 +134,86 @@ struct net_device *mlx5i_pkey_get_netdev(struct net_device *netdev, u32 qpn)
 
 	return node->netdev;
 }
+
+/* Called directly after IPoIB netdevice was created to initialize SW structs */
+static void mlx5i_pkey_init(struct mlx5_core_dev *mdev,
+			     struct net_device *netdev,
+			     const struct mlx5e_profile *profile,
+			     void *ppriv)
+{
+	struct mlx5e_priv *priv  = mlx5i_epriv(netdev);
+
+	mlx5i_init(mdev, netdev, profile, ppriv);
+
+	/* Override parent ndo */
+	netdev->netdev_ops = NULL;
+
+	/* Currently no ethtool support */
+	netdev->ethtool_ops = NULL;
+
+	/* Use dummy rqs */
+	priv->channels.params.log_rq_size = MLX5E_PARAMS_MINIMUM_LOG_RQ_SIZE;
+}
+
+/* Called directly before IPoIB netdevice is destroyed to cleanup SW structs */
+static void mlx5i_pkey_cleanup(struct mlx5e_priv *priv)
+{
+	/* Do nothing .. */
+}
+
+static int mlx5i_pkey_init_tx(struct mlx5e_priv *priv)
+{
+	struct mlx5i_priv *ipriv = priv->ppriv;
+	int err;
+
+	err = mlx5i_create_underlay_qp(priv->mdev, &ipriv->qp);
+	if (err) {
+		mlx5_core_warn(priv->mdev, "create child underlay QP failed, %d\n", err);
+		return err;
+	}
+
+	return 0;
+}
+
+static void mlx5i_pkey_cleanup_tx(struct mlx5e_priv *priv)
+{
+	struct mlx5i_priv *ipriv = priv->ppriv;
+
+	mlx5i_destroy_underlay_qp(priv->mdev, &ipriv->qp);
+}
+
+static int mlx5i_pkey_init_rx(struct mlx5e_priv *priv)
+{
+	/* Since the rx resources are shared between child and parent, the
+	 * parent interface is taking care of rx resource allocation and init
+	 */
+	return 0;
+}
+
+static void mlx5i_pkey_cleanup_rx(struct mlx5e_priv *priv)
+{
+	/* Since the rx resources are shared between child and parent, the
+	 * parent interface is taking care of rx resource free and de-init
+	 */
+}
+
+static const struct mlx5e_profile mlx5i_pkey_nic_profile = {
+	.init		   = mlx5i_pkey_init,
+	.cleanup	   = mlx5i_pkey_cleanup,
+	.init_tx	   = mlx5i_pkey_init_tx,
+	.cleanup_tx	   = mlx5i_pkey_cleanup_tx,
+	.init_rx	   = mlx5i_pkey_init_rx,
+	.cleanup_rx	   = mlx5i_pkey_cleanup_rx,
+	.enable		   = NULL,
+	.disable	   = NULL,
+	.update_stats	   = NULL,
+	.max_nch	   = mlx5e_get_max_num_channels,
+	.rx_handlers.handle_rx_cqe       = mlx5i_handle_rx_cqe,
+	.rx_handlers.handle_rx_cqe_mpwqe = NULL, /* Not supported */
+	.max_tc		   = MLX5I_MAX_NUM_TC,
+};
+
+const struct mlx5e_profile *mlx5i_pkey_get_profile(void)
+{
+	return &mlx5i_pkey_nic_profile;
+}
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [for-next 10/12] net/mlx5e: IPoIB, Add PKEY child interface ndos
  2017-10-14 18:48 [pull request][for-next 00/12] Mellanox, mlx5 IPoIB Muli Pkey support 2017-10-11 Saeed Mahameed
                   ` (5 preceding siblings ...)
  2017-10-14 18:48 ` [for-next 09/12] net/mlx5e: IPoIB, Add PKEY child interface nic profile Saeed Mahameed
@ 2017-10-14 18:48 ` Saeed Mahameed
  2017-10-14 18:48 ` [for-next 12/12] net/mlx5e: IPoIB, Modify rdma netdev allocate and free to support PKEY Saeed Mahameed
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 20+ messages in thread
From: Saeed Mahameed @ 2017-10-14 18:48 UTC (permalink / raw)
  To: David S. Miller, Doug Ledford
  Cc: netdev, linux-rdma, Leon Romanovsky, Alex Vesker, Saeed Mahameed

From: Alex Vesker <valex@mellanox.com>

Child interface ndos will be called to support child interface
specific behaviour.

ndo_init flow:
-Acquire shared QPN to net-device HT from parent
-Continue with the same flow as parent interface

ndo_open flow:
-Initialize child underlay QP and connect to shared FT
-Create child send TIS
-Open child send channels

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c  |  10 +-
 .../net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h  |   8 ++
 .../ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c | 133 ++++++++++++++++++++-
 3 files changed, 144 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
index 196771cc599e..70706eb70d3e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
@@ -40,8 +40,6 @@
 
 static int mlx5i_open(struct net_device *netdev);
 static int mlx5i_close(struct net_device *netdev);
-static int  mlx5i_dev_init(struct net_device *dev);
-static void mlx5i_dev_cleanup(struct net_device *dev);
 static int mlx5i_change_mtu(struct net_device *netdev, int new_mtu);
 static int mlx5i_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd);
 
@@ -108,7 +106,7 @@ static void mlx5i_cleanup(struct mlx5e_priv *priv)
 	/* Do nothing .. */
 }
 
-static int mlx5i_init_underlay_qp(struct mlx5e_priv *priv)
+int mlx5i_init_underlay_qp(struct mlx5e_priv *priv)
 {
 	struct mlx5_core_dev *mdev = priv->mdev;
 	struct mlx5i_priv *ipriv = priv->ppriv;
@@ -154,7 +152,7 @@ static int mlx5i_init_underlay_qp(struct mlx5e_priv *priv)
 	return ret;
 }
 
-static void mlx5i_uninit_underlay_qp(struct mlx5e_priv *priv)
+void mlx5i_uninit_underlay_qp(struct mlx5e_priv *priv)
 {
 	struct mlx5i_priv *ipriv = priv->ppriv;
 	struct mlx5_core_dev *mdev = priv->mdev;
@@ -372,7 +370,7 @@ static int mlx5i_change_mtu(struct net_device *netdev, int new_mtu)
 	return err;
 }
 
-static int mlx5i_dev_init(struct net_device *dev)
+int mlx5i_dev_init(struct net_device *dev)
 {
 	struct mlx5e_priv    *priv   = mlx5i_epriv(dev);
 	struct mlx5i_priv    *ipriv  = priv->ppriv;
@@ -402,7 +400,7 @@ static int mlx5i_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 	}
 }
 
-static void mlx5i_dev_cleanup(struct net_device *dev)
+void mlx5i_dev_cleanup(struct net_device *dev)
 {
 	struct mlx5e_priv    *priv   = mlx5i_epriv(dev);
 	struct mlx5i_priv    *ipriv = priv->ppriv;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h
index c9895f7a2358..80c0cfee7164 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h
@@ -59,6 +59,10 @@ struct mlx5i_priv {
 int mlx5i_create_underlay_qp(struct mlx5_core_dev *mdev, struct mlx5_core_qp *qp);
 void mlx5i_destroy_underlay_qp(struct mlx5_core_dev *mdev, struct mlx5_core_qp *qp);
 
+/* Underlay QP state modification init/uninit functions */
+int mlx5i_init_underlay_qp(struct mlx5e_priv *priv);
+void mlx5i_uninit_underlay_qp(struct mlx5e_priv *priv);
+
 /* Allocate/Free underlay QPN to net-device hash table */
 int mlx5i_pkey_qpn_ht_init(struct net_device *netdev);
 void mlx5i_pkey_qpn_ht_cleanup(struct net_device *netdev);
@@ -70,6 +74,10 @@ int mlx5i_pkey_del_qpn(struct net_device *netdev, u32 qpn);
 /* Get the net-device corresponding to the given underlay QPN */
 struct net_device *mlx5i_pkey_get_netdev(struct net_device *netdev, u32 qpn);
 
+/* Shared ndo functionts */
+int mlx5i_dev_init(struct net_device *dev);
+void mlx5i_dev_cleanup(struct net_device *dev);
+
 /* Parent profile functions */
 void mlx5i_init(struct mlx5_core_dev *mdev,
 		struct net_device *netdev,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c
index 17c508d98dbb..d99bec6855de 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c
@@ -135,6 +135,137 @@ struct net_device *mlx5i_pkey_get_netdev(struct net_device *netdev, u32 qpn)
 	return node->netdev;
 }
 
+static int mlx5i_pkey_open(struct net_device *netdev);
+static int mlx5i_pkey_close(struct net_device *netdev);
+static int mlx5i_pkey_dev_init(struct net_device *dev);
+static void mlx5i_pkey_dev_cleanup(struct net_device *netdev);
+static int mlx5i_pkey_change_mtu(struct net_device *netdev, int new_mtu);
+
+static const struct net_device_ops mlx5i_pkey_netdev_ops = {
+	.ndo_open                = mlx5i_pkey_open,
+	.ndo_stop                = mlx5i_pkey_close,
+	.ndo_init                = mlx5i_pkey_dev_init,
+	.ndo_uninit              = mlx5i_pkey_dev_cleanup,
+	.ndo_change_mtu          = mlx5i_pkey_change_mtu,
+};
+
+/* Child NDOs */
+static int mlx5i_pkey_dev_init(struct net_device *dev)
+{
+	struct mlx5e_priv *priv = mlx5i_epriv(dev);
+	struct mlx5i_priv *ipriv, *parent_ipriv;
+	struct net_device *parent_dev;
+	int parent_ifindex;
+
+	ipriv = priv->ppriv;
+
+	/* Get QPN to netdevice hash table from parent */
+	parent_ifindex = dev->netdev_ops->ndo_get_iflink(dev);
+	parent_dev = dev_get_by_index(dev_net(dev), parent_ifindex);
+	if (!parent_dev) {
+		mlx5_core_warn(priv->mdev, "failed to get parent device\n");
+		return -EINVAL;
+	}
+
+	parent_ipriv = netdev_priv(parent_dev);
+	ipriv->qpn_htbl = parent_ipriv->qpn_htbl;
+	dev_put(parent_dev);
+
+	return mlx5i_dev_init(dev);
+}
+
+static void mlx5i_pkey_dev_cleanup(struct net_device *netdev)
+{
+	return mlx5i_dev_cleanup(netdev);
+}
+
+static int mlx5i_pkey_open(struct net_device *netdev)
+{
+	struct mlx5e_priv *epriv = mlx5i_epriv(netdev);
+	struct mlx5i_priv *ipriv = epriv->ppriv;
+	struct mlx5_core_dev *mdev = epriv->mdev;
+	int err;
+
+	mutex_lock(&epriv->state_lock);
+
+	set_bit(MLX5E_STATE_OPENED, &epriv->state);
+
+	err = mlx5i_init_underlay_qp(epriv);
+	if (err) {
+		mlx5_core_warn(mdev, "prepare child underlay qp state failed, %d\n", err);
+		goto err_release_lock;
+	}
+
+	err = mlx5_fs_add_rx_underlay_qpn(mdev, ipriv->qp.qpn);
+	if (err) {
+		mlx5_core_warn(mdev, "attach child underlay qp to ft failed, %d\n", err);
+		goto err_unint_underlay_qp;
+	}
+
+	err = mlx5e_create_tis(mdev, 0 /* tc */, ipriv->qp.qpn, &epriv->tisn[0]);
+	if (err) {
+		mlx5_core_warn(mdev, "create child tis failed, %d\n", err);
+		goto err_remove_rx_uderlay_qp;
+	}
+
+	err = mlx5e_open_channels(epriv, &epriv->channels);
+	if (err) {
+		mlx5_core_warn(mdev, "opening child channels failed, %d\n", err);
+		goto err_clear_state_opened_flag;
+	}
+	mlx5e_refresh_tirs(epriv, false);
+	mlx5e_activate_priv_channels(epriv);
+	mutex_unlock(&epriv->state_lock);
+
+	return 0;
+
+err_clear_state_opened_flag:
+	mlx5e_destroy_tis(mdev, epriv->tisn[0]);
+err_remove_rx_uderlay_qp:
+	mlx5_fs_remove_rx_underlay_qpn(mdev, ipriv->qp.qpn);
+err_unint_underlay_qp:
+	mlx5i_uninit_underlay_qp(epriv);
+err_release_lock:
+	clear_bit(MLX5E_STATE_OPENED, &epriv->state);
+	mutex_unlock(&epriv->state_lock);
+	return err;
+}
+
+static int mlx5i_pkey_close(struct net_device *netdev)
+{
+	struct mlx5e_priv *priv = mlx5i_epriv(netdev);
+	struct mlx5i_priv *ipriv = priv->ppriv;
+	struct mlx5_core_dev *mdev = priv->mdev;
+
+	mutex_lock(&priv->state_lock);
+
+	if (!test_bit(MLX5E_STATE_OPENED, &priv->state))
+		goto unlock;
+
+	clear_bit(MLX5E_STATE_OPENED, &priv->state);
+
+	netif_carrier_off(priv->netdev);
+	mlx5_fs_remove_rx_underlay_qpn(mdev, ipriv->qp.qpn);
+	mlx5i_uninit_underlay_qp(priv);
+	mlx5e_deactivate_priv_channels(priv);
+	mlx5e_close_channels(&priv->channels);
+	mlx5e_destroy_tis(mdev, priv->tisn[0]);
+unlock:
+	mutex_unlock(&priv->state_lock);
+	return 0;
+}
+
+static int mlx5i_pkey_change_mtu(struct net_device *netdev, int new_mtu)
+{
+	struct mlx5e_priv *priv = mlx5i_epriv(netdev);
+
+	mutex_lock(&priv->state_lock);
+	netdev->mtu = new_mtu;
+	mutex_unlock(&priv->state_lock);
+
+	return 0;
+}
+
 /* Called directly after IPoIB netdevice was created to initialize SW structs */
 static void mlx5i_pkey_init(struct mlx5_core_dev *mdev,
 			     struct net_device *netdev,
@@ -146,7 +277,7 @@ static void mlx5i_pkey_init(struct mlx5_core_dev *mdev,
 	mlx5i_init(mdev, netdev, profile, ppriv);
 
 	/* Override parent ndo */
-	netdev->netdev_ops = NULL;
+	netdev->netdev_ops = &mlx5i_pkey_netdev_ops;
 
 	/* Currently no ethtool support */
 	netdev->ethtool_ops = NULL;
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [for-next 11/12] net/mlx5e: IPoIB, Add PKEY child interface ethtool ops
       [not found] ` <20171014184827.18491-1-saeedm-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (2 preceding siblings ...)
  2017-10-14 18:48   ` [for-next 08/12] net/mlx5e: IPoIB, Use hash-table to map between QPN to child netdev Saeed Mahameed
@ 2017-10-14 18:48   ` Saeed Mahameed
  2017-10-14 21:19   ` [pull request][for-next 00/12] Mellanox, mlx5 IPoIB Muli Pkey support 2017-10-11 Doug Ledford
  4 siblings, 0 replies; 20+ messages in thread
From: Saeed Mahameed @ 2017-10-14 18:48 UTC (permalink / raw)
  To: David S. Miller, Doug Ledford
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Leon Romanovsky, Alex Vesker, Saeed Mahameed

From: Alex Vesker <valex-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Similar to VLAN interfaces child interfaces have limited ethtool
support. In current code the main limitation that does not
allow child interface ethtool configuration is due to shared
resources which are managed by the parent.

Signed-off-by: Alex Vesker <valex-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Erez Shitrit <erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Saeed Mahameed <saeedm-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/net/ethernet/mellanox/mlx5/core/ipoib/ethtool.c    | 5 +++++
 drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h      | 1 +
 drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c | 4 ++--
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ethtool.c
index 43c126c63955..6f338a9219c8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ethtool.c
@@ -250,3 +250,8 @@ const struct ethtool_ops mlx5i_ethtool_ops = {
 	.get_link_ksettings = mlx5i_get_link_ksettings,
 	.get_link           = ethtool_op_get_link,
 };
+
+const struct ethtool_ops mlx5i_pkey_ethtool_ops = {
+	.get_drvinfo        = mlx5i_get_drvinfo,
+	.get_link           = ethtool_op_get_link,
+};
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h
index 80c0cfee7164..a50c1a19550e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h
@@ -39,6 +39,7 @@
 #define MLX5I_MAX_NUM_TC 1
 
 extern const struct ethtool_ops mlx5i_ethtool_ops;
+extern const struct ethtool_ops mlx5i_pkey_ethtool_ops;
 
 #define MLX5_IB_GRH_BYTES       40
 #define MLX5_IPOIB_ENCAP_LEN    4
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c
index d99bec6855de..531b02cc979b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c
@@ -279,8 +279,8 @@ static void mlx5i_pkey_init(struct mlx5_core_dev *mdev,
 	/* Override parent ndo */
 	netdev->netdev_ops = &mlx5i_pkey_netdev_ops;
 
-	/* Currently no ethtool support */
-	netdev->ethtool_ops = NULL;
+	/* Set child limited ethtool support */
+	netdev->ethtool_ops = &mlx5i_pkey_ethtool_ops;
 
 	/* Use dummy rqs */
 	priv->channels.params.log_rq_size = MLX5E_PARAMS_MINIMUM_LOG_RQ_SIZE;
-- 
2.14.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [for-next 12/12] net/mlx5e: IPoIB, Modify rdma netdev allocate and free to support PKEY
  2017-10-14 18:48 [pull request][for-next 00/12] Mellanox, mlx5 IPoIB Muli Pkey support 2017-10-11 Saeed Mahameed
                   ` (6 preceding siblings ...)
  2017-10-14 18:48 ` [for-next 10/12] net/mlx5e: IPoIB, Add PKEY child interface ndos Saeed Mahameed
@ 2017-10-14 18:48 ` Saeed Mahameed
       [not found] ` <20171014184827.18491-1-saeedm-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2017-10-14 22:51 ` Parav Pandit
  9 siblings, 0 replies; 20+ messages in thread
From: Saeed Mahameed @ 2017-10-14 18:48 UTC (permalink / raw)
  To: David S. Miller, Doug Ledford
  Cc: netdev, linux-rdma, Leon Romanovsky, Alex Vesker, Saeed Mahameed

From: Alex Vesker <valex@mellanox.com>

Resources such as FT, QPN HT and mdev resources should be allocated
only by parent netdev. Shared resources are allocated and freed by the
parent interface since the parent is always present and created
before the IPoIB PKEY sub-interface.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/en_common.c    |  1 +
 .../net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c  | 52 ++++++++++++++--------
 .../net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h  |  1 +
 3 files changed, 36 insertions(+), 18 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_common.c b/drivers/net/ethernet/mellanox/mlx5/core/en_common.c
index ece3fb147e3e..157d02917237 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_common.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_common.c
@@ -134,6 +134,7 @@ void mlx5e_destroy_mdev_resources(struct mlx5_core_dev *mdev)
 	mlx5_core_destroy_mkey(mdev, &res->mkey);
 	mlx5_core_dealloc_transport_domain(mdev, res->td.tdn);
 	mlx5_core_dealloc_pd(mdev, res->pdn);
+	memset(res, 0, sizeof(*res));
 }
 
 int mlx5e_refresh_tirs(struct mlx5e_priv *priv, bool enable_uc_lb)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
index 70706eb70d3e..abf270d7f556 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
@@ -560,12 +560,13 @@ struct net_device *mlx5_rdma_netdev_alloc(struct mlx5_core_dev *mdev,
 					  const char *name,
 					  void (*setup)(struct net_device *))
 {
-	const struct mlx5e_profile *profile = &mlx5i_nic_profile;
-	int nch = profile->max_nch(mdev);
+	const struct mlx5e_profile *profile;
 	struct net_device *netdev;
 	struct mlx5i_priv *ipriv;
 	struct mlx5e_priv *epriv;
 	struct rdma_netdev *rn;
+	bool sub_interface;
+	int nch;
 	int err;
 
 	if (mlx5i_check_required_hca_cap(mdev)) {
@@ -573,10 +574,15 @@ struct net_device *mlx5_rdma_netdev_alloc(struct mlx5_core_dev *mdev,
 		return ERR_PTR(-EOPNOTSUPP);
 	}
 
-	/* This function should only be called once per mdev */
-	err = mlx5e_create_mdev_resources(mdev);
-	if (err)
-		return NULL;
+	/* TODO: Need to find a better way to check if child device*/
+	sub_interface = (mdev->mlx5e_res.pdn != 0);
+
+	if (sub_interface)
+		profile = mlx5i_pkey_get_profile();
+	else
+		profile = &mlx5i_nic_profile;
+
+	nch = profile->max_nch(mdev);
 
 	netdev = alloc_netdev_mqs(sizeof(struct mlx5i_priv) + sizeof(struct mlx5e_priv),
 				  name, NET_NAME_UNKNOWN,
@@ -585,7 +591,7 @@ struct net_device *mlx5_rdma_netdev_alloc(struct mlx5_core_dev *mdev,
 				  nch);
 	if (!netdev) {
 		mlx5_core_warn(mdev, "alloc_netdev_mqs failed\n");
-		goto free_mdev_resources;
+		return NULL;
 	}
 
 	ipriv = netdev_priv(netdev);
@@ -595,10 +601,18 @@ struct net_device *mlx5_rdma_netdev_alloc(struct mlx5_core_dev *mdev,
 	if (!epriv->wq)
 		goto err_free_netdev;
 
-	err = mlx5i_pkey_qpn_ht_init(netdev);
-	if (err) {
-		mlx5_core_warn(mdev, "allocate qpn_to_netdev ht failed\n");
-		goto destroy_wq;
+	ipriv->sub_interface = sub_interface;
+	if (!ipriv->sub_interface) {
+		err = mlx5i_pkey_qpn_ht_init(netdev);
+		if (err) {
+			mlx5_core_warn(mdev, "allocate qpn_to_netdev ht failed\n");
+			goto destroy_wq;
+		}
+
+		/* This should only be called once per mdev */
+		err = mlx5e_create_mdev_resources(mdev);
+		if (err)
+			goto destroy_ht;
 	}
 
 	profile->init(mdev, netdev, profile, ipriv);
@@ -616,12 +630,12 @@ struct net_device *mlx5_rdma_netdev_alloc(struct mlx5_core_dev *mdev,
 
 	return netdev;
 
+destroy_ht:
+	mlx5i_pkey_qpn_ht_cleanup(netdev);
 destroy_wq:
 	destroy_workqueue(epriv->wq);
 err_free_netdev:
 	free_netdev(netdev);
-free_mdev_resources:
-	mlx5e_destroy_mdev_resources(mdev);
 
 	return NULL;
 }
@@ -629,16 +643,18 @@ EXPORT_SYMBOL(mlx5_rdma_netdev_alloc);
 
 void mlx5_rdma_netdev_free(struct net_device *netdev)
 {
-	struct mlx5e_priv          *priv    = mlx5i_epriv(netdev);
+	struct mlx5e_priv *priv = mlx5i_epriv(netdev);
+	struct mlx5i_priv *ipriv = priv->ppriv;
 	const struct mlx5e_profile *profile = priv->profile;
-	struct mlx5_core_dev       *mdev    = priv->mdev;
 
 	mlx5e_detach_netdev(priv);
 	profile->cleanup(priv);
 	destroy_workqueue(priv->wq);
-	mlx5i_pkey_qpn_ht_cleanup(netdev);
-	free_netdev(netdev);
 
-	mlx5e_destroy_mdev_resources(mdev);
+	if (!ipriv->sub_interface) {
+		mlx5i_pkey_qpn_ht_cleanup(netdev);
+		mlx5e_destroy_mdev_resources(priv->mdev);
+	}
+	free_netdev(netdev);
 }
 EXPORT_SYMBOL(mlx5_rdma_netdev_free);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h
index a50c1a19550e..49008022c306 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h
@@ -50,6 +50,7 @@ extern const struct ethtool_ops mlx5i_pkey_ethtool_ops;
 struct mlx5i_priv {
 	struct rdma_netdev rn; /* keep this first */
 	struct mlx5_core_qp qp;
+	bool   sub_interface;
 	u32    qkey;
 	u16    pkey_index;
 	struct mlx5i_pkey_qpn_ht *qpn_htbl;
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [pull request][for-next 00/12] Mellanox, mlx5 IPoIB Muli Pkey support 2017-10-11
       [not found] ` <20171014184827.18491-1-saeedm-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (3 preceding siblings ...)
  2017-10-14 18:48   ` [for-next 11/12] net/mlx5e: IPoIB, Add PKEY child interface ethtool ops Saeed Mahameed
@ 2017-10-14 21:19   ` Doug Ledford
  2017-10-15  6:30     ` Leon Romanovsky
  2017-10-16  4:45     ` [pull request][for-next 00/12] Mellanox, mlx5 IPoIB Muli Pkey support 2017-10-11,Re: " David Miller
  4 siblings, 2 replies; 20+ messages in thread
From: Doug Ledford @ 2017-10-14 21:19 UTC (permalink / raw)
  To: Saeed Mahameed, David S. Miller
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Leon Romanovsky


[-- Attachment #1.1: Type: text/plain, Size: 1229 bytes --]

On 10/14/2017 2:48 PM, Saeed Mahameed wrote:
> Hi Dave and Doug,
> 
> This series includes updates for mlx5 IPoIB offloading driver from Alex
> and Feras to add the support for Muli Pkey in the mlx5i ipoib offloading netdev,
> to be merged into net-next and rdma-next trees.

As far as the two IPoIB patches are concerned, they're fine.

> Doug, I am sorry I couldn't base this on rc2 since the series needs and conflicts
> with a fix that was submitted to rc3, so to keep things simple I based it on rc4,
> I hope this is ok with you..

No worries, it just means I have to submit it under another branch.  But
I'm already holding one patch series in a stand alone branch, so no big
deal.  And, actually, the IPoIB changes are so small they can simply go
through Dave's tree if you don't have any dependent code in the IPoIB
driver to submit after this but still in this devel cycle.

> Please pull and let me know if there's any problem.

Once I hear that Dave is OK with the net changes, I'm ready to pull (if
I need to).

-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    GPG Key ID: B826A3330E572FDD
    Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [pull request][for-next 00/12] Mellanox, mlx5 IPoIB Muli Pkey support 2017-10-11
  2017-10-14 18:48 [pull request][for-next 00/12] Mellanox, mlx5 IPoIB Muli Pkey support 2017-10-11 Saeed Mahameed
                   ` (8 preceding siblings ...)
       [not found] ` <20171014184827.18491-1-saeedm-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2017-10-14 22:51 ` Parav Pandit
  9 siblings, 0 replies; 20+ messages in thread
From: Parav Pandit @ 2017-10-14 22:51 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: David S. Miller, Doug Ledford, netdev, linux-rdma,
	Leon Romanovsky

On Sat, Oct 14, 2017 at 1:48 PM, Saeed Mahameed <saeedm@mellanox.com> wrote:
> Hi Dave and Doug,
>
> This series includes updates for mlx5 IPoIB offloading driver from Alex
> and Feras to add the support for Muli Pkey in the mlx5i ipoib offloading netdev,

In description and cover letter subject, here
s/Muli/Multi

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [pull request][for-next 00/12] Mellanox, mlx5 IPoIB Muli Pkey support 2017-10-11
  2017-10-14 21:19   ` [pull request][for-next 00/12] Mellanox, mlx5 IPoIB Muli Pkey support 2017-10-11 Doug Ledford
@ 2017-10-15  6:30     ` Leon Romanovsky
  2017-10-16  4:45     ` [pull request][for-next 00/12] Mellanox, mlx5 IPoIB Muli Pkey support 2017-10-11,Re: " David Miller
  1 sibling, 0 replies; 20+ messages in thread
From: Leon Romanovsky @ 2017-10-15  6:30 UTC (permalink / raw)
  To: Doug Ledford; +Cc: Saeed Mahameed, David S. Miller, netdev, linux-rdma

[-- Attachment #1: Type: text/plain, Size: 1358 bytes --]

On Sat, Oct 14, 2017 at 05:19:34PM -0400, Doug Ledford wrote:
> On 10/14/2017 2:48 PM, Saeed Mahameed wrote:
> > Hi Dave and Doug,
> >
> > This series includes updates for mlx5 IPoIB offloading driver from Alex
> > and Feras to add the support for Muli Pkey in the mlx5i ipoib offloading netdev,
> > to be merged into net-next and rdma-next trees.
>
> As far as the two IPoIB patches are concerned, they're fine.
>
> > Doug, I am sorry I couldn't base this on rc2 since the series needs and conflicts
> > with a fix that was submitted to rc3, so to keep things simple I based it on rc4,
> > I hope this is ok with you..
>
> No worries, it just means I have to submit it under another branch.  But
> I'm already holding one patch series in a stand alone branch, so no big
> deal.  And, actually, the IPoIB changes are so small they can simply go
> through Dave's tree if you don't have any dependent code in the IPoIB
> driver to submit after this but still in this devel cycle.

I have IPoIB NAPI implementation waiting for this shared code.

Thanks

>
> > Please pull and let me know if there's any problem.
>
> Once I hear that Dave is OK with the net changes, I'm ready to pull (if
> I need to).
>
> --
> Doug Ledford <dledford@redhat.com>
>     GPG Key ID: B826A3330E572FDD
>     Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD
>




[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [pull request][for-next 00/12] Mellanox, mlx5 IPoIB Muli Pkey support 2017-10-11,Re: [pull request][for-next 00/12] Mellanox, mlx5 IPoIB Muli Pkey support 2017-10-11
  2017-10-14 21:19   ` [pull request][for-next 00/12] Mellanox, mlx5 IPoIB Muli Pkey support 2017-10-11 Doug Ledford
  2017-10-15  6:30     ` Leon Romanovsky
@ 2017-10-16  4:45     ` David Miller
       [not found]       ` <20171016.054503.459455425504112937.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
  1 sibling, 1 reply; 20+ messages in thread
From: David Miller @ 2017-10-16  4:45 UTC (permalink / raw)
  To: dledford; +Cc: saeedm, netdev, linux-rdma, leonro

From: Doug Ledford <dledford@redhat.com>
Date: Sat, 14 Oct 2017 17:19:34 -0400

> On 10/14/2017 2:48 PM, Saeed Mahameed wrote:
>> Hi Dave and Doug,
>> 
>> This series includes updates for mlx5 IPoIB offloading driver from Alex
>> and Feras to add the support for Muli Pkey in the mlx5i ipoib offloading netdev,
>> to be merged into net-next and rdma-next trees.
> 
> As far as the two IPoIB patches are concerned, they're fine.
> 
>> Doug, I am sorry I couldn't base this on rc2 since the series needs and conflicts
>> with a fix that was submitted to rc3, so to keep things simple I based it on rc4,
>> I hope this is ok with you..
> 
> No worries, it just means I have to submit it under another branch.  But
> I'm already holding one patch series in a stand alone branch, so no big
> deal.  And, actually, the IPoIB changes are so small they can simply go
> through Dave's tree if you don't have any dependent code in the IPoIB
> driver to submit after this but still in this devel cycle.
> 
>> Please pull and let me know if there's any problem.
> 
> Once I hear that Dave is OK with the net changes, I'm ready to pull (if
> I need to).

They look fine and I've pulled this into net-next.

Thanks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [pull request][for-next 00/12] Mellanox, mlx5 IPoIB Muli Pkey support 2017-10-11,Re: [pull request][for-next 00/12] Mellanox, mlx5 IPoIB Muli Pkey support 2017-10-11
       [not found]       ` <20171016.054503.459455425504112937.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
@ 2017-10-18 15:08         ` Doug Ledford
  0 siblings, 0 replies; 20+ messages in thread
From: Doug Ledford @ 2017-10-18 15:08 UTC (permalink / raw)
  To: David Miller
  Cc: saeedm-VPRAkNaXOzVWk0Htik3J/w, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, leonro-VPRAkNaXOzVWk0Htik3J/w

On Mon, 2017-10-16 at 05:45 +0100, David Miller wrote:
> > Once I hear that Dave is OK with the net changes, I'm ready to pull
> > (if
> > I need to).
> 
> They look fine and I've pulled this into net-next.

Thanks, pulled here as well.

-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    GPG KeyID: B826A3330E572FDD
    Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [for-next 08/12] net/mlx5e: IPoIB, Use hash-table to map between QPN to child netdev
       [not found]     ` <20171014184827.18491-9-saeedm-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2017-10-23 15:47       ` Jason Gunthorpe
  2017-10-24  6:28         ` Alex Vesker
  0 siblings, 1 reply; 20+ messages in thread
From: Jason Gunthorpe @ 2017-10-23 15:47 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: David S. Miller, Doug Ledford, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Leon Romanovsky, Alex Vesker

On Sat, Oct 14, 2017 at 11:48:23AM -0700, Saeed Mahameed wrote:
> From: Alex Vesker <valex-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> 
> This change is needed for PKEY support, since the RQs are shared
> between the child interface and the parent. The parent is responsible
> for NAPI and the precessing of RX completions. Using the dqpn in the
> completion descriptor we set the corresponding child IPoIB netdevice
> on the SKB.
> The mapping between the dqpn and the netdevice is done using a HT,
> each mlx5 IPoIB interface registers its mapping on creation.

It seems really really weird to share the receive Q across all of the
children and do the sorting in software.. why is this done like this?

Wouldn't it be better to allow the children to progress concurrently,
potentially on multiple cores? They all have their own QPs after all..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [for-next 08/12] net/mlx5e: IPoIB, Use hash-table to map between QPN to child netdev
  2017-10-23 15:47       ` Jason Gunthorpe
@ 2017-10-24  6:28         ` Alex Vesker
  0 siblings, 0 replies; 20+ messages in thread
From: Alex Vesker @ 2017-10-24  6:28 UTC (permalink / raw)
  To: Jason Gunthorpe, Saeed Mahameed
  Cc: David S. Miller, Doug Ledford, netdev, linux-rdma,
	Leon Romanovsky



On 10/23/2017 6:47 PM, Jason Gunthorpe wrote:
> On Sat, Oct 14, 2017 at 11:48:23AM -0700, Saeed Mahameed wrote:
>> From: Alex Vesker <valex@mellanox.com>
>>
>> This change is needed for PKEY support, since the RQs are shared
>> between the child interface and the parent. The parent is responsible
>> for NAPI and the precessing of RX completions. Using the dqpn in the
>> completion descriptor we set the corresponding child IPoIB netdevice
>> on the SKB.
>> The mapping between the dqpn and the netdevice is done using a HT,
>> each mlx5 IPoIB interface registers its mapping on creation.
> It seems really really weird to share the receive Q across all of the
> children and do the sorting in software.. why is this done like this?
>
> Wouldn't it be better to allow the children to progress concurrently,
> potentially on multiple cores? They all have their own QPs after all..
>
> Jason
The child interface RQs are in minimum size since they are not used.
The reason for this is saving resources, since we want to be able to 
support many PKEYs
and allocating less memory while reusing HW resources such as flow 
steering table.
This solution allowed us save memory on allocating the RQs of the child 
and use the same
HW flow steering table, TIR, RQT. Using the hash table doesn't impact on 
performance.
The parent have multiple RQs on different cores running in parallel.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2017-10-24  6:29 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-14 18:48 [pull request][for-next 00/12] Mellanox, mlx5 IPoIB Muli Pkey support 2017-10-11 Saeed Mahameed
2017-10-14 18:48 ` [for-next 02/12] net/mlx5: PTP code migration to driver core section Saeed Mahameed
2017-10-14 18:48 ` [for-next 04/12] net/mlx5: Support for attaching multiple underlay QPs to root flow table Saeed Mahameed
2017-10-14 18:48 ` [for-next 05/12] IB/ipoib: Grab rtnl lock on heavy flush when calling ndo_open/stop Saeed Mahameed
2017-10-14 18:48 ` [for-next 06/12] IB/ipoib: Add ability to set PKEY index to lower device driver Saeed Mahameed
2017-10-14 18:48 ` [for-next 07/12] net/mlx5e: IPoIB, Support for setting PKEY index to underlay QP Saeed Mahameed
2017-10-14 18:48 ` [for-next 09/12] net/mlx5e: IPoIB, Add PKEY child interface nic profile Saeed Mahameed
2017-10-14 18:48 ` [for-next 10/12] net/mlx5e: IPoIB, Add PKEY child interface ndos Saeed Mahameed
2017-10-14 18:48 ` [for-next 12/12] net/mlx5e: IPoIB, Modify rdma netdev allocate and free to support PKEY Saeed Mahameed
     [not found] ` <20171014184827.18491-1-saeedm-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2017-10-14 18:48   ` [for-next 01/12] net/mlx5: File renaming towards ptp core implementation Saeed Mahameed
2017-10-14 18:48   ` [for-next 03/12] net/mlx5e: IPoIB, Move underlay QP init/uninit to separate functions Saeed Mahameed
2017-10-14 18:48   ` [for-next 08/12] net/mlx5e: IPoIB, Use hash-table to map between QPN to child netdev Saeed Mahameed
     [not found]     ` <20171014184827.18491-9-saeedm-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2017-10-23 15:47       ` Jason Gunthorpe
2017-10-24  6:28         ` Alex Vesker
2017-10-14 18:48   ` [for-next 11/12] net/mlx5e: IPoIB, Add PKEY child interface ethtool ops Saeed Mahameed
2017-10-14 21:19   ` [pull request][for-next 00/12] Mellanox, mlx5 IPoIB Muli Pkey support 2017-10-11 Doug Ledford
2017-10-15  6:30     ` Leon Romanovsky
2017-10-16  4:45     ` [pull request][for-next 00/12] Mellanox, mlx5 IPoIB Muli Pkey support 2017-10-11,Re: " David Miller
     [not found]       ` <20171016.054503.459455425504112937.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2017-10-18 15:08         ` Doug Ledford
2017-10-14 22:51 ` Parav Pandit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).