* [PATCH net-next 00/10] Convert mlx4 to use auxiliary bus
@ 2023-08-04 15:05 Petr Pavlu
2023-08-04 15:05 ` [PATCH net-next 01/10] mlx4: Get rid of the mlx4_interface.get_dev callback Petr Pavlu
` (11 more replies)
0 siblings, 12 replies; 28+ messages in thread
From: Petr Pavlu @ 2023-08-04 15:05 UTC (permalink / raw)
To: tariqt, yishaih, leon
Cc: davem, edumazet, kuba, pabeni, jgg, netdev, linux-rdma,
linux-kernel, Petr Pavlu
This series converts the mlx4 drivers to use auxiliary bus, similarly to
how mlx5 was converted [1]. The first 6 patches are preparatory changes,
the remaining 4 are the final conversion.
Initial motivation for this change was to address a problem related to
loading mlx4_en/mlx4_ib by mlx4_core using request_module_nowait(). When
doing such a load in initrd, the operation is asynchronous to any init
control and can get unexpectedly affected/interrupted by an eventual
root switch. Using an auxiliary bus leaves these module loads to udevd
which better integrates with systemd processing. [2]
General benefit is to get rid of custom interface logic and instead use
a common facility available for this task. An obvious risk is that some
new bug is introduced by the conversion.
Leon Romanovsky was kind enough to check for me that the series passes
their verification tests.
[1] https://lore.kernel.org/netdev/20201101201542.2027568-1-leon@kernel.org/
[2] https://lore.kernel.org/netdev/0a361ac2-c6bd-2b18-4841-b1b991f0635e@suse.com/
Petr Pavlu (10):
mlx4: Get rid of the mlx4_interface.get_dev callback
mlx4: Rename member mlx4_en_dev.nb to netdev_nb
mlx4: Replace the mlx4_interface.event callback with a notifier
mlx4: Get rid of the mlx4_interface.activate callback
mlx4: Move the bond work to the core driver
mlx4: Avoid resetting MLX4_INTFF_BONDING per driver
mlx4: Register mlx4 devices to an auxiliary virtual bus
mlx4: Connect the ethernet part to the auxiliary bus
mlx4: Connect the infiniband part to the auxiliary bus
mlx4: Delete custom device management logic
drivers/infiniband/hw/mlx4/main.c | 207 ++++++----
drivers/infiniband/hw/mlx4/mlx4_ib.h | 2 +
drivers/net/ethernet/mellanox/mlx4/Kconfig | 1 +
drivers/net/ethernet/mellanox/mlx4/en_main.c | 141 ++++---
.../net/ethernet/mellanox/mlx4/en_netdev.c | 64 +---
drivers/net/ethernet/mellanox/mlx4/intf.c | 361 ++++++++++++------
drivers/net/ethernet/mellanox/mlx4/main.c | 110 ++++--
drivers/net/ethernet/mellanox/mlx4/mlx4.h | 16 +-
drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 4 +-
include/linux/mlx4/device.h | 20 +
include/linux/mlx4/driver.h | 42 +-
11 files changed, 572 insertions(+), 396 deletions(-)
base-commit: 86b7e033d684a9d4ca20ad8e6f8b9300cf99668f
--
2.35.3
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH net-next 01/10] mlx4: Get rid of the mlx4_interface.get_dev callback
2023-08-04 15:05 [PATCH net-next 00/10] Convert mlx4 to use auxiliary bus Petr Pavlu
@ 2023-08-04 15:05 ` Petr Pavlu
2023-08-08 18:55 ` Leon Romanovsky
2023-08-04 15:05 ` [PATCH net-next 02/10] mlx4: Rename member mlx4_en_dev.nb to netdev_nb Petr Pavlu
` (10 subsequent siblings)
11 siblings, 1 reply; 28+ messages in thread
From: Petr Pavlu @ 2023-08-04 15:05 UTC (permalink / raw)
To: tariqt, yishaih, leon
Cc: davem, edumazet, kuba, pabeni, jgg, netdev, linux-rdma,
linux-kernel, Petr Pavlu
Simplify the mlx4 driver interface by removing mlx4_get_protocol_dev()
and the associated mlx4_interface.get_dev callbacks. This is done in
preparation to use an auxiliary bus to model the mlx4 driver structure.
The change is motivated by the following situation:
* The mlx4_en interface is being initialized by mlx4_en_add() and
mlx4_en_activate().
* The latter activate function calls mlx4_en_init_netdev() ->
register_netdev() to register a new net_device.
* A netdev event NETDEV_REGISTER is raised for the device.
* The netdev notififier mlx4_ib_netdev_event() is called and it invokes
mlx4_ib_scan_netdevs() -> mlx4_get_protocol_dev() ->
mlx4_en_get_netdev() [via mlx4_interface.get_dev].
This chain creates a problem when mlx4_en gets switched to be an
auxiliary driver. It contains two device calls which would both need to
take a respective device lock.
Avoid this situation by updating mlx4_ib_scan_netdevs() to no longer
call mlx4_get_protocol_dev() but instead to utilize the information
passed in net_device.parent and net_device.dev_port. This data is
sufficient to determine that an updated port is one that the mlx4_ib
driver should take care of and to keep mlx4_ib_dev.iboe.netdevs up to
date.
Following that, update mlx4_ib_get_netdev() to also not call
mlx4_get_protocol_dev() and instead scan all current netdevs to find
find a matching one. Note that mlx4_ib_get_netdev() is called early from
ib_register_device() and cannot use data tracked in
mlx4_ib_dev.iboe.netdevs which is not at that point yet set.
Finally, remove function mlx4_get_protocol_dev() and the
mlx4_interface.get_dev callbacks (only mlx4_en_get_netdev()) as they
became unused.
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Tested-by: Leon Romanovsky <leon@kernel.org>
---
drivers/infiniband/hw/mlx4/main.c | 89 ++++++++++----------
drivers/net/ethernet/mellanox/mlx4/en_main.c | 8 --
drivers/net/ethernet/mellanox/mlx4/intf.c | 21 -----
include/linux/mlx4/driver.h | 3 -
4 files changed, 43 insertions(+), 78 deletions(-)
diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index b18e9f2adc82..7dd70d778b6b 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -125,12 +125,14 @@ static struct net_device *mlx4_ib_get_netdev(struct ib_device *device,
u32 port_num)
{
struct mlx4_ib_dev *ibdev = to_mdev(device);
- struct net_device *dev;
+ struct net_device *dev, *ret = NULL;
rcu_read_lock();
- dev = mlx4_get_protocol_dev(ibdev->dev, MLX4_PROT_ETH, port_num);
+ for_each_netdev_rcu(&init_net, dev) {
+ if (dev->dev.parent != ibdev->ib_dev.dev.parent ||
+ dev->dev_port + 1 != port_num)
+ continue;
- if (dev) {
if (mlx4_is_bonded(ibdev->dev)) {
struct net_device *upper = NULL;
@@ -143,11 +145,14 @@ static struct net_device *mlx4_ib_get_netdev(struct ib_device *device,
dev = active;
}
}
+
+ dev_hold(dev);
+ ret = dev;
+ break;
}
- dev_hold(dev);
rcu_read_unlock();
- return dev;
+ return ret;
}
static int mlx4_ib_update_gids_v1(struct gid_entry *gids,
@@ -2319,61 +2324,53 @@ static void mlx4_ib_update_qps(struct mlx4_ib_dev *ibdev,
mutex_unlock(&ibdev->qp1_proxy_lock[port - 1]);
}
-static void mlx4_ib_scan_netdevs(struct mlx4_ib_dev *ibdev,
- struct net_device *dev,
- unsigned long event)
+static void mlx4_ib_scan_netdev(struct mlx4_ib_dev *ibdev,
+ struct net_device *dev,
+ unsigned long event)
{
- struct mlx4_ib_iboe *iboe;
- int update_qps_port = -1;
- int port;
+ struct mlx4_ib_iboe *iboe = &ibdev->iboe;
ASSERT_RTNL();
- iboe = &ibdev->iboe;
+ if (dev->dev.parent != ibdev->ib_dev.dev.parent)
+ return;
spin_lock_bh(&iboe->lock);
- mlx4_foreach_ib_transport_port(port, ibdev->dev) {
-
- iboe->netdevs[port - 1] =
- mlx4_get_protocol_dev(ibdev->dev, MLX4_PROT_ETH, port);
- if (dev == iboe->netdevs[port - 1] &&
- (event == NETDEV_CHANGEADDR || event == NETDEV_REGISTER ||
- event == NETDEV_UP || event == NETDEV_CHANGE))
- update_qps_port = port;
+ iboe->netdevs[dev->dev_port] = event != NETDEV_UNREGISTER ? dev : NULL;
- if (dev == iboe->netdevs[port - 1] &&
- (event == NETDEV_UP || event == NETDEV_DOWN)) {
- enum ib_port_state port_state;
- struct ib_event ibev = { };
-
- if (ib_get_cached_port_state(&ibdev->ib_dev, port,
- &port_state))
- continue;
+ if (event == NETDEV_UP || event == NETDEV_DOWN) {
+ enum ib_port_state port_state;
+ struct ib_event ibev = { };
- if (event == NETDEV_UP &&
- (port_state != IB_PORT_ACTIVE ||
- iboe->last_port_state[port - 1] != IB_PORT_DOWN))
- continue;
- if (event == NETDEV_DOWN &&
- (port_state != IB_PORT_DOWN ||
- iboe->last_port_state[port - 1] != IB_PORT_ACTIVE))
- continue;
- iboe->last_port_state[port - 1] = port_state;
+ if (ib_get_cached_port_state(&ibdev->ib_dev, dev->dev_port + 1,
+ &port_state))
+ goto iboe_out;
- ibev.device = &ibdev->ib_dev;
- ibev.element.port_num = port;
- ibev.event = event == NETDEV_UP ? IB_EVENT_PORT_ACTIVE :
- IB_EVENT_PORT_ERR;
- ib_dispatch_event(&ibev);
- }
+ if (event == NETDEV_UP &&
+ (port_state != IB_PORT_ACTIVE ||
+ iboe->last_port_state[dev->dev_port] != IB_PORT_DOWN))
+ goto iboe_out;
+ if (event == NETDEV_DOWN &&
+ (port_state != IB_PORT_DOWN ||
+ iboe->last_port_state[dev->dev_port] != IB_PORT_ACTIVE))
+ goto iboe_out;
+ iboe->last_port_state[dev->dev_port] = port_state;
+ ibev.device = &ibdev->ib_dev;
+ ibev.element.port_num = dev->dev_port + 1;
+ ibev.event = event == NETDEV_UP ? IB_EVENT_PORT_ACTIVE :
+ IB_EVENT_PORT_ERR;
+ ib_dispatch_event(&ibev);
}
+
+iboe_out:
spin_unlock_bh(&iboe->lock);
- if (update_qps_port > 0)
- mlx4_ib_update_qps(ibdev, dev, update_qps_port);
+ if (event == NETDEV_CHANGEADDR || event == NETDEV_REGISTER ||
+ event == NETDEV_UP || event == NETDEV_CHANGE)
+ mlx4_ib_update_qps(ibdev, dev, dev->dev_port + 1);
}
static int mlx4_ib_netdev_event(struct notifier_block *this,
@@ -2386,7 +2383,7 @@ static int mlx4_ib_netdev_event(struct notifier_block *this,
return NOTIFY_DONE;
ibdev = container_of(this, struct mlx4_ib_dev, iboe.nb);
- mlx4_ib_scan_netdevs(ibdev, dev, event);
+ mlx4_ib_scan_netdev(ibdev, dev, event);
return NOTIFY_DONE;
}
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_main.c b/drivers/net/ethernet/mellanox/mlx4/en_main.c
index f1259bdb1a29..6a42bec6bd85 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_main.c
@@ -183,13 +183,6 @@ static void mlx4_en_get_profile(struct mlx4_en_dev *mdev)
}
}
-static void *mlx4_en_get_netdev(struct mlx4_dev *dev, void *ctx, u8 port)
-{
- struct mlx4_en_dev *endev = ctx;
-
- return endev->pndev[port];
-}
-
static void mlx4_en_event(struct mlx4_dev *dev, void *endev_ptr,
enum mlx4_dev_event event, unsigned long port)
{
@@ -354,7 +347,6 @@ static struct mlx4_interface mlx4_en_interface = {
.add = mlx4_en_add,
.remove = mlx4_en_remove,
.event = mlx4_en_event,
- .get_dev = mlx4_en_get_netdev,
.protocol = MLX4_PROT_ETH,
.activate = mlx4_en_activate,
};
diff --git a/drivers/net/ethernet/mellanox/mlx4/intf.c b/drivers/net/ethernet/mellanox/mlx4/intf.c
index 65482f004e50..28d7da925d36 100644
--- a/drivers/net/ethernet/mellanox/mlx4/intf.c
+++ b/drivers/net/ethernet/mellanox/mlx4/intf.c
@@ -245,27 +245,6 @@ void mlx4_unregister_device(struct mlx4_dev *dev)
mutex_unlock(&intf_mutex);
}
-void *mlx4_get_protocol_dev(struct mlx4_dev *dev, enum mlx4_protocol proto, int port)
-{
- struct mlx4_priv *priv = mlx4_priv(dev);
- struct mlx4_device_context *dev_ctx;
- unsigned long flags;
- void *result = NULL;
-
- spin_lock_irqsave(&priv->ctx_lock, flags);
-
- list_for_each_entry(dev_ctx, &priv->ctx_list, list)
- if (dev_ctx->intf->protocol == proto && dev_ctx->intf->get_dev) {
- result = dev_ctx->intf->get_dev(dev, dev_ctx->context, port);
- break;
- }
-
- spin_unlock_irqrestore(&priv->ctx_lock, flags);
-
- return result;
-}
-EXPORT_SYMBOL_GPL(mlx4_get_protocol_dev);
-
struct devlink_port *mlx4_get_devlink_port(struct mlx4_dev *dev, int port)
{
struct mlx4_port_info *info = &mlx4_priv(dev)->port[port];
diff --git a/include/linux/mlx4/driver.h b/include/linux/mlx4/driver.h
index 1834c8fad12e..923951e19300 100644
--- a/include/linux/mlx4/driver.h
+++ b/include/linux/mlx4/driver.h
@@ -59,7 +59,6 @@ struct mlx4_interface {
void (*remove)(struct mlx4_dev *dev, void *context);
void (*event) (struct mlx4_dev *dev, void *context,
enum mlx4_dev_event event, unsigned long param);
- void * (*get_dev)(struct mlx4_dev *dev, void *context, u8 port);
void (*activate)(struct mlx4_dev *dev, void *context);
struct list_head list;
enum mlx4_protocol protocol;
@@ -88,8 +87,6 @@ struct mlx4_port_map {
int mlx4_port_map_set(struct mlx4_dev *dev, struct mlx4_port_map *v2p);
-void *mlx4_get_protocol_dev(struct mlx4_dev *dev, enum mlx4_protocol proto, int port);
-
struct devlink_port *mlx4_get_devlink_port(struct mlx4_dev *dev, int port);
#endif /* MLX4_DRIVER_H */
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next 02/10] mlx4: Rename member mlx4_en_dev.nb to netdev_nb
2023-08-04 15:05 [PATCH net-next 00/10] Convert mlx4 to use auxiliary bus Petr Pavlu
2023-08-04 15:05 ` [PATCH net-next 01/10] mlx4: Get rid of the mlx4_interface.get_dev callback Petr Pavlu
@ 2023-08-04 15:05 ` Petr Pavlu
2023-08-08 18:55 ` Leon Romanovsky
2023-08-04 15:05 ` [PATCH net-next 03/10] mlx4: Replace the mlx4_interface.event callback with a notifier Petr Pavlu
` (9 subsequent siblings)
11 siblings, 1 reply; 28+ messages in thread
From: Petr Pavlu @ 2023-08-04 15:05 UTC (permalink / raw)
To: tariqt, yishaih, leon
Cc: davem, edumazet, kuba, pabeni, jgg, netdev, linux-rdma,
linux-kernel, Petr Pavlu
Rename the mlx4_en_dev.nb notifier_block member to netdev_nb in
preparation to add a mlx4 core notifier_block.
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Tested-by: Leon Romanovsky <leon@kernel.org>
---
drivers/net/ethernet/mellanox/mlx4/en_main.c | 14 +++++++-------
drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 2 +-
drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 2 +-
3 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_main.c b/drivers/net/ethernet/mellanox/mlx4/en_main.c
index 6a42bec6bd85..be8ba34c9025 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_main.c
@@ -235,8 +235,8 @@ static void mlx4_en_remove(struct mlx4_dev *dev, void *endev_ptr)
iounmap(mdev->uar_map);
mlx4_uar_free(dev, &mdev->priv_uar);
mlx4_pd_free(dev, mdev->priv_pdn);
- if (mdev->nb.notifier_call)
- unregister_netdevice_notifier(&mdev->nb);
+ if (mdev->netdev_nb.notifier_call)
+ unregister_netdevice_notifier(&mdev->netdev_nb);
kfree(mdev);
}
@@ -252,11 +252,11 @@ static void mlx4_en_activate(struct mlx4_dev *dev, void *ctx)
mdev->pndev[i] = NULL;
}
- /* register notifier */
- mdev->nb.notifier_call = mlx4_en_netdev_event;
- if (register_netdevice_notifier(&mdev->nb)) {
- mdev->nb.notifier_call = NULL;
- mlx4_err(mdev, "Failed to create notifier\n");
+ /* register netdev notifier */
+ mdev->netdev_nb.notifier_call = mlx4_en_netdev_event;
+ if (register_netdevice_notifier(&mdev->netdev_nb)) {
+ mdev->netdev_nb.notifier_call = NULL;
+ mlx4_err(mdev, "Failed to create netdev notifier\n");
}
}
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index 403604ceebc8..7066c426b95c 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -2967,7 +2967,7 @@ int mlx4_en_netdev_event(struct notifier_block *this,
if (!net_eq(dev_net(ndev), &init_net))
return NOTIFY_DONE;
- mdev = container_of(this, struct mlx4_en_dev, nb);
+ mdev = container_of(this, struct mlx4_en_dev, netdev_nb);
dev = mdev->dev;
/* Go into this mode only when two network devices set on two ports
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index 321f801c1d7c..72a3fea36702 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -432,7 +432,7 @@ struct mlx4_en_dev {
unsigned long last_overflow_check;
struct ptp_clock *ptp_clock;
struct ptp_clock_info ptp_clock_info;
- struct notifier_block nb;
+ struct notifier_block netdev_nb;
};
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next 03/10] mlx4: Replace the mlx4_interface.event callback with a notifier
2023-08-04 15:05 [PATCH net-next 00/10] Convert mlx4 to use auxiliary bus Petr Pavlu
2023-08-04 15:05 ` [PATCH net-next 01/10] mlx4: Get rid of the mlx4_interface.get_dev callback Petr Pavlu
2023-08-04 15:05 ` [PATCH net-next 02/10] mlx4: Rename member mlx4_en_dev.nb to netdev_nb Petr Pavlu
@ 2023-08-04 15:05 ` Petr Pavlu
2023-08-05 14:29 ` Zhu Yanjun
2023-08-07 13:58 ` Simon Horman
2023-08-04 15:05 ` [PATCH net-next 04/10] mlx4: Get rid of the mlx4_interface.activate callback Petr Pavlu
` (8 subsequent siblings)
11 siblings, 2 replies; 28+ messages in thread
From: Petr Pavlu @ 2023-08-04 15:05 UTC (permalink / raw)
To: tariqt, yishaih, leon
Cc: davem, edumazet, kuba, pabeni, jgg, netdev, linux-rdma,
linux-kernel, Petr Pavlu
Use a notifier to implement mlx4_dispatch_event() in preparation to
switch mlx4_en and mlx4_ib to be an auxiliary device.
A problem is that if the mlx4_interface.event callback was replaced with
something as mlx4_adrv.event then the implementation of
mlx4_dispatch_event() would need to acquire a lock on a given device
before executing this callback. That is necessary because otherwise
there is no guarantee that the associated driver cannot get unbound when
the callback is running. However, taking this lock is not possible
because mlx4_dispatch_event() can be invoked from the hardirq context.
Using an atomic notifier allows the driver to accurately record when it
wants to receive these events and solves this problem.
A handler registration is done by both mlx4_en and mlx4_ib at the end of
their mlx4_interface.add callback. This matches the current situation
when mlx4_add_device() would enable events for a given device
immediately after this callback, by adding the device on the
mlx4_priv.list.
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Tested-by: Leon Romanovsky <leon@kernel.org>
---
drivers/infiniband/hw/mlx4/main.c | 41 +++++++++++++-------
drivers/infiniband/hw/mlx4/mlx4_ib.h | 2 +
drivers/net/ethernet/mellanox/mlx4/en_main.c | 25 ++++++++----
drivers/net/ethernet/mellanox/mlx4/intf.c | 24 ++++++++----
drivers/net/ethernet/mellanox/mlx4/main.c | 2 +
drivers/net/ethernet/mellanox/mlx4/mlx4.h | 2 +
drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 2 +
include/linux/mlx4/driver.h | 8 +++-
8 files changed, 76 insertions(+), 30 deletions(-)
diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index 7dd70d778b6b..458b4b11dffa 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -82,6 +82,8 @@ static const char mlx4_ib_version[] =
static void do_slave_init(struct mlx4_ib_dev *ibdev, int slave, int do_init);
static enum rdma_link_layer mlx4_ib_port_link_layer(struct ib_device *device,
u32 port_num);
+static int mlx4_ib_event(struct notifier_block *this, unsigned long event,
+ void *ptr);
static struct workqueue_struct *wq;
@@ -2836,6 +2838,12 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
do_slave_init(ibdev, j, 1);
}
}
+
+ /* register mlx4 core notifier */
+ ibdev->mlx_nb.notifier_call = mlx4_ib_event;
+ err = mlx4_register_event_notifier(dev, &ibdev->mlx_nb);
+ WARN(err, "failed to register mlx4 event notifier (%d)", err);
+
return ibdev;
err_notif:
@@ -2953,6 +2961,8 @@ static void mlx4_ib_remove(struct mlx4_dev *dev, void *ibdev_ptr)
int p;
int i;
+ mlx4_unregister_event_notifier(dev, &ibdev->mlx_nb);
+
mlx4_foreach_port(i, dev, MLX4_PORT_TYPE_IB)
devlink_port_type_clear(mlx4_get_devlink_port(dev, i));
ibdev->ib_active = false;
@@ -3173,11 +3183,14 @@ void mlx4_sched_ib_sl2vl_update_work(struct mlx4_ib_dev *ibdev,
}
}
-static void mlx4_ib_event(struct mlx4_dev *dev, void *ibdev_ptr,
- enum mlx4_dev_event event, unsigned long param)
+static int mlx4_ib_event(struct notifier_block *this,
+ unsigned long event /*mlx4_dev_event*/, void *ptr)
{
+ struct mlx4_ib_dev *ibdev =
+ container_of(this, struct mlx4_ib_dev, mlx_nb);
+ struct mlx4_dev *dev = ibdev->dev;
+ unsigned long param = *(unsigned long *)ptr;
struct ib_event ibev;
- struct mlx4_ib_dev *ibdev = to_mdev((struct ib_device *) ibdev_ptr);
struct mlx4_eqe *eqe = NULL;
struct ib_event_work *ew;
int p = 0;
@@ -3187,11 +3200,11 @@ static void mlx4_ib_event(struct mlx4_dev *dev, void *ibdev_ptr,
(event == MLX4_DEV_EVENT_PORT_DOWN))) {
ew = kmalloc(sizeof(*ew), GFP_ATOMIC);
if (!ew)
- return;
+ return NOTIFY_DONE;
INIT_WORK(&ew->work, handle_bonded_port_state_event);
ew->ib_dev = ibdev;
queue_work(wq, &ew->work);
- return;
+ return NOTIFY_DONE;
}
if (event == MLX4_DEV_EVENT_PORT_MGMT_CHANGE)
@@ -3202,7 +3215,7 @@ static void mlx4_ib_event(struct mlx4_dev *dev, void *ibdev_ptr,
switch (event) {
case MLX4_DEV_EVENT_PORT_UP:
if (p > ibdev->num_ports)
- return;
+ return NOTIFY_DONE;
if (!mlx4_is_slave(dev) &&
rdma_port_get_link_layer(&ibdev->ib_dev, p) ==
IB_LINK_LAYER_INFINIBAND) {
@@ -3217,7 +3230,7 @@ static void mlx4_ib_event(struct mlx4_dev *dev, void *ibdev_ptr,
case MLX4_DEV_EVENT_PORT_DOWN:
if (p > ibdev->num_ports)
- return;
+ return NOTIFY_DONE;
ibev.event = IB_EVENT_PORT_ERR;
break;
@@ -3230,7 +3243,7 @@ static void mlx4_ib_event(struct mlx4_dev *dev, void *ibdev_ptr,
case MLX4_DEV_EVENT_PORT_MGMT_CHANGE:
ew = kmalloc(sizeof *ew, GFP_ATOMIC);
if (!ew)
- return;
+ return NOTIFY_DONE;
INIT_WORK(&ew->work, handle_port_mgmt_change_event);
memcpy(&ew->ib_eqe, eqe, sizeof *eqe);
@@ -3240,7 +3253,7 @@ static void mlx4_ib_event(struct mlx4_dev *dev, void *ibdev_ptr,
queue_work(wq, &ew->work);
else
handle_port_mgmt_change_event(&ew->work);
- return;
+ return NOTIFY_DONE;
case MLX4_DEV_EVENT_SLAVE_INIT:
/* here, p is the slave id */
@@ -3256,7 +3269,7 @@ static void mlx4_ib_event(struct mlx4_dev *dev, void *ibdev_ptr,
1);
}
}
- return;
+ return NOTIFY_DONE;
case MLX4_DEV_EVENT_SLAVE_SHUTDOWN:
if (mlx4_is_master(dev)) {
@@ -3272,22 +3285,22 @@ static void mlx4_ib_event(struct mlx4_dev *dev, void *ibdev_ptr,
}
/* here, p is the slave id */
do_slave_init(ibdev, p, 0);
- return;
+ return NOTIFY_DONE;
default:
- return;
+ return NOTIFY_DONE;
}
- ibev.device = ibdev_ptr;
+ ibev.device = &ibdev->ib_dev;
ibev.element.port_num = mlx4_is_bonded(ibdev->dev) ? 1 : (u8)p;
ib_dispatch_event(&ibev);
+ return NOTIFY_DONE;
}
static struct mlx4_interface mlx4_ib_interface = {
.add = mlx4_ib_add,
.remove = mlx4_ib_remove,
- .event = mlx4_ib_event,
.protocol = MLX4_PROT_IB_IPV6,
.flags = MLX4_INTFF_BONDING
};
diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h
index 17fee1e73a45..41ca1114a995 100644
--- a/drivers/infiniband/hw/mlx4/mlx4_ib.h
+++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h
@@ -38,6 +38,7 @@
#include <linux/list.h>
#include <linux/mutex.h>
#include <linux/idr.h>
+#include <linux/notifier.h>
#include <rdma/ib_verbs.h>
#include <rdma/ib_umem.h>
@@ -644,6 +645,7 @@ struct mlx4_ib_dev {
spinlock_t reset_flow_resource_lock;
struct list_head qp_list;
struct mlx4_ib_diag_counters diag_counters[MLX4_DIAG_COUNTERS_TYPES];
+ struct notifier_block mlx_nb;
};
struct ib_event_work {
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_main.c b/drivers/net/ethernet/mellanox/mlx4/en_main.c
index be8ba34c9025..8384bff5c37d 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_main.c
@@ -183,17 +183,20 @@ static void mlx4_en_get_profile(struct mlx4_en_dev *mdev)
}
}
-static void mlx4_en_event(struct mlx4_dev *dev, void *endev_ptr,
- enum mlx4_dev_event event, unsigned long port)
+static int mlx4_en_event(struct notifier_block *this,
+ unsigned long event /*mlx4_dev_event*/, void *ptr)
{
- struct mlx4_en_dev *mdev = (struct mlx4_en_dev *) endev_ptr;
+ struct mlx4_en_dev *mdev =
+ container_of(this, struct mlx4_en_dev, mlx_nb);
+ struct mlx4_dev *dev = mdev->dev;
+ unsigned long port = *(unsigned long *)ptr;
struct mlx4_en_priv *priv;
switch (event) {
case MLX4_DEV_EVENT_PORT_UP:
case MLX4_DEV_EVENT_PORT_DOWN:
if (!mdev->pndev[port])
- return;
+ return NOTIFY_DONE;
priv = netdev_priv(mdev->pndev[port]);
/* To prevent races, we poll the link state in a separate
task rather than changing it here */
@@ -211,10 +214,12 @@ static void mlx4_en_event(struct mlx4_dev *dev, void *endev_ptr,
default:
if (port < 1 || port > dev->caps.num_ports ||
!mdev->pndev[port])
- return;
- mlx4_warn(mdev, "Unhandled event %d for port %d\n", event,
+ return NOTIFY_DONE;
+ mlx4_warn(mdev, "Unhandled event %d for port %d\n", (int) event,
(int) port);
}
+
+ return NOTIFY_DONE;
}
static void mlx4_en_remove(struct mlx4_dev *dev, void *endev_ptr)
@@ -222,6 +227,8 @@ static void mlx4_en_remove(struct mlx4_dev *dev, void *endev_ptr)
struct mlx4_en_dev *mdev = endev_ptr;
int i;
+ mlx4_unregister_event_notifier(dev, &mdev->mlx_nb);
+
mutex_lock(&mdev->state_lock);
mdev->device_up = false;
mutex_unlock(&mdev->state_lock);
@@ -326,6 +333,11 @@ static void *mlx4_en_add(struct mlx4_dev *dev)
mutex_init(&mdev->state_lock);
mdev->device_up = true;
+ /* register mlx4 core notifier */
+ mdev->mlx_nb.notifier_call = mlx4_en_event;
+ err = mlx4_register_event_notifier(dev, &mdev->mlx_nb);
+ WARN(err, "failed to register mlx4 event notifier (%d)", err);
+
return mdev;
err_mr:
@@ -346,7 +358,6 @@ static void *mlx4_en_add(struct mlx4_dev *dev)
static struct mlx4_interface mlx4_en_interface = {
.add = mlx4_en_add,
.remove = mlx4_en_remove,
- .event = mlx4_en_event,
.protocol = MLX4_PROT_ETH,
.activate = mlx4_en_activate,
};
diff --git a/drivers/net/ethernet/mellanox/mlx4/intf.c b/drivers/net/ethernet/mellanox/mlx4/intf.c
index 28d7da925d36..a7c3e2efa464 100644
--- a/drivers/net/ethernet/mellanox/mlx4/intf.c
+++ b/drivers/net/ethernet/mellanox/mlx4/intf.c
@@ -183,17 +183,27 @@ void mlx4_dispatch_event(struct mlx4_dev *dev, enum mlx4_dev_event type,
unsigned long param)
{
struct mlx4_priv *priv = mlx4_priv(dev);
- struct mlx4_device_context *dev_ctx;
- unsigned long flags;
- spin_lock_irqsave(&priv->ctx_lock, flags);
+ atomic_notifier_call_chain(&priv->event_nh, type, ¶m);
+}
- list_for_each_entry(dev_ctx, &priv->ctx_list, list)
- if (dev_ctx->intf->event)
- dev_ctx->intf->event(dev, dev_ctx->context, type, param);
+int mlx4_register_event_notifier(struct mlx4_dev *dev,
+ struct notifier_block *nb)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
- spin_unlock_irqrestore(&priv->ctx_lock, flags);
+ return atomic_notifier_chain_register(&priv->event_nh, nb);
+}
+EXPORT_SYMBOL(mlx4_register_event_notifier);
+
+int mlx4_unregister_event_notifier(struct mlx4_dev *dev,
+ struct notifier_block *nb)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ return atomic_notifier_chain_unregister(&priv->event_nh, nb);
}
+EXPORT_SYMBOL(mlx4_unregister_event_notifier);
int mlx4_register_device(struct mlx4_dev *dev)
{
diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index 8a5409b00530..5f3ba8385e23 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -3378,6 +3378,8 @@ static int mlx4_load_one(struct pci_dev *pdev, int pci_dev_data,
INIT_LIST_HEAD(&priv->ctx_list);
spin_lock_init(&priv->ctx_lock);
+ ATOMIC_INIT_NOTIFIER_HEAD(&priv->event_nh);
+
mutex_init(&priv->port_mutex);
mutex_init(&priv->bond_mutex);
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
index 6ccf340660d9..10f12e4992f1 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
@@ -47,6 +47,7 @@
#include <linux/spinlock.h>
#include <net/devlink.h>
#include <linux/rwsem.h>
+#include <linux/notifier.h>
#include <linux/mlx4/device.h>
#include <linux/mlx4/driver.h>
@@ -878,6 +879,7 @@ struct mlx4_priv {
struct list_head dev_list;
struct list_head ctx_list;
spinlock_t ctx_lock;
+ struct atomic_notifier_head event_nh;
int pci_dev_data;
int removed;
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index 72a3fea36702..efe3f97b874f 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -49,6 +49,7 @@
#include <linux/ptp_clock_kernel.h>
#include <linux/irq.h>
#include <net/xdp.h>
+#include <linux/notifier.h>
#include <linux/mlx4/device.h>
#include <linux/mlx4/qp.h>
@@ -433,6 +434,7 @@ struct mlx4_en_dev {
struct ptp_clock *ptp_clock;
struct ptp_clock_info ptp_clock_info;
struct notifier_block netdev_nb;
+ struct notifier_block mlx_nb;
};
diff --git a/include/linux/mlx4/driver.h b/include/linux/mlx4/driver.h
index 923951e19300..228da8ed7e75 100644
--- a/include/linux/mlx4/driver.h
+++ b/include/linux/mlx4/driver.h
@@ -34,6 +34,7 @@
#define MLX4_DRIVER_H
#include <net/devlink.h>
+#include <linux/notifier.h>
#include <linux/mlx4/device.h>
struct mlx4_dev;
@@ -57,8 +58,6 @@ enum {
struct mlx4_interface {
void * (*add) (struct mlx4_dev *dev);
void (*remove)(struct mlx4_dev *dev, void *context);
- void (*event) (struct mlx4_dev *dev, void *context,
- enum mlx4_dev_event event, unsigned long param);
void (*activate)(struct mlx4_dev *dev, void *context);
struct list_head list;
enum mlx4_protocol protocol;
@@ -87,6 +86,11 @@ struct mlx4_port_map {
int mlx4_port_map_set(struct mlx4_dev *dev, struct mlx4_port_map *v2p);
+int mlx4_register_event_notifier(struct mlx4_dev *dev,
+ struct notifier_block *nb);
+int mlx4_unregister_event_notifier(struct mlx4_dev *dev,
+ struct notifier_block *nb);
+
struct devlink_port *mlx4_get_devlink_port(struct mlx4_dev *dev, int port);
#endif /* MLX4_DRIVER_H */
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next 04/10] mlx4: Get rid of the mlx4_interface.activate callback
2023-08-04 15:05 [PATCH net-next 00/10] Convert mlx4 to use auxiliary bus Petr Pavlu
` (2 preceding siblings ...)
2023-08-04 15:05 ` [PATCH net-next 03/10] mlx4: Replace the mlx4_interface.event callback with a notifier Petr Pavlu
@ 2023-08-04 15:05 ` Petr Pavlu
2023-08-08 18:56 ` Leon Romanovsky
2023-08-04 15:05 ` [PATCH net-next 05/10] mlx4: Move the bond work to the core driver Petr Pavlu
` (7 subsequent siblings)
11 siblings, 1 reply; 28+ messages in thread
From: Petr Pavlu @ 2023-08-04 15:05 UTC (permalink / raw)
To: tariqt, yishaih, leon
Cc: davem, edumazet, kuba, pabeni, jgg, netdev, linux-rdma,
linux-kernel, Petr Pavlu
The mlx4_interface.activate callback was introduced in commit
79857cd31fe7 ("net/mlx4: Postpone the registration of net_device"). It
dealt with a situation when a netdev notifier received a NETDEV_REGISTER
event for a new net_device created by mlx4_en but the same device was
not yet visible to mlx4_get_protocol_dev(). The callback can be removed
now that mlx4_get_protocol_dev() is gone.
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Tested-by: Leon Romanovsky <leon@kernel.org>
---
drivers/net/ethernet/mellanox/mlx4/en_main.c | 37 +++++++++-----------
drivers/net/ethernet/mellanox/mlx4/intf.c | 2 --
include/linux/mlx4/driver.h | 1 -
3 files changed, 16 insertions(+), 24 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_main.c b/drivers/net/ethernet/mellanox/mlx4/en_main.c
index 8384bff5c37d..3824884ab515 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_main.c
@@ -247,26 +247,6 @@ static void mlx4_en_remove(struct mlx4_dev *dev, void *endev_ptr)
kfree(mdev);
}
-static void mlx4_en_activate(struct mlx4_dev *dev, void *ctx)
-{
- int i;
- struct mlx4_en_dev *mdev = ctx;
-
- /* Create a netdev for each port */
- mlx4_foreach_port(i, dev, MLX4_PORT_TYPE_ETH) {
- mlx4_info(mdev, "Activating port:%d\n", i);
- if (mlx4_en_init_netdev(mdev, i, &mdev->profile.prof[i]))
- mdev->pndev[i] = NULL;
- }
-
- /* register netdev notifier */
- mdev->netdev_nb.notifier_call = mlx4_en_netdev_event;
- if (register_netdevice_notifier(&mdev->netdev_nb)) {
- mdev->netdev_nb.notifier_call = NULL;
- mlx4_err(mdev, "Failed to create netdev notifier\n");
- }
-}
-
static void *mlx4_en_add(struct mlx4_dev *dev)
{
struct mlx4_en_dev *mdev;
@@ -338,6 +318,22 @@ static void *mlx4_en_add(struct mlx4_dev *dev)
err = mlx4_register_event_notifier(dev, &mdev->mlx_nb);
WARN(err, "failed to register mlx4 event notifier (%d)", err);
+ /* Setup ports */
+
+ /* Create a netdev for each port */
+ mlx4_foreach_port(i, dev, MLX4_PORT_TYPE_ETH) {
+ mlx4_info(mdev, "Activating port:%d\n", i);
+ if (mlx4_en_init_netdev(mdev, i, &mdev->profile.prof[i]))
+ mdev->pndev[i] = NULL;
+ }
+
+ /* register netdev notifier */
+ mdev->netdev_nb.notifier_call = mlx4_en_netdev_event;
+ if (register_netdevice_notifier(&mdev->netdev_nb)) {
+ mdev->netdev_nb.notifier_call = NULL;
+ mlx4_err(mdev, "Failed to create netdev notifier\n");
+ }
+
return mdev;
err_mr:
@@ -359,7 +355,6 @@ static struct mlx4_interface mlx4_en_interface = {
.add = mlx4_en_add,
.remove = mlx4_en_remove,
.protocol = MLX4_PROT_ETH,
- .activate = mlx4_en_activate,
};
static void mlx4_en_verify_params(void)
diff --git a/drivers/net/ethernet/mellanox/mlx4/intf.c b/drivers/net/ethernet/mellanox/mlx4/intf.c
index a7c3e2efa464..8b2c1404cb66 100644
--- a/drivers/net/ethernet/mellanox/mlx4/intf.c
+++ b/drivers/net/ethernet/mellanox/mlx4/intf.c
@@ -64,8 +64,6 @@ static void mlx4_add_device(struct mlx4_interface *intf, struct mlx4_priv *priv)
spin_lock_irq(&priv->ctx_lock);
list_add_tail(&dev_ctx->list, &priv->ctx_list);
spin_unlock_irq(&priv->ctx_lock);
- if (intf->activate)
- intf->activate(&priv->dev, dev_ctx->context);
} else
kfree(dev_ctx);
diff --git a/include/linux/mlx4/driver.h b/include/linux/mlx4/driver.h
index 228da8ed7e75..0f8c9ba4c574 100644
--- a/include/linux/mlx4/driver.h
+++ b/include/linux/mlx4/driver.h
@@ -58,7 +58,6 @@ enum {
struct mlx4_interface {
void * (*add) (struct mlx4_dev *dev);
void (*remove)(struct mlx4_dev *dev, void *context);
- void (*activate)(struct mlx4_dev *dev, void *context);
struct list_head list;
enum mlx4_protocol protocol;
int flags;
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next 05/10] mlx4: Move the bond work to the core driver
2023-08-04 15:05 [PATCH net-next 00/10] Convert mlx4 to use auxiliary bus Petr Pavlu
` (3 preceding siblings ...)
2023-08-04 15:05 ` [PATCH net-next 04/10] mlx4: Get rid of the mlx4_interface.activate callback Petr Pavlu
@ 2023-08-04 15:05 ` Petr Pavlu
2023-08-08 18:56 ` Leon Romanovsky
2023-08-04 15:05 ` [PATCH net-next 06/10] mlx4: Avoid resetting MLX4_INTFF_BONDING per driver Petr Pavlu
` (6 subsequent siblings)
11 siblings, 1 reply; 28+ messages in thread
From: Petr Pavlu @ 2023-08-04 15:05 UTC (permalink / raw)
To: tariqt, yishaih, leon
Cc: davem, edumazet, kuba, pabeni, jgg, netdev, linux-rdma,
linux-kernel, Petr Pavlu
Function mlx4_en_queue_bond_work() is used in mlx4_en to start a bond
reconfiguration. It gathers data about a new port map setting, takes
a reference on the netdev that triggered the change and queues a work
object on mlx4_en_priv.mdev.workqueue to perform the operation. The
scheduled work is mlx4_en_bond_work() which calls
mlx4_bond()/mlx4_unbond() and consequently mlx4_do_bond().
At the same time, function mlx4_change_port_types() in mlx4_core might
be invoked to change the port type configuration. As part of its logic,
it re-registers the whole device by calling mlx4_unregister_device(),
followed by mlx4_register_device().
The two operations can result in concurrent access to the data about
currently active interfaces on the device.
Functions mlx4_register_device() and mlx4_unregister_device() lock the
intf_mutex to gain exclusive access to this data. The current
implementation of mlx4_do_bond() doesn't do that which could result in
an unexpected behavior. An updated version of mlx4_do_bond() for use
with an auxiliary bus goes and locks the intf_mutex when accessing a new
auxiliary device array.
However, doing so can then result in the following deadlock:
* A two-port mlx4 device is configured as an Ethernet bond.
* One of the ports is changed from eth to ib, for instance, by writing
into a mlx4_port<x> sysfs attribute file.
* mlx4_change_port_types() is called to update port types. It invokes
mlx4_unregister_device() to unregister the device which locks the
intf_mutex and starts removing all associated interfaces.
* Function mlx4_en_remove() gets invoked and starts destroying its first
netdev. This triggers mlx4_en_netdev_event() which recognizes that the
configured bond is broken. It runs mlx4_en_queue_bond_work() which
takes a reference on the netdev. Removing the netdev now cannot
proceed until the work is completed.
* Work function mlx4_en_bond_work() gets scheduled. It calls
mlx4_unbond() -> mlx4_do_bond(). The latter function tries to lock the
intf_mutex but that is not possible because it is held already by
mlx4_unregister_device().
This particular case could be possibly solved by unregistering the
mlx4_en_netdev_event() notifier in mlx4_en_remove() earlier, but it
seems better to decouple mlx4_en more and break this reference order.
Avoid then this scenario by recognizing that the bond reconfiguration
operates only on a mlx4_dev. The logic to queue and execute the bond
work can be moved into the mlx4_core driver. Only a reference on the
respective mlx4_dev object is needed to be taken during the work's
lifetime. This removes a call from mlx4_en that can directly result in
needing to lock the intf_mutex, it remains a privilege of the core
driver.
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Tested-by: Leon Romanovsky <leon@kernel.org>
---
.../net/ethernet/mellanox/mlx4/en_netdev.c | 62 +-----------------
drivers/net/ethernet/mellanox/mlx4/main.c | 65 +++++++++++++++++--
drivers/net/ethernet/mellanox/mlx4/mlx4.h | 5 ++
include/linux/mlx4/device.h | 13 ++++
include/linux/mlx4/driver.h | 19 ------
5 files changed, 77 insertions(+), 87 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index 7066c426b95c..33bbcced8105 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -2894,63 +2894,6 @@ static const struct xdp_metadata_ops mlx4_xdp_metadata_ops = {
.xmo_rx_hash = mlx4_en_xdp_rx_hash,
};
-struct mlx4_en_bond {
- struct work_struct work;
- struct mlx4_en_priv *priv;
- int is_bonded;
- struct mlx4_port_map port_map;
-};
-
-static void mlx4_en_bond_work(struct work_struct *work)
-{
- struct mlx4_en_bond *bond = container_of(work,
- struct mlx4_en_bond,
- work);
- int err = 0;
- struct mlx4_dev *dev = bond->priv->mdev->dev;
-
- if (bond->is_bonded) {
- if (!mlx4_is_bonded(dev)) {
- err = mlx4_bond(dev);
- if (err)
- en_err(bond->priv, "Fail to bond device\n");
- }
- if (!err) {
- err = mlx4_port_map_set(dev, &bond->port_map);
- if (err)
- en_err(bond->priv, "Fail to set port map [%d][%d]: %d\n",
- bond->port_map.port1,
- bond->port_map.port2,
- err);
- }
- } else if (mlx4_is_bonded(dev)) {
- err = mlx4_unbond(dev);
- if (err)
- en_err(bond->priv, "Fail to unbond device\n");
- }
- dev_put(bond->priv->dev);
- kfree(bond);
-}
-
-static int mlx4_en_queue_bond_work(struct mlx4_en_priv *priv, int is_bonded,
- u8 v2p_p1, u8 v2p_p2)
-{
- struct mlx4_en_bond *bond;
-
- bond = kzalloc(sizeof(*bond), GFP_ATOMIC);
- if (!bond)
- return -ENOMEM;
-
- INIT_WORK(&bond->work, mlx4_en_bond_work);
- bond->priv = priv;
- bond->is_bonded = is_bonded;
- bond->port_map.port1 = v2p_p1;
- bond->port_map.port2 = v2p_p2;
- dev_hold(priv->dev);
- queue_work(priv->mdev->workqueue, &bond->work);
- return 0;
-}
-
int mlx4_en_netdev_event(struct notifier_block *this,
unsigned long event, void *ptr)
{
@@ -2960,7 +2903,6 @@ int mlx4_en_netdev_event(struct notifier_block *this,
struct mlx4_dev *dev;
int i, num_eth_ports = 0;
bool do_bond = true;
- struct mlx4_en_priv *priv;
u8 v2p_port1 = 0;
u8 v2p_port2 = 0;
@@ -2995,7 +2937,6 @@ int mlx4_en_netdev_event(struct notifier_block *this,
if ((do_bond && (event != NETDEV_BONDING_INFO)) || !port)
return NOTIFY_DONE;
- priv = netdev_priv(ndev);
if (do_bond) {
struct netdev_notifier_bonding_info *notifier_info = ptr;
struct netdev_bonding_info *bonding_info =
@@ -3062,8 +3003,7 @@ int mlx4_en_netdev_event(struct notifier_block *this,
}
}
- mlx4_en_queue_bond_work(priv, do_bond,
- v2p_port1, v2p_port2);
+ mlx4_queue_bond_work(dev, do_bond, v2p_port1, v2p_port2);
return NOTIFY_DONE;
}
diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index 5f3ba8385e23..0ed490b99163 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -1441,7 +1441,7 @@ static int mlx4_mf_unbond(struct mlx4_dev *dev)
return ret;
}
-int mlx4_bond(struct mlx4_dev *dev)
+static int mlx4_bond(struct mlx4_dev *dev)
{
int ret = 0;
struct mlx4_priv *priv = mlx4_priv(dev);
@@ -1467,9 +1467,8 @@ int mlx4_bond(struct mlx4_dev *dev)
return ret;
}
-EXPORT_SYMBOL_GPL(mlx4_bond);
-int mlx4_unbond(struct mlx4_dev *dev)
+static int mlx4_unbond(struct mlx4_dev *dev)
{
int ret = 0;
struct mlx4_priv *priv = mlx4_priv(dev);
@@ -1496,10 +1495,8 @@ int mlx4_unbond(struct mlx4_dev *dev)
return ret;
}
-EXPORT_SYMBOL_GPL(mlx4_unbond);
-
-int mlx4_port_map_set(struct mlx4_dev *dev, struct mlx4_port_map *v2p)
+static int mlx4_port_map_set(struct mlx4_dev *dev, struct mlx4_port_map *v2p)
{
u8 port1 = v2p->port1;
u8 port2 = v2p->port2;
@@ -1541,7 +1538,61 @@ int mlx4_port_map_set(struct mlx4_dev *dev, struct mlx4_port_map *v2p)
mutex_unlock(&priv->bond_mutex);
return err;
}
-EXPORT_SYMBOL_GPL(mlx4_port_map_set);
+
+struct mlx4_bond {
+ struct work_struct work;
+ struct mlx4_dev *dev;
+ int is_bonded;
+ struct mlx4_port_map port_map;
+};
+
+static void mlx4_bond_work(struct work_struct *work)
+{
+ struct mlx4_bond *bond = container_of(work, struct mlx4_bond, work);
+ int err = 0;
+
+ if (bond->is_bonded) {
+ if (!mlx4_is_bonded(bond->dev)) {
+ err = mlx4_bond(bond->dev);
+ if (err)
+ mlx4_err(bond->dev, "Fail to bond device\n");
+ }
+ if (!err) {
+ err = mlx4_port_map_set(bond->dev, &bond->port_map);
+ if (err)
+ mlx4_err(bond->dev,
+ "Fail to set port map [%d][%d]: %d\n",
+ bond->port_map.port1,
+ bond->port_map.port2, err);
+ }
+ } else if (mlx4_is_bonded(bond->dev)) {
+ err = mlx4_unbond(bond->dev);
+ if (err)
+ mlx4_err(bond->dev, "Fail to unbond device\n");
+ }
+ put_device(&bond->dev->persist->pdev->dev);
+ kfree(bond);
+}
+
+int mlx4_queue_bond_work(struct mlx4_dev *dev, int is_bonded, u8 v2p_p1,
+ u8 v2p_p2)
+{
+ struct mlx4_bond *bond;
+
+ bond = kzalloc(sizeof(*bond), GFP_ATOMIC);
+ if (!bond)
+ return -ENOMEM;
+
+ INIT_WORK(&bond->work, mlx4_bond_work);
+ get_device(&dev->persist->pdev->dev);
+ bond->dev = dev;
+ bond->is_bonded = is_bonded;
+ bond->port_map.port1 = v2p_p1;
+ bond->port_map.port2 = v2p_p2;
+ queue_work(mlx4_wq, &bond->work);
+ return 0;
+}
+EXPORT_SYMBOL(mlx4_queue_bond_work);
static int mlx4_load_fw(struct mlx4_dev *dev)
{
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
index 10f12e4992f1..ece9acb6a869 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
@@ -863,6 +863,11 @@ struct mlx4_steer {
struct list_head steer_entries[MLX4_NUM_STEERS];
};
+struct mlx4_port_map {
+ u8 port1;
+ u8 port2;
+};
+
enum {
MLX4_PCI_DEV_IS_VF = 1 << 0,
MLX4_PCI_DEV_FORCE_SENSE_PORT = 1 << 1,
diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index 6646634a0b9d..049d8a4b044d 100644
--- a/include/linux/mlx4/device.h
+++ b/include/linux/mlx4/device.h
@@ -1087,6 +1087,19 @@ static inline void *mlx4_buf_offset(struct mlx4_buf *buf, int offset)
(offset & (PAGE_SIZE - 1));
}
+static inline int mlx4_is_bonded(struct mlx4_dev *dev)
+{
+ return !!(dev->flags & MLX4_FLAG_BONDED);
+}
+
+static inline int mlx4_is_mf_bonded(struct mlx4_dev *dev)
+{
+ return (mlx4_is_bonded(dev) && mlx4_is_mfunc(dev));
+}
+
+int mlx4_queue_bond_work(struct mlx4_dev *dev, int is_bonded, u8 v2p_p1,
+ u8 v2p_p2);
+
int mlx4_pd_alloc(struct mlx4_dev *dev, u32 *pdn);
void mlx4_pd_free(struct mlx4_dev *dev, u32 pdn);
int mlx4_xrcd_alloc(struct mlx4_dev *dev, u32 *xrcdn);
diff --git a/include/linux/mlx4/driver.h b/include/linux/mlx4/driver.h
index 0f8c9ba4c574..781d5a0c2faa 100644
--- a/include/linux/mlx4/driver.h
+++ b/include/linux/mlx4/driver.h
@@ -66,25 +66,6 @@ struct mlx4_interface {
int mlx4_register_interface(struct mlx4_interface *intf);
void mlx4_unregister_interface(struct mlx4_interface *intf);
-int mlx4_bond(struct mlx4_dev *dev);
-int mlx4_unbond(struct mlx4_dev *dev);
-static inline int mlx4_is_bonded(struct mlx4_dev *dev)
-{
- return !!(dev->flags & MLX4_FLAG_BONDED);
-}
-
-static inline int mlx4_is_mf_bonded(struct mlx4_dev *dev)
-{
- return (mlx4_is_bonded(dev) && mlx4_is_mfunc(dev));
-}
-
-struct mlx4_port_map {
- u8 port1;
- u8 port2;
-};
-
-int mlx4_port_map_set(struct mlx4_dev *dev, struct mlx4_port_map *v2p);
-
int mlx4_register_event_notifier(struct mlx4_dev *dev,
struct notifier_block *nb);
int mlx4_unregister_event_notifier(struct mlx4_dev *dev,
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next 06/10] mlx4: Avoid resetting MLX4_INTFF_BONDING per driver
2023-08-04 15:05 [PATCH net-next 00/10] Convert mlx4 to use auxiliary bus Petr Pavlu
` (4 preceding siblings ...)
2023-08-04 15:05 ` [PATCH net-next 05/10] mlx4: Move the bond work to the core driver Petr Pavlu
@ 2023-08-04 15:05 ` Petr Pavlu
2023-08-08 18:57 ` Leon Romanovsky
2023-08-04 15:05 ` [PATCH net-next 07/10] mlx4: Register mlx4 devices to an auxiliary virtual bus Petr Pavlu
` (5 subsequent siblings)
11 siblings, 1 reply; 28+ messages in thread
From: Petr Pavlu @ 2023-08-04 15:05 UTC (permalink / raw)
To: tariqt, yishaih, leon
Cc: davem, edumazet, kuba, pabeni, jgg, netdev, linux-rdma,
linux-kernel, Petr Pavlu
The mlx4_core driver has a logic that allows a sub-driver to set the
MLX4_INTFF_BONDING flag which then causes that function mlx4_do_bond()
asks the sub-driver to fully re-probe a device when its bonding
configuration changes.
Performing this operation is disallowed in mlx4_register_interface()
when it is detected that any mlx4 device is multifunction (SRIOV). The
code then resets MLX4_INTFF_BONDING in the driver flags.
Move this check directly into mlx4_do_bond(). It provides a better
separation as mlx4_core no longer directly modifies the sub-driver flags
and it will allow to get rid of explicitly keeping track of all mlx4
devices by the intf.c code when it is switched to an auxiliary bus.
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Tested-by: Leon Romanovsky <leon@kernel.org>
---
drivers/net/ethernet/mellanox/mlx4/intf.c | 19 +++++++++++--------
1 file changed, 11 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/intf.c b/drivers/net/ethernet/mellanox/mlx4/intf.c
index 8b2c1404cb66..30aead34ce08 100644
--- a/drivers/net/ethernet/mellanox/mlx4/intf.c
+++ b/drivers/net/ethernet/mellanox/mlx4/intf.c
@@ -96,11 +96,6 @@ int mlx4_register_interface(struct mlx4_interface *intf)
list_add_tail(&intf->list, &intf_list);
list_for_each_entry(priv, &dev_list, dev_list) {
- if (mlx4_is_mfunc(&priv->dev) && (intf->flags & MLX4_INTFF_BONDING)) {
- mlx4_dbg(&priv->dev,
- "SRIOV, disabling HA mode for intf proto %d\n", intf->protocol);
- intf->flags &= ~MLX4_INTFF_BONDING;
- }
mlx4_add_device(intf, priv);
}
@@ -155,10 +150,18 @@ int mlx4_do_bond(struct mlx4_dev *dev, bool enable)
spin_lock_irqsave(&priv->ctx_lock, flags);
list_for_each_entry_safe(dev_ctx, temp_dev_ctx, &priv->ctx_list, list) {
- if (dev_ctx->intf->flags & MLX4_INTFF_BONDING) {
- list_add_tail(&dev_ctx->bond_list, &bond_list);
- list_del(&dev_ctx->list);
+ if (!(dev_ctx->intf->flags & MLX4_INTFF_BONDING))
+ continue;
+
+ if (mlx4_is_mfunc(dev)) {
+ mlx4_dbg(dev,
+ "SRIOV, disabled HA mode for intf proto %d\n",
+ dev_ctx->intf->protocol);
+ continue;
}
+
+ list_add_tail(&dev_ctx->bond_list, &bond_list);
+ list_del(&dev_ctx->list);
}
spin_unlock_irqrestore(&priv->ctx_lock, flags);
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next 07/10] mlx4: Register mlx4 devices to an auxiliary virtual bus
2023-08-04 15:05 [PATCH net-next 00/10] Convert mlx4 to use auxiliary bus Petr Pavlu
` (5 preceding siblings ...)
2023-08-04 15:05 ` [PATCH net-next 06/10] mlx4: Avoid resetting MLX4_INTFF_BONDING per driver Petr Pavlu
@ 2023-08-04 15:05 ` Petr Pavlu
2023-08-06 3:16 ` Zhu Yanjun
2023-08-08 18:57 ` Leon Romanovsky
2023-08-04 15:05 ` [PATCH net-next 08/10] mlx4: Connect the ethernet part to the auxiliary bus Petr Pavlu
` (4 subsequent siblings)
11 siblings, 2 replies; 28+ messages in thread
From: Petr Pavlu @ 2023-08-04 15:05 UTC (permalink / raw)
To: tariqt, yishaih, leon
Cc: davem, edumazet, kuba, pabeni, jgg, netdev, linux-rdma,
linux-kernel, Petr Pavlu
Add an auxiliary virtual bus to model the mlx4 driver structure. The
code is added along the current custom device management logic.
Subsequent patches switch mlx4_en and mlx4_ib to the auxiliary bus and
the old interface is then removed.
Structure mlx4_priv gains a new adev dynamic array to keep track of its
auxiliary devices. Access to the array is protected by the global
mlx4_intf mutex.
Functions mlx4_register_device() and mlx4_unregister_device() are
updated to expose auxiliary devices on the bus in order to load mlx4_en
and/or mlx4_ib. Functions mlx4_register_auxiliary_driver() and
mlx4_unregister_auxiliary_driver() are added to substitute
mlx4_register_interface() and mlx4_unregister_interface(), respectively.
Function mlx4_do_bond() is adjusted to walk over the adev array and
re-adds a specific auxiliary device if its driver sets the
MLX4_INTFF_BONDING flag.
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Tested-by: Leon Romanovsky <leon@kernel.org>
---
drivers/net/ethernet/mellanox/mlx4/Kconfig | 1 +
drivers/net/ethernet/mellanox/mlx4/intf.c | 230 ++++++++++++++++++++-
drivers/net/ethernet/mellanox/mlx4/main.c | 17 +-
drivers/net/ethernet/mellanox/mlx4/mlx4.h | 6 +
include/linux/mlx4/device.h | 7 +
include/linux/mlx4/driver.h | 11 +
6 files changed, 268 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/Kconfig b/drivers/net/ethernet/mellanox/mlx4/Kconfig
index 1b4b1f642317..825e05fb8607 100644
--- a/drivers/net/ethernet/mellanox/mlx4/Kconfig
+++ b/drivers/net/ethernet/mellanox/mlx4/Kconfig
@@ -27,6 +27,7 @@ config MLX4_EN_DCB
config MLX4_CORE
tristate
depends on PCI
+ select AUXILIARY_BUS
select NET_DEVLINK
default n
diff --git a/drivers/net/ethernet/mellanox/mlx4/intf.c b/drivers/net/ethernet/mellanox/mlx4/intf.c
index 30aead34ce08..4b1e18e4a682 100644
--- a/drivers/net/ethernet/mellanox/mlx4/intf.c
+++ b/drivers/net/ethernet/mellanox/mlx4/intf.c
@@ -48,6 +48,89 @@ struct mlx4_device_context {
static LIST_HEAD(intf_list);
static LIST_HEAD(dev_list);
static DEFINE_MUTEX(intf_mutex);
+static DEFINE_IDA(mlx4_adev_ida);
+
+static const struct mlx4_adev_device {
+ const char *suffix;
+ bool (*is_supported)(struct mlx4_dev *dev);
+} mlx4_adev_devices[1] = {};
+
+int mlx4_adev_init(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ priv->adev_idx = ida_alloc(&mlx4_adev_ida, GFP_KERNEL);
+ if (priv->adev_idx < 0)
+ return priv->adev_idx;
+
+ priv->adev = kcalloc(ARRAY_SIZE(mlx4_adev_devices),
+ sizeof(struct mlx4_adev *), GFP_KERNEL);
+ if (!priv->adev) {
+ ida_free(&mlx4_adev_ida, priv->adev_idx);
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+void mlx4_adev_cleanup(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ kfree(priv->adev);
+ ida_free(&mlx4_adev_ida, priv->adev_idx);
+}
+
+static void adev_release(struct device *dev)
+{
+ struct mlx4_adev *mlx4_adev =
+ container_of(dev, struct mlx4_adev, adev.dev);
+ struct mlx4_priv *priv = mlx4_priv(mlx4_adev->mdev);
+ int idx = mlx4_adev->idx;
+
+ kfree(mlx4_adev);
+ priv->adev[idx] = NULL;
+}
+
+static struct mlx4_adev *add_adev(struct mlx4_dev *dev, int idx)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ const char *suffix = mlx4_adev_devices[idx].suffix;
+ struct auxiliary_device *adev;
+ struct mlx4_adev *madev;
+ int ret;
+
+ madev = kzalloc(sizeof(*madev), GFP_KERNEL);
+ if (!madev)
+ return ERR_PTR(-ENOMEM);
+
+ adev = &madev->adev;
+ adev->id = priv->adev_idx;
+ adev->name = suffix;
+ adev->dev.parent = &dev->persist->pdev->dev;
+ adev->dev.release = adev_release;
+ madev->mdev = dev;
+ madev->idx = idx;
+
+ ret = auxiliary_device_init(adev);
+ if (ret) {
+ kfree(madev);
+ return ERR_PTR(ret);
+ }
+
+ ret = auxiliary_device_add(adev);
+ if (ret) {
+ auxiliary_device_uninit(adev);
+ return ERR_PTR(ret);
+ }
+ return madev;
+}
+
+static void del_adev(struct auxiliary_device *adev)
+{
+ auxiliary_device_delete(adev);
+ auxiliary_device_uninit(adev);
+}
static void mlx4_add_device(struct mlx4_interface *intf, struct mlx4_priv *priv)
{
@@ -120,12 +203,24 @@ void mlx4_unregister_interface(struct mlx4_interface *intf)
}
EXPORT_SYMBOL_GPL(mlx4_unregister_interface);
+int mlx4_register_auxiliary_driver(struct mlx4_adrv *madrv)
+{
+ return auxiliary_driver_register(&madrv->adrv);
+}
+EXPORT_SYMBOL_GPL(mlx4_register_auxiliary_driver);
+
+void mlx4_unregister_auxiliary_driver(struct mlx4_adrv *madrv)
+{
+ auxiliary_driver_unregister(&madrv->adrv);
+}
+EXPORT_SYMBOL_GPL(mlx4_unregister_auxiliary_driver);
+
int mlx4_do_bond(struct mlx4_dev *dev, bool enable)
{
struct mlx4_priv *priv = mlx4_priv(dev);
struct mlx4_device_context *dev_ctx = NULL, *temp_dev_ctx;
unsigned long flags;
- int ret;
+ int i, ret;
LIST_HEAD(bond_list);
if (!(dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_PORT_REMAP))
@@ -177,6 +272,57 @@ int mlx4_do_bond(struct mlx4_dev *dev, bool enable)
dev_ctx->intf->protocol, enable ?
"enabled" : "disabled");
}
+
+ mutex_lock(&intf_mutex);
+
+ for (i = 0; i < ARRAY_SIZE(mlx4_adev_devices); i++) {
+ struct mlx4_adev *madev = priv->adev[i];
+ struct mlx4_adrv *madrv;
+ enum mlx4_protocol protocol;
+
+ if (!madev)
+ continue;
+
+ device_lock(&madev->adev.dev);
+ if (!madev->adev.dev.driver) {
+ device_unlock(&madev->adev.dev);
+ continue;
+ }
+
+ madrv = container_of(madev->adev.dev.driver, struct mlx4_adrv,
+ adrv.driver);
+ if (!(madrv->flags & MLX4_INTFF_BONDING)) {
+ device_unlock(&madev->adev.dev);
+ continue;
+ }
+
+ if (mlx4_is_mfunc(dev)) {
+ mlx4_dbg(dev,
+ "SRIOV, disabled HA mode for intf proto %d\n",
+ madrv->protocol);
+ device_unlock(&madev->adev.dev);
+ continue;
+ }
+
+ protocol = madrv->protocol;
+ device_unlock(&madev->adev.dev);
+
+ del_adev(&madev->adev);
+ priv->adev[i] = add_adev(dev, i);
+ if (IS_ERR(priv->adev[i])) {
+ mlx4_warn(dev, "Device[%d] (%s) failed to load\n", i,
+ mlx4_adev_devices[i].suffix);
+ priv->adev[i] = NULL;
+ continue;
+ }
+
+ mlx4_dbg(dev,
+ "Interface for protocol %d restarted with bonded mode %s\n",
+ protocol, enable ? "enabled" : "disabled");
+ }
+
+ mutex_unlock(&intf_mutex);
+
return 0;
}
@@ -206,10 +352,80 @@ int mlx4_unregister_event_notifier(struct mlx4_dev *dev,
}
EXPORT_SYMBOL(mlx4_unregister_event_notifier);
+static int add_drivers(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int i, ret = 0;
+
+ for (i = 0; i < ARRAY_SIZE(mlx4_adev_devices); i++) {
+ bool is_supported = false;
+
+ if (priv->adev[i])
+ continue;
+
+ if (mlx4_adev_devices[i].is_supported)
+ is_supported = mlx4_adev_devices[i].is_supported(dev);
+
+ if (!is_supported)
+ continue;
+
+ priv->adev[i] = add_adev(dev, i);
+ if (IS_ERR(priv->adev[i])) {
+ mlx4_warn(dev, "Device[%d] (%s) failed to load\n", i,
+ mlx4_adev_devices[i].suffix);
+ /* We continue to rescan drivers and leave to the caller
+ * to make decision if to release everything or
+ * continue. */
+ ret = PTR_ERR(priv->adev[i]);
+ priv->adev[i] = NULL;
+ }
+ }
+ return ret;
+}
+
+static void delete_drivers(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ bool delete_all;
+ int i;
+
+ delete_all = !(dev->persist->interface_state & MLX4_INTERFACE_STATE_UP);
+
+ for (i = ARRAY_SIZE(mlx4_adev_devices) - 1; i >= 0; i--) {
+ bool is_supported = false;
+
+ if (!priv->adev[i])
+ continue;
+
+ if (mlx4_adev_devices[i].is_supported && !delete_all)
+ is_supported = mlx4_adev_devices[i].is_supported(dev);
+
+ if (is_supported)
+ continue;
+
+ del_adev(&priv->adev[i]->adev);
+ priv->adev[i] = NULL;
+ }
+}
+
+/* This function is used after mlx4_dev is reconfigured.
+ */
+static int rescan_drivers_locked(struct mlx4_dev *dev)
+{
+ lockdep_assert_held(&intf_mutex);
+
+ delete_drivers(dev);
+ if (!(dev->persist->interface_state & MLX4_INTERFACE_STATE_UP))
+ return 0;
+
+ return add_drivers(dev);
+}
+
int mlx4_register_device(struct mlx4_dev *dev)
{
struct mlx4_priv *priv = mlx4_priv(dev);
struct mlx4_interface *intf;
+ int ret;
mutex_lock(&intf_mutex);
@@ -218,10 +434,18 @@ int mlx4_register_device(struct mlx4_dev *dev)
list_for_each_entry(intf, &intf_list, list)
mlx4_add_device(intf, priv);
+ ret = rescan_drivers_locked(dev);
+
mutex_unlock(&intf_mutex);
+
+ if (ret) {
+ mlx4_unregister_device(dev);
+ return ret;
+ }
+
mlx4_start_catas_poll(dev);
- return 0;
+ return ret;
}
void mlx4_unregister_device(struct mlx4_dev *dev)
@@ -253,6 +477,8 @@ void mlx4_unregister_device(struct mlx4_dev *dev)
list_del(&priv->dev_list);
dev->persist->interface_state &= ~MLX4_INTERFACE_STATE_UP;
+ rescan_drivers_locked(dev);
+
mutex_unlock(&intf_mutex);
}
diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index 0ed490b99163..c4ec7377aa71 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -3429,6 +3429,10 @@ static int mlx4_load_one(struct pci_dev *pdev, int pci_dev_data,
INIT_LIST_HEAD(&priv->ctx_list);
spin_lock_init(&priv->ctx_lock);
+ err = mlx4_adev_init(dev);
+ if (err)
+ return err;
+
ATOMIC_INIT_NOTIFIER_HEAD(&priv->event_nh);
mutex_init(&priv->port_mutex);
@@ -3455,10 +3459,11 @@ static int mlx4_load_one(struct pci_dev *pdev, int pci_dev_data,
err = mlx4_get_ownership(dev);
if (err) {
if (err < 0)
- return err;
+ goto err_adev;
else {
mlx4_warn(dev, "Multiple PFs not yet supported - Skipping PF\n");
- return -EINVAL;
+ err = -EINVAL;
+ goto err_adev;
}
}
@@ -3806,6 +3811,9 @@ static int mlx4_load_one(struct pci_dev *pdev, int pci_dev_data,
mlx4_free_ownership(dev);
kfree(dev_cap);
+
+err_adev:
+ mlx4_adev_cleanup(dev);
return err;
}
@@ -4186,6 +4194,8 @@ static void mlx4_unload_one(struct pci_dev *pdev)
mlx4_slave_destroy_special_qp_cap(dev);
kfree(dev->dev_vfs);
+ mlx4_adev_cleanup(dev);
+
mlx4_clean_dev(dev);
priv->pci_dev_data = pci_dev_data;
priv->removed = 1;
@@ -4573,6 +4583,9 @@ static int __init mlx4_init(void)
{
int ret;
+ WARN_ONCE(strcmp(MLX4_ADEV_NAME, KBUILD_MODNAME),
+ "mlx4_core name not in sync with kernel module name");
+
if (mlx4_verify_params())
return -EINVAL;
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
index ece9acb6a869..d5050bfb342f 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
@@ -47,6 +47,7 @@
#include <linux/spinlock.h>
#include <net/devlink.h>
#include <linux/rwsem.h>
+#include <linux/auxiliary_bus.h>
#include <linux/notifier.h>
#include <linux/mlx4/device.h>
@@ -884,6 +885,8 @@ struct mlx4_priv {
struct list_head dev_list;
struct list_head ctx_list;
spinlock_t ctx_lock;
+ struct mlx4_adev **adev;
+ int adev_idx;
struct atomic_notifier_head event_nh;
int pci_dev_data;
@@ -1052,6 +1055,9 @@ void mlx4_catas_end(struct mlx4_dev *dev);
int mlx4_crdump_init(struct mlx4_dev *dev);
void mlx4_crdump_end(struct mlx4_dev *dev);
int mlx4_restart_one(struct pci_dev *pdev);
+
+int mlx4_adev_init(struct mlx4_dev *dev);
+void mlx4_adev_cleanup(struct mlx4_dev *dev);
int mlx4_register_device(struct mlx4_dev *dev);
void mlx4_unregister_device(struct mlx4_dev *dev);
void mlx4_dispatch_event(struct mlx4_dev *dev, enum mlx4_dev_event type,
diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index 049d8a4b044d..27f42f713c89 100644
--- a/include/linux/mlx4/device.h
+++ b/include/linux/mlx4/device.h
@@ -33,6 +33,7 @@
#ifndef MLX4_DEVICE_H
#define MLX4_DEVICE_H
+#include <linux/auxiliary_bus.h>
#include <linux/if_ether.h>
#include <linux/pci.h>
#include <linux/completion.h>
@@ -889,6 +890,12 @@ struct mlx4_dev {
u8 uar_page_shift;
};
+struct mlx4_adev {
+ struct auxiliary_device adev;
+ struct mlx4_dev *mdev;
+ int idx;
+};
+
struct mlx4_clock_params {
u64 offset;
u8 bar;
diff --git a/include/linux/mlx4/driver.h b/include/linux/mlx4/driver.h
index 781d5a0c2faa..9cf157d381c6 100644
--- a/include/linux/mlx4/driver.h
+++ b/include/linux/mlx4/driver.h
@@ -34,9 +34,12 @@
#define MLX4_DRIVER_H
#include <net/devlink.h>
+#include <linux/auxiliary_bus.h>
#include <linux/notifier.h>
#include <linux/mlx4/device.h>
+#define MLX4_ADEV_NAME "mlx4_core"
+
struct mlx4_dev;
#define MLX4_MAC_MASK 0xffffffffffffULL
@@ -63,8 +66,16 @@ struct mlx4_interface {
int flags;
};
+struct mlx4_adrv {
+ struct auxiliary_driver adrv;
+ enum mlx4_protocol protocol;
+ int flags;
+};
+
int mlx4_register_interface(struct mlx4_interface *intf);
void mlx4_unregister_interface(struct mlx4_interface *intf);
+int mlx4_register_auxiliary_driver(struct mlx4_adrv *madrv);
+void mlx4_unregister_auxiliary_driver(struct mlx4_adrv *madrv);
int mlx4_register_event_notifier(struct mlx4_dev *dev,
struct notifier_block *nb);
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next 08/10] mlx4: Connect the ethernet part to the auxiliary bus
2023-08-04 15:05 [PATCH net-next 00/10] Convert mlx4 to use auxiliary bus Petr Pavlu
` (6 preceding siblings ...)
2023-08-04 15:05 ` [PATCH net-next 07/10] mlx4: Register mlx4 devices to an auxiliary virtual bus Petr Pavlu
@ 2023-08-04 15:05 ` Petr Pavlu
2023-08-08 18:57 ` Leon Romanovsky
2023-08-04 15:05 ` [PATCH net-next 09/10] mlx4: Connect the infiniband " Petr Pavlu
` (3 subsequent siblings)
11 siblings, 1 reply; 28+ messages in thread
From: Petr Pavlu @ 2023-08-04 15:05 UTC (permalink / raw)
To: tariqt, yishaih, leon
Cc: davem, edumazet, kuba, pabeni, jgg, netdev, linux-rdma,
linux-kernel, Petr Pavlu
Use the auxiliary bus to perform device management of the ethernet part
of the mlx4 driver.
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Tested-by: Leon Romanovsky <leon@kernel.org>
---
drivers/net/ethernet/mellanox/mlx4/en_main.c | 67 ++++++++++++++------
drivers/net/ethernet/mellanox/mlx4/intf.c | 13 +++-
2 files changed, 59 insertions(+), 21 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_main.c b/drivers/net/ethernet/mellanox/mlx4/en_main.c
index 3824884ab515..2827d5373d9f 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_main.c
@@ -222,9 +222,11 @@ static int mlx4_en_event(struct notifier_block *this,
return NOTIFY_DONE;
}
-static void mlx4_en_remove(struct mlx4_dev *dev, void *endev_ptr)
+static void mlx4_en_remove(struct auxiliary_device *adev)
{
- struct mlx4_en_dev *mdev = endev_ptr;
+ struct mlx4_adev *madev = container_of(adev, struct mlx4_adev, adev);
+ struct mlx4_dev *dev = madev->mdev;
+ struct mlx4_en_dev *mdev = auxiliary_get_drvdata(adev);
int i;
mlx4_unregister_event_notifier(dev, &mdev->mlx_nb);
@@ -247,27 +249,36 @@ static void mlx4_en_remove(struct mlx4_dev *dev, void *endev_ptr)
kfree(mdev);
}
-static void *mlx4_en_add(struct mlx4_dev *dev)
+static int mlx4_en_probe(struct auxiliary_device *adev,
+ const struct auxiliary_device_id *id)
{
+ struct mlx4_adev *madev = container_of(adev, struct mlx4_adev, adev);
+ struct mlx4_dev *dev = madev->mdev;
struct mlx4_en_dev *mdev;
- int i;
+ int err, i;
printk_once(KERN_INFO "%s", mlx4_en_version);
mdev = kzalloc(sizeof(*mdev), GFP_KERNEL);
- if (!mdev)
+ if (!mdev) {
+ err = -ENOMEM;
goto err_free_res;
+ }
- if (mlx4_pd_alloc(dev, &mdev->priv_pdn))
+ err = mlx4_pd_alloc(dev, &mdev->priv_pdn);
+ if (err)
goto err_free_dev;
- if (mlx4_uar_alloc(dev, &mdev->priv_uar))
+ err = mlx4_uar_alloc(dev, &mdev->priv_uar);
+ if (err)
goto err_pd;
mdev->uar_map = ioremap((phys_addr_t) mdev->priv_uar.pfn << PAGE_SHIFT,
PAGE_SIZE);
- if (!mdev->uar_map)
+ if (!mdev->uar_map) {
+ err = -ENOMEM;
goto err_uar;
+ }
spin_lock_init(&mdev->uar_lock);
mdev->dev = dev;
@@ -279,13 +290,15 @@ static void *mlx4_en_add(struct mlx4_dev *dev)
if (!mdev->LSO_support)
mlx4_warn(mdev, "LSO not supported, please upgrade to later FW version to enable LSO\n");
- if (mlx4_mr_alloc(mdev->dev, mdev->priv_pdn, 0, ~0ull,
- MLX4_PERM_LOCAL_WRITE | MLX4_PERM_LOCAL_READ,
- 0, 0, &mdev->mr)) {
+ err = mlx4_mr_alloc(mdev->dev, mdev->priv_pdn, 0, ~0ull,
+ MLX4_PERM_LOCAL_WRITE | MLX4_PERM_LOCAL_READ, 0, 0,
+ &mdev->mr);
+ if (err) {
mlx4_err(mdev, "Failed allocating memory region\n");
goto err_map;
}
- if (mlx4_mr_enable(mdev->dev, &mdev->mr)) {
+ err = mlx4_mr_enable(mdev->dev, &mdev->mr);
+ if (err) {
mlx4_err(mdev, "Failed enabling memory region\n");
goto err_mr;
}
@@ -305,8 +318,10 @@ static void *mlx4_en_add(struct mlx4_dev *dev)
* Note: we cannot use the shared workqueue because of deadlocks caused
* by the rtnl lock */
mdev->workqueue = create_singlethread_workqueue("mlx4_en");
- if (!mdev->workqueue)
+ if (!mdev->workqueue) {
+ err = -ENOMEM;
goto err_mr;
+ }
/* At this stage all non-port specific tasks are complete:
* mark the card state as up */
@@ -334,7 +349,8 @@ static void *mlx4_en_add(struct mlx4_dev *dev)
mlx4_err(mdev, "Failed to create netdev notifier\n");
}
- return mdev;
+ auxiliary_set_drvdata(adev, mdev);
+ return 0;
err_mr:
(void) mlx4_mr_free(dev, &mdev->mr);
@@ -348,12 +364,23 @@ static void *mlx4_en_add(struct mlx4_dev *dev)
err_free_dev:
kfree(mdev);
err_free_res:
- return NULL;
+ return err;
}
-static struct mlx4_interface mlx4_en_interface = {
- .add = mlx4_en_add,
- .remove = mlx4_en_remove,
+static const struct auxiliary_device_id mlx4_en_id_table[] = {
+ { .name = MLX4_ADEV_NAME ".eth" },
+ {},
+};
+
+MODULE_DEVICE_TABLE(auxiliary, mlx4_en_id_table);
+
+static struct mlx4_adrv mlx4_en_adrv = {
+ .adrv = {
+ .name = "eth",
+ .probe = mlx4_en_probe,
+ .remove = mlx4_en_remove,
+ .id_table = mlx4_en_id_table,
+ },
.protocol = MLX4_PROT_ETH,
};
@@ -383,12 +410,12 @@ static int __init mlx4_en_init(void)
mlx4_en_verify_params();
mlx4_en_init_ptys2ethtool_map();
- return mlx4_register_interface(&mlx4_en_interface);
+ return mlx4_register_auxiliary_driver(&mlx4_en_adrv);
}
static void __exit mlx4_en_cleanup(void)
{
- mlx4_unregister_interface(&mlx4_en_interface);
+ mlx4_unregister_auxiliary_driver(&mlx4_en_adrv);
}
module_init(mlx4_en_init);
diff --git a/drivers/net/ethernet/mellanox/mlx4/intf.c b/drivers/net/ethernet/mellanox/mlx4/intf.c
index 4b1e18e4a682..0a27820ece2e 100644
--- a/drivers/net/ethernet/mellanox/mlx4/intf.c
+++ b/drivers/net/ethernet/mellanox/mlx4/intf.c
@@ -50,10 +50,21 @@ static LIST_HEAD(dev_list);
static DEFINE_MUTEX(intf_mutex);
static DEFINE_IDA(mlx4_adev_ida);
+static bool is_eth_supported(struct mlx4_dev *dev)
+{
+ for (int port = 1; port <= dev->caps.num_ports; port++)
+ if (dev->caps.port_type[port] == MLX4_PORT_TYPE_ETH)
+ return true;
+
+ return false;
+}
+
static const struct mlx4_adev_device {
const char *suffix;
bool (*is_supported)(struct mlx4_dev *dev);
-} mlx4_adev_devices[1] = {};
+} mlx4_adev_devices[] = {
+ { "eth", is_eth_supported },
+};
int mlx4_adev_init(struct mlx4_dev *dev)
{
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next 09/10] mlx4: Connect the infiniband part to the auxiliary bus
2023-08-04 15:05 [PATCH net-next 00/10] Convert mlx4 to use auxiliary bus Petr Pavlu
` (7 preceding siblings ...)
2023-08-04 15:05 ` [PATCH net-next 08/10] mlx4: Connect the ethernet part to the auxiliary bus Petr Pavlu
@ 2023-08-04 15:05 ` Petr Pavlu
2023-08-08 18:58 ` Leon Romanovsky
2023-08-04 15:05 ` [PATCH net-next 10/10] mlx4: Delete custom device management logic Petr Pavlu
` (2 subsequent siblings)
11 siblings, 1 reply; 28+ messages in thread
From: Petr Pavlu @ 2023-08-04 15:05 UTC (permalink / raw)
To: tariqt, yishaih, leon
Cc: davem, edumazet, kuba, pabeni, jgg, netdev, linux-rdma,
linux-kernel, Petr Pavlu
Use the auxiliary bus to perform device management of the infiniband
part of the mlx4 driver.
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Tested-by: Leon Romanovsky <leon@kernel.org>
---
drivers/infiniband/hw/mlx4/main.c | 77 ++++++++++++++++-------
drivers/net/ethernet/mellanox/mlx4/intf.c | 13 ++++
2 files changed, 67 insertions(+), 23 deletions(-)
diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index 458b4b11dffa..1ca97c893bd8 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -2609,8 +2609,11 @@ static const struct ib_device_ops mlx4_ib_dev_fs_ops = {
.destroy_flow = mlx4_ib_destroy_flow,
};
-static void *mlx4_ib_add(struct mlx4_dev *dev)
+static int mlx4_ib_probe(struct auxiliary_device *adev,
+ const struct auxiliary_device_id *id)
{
+ struct mlx4_adev *madev = container_of(adev, struct mlx4_adev, adev);
+ struct mlx4_dev *dev = madev->mdev;
struct mlx4_ib_dev *ibdev;
int num_ports = 0;
int i, j;
@@ -2630,27 +2633,31 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
/* No point in registering a device with no ports... */
if (num_ports == 0)
- return NULL;
+ return -ENODEV;
ibdev = ib_alloc_device(mlx4_ib_dev, ib_dev);
if (!ibdev) {
dev_err(&dev->persist->pdev->dev,
"Device struct alloc failed\n");
- return NULL;
+ return -ENOMEM;
}
iboe = &ibdev->iboe;
- if (mlx4_pd_alloc(dev, &ibdev->priv_pdn))
+ err = mlx4_pd_alloc(dev, &ibdev->priv_pdn);
+ if (err)
goto err_dealloc;
- if (mlx4_uar_alloc(dev, &ibdev->priv_uar))
+ err = mlx4_uar_alloc(dev, &ibdev->priv_uar);
+ if (err)
goto err_pd;
ibdev->uar_map = ioremap((phys_addr_t) ibdev->priv_uar.pfn << PAGE_SHIFT,
PAGE_SIZE);
- if (!ibdev->uar_map)
+ if (!ibdev->uar_map) {
+ err = -ENOMEM;
goto err_uar;
+ }
MLX4_INIT_DOORBELL_LOCK(&ibdev->uar_lock);
ibdev->dev = dev;
@@ -2694,7 +2701,8 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
spin_lock_init(&iboe->lock);
- if (init_node_data(ibdev))
+ err = init_node_data(ibdev);
+ if (err)
goto err_map;
mlx4_init_sl2vl_tbl(ibdev);
@@ -2726,6 +2734,7 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
new_counter_index = kmalloc(sizeof(*new_counter_index),
GFP_KERNEL);
if (!new_counter_index) {
+ err = -ENOMEM;
if (allocated)
mlx4_counter_free(ibdev->dev, counter_index);
goto err_counter;
@@ -2743,8 +2752,10 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
new_counter_index =
kmalloc(sizeof(struct counter_index),
GFP_KERNEL);
- if (!new_counter_index)
+ if (!new_counter_index) {
+ err = -ENOMEM;
goto err_counter;
+ }
new_counter_index->index = counter_index;
new_counter_index->allocated = 0;
list_add_tail(&new_counter_index->list,
@@ -2773,8 +2784,10 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
ibdev->ib_uc_qpns_bitmap = bitmap_alloc(ibdev->steer_qpn_count,
GFP_KERNEL);
- if (!ibdev->ib_uc_qpns_bitmap)
+ if (!ibdev->ib_uc_qpns_bitmap) {
+ err = -ENOMEM;
goto err_steer_qp_release;
+ }
if (dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_DMFS_IPOIB) {
bitmap_zero(ibdev->ib_uc_qpns_bitmap,
@@ -2794,17 +2807,21 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
for (j = 1; j <= ibdev->dev->caps.num_ports; j++)
atomic64_set(&iboe->mac[j - 1], ibdev->dev->caps.def_mac[j]);
- if (mlx4_ib_alloc_diag_counters(ibdev))
+ err = mlx4_ib_alloc_diag_counters(ibdev);
+ if (err)
goto err_steer_free_bitmap;
- if (ib_register_device(&ibdev->ib_dev, "mlx4_%d",
- &dev->persist->pdev->dev))
+ err = ib_register_device(&ibdev->ib_dev, "mlx4_%d",
+ &dev->persist->pdev->dev);
+ if (err)
goto err_diag_counters;
- if (mlx4_ib_mad_init(ibdev))
+ err = mlx4_ib_mad_init(ibdev);
+ if (err)
goto err_reg;
- if (mlx4_ib_init_sriov(ibdev))
+ err = mlx4_ib_init_sriov(ibdev);
+ if (err)
goto err_mad;
if (!iboe->nb.notifier_call) {
@@ -2844,7 +2861,8 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
err = mlx4_register_event_notifier(dev, &ibdev->mlx_nb);
WARN(err, "failed to register mlx4 event notifier (%d)", err);
- return ibdev;
+ auxiliary_set_drvdata(adev, ibdev);
+ return 0;
err_notif:
if (ibdev->iboe.nb.notifier_call) {
@@ -2888,7 +2906,7 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
err_dealloc:
ib_dealloc_device(&ibdev->ib_dev);
- return NULL;
+ return err;
}
int mlx4_ib_steer_qp_alloc(struct mlx4_ib_dev *dev, int count, int *qpn)
@@ -2955,9 +2973,11 @@ int mlx4_ib_steer_qp_reg(struct mlx4_ib_dev *mdev, struct mlx4_ib_qp *mqp,
return err;
}
-static void mlx4_ib_remove(struct mlx4_dev *dev, void *ibdev_ptr)
+static void mlx4_ib_remove(struct auxiliary_device *adev)
{
- struct mlx4_ib_dev *ibdev = ibdev_ptr;
+ struct mlx4_adev *madev = container_of(adev, struct mlx4_adev, adev);
+ struct mlx4_dev *dev = madev->mdev;
+ struct mlx4_ib_dev *ibdev = auxiliary_get_drvdata(adev);
int p;
int i;
@@ -3298,9 +3318,20 @@ static int mlx4_ib_event(struct notifier_block *this,
return NOTIFY_DONE;
}
-static struct mlx4_interface mlx4_ib_interface = {
- .add = mlx4_ib_add,
- .remove = mlx4_ib_remove,
+static const struct auxiliary_device_id mlx4_ib_id_table[] = {
+ { .name = MLX4_ADEV_NAME ".ib" },
+ {},
+};
+
+MODULE_DEVICE_TABLE(auxiliary, mlx4_ib_id_table);
+
+static struct mlx4_adrv mlx4_ib_adrv = {
+ .adrv = {
+ .name = "ib",
+ .probe = mlx4_ib_probe,
+ .remove = mlx4_ib_remove,
+ .id_table = mlx4_ib_id_table,
+ },
.protocol = MLX4_PROT_IB_IPV6,
.flags = MLX4_INTFF_BONDING
};
@@ -3325,7 +3356,7 @@ static int __init mlx4_ib_init(void)
if (err)
goto clean_cm;
- err = mlx4_register_interface(&mlx4_ib_interface);
+ err = mlx4_register_auxiliary_driver(&mlx4_ib_adrv);
if (err)
goto clean_mcg;
@@ -3347,7 +3378,7 @@ static int __init mlx4_ib_init(void)
static void __exit mlx4_ib_cleanup(void)
{
- mlx4_unregister_interface(&mlx4_ib_interface);
+ mlx4_unregister_auxiliary_driver(&mlx4_ib_adrv);
mlx4_ib_mcg_destroy();
mlx4_ib_cm_destroy();
mlx4_ib_qp_event_cleanup();
diff --git a/drivers/net/ethernet/mellanox/mlx4/intf.c b/drivers/net/ethernet/mellanox/mlx4/intf.c
index 0a27820ece2e..16b2c99ff737 100644
--- a/drivers/net/ethernet/mellanox/mlx4/intf.c
+++ b/drivers/net/ethernet/mellanox/mlx4/intf.c
@@ -59,11 +59,24 @@ static bool is_eth_supported(struct mlx4_dev *dev)
return false;
}
+static bool is_ib_supported(struct mlx4_dev *dev)
+{
+ for (int port = 1; port <= dev->caps.num_ports; port++)
+ if (dev->caps.port_type[port] == MLX4_PORT_TYPE_IB)
+ return true;
+
+ if (dev->caps.flags & MLX4_DEV_CAP_FLAG_IBOE)
+ return true;
+
+ return false;
+}
+
static const struct mlx4_adev_device {
const char *suffix;
bool (*is_supported)(struct mlx4_dev *dev);
} mlx4_adev_devices[] = {
{ "eth", is_eth_supported },
+ { "ib", is_ib_supported },
};
int mlx4_adev_init(struct mlx4_dev *dev)
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH net-next 10/10] mlx4: Delete custom device management logic
2023-08-04 15:05 [PATCH net-next 00/10] Convert mlx4 to use auxiliary bus Petr Pavlu
` (8 preceding siblings ...)
2023-08-04 15:05 ` [PATCH net-next 09/10] mlx4: Connect the infiniband " Petr Pavlu
@ 2023-08-04 15:05 ` Petr Pavlu
2023-08-08 18:58 ` Leon Romanovsky
2023-08-04 16:49 ` [PATCH net-next 00/10] Convert mlx4 to use auxiliary bus Jason Gunthorpe
2023-08-09 11:12 ` Tariq Toukan
11 siblings, 1 reply; 28+ messages in thread
From: Petr Pavlu @ 2023-08-04 15:05 UTC (permalink / raw)
To: tariqt, yishaih, leon
Cc: davem, edumazet, kuba, pabeni, jgg, netdev, linux-rdma,
linux-kernel, Petr Pavlu
After the conversion to use the auxiliary bus, the custom device
management is not needed anymore and can be deleted.
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Tested-by: Leon Romanovsky <leon@kernel.org>
---
drivers/net/ethernet/mellanox/mlx4/intf.c | 125 ----------------------
drivers/net/ethernet/mellanox/mlx4/main.c | 28 -----
drivers/net/ethernet/mellanox/mlx4/mlx4.h | 3 -
include/linux/mlx4/driver.h | 10 --
4 files changed, 166 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/intf.c b/drivers/net/ethernet/mellanox/mlx4/intf.c
index 16b2c99ff737..c7697ee0dd05 100644
--- a/drivers/net/ethernet/mellanox/mlx4/intf.c
+++ b/drivers/net/ethernet/mellanox/mlx4/intf.c
@@ -38,15 +38,6 @@
#include "mlx4.h"
-struct mlx4_device_context {
- struct list_head list;
- struct list_head bond_list;
- struct mlx4_interface *intf;
- void *context;
-};
-
-static LIST_HEAD(intf_list);
-static LIST_HEAD(dev_list);
static DEFINE_MUTEX(intf_mutex);
static DEFINE_IDA(mlx4_adev_ida);
@@ -156,77 +147,6 @@ static void del_adev(struct auxiliary_device *adev)
auxiliary_device_uninit(adev);
}
-static void mlx4_add_device(struct mlx4_interface *intf, struct mlx4_priv *priv)
-{
- struct mlx4_device_context *dev_ctx;
-
- dev_ctx = kmalloc(sizeof(*dev_ctx), GFP_KERNEL);
- if (!dev_ctx)
- return;
-
- dev_ctx->intf = intf;
- dev_ctx->context = intf->add(&priv->dev);
-
- if (dev_ctx->context) {
- spin_lock_irq(&priv->ctx_lock);
- list_add_tail(&dev_ctx->list, &priv->ctx_list);
- spin_unlock_irq(&priv->ctx_lock);
- } else
- kfree(dev_ctx);
-
-}
-
-static void mlx4_remove_device(struct mlx4_interface *intf, struct mlx4_priv *priv)
-{
- struct mlx4_device_context *dev_ctx;
-
- list_for_each_entry(dev_ctx, &priv->ctx_list, list)
- if (dev_ctx->intf == intf) {
- spin_lock_irq(&priv->ctx_lock);
- list_del(&dev_ctx->list);
- spin_unlock_irq(&priv->ctx_lock);
-
- intf->remove(&priv->dev, dev_ctx->context);
- kfree(dev_ctx);
- return;
- }
-}
-
-int mlx4_register_interface(struct mlx4_interface *intf)
-{
- struct mlx4_priv *priv;
-
- if (!intf->add || !intf->remove)
- return -EINVAL;
-
- mutex_lock(&intf_mutex);
-
- list_add_tail(&intf->list, &intf_list);
- list_for_each_entry(priv, &dev_list, dev_list) {
- mlx4_add_device(intf, priv);
- }
-
- mutex_unlock(&intf_mutex);
-
- return 0;
-}
-EXPORT_SYMBOL_GPL(mlx4_register_interface);
-
-void mlx4_unregister_interface(struct mlx4_interface *intf)
-{
- struct mlx4_priv *priv;
-
- mutex_lock(&intf_mutex);
-
- list_for_each_entry(priv, &dev_list, dev_list)
- mlx4_remove_device(intf, priv);
-
- list_del(&intf->list);
-
- mutex_unlock(&intf_mutex);
-}
-EXPORT_SYMBOL_GPL(mlx4_unregister_interface);
-
int mlx4_register_auxiliary_driver(struct mlx4_adrv *madrv)
{
return auxiliary_driver_register(&madrv->adrv);
@@ -242,10 +162,7 @@ EXPORT_SYMBOL_GPL(mlx4_unregister_auxiliary_driver);
int mlx4_do_bond(struct mlx4_dev *dev, bool enable)
{
struct mlx4_priv *priv = mlx4_priv(dev);
- struct mlx4_device_context *dev_ctx = NULL, *temp_dev_ctx;
- unsigned long flags;
int i, ret;
- LIST_HEAD(bond_list);
if (!(dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_PORT_REMAP))
return -EOPNOTSUPP;
@@ -267,36 +184,6 @@ int mlx4_do_bond(struct mlx4_dev *dev, bool enable)
dev->flags &= ~MLX4_FLAG_BONDED;
}
- spin_lock_irqsave(&priv->ctx_lock, flags);
- list_for_each_entry_safe(dev_ctx, temp_dev_ctx, &priv->ctx_list, list) {
- if (!(dev_ctx->intf->flags & MLX4_INTFF_BONDING))
- continue;
-
- if (mlx4_is_mfunc(dev)) {
- mlx4_dbg(dev,
- "SRIOV, disabled HA mode for intf proto %d\n",
- dev_ctx->intf->protocol);
- continue;
- }
-
- list_add_tail(&dev_ctx->bond_list, &bond_list);
- list_del(&dev_ctx->list);
- }
- spin_unlock_irqrestore(&priv->ctx_lock, flags);
-
- list_for_each_entry(dev_ctx, &bond_list, bond_list) {
- dev_ctx->intf->remove(dev, dev_ctx->context);
- dev_ctx->context = dev_ctx->intf->add(dev);
-
- spin_lock_irqsave(&priv->ctx_lock, flags);
- list_add_tail(&dev_ctx->list, &priv->ctx_list);
- spin_unlock_irqrestore(&priv->ctx_lock, flags);
-
- mlx4_dbg(dev, "Interface for protocol %d restarted with bonded mode %s\n",
- dev_ctx->intf->protocol, enable ?
- "enabled" : "disabled");
- }
-
mutex_lock(&intf_mutex);
for (i = 0; i < ARRAY_SIZE(mlx4_adev_devices); i++) {
@@ -447,16 +334,11 @@ static int rescan_drivers_locked(struct mlx4_dev *dev)
int mlx4_register_device(struct mlx4_dev *dev)
{
- struct mlx4_priv *priv = mlx4_priv(dev);
- struct mlx4_interface *intf;
int ret;
mutex_lock(&intf_mutex);
dev->persist->interface_state |= MLX4_INTERFACE_STATE_UP;
- list_add_tail(&priv->dev_list, &dev_list);
- list_for_each_entry(intf, &intf_list, list)
- mlx4_add_device(intf, priv);
ret = rescan_drivers_locked(dev);
@@ -474,9 +356,6 @@ int mlx4_register_device(struct mlx4_dev *dev)
void mlx4_unregister_device(struct mlx4_dev *dev)
{
- struct mlx4_priv *priv = mlx4_priv(dev);
- struct mlx4_interface *intf;
-
if (!(dev->persist->interface_state & MLX4_INTERFACE_STATE_UP))
return;
@@ -495,10 +374,6 @@ void mlx4_unregister_device(struct mlx4_dev *dev)
}
mutex_lock(&intf_mutex);
- list_for_each_entry(intf, &intf_list, list)
- mlx4_remove_device(intf, priv);
-
- list_del(&priv->dev_list);
dev->persist->interface_state &= ~MLX4_INTERFACE_STATE_UP;
rescan_drivers_locked(dev);
diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index c4ec7377aa71..2581226836b5 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -42,7 +42,6 @@
#include <linux/slab.h>
#include <linux/io-mapping.h>
#include <linux/delay.h>
-#include <linux/kmod.h>
#include <linux/etherdevice.h>
#include <net/devlink.h>
@@ -1091,27 +1090,6 @@ static int mlx4_slave_cap(struct mlx4_dev *dev)
return err;
}
-static void mlx4_request_modules(struct mlx4_dev *dev)
-{
- int port;
- int has_ib_port = false;
- int has_eth_port = false;
-#define EN_DRV_NAME "mlx4_en"
-#define IB_DRV_NAME "mlx4_ib"
-
- for (port = 1; port <= dev->caps.num_ports; port++) {
- if (dev->caps.port_type[port] == MLX4_PORT_TYPE_IB)
- has_ib_port = true;
- else if (dev->caps.port_type[port] == MLX4_PORT_TYPE_ETH)
- has_eth_port = true;
- }
-
- if (has_eth_port)
- request_module_nowait(EN_DRV_NAME);
- if (has_ib_port || (dev->caps.flags & MLX4_DEV_CAP_FLAG_IBOE))
- request_module_nowait(IB_DRV_NAME);
-}
-
/*
* Change the port configuration of the device.
* Every user of this function must hold the port mutex.
@@ -1147,7 +1125,6 @@ int mlx4_change_port_types(struct mlx4_dev *dev,
mlx4_err(dev, "Failed to register device\n");
goto out;
}
- mlx4_request_modules(dev);
}
out:
@@ -3426,9 +3403,6 @@ static int mlx4_load_one(struct pci_dev *pdev, int pci_dev_data,
devl_assert_locked(devlink);
dev = &priv->dev;
- INIT_LIST_HEAD(&priv->ctx_list);
- spin_lock_init(&priv->ctx_lock);
-
err = mlx4_adev_init(dev);
if (err)
return err;
@@ -3732,8 +3706,6 @@ static int mlx4_load_one(struct pci_dev *pdev, int pci_dev_data,
if (err)
goto err_port;
- mlx4_request_modules(dev);
-
mlx4_sense_init(dev);
mlx4_start_sense(dev);
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
index d5050bfb342f..d707b790536f 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
@@ -882,9 +882,6 @@ enum {
struct mlx4_priv {
struct mlx4_dev dev;
- struct list_head dev_list;
- struct list_head ctx_list;
- spinlock_t ctx_lock;
struct mlx4_adev **adev;
int adev_idx;
struct atomic_notifier_head event_nh;
diff --git a/include/linux/mlx4/driver.h b/include/linux/mlx4/driver.h
index 9cf157d381c6..69825223081f 100644
--- a/include/linux/mlx4/driver.h
+++ b/include/linux/mlx4/driver.h
@@ -58,22 +58,12 @@ enum {
MLX4_INTFF_BONDING = 1 << 0
};
-struct mlx4_interface {
- void * (*add) (struct mlx4_dev *dev);
- void (*remove)(struct mlx4_dev *dev, void *context);
- struct list_head list;
- enum mlx4_protocol protocol;
- int flags;
-};
-
struct mlx4_adrv {
struct auxiliary_driver adrv;
enum mlx4_protocol protocol;
int flags;
};
-int mlx4_register_interface(struct mlx4_interface *intf);
-void mlx4_unregister_interface(struct mlx4_interface *intf);
int mlx4_register_auxiliary_driver(struct mlx4_adrv *madrv);
void mlx4_unregister_auxiliary_driver(struct mlx4_adrv *madrv);
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH net-next 00/10] Convert mlx4 to use auxiliary bus
2023-08-04 15:05 [PATCH net-next 00/10] Convert mlx4 to use auxiliary bus Petr Pavlu
` (9 preceding siblings ...)
2023-08-04 15:05 ` [PATCH net-next 10/10] mlx4: Delete custom device management logic Petr Pavlu
@ 2023-08-04 16:49 ` Jason Gunthorpe
2023-08-09 11:12 ` Tariq Toukan
11 siblings, 0 replies; 28+ messages in thread
From: Jason Gunthorpe @ 2023-08-04 16:49 UTC (permalink / raw)
To: Petr Pavlu
Cc: tariqt, yishaih, leon, davem, edumazet, kuba, pabeni, netdev,
linux-rdma, linux-kernel
On Fri, Aug 04, 2023 at 05:05:17PM +0200, Petr Pavlu wrote:
> This series converts the mlx4 drivers to use auxiliary bus, similarly to
> how mlx5 was converted [1]. The first 6 patches are preparatory changes,
> the remaining 4 are the final conversion.
>
> Initial motivation for this change was to address a problem related to
> loading mlx4_en/mlx4_ib by mlx4_core using request_module_nowait(). When
> doing such a load in initrd, the operation is asynchronous to any init
> control and can get unexpectedly affected/interrupted by an eventual
> root switch. Using an auxiliary bus leaves these module loads to udevd
> which better integrates with systemd processing. [2]
Neat, I didn't realize that was a pain point for distros.
Jason
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next 03/10] mlx4: Replace the mlx4_interface.event callback with a notifier
2023-08-04 15:05 ` [PATCH net-next 03/10] mlx4: Replace the mlx4_interface.event callback with a notifier Petr Pavlu
@ 2023-08-05 14:29 ` Zhu Yanjun
2023-08-08 12:13 ` Petr Pavlu
2023-08-07 13:58 ` Simon Horman
1 sibling, 1 reply; 28+ messages in thread
From: Zhu Yanjun @ 2023-08-05 14:29 UTC (permalink / raw)
To: Petr Pavlu, tariqt, yishaih, leon
Cc: davem, edumazet, kuba, pabeni, jgg, netdev, linux-rdma,
linux-kernel
在 2023/8/4 23:05, Petr Pavlu 写道:
> Use a notifier to implement mlx4_dispatch_event() in preparation to
> switch mlx4_en and mlx4_ib to be an auxiliary device.
>
> A problem is that if the mlx4_interface.event callback was replaced with
> something as mlx4_adrv.event then the implementation of
> mlx4_dispatch_event() would need to acquire a lock on a given device
> before executing this callback. That is necessary because otherwise
> there is no guarantee that the associated driver cannot get unbound when
> the callback is running. However, taking this lock is not possible
> because mlx4_dispatch_event() can be invoked from the hardirq context.
> Using an atomic notifier allows the driver to accurately record when it
> wants to receive these events and solves this problem.
>
> A handler registration is done by both mlx4_en and mlx4_ib at the end of
> their mlx4_interface.add callback. This matches the current situation
> when mlx4_add_device() would enable events for a given device
> immediately after this callback, by adding the device on the
> mlx4_priv.list.
>
> Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
> Tested-by: Leon Romanovsky <leon@kernel.org>
> ---
> drivers/infiniband/hw/mlx4/main.c | 41 +++++++++++++-------
> drivers/infiniband/hw/mlx4/mlx4_ib.h | 2 +
> drivers/net/ethernet/mellanox/mlx4/en_main.c | 25 ++++++++----
> drivers/net/ethernet/mellanox/mlx4/intf.c | 24 ++++++++----
> drivers/net/ethernet/mellanox/mlx4/main.c | 2 +
> drivers/net/ethernet/mellanox/mlx4/mlx4.h | 2 +
> drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 2 +
> include/linux/mlx4/driver.h | 8 +++-
> 8 files changed, 76 insertions(+), 30 deletions(-)
>
> diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
> index 7dd70d778b6b..458b4b11dffa 100644
> --- a/drivers/infiniband/hw/mlx4/main.c
> +++ b/drivers/infiniband/hw/mlx4/main.c
> @@ -82,6 +82,8 @@ static const char mlx4_ib_version[] =
> static void do_slave_init(struct mlx4_ib_dev *ibdev, int slave, int do_init);
> static enum rdma_link_layer mlx4_ib_port_link_layer(struct ib_device *device,
> u32 port_num);
> +static int mlx4_ib_event(struct notifier_block *this, unsigned long event,
> + void *ptr);
>
> static struct workqueue_struct *wq;
>
> @@ -2836,6 +2838,12 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
> do_slave_init(ibdev, j, 1);
> }
> }
> +
> + /* register mlx4 core notifier */
> + ibdev->mlx_nb.notifier_call = mlx4_ib_event;
> + err = mlx4_register_event_notifier(dev, &ibdev->mlx_nb);
> + WARN(err, "failed to register mlx4 event notifier (%d)", err);
> +
> return ibdev;
>
> err_notif:
> @@ -2953,6 +2961,8 @@ static void mlx4_ib_remove(struct mlx4_dev *dev, void *ibdev_ptr)
> int p;
> int i;
>
> + mlx4_unregister_event_notifier(dev, &ibdev->mlx_nb);
> +
> mlx4_foreach_port(i, dev, MLX4_PORT_TYPE_IB)
> devlink_port_type_clear(mlx4_get_devlink_port(dev, i));
> ibdev->ib_active = false;
> @@ -3173,11 +3183,14 @@ void mlx4_sched_ib_sl2vl_update_work(struct mlx4_ib_dev *ibdev,
> }
> }
>
> -static void mlx4_ib_event(struct mlx4_dev *dev, void *ibdev_ptr,
> - enum mlx4_dev_event event, unsigned long param)
> +static int mlx4_ib_event(struct notifier_block *this,
> + unsigned long event /*mlx4_dev_event*/, void *ptr)
/*mlx4_dev_event*/ should be removed?
Zhu Yanjun
> {
> + struct mlx4_ib_dev *ibdev =
> + container_of(this, struct mlx4_ib_dev, mlx_nb);
> + struct mlx4_dev *dev = ibdev->dev;
> + unsigned long param = *(unsigned long *)ptr;
> struct ib_event ibev;
> - struct mlx4_ib_dev *ibdev = to_mdev((struct ib_device *) ibdev_ptr);
> struct mlx4_eqe *eqe = NULL;
> struct ib_event_work *ew;
> int p = 0;
> @@ -3187,11 +3200,11 @@ static void mlx4_ib_event(struct mlx4_dev *dev, void *ibdev_ptr,
> (event == MLX4_DEV_EVENT_PORT_DOWN))) {
> ew = kmalloc(sizeof(*ew), GFP_ATOMIC);
> if (!ew)
> - return;
> + return NOTIFY_DONE;
> INIT_WORK(&ew->work, handle_bonded_port_state_event);
> ew->ib_dev = ibdev;
> queue_work(wq, &ew->work);
> - return;
> + return NOTIFY_DONE;
> }
>
> if (event == MLX4_DEV_EVENT_PORT_MGMT_CHANGE)
> @@ -3202,7 +3215,7 @@ static void mlx4_ib_event(struct mlx4_dev *dev, void *ibdev_ptr,
> switch (event) {
> case MLX4_DEV_EVENT_PORT_UP:
> if (p > ibdev->num_ports)
> - return;
> + return NOTIFY_DONE;
> if (!mlx4_is_slave(dev) &&
> rdma_port_get_link_layer(&ibdev->ib_dev, p) ==
> IB_LINK_LAYER_INFINIBAND) {
> @@ -3217,7 +3230,7 @@ static void mlx4_ib_event(struct mlx4_dev *dev, void *ibdev_ptr,
>
> case MLX4_DEV_EVENT_PORT_DOWN:
> if (p > ibdev->num_ports)
> - return;
> + return NOTIFY_DONE;
> ibev.event = IB_EVENT_PORT_ERR;
> break;
>
> @@ -3230,7 +3243,7 @@ static void mlx4_ib_event(struct mlx4_dev *dev, void *ibdev_ptr,
> case MLX4_DEV_EVENT_PORT_MGMT_CHANGE:
> ew = kmalloc(sizeof *ew, GFP_ATOMIC);
> if (!ew)
> - return;
> + return NOTIFY_DONE;
>
> INIT_WORK(&ew->work, handle_port_mgmt_change_event);
> memcpy(&ew->ib_eqe, eqe, sizeof *eqe);
> @@ -3240,7 +3253,7 @@ static void mlx4_ib_event(struct mlx4_dev *dev, void *ibdev_ptr,
> queue_work(wq, &ew->work);
> else
> handle_port_mgmt_change_event(&ew->work);
> - return;
> + return NOTIFY_DONE;
>
> case MLX4_DEV_EVENT_SLAVE_INIT:
> /* here, p is the slave id */
> @@ -3256,7 +3269,7 @@ static void mlx4_ib_event(struct mlx4_dev *dev, void *ibdev_ptr,
> 1);
> }
> }
> - return;
> + return NOTIFY_DONE;
>
> case MLX4_DEV_EVENT_SLAVE_SHUTDOWN:
> if (mlx4_is_master(dev)) {
> @@ -3272,22 +3285,22 @@ static void mlx4_ib_event(struct mlx4_dev *dev, void *ibdev_ptr,
> }
> /* here, p is the slave id */
> do_slave_init(ibdev, p, 0);
> - return;
> + return NOTIFY_DONE;
>
> default:
> - return;
> + return NOTIFY_DONE;
> }
>
> - ibev.device = ibdev_ptr;
> + ibev.device = &ibdev->ib_dev;
> ibev.element.port_num = mlx4_is_bonded(ibdev->dev) ? 1 : (u8)p;
>
> ib_dispatch_event(&ibev);
> + return NOTIFY_DONE;
> }
>
> static struct mlx4_interface mlx4_ib_interface = {
> .add = mlx4_ib_add,
> .remove = mlx4_ib_remove,
> - .event = mlx4_ib_event,
> .protocol = MLX4_PROT_IB_IPV6,
> .flags = MLX4_INTFF_BONDING
> };
> diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h
> index 17fee1e73a45..41ca1114a995 100644
> --- a/drivers/infiniband/hw/mlx4/mlx4_ib.h
> +++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h
> @@ -38,6 +38,7 @@
> #include <linux/list.h>
> #include <linux/mutex.h>
> #include <linux/idr.h>
> +#include <linux/notifier.h>
>
> #include <rdma/ib_verbs.h>
> #include <rdma/ib_umem.h>
> @@ -644,6 +645,7 @@ struct mlx4_ib_dev {
> spinlock_t reset_flow_resource_lock;
> struct list_head qp_list;
> struct mlx4_ib_diag_counters diag_counters[MLX4_DIAG_COUNTERS_TYPES];
> + struct notifier_block mlx_nb;
> };
>
> struct ib_event_work {
> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_main.c b/drivers/net/ethernet/mellanox/mlx4/en_main.c
> index be8ba34c9025..8384bff5c37d 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/en_main.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_main.c
> @@ -183,17 +183,20 @@ static void mlx4_en_get_profile(struct mlx4_en_dev *mdev)
> }
> }
>
> -static void mlx4_en_event(struct mlx4_dev *dev, void *endev_ptr,
> - enum mlx4_dev_event event, unsigned long port)
> +static int mlx4_en_event(struct notifier_block *this,
> + unsigned long event /*mlx4_dev_event*/, void *ptr)
> {
> - struct mlx4_en_dev *mdev = (struct mlx4_en_dev *) endev_ptr;
> + struct mlx4_en_dev *mdev =
> + container_of(this, struct mlx4_en_dev, mlx_nb);
> + struct mlx4_dev *dev = mdev->dev;
> + unsigned long port = *(unsigned long *)ptr;
> struct mlx4_en_priv *priv;
>
> switch (event) {
> case MLX4_DEV_EVENT_PORT_UP:
> case MLX4_DEV_EVENT_PORT_DOWN:
> if (!mdev->pndev[port])
> - return;
> + return NOTIFY_DONE;
> priv = netdev_priv(mdev->pndev[port]);
> /* To prevent races, we poll the link state in a separate
> task rather than changing it here */
> @@ -211,10 +214,12 @@ static void mlx4_en_event(struct mlx4_dev *dev, void *endev_ptr,
> default:
> if (port < 1 || port > dev->caps.num_ports ||
> !mdev->pndev[port])
> - return;
> - mlx4_warn(mdev, "Unhandled event %d for port %d\n", event,
> + return NOTIFY_DONE;
> + mlx4_warn(mdev, "Unhandled event %d for port %d\n", (int) event,
> (int) port);
> }
> +
> + return NOTIFY_DONE;
> }
>
> static void mlx4_en_remove(struct mlx4_dev *dev, void *endev_ptr)
> @@ -222,6 +227,8 @@ static void mlx4_en_remove(struct mlx4_dev *dev, void *endev_ptr)
> struct mlx4_en_dev *mdev = endev_ptr;
> int i;
>
> + mlx4_unregister_event_notifier(dev, &mdev->mlx_nb);
> +
> mutex_lock(&mdev->state_lock);
> mdev->device_up = false;
> mutex_unlock(&mdev->state_lock);
> @@ -326,6 +333,11 @@ static void *mlx4_en_add(struct mlx4_dev *dev)
> mutex_init(&mdev->state_lock);
> mdev->device_up = true;
>
> + /* register mlx4 core notifier */
> + mdev->mlx_nb.notifier_call = mlx4_en_event;
> + err = mlx4_register_event_notifier(dev, &mdev->mlx_nb);
> + WARN(err, "failed to register mlx4 event notifier (%d)", err);
> +
> return mdev;
>
> err_mr:
> @@ -346,7 +358,6 @@ static void *mlx4_en_add(struct mlx4_dev *dev)
> static struct mlx4_interface mlx4_en_interface = {
> .add = mlx4_en_add,
> .remove = mlx4_en_remove,
> - .event = mlx4_en_event,
> .protocol = MLX4_PROT_ETH,
> .activate = mlx4_en_activate,
> };
> diff --git a/drivers/net/ethernet/mellanox/mlx4/intf.c b/drivers/net/ethernet/mellanox/mlx4/intf.c
> index 28d7da925d36..a7c3e2efa464 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/intf.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/intf.c
> @@ -183,17 +183,27 @@ void mlx4_dispatch_event(struct mlx4_dev *dev, enum mlx4_dev_event type,
> unsigned long param)
> {
> struct mlx4_priv *priv = mlx4_priv(dev);
> - struct mlx4_device_context *dev_ctx;
> - unsigned long flags;
>
> - spin_lock_irqsave(&priv->ctx_lock, flags);
> + atomic_notifier_call_chain(&priv->event_nh, type, ¶m);
> +}
>
> - list_for_each_entry(dev_ctx, &priv->ctx_list, list)
> - if (dev_ctx->intf->event)
> - dev_ctx->intf->event(dev, dev_ctx->context, type, param);
> +int mlx4_register_event_notifier(struct mlx4_dev *dev,
> + struct notifier_block *nb)
> +{
> + struct mlx4_priv *priv = mlx4_priv(dev);
>
> - spin_unlock_irqrestore(&priv->ctx_lock, flags);
> + return atomic_notifier_chain_register(&priv->event_nh, nb);
> +}
> +EXPORT_SYMBOL(mlx4_register_event_notifier);
> +
> +int mlx4_unregister_event_notifier(struct mlx4_dev *dev,
> + struct notifier_block *nb)
> +{
> + struct mlx4_priv *priv = mlx4_priv(dev);
> +
> + return atomic_notifier_chain_unregister(&priv->event_nh, nb);
> }
> +EXPORT_SYMBOL(mlx4_unregister_event_notifier);
>
> int mlx4_register_device(struct mlx4_dev *dev)
> {
> diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
> index 8a5409b00530..5f3ba8385e23 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/main.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/main.c
> @@ -3378,6 +3378,8 @@ static int mlx4_load_one(struct pci_dev *pdev, int pci_dev_data,
> INIT_LIST_HEAD(&priv->ctx_list);
> spin_lock_init(&priv->ctx_lock);
>
> + ATOMIC_INIT_NOTIFIER_HEAD(&priv->event_nh);
> +
> mutex_init(&priv->port_mutex);
> mutex_init(&priv->bond_mutex);
>
> diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
> index 6ccf340660d9..10f12e4992f1 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
> +++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
> @@ -47,6 +47,7 @@
> #include <linux/spinlock.h>
> #include <net/devlink.h>
> #include <linux/rwsem.h>
> +#include <linux/notifier.h>
>
> #include <linux/mlx4/device.h>
> #include <linux/mlx4/driver.h>
> @@ -878,6 +879,7 @@ struct mlx4_priv {
> struct list_head dev_list;
> struct list_head ctx_list;
> spinlock_t ctx_lock;
> + struct atomic_notifier_head event_nh;
>
> int pci_dev_data;
> int removed;
> diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
> index 72a3fea36702..efe3f97b874f 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
> +++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
> @@ -49,6 +49,7 @@
> #include <linux/ptp_clock_kernel.h>
> #include <linux/irq.h>
> #include <net/xdp.h>
> +#include <linux/notifier.h>
>
> #include <linux/mlx4/device.h>
> #include <linux/mlx4/qp.h>
> @@ -433,6 +434,7 @@ struct mlx4_en_dev {
> struct ptp_clock *ptp_clock;
> struct ptp_clock_info ptp_clock_info;
> struct notifier_block netdev_nb;
> + struct notifier_block mlx_nb;
> };
>
>
> diff --git a/include/linux/mlx4/driver.h b/include/linux/mlx4/driver.h
> index 923951e19300..228da8ed7e75 100644
> --- a/include/linux/mlx4/driver.h
> +++ b/include/linux/mlx4/driver.h
> @@ -34,6 +34,7 @@
> #define MLX4_DRIVER_H
>
> #include <net/devlink.h>
> +#include <linux/notifier.h>
> #include <linux/mlx4/device.h>
>
> struct mlx4_dev;
> @@ -57,8 +58,6 @@ enum {
> struct mlx4_interface {
> void * (*add) (struct mlx4_dev *dev);
> void (*remove)(struct mlx4_dev *dev, void *context);
> - void (*event) (struct mlx4_dev *dev, void *context,
> - enum mlx4_dev_event event, unsigned long param);
> void (*activate)(struct mlx4_dev *dev, void *context);
> struct list_head list;
> enum mlx4_protocol protocol;
> @@ -87,6 +86,11 @@ struct mlx4_port_map {
>
> int mlx4_port_map_set(struct mlx4_dev *dev, struct mlx4_port_map *v2p);
>
> +int mlx4_register_event_notifier(struct mlx4_dev *dev,
> + struct notifier_block *nb);
> +int mlx4_unregister_event_notifier(struct mlx4_dev *dev,
> + struct notifier_block *nb);
> +
> struct devlink_port *mlx4_get_devlink_port(struct mlx4_dev *dev, int port);
>
> #endif /* MLX4_DRIVER_H */
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next 07/10] mlx4: Register mlx4 devices to an auxiliary virtual bus
2023-08-04 15:05 ` [PATCH net-next 07/10] mlx4: Register mlx4 devices to an auxiliary virtual bus Petr Pavlu
@ 2023-08-06 3:16 ` Zhu Yanjun
2023-08-08 12:17 ` Petr Pavlu
2023-08-08 18:57 ` Leon Romanovsky
1 sibling, 1 reply; 28+ messages in thread
From: Zhu Yanjun @ 2023-08-06 3:16 UTC (permalink / raw)
To: Petr Pavlu, tariqt, yishaih, leon
Cc: davem, edumazet, kuba, pabeni, jgg, netdev, linux-rdma,
linux-kernel
在 2023/8/4 23:05, Petr Pavlu 写道:
> Add an auxiliary virtual bus to model the mlx4 driver structure. The
> code is added along the current custom device management logic.
> Subsequent patches switch mlx4_en and mlx4_ib to the auxiliary bus and
> the old interface is then removed.
>
> Structure mlx4_priv gains a new adev dynamic array to keep track of its
> auxiliary devices. Access to the array is protected by the global
> mlx4_intf mutex.
>
> Functions mlx4_register_device() and mlx4_unregister_device() are
> updated to expose auxiliary devices on the bus in order to load mlx4_en
> and/or mlx4_ib. Functions mlx4_register_auxiliary_driver() and
> mlx4_unregister_auxiliary_driver() are added to substitute
> mlx4_register_interface() and mlx4_unregister_interface(), respectively.
> Function mlx4_do_bond() is adjusted to walk over the adev array and
> re-adds a specific auxiliary device if its driver sets the
> MLX4_INTFF_BONDING flag.
>
> Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
> Tested-by: Leon Romanovsky <leon@kernel.org>
> ---
> drivers/net/ethernet/mellanox/mlx4/Kconfig | 1 +
> drivers/net/ethernet/mellanox/mlx4/intf.c | 230 ++++++++++++++++++++-
> drivers/net/ethernet/mellanox/mlx4/main.c | 17 +-
> drivers/net/ethernet/mellanox/mlx4/mlx4.h | 6 +
> include/linux/mlx4/device.h | 7 +
> include/linux/mlx4/driver.h | 11 +
> 6 files changed, 268 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx4/Kconfig b/drivers/net/ethernet/mellanox/mlx4/Kconfig
> index 1b4b1f642317..825e05fb8607 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/Kconfig
> +++ b/drivers/net/ethernet/mellanox/mlx4/Kconfig
> @@ -27,6 +27,7 @@ config MLX4_EN_DCB
> config MLX4_CORE
> tristate
> depends on PCI
> + select AUXILIARY_BUS
> select NET_DEVLINK
> default n
>
> diff --git a/drivers/net/ethernet/mellanox/mlx4/intf.c b/drivers/net/ethernet/mellanox/mlx4/intf.c
> index 30aead34ce08..4b1e18e4a682 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/intf.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/intf.c
> @@ -48,6 +48,89 @@ struct mlx4_device_context {
> static LIST_HEAD(intf_list);
> static LIST_HEAD(dev_list);
> static DEFINE_MUTEX(intf_mutex);
> +static DEFINE_IDA(mlx4_adev_ida);
> +
> +static const struct mlx4_adev_device {
> + const char *suffix;
> + bool (*is_supported)(struct mlx4_dev *dev);
> +} mlx4_adev_devices[1] = {};
> +
> +int mlx4_adev_init(struct mlx4_dev *dev)
> +{
> + struct mlx4_priv *priv = mlx4_priv(dev);
> +
> + priv->adev_idx = ida_alloc(&mlx4_adev_ida, GFP_KERNEL);
> + if (priv->adev_idx < 0)
> + return priv->adev_idx;
> +
> + priv->adev = kcalloc(ARRAY_SIZE(mlx4_adev_devices),
> + sizeof(struct mlx4_adev *), GFP_KERNEL);
> + if (!priv->adev) {
> + ida_free(&mlx4_adev_ida, priv->adev_idx);
> + return -ENOMEM;
> + }
> +
> + return 0;
> +}
> +
> +void mlx4_adev_cleanup(struct mlx4_dev *dev)
> +{
> + struct mlx4_priv *priv = mlx4_priv(dev);
> +
> + kfree(priv->adev);
> + ida_free(&mlx4_adev_ida, priv->adev_idx);
> +}
> +
> +static void adev_release(struct device *dev)
> +{
> + struct mlx4_adev *mlx4_adev =
> + container_of(dev, struct mlx4_adev, adev.dev);
> + struct mlx4_priv *priv = mlx4_priv(mlx4_adev->mdev);
> + int idx = mlx4_adev->idx;
> +
> + kfree(mlx4_adev);
> + priv->adev[idx] = NULL;
> +}
> +
> +static struct mlx4_adev *add_adev(struct mlx4_dev *dev, int idx)
> +{
> + struct mlx4_priv *priv = mlx4_priv(dev);
> + const char *suffix = mlx4_adev_devices[idx].suffix;
> + struct auxiliary_device *adev;
> + struct mlx4_adev *madev;
> + int ret;
> +
> + madev = kzalloc(sizeof(*madev), GFP_KERNEL);
> + if (!madev)
> + return ERR_PTR(-ENOMEM);
> +
> + adev = &madev->adev;
> + adev->id = priv->adev_idx;
> + adev->name = suffix;
> + adev->dev.parent = &dev->persist->pdev->dev;
> + adev->dev.release = adev_release;
> + madev->mdev = dev;
> + madev->idx = idx;
> +
> + ret = auxiliary_device_init(adev);
> + if (ret) {
> + kfree(madev);
> + return ERR_PTR(ret);
> + }
> +
> + ret = auxiliary_device_add(adev);
> + if (ret) {
madev is allocated, but it is not handled here when auxiliary_device_add
error. It should be freed, too?
That is, add "kfree(madev);" here?
If madev will be handled in other place, please add some comments here
to indicate madev is handled in other place.
Zhu Yanjun
> + auxiliary_device_uninit(adev);
> + return ERR_PTR(ret);
> + }
> + return madev;
> +}
> +
> +static void del_adev(struct auxiliary_device *adev)
> +{
> + auxiliary_device_delete(adev);
> + auxiliary_device_uninit(adev);
> +}
>
> static void mlx4_add_device(struct mlx4_interface *intf, struct mlx4_priv *priv)
> {
> @@ -120,12 +203,24 @@ void mlx4_unregister_interface(struct mlx4_interface *intf)
> }
> EXPORT_SYMBOL_GPL(mlx4_unregister_interface);
>
> +int mlx4_register_auxiliary_driver(struct mlx4_adrv *madrv)
> +{
> + return auxiliary_driver_register(&madrv->adrv);
> +}
> +EXPORT_SYMBOL_GPL(mlx4_register_auxiliary_driver);
> +
> +void mlx4_unregister_auxiliary_driver(struct mlx4_adrv *madrv)
> +{
> + auxiliary_driver_unregister(&madrv->adrv);
> +}
> +EXPORT_SYMBOL_GPL(mlx4_unregister_auxiliary_driver);
> +
> int mlx4_do_bond(struct mlx4_dev *dev, bool enable)
> {
> struct mlx4_priv *priv = mlx4_priv(dev);
> struct mlx4_device_context *dev_ctx = NULL, *temp_dev_ctx;
> unsigned long flags;
> - int ret;
> + int i, ret;
> LIST_HEAD(bond_list);
>
> if (!(dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_PORT_REMAP))
> @@ -177,6 +272,57 @@ int mlx4_do_bond(struct mlx4_dev *dev, bool enable)
> dev_ctx->intf->protocol, enable ?
> "enabled" : "disabled");
> }
> +
> + mutex_lock(&intf_mutex);
> +
> + for (i = 0; i < ARRAY_SIZE(mlx4_adev_devices); i++) {
> + struct mlx4_adev *madev = priv->adev[i];
> + struct mlx4_adrv *madrv;
> + enum mlx4_protocol protocol;
> +
> + if (!madev)
> + continue;
> +
> + device_lock(&madev->adev.dev);
> + if (!madev->adev.dev.driver) {
> + device_unlock(&madev->adev.dev);
> + continue;
> + }
> +
> + madrv = container_of(madev->adev.dev.driver, struct mlx4_adrv,
> + adrv.driver);
> + if (!(madrv->flags & MLX4_INTFF_BONDING)) {
> + device_unlock(&madev->adev.dev);
> + continue;
> + }
> +
> + if (mlx4_is_mfunc(dev)) {
> + mlx4_dbg(dev,
> + "SRIOV, disabled HA mode for intf proto %d\n",
> + madrv->protocol);
> + device_unlock(&madev->adev.dev);
> + continue;
> + }
> +
> + protocol = madrv->protocol;
> + device_unlock(&madev->adev.dev);
> +
> + del_adev(&madev->adev);
> + priv->adev[i] = add_adev(dev, i);
> + if (IS_ERR(priv->adev[i])) {
> + mlx4_warn(dev, "Device[%d] (%s) failed to load\n", i,
> + mlx4_adev_devices[i].suffix);
> + priv->adev[i] = NULL;
> + continue;
> + }
> +
> + mlx4_dbg(dev,
> + "Interface for protocol %d restarted with bonded mode %s\n",
> + protocol, enable ? "enabled" : "disabled");
> + }
> +
> + mutex_unlock(&intf_mutex);
> +
> return 0;
> }
>
> @@ -206,10 +352,80 @@ int mlx4_unregister_event_notifier(struct mlx4_dev *dev,
> }
> EXPORT_SYMBOL(mlx4_unregister_event_notifier);
>
> +static int add_drivers(struct mlx4_dev *dev)
> +{
> + struct mlx4_priv *priv = mlx4_priv(dev);
> + int i, ret = 0;
> +
> + for (i = 0; i < ARRAY_SIZE(mlx4_adev_devices); i++) {
> + bool is_supported = false;
> +
> + if (priv->adev[i])
> + continue;
> +
> + if (mlx4_adev_devices[i].is_supported)
> + is_supported = mlx4_adev_devices[i].is_supported(dev);
> +
> + if (!is_supported)
> + continue;
> +
> + priv->adev[i] = add_adev(dev, i);
> + if (IS_ERR(priv->adev[i])) {
> + mlx4_warn(dev, "Device[%d] (%s) failed to load\n", i,
> + mlx4_adev_devices[i].suffix);
> + /* We continue to rescan drivers and leave to the caller
> + * to make decision if to release everything or
> + * continue. */
> + ret = PTR_ERR(priv->adev[i]);
> + priv->adev[i] = NULL;
> + }
> + }
> + return ret;
> +}
> +
> +static void delete_drivers(struct mlx4_dev *dev)
> +{
> + struct mlx4_priv *priv = mlx4_priv(dev);
> + bool delete_all;
> + int i;
> +
> + delete_all = !(dev->persist->interface_state & MLX4_INTERFACE_STATE_UP);
> +
> + for (i = ARRAY_SIZE(mlx4_adev_devices) - 1; i >= 0; i--) {
> + bool is_supported = false;
> +
> + if (!priv->adev[i])
> + continue;
> +
> + if (mlx4_adev_devices[i].is_supported && !delete_all)
> + is_supported = mlx4_adev_devices[i].is_supported(dev);
> +
> + if (is_supported)
> + continue;
> +
> + del_adev(&priv->adev[i]->adev);
> + priv->adev[i] = NULL;
> + }
> +}
> +
> +/* This function is used after mlx4_dev is reconfigured.
> + */
> +static int rescan_drivers_locked(struct mlx4_dev *dev)
> +{
> + lockdep_assert_held(&intf_mutex);
> +
> + delete_drivers(dev);
> + if (!(dev->persist->interface_state & MLX4_INTERFACE_STATE_UP))
> + return 0;
> +
> + return add_drivers(dev);
> +}
> +
> int mlx4_register_device(struct mlx4_dev *dev)
> {
> struct mlx4_priv *priv = mlx4_priv(dev);
> struct mlx4_interface *intf;
> + int ret;
>
> mutex_lock(&intf_mutex);
>
> @@ -218,10 +434,18 @@ int mlx4_register_device(struct mlx4_dev *dev)
> list_for_each_entry(intf, &intf_list, list)
> mlx4_add_device(intf, priv);
>
> + ret = rescan_drivers_locked(dev);
> +
> mutex_unlock(&intf_mutex);
> +
> + if (ret) {
> + mlx4_unregister_device(dev);
> + return ret;
> + }
> +
> mlx4_start_catas_poll(dev);
>
> - return 0;
> + return ret;
> }
>
> void mlx4_unregister_device(struct mlx4_dev *dev)
> @@ -253,6 +477,8 @@ void mlx4_unregister_device(struct mlx4_dev *dev)
> list_del(&priv->dev_list);
> dev->persist->interface_state &= ~MLX4_INTERFACE_STATE_UP;
>
> + rescan_drivers_locked(dev);
> +
> mutex_unlock(&intf_mutex);
> }
>
> diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
> index 0ed490b99163..c4ec7377aa71 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/main.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/main.c
> @@ -3429,6 +3429,10 @@ static int mlx4_load_one(struct pci_dev *pdev, int pci_dev_data,
> INIT_LIST_HEAD(&priv->ctx_list);
> spin_lock_init(&priv->ctx_lock);
>
> + err = mlx4_adev_init(dev);
> + if (err)
> + return err;
> +
> ATOMIC_INIT_NOTIFIER_HEAD(&priv->event_nh);
>
> mutex_init(&priv->port_mutex);
> @@ -3455,10 +3459,11 @@ static int mlx4_load_one(struct pci_dev *pdev, int pci_dev_data,
> err = mlx4_get_ownership(dev);
> if (err) {
> if (err < 0)
> - return err;
> + goto err_adev;
> else {
> mlx4_warn(dev, "Multiple PFs not yet supported - Skipping PF\n");
> - return -EINVAL;
> + err = -EINVAL;
> + goto err_adev;
> }
> }
>
> @@ -3806,6 +3811,9 @@ static int mlx4_load_one(struct pci_dev *pdev, int pci_dev_data,
> mlx4_free_ownership(dev);
>
> kfree(dev_cap);
> +
> +err_adev:
> + mlx4_adev_cleanup(dev);
> return err;
> }
>
> @@ -4186,6 +4194,8 @@ static void mlx4_unload_one(struct pci_dev *pdev)
> mlx4_slave_destroy_special_qp_cap(dev);
> kfree(dev->dev_vfs);
>
> + mlx4_adev_cleanup(dev);
> +
> mlx4_clean_dev(dev);
> priv->pci_dev_data = pci_dev_data;
> priv->removed = 1;
> @@ -4573,6 +4583,9 @@ static int __init mlx4_init(void)
> {
> int ret;
>
> + WARN_ONCE(strcmp(MLX4_ADEV_NAME, KBUILD_MODNAME),
> + "mlx4_core name not in sync with kernel module name");
> +
> if (mlx4_verify_params())
> return -EINVAL;
>
> diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
> index ece9acb6a869..d5050bfb342f 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
> +++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
> @@ -47,6 +47,7 @@
> #include <linux/spinlock.h>
> #include <net/devlink.h>
> #include <linux/rwsem.h>
> +#include <linux/auxiliary_bus.h>
> #include <linux/notifier.h>
>
> #include <linux/mlx4/device.h>
> @@ -884,6 +885,8 @@ struct mlx4_priv {
> struct list_head dev_list;
> struct list_head ctx_list;
> spinlock_t ctx_lock;
> + struct mlx4_adev **adev;
> + int adev_idx;
> struct atomic_notifier_head event_nh;
>
> int pci_dev_data;
> @@ -1052,6 +1055,9 @@ void mlx4_catas_end(struct mlx4_dev *dev);
> int mlx4_crdump_init(struct mlx4_dev *dev);
> void mlx4_crdump_end(struct mlx4_dev *dev);
> int mlx4_restart_one(struct pci_dev *pdev);
> +
> +int mlx4_adev_init(struct mlx4_dev *dev);
> +void mlx4_adev_cleanup(struct mlx4_dev *dev);
> int mlx4_register_device(struct mlx4_dev *dev);
> void mlx4_unregister_device(struct mlx4_dev *dev);
> void mlx4_dispatch_event(struct mlx4_dev *dev, enum mlx4_dev_event type,
> diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
> index 049d8a4b044d..27f42f713c89 100644
> --- a/include/linux/mlx4/device.h
> +++ b/include/linux/mlx4/device.h
> @@ -33,6 +33,7 @@
> #ifndef MLX4_DEVICE_H
> #define MLX4_DEVICE_H
>
> +#include <linux/auxiliary_bus.h>
> #include <linux/if_ether.h>
> #include <linux/pci.h>
> #include <linux/completion.h>
> @@ -889,6 +890,12 @@ struct mlx4_dev {
> u8 uar_page_shift;
> };
>
> +struct mlx4_adev {
> + struct auxiliary_device adev;
> + struct mlx4_dev *mdev;
> + int idx;
> +};
> +
> struct mlx4_clock_params {
> u64 offset;
> u8 bar;
> diff --git a/include/linux/mlx4/driver.h b/include/linux/mlx4/driver.h
> index 781d5a0c2faa..9cf157d381c6 100644
> --- a/include/linux/mlx4/driver.h
> +++ b/include/linux/mlx4/driver.h
> @@ -34,9 +34,12 @@
> #define MLX4_DRIVER_H
>
> #include <net/devlink.h>
> +#include <linux/auxiliary_bus.h>
> #include <linux/notifier.h>
> #include <linux/mlx4/device.h>
>
> +#define MLX4_ADEV_NAME "mlx4_core"
> +
> struct mlx4_dev;
>
> #define MLX4_MAC_MASK 0xffffffffffffULL
> @@ -63,8 +66,16 @@ struct mlx4_interface {
> int flags;
> };
>
> +struct mlx4_adrv {
> + struct auxiliary_driver adrv;
> + enum mlx4_protocol protocol;
> + int flags;
> +};
> +
> int mlx4_register_interface(struct mlx4_interface *intf);
> void mlx4_unregister_interface(struct mlx4_interface *intf);
> +int mlx4_register_auxiliary_driver(struct mlx4_adrv *madrv);
> +void mlx4_unregister_auxiliary_driver(struct mlx4_adrv *madrv);
>
> int mlx4_register_event_notifier(struct mlx4_dev *dev,
> struct notifier_block *nb);
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next 03/10] mlx4: Replace the mlx4_interface.event callback with a notifier
2023-08-04 15:05 ` [PATCH net-next 03/10] mlx4: Replace the mlx4_interface.event callback with a notifier Petr Pavlu
2023-08-05 14:29 ` Zhu Yanjun
@ 2023-08-07 13:58 ` Simon Horman
2023-08-08 12:15 ` Petr Pavlu
1 sibling, 1 reply; 28+ messages in thread
From: Simon Horman @ 2023-08-07 13:58 UTC (permalink / raw)
To: Petr Pavlu
Cc: tariqt, yishaih, leon, davem, edumazet, kuba, pabeni, jgg, netdev,
linux-rdma, linux-kernel
On Fri, Aug 04, 2023 at 05:05:20PM +0200, Petr Pavlu wrote:
...
> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_main.c b/drivers/net/ethernet/mellanox/mlx4/en_main.c
...
> @@ -326,6 +333,11 @@ static void *mlx4_en_add(struct mlx4_dev *dev)
> mutex_init(&mdev->state_lock);
> mdev->device_up = true;
>
> + /* register mlx4 core notifier */
> + mdev->mlx_nb.notifier_call = mlx4_en_event;
> + err = mlx4_register_event_notifier(dev, &mdev->mlx_nb);
Hi Petr.
This fails to build because err isn't declared in this context.
> + WARN(err, "failed to register mlx4 event notifier (%d)", err);
> +
> return mdev;
>
> err_mr:
...
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next 03/10] mlx4: Replace the mlx4_interface.event callback with a notifier
2023-08-05 14:29 ` Zhu Yanjun
@ 2023-08-08 12:13 ` Petr Pavlu
0 siblings, 0 replies; 28+ messages in thread
From: Petr Pavlu @ 2023-08-08 12:13 UTC (permalink / raw)
To: Zhu Yanjun
Cc: tariqt, yishaih, leon, davem, edumazet, kuba, pabeni, jgg, netdev,
linux-rdma, linux-kernel
On 8/5/23 16:29, Zhu Yanjun wrote:
> 在 2023/8/4 23:05, Petr Pavlu 写道:
>> Use a notifier to implement mlx4_dispatch_event() in preparation to
>> switch mlx4_en and mlx4_ib to be an auxiliary device.
>>
>> A problem is that if the mlx4_interface.event callback was replaced with
>> something as mlx4_adrv.event then the implementation of
>> mlx4_dispatch_event() would need to acquire a lock on a given device
>> before executing this callback. That is necessary because otherwise
>> there is no guarantee that the associated driver cannot get unbound when
>> the callback is running. However, taking this lock is not possible
>> because mlx4_dispatch_event() can be invoked from the hardirq context.
>> Using an atomic notifier allows the driver to accurately record when it
>> wants to receive these events and solves this problem.
>>
>> A handler registration is done by both mlx4_en and mlx4_ib at the end of
>> their mlx4_interface.add callback. This matches the current situation
>> when mlx4_add_device() would enable events for a given device
>> immediately after this callback, by adding the device on the
>> mlx4_priv.list.
>>
>> Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
>> Tested-by: Leon Romanovsky <leon@kernel.org>
>> ---
>> drivers/infiniband/hw/mlx4/main.c | 41 +++++++++++++-------
>> drivers/infiniband/hw/mlx4/mlx4_ib.h | 2 +
>> drivers/net/ethernet/mellanox/mlx4/en_main.c | 25 ++++++++----
>> drivers/net/ethernet/mellanox/mlx4/intf.c | 24 ++++++++----
>> drivers/net/ethernet/mellanox/mlx4/main.c | 2 +
>> drivers/net/ethernet/mellanox/mlx4/mlx4.h | 2 +
>> drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 2 +
>> include/linux/mlx4/driver.h | 8 +++-
>> 8 files changed, 76 insertions(+), 30 deletions(-)
>>
>> diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
>> index 7dd70d778b6b..458b4b11dffa 100644
>> --- a/drivers/infiniband/hw/mlx4/main.c
>> +++ b/drivers/infiniband/hw/mlx4/main.c
>> @@ -82,6 +82,8 @@ static const char mlx4_ib_version[] =
>> static void do_slave_init(struct mlx4_ib_dev *ibdev, int slave, int do_init);
>> static enum rdma_link_layer mlx4_ib_port_link_layer(struct ib_device *device,
>> u32 port_num);
>> +static int mlx4_ib_event(struct notifier_block *this, unsigned long event,
>> + void *ptr);
>>
>> static struct workqueue_struct *wq;
>>
>> @@ -2836,6 +2838,12 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
>> do_slave_init(ibdev, j, 1);
>> }
>> }
>> +
>> + /* register mlx4 core notifier */
>> + ibdev->mlx_nb.notifier_call = mlx4_ib_event;
>> + err = mlx4_register_event_notifier(dev, &ibdev->mlx_nb);
>> + WARN(err, "failed to register mlx4 event notifier (%d)", err);
>> +
>> return ibdev;
>>
>> err_notif:
>> @@ -2953,6 +2961,8 @@ static void mlx4_ib_remove(struct mlx4_dev *dev, void *ibdev_ptr)
>> int p;
>> int i;
>>
>> + mlx4_unregister_event_notifier(dev, &ibdev->mlx_nb);
>> +
>> mlx4_foreach_port(i, dev, MLX4_PORT_TYPE_IB)
>> devlink_port_type_clear(mlx4_get_devlink_port(dev, i));
>> ibdev->ib_active = false;
>> @@ -3173,11 +3183,14 @@ void mlx4_sched_ib_sl2vl_update_work(struct mlx4_ib_dev *ibdev,
>> }
>> }
>>
>> -static void mlx4_ib_event(struct mlx4_dev *dev, void *ibdev_ptr,
>> - enum mlx4_dev_event event, unsigned long param)
>> +static int mlx4_ib_event(struct notifier_block *this,
>> + unsigned long event /*mlx4_dev_event*/, void *ptr)
>
> /*mlx4_dev_event*/ should be removed?
The comment was meant to indicate the actual type of the event. I can
remove it.
Thanks,
Petr
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next 03/10] mlx4: Replace the mlx4_interface.event callback with a notifier
2023-08-07 13:58 ` Simon Horman
@ 2023-08-08 12:15 ` Petr Pavlu
0 siblings, 0 replies; 28+ messages in thread
From: Petr Pavlu @ 2023-08-08 12:15 UTC (permalink / raw)
To: Simon Horman
Cc: tariqt, yishaih, leon, davem, edumazet, kuba, pabeni, jgg, netdev,
linux-rdma, linux-kernel
On 8/7/23 15:58, Simon Horman wrote:
> On Fri, Aug 04, 2023 at 05:05:20PM +0200, Petr Pavlu wrote:
>
> ...
>
>> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_main.c b/drivers/net/ethernet/mellanox/mlx4/en_main.c
>
> ...
>
>> @@ -326,6 +333,11 @@ static void *mlx4_en_add(struct mlx4_dev *dev)
>> mutex_init(&mdev->state_lock);
>> mdev->device_up = true;
>>
>> + /* register mlx4 core notifier */
>> + mdev->mlx_nb.notifier_call = mlx4_en_event;
>> + err = mlx4_register_event_notifier(dev, &mdev->mlx_nb);
>
> Hi Petr.
>
> This fails to build because err isn't declared in this context.
Ah, the err variable in mlx4_en_add() is only defined by a subsequent
patch in the series. I'll fix it in v2.
Thanks,
Petr
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next 07/10] mlx4: Register mlx4 devices to an auxiliary virtual bus
2023-08-06 3:16 ` Zhu Yanjun
@ 2023-08-08 12:17 ` Petr Pavlu
0 siblings, 0 replies; 28+ messages in thread
From: Petr Pavlu @ 2023-08-08 12:17 UTC (permalink / raw)
To: Zhu Yanjun
Cc: tariqt, yishaih, leon, davem, edumazet, kuba, pabeni, jgg, netdev,
linux-rdma, linux-kernel
On 8/6/23 05:16, Zhu Yanjun wrote:
> 在 2023/8/4 23:05, Petr Pavlu 写道:
>> Add an auxiliary virtual bus to model the mlx4 driver structure. The
>> code is added along the current custom device management logic.
>> Subsequent patches switch mlx4_en and mlx4_ib to the auxiliary bus and
>> the old interface is then removed.
>>
>> Structure mlx4_priv gains a new adev dynamic array to keep track of its
>> auxiliary devices. Access to the array is protected by the global
>> mlx4_intf mutex.
>>
>> Functions mlx4_register_device() and mlx4_unregister_device() are
>> updated to expose auxiliary devices on the bus in order to load mlx4_en
>> and/or mlx4_ib. Functions mlx4_register_auxiliary_driver() and
>> mlx4_unregister_auxiliary_driver() are added to substitute
>> mlx4_register_interface() and mlx4_unregister_interface(), respectively.
>> Function mlx4_do_bond() is adjusted to walk over the adev array and
>> re-adds a specific auxiliary device if its driver sets the
>> MLX4_INTFF_BONDING flag.
>>
>> Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
>> Tested-by: Leon Romanovsky <leon@kernel.org>
>> ---
>> drivers/net/ethernet/mellanox/mlx4/Kconfig | 1 +
>> drivers/net/ethernet/mellanox/mlx4/intf.c | 230 ++++++++++++++++++++-
>> drivers/net/ethernet/mellanox/mlx4/main.c | 17 +-
>> drivers/net/ethernet/mellanox/mlx4/mlx4.h | 6 +
>> include/linux/mlx4/device.h | 7 +
>> include/linux/mlx4/driver.h | 11 +
>> 6 files changed, 268 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx4/Kconfig b/drivers/net/ethernet/mellanox/mlx4/Kconfig
>> index 1b4b1f642317..825e05fb8607 100644
>> --- a/drivers/net/ethernet/mellanox/mlx4/Kconfig
>> +++ b/drivers/net/ethernet/mellanox/mlx4/Kconfig
>> @@ -27,6 +27,7 @@ config MLX4_EN_DCB
>> config MLX4_CORE
>> tristate
>> depends on PCI
>> + select AUXILIARY_BUS
>> select NET_DEVLINK
>> default n
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx4/intf.c b/drivers/net/ethernet/mellanox/mlx4/intf.c
>> index 30aead34ce08..4b1e18e4a682 100644
>> --- a/drivers/net/ethernet/mellanox/mlx4/intf.c
>> +++ b/drivers/net/ethernet/mellanox/mlx4/intf.c
>> @@ -48,6 +48,89 @@ struct mlx4_device_context {
>> static LIST_HEAD(intf_list);
>> static LIST_HEAD(dev_list);
>> static DEFINE_MUTEX(intf_mutex);
>> +static DEFINE_IDA(mlx4_adev_ida);
>> +
>> +static const struct mlx4_adev_device {
>> + const char *suffix;
>> + bool (*is_supported)(struct mlx4_dev *dev);
>> +} mlx4_adev_devices[1] = {};
>> +
>> +int mlx4_adev_init(struct mlx4_dev *dev)
>> +{
>> + struct mlx4_priv *priv = mlx4_priv(dev);
>> +
>> + priv->adev_idx = ida_alloc(&mlx4_adev_ida, GFP_KERNEL);
>> + if (priv->adev_idx < 0)
>> + return priv->adev_idx;
>> +
>> + priv->adev = kcalloc(ARRAY_SIZE(mlx4_adev_devices),
>> + sizeof(struct mlx4_adev *), GFP_KERNEL);
>> + if (!priv->adev) {
>> + ida_free(&mlx4_adev_ida, priv->adev_idx);
>> + return -ENOMEM;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +void mlx4_adev_cleanup(struct mlx4_dev *dev)
>> +{
>> + struct mlx4_priv *priv = mlx4_priv(dev);
>> +
>> + kfree(priv->adev);
>> + ida_free(&mlx4_adev_ida, priv->adev_idx);
>> +}
>> +
>> +static void adev_release(struct device *dev)
>> +{
>> + struct mlx4_adev *mlx4_adev =
>> + container_of(dev, struct mlx4_adev, adev.dev);
>> + struct mlx4_priv *priv = mlx4_priv(mlx4_adev->mdev);
>> + int idx = mlx4_adev->idx;
>> +
>> + kfree(mlx4_adev);
>> + priv->adev[idx] = NULL;
>> +}
>> +
>> +static struct mlx4_adev *add_adev(struct mlx4_dev *dev, int idx)
>> +{
>> + struct mlx4_priv *priv = mlx4_priv(dev);
>> + const char *suffix = mlx4_adev_devices[idx].suffix;
>> + struct auxiliary_device *adev;
>> + struct mlx4_adev *madev;
>> + int ret;
>> +
>> + madev = kzalloc(sizeof(*madev), GFP_KERNEL);
>> + if (!madev)
>> + return ERR_PTR(-ENOMEM);
>> +
>> + adev = &madev->adev;
>> + adev->id = priv->adev_idx;
>> + adev->name = suffix;
>> + adev->dev.parent = &dev->persist->pdev->dev;
>> + adev->dev.release = adev_release;
>> + madev->mdev = dev;
>> + madev->idx = idx;
>> +
>> + ret = auxiliary_device_init(adev);
>> + if (ret) {
>> + kfree(madev);
>> + return ERR_PTR(ret);
>> + }
>> +
>> + ret = auxiliary_device_add(adev);
>> + if (ret) {
>
> madev is allocated, but it is not handled here when auxiliary_device_add
> error. It should be freed, too?
> That is, add "kfree(madev);" here?
>
> If madev will be handled in other place, please add some comments here
> to indicate madev is handled in other place.
A successful call to auxiliary_device_init() registers the device's
.release callback. The madev storage is freed by calling
auxiliary_device_uninit() which invokes adev_release() -> kfree().
Thanks,
Petr
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next 01/10] mlx4: Get rid of the mlx4_interface.get_dev callback
2023-08-04 15:05 ` [PATCH net-next 01/10] mlx4: Get rid of the mlx4_interface.get_dev callback Petr Pavlu
@ 2023-08-08 18:55 ` Leon Romanovsky
0 siblings, 0 replies; 28+ messages in thread
From: Leon Romanovsky @ 2023-08-08 18:55 UTC (permalink / raw)
To: Petr Pavlu
Cc: tariqt, yishaih, davem, edumazet, kuba, pabeni, jgg, netdev,
linux-rdma, linux-kernel
On Fri, Aug 04, 2023 at 05:05:18PM +0200, Petr Pavlu wrote:
> Simplify the mlx4 driver interface by removing mlx4_get_protocol_dev()
> and the associated mlx4_interface.get_dev callbacks. This is done in
> preparation to use an auxiliary bus to model the mlx4 driver structure.
>
> The change is motivated by the following situation:
> * The mlx4_en interface is being initialized by mlx4_en_add() and
> mlx4_en_activate().
> * The latter activate function calls mlx4_en_init_netdev() ->
> register_netdev() to register a new net_device.
> * A netdev event NETDEV_REGISTER is raised for the device.
> * The netdev notififier mlx4_ib_netdev_event() is called and it invokes
> mlx4_ib_scan_netdevs() -> mlx4_get_protocol_dev() ->
> mlx4_en_get_netdev() [via mlx4_interface.get_dev].
>
> This chain creates a problem when mlx4_en gets switched to be an
> auxiliary driver. It contains two device calls which would both need to
> take a respective device lock.
>
> Avoid this situation by updating mlx4_ib_scan_netdevs() to no longer
> call mlx4_get_protocol_dev() but instead to utilize the information
> passed in net_device.parent and net_device.dev_port. This data is
> sufficient to determine that an updated port is one that the mlx4_ib
> driver should take care of and to keep mlx4_ib_dev.iboe.netdevs up to
> date.
>
> Following that, update mlx4_ib_get_netdev() to also not call
> mlx4_get_protocol_dev() and instead scan all current netdevs to find
> find a matching one. Note that mlx4_ib_get_netdev() is called early from
> ib_register_device() and cannot use data tracked in
> mlx4_ib_dev.iboe.netdevs which is not at that point yet set.
>
> Finally, remove function mlx4_get_protocol_dev() and the
> mlx4_interface.get_dev callbacks (only mlx4_en_get_netdev()) as they
> became unused.
>
> Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
> Tested-by: Leon Romanovsky <leon@kernel.org>
> ---
> drivers/infiniband/hw/mlx4/main.c | 89 ++++++++++----------
> drivers/net/ethernet/mellanox/mlx4/en_main.c | 8 --
> drivers/net/ethernet/mellanox/mlx4/intf.c | 21 -----
> include/linux/mlx4/driver.h | 3 -
> 4 files changed, 43 insertions(+), 78 deletions(-)
>
Thanks,
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next 02/10] mlx4: Rename member mlx4_en_dev.nb to netdev_nb
2023-08-04 15:05 ` [PATCH net-next 02/10] mlx4: Rename member mlx4_en_dev.nb to netdev_nb Petr Pavlu
@ 2023-08-08 18:55 ` Leon Romanovsky
0 siblings, 0 replies; 28+ messages in thread
From: Leon Romanovsky @ 2023-08-08 18:55 UTC (permalink / raw)
To: Petr Pavlu
Cc: tariqt, yishaih, davem, edumazet, kuba, pabeni, jgg, netdev,
linux-rdma, linux-kernel
On Fri, Aug 04, 2023 at 05:05:19PM +0200, Petr Pavlu wrote:
> Rename the mlx4_en_dev.nb notifier_block member to netdev_nb in
> preparation to add a mlx4 core notifier_block.
>
> Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
> Tested-by: Leon Romanovsky <leon@kernel.org>
> ---
> drivers/net/ethernet/mellanox/mlx4/en_main.c | 14 +++++++-------
> drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 2 +-
> drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 2 +-
> 3 files changed, 9 insertions(+), 9 deletions(-)
>
Thanks,
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next 04/10] mlx4: Get rid of the mlx4_interface.activate callback
2023-08-04 15:05 ` [PATCH net-next 04/10] mlx4: Get rid of the mlx4_interface.activate callback Petr Pavlu
@ 2023-08-08 18:56 ` Leon Romanovsky
0 siblings, 0 replies; 28+ messages in thread
From: Leon Romanovsky @ 2023-08-08 18:56 UTC (permalink / raw)
To: Petr Pavlu
Cc: tariqt, yishaih, davem, edumazet, kuba, pabeni, jgg, netdev,
linux-rdma, linux-kernel
On Fri, Aug 04, 2023 at 05:05:21PM +0200, Petr Pavlu wrote:
> The mlx4_interface.activate callback was introduced in commit
> 79857cd31fe7 ("net/mlx4: Postpone the registration of net_device"). It
> dealt with a situation when a netdev notifier received a NETDEV_REGISTER
> event for a new net_device created by mlx4_en but the same device was
> not yet visible to mlx4_get_protocol_dev(). The callback can be removed
> now that mlx4_get_protocol_dev() is gone.
>
> Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
> Tested-by: Leon Romanovsky <leon@kernel.org>
> ---
> drivers/net/ethernet/mellanox/mlx4/en_main.c | 37 +++++++++-----------
> drivers/net/ethernet/mellanox/mlx4/intf.c | 2 --
> include/linux/mlx4/driver.h | 1 -
> 3 files changed, 16 insertions(+), 24 deletions(-)
>
Thanks,
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next 05/10] mlx4: Move the bond work to the core driver
2023-08-04 15:05 ` [PATCH net-next 05/10] mlx4: Move the bond work to the core driver Petr Pavlu
@ 2023-08-08 18:56 ` Leon Romanovsky
0 siblings, 0 replies; 28+ messages in thread
From: Leon Romanovsky @ 2023-08-08 18:56 UTC (permalink / raw)
To: Petr Pavlu
Cc: tariqt, yishaih, davem, edumazet, kuba, pabeni, jgg, netdev,
linux-rdma, linux-kernel
On Fri, Aug 04, 2023 at 05:05:22PM +0200, Petr Pavlu wrote:
> Function mlx4_en_queue_bond_work() is used in mlx4_en to start a bond
> reconfiguration. It gathers data about a new port map setting, takes
> a reference on the netdev that triggered the change and queues a work
> object on mlx4_en_priv.mdev.workqueue to perform the operation. The
> scheduled work is mlx4_en_bond_work() which calls
> mlx4_bond()/mlx4_unbond() and consequently mlx4_do_bond().
>
> At the same time, function mlx4_change_port_types() in mlx4_core might
> be invoked to change the port type configuration. As part of its logic,
> it re-registers the whole device by calling mlx4_unregister_device(),
> followed by mlx4_register_device().
>
> The two operations can result in concurrent access to the data about
> currently active interfaces on the device.
>
> Functions mlx4_register_device() and mlx4_unregister_device() lock the
> intf_mutex to gain exclusive access to this data. The current
> implementation of mlx4_do_bond() doesn't do that which could result in
> an unexpected behavior. An updated version of mlx4_do_bond() for use
> with an auxiliary bus goes and locks the intf_mutex when accessing a new
> auxiliary device array.
>
> However, doing so can then result in the following deadlock:
> * A two-port mlx4 device is configured as an Ethernet bond.
> * One of the ports is changed from eth to ib, for instance, by writing
> into a mlx4_port<x> sysfs attribute file.
> * mlx4_change_port_types() is called to update port types. It invokes
> mlx4_unregister_device() to unregister the device which locks the
> intf_mutex and starts removing all associated interfaces.
> * Function mlx4_en_remove() gets invoked and starts destroying its first
> netdev. This triggers mlx4_en_netdev_event() which recognizes that the
> configured bond is broken. It runs mlx4_en_queue_bond_work() which
> takes a reference on the netdev. Removing the netdev now cannot
> proceed until the work is completed.
> * Work function mlx4_en_bond_work() gets scheduled. It calls
> mlx4_unbond() -> mlx4_do_bond(). The latter function tries to lock the
> intf_mutex but that is not possible because it is held already by
> mlx4_unregister_device().
>
> This particular case could be possibly solved by unregistering the
> mlx4_en_netdev_event() notifier in mlx4_en_remove() earlier, but it
> seems better to decouple mlx4_en more and break this reference order.
>
> Avoid then this scenario by recognizing that the bond reconfiguration
> operates only on a mlx4_dev. The logic to queue and execute the bond
> work can be moved into the mlx4_core driver. Only a reference on the
> respective mlx4_dev object is needed to be taken during the work's
> lifetime. This removes a call from mlx4_en that can directly result in
> needing to lock the intf_mutex, it remains a privilege of the core
> driver.
>
> Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
> Tested-by: Leon Romanovsky <leon@kernel.org>
> ---
> .../net/ethernet/mellanox/mlx4/en_netdev.c | 62 +-----------------
> drivers/net/ethernet/mellanox/mlx4/main.c | 65 +++++++++++++++++--
> drivers/net/ethernet/mellanox/mlx4/mlx4.h | 5 ++
> include/linux/mlx4/device.h | 13 ++++
> include/linux/mlx4/driver.h | 19 ------
> 5 files changed, 77 insertions(+), 87 deletions(-)
>
Thanks,
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next 06/10] mlx4: Avoid resetting MLX4_INTFF_BONDING per driver
2023-08-04 15:05 ` [PATCH net-next 06/10] mlx4: Avoid resetting MLX4_INTFF_BONDING per driver Petr Pavlu
@ 2023-08-08 18:57 ` Leon Romanovsky
0 siblings, 0 replies; 28+ messages in thread
From: Leon Romanovsky @ 2023-08-08 18:57 UTC (permalink / raw)
To: Petr Pavlu
Cc: tariqt, yishaih, davem, edumazet, kuba, pabeni, jgg, netdev,
linux-rdma, linux-kernel
On Fri, Aug 04, 2023 at 05:05:23PM +0200, Petr Pavlu wrote:
> The mlx4_core driver has a logic that allows a sub-driver to set the
> MLX4_INTFF_BONDING flag which then causes that function mlx4_do_bond()
> asks the sub-driver to fully re-probe a device when its bonding
> configuration changes.
>
> Performing this operation is disallowed in mlx4_register_interface()
> when it is detected that any mlx4 device is multifunction (SRIOV). The
> code then resets MLX4_INTFF_BONDING in the driver flags.
>
> Move this check directly into mlx4_do_bond(). It provides a better
> separation as mlx4_core no longer directly modifies the sub-driver flags
> and it will allow to get rid of explicitly keeping track of all mlx4
> devices by the intf.c code when it is switched to an auxiliary bus.
>
> Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
> Tested-by: Leon Romanovsky <leon@kernel.org>
> ---
> drivers/net/ethernet/mellanox/mlx4/intf.c | 19 +++++++++++--------
> 1 file changed, 11 insertions(+), 8 deletions(-)
>
Thanks,
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next 07/10] mlx4: Register mlx4 devices to an auxiliary virtual bus
2023-08-04 15:05 ` [PATCH net-next 07/10] mlx4: Register mlx4 devices to an auxiliary virtual bus Petr Pavlu
2023-08-06 3:16 ` Zhu Yanjun
@ 2023-08-08 18:57 ` Leon Romanovsky
1 sibling, 0 replies; 28+ messages in thread
From: Leon Romanovsky @ 2023-08-08 18:57 UTC (permalink / raw)
To: Petr Pavlu
Cc: tariqt, yishaih, davem, edumazet, kuba, pabeni, jgg, netdev,
linux-rdma, linux-kernel
On Fri, Aug 04, 2023 at 05:05:24PM +0200, Petr Pavlu wrote:
> Add an auxiliary virtual bus to model the mlx4 driver structure. The
> code is added along the current custom device management logic.
> Subsequent patches switch mlx4_en and mlx4_ib to the auxiliary bus and
> the old interface is then removed.
>
> Structure mlx4_priv gains a new adev dynamic array to keep track of its
> auxiliary devices. Access to the array is protected by the global
> mlx4_intf mutex.
>
> Functions mlx4_register_device() and mlx4_unregister_device() are
> updated to expose auxiliary devices on the bus in order to load mlx4_en
> and/or mlx4_ib. Functions mlx4_register_auxiliary_driver() and
> mlx4_unregister_auxiliary_driver() are added to substitute
> mlx4_register_interface() and mlx4_unregister_interface(), respectively.
> Function mlx4_do_bond() is adjusted to walk over the adev array and
> re-adds a specific auxiliary device if its driver sets the
> MLX4_INTFF_BONDING flag.
>
> Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
> Tested-by: Leon Romanovsky <leon@kernel.org>
> ---
> drivers/net/ethernet/mellanox/mlx4/Kconfig | 1 +
> drivers/net/ethernet/mellanox/mlx4/intf.c | 230 ++++++++++++++++++++-
> drivers/net/ethernet/mellanox/mlx4/main.c | 17 +-
> drivers/net/ethernet/mellanox/mlx4/mlx4.h | 6 +
> include/linux/mlx4/device.h | 7 +
> include/linux/mlx4/driver.h | 11 +
> 6 files changed, 268 insertions(+), 4 deletions(-)
>
Thanks,
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next 08/10] mlx4: Connect the ethernet part to the auxiliary bus
2023-08-04 15:05 ` [PATCH net-next 08/10] mlx4: Connect the ethernet part to the auxiliary bus Petr Pavlu
@ 2023-08-08 18:57 ` Leon Romanovsky
0 siblings, 0 replies; 28+ messages in thread
From: Leon Romanovsky @ 2023-08-08 18:57 UTC (permalink / raw)
To: Petr Pavlu
Cc: tariqt, yishaih, davem, edumazet, kuba, pabeni, jgg, netdev,
linux-rdma, linux-kernel
On Fri, Aug 04, 2023 at 05:05:25PM +0200, Petr Pavlu wrote:
> Use the auxiliary bus to perform device management of the ethernet part
> of the mlx4 driver.
>
> Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
> Tested-by: Leon Romanovsky <leon@kernel.org>
> ---
> drivers/net/ethernet/mellanox/mlx4/en_main.c | 67 ++++++++++++++------
> drivers/net/ethernet/mellanox/mlx4/intf.c | 13 +++-
> 2 files changed, 59 insertions(+), 21 deletions(-)
>
Thanks,
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next 09/10] mlx4: Connect the infiniband part to the auxiliary bus
2023-08-04 15:05 ` [PATCH net-next 09/10] mlx4: Connect the infiniband " Petr Pavlu
@ 2023-08-08 18:58 ` Leon Romanovsky
0 siblings, 0 replies; 28+ messages in thread
From: Leon Romanovsky @ 2023-08-08 18:58 UTC (permalink / raw)
To: Petr Pavlu
Cc: tariqt, yishaih, davem, edumazet, kuba, pabeni, jgg, netdev,
linux-rdma, linux-kernel
On Fri, Aug 04, 2023 at 05:05:26PM +0200, Petr Pavlu wrote:
> Use the auxiliary bus to perform device management of the infiniband
> part of the mlx4 driver.
>
> Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
> Tested-by: Leon Romanovsky <leon@kernel.org>
> ---
> drivers/infiniband/hw/mlx4/main.c | 77 ++++++++++++++++-------
> drivers/net/ethernet/mellanox/mlx4/intf.c | 13 ++++
> 2 files changed, 67 insertions(+), 23 deletions(-)
>
Thanks,
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next 10/10] mlx4: Delete custom device management logic
2023-08-04 15:05 ` [PATCH net-next 10/10] mlx4: Delete custom device management logic Petr Pavlu
@ 2023-08-08 18:58 ` Leon Romanovsky
0 siblings, 0 replies; 28+ messages in thread
From: Leon Romanovsky @ 2023-08-08 18:58 UTC (permalink / raw)
To: Petr Pavlu
Cc: tariqt, yishaih, davem, edumazet, kuba, pabeni, jgg, netdev,
linux-rdma, linux-kernel
On Fri, Aug 04, 2023 at 05:05:27PM +0200, Petr Pavlu wrote:
> After the conversion to use the auxiliary bus, the custom device
> management is not needed anymore and can be deleted.
>
> Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
> Tested-by: Leon Romanovsky <leon@kernel.org>
> ---
> drivers/net/ethernet/mellanox/mlx4/intf.c | 125 ----------------------
> drivers/net/ethernet/mellanox/mlx4/main.c | 28 -----
> drivers/net/ethernet/mellanox/mlx4/mlx4.h | 3 -
> include/linux/mlx4/driver.h | 10 --
> 4 files changed, 166 deletions(-)
>
Thanks,
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next 00/10] Convert mlx4 to use auxiliary bus
2023-08-04 15:05 [PATCH net-next 00/10] Convert mlx4 to use auxiliary bus Petr Pavlu
` (10 preceding siblings ...)
2023-08-04 16:49 ` [PATCH net-next 00/10] Convert mlx4 to use auxiliary bus Jason Gunthorpe
@ 2023-08-09 11:12 ` Tariq Toukan
11 siblings, 0 replies; 28+ messages in thread
From: Tariq Toukan @ 2023-08-09 11:12 UTC (permalink / raw)
To: Petr Pavlu, tariqt, yishaih, leon
Cc: davem, edumazet, kuba, pabeni, jgg, netdev, linux-rdma,
linux-kernel
On 04/08/2023 18:05, Petr Pavlu wrote:
> This series converts the mlx4 drivers to use auxiliary bus, similarly to
> how mlx5 was converted [1]. The first 6 patches are preparatory changes,
> the remaining 4 are the final conversion.
>
> Initial motivation for this change was to address a problem related to
> loading mlx4_en/mlx4_ib by mlx4_core using request_module_nowait(). When
> doing such a load in initrd, the operation is asynchronous to any init
> control and can get unexpectedly affected/interrupted by an eventual
> root switch. Using an auxiliary bus leaves these module loads to udevd
> which better integrates with systemd processing. [2]
>
> General benefit is to get rid of custom interface logic and instead use
> a common facility available for this task. An obvious risk is that some
> new bug is introduced by the conversion.
>
> Leon Romanovsky was kind enough to check for me that the series passes
> their verification tests.
>
> [1] https://lore.kernel.org/netdev/20201101201542.2027568-1-leon@kernel.org/
> [2] https://lore.kernel.org/netdev/0a361ac2-c6bd-2b18-4841-b1b991f0635e@suse.com/
>
> Petr Pavlu (10):
> mlx4: Get rid of the mlx4_interface.get_dev callback
> mlx4: Rename member mlx4_en_dev.nb to netdev_nb
> mlx4: Replace the mlx4_interface.event callback with a notifier
> mlx4: Get rid of the mlx4_interface.activate callback
> mlx4: Move the bond work to the core driver
> mlx4: Avoid resetting MLX4_INTFF_BONDING per driver
> mlx4: Register mlx4 devices to an auxiliary virtual bus
> mlx4: Connect the ethernet part to the auxiliary bus
> mlx4: Connect the infiniband part to the auxiliary bus
> mlx4: Delete custom device management logic
>
> drivers/infiniband/hw/mlx4/main.c | 207 ++++++----
> drivers/infiniband/hw/mlx4/mlx4_ib.h | 2 +
> drivers/net/ethernet/mellanox/mlx4/Kconfig | 1 +
> drivers/net/ethernet/mellanox/mlx4/en_main.c | 141 ++++---
> .../net/ethernet/mellanox/mlx4/en_netdev.c | 64 +---
> drivers/net/ethernet/mellanox/mlx4/intf.c | 361 ++++++++++++------
> drivers/net/ethernet/mellanox/mlx4/main.c | 110 ++++--
> drivers/net/ethernet/mellanox/mlx4/mlx4.h | 16 +-
> drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 4 +-
> include/linux/mlx4/device.h | 20 +
> include/linux/mlx4/driver.h | 42 +-
> 11 files changed, 572 insertions(+), 396 deletions(-)
>
>
> base-commit: 86b7e033d684a9d4ca20ad8e6f8b9300cf99668f
For the series:
Acked-by: Tariq Toukan <tariqt@nvidia.com>
^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2023-08-09 11:12 UTC | newest]
Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-04 15:05 [PATCH net-next 00/10] Convert mlx4 to use auxiliary bus Petr Pavlu
2023-08-04 15:05 ` [PATCH net-next 01/10] mlx4: Get rid of the mlx4_interface.get_dev callback Petr Pavlu
2023-08-08 18:55 ` Leon Romanovsky
2023-08-04 15:05 ` [PATCH net-next 02/10] mlx4: Rename member mlx4_en_dev.nb to netdev_nb Petr Pavlu
2023-08-08 18:55 ` Leon Romanovsky
2023-08-04 15:05 ` [PATCH net-next 03/10] mlx4: Replace the mlx4_interface.event callback with a notifier Petr Pavlu
2023-08-05 14:29 ` Zhu Yanjun
2023-08-08 12:13 ` Petr Pavlu
2023-08-07 13:58 ` Simon Horman
2023-08-08 12:15 ` Petr Pavlu
2023-08-04 15:05 ` [PATCH net-next 04/10] mlx4: Get rid of the mlx4_interface.activate callback Petr Pavlu
2023-08-08 18:56 ` Leon Romanovsky
2023-08-04 15:05 ` [PATCH net-next 05/10] mlx4: Move the bond work to the core driver Petr Pavlu
2023-08-08 18:56 ` Leon Romanovsky
2023-08-04 15:05 ` [PATCH net-next 06/10] mlx4: Avoid resetting MLX4_INTFF_BONDING per driver Petr Pavlu
2023-08-08 18:57 ` Leon Romanovsky
2023-08-04 15:05 ` [PATCH net-next 07/10] mlx4: Register mlx4 devices to an auxiliary virtual bus Petr Pavlu
2023-08-06 3:16 ` Zhu Yanjun
2023-08-08 12:17 ` Petr Pavlu
2023-08-08 18:57 ` Leon Romanovsky
2023-08-04 15:05 ` [PATCH net-next 08/10] mlx4: Connect the ethernet part to the auxiliary bus Petr Pavlu
2023-08-08 18:57 ` Leon Romanovsky
2023-08-04 15:05 ` [PATCH net-next 09/10] mlx4: Connect the infiniband " Petr Pavlu
2023-08-08 18:58 ` Leon Romanovsky
2023-08-04 15:05 ` [PATCH net-next 10/10] mlx4: Delete custom device management logic Petr Pavlu
2023-08-08 18:58 ` Leon Romanovsky
2023-08-04 16:49 ` [PATCH net-next 00/10] Convert mlx4 to use auxiliary bus Jason Gunthorpe
2023-08-09 11:12 ` Tariq Toukan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).