From: Leon Romanovsky <leon@kernel.org>
To: Petr Pavlu <petr.pavlu@suse.com>
Cc: tariqt@nvidia.com, yishaih@nvidia.com, davem@davemloft.net,
edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
jgg@ziepe.ca, netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH net-next 05/10] mlx4: Move the bond work to the core driver
Date: Tue, 8 Aug 2023 21:56:44 +0300 [thread overview]
Message-ID: <20230808185644.GJ94631@unreal> (raw)
In-Reply-To: <20230804150527.6117-6-petr.pavlu@suse.com>
On Fri, Aug 04, 2023 at 05:05:22PM +0200, Petr Pavlu wrote:
> Function mlx4_en_queue_bond_work() is used in mlx4_en to start a bond
> reconfiguration. It gathers data about a new port map setting, takes
> a reference on the netdev that triggered the change and queues a work
> object on mlx4_en_priv.mdev.workqueue to perform the operation. The
> scheduled work is mlx4_en_bond_work() which calls
> mlx4_bond()/mlx4_unbond() and consequently mlx4_do_bond().
>
> At the same time, function mlx4_change_port_types() in mlx4_core might
> be invoked to change the port type configuration. As part of its logic,
> it re-registers the whole device by calling mlx4_unregister_device(),
> followed by mlx4_register_device().
>
> The two operations can result in concurrent access to the data about
> currently active interfaces on the device.
>
> Functions mlx4_register_device() and mlx4_unregister_device() lock the
> intf_mutex to gain exclusive access to this data. The current
> implementation of mlx4_do_bond() doesn't do that which could result in
> an unexpected behavior. An updated version of mlx4_do_bond() for use
> with an auxiliary bus goes and locks the intf_mutex when accessing a new
> auxiliary device array.
>
> However, doing so can then result in the following deadlock:
> * A two-port mlx4 device is configured as an Ethernet bond.
> * One of the ports is changed from eth to ib, for instance, by writing
> into a mlx4_port<x> sysfs attribute file.
> * mlx4_change_port_types() is called to update port types. It invokes
> mlx4_unregister_device() to unregister the device which locks the
> intf_mutex and starts removing all associated interfaces.
> * Function mlx4_en_remove() gets invoked and starts destroying its first
> netdev. This triggers mlx4_en_netdev_event() which recognizes that the
> configured bond is broken. It runs mlx4_en_queue_bond_work() which
> takes a reference on the netdev. Removing the netdev now cannot
> proceed until the work is completed.
> * Work function mlx4_en_bond_work() gets scheduled. It calls
> mlx4_unbond() -> mlx4_do_bond(). The latter function tries to lock the
> intf_mutex but that is not possible because it is held already by
> mlx4_unregister_device().
>
> This particular case could be possibly solved by unregistering the
> mlx4_en_netdev_event() notifier in mlx4_en_remove() earlier, but it
> seems better to decouple mlx4_en more and break this reference order.
>
> Avoid then this scenario by recognizing that the bond reconfiguration
> operates only on a mlx4_dev. The logic to queue and execute the bond
> work can be moved into the mlx4_core driver. Only a reference on the
> respective mlx4_dev object is needed to be taken during the work's
> lifetime. This removes a call from mlx4_en that can directly result in
> needing to lock the intf_mutex, it remains a privilege of the core
> driver.
>
> Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
> Tested-by: Leon Romanovsky <leon@kernel.org>
> ---
> .../net/ethernet/mellanox/mlx4/en_netdev.c | 62 +-----------------
> drivers/net/ethernet/mellanox/mlx4/main.c | 65 +++++++++++++++++--
> drivers/net/ethernet/mellanox/mlx4/mlx4.h | 5 ++
> include/linux/mlx4/device.h | 13 ++++
> include/linux/mlx4/driver.h | 19 ------
> 5 files changed, 77 insertions(+), 87 deletions(-)
>
Thanks,
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
next prev parent reply other threads:[~2023-08-08 19:30 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-04 15:05 [PATCH net-next 00/10] Convert mlx4 to use auxiliary bus Petr Pavlu
2023-08-04 15:05 ` [PATCH net-next 01/10] mlx4: Get rid of the mlx4_interface.get_dev callback Petr Pavlu
2023-08-08 18:55 ` Leon Romanovsky
2023-08-04 15:05 ` [PATCH net-next 02/10] mlx4: Rename member mlx4_en_dev.nb to netdev_nb Petr Pavlu
2023-08-08 18:55 ` Leon Romanovsky
2023-08-04 15:05 ` [PATCH net-next 03/10] mlx4: Replace the mlx4_interface.event callback with a notifier Petr Pavlu
2023-08-05 14:29 ` Zhu Yanjun
2023-08-08 12:13 ` Petr Pavlu
2023-08-07 13:58 ` Simon Horman
2023-08-08 12:15 ` Petr Pavlu
2023-08-04 15:05 ` [PATCH net-next 04/10] mlx4: Get rid of the mlx4_interface.activate callback Petr Pavlu
2023-08-08 18:56 ` Leon Romanovsky
2023-08-04 15:05 ` [PATCH net-next 05/10] mlx4: Move the bond work to the core driver Petr Pavlu
2023-08-08 18:56 ` Leon Romanovsky [this message]
2023-08-04 15:05 ` [PATCH net-next 06/10] mlx4: Avoid resetting MLX4_INTFF_BONDING per driver Petr Pavlu
2023-08-08 18:57 ` Leon Romanovsky
2023-08-04 15:05 ` [PATCH net-next 07/10] mlx4: Register mlx4 devices to an auxiliary virtual bus Petr Pavlu
2023-08-06 3:16 ` Zhu Yanjun
2023-08-08 12:17 ` Petr Pavlu
2023-08-08 18:57 ` Leon Romanovsky
2023-08-04 15:05 ` [PATCH net-next 08/10] mlx4: Connect the ethernet part to the auxiliary bus Petr Pavlu
2023-08-08 18:57 ` Leon Romanovsky
2023-08-04 15:05 ` [PATCH net-next 09/10] mlx4: Connect the infiniband " Petr Pavlu
2023-08-08 18:58 ` Leon Romanovsky
2023-08-04 15:05 ` [PATCH net-next 10/10] mlx4: Delete custom device management logic Petr Pavlu
2023-08-08 18:58 ` Leon Romanovsky
2023-08-04 16:49 ` [PATCH net-next 00/10] Convert mlx4 to use auxiliary bus Jason Gunthorpe
2023-08-09 11:12 ` Tariq Toukan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230808185644.GJ94631@unreal \
--to=leon@kernel.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=jgg@ziepe.ca \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=petr.pavlu@suse.com \
--cc=tariqt@nvidia.com \
--cc=yishaih@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.