* [patch rdma-next v5 1/2] net: mana: Change the function signature of mana_get_primary_netdev_rcu
@ 2025-03-06 19:24 longli
2025-03-06 19:24 ` [patch rdma-next v5 2/2] RDMA/mana_ib: Handle net event for pointing to the current netdev longli
0 siblings, 1 reply; 7+ messages in thread
From: longli @ 2025-03-06 19:24 UTC (permalink / raw)
To: Jason Gunthorpe, Leon Romanovsky, Konstantin Taranov,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: linux-rdma, netdev, linux-kernel, linux-hyperv, Long Li
From: Long Li <longli@microsoft.com>
Change mana_get_primary_netdev_rcu() to mana_get_primary_netdev(), and
return the ndev with refcount held. The caller is responsible for dropping
the refcount.
Also drop the check for IFF_SLAVE as it is not necessary if the upper
device is present.
Signed-off-by: Long Li <longli@microsoft.com>
---
Changes
v4: use netdev_hold()/netdev_put() and remove the check for IFF_SLAVE
v5: use netdevice_tracker in mana_ib_dev for netdev_hold()/netdev_put()
drivers/infiniband/hw/mana/device.c | 7 +++---
drivers/infiniband/hw/mana/mana_ib.h | 1 +
drivers/net/ethernet/microsoft/mana/mana_en.c | 22 ++++++++++++-------
include/net/mana/mana.h | 4 +++-
4 files changed, 21 insertions(+), 13 deletions(-)
diff --git a/drivers/infiniband/hw/mana/device.c b/drivers/infiniband/hw/mana/device.c
index 3416a85f8738..363566095501 100644
--- a/drivers/infiniband/hw/mana/device.c
+++ b/drivers/infiniband/hw/mana/device.c
@@ -84,10 +84,8 @@ static int mana_ib_probe(struct auxiliary_device *adev,
dev->ib_dev.num_comp_vectors = mdev->gdma_context->max_num_queues;
dev->ib_dev.dev.parent = mdev->gdma_context->dev;
- rcu_read_lock(); /* required to get primary netdev */
- ndev = mana_get_primary_netdev_rcu(mc, 0);
+ ndev = mana_get_primary_netdev(mc, 0, &dev->dev_tracker);
if (!ndev) {
- rcu_read_unlock();
ret = -ENODEV;
ibdev_err(&dev->ib_dev, "Failed to get netdev for IB port 1");
goto free_ib_device;
@@ -95,7 +93,8 @@ static int mana_ib_probe(struct auxiliary_device *adev,
ether_addr_copy(mac_addr, ndev->dev_addr);
addrconf_addr_eui48((u8 *)&dev->ib_dev.node_guid, ndev->dev_addr);
ret = ib_device_set_netdev(&dev->ib_dev, ndev, 1);
- rcu_read_unlock();
+ /* mana_get_primary_netdev() returns ndev with refcount held */
+ netdev_put(ndev, &dev->dev_tracker);
if (ret) {
ibdev_err(&dev->ib_dev, "Failed to set ib netdev, ret %d", ret);
goto free_ib_device;
diff --git a/drivers/infiniband/hw/mana/mana_ib.h b/drivers/infiniband/hw/mana/mana_ib.h
index b53a5b4de908..2638688f2505 100644
--- a/drivers/infiniband/hw/mana/mana_ib.h
+++ b/drivers/infiniband/hw/mana/mana_ib.h
@@ -64,6 +64,7 @@ struct mana_ib_dev {
struct gdma_queue **eqs;
struct xarray qp_table_wq;
struct mana_ib_adapter_caps adapter_caps;
+ netdevice_tracker dev_tracker;
};
struct mana_ib_wq {
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index aa1e47233fe5..4e870b11f946 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -3131,21 +3131,27 @@ void mana_remove(struct gdma_dev *gd, bool suspending)
kfree(ac);
}
-struct net_device *mana_get_primary_netdev_rcu(struct mana_context *ac, u32 port_index)
+struct net_device *mana_get_primary_netdev(struct mana_context *ac,
+ u32 port_index,
+ netdevice_tracker *tracker)
{
struct net_device *ndev;
- RCU_LOCKDEP_WARN(!rcu_read_lock_held(),
- "Taking primary netdev without holding the RCU read lock");
if (port_index >= ac->num_ports)
return NULL;
- /* When mana is used in netvsc, the upper netdevice should be returned. */
- if (ac->ports[port_index]->flags & IFF_SLAVE)
- ndev = netdev_master_upper_dev_get_rcu(ac->ports[port_index]);
- else
+ rcu_read_lock();
+
+ /* If mana is used in netvsc, the upper netdevice should be returned. */
+ ndev = netdev_master_upper_dev_get_rcu(ac->ports[port_index]);
+
+ /* If there is no upper device, use the parent Ethernet device */
+ if (!ndev)
ndev = ac->ports[port_index];
+ netdev_hold(ndev, tracker, GFP_ATOMIC);
+ rcu_read_unlock();
+
return ndev;
}
-EXPORT_SYMBOL_NS(mana_get_primary_netdev_rcu, "NET_MANA");
+EXPORT_SYMBOL_NS(mana_get_primary_netdev, "NET_MANA");
diff --git a/include/net/mana/mana.h b/include/net/mana/mana.h
index 0d00b24eacaf..0f78065de8fe 100644
--- a/include/net/mana/mana.h
+++ b/include/net/mana/mana.h
@@ -827,5 +827,7 @@ int mana_cfg_vport(struct mana_port_context *apc, u32 protection_dom_id,
u32 doorbell_pg_id);
void mana_uncfg_vport(struct mana_port_context *apc);
-struct net_device *mana_get_primary_netdev_rcu(struct mana_context *ac, u32 port_index);
+struct net_device *mana_get_primary_netdev(struct mana_context *ac,
+ u32 port_index,
+ netdevice_tracker *tracker);
#endif /* _MANA_H */
--
2.34.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [patch rdma-next v5 2/2] RDMA/mana_ib: Handle net event for pointing to the current netdev
2025-03-06 19:24 [patch rdma-next v5 1/2] net: mana: Change the function signature of mana_get_primary_netdev_rcu longli
@ 2025-03-06 19:24 ` longli
2025-03-06 19:53 ` Jason Gunthorpe
0 siblings, 1 reply; 7+ messages in thread
From: longli @ 2025-03-06 19:24 UTC (permalink / raw)
To: Jason Gunthorpe, Leon Romanovsky, Konstantin Taranov,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: linux-rdma, netdev, linux-kernel, linux-hyperv, Long Li
From: Long Li <longli@microsoft.com>
When running under Hyper-V, the master device to the RDMA device is always
bonded to this RDMA device. This is not user-configurable.
The master device can be unbind/bind from the kernel. During those events,
the RDMA device should set to the current netdev to reflect the change of
master device from those events.
Signed-off-by: Long Li <longli@microsoft.com>
---
Changes
v2: Add missing error handling when register_netdevice_notifier() fails.
v3: Change mana_get_primary_netdev() to return with netdev refcount held.
v4: use netdev_put().
v5: use netdevice_tracker for netdev_hold()/netdev_put().
drivers/infiniband/hw/mana/device.c | 47 ++++++++++++++++++++++++++--
drivers/infiniband/hw/mana/mana_ib.h | 1 +
2 files changed, 46 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/hw/mana/device.c b/drivers/infiniband/hw/mana/device.c
index 363566095501..b0b866b574a0 100644
--- a/drivers/infiniband/hw/mana/device.c
+++ b/drivers/infiniband/hw/mana/device.c
@@ -51,6 +51,38 @@ static const struct ib_device_ops mana_ib_dev_ops = {
ib_ind_table),
};
+static int mana_ib_netdev_event(struct notifier_block *this,
+ unsigned long event, void *ptr)
+{
+ struct mana_ib_dev *dev = container_of(this, struct mana_ib_dev, nb);
+ struct net_device *event_dev = netdev_notifier_info_to_dev(ptr);
+ struct gdma_context *gc = dev->gdma_dev->gdma_context;
+ struct mana_context *mc = gc->mana.driver_data;
+ struct net_device *ndev;
+
+ /* Only process events from our parent device */
+ if (event_dev != mc->ports[0])
+ return NOTIFY_DONE;
+
+ switch (event) {
+ case NETDEV_CHANGEUPPER:
+ ndev = mana_get_primary_netdev(mc, 0, &dev->dev_tracker);
+ /*
+ * RDMA core will setup GID based on updated netdev.
+ * It's not possible to race with the core as rtnl lock is being
+ * held.
+ */
+ ib_device_set_netdev(&dev->ib_dev, ndev, 1);
+
+ /* mana_get_primary_netdev() returns ndev with refcount held */
+ netdev_put(ndev, &dev->dev_tracker);
+
+ return NOTIFY_OK;
+ default:
+ return NOTIFY_DONE;
+ }
+}
+
static int mana_ib_probe(struct auxiliary_device *adev,
const struct auxiliary_device_id *id)
{
@@ -108,17 +140,25 @@ static int mana_ib_probe(struct auxiliary_device *adev,
}
dev->gdma_dev = &mdev->gdma_context->mana_ib;
+ dev->nb.notifier_call = mana_ib_netdev_event;
+ ret = register_netdevice_notifier(&dev->nb);
+ if (ret) {
+ ibdev_err(&dev->ib_dev, "Failed to register net notifier, %d",
+ ret);
+ goto deregister_device;
+ }
+
ret = mana_ib_gd_query_adapter_caps(dev);
if (ret) {
ibdev_err(&dev->ib_dev, "Failed to query device caps, ret %d",
ret);
- goto deregister_device;
+ goto deregister_net_notifier;
}
ret = mana_ib_create_eqs(dev);
if (ret) {
ibdev_err(&dev->ib_dev, "Failed to create EQs, ret %d", ret);
- goto deregister_device;
+ goto deregister_net_notifier;
}
ret = mana_ib_gd_create_rnic_adapter(dev);
@@ -147,6 +187,8 @@ static int mana_ib_probe(struct auxiliary_device *adev,
mana_ib_gd_destroy_rnic_adapter(dev);
destroy_eqs:
mana_ib_destroy_eqs(dev);
+deregister_net_notifier:
+ unregister_netdevice_notifier(&dev->nb);
deregister_device:
mana_gd_deregister_device(dev->gdma_dev);
free_ib_device:
@@ -162,6 +204,7 @@ static void mana_ib_remove(struct auxiliary_device *adev)
xa_destroy(&dev->qp_table_wq);
mana_ib_gd_destroy_rnic_adapter(dev);
mana_ib_destroy_eqs(dev);
+ unregister_netdevice_notifier(&dev->nb);
mana_gd_deregister_device(dev->gdma_dev);
ib_dealloc_device(&dev->ib_dev);
}
diff --git a/drivers/infiniband/hw/mana/mana_ib.h b/drivers/infiniband/hw/mana/mana_ib.h
index 2638688f2505..bb9c6b1af24e 100644
--- a/drivers/infiniband/hw/mana/mana_ib.h
+++ b/drivers/infiniband/hw/mana/mana_ib.h
@@ -65,6 +65,7 @@ struct mana_ib_dev {
struct xarray qp_table_wq;
struct mana_ib_adapter_caps adapter_caps;
netdevice_tracker dev_tracker;
+ struct notifier_block nb;
};
struct mana_ib_wq {
--
2.34.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [patch rdma-next v5 2/2] RDMA/mana_ib: Handle net event for pointing to the current netdev
2025-03-06 19:24 ` [patch rdma-next v5 2/2] RDMA/mana_ib: Handle net event for pointing to the current netdev longli
@ 2025-03-06 19:53 ` Jason Gunthorpe
2025-03-06 20:01 ` [EXTERNAL] " Long Li
0 siblings, 1 reply; 7+ messages in thread
From: Jason Gunthorpe @ 2025-03-06 19:53 UTC (permalink / raw)
To: longli
Cc: Leon Romanovsky, Konstantin Taranov, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, linux-rdma, netdev,
linux-kernel, linux-hyperv, Long Li
On Thu, Mar 06, 2025 at 11:24:39AM -0800, longli@linuxonhyperv.com wrote:
> + switch (event) {
> + case NETDEV_CHANGEUPPER:
> + ndev = mana_get_primary_netdev(mc, 0, &dev->dev_tracker);
> + /*
> + * RDMA core will setup GID based on updated netdev.
> + * It's not possible to race with the core as rtnl lock is being
> + * held.
> + */
> + ib_device_set_netdev(&dev->ib_dev, ndev, 1);
> +
> + /* mana_get_primary_netdev() returns ndev with refcount held */
> + netdev_put(ndev, &dev->dev_tracker);
? What is the point of a tracker in dev if it never lasts outside this
scope?
ib_device_set_netdev() already has a tracker built into it.
Jason
^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [EXTERNAL] Re: [patch rdma-next v5 2/2] RDMA/mana_ib: Handle net event for pointing to the current netdev
2025-03-06 19:53 ` Jason Gunthorpe
@ 2025-03-06 20:01 ` Long Li
2025-03-10 21:46 ` Long Li
0 siblings, 1 reply; 7+ messages in thread
From: Long Li @ 2025-03-06 20:01 UTC (permalink / raw)
To: Jason Gunthorpe, longli@linuxonhyperv.com
Cc: Leon Romanovsky, Konstantin Taranov, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni,
linux-rdma@vger.kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-hyperv@vger.kernel.org
> Subject: [EXTERNAL] Re: [patch rdma-next v5 2/2] RDMA/mana_ib: Handle net
> event for pointing to the current netdev
>
> On Thu, Mar 06, 2025 at 11:24:39AM -0800, longli@linuxonhyperv.com wrote:
> > + switch (event) {
> > + case NETDEV_CHANGEUPPER:
> > + ndev = mana_get_primary_netdev(mc, 0, &dev->dev_tracker);
> > + /*
> > + * RDMA core will setup GID based on updated netdev.
> > + * It's not possible to race with the core as rtnl lock is being
> > + * held.
> > + */
> > + ib_device_set_netdev(&dev->ib_dev, ndev, 1);
> > +
> > + /* mana_get_primary_netdev() returns ndev with refcount held
> */
> > + netdev_put(ndev, &dev->dev_tracker);
>
> ? What is the point of a tracker in dev if it never lasts outside this scope?
>
> ib_device_set_netdev() already has a tracker built into it.
>
> Jason
I was asked to use a tracker for netdev_hold()/netdev_put(). But this code (and the code in mana_ib_probe() of the 1st patch) is simple enough that everything is done in one scope.
Jakub, do you think it's okay to use NULL as the tracker in both patches?
Long
^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [EXTERNAL] Re: [patch rdma-next v5 2/2] RDMA/mana_ib: Handle net event for pointing to the current netdev
2025-03-06 20:01 ` [EXTERNAL] " Long Li
@ 2025-03-10 21:46 ` Long Li
2025-03-12 13:38 ` Leon Romanovsky
0 siblings, 1 reply; 7+ messages in thread
From: Long Li @ 2025-03-10 21:46 UTC (permalink / raw)
To: Jason Gunthorpe, longli@linuxonhyperv.com
Cc: Leon Romanovsky, Konstantin Taranov, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni,
linux-rdma@vger.kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-hyperv@vger.kernel.org
> Subject: RE: [EXTERNAL] Re: [patch rdma-next v5 2/2] RDMA/mana_ib:
> Handle net event for pointing to the current netdev
>
> > Subject: [EXTERNAL] Re: [patch rdma-next v5 2/2] RDMA/mana_ib: Handle
> > net event for pointing to the current netdev
> >
> > On Thu, Mar 06, 2025 at 11:24:39AM -0800, longli@linuxonhyperv.com
> wrote:
> > > + switch (event) {
> > > + case NETDEV_CHANGEUPPER:
> > > + ndev = mana_get_primary_netdev(mc, 0, &dev->dev_tracker);
> > > + /*
> > > + * RDMA core will setup GID based on updated netdev.
> > > + * It's not possible to race with the core as rtnl lock is being
> > > + * held.
> > > + */
> > > + ib_device_set_netdev(&dev->ib_dev, ndev, 1);
> > > +
> > > + /* mana_get_primary_netdev() returns ndev with refcount
> held
> > */
> > > + netdev_put(ndev, &dev->dev_tracker);
> >
> > ? What is the point of a tracker in dev if it never lasts outside this scope?
> >
> > ib_device_set_netdev() already has a tracker built into it.
> >
> > Jason
>
> I was asked to use a tracker for netdev_hold()/netdev_put(). But this code
> (and the code in mana_ib_probe() of the 1st patch) is simple enough that
> everything is done in one scope.
>
> Jakub, do you think it's okay to use NULL as the tracker in both patches?
>
> Long
Hi,
If we don't want to use a tracker, can we take the v4 version of the patch set?
Otherwise, please take v5 (this patch) if a tracker is required.
Thanks,
Long
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [EXTERNAL] Re: [patch rdma-next v5 2/2] RDMA/mana_ib: Handle net event for pointing to the current netdev
2025-03-10 21:46 ` Long Li
@ 2025-03-12 13:38 ` Leon Romanovsky
2025-03-12 23:16 ` Long Li
0 siblings, 1 reply; 7+ messages in thread
From: Leon Romanovsky @ 2025-03-12 13:38 UTC (permalink / raw)
To: Long Li
Cc: Jason Gunthorpe, longli@linuxonhyperv.com, Konstantin Taranov,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
linux-rdma@vger.kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-hyperv@vger.kernel.org
On Mon, Mar 10, 2025 at 09:46:08PM +0000, Long Li wrote:
> > Subject: RE: [EXTERNAL] Re: [patch rdma-next v5 2/2] RDMA/mana_ib:
> > Handle net event for pointing to the current netdev
> >
> > > Subject: [EXTERNAL] Re: [patch rdma-next v5 2/2] RDMA/mana_ib: Handle
> > > net event for pointing to the current netdev
> > >
> > > On Thu, Mar 06, 2025 at 11:24:39AM -0800, longli@linuxonhyperv.com
> > wrote:
> > > > + switch (event) {
> > > > + case NETDEV_CHANGEUPPER:
> > > > + ndev = mana_get_primary_netdev(mc, 0, &dev->dev_tracker);
> > > > + /*
> > > > + * RDMA core will setup GID based on updated netdev.
> > > > + * It's not possible to race with the core as rtnl lock is being
> > > > + * held.
> > > > + */
> > > > + ib_device_set_netdev(&dev->ib_dev, ndev, 1);
> > > > +
> > > > + /* mana_get_primary_netdev() returns ndev with refcount
> > held
> > > */
> > > > + netdev_put(ndev, &dev->dev_tracker);
> > >
> > > ? What is the point of a tracker in dev if it never lasts outside this scope?
> > >
> > > ib_device_set_netdev() already has a tracker built into it.
> > >
> > > Jason
> >
> > I was asked to use a tracker for netdev_hold()/netdev_put(). But this code
> > (and the code in mana_ib_probe() of the 1st patch) is simple enough that
> > everything is done in one scope.
> >
> > Jakub, do you think it's okay to use NULL as the tracker in both patches?
> >
> > Long
>
> Hi,
>
> If we don't want to use a tracker, can we take the v4 version of the patch set?
>
> Otherwise, please take v5 (this patch) if a tracker is required.
Let's use v5 version as it is more complete variant, however the series
needs to be rebased as it doesn't apply after Konstantin's changes.
Thanks
>
> Thanks,
> Long
^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [EXTERNAL] Re: [patch rdma-next v5 2/2] RDMA/mana_ib: Handle net event for pointing to the current netdev
2025-03-12 13:38 ` Leon Romanovsky
@ 2025-03-12 23:16 ` Long Li
0 siblings, 0 replies; 7+ messages in thread
From: Long Li @ 2025-03-12 23:16 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Jason Gunthorpe, longli@linuxonhyperv.com, Konstantin Taranov,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
linux-rdma@vger.kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-hyperv@vger.kernel.org
> > Hi,
> >
> > If we don't want to use a tracker, can we take the v4 version of the patch set?
> >
> > Otherwise, please take v5 (this patch) if a tracker is required.
>
> Let's use v5 version as it is more complete variant, however the series needs to
> be rebased as it doesn't apply after Konstantin's changes.
>
> Thanks
I rebased the patch and sent v6.
Thanks,
Long
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-03-12 23:16 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-06 19:24 [patch rdma-next v5 1/2] net: mana: Change the function signature of mana_get_primary_netdev_rcu longli
2025-03-06 19:24 ` [patch rdma-next v5 2/2] RDMA/mana_ib: Handle net event for pointing to the current netdev longli
2025-03-06 19:53 ` Jason Gunthorpe
2025-03-06 20:01 ` [EXTERNAL] " Long Li
2025-03-10 21:46 ` Long Li
2025-03-12 13:38 ` Leon Romanovsky
2025-03-12 23:16 ` Long Li
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).