linux-hyperv.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RE: [PATCH rdma-next 1/1] RDMA/mana_ib: Set correct device into ib
       [not found]     ` <20240626121118.GP29266@unreal>
@ 2024-11-21  0:03       ` Long Li
  2024-11-25 15:56         ` Parav Pandit
  0 siblings, 1 reply; 7+ messages in thread
From: Long Li @ 2024-11-21  0:03 UTC (permalink / raw)
  To: Leon Romanovsky, Konstantin Taranov
  Cc: Konstantin Taranov, Wei Hu, sharmaajay@microsoft.com,
	jgg@ziepe.ca, linux-rdma@vger.kernel.org, linux-netdev,
	open list:Hyper-V/Azure CORE AND DRIVERS

> >
> > Actually, another alternative solution for mana_ib is always set the
> > slave device, but in the GID mgmt code we need the following patch.
> > The problem is that it may require testing/confirmation from other ib providers
> as in the worst case some GIDs will not be listed.
> 
> is_eth_active_slave_of_bonding_rcu() is for bonding.

Sorry, need to bring this issue up again.

This patch has broken user-space programs (e.g DPDK) that requires to export a kernel device to user-mode.

With this patch, the RDMA driver grabbed a reference from the master device, it's impossible to move the master device to user-mode.

I think the root cause is that the individual driver should not decide on which (master or slave) address should be used for GID. roce_gid_mgmt.c should handle this situation.

I think Konstantin's suggestion makes sense, how about we do this (don't need to define netdev_is_slave(dev)):

--- a/drivers/infiniband/core/roce_gid_mgmt.c
+++ b/drivers/infiniband/core/roce_gid_mgmt.c
@@ -161,7 +161,7 @@ is_eth_port_of_netdev_filter(struct ib_device *ib_dev, u32 port,
        res = ((rdma_is_upper_dev_rcu(rdma_ndev, cookie) &&
               (is_eth_active_slave_of_bonding_rcu(rdma_ndev, real_dev) &
                REQUIRED_BOND_STATES)) ||
-              real_dev == rdma_ndev);
+              (real_dev == rdma_ndev && !netif_is_bond_slave(rdma_ndev)));

        rcu_read_unlock();
        return res;


is_eth_port_of_netdev_filter() should not return true if this netdev is a bonded slave. In this case, only use the address of its bonded master.

Thanks,

Long

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH rdma-next 1/1] RDMA/mana_ib: Set correct device into ib
  2024-11-21  0:03       ` [PATCH rdma-next 1/1] RDMA/mana_ib: Set correct device into ib Long Li
@ 2024-11-25 15:56         ` Parav Pandit
  2024-11-25 20:10           ` Leon Romanovsky
  0 siblings, 1 reply; 7+ messages in thread
From: Parav Pandit @ 2024-11-25 15:56 UTC (permalink / raw)
  To: NBU-Contact-longli (EXTERNAL), Leon Romanovsky,
	Konstantin Taranov
  Cc: Konstantin Taranov, Wei Hu, sharmaajay@microsoft.com,
	jgg@ziepe.ca, linux-rdma@vger.kernel.org, linux-netdev,
	open list:Hyper-V/Azure CORE AND DRIVERS



> From: Long Li <longli@microsoft.com>
> Sent: Thursday, November 21, 2024 5:34 AM
> 
> > >
> > > Actually, another alternative solution for mana_ib is always set the
> > > slave device, but in the GID mgmt code we need the following patch.
> > > The problem is that it may require testing/confirmation from other
> > > ib providers
> > as in the worst case some GIDs will not be listed.
> >
> > is_eth_active_slave_of_bonding_rcu() is for bonding.
> 
> Sorry, need to bring this issue up again.
> 
> This patch has broken user-space programs (e.g DPDK) that requires to
> export a kernel device to user-mode.
> 
> With this patch, the RDMA driver grabbed a reference from the master
> device, it's impossible to move the master device to user-mode.
> 
> I think the root cause is that the individual driver should not decide on which
> (master or slave) address should be used for GID. roce_gid_mgmt.c should
> handle this situation.
> 
> I think Konstantin's suggestion makes sense, how about we do this (don't
> need to define netdev_is_slave(dev)):
> 
> --- a/drivers/infiniband/core/roce_gid_mgmt.c
> +++ b/drivers/infiniband/core/roce_gid_mgmt.c
> @@ -161,7 +161,7 @@ is_eth_port_of_netdev_filter(struct ib_device
> *ib_dev, u32 port,
>         res = ((rdma_is_upper_dev_rcu(rdma_ndev, cookie) &&
>                (is_eth_active_slave_of_bonding_rcu(rdma_ndev, real_dev) &
>                 REQUIRED_BOND_STATES)) ||
> -              real_dev == rdma_ndev);
> +              (real_dev == rdma_ndev &&
> + !netif_is_bond_slave(rdma_ndev)));
> 
>         rcu_read_unlock();
>         return res;
> 
> 
> is_eth_port_of_netdev_filter() should not return true if this netdev is a
> bonded slave. In this case, only use the address of its bonded master.
> 
Right. This change makes sense to me.
I don't have a setup presently to verify it to ensure I didn't miss a corner case.
Leon,
Can you or others please test the regression once with the formal patch?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH rdma-next 1/1] RDMA/mana_ib: Set correct device into ib
  2024-11-25 15:56         ` Parav Pandit
@ 2024-11-25 20:10           ` Leon Romanovsky
  2024-11-27 19:46             ` [EXTERNAL] " Long Li
  0 siblings, 1 reply; 7+ messages in thread
From: Leon Romanovsky @ 2024-11-25 20:10 UTC (permalink / raw)
  To: Parav Pandit
  Cc: NBU-Contact-longli (EXTERNAL), Konstantin Taranov,
	Konstantin Taranov, Wei Hu, sharmaajay@microsoft.com,
	jgg@ziepe.ca, linux-rdma@vger.kernel.org, linux-netdev,
	open list:Hyper-V/Azure CORE AND DRIVERS

On Mon, Nov 25, 2024 at 03:56:01PM +0000, Parav Pandit wrote:
> 
> 
> > From: Long Li <longli@microsoft.com>
> > Sent: Thursday, November 21, 2024 5:34 AM
> > 
> > > >
> > > > Actually, another alternative solution for mana_ib is always set the
> > > > slave device, but in the GID mgmt code we need the following patch.
> > > > The problem is that it may require testing/confirmation from other
> > > > ib providers
> > > as in the worst case some GIDs will not be listed.
> > >
> > > is_eth_active_slave_of_bonding_rcu() is for bonding.
> > 
> > Sorry, need to bring this issue up again.
> > 
> > This patch has broken user-space programs (e.g DPDK) that requires to
> > export a kernel device to user-mode.
> > 
> > With this patch, the RDMA driver grabbed a reference from the master
> > device, it's impossible to move the master device to user-mode.
> > 
> > I think the root cause is that the individual driver should not decide on which
> > (master or slave) address should be used for GID. roce_gid_mgmt.c should
> > handle this situation.
> > 
> > I think Konstantin's suggestion makes sense, how about we do this (don't
> > need to define netdev_is_slave(dev)):
> > 
> > --- a/drivers/infiniband/core/roce_gid_mgmt.c
> > +++ b/drivers/infiniband/core/roce_gid_mgmt.c
> > @@ -161,7 +161,7 @@ is_eth_port_of_netdev_filter(struct ib_device
> > *ib_dev, u32 port,
> >         res = ((rdma_is_upper_dev_rcu(rdma_ndev, cookie) &&
> >                (is_eth_active_slave_of_bonding_rcu(rdma_ndev, real_dev) &
> >                 REQUIRED_BOND_STATES)) ||
> > -              real_dev == rdma_ndev);
> > +              (real_dev == rdma_ndev &&
> > + !netif_is_bond_slave(rdma_ndev)));
> > 
> >         rcu_read_unlock();
> >         return res;
> > 
> > 
> > is_eth_port_of_netdev_filter() should not return true if this netdev is a
> > bonded slave. In this case, only use the address of its bonded master.
> > 
> Right. This change makes sense to me.
> I don't have a setup presently to verify it to ensure I didn't miss a corner case.
> Leon,
> Can you or others please test the regression once with the formal patch?

Sure, once Long will send the patch, I'll make sure that it is tested.

Thanks

> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [EXTERNAL] Re: [PATCH rdma-next 1/1] RDMA/mana_ib: Set correct device into ib
  2024-11-25 20:10           ` Leon Romanovsky
@ 2024-11-27 19:46             ` Long Li
  2024-11-28  9:39               ` Leon Romanovsky
  0 siblings, 1 reply; 7+ messages in thread
From: Long Li @ 2024-11-27 19:46 UTC (permalink / raw)
  To: Leon Romanovsky, Parav Pandit
  Cc: Konstantin Taranov, Konstantin Taranov, Wei Hu,
	sharmaajay@microsoft.com, jgg@ziepe.ca,
	linux-rdma@vger.kernel.org, linux-netdev,
	open list:Hyper-V/Azure CORE AND DRIVERS


> > > I think Konstantin's suggestion makes sense, how about we do this
> > > (don't need to define netdev_is_slave(dev)):
> > >
> > > --- a/drivers/infiniband/core/roce_gid_mgmt.c
> > > +++ b/drivers/infiniband/core/roce_gid_mgmt.c
> > > @@ -161,7 +161,7 @@ is_eth_port_of_netdev_filter(struct ib_device
> > > *ib_dev, u32 port,
> > >         res = ((rdma_is_upper_dev_rcu(rdma_ndev, cookie) &&
> > >                (is_eth_active_slave_of_bonding_rcu(rdma_ndev, real_dev) &
> > >                 REQUIRED_BOND_STATES)) ||
> > > -              real_dev == rdma_ndev);
> > > +              (real_dev == rdma_ndev &&
> > > + !netif_is_bond_slave(rdma_ndev)));
> > >
> > >         rcu_read_unlock();
> > >         return res;
> > >
> > >
> > > is_eth_port_of_netdev_filter() should not return true if this netdev
> > > is a bonded slave. In this case, only use the address of its bonded master.
> > >
> > Right. This change makes sense to me.
> > I don't have a setup presently to verify it to ensure I didn't miss a corner case.
> > Leon,
> > Can you or others please test the regression once with the formal patch?
> 
> Sure, once Long will send the patch, I'll make sure that it is tested.
> 
> Thanks
> 

I posted patches for discussion.
https://lore.kernel.org/linux-rdma/1732736619-19941-1-git-send-email-longli@linuxonhyperv.com/T/#t

Thank you,
Long


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [EXTERNAL] Re: [PATCH rdma-next 1/1] RDMA/mana_ib: Set correct device into ib
  2024-11-27 19:46             ` [EXTERNAL] " Long Li
@ 2024-11-28  9:39               ` Leon Romanovsky
  2024-12-03 18:32                 ` Long Li
  2025-02-07 21:39                 ` Long Li
  0 siblings, 2 replies; 7+ messages in thread
From: Leon Romanovsky @ 2024-11-28  9:39 UTC (permalink / raw)
  To: Long Li
  Cc: Parav Pandit, Konstantin Taranov, Konstantin Taranov, Wei Hu,
	sharmaajay@microsoft.com, jgg@ziepe.ca,
	linux-rdma@vger.kernel.org, linux-netdev,
	open list:Hyper-V/Azure CORE AND DRIVERS

On Wed, Nov 27, 2024 at 07:46:39PM +0000, Long Li wrote:
> 
> > > > I think Konstantin's suggestion makes sense, how about we do this
> > > > (don't need to define netdev_is_slave(dev)):
> > > >
> > > > --- a/drivers/infiniband/core/roce_gid_mgmt.c
> > > > +++ b/drivers/infiniband/core/roce_gid_mgmt.c
> > > > @@ -161,7 +161,7 @@ is_eth_port_of_netdev_filter(struct ib_device
> > > > *ib_dev, u32 port,
> > > >         res = ((rdma_is_upper_dev_rcu(rdma_ndev, cookie) &&
> > > >                (is_eth_active_slave_of_bonding_rcu(rdma_ndev, real_dev) &
> > > >                 REQUIRED_BOND_STATES)) ||
> > > > -              real_dev == rdma_ndev);
> > > > +              (real_dev == rdma_ndev &&
> > > > + !netif_is_bond_slave(rdma_ndev)));
> > > >
> > > >         rcu_read_unlock();
> > > >         return res;
> > > >
> > > >
> > > > is_eth_port_of_netdev_filter() should not return true if this netdev
> > > > is a bonded slave. In this case, only use the address of its bonded master.
> > > >
> > > Right. This change makes sense to me.
> > > I don't have a setup presently to verify it to ensure I didn't miss a corner case.
> > > Leon,
> > > Can you or others please test the regression once with the formal patch?
> > 
> > Sure, once Long will send the patch, I'll make sure that it is tested.
> > 
> > Thanks
> > 
> 
> I posted patches for discussion.
> https://lore.kernel.org/linux-rdma/1732736619-19941-1-git-send-email-longli@linuxonhyperv.com/T/#t

Please resend these patches as series with cover letter and don't embed
extra patch (the one which is not numbered) into the series.

Thanks

> 
> Thank you,
> Long
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [EXTERNAL] Re: [PATCH rdma-next 1/1] RDMA/mana_ib: Set correct device into ib
  2024-11-28  9:39               ` Leon Romanovsky
@ 2024-12-03 18:32                 ` Long Li
  2025-02-07 21:39                 ` Long Li
  1 sibling, 0 replies; 7+ messages in thread
From: Long Li @ 2024-12-03 18:32 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Parav Pandit, Konstantin Taranov, Konstantin Taranov, Wei Hu,
	sharmaajay@microsoft.com, jgg@ziepe.ca,
	linux-rdma@vger.kernel.org, linux-netdev,
	open list:Hyper-V/Azure CORE AND DRIVERS

> Subject: Re: [EXTERNAL] Re: [PATCH rdma-next 1/1] RDMA/mana_ib: Set correct
> device into ib
> 
> On Wed, Nov 27, 2024 at 07:46:39PM +0000, Long Li wrote:
> >
> > > > > I think Konstantin's suggestion makes sense, how about we do
> > > > > this (don't need to define netdev_is_slave(dev)):
> > > > >
> > > > > --- a/drivers/infiniband/core/roce_gid_mgmt.c
> > > > > +++ b/drivers/infiniband/core/roce_gid_mgmt.c
> > > > > @@ -161,7 +161,7 @@ is_eth_port_of_netdev_filter(struct
> > > > > ib_device *ib_dev, u32 port,
> > > > >         res = ((rdma_is_upper_dev_rcu(rdma_ndev, cookie) &&
> > > > >                (is_eth_active_slave_of_bonding_rcu(rdma_ndev, real_dev) &
> > > > >                 REQUIRED_BOND_STATES)) ||
> > > > > -              real_dev == rdma_ndev);
> > > > > +              (real_dev == rdma_ndev &&
> > > > > + !netif_is_bond_slave(rdma_ndev)));
> > > > >
> > > > >         rcu_read_unlock();
> > > > >         return res;
> > > > >
> > > > >
> > > > > is_eth_port_of_netdev_filter() should not return true if this
> > > > > netdev is a bonded slave. In this case, only use the address of its bonded
> master.
> > > > >
> > > > Right. This change makes sense to me.
> > > > I don't have a setup presently to verify it to ensure I didn't miss a corner
> case.
> > > > Leon,
> > > > Can you or others please test the regression once with the formal patch?
> > >
> > > Sure, once Long will send the patch, I'll make sure that it is tested.
> > >
> > > Thanks
> > >
> >
> > I posted patches for discussion.
> > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore
> > .kernel.org%2Flinux-rdma%2F1732736619-19941-1-git-send-email-longli%40
> >
> linuxonhyperv.com%2FT%2F%23t&data=05%7C02%7Clongli%40microsoft.com%7
> C4
> >
> 20bac91521e414ff34c08dd0f909cf6%7C72f988bf86f141af91ab2d7cd011db47%7
> C1
> > %7C0%7C638683835975667120%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU
> 1hcGkiOnRy
> >
> dWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D
> %
> >
> 3D%7C0%7C%7C%7C&sdata=7vTTi%2FilkYdEKNG1qwpgYYDriOPPUF%2Bp8Zh91
> 60CEVE%
> > 3D&reserved=0
> 
> Please resend these patches as series with cover letter and don't embed extra
> patch (the one which is not numbered) into the series.
> 
> Thanks


I will resend those as a series after addressing the other comments on bonding.

Thanks


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [EXTERNAL] Re: [PATCH rdma-next 1/1] RDMA/mana_ib: Set correct device into ib
  2024-11-28  9:39               ` Leon Romanovsky
  2024-12-03 18:32                 ` Long Li
@ 2025-02-07 21:39                 ` Long Li
  1 sibling, 0 replies; 7+ messages in thread
From: Long Li @ 2025-02-07 21:39 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Parav Pandit, Konstantin Taranov, Konstantin Taranov, Wei Hu,
	sharmaajay@microsoft.com, jgg@ziepe.ca,
	linux-rdma@vger.kernel.org, linux-netdev,
	open list:Hyper-V/Azure CORE AND DRIVERS

> On Wed, Nov 27, 2024 at 07:46:39PM +0000, Long Li wrote:
> >
> > > > > I think Konstantin's suggestion makes sense, how about we do
> > > > > this (don't need to define netdev_is_slave(dev)):
> > > > >
> > > > > --- a/drivers/infiniband/core/roce_gid_mgmt.c
> > > > > +++ b/drivers/infiniband/core/roce_gid_mgmt.c
> > > > > @@ -161,7 +161,7 @@ is_eth_port_of_netdev_filter(struct
> > > > > ib_device *ib_dev, u32 port,
> > > > >         res = ((rdma_is_upper_dev_rcu(rdma_ndev, cookie) &&
> > > > >                (is_eth_active_slave_of_bonding_rcu(rdma_ndev, real_dev) &
> > > > >                 REQUIRED_BOND_STATES)) ||
> > > > > -              real_dev == rdma_ndev);
> > > > > +              (real_dev == rdma_ndev &&
> > > > > + !netif_is_bond_slave(rdma_ndev)));
> > > > >
> > > > >         rcu_read_unlock();
> > > > >         return res;
> > > > >
> > > > >
> > > > > is_eth_port_of_netdev_filter() should not return true if this
> > > > > netdev is a bonded slave. In this case, only use the address of its bonded
> master.
> > > > >
> > > > Right. This change makes sense to me.
> > > > I don't have a setup presently to verify it to ensure I didn't miss a corner
> case.
> > > > Leon,
> > > > Can you or others please test the regression once with the formal patch?
> > >
> > > Sure, once Long will send the patch, I'll make sure that it is tested.
> > >
> > > Thanks
> > >
> >
> > I posted patches for discussion.
> > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore
> > .kernel.org%2Flinux-rdma%2F1732736619-19941-1-git-send-email-longli%40
> >
> linuxonhyperv.com%2FT%2F%23t&data=05%7C02%7Clongli%40microsoft.com%7
> C4
> >
> 20bac91521e414ff34c08dd0f909cf6%7C72f988bf86f141af91ab2d7cd011db47%7
> C1
> >
> %7C0%7C638683835975667120%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1h
> cGkiOnRy
> >
> dWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D
> %
> >
> 3D%7C0%7C%7C%7C&sdata=7vTTi%2FilkYdEKNG1qwpgYYDriOPPUF%2Bp8Zh91
> 60CEVE%
> > 3D&reserved=0
> 
> Please resend these patches as series with cover letter and don't embed extra
> patch (the one which is not numbered) into the series.
> 
> Thanks

Sorry for the late relay. I have done some more testing and sent those patches in a series with a cover letter.

Please review the series.

Thanks,
Long


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-02-07 21:39 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1719311307-7920-1-git-send-email-kotaranov@linux.microsoft.com>
     [not found] ` <20240626054748.GN29266@unreal>
     [not found]   ` <PAXPR83MB0559F4678E73B0091A8ADFBBB4D62@PAXPR83MB0559.EURPRD83.prod.outlook.com>
     [not found]     ` <20240626121118.GP29266@unreal>
2024-11-21  0:03       ` [PATCH rdma-next 1/1] RDMA/mana_ib: Set correct device into ib Long Li
2024-11-25 15:56         ` Parav Pandit
2024-11-25 20:10           ` Leon Romanovsky
2024-11-27 19:46             ` [EXTERNAL] " Long Li
2024-11-28  9:39               ` Leon Romanovsky
2024-12-03 18:32                 ` Long Li
2025-02-07 21:39                 ` Long Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).