* [PATCH net 1/4] bonding: implement xfrm state xdo_dev_state_free API
2024-07-29 12:44 [PATCH net 0/4] Fixes for IPsec over bonding Tariq Toukan
@ 2024-07-29 12:44 ` Tariq Toukan
2024-07-31 3:27 ` Hangbin Liu
2024-07-29 12:44 ` [PATCH net 2/4] bonding: call xfrm state xdo_dev_state_free after deletion Tariq Toukan
` (3 subsequent siblings)
4 siblings, 1 reply; 12+ messages in thread
From: Tariq Toukan @ 2024-07-29 12:44 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
Jay Vosburgh, Andy Gospodarek
Cc: netdev, Saeed Mahameed, Gal Pressman, Leon Romanovsky, Jianbo Liu,
Cosmin Ratiu, Tariq Toukan
From: Jianbo Liu <jianbol@nvidia.com>
Add this implementation for bonding, so hardware resources can be
freed after xfrm state is deleted.
Fixes: 9a5605505d9c ("bonding: Add struct bond_ipesc to manage SA")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
drivers/net/bonding/bond_main.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 1cd92c12e782..3b880ff2b82a 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -588,6 +588,15 @@ static void bond_ipsec_del_sa_all(struct bonding *bond)
rcu_read_unlock();
}
+static void bond_ipsec_free_sa(struct xfrm_state *xs)
+{
+ struct net_device *real_dev = xs->xso.real_dev;
+
+ if (real_dev && real_dev->xfrmdev_ops &&
+ real_dev->xfrmdev_ops->xdo_dev_state_free)
+ real_dev->xfrmdev_ops->xdo_dev_state_free(xs);
+}
+
/**
* bond_ipsec_offload_ok - can this packet use the xfrm hw offload
* @skb: current data packet
@@ -632,6 +641,7 @@ static bool bond_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *xs)
static const struct xfrmdev_ops bond_xfrmdev_ops = {
.xdo_dev_state_add = bond_ipsec_add_sa,
.xdo_dev_state_delete = bond_ipsec_del_sa,
+ .xdo_dev_state_free = bond_ipsec_free_sa,
.xdo_dev_offload_ok = bond_ipsec_offload_ok,
};
#endif /* CONFIG_XFRM_OFFLOAD */
--
2.44.0
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [PATCH net 1/4] bonding: implement xfrm state xdo_dev_state_free API
2024-07-29 12:44 ` [PATCH net 1/4] bonding: implement xfrm state xdo_dev_state_free API Tariq Toukan
@ 2024-07-31 3:27 ` Hangbin Liu
2024-07-31 6:52 ` Jianbo Liu
0 siblings, 1 reply; 12+ messages in thread
From: Hangbin Liu @ 2024-07-31 3:27 UTC (permalink / raw)
To: Tariq Toukan
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
Jay Vosburgh, Andy Gospodarek, netdev, Saeed Mahameed,
Gal Pressman, Leon Romanovsky, Jianbo Liu, Cosmin Ratiu
On Mon, Jul 29, 2024 at 03:44:02PM +0300, Tariq Toukan wrote:
> From: Jianbo Liu <jianbol@nvidia.com>
>
> Add this implementation for bonding, so hardware resources can be
> freed after xfrm state is deleted.
>
> Fixes: 9a5605505d9c ("bonding: Add struct bond_ipesc to manage SA")
> Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
> Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
> ---
> drivers/net/bonding/bond_main.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 1cd92c12e782..3b880ff2b82a 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -588,6 +588,15 @@ static void bond_ipsec_del_sa_all(struct bonding *bond)
> rcu_read_unlock();
> }
>
> +static void bond_ipsec_free_sa(struct xfrm_state *xs)
> +{
> + struct net_device *real_dev = xs->xso.real_dev;
I think it's also good to check the bond/slave status like bond_ipsec_del_sa()
does, no?
> +
> + if (real_dev && real_dev->xfrmdev_ops &&
> + real_dev->xfrmdev_ops->xdo_dev_state_free)
> + real_dev->xfrmdev_ops->xdo_dev_state_free(xs);
> +}
And we should call this in bond_ipsec_del_sa_all() for each slave.
Thanks
Hangbin
> +
> /**
> * bond_ipsec_offload_ok - can this packet use the xfrm hw offload
> * @skb: current data packet
> @@ -632,6 +641,7 @@ static bool bond_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *xs)
> static const struct xfrmdev_ops bond_xfrmdev_ops = {
> .xdo_dev_state_add = bond_ipsec_add_sa,
> .xdo_dev_state_delete = bond_ipsec_del_sa,
> + .xdo_dev_state_free = bond_ipsec_free_sa,
> .xdo_dev_offload_ok = bond_ipsec_offload_ok,
> };
> #endif /* CONFIG_XFRM_OFFLOAD */
> --
> 2.44.0
>
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH net 1/4] bonding: implement xfrm state xdo_dev_state_free API
2024-07-31 3:27 ` Hangbin Liu
@ 2024-07-31 6:52 ` Jianbo Liu
0 siblings, 0 replies; 12+ messages in thread
From: Jianbo Liu @ 2024-07-31 6:52 UTC (permalink / raw)
To: Tariq Toukan, liuhangbin@gmail.com
Cc: davem@davemloft.net, Leon Romanovsky, andy@greyhouse.net,
Gal Pressman, jv@jvosburgh.net, kuba@kernel.org,
pabeni@redhat.com, edumazet@google.com, Saeed Mahameed,
Cosmin Ratiu, netdev@vger.kernel.org
On Wed, 2024-07-31 at 11:27 +0800, Hangbin Liu wrote:
> On Mon, Jul 29, 2024 at 03:44:02PM +0300, Tariq Toukan wrote:
> > From: Jianbo Liu <jianbol@nvidia.com>
> >
> > Add this implementation for bonding, so hardware resources can be
> > freed after xfrm state is deleted.
> >
> > Fixes: 9a5605505d9c ("bonding: Add struct bond_ipesc to manage SA")
> > Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
> > Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
> > Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
> > ---
> > drivers/net/bonding/bond_main.c | 10 ++++++++++
> > 1 file changed, 10 insertions(+)
> >
> > diff --git a/drivers/net/bonding/bond_main.c
> > b/drivers/net/bonding/bond_main.c
> > index 1cd92c12e782..3b880ff2b82a 100644
> > --- a/drivers/net/bonding/bond_main.c
> > +++ b/drivers/net/bonding/bond_main.c
> > @@ -588,6 +588,15 @@ static void bond_ipsec_del_sa_all(struct
> > bonding *bond)
> > rcu_read_unlock();
> > }
> >
> > +static void bond_ipsec_free_sa(struct xfrm_state *xs)
> > +{
> > + struct net_device *real_dev = xs->xso.real_dev;
>
> I think it's also good to check the bond/slave status like
> bond_ipsec_del_sa()
> does, no?
It seems no necessary, but I will try adding. Thanks!
>
> > +
> > + if (real_dev && real_dev->xfrmdev_ops &&
> > + real_dev->xfrmdev_ops->xdo_dev_state_free)
> > + real_dev->xfrmdev_ops->xdo_dev_state_free(xs);
> > +}
>
> And we should call this in bond_ipsec_del_sa_all() for each slave.
Yes, it's in the next patch.
>
> Thanks
> Hangbin
> > +
> > /**
> > * bond_ipsec_offload_ok - can this packet use the xfrm hw offload
> > * @skb: current data packet
> > @@ -632,6 +641,7 @@ static bool bond_ipsec_offload_ok(struct
> > sk_buff *skb, struct xfrm_state *xs)
> > static const struct xfrmdev_ops bond_xfrmdev_ops = {
> > .xdo_dev_state_add = bond_ipsec_add_sa,
> > .xdo_dev_state_delete = bond_ipsec_del_sa,
> > + .xdo_dev_state_free = bond_ipsec_free_sa,
> > .xdo_dev_offload_ok = bond_ipsec_offload_ok,
> > };
> > #endif /* CONFIG_XFRM_OFFLOAD */
> > --
> > 2.44.0
> >
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH net 2/4] bonding: call xfrm state xdo_dev_state_free after deletion
2024-07-29 12:44 [PATCH net 0/4] Fixes for IPsec over bonding Tariq Toukan
2024-07-29 12:44 ` [PATCH net 1/4] bonding: implement xfrm state xdo_dev_state_free API Tariq Toukan
@ 2024-07-29 12:44 ` Tariq Toukan
2024-07-31 3:36 ` Hangbin Liu
2024-07-29 12:44 ` [PATCH net 3/4] bonding: extract the use of real_device into local variable Tariq Toukan
` (2 subsequent siblings)
4 siblings, 1 reply; 12+ messages in thread
From: Tariq Toukan @ 2024-07-29 12:44 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
Jay Vosburgh, Andy Gospodarek
Cc: netdev, Saeed Mahameed, Gal Pressman, Leon Romanovsky, Jianbo Liu,
Cosmin Ratiu, Tariq Toukan
From: Jianbo Liu <jianbol@nvidia.com>
Need to call xdo_dev_state_free API to avoid hardware resource leakage
when deleting all SAs from old active real interface.
Fixes: 9a5605505d9c ("bonding: Add struct bond_ipesc to manage SA")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
drivers/net/bonding/bond_main.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 3b880ff2b82a..551cebfa3261 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -581,6 +581,8 @@ static void bond_ipsec_del_sa_all(struct bonding *bond)
__func__);
} else {
slave->dev->xfrmdev_ops->xdo_dev_state_delete(ipsec->xs);
+ if (slave->dev->xfrmdev_ops->xdo_dev_state_free)
+ slave->dev->xfrmdev_ops->xdo_dev_state_free(ipsec->xs);
}
ipsec->xs->xso.real_dev = NULL;
}
--
2.44.0
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [PATCH net 2/4] bonding: call xfrm state xdo_dev_state_free after deletion
2024-07-29 12:44 ` [PATCH net 2/4] bonding: call xfrm state xdo_dev_state_free after deletion Tariq Toukan
@ 2024-07-31 3:36 ` Hangbin Liu
2024-07-31 7:08 ` Jianbo Liu
0 siblings, 1 reply; 12+ messages in thread
From: Hangbin Liu @ 2024-07-31 3:36 UTC (permalink / raw)
To: Tariq Toukan
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
Jay Vosburgh, Andy Gospodarek, netdev, Saeed Mahameed,
Gal Pressman, Leon Romanovsky, Jianbo Liu, Cosmin Ratiu
On Mon, Jul 29, 2024 at 03:44:03PM +0300, Tariq Toukan wrote:
> From: Jianbo Liu <jianbol@nvidia.com>
>
> Need to call xdo_dev_state_free API to avoid hardware resource leakage
> when deleting all SAs from old active real interface.
>
> Fixes: 9a5605505d9c ("bonding: Add struct bond_ipesc to manage SA")
> Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
> Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
> ---
> drivers/net/bonding/bond_main.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 3b880ff2b82a..551cebfa3261 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -581,6 +581,8 @@ static void bond_ipsec_del_sa_all(struct bonding *bond)
> __func__);
> } else {
> slave->dev->xfrmdev_ops->xdo_dev_state_delete(ipsec->xs);
> + if (slave->dev->xfrmdev_ops->xdo_dev_state_free)
> + slave->dev->xfrmdev_ops->xdo_dev_state_free(ipsec->xs);
OH, you do it here.
> }
> ipsec->xs->xso.real_dev = NULL;
I'm not sure if we should make xdo_dev_state_free() rely on
xdo_dev_state_delete(). In xfrm_state_find() the xfrm_dev_state_free()
is called whatever xfrm_dev_state_delete() is support or not. Although
usually the NIC driver will support the _delete() if the _free() supported.
BTW, For me this patch should merge with Patch 1/4
Thanks
Hangbin
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH net 2/4] bonding: call xfrm state xdo_dev_state_free after deletion
2024-07-31 3:36 ` Hangbin Liu
@ 2024-07-31 7:08 ` Jianbo Liu
0 siblings, 0 replies; 12+ messages in thread
From: Jianbo Liu @ 2024-07-31 7:08 UTC (permalink / raw)
To: Tariq Toukan, liuhangbin@gmail.com
Cc: davem@davemloft.net, Leon Romanovsky, andy@greyhouse.net,
Gal Pressman, jv@jvosburgh.net, kuba@kernel.org,
pabeni@redhat.com, edumazet@google.com, Saeed Mahameed,
Cosmin Ratiu, netdev@vger.kernel.org
On Wed, 2024-07-31 at 11:36 +0800, Hangbin Liu wrote:
> On Mon, Jul 29, 2024 at 03:44:03PM +0300, Tariq Toukan wrote:
> > From: Jianbo Liu <jianbol@nvidia.com>
> >
> > Need to call xdo_dev_state_free API to avoid hardware resource
> > leakage
> > when deleting all SAs from old active real interface.
> >
> > Fixes: 9a5605505d9c ("bonding: Add struct bond_ipesc to manage SA")
> > Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
> > Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
> > Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
> > ---
> > drivers/net/bonding/bond_main.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/net/bonding/bond_main.c
> > b/drivers/net/bonding/bond_main.c
> > index 3b880ff2b82a..551cebfa3261 100644
> > --- a/drivers/net/bonding/bond_main.c
> > +++ b/drivers/net/bonding/bond_main.c
> > @@ -581,6 +581,8 @@ static void bond_ipsec_del_sa_all(struct
> > bonding *bond)
> > __func__);
> > } else {
> > slave->dev->xfrmdev_ops-
> > >xdo_dev_state_delete(ipsec->xs);
> > + if (slave->dev->xfrmdev_ops-
> > >xdo_dev_state_free)
> > + slave->dev->xfrmdev_ops-
> > >xdo_dev_state_free(ipsec->xs);
>
> OH, you do it here.
>
Yes :)
> > }
> > ipsec->xs->xso.real_dev = NULL;
>
> I'm not sure if we should make xdo_dev_state_free() rely on
> xdo_dev_state_delete(). In xfrm_state_find() the
> xfrm_dev_state_free()
> is called whatever xfrm_dev_state_delete() is support or not.
> Although
> usually the NIC driver will support the _delete() if the _free()
> supported.
>
I don't see they rely on each other, so I'd like to keep it as is.
> BTW, For me this patch should merge with Patch 1/4
Will do. Thanks!
>
> Thanks
> Hangbin
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH net 3/4] bonding: extract the use of real_device into local variable
2024-07-29 12:44 [PATCH net 0/4] Fixes for IPsec over bonding Tariq Toukan
2024-07-29 12:44 ` [PATCH net 1/4] bonding: implement xfrm state xdo_dev_state_free API Tariq Toukan
2024-07-29 12:44 ` [PATCH net 2/4] bonding: call xfrm state xdo_dev_state_free after deletion Tariq Toukan
@ 2024-07-29 12:44 ` Tariq Toukan
2024-07-29 12:44 ` [PATCH net 4/4] bonding: change ipsec_lock from spin lock to mutex Tariq Toukan
2024-07-31 3:20 ` [PATCH net 0/4] Fixes for IPsec over bonding Hangbin Liu
4 siblings, 0 replies; 12+ messages in thread
From: Tariq Toukan @ 2024-07-29 12:44 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
Jay Vosburgh, Andy Gospodarek
Cc: netdev, Saeed Mahameed, Gal Pressman, Leon Romanovsky, Jianbo Liu,
Cosmin Ratiu, Tariq Toukan
From: Jianbo Liu <jianbol@nvidia.com>
Add a local variable for slave->dev, to prepare for the lock change in
the next patch. There is no functionality change.
Fixes: 9a5605505d9c ("bonding: Add struct bond_ipesc to manage SA")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
drivers/net/bonding/bond_main.c | 58 +++++++++++++++++++--------------
1 file changed, 33 insertions(+), 25 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 551cebfa3261..763d807be311 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -427,6 +427,7 @@ static int bond_ipsec_add_sa(struct xfrm_state *xs,
struct netlink_ext_ack *extack)
{
struct net_device *bond_dev = xs->xso.dev;
+ struct net_device *real_dev;
struct bond_ipsec *ipsec;
struct bonding *bond;
struct slave *slave;
@@ -443,9 +444,10 @@ static int bond_ipsec_add_sa(struct xfrm_state *xs,
return -ENODEV;
}
- if (!slave->dev->xfrmdev_ops ||
- !slave->dev->xfrmdev_ops->xdo_dev_state_add ||
- netif_is_bond_master(slave->dev)) {
+ real_dev = slave->dev;
+ if (!real_dev->xfrmdev_ops ||
+ !real_dev->xfrmdev_ops->xdo_dev_state_add ||
+ netif_is_bond_master(real_dev)) {
NL_SET_ERR_MSG_MOD(extack, "Slave does not support ipsec offload");
rcu_read_unlock();
return -EINVAL;
@@ -456,9 +458,9 @@ static int bond_ipsec_add_sa(struct xfrm_state *xs,
rcu_read_unlock();
return -ENOMEM;
}
- xs->xso.real_dev = slave->dev;
- err = slave->dev->xfrmdev_ops->xdo_dev_state_add(xs, extack);
+ xs->xso.real_dev = real_dev;
+ err = real_dev->xfrmdev_ops->xdo_dev_state_add(xs, extack);
if (!err) {
ipsec->xs = xs;
INIT_LIST_HEAD(&ipsec->list);
@@ -475,6 +477,7 @@ static int bond_ipsec_add_sa(struct xfrm_state *xs,
static void bond_ipsec_add_sa_all(struct bonding *bond)
{
struct net_device *bond_dev = bond->dev;
+ struct net_device *real_dev;
struct bond_ipsec *ipsec;
struct slave *slave;
@@ -483,12 +486,13 @@ static void bond_ipsec_add_sa_all(struct bonding *bond)
if (!slave)
goto out;
- if (!slave->dev->xfrmdev_ops ||
- !slave->dev->xfrmdev_ops->xdo_dev_state_add ||
- netif_is_bond_master(slave->dev)) {
+ real_dev = slave->dev;
+ if (!real_dev->xfrmdev_ops ||
+ !real_dev->xfrmdev_ops->xdo_dev_state_add ||
+ netif_is_bond_master(real_dev)) {
spin_lock_bh(&bond->ipsec_lock);
if (!list_empty(&bond->ipsec_list))
- slave_warn(bond_dev, slave->dev,
+ slave_warn(bond_dev, real_dev,
"%s: no slave xdo_dev_state_add\n",
__func__);
spin_unlock_bh(&bond->ipsec_lock);
@@ -497,9 +501,9 @@ static void bond_ipsec_add_sa_all(struct bonding *bond)
spin_lock_bh(&bond->ipsec_lock);
list_for_each_entry(ipsec, &bond->ipsec_list, list) {
- ipsec->xs->xso.real_dev = slave->dev;
- if (slave->dev->xfrmdev_ops->xdo_dev_state_add(ipsec->xs, NULL)) {
- slave_warn(bond_dev, slave->dev, "%s: failed to add SA\n", __func__);
+ ipsec->xs->xso.real_dev = real_dev;
+ if (real_dev->xfrmdev_ops->xdo_dev_state_add(ipsec->xs, NULL)) {
+ slave_warn(bond_dev, real_dev, "%s: failed to add SA\n", __func__);
ipsec->xs->xso.real_dev = NULL;
}
}
@@ -515,6 +519,7 @@ static void bond_ipsec_add_sa_all(struct bonding *bond)
static void bond_ipsec_del_sa(struct xfrm_state *xs)
{
struct net_device *bond_dev = xs->xso.dev;
+ struct net_device *real_dev;
struct bond_ipsec *ipsec;
struct bonding *bond;
struct slave *slave;
@@ -532,16 +537,17 @@ static void bond_ipsec_del_sa(struct xfrm_state *xs)
if (!xs->xso.real_dev)
goto out;
- WARN_ON(xs->xso.real_dev != slave->dev);
+ real_dev = slave->dev;
+ WARN_ON(xs->xso.real_dev != real_dev);
- if (!slave->dev->xfrmdev_ops ||
- !slave->dev->xfrmdev_ops->xdo_dev_state_delete ||
- netif_is_bond_master(slave->dev)) {
- slave_warn(bond_dev, slave->dev, "%s: no slave xdo_dev_state_delete\n", __func__);
+ if (!real_dev->xfrmdev_ops ||
+ !real_dev->xfrmdev_ops->xdo_dev_state_delete ||
+ netif_is_bond_master(real_dev)) {
+ slave_warn(bond_dev, real_dev, "%s: no slave xdo_dev_state_delete\n", __func__);
goto out;
}
- slave->dev->xfrmdev_ops->xdo_dev_state_delete(xs);
+ real_dev->xfrmdev_ops->xdo_dev_state_delete(xs);
out:
spin_lock_bh(&bond->ipsec_lock);
list_for_each_entry(ipsec, &bond->ipsec_list, list) {
@@ -558,6 +564,7 @@ static void bond_ipsec_del_sa(struct xfrm_state *xs)
static void bond_ipsec_del_sa_all(struct bonding *bond)
{
struct net_device *bond_dev = bond->dev;
+ struct net_device *real_dev;
struct bond_ipsec *ipsec;
struct slave *slave;
@@ -568,21 +575,22 @@ static void bond_ipsec_del_sa_all(struct bonding *bond)
return;
}
+ real_dev = slave->dev;
spin_lock_bh(&bond->ipsec_lock);
list_for_each_entry(ipsec, &bond->ipsec_list, list) {
if (!ipsec->xs->xso.real_dev)
continue;
- if (!slave->dev->xfrmdev_ops ||
- !slave->dev->xfrmdev_ops->xdo_dev_state_delete ||
- netif_is_bond_master(slave->dev)) {
- slave_warn(bond_dev, slave->dev,
+ if (!real_dev->xfrmdev_ops ||
+ !real_dev->xfrmdev_ops->xdo_dev_state_delete ||
+ netif_is_bond_master(real_dev)) {
+ slave_warn(bond_dev, real_dev,
"%s: no slave xdo_dev_state_delete\n",
__func__);
} else {
- slave->dev->xfrmdev_ops->xdo_dev_state_delete(ipsec->xs);
- if (slave->dev->xfrmdev_ops->xdo_dev_state_free)
- slave->dev->xfrmdev_ops->xdo_dev_state_free(ipsec->xs);
+ real_dev->xfrmdev_ops->xdo_dev_state_delete(ipsec->xs);
+ if (real_dev->xfrmdev_ops->xdo_dev_state_free)
+ real_dev->xfrmdev_ops->xdo_dev_state_free(ipsec->xs);
}
ipsec->xs->xso.real_dev = NULL;
}
--
2.44.0
^ permalink raw reply related [flat|nested] 12+ messages in thread* [PATCH net 4/4] bonding: change ipsec_lock from spin lock to mutex
2024-07-29 12:44 [PATCH net 0/4] Fixes for IPsec over bonding Tariq Toukan
` (2 preceding siblings ...)
2024-07-29 12:44 ` [PATCH net 3/4] bonding: extract the use of real_device into local variable Tariq Toukan
@ 2024-07-29 12:44 ` Tariq Toukan
2024-07-30 11:28 ` Paolo Abeni
2024-07-31 3:20 ` [PATCH net 0/4] Fixes for IPsec over bonding Hangbin Liu
4 siblings, 1 reply; 12+ messages in thread
From: Tariq Toukan @ 2024-07-29 12:44 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
Jay Vosburgh, Andy Gospodarek
Cc: netdev, Saeed Mahameed, Gal Pressman, Leon Romanovsky, Jianbo Liu,
Cosmin Ratiu, Tariq Toukan
From: Jianbo Liu <jianbol@nvidia.com>
In the cited commit, bond->ipsec_lock is added to protect ipsec_list,
hence xdo_dev_state_add and xdo_dev_state_delete are called inside
this lock. As ipsec_lock is spin lock and such xfrmdev ops may sleep,
"scheduling while atomic" will be triggered when changing bond's
active slave.
[ 101.055189] BUG: scheduling while atomic: bash/902/0x00000200
[ 101.055726] Modules linked in:
[ 101.058211] CPU: 3 PID: 902 Comm: bash Not tainted 6.9.0-rc4+ #1
[ 101.058760] Hardware name:
[ 101.059434] Call Trace:
[ 101.059436] <TASK>
[ 101.060873] dump_stack_lvl+0x51/0x60
[ 101.061275] __schedule_bug+0x4e/0x60
[ 101.061682] __schedule+0x612/0x7c0
[ 101.062078] ? __mod_timer+0x25c/0x370
[ 101.062486] schedule+0x25/0xd0
[ 101.062845] schedule_timeout+0x77/0xf0
[ 101.063265] ? asm_common_interrupt+0x22/0x40
[ 101.063724] ? __bpf_trace_itimer_state+0x10/0x10
[ 101.064215] __wait_for_common+0x87/0x190
[ 101.064648] ? usleep_range_state+0x90/0x90
[ 101.065091] cmd_exec+0x437/0xb20 [mlx5_core]
[ 101.065569] mlx5_cmd_do+0x1e/0x40 [mlx5_core]
[ 101.066051] mlx5_cmd_exec+0x18/0x30 [mlx5_core]
[ 101.066552] mlx5_crypto_create_dek_key+0xea/0x120 [mlx5_core]
[ 101.067163] ? bonding_sysfs_store_option+0x4d/0x80 [bonding]
[ 101.067738] ? kmalloc_trace+0x4d/0x350
[ 101.068156] mlx5_ipsec_create_sa_ctx+0x33/0x100 [mlx5_core]
[ 101.068747] mlx5e_xfrm_add_state+0x47b/0xaa0 [mlx5_core]
[ 101.069312] bond_change_active_slave+0x392/0x900 [bonding]
[ 101.069868] bond_option_active_slave_set+0x1c2/0x240 [bonding]
[ 101.070454] __bond_opt_set+0xa6/0x430 [bonding]
[ 101.070935] __bond_opt_set_notify+0x2f/0x90 [bonding]
[ 101.071453] bond_opt_tryset_rtnl+0x72/0xb0 [bonding]
[ 101.071965] bonding_sysfs_store_option+0x4d/0x80 [bonding]
[ 101.072567] kernfs_fop_write_iter+0x10c/0x1a0
[ 101.073033] vfs_write+0x2d8/0x400
[ 101.073416] ? alloc_fd+0x48/0x180
[ 101.073798] ksys_write+0x5f/0xe0
[ 101.074175] do_syscall_64+0x52/0x110
[ 101.074576] entry_SYSCALL_64_after_hwframe+0x4b/0x53
As bond_ipsec_add_sa_all and bond_ipsec_del_sa_all are only called
from bond_change_active_slave, which requires holding the RTNL lock.
And bond_ipsec_add_sa and bond_ipsec_del_sa are xfrm state
xdo_dev_state_add and xdo_dev_state_delete APIs, which are in user
context. So ipsec_lock doesn't have to be spin lock, change it to
mutex, and thus the above issue can be resolved.
Fixes: 9a5605505d9c ("bonding: Add struct bond_ipesc to manage SA")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
drivers/net/bonding/bond_main.c | 73 ++++++++++++++++++---------------
include/net/bonding.h | 2 +-
2 files changed, 41 insertions(+), 34 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 763d807be311..bced29813478 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -439,38 +439,33 @@ static int bond_ipsec_add_sa(struct xfrm_state *xs,
rcu_read_lock();
bond = netdev_priv(bond_dev);
slave = rcu_dereference(bond->curr_active_slave);
- if (!slave) {
- rcu_read_unlock();
+ real_dev = slave ? slave->dev : NULL;
+ rcu_read_unlock();
+ if (!real_dev)
return -ENODEV;
- }
- real_dev = slave->dev;
if (!real_dev->xfrmdev_ops ||
!real_dev->xfrmdev_ops->xdo_dev_state_add ||
netif_is_bond_master(real_dev)) {
NL_SET_ERR_MSG_MOD(extack, "Slave does not support ipsec offload");
- rcu_read_unlock();
return -EINVAL;
}
ipsec = kmalloc(sizeof(*ipsec), GFP_ATOMIC);
- if (!ipsec) {
- rcu_read_unlock();
+ if (!ipsec)
return -ENOMEM;
- }
xs->xso.real_dev = real_dev;
err = real_dev->xfrmdev_ops->xdo_dev_state_add(xs, extack);
if (!err) {
ipsec->xs = xs;
INIT_LIST_HEAD(&ipsec->list);
- spin_lock_bh(&bond->ipsec_lock);
+ mutex_lock(&bond->ipsec_lock);
list_add(&ipsec->list, &bond->ipsec_list);
- spin_unlock_bh(&bond->ipsec_lock);
+ mutex_unlock(&bond->ipsec_lock);
} else {
kfree(ipsec);
}
- rcu_read_unlock();
return err;
}
@@ -482,34 +477,44 @@ static void bond_ipsec_add_sa_all(struct bonding *bond)
struct slave *slave;
rcu_read_lock();
- slave = rcu_dereference(bond->curr_active_slave);
- if (!slave)
- goto out;
+ slave = rtnl_dereference(bond->curr_active_slave);
+ real_dev = slave ? slave->dev : NULL;
+ rcu_read_unlock();
+ if (!real_dev)
+ return;
- real_dev = slave->dev;
+ mutex_lock(&bond->ipsec_lock);
if (!real_dev->xfrmdev_ops ||
!real_dev->xfrmdev_ops->xdo_dev_state_add ||
netif_is_bond_master(real_dev)) {
- spin_lock_bh(&bond->ipsec_lock);
if (!list_empty(&bond->ipsec_list))
slave_warn(bond_dev, real_dev,
"%s: no slave xdo_dev_state_add\n",
__func__);
- spin_unlock_bh(&bond->ipsec_lock);
goto out;
}
- spin_lock_bh(&bond->ipsec_lock);
list_for_each_entry(ipsec, &bond->ipsec_list, list) {
+ struct net_device *dev = ipsec->xs->xso.real_dev;
+
+ /* If new state is added before ipsec_lock acquired */
+ if (dev) {
+ if (dev == real_dev)
+ continue;
+
+ dev->xfrmdev_ops->xdo_dev_state_delete(ipsec->xs);
+ if (dev->xfrmdev_ops->xdo_dev_state_free)
+ dev->xfrmdev_ops->xdo_dev_state_free(ipsec->xs);
+ }
+
ipsec->xs->xso.real_dev = real_dev;
if (real_dev->xfrmdev_ops->xdo_dev_state_add(ipsec->xs, NULL)) {
slave_warn(bond_dev, real_dev, "%s: failed to add SA\n", __func__);
ipsec->xs->xso.real_dev = NULL;
}
}
- spin_unlock_bh(&bond->ipsec_lock);
out:
- rcu_read_unlock();
+ mutex_unlock(&bond->ipsec_lock);
}
/**
@@ -530,6 +535,8 @@ static void bond_ipsec_del_sa(struct xfrm_state *xs)
rcu_read_lock();
bond = netdev_priv(bond_dev);
slave = rcu_dereference(bond->curr_active_slave);
+ real_dev = slave ? slave->dev : NULL;
+ rcu_read_unlock();
if (!slave)
goto out;
@@ -537,7 +544,6 @@ static void bond_ipsec_del_sa(struct xfrm_state *xs)
if (!xs->xso.real_dev)
goto out;
- real_dev = slave->dev;
WARN_ON(xs->xso.real_dev != real_dev);
if (!real_dev->xfrmdev_ops ||
@@ -549,7 +555,7 @@ static void bond_ipsec_del_sa(struct xfrm_state *xs)
real_dev->xfrmdev_ops->xdo_dev_state_delete(xs);
out:
- spin_lock_bh(&bond->ipsec_lock);
+ mutex_lock(&bond->ipsec_lock);
list_for_each_entry(ipsec, &bond->ipsec_list, list) {
if (ipsec->xs == xs) {
list_del(&ipsec->list);
@@ -557,8 +563,7 @@ static void bond_ipsec_del_sa(struct xfrm_state *xs)
break;
}
}
- spin_unlock_bh(&bond->ipsec_lock);
- rcu_read_unlock();
+ mutex_unlock(&bond->ipsec_lock);
}
static void bond_ipsec_del_sa_all(struct bonding *bond)
@@ -569,14 +574,13 @@ static void bond_ipsec_del_sa_all(struct bonding *bond)
struct slave *slave;
rcu_read_lock();
- slave = rcu_dereference(bond->curr_active_slave);
- if (!slave) {
- rcu_read_unlock();
+ slave = rtnl_dereference(bond->curr_active_slave);
+ real_dev = slave ? slave->dev : NULL;
+ rcu_read_unlock();
+ if (!real_dev)
return;
- }
- real_dev = slave->dev;
- spin_lock_bh(&bond->ipsec_lock);
+ mutex_lock(&bond->ipsec_lock);
list_for_each_entry(ipsec, &bond->ipsec_list, list) {
if (!ipsec->xs->xso.real_dev)
continue;
@@ -594,8 +598,7 @@ static void bond_ipsec_del_sa_all(struct bonding *bond)
}
ipsec->xs->xso.real_dev = NULL;
}
- spin_unlock_bh(&bond->ipsec_lock);
- rcu_read_unlock();
+ mutex_unlock(&bond->ipsec_lock);
}
static void bond_ipsec_free_sa(struct xfrm_state *xs)
@@ -5902,7 +5905,7 @@ void bond_setup(struct net_device *bond_dev)
/* set up xfrm device ops (only supported in active-backup right now) */
bond_dev->xfrmdev_ops = &bond_xfrmdev_ops;
INIT_LIST_HEAD(&bond->ipsec_list);
- spin_lock_init(&bond->ipsec_lock);
+ mutex_init(&bond->ipsec_lock);
#endif /* CONFIG_XFRM_OFFLOAD */
/* don't acquire bond device's netif_tx_lock when transmitting */
@@ -5951,6 +5954,10 @@ static void bond_uninit(struct net_device *bond_dev)
__bond_release_one(bond_dev, slave->dev, true, true);
netdev_info(bond_dev, "Released all slaves\n");
+#ifdef CONFIG_XFRM_OFFLOAD
+ mutex_destroy(&bond->ipsec_lock);
+#endif /* CONFIG_XFRM_OFFLOAD */
+
bond_set_slave_arr(bond, NULL, NULL);
list_del_rcu(&bond->bond_list);
diff --git a/include/net/bonding.h b/include/net/bonding.h
index b61fb1aa3a56..8bb5f016969f 100644
--- a/include/net/bonding.h
+++ b/include/net/bonding.h
@@ -260,7 +260,7 @@ struct bonding {
#ifdef CONFIG_XFRM_OFFLOAD
struct list_head ipsec_list;
/* protecting ipsec_list */
- spinlock_t ipsec_lock;
+ struct mutex ipsec_lock;
#endif /* CONFIG_XFRM_OFFLOAD */
struct bpf_prog *xdp_prog;
};
--
2.44.0
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [PATCH net 4/4] bonding: change ipsec_lock from spin lock to mutex
2024-07-29 12:44 ` [PATCH net 4/4] bonding: change ipsec_lock from spin lock to mutex Tariq Toukan
@ 2024-07-30 11:28 ` Paolo Abeni
2024-07-31 6:29 ` Jianbo Liu
0 siblings, 1 reply; 12+ messages in thread
From: Paolo Abeni @ 2024-07-30 11:28 UTC (permalink / raw)
To: Tariq Toukan, David S. Miller, Jakub Kicinski, Eric Dumazet,
Jay Vosburgh, Andy Gospodarek
Cc: netdev, Saeed Mahameed, Gal Pressman, Leon Romanovsky, Jianbo Liu,
Cosmin Ratiu
On 7/29/24 14:44, Tariq Toukan wrote:
> From: Jianbo Liu <jianbol@nvidia.com>
>
> In the cited commit, bond->ipsec_lock is added to protect ipsec_list,
> hence xdo_dev_state_add and xdo_dev_state_delete are called inside
> this lock. As ipsec_lock is spin lock and such xfrmdev ops may sleep,
minor nit: missing 'a' here ^^
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 763d807be311..bced29813478 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -439,38 +439,33 @@ static int bond_ipsec_add_sa(struct xfrm_state *xs,
> rcu_read_lock();
> bond = netdev_priv(bond_dev);
> slave = rcu_dereference(bond->curr_active_slave);
Is even this caller always under RTNL lock? if so it would be better
replace rcu_dereference() with rtnl_dereference() and drop the rcu
lock, so it's clear that real_dev can't go away here.
Similar question for bond_ipsec_delete_sa, below.
> - if (!slave) {
> - rcu_read_unlock();
> + real_dev = slave ? slave->dev : NULL;
> + rcu_read_unlock();
> + if (!real_dev)
> return -ENODEV;
> - }
>
> - real_dev = slave->dev;
> if (!real_dev->xfrmdev_ops ||
> !real_dev->xfrmdev_ops->xdo_dev_state_add ||
> netif_is_bond_master(real_dev)) {
> NL_SET_ERR_MSG_MOD(extack, "Slave does not support ipsec offload");
> - rcu_read_unlock();
> return -EINVAL;
> }
>
> ipsec = kmalloc(sizeof(*ipsec), GFP_ATOMIC);
I guess at this point you can use GFP_KERNEL here.
[...]
> @@ -482,34 +477,44 @@ static void bond_ipsec_add_sa_all(struct bonding *bond)
> struct slave *slave;
>
> rcu_read_lock();
> - slave = rcu_dereference(bond->curr_active_slave);
> - if (!slave)
> - goto out;
> + slave = rtnl_dereference(bond->curr_active_slave);
> + real_dev = slave ? slave->dev : NULL;
> + rcu_read_unlock();
You can drop the rcu read lock/unlock here.
[...]
> @@ -569,14 +574,13 @@ static void bond_ipsec_del_sa_all(struct bonding *bond)
> struct slave *slave;
>
> rcu_read_lock();
> - slave = rcu_dereference(bond->curr_active_slave);
> - if (!slave) {
> - rcu_read_unlock();
> + slave = rtnl_dereference(bond->curr_active_slave);
> + real_dev = slave ? slave->dev : NULL;
> + rcu_read_unlock();
Same here.
Thanks,
Paolo
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH net 4/4] bonding: change ipsec_lock from spin lock to mutex
2024-07-30 11:28 ` Paolo Abeni
@ 2024-07-31 6:29 ` Jianbo Liu
0 siblings, 0 replies; 12+ messages in thread
From: Jianbo Liu @ 2024-07-31 6:29 UTC (permalink / raw)
To: Tariq Toukan, jv@jvosburgh.net, andy@greyhouse.net,
edumazet@google.com, davem@davemloft.net, kuba@kernel.org,
pabeni@redhat.com
Cc: Leon Romanovsky, Gal Pressman, Saeed Mahameed,
netdev@vger.kernel.org, Cosmin Ratiu
On Tue, 2024-07-30 at 13:28 +0200, Paolo Abeni wrote:
>
>
> On 7/29/24 14:44, Tariq Toukan wrote:
> > From: Jianbo Liu <jianbol@nvidia.com>
> >
> > In the cited commit, bond->ipsec_lock is added to protect
> > ipsec_list,
> > hence xdo_dev_state_add and xdo_dev_state_delete are called inside
> > this lock. As ipsec_lock is spin lock and such xfrmdev ops may
> > sleep,
>
> minor nit: missing 'a' here ^^
OK, thanks.
>
> > diff --git a/drivers/net/bonding/bond_main.c
> > b/drivers/net/bonding/bond_main.c
> > index 763d807be311..bced29813478 100644
> > --- a/drivers/net/bonding/bond_main.c
> > +++ b/drivers/net/bonding/bond_main.c
> > @@ -439,38 +439,33 @@ static int bond_ipsec_add_sa(struct
> > xfrm_state *xs,
> > rcu_read_lock();
> > bond = netdev_priv(bond_dev);
> > slave = rcu_dereference(bond->curr_active_slave);
>
> Is even this caller always under RTNL lock? if so it would be better
> replace rcu_dereference() with rtnl_dereference() and drop the rcu
> lock, so it's clear that real_dev can't go away here.
>
> Similar question for bond_ipsec_delete_sa, below.
>
No, I don't think they are called under RTNL lock.
> > - if (!slave) {
> > - rcu_read_unlock();
> > + real_dev = slave ? slave->dev : NULL;
> > + rcu_read_unlock();
> > + if (!real_dev)
> > return -ENODEV;
> > - }
> >
> > - real_dev = slave->dev;
> > if (!real_dev->xfrmdev_ops ||
> > !real_dev->xfrmdev_ops->xdo_dev_state_add ||
> > netif_is_bond_master(real_dev)) {
> > NL_SET_ERR_MSG_MOD(extack, "Slave does not support
> > ipsec offload");
> > - rcu_read_unlock();
> > return -EINVAL;
> > }
> >
> > ipsec = kmalloc(sizeof(*ipsec), GFP_ATOMIC);
>
> I guess at this point you can use GFP_KERNEL here.
>
Good point, thanks.
> [...]
> > @@ -482,34 +477,44 @@ static void bond_ipsec_add_sa_all(struct
> > bonding *bond)
> > struct slave *slave;
> >
> > rcu_read_lock();
> > - slave = rcu_dereference(bond->curr_active_slave);
> > - if (!slave)
> > - goto out;
> > + slave = rtnl_dereference(bond->curr_active_slave);
> > + real_dev = slave ? slave->dev : NULL;
> > + rcu_read_unlock();
>
> You can drop the rcu read lock/unlock here.
Yes, I will drop rcu read lock/unlock for these 4 functions.
>
> [...]
> > @@ -569,14 +574,13 @@ static void bond_ipsec_del_sa_all(struct
> > bonding *bond)
> > struct slave *slave;
> >
> > rcu_read_lock();
> > - slave = rcu_dereference(bond->curr_active_slave);
> > - if (!slave) {
> > - rcu_read_unlock();
> > + slave = rtnl_dereference(bond->curr_active_slave);
> > + real_dev = slave ? slave->dev : NULL;
> > + rcu_read_unlock();
>
> Same here.
>
> Thanks,
>
> Paolo
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net 0/4] Fixes for IPsec over bonding
2024-07-29 12:44 [PATCH net 0/4] Fixes for IPsec over bonding Tariq Toukan
` (3 preceding siblings ...)
2024-07-29 12:44 ` [PATCH net 4/4] bonding: change ipsec_lock from spin lock to mutex Tariq Toukan
@ 2024-07-31 3:20 ` Hangbin Liu
4 siblings, 0 replies; 12+ messages in thread
From: Hangbin Liu @ 2024-07-31 3:20 UTC (permalink / raw)
To: Tariq Toukan
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
Jay Vosburgh, Andy Gospodarek, netdev, Saeed Mahameed,
Gal Pressman, Leon Romanovsky
On Mon, Jul 29, 2024 at 03:44:01PM +0300, Tariq Toukan wrote:
> Hi,
>
> This patchset by Jianbo provides bug fixes for IPsec over bonding
> driver.
>
> It adds the missing xdo_dev_state_free API, and fixes "scheduling while
> atomic" by using mutex lock instead.
Good, I'm also working on the bonding IPsec offload issues recently. This save
some of my works :)
Hangbin
^ permalink raw reply [flat|nested] 12+ messages in thread