All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hangbin Liu <liuhangbin@gmail.com>
To: Nikolay Aleksandrov <razor@blackwall.org>
Cc: netdev@vger.kernel.org, Jay Vosburgh <jv@jvosburgh.net>,
	Andrew Lunn <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Simon Horman <horms@kernel.org>, Shuah Khan <shuah@kernel.org>,
	Tariq Toukan <tariqt@nvidia.com>, Jianbo Liu <jianbol@nvidia.com>,
	Jarod Wilson <jarod@redhat.com>,
	Steffen Klassert <steffen.klassert@secunet.com>,
	Cosmin Ratiu <cratiu@nvidia.com>,
	linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCHv3 net 1/3] bonding: move IPsec deletion to bond_ipsec_free_sa
Date: Fri, 28 Feb 2025 02:20:26 +0000	[thread overview]
Message-ID: <Z8EdatcTr9weRfHr@fedora> (raw)
In-Reply-To: <f88b234a-37ec-46a4-b920-35f598ab6c38@blackwall.org>

On Thu, Feb 27, 2025 at 03:31:01PM +0200, Nikolay Aleksandrov wrote:
> >> One more thing - note I'm not an xfrm expert by far but it seems to me here you have
> >> to also call  xdo_dev_state_free() with the old active slave dev otherwise that will
> >> never get called with the original real_dev after the switch to a new
> >> active slave (or more accurately it might if the GC runs between the switching
> >> but it is a race), care must be taken wrt sequence of events because the XFRM
> > 
> > Can we just call xs->xso.real_dev->xfrmdev_ops->xdo_dev_state_free(xs)
> > no matter xs->xso.real_dev == real_dev or not? I'm afraid calling
> > xdo_dev_state_free() every where may make us lot more easily.
> > 
> 
> You'd have to check all drivers that implement the callback to answer that and even then
> I'd stick to the canonical way of how it's done in xfrm and make the bond just passthrough.
> Any other games become dangerous and new code will have to be carefully reviewed every
> time, calling another device's free_sa when it wasn't added before doesn't sound good.
> 
> >> GC may be running in parallel which probably means that in bond_ipsec_free_sa()
> >> you'll have to take the mutex before calling xdo_dev_state_free() and check
> >> if the entry is still linked in the bond's ipsec list before calling the free_sa
> >> callback, if it isn't then del_sa_all got to it before the GC and there's nothing
> >> to do if it also called the dev's free_sa callback. The check for real_dev doesn't
> >> seem enough to protect against this race.
> > 
> > I agree that we need to take the mutex before calling xdo_dev_state_free()
> > in bond_ipsec_free_sa(). Do you think if this is enough? I'm a bit lot here.
> > 
> > Thanks
> > Hangbin
> 
> Well, the race is between the xfrm GC and del_sa_all, in bond's free_sa if you
> walk the list under the mutex before calling real_dev's free callback and
> don't find the current element that's being freed in free_sa then it was
> cleaned up by del_sa_all, otherwise del_sa_all is waiting to walk that
> list and clean the entries. I think it should be fine as long as free_sa
> was called once with the proper device.

OK, so the free will be called either in del_sa_all() or free_sa().
Something like this?

 static void bond_ipsec_del_sa_all(struct bonding *bond)
@@ -620,6 +614,16 @@ static void bond_ipsec_del_sa_all(struct bonding *bond)
 		if (!ipsec->xs->xso.real_dev)
 			continue;
 
+		if (ipsec->xs->km.state == XFRM_STATE_DEAD) {
+			/* already dead no need to delete again */
+			if (real_dev->xfrmdev_ops->xdo_dev_state_free)
+				real_dev->xfrmdev_ops->xdo_dev_state_free(ipsec->xs);
+			list_del(&ipsec->list);
+			kfree(ipsec);
+			continue;
+		}
+
 		if (!real_dev->xfrmdev_ops ||
 		    !real_dev->xfrmdev_ops->xdo_dev_state_delete ||
 		    netif_is_bond_master(real_dev)) {
 
@@ -659,11 +664,22 @@ static void bond_ipsec_free_sa(struct xfrm_state *xs)
 	if (!xs->xso.real_dev)
 		goto out;
 
-	WARN_ON(xs->xso.real_dev != real_dev);
+	mutex_lock(&bond->ipsec_lock);
+	list_for_each_entry(ipsec, &bond->ipsec_list, list) {
+		if (ipsec->xs == xs) {
+			if (real_dev && xs->xso.real_dev == real_dev &&

                           ^^ looks we don't need this xs->xso.real_dev == real_dev
			   checking if there is no race, do we? Or just keep
			   the WARN_ON() in case of any race.

+			    real_dev->xfrmdev_ops &&
+			    real_dev->xfrmdev_ops->xdo_dev_state_free)
+				real_dev->xfrmdev_ops->xdo_dev_state_free(xs);
+			list_del(&ipsec->list);
+			kfree(ipsec);
+			break;
+		}
+	}
+	mutex_unlock(&bond->ipsec_lock);
 
-	if (real_dev && real_dev->xfrmdev_ops &&
-	    real_dev->xfrmdev_ops->xdo_dev_state_free)
-		real_dev->xfrmdev_ops->xdo_dev_state_free(xs);
 out:
 	netdev_put(real_dev, &tracker);
 }
-- 
2.39.5 (Apple Git-154)


Thanks
Hangbin

  reply	other threads:[~2025-02-28  2:20 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-27  8:37 [PATCHv3 net 0/3] bond: fix xfrm offload issues Hangbin Liu
2025-02-27  8:37 ` [PATCHv3 net 1/3] bonding: move IPsec deletion to bond_ipsec_free_sa Hangbin Liu
2025-02-27  8:50   ` Nikolay Aleksandrov
2025-02-27  9:21     ` Nikolay Aleksandrov
2025-02-27 13:21       ` Hangbin Liu
2025-02-27 13:31         ` Nikolay Aleksandrov
2025-02-28  2:20           ` Hangbin Liu [this message]
2025-02-28 10:31             ` Cosmin Ratiu
2025-02-28 11:07               ` Nikolay Aleksandrov
2025-02-28 11:10                 ` Nikolay Aleksandrov
2025-02-28 12:59               ` Hangbin Liu
2025-03-04  9:18               ` Hangbin Liu
2025-03-04 10:25                 ` Cosmin Ratiu
2025-02-27  8:37 ` [PATCHv3 net 2/3] bonding: fix xfrm offload feature setup on active-backup mode Hangbin Liu
2025-02-27  8:37 ` [PATCHv3 net 3/3] selftests: bonding: add ipsec offload test Hangbin Liu
2025-02-27 13:59   ` Petr Machata

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z8EdatcTr9weRfHr@fedora \
    --to=liuhangbin@gmail.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=cratiu@nvidia.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=jarod@redhat.com \
    --cc=jianbol@nvidia.com \
    --cc=jv@jvosburgh.net \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=razor@blackwall.org \
    --cc=shuah@kernel.org \
    --cc=steffen.klassert@secunet.com \
    --cc=tariqt@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.