netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hangbin Liu <liuhangbin@gmail.com>
To: Cosmin Ratiu <cratiu@nvidia.com>
Cc: "shuah@kernel.org" <shuah@kernel.org>,
	"andrew+netdev@lunn.ch" <andrew+netdev@lunn.ch>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"jv@jvosburgh.net" <jv@jvosburgh.net>,
	"herbert@gondor.apana.org.au" <herbert@gondor.apana.org.au>,
	"andy@greyhouse.net" <andy@greyhouse.net>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"pabeni@redhat.com" <pabeni@redhat.com>,
	"edumazet@google.com" <edumazet@google.com>,
	"sd@queasysnail.net" <sd@queasysnail.net>,
	Jianbo Liu <jianbol@nvidia.com>,
	"horms@kernel.org" <horms@kernel.org>,
	"kuba@kernel.org" <kuba@kernel.org>,
	Tariq Toukan <tariqt@nvidia.com>,
	"razor@blackwall.org" <razor@blackwall.org>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"steffen.klassert@secunet.com" <steffen.klassert@secunet.com>,
	"linux-kselftest@vger.kernel.org"
	<linux-kselftest@vger.kernel.org>
Subject: Re: [PATCH net 0/2] bond: fix xfrm offload feature during init
Date: Thu, 20 Feb 2025 11:18:15 +0000	[thread overview]
Message-ID: <Z7cPdyDZ9OOIgU7c@fedora> (raw)
In-Reply-To: <40e2170b52f1f80fd72405df6901c4323f903e67.camel@nvidia.com>

On Thu, Feb 20, 2025 at 10:48:43AM +0000, Cosmin Ratiu wrote:
> On Mon, 2025-01-20 at 23:59 +0000, Hangbin Liu wrote:
> > > 
> > > I am not sure this can be fixed in bonding given that the
> > > xdo_dev_state_delete op could, in the general case, sleep while
> > > talking
> > > to the hardware. I don't think it's reasonable to expect devices to
> > > offload xfrm while the kernel holds a spinlock.
> > > Bonding just exposed this assumption mismatch because of the mutex
> > > that
> > > was added to replace a spinlock which exhibited the same problem we
> > > are
> > > talking about here.
> > > 
> > > Do the dev offload operations need to be synchronous? Couldn't
> > > __xfrm_state_delete instead schedule a wq to do the dev offload? I
> > > saw
> > > there's already an xfrm_state_gc_task that's invoked to call
> > > xfrm_dev_state_free, perhaps that could be used to do the delete as
> > > well?
> > 
> > Yes, I have tried to move the bonding gc work in bond_ipsec_del_sa()
> > to a wq
> > in https://lore.kernel.org/netdev/Z33nEKg4PxwReUu_@fedora/. i.e. move
> > the
> > following part out of spin lock via wq.
> > 
> >         mutex_lock(&bond->ipsec_lock);
> >         list_for_each_entry(ipsec, &bond->ipsec_list, list) {
> >                 if (ipsec->xs == xs) {
> >                         list_del(&ipsec->list);
> >                         kfree(ipsec);
> >                         break;
> >                 }
> >         }
> >         mutex_unlock(&bond->ipsec_lock);
> > 
> > But we can see there is an (ipsec->xs == xs). So we still need to
> > make
> > sure the xs is not released. Can we add a xs reference in
> > bond_ipsec_del_sa()
> > to achieve this?
> 
> Hello,
> 
> After staring at the issue a while longer, I am also converging on just
> moving that mutex part from bond_ipsec_del_sa out to a wq. I browsed
> through all driver implementations of .xdo_dev_state_delete() and found
> none that sleeps or allocates memory with GFP_KERNEL. So if we only fix
> bond_ipsec_del_sa, that would be enough to make it all work again.
> 
> So it should be perfectly safe to add a ref to xs in bond_ipsec_del_sa
> before firing up a wq to do the mutex lock + list traversal, before
> releasing the ref.
> xfrm_state is already unlinked from everything by __xfrm_state_delete
> before xfrm_dev_state_delete is called and the xfrm_state_alloc
> reference is dropped by the end of xfrm_dev_state_delete, so the only
> thing keeping it alive would be the reference added in
> bond_ipsec_del_sa. When that is put after the list traversal,
> __xfrm_state_destroy gets called with sync == false, which passes on
> the baton to another wq to do the gc for xs.
> 
> This all sounds reasonable.
> Will you chase this down or do you prefer me to send the proposed fix?

Thanks for the feedback and confirmation. Let me try it first. Hope
unregistering bond doesn't affect the gc works.

Thanks
Hangbin

  reply	other threads:[~2025-02-20 11:18 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-11  7:11 [PATCH net 0/2] bond: fix xfrm offload feature during init Hangbin Liu
2024-12-11  7:11 ` [PATCH net 1/2] bonding: fix xfrm offload feature setup on active-backup mode Hangbin Liu
2024-12-12  9:19   ` Nikolay Aleksandrov
2024-12-12  9:39     ` Hangbin Liu
2024-12-12  9:43       ` Nikolay Aleksandrov
2024-12-13  3:10         ` Hangbin Liu
2024-12-11  7:11 ` [PATCH net 2/2] selftests: bonding: add ipsec offload test Hangbin Liu
2024-12-12 14:27 ` [PATCH net 0/2] bond: fix xfrm offload feature during init Jakub Kicinski
2024-12-13  7:18   ` Hangbin Liu
2024-12-14  3:31     ` Jakub Kicinski
2025-01-02  2:44       ` Hangbin Liu
2025-01-02  3:33         ` Jianbo Liu
2025-01-03 11:05           ` Hangbin Liu
2025-01-06 10:47           ` Hangbin Liu
2025-01-08  2:46             ` Hangbin Liu
2025-01-08  3:40               ` Jianbo Liu
2025-01-08  7:14                 ` Hangbin Liu
2025-01-09  1:26                   ` Jianbo Liu
2025-01-09  8:37                     ` Hangbin Liu
2025-01-09  9:51                       ` Jianbo Liu
2025-01-09 10:17                         ` Hangbin Liu
2025-01-09 12:21                           ` Jianbo Liu
2025-01-15  9:19                   ` Hangbin Liu
2025-01-17  7:54                     ` Steffen Klassert
2025-01-20 16:16                       ` Cosmin Ratiu
2025-01-20 23:59                         ` Hangbin Liu
2025-02-20 10:48                           ` Cosmin Ratiu
2025-02-20 11:18                             ` Hangbin Liu [this message]
2025-02-20 11:33                               ` Cosmin Ratiu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z7cPdyDZ9OOIgU7c@fedora \
    --to=liuhangbin@gmail.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=andy@greyhouse.net \
    --cc=cratiu@nvidia.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=horms@kernel.org \
    --cc=jianbol@nvidia.com \
    --cc=jv@jvosburgh.net \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=razor@blackwall.org \
    --cc=sd@queasysnail.net \
    --cc=shuah@kernel.org \
    --cc=steffen.klassert@secunet.com \
    --cc=tariqt@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).