public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] net: tls: Use disable_delayed_work_sync() instead of cancel_delayed_work_sync() in tls_sw_cancel_work_tx()
@ 2026-02-16  9:51 Hyunwoo Kim
  2026-02-17 15:37 ` Simon Horman
  0 siblings, 1 reply; 5+ messages in thread
From: Hyunwoo Kim @ 2026-02-16  9:51 UTC (permalink / raw)
  To: john.fastabend, kuba, sd, davem, edumazet, pabeni; +Cc: horms, netdev, imv4bel

After cancel_delayed_work_sync() is called from tls_sk_proto_close(), 
tx_work_handler() can still be scheduled from paths such as the 
Delayed ACK handler or ksoftirqd.
As a result, the tx_work_handler() worker may dereference a freed 
TLS object.

To prevent this race condition, cancel_delayed_work_sync() is 
replaced with disable_delayed_work_sync().

Fixes: f87e62d45e51 ("net/tls: remove close callback sock unlock/lock around TX work flush")
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
---
 net/tls/tls_sw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 9937d4c810f2..b1fa62de9dab 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -2533,7 +2533,7 @@ void tls_sw_cancel_work_tx(struct tls_context *tls_ctx)
 
 	set_bit(BIT_TX_CLOSING, &ctx->tx_bitmask);
 	set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
-	cancel_delayed_work_sync(&ctx->tx_work.work);
+	disable_delayed_work_sync(&ctx->tx_work.work);
 }
 
 void tls_sw_release_resources_tx(struct sock *sk)
-- 
2.43.0
---

Dear,

The following is a simplified scenario illustrating how each race can occur. Since tls_sw_cancel_work_tx() does not hold lock_sock(), it can race with tls_write_space().
```
          cpu0                                             cpu1

tls_sk_proto_close()
  tls_sw_cancel_work_tx()
                                                    tls_write_space()
                                                      tls_sw_write_space()
                                                      if (!test_and_set_bit(BIT_TX_SCHEDULED, &tx_ctx->tx_bitmask))
    set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
    cancel_delayed_work_sync(&ctx->tx_work.work);
                                                      schedule_delayed_work(&tx_ctx->tx_work.work, 0);
```

Best regards,
Hyunwoo Kim

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] net: tls: Use disable_delayed_work_sync() instead of cancel_delayed_work_sync() in tls_sw_cancel_work_tx()
  2026-02-16  9:51 [PATCH] net: tls: Use disable_delayed_work_sync() instead of cancel_delayed_work_sync() in tls_sw_cancel_work_tx() Hyunwoo Kim
@ 2026-02-17 15:37 ` Simon Horman
  2026-02-17 16:46   ` Hyunwoo Kim
  0 siblings, 1 reply; 5+ messages in thread
From: Simon Horman @ 2026-02-17 15:37 UTC (permalink / raw)
  To: Hyunwoo Kim; +Cc: john.fastabend, kuba, sd, davem, edumazet, pabeni, netdev

On Mon, Feb 16, 2026 at 06:51:50PM +0900, Hyunwoo Kim wrote:
> After cancel_delayed_work_sync() is called from tls_sk_proto_close(), 
> tx_work_handler() can still be scheduled from paths such as the 
> Delayed ACK handler or ksoftirqd.
> As a result, the tx_work_handler() worker may dereference a freed 
> TLS object.
> 
> To prevent this race condition, cancel_delayed_work_sync() is 
> replaced with disable_delayed_work_sync().
> 
> Fixes: f87e62d45e51 ("net/tls: remove close callback sock unlock/lock around TX work flush")
> Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>

Hi Hyunwoo,

Thanks for your patch(es).

Some feedback on process from my side.
You can read more about that at
https://docs.kernel.org/process/maintainer-netdev.html

* I think it would be good to mention how this problem was found.

* As a bug fix for code present in the net tree, it should be targeted
  at that tree like this.

  Subject: [PATCH net]: ...

* Looking over git history, it seems that an appropriate prefix
  for patches for this code is 'tls: '

  Subject [PATCH net]: tls: ...

* Also, please try to make the subject a bit more succinct

> ---
>  net/tls/tls_sw.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> index 9937d4c810f2..b1fa62de9dab 100644
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -2533,7 +2533,7 @@ void tls_sw_cancel_work_tx(struct tls_context *tls_ctx)
>  
>  	set_bit(BIT_TX_CLOSING, &ctx->tx_bitmask);
>  	set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
> -	cancel_delayed_work_sync(&ctx->tx_work.work);
> +	disable_delayed_work_sync(&ctx->tx_work.work);
>  }
>  
>  void tls_sw_release_resources_tx(struct sock *sk)
> -- 
> 2.43.0
> ---
> 
> Dear,
> 
> The following is a simplified scenario illustrating how each race can occur. Since tls_sw_cancel_work_tx() does not hold lock_sock(), it can race with tls_write_space().
> ```
>           cpu0                                             cpu1
> 
> tls_sk_proto_close()
>   tls_sw_cancel_work_tx()
>                                                     tls_write_space()
>                                                       tls_sw_write_space()
>                                                       if (!test_and_set_bit(BIT_TX_SCHEDULED, &tx_ctx->tx_bitmask))
>     set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
>     cancel_delayed_work_sync(&ctx->tx_work.work);
>                                                       schedule_delayed_work(&tx_ctx->tx_work.work, 0);
> ```

I think that the text above would be best included in the patch description.
At least for me it is fundamental to understanding the problem.

> 
> Best regards,
> Hyunwoo Kim
> 

I see three similar patches on the mailing list.
The comments above go for them too.
And It would probably be useful to just handle one at a time,
to allow for proper feedback. Or bundle them in a patchset.

-- 
pw-bot: changes-requested

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] net: tls: Use disable_delayed_work_sync() instead of cancel_delayed_work_sync() in tls_sw_cancel_work_tx()
  2026-02-17 15:37 ` Simon Horman
@ 2026-02-17 16:46   ` Hyunwoo Kim
  2026-02-18  9:34     ` Simon Horman
  0 siblings, 1 reply; 5+ messages in thread
From: Hyunwoo Kim @ 2026-02-17 16:46 UTC (permalink / raw)
  To: Simon Horman
  Cc: john.fastabend, kuba, sd, davem, edumazet, pabeni, netdev,
	imv4bel

On Tue, Feb 17, 2026 at 03:37:01PM +0000, Simon Horman wrote:
> On Mon, Feb 16, 2026 at 06:51:50PM +0900, Hyunwoo Kim wrote:
> > After cancel_delayed_work_sync() is called from tls_sk_proto_close(), 
> > tx_work_handler() can still be scheduled from paths such as the 
> > Delayed ACK handler or ksoftirqd.
> > As a result, the tx_work_handler() worker may dereference a freed 
> > TLS object.
> > 
> > To prevent this race condition, cancel_delayed_work_sync() is 
> > replaced with disable_delayed_work_sync().
> > 
> > Fixes: f87e62d45e51 ("net/tls: remove close callback sock unlock/lock around TX work flush")
> > Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
> 
> Hi Hyunwoo,
> 
> Thanks for your patch(es).
> 
> Some feedback on process from my side.
> You can read more about that at
> https://docs.kernel.org/process/maintainer-netdev.html
> 
> * I think it would be good to mention how this problem was found.

Hi Simon,

Thank you for the feedback.

This issue was found during a manual code audit. I will add this 
information to the commit message.


> 
> * As a bug fix for code present in the net tree, it should be targeted
>   at that tree like this.
> 
>   Subject: [PATCH net]: ...
> 
> * Looking over git history, it seems that an appropriate prefix
>   for patches for this code is 'tls: '
> 
>   Subject [PATCH net]: tls: ...
> 
> * Also, please try to make the subject a bit more succinct
> 
> > ---
> >  net/tls/tls_sw.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> > index 9937d4c810f2..b1fa62de9dab 100644
> > --- a/net/tls/tls_sw.c
> > +++ b/net/tls/tls_sw.c
> > @@ -2533,7 +2533,7 @@ void tls_sw_cancel_work_tx(struct tls_context *tls_ctx)
> >  
> >  	set_bit(BIT_TX_CLOSING, &ctx->tx_bitmask);
> >  	set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
> > -	cancel_delayed_work_sync(&ctx->tx_work.work);
> > +	disable_delayed_work_sync(&ctx->tx_work.work);
> >  }
> >  
> >  void tls_sw_release_resources_tx(struct sock *sk)
> > -- 
> > 2.43.0
> > ---
> > 
> > Dear,
> > 
> > The following is a simplified scenario illustrating how each race can occur. Since tls_sw_cancel_work_tx() does not hold lock_sock(), it can race with tls_write_space().
> > ```
> >           cpu0                                             cpu1
> > 
> > tls_sk_proto_close()
> >   tls_sw_cancel_work_tx()
> >                                                     tls_write_space()
> >                                                       tls_sw_write_space()
> >                                                       if (!test_and_set_bit(BIT_TX_SCHEDULED, &tx_ctx->tx_bitmask))
> >     set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
> >     cancel_delayed_work_sync(&ctx->tx_work.work);
> >                                                       schedule_delayed_work(&tx_ctx->tx_work.work, 0);
> > ```
> 
> I think that the text above would be best included in the patch description.
> At least for me it is fundamental to understanding the problem.

Understood. I will add the race scenario description to the patch.

> 
> > 
> > Best regards,
> > Hyunwoo Kim
> > 
> 
> I see three similar patches on the mailing list.
> The comments above go for them too.
> And It would probably be useful to just handle one at a time,
> to allow for proper feedback. Or bundle them in a patchset.

Since the core of this bug pattern is espintcp, I will submit a revised 
espintcp v2 patch shortly. 
I would appreciate it if you could review the espintcp v2 patch. 
Once the espintcp review is done, I will apply the feedback to the 
remaining strparser and tls patches as well.

Best regards,
Hyunwoo Kim

> 
> -- 
> pw-bot: changes-requested

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] net: tls: Use disable_delayed_work_sync() instead of cancel_delayed_work_sync() in tls_sw_cancel_work_tx()
  2026-02-17 16:46   ` Hyunwoo Kim
@ 2026-02-18  9:34     ` Simon Horman
  2026-02-18  9:36       ` Simon Horman
  0 siblings, 1 reply; 5+ messages in thread
From: Simon Horman @ 2026-02-18  9:34 UTC (permalink / raw)
  To: Hyunwoo Kim; +Cc: john.fastabend, kuba, sd, davem, edumazet, pabeni, netdev

On Wed, Feb 18, 2026 at 01:46:00AM +0900, Hyunwoo Kim wrote:
> On Tue, Feb 17, 2026 at 03:37:01PM +0000, Simon Horman wrote:
> > On Mon, Feb 16, 2026 at 06:51:50PM +0900, Hyunwoo Kim wrote:
> > > After cancel_delayed_work_sync() is called from tls_sk_proto_close(), 
> > > tx_work_handler() can still be scheduled from paths such as the 
> > > Delayed ACK handler or ksoftirqd.
> > > As a result, the tx_work_handler() worker may dereference a freed 
> > > TLS object.
> > > 
> > > To prevent this race condition, cancel_delayed_work_sync() is 
> > > replaced with disable_delayed_work_sync().
> > > 
> > > Fixes: f87e62d45e51 ("net/tls: remove close callback sock unlock/lock around TX work flush")
> > > Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
> > 
> > Hi Hyunwoo,
> > 
> > Thanks for your patch(es).
> > 
> > Some feedback on process from my side.
> > You can read more about that at
> > https://docs.kernel.org/process/maintainer-netdev.html
> > 
> > * I think it would be good to mention how this problem was found.
> 
> Hi Simon,
> 
> Thank you for the feedback.
> 
> This issue was found during a manual code audit. I will add this 
> information to the commit message.
> 
> 
> > 
> > * As a bug fix for code present in the net tree, it should be targeted
> >   at that tree like this.
> > 
> >   Subject: [PATCH net]: ...
> > 
> > * Looking over git history, it seems that an appropriate prefix
> >   for patches for this code is 'tls: '
> > 
> >   Subject [PATCH net]: tls: ...
> > 
> > * Also, please try to make the subject a bit more succinct
> > 
> > > ---
> > >  net/tls/tls_sw.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> > > index 9937d4c810f2..b1fa62de9dab 100644
> > > --- a/net/tls/tls_sw.c
> > > +++ b/net/tls/tls_sw.c
> > > @@ -2533,7 +2533,7 @@ void tls_sw_cancel_work_tx(struct tls_context *tls_ctx)
> > >  
> > >  	set_bit(BIT_TX_CLOSING, &ctx->tx_bitmask);
> > >  	set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
> > > -	cancel_delayed_work_sync(&ctx->tx_work.work);
> > > +	disable_delayed_work_sync(&ctx->tx_work.work);
> > >  }
> > >  
> > >  void tls_sw_release_resources_tx(struct sock *sk)
> > > -- 
> > > 2.43.0
> > > ---
> > > 
> > > Dear,
> > > 
> > > The following is a simplified scenario illustrating how each race can occur. Since tls_sw_cancel_work_tx() does not hold lock_sock(), it can race with tls_write_space().
> > > ```
> > >           cpu0                                             cpu1
> > > 
> > > tls_sk_proto_close()
> > >   tls_sw_cancel_work_tx()
> > >                                                     tls_write_space()
> > >                                                       tls_sw_write_space()
> > >                                                       if (!test_and_set_bit(BIT_TX_SCHEDULED, &tx_ctx->tx_bitmask))
> > >     set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
> > >     cancel_delayed_work_sync(&ctx->tx_work.work);
> > >                                                       schedule_delayed_work(&tx_ctx->tx_work.work, 0);
> > > ```
> > 
> > I think that the text above would be best included in the patch description.
> > At least for me it is fundamental to understanding the problem.
> 
> Understood. I will add the race scenario description to the patch.
> 
> > 
> > > 
> > > Best regards,
> > > Hyunwoo Kim
> > > 
> > 
> > I see three similar patches on the mailing list.
> > The comments above go for them too.
> > And It would probably be useful to just handle one at a time,
> > to allow for proper feedback. Or bundle them in a patchset.
> 
> Since the core of this bug pattern is espintcp, I will submit a revised 
> espintcp v2 patch shortly. 
> I would appreciate it if you could review the espintcp v2 patch. 
> Once the espintcp review is done, I will apply the feedback to the 
> remaining strparser and tls patches as well.

Thanks.

I have looked over the espintcp patch.

It looks good to me. But I think it would be good to include
some sort of race analysis in the commit message: something similar
to the cpu0 / cpu1 text above, but perhaps with a different case explained.

Also, I think the other comments above apply to that patch too.

I will respond to that patch pointing to this message.

> 
> Best regards,
> Hyunwoo Kim
> 
> > 
> > -- 
> > pw-bot: changes-requested
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] net: tls: Use disable_delayed_work_sync() instead of cancel_delayed_work_sync() in tls_sw_cancel_work_tx()
  2026-02-18  9:34     ` Simon Horman
@ 2026-02-18  9:36       ` Simon Horman
  0 siblings, 0 replies; 5+ messages in thread
From: Simon Horman @ 2026-02-18  9:36 UTC (permalink / raw)
  To: Hyunwoo Kim; +Cc: john.fastabend, kuba, sd, davem, edumazet, pabeni, netdev

On Wed, Feb 18, 2026 at 09:34:32AM +0000, Simon Horman wrote:
> On Wed, Feb 18, 2026 at 01:46:00AM +0900, Hyunwoo Kim wrote:
> > On Tue, Feb 17, 2026 at 03:37:01PM +0000, Simon Horman wrote:
> > > On Mon, Feb 16, 2026 at 06:51:50PM +0900, Hyunwoo Kim wrote:
> > > > After cancel_delayed_work_sync() is called from tls_sk_proto_close(), 
> > > > tx_work_handler() can still be scheduled from paths such as the 
> > > > Delayed ACK handler or ksoftirqd.
> > > > As a result, the tx_work_handler() worker may dereference a freed 
> > > > TLS object.
> > > > 
> > > > To prevent this race condition, cancel_delayed_work_sync() is 
> > > > replaced with disable_delayed_work_sync().
> > > > 
> > > > Fixes: f87e62d45e51 ("net/tls: remove close callback sock unlock/lock around TX work flush")
> > > > Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
> > > 
> > > Hi Hyunwoo,
> > > 
> > > Thanks for your patch(es).
> > > 
> > > Some feedback on process from my side.
> > > You can read more about that at
> > > https://docs.kernel.org/process/maintainer-netdev.html
> > > 
> > > * I think it would be good to mention how this problem was found.
> > 
> > Hi Simon,
> > 
> > Thank you for the feedback.
> > 
> > This issue was found during a manual code audit. I will add this 
> > information to the commit message.
> > 
> > 
> > > 
> > > * As a bug fix for code present in the net tree, it should be targeted
> > >   at that tree like this.
> > > 
> > >   Subject: [PATCH net]: ...
> > > 
> > > * Looking over git history, it seems that an appropriate prefix
> > >   for patches for this code is 'tls: '
> > > 
> > >   Subject [PATCH net]: tls: ...
> > > 
> > > * Also, please try to make the subject a bit more succinct
> > > 
> > > > ---
> > > >  net/tls/tls_sw.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> > > > index 9937d4c810f2..b1fa62de9dab 100644
> > > > --- a/net/tls/tls_sw.c
> > > > +++ b/net/tls/tls_sw.c
> > > > @@ -2533,7 +2533,7 @@ void tls_sw_cancel_work_tx(struct tls_context *tls_ctx)
> > > >  
> > > >  	set_bit(BIT_TX_CLOSING, &ctx->tx_bitmask);
> > > >  	set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
> > > > -	cancel_delayed_work_sync(&ctx->tx_work.work);
> > > > +	disable_delayed_work_sync(&ctx->tx_work.work);
> > > >  }
> > > >  
> > > >  void tls_sw_release_resources_tx(struct sock *sk)
> > > > -- 
> > > > 2.43.0
> > > > ---
> > > > 
> > > > Dear,
> > > > 
> > > > The following is a simplified scenario illustrating how each race can occur. Since tls_sw_cancel_work_tx() does not hold lock_sock(), it can race with tls_write_space().
> > > > ```
> > > >           cpu0                                             cpu1
> > > > 
> > > > tls_sk_proto_close()
> > > >   tls_sw_cancel_work_tx()
> > > >                                                     tls_write_space()
> > > >                                                       tls_sw_write_space()
> > > >                                                       if (!test_and_set_bit(BIT_TX_SCHEDULED, &tx_ctx->tx_bitmask))
> > > >     set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
> > > >     cancel_delayed_work_sync(&ctx->tx_work.work);
> > > >                                                       schedule_delayed_work(&tx_ctx->tx_work.work, 0);
> > > > ```
> > > 
> > > I think that the text above would be best included in the patch description.
> > > At least for me it is fundamental to understanding the problem.
> > 
> > Understood. I will add the race scenario description to the patch.
> > 
> > > 
> > > > 
> > > > Best regards,
> > > > Hyunwoo Kim
> > > > 
> > > 
> > > I see three similar patches on the mailing list.
> > > The comments above go for them too.
> > > And It would probably be useful to just handle one at a time,
> > > to allow for proper feedback. Or bundle them in a patchset.
> > 
> > Since the core of this bug pattern is espintcp, I will submit a revised 
> > espintcp v2 patch shortly. 
> > I would appreciate it if you could review the espintcp v2 patch. 
> > Once the espintcp review is done, I will apply the feedback to the 
> > remaining strparser and tls patches as well.
> 
> Thanks.
> 
> I have looked over the espintcp patch.
> 
> It looks good to me. But I think it would be good to include
> some sort of race analysis in the commit message: something similar
> to the cpu0 / cpu1 text above, but perhaps with a different case explained.
> 
> Also, I think the other comments above apply to that patch too.
> 
> I will respond to that patch pointing to this message.

Sorry, I was a bit hasty there.
The comments above are for v1.
I'll look over v2 and respond there.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-02-18  9:36 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-16  9:51 [PATCH] net: tls: Use disable_delayed_work_sync() instead of cancel_delayed_work_sync() in tls_sw_cancel_work_tx() Hyunwoo Kim
2026-02-17 15:37 ` Simon Horman
2026-02-17 16:46   ` Hyunwoo Kim
2026-02-18  9:34     ` Simon Horman
2026-02-18  9:36       ` Simon Horman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox