* [PATCH] net: tls: Use disable_delayed_work_sync() instead of cancel_delayed_work_sync() in tls_sw_cancel_work_tx()
@ 2026-02-16 9:51 Hyunwoo Kim
2026-02-17 15:37 ` Simon Horman
0 siblings, 1 reply; 5+ messages in thread
From: Hyunwoo Kim @ 2026-02-16 9:51 UTC (permalink / raw)
To: john.fastabend, kuba, sd, davem, edumazet, pabeni; +Cc: horms, netdev, imv4bel
After cancel_delayed_work_sync() is called from tls_sk_proto_close(),
tx_work_handler() can still be scheduled from paths such as the
Delayed ACK handler or ksoftirqd.
As a result, the tx_work_handler() worker may dereference a freed
TLS object.
To prevent this race condition, cancel_delayed_work_sync() is
replaced with disable_delayed_work_sync().
Fixes: f87e62d45e51 ("net/tls: remove close callback sock unlock/lock around TX work flush")
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
---
net/tls/tls_sw.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 9937d4c810f2..b1fa62de9dab 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -2533,7 +2533,7 @@ void tls_sw_cancel_work_tx(struct tls_context *tls_ctx)
set_bit(BIT_TX_CLOSING, &ctx->tx_bitmask);
set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
- cancel_delayed_work_sync(&ctx->tx_work.work);
+ disable_delayed_work_sync(&ctx->tx_work.work);
}
void tls_sw_release_resources_tx(struct sock *sk)
--
2.43.0
---
Dear,
The following is a simplified scenario illustrating how each race can occur. Since tls_sw_cancel_work_tx() does not hold lock_sock(), it can race with tls_write_space().
```
cpu0 cpu1
tls_sk_proto_close()
tls_sw_cancel_work_tx()
tls_write_space()
tls_sw_write_space()
if (!test_and_set_bit(BIT_TX_SCHEDULED, &tx_ctx->tx_bitmask))
set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
cancel_delayed_work_sync(&ctx->tx_work.work);
schedule_delayed_work(&tx_ctx->tx_work.work, 0);
```
Best regards,
Hyunwoo Kim
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] net: tls: Use disable_delayed_work_sync() instead of cancel_delayed_work_sync() in tls_sw_cancel_work_tx()
2026-02-16 9:51 [PATCH] net: tls: Use disable_delayed_work_sync() instead of cancel_delayed_work_sync() in tls_sw_cancel_work_tx() Hyunwoo Kim
@ 2026-02-17 15:37 ` Simon Horman
2026-02-17 16:46 ` Hyunwoo Kim
0 siblings, 1 reply; 5+ messages in thread
From: Simon Horman @ 2026-02-17 15:37 UTC (permalink / raw)
To: Hyunwoo Kim; +Cc: john.fastabend, kuba, sd, davem, edumazet, pabeni, netdev
On Mon, Feb 16, 2026 at 06:51:50PM +0900, Hyunwoo Kim wrote:
> After cancel_delayed_work_sync() is called from tls_sk_proto_close(),
> tx_work_handler() can still be scheduled from paths such as the
> Delayed ACK handler or ksoftirqd.
> As a result, the tx_work_handler() worker may dereference a freed
> TLS object.
>
> To prevent this race condition, cancel_delayed_work_sync() is
> replaced with disable_delayed_work_sync().
>
> Fixes: f87e62d45e51 ("net/tls: remove close callback sock unlock/lock around TX work flush")
> Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
Hi Hyunwoo,
Thanks for your patch(es).
Some feedback on process from my side.
You can read more about that at
https://docs.kernel.org/process/maintainer-netdev.html
* I think it would be good to mention how this problem was found.
* As a bug fix for code present in the net tree, it should be targeted
at that tree like this.
Subject: [PATCH net]: ...
* Looking over git history, it seems that an appropriate prefix
for patches for this code is 'tls: '
Subject [PATCH net]: tls: ...
* Also, please try to make the subject a bit more succinct
> ---
> net/tls/tls_sw.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> index 9937d4c810f2..b1fa62de9dab 100644
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -2533,7 +2533,7 @@ void tls_sw_cancel_work_tx(struct tls_context *tls_ctx)
>
> set_bit(BIT_TX_CLOSING, &ctx->tx_bitmask);
> set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
> - cancel_delayed_work_sync(&ctx->tx_work.work);
> + disable_delayed_work_sync(&ctx->tx_work.work);
> }
>
> void tls_sw_release_resources_tx(struct sock *sk)
> --
> 2.43.0
> ---
>
> Dear,
>
> The following is a simplified scenario illustrating how each race can occur. Since tls_sw_cancel_work_tx() does not hold lock_sock(), it can race with tls_write_space().
> ```
> cpu0 cpu1
>
> tls_sk_proto_close()
> tls_sw_cancel_work_tx()
> tls_write_space()
> tls_sw_write_space()
> if (!test_and_set_bit(BIT_TX_SCHEDULED, &tx_ctx->tx_bitmask))
> set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
> cancel_delayed_work_sync(&ctx->tx_work.work);
> schedule_delayed_work(&tx_ctx->tx_work.work, 0);
> ```
I think that the text above would be best included in the patch description.
At least for me it is fundamental to understanding the problem.
>
> Best regards,
> Hyunwoo Kim
>
I see three similar patches on the mailing list.
The comments above go for them too.
And It would probably be useful to just handle one at a time,
to allow for proper feedback. Or bundle them in a patchset.
--
pw-bot: changes-requested
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] net: tls: Use disable_delayed_work_sync() instead of cancel_delayed_work_sync() in tls_sw_cancel_work_tx()
2026-02-17 15:37 ` Simon Horman
@ 2026-02-17 16:46 ` Hyunwoo Kim
2026-02-18 9:34 ` Simon Horman
0 siblings, 1 reply; 5+ messages in thread
From: Hyunwoo Kim @ 2026-02-17 16:46 UTC (permalink / raw)
To: Simon Horman
Cc: john.fastabend, kuba, sd, davem, edumazet, pabeni, netdev,
imv4bel
On Tue, Feb 17, 2026 at 03:37:01PM +0000, Simon Horman wrote:
> On Mon, Feb 16, 2026 at 06:51:50PM +0900, Hyunwoo Kim wrote:
> > After cancel_delayed_work_sync() is called from tls_sk_proto_close(),
> > tx_work_handler() can still be scheduled from paths such as the
> > Delayed ACK handler or ksoftirqd.
> > As a result, the tx_work_handler() worker may dereference a freed
> > TLS object.
> >
> > To prevent this race condition, cancel_delayed_work_sync() is
> > replaced with disable_delayed_work_sync().
> >
> > Fixes: f87e62d45e51 ("net/tls: remove close callback sock unlock/lock around TX work flush")
> > Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
>
> Hi Hyunwoo,
>
> Thanks for your patch(es).
>
> Some feedback on process from my side.
> You can read more about that at
> https://docs.kernel.org/process/maintainer-netdev.html
>
> * I think it would be good to mention how this problem was found.
Hi Simon,
Thank you for the feedback.
This issue was found during a manual code audit. I will add this
information to the commit message.
>
> * As a bug fix for code present in the net tree, it should be targeted
> at that tree like this.
>
> Subject: [PATCH net]: ...
>
> * Looking over git history, it seems that an appropriate prefix
> for patches for this code is 'tls: '
>
> Subject [PATCH net]: tls: ...
>
> * Also, please try to make the subject a bit more succinct
>
> > ---
> > net/tls/tls_sw.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> > index 9937d4c810f2..b1fa62de9dab 100644
> > --- a/net/tls/tls_sw.c
> > +++ b/net/tls/tls_sw.c
> > @@ -2533,7 +2533,7 @@ void tls_sw_cancel_work_tx(struct tls_context *tls_ctx)
> >
> > set_bit(BIT_TX_CLOSING, &ctx->tx_bitmask);
> > set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
> > - cancel_delayed_work_sync(&ctx->tx_work.work);
> > + disable_delayed_work_sync(&ctx->tx_work.work);
> > }
> >
> > void tls_sw_release_resources_tx(struct sock *sk)
> > --
> > 2.43.0
> > ---
> >
> > Dear,
> >
> > The following is a simplified scenario illustrating how each race can occur. Since tls_sw_cancel_work_tx() does not hold lock_sock(), it can race with tls_write_space().
> > ```
> > cpu0 cpu1
> >
> > tls_sk_proto_close()
> > tls_sw_cancel_work_tx()
> > tls_write_space()
> > tls_sw_write_space()
> > if (!test_and_set_bit(BIT_TX_SCHEDULED, &tx_ctx->tx_bitmask))
> > set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
> > cancel_delayed_work_sync(&ctx->tx_work.work);
> > schedule_delayed_work(&tx_ctx->tx_work.work, 0);
> > ```
>
> I think that the text above would be best included in the patch description.
> At least for me it is fundamental to understanding the problem.
Understood. I will add the race scenario description to the patch.
>
> >
> > Best regards,
> > Hyunwoo Kim
> >
>
> I see three similar patches on the mailing list.
> The comments above go for them too.
> And It would probably be useful to just handle one at a time,
> to allow for proper feedback. Or bundle them in a patchset.
Since the core of this bug pattern is espintcp, I will submit a revised
espintcp v2 patch shortly.
I would appreciate it if you could review the espintcp v2 patch.
Once the espintcp review is done, I will apply the feedback to the
remaining strparser and tls patches as well.
Best regards,
Hyunwoo Kim
>
> --
> pw-bot: changes-requested
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] net: tls: Use disable_delayed_work_sync() instead of cancel_delayed_work_sync() in tls_sw_cancel_work_tx()
2026-02-17 16:46 ` Hyunwoo Kim
@ 2026-02-18 9:34 ` Simon Horman
2026-02-18 9:36 ` Simon Horman
0 siblings, 1 reply; 5+ messages in thread
From: Simon Horman @ 2026-02-18 9:34 UTC (permalink / raw)
To: Hyunwoo Kim; +Cc: john.fastabend, kuba, sd, davem, edumazet, pabeni, netdev
On Wed, Feb 18, 2026 at 01:46:00AM +0900, Hyunwoo Kim wrote:
> On Tue, Feb 17, 2026 at 03:37:01PM +0000, Simon Horman wrote:
> > On Mon, Feb 16, 2026 at 06:51:50PM +0900, Hyunwoo Kim wrote:
> > > After cancel_delayed_work_sync() is called from tls_sk_proto_close(),
> > > tx_work_handler() can still be scheduled from paths such as the
> > > Delayed ACK handler or ksoftirqd.
> > > As a result, the tx_work_handler() worker may dereference a freed
> > > TLS object.
> > >
> > > To prevent this race condition, cancel_delayed_work_sync() is
> > > replaced with disable_delayed_work_sync().
> > >
> > > Fixes: f87e62d45e51 ("net/tls: remove close callback sock unlock/lock around TX work flush")
> > > Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
> >
> > Hi Hyunwoo,
> >
> > Thanks for your patch(es).
> >
> > Some feedback on process from my side.
> > You can read more about that at
> > https://docs.kernel.org/process/maintainer-netdev.html
> >
> > * I think it would be good to mention how this problem was found.
>
> Hi Simon,
>
> Thank you for the feedback.
>
> This issue was found during a manual code audit. I will add this
> information to the commit message.
>
>
> >
> > * As a bug fix for code present in the net tree, it should be targeted
> > at that tree like this.
> >
> > Subject: [PATCH net]: ...
> >
> > * Looking over git history, it seems that an appropriate prefix
> > for patches for this code is 'tls: '
> >
> > Subject [PATCH net]: tls: ...
> >
> > * Also, please try to make the subject a bit more succinct
> >
> > > ---
> > > net/tls/tls_sw.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> > > index 9937d4c810f2..b1fa62de9dab 100644
> > > --- a/net/tls/tls_sw.c
> > > +++ b/net/tls/tls_sw.c
> > > @@ -2533,7 +2533,7 @@ void tls_sw_cancel_work_tx(struct tls_context *tls_ctx)
> > >
> > > set_bit(BIT_TX_CLOSING, &ctx->tx_bitmask);
> > > set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
> > > - cancel_delayed_work_sync(&ctx->tx_work.work);
> > > + disable_delayed_work_sync(&ctx->tx_work.work);
> > > }
> > >
> > > void tls_sw_release_resources_tx(struct sock *sk)
> > > --
> > > 2.43.0
> > > ---
> > >
> > > Dear,
> > >
> > > The following is a simplified scenario illustrating how each race can occur. Since tls_sw_cancel_work_tx() does not hold lock_sock(), it can race with tls_write_space().
> > > ```
> > > cpu0 cpu1
> > >
> > > tls_sk_proto_close()
> > > tls_sw_cancel_work_tx()
> > > tls_write_space()
> > > tls_sw_write_space()
> > > if (!test_and_set_bit(BIT_TX_SCHEDULED, &tx_ctx->tx_bitmask))
> > > set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
> > > cancel_delayed_work_sync(&ctx->tx_work.work);
> > > schedule_delayed_work(&tx_ctx->tx_work.work, 0);
> > > ```
> >
> > I think that the text above would be best included in the patch description.
> > At least for me it is fundamental to understanding the problem.
>
> Understood. I will add the race scenario description to the patch.
>
> >
> > >
> > > Best regards,
> > > Hyunwoo Kim
> > >
> >
> > I see three similar patches on the mailing list.
> > The comments above go for them too.
> > And It would probably be useful to just handle one at a time,
> > to allow for proper feedback. Or bundle them in a patchset.
>
> Since the core of this bug pattern is espintcp, I will submit a revised
> espintcp v2 patch shortly.
> I would appreciate it if you could review the espintcp v2 patch.
> Once the espintcp review is done, I will apply the feedback to the
> remaining strparser and tls patches as well.
Thanks.
I have looked over the espintcp patch.
It looks good to me. But I think it would be good to include
some sort of race analysis in the commit message: something similar
to the cpu0 / cpu1 text above, but perhaps with a different case explained.
Also, I think the other comments above apply to that patch too.
I will respond to that patch pointing to this message.
>
> Best regards,
> Hyunwoo Kim
>
> >
> > --
> > pw-bot: changes-requested
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] net: tls: Use disable_delayed_work_sync() instead of cancel_delayed_work_sync() in tls_sw_cancel_work_tx()
2026-02-18 9:34 ` Simon Horman
@ 2026-02-18 9:36 ` Simon Horman
0 siblings, 0 replies; 5+ messages in thread
From: Simon Horman @ 2026-02-18 9:36 UTC (permalink / raw)
To: Hyunwoo Kim; +Cc: john.fastabend, kuba, sd, davem, edumazet, pabeni, netdev
On Wed, Feb 18, 2026 at 09:34:32AM +0000, Simon Horman wrote:
> On Wed, Feb 18, 2026 at 01:46:00AM +0900, Hyunwoo Kim wrote:
> > On Tue, Feb 17, 2026 at 03:37:01PM +0000, Simon Horman wrote:
> > > On Mon, Feb 16, 2026 at 06:51:50PM +0900, Hyunwoo Kim wrote:
> > > > After cancel_delayed_work_sync() is called from tls_sk_proto_close(),
> > > > tx_work_handler() can still be scheduled from paths such as the
> > > > Delayed ACK handler or ksoftirqd.
> > > > As a result, the tx_work_handler() worker may dereference a freed
> > > > TLS object.
> > > >
> > > > To prevent this race condition, cancel_delayed_work_sync() is
> > > > replaced with disable_delayed_work_sync().
> > > >
> > > > Fixes: f87e62d45e51 ("net/tls: remove close callback sock unlock/lock around TX work flush")
> > > > Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
> > >
> > > Hi Hyunwoo,
> > >
> > > Thanks for your patch(es).
> > >
> > > Some feedback on process from my side.
> > > You can read more about that at
> > > https://docs.kernel.org/process/maintainer-netdev.html
> > >
> > > * I think it would be good to mention how this problem was found.
> >
> > Hi Simon,
> >
> > Thank you for the feedback.
> >
> > This issue was found during a manual code audit. I will add this
> > information to the commit message.
> >
> >
> > >
> > > * As a bug fix for code present in the net tree, it should be targeted
> > > at that tree like this.
> > >
> > > Subject: [PATCH net]: ...
> > >
> > > * Looking over git history, it seems that an appropriate prefix
> > > for patches for this code is 'tls: '
> > >
> > > Subject [PATCH net]: tls: ...
> > >
> > > * Also, please try to make the subject a bit more succinct
> > >
> > > > ---
> > > > net/tls/tls_sw.c | 2 +-
> > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> > > > index 9937d4c810f2..b1fa62de9dab 100644
> > > > --- a/net/tls/tls_sw.c
> > > > +++ b/net/tls/tls_sw.c
> > > > @@ -2533,7 +2533,7 @@ void tls_sw_cancel_work_tx(struct tls_context *tls_ctx)
> > > >
> > > > set_bit(BIT_TX_CLOSING, &ctx->tx_bitmask);
> > > > set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
> > > > - cancel_delayed_work_sync(&ctx->tx_work.work);
> > > > + disable_delayed_work_sync(&ctx->tx_work.work);
> > > > }
> > > >
> > > > void tls_sw_release_resources_tx(struct sock *sk)
> > > > --
> > > > 2.43.0
> > > > ---
> > > >
> > > > Dear,
> > > >
> > > > The following is a simplified scenario illustrating how each race can occur. Since tls_sw_cancel_work_tx() does not hold lock_sock(), it can race with tls_write_space().
> > > > ```
> > > > cpu0 cpu1
> > > >
> > > > tls_sk_proto_close()
> > > > tls_sw_cancel_work_tx()
> > > > tls_write_space()
> > > > tls_sw_write_space()
> > > > if (!test_and_set_bit(BIT_TX_SCHEDULED, &tx_ctx->tx_bitmask))
> > > > set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
> > > > cancel_delayed_work_sync(&ctx->tx_work.work);
> > > > schedule_delayed_work(&tx_ctx->tx_work.work, 0);
> > > > ```
> > >
> > > I think that the text above would be best included in the patch description.
> > > At least for me it is fundamental to understanding the problem.
> >
> > Understood. I will add the race scenario description to the patch.
> >
> > >
> > > >
> > > > Best regards,
> > > > Hyunwoo Kim
> > > >
> > >
> > > I see three similar patches on the mailing list.
> > > The comments above go for them too.
> > > And It would probably be useful to just handle one at a time,
> > > to allow for proper feedback. Or bundle them in a patchset.
> >
> > Since the core of this bug pattern is espintcp, I will submit a revised
> > espintcp v2 patch shortly.
> > I would appreciate it if you could review the espintcp v2 patch.
> > Once the espintcp review is done, I will apply the feedback to the
> > remaining strparser and tls patches as well.
>
> Thanks.
>
> I have looked over the espintcp patch.
>
> It looks good to me. But I think it would be good to include
> some sort of race analysis in the commit message: something similar
> to the cpu0 / cpu1 text above, but perhaps with a different case explained.
>
> Also, I think the other comments above apply to that patch too.
>
> I will respond to that patch pointing to this message.
Sorry, I was a bit hasty there.
The comments above are for v1.
I'll look over v2 and respond there.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-02-18 9:36 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-16 9:51 [PATCH] net: tls: Use disable_delayed_work_sync() instead of cancel_delayed_work_sync() in tls_sw_cancel_work_tx() Hyunwoo Kim
2026-02-17 15:37 ` Simon Horman
2026-02-17 16:46 ` Hyunwoo Kim
2026-02-18 9:34 ` Simon Horman
2026-02-18 9:36 ` Simon Horman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox