* [PATCH] net: tls: Use disable_delayed_work_sync() instead of cancel_delayed_work_sync() in tls_sw_cancel_work_tx()
@ 2026-02-16 9:51 Hyunwoo Kim
2026-02-17 15:37 ` Simon Horman
0 siblings, 1 reply; 5+ messages in thread
From: Hyunwoo Kim @ 2026-02-16 9:51 UTC (permalink / raw)
To: john.fastabend, kuba, sd, davem, edumazet, pabeni; +Cc: horms, netdev, imv4bel
After cancel_delayed_work_sync() is called from tls_sk_proto_close(),
tx_work_handler() can still be scheduled from paths such as the
Delayed ACK handler or ksoftirqd.
As a result, the tx_work_handler() worker may dereference a freed
TLS object.
To prevent this race condition, cancel_delayed_work_sync() is
replaced with disable_delayed_work_sync().
Fixes: f87e62d45e51 ("net/tls: remove close callback sock unlock/lock around TX work flush")
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
---
net/tls/tls_sw.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 9937d4c810f2..b1fa62de9dab 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -2533,7 +2533,7 @@ void tls_sw_cancel_work_tx(struct tls_context *tls_ctx)
set_bit(BIT_TX_CLOSING, &ctx->tx_bitmask);
set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
- cancel_delayed_work_sync(&ctx->tx_work.work);
+ disable_delayed_work_sync(&ctx->tx_work.work);
}
void tls_sw_release_resources_tx(struct sock *sk)
--
2.43.0
---
Dear,
The following is a simplified scenario illustrating how each race can occur. Since tls_sw_cancel_work_tx() does not hold lock_sock(), it can race with tls_write_space().
```
cpu0 cpu1
tls_sk_proto_close()
tls_sw_cancel_work_tx()
tls_write_space()
tls_sw_write_space()
if (!test_and_set_bit(BIT_TX_SCHEDULED, &tx_ctx->tx_bitmask))
set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
cancel_delayed_work_sync(&ctx->tx_work.work);
schedule_delayed_work(&tx_ctx->tx_work.work, 0);
```
Best regards,
Hyunwoo Kim
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH] net: tls: Use disable_delayed_work_sync() instead of cancel_delayed_work_sync() in tls_sw_cancel_work_tx() 2026-02-16 9:51 [PATCH] net: tls: Use disable_delayed_work_sync() instead of cancel_delayed_work_sync() in tls_sw_cancel_work_tx() Hyunwoo Kim @ 2026-02-17 15:37 ` Simon Horman 2026-02-17 16:46 ` Hyunwoo Kim 0 siblings, 1 reply; 5+ messages in thread From: Simon Horman @ 2026-02-17 15:37 UTC (permalink / raw) To: Hyunwoo Kim; +Cc: john.fastabend, kuba, sd, davem, edumazet, pabeni, netdev On Mon, Feb 16, 2026 at 06:51:50PM +0900, Hyunwoo Kim wrote: > After cancel_delayed_work_sync() is called from tls_sk_proto_close(), > tx_work_handler() can still be scheduled from paths such as the > Delayed ACK handler or ksoftirqd. > As a result, the tx_work_handler() worker may dereference a freed > TLS object. > > To prevent this race condition, cancel_delayed_work_sync() is > replaced with disable_delayed_work_sync(). > > Fixes: f87e62d45e51 ("net/tls: remove close callback sock unlock/lock around TX work flush") > Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com> Hi Hyunwoo, Thanks for your patch(es). Some feedback on process from my side. You can read more about that at https://docs.kernel.org/process/maintainer-netdev.html * I think it would be good to mention how this problem was found. * As a bug fix for code present in the net tree, it should be targeted at that tree like this. Subject: [PATCH net]: ... * Looking over git history, it seems that an appropriate prefix for patches for this code is 'tls: ' Subject [PATCH net]: tls: ... * Also, please try to make the subject a bit more succinct > --- > net/tls/tls_sw.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c > index 9937d4c810f2..b1fa62de9dab 100644 > --- a/net/tls/tls_sw.c > +++ b/net/tls/tls_sw.c > @@ -2533,7 +2533,7 @@ void tls_sw_cancel_work_tx(struct tls_context *tls_ctx) > > set_bit(BIT_TX_CLOSING, &ctx->tx_bitmask); > set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask); > - cancel_delayed_work_sync(&ctx->tx_work.work); > + disable_delayed_work_sync(&ctx->tx_work.work); > } > > void tls_sw_release_resources_tx(struct sock *sk) > -- > 2.43.0 > --- > > Dear, > > The following is a simplified scenario illustrating how each race can occur. Since tls_sw_cancel_work_tx() does not hold lock_sock(), it can race with tls_write_space(). > ``` > cpu0 cpu1 > > tls_sk_proto_close() > tls_sw_cancel_work_tx() > tls_write_space() > tls_sw_write_space() > if (!test_and_set_bit(BIT_TX_SCHEDULED, &tx_ctx->tx_bitmask)) > set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask); > cancel_delayed_work_sync(&ctx->tx_work.work); > schedule_delayed_work(&tx_ctx->tx_work.work, 0); > ``` I think that the text above would be best included in the patch description. At least for me it is fundamental to understanding the problem. > > Best regards, > Hyunwoo Kim > I see three similar patches on the mailing list. The comments above go for them too. And It would probably be useful to just handle one at a time, to allow for proper feedback. Or bundle them in a patchset. -- pw-bot: changes-requested ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] net: tls: Use disable_delayed_work_sync() instead of cancel_delayed_work_sync() in tls_sw_cancel_work_tx() 2026-02-17 15:37 ` Simon Horman @ 2026-02-17 16:46 ` Hyunwoo Kim 2026-02-18 9:34 ` Simon Horman 0 siblings, 1 reply; 5+ messages in thread From: Hyunwoo Kim @ 2026-02-17 16:46 UTC (permalink / raw) To: Simon Horman Cc: john.fastabend, kuba, sd, davem, edumazet, pabeni, netdev, imv4bel On Tue, Feb 17, 2026 at 03:37:01PM +0000, Simon Horman wrote: > On Mon, Feb 16, 2026 at 06:51:50PM +0900, Hyunwoo Kim wrote: > > After cancel_delayed_work_sync() is called from tls_sk_proto_close(), > > tx_work_handler() can still be scheduled from paths such as the > > Delayed ACK handler or ksoftirqd. > > As a result, the tx_work_handler() worker may dereference a freed > > TLS object. > > > > To prevent this race condition, cancel_delayed_work_sync() is > > replaced with disable_delayed_work_sync(). > > > > Fixes: f87e62d45e51 ("net/tls: remove close callback sock unlock/lock around TX work flush") > > Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com> > > Hi Hyunwoo, > > Thanks for your patch(es). > > Some feedback on process from my side. > You can read more about that at > https://docs.kernel.org/process/maintainer-netdev.html > > * I think it would be good to mention how this problem was found. Hi Simon, Thank you for the feedback. This issue was found during a manual code audit. I will add this information to the commit message. > > * As a bug fix for code present in the net tree, it should be targeted > at that tree like this. > > Subject: [PATCH net]: ... > > * Looking over git history, it seems that an appropriate prefix > for patches for this code is 'tls: ' > > Subject [PATCH net]: tls: ... > > * Also, please try to make the subject a bit more succinct > > > --- > > net/tls/tls_sw.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c > > index 9937d4c810f2..b1fa62de9dab 100644 > > --- a/net/tls/tls_sw.c > > +++ b/net/tls/tls_sw.c > > @@ -2533,7 +2533,7 @@ void tls_sw_cancel_work_tx(struct tls_context *tls_ctx) > > > > set_bit(BIT_TX_CLOSING, &ctx->tx_bitmask); > > set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask); > > - cancel_delayed_work_sync(&ctx->tx_work.work); > > + disable_delayed_work_sync(&ctx->tx_work.work); > > } > > > > void tls_sw_release_resources_tx(struct sock *sk) > > -- > > 2.43.0 > > --- > > > > Dear, > > > > The following is a simplified scenario illustrating how each race can occur. Since tls_sw_cancel_work_tx() does not hold lock_sock(), it can race with tls_write_space(). > > ``` > > cpu0 cpu1 > > > > tls_sk_proto_close() > > tls_sw_cancel_work_tx() > > tls_write_space() > > tls_sw_write_space() > > if (!test_and_set_bit(BIT_TX_SCHEDULED, &tx_ctx->tx_bitmask)) > > set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask); > > cancel_delayed_work_sync(&ctx->tx_work.work); > > schedule_delayed_work(&tx_ctx->tx_work.work, 0); > > ``` > > I think that the text above would be best included in the patch description. > At least for me it is fundamental to understanding the problem. Understood. I will add the race scenario description to the patch. > > > > > Best regards, > > Hyunwoo Kim > > > > I see three similar patches on the mailing list. > The comments above go for them too. > And It would probably be useful to just handle one at a time, > to allow for proper feedback. Or bundle them in a patchset. Since the core of this bug pattern is espintcp, I will submit a revised espintcp v2 patch shortly. I would appreciate it if you could review the espintcp v2 patch. Once the espintcp review is done, I will apply the feedback to the remaining strparser and tls patches as well. Best regards, Hyunwoo Kim > > -- > pw-bot: changes-requested ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] net: tls: Use disable_delayed_work_sync() instead of cancel_delayed_work_sync() in tls_sw_cancel_work_tx() 2026-02-17 16:46 ` Hyunwoo Kim @ 2026-02-18 9:34 ` Simon Horman 2026-02-18 9:36 ` Simon Horman 0 siblings, 1 reply; 5+ messages in thread From: Simon Horman @ 2026-02-18 9:34 UTC (permalink / raw) To: Hyunwoo Kim; +Cc: john.fastabend, kuba, sd, davem, edumazet, pabeni, netdev On Wed, Feb 18, 2026 at 01:46:00AM +0900, Hyunwoo Kim wrote: > On Tue, Feb 17, 2026 at 03:37:01PM +0000, Simon Horman wrote: > > On Mon, Feb 16, 2026 at 06:51:50PM +0900, Hyunwoo Kim wrote: > > > After cancel_delayed_work_sync() is called from tls_sk_proto_close(), > > > tx_work_handler() can still be scheduled from paths such as the > > > Delayed ACK handler or ksoftirqd. > > > As a result, the tx_work_handler() worker may dereference a freed > > > TLS object. > > > > > > To prevent this race condition, cancel_delayed_work_sync() is > > > replaced with disable_delayed_work_sync(). > > > > > > Fixes: f87e62d45e51 ("net/tls: remove close callback sock unlock/lock around TX work flush") > > > Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com> > > > > Hi Hyunwoo, > > > > Thanks for your patch(es). > > > > Some feedback on process from my side. > > You can read more about that at > > https://docs.kernel.org/process/maintainer-netdev.html > > > > * I think it would be good to mention how this problem was found. > > Hi Simon, > > Thank you for the feedback. > > This issue was found during a manual code audit. I will add this > information to the commit message. > > > > > > * As a bug fix for code present in the net tree, it should be targeted > > at that tree like this. > > > > Subject: [PATCH net]: ... > > > > * Looking over git history, it seems that an appropriate prefix > > for patches for this code is 'tls: ' > > > > Subject [PATCH net]: tls: ... > > > > * Also, please try to make the subject a bit more succinct > > > > > --- > > > net/tls/tls_sw.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c > > > index 9937d4c810f2..b1fa62de9dab 100644 > > > --- a/net/tls/tls_sw.c > > > +++ b/net/tls/tls_sw.c > > > @@ -2533,7 +2533,7 @@ void tls_sw_cancel_work_tx(struct tls_context *tls_ctx) > > > > > > set_bit(BIT_TX_CLOSING, &ctx->tx_bitmask); > > > set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask); > > > - cancel_delayed_work_sync(&ctx->tx_work.work); > > > + disable_delayed_work_sync(&ctx->tx_work.work); > > > } > > > > > > void tls_sw_release_resources_tx(struct sock *sk) > > > -- > > > 2.43.0 > > > --- > > > > > > Dear, > > > > > > The following is a simplified scenario illustrating how each race can occur. Since tls_sw_cancel_work_tx() does not hold lock_sock(), it can race with tls_write_space(). > > > ``` > > > cpu0 cpu1 > > > > > > tls_sk_proto_close() > > > tls_sw_cancel_work_tx() > > > tls_write_space() > > > tls_sw_write_space() > > > if (!test_and_set_bit(BIT_TX_SCHEDULED, &tx_ctx->tx_bitmask)) > > > set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask); > > > cancel_delayed_work_sync(&ctx->tx_work.work); > > > schedule_delayed_work(&tx_ctx->tx_work.work, 0); > > > ``` > > > > I think that the text above would be best included in the patch description. > > At least for me it is fundamental to understanding the problem. > > Understood. I will add the race scenario description to the patch. > > > > > > > > > Best regards, > > > Hyunwoo Kim > > > > > > > I see three similar patches on the mailing list. > > The comments above go for them too. > > And It would probably be useful to just handle one at a time, > > to allow for proper feedback. Or bundle them in a patchset. > > Since the core of this bug pattern is espintcp, I will submit a revised > espintcp v2 patch shortly. > I would appreciate it if you could review the espintcp v2 patch. > Once the espintcp review is done, I will apply the feedback to the > remaining strparser and tls patches as well. Thanks. I have looked over the espintcp patch. It looks good to me. But I think it would be good to include some sort of race analysis in the commit message: something similar to the cpu0 / cpu1 text above, but perhaps with a different case explained. Also, I think the other comments above apply to that patch too. I will respond to that patch pointing to this message. > > Best regards, > Hyunwoo Kim > > > > > -- > > pw-bot: changes-requested > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] net: tls: Use disable_delayed_work_sync() instead of cancel_delayed_work_sync() in tls_sw_cancel_work_tx() 2026-02-18 9:34 ` Simon Horman @ 2026-02-18 9:36 ` Simon Horman 0 siblings, 0 replies; 5+ messages in thread From: Simon Horman @ 2026-02-18 9:36 UTC (permalink / raw) To: Hyunwoo Kim; +Cc: john.fastabend, kuba, sd, davem, edumazet, pabeni, netdev On Wed, Feb 18, 2026 at 09:34:32AM +0000, Simon Horman wrote: > On Wed, Feb 18, 2026 at 01:46:00AM +0900, Hyunwoo Kim wrote: > > On Tue, Feb 17, 2026 at 03:37:01PM +0000, Simon Horman wrote: > > > On Mon, Feb 16, 2026 at 06:51:50PM +0900, Hyunwoo Kim wrote: > > > > After cancel_delayed_work_sync() is called from tls_sk_proto_close(), > > > > tx_work_handler() can still be scheduled from paths such as the > > > > Delayed ACK handler or ksoftirqd. > > > > As a result, the tx_work_handler() worker may dereference a freed > > > > TLS object. > > > > > > > > To prevent this race condition, cancel_delayed_work_sync() is > > > > replaced with disable_delayed_work_sync(). > > > > > > > > Fixes: f87e62d45e51 ("net/tls: remove close callback sock unlock/lock around TX work flush") > > > > Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com> > > > > > > Hi Hyunwoo, > > > > > > Thanks for your patch(es). > > > > > > Some feedback on process from my side. > > > You can read more about that at > > > https://docs.kernel.org/process/maintainer-netdev.html > > > > > > * I think it would be good to mention how this problem was found. > > > > Hi Simon, > > > > Thank you for the feedback. > > > > This issue was found during a manual code audit. I will add this > > information to the commit message. > > > > > > > > > > * As a bug fix for code present in the net tree, it should be targeted > > > at that tree like this. > > > > > > Subject: [PATCH net]: ... > > > > > > * Looking over git history, it seems that an appropriate prefix > > > for patches for this code is 'tls: ' > > > > > > Subject [PATCH net]: tls: ... > > > > > > * Also, please try to make the subject a bit more succinct > > > > > > > --- > > > > net/tls/tls_sw.c | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c > > > > index 9937d4c810f2..b1fa62de9dab 100644 > > > > --- a/net/tls/tls_sw.c > > > > +++ b/net/tls/tls_sw.c > > > > @@ -2533,7 +2533,7 @@ void tls_sw_cancel_work_tx(struct tls_context *tls_ctx) > > > > > > > > set_bit(BIT_TX_CLOSING, &ctx->tx_bitmask); > > > > set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask); > > > > - cancel_delayed_work_sync(&ctx->tx_work.work); > > > > + disable_delayed_work_sync(&ctx->tx_work.work); > > > > } > > > > > > > > void tls_sw_release_resources_tx(struct sock *sk) > > > > -- > > > > 2.43.0 > > > > --- > > > > > > > > Dear, > > > > > > > > The following is a simplified scenario illustrating how each race can occur. Since tls_sw_cancel_work_tx() does not hold lock_sock(), it can race with tls_write_space(). > > > > ``` > > > > cpu0 cpu1 > > > > > > > > tls_sk_proto_close() > > > > tls_sw_cancel_work_tx() > > > > tls_write_space() > > > > tls_sw_write_space() > > > > if (!test_and_set_bit(BIT_TX_SCHEDULED, &tx_ctx->tx_bitmask)) > > > > set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask); > > > > cancel_delayed_work_sync(&ctx->tx_work.work); > > > > schedule_delayed_work(&tx_ctx->tx_work.work, 0); > > > > ``` > > > > > > I think that the text above would be best included in the patch description. > > > At least for me it is fundamental to understanding the problem. > > > > Understood. I will add the race scenario description to the patch. > > > > > > > > > > > > > Best regards, > > > > Hyunwoo Kim > > > > > > > > > > I see three similar patches on the mailing list. > > > The comments above go for them too. > > > And It would probably be useful to just handle one at a time, > > > to allow for proper feedback. Or bundle them in a patchset. > > > > Since the core of this bug pattern is espintcp, I will submit a revised > > espintcp v2 patch shortly. > > I would appreciate it if you could review the espintcp v2 patch. > > Once the espintcp review is done, I will apply the feedback to the > > remaining strparser and tls patches as well. > > Thanks. > > I have looked over the espintcp patch. > > It looks good to me. But I think it would be good to include > some sort of race analysis in the commit message: something similar > to the cpu0 / cpu1 text above, but perhaps with a different case explained. > > Also, I think the other comments above apply to that patch too. > > I will respond to that patch pointing to this message. Sorry, I was a bit hasty there. The comments above are for v1. I'll look over v2 and respond there. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-02-18 9:36 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-02-16 9:51 [PATCH] net: tls: Use disable_delayed_work_sync() instead of cancel_delayed_work_sync() in tls_sw_cancel_work_tx() Hyunwoo Kim 2026-02-17 15:37 ` Simon Horman 2026-02-17 16:46 ` Hyunwoo Kim 2026-02-18 9:34 ` Simon Horman 2026-02-18 9:36 ` Simon Horman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox