public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net v2] tls: Fix race condition in tls_sw_cancel_work_tx()
@ 2026-02-20  9:40 Hyunwoo Kim
  2026-02-22 17:44 ` Simon Horman
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Hyunwoo Kim @ 2026-02-20  9:40 UTC (permalink / raw)
  To: john.fastabend, kuba, sd, davem, edumazet, pabeni, horms; +Cc: netdev, imv4bel

This issue was discovered during a code audit.

After cancel_delayed_work_sync() is called from tls_sk_proto_close(), 
tx_work_handler() can still be scheduled from paths such as the 
Delayed ACK handler or ksoftirqd.
As a result, the tx_work_handler() worker may dereference a freed 
TLS object.

The following is a simple race scenario:

          cpu0                         cpu1

tls_sk_proto_close()
  tls_sw_cancel_work_tx()
                                 tls_write_space()
                                   tls_sw_write_space()
                                     if (!test_and_set_bit(BIT_TX_SCHEDULED, &tx_ctx->tx_bitmask))
    set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
    cancel_delayed_work_sync(&ctx->tx_work.work);
                                     schedule_delayed_work(&tx_ctx->tx_work.work, 0);

To prevent this race condition, cancel_delayed_work_sync() is 
replaced with disable_delayed_work_sync().

Fixes: f87e62d45e51 ("net/tls: remove close callback sock unlock/lock around TX work flush")
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
---
Changes in v2:
- Shorten the patch subject
- Target the net tree
- Add the bug discovery background and the race scenario to the commit message
- v1: https://lore.kernel.org/all/aZLotq3aZY0b-dI8@v4bel/
---
 net/tls/tls_sw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 9937d4c810f2..b1fa62de9dab 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -2533,7 +2533,7 @@ void tls_sw_cancel_work_tx(struct tls_context *tls_ctx)
 
 	set_bit(BIT_TX_CLOSING, &ctx->tx_bitmask);
 	set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
-	cancel_delayed_work_sync(&ctx->tx_work.work);
+	disable_delayed_work_sync(&ctx->tx_work.work);
 }
 
 void tls_sw_release_resources_tx(struct sock *sk)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net v2] tls: Fix race condition in tls_sw_cancel_work_tx()
  2026-02-20  9:40 [PATCH net v2] tls: Fix race condition in tls_sw_cancel_work_tx() Hyunwoo Kim
@ 2026-02-22 17:44 ` Simon Horman
  2026-02-23 16:18 ` Sabrina Dubroca
  2026-02-24  1:30 ` patchwork-bot+netdevbpf
  2 siblings, 0 replies; 4+ messages in thread
From: Simon Horman @ 2026-02-22 17:44 UTC (permalink / raw)
  To: Hyunwoo Kim; +Cc: john.fastabend, kuba, sd, davem, edumazet, pabeni, netdev

On Fri, Feb 20, 2026 at 06:40:36PM +0900, Hyunwoo Kim wrote:
> This issue was discovered during a code audit.
> 
> After cancel_delayed_work_sync() is called from tls_sk_proto_close(), 
> tx_work_handler() can still be scheduled from paths such as the 
> Delayed ACK handler or ksoftirqd.
> As a result, the tx_work_handler() worker may dereference a freed 
> TLS object.
> 
> The following is a simple race scenario:
> 
>           cpu0                         cpu1
> 
> tls_sk_proto_close()
>   tls_sw_cancel_work_tx()
>                                  tls_write_space()
>                                    tls_sw_write_space()
>                                      if (!test_and_set_bit(BIT_TX_SCHEDULED, &tx_ctx->tx_bitmask))
>     set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
>     cancel_delayed_work_sync(&ctx->tx_work.work);
>                                      schedule_delayed_work(&tx_ctx->tx_work.work, 0);
> 
> To prevent this race condition, cancel_delayed_work_sync() is 
> replaced with disable_delayed_work_sync().
> 
> Fixes: f87e62d45e51 ("net/tls: remove close callback sock unlock/lock around TX work flush")
> Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
> ---
> Changes in v2:
> - Shorten the patch subject
> - Target the net tree
> - Add the bug discovery background and the race scenario to the commit message
> - v1: https://lore.kernel.org/all/aZLotq3aZY0b-dI8@v4bel/

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net v2] tls: Fix race condition in tls_sw_cancel_work_tx()
  2026-02-20  9:40 [PATCH net v2] tls: Fix race condition in tls_sw_cancel_work_tx() Hyunwoo Kim
  2026-02-22 17:44 ` Simon Horman
@ 2026-02-23 16:18 ` Sabrina Dubroca
  2026-02-24  1:30 ` patchwork-bot+netdevbpf
  2 siblings, 0 replies; 4+ messages in thread
From: Sabrina Dubroca @ 2026-02-23 16:18 UTC (permalink / raw)
  To: Hyunwoo Kim; +Cc: john.fastabend, kuba, davem, edumazet, pabeni, horms, netdev

2026-02-20, 18:40:36 +0900, Hyunwoo Kim wrote:
> This issue was discovered during a code audit.
> 
> After cancel_delayed_work_sync() is called from tls_sk_proto_close(), 
> tx_work_handler() can still be scheduled from paths such as the 
> Delayed ACK handler or ksoftirqd.
> As a result, the tx_work_handler() worker may dereference a freed 
> TLS object.
> 
> The following is a simple race scenario:
> 
>           cpu0                         cpu1
> 
> tls_sk_proto_close()
>   tls_sw_cancel_work_tx()
>                                  tls_write_space()
>                                    tls_sw_write_space()
>                                      if (!test_and_set_bit(BIT_TX_SCHEDULED, &tx_ctx->tx_bitmask))
>     set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
>     cancel_delayed_work_sync(&ctx->tx_work.work);
>                                      schedule_delayed_work(&tx_ctx->tx_work.work, 0);

And possibly something similar with tls_encrypt_done() if async crypto
gets delayed by just the right amount.

> To prevent this race condition, cancel_delayed_work_sync() is 
> replaced with disable_delayed_work_sync().
> 
> Fixes: f87e62d45e51 ("net/tls: remove close callback sock unlock/lock around TX work flush")
> Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
> ---
> Changes in v2:
> - Shorten the patch subject
> - Target the net tree
> - Add the bug discovery background and the race scenario to the commit message
> - v1: https://lore.kernel.org/all/aZLotq3aZY0b-dI8@v4bel/
> ---
>  net/tls/tls_sw.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> index 9937d4c810f2..b1fa62de9dab 100644
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -2533,7 +2533,7 @@ void tls_sw_cancel_work_tx(struct tls_context *tls_ctx)
>  
>  	set_bit(BIT_TX_CLOSING, &ctx->tx_bitmask);
>  	set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
> -	cancel_delayed_work_sync(&ctx->tx_work.work);
> +	disable_delayed_work_sync(&ctx->tx_work.work);

This seems ok.
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>

It would maybe be a bit "cleaner" to reorder the operations in
tls_sk_proto_close() to first stop all async encryptions and switch
the CBs so that tls_write_space can't be called anymore, and then
cancel the tx_work (once we know nothing can reschedule it anymore),
but this should be fine.

>  }
>  
>  void tls_sw_release_resources_tx(struct sock *sk)
> -- 
> 2.43.0
> 

-- 
Sabrina

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net v2] tls: Fix race condition in tls_sw_cancel_work_tx()
  2026-02-20  9:40 [PATCH net v2] tls: Fix race condition in tls_sw_cancel_work_tx() Hyunwoo Kim
  2026-02-22 17:44 ` Simon Horman
  2026-02-23 16:18 ` Sabrina Dubroca
@ 2026-02-24  1:30 ` patchwork-bot+netdevbpf
  2 siblings, 0 replies; 4+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-02-24  1:30 UTC (permalink / raw)
  To: Hyunwoo Kim
  Cc: john.fastabend, kuba, sd, davem, edumazet, pabeni, horms, netdev

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Fri, 20 Feb 2026 18:40:36 +0900 you wrote:
> This issue was discovered during a code audit.
> 
> After cancel_delayed_work_sync() is called from tls_sk_proto_close(),
> tx_work_handler() can still be scheduled from paths such as the
> Delayed ACK handler or ksoftirqd.
> As a result, the tx_work_handler() worker may dereference a freed
> TLS object.
> 
> [...]

Here is the summary with links:
  - [net,v2] tls: Fix race condition in tls_sw_cancel_work_tx()
    https://git.kernel.org/netdev/net/c/7bb09315f93d

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-02-24  1:30 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-20  9:40 [PATCH net v2] tls: Fix race condition in tls_sw_cancel_work_tx() Hyunwoo Kim
2026-02-22 17:44 ` Simon Horman
2026-02-23 16:18 ` Sabrina Dubroca
2026-02-24  1:30 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox