* [PATCH] net/tls: Fix slab-use-after-free in tls_encrypt_done
@ 2023-10-12 11:02 Juntong Deng
2023-10-16 9:50 ` Simon Horman
2023-10-17 10:31 ` Paolo Abeni
0 siblings, 2 replies; 5+ messages in thread
From: Juntong Deng @ 2023-10-12 11:02 UTC (permalink / raw)
To: borisp, john.fastabend, kuba, davem, edumazet, pabeni
Cc: netdev, linux-kernel, linux-kernel-mentees,
syzbot+29c22ea2d6b2c5fd2eae
In the current implementation, ctx->async_wait.completion is completed
after spin_lock_bh, which causes tls_sw_release_resources_tx to
continue executing and return to tls_sk_proto_cleanup, then return
to tls_sk_proto_close, and after that enter tls_sw_free_ctx_tx to kfree
the entire struct tls_context (including ctx->encrypt_compl_lock).
Since ctx->encrypt_compl_lock has been freed, subsequent spin_unlock_bh
will result in slab-use-after-free error. Due to SMP, even using
spin_lock_bh does not prevent tls_sw_release_resources_tx from continuing
on other CPUs. After tls_sw_release_resources_tx is woken up, there is no
attempt to hold ctx->encrypt_compl_lock again, therefore everything
described above is possible.
The fix is to put complete(&ctx->async_wait.completion) after
spin_unlock_bh, making the release after the unlock. Since complete is
only executed if pending is 0, which means this is the last record, there
is no need to worry about race condition causing duplicate completes.
Reported-by: syzbot+29c22ea2d6b2c5fd2eae@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=29c22ea2d6b2c5fd2eae
Signed-off-by: Juntong Deng <juntong.deng@outlook.com>
---
net/tls/tls_sw.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 270712b8d391..7abe5a6aa989 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -441,6 +441,7 @@ static void tls_encrypt_done(void *data, int err)
struct sk_msg *msg_en;
bool ready = false;
struct sock *sk;
+ int async_notify;
int pending;
msg_en = &rec->msg_encrypted;
@@ -482,10 +483,11 @@ static void tls_encrypt_done(void *data, int err)
spin_lock_bh(&ctx->encrypt_compl_lock);
pending = atomic_dec_return(&ctx->encrypt_pending);
+ async_notify = ctx->async_notify;
+ spin_unlock_bh(&ctx->encrypt_compl_lock);
- if (!pending && ctx->async_notify)
+ if (!pending && async_notify)
complete(&ctx->async_wait.completion);
- spin_unlock_bh(&ctx->encrypt_compl_lock);
if (!ready)
return;
--
2.39.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] net/tls: Fix slab-use-after-free in tls_encrypt_done
2023-10-12 11:02 [PATCH] net/tls: Fix slab-use-after-free in tls_encrypt_done Juntong Deng
@ 2023-10-16 9:50 ` Simon Horman
2023-10-17 10:25 ` Juntong Deng
2023-10-17 10:31 ` Paolo Abeni
1 sibling, 1 reply; 5+ messages in thread
From: Simon Horman @ 2023-10-16 9:50 UTC (permalink / raw)
To: Juntong Deng
Cc: borisp, john.fastabend, kuba, davem, edumazet, pabeni, netdev,
linux-kernel, linux-kernel-mentees, syzbot+29c22ea2d6b2c5fd2eae
On Thu, Oct 12, 2023 at 07:02:51PM +0800, Juntong Deng wrote:
> In the current implementation, ctx->async_wait.completion is completed
> after spin_lock_bh, which causes tls_sw_release_resources_tx to
> continue executing and return to tls_sk_proto_cleanup, then return
Hi Juntong Deng,
I'm slightly confused by "causes tls_sw_release_resources_tx to continue
executing".
What I see in tls_sw_release_resources_tx() is:
/* Wait for any pending async encryptions to complete */
spin_lock_bh(&ctx->encrypt_compl_lock);
ctx->async_notify = true;
pending = atomic_read(&ctx->encrypt_pending);
spin_unlock_bh(&ctx->encrypt_compl_lock);
Am I wrong in thinking the above will block because
(the same) ctx->encrypt_compl_lock is held in tls_encrypt_done?
> to tls_sk_proto_close, and after that enter tls_sw_free_ctx_tx to kfree
> the entire struct tls_context (including ctx->encrypt_compl_lock).
>
> Since ctx->encrypt_compl_lock has been freed, subsequent spin_unlock_bh
> will result in slab-use-after-free error. Due to SMP, even using
> spin_lock_bh does not prevent tls_sw_release_resources_tx from continuing
> on other CPUs. After tls_sw_release_resources_tx is woken up, there is no
> attempt to hold ctx->encrypt_compl_lock again, therefore everything
> described above is possible.
>
> The fix is to put complete(&ctx->async_wait.completion) after
> spin_unlock_bh, making the release after the unlock. Since complete is
> only executed if pending is 0, which means this is the last record, there
> is no need to worry about race condition causing duplicate completes.
>
> Reported-by: syzbot+29c22ea2d6b2c5fd2eae@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=29c22ea2d6b2c5fd2eae
> Signed-off-by: Juntong Deng <juntong.deng@outlook.com>
> ---
> net/tls/tls_sw.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> index 270712b8d391..7abe5a6aa989 100644
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -441,6 +441,7 @@ static void tls_encrypt_done(void *data, int err)
> struct sk_msg *msg_en;
> bool ready = false;
> struct sock *sk;
> + int async_notify;
> int pending;
>
> msg_en = &rec->msg_encrypted;
> @@ -482,10 +483,11 @@ static void tls_encrypt_done(void *data, int err)
>
> spin_lock_bh(&ctx->encrypt_compl_lock);
> pending = atomic_dec_return(&ctx->encrypt_pending);
> + async_notify = ctx->async_notify;
> + spin_unlock_bh(&ctx->encrypt_compl_lock);
>
> - if (!pending && ctx->async_notify)
> + if (!pending && async_notify)
> complete(&ctx->async_wait.completion);
> - spin_unlock_bh(&ctx->encrypt_compl_lock);
>
> if (!ready)
> return;
> --
> 2.39.2
>
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] net/tls: Fix slab-use-after-free in tls_encrypt_done
2023-10-16 9:50 ` Simon Horman
@ 2023-10-17 10:25 ` Juntong Deng
0 siblings, 0 replies; 5+ messages in thread
From: Juntong Deng @ 2023-10-17 10:25 UTC (permalink / raw)
To: Simon Horman
Cc: borisp, john.fastabend, kuba, davem, edumazet, pabeni, netdev,
linux-kernel, linux-kernel-mentees, syzbot+29c22ea2d6b2c5fd2eae
On 2023/10/16 17:50, Simon Horman wrote:
> On Thu, Oct 12, 2023 at 07:02:51PM +0800, Juntong Deng wrote:
>> In the current implementation, ctx->async_wait.completion is completed
>> after spin_lock_bh, which causes tls_sw_release_resources_tx to
>> continue executing and return to tls_sk_proto_cleanup, then return
>
> Hi Juntong Deng,
>
> I'm slightly confused by "causes tls_sw_release_resources_tx to continue
> executing".
>
> What I see in tls_sw_release_resources_tx() is:
>
> /* Wait for any pending async encryptions to complete */
> spin_lock_bh(&ctx->encrypt_compl_lock);
> ctx->async_notify = true;
> pending = atomic_read(&ctx->encrypt_pending);
> spin_unlock_bh(&ctx->encrypt_compl_lock);
>
> Am I wrong in thinking the above will block because
> (the same) ctx->encrypt_compl_lock is held in tls_encrypt_done?
>
Hi Simon Horman,
What I mean is that tls_sw_release_resources_tx will pause at
crypto_wait_req(-EINPROGRESS, &ctx->async_wait) because crypto_wait_req
call wait_for_completion.
Then after tls_encrypt_done call complete(&ctx->async_wait.completion),
it will cause tls_sw_release_resources_tx to continue executing.
>> to tls_sk_proto_close, and after that enter tls_sw_free_ctx_tx to kfree
>> the entire struct tls_context (including ctx->encrypt_compl_lock).
>>
>> Since ctx->encrypt_compl_lock has been freed, subsequent spin_unlock_bh
>> will result in slab-use-after-free error. Due to SMP, even using
>> spin_lock_bh does not prevent tls_sw_release_resources_tx from continuing
>> on other CPUs. After tls_sw_release_resources_tx is woken up, there is no
>> attempt to hold ctx->encrypt_compl_lock again, therefore everything
>> described above is possible.
>>
>> The fix is to put complete(&ctx->async_wait.completion) after
>> spin_unlock_bh, making the release after the unlock. Since complete is
>> only executed if pending is 0, which means this is the last record, there
>> is no need to worry about race condition causing duplicate completes.
>>
>> Reported-by: syzbot+29c22ea2d6b2c5fd2eae@syzkaller.appspotmail.com
>> Closes: https://syzkaller.appspot.com/bug?extid=29c22ea2d6b2c5fd2eae
>> Signed-off-by: Juntong Deng <juntong.deng@outlook.com>
>> ---
>> net/tls/tls_sw.c | 6 ++++--
>> 1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
>> index 270712b8d391..7abe5a6aa989 100644
>> --- a/net/tls/tls_sw.c
>> +++ b/net/tls/tls_sw.c
>> @@ -441,6 +441,7 @@ static void tls_encrypt_done(void *data, int err)
>> struct sk_msg *msg_en;
>> bool ready = false;
>> struct sock *sk;
>> + int async_notify;
>> int pending;
>>
>> msg_en = &rec->msg_encrypted;
>> @@ -482,10 +483,11 @@ static void tls_encrypt_done(void *data, int err)
>>
>> spin_lock_bh(&ctx->encrypt_compl_lock);
>> pending = atomic_dec_return(&ctx->encrypt_pending);
>> + async_notify = ctx->async_notify;
>> + spin_unlock_bh(&ctx->encrypt_compl_lock);
>>
>> - if (!pending && ctx->async_notify)
>> + if (!pending && async_notify)
>> complete(&ctx->async_wait.completion);
>> - spin_unlock_bh(&ctx->encrypt_compl_lock);
>>
>> if (!ready)
>> return;
>> --
>> 2.39.2
>>
>>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] net/tls: Fix slab-use-after-free in tls_encrypt_done
2023-10-12 11:02 [PATCH] net/tls: Fix slab-use-after-free in tls_encrypt_done Juntong Deng
2023-10-16 9:50 ` Simon Horman
@ 2023-10-17 10:31 ` Paolo Abeni
2023-10-17 11:49 ` Juntong Deng
1 sibling, 1 reply; 5+ messages in thread
From: Paolo Abeni @ 2023-10-17 10:31 UTC (permalink / raw)
To: Juntong Deng, borisp, john.fastabend, kuba, davem, edumazet
Cc: netdev, linux-kernel-mentees, linux-kernel,
syzbot+29c22ea2d6b2c5fd2eae
On Thu, 2023-10-12 at 19:02 +0800, Juntong Deng wrote:
> In the current implementation, ctx->async_wait.completion is completed
> after spin_lock_bh, which causes tls_sw_release_resources_tx to
> continue executing and return to tls_sk_proto_cleanup, then return
> to tls_sk_proto_close, and after that enter tls_sw_free_ctx_tx to kfree
> the entire struct tls_context (including ctx->encrypt_compl_lock).
>
> Since ctx->encrypt_compl_lock has been freed, subsequent spin_unlock_bh
> will result in slab-use-after-free error. Due to SMP, even using
> spin_lock_bh does not prevent tls_sw_release_resources_tx from continuing
> on other CPUs. After tls_sw_release_resources_tx is woken up, there is no
> attempt to hold ctx->encrypt_compl_lock again, therefore everything
> described above is possible.
>
> The fix is to put complete(&ctx->async_wait.completion) after
> spin_unlock_bh, making the release after the unlock. Since complete is
> only executed if pending is 0, which means this is the last record, there
> is no need to worry about race condition causing duplicate completes.
>
> Reported-by: syzbot+29c22ea2d6b2c5fd2eae@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=29c22ea2d6b2c5fd2eae
> Signed-off-by: Juntong Deng <juntong.deng@outlook.com>
Have you tested this patch vs the syzbot reproducer?
I think the following race is still present:
CPU0 CPU1
tls_sw_release_resources_tx tls_encrypt_done
spin_lock_bh
spin_unlock_bh
spin_lock_bh
spin_unlock_bh
complete
wait
// ...
tls_sk_proto_close
test_and_set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask
// UaF
regardless of 'complete()' being invoked before or after the
'spin_unlock_bh()'.
Paolo
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] net/tls: Fix slab-use-after-free in tls_encrypt_done
2023-10-17 10:31 ` Paolo Abeni
@ 2023-10-17 11:49 ` Juntong Deng
0 siblings, 0 replies; 5+ messages in thread
From: Juntong Deng @ 2023-10-17 11:49 UTC (permalink / raw)
To: Paolo Abeni, borisp, john.fastabend, kuba, davem, edumazet
Cc: netdev, linux-kernel-mentees, linux-kernel,
syzbot+29c22ea2d6b2c5fd2eae
On 2023/10/17 18:31, Paolo Abeni wrote:
> On Thu, 2023-10-12 at 19:02 +0800, Juntong Deng wrote:
>> In the current implementation, ctx->async_wait.completion is completed
>> after spin_lock_bh, which causes tls_sw_release_resources_tx to
>> continue executing and return to tls_sk_proto_cleanup, then return
>> to tls_sk_proto_close, and after that enter tls_sw_free_ctx_tx to kfree
>> the entire struct tls_context (including ctx->encrypt_compl_lock).
>>
>> Since ctx->encrypt_compl_lock has been freed, subsequent spin_unlock_bh
>> will result in slab-use-after-free error. Due to SMP, even using
>> spin_lock_bh does not prevent tls_sw_release_resources_tx from continuing
>> on other CPUs. After tls_sw_release_resources_tx is woken up, there is no
>> attempt to hold ctx->encrypt_compl_lock again, therefore everything
>> described above is possible.
>>
>> The fix is to put complete(&ctx->async_wait.completion) after
>> spin_unlock_bh, making the release after the unlock. Since complete is
>> only executed if pending is 0, which means this is the last record, there
>> is no need to worry about race condition causing duplicate completes.
>>
>> Reported-by: syzbot+29c22ea2d6b2c5fd2eae@syzkaller.appspotmail.com
>> Closes: https://syzkaller.appspot.com/bug?extid=29c22ea2d6b2c5fd2eae
>> Signed-off-by: Juntong Deng <juntong.deng@outlook.com>
>
> Have you tested this patch vs the syzbot reproducer?
>
> I think the following race is still present:
>
> CPU0 CPU1
> tls_sw_release_resources_tx tls_encrypt_done
> spin_lock_bh
> spin_unlock_bh
> spin_lock_bh
> spin_unlock_bh
> complete
>
> wait
> // ...
> tls_sk_proto_close
>
> test_and_set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask
> // UaF
>
> regardless of 'complete()' being invoked before or after the
> 'spin_unlock_bh()'.
>
> Paolo
>
Yes, I think you are right.
My previous thought was that test_and_set_bit() is only called if
'ready' is true, but 'ready' will only be true on the first record,
and complete() is only called when processing the last record.
I simply thought before that the first record would not be the last
record, so I thought before that the test_and_set_bit() would not be
called when complete() was called.
But your reply inspired me and I thought about it carefully and the
situation with only one record is possible.
I will make version 2 patch to solve this problem.
Thanks.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-10-17 11:49 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-12 11:02 [PATCH] net/tls: Fix slab-use-after-free in tls_encrypt_done Juntong Deng
2023-10-16 9:50 ` Simon Horman
2023-10-17 10:25 ` Juntong Deng
2023-10-17 10:31 ` Paolo Abeni
2023-10-17 11:49 ` Juntong Deng
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).