Linux CIFS filesystem development
 help / color / mirror / Atom feed
From: David Howells <dhowells@redhat.com>
To: Xin Long <lucien.xin@gmail.com>
Cc: network dev <netdev@vger.kernel.org>,
	davem@davemloft.net, kuba@kernel.org,
	Eric Dumazet <edumazet@google.com>,
	Paolo Abeni <pabeni@redhat.com>, Simon Horman <horms@kernel.org>,
	Stefan Metzmacher <metze@samba.org>,
	Moritz Buhl <mbuhl@openbsd.org>,
	Tyler Fanelli <tfanelli@redhat.com>,
	Pengtao He <hepengtao@xiaomi.com>,
	linux-cifs@vger.kernel.org, Steve French <smfrench@gmail.com>,
	Namjae Jeon <linkinjeon@kernel.org>,
	Paulo Alcantara <pc@manguebit.com>, Tom Talpey <tom@talpey.com>,
	kernel-tls-handshake@lists.linux.dev,
	Chuck Lever <chuck.lever@oracle.com>,
	Jeff Layton <jlayton@kernel.org>,
	Benjamin Coddington <bcodding@redhat.com>,
	Steve Dickson <steved@redhat.com>, Hannes Reinecke <hare@suse.de>,
	Alexander Aring <aahringo@redhat.com>,
	Cong Wang <xiyou.wangcong@gmail.com>,
	"D . Wythe" <alibuda@linux.alibaba.com>,
	Jason Baron <jbaron@akamai.com>,
	illiliti <illiliti@protonmail.com>,
	Sabrina Dubroca <sd@queasysnail.net>,
	Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>,
	Daniel Stenberg <daniel@haxx.se>,
	Andy Gospodarek <andrew.gospodarek@broadcom.com>
Subject: Re: [PATCH net-next 00/15] net: introduce QUIC infrastructure and core subcomponents
Date: Mon, 07 Jul 2025 09:40:44 +0100	[thread overview]
Message-ID: <2334439.1751877644@warthog.procyon.org.uk> (raw)
In-Reply-To: <cover.1751743914.git.lucien.xin@gmail.com>


Xin Long <lucien.xin@gmail.com> wrote:

> Introduction
> ============
> 
> The QUIC protocol, as defined in RFC9000, offers a UDP-based, secure
> transport with flow-controlled streams for efficient communication,
> low-latency connection setup, and network path migration, ensuring
> confidentiality, integrity, and availability across various deployments.
> 
> This implementation introduces QUIC support in Linux Kernel, offering
> several key advantages:
> 
> - Seamless Integration for Kernel Subsystems: Kernel subsystems such as
>   SMB and NFS can operate over QUIC seamlessly after the handshake,
>   leveraging the net/handshake APIs.
> 
> - Standardized Socket APIs for QUIC: This implementation standardizes the
>   socket APIs for QUIC, covering essential operations like listen, accept,
>   connect, sendmsg, recvmsg, close, get/setsockopt, and getsock/peername().
> 
> - Efficient ALPN Routing: It incorporates ALPN routing within the kernel,
>   efficiently directing incoming requests to the appropriate applications
>   across different processes based on ALPN.
> 
> - Performance Enhancements: By minimizing data duplication through
>   zero-copy techniques such as sendfile(), and paving the way for crypto
>   offloading in NICs, this implementation enhances performance and prepares
>   for future optimizations.
> 
> This implementation offers fundamental support for the following RFCs:
> 
> - RFC9000 - QUIC: A UDP-Based Multiplexed and Secure Transport
> - RFC9001 - Using TLS to Secure QUIC
> - RFC9002 - QUIC Loss Detection and Congestion Control
> - RFC9221 - An Unreliable Datagram Extension to QUIC
> - RFC9287 - Greasing the QUIC Bit
> - RFC9368 - Compatible Version Negotiation for QUIC
> - RFC9369 - QUIC Version 2
> 
> The socket APIs for QUIC follow the RFC draft [1]:
> 
> - The Sockets API Extensions for In-kernel QUIC Implementations
> 
> Implementation
> ==============
> 
> The core idea is to implement QUIC within the kernel, using a userspace
> handshake approach.
> 
> Only the processing and creation of raw TLS Handshake Messages are handled
> in userspace, facilitated by a TLS library like GnuTLS. These messages are
> exchanged between kernel and userspace via sendmsg() and recvmsg(), with
> cryptographic details conveyed through control messages (cmsg).
> 
> The entire QUIC protocol, aside from the TLS Handshake Messages processing
> and creation, is managed within the kernel. Rather than using a Upper Layer
> Protocol (ULP) layer, this implementation establishes a socket of type
> IPPROTO_QUIC (similar to IPPROTO_MPTCP), operating over UDP tunnels.
> 
> For kernel consumers, they can initiate a handshake request from the kernel
> to userspace using the existing net/handshake netlink. The userspace
> component, such as tlshd service [2], then manages the processing
> of the QUIC handshake request.
> 
> - Handshake Architecture:
> 
>   ┌──────┐  ┌──────┐
>   │ APP1 │  │ APP2 │ ...
>   └──────┘  └──────┘
>   ┌──────────────────────────────────────────┐
>   │     {quic_client/server_handshake()}     │<─────────────┐
>   └──────────────────────────────────────────┘       ┌─────────────┐
>    {send/recvmsg()}      {set/getsockopt()}          │    tlshd    │
>    [CMSG handshake_info] [SOCKOPT_CRYPTO_SECRET]     └─────────────┘
>                          [SOCKOPT_TRANSPORT_PARAM_EXT]    │   ^
>                 │ ^                  │ ^                  │   │
>   Userspace     │ │                  │ │                  │   │
>   ──────────────│─│──────────────────│─│──────────────────│───│───────
>   Kernel        │ │                  │ │                  │   │
>                 v │                  v │                  v   │
>   ┌──────────────────┬───────────────────────┐       ┌─────────────┐
>   │ protocol, timer, │ socket (IPPROTO_QUIC) │<──┐   │ handshake   │
>   │                  ├───────────────────────┤   │   │netlink APIs │
>   │ common, family,  │ outqueue  |  inqueue  │   │   └─────────────┘
>   │                  ├───────────────────────┤   │      │       │
>   │ stream, connid,  │         frame         │   │   ┌─────┐ ┌─────┐
>   │                  ├───────────────────────┤   │   │     │ │     │
>   │ path, pnspace,   │         packet        │   │───│ SMB │ │ NFS │...
>   │                  ├───────────────────────┤   │   │     │ │     │
>   │ cong, crypto     │       UDP tunnels     │   │   └─────┘ └─────┘
>   └──────────────────┴───────────────────────┘   └──────┴───────┘
> 
> - User Data Architecture:
> 
>   ┌──────┐  ┌──────┐
>   │ APP1 │  │ APP2 │ ...
>   └──────┘  └──────┘
>    {send/recvmsg()}   {set/getsockopt()}              {recvmsg()}
>    [CMSG stream_info] [SOCKOPT_KEY_UPDATE]            [EVENT conn update]
>                       [SOCKOPT_CONNECTION_MIGRATION]  [EVENT stream update]
>                       [SOCKOPT_STREAM_OPEN/RESET/STOP]
>                 │ ^               │ ^                     ^
>   Userspace     │ │               │ │                     │
>   ──────────────│─│───────────────│─│─────────────────────│───────────
>   Kernel        │ │               │ │                     │
>                 v │               v │  ┌──────────────────┘
>   ┌──────────────────┬───────────────────────┐
>   │ protocol, timer, │ socket (IPPROTO_QUIC) │<──┐{kernel_send/recvmsg()}
>   │                  ├───────────────────────┤   │{kernel_set/getsockopt()}
>   │ common, family,  │ outqueue  |  inqueue  │   │{kernel_recvmsg()}
>   │                  ├───────────────────────┤   │
>   │ stream, connid,  │         frame         │   │   ┌─────┐ ┌─────┐
>   │                  ├───────────────────────┤   │   │     │ │     │
>   │ path, pnspace,   │         packet        │   │───│ SMB │ │ NFS │...
>   │                  ├───────────────────────┤   │   │     │ │     │
>   │ cong, crypto     │       UDP tunnels     │   │   └─────┘ └─────┘
>   └──────────────────┴───────────────────────┘   └──────┴───────┘
> 
> Interface
> =========
> 
> This implementation supports a mapping of QUIC into sockets APIs. Similar
> to TCP and SCTP, a typical Server and Client use the following system call
> sequence to communicate:
> 
>     Client                             Server
>   ──────────────────────────────────────────────────────────────────────
>   sockfd = socket(IPPROTO_QUIC)      listenfd = socket(IPPROTO_QUIC)
>   bind(sockfd)                       bind(listenfd)
>                                      listen(listenfd)
>   connect(sockfd)
>   quic_client_handshake(sockfd)
>                                      sockfd = accecpt(listenfd)
>                                      quic_server_handshake(sockfd, cert)
> 
>   sendmsg(sockfd)                    recvmsg(sockfd)
>   close(sockfd)                      close(sockfd)
>                                      close(listenfd)
> 
> Please note that quic_client_handshake() and quic_server_handshake()
> functions are currently sourced from libquic [3]. These functions are
> responsible for receiving and processing the raw TLS handshake messages
> until the completion of the handshake process.
> 
> For utilization by kernel consumers, it is essential to have tlshd
> service [2] installed and running in userspace. This service receives
> and manages kernel handshake requests for kernel sockets. In the kernel,
> the APIs closely resemble those used in userspace:
> 
>     Client                             Server
>   ────────────────────────────────────────────────────────────────────────
>   __sock_create(IPPROTO_QUIC, &sock)  __sock_create(IPPROTO_QUIC, &sock)
>   kernel_bind(sock)                   kernel_bind(sock)
>                                       kernel_listen(sock)
>   kernel_connect(sock)
>   tls_client_hello_x509(args:{sock})
>                                       kernel_accept(sock, &newsock)
>                                       tls_server_hello_x509(args:{newsock})
> 
>   kernel_sendmsg(sock)                kernel_recvmsg(newsock)
>   sock_release(sock)                  sock_release(newsock)
>                                       sock_release(sock)
> 
> Please be aware that tls_client_hello_x509() and tls_server_hello_x509()
> are APIs from net/handshake/. They are used to dispatch the handshake
> request to the userspace tlshd service and subsequently block until the
> handshake process is completed.

Can you please put this (or something like this) into Documentation/
somewhere?

Thanks,
David


  parent reply	other threads:[~2025-07-07  8:41 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-05 19:31 [PATCH net-next 00/15] net: introduce QUIC infrastructure and core subcomponents Xin Long
2025-07-05 19:31 ` [PATCH net-next 01/15] net: define IPPROTO_QUIC and SOL_QUIC constants Xin Long
2025-07-05 19:31 ` [PATCH net-next 02/15] net: build socket infrastructure for QUIC protocol Xin Long
2025-07-05 19:31 ` [PATCH net-next 03/15] quic: provide common utilities and data structures Xin Long
2025-07-05 19:31 ` [PATCH net-next 04/15] quic: provide family ops for address and protocol Xin Long
2025-07-05 19:31 ` [PATCH net-next 05/15] quic: provide quic.h header files for kernel and userspace Xin Long
2025-07-08 14:34   ` Jakub Kicinski
2025-07-09 14:52     ` Xin Long
2025-07-08 16:33   ` David Howells
2025-07-09 17:05     ` Xin Long
2025-07-05 19:31 ` [PATCH net-next 06/15] quic: add stream management Xin Long
2025-07-05 19:31 ` [PATCH net-next 07/15] quic: add connection id management Xin Long
2025-07-05 19:31 ` [PATCH net-next 08/15] quic: add path management Xin Long
2025-07-05 19:31 ` [PATCH net-next 09/15] quic: add congestion control Xin Long
2025-07-05 19:31 ` [PATCH net-next 10/15] quic: add packet number space Xin Long
2025-07-05 19:31 ` [PATCH net-next 11/15] quic: add crypto key derivation and installation Xin Long
2025-07-05 19:31 ` [PATCH net-next 12/15] quic: add crypto packet encryption and decryption Xin Long
2025-07-05 19:31 ` [PATCH net-next 13/15] quic: add timer management Xin Long
2025-07-05 19:31 ` [PATCH net-next 14/15] quic: add frame encoder and decoder base Xin Long
2025-07-05 19:31 ` [PATCH net-next 15/15] quic: add packet builder and parser base Xin Long
2025-07-07  8:40 ` David Howells [this message]
2025-07-07 14:54   ` [PATCH net-next 00/15] net: introduce QUIC infrastructure and core subcomponents Xin Long
2025-07-08  9:08     ` David Howells

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2334439.1751877644@warthog.procyon.org.uk \
    --to=dhowells@redhat.com \
    --cc=aahringo@redhat.com \
    --cc=alibuda@linux.alibaba.com \
    --cc=andrew.gospodarek@broadcom.com \
    --cc=bcodding@redhat.com \
    --cc=chuck.lever@oracle.com \
    --cc=daniel@haxx.se \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hare@suse.de \
    --cc=hepengtao@xiaomi.com \
    --cc=horms@kernel.org \
    --cc=illiliti@protonmail.com \
    --cc=jbaron@akamai.com \
    --cc=jlayton@kernel.org \
    --cc=kernel-tls-handshake@lists.linux.dev \
    --cc=kuba@kernel.org \
    --cc=linkinjeon@kernel.org \
    --cc=linux-cifs@vger.kernel.org \
    --cc=lucien.xin@gmail.com \
    --cc=marcelo.leitner@gmail.com \
    --cc=mbuhl@openbsd.org \
    --cc=metze@samba.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pc@manguebit.com \
    --cc=sd@queasysnail.net \
    --cc=smfrench@gmail.com \
    --cc=steved@redhat.com \
    --cc=tfanelli@redhat.com \
    --cc=tom@talpey.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox