public inbox for virtio-dev@lists.linux.dev
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Paolo Abeni <pabeni@redhat.com>
Cc: virtio-dev@lists.linux.dev, maxime.coquelin@redhat.com,
	Eelco Chaudron <echaudro@redhat.com>,
	Jason Wang <jasowang@redhat.com>
Subject: Re: [PATCH v3] virtio-net: define UDP tunnel offload feature
Date: Mon, 20 May 2024 22:55:42 -0400	[thread overview]
Message-ID: <20240520225254-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <f0bf2db203f8061a3ecb50b7a48cf26d8f3c7f68.1716193086.git.pabeni@redhat.com>

On Mon, May 20, 2024 at 10:24:53AM +0200, Paolo Abeni wrote:
> The VIRTIO_NET_HDR_GSO_UDP_TUNNEL is a gso_type flag allowing GSO over
> UDP tunnel. It can be negotiated on both the host and guest sides.
> 
> UDP tunnel usage is ubiquitous in container deployment, and the ability
> to offload UDP encapsulated GSO traffic impacts greatly the performances
> and the CPU utilization of such use cases.
> 
> One constraint addressed here is that the virtio side (either device or
> driver) receiving a UDP tunneled GSO packet must be able to reconstruct
> completely the inner and outer headers offset - to allow for later GSO.
> 
> To accommodate such need, new optional fields are introduced in the
> virtio_net header: outer_th_offset, inner_protocol, inner_mac_offset,
> inner_nh_offset. They map directly to the corresponding header
> information.
> 
> Note that the inner transport header is implied by the (inner) checksum
> offload, if present. Otherwise, it's up to the receiver to detect the
> inner transport header offset from the provided information, as it's
> currently the case for plain (not UDP tunneled) GSO packets.
> 
> The outer UDP header may carry a second checksum, which can be offloaded
> independently from the inner one. Since UDP tunnel checksum offload
> support makes little sense without UDP tunnel GSO support, to avoid
> unnecessary complex feature negotiation, the
> VIRTIO_NET_HDR_GSO_UDP_TUNNEL feature implies the support for the outer
> header checksum offload and the checksum itself is handled similarly to
> the inner header one.
> 
> Note that there is no concept of UDP tunnel type negotiation (e.g.
> vxlan, geneve, vxlan-gpe, etc.). That is intentional because:
> - given the information carried by the guest or host kernel, it's
>   impossible to probe reliably the UDP tunnel type. Specifically, the
>   outer UDP port numbers give a hint, but peers could use nonstandard
>   ones.
> - all the existing UDP tunnel protocols behave the same way WRT GSO
>   offload, carrying an immutable header on top of the outer transport
>   one.
> - if a new UDP tunnel protocol should surface in the future with
>   different constraints, the host and guest kernels will need explicit
>   support for it, including new, different GSO features. Additional
>   virtio support should be designed separately.
> 
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>


Paolo,
when you subscribed to virtio-dev you should have received this:



When to use this list:
- questions and change proposals for Virtio drivers and devices
  implementing the specification.

When not to use this list:
- questions and change proposals for the Virtio specification,
  including the specification of basic functionality, transports and
  devices (please use the virtio-comment mailing list
  [mailto:virtio-comment@lists.linux.dev] for this).


Did you receive this? If yes was this somehow unclear? If no why
are you sending a Virtio specification change proposal
to virtio-dev?



> ---
> v2 -> v3:
>  - UDP_TUNNEL -> UDP_TUNNEL_GSO
>  - add explicit fields for the inner meta-data
>  - more verbose changelog
>  https://lists.oasis-open.org/archives/virtio-dev/202206/msg00026.html
> 
> v1 -> v2:
>  - explicitly state that the outer header probing is mandatory
>  - explicitly state that GSO_UDP is not allowed with GSO_UDP_TUNNEL
>  - clarify hdr_len usage
>  - clarify UDP_TUNNEL_CSUM bit usage
>  - fix a few typos
>  https://lists.oasis-open.org/archives/virtio-dev/202205/msg00037.html
> ---
>  device-types/net/description.tex | 118 ++++++++++++++++++++++++++++---
>  1 file changed, 110 insertions(+), 8 deletions(-)
> 
> diff --git a/device-types/net/description.tex b/device-types/net/description.tex
> index 76585b0..2eae797 100644
> --- a/device-types/net/description.tex
> +++ b/device-types/net/description.tex
> @@ -88,6 +88,12 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
>  \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
>      channel.
>  
> +\item[VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO (49)] Driver can receive GSO packets
> +  carried by an UDP tunnel and can handle the outer checksum.
> +
> +\item[VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO (50)] Device can receive GSO packets
> +  carried by an UDP tunnel and can handle the outer checksum.
> +
>  \item[VIRTIO_NET_F_HASH_TUNNEL(51)] Device supports inner header hash for encapsulated packets.
>  
>  \item[VIRTIO_NET_F_VQ_NOTF_COAL(52)] Device supports virtqueue notification coalescing.
> @@ -133,12 +139,16 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
>  \item[VIRTIO_NET_F_GUEST_UFO] Requires VIRTIO_NET_F_GUEST_CSUM.
>  \item[VIRTIO_NET_F_GUEST_USO4] Requires VIRTIO_NET_F_GUEST_CSUM.
>  \item[VIRTIO_NET_F_GUEST_USO6] Requires VIRTIO_NET_F_GUEST_CSUM.
> +\item[VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO] Requires VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6,
> +   VIRTIO_NET_F_GUEST_USO4 or VIRTIO_NET_F_GUEST_USO6.
>  
>  \item[VIRTIO_NET_F_HOST_TSO4] Requires VIRTIO_NET_F_CSUM.
>  \item[VIRTIO_NET_F_HOST_TSO6] Requires VIRTIO_NET_F_CSUM.
>  \item[VIRTIO_NET_F_HOST_ECN] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
>  \item[VIRTIO_NET_F_HOST_UFO] Requires VIRTIO_NET_F_CSUM.
>  \item[VIRTIO_NET_F_HOST_USO] Requires VIRTIO_NET_F_CSUM.
> +\item[VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO] Requires VIRTIO_NET_F_HOST_TSO4, VIRTIO_NET_F_HOST_TSO6,
> +   VIRTIO_NET_F_GUEST_USO4 or VIRTIO_NET_F_GUEST_USO6.
>  
>  \item[VIRTIO_NET_F_CTRL_RX] Requires VIRTIO_NET_F_CTRL_VQ.
>  \item[VIRTIO_NET_F_CTRL_VLAN] Requires VIRTIO_NET_F_CTRL_VQ.
> @@ -374,6 +384,9 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev
>    segmentation/fragmentation offload by negotiating the VIRTIO_NET_F_HOST_TSO4 (IPv4
>    TCP), VIRTIO_NET_F_HOST_TSO6 (IPv6 TCP), VIRTIO_NET_F_HOST_UFO
>    (UDP fragmentation) and VIRTIO_NET_F_HOST_USO (UDP segmentation) features.
> +  Additionally, it can negotiate the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO feature
> +  to use TCP segmentation or UDP segmentation on top of UDP encapsulation,
> +  respecting the other negotiated features.
>  
>  \item The converse features are also available: a driver can save
>    the virtual device some work by negotiating these features.\note{For example, a network packet transported between two guests on
> @@ -382,8 +395,9 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev
>     The VIRTIO_NET_F_GUEST_CSUM feature indicates that partially
>    checksummed packets can be received, and if it can do that then
>    the VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6,
> -  VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_GUEST_USO4
> -  and VIRTIO_NET_F_GUEST_USO6 are the input equivalents of the features described above.
> +  VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_GUEST_USO4,
> +  VIRTIO_NET_F_GUEST_USO6 and VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO
> +  are the input equivalents of the features described above.
>    See \ref{sec:Device Types / Network Device / Device Operation /
>  Setting Up Receive Buffers}~\nameref{sec:Device Types / Network
>  Device / Device Operation / Setting Up Receive Buffers} and
> @@ -407,12 +421,14 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O
>  #define VIRTIO_NET_HDR_F_NEEDS_CSUM    1
>  #define VIRTIO_NET_HDR_F_DATA_VALID    2
>  #define VIRTIO_NET_HDR_F_RSC_INFO      4
> +#define VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM 8
>          u8 flags;
>  #define VIRTIO_NET_HDR_GSO_NONE        0
>  #define VIRTIO_NET_HDR_GSO_TCPV4       1
>  #define VIRTIO_NET_HDR_GSO_UDP         3
>  #define VIRTIO_NET_HDR_GSO_TCPV6       4
>  #define VIRTIO_NET_HDR_GSO_UDP_L4      5
> +#define VIRTIO_NET_HDR_GSO_UDP_TUNNEL 0x40
>  #define VIRTIO_NET_HDR_GSO_ECN      0x80
>          u8 gso_type;
>          le16 hdr_len;
> @@ -423,6 +439,10 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O
>          le32 hash_value;        (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
>          le16 hash_report;       (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
>          le16 padding_reserved;  (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
> +        le16 outer_th_offset;   (Only if VIRTIO_NET_F_UDP_TUNNEL_GSO negotiated)
> +        le16 inner_protocol;    (Only if VIRTIO_NET_F_UDP_TUNNEL_GSO negotiated)
> +        le16 inner_mac_offset;  (Only if VIRTIO_NET_F_UDP_TUNNEL_GSO negotiated)
> +        le16 inner_nh_offset;   (Only if VIRTIO_NET_F_UDP_TUNNEL_GSO negotiated)
>  };
>  \end{lstlisting}
>  
> @@ -480,6 +500,8 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
>  followed by the TCP header (with the TCP checksum field 16 bytes
>  into that header). \field{csum_start} will be 14+20 = 34 (the TCP
>  checksum includes the header), and \field{csum_offset} will be 16.
> +If the given packets has the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit set,
> +the above checksum fields refer to the inner header checksum.
>  \end{note}
>  
>  \item If the driver negotiated
> @@ -516,6 +538,30 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
>  specifically in the protocol.}.
>     \end{itemize}
>  
> +\item If the driver negotiated the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO feature,
> +  the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit in \field{gso_type} indicates that
> +  the GSO protocol is encapsulated in an UDP tunnel.
> +  The other tunnel-related fields indicate how to replicate the inner packet
> +  header to cut it into smaller packets:
> +
> +  \begin{itemize}
> +  \item \field{outer_th_offset} field indicates the outer transport header within
> +      the packet
> +
> +  \item \field{inner_protocol} field indicates the ethernet type of the inner
> +      protocol.
> +
> +  \item \field{inner_mac_offset} field indicates the inner mac header within the packet
> +
> +  \item \field{inner_nh_offset} field indicates the inner network header within
> +      the packet
> +
> +  \item If the \field{flags} field has the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM set,
> +    the outer UDP checksum field carries the checksum for the UDP pseudo header
> +    and the complete UDP checksum can be computed in a similar way to the
> +    inner TCP.
> +  \end{itemize}
> +
>  \item \field{num_buffers} is set to zero.  This field is unused on transmitted packets.
>  
>  \item The header and packet are added as one output descriptor to the
> @@ -557,6 +603,14 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
>  driver MUST set the VIRTIO_NET_HDR_GSO_ECN bit in
>  \field{gso_type}.
>  
> +The driver MUST NOT send to the device TCP or UDP GSO packets over UDP tunnel
> +requiring segmentation offload, unless the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO is
> +negotiated, in which case the driver MUST set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL
> +bit in the \field{gso_type}.
> +
> +The driver MUST NOT set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL together with
> +VIRTIO_NET_HDR_GSO_UDP.
> +
>  If the VIRTIO_NET_F_CSUM feature has been negotiated, the
>  driver MAY set the VIRTIO_NET_HDR_F_NEEDS_CSUM bit in
>  \field{flags}, if so:
> @@ -633,6 +687,18 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
>  	\end{note}
>  \end{itemize}
>  
> +If the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO option has been negotiated:
> +\begin{itemize}
> +\item If the \field{gso_type} has the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit set
> +	the device MUST use the \field{outer_th_offset}, \field{inner_protocol},
> +  \field{inner_mac_offset} and \field{inner_nh_offset} fields to
> +  locate the corresponding headers inside the packet.
> +\end{itemize}
> +
> +If VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit in \field{gso_type} is not set, the
> +device MUST NOT use the \field{outer_th_offset}, \field{inner_protocol},
> +\field{inner_mac_offset} and \field{inner_nh_offset}.
> +
>  If VIRTIO_NET_HDR_F_NEEDS_CSUM is not set, the device MUST NOT
>  rely on the packet checksum being correct.
>  \paragraph{Packet Transmission Interrupt}\label{sec:Device Types / Network Device / Device Operation / Packet Transmission / Packet Transmission Interrupt}
> @@ -727,8 +793,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>    has been validated.
>  \end{enumerate}
>  
> -Additionally, VIRTIO_NET_F_GUEST_CSUM, TSO4, TSO6, UDP and ECN
> -features enable receive checksum, large receive offload and ECN
> +Additionally, VIRTIO_NET_F_GUEST_CSUM, TSO4, TSO6, UDP, UDP_TUNNEL
> +and ECN features enable receive checksum, large receive offload and ECN
>  support which are the input equivalents of the transmit checksum,
>  transmit segmentation offloading and ECN features, as described
>  in \ref{sec:Device Types / Network Device / Device Operation /
> @@ -738,6 +804,15 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>    negotiated, then \field{gso_type} MAY be something other than
>    VIRTIO_NET_HDR_GSO_NONE, and \field{gso_size} field indicates the
>    desired MSS (see Packet Transmission point 2).
> +\item If the VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO option was negotiated and
> +  \field{gso_type} is not VIRTIO_NET_HDR_GSO_NONE, the VIRTIO_NET_HDR_GSO_UDP_TUNNEL
> +  bit MAY be set. In such case the \field{outer_th_offset}, \field{inner_protocol},
> +  \field{inner_mac_offset} and \field{inner_nh_offset} fields indicates corresponding
> +  header information.
> +  Additionally, the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit in the
> +  \field{flags} MAY be set, indicating that the outer UDP header
> +  carries the UDP pseudo header csum and that the driver can compute
> +  the full UDP checksum on top of it (see Packet Transmission point 3).
>  \item If the VIRTIO_NET_F_RSC_EXT option was negotiated (this
>    implies one of VIRTIO_NET_F_GUEST_TSO4, TSO6), the
>    device processes also duplicated ACK segments, reports
> @@ -750,8 +825,9 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>    from \field{csum_start} and any preceding checksums
>    have been validated.  The checksum on the packet is incomplete and
>    if bit VIRTIO_NET_HDR_F_RSC_INFO is not set in \field{flags},
> -  then \field{csum_start} and \field{csum_offset} indicate how to calculate it
> -  (see Packet Transmission point 1).
> +  then \field{csum_start} and \field{csum_offset} indicate how to calculate it.
> +  If the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit is set, the \field{csum_start} field
> +  refers to the inner transport header offset (see Packet Transmission point 1).
>  
>  \end{enumerate}
>  
> @@ -800,6 +876,20 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>  which have the Explicit Congestion Notification bit set, unless the
>  VIRTIO_NET_F_GUEST_ECN feature is negotiated, in which case the
>  device MUST set the VIRTIO_NET_HDR_GSO_ECN bit in
> +
> +The device SHOULD NOT send to the driver TCP or UDP GSO packets encapsulated in UDP
> +tunnel and requiring segmentation offload, unless the
> +VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO is negotiated, in which case the device MUST set
> +the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit in \field{gso_type} and MUST set the
> +\field{outer_th_offset}, \field{inner_protocol}, \field{inner_mac_offset} and
> +\field{inner_nh_offset}. If the outer UDP header carries a non 0 checksum:
> +\begin{enumerate}
> +\item the device MUST set the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit in
> +	\field{flags}
> +\item the device MUST set the outer UDP header checksum field to the outer
> +	UDP pseudo header sum
> +\end{enumerate}
> +
>  \field{gso_type}.
>  
>  If the VIRTIO_NET_F_GUEST_CSUM feature has been negotiated, the
> @@ -819,6 +909,12 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>  	fully checksummed packet;
>  \end{enumerate}
>  
> +\begin{note}
> +If the VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO feature is negotiated and the
> +VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit is set, the VIRTIO_NET_HDR_F_NEEDS_CSUM
> +bit refers to the inner header checksum.
> +\end{note}
> +
>  If none of the VIRTIO_NET_F_GUEST_TSO4, TSO6, UFO, USO4 or USO6 options have
>  been negotiated, the device MUST set \field{gso_type} to
>  VIRTIO_NET_HDR_GSO_NONE.
> @@ -842,8 +938,9 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>  If the VIRTIO_NET_F_GUEST_CSUM feature has been negotiated, the
>  device MAY set the VIRTIO_NET_HDR_F_DATA_VALID bit in
>  \field{flags}, if so, the device MUST validate the packet
> -checksum (in case of multiple encapsulated protocols, one level
> -of checksums is validated).
> +checksum. If the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM
> +bit in \field{flags} is also set, the device MUST additionally validate
> +the outer UDP header checksum.
>  
>  \drivernormative{\paragraph}{Processing of Incoming
>  Packets}{Device Types / Network Device / Device Operation /
> @@ -863,6 +960,10 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>  This is due to various bugs in implementations.
>  \end{note}
>  
> +If the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_GSO bit in \field{gso_type} is not set,
> +the driver MUST NOT use the \field{outer_th_offset}, \field{inner_protocol},
> +\field{inner_mac_offset} and \field{inner_nh_offset}.
> +
>  If neither VIRTIO_NET_HDR_F_NEEDS_CSUM nor
>  VIRTIO_NET_HDR_F_DATA_VALID is set, the driver MUST NOT
>  rely on the packet checksum being correct.
> @@ -1624,6 +1725,7 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
>  #define VIRTIO_NET_F_GUEST_TSO6       8
>  #define VIRTIO_NET_F_GUEST_ECN        9
>  #define VIRTIO_NET_F_GUEST_UFO        10
> +#define VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO 49
>  #define VIRTIO_NET_F_GUEST_USO4       54
>  #define VIRTIO_NET_F_GUEST_USO6       55
>  
> -- 
> 2.43.2
> 


  parent reply	other threads:[~2024-05-21  2:55 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-20  8:24 [PATCH v3] virtio-net: define UDP tunnel offload feature Paolo Abeni
2024-05-20 13:00 ` Stefano Garzarella
2024-05-21  2:55 ` Michael S. Tsirkin [this message]
2024-05-21  7:25   ` Paolo Abeni
2024-05-21  7:41     ` Michael S. Tsirkin
2024-05-21 12:00       ` Paolo Abeni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240520225254-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=echaudro@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=pabeni@redhat.com \
    --cc=virtio-dev@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox