* [PATCH v7 0/2] virtio-net: define UDP tunnel offload
@ 2024-08-21 10:02 Paolo Abeni
2024-08-21 10:02 ` [PATCH v7 1/2] virtio-net: define UDP tunnel segmentation offload feature Paolo Abeni
2024-08-21 10:02 ` [PATCH v7 2/2] virtio-net: define UDP tunnel checksum " Paolo Abeni
0 siblings, 2 replies; 15+ messages in thread
From: Paolo Abeni @ 2024-08-21 10:02 UTC (permalink / raw)
To: virtio-comment
Cc: maxime.coquelin, Eelco Chaudron, Jason Wang, Stefano Garzarella,
Willem de Bruijn
UDP tunnel usage is ubiquitous in container deployment, and the ability
to offload UDP encapsulated GSO traffic impacts greatly the performances
and the CPU utilization of such use cases.
This series introduces separate features to handle both UDP tunnel
segmentation offload (patch 1) and UDP tunnel outer checksum offload
(patch 2).
Changes from v6:
- replaced inner_protocol_type with
VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV{4,6}
- many clarifications and cleanup, see the individual patches changelog
for the details
Changes from v5:
- split in 2 patches
- dropped outer_mh_offset field
- many csum related clarification
Paolo Abeni (2):
virtio-net: define UDP tunnel segmentation offload feature
virtio-net: define UDP tunnel checksum offload feature
device-types/net/description.tex | 266 ++++++++++++++++++++++++++++++-
1 file changed, 258 insertions(+), 8 deletions(-)
--
2.45.2
^ permalink raw reply [flat|nested] 15+ messages in thread* [PATCH v7 1/2] virtio-net: define UDP tunnel segmentation offload feature 2024-08-21 10:02 [PATCH v7 0/2] virtio-net: define UDP tunnel offload Paolo Abeni @ 2024-08-21 10:02 ` Paolo Abeni 2024-08-21 19:28 ` Willem de Bruijn 2024-08-21 10:02 ` [PATCH v7 2/2] virtio-net: define UDP tunnel checksum " Paolo Abeni 1 sibling, 1 reply; 15+ messages in thread From: Paolo Abeni @ 2024-08-21 10:02 UTC (permalink / raw) To: virtio-comment Cc: maxime.coquelin, Eelco Chaudron, Jason Wang, Stefano Garzarella, Willem de Bruijn The VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV{4,6} are two gso_type flags allowing respectively GSO over UDPv4 tunnel and GSO over UDPv6 tunnel. They can be negotiated on both the host and guest sides. One constraint addressed here is that the virtio side (either device or driver) receiving a UDP tunneled GSO packet must be able to reconstruct completely the inner and outer headers offset - to allow for later GSO. To accommodate such need, new fields are introduced in the virtio_net header: outer_th_offset and inner_nh_offset. They map directly to the corresponding header information. The inner transport header is implied by the (inner) checksum offload. Those fields are required because a virtio device H/W implementation performing segmentation for UDP tunneled packet will need to touch the outer transport protocol (for the UDP length filed), the inner network protocol (for the total length field, in the IPv4 case). Note that segmentation will additionally need to touch the outer network protocol and the inner transport protocol. The first is implied/easily found with trivial parsing, the latter is identified by the existing csum_start field. Note that there is no concept of UDP tunnel type negotiation (e.g. vxlan, geneve, vxlan-gpe, etc.), as a virtio device H/W implementation can perform segmentation for every possible UDP-tunnel given the specified new fields. In the reverse direction, if a virtio device H/W implementation receives some traffic for an unknown or unsupported UDP tunnel, it will simply not aggregate the wire packet in a GSO one. Signed-off-by: Paolo Abeni <pabeni@redhat.com> --- v6 -> v7: - replaced inner_protocol_type with VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 && VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 - UDP_TUNNEL now depends on all the TCP and USO segmentation features - reworded the 'VIRTIO_NET_HDR_GSO_UDP_TUNNEL' bit driver normative - specified that csum_start/offset points to the inner header in the driver normative, too - new fields presence depends on either VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO or VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO - e.g. -> i.e. - zero -> positive zero (0x0) - checksumming -> checksum validation - explain why we the outer UDP csum can be precomputed only when the GSO packet length is a multiple of gso_size - added an example note for the tunnel-related fields value https://lore.kernel.org/virtio-comment/cover.1722252302.git.pabeni@redhat.com/T/#t v5 -> v6: - split in 2 patches - dropped the unneeded outer_mh_offset field https://lore.kernel.org/virtio-comment/56ccba136272593d0d2af0303600c06f26bd84a1.1718621522.git.pabeni@redhat.com/ v4 -> v5: - add separate negotiation for UDP_TUNNEL_CSUM - much more verbose specification of outer csum handling - UDP_TUNNEL_CSUM requires GSO_UDP_TUNNEL https://lore.kernel.org/virtio-comment/cb89d3d6b6a7e4a7f13e60dd3fded5e950727890.1716448766.git.pabeni@redhat.com/ v3 -> v4: - new virtio_net_hdr fields depends on F_HOST_UDP_TUNNEL_GSO or F_GUEST_UDP_TUNNEL_GSO (Stefano) - Extended the changelog to answer Jason's questions. - Clarified outer csum handling (Jason) https://lore.kernel.org/virtio-dev/f0bf2db203f8061a3ecb50b7a48cf26d8f3c7f68.1716193086.git.pabeni@redhat.com v2 -> v3: - UDP_TUNNEL -> UDP_TUNNEL_GSO - add explicit fields for the inner meta-data - more verbose changelog https://lists.oasis-open.org/archives/virtio-dev/202206/msg00026.html v1 -> v2: - explicitly state that the outer header probing is mandatory - explicitly state that GSO_UDP is not allowed with GSO_UDP_TUNNEL - clarify hdr_len usage - clarify UDP_TUNNEL_CSUM bit usage - fix a few typos https://lists.oasis-open.org/archives/virtio-dev/202205/msg00037.html --- device-types/net/description.tex | 157 +++++++++++++++++++++++++++++-- 1 file changed, 151 insertions(+), 6 deletions(-) diff --git a/device-types/net/description.tex b/device-types/net/description.tex index 76585b0..e0e0927 100644 --- a/device-types/net/description.tex +++ b/device-types/net/description.tex @@ -88,6 +88,12 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control channel. +\item[VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO (47)] Driver can receive GSO packets + carried by a UDP tunnel. + +\item[VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO (49)] Device can receive GSO packets + carried by a UDP tunnel. + \item[VIRTIO_NET_F_HASH_TUNNEL(51)] Device supports inner header hash for encapsulated packets. \item[VIRTIO_NET_F_VQ_NOTF_COAL(52)] Device supports virtqueue notification coalescing. @@ -133,12 +139,16 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device \item[VIRTIO_NET_F_GUEST_UFO] Requires VIRTIO_NET_F_GUEST_CSUM. \item[VIRTIO_NET_F_GUEST_USO4] Requires VIRTIO_NET_F_GUEST_CSUM. \item[VIRTIO_NET_F_GUEST_USO6] Requires VIRTIO_NET_F_GUEST_CSUM. +\item[VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO] Requires VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6, + VIRTIO_NET_F_GUEST_USO4 and VIRTIO_NET_F_GUEST_USO6. \item[VIRTIO_NET_F_HOST_TSO4] Requires VIRTIO_NET_F_CSUM. \item[VIRTIO_NET_F_HOST_TSO6] Requires VIRTIO_NET_F_CSUM. \item[VIRTIO_NET_F_HOST_ECN] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6. \item[VIRTIO_NET_F_HOST_UFO] Requires VIRTIO_NET_F_CSUM. \item[VIRTIO_NET_F_HOST_USO] Requires VIRTIO_NET_F_CSUM. +\item[VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO] Requires VIRTIO_NET_F_HOST_TSO4, VIRTIO_NET_F_HOST_TSO6 + and VIRTIO_NET_F_HOST_USO. \item[VIRTIO_NET_F_CTRL_RX] Requires VIRTIO_NET_F_CTRL_VQ. \item[VIRTIO_NET_F_CTRL_VLAN] Requires VIRTIO_NET_F_CTRL_VQ. @@ -375,6 +385,13 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev TCP), VIRTIO_NET_F_HOST_TSO6 (IPv6 TCP), VIRTIO_NET_F_HOST_UFO (UDP fragmentation) and VIRTIO_NET_F_HOST_USO (UDP segmentation) features. +\item If the VIRTIO_NET_F_HOST_TSO6, VIRTIO_NET_F_HOST_TSO4 and VIRTIO_NET_F_HOST_USO + segmentation features are negotiated, a driver can + use TCP segmentation or UDP segmentation on top of UDP encapsulation + offload, when the outer header does not require checksumming - e.g. + the outer UDP checksum is zero - by negotiating the + VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO feature. + \item The converse features are also available: a driver can save the virtual device some work by negotiating these features.\note{For example, a network packet transported between two guests on the same system might not need checksumming at all, nor segmentation, @@ -382,8 +399,9 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev The VIRTIO_NET_F_GUEST_CSUM feature indicates that partially checksummed packets can be received, and if it can do that then the VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6, - VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_GUEST_USO4 - and VIRTIO_NET_F_GUEST_USO6 are the input equivalents of the features described above. + VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_GUEST_USO4, + VIRTIO_NET_F_GUEST_USO6 and VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO + are the input equivalents of the features described above. See \ref{sec:Device Types / Network Device / Device Operation / Setting Up Receive Buffers}~\nameref{sec:Device Types / Network Device / Device Operation / Setting Up Receive Buffers} and @@ -413,6 +431,8 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O #define VIRTIO_NET_HDR_GSO_UDP 3 #define VIRTIO_NET_HDR_GSO_TCPV6 4 #define VIRTIO_NET_HDR_GSO_UDP_L4 5 +#define VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 0x20 +#define VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 0x40 #define VIRTIO_NET_HDR_GSO_ECN 0x80 u8 gso_type; le16 hdr_len; @@ -423,6 +443,8 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O le32 hash_value; (Only if VIRTIO_NET_F_HASH_REPORT negotiated) le16 hash_report; (Only if VIRTIO_NET_F_HASH_REPORT negotiated) le16 padding_reserved; (Only if VIRTIO_NET_F_HASH_REPORT negotiated) + le16 outer_th_offset (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO or VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO negotiated) + le16 inner_nh_offset; (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO or VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO negotiated) }; \end{lstlisting} @@ -480,6 +502,10 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De followed by the TCP header (with the TCP checksum field 16 bytes into that header). \field{csum_start} will be 14+20 = 34 (the TCP checksum includes the header), and \field{csum_offset} will be 16. +If the given packet has the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit set, +the above checksum fields refer to the inner header checksum, see +the example below. \end{note} \item If the driver negotiated @@ -516,6 +542,39 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De specifically in the protocol.}. \end{itemize} +\item If the driver negotiated the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO feature and the + \field{gso_type} has the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or + VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit set, the GSO protocol is encapsulated + in a UDP tunnel. + The outer UDP header must not require checksum validation, i.e. the outer UDP + checksum must be a positive zero (0x0) as defined in UDP RFC 768. + The other tunnel-related fields indicate how to replicate the packet + headers to cut it into smaller packets: + + \begin{itemize} + \item \field{outer_th_offset} field indicates the outer transport header within + the packet. Note that such field differs from \field{csum_start} as the latter + points to the inner transport header within the packet. + + \item \field{inner_nh_offset} field indicates the inner network header within + the packet. + \end{itemize} + +\begin{note} +For example, consider a partially checksummed TCP (IPv4) packet carried over a +Geneve UDP tunnel (again IPv4) with no tunnel options. Note that the +only relevant variable related to the tunnel type is the tunnel header length. +The packet will have a 14 byte outer ethernet header, 20 byte outer IP header +followed by the 8 byte UDP header (with a 0 checksum value), 8 byte Geneve header, +14 byte inner ethernet header, 20 byte inner IP header +and the TCP header (with the TCP checksum field 16 bytes +into that header). \field{csum_start} will be 14+20+8+8+14+20 = 84 (the TCP +checksum includes the header), \field{csum_offset} will be 16. +\field{inner_nh_offset} will be 14+20+8+8+14 = 62, \field{outer_th_offset} will be +14+20+8 = 42 and \field{gso_type} will be +VIRTIO_NET_HDR_GSO_TCPV4 | VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 = 0x21 +\end{note} + \item \field{num_buffers} is set to zero. This field is unused on transmitted packets. \item The header and packet are added as one output descriptor to the @@ -557,6 +616,29 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De driver MUST set the VIRTIO_NET_HDR_GSO_ECN bit in \field{gso_type}. +If VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO is negotiated, the driver MAY set +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit +in \field{gso_type} according to the inner network header protocol type +to request TCP or UDP GSO packets over UDPv4 or UDPv6 tunnel +segmentation, otherwise the driver MUST NOT set neither the +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit nor the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit +in \field{gso_type}. + +When requesting TCP or UDP GSO segmentation over UDP tunnel, the driver MUST SET the +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit if the inner network header is IPv4, i.e. the +packet is a TCPv4 GSO one, otherwise, if the inner network header is IPv6, the driver +MUST SET the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit. + +The driver MUST NOT send to the device TCP or UDP GSO packets over UDP tunnel +requiring segmentation and outer UDP checksum offload. + +The driver MUST NOT set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit together with VIRTIO_NET_HDR_GSO_UDP, as the +latter is deprecated in favor of UDP_L4 and no new feature will support it. + +The driver MUST NOT set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit and the +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit together. + If the VIRTIO_NET_F_CSUM feature has been negotiated, the driver MAY set the VIRTIO_NET_HDR_F_NEEDS_CSUM bit in \field{flags}, if so: @@ -598,6 +680,36 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De header. \end{itemize} +If the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO option has been negotiated, the +driver MAY set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit in \field{gso_type}, if so: +\begin{itemize} +\item the driver MUST set \field{outer_th_offset} to the outer UDP header + offset and \field{inner_nh_offset} to the inner network header offset. + The \field{csum_start} and \field{csum_offset} fields point respectively + to the inner transport header and inner transport checksum field. +\end{itemize} + +If the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit in \field{gso_type} are set, the +outer UDP header MUST NOT require checksum validation. That is, the +outer UDP checksum value MUST be 0 or the validated complete checksum +for such header. + +\begin{note} +The valid complete checksum of the outer UDP header of individual segments +can be computed by the driver prior to segmentation only if the GSO packet +size is a multiple of \field{gso_size}, because then all segments +have the same size and thus all data included in the outer UDP +checksum is the same for every segment. These pre-computed segment +length and checksum fields are different from those of the GSO +packet. +\end{note} + +If the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO option has not +been negotiated, the driver MUST NOT set neither the VIRTIO_NET_HDR_F_UDP_TUNNEL_IPV4 +bit nor the VIRTIO_NET_HDR_F_UDP_TUNNEL_IPV6 in \field{gso_type}. + The driver SHOULD accept the VIRTIO_NET_F_GUEST_HDRLEN feature if it has been offered, and if it's able to provide the exact header length. @@ -633,6 +745,10 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De \end{note} \end{itemize} +If the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit and theVIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 +bit in \field{gso_type} are not set, the device MUST NOT use the +\field{outer_th_offset} and \field{inner_nh_offset}. + If VIRTIO_NET_HDR_F_NEEDS_CSUM is not set, the device MUST NOT rely on the packet checksum being correct. \paragraph{Packet Transmission Interrupt}\label{sec:Device Types / Network Device / Device Operation / Packet Transmission / Packet Transmission Interrupt} @@ -727,8 +843,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network has been validated. \end{enumerate} -Additionally, VIRTIO_NET_F_GUEST_CSUM, TSO4, TSO6, UDP and ECN -features enable receive checksum, large receive offload and ECN +Additionally, VIRTIO_NET_F_GUEST_CSUM, TSO4, TSO6, UDP, UDP_TUNNEL +and ECN features enable receive checksum, large receive offload and ECN support which are the input equivalents of the transmit checksum, transmit segmentation offloading and ECN features, as described in \ref{sec:Device Types / Network Device / Device Operation / @@ -750,8 +866,16 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network from \field{csum_start} and any preceding checksums have been validated. The checksum on the packet is incomplete and if bit VIRTIO_NET_HDR_F_RSC_INFO is not set in \field{flags}, - then \field{csum_start} and \field{csum_offset} indicate how to calculate it - (see Packet Transmission point 1). + then \field{csum_start} and \field{csum_offset} indicate how to calculate it. + If the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 + bit are set, the \field{csum_start} field + refers to the inner transport header offset (see Packet Transmission point 1). +\item If the VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO option was negotiated and + \field{gso_type} is not VIRTIO_NET_HDR_GSO_NONE, the + VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 + bit MAY be set. In such case the \field{outer_th_offset} and + \field{inner_nh_offset} fields indicate the corresponding + headers information. \end{enumerate} @@ -796,6 +920,10 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network If none of VIRTIO_NET_F_GUEST_USO4 or VIRTIO_NET_F_GUEST_USO6 have been negotiated, the device MUST NOT set \field{gso_type} to VIRTIO_NET_HDR_GSO_UDP_L4. +If VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO is not negotiated, the device MUST NOT set +neither the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit nor the +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit in \field{gso_type}. + The device SHOULD NOT send to the driver TCP packets requiring segmentation offload which have the Explicit Congestion Notification bit set, unless the VIRTIO_NET_F_GUEST_ECN feature is negotiated, in which case the @@ -819,6 +947,18 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network fully checksummed packet; \end{enumerate} +The device MUST NOT send to the driver TCP or UDP GSO packets encapsulated in UDP +tunnel and requiring segmentation offload, unless the +VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO is negotiated, in which case the device MUST set +the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 +bit in \field{gso_type} according to the inner network header protocol type, +MUST set the \field{outer_th_offset} and \field{inner_nh_offset} fields +to the corresponding header information, and the outer UDP header MUST NOT +require checksum offload. + +The device MUST NOT send the driver TCP or UDP GSO packets encapsulated in UDP +tunnel and requiring segmentation and outer checksum offload. + If none of the VIRTIO_NET_F_GUEST_TSO4, TSO6, UFO, USO4 or USO6 options have been negotiated, the device MUST set \field{gso_type} to VIRTIO_NET_HDR_GSO_NONE. @@ -867,6 +1007,10 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network VIRTIO_NET_HDR_F_DATA_VALID is set, the driver MUST NOT rely on the packet checksum being correct. +If the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 +bit in \field{gso_type} are not set, the driver MUST NOT use the +\field{outer_th_offset} and \field{inner_nh_offset}. + \paragraph{Hash calculation for incoming packets} \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets} @@ -1624,6 +1768,7 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi #define VIRTIO_NET_F_GUEST_TSO6 8 #define VIRTIO_NET_F_GUEST_ECN 9 #define VIRTIO_NET_F_GUEST_UFO 10 +#define VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO 47 #define VIRTIO_NET_F_GUEST_USO4 54 #define VIRTIO_NET_F_GUEST_USO6 55 -- 2.45.2 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v7 1/2] virtio-net: define UDP tunnel segmentation offload feature 2024-08-21 10:02 ` [PATCH v7 1/2] virtio-net: define UDP tunnel segmentation offload feature Paolo Abeni @ 2024-08-21 19:28 ` Willem de Bruijn 2024-08-21 19:48 ` Willem de Bruijn 2024-08-22 14:11 ` Paolo Abeni 0 siblings, 2 replies; 15+ messages in thread From: Willem de Bruijn @ 2024-08-21 19:28 UTC (permalink / raw) To: Paolo Abeni Cc: virtio-comment, maxime.coquelin, Eelco Chaudron, Jason Wang, Stefano Garzarella On Wed, Aug 21, 2024 at 6:03 AM Paolo Abeni <pabeni@redhat.com> wrote: > > The VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV{4,6} are two gso_type flags > allowing respectively GSO over UDPv4 tunnel and GSO over UDPv6 tunnel. > They can be negotiated on both the host and guest sides. > > One constraint addressed here is that the virtio side (either device or > driver) receiving a UDP tunneled GSO packet must be able to reconstruct > completely the inner and outer headers offset - to allow for later GSO. > > To accommodate such need, new fields are introduced in the virtio_net > header: outer_th_offset and inner_nh_offset. > They map directly to the corresponding header information. The inner > transport header is implied by the (inner) checksum offload. > > Those fields are required because a virtio device H/W implementation > performing segmentation for UDP tunneled packet will need to touch > the outer transport protocol (for the UDP length filed), the > inner network protocol (for the total length field, in the IPv4 > case). > > Note that segmentation will additionally need to touch > the outer network protocol and the inner transport protocol. The first > is implied/easily found with trivial parsing, the latter is identified Trivial because Ethernet header followed by IPv4 or IPv6 header is assumed? I.e., no level 2.5 headers like vlan or mpls to complicate matters. > by the existing csum_start field. > > Note that there is no concept of UDP tunnel type negotiation (e.g. > vxlan, geneve, vxlan-gpe, etc.), as a virtio device H/W implementation > can perform segmentation for every possible UDP-tunnel given the > specified new fields. > In the reverse direction, if a virtio device H/W implementation receives > some traffic for an unknown or unsupported UDP tunnel, it will simply > not aggregate the wire packet in a GSO one. > > Signed-off-by: Paolo Abeni <pabeni@redhat.com> Overall this looks great to me. A few comments inline, mostly to clarify language. Happy to give my Reviewed-By as is too at this point. > --- > v6 -> v7: > - replaced inner_protocol_type with VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 && > VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 > - UDP_TUNNEL now depends on all the TCP and USO segmentation features > - reworded the 'VIRTIO_NET_HDR_GSO_UDP_TUNNEL' bit driver normative > - specified that csum_start/offset points to the inner header in the > driver normative, too > - new fields presence depends on either VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO or > VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO > - e.g. -> i.e. > - zero -> positive zero (0x0) > - checksumming -> checksum validation > - explain why we the outer UDP csum can be precomputed only > when the GSO packet length is a multiple of gso_size > - added an example note for the tunnel-related fields value > https://lore.kernel.org/virtio-comment/cover.1722252302.git.pabeni@redhat.com/T/#t > > v5 -> v6: > - split in 2 patches > - dropped the unneeded outer_mh_offset field > https://lore.kernel.org/virtio-comment/56ccba136272593d0d2af0303600c06f26bd84a1.1718621522.git.pabeni@redhat.com/ > > v4 -> v5: > - add separate negotiation for UDP_TUNNEL_CSUM > - much more verbose specification of outer csum handling > - UDP_TUNNEL_CSUM requires GSO_UDP_TUNNEL > https://lore.kernel.org/virtio-comment/cb89d3d6b6a7e4a7f13e60dd3fded5e950727890.1716448766.git.pabeni@redhat.com/ > > v3 -> v4: > - new virtio_net_hdr fields depends on F_HOST_UDP_TUNNEL_GSO or > F_GUEST_UDP_TUNNEL_GSO (Stefano) > - Extended the changelog to answer Jason's questions. > - Clarified outer csum handling (Jason) > https://lore.kernel.org/virtio-dev/f0bf2db203f8061a3ecb50b7a48cf26d8f3c7f68.1716193086.git.pabeni@redhat.com > > v2 -> v3: > - UDP_TUNNEL -> UDP_TUNNEL_GSO > - add explicit fields for the inner meta-data > - more verbose changelog > https://lists.oasis-open.org/archives/virtio-dev/202206/msg00026.html > > v1 -> v2: > - explicitly state that the outer header probing is mandatory > - explicitly state that GSO_UDP is not allowed with GSO_UDP_TUNNEL > - clarify hdr_len usage > - clarify UDP_TUNNEL_CSUM bit usage > - fix a few typos > https://lists.oasis-open.org/archives/virtio-dev/202205/msg00037.html > --- > device-types/net/description.tex | 157 +++++++++++++++++++++++++++++-- > 1 file changed, 151 insertions(+), 6 deletions(-) > > diff --git a/device-types/net/description.tex b/device-types/net/description.tex > index 76585b0..e0e0927 100644 > --- a/device-types/net/description.tex > +++ b/device-types/net/description.tex > @@ -88,6 +88,12 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits > \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control > channel. > > +\item[VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO (47)] Driver can receive GSO packets > + carried by a UDP tunnel. > + > +\item[VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO (49)] Device can receive GSO packets > + carried by a UDP tunnel. > + > \item[VIRTIO_NET_F_HASH_TUNNEL(51)] Device supports inner header hash for encapsulated packets. > > \item[VIRTIO_NET_F_VQ_NOTF_COAL(52)] Device supports virtqueue notification coalescing. > @@ -133,12 +139,16 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device > \item[VIRTIO_NET_F_GUEST_UFO] Requires VIRTIO_NET_F_GUEST_CSUM. > \item[VIRTIO_NET_F_GUEST_USO4] Requires VIRTIO_NET_F_GUEST_CSUM. > \item[VIRTIO_NET_F_GUEST_USO6] Requires VIRTIO_NET_F_GUEST_CSUM. > +\item[VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO] Requires VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6, > + VIRTIO_NET_F_GUEST_USO4 and VIRTIO_NET_F_GUEST_USO6. > > \item[VIRTIO_NET_F_HOST_TSO4] Requires VIRTIO_NET_F_CSUM. > \item[VIRTIO_NET_F_HOST_TSO6] Requires VIRTIO_NET_F_CSUM. > \item[VIRTIO_NET_F_HOST_ECN] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6. > \item[VIRTIO_NET_F_HOST_UFO] Requires VIRTIO_NET_F_CSUM. > \item[VIRTIO_NET_F_HOST_USO] Requires VIRTIO_NET_F_CSUM. > +\item[VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO] Requires VIRTIO_NET_F_HOST_TSO4, VIRTIO_NET_F_HOST_TSO6 > + and VIRTIO_NET_F_HOST_USO. > > \item[VIRTIO_NET_F_CTRL_RX] Requires VIRTIO_NET_F_CTRL_VQ. > \item[VIRTIO_NET_F_CTRL_VLAN] Requires VIRTIO_NET_F_CTRL_VQ. > @@ -375,6 +385,13 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev > TCP), VIRTIO_NET_F_HOST_TSO6 (IPv6 TCP), VIRTIO_NET_F_HOST_UFO > (UDP fragmentation) and VIRTIO_NET_F_HOST_USO (UDP segmentation) features. > > +\item If the VIRTIO_NET_F_HOST_TSO6, VIRTIO_NET_F_HOST_TSO4 and VIRTIO_NET_F_HOST_USO > + segmentation features are negotiated, a driver can > + use TCP segmentation or UDP segmentation on top of UDP encapsulation > + offload, when the outer header does not require checksumming - e.g. > + the outer UDP checksum is zero - by negotiating the > + VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO feature. > + > \item The converse features are also available: a driver can save > the virtual device some work by negotiating these features.\note{For example, a network packet transported between two guests on > the same system might not need checksumming at all, nor segmentation, > @@ -382,8 +399,9 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev > The VIRTIO_NET_F_GUEST_CSUM feature indicates that partially > checksummed packets can be received, and if it can do that then > the VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6, > - VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_GUEST_USO4 > - and VIRTIO_NET_F_GUEST_USO6 are the input equivalents of the features described above. > + VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_GUEST_USO4, > + VIRTIO_NET_F_GUEST_USO6 and VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO > + are the input equivalents of the features described above. > See \ref{sec:Device Types / Network Device / Device Operation / > Setting Up Receive Buffers}~\nameref{sec:Device Types / Network > Device / Device Operation / Setting Up Receive Buffers} and > @@ -413,6 +431,8 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O > #define VIRTIO_NET_HDR_GSO_UDP 3 > #define VIRTIO_NET_HDR_GSO_TCPV6 4 > #define VIRTIO_NET_HDR_GSO_UDP_L4 5 > +#define VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 0x20 > +#define VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 0x40 > #define VIRTIO_NET_HDR_GSO_ECN 0x80 > u8 gso_type; > le16 hdr_len; > @@ -423,6 +443,8 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O > le32 hash_value; (Only if VIRTIO_NET_F_HASH_REPORT negotiated) > le16 hash_report; (Only if VIRTIO_NET_F_HASH_REPORT negotiated) > le16 padding_reserved; (Only if VIRTIO_NET_F_HASH_REPORT negotiated) > + le16 outer_th_offset (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO or VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO negotiated) > + le16 inner_nh_offset; (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO or VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO negotiated) > }; > \end{lstlisting} > > @@ -480,6 +502,10 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De > followed by the TCP header (with the TCP checksum field 16 bytes > into that header). \field{csum_start} will be 14+20 = 34 (the TCP > checksum includes the header), and \field{csum_offset} will be 16. > +If the given packet has the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the > +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit set, > +the above checksum fields refer to the inner header checksum, see > +the example below. > \end{note} > > \item If the driver negotiated > @@ -516,6 +542,39 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De > specifically in the protocol.}. > \end{itemize} > > +\item If the driver negotiated the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO feature and the > + \field{gso_type} has the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or > + VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit set, the GSO protocol is encapsulated > + in a UDP tunnel. > + The outer UDP header must not require checksum validation, i.e. the outer UDP > + checksum must be a positive zero (0x0) as defined in UDP RFC 768. > + The other tunnel-related fields indicate how to replicate the packet > + headers to cut it into smaller packets: > + > + \begin{itemize} > + \item \field{outer_th_offset} field indicates the outer transport header within > + the packet. Note that such field differs from \field{csum_start} as the latter > + points to the inner transport header within the packet. > + > + \item \field{inner_nh_offset} field indicates the inner network header within > + the packet. > + \end{itemize} > + > +\begin{note} > +For example, consider a partially checksummed TCP (IPv4) packet carried over a partially checksummed here means CHECKSUM_PARTIAL? The term makes sense within Linux, but intuitively partially checksummed could be interpreted to mean something like UDPLite: a checksum over part of the packet. Perhaps checksum offloaded instead? And since I'm bikeshedding: "Note that" is a phrase that can almost always be dropped entirely. > +Geneve UDP tunnel (again IPv4) with no tunnel options. Note that the > +only relevant variable related to the tunnel type is the tunnel header length. > +The packet will have a 14 byte outer ethernet header, 20 byte outer IP header > +followed by the 8 byte UDP header (with a 0 checksum value), 8 byte Geneve header, > +14 byte inner ethernet header, 20 byte inner IP header > +and the TCP header (with the TCP checksum field 16 bytes > +into that header). \field{csum_start} will be 14+20+8+8+14+20 = 84 (the TCP > +checksum includes the header), \field{csum_offset} will be 16. The csum_offset field is not even needed in virtio_net_hdr if gso is enabled, as then the gso type already defines the protocol. So we could harvest that 16b field. But the new feature needs two 16b fields. Likely they could be two 8b fields with some relative offset gymnastics. But adding it all up that makes for a less obvious, harder to understand datapath. Only mentioning as a discussion point. Just adding two new fields to the struct seems more sensible to me, too. > +\field{inner_nh_offset} will be 14+20+8+8+14 = 62, \field{outer_th_offset} will be > +14+20+8 = 42 and \field{gso_type} will be > +VIRTIO_NET_HDR_GSO_TCPV4 | VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 = 0x21 > +\end{note} > + > \item \field{num_buffers} is set to zero. This field is unused on transmitted packets. > > \item The header and packet are added as one output descriptor to the > @@ -557,6 +616,29 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De > driver MUST set the VIRTIO_NET_HDR_GSO_ECN bit in > \field{gso_type}. > > +If VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO is negotiated, the driver MAY set > +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit > +in \field{gso_type} according to the inner network header protocol type > +to request TCP or UDP GSO packets over UDPv4 or UDPv6 tunnel > +segmentation, otherwise the driver MUST NOT set neither the not neither is a double negative. Neither already means not either, so this because not not either. > +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit nor the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit > +in \field{gso_type}. > + > +When requesting TCP or UDP GSO segmentation over UDP tunnel, the driver MUST SET the > +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit if the inner network header is IPv4, i.e. the > +packet is a TCPv4 GSO one, otherwise, if the inner network header is IPv6, the driver > +MUST SET the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit. > + > +The driver MUST NOT send to the device TCP or UDP GSO packets over UDP tunnel > +requiring segmentation and outer UDP checksum offload. > + > +The driver MUST NOT set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the > +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit together with VIRTIO_NET_HDR_GSO_UDP, as the > +latter is deprecated in favor of UDP_L4 and no new feature will support it. > + > +The driver MUST NOT set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit and the > +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit together. > + > If the VIRTIO_NET_F_CSUM feature has been negotiated, the > driver MAY set the VIRTIO_NET_HDR_F_NEEDS_CSUM bit in > \field{flags}, if so: > @@ -598,6 +680,36 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De > header. > \end{itemize} > > +If the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO option has been negotiated, the > +driver MAY set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the > +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit in \field{gso_type}, if so: > +\begin{itemize} > +\item the driver MUST set \field{outer_th_offset} to the outer UDP header > + offset and \field{inner_nh_offset} to the inner network header offset. > + The \field{csum_start} and \field{csum_offset} fields point respectively > + to the inner transport header and inner transport checksum field. > +\end{itemize} > + > +If the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the > +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit in \field{gso_type} are set, the > +outer UDP header MUST NOT require checksum validation. That is, the > +outer UDP checksum value MUST be 0 or the validated complete checksum > +for such header. > + > +\begin{note} > +The valid complete checksum of the outer UDP header of individual segments > +can be computed by the driver prior to segmentation only if the GSO packet > +size is a multiple of \field{gso_size}, because then all segments > +have the same size and thus all data included in the outer UDP > +checksum is the same for every segment. These pre-computed segment > +length and checksum fields are different from those of the GSO > +packet. > +\end{note} The second variant, gso_partial, also implies that the outer header already has udph->length set to the segment length. While the first variant does not. Right? Probably worth clarifying. > + > +If the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO option has not > +been negotiated, the driver MUST NOT set neither the VIRTIO_NET_HDR_F_UDP_TUNNEL_IPV4 > +bit nor the VIRTIO_NET_HDR_F_UDP_TUNNEL_IPV6 in \field{gso_type}. > + > The driver SHOULD accept the VIRTIO_NET_F_GUEST_HDRLEN feature if it has > been offered, and if it's able to provide the exact header length. > > @@ -633,6 +745,10 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De > \end{note} > \end{itemize} > > +If the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit and theVIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 > +bit in \field{gso_type} are not set, the device MUST NOT use the > +\field{outer_th_offset} and \field{inner_nh_offset}. > + > If VIRTIO_NET_HDR_F_NEEDS_CSUM is not set, the device MUST NOT > rely on the packet checksum being correct. > \paragraph{Packet Transmission Interrupt}\label{sec:Device Types / Network Device / Device Operation / Packet Transmission / Packet Transmission Interrupt} > @@ -727,8 +843,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network > has been validated. > \end{enumerate} > > -Additionally, VIRTIO_NET_F_GUEST_CSUM, TSO4, TSO6, UDP and ECN > -features enable receive checksum, large receive offload and ECN > +Additionally, VIRTIO_NET_F_GUEST_CSUM, TSO4, TSO6, UDP, UDP_TUNNEL > +and ECN features enable receive checksum, large receive offload and ECN > support which are the input equivalents of the transmit checksum, > transmit segmentation offloading and ECN features, as described > in \ref{sec:Device Types / Network Device / Device Operation / > @@ -750,8 +866,16 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network > from \field{csum_start} and any preceding checksums > have been validated. The checksum on the packet is incomplete and > if bit VIRTIO_NET_HDR_F_RSC_INFO is not set in \field{flags}, > - then \field{csum_start} and \field{csum_offset} indicate how to calculate it > - (see Packet Transmission point 1). > + then \field{csum_start} and \field{csum_offset} indicate how to calculate it. > + If the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 > + bit are set, the \field{csum_start} field > + refers to the inner transport header offset (see Packet Transmission point 1). > +\item If the VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO option was negotiated and > + \field{gso_type} is not VIRTIO_NET_HDR_GSO_NONE, the > + VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 > + bit MAY be set. In such case the \field{outer_th_offset} and > + \field{inner_nh_offset} fields indicate the corresponding > + headers information. > > \end{enumerate} > > @@ -796,6 +920,10 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network > If none of VIRTIO_NET_F_GUEST_USO4 or VIRTIO_NET_F_GUEST_USO6 have been negotiated, > the device MUST NOT set \field{gso_type} to VIRTIO_NET_HDR_GSO_UDP_L4. > > +If VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO is not negotiated, the device MUST NOT set > +neither the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit nor the > +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit in \field{gso_type}. > + > The device SHOULD NOT send to the driver TCP packets requiring segmentation offload > which have the Explicit Congestion Notification bit set, unless the > VIRTIO_NET_F_GUEST_ECN feature is negotiated, in which case the > @@ -819,6 +947,18 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network > fully checksummed packet; > \end{enumerate} > > +The device MUST NOT send to the driver TCP or UDP GSO packets encapsulated in UDP > +tunnel and requiring segmentation offload, unless the > +VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO is negotiated, in which case the device MUST set > +the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 > +bit in \field{gso_type} according to the inner network header protocol type, > +MUST set the \field{outer_th_offset} and \field{inner_nh_offset} fields > +to the corresponding header information, and the outer UDP header MUST NOT > +require checksum offload. > + > +The device MUST NOT send the driver TCP or UDP GSO packets encapsulated in UDP > +tunnel and requiring segmentation and outer checksum offload. > + > If none of the VIRTIO_NET_F_GUEST_TSO4, TSO6, UFO, USO4 or USO6 options have > been negotiated, the device MUST set \field{gso_type} to > VIRTIO_NET_HDR_GSO_NONE. > @@ -867,6 +1007,10 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network > VIRTIO_NET_HDR_F_DATA_VALID is set, the driver MUST NOT > rely on the packet checksum being correct. > > +If the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 > +bit in \field{gso_type} are not set, the driver MUST NOT use the > +\field{outer_th_offset} and \field{inner_nh_offset}. > + > \paragraph{Hash calculation for incoming packets} > \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets} > > @@ -1624,6 +1768,7 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi > #define VIRTIO_NET_F_GUEST_TSO6 8 > #define VIRTIO_NET_F_GUEST_ECN 9 > #define VIRTIO_NET_F_GUEST_UFO 10 > +#define VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO 47 > #define VIRTIO_NET_F_GUEST_USO4 54 > #define VIRTIO_NET_F_GUEST_USO6 55 > > -- > 2.45.2 > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v7 1/2] virtio-net: define UDP tunnel segmentation offload feature 2024-08-21 19:28 ` Willem de Bruijn @ 2024-08-21 19:48 ` Willem de Bruijn 2024-08-22 14:25 ` Paolo Abeni 2024-08-22 14:11 ` Paolo Abeni 1 sibling, 1 reply; 15+ messages in thread From: Willem de Bruijn @ 2024-08-21 19:48 UTC (permalink / raw) To: Paolo Abeni Cc: virtio-comment, maxime.coquelin, Eelco Chaudron, Jason Wang, Stefano Garzarella > > +Geneve UDP tunnel (again IPv4) with no tunnel options. Note that the > > +only relevant variable related to the tunnel type is the tunnel header length. > > +The packet will have a 14 byte outer ethernet header, 20 byte outer IP header > > +followed by the 8 byte UDP header (with a 0 checksum value), 8 byte Geneve header, > > +14 byte inner ethernet header, 20 byte inner IP header > > +and the TCP header (with the TCP checksum field 16 bytes > > +into that header). \field{csum_start} will be 14+20+8+8+14+20 = 84 (the TCP > > +checksum includes the header), \field{csum_offset} will be 16. > > The csum_offset field is not even needed in virtio_net_hdr if gso is > enabled, as then the gso type already defines the protocol. We recently started testing this invariant in the Linux code. We have learned the hard way to not trust userspace processes to set up any fields correctly. For the new fields, we should from the start think of such invariants that can detect bad input. For instance, outer_th_offset < inner_nh_offset inner_nh_offset < csum_start csum_start < pkt_len Maybe even more stringent invariant checks are possible. The only ones that truly matter are the ones that can affect device side correctness (such as host kernel crashes). Not sure that we have to call them out in the spec, but perhaps helpful too. Either way, good to think about possible attacks on the new interface even before coding, before the spec is complete. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v7 1/2] virtio-net: define UDP tunnel segmentation offload feature 2024-08-21 19:48 ` Willem de Bruijn @ 2024-08-22 14:25 ` Paolo Abeni 2024-08-22 14:30 ` Willem de Bruijn 0 siblings, 1 reply; 15+ messages in thread From: Paolo Abeni @ 2024-08-22 14:25 UTC (permalink / raw) To: Willem de Bruijn Cc: virtio-comment, maxime.coquelin, Eelco Chaudron, Jason Wang, Stefano Garzarella On 8/21/24 21:48, Willem de Bruijn wrote: >>> +Geneve UDP tunnel (again IPv4) with no tunnel options. Note that the >>> +only relevant variable related to the tunnel type is the tunnel header length. >>> +The packet will have a 14 byte outer ethernet header, 20 byte outer IP header >>> +followed by the 8 byte UDP header (with a 0 checksum value), 8 byte Geneve header, >>> +14 byte inner ethernet header, 20 byte inner IP header >>> +and the TCP header (with the TCP checksum field 16 bytes >>> +into that header). \field{csum_start} will be 14+20+8+8+14+20 = 84 (the TCP >>> +checksum includes the header), \field{csum_offset} will be 16. >> >> The csum_offset field is not even needed in virtio_net_hdr if gso is >> enabled, as then the gso type already defines the protocol. > > We recently started testing this invariant in the Linux code. We have > learned the hard way to not trust userspace processes to set up any > fields correctly. > > For the new fields, we should from the start think of such invariants > that can detect bad input. For instance, > > outer_th_offset < inner_nh_offset > inner_nh_offset < csum_start > csum_start < pkt_len > > Maybe even more stringent invariant checks are possible. The only > ones that truly matter are the ones that can affect device side > correctness (such as host kernel crashes). FTR, I intend to implement something alike: outer_th_offset >= 14 + base outer network hdr size outer_th_offset + 8 <= inner_nh_offset inner_nh_offset + base inner network hdr size <= csum_start csum_start <= pkt_len - base inner transport hdr size where base inner/outer network hdr size is 20 or 40 depending on ipv4 vs ipv6 and base inner transport hdr size is 8 or 20 debending on udp vs tcp. > Not sure that we have to call them out in the spec, but perhaps > helpful too. Either way, good to think about possible attacks on > the new interface even before coding, before the spec is complete. AFIACS the current spec currently does not describe validation for the existing fields subject to such issues (e.g. csum_start/csum_offset). I think it could be added in a second moment, if needed. Thanks, Paolo ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v7 1/2] virtio-net: define UDP tunnel segmentation offload feature 2024-08-22 14:25 ` Paolo Abeni @ 2024-08-22 14:30 ` Willem de Bruijn 0 siblings, 0 replies; 15+ messages in thread From: Willem de Bruijn @ 2024-08-22 14:30 UTC (permalink / raw) To: Paolo Abeni Cc: virtio-comment, maxime.coquelin, Eelco Chaudron, Jason Wang, Stefano Garzarella On Thu, Aug 22, 2024 at 10:25 AM Paolo Abeni <pabeni@redhat.com> wrote: > > > > On 8/21/24 21:48, Willem de Bruijn wrote: > >>> +Geneve UDP tunnel (again IPv4) with no tunnel options. Note that the > >>> +only relevant variable related to the tunnel type is the tunnel header length. > >>> +The packet will have a 14 byte outer ethernet header, 20 byte outer IP header > >>> +followed by the 8 byte UDP header (with a 0 checksum value), 8 byte Geneve header, > >>> +14 byte inner ethernet header, 20 byte inner IP header > >>> +and the TCP header (with the TCP checksum field 16 bytes > >>> +into that header). \field{csum_start} will be 14+20+8+8+14+20 = 84 (the TCP > >>> +checksum includes the header), \field{csum_offset} will be 16. > >> > >> The csum_offset field is not even needed in virtio_net_hdr if gso is > >> enabled, as then the gso type already defines the protocol. > > > > We recently started testing this invariant in the Linux code. We have > > learned the hard way to not trust userspace processes to set up any > > fields correctly. > > > > For the new fields, we should from the start think of such invariants > > that can detect bad input. For instance, > > > > outer_th_offset < inner_nh_offset > > inner_nh_offset < csum_start > > csum_start < pkt_len > > > > Maybe even more stringent invariant checks are possible. The only > > ones that truly matter are the ones that can affect device side > > correctness (such as host kernel crashes). > > FTR, I intend to implement something alike: > > outer_th_offset >= 14 + base outer network hdr size > outer_th_offset + 8 <= inner_nh_offset > inner_nh_offset + base inner network hdr size <= csum_start > csum_start <= pkt_len - base inner transport hdr size > > where base inner/outer network hdr size is 20 or 40 depending on ipv4 vs > ipv6 and base inner transport hdr size is 8 or 20 debending on udp vs tcp. Great! Thanks Paolo. We learned the hard way that we need to sanitize these inputs. Good to see that included from the start for new fields. > > Not sure that we have to call them out in the spec, but perhaps > > helpful too. Either way, good to think about possible attacks on > > the new interface even before coding, before the spec is complete. > > AFIACS the current spec currently does not describe validation for the > existing fields subject to such issues (e.g. csum_start/csum_offset). I > think it could be added in a second moment, if needed. Either way. It may be helpful for other implementers of the spec. If not too much work to add. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v7 1/2] virtio-net: define UDP tunnel segmentation offload feature 2024-08-21 19:28 ` Willem de Bruijn 2024-08-21 19:48 ` Willem de Bruijn @ 2024-08-22 14:11 ` Paolo Abeni 2024-08-26 12:23 ` [EXTERNAL] " Shiva Shankar Kommula 1 sibling, 1 reply; 15+ messages in thread From: Paolo Abeni @ 2024-08-22 14:11 UTC (permalink / raw) To: Willem de Bruijn Cc: virtio-comment, maxime.coquelin, Eelco Chaudron, Jason Wang, Stefano Garzarella On 8/21/24 21:28, Willem de Bruijn wrote: > On Wed, Aug 21, 2024 at 6:03 AM Paolo Abeni <pabeni@redhat.com> wrote: >> >> The VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV{4,6} are two gso_type flags >> allowing respectively GSO over UDPv4 tunnel and GSO over UDPv6 tunnel. >> They can be negotiated on both the host and guest sides. >> >> One constraint addressed here is that the virtio side (either device or >> driver) receiving a UDP tunneled GSO packet must be able to reconstruct >> completely the inner and outer headers offset - to allow for later GSO. >> >> To accommodate such need, new fields are introduced in the virtio_net >> header: outer_th_offset and inner_nh_offset. >> They map directly to the corresponding header information. The inner >> transport header is implied by the (inner) checksum offload. >> >> Those fields are required because a virtio device H/W implementation >> performing segmentation for UDP tunneled packet will need to touch >> the outer transport protocol (for the UDP length filed), the >> inner network protocol (for the total length field, in the IPv4 >> case). >> >> Note that segmentation will additionally need to touch >> the outer network protocol and the inner transport protocol. The first >> is implied/easily found with trivial parsing, the latter is identified > > Trivial because Ethernet header followed by IPv4 or IPv6 header is > assumed? I.e., no level 2.5 headers like vlan or mpls to complicate > matters. Yes ;) >> @@ -516,6 +542,39 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De >> specifically in the protocol.}. >> \end{itemize} >> >> +\item If the driver negotiated the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO feature and the >> + \field{gso_type} has the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or >> + VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit set, the GSO protocol is encapsulated >> + in a UDP tunnel. >> + The outer UDP header must not require checksum validation, i.e. the outer UDP >> + checksum must be a positive zero (0x0) as defined in UDP RFC 768. >> + The other tunnel-related fields indicate how to replicate the packet >> + headers to cut it into smaller packets: >> + >> + \begin{itemize} >> + \item \field{outer_th_offset} field indicates the outer transport header within >> + the packet. Note that such field differs from \field{csum_start} as the latter >> + points to the inner transport header within the packet. >> + >> + \item \field{inner_nh_offset} field indicates the inner network header within >> + the packet. >> + \end{itemize} >> + >> +\begin{note} >> +For example, consider a partially checksummed TCP (IPv4) packet carried over a > > partially checksummed here means CHECKSUM_PARTIAL? > > The term makes sense within Linux, but intuitively partially > checksummed could be interpreted to mean something like UDPLite: a > checksum over part of the packet. > > Perhaps checksum offloaded instead? I used the same wording of the similar note describing plain TCP GSO packet a few lines above, for cargo cul... consistency. So just a small consistency-related resistence here. > And since I'm bikeshedding: "Note that" is a phrase that can almost > always be dropped entirely. Will drop in the next revision. >> @@ -557,6 +616,29 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De >> driver MUST set the VIRTIO_NET_HDR_GSO_ECN bit in >> \field{gso_type}. >> >> +If VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO is negotiated, the driver MAY set >> +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit >> +in \field{gso_type} according to the inner network header protocol type >> +to request TCP or UDP GSO packets over UDPv4 or UDPv6 tunnel >> +segmentation, otherwise the driver MUST NOT set neither the > > not neither is a double negative. Neither already means not either, so > this because not not either. The intended meaning is a single negative (not set). I'll replace neither with either in the next revision. >> @@ -598,6 +680,36 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De >> header. >> \end{itemize} >> >> +If the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO option has been negotiated, the >> +driver MAY set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the >> +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit in \field{gso_type}, if so: >> +\begin{itemize} >> +\item the driver MUST set \field{outer_th_offset} to the outer UDP header >> + offset and \field{inner_nh_offset} to the inner network header offset. >> + The \field{csum_start} and \field{csum_offset} fields point respectively >> + to the inner transport header and inner transport checksum field. >> +\end{itemize} >> + >> +If the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the >> +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit in \field{gso_type} are set, the >> +outer UDP header MUST NOT require checksum validation. That is, the >> +outer UDP checksum value MUST be 0 or the validated complete checksum >> +for such header. >> + >> +\begin{note} >> +The valid complete checksum of the outer UDP header of individual segments >> +can be computed by the driver prior to segmentation only if the GSO packet >> +size is a multiple of \field{gso_size}, because then all segments >> +have the same size and thus all data included in the outer UDP >> +checksum is the same for every segment. These pre-computed segment >> +length and checksum fields are different from those of the GSO >> +packet. >> +\end{note} > > The second variant, gso_partial, also implies that the outer header > already has udph->length set to the segment length. > > While the first variant does not. Right? > > Probably worth clarifying. I will do in the next revision. Thanks, Paolo ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: [EXTERNAL] Re: [PATCH v7 1/2] virtio-net: define UDP tunnel segmentation offload feature 2024-08-22 14:11 ` Paolo Abeni @ 2024-08-26 12:23 ` Shiva Shankar Kommula 2024-08-29 15:49 ` Paolo Abeni 0 siblings, 1 reply; 15+ messages in thread From: Shiva Shankar Kommula @ 2024-08-26 12:23 UTC (permalink / raw) To: Paolo Abeni, Willem de Bruijn Cc: virtio-comment@lists.linux.dev, maxime.coquelin@redhat.com, Eelco Chaudron, Jason Wang, Stefano Garzarella, Nithin Kumar Dabilpuram > -----Original Message----- > From: Paolo Abeni <pabeni@redhat.com> > Sent: Thursday, August 22, 2024 7:41 PM > To: Willem de Bruijn <willemb@google.com> > Cc: virtio-comment@lists.linux.dev; maxime.coquelin@redhat.com; Eelco > Chaudron <echaudro@redhat.com>; Jason Wang <jasowang@redhat.com>; > Stefano Garzarella <sgarzare@redhat.com> > Subject: [EXTERNAL] Re: [PATCH v7 1/2] virtio-net: define UDP tunnel > segmentation offload feature > > On 8/21/24 21: 28, Willem de Bruijn wrote: > On Wed, Aug 21, 2024 at 6: 03 > AM Paolo Abeni <pabeni@ redhat. com> wrote: >> >> The > VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV{4,6} are two gso_type flags >> > allowing respectively GSO > On 8/21/24 21:28, Willem de Bruijn wrote: > > On Wed, Aug 21, 2024 at 6:03 AM Paolo Abeni <pabeni@redhat.com> > wrote: > >> > >> The VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV{4,6} are two gso_type flags > >> allowing respectively GSO over UDPv4 tunnel and GSO over UDPv6 tunnel. > >> They can be negotiated on both the host and guest sides. > >> > >> One constraint addressed here is that the virtio side (either device > >> or > >> driver) receiving a UDP tunneled GSO packet must be able to > >> reconstruct completely the inner and outer headers offset - to allow for > later GSO. > >> > >> To accommodate such need, new fields are introduced in the virtio_net > >> header: outer_th_offset and inner_nh_offset. > >> They map directly to the corresponding header information. The inner > >> transport header is implied by the (inner) checksum offload. > >> > >> Those fields are required because a virtio device H/W implementation > >> performing segmentation for UDP tunneled packet will need to touch > >> the outer transport protocol (for the UDP length filed), the inner > >> network protocol (for the total length field, in the IPv4 case). > >> > >> Note that segmentation will additionally need to touch the outer > >> network protocol and the inner transport protocol. The first is > >> implied/easily found with trivial parsing, the latter is identified > > > > Trivial because Ethernet header followed by IPv4 or IPv6 header is > > assumed? I.e., no level 2.5 headers like vlan or mpls to complicate > > matters. > > Yes ;) Hi Paolo Abeni, Apologies for joining the conversion bit late. We are using VDPA-based device along with DPDK in our environment and have noticed a significant performance drop (~8-10%) when reading packet data to determine the [outer]network header type in segment offloading cases as segmentation offload needs to know the start of network header. We considered submitting an RFC for the addition of [outer]network header start field in virtio_net_hdr_t. However, since this patch series is already proposing changes for tunnel related offload features, it may be beneficial to include this proposal as well This perf drop issue will become more noticeable when the inline IPSec feature (Marvell is planning to propose draft RFC) is added to the virtio specification I would appreciate your thoughts on this Thanks Shiva > > >> @@ -516,6 +542,39 @@ \subsubsection{Packet > Transmission}\label{sec:Device Types / Network Device / De > >> specifically in the protocol.}. > >> \end{itemize} > >> > >> +\item If the driver negotiated the > VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO > >> +feature and the > >> + \field{gso_type} has the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit > or > >> + VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit set, the GSO protocol is > >> +encapsulated > >> + in a UDP tunnel. > >> + The outer UDP header must not require checksum validation, i.e. > >> +the outer UDP > >> + checksum must be a positive zero (0x0) as defined in UDP RFC 768. > >> + The other tunnel-related fields indicate how to replicate the > >> +packet > >> + headers to cut it into smaller packets: > >> + > >> + \begin{itemize} > >> + \item \field{outer_th_offset} field indicates the outer transport header > within > >> + the packet. Note that such field differs from \field{csum_start} as the > latter > >> + points to the inner transport header within the packet. > >> + > >> + \item \field{inner_nh_offset} field indicates the inner network header > within > >> + the packet. > >> + \end{itemize} > >> + > >> +\begin{note} > >> +For example, consider a partially checksummed TCP (IPv4) packet > >> +carried over a > > > > partially checksummed here means CHECKSUM_PARTIAL? > > > > The term makes sense within Linux, but intuitively partially > > checksummed could be interpreted to mean something like UDPLite: a > > checksum over part of the packet. > > > > Perhaps checksum offloaded instead? > > I used the same wording of the similar note describing plain TCP GSO packet a > few lines above, for cargo cul... consistency. > > So just a small consistency-related resistence here. > > > And since I'm bikeshedding: "Note that" is a phrase that can almost > > always be dropped entirely. > > Will drop in the next revision. > > >> @@ -557,6 +616,29 @@ \subsubsection{Packet > Transmission}\label{sec:Device Types / Network Device / De > >> driver MUST set the VIRTIO_NET_HDR_GSO_ECN bit in > >> \field{gso_type}. > >> > >> +If VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO is negotiated, the driver MAY > >> +set > >> +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the > >> +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit in \field{gso_type} > according > >> +to the inner network header protocol type to request TCP or UDP GSO > >> +packets over UDPv4 or UDPv6 tunnel segmentation, otherwise the > >> +driver MUST NOT set neither the > > > > not neither is a double negative. Neither already means not either, so > > this because not not either. > > The intended meaning is a single negative (not set). I'll replace neither with > either in the next revision. > > >> @@ -598,6 +680,36 @@ \subsubsection{Packet > Transmission}\label{sec:Device Types / Network Device / De > >> header. > >> \end{itemize} > >> > >> +If the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO option has been > negotiated, > >> +the driver MAY set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or > the > >> +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit in \field{gso_type}, if so: > >> +\begin{itemize} > >> +\item the driver MUST set \field{outer_th_offset} to the outer UDP > >> +header > >> + offset and \field{inner_nh_offset} to the inner network header offset. > >> + The \field{csum_start} and \field{csum_offset} fields point > >> +respectively > >> + to the inner transport header and inner transport checksum field. > >> +\end{itemize} > >> + > >> +If the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the > >> +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit in \field{gso_type} are set, > >> +the outer UDP header MUST NOT require checksum validation. That is, > >> +the outer UDP checksum value MUST be 0 or the validated complete > >> +checksum for such header. > >> + > >> +\begin{note} > >> +The valid complete checksum of the outer UDP header of individual > >> +segments can be computed by the driver prior to segmentation only if > >> +the GSO packet size is a multiple of \field{gso_size}, because then > >> +all segments have the same size and thus all data included in the > >> +outer UDP checksum is the same for every segment. These pre-computed > >> +segment length and checksum fields are different from those of the > >> +GSO packet. > >> +\end{note} > > > > The second variant, gso_partial, also implies that the outer header > > already has udph->length set to the segment length. > > > > While the first variant does not. Right? > > > > Probably worth clarifying. > > I will do in the next revision. > > Thanks, > > Paolo > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [EXTERNAL] Re: [PATCH v7 1/2] virtio-net: define UDP tunnel segmentation offload feature 2024-08-26 12:23 ` [EXTERNAL] " Shiva Shankar Kommula @ 2024-08-29 15:49 ` Paolo Abeni 2024-09-02 12:31 ` Shiva Shankar Kommula 0 siblings, 1 reply; 15+ messages in thread From: Paolo Abeni @ 2024-08-29 15:49 UTC (permalink / raw) To: Shiva Shankar Kommula, Willem de Bruijn Cc: virtio-comment@lists.linux.dev, maxime.coquelin@redhat.com, Eelco Chaudron, Jason Wang, Stefano Garzarella, Nithin Kumar Dabilpuram On 8/26/24 14:23, Shiva Shankar Kommula wrote: >> -----Original Message----- >> From: Paolo Abeni <pabeni@redhat.com> >> Sent: Thursday, August 22, 2024 7:41 PM >> To: Willem de Bruijn <willemb@google.com> >> Cc: virtio-comment@lists.linux.dev; maxime.coquelin@redhat.com; Eelco >> Chaudron <echaudro@redhat.com>; Jason Wang <jasowang@redhat.com>; >> Stefano Garzarella <sgarzare@redhat.com> >> Subject: [EXTERNAL] Re: [PATCH v7 1/2] virtio-net: define UDP tunnel >> segmentation offload feature >> >> On 8/21/24 21: 28, Willem de Bruijn wrote: > On Wed, Aug 21, 2024 at 6: 03 >> AM Paolo Abeni <pabeni@ redhat. com> wrote: >> >> The >> VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV{4,6} are two gso_type flags >> >> allowing respectively GSO >> On 8/21/24 21:28, Willem de Bruijn wrote: >>> On Wed, Aug 21, 2024 at 6:03 AM Paolo Abeni <pabeni@redhat.com> >> wrote: >>>> >>>> The VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV{4,6} are two gso_type flags >>>> allowing respectively GSO over UDPv4 tunnel and GSO over UDPv6 tunnel. >>>> They can be negotiated on both the host and guest sides. >>>> >>>> One constraint addressed here is that the virtio side (either device >>>> or >>>> driver) receiving a UDP tunneled GSO packet must be able to >>>> reconstruct completely the inner and outer headers offset - to allow for >> later GSO. >>>> >>>> To accommodate such need, new fields are introduced in the virtio_net >>>> header: outer_th_offset and inner_nh_offset. >>>> They map directly to the corresponding header information. The inner >>>> transport header is implied by the (inner) checksum offload. >>>> >>>> Those fields are required because a virtio device H/W implementation >>>> performing segmentation for UDP tunneled packet will need to touch >>>> the outer transport protocol (for the UDP length filed), the inner >>>> network protocol (for the total length field, in the IPv4 case). >>>> >>>> Note that segmentation will additionally need to touch the outer >>>> network protocol and the inner transport protocol. The first is >>>> implied/easily found with trivial parsing, the latter is identified >>> >>> Trivial because Ethernet header followed by IPv4 or IPv6 header is >>> assumed? I.e., no level 2.5 headers like vlan or mpls to complicate >>> matters. >> >> Yes ;) > Hi Paolo Abeni, > Apologies for joining the conversion bit late. > > We are using VDPA-based device along with DPDK in our environment > and have noticed a significant performance drop (~8-10%) when > reading packet data to determine the [outer]network header type > in segment offloading cases as segmentation offload needs to > know the start of network header. I'm unsure I parse the above correctly. DPDK is reading packets from the VDPA device, when does segmentation offload come into place? I guess the DPDK application is forwarding the packets towards another (VDPA?) device?. There is no encapsulation protocol at all right? The idea of adding additional virtio net header fields for plain GSO packets sounds a bit strange to me. I guess a more detailed description of your scenario would help. Thanks, Paolo ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: [EXTERNAL] Re: [PATCH v7 1/2] virtio-net: define UDP tunnel segmentation offload feature 2024-08-29 15:49 ` Paolo Abeni @ 2024-09-02 12:31 ` Shiva Shankar Kommula 2024-09-04 14:53 ` Paolo Abeni 0 siblings, 1 reply; 15+ messages in thread From: Shiva Shankar Kommula @ 2024-09-02 12:31 UTC (permalink / raw) To: Paolo Abeni Cc: virtio-comment@lists.linux.dev, maxime.coquelin@redhat.com, Eelco Chaudron, Jason Wang, Stefano Garzarella, Nithin Kumar Dabilpuram, Willem de Bruijn > -----Original Message----- > From: Paolo Abeni <pabeni@redhat.com> > Sent: Thursday, August 29, 2024 9:20 PM > To: Shiva Shankar Kommula <kshankar@marvell.com>; Willem de Bruijn > <willemb@google.com> > Cc: virtio-comment@lists.linux.dev; maxime.coquelin@redhat.com; Eelco > Chaudron <echaudro@redhat.com>; Jason Wang <jasowang@redhat.com>; > Stefano Garzarella <sgarzare@redhat.com>; Nithin Kumar Dabilpuram > <ndabilpuram@marvell.com> > Subject: Re: [EXTERNAL] Re: [PATCH v7 1/2] virtio-net: define UDP tunnel > segmentation offload feature > > On 8/26/24 14: 23, Shiva Shankar Kommula wrote: >> -----Original Message---- > - >> From: Paolo Abeni <pabeni@ redhat. com> >> Sent: Thursday, August 22, > 2024 7: 41 PM >> To: Willem de Bruijn <willemb@ google. com> > > On 8/26/24 14:23, Shiva Shankar Kommula wrote: > >> -----Original Message----- > >> From: Paolo Abeni <pabeni@redhat.com> > >> Sent: Thursday, August 22, 2024 7:41 PM > >> To: Willem de Bruijn <willemb@google.com> > >> Cc: virtio-comment@lists.linux.dev; maxime.coquelin@redhat.com; Eelco > >> Chaudron <echaudro@redhat.com>; Jason Wang > <jasowang@redhat.com>; > >> Stefano Garzarella <sgarzare@redhat.com> > >> Subject: [EXTERNAL] Re: [PATCH v7 1/2] virtio-net: define UDP tunnel > >> segmentation offload feature > >> > >> On 8/21/24 21: 28, Willem de Bruijn wrote: > On Wed, Aug 21, 2024 at > >> 6: 03 AM Paolo Abeni <pabeni@ redhat. com> wrote: >> >> The > >> VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV{4,6} are two gso_type flags >> > >> allowing respectively GSO On 8/21/24 21:28, Willem de Bruijn wrote: > >>> On Wed, Aug 21, 2024 at 6:03 AM Paolo Abeni <pabeni@redhat.com> > >> wrote: > >>>> > >>>> The VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV{4,6} are two gso_type > flags > >>>> allowing respectively GSO over UDPv4 tunnel and GSO over UDPv6 > tunnel. > >>>> They can be negotiated on both the host and guest sides. > >>>> > >>>> One constraint addressed here is that the virtio side (either > >>>> device or > >>>> driver) receiving a UDP tunneled GSO packet must be able to > >>>> reconstruct completely the inner and outer headers offset - to > >>>> allow for > >> later GSO. > >>>> > >>>> To accommodate such need, new fields are introduced in the > >>>> virtio_net > >>>> header: outer_th_offset and inner_nh_offset. > >>>> They map directly to the corresponding header information. The > >>>> inner transport header is implied by the (inner) checksum offload. > >>>> > >>>> Those fields are required because a virtio device H/W > >>>> implementation performing segmentation for UDP tunneled packet will > >>>> need to touch the outer transport protocol (for the UDP length > >>>> filed), the inner network protocol (for the total length field, in the IPv4 > case). > >>>> > >>>> Note that segmentation will additionally need to touch the outer > >>>> network protocol and the inner transport protocol. The first is > >>>> implied/easily found with trivial parsing, the latter is identified > >>> > >>> Trivial because Ethernet header followed by IPv4 or IPv6 header is > >>> assumed? I.e., no level 2.5 headers like vlan or mpls to complicate > >>> matters. > >> > >> Yes ;) > > Hi Paolo Abeni, > > Apologies for joining the conversion bit late. > > > > We are using VDPA-based device along with DPDK in our environment and > > have noticed a significant performance drop (~8-10%) when reading > > packet data to determine the [outer]network header type in segment > > offloading cases as segmentation offload needs to know the start of > > network header. > > I'm unsure I parse the above correctly. DPDK is reading packets from the VDPA > device, when does segmentation offload come into place? I guess the DPDK > application is forwarding the packets towards another (VDPA?) device?. > > There is no encapsulation protocol at all right? > > The idea of adding additional virtio net header fields for plain GSO packets > sounds a bit strange to me. > > I guess a more detailed description of your scenario would help. We are targeting vDPA offloading as our use case. This involves configuring vDPA based application on the host to directly transmit packets to a virtual ring(vring) with specific offload requests, such as IP checksum offloading or TCP segmentation offloading. On the other end, we have firmware running on ARM cores, which processes the incoming packets with hardware assistance and forwards them to the network. In scenarios, where the L3 network header start is not provided, firmware must parse the packet to identify the beginning of L3 layer, which introduces additional overhead. > > Thanks, > > Paolo > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [EXTERNAL] Re: [PATCH v7 1/2] virtio-net: define UDP tunnel segmentation offload feature 2024-09-02 12:31 ` Shiva Shankar Kommula @ 2024-09-04 14:53 ` Paolo Abeni 0 siblings, 0 replies; 15+ messages in thread From: Paolo Abeni @ 2024-09-04 14:53 UTC (permalink / raw) To: Shiva Shankar Kommula Cc: virtio-comment@lists.linux.dev, maxime.coquelin@redhat.com, Eelco Chaudron, Jason Wang, Stefano Garzarella, Nithin Kumar Dabilpuram, Willem de Bruijn On 9/2/24 14:31, Shiva Shankar Kommula wrote> On Thursday, August 29, 2024 at 9:20 PM Paolo Abeni wrote: >> On 8/26/24 14: 23, Shiva Shankar Kommula wrote: >> -----Original Message---- >>> We are using VDPA-based device along with DPDK in our environment and >>> have noticed a significant performance drop (~8-10%) when reading >>> packet data to determine the [outer]network header type in segment >>> offloading cases as segmentation offload needs to know the start of >>> network header. >> >> I'm unsure I parse the above correctly. DPDK is reading packets from the VDPA >> device, when does segmentation offload come into place? I guess the DPDK >> application is forwarding the packets towards another (VDPA?) device?. >> >> There is no encapsulation protocol at all right? >> >> The idea of adding additional virtio net header fields for plain GSO packets >> sounds a bit strange to me. >> >> I guess a more detailed description of your scenario would help. > We are targeting vDPA offloading as our use case. This involves > configuring vDPA based application on the host to directly transmit > packets to a virtual ring(vring) with specific offload requests, such as > IP checksum offloading or TCP segmentation offloading. > > On the other end, we have firmware running on ARM cores, which > processes the incoming packets with hardware assistance and > forwards them to the network. In scenarios, where the L3 network > header start is not provided, firmware must parse the packet to identify > the beginning of L3 layer, which introduces additional overhead. It looks like such use-case is quite unrelated from what is under discussion here. I think we don't want to add additional fields for plain GSO packets. Perhaps you could consider resurrecting/reusing hdr_len? Possibly under some additional feature negotiation. In any case this discussion looks orthogonal to the UDP tunnel changes. Cheers, Paolo ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v7 2/2] virtio-net: define UDP tunnel checksum offload feature 2024-08-21 10:02 [PATCH v7 0/2] virtio-net: define UDP tunnel offload Paolo Abeni 2024-08-21 10:02 ` [PATCH v7 1/2] virtio-net: define UDP tunnel segmentation offload feature Paolo Abeni @ 2024-08-21 10:02 ` Paolo Abeni 2024-08-21 19:42 ` Willem de Bruijn 1 sibling, 1 reply; 15+ messages in thread From: Paolo Abeni @ 2024-08-21 10:02 UTC (permalink / raw) To: virtio-comment Cc: maxime.coquelin, Eelco Chaudron, Jason Wang, Stefano Garzarella, Willem de Bruijn This complements the previous change, additionally introducing the UDP tunnel checksum offload feature. Differently from the plain checksum offload feature, this depends on UDP tunnel segmentation being available, as outer checksum computation for non GSO packets is cheap and H/W implementation of such a feature is complex. UDP tunnel checksum offload does not introduce additional fields, instead it leverages the outer transport offset introduced by the UDP tunnel segmentation feature to locate the outer checksum inside the packet. When UDP tunnel checksum offload is enabled, the VIRTIO_NET_HDR_F_DATA_VALID bit semantic is extended, in the device -> driver direction, to cover both the inner and the outer checksum validation. Signed-off-by: Paolo Abeni <pabeni@redhat.com> --- v6 -> v7: - rebased - VIRTIO_NET_F_{HOST,GUEST}_UDP_TUNNEL_CSUM -> ...UDP_TUNNEL_GSO_CSUM - dropped unintended change to existing TCP gso spec --- device-types/net/description.tex | 121 +++++++++++++++++++++++++++++-- 1 file changed, 113 insertions(+), 8 deletions(-) diff --git a/device-types/net/description.tex b/device-types/net/description.tex index e0e0927..3f42e43 100644 --- a/device-types/net/description.tex +++ b/device-types/net/description.tex @@ -91,9 +91,15 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits \item[VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO (47)] Driver can receive GSO packets carried by a UDP tunnel. +\item[VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM (48)] Driver handles packets + carried by a UDP tunnel with partial csum for the outer header. + \item[VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO (49)] Device can receive GSO packets carried by a UDP tunnel. +\item[VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM (50)] Device handles packets + carried by a UDP tunnel with partial csum for the outer header. + \item[VIRTIO_NET_F_HASH_TUNNEL(51)] Device supports inner header hash for encapsulated packets. \item[VIRTIO_NET_F_VQ_NOTF_COAL(52)] Device supports virtqueue notification coalescing. @@ -141,6 +147,7 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device \item[VIRTIO_NET_F_GUEST_USO6] Requires VIRTIO_NET_F_GUEST_CSUM. \item[VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO] Requires VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6, VIRTIO_NET_F_GUEST_USO4 and VIRTIO_NET_F_GUEST_USO6. +\item[VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM] Requires VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO \item[VIRTIO_NET_F_HOST_TSO4] Requires VIRTIO_NET_F_CSUM. \item[VIRTIO_NET_F_HOST_TSO6] Requires VIRTIO_NET_F_CSUM. @@ -149,6 +156,7 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device \item[VIRTIO_NET_F_HOST_USO] Requires VIRTIO_NET_F_CSUM. \item[VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO] Requires VIRTIO_NET_F_HOST_TSO4, VIRTIO_NET_F_HOST_TSO6 and VIRTIO_NET_F_HOST_USO. +\item[VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM] Requires VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO \item[VIRTIO_NET_F_CTRL_RX] Requires VIRTIO_NET_F_CTRL_VQ. \item[VIRTIO_NET_F_CTRL_VLAN] Requires VIRTIO_NET_F_CTRL_VQ. @@ -162,6 +170,13 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device \item[VIRTIO_NET_F_HASH_TUNNEL] Requires VIRTIO_NET_F_CTRL_VQ along with VIRTIO_NET_F_RSS or VIRTIO_NET_F_HASH_REPORT. \end{description} +\begin{note} +The dependency between UDP_TUNNEL_GSO_CSUM and UDP_TUNNEL_GSO is intentionally +in the opposite direction with respect to the plain GSO features and the plain +checksum offload because UDP tunnel checksum offload gives very little gain +for non GSO packets is quite complex to implement in H/W. +\end{note} + \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Network Device / Feature bits / Legacy Interface: Feature bits} \begin{description} \item[VIRTIO_NET_F_GSO (6)] Device handles packets with any GSO type. This was supposed to indicate segmentation offload support, but @@ -392,6 +407,11 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev the outer UDP checksum is zero - by negotiating the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO feature. +\item If VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO is negotiated, a driver can + additionally use TCP segmentation or UDP segmentation on top of UDP + encapsulation with the outer header requiring checksum offload, + negotiating the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM feature. + \item The converse features are also available: a driver can save the virtual device some work by negotiating these features.\note{For example, a network packet transported between two guests on the same system might not need checksumming at all, nor segmentation, @@ -400,8 +420,9 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev checksummed packets can be received, and if it can do that then the VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6, VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_GUEST_USO4, - VIRTIO_NET_F_GUEST_USO6 and VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO - are the input equivalents of the features described above. + VIRTIO_NET_F_GUEST_USO6 VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO and + VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM are the input equivalents of + the features described above. See \ref{sec:Device Types / Network Device / Device Operation / Setting Up Receive Buffers}~\nameref{sec:Device Types / Network Device / Device Operation / Setting Up Receive Buffers} and @@ -425,6 +446,7 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O #define VIRTIO_NET_HDR_F_NEEDS_CSUM 1 #define VIRTIO_NET_HDR_F_DATA_VALID 2 #define VIRTIO_NET_HDR_F_RSC_INFO 4 +#define VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM 8 u8 flags; #define VIRTIO_NET_HDR_GSO_NONE 0 #define VIRTIO_NET_HDR_GSO_TCPV4 1 @@ -546,8 +568,11 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De \field{gso_type} has the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit set, the GSO protocol is encapsulated in a UDP tunnel. - The outer UDP header must not require checksum validation, i.e. the outer UDP - checksum must be a positive zero (0x0) as defined in UDP RFC 768. + If the outer UDP header requires checksumming, the driver must have + additionally negotiated the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM feature + and offloaded the outer checksum accordingly, otherwise + the outer UDP header must not require checksum validation, i.e. the outer + UDP checksum must be positive zero (0x0) as defined in UDP RFC 768. The other tunnel-related fields indicate how to replicate the packet headers to cut it into smaller packets: @@ -575,6 +600,20 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De VIRTIO_NET_HDR_GSO_TCPV4 | VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 = 0x21 \end{note} +\item If the driver negotiated the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM feature, + the transmitted packet is a GSO one encapsulated in a UDP tunnel, and + the outer UDP header requires checksumming, the driver can skip checksumming + the outer header: + + \begin{itemize} + \item \field{flags} has the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM set, + + \item The outer UDP checksum field in the packet is set to the sum + of the UDP pseudo header, so that replacing it by the ones' + complement checksum of the outer UDP header and payload will give the + correct result. + \end{itemize} + \item \field{num_buffers} is set to zero. This field is unused on transmitted packets. \item The header and packet are added as one output descriptor to the @@ -630,6 +669,14 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De MUST SET the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit. The driver MUST NOT send to the device TCP or UDP GSO packets over UDP tunnel +requiring segmentation and outer UDP checksum offload, unless both the +VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO and VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM features +are negotiated, in which case the driver MUST set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL +bit in the \field{gso_type} and the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit in the \field{flags}. + +If VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM is not negotiated, the driver MUST not set +the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit in the \field{flags} and +MUST NOT send to the device TCP or UDP GSO packets over UDP tunnel requiring segmentation and outer UDP checksum offload. The driver MUST NOT set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the @@ -639,6 +686,9 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De The driver MUST NOT set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit and the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit together. +The driver MUST NOT set the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit \field{flags} +without setting the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO bit in \field{gso_type}. + If the VIRTIO_NET_F_CSUM feature has been negotiated, the driver MAY set the VIRTIO_NET_HDR_F_NEEDS_CSUM bit in \field{flags}, if so: @@ -690,8 +740,29 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De to the inner transport header and inner transport checksum field. \end{itemize} +If the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM feature has been negotiated, +and the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit in \field{gso_type} are set, +the driver MAY set the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit in +\field{flags}, if so: +\begin{enumerate} +\item the driver MUST validate the packet checksum at + offset 6 from \field{outer_th_offset} as well as all + preceding offsets; +\item the driver MUST set the packet outer UDP header checksum + to the outer UDP pseudo header; +\end{enumerate} + +\begin{note} +calculating a ones' complement checksum from \field{outer_th_offset} +up until the end of the packet and storing the result at offset 6 +from \field{outer_th_offset} will result in a fully checksummed outer UDP packet; +\end{note} + If the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the -VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit in \field{gso_type} are set, the +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit in \field{gso_type} are set +and the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM feature has not +been negotiated, the outer UDP header MUST NOT require checksum validation. That is, the outer UDP checksum value MUST be 0 or the validated complete checksum for such header. @@ -710,6 +781,10 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De been negotiated, the driver MUST NOT set neither the VIRTIO_NET_HDR_F_UDP_TUNNEL_IPV4 bit nor the VIRTIO_NET_HDR_F_UDP_TUNNEL_IPV6 in \field{gso_type}. +If the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM option has not been negotiated, +the driver MUST NOT set the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit +in \field{flags}. + The driver SHOULD accept the VIRTIO_NET_F_GUEST_HDRLEN feature if it has been offered, and if it's able to provide the exact header length. @@ -876,6 +951,12 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network bit MAY be set. In such case the \field{outer_th_offset} and \field{inner_nh_offset} fields indicate the corresponding headers information. +\item If the VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM feature was negotiated, and + the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit is set in \field{gso_type}, the + VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit in the \field{flags} can be set: if so, + the outer UDP header checksum at offset 6 from \field{outer_th_offset} have + been validated. The checksum on the packet is incomplete and the + \field{outer_th_offset} indicates how to calculate it. \end{enumerate} @@ -924,6 +1005,9 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network neither the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit nor the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit in \field{gso_type}. +If VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM is not negotiated the device MUST NOT set +the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit in \field{flags}. + The device SHOULD NOT send to the driver TCP packets requiring segmentation offload which have the Explicit Congestion Notification bit set, unless the VIRTIO_NET_F_GUEST_ECN feature is negotiated, in which case the @@ -956,7 +1040,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network to the corresponding header information, and the outer UDP header MUST NOT require checksum offload. -The device MUST NOT send the driver TCP or UDP GSO packets encapsulated in UDP +If the VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM feature has not been negotiated, +the device MUST NOT send the driver TCP or UDP GSO packets encapsulated in UDP tunnel and requiring segmentation and outer checksum offload. If none of the VIRTIO_NET_F_GUEST_TSO4, TSO6, UFO, USO4 or USO6 options have @@ -982,8 +1067,27 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network If the VIRTIO_NET_F_GUEST_CSUM feature has been negotiated, the device MAY set the VIRTIO_NET_HDR_F_DATA_VALID bit in \field{flags}, if so, the device MUST validate the packet -checksum (in case of multiple encapsulated protocols, one level -of checksums is validated). +checksum. If the VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM feature has been negotiated, +and the VIRTIO_NET_HDR_F_UDP_TUNNEL bit set, both the outer UDP checksum +and the inner transport checksum have been +validated, otherwise only one level of checksums (the inner one in case +of tunnels) has been validated. + +If the VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM feature has been negotiated, +the bit VIRTIO_NET_HDR_GSO_UDP_TUNNEL is set in \field{gso_type} +device MAY set the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit in +\field{flags}, if so: +\begin{enumerate} +\item the device MUST validate the packet checksum at + offset 6 from \field{outer_th_offset}; +\item the device MUST set the packet checksum stored in the + receive buffer to the outer UDP pseudo header; +\end{enumerate} + +Otherwise, if the VIRTIO_NET_HDR_GSO_UDP_TUNNEL is set in +\field{gso_type} and the bit VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM is cleared in +\field{flags}, the device MUST either provide a zero outer UDP header +checksum or a fully checksummed outer UDP header. \drivernormative{\paragraph}{Processing of Incoming Packets}{Device Types / Network Device / Device Operation / @@ -1769,6 +1873,7 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi #define VIRTIO_NET_F_GUEST_ECN 9 #define VIRTIO_NET_F_GUEST_UFO 10 #define VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO 47 +#define VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM 48 #define VIRTIO_NET_F_GUEST_USO4 54 #define VIRTIO_NET_F_GUEST_USO6 55 -- 2.45.2 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v7 2/2] virtio-net: define UDP tunnel checksum offload feature 2024-08-21 10:02 ` [PATCH v7 2/2] virtio-net: define UDP tunnel checksum " Paolo Abeni @ 2024-08-21 19:42 ` Willem de Bruijn 2024-08-22 15:18 ` Paolo Abeni 0 siblings, 1 reply; 15+ messages in thread From: Willem de Bruijn @ 2024-08-21 19:42 UTC (permalink / raw) To: Paolo Abeni Cc: virtio-comment, maxime.coquelin, Eelco Chaudron, Jason Wang, Stefano Garzarella On Wed, Aug 21, 2024 at 6:03 AM Paolo Abeni <pabeni@redhat.com> wrote: > > This complements the previous change, additionally > introducing the UDP tunnel checksum offload feature. > > Differently from the plain checksum offload feature, this > depends on UDP tunnel segmentation being available, as outer checksum > computation for non GSO packets is cheap and H/W implementation of > such a feature is complex. > > UDP tunnel checksum offload does not introduce additional fields, > instead it leverages the outer transport offset introduced by the > UDP tunnel segmentation feature to locate the outer checksum > inside the packet. > > When UDP tunnel checksum offload is enabled, the VIRTIO_NET_HDR_F_DATA_VALID > bit semantic is extended, in the device -> driver direction, to > cover both the inner and the outer checksum validation. > > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > +\begin{note} > +The dependency between UDP_TUNNEL_GSO_CSUM and UDP_TUNNEL_GSO is intentionally > +in the opposite direction with respect to the plain GSO features and the plain > +checksum offload because UDP tunnel checksum offload gives very little gain > +for non GSO packets is quite complex to implement in H/W. packets [and] is quite > See \ref{sec:Device Types / Network Device / Device Operation / > Setting Up Receive Buffers}~\nameref{sec:Device Types / Network > Device / Device Operation / Setting Up Receive Buffers} and > @@ -425,6 +446,7 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O > #define VIRTIO_NET_HDR_F_NEEDS_CSUM 1 > #define VIRTIO_NET_HDR_F_DATA_VALID 2 > #define VIRTIO_NET_HDR_F_RSC_INFO 4 > +#define VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM 8 Only now notice: since this is only valid with GSO, should this be a gso_type bit, like VIRTIO_NET_HDR_GSO_ECN? > +If the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM feature has been negotiated, > +and the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or > +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit in \field{gso_type} are set, > +the driver MAY set the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit in > +\field{flags}, if so: > +\begin{enumerate} > +\item the driver MUST validate the packet checksum at > + offset 6 from \field{outer_th_offset} as well as all > + preceding offsets; I don't follow this comment: when programming checksum offload, why does the driver have to validate any checksums? Same in two other locations. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v7 2/2] virtio-net: define UDP tunnel checksum offload feature 2024-08-21 19:42 ` Willem de Bruijn @ 2024-08-22 15:18 ` Paolo Abeni 2024-08-22 15:20 ` Willem de Bruijn 0 siblings, 1 reply; 15+ messages in thread From: Paolo Abeni @ 2024-08-22 15:18 UTC (permalink / raw) To: Willem de Bruijn Cc: virtio-comment, maxime.coquelin, Eelco Chaudron, Jason Wang, Stefano Garzarella On 8/21/24 21:42, Willem de Bruijn wrote: > On Wed, Aug 21, 2024 at 6:03 AM Paolo Abeni <pabeni@redhat.com> wrote: >> @@ -425,6 +446,7 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O >> #define VIRTIO_NET_HDR_F_NEEDS_CSUM 1 >> #define VIRTIO_NET_HDR_F_DATA_VALID 2 >> #define VIRTIO_NET_HDR_F_RSC_INFO 4 >> +#define VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM 8 > > Only now notice: since this is only valid with GSO, should this be a > gso_type bit, like VIRTIO_NET_HDR_GSO_ECN? I originally placed the flag here for consistency with NEEDS_CSUM, before making the TUNNEL_CSUM offload depending on TUNNEL_GSO offload. I think it's still a valid choice: the bit space in 'gso_type' is quite low; will be significantly lower then what will have in 'flags' with such change, I *think* it's more future proof this way. Do you have strong opinion otherwise? >> +If the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM feature has been negotiated, >> +and the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or >> +VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit in \field{gso_type} are set, >> +the driver MAY set the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit in >> +\field{flags}, if so: >> +\begin{enumerate} >> +\item the driver MUST validate the packet checksum at >> + offset 6 from \field{outer_th_offset} as well as all >> + preceding offsets; > > I don't follow this comment: when programming checksum offload, why > does the driver have to validate any checksums? I must admit this is plain cargo cult programming, from the original csum fields specification. I naively (and forcefully) interpreted the original statement as 'the driver must validate the csums related offset' (yup, it requires quite a bit of good interpretation will ;). I now think we are better off dropping this sentence (and similar ones) completely. Thanks, Paolo > > Same in two other locations. > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v7 2/2] virtio-net: define UDP tunnel checksum offload feature 2024-08-22 15:18 ` Paolo Abeni @ 2024-08-22 15:20 ` Willem de Bruijn 0 siblings, 0 replies; 15+ messages in thread From: Willem de Bruijn @ 2024-08-22 15:20 UTC (permalink / raw) To: Paolo Abeni Cc: virtio-comment, maxime.coquelin, Eelco Chaudron, Jason Wang, Stefano Garzarella On Thu, Aug 22, 2024 at 11:19 AM Paolo Abeni <pabeni@redhat.com> wrote: > > On 8/21/24 21:42, Willem de Bruijn wrote: > > On Wed, Aug 21, 2024 at 6:03 AM Paolo Abeni <pabeni@redhat.com> wrote: > >> @@ -425,6 +446,7 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O > >> #define VIRTIO_NET_HDR_F_NEEDS_CSUM 1 > >> #define VIRTIO_NET_HDR_F_DATA_VALID 2 > >> #define VIRTIO_NET_HDR_F_RSC_INFO 4 > >> +#define VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM 8 > > > > Only now notice: since this is only valid with GSO, should this be a > > gso_type bit, like VIRTIO_NET_HDR_GSO_ECN? > > I originally placed the flag here for consistency with NEEDS_CSUM, > before making the TUNNEL_CSUM offload depending on TUNNEL_GSO offload. > > I think it's still a valid choice: the bit space in 'gso_type' is quite > low; will be significantly lower then what will have in 'flags' with > such change, I *think* it's more future proof this way. Do you have > strong opinion otherwise? That's a valid reason. SGTM. ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2024-09-04 14:53 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-08-21 10:02 [PATCH v7 0/2] virtio-net: define UDP tunnel offload Paolo Abeni 2024-08-21 10:02 ` [PATCH v7 1/2] virtio-net: define UDP tunnel segmentation offload feature Paolo Abeni 2024-08-21 19:28 ` Willem de Bruijn 2024-08-21 19:48 ` Willem de Bruijn 2024-08-22 14:25 ` Paolo Abeni 2024-08-22 14:30 ` Willem de Bruijn 2024-08-22 14:11 ` Paolo Abeni 2024-08-26 12:23 ` [EXTERNAL] " Shiva Shankar Kommula 2024-08-29 15:49 ` Paolo Abeni 2024-09-02 12:31 ` Shiva Shankar Kommula 2024-09-04 14:53 ` Paolo Abeni 2024-08-21 10:02 ` [PATCH v7 2/2] virtio-net: define UDP tunnel checksum " Paolo Abeni 2024-08-21 19:42 ` Willem de Bruijn 2024-08-22 15:18 ` Paolo Abeni 2024-08-22 15:20 ` Willem de Bruijn
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox