* [PATCH v8 1/2] virtio-net: define UDP tunnel segmentation offload feature
2024-09-04 15:49 [PATCH v8 0/2] virtio-net: define UDP tunnel offload Paolo Abeni
@ 2024-09-04 15:49 ` Paolo Abeni
2024-09-04 15:49 ` [PATCH v8 2/2] virtio-net: define UDP tunnel checksum " Paolo Abeni
` (2 subsequent siblings)
3 siblings, 0 replies; 15+ messages in thread
From: Paolo Abeni @ 2024-09-04 15:49 UTC (permalink / raw)
To: virtio-comment
Cc: maxime.coquelin, Eelco Chaudron, Jason Wang, Stefano Garzarella,
Willem de Bruijn, kshankar
The VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV{4,6} are two gso_type flags
allowing respectively GSO over UDPv4 tunnel and GSO over UDPv6 tunnel.
They can be negotiated on both the host and guest sides.
One constraint addressed here is that the virtio side (either device or
driver) receiving a UDP tunneled GSO packet must be able to reconstruct
completely the inner and outer headers offset - to allow for later GSO.
To accommodate such need, new fields are introduced in the virtio_net
header: outer_th_offset and inner_nh_offset.
They map directly to the corresponding header information. The inner
transport header is implied by the (inner) checksum offload.
Those fields are required because a virtio device H/W implementation
performing segmentation for UDP tunneled packet will need to touch
the outer transport protocol (for the UDP length filed), the
inner network protocol (for the total length field, in the IPv4
case).
Note that segmentation will additionally need to touch
the outer network protocol and the inner transport protocol. The first
is implied/easily found with trivial parsing, the latter is identified
by the existing csum_start field.
Note that there is no concept of UDP tunnel type negotiation (e.g.
vxlan, geneve, vxlan-gpe, etc.), as a virtio device H/W implementation
can perform segmentation for every possible UDP-tunnel given the
specified new fields.
In the reverse direction, if a virtio device H/W implementation receives
some traffic for an unknown or unsupported UDP tunnel, it will simply
not aggregate the wire packet in a GSO one.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
v7 -> v8:
- dropped 'Note that' in note
- NOT set neither A nor B -> NOT set either A or B
- specify that 'gso_partial' needs uh->len set to the wire one
v6 -> v7:
- replaced inner_protocol_type with VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 &&
VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4
- UDP_TUNNEL now depends on all the TCP and USO segmentation features
- reworded the 'VIRTIO_NET_HDR_GSO_UDP_TUNNEL' bit driver normative
- specified that csum_start/offset points to the inner header in the
driver normative, too
- new fields presence depends on either VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO or
VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO
- e.g. -> i.e.
- zero -> positive zero (0x0)
- checksumming -> checksum validation
- explain why we the outer UDP csum can be precomputed only
when the GSO packet length is a multiple of gso_size
- added an example note for the tunnel-related fields value
https://lore.kernel.org/virtio-comment/cover.1722252302.git.pabeni@redhat.com/T/#t
v5 -> v6:
- split in 2 patches
- dropped the unneeded outer_mh_offset field
https://lore.kernel.org/virtio-comment/56ccba136272593d0d2af0303600c06f26bd84a1.1718621522.git.pabeni@redhat.com/
v4 -> v5:
- add separate negotiation for UDP_TUNNEL_CSUM
- much more verbose specification of outer csum handling
- UDP_TUNNEL_CSUM requires GSO_UDP_TUNNEL
https://lore.kernel.org/virtio-comment/cb89d3d6b6a7e4a7f13e60dd3fded5e950727890.1716448766.git.pabeni@redhat.com/
v3 -> v4:
- new virtio_net_hdr fields depends on F_HOST_UDP_TUNNEL_GSO or
F_GUEST_UDP_TUNNEL_GSO (Stefano)
- Extended the changelog to answer Jason's questions.
- Clarified outer csum handling (Jason)
https://lore.kernel.org/virtio-dev/f0bf2db203f8061a3ecb50b7a48cf26d8f3c7f68.1716193086.git.pabeni@redhat.com
v2 -> v3:
- UDP_TUNNEL -> UDP_TUNNEL_GSO
- add explicit fields for the inner meta-data
- more verbose changelog
https://lists.oasis-open.org/archives/virtio-dev/202206/msg00026.html
v1 -> v2:
- explicitly state that the outer header probing is mandatory
- explicitly state that GSO_UDP is not allowed with GSO_UDP_TUNNEL
- clarify hdr_len usage
- clarify UDP_TUNNEL_CSUM bit usage
- fix a few typos
https://lists.oasis-open.org/archives/virtio-dev/202205/msg00037.html
---
device-types/net/description.tex | 159 +++++++++++++++++++++++++++++--
1 file changed, 153 insertions(+), 6 deletions(-)
diff --git a/device-types/net/description.tex b/device-types/net/description.tex
index 76585b0..3fb9940 100644
--- a/device-types/net/description.tex
+++ b/device-types/net/description.tex
@@ -88,6 +88,12 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
\item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
channel.
+\item[VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO (47)] Driver can receive GSO packets
+ carried by a UDP tunnel.
+
+\item[VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO (49)] Device can receive GSO packets
+ carried by a UDP tunnel.
+
\item[VIRTIO_NET_F_HASH_TUNNEL(51)] Device supports inner header hash for encapsulated packets.
\item[VIRTIO_NET_F_VQ_NOTF_COAL(52)] Device supports virtqueue notification coalescing.
@@ -133,12 +139,16 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
\item[VIRTIO_NET_F_GUEST_UFO] Requires VIRTIO_NET_F_GUEST_CSUM.
\item[VIRTIO_NET_F_GUEST_USO4] Requires VIRTIO_NET_F_GUEST_CSUM.
\item[VIRTIO_NET_F_GUEST_USO6] Requires VIRTIO_NET_F_GUEST_CSUM.
+\item[VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO] Requires VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6,
+ VIRTIO_NET_F_GUEST_USO4 and VIRTIO_NET_F_GUEST_USO6.
\item[VIRTIO_NET_F_HOST_TSO4] Requires VIRTIO_NET_F_CSUM.
\item[VIRTIO_NET_F_HOST_TSO6] Requires VIRTIO_NET_F_CSUM.
\item[VIRTIO_NET_F_HOST_ECN] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
\item[VIRTIO_NET_F_HOST_UFO] Requires VIRTIO_NET_F_CSUM.
\item[VIRTIO_NET_F_HOST_USO] Requires VIRTIO_NET_F_CSUM.
+\item[VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO] Requires VIRTIO_NET_F_HOST_TSO4, VIRTIO_NET_F_HOST_TSO6
+ and VIRTIO_NET_F_HOST_USO.
\item[VIRTIO_NET_F_CTRL_RX] Requires VIRTIO_NET_F_CTRL_VQ.
\item[VIRTIO_NET_F_CTRL_VLAN] Requires VIRTIO_NET_F_CTRL_VQ.
@@ -375,6 +385,13 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev
TCP), VIRTIO_NET_F_HOST_TSO6 (IPv6 TCP), VIRTIO_NET_F_HOST_UFO
(UDP fragmentation) and VIRTIO_NET_F_HOST_USO (UDP segmentation) features.
+\item If the VIRTIO_NET_F_HOST_TSO6, VIRTIO_NET_F_HOST_TSO4 and VIRTIO_NET_F_HOST_USO
+ segmentation features are negotiated, a driver can
+ use TCP segmentation or UDP segmentation on top of UDP encapsulation
+ offload, when the outer header does not require checksumming - e.g.
+ the outer UDP checksum is zero - by negotiating the
+ VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO feature.
+
\item The converse features are also available: a driver can save
the virtual device some work by negotiating these features.\note{For example, a network packet transported between two guests on
the same system might not need checksumming at all, nor segmentation,
@@ -382,8 +399,9 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev
The VIRTIO_NET_F_GUEST_CSUM feature indicates that partially
checksummed packets can be received, and if it can do that then
the VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6,
- VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_GUEST_USO4
- and VIRTIO_NET_F_GUEST_USO6 are the input equivalents of the features described above.
+ VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_GUEST_USO4,
+ VIRTIO_NET_F_GUEST_USO6 and VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO
+ are the input equivalents of the features described above.
See \ref{sec:Device Types / Network Device / Device Operation /
Setting Up Receive Buffers}~\nameref{sec:Device Types / Network
Device / Device Operation / Setting Up Receive Buffers} and
@@ -413,6 +431,8 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O
#define VIRTIO_NET_HDR_GSO_UDP 3
#define VIRTIO_NET_HDR_GSO_TCPV6 4
#define VIRTIO_NET_HDR_GSO_UDP_L4 5
+#define VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 0x20
+#define VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 0x40
#define VIRTIO_NET_HDR_GSO_ECN 0x80
u8 gso_type;
le16 hdr_len;
@@ -423,6 +443,8 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O
le32 hash_value; (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
le16 hash_report; (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
le16 padding_reserved; (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
+ le16 outer_th_offset (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO or VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO negotiated)
+ le16 inner_nh_offset; (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO or VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO negotiated)
};
\end{lstlisting}
@@ -480,6 +502,10 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
followed by the TCP header (with the TCP checksum field 16 bytes
into that header). \field{csum_start} will be 14+20 = 34 (the TCP
checksum includes the header), and \field{csum_offset} will be 16.
+If the given packet has the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the
+VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit set,
+the above checksum fields refer to the inner header checksum, see
+the example below.
\end{note}
\item If the driver negotiated
@@ -516,6 +542,39 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
specifically in the protocol.}.
\end{itemize}
+\item If the driver negotiated the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO feature and the
+ \field{gso_type} has the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or
+ VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit set, the GSO protocol is encapsulated
+ in a UDP tunnel.
+ The outer UDP header must not require checksum validation, i.e. the outer UDP
+ checksum must be a positive zero (0x0) as defined in UDP RFC 768.
+ The other tunnel-related fields indicate how to replicate the packet
+ headers to cut it into smaller packets:
+
+ \begin{itemize}
+ \item \field{outer_th_offset} field indicates the outer transport header within
+ the packet. This field differs from \field{csum_start} as the latter
+ points to the inner transport header within the packet.
+
+ \item \field{inner_nh_offset} field indicates the inner network header within
+ the packet.
+ \end{itemize}
+
+\begin{note}
+For example, consider a partially checksummed TCP (IPv4) packet carried over a
+Geneve UDP tunnel (again IPv4) with no tunnel options. The
+only relevant variable related to the tunnel type is the tunnel header length.
+The packet will have a 14 byte outer ethernet header, 20 byte outer IP header
+followed by the 8 byte UDP header (with a 0 checksum value), 8 byte Geneve header,
+14 byte inner ethernet header, 20 byte inner IP header
+and the TCP header (with the TCP checksum field 16 bytes
+into that header). \field{csum_start} will be 14+20+8+8+14+20 = 84 (the TCP
+checksum includes the header), \field{csum_offset} will be 16.
+\field{inner_nh_offset} will be 14+20+8+8+14 = 62, \field{outer_th_offset} will be
+14+20+8 = 42 and \field{gso_type} will be
+VIRTIO_NET_HDR_GSO_TCPV4 | VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 = 0x21
+\end{note}
+
\item \field{num_buffers} is set to zero. This field is unused on transmitted packets.
\item The header and packet are added as one output descriptor to the
@@ -557,6 +616,29 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
driver MUST set the VIRTIO_NET_HDR_GSO_ECN bit in
\field{gso_type}.
+If VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO is negotiated, the driver MAY set
+VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit
+in \field{gso_type} according to the inner network header protocol type
+to request TCP or UDP GSO packets over UDPv4 or UDPv6 tunnel
+segmentation, otherwise the driver MUST NOT set either the
+VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit
+in \field{gso_type}.
+
+When requesting TCP or UDP GSO segmentation over UDP tunnel, the driver MUST SET the
+VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit if the inner network header is IPv4, i.e. the
+packet is a TCPv4 GSO one, otherwise, if the inner network header is IPv6, the driver
+MUST SET the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit.
+
+The driver MUST NOT send to the device TCP or UDP GSO packets over UDP tunnel
+requiring segmentation and outer UDP checksum offload.
+
+The driver MUST NOT set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the
+VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit together with VIRTIO_NET_HDR_GSO_UDP, as the
+latter is deprecated in favor of UDP_L4 and no new feature will support it.
+
+The driver MUST NOT set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit and the
+VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit together.
+
If the VIRTIO_NET_F_CSUM feature has been negotiated, the
driver MAY set the VIRTIO_NET_HDR_F_NEEDS_CSUM bit in
\field{flags}, if so:
@@ -598,6 +680,38 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
header.
\end{itemize}
+If the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO option has been negotiated, the
+driver MAY set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the
+VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit in \field{gso_type}, if so:
+\begin{itemize}
+\item the driver MUST set \field{outer_th_offset} to the outer UDP header
+ offset and \field{inner_nh_offset} to the inner network header offset.
+ The \field{csum_start} and \field{csum_offset} fields point respectively
+ to the inner transport header and inner transport checksum field.
+\end{itemize}
+
+If the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the
+VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit in \field{gso_type} are set, the
+outer UDP header MUST NOT require checksum validation. That is, the
+outer UDP checksum value MUST be 0 or the validated complete checksum
+for such header.
+
+\begin{note}
+The valid complete checksum of the outer UDP header of individual segments
+can be computed by the driver prior to segmentation only if the GSO packet
+size is a multiple of \field{gso_size}, because then all segments
+have the same size and thus all data included in the outer UDP
+checksum is the same for every segment. These pre-computed segment
+length and checksum fields are different from those of the GSO
+packet.
+In this scenario the outer UDP header of the GSO packet must carry the
+segmented UDP packet length.
+\end{note}
+
+If the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO option has not
+been negotiated, the driver MUST NOT set either the VIRTIO_NET_HDR_F_UDP_TUNNEL_IPV4
+bit or the VIRTIO_NET_HDR_F_UDP_TUNNEL_IPV6 in \field{gso_type}.
+
The driver SHOULD accept the VIRTIO_NET_F_GUEST_HDRLEN feature if it has
been offered, and if it's able to provide the exact header length.
@@ -633,6 +747,10 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
\end{note}
\end{itemize}
+If the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit and theVIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6
+bit in \field{gso_type} are not set, the device MUST NOT use the
+\field{outer_th_offset} and \field{inner_nh_offset}.
+
If VIRTIO_NET_HDR_F_NEEDS_CSUM is not set, the device MUST NOT
rely on the packet checksum being correct.
\paragraph{Packet Transmission Interrupt}\label{sec:Device Types / Network Device / Device Operation / Packet Transmission / Packet Transmission Interrupt}
@@ -727,8 +845,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
has been validated.
\end{enumerate}
-Additionally, VIRTIO_NET_F_GUEST_CSUM, TSO4, TSO6, UDP and ECN
-features enable receive checksum, large receive offload and ECN
+Additionally, VIRTIO_NET_F_GUEST_CSUM, TSO4, TSO6, UDP, UDP_TUNNEL
+and ECN features enable receive checksum, large receive offload and ECN
support which are the input equivalents of the transmit checksum,
transmit segmentation offloading and ECN features, as described
in \ref{sec:Device Types / Network Device / Device Operation /
@@ -750,8 +868,16 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
from \field{csum_start} and any preceding checksums
have been validated. The checksum on the packet is incomplete and
if bit VIRTIO_NET_HDR_F_RSC_INFO is not set in \field{flags},
- then \field{csum_start} and \field{csum_offset} indicate how to calculate it
- (see Packet Transmission point 1).
+ then \field{csum_start} and \field{csum_offset} indicate how to calculate it.
+ If the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6
+ bit are set, the \field{csum_start} field
+ refers to the inner transport header offset (see Packet Transmission point 1).
+\item If the VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO option was negotiated and
+ \field{gso_type} is not VIRTIO_NET_HDR_GSO_NONE, the
+ VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6
+ bit MAY be set. In such case the \field{outer_th_offset} and
+ \field{inner_nh_offset} fields indicate the corresponding
+ headers information.
\end{enumerate}
@@ -796,6 +922,10 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
If none of VIRTIO_NET_F_GUEST_USO4 or VIRTIO_NET_F_GUEST_USO6 have been negotiated,
the device MUST NOT set \field{gso_type} to VIRTIO_NET_HDR_GSO_UDP_L4.
+If VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO is not negotiated, the device MUST NOT set
+either the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the
+VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit in \field{gso_type}.
+
The device SHOULD NOT send to the driver TCP packets requiring segmentation offload
which have the Explicit Congestion Notification bit set, unless the
VIRTIO_NET_F_GUEST_ECN feature is negotiated, in which case the
@@ -819,6 +949,18 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
fully checksummed packet;
\end{enumerate}
+The device MUST NOT send to the driver TCP or UDP GSO packets encapsulated in UDP
+tunnel and requiring segmentation offload, unless the
+VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO is negotiated, in which case the device MUST set
+the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6
+bit in \field{gso_type} according to the inner network header protocol type,
+MUST set the \field{outer_th_offset} and \field{inner_nh_offset} fields
+to the corresponding header information, and the outer UDP header MUST NOT
+require checksum offload.
+
+The device MUST NOT send the driver TCP or UDP GSO packets encapsulated in UDP
+tunnel and requiring segmentation and outer checksum offload.
+
If none of the VIRTIO_NET_F_GUEST_TSO4, TSO6, UFO, USO4 or USO6 options have
been negotiated, the device MUST set \field{gso_type} to
VIRTIO_NET_HDR_GSO_NONE.
@@ -867,6 +1009,10 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
VIRTIO_NET_HDR_F_DATA_VALID is set, the driver MUST NOT
rely on the packet checksum being correct.
+If the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6
+bit in \field{gso_type} are not set, the driver MUST NOT use the
+\field{outer_th_offset} and \field{inner_nh_offset}.
+
\paragraph{Hash calculation for incoming packets}
\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets}
@@ -1624,6 +1770,7 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
#define VIRTIO_NET_F_GUEST_TSO6 8
#define VIRTIO_NET_F_GUEST_ECN 9
#define VIRTIO_NET_F_GUEST_UFO 10
+#define VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO 47
#define VIRTIO_NET_F_GUEST_USO4 54
#define VIRTIO_NET_F_GUEST_USO6 55
--
2.45.2
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH v8 2/2] virtio-net: define UDP tunnel checksum offload feature
2024-09-04 15:49 [PATCH v8 0/2] virtio-net: define UDP tunnel offload Paolo Abeni
2024-09-04 15:49 ` [PATCH v8 1/2] virtio-net: define UDP tunnel segmentation offload feature Paolo Abeni
@ 2024-09-04 15:49 ` Paolo Abeni
2024-09-09 2:19 ` [PATCH v8 0/2] virtio-net: define UDP tunnel offload Willem de Bruijn
2024-09-09 3:16 ` Jason Wang
3 siblings, 0 replies; 15+ messages in thread
From: Paolo Abeni @ 2024-09-04 15:49 UTC (permalink / raw)
To: virtio-comment
Cc: maxime.coquelin, Eelco Chaudron, Jason Wang, Stefano Garzarella,
Willem de Bruijn, kshankar
This complements the previous change, additionally
introducing the UDP tunnel checksum offload feature.
Differently from the plain checksum offload feature, this
depends on UDP tunnel segmentation being available, as outer checksum
computation for non GSO packets is cheap and H/W implementation of
such a feature is complex.
UDP tunnel checksum offload does not introduce additional fields,
instead it leverages the outer transport offset introduced by the
UDP tunnel segmentation feature to locate the outer checksum
inside the packet.
When UDP tunnel checksum offload is negotiated:
- the driver requests the outer UDP checksum offload setting the
VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit in the flag field. Such bit
is not allocated inside the gso_type field to prevent space
exhaustion there.
- in the device -> driver direction, the VIRTIO_NET_HDR_F_DATA_VALID
bit semantic is extended, covering both the inner and the outer
checksum validation.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
v7 -> v8:
- dropped confusing wording around csum validation
- clarified VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM allocation
v6 -> v7:
- rebased
- VIRTIO_NET_F_{HOST,GUEST}_UDP_TUNNEL_CSUM -> ...UDP_TUNNEL_GSO_CSUM
- dropped unintended change to existing TCP gso spec
---
device-types/net/description.tex | 115 ++++++++++++++++++++++++++++---
1 file changed, 105 insertions(+), 10 deletions(-)
diff --git a/device-types/net/description.tex b/device-types/net/description.tex
index 3fb9940..447b692 100644
--- a/device-types/net/description.tex
+++ b/device-types/net/description.tex
@@ -91,9 +91,15 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
\item[VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO (47)] Driver can receive GSO packets
carried by a UDP tunnel.
+\item[VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM (48)] Driver handles packets
+ carried by a UDP tunnel with partial csum for the outer header.
+
\item[VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO (49)] Device can receive GSO packets
carried by a UDP tunnel.
+\item[VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM (50)] Device handles packets
+ carried by a UDP tunnel with partial csum for the outer header.
+
\item[VIRTIO_NET_F_HASH_TUNNEL(51)] Device supports inner header hash for encapsulated packets.
\item[VIRTIO_NET_F_VQ_NOTF_COAL(52)] Device supports virtqueue notification coalescing.
@@ -141,6 +147,7 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
\item[VIRTIO_NET_F_GUEST_USO6] Requires VIRTIO_NET_F_GUEST_CSUM.
\item[VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO] Requires VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6,
VIRTIO_NET_F_GUEST_USO4 and VIRTIO_NET_F_GUEST_USO6.
+\item[VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM] Requires VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO
\item[VIRTIO_NET_F_HOST_TSO4] Requires VIRTIO_NET_F_CSUM.
\item[VIRTIO_NET_F_HOST_TSO6] Requires VIRTIO_NET_F_CSUM.
@@ -149,6 +156,7 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
\item[VIRTIO_NET_F_HOST_USO] Requires VIRTIO_NET_F_CSUM.
\item[VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO] Requires VIRTIO_NET_F_HOST_TSO4, VIRTIO_NET_F_HOST_TSO6
and VIRTIO_NET_F_HOST_USO.
+\item[VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM] Requires VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO
\item[VIRTIO_NET_F_CTRL_RX] Requires VIRTIO_NET_F_CTRL_VQ.
\item[VIRTIO_NET_F_CTRL_VLAN] Requires VIRTIO_NET_F_CTRL_VQ.
@@ -162,6 +170,13 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
\item[VIRTIO_NET_F_HASH_TUNNEL] Requires VIRTIO_NET_F_CTRL_VQ along with VIRTIO_NET_F_RSS or VIRTIO_NET_F_HASH_REPORT.
\end{description}
+\begin{note}
+The dependency between UDP_TUNNEL_GSO_CSUM and UDP_TUNNEL_GSO is intentionally
+in the opposite direction with respect to the plain GSO features and the plain
+checksum offload because UDP tunnel checksum offload gives very little gain
+for non GSO packets and is quite complex to implement in H/W.
+\end{note}
+
\subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Network Device / Feature bits / Legacy Interface: Feature bits}
\begin{description}
\item[VIRTIO_NET_F_GSO (6)] Device handles packets with any GSO type. This was supposed to indicate segmentation offload support, but
@@ -392,6 +407,11 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev
the outer UDP checksum is zero - by negotiating the
VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO feature.
+\item If VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO is negotiated, a driver can
+ additionally use TCP segmentation or UDP segmentation on top of UDP
+ encapsulation with the outer header requiring checksum offload,
+ negotiating the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM feature.
+
\item The converse features are also available: a driver can save
the virtual device some work by negotiating these features.\note{For example, a network packet transported between two guests on
the same system might not need checksumming at all, nor segmentation,
@@ -400,8 +420,9 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev
checksummed packets can be received, and if it can do that then
the VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6,
VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_GUEST_USO4,
- VIRTIO_NET_F_GUEST_USO6 and VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO
- are the input equivalents of the features described above.
+ VIRTIO_NET_F_GUEST_USO6 VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO and
+ VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM are the input equivalents of
+ the features described above.
See \ref{sec:Device Types / Network Device / Device Operation /
Setting Up Receive Buffers}~\nameref{sec:Device Types / Network
Device / Device Operation / Setting Up Receive Buffers} and
@@ -425,6 +446,7 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O
#define VIRTIO_NET_HDR_F_NEEDS_CSUM 1
#define VIRTIO_NET_HDR_F_DATA_VALID 2
#define VIRTIO_NET_HDR_F_RSC_INFO 4
+#define VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM 8
u8 flags;
#define VIRTIO_NET_HDR_GSO_NONE 0
#define VIRTIO_NET_HDR_GSO_TCPV4 1
@@ -546,8 +568,11 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
\field{gso_type} has the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or
VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit set, the GSO protocol is encapsulated
in a UDP tunnel.
- The outer UDP header must not require checksum validation, i.e. the outer UDP
- checksum must be a positive zero (0x0) as defined in UDP RFC 768.
+ If the outer UDP header requires checksumming, the driver must have
+ additionally negotiated the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM feature
+ and offloaded the outer checksum accordingly, otherwise
+ the outer UDP header must not require checksum validation, i.e. the outer
+ UDP checksum must be positive zero (0x0) as defined in UDP RFC 768.
The other tunnel-related fields indicate how to replicate the packet
headers to cut it into smaller packets:
@@ -575,6 +600,20 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
VIRTIO_NET_HDR_GSO_TCPV4 | VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 = 0x21
\end{note}
+\item If the driver negotiated the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM feature,
+ the transmitted packet is a GSO one encapsulated in a UDP tunnel, and
+ the outer UDP header requires checksumming, the driver can skip checksumming
+ the outer header:
+
+ \begin{itemize}
+ \item \field{flags} has the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM set,
+
+ \item The outer UDP checksum field in the packet is set to the sum
+ of the UDP pseudo header, so that replacing it by the ones'
+ complement checksum of the outer UDP header and payload will give the
+ correct result.
+ \end{itemize}
+
\item \field{num_buffers} is set to zero. This field is unused on transmitted packets.
\item The header and packet are added as one output descriptor to the
@@ -630,6 +669,14 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
MUST SET the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit.
The driver MUST NOT send to the device TCP or UDP GSO packets over UDP tunnel
+requiring segmentation and outer UDP checksum offload, unless both the
+VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO and VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM features
+are negotiated, in which case the driver MUST set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL
+bit in the \field{gso_type} and the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit in the \field{flags}.
+
+If VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM is not negotiated, the driver MUST not set
+the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit in the \field{flags} and
+MUST NOT send to the device TCP or UDP GSO packets over UDP tunnel
requiring segmentation and outer UDP checksum offload.
The driver MUST NOT set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the
@@ -639,6 +686,9 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
The driver MUST NOT set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit and the
VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit together.
+The driver MUST NOT set the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit \field{flags}
+without setting the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO bit in \field{gso_type}.
+
If the VIRTIO_NET_F_CSUM feature has been negotiated, the
driver MAY set the VIRTIO_NET_HDR_F_NEEDS_CSUM bit in
\field{flags}, if so:
@@ -690,8 +740,23 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
to the inner transport header and inner transport checksum field.
\end{itemize}
+If the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM feature has been negotiated,
+and the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or
+VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit in \field{gso_type} are set,
+the driver MAY set the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit in
+\field{flags}, if so the driver MUST set the packet outer UDP header checksum
+to the outer UDP pseudo header.
+
+\begin{note}
+calculating a ones' complement checksum from \field{outer_th_offset}
+up until the end of the packet and storing the result at offset 6
+from \field{outer_th_offset} will result in a fully checksummed outer UDP packet;
+\end{note}
+
If the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the
-VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit in \field{gso_type} are set, the
+VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit in \field{gso_type} are set
+and the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM feature has not
+been negotiated, the
outer UDP header MUST NOT require checksum validation. That is, the
outer UDP checksum value MUST be 0 or the validated complete checksum
for such header.
@@ -712,6 +777,10 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
been negotiated, the driver MUST NOT set either the VIRTIO_NET_HDR_F_UDP_TUNNEL_IPV4
bit or the VIRTIO_NET_HDR_F_UDP_TUNNEL_IPV6 in \field{gso_type}.
+If the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM option has not been negotiated,
+the driver MUST NOT set the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit
+in \field{flags}.
+
The driver SHOULD accept the VIRTIO_NET_F_GUEST_HDRLEN feature if it has
been offered, and if it's able to provide the exact header length.
@@ -841,8 +910,10 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
\item If the VIRTIO_NET_F_GUEST_CSUM feature was negotiated, the
VIRTIO_NET_HDR_F_DATA_VALID bit in \field{flags} can be
set: if so, device has validated the packet checksum.
- In case of multiple encapsulated protocols, one level of checksums
- has been validated.
+ If the VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM feature has been negotiated,
+ and the VIRTIO_NET_HDR_F_UDP_TUNNEL bit set, both the outer UDP checksum
+ and the inner transport checksum have been validated, otherwise only one
+ level of checksums (the inner one in case of tunnels) has been validated.
\end{enumerate}
Additionally, VIRTIO_NET_F_GUEST_CSUM, TSO4, TSO6, UDP, UDP_TUNNEL
@@ -878,6 +949,11 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
bit MAY be set. In such case the \field{outer_th_offset} and
\field{inner_nh_offset} fields indicate the corresponding
headers information.
+\item If the VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM feature was negotiated, and
+ the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit is set in \field{gso_type}, the
+ VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit in the \field{flags} can be set: if so,
+ the outer UDP header checksum at offset 6 from \field{outer_th_offset}
+ is set to the outer UDP pseudo header checksum.
\end{enumerate}
@@ -926,6 +1002,9 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
either the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the
VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit in \field{gso_type}.
+If VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM is not negotiated the device MUST NOT set
+the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit in \field{flags}.
+
The device SHOULD NOT send to the driver TCP packets requiring segmentation offload
which have the Explicit Congestion Notification bit set, unless the
VIRTIO_NET_F_GUEST_ECN feature is negotiated, in which case the
@@ -958,7 +1037,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
to the corresponding header information, and the outer UDP header MUST NOT
require checksum offload.
-The device MUST NOT send the driver TCP or UDP GSO packets encapsulated in UDP
+If the VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM feature has not been negotiated,
+the device MUST NOT send the driver TCP or UDP GSO packets encapsulated in UDP
tunnel and requiring segmentation and outer checksum offload.
If none of the VIRTIO_NET_F_GUEST_TSO4, TSO6, UFO, USO4 or USO6 options have
@@ -984,8 +1064,22 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
If the VIRTIO_NET_F_GUEST_CSUM feature has been negotiated, the
device MAY set the VIRTIO_NET_HDR_F_DATA_VALID bit in
\field{flags}, if so, the device MUST validate the packet
-checksum (in case of multiple encapsulated protocols, one level
-of checksums is validated).
+checksum. If the VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM feature has been negotiated,
+and the VIRTIO_NET_HDR_F_UDP_TUNNEL bit set, both the outer UDP checksum
+and the inner transport checksum have been
+validated, otherwise only one level of checksums (the inner one in case
+of tunnels) has been validated.
+
+If the VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM feature has been negotiated,
+the bit VIRTIO_NET_HDR_GSO_UDP_TUNNEL is set in \field{gso_type}
+device MAY set the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit in
+\field{flags}, if so the device MUST set the packet checksum stored in the
+receive buffer to the outer UDP pseudo header.
+
+Otherwise, if the VIRTIO_NET_HDR_GSO_UDP_TUNNEL is set in
+\field{gso_type} and the bit VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM is cleared in
+\field{flags}, the device MUST either provide a zero outer UDP header
+checksum or a fully checksummed outer UDP header.
\drivernormative{\paragraph}{Processing of Incoming
Packets}{Device Types / Network Device / Device Operation /
@@ -1771,6 +1865,7 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
#define VIRTIO_NET_F_GUEST_ECN 9
#define VIRTIO_NET_F_GUEST_UFO 10
#define VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO 47
+#define VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM 48
#define VIRTIO_NET_F_GUEST_USO4 54
#define VIRTIO_NET_F_GUEST_USO6 55
--
2.45.2
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH v8 0/2] virtio-net: define UDP tunnel offload
2024-09-04 15:49 [PATCH v8 0/2] virtio-net: define UDP tunnel offload Paolo Abeni
2024-09-04 15:49 ` [PATCH v8 1/2] virtio-net: define UDP tunnel segmentation offload feature Paolo Abeni
2024-09-04 15:49 ` [PATCH v8 2/2] virtio-net: define UDP tunnel checksum " Paolo Abeni
@ 2024-09-09 2:19 ` Willem de Bruijn
2024-09-09 3:16 ` Jason Wang
3 siblings, 0 replies; 15+ messages in thread
From: Willem de Bruijn @ 2024-09-09 2:19 UTC (permalink / raw)
To: Paolo Abeni
Cc: virtio-comment, maxime.coquelin, Eelco Chaudron, Jason Wang,
Stefano Garzarella, kshankar
On Wed, Sep 4, 2024 at 11:50 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> UDP tunnel usage is ubiquitous in container deployment, and the ability
> to offload UDP encapsulated GSO traffic impacts greatly the performances
> and the CPU utilization of such use cases.
>
> This series introduces separate features to handle both UDP tunnel
> segmentation offload (patch 1) and UDP tunnel outer checksum offload
> (patch 2).
>
> Changes from v7:
> - minor cleanup in patch 1/2
> - dropped confusing wording about csum validation in patch 2/2
>
> Changes from v6:
> - replaced inner_protocol_type with
> VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV{4,6}
> - many clarifications and cleanup, see the individual patches changelog
> for the details
>
> Changes from v5:
> - split in 2 patches
> - dropped outer_mh_offset field
> - many csum related clarification
>
> Paolo Abeni (2):
> virtio-net: define UDP tunnel segmentation offload feature
> virtio-net: define UDP tunnel checksum offload feature
Series looks great to me. No further comments from me. Thanks Paolo.
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH v8 0/2] virtio-net: define UDP tunnel offload
2024-09-04 15:49 [PATCH v8 0/2] virtio-net: define UDP tunnel offload Paolo Abeni
` (2 preceding siblings ...)
2024-09-09 2:19 ` [PATCH v8 0/2] virtio-net: define UDP tunnel offload Willem de Bruijn
@ 2024-09-09 3:16 ` Jason Wang
2024-09-09 8:05 ` Paolo Abeni
3 siblings, 1 reply; 15+ messages in thread
From: Jason Wang @ 2024-09-09 3:16 UTC (permalink / raw)
To: Paolo Abeni
Cc: virtio-comment, maxime.coquelin, Eelco Chaudron,
Stefano Garzarella, Willem de Bruijn, kshankar
On Wed, Sep 4, 2024 at 11:50 PM Paolo Abeni <pabeni@redhat.com> wrote:
>
> UDP tunnel usage is ubiquitous in container deployment, and the ability
> to offload UDP encapsulated GSO traffic impacts greatly the performances
> and the CPU utilization of such use cases.
>
> This series introduces separate features to handle both UDP tunnel
> segmentation offload (patch 1) and UDP tunnel outer checksum offload
> (patch 2).
Looks good to me.
And I wonder if there's a link to the prototype code so I can have a
double check.
Thanks
>
> Changes from v7:
> - minor cleanup in patch 1/2
> - dropped confusing wording about csum validation in patch 2/2
>
> Changes from v6:
> - replaced inner_protocol_type with
> VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV{4,6}
> - many clarifications and cleanup, see the individual patches changelog
> for the details
>
> Changes from v5:
> - split in 2 patches
> - dropped outer_mh_offset field
> - many csum related clarification
>
> Paolo Abeni (2):
> virtio-net: define UDP tunnel segmentation offload feature
> virtio-net: define UDP tunnel checksum offload feature
>
> device-types/net/description.tex | 262 +++++++++++++++++++++++++++++--
> 1 file changed, 252 insertions(+), 10 deletions(-)
>
> --
> 2.45.2
>
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH v8 0/2] virtio-net: define UDP tunnel offload
2024-09-09 3:16 ` Jason Wang
@ 2024-09-09 8:05 ` Paolo Abeni
2024-09-09 8:39 ` Jason Wang
2024-09-30 7:36 ` Paolo Abeni
0 siblings, 2 replies; 15+ messages in thread
From: Paolo Abeni @ 2024-09-09 8:05 UTC (permalink / raw)
To: Jason Wang
Cc: virtio-comment, maxime.coquelin, Eelco Chaudron,
Stefano Garzarella, Willem de Bruijn, kshankar
On 9/9/24 05:16, Jason Wang wrote:
> On Wed, Sep 4, 2024 at 11:50 PM Paolo Abeni <pabeni@redhat.com> wrote:
>>
>> UDP tunnel usage is ubiquitous in container deployment, and the ability
>> to offload UDP encapsulated GSO traffic impacts greatly the performances
>> and the CPU utilization of such use cases.
>>
>> This series introduces separate features to handle both UDP tunnel
>> segmentation offload (patch 1) and UDP tunnel outer checksum offload
>> (patch 2).
>
> Looks good to me.
>
> And I wonder if there's a link to the prototype code so I can have a
> double check.
I still have to refresh the code from first implementation - that was
the base for the first draft here some time ago.
In the next 2 weeks I'll be traveling and/or AFK, I hope I can share the
code a little after that.
Cheers,
Paolo
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v8 0/2] virtio-net: define UDP tunnel offload
2024-09-09 8:05 ` Paolo Abeni
@ 2024-09-09 8:39 ` Jason Wang
2024-09-30 7:36 ` Paolo Abeni
1 sibling, 0 replies; 15+ messages in thread
From: Jason Wang @ 2024-09-09 8:39 UTC (permalink / raw)
To: Paolo Abeni
Cc: virtio-comment, maxime.coquelin, Eelco Chaudron,
Stefano Garzarella, Willem de Bruijn, kshankar
On Mon, Sep 9, 2024 at 4:05 PM Paolo Abeni <pabeni@redhat.com> wrote:
>
>
>
> On 9/9/24 05:16, Jason Wang wrote:
> > On Wed, Sep 4, 2024 at 11:50 PM Paolo Abeni <pabeni@redhat.com> wrote:
> >>
> >> UDP tunnel usage is ubiquitous in container deployment, and the ability
> >> to offload UDP encapsulated GSO traffic impacts greatly the performances
> >> and the CPU utilization of such use cases.
> >>
> >> This series introduces separate features to handle both UDP tunnel
> >> segmentation offload (patch 1) and UDP tunnel outer checksum offload
> >> (patch 2).
> >
> > Looks good to me.
> >
> > And I wonder if there's a link to the prototype code so I can have a
> > double check.
>
> I still have to refresh the code from first implementation - that was
> the base for the first draft here some time ago.
>
> In the next 2 weeks I'll be traveling and/or AFK, I hope I can share the
> code a little after that.
That's fine.
Thanks
>
> Cheers,
>
> Paolo
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v8 0/2] virtio-net: define UDP tunnel offload
2024-09-09 8:05 ` Paolo Abeni
2024-09-09 8:39 ` Jason Wang
@ 2024-09-30 7:36 ` Paolo Abeni
2024-10-01 13:09 ` Willem de Bruijn
1 sibling, 1 reply; 15+ messages in thread
From: Paolo Abeni @ 2024-09-30 7:36 UTC (permalink / raw)
To: Jason Wang
Cc: virtio-comment, maxime.coquelin, Eelco Chaudron,
Stefano Garzarella, Willem de Bruijn, kshankar
Hi,
On 9/9/24 10:05, Paolo Abeni wrote:
> On 9/9/24 05:16, Jason Wang wrote:
>> On Wed, Sep 4, 2024 at 11:50 PM Paolo Abeni <pabeni@redhat.com> wrote:
>>>
>>> UDP tunnel usage is ubiquitous in container deployment, and the ability
>>> to offload UDP encapsulated GSO traffic impacts greatly the performances
>>> and the CPU utilization of such use cases.
>>>
>>> This series introduces separate features to handle both UDP tunnel
>>> segmentation offload (patch 1) and UDP tunnel outer checksum offload
>>> (patch 2).
>>
>> Looks good to me.
>>
>> And I wonder if there's a link to the prototype code so I can have a
>> double check.
>
> I still have to refresh the code from first implementation - that was
> the base for the first draft here some time ago.
>
> In the next 2 weeks I'll be traveling and/or AFK, I hope I can share the
> code a little after that.
A very early PoC is available here:
https://github.com/pabeni/linux-devel/commits/virtio_tunnel_gso/
It's very lightly tested vs a patched user-space:
https://github.com/pabeni/qemu/commits/virtio_tunnel/
"very lightly" is actually a strong understatement; the thing survived a
single iperf-over-udp-tunnel run, and that is. However I hope it should
suffice as showcase.
In the tun/user-space interface I used a separate ioctl to configure the
tunnel offload instead of re-using the existing ioctl for plain GSO,
because AFAICS the tun driver is currently unaware of the complete
virtio net hdr layout (e.g. ignores rx hash and VIRTIO_NET_F_MRG_RXBUF)
and it is just notified of the total hdr struct size.
With UDP tunnel offload enabled, the tun driver need to know the tunnel
fields offset inside the virtio net hdr. Piggybacking such info inside
the the current ioctl(TUNSETOFFLOAD) data (a single long int) is
feasible but looks hackish. Instead I preferred implementing a separate
ioctl(TUNSETTNLOFFLOAD) receiving as input from user-space a struct
containing the mentioned offset and the "csum offload enabled" flag
Any comments/suggestion WRT such interface more than welcome!
The PoC demonstrated that the pending spec changes need some fixlet:
- I wrote the patch on top of the current virtio-spec master branch, I
need to rebase on top of virtio-1.4, I guess.
- I underlooked to the VIRTIO_NET_HDR_F_DATA_VALID flag. The
specification change allows GSO over UDP tunnel offload with DATA_VALID,
but GSO over UDP tunnel needs the inner transport protocol information.
This point looks more difficult and potentially controversial.
With the proposed change, when DATA_VALID is set, the implementation
will have to determine the inner transport header location via parsing,
as currently is doing for plain GSO. Note that the PoC is currently
bugged WRT this scenario.
I'm wondering if we could revisit the spec change requiring that both
the driver and the device, for GSO over UDP tunnel packets only, will
fill the csum_start field when either the NEEDS_CSUM or the DATA_VALID
bits are set. Equivalently, for GSO over UDP tunnel packets only, the
driver and the device, to specify CHECKSUM_UNNECESSARY packets, could
set both the NEEDS_CSUM and DATA_VALID, with the latter flag 'taking
precedence' WRT the actual checksum type.
Note that the above will made the GSO over UDP tunnel path for
DATA_VALID slightly different from the current plain GSO handling, but
the difference should avoid the current rough edges of the plain GSO.
Other options would be resort to the parsing the inner hdr location (as
in the current plain GSO implementation) or not allowing DATA_VALID for
GSO over UDP tunnel. WDYT?
Thanks,
Paolo
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v8 0/2] virtio-net: define UDP tunnel offload
2024-09-30 7:36 ` Paolo Abeni
@ 2024-10-01 13:09 ` Willem de Bruijn
2024-10-01 14:02 ` Paolo Abeni
0 siblings, 1 reply; 15+ messages in thread
From: Willem de Bruijn @ 2024-10-01 13:09 UTC (permalink / raw)
To: Paolo Abeni
Cc: Jason Wang, virtio-comment, maxime.coquelin, Eelco Chaudron,
Stefano Garzarella, kshankar
On Mon, Sep 30, 2024 at 3:36 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> Hi,
>
> On 9/9/24 10:05, Paolo Abeni wrote:
> > On 9/9/24 05:16, Jason Wang wrote:
> >> On Wed, Sep 4, 2024 at 11:50 PM Paolo Abeni <pabeni@redhat.com> wrote:
> >>>
> >>> UDP tunnel usage is ubiquitous in container deployment, and the ability
> >>> to offload UDP encapsulated GSO traffic impacts greatly the performances
> >>> and the CPU utilization of such use cases.
> >>>
> >>> This series introduces separate features to handle both UDP tunnel
> >>> segmentation offload (patch 1) and UDP tunnel outer checksum offload
> >>> (patch 2).
> >>
> >> Looks good to me.
> >>
> >> And I wonder if there's a link to the prototype code so I can have a
> >> double check.
> >
> > I still have to refresh the code from first implementation - that was
> > the base for the first draft here some time ago.
> >
> > In the next 2 weeks I'll be traveling and/or AFK, I hope I can share the
> > code a little after that.
>
> A very early PoC is available here:
>
> https://github.com/pabeni/linux-devel/commits/virtio_tunnel_gso/
>
> It's very lightly tested vs a patched user-space:
>
> https://github.com/pabeni/qemu/commits/virtio_tunnel/
>
> "very lightly" is actually a strong understatement; the thing survived a
> single iperf-over-udp-tunnel run, and that is. However I hope it should
> suffice as showcase.
>
> In the tun/user-space interface I used a separate ioctl to configure the
> tunnel offload instead of re-using the existing ioctl for plain GSO,
> because AFAICS the tun driver is currently unaware of the complete
> virtio net hdr layout (e.g. ignores rx hash and VIRTIO_NET_F_MRG_RXBUF)
> and it is just notified of the total hdr struct size.
>
> With UDP tunnel offload enabled, the tun driver need to know the tunnel
> fields offset inside the virtio net hdr. Piggybacking such info inside
> the the current ioctl(TUNSETOFFLOAD) data (a single long int) is
> feasible but looks hackish. Instead I preferred implementing a separate
> ioctl(TUNSETTNLOFFLOAD) receiving as input from user-space a struct
> containing the mentioned offset and the "csum offload enabled" flag
>
> Any comments/suggestion WRT such interface more than welcome!
>
> The PoC demonstrated that the pending spec changes need some fixlet:
>
> - I wrote the patch on top of the current virtio-spec master branch, I
> need to rebase on top of virtio-1.4, I guess.
>
> - I underlooked to the VIRTIO_NET_HDR_F_DATA_VALID flag. The
> specification change allows GSO over UDP tunnel offload with DATA_VALID,
> but GSO over UDP tunnel needs the inner transport protocol information.
> This point looks more difficult and potentially controversial.
>
> With the proposed change, when DATA_VALID is set, the implementation
> will have to determine the inner transport header location via parsing,
> as currently is doing for plain GSO. Note that the PoC is currently
> bugged WRT this scenario.
>
> I'm wondering if we could revisit the spec change requiring that both
> the driver and the device, for GSO over UDP tunnel packets only, will
> fill the csum_start field when either the NEEDS_CSUM or the DATA_VALID
> bits are set.
This SGTM.
> Equivalently, for GSO over UDP tunnel packets only, the
> driver and the device, to specify CHECKSUM_UNNECESSARY packets, could
> set both the NEEDS_CSUM and DATA_VALID, with the latter flag 'taking
> precedence' WRT the actual checksum type.
Why do both need to be set to signal CHECKSUM_UNNECESSARY, rather
than only DATA_VALID?
> Note that the above will made the GSO over UDP tunnel path for
> DATA_VALID slightly different from the current plain GSO handling, but
> the difference should avoid the current rough edges of the plain GSO.
>
> Other options would be resort to the parsing the inner hdr location (as
> in the current plain GSO implementation) or not allowing DATA_VALID for
> GSO over UDP tunnel. WDYT?
>
> Thanks,
>
> Paolo
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v8 0/2] virtio-net: define UDP tunnel offload
2024-10-01 13:09 ` Willem de Bruijn
@ 2024-10-01 14:02 ` Paolo Abeni
2024-10-01 19:15 ` Willem de Bruijn
0 siblings, 1 reply; 15+ messages in thread
From: Paolo Abeni @ 2024-10-01 14:02 UTC (permalink / raw)
To: Willem de Bruijn
Cc: Jason Wang, virtio-comment, maxime.coquelin, Eelco Chaudron,
Stefano Garzarella, kshankar
On 10/1/24 15:09, Willem de Bruijn wrote:
> On Mon, Sep 30, 2024 at 3:36 AM Paolo Abeni <pabeni@redhat.com> wrote:
>> On 9/9/24 10:05, Paolo Abeni wrote:
>>> On 9/9/24 05:16, Jason Wang wrote:
>>>> On Wed, Sep 4, 2024 at 11:50 PM Paolo Abeni <pabeni@redhat.com> wrote:
>>>>>
>>>>> UDP tunnel usage is ubiquitous in container deployment, and the ability
>>>>> to offload UDP encapsulated GSO traffic impacts greatly the performances
>>>>> and the CPU utilization of such use cases.
>>>>>
>>>>> This series introduces separate features to handle both UDP tunnel
>>>>> segmentation offload (patch 1) and UDP tunnel outer checksum offload
>>>>> (patch 2).
>>>>
>>>> Looks good to me.
>>>>
>>>> And I wonder if there's a link to the prototype code so I can have a
>>>> double check.
>>>
>>> I still have to refresh the code from first implementation - that was
>>> the base for the first draft here some time ago.
>>>
>>> In the next 2 weeks I'll be traveling and/or AFK, I hope I can share the
>>> code a little after that.
>>
>> A very early PoC is available here:
>>
>> https://github.com/pabeni/linux-devel/commits/virtio_tunnel_gso/
>>
>> It's very lightly tested vs a patched user-space:
>>
>> https://github.com/pabeni/qemu/commits/virtio_tunnel/
>>
>> "very lightly" is actually a strong understatement; the thing survived a
>> single iperf-over-udp-tunnel run, and that is. However I hope it should
>> suffice as showcase.
>>
>> In the tun/user-space interface I used a separate ioctl to configure the
>> tunnel offload instead of re-using the existing ioctl for plain GSO,
>> because AFAICS the tun driver is currently unaware of the complete
>> virtio net hdr layout (e.g. ignores rx hash and VIRTIO_NET_F_MRG_RXBUF)
>> and it is just notified of the total hdr struct size.
>>
>> With UDP tunnel offload enabled, the tun driver need to know the tunnel
>> fields offset inside the virtio net hdr. Piggybacking such info inside
>> the the current ioctl(TUNSETOFFLOAD) data (a single long int) is
>> feasible but looks hackish. Instead I preferred implementing a separate
>> ioctl(TUNSETTNLOFFLOAD) receiving as input from user-space a struct
>> containing the mentioned offset and the "csum offload enabled" flag
>>
>> Any comments/suggestion WRT such interface more than welcome!
>>
>> The PoC demonstrated that the pending spec changes need some fixlet:
>>
>> - I wrote the patch on top of the current virtio-spec master branch, I
>> need to rebase on top of virtio-1.4, I guess.
>>
>> - I underlooked to the VIRTIO_NET_HDR_F_DATA_VALID flag. The
>> specification change allows GSO over UDP tunnel offload with DATA_VALID,
>> but GSO over UDP tunnel needs the inner transport protocol information.
>> This point looks more difficult and potentially controversial.
>>
>> With the proposed change, when DATA_VALID is set, the implementation
>> will have to determine the inner transport header location via parsing,
>> as currently is doing for plain GSO. Note that the PoC is currently
>> bugged WRT this scenario.
>>
>> I'm wondering if we could revisit the spec change requiring that both
>> the driver and the device, for GSO over UDP tunnel packets only, will
>> fill the csum_start field when either the NEEDS_CSUM or the DATA_VALID
>> bits are set.
>
> This SGTM.
>
>> Equivalently, for GSO over UDP tunnel packets only, the
>> driver and the device, to specify CHECKSUM_UNNECESSARY packets, could
>> set both the NEEDS_CSUM and DATA_VALID, with the latter flag 'taking
>> precedence' WRT the actual checksum type.
>
> Why do both need to be set to signal CHECKSUM_UNNECESSARY, rather
> than only DATA_VALID?
I think it's mostly a matter of preferences, and 'allowed delta' WRT to
the existing offload.
AFAICS, technically the plain GSO spec don't mandate that NEEDS_CSUM and
DATA_VALID as mutually exclusive. Instead the spec require the
NEEDS_CSUM flag to use csum_* fields: the first option would create a
more significant difference WRT plain GSO.
From an implementation PoV, the 2nd option (both NEEDS_CSUM and
NEEDS_CSUM) would allow for simpler integration/reuse of the existing
code: the UDP tunnel GSO 'to_skb()' helper could directly call the plain
GSO one and eventually later just set CHECKSUM_UNNECESSARY as needed.
Thanks for the feedback,
Paolo
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v8 0/2] virtio-net: define UDP tunnel offload
2024-10-01 14:02 ` Paolo Abeni
@ 2024-10-01 19:15 ` Willem de Bruijn
2024-10-02 8:35 ` Paolo Abeni
0 siblings, 1 reply; 15+ messages in thread
From: Willem de Bruijn @ 2024-10-01 19:15 UTC (permalink / raw)
To: Paolo Abeni
Cc: Jason Wang, virtio-comment, maxime.coquelin, Eelco Chaudron,
Stefano Garzarella, kshankar
On Tue, Oct 1, 2024 at 10:02 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On 10/1/24 15:09, Willem de Bruijn wrote:
> > On Mon, Sep 30, 2024 at 3:36 AM Paolo Abeni <pabeni@redhat.com> wrote:
> >> On 9/9/24 10:05, Paolo Abeni wrote:
> >>> On 9/9/24 05:16, Jason Wang wrote:
> >>>> On Wed, Sep 4, 2024 at 11:50 PM Paolo Abeni <pabeni@redhat.com> wrote:
> >>>>>
> >>>>> UDP tunnel usage is ubiquitous in container deployment, and the ability
> >>>>> to offload UDP encapsulated GSO traffic impacts greatly the performances
> >>>>> and the CPU utilization of such use cases.
> >>>>>
> >>>>> This series introduces separate features to handle both UDP tunnel
> >>>>> segmentation offload (patch 1) and UDP tunnel outer checksum offload
> >>>>> (patch 2).
> >>>>
> >>>> Looks good to me.
> >>>>
> >>>> And I wonder if there's a link to the prototype code so I can have a
> >>>> double check.
> >>>
> >>> I still have to refresh the code from first implementation - that was
> >>> the base for the first draft here some time ago.
> >>>
> >>> In the next 2 weeks I'll be traveling and/or AFK, I hope I can share the
> >>> code a little after that.
> >>
> >> A very early PoC is available here:
> >>
> >> https://github.com/pabeni/linux-devel/commits/virtio_tunnel_gso/
> >>
> >> It's very lightly tested vs a patched user-space:
> >>
> >> https://github.com/pabeni/qemu/commits/virtio_tunnel/
> >>
> >> "very lightly" is actually a strong understatement; the thing survived a
> >> single iperf-over-udp-tunnel run, and that is. However I hope it should
> >> suffice as showcase.
> >>
> >> In the tun/user-space interface I used a separate ioctl to configure the
> >> tunnel offload instead of re-using the existing ioctl for plain GSO,
> >> because AFAICS the tun driver is currently unaware of the complete
> >> virtio net hdr layout (e.g. ignores rx hash and VIRTIO_NET_F_MRG_RXBUF)
> >> and it is just notified of the total hdr struct size.
> >>
> >> With UDP tunnel offload enabled, the tun driver need to know the tunnel
> >> fields offset inside the virtio net hdr. Piggybacking such info inside
> >> the the current ioctl(TUNSETOFFLOAD) data (a single long int) is
> >> feasible but looks hackish. Instead I preferred implementing a separate
> >> ioctl(TUNSETTNLOFFLOAD) receiving as input from user-space a struct
> >> containing the mentioned offset and the "csum offload enabled" flag
> >>
> >> Any comments/suggestion WRT such interface more than welcome!
> >>
> >> The PoC demonstrated that the pending spec changes need some fixlet:
> >>
> >> - I wrote the patch on top of the current virtio-spec master branch, I
> >> need to rebase on top of virtio-1.4, I guess.
> >>
> >> - I underlooked to the VIRTIO_NET_HDR_F_DATA_VALID flag. The
> >> specification change allows GSO over UDP tunnel offload with DATA_VALID,
> >> but GSO over UDP tunnel needs the inner transport protocol information.
> >> This point looks more difficult and potentially controversial.
> >>
> >> With the proposed change, when DATA_VALID is set, the implementation
> >> will have to determine the inner transport header location via parsing,
> >> as currently is doing for plain GSO. Note that the PoC is currently
> >> bugged WRT this scenario.
> >>
> >> I'm wondering if we could revisit the spec change requiring that both
> >> the driver and the device, for GSO over UDP tunnel packets only, will
> >> fill the csum_start field when either the NEEDS_CSUM or the DATA_VALID
> >> bits are set.
> >
> > This SGTM.
> >
> >> Equivalently, for GSO over UDP tunnel packets only, the
> >> driver and the device, to specify CHECKSUM_UNNECESSARY packets, could
> >> set both the NEEDS_CSUM and DATA_VALID, with the latter flag 'taking
> >> precedence' WRT the actual checksum type.
> >
> > Why do both need to be set to signal CHECKSUM_UNNECESSARY, rather
> > than only DATA_VALID?
>
> I think it's mostly a matter of preferences, and 'allowed delta' WRT to
> the existing offload.
>
> AFAICS, technically the plain GSO spec don't mandate that NEEDS_CSUM and
> DATA_VALID as mutually exclusive. Instead the spec require the
> NEEDS_CSUM flag to use csum_* fields: the first option would create a
> more significant difference WRT plain GSO.
True. The existing API is very permissive.
As you probably know, I'm strongly in favor of being more strict in
new additions, to avoid allowing weird unexpected packets that we then
have to detect and work around later.
From that PoV, explicitly specifying whether the bits are mutually
exclusive or, if not, what having both sets means, would be great.
Even if only for the new tunnel case, as we cannot do it
retroactively.
> From an implementation PoV, the 2nd option (both NEEDS_CSUM and
> NEEDS_CSUM) would allow for simpler integration/reuse of the existing
> code: the UDP tunnel GSO 'to_skb()' helper could directly call the plain
> GSO one and eventually later just set CHECKSUM_UNNECESSARY as needed.
I don't entirely follow. But that's probably fine at this point.
I did quickly skim your initial implementation code. At high level
looks entirely sensible to me. Thanks for sharing!
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v8 0/2] virtio-net: define UDP tunnel offload
2024-10-01 19:15 ` Willem de Bruijn
@ 2024-10-02 8:35 ` Paolo Abeni
2024-10-02 12:29 ` Willem de Bruijn
0 siblings, 1 reply; 15+ messages in thread
From: Paolo Abeni @ 2024-10-02 8:35 UTC (permalink / raw)
To: Willem de Bruijn
Cc: Jason Wang, virtio-comment, maxime.coquelin, Eelco Chaudron,
Stefano Garzarella, kshankar
On 10/1/24 21:15, Willem de Bruijn wrote:
> On Tue, Oct 1, 2024 at 10:02 AM Paolo Abeni <pabeni@redhat.com> wrote:
>> On 10/1/24 15:09, Willem de Bruijn wrote:
>>> On Mon, Sep 30, 2024 at 3:36 AM Paolo Abeni <pabeni@redhat.com> wrote:
>>>> On 9/9/24 10:05, Paolo Abeni wrote:
>>>> I'm wondering if we could revisit the spec change requiring that both
>>>> the driver and the device, for GSO over UDP tunnel packets only, will
>>>> fill the csum_start field when either the NEEDS_CSUM or the DATA_VALID
>>>> bits are set.
>>>
>>> This SGTM.
>>>
>>>> Equivalently, for GSO over UDP tunnel packets only, the
>>>> driver and the device, to specify CHECKSUM_UNNECESSARY packets, could
>>>> set both the NEEDS_CSUM and DATA_VALID, with the latter flag 'taking
>>>> precedence' WRT the actual checksum type.
>>>
>>> Why do both need to be set to signal CHECKSUM_UNNECESSARY, rather
>>> than only DATA_VALID?
>>
>> I think it's mostly a matter of preferences, and 'allowed delta' WRT to
>> the existing offload.
>>
>> AFAICS, technically the plain GSO spec don't mandate that NEEDS_CSUM and
>> DATA_VALID as mutually exclusive. Instead the spec require the
>> NEEDS_CSUM flag to use csum_* fields: the first option would create a
>> more significant difference WRT plain GSO.
>
> True. The existing API is very permissive.
>
> As you probably know, I'm strongly in favor of being more strict in
> new additions, to avoid allowing weird unexpected packets that we then
> have to detect and work around later.
>
> From that PoV, explicitly specifying whether the bits are mutually
> exclusive or, if not, what having both sets means, would be great.
> Even if only for the new tunnel case, as we cannot do it
> retroactively.
I must admit it's not clear to me if the above is a sort of endorsement
for both options, assuming each of them will be strict about the flags
definition.
If even the 2nd option is ok, I would go with that, as it sounds
preferable to me for the mentioned reasons.
>> From an implementation PoV, the 2nd option (both NEEDS_CSUM and
>> NEEDS_CSUM) would allow for simpler integration/reuse of the existing
>> code: the UDP tunnel GSO 'to_skb()' helper could directly call the plain
>> GSO one and eventually later just set CHECKSUM_UNNECESSARY as needed.
>
> I don't entirely follow. But that's probably fine at this point.
Let me try to make an example.
In order to reuse the existing code/avoid implementing multiple times
access and parsing of the csum_start/csum_offset fields, with opt 1, in
virtio_net_hdr_tnl_to_skb() I would need some hack alike:
// this is an UDP GSO packet, tnl fields already
// parsed/validated
if (!(hdr->flags & VIRTIO_NET_F_GUEST_CSUM)) {
clear_guest_csum_bit = true;
hdr->flags |= VIRTIO_NET_F_GUEST_CSUM;
}
ret = virtio_net_hdr_to_skb(skb, hdr, little_endian);
if (clear_guest_csum_bit)
hdr->flags &= ~VIRTIO_NET_F_GUEST_CSUM;
if (ret)
return ret;
with option 2, the above chunk would be:
// Incidentally this test is already present in the PoC code
// and it's a bug WRT
// v8 virtio-net: define UDP tunnel checksum offload feature
if (!(hdr->flags & VIRTIO_NET_F_GUEST_CSUM))
return -EINVAL;
ret = virtio_net_hdr_to_skb(skb, hdr, little_endian);
if (ret)
return ret;
I think the implementation would be cleaner/simpler, and the validation
stricter.
AFAICS, in both cases, when VIRTIO_NET_F_DATA_VALID is set, the
_tnl_to_skb() helper will have to inner csum field, and eventually the
outer one, if not zero, to the L4 pseudohdr csum. Basically
VIRTIO_NET_F_DATA_VALID with GSO says that csum is validated, but the
pseudo hdr csum is not provided.
Thanks,
Paolo
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH v8 0/2] virtio-net: define UDP tunnel offload
2024-10-02 8:35 ` Paolo Abeni
@ 2024-10-02 12:29 ` Willem de Bruijn
2024-10-02 14:29 ` Paolo Abeni
0 siblings, 1 reply; 15+ messages in thread
From: Willem de Bruijn @ 2024-10-02 12:29 UTC (permalink / raw)
To: Paolo Abeni
Cc: Jason Wang, virtio-comment, maxime.coquelin, Eelco Chaudron,
Stefano Garzarella, kshankar
On Wed, Oct 2, 2024 at 4:35 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On 10/1/24 21:15, Willem de Bruijn wrote:
> > On Tue, Oct 1, 2024 at 10:02 AM Paolo Abeni <pabeni@redhat.com> wrote:
> >> On 10/1/24 15:09, Willem de Bruijn wrote:
> >>> On Mon, Sep 30, 2024 at 3:36 AM Paolo Abeni <pabeni@redhat.com> wrote:
> >>>> On 9/9/24 10:05, Paolo Abeni wrote:
> >>>> I'm wondering if we could revisit the spec change requiring that both
> >>>> the driver and the device, for GSO over UDP tunnel packets only, will
> >>>> fill the csum_start field when either the NEEDS_CSUM or the DATA_VALID
> >>>> bits are set.
> >>>
> >>> This SGTM.
> >>>
> >>>> Equivalently, for GSO over UDP tunnel packets only, the
> >>>> driver and the device, to specify CHECKSUM_UNNECESSARY packets, could
> >>>> set both the NEEDS_CSUM and DATA_VALID, with the latter flag 'taking
> >>>> precedence' WRT the actual checksum type.
> >>>
> >>> Why do both need to be set to signal CHECKSUM_UNNECESSARY, rather
> >>> than only DATA_VALID?
> >>
> >> I think it's mostly a matter of preferences, and 'allowed delta' WRT to
> >> the existing offload.
> >>
> >> AFAICS, technically the plain GSO spec don't mandate that NEEDS_CSUM and
> >> DATA_VALID as mutually exclusive. Instead the spec require the
> >> NEEDS_CSUM flag to use csum_* fields: the first option would create a
> >> more significant difference WRT plain GSO.
> >
> > True. The existing API is very permissive.
> >
> > As you probably know, I'm strongly in favor of being more strict in
> > new additions, to avoid allowing weird unexpected packets that we then
> > have to detect and work around later.
> >
> > From that PoV, explicitly specifying whether the bits are mutually
> > exclusive or, if not, what having both sets means, would be great.
> > Even if only for the new tunnel case, as we cannot do it
> > retroactively.
>
> I must admit it's not clear to me if the above is a sort of endorsement
> for both options, assuming each of them will be strict about the flags
> definition.
>
> If even the 2nd option is ok, I would go with that, as
Here option 1 is "requiring that both the driver and the device, for
GSO over UDP tunnel packets only, will fill the csum_start field when
either the NEEDS_CSUM or the DATA_VALID bits are set."
And option 2 is "Equivalently, for GSO over UDP tunnel packets only,
the driver and the device, to specify CHECKSUM_UNNECESSARY packets,
could set both the NEEDS_CSUM and DATA_VALID, with the latter flag
'taking precedence' WRT the actual checksum type."
Right?
Option 1 intuitively is simpler to follow: for tunnel offload,
csum_start must always be set.
> it sounds preferable to me for the mentioned reasons.
It's a bit unfortunate for implementation to dictate interface,
especially if it makes it less straightforward. But if the
implementation is significantly simpler with option 2, no strong
objections from me. This is not a huge design point, either way.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v8 0/2] virtio-net: define UDP tunnel offload
2024-10-02 12:29 ` Willem de Bruijn
@ 2024-10-02 14:29 ` Paolo Abeni
2024-10-02 16:55 ` Willem de Bruijn
0 siblings, 1 reply; 15+ messages in thread
From: Paolo Abeni @ 2024-10-02 14:29 UTC (permalink / raw)
To: Willem de Bruijn, Jason Wang
Cc: virtio-comment, maxime.coquelin, Eelco Chaudron,
Stefano Garzarella, kshankar
On 10/2/24 14:29, Willem de Bruijn wrote:
> On Wed, Oct 2, 2024 at 4:35 AM Paolo Abeni <pabeni@redhat.com> wrote:
>>
>> On 10/1/24 21:15, Willem de Bruijn wrote:
>>> On Tue, Oct 1, 2024 at 10:02 AM Paolo Abeni <pabeni@redhat.com> wrote:
>>>> On 10/1/24 15:09, Willem de Bruijn wrote:
>>>>> On Mon, Sep 30, 2024 at 3:36 AM Paolo Abeni <pabeni@redhat.com> wrote:
>>>>>> On 9/9/24 10:05, Paolo Abeni wrote:
>>>>>> I'm wondering if we could revisit the spec change requiring that both
>>>>>> the driver and the device, for GSO over UDP tunnel packets only, will
>>>>>> fill the csum_start field when either the NEEDS_CSUM or the DATA_VALID
>>>>>> bits are set.
>>>>>
>>>>> This SGTM.
>>>>>
>>>>>> Equivalently, for GSO over UDP tunnel packets only, the
>>>>>> driver and the device, to specify CHECKSUM_UNNECESSARY packets, could
>>>>>> set both the NEEDS_CSUM and DATA_VALID, with the latter flag 'taking
>>>>>> precedence' WRT the actual checksum type.
>>>>>
>>>>> Why do both need to be set to signal CHECKSUM_UNNECESSARY, rather
>>>>> than only DATA_VALID?
>>>>
>>>> I think it's mostly a matter of preferences, and 'allowed delta' WRT to
>>>> the existing offload.
>>>>
>>>> AFAICS, technically the plain GSO spec don't mandate that NEEDS_CSUM and
>>>> DATA_VALID as mutually exclusive. Instead the spec require the
>>>> NEEDS_CSUM flag to use csum_* fields: the first option would create a
>>>> more significant difference WRT plain GSO.
>>>
>>> True. The existing API is very permissive.
>>>
>>> As you probably know, I'm strongly in favor of being more strict in
>>> new additions, to avoid allowing weird unexpected packets that we then
>>> have to detect and work around later.
>>>
>>> From that PoV, explicitly specifying whether the bits are mutually
>>> exclusive or, if not, what having both sets means, would be great.
>>> Even if only for the new tunnel case, as we cannot do it
>>> retroactively.
>>
>> I must admit it's not clear to me if the above is a sort of endorsement
>> for both options, assuming each of them will be strict about the flags
>> definition.
>>
>> If even the 2nd option is ok, I would go with that, as
>
> Here option 1 is "requiring that both the driver and the device, for
> GSO over UDP tunnel packets only, will fill the csum_start field when
> either the NEEDS_CSUM or the DATA_VALID bits are set."
>
> And option 2 is "Equivalently, for GSO over UDP tunnel packets only,
> the driver and the device, to specify CHECKSUM_UNNECESSARY packets,
> could set both the NEEDS_CSUM and DATA_VALID, with the latter flag
> 'taking precedence' WRT the actual checksum type."
>
> Right?
Yes!
> Option 1 intuitively is simpler to follow: for tunnel offload,
> csum_start must always be set.
>
>> it sounds preferable to me for the mentioned reasons.
>
> It's a bit unfortunate for implementation to dictate interface,
> especially if it makes it less straightforward. But if the
> implementation is significantly simpler with option 2, no strong
> objections from me. This is not a huge design point, either way.
Arguably the most relevant sell-point for option 2 is that it allows
enforcing simpler/stricter checks in virtio net hdr parsing (if
UDP_TUNNEL is set, NEEDS_CSUM and plain gso must be set, too).
Somewhat related: I just noted that the 'TX sides' of the spec (both in
device->driver and in the driver -> device direction) is already strict
WRT csum and gso dependency, but the corresponding 'RX sides'
({device,driver}normative/Processing of Incoming Packets) lack
completely any MUST NOT accept/MUST drop statement for bad/invalid inputs.
My understanding is that most/all of the trouble we have on virtio net
hdr parsing and validation steam from such lack. I'm adding explicit
'MUST NOT accept' statements for all the UDP-tunnel-related invalid
flags/gso_type permutation I can think of.
@Jason: do you know if there is a specific reason for the mentioned lack
validation in the current specs?
Cheers,
Paolo
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH v8 0/2] virtio-net: define UDP tunnel offload
2024-10-02 14:29 ` Paolo Abeni
@ 2024-10-02 16:55 ` Willem de Bruijn
0 siblings, 0 replies; 15+ messages in thread
From: Willem de Bruijn @ 2024-10-02 16:55 UTC (permalink / raw)
To: Paolo Abeni
Cc: Jason Wang, virtio-comment, maxime.coquelin, Eelco Chaudron,
Stefano Garzarella, kshankar
On Wed, Oct 2, 2024 at 10:29 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On 10/2/24 14:29, Willem de Bruijn wrote:
> > On Wed, Oct 2, 2024 at 4:35 AM Paolo Abeni <pabeni@redhat.com> wrote:
> >>
> >> On 10/1/24 21:15, Willem de Bruijn wrote:
> >>> On Tue, Oct 1, 2024 at 10:02 AM Paolo Abeni <pabeni@redhat.com> wrote:
> >>>> On 10/1/24 15:09, Willem de Bruijn wrote:
> >>>>> On Mon, Sep 30, 2024 at 3:36 AM Paolo Abeni <pabeni@redhat.com> wrote:
> >>>>>> On 9/9/24 10:05, Paolo Abeni wrote:
> >>>>>> I'm wondering if we could revisit the spec change requiring that both
> >>>>>> the driver and the device, for GSO over UDP tunnel packets only, will
> >>>>>> fill the csum_start field when either the NEEDS_CSUM or the DATA_VALID
> >>>>>> bits are set.
> >>>>>
> >>>>> This SGTM.
> >>>>>
> >>>>>> Equivalently, for GSO over UDP tunnel packets only, the
> >>>>>> driver and the device, to specify CHECKSUM_UNNECESSARY packets, could
> >>>>>> set both the NEEDS_CSUM and DATA_VALID, with the latter flag 'taking
> >>>>>> precedence' WRT the actual checksum type.
> >>>>>
> >>>>> Why do both need to be set to signal CHECKSUM_UNNECESSARY, rather
> >>>>> than only DATA_VALID?
> >>>>
> >>>> I think it's mostly a matter of preferences, and 'allowed delta' WRT to
> >>>> the existing offload.
> >>>>
> >>>> AFAICS, technically the plain GSO spec don't mandate that NEEDS_CSUM and
> >>>> DATA_VALID as mutually exclusive. Instead the spec require the
> >>>> NEEDS_CSUM flag to use csum_* fields: the first option would create a
> >>>> more significant difference WRT plain GSO.
> >>>
> >>> True. The existing API is very permissive.
> >>>
> >>> As you probably know, I'm strongly in favor of being more strict in
> >>> new additions, to avoid allowing weird unexpected packets that we then
> >>> have to detect and work around later.
> >>>
> >>> From that PoV, explicitly specifying whether the bits are mutually
> >>> exclusive or, if not, what having both sets means, would be great.
> >>> Even if only for the new tunnel case, as we cannot do it
> >>> retroactively.
> >>
> >> I must admit it's not clear to me if the above is a sort of endorsement
> >> for both options, assuming each of them will be strict about the flags
> >> definition.
> >>
> >> If even the 2nd option is ok, I would go with that, as
> >
> > Here option 1 is "requiring that both the driver and the device, for
> > GSO over UDP tunnel packets only, will fill the csum_start field when
> > either the NEEDS_CSUM or the DATA_VALID bits are set."
> >
> > And option 2 is "Equivalently, for GSO over UDP tunnel packets only,
> > the driver and the device, to specify CHECKSUM_UNNECESSARY packets,
> > could set both the NEEDS_CSUM and DATA_VALID, with the latter flag
> > 'taking precedence' WRT the actual checksum type."
> >
> > Right?
>
> Yes!
>
> > Option 1 intuitively is simpler to follow: for tunnel offload,
> > csum_start must always be set.
> >
> >> it sounds preferable to me for the mentioned reasons.
> >
> > It's a bit unfortunate for implementation to dictate interface,
> > especially if it makes it less straightforward. But if the
> > implementation is significantly simpler with option 2, no strong
> > objections from me. This is not a huge design point, either way.
>
> Arguably the most relevant sell-point for option 2 is that it allows
> enforcing simpler/stricter checks in virtio net hdr parsing (if
> UDP_TUNNEL is set, NEEDS_CSUM and plain gso must be set, too).
GSO implies checksum offload, so that part is fine.
As long as it does not extend to UDP tunnels for non-GSO packets.
> Somewhat related: I just noted that the 'TX sides' of the spec (both in
> device->driver and in the driver -> device direction) is already strict
> WRT csum and gso dependency, but the corresponding 'RX sides'
> ({device,driver}normative/Processing of Incoming Packets) lack
> completely any MUST NOT accept/MUST drop statement for bad/invalid inputs.
>
> My understanding is that most/all of the trouble we have on virtio net
> hdr parsing and validation steam from such lack. I'm adding explicit
> 'MUST NOT accept' statements for all the UDP-tunnel-related invalid
> flags/gso_type permutation I can think of.
Thanks!
> @Jason: do you know if there is a specific reason for the mentioned lack
> validation in the current specs?
^ permalink raw reply [flat|nested] 15+ messages in thread