public inbox for virtio-comment@lists.linux.dev
 help / color / mirror / Atom feed
* [PATCH v6 0/2] virtio-net: define UDP tunnel offload
@ 2024-07-29 11:30 Paolo Abeni
  2024-07-29 11:30 ` [PATCH v6 1/2] virtio-net: define UDP tunnel segmentation offload feature Paolo Abeni
  2024-07-29 11:30 ` [PATCH v6 2/2] virtio-net: define UDP tunnel checksum " Paolo Abeni
  0 siblings, 2 replies; 27+ messages in thread
From: Paolo Abeni @ 2024-07-29 11:30 UTC (permalink / raw)
  To: virtio-comment
  Cc: maxime.coquelin, Eelco Chaudron, Jason Wang, Stefano Garzarella,
	Willem de Bruijn

UDP tunnel usage is ubiquitous in container deployment, and the ability
to offload UDP encapsulated GSO traffic impacts greatly the performances
and the CPU utilization of such use cases.

This series introduces separate features to handle both UDP tunnel 
segmentation offload (patch 1) and UDP tunnel outer checksum offload
(patch 2)

Changes from v5:
- split in 2 patches
- dropped outer_mh_offset field
- many csum related clarification

Paolo Abeni (2):
  virtio-net: define UDP tunnel segmentation offload feature
  virtio-net: define UDP tunnel checksum offload feature

 device-types/net/description.tex | 227 +++++++++++++++++++++++++++++--
 1 file changed, 218 insertions(+), 9 deletions(-)

-- 
2.45.2


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v6 1/2] virtio-net: define UDP tunnel segmentation offload feature
  2024-07-29 11:30 [PATCH v6 0/2] virtio-net: define UDP tunnel offload Paolo Abeni
@ 2024-07-29 11:30 ` Paolo Abeni
  2024-07-29 21:04   ` Willem de Bruijn
  2024-07-31  1:57   ` Jason Wang
  2024-07-29 11:30 ` [PATCH v6 2/2] virtio-net: define UDP tunnel checksum " Paolo Abeni
  1 sibling, 2 replies; 27+ messages in thread
From: Paolo Abeni @ 2024-07-29 11:30 UTC (permalink / raw)
  To: virtio-comment
  Cc: maxime.coquelin, Eelco Chaudron, Jason Wang, Stefano Garzarella,
	Willem de Bruijn

The VIRTIO_NET_HDR_GSO_UDP_TUNNEL is a gso_type flag allowing GSO over
UDP tunnel. It can be negotiated on both the host and guest sides.

One constraint addressed here is that the virtio side (either device or
driver) receiving a UDP tunneled GSO packet must be able to reconstruct
completely the inner and outer headers offset - to allow for later GSO.

To accommodate such need, new fields are introduced in the virtio_net
header: outer_th_offset, inner_protocol_type and inner_nh_offset.
They map directly to the corresponding header information. The inner
transport header is implied by the (inner) checksum offload.

Those fields are required because a virtio device H/W implementation
performing segmentation for UDP tunneled packet will need to touch
the outer transport protocol (for the UDP length filed), the
inner network protocol (for the total length field, in the IPv4
case).

Note that segmentation will additionally need to touch
the outer network protocol and the inner transport protocol. The first
is implied/easily found with trivial parsing, the latter is identified
by the existing csum_start field.

Note that there is no concept of UDP tunnel type negotiation (e.g.
vxlan, geneve, vxlan-gpe, etc.), as a virtio device H/W implementation
can perform segmentation for every possible UDP-tunnel given the
specified new fields.
In the reverse direction, if a virtio device H/W implementation receives
some traffic for an unknown or unsupported UDP tunnel, it will simply
not aggregate the wire packet in GSO one.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
v5 -> v6:
 - split in 2 patches
 - dropped the unneeded outer_mh_offset field
https://lore.kernel.org/virtio-comment/56ccba136272593d0d2af0303600c06f26bd84a1.1718621522.git.pabeni@redhat.com/

v4 -> v5:
 - add separate negotiation for UDP_TUNNEL_CSUM
 - much more verbose specification of outer csum handling
 - UDP_TUNNEL_CSUM requires GSO_UDP_TUNNEL
 https://lore.kernel.org/virtio-comment/cb89d3d6b6a7e4a7f13e60dd3fded5e950727890.1716448766.git.pabeni@redhat.com/

v3 -> v4:
 - new virtio_net_hdr fields depends on F_HOST_UDP_TUNNEL_GSO or
   F_GUEST_UDP_TUNNEL_GSO (Stefano)
 - Extended the changelog to answer Jason's questions.
 - Clarified outer csum handling (Jason)
 https://lore.kernel.org/virtio-dev/f0bf2db203f8061a3ecb50b7a48cf26d8f3c7f68.1716193086.git.pabeni@redhat.com

v2 -> v3:
 - UDP_TUNNEL -> UDP_TUNNEL_GSO
 - add explicit fields for the inner meta-data
 - more verbose changelog
 https://lists.oasis-open.org/archives/virtio-dev/202206/msg00026.html

v1 -> v2:
 - explicitly state that the outer header probing is mandatory
 - explicitly state that GSO_UDP is not allowed with GSO_UDP_TUNNEL
 - clarify hdr_len usage
 - clarify UDP_TUNNEL_CSUM bit usage
 - fix a few typos
 https://lists.oasis-open.org/archives/virtio-dev/202205/msg00037.html
---
 device-types/net/description.tex | 118 +++++++++++++++++++++++++++++--
 1 file changed, 112 insertions(+), 6 deletions(-)

diff --git a/device-types/net/description.tex b/device-types/net/description.tex
index 76585b0..eda294d 100644
--- a/device-types/net/description.tex
+++ b/device-types/net/description.tex
@@ -88,6 +88,12 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
 \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
     channel.
 
+\item[VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO (47)] Driver can receive GSO packets
+  carried by a UDP tunnel.
+
+\item[VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO (49)] Device can receive GSO packets
+  carried by a UDP tunnel.
+
 \item[VIRTIO_NET_F_HASH_TUNNEL(51)] Device supports inner header hash for encapsulated packets.
 
 \item[VIRTIO_NET_F_VQ_NOTF_COAL(52)] Device supports virtqueue notification coalescing.
@@ -133,12 +139,16 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
 \item[VIRTIO_NET_F_GUEST_UFO] Requires VIRTIO_NET_F_GUEST_CSUM.
 \item[VIRTIO_NET_F_GUEST_USO4] Requires VIRTIO_NET_F_GUEST_CSUM.
 \item[VIRTIO_NET_F_GUEST_USO6] Requires VIRTIO_NET_F_GUEST_CSUM.
+\item[VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO] Requires both VIRTIO_NET_F_GUEST_TSO4 and VIRTIO_NET_F_GUEST_TSO6, or
+   both VIRTIO_NET_F_GUEST_USO4 and VIRTIO_NET_F_GUEST_USO6.
 
 \item[VIRTIO_NET_F_HOST_TSO4] Requires VIRTIO_NET_F_CSUM.
 \item[VIRTIO_NET_F_HOST_TSO6] Requires VIRTIO_NET_F_CSUM.
 \item[VIRTIO_NET_F_HOST_ECN] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
 \item[VIRTIO_NET_F_HOST_UFO] Requires VIRTIO_NET_F_CSUM.
 \item[VIRTIO_NET_F_HOST_USO] Requires VIRTIO_NET_F_CSUM.
+\item[VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO] Requires both VIRTIO_NET_F_HOST_TSO4 and VIRTIO_NET_F_HOST_TSO6, or
+   both VIRTIO_NET_F_GUEST_USO4 or VIRTIO_NET_F_GUEST_USO6.
 
 \item[VIRTIO_NET_F_CTRL_RX] Requires VIRTIO_NET_F_CTRL_VQ.
 \item[VIRTIO_NET_F_CTRL_VLAN] Requires VIRTIO_NET_F_CTRL_VQ.
@@ -375,6 +385,12 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev
   TCP), VIRTIO_NET_F_HOST_TSO6 (IPv6 TCP), VIRTIO_NET_F_HOST_UFO
   (UDP fragmentation) and VIRTIO_NET_F_HOST_USO (UDP segmentation) features.
 
+\item If any TCP or UDP segmentation feature is negotiated, a driver can
+  use TCP segmentation or UDP segmentation on top of UDP encapsulation
+  offload, when the outer header does not require checksumming - e.g.
+  the outer UDP checksum is zero - by negotiating the
+  VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO feature.
+
 \item The converse features are also available: a driver can save
   the virtual device some work by negotiating these features.\note{For example, a network packet transported between two guests on
 the same system might not need checksumming at all, nor segmentation,
@@ -382,8 +398,9 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev
    The VIRTIO_NET_F_GUEST_CSUM feature indicates that partially
   checksummed packets can be received, and if it can do that then
   the VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6,
-  VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_GUEST_USO4
-  and VIRTIO_NET_F_GUEST_USO6 are the input equivalents of the features described above.
+  VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_GUEST_USO4,
+  VIRTIO_NET_F_GUEST_USO6 and VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO
+  are the input equivalents of the features described above.
   See \ref{sec:Device Types / Network Device / Device Operation /
 Setting Up Receive Buffers}~\nameref{sec:Device Types / Network
 Device / Device Operation / Setting Up Receive Buffers} and
@@ -413,6 +430,7 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O
 #define VIRTIO_NET_HDR_GSO_UDP         3
 #define VIRTIO_NET_HDR_GSO_TCPV6       4
 #define VIRTIO_NET_HDR_GSO_UDP_L4      5
+#define VIRTIO_NET_HDR_GSO_UDP_TUNNEL 0x40
 #define VIRTIO_NET_HDR_GSO_ECN      0x80
         u8 gso_type;
         le16 hdr_len;
@@ -423,6 +441,9 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O
         le32 hash_value;        (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
         le16 hash_report;       (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
         le16 padding_reserved;  (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
+        le16 outer_th_offset    (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
+        le16 inner_protocol_type;(Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
+        le16 inner_nh_offset;   (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
 };
 \end{lstlisting}
 
@@ -480,6 +501,8 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
 followed by the TCP header (with the TCP checksum field 16 bytes
 into that header). \field{csum_start} will be 14+20 = 34 (the TCP
 checksum includes the header), and \field{csum_offset} will be 16.
+If the given packet has the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit set,
+the above checksum fields refer to the inner header checksum.
 \end{note}
 
 \item If the driver negotiated
@@ -516,6 +539,26 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
 specifically in the protocol.}.
    \end{itemize}
 
+\item If the driver negotiated the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO feature,
+  the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit in \field{gso_type} indicates that
+  the GSO protocol is encapsulated in a UDP tunnel.
+  The outer UDP header must not require checksumming, e.g. the outer UDP checksum
+  must be zero.
+  The other tunnel-related fields indicate how to replicate the packet
+  headers to cut it into smaller packets:
+
+  \begin{itemize}
+  \item \field{outer_th_offset} field indicates the outer transport header within
+      the packet. Note that such field differs from \field{csum_start} as the latter
+      points to the inner transport header within the packet.
+
+  \item \field{inner_protocol_type} field indicates the ethernet type of the inner
+      protocol.
+
+  \item \field{inner_nh_offset} field indicates the inner network header within
+      the packet.
+  \end{itemize}
+
 \item \field{num_buffers} is set to zero.  This field is unused on transmitted packets.
 
 \item The header and packet are added as one output descriptor to the
@@ -557,6 +600,18 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
 driver MUST set the VIRTIO_NET_HDR_GSO_ECN bit in
 \field{gso_type}.
 
+The driver MUST NOT send to the device TCP or UDP GSO packets over UDP tunnel
+requiring segmentation offload, unless the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO is
+negotiated, in which case the driver MUST set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL
+bit in \field{gso_type}.
+
+The driver MUST NOT send to the device TCP or UDP GSO packets over UDP tunnel
+requiring segmentation and outer UDP checksum offload.
+
+The driver MUST NOT set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL together with
+VIRTIO_NET_HDR_GSO_UDP, as the latter is deprecated in favor of UDP_L4
+and no new features will support it.
+
 If the VIRTIO_NET_F_CSUM feature has been negotiated, the
 driver MAY set the VIRTIO_NET_HDR_F_NEEDS_CSUM bit in
 \field{flags}, if so:
@@ -598,6 +653,28 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
 	header.
 \end{itemize}
 
+If the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO option has been negotiated, the
+driver MAY set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit in \field{gso_type}, if so:
+\begin{itemize}
+\item the driver MUST set \field{outer_th_offset} to the outer UDP header
+  offset, \field{inner_protocol_type} to the ethernet type of the inner
+  protocol, and \field{inner_nh_offset} to the inner network header offset.
+\end{itemize}
+
+If the bit VIRTIO_NET_HDR_GSO_UDP_TUNNEL in \field{gso_type} is set, the outer
+UDP header MUST NOT require checksumming. That is the, the outer UDP checksum
+value must be 0, or the valid complete checksum for such header.
+
+\begin{note}
+The valid complete checksum of the outer UDP header of individual segments
+can be computed by the driver prior to segmentation only if the GSO packet
+size is a multiple of \field{gso_size}.
+\end{note}
+
+If the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO option has not
+been negotiated, the driver MUST NOT set the VIRTIO_NET_HDR_F_UDP_TUNNEL bit
+in \field{gso_type}.
+
 The driver SHOULD accept the VIRTIO_NET_F_GUEST_HDRLEN feature if it has
 been offered, and if it's able to provide the exact header length.
 
@@ -633,6 +710,10 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
 	\end{note}
 \end{itemize}
 
+If VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit in \field{gso_type} is not set, the
+device MUST NOT use the \field{outer_th_offset}, \field{inner_protocol_type}
+ and \field{inner_nh_offset}.
+
 If VIRTIO_NET_HDR_F_NEEDS_CSUM is not set, the device MUST NOT
 rely on the packet checksum being correct.
 \paragraph{Packet Transmission Interrupt}\label{sec:Device Types / Network Device / Device Operation / Packet Transmission / Packet Transmission Interrupt}
@@ -727,8 +808,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
   has been validated.
 \end{enumerate}
 
-Additionally, VIRTIO_NET_F_GUEST_CSUM, TSO4, TSO6, UDP and ECN
-features enable receive checksum, large receive offload and ECN
+Additionally, VIRTIO_NET_F_GUEST_CSUM, TSO4, TSO6, UDP, UDP_TUNNEL
+and ECN features enable receive checksum, large receive offload and ECN
 support which are the input equivalents of the transmit checksum,
 transmit segmentation offloading and ECN features, as described
 in \ref{sec:Device Types / Network Device / Device Operation /
@@ -750,8 +831,14 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
   from \field{csum_start} and any preceding checksums
   have been validated.  The checksum on the packet is incomplete and
   if bit VIRTIO_NET_HDR_F_RSC_INFO is not set in \field{flags},
-  then \field{csum_start} and \field{csum_offset} indicate how to calculate it
-  (see Packet Transmission point 1).
+  then \field{csum_start} and \field{csum_offset} indicate how to calculate it.
+  If the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit is set, the \field{csum_start} field
+  refers to the inner transport header offset (see Packet Transmission point 1).
+\item If the VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO option was negotiated and
+  \field{gso_type} is not VIRTIO_NET_HDR_GSO_NONE, the VIRTIO_NET_HDR_GSO_UDP_TUNNEL
+  bit MAY be set. In such case the \field{outer_th_offset}, \field{inner_protocol_type},
+  and \field{inner_nh_offset} fields indicate the corresponding
+  headers information.
 
 \end{enumerate}
 
@@ -796,6 +883,9 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
 If none of VIRTIO_NET_F_GUEST_USO4 or VIRTIO_NET_F_GUEST_USO6 have been negotiated,
 the device MUST NOT set \field{gso_type} to VIRTIO_NET_HDR_GSO_UDP_L4.
 
+If VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO is not negotiated the device MUST NOT set
+the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit in \field{gso_type}.
+
 The device SHOULD NOT send to the driver TCP packets requiring segmentation offload
 which have the Explicit Congestion Notification bit set, unless the
 VIRTIO_NET_F_GUEST_ECN feature is negotiated, in which case the
@@ -819,6 +909,17 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
 	fully checksummed packet;
 \end{enumerate}
 
+The device MUST NOT send to the driver TCP or UDP GSO packets encapsulated in UDP
+tunnel and requiring segmentation offload, unless the
+VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO is negotiated, in which case the device MUST set
+the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit in \field{gso_type} and MUST set the
+\field{outer_th_offset}, \field{inner_protocol_type} and
+\field{inner_nh_offset} fields to the corresponding header information, and the outer
+UDP header MUST NOT require checksum offload.
+
+The device MUST NOT send the driver TCP or UDP GSO packets encapsulated in UDP
+tunnel and requiring segmentation and outer checksum offload.
+
 If none of the VIRTIO_NET_F_GUEST_TSO4, TSO6, UFO, USO4 or USO6 options have
 been negotiated, the device MUST set \field{gso_type} to
 VIRTIO_NET_HDR_GSO_NONE.
@@ -867,6 +968,10 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
 VIRTIO_NET_HDR_F_DATA_VALID is set, the driver MUST NOT
 rely on the packet checksum being correct.
 
+If the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_GSO bit in \field{gso_type} is not set,
+the driver MUST NOT use the \field{outer_th_offset}, \field{inner_protocol_type},
+\field{inner_mac_offset} and \field{inner_nh_offset}.
+
 \paragraph{Hash calculation for incoming packets}
 \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets}
 
@@ -1624,6 +1729,7 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
 #define VIRTIO_NET_F_GUEST_TSO6       8
 #define VIRTIO_NET_F_GUEST_ECN        9
 #define VIRTIO_NET_F_GUEST_UFO        10
+#define VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO  47
 #define VIRTIO_NET_F_GUEST_USO4       54
 #define VIRTIO_NET_F_GUEST_USO6       55
 
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 2/2] virtio-net: define UDP tunnel checksum offload feature
  2024-07-29 11:30 [PATCH v6 0/2] virtio-net: define UDP tunnel offload Paolo Abeni
  2024-07-29 11:30 ` [PATCH v6 1/2] virtio-net: define UDP tunnel segmentation offload feature Paolo Abeni
@ 2024-07-29 11:30 ` Paolo Abeni
  2024-07-29 21:35   ` Willem de Bruijn
  1 sibling, 1 reply; 27+ messages in thread
From: Paolo Abeni @ 2024-07-29 11:30 UTC (permalink / raw)
  To: virtio-comment
  Cc: maxime.coquelin, Eelco Chaudron, Jason Wang, Stefano Garzarella,
	Willem de Bruijn

This complements the previous change, additionally
introducing the UDP tunnel checksum offload feature.

Differently from the plain checksum offload feature, this
depends on UDP tunnel segmentation being available, as outer checksum
computation for non GSO packets is cheap and H/W implementation of
such feature is complex.

UDP tunnel checksum offload does not introduce additional fields,
instead leverages the outer transport offset introduced by the
UDP tunnel segmentation feature to locate the outer checksum
inside the packet.

When  UDP tunnel checksum offload is enabled, the VIRTIO_NET_HDR_F_DATA_VALID
bit semantic is extended, in the device -> driver direction, to
cover both the inner and the outer checksum validation.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 device-types/net/description.tex | 123 ++++++++++++++++++++++++++++---
 1 file changed, 113 insertions(+), 10 deletions(-)

diff --git a/device-types/net/description.tex b/device-types/net/description.tex
index eda294d..1225b50 100644
--- a/device-types/net/description.tex
+++ b/device-types/net/description.tex
@@ -91,9 +91,15 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
 \item[VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO (47)] Driver can receive GSO packets
   carried by a UDP tunnel.
 
+\item[VIRTIO_NET_F_GUEST_UDP_TUNNEL_CSUM (48)] Driver handles packets
+  carried by a UDP tunnel with partial csum for the outer header.
+
 \item[VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO (49)] Device can receive GSO packets
   carried by a UDP tunnel.
 
+\item[VIRTIO_NET_F_HOST_UDP_TUNNEL_CSUM (50)] Device handles packets
+  carried by a UDP tunnel with partial csum for the outer header.
+
 \item[VIRTIO_NET_F_HASH_TUNNEL(51)] Device supports inner header hash for encapsulated packets.
 
 \item[VIRTIO_NET_F_VQ_NOTF_COAL(52)] Device supports virtqueue notification coalescing.
@@ -141,6 +147,7 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
 \item[VIRTIO_NET_F_GUEST_USO6] Requires VIRTIO_NET_F_GUEST_CSUM.
 \item[VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO] Requires both VIRTIO_NET_F_GUEST_TSO4 and VIRTIO_NET_F_GUEST_TSO6, or
    both VIRTIO_NET_F_GUEST_USO4 and VIRTIO_NET_F_GUEST_USO6.
+\item[VIRTIO_NET_F_GUEST_UDP_TUNNEL_CSUM] Requires VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO
 
 \item[VIRTIO_NET_F_HOST_TSO4] Requires VIRTIO_NET_F_CSUM.
 \item[VIRTIO_NET_F_HOST_TSO6] Requires VIRTIO_NET_F_CSUM.
@@ -149,6 +156,7 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
 \item[VIRTIO_NET_F_HOST_USO] Requires VIRTIO_NET_F_CSUM.
 \item[VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO] Requires both VIRTIO_NET_F_HOST_TSO4 and VIRTIO_NET_F_HOST_TSO6, or
    both VIRTIO_NET_F_GUEST_USO4 or VIRTIO_NET_F_GUEST_USO6.
++\item[VIRTIO_NET_F_HOST_UDP_TUNNEL_CSUM] Requires VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO
 
 \item[VIRTIO_NET_F_CTRL_RX] Requires VIRTIO_NET_F_CTRL_VQ.
 \item[VIRTIO_NET_F_CTRL_VLAN] Requires VIRTIO_NET_F_CTRL_VQ.
@@ -162,6 +170,13 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
 \item[VIRTIO_NET_F_HASH_TUNNEL] Requires VIRTIO_NET_F_CTRL_VQ along with VIRTIO_NET_F_RSS or VIRTIO_NET_F_HASH_REPORT.
 \end{description}
 
+\begin{note}
+The dependency between UDP_TUNNEL_CSUM and UDP_TUNNEL_GSO is intentionally
+in the opposite direction with respect to the plain GSO features and the plain
+checksum offload because UDP tunnel checksum offload gives very little gain
+for non GSO packets is quite complex to implement in H/W.
+\end{note}
+
 \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Network Device / Feature bits / Legacy Interface: Feature bits}
 \begin{description}
 \item[VIRTIO_NET_F_GSO (6)] Device handles packets with any GSO type. This was supposed to indicate segmentation offload support, but
@@ -391,6 +406,11 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev
   the outer UDP checksum is zero - by negotiating the
   VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO feature.
 
+\item If VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO is negotiated, a driver can
+  additionally use TCP segmentation or UDP segmentation on top of UDP
+  encapsulation with the outer header requiring checksum offload,
+  negotiating the VIRTIO_NET_F_HOST_UDP_TUNNEL_CSUM feature.
+
 \item The converse features are also available: a driver can save
   the virtual device some work by negotiating these features.\note{For example, a network packet transported between two guests on
 the same system might not need checksumming at all, nor segmentation,
@@ -399,8 +419,9 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev
   checksummed packets can be received, and if it can do that then
   the VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6,
   VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_GUEST_USO4,
-  VIRTIO_NET_F_GUEST_USO6 and VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO
-  are the input equivalents of the features described above.
+  VIRTIO_NET_F_GUEST_USO6 VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO and
+  VIRTIO_NET_F_GUEST_UDP_TUNNEL_CSUM are the input equivalents of
+  the features described above.
   See \ref{sec:Device Types / Network Device / Device Operation /
 Setting Up Receive Buffers}~\nameref{sec:Device Types / Network
 Device / Device Operation / Setting Up Receive Buffers} and
@@ -424,6 +445,7 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O
 #define VIRTIO_NET_HDR_F_NEEDS_CSUM    1
 #define VIRTIO_NET_HDR_F_DATA_VALID    2
 #define VIRTIO_NET_HDR_F_RSC_INFO      4
+#define VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM 8
         u8 flags;
 #define VIRTIO_NET_HDR_GSO_NONE        0
 #define VIRTIO_NET_HDR_GSO_TCPV4       1
@@ -490,7 +512,7 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
     new (16 bit ones' complement) checksum is placed by the device.
 
   \item The TCP checksum field in the packet is set to the sum
-    of the TCP pseudo header, so that replacing it by the ones'
+    of the TCP pseudo header, so that replacing it with the ones'
     complement checksum of the TCP header and body will give the
     correct result.
   \end{itemize}
@@ -542,8 +564,11 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
 \item If the driver negotiated the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO feature,
   the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit in \field{gso_type} indicates that
   the GSO protocol is encapsulated in a UDP tunnel.
-  The outer UDP header must not require checksumming, e.g. the outer UDP checksum
-  must be zero.
+  If the outer UDP header requires checksumming, the driver must have
+  additionally negotiated the VIRTIO_NET_F_HOST_UDP_TUNNEL_CSUM feature
+  and offloaded the outer checksum accordingly, otherwise
+  the outer UDP header must not require checksumming, e.g. the outer
+  UDP checksum must be zero.
   The other tunnel-related fields indicate how to replicate the packet
   headers to cut it into smaller packets:
 
@@ -559,6 +584,20 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
       the packet.
   \end{itemize}
 
+\item If the driver negotiated the VIRTIO_NET_F_HOST_UDP_TUNNEL_CSUM feature,
+  the transmitted packet is a GSO one encapsulated in a UDP tunnel, and
+  the outer UDP header requires checksumming, the driver can skip checksumming
+  the outer header:
+
+  \begin{itemize}
+  \item \field{flags} has the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM set,
+
+  \item The outer UDP checksum field in the packet is set to the sum
+    of the UDP pseudo header, so that replacing it by the ones'
+    complement checksum of the outer UDP header and payload will give the
+    correct result.
+  \end{itemize}
+
 \item \field{num_buffers} is set to zero.  This field is unused on transmitted packets.
 
 \item The header and packet are added as one output descriptor to the
@@ -606,12 +645,23 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
 bit in \field{gso_type}.
 
 The driver MUST NOT send to the device TCP or UDP GSO packets over UDP tunnel
+requiring segmentation and outer UDP checksum offload, unless both the
+VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO and VIRTIO_NET_F_HOST_UDP_TUNNEL_CSUM features
+are negotiated, in which case the driver MUST set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL
+bit in the \field{gso_type} and the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit in the \field{flags}.
+
+If VIRTIO_NET_F_HOST_UDP_TUNNEL_CSUM is not negotiated, the driver MUST not set
+the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit in the \field{flags} and
+MUST NOT send to the device TCP or UDP GSO packets over UDP tunnel
 requiring segmentation and outer UDP checksum offload.
 
 The driver MUST NOT set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL together with
 VIRTIO_NET_HDR_GSO_UDP, as the latter is deprecated in favor of UDP_L4
 and no new features will support it.
 
+The driver MUST NOT set the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit \field{flags}
+without setting the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO bit in \field{gso_type}.
+
 If the VIRTIO_NET_F_CSUM feature has been negotiated, the
 driver MAY set the VIRTIO_NET_HDR_F_NEEDS_CSUM bit in
 \field{flags}, if so:
@@ -661,8 +711,27 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
   protocol, and \field{inner_nh_offset} to the inner network header offset.
 \end{itemize}
 
-If the bit VIRTIO_NET_HDR_GSO_UDP_TUNNEL in \field{gso_type} is set, the outer
-UDP header MUST NOT require checksumming. That is the, the outer UDP checksum
+If the VIRTIO_NET_F_HOST_UDP_TUNNEL_CSUM feature has been negotiated, and the
+bit VIRTIO_NET_HDR_GSO_UDP_TUNNEL in \field{gso_type} is set, the
+driver MAY set the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit in
+\field{flags}, if so:
+\begin{enumerate}
+\item the driver MUST validate the packet checksum at
+	offset 6 from \field{outer_th_offset} as well as all
+	preceding offsets;
+\item the driver MUST set the packet outer UDP header checksum
+      to the outer UDP pseudo header;
+\end{enumerate}
+
+\begin{note}
+calculating a ones' complement checksum from \field{outer_th_offset}
+up until the end of the packet and storing the result at offset 6
+from \field{outer_th_offset} will result in a fully checksummed outer UDP packet;
+\end{note}
+
+If the bit VIRTIO_NET_HDR_GSO_UDP_TUNNEL in \field{gso_type} is set
+and the VIRTIO_NET_F_HOST_UDP_TUNNEL_CSUM feature has not been negotiated, the outer
+UDP header MUST NOT require checksumming. That is, the outer UDP checksum
 value must be 0, or the valid complete checksum for such header.
 
 \begin{note}
@@ -675,6 +744,10 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
 been negotiated, the driver MUST NOT set the VIRTIO_NET_HDR_F_UDP_TUNNEL bit
 in \field{gso_type}.
 
+If the VIRTIO_NET_F_HOST_UDP_TUNNEL_CSUM option has not been negotiated,
+the driver MUST NOT set the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit
+in \field{flags}.
+
 The driver SHOULD accept the VIRTIO_NET_F_GUEST_HDRLEN feature if it has
 been offered, and if it's able to provide the exact header length.
 
@@ -839,6 +912,12 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
   bit MAY be set. In such case the \field{outer_th_offset}, \field{inner_protocol_type},
   and \field{inner_nh_offset} fields indicate the corresponding
   headers information.
+\item If the VIRTIO_NET_F_GUEST_UDP_TUNNEL_CSUM feature was negotiated, and
+  the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit is set in \field{gso_type}, the
+  VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit in the \field{flags} can be set: if so,
+  the outer UDP header checksum at offset 6 from \field{outer_th_offset} have
+  been validated. The checksum on the packet is incomplete and the
+  \field{outer_th_offset} indicates how to calculate it.
 
 \end{enumerate}
 
@@ -886,6 +965,9 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
 If VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO is not negotiated the device MUST NOT set
 the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit in \field{gso_type}.
 
+If VIRTIO_NET_F_GUEST_UDP_TUNNEL_CSUM is not negotiated the device MUST NOT set
+the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit in \field{flags}.
+
 The device SHOULD NOT send to the driver TCP packets requiring segmentation offload
 which have the Explicit Congestion Notification bit set, unless the
 VIRTIO_NET_F_GUEST_ECN feature is negotiated, in which case the
@@ -917,7 +999,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
 \field{inner_nh_offset} fields to the corresponding header information, and the outer
 UDP header MUST NOT require checksum offload.
 
-The device MUST NOT send the driver TCP or UDP GSO packets encapsulated in UDP
+If the VIRTIO_NET_F_GUEST_UDP_TUNNEL_CSUM feature has not been negotiated,
+the device MUST NOT send the driver TCP or UDP GSO packets encapsulated in UDP
 tunnel and requiring segmentation and outer checksum offload.
 
 If none of the VIRTIO_NET_F_GUEST_TSO4, TSO6, UFO, USO4 or USO6 options have
@@ -943,8 +1026,27 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
 If the VIRTIO_NET_F_GUEST_CSUM feature has been negotiated, the
 device MAY set the VIRTIO_NET_HDR_F_DATA_VALID bit in
 \field{flags}, if so, the device MUST validate the packet
-checksum (in case of multiple encapsulated protocols, one level
-of checksums is validated).
+checksum. If the VIRTIO_NET_F_GUEST_UDP_TUNNEL_CSUM feature has been negotiated,
+and the VIRTIO_NET_HDR_F_UDP_TUNNEL bit set, both the outer UDP checksum
+and the inner transport checksum have been
+validated, otherwise only one level of checksums (the inner one in case
+of tunnels) has been validated.
+
+If the VIRTIO_NET_F_GUEST_UDP_TUNNEL_CSUM feature has been negotiated,
+the bit VIRTIO_NET_HDR_GSO_UDP_TUNNEL is set in \field{gso_type}
+device MAY set the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit in
+\field{flags}, if so:
+\begin{enumerate}
+\item the device MUST validate the packet checksum at
+	offset 6 from \field{outer_th_offset};
+\item the device MUST set the packet checksum stored in the
+	receive buffer to the outer UDP pseudo header;
+\end{enumerate}
+
+Otherwise, if the VIRTIO_NET_HDR_GSO_UDP_TUNNEL is set in
+\field{gso_type} and the bit VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM is cleared in
+\field{flags}, the device MUST either provide a zero outer UDP header
+checksum or a fully checksummed outer UDP header.
 
 \drivernormative{\paragraph}{Processing of Incoming
 Packets}{Device Types / Network Device / Device Operation /
@@ -1730,6 +1832,7 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
 #define VIRTIO_NET_F_GUEST_ECN        9
 #define VIRTIO_NET_F_GUEST_UFO        10
 #define VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO  47
+#define VIRTIO_NET_F_GUEST_UDP_TUNNEL_CSUM 48
 #define VIRTIO_NET_F_GUEST_USO4       54
 #define VIRTIO_NET_F_GUEST_USO6       55
 
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 1/2] virtio-net: define UDP tunnel segmentation offload feature
  2024-07-29 11:30 ` [PATCH v6 1/2] virtio-net: define UDP tunnel segmentation offload feature Paolo Abeni
@ 2024-07-29 21:04   ` Willem de Bruijn
  2024-07-29 21:36     ` Willem de Bruijn
  2024-07-30  7:33     ` Paolo Abeni
  2024-07-31  1:57   ` Jason Wang
  1 sibling, 2 replies; 27+ messages in thread
From: Willem de Bruijn @ 2024-07-29 21:04 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: virtio-comment, maxime.coquelin, Eelco Chaudron, Jason Wang,
	Stefano Garzarella

On Mon, Jul 29, 2024 at 7:30 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> The VIRTIO_NET_HDR_GSO_UDP_TUNNEL is a gso_type flag allowing GSO over
> UDP tunnel. It can be negotiated on both the host and guest sides.
>
> One constraint addressed here is that the virtio side (either device or
> driver) receiving a UDP tunneled GSO packet must be able to reconstruct
> completely the inner and outer headers offset - to allow for later GSO.
>
> To accommodate such need, new fields are introduced in the virtio_net
> header: outer_th_offset, inner_protocol_type and inner_nh_offset.
> They map directly to the corresponding header information. The inner
> transport header is implied by the (inner) checksum offload.
>
> Those fields are required because a virtio device H/W implementation
> performing segmentation for UDP tunneled packet will need to touch
> the outer transport protocol (for the UDP length filed), the
> inner network protocol (for the total length field, in the IPv4
> case).
>
> Note that segmentation will additionally need to touch
> the outer network protocol and the inner transport protocol. The first
> is implied/easily found with trivial parsing, the latter is identified
> by the existing csum_start field.
>
> Note that there is no concept of UDP tunnel type negotiation (e.g.
> vxlan, geneve, vxlan-gpe, etc.), as a virtio device H/W implementation
> can perform segmentation for every possible UDP-tunnel given the
> specified new fields.
> In the reverse direction, if a virtio device H/W implementation receives
> some traffic for an unknown or unsupported UDP tunnel, it will simply
> not aggregate the wire packet in GSO one.
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Overall this looks good to me. Thanks for splitting the patches.

You have a working software implementation, right? Will this be sent
(as RFC) to the public netdev or so mailing list before this becomes a
standard? I assume that there is a standard process for virtio that
you are following. I'd just be more confident in a thumbs up if we
also review the code that implements the text.

> @@ -423,6 +441,9 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O
>          le32 hash_value;        (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
>          le16 hash_report;       (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
>          le16 padding_reserved;  (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
> +        le16 outer_th_offset    (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
> +        le16 inner_protocol_type;(Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
> +        le16 inner_nh_offset;   (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
>  };

In relation to the above: I recall that expanding the virtio_net_hdr
was non-trivial when I last tried to add fields ("[rfc,2/3]
virtio-net: support receive timestamp"). If that is the experience
again, it might be worthwhile to expand with enough space to also
accommodate a few subsequent extensions. Headroom is not that
expensive.

> +\item If the driver negotiated the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO feature,
> +  the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit in \field{gso_type} indicates that
> +  the GSO protocol is encapsulated in a UDP tunnel.
> +  The outer UDP header must not require checksumming, e.g. the outer UDP checksum
> +  must be zero.

nit: "e.g." should be "i.e.," here

and maybe must be a positive zero (0x0) checksum as defined in UDP RFC 768.

> +If the bit VIRTIO_NET_HDR_GSO_UDP_TUNNEL in \field{gso_type} is set, the outer
> +UDP header MUST NOT require checksumming. That is the, the outer UDP checksum
> +value must be 0, or the valid complete checksum for such header.

MUST NOT require checksum validation. [..] or the validated complete checksum

(nit-picking at this point, all are suggestions, feel free to ignore).

> +
> +\begin{note}
> +The valid complete checksum of the outer UDP header of individual segments
> +can be computed by the driver prior to segmentation only if the GSO packet
> +size is a multiple of \field{gso_size}.
> +\end{note}

Might be helpful to explain why. E.g., ", because then all segments
have the same size and thus all data included in the outer UDP
checksum is the same for every segment. These pre-computed segment
length and checksum fields are different from those of the GSO
packet."

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 2/2] virtio-net: define UDP tunnel checksum offload feature
  2024-07-29 11:30 ` [PATCH v6 2/2] virtio-net: define UDP tunnel checksum " Paolo Abeni
@ 2024-07-29 21:35   ` Willem de Bruijn
  2024-07-30  7:52     ` Paolo Abeni
  0 siblings, 1 reply; 27+ messages in thread
From: Willem de Bruijn @ 2024-07-29 21:35 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: virtio-comment, maxime.coquelin, Eelco Chaudron, Jason Wang,
	Stefano Garzarella

On Mon, Jul 29, 2024 at 7:30 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> This complements the previous change, additionally
> introducing the UDP tunnel checksum offload feature.
>
> Differently from the plain checksum offload feature, this
> depends on UDP tunnel segmentation being available, as outer checksum
> computation for non GSO packets is cheap and H/W implementation of
> such feature is complex.
>
> UDP tunnel checksum offload does not introduce additional fields,
> instead leverages the outer transport offset introduced by the
> UDP tunnel segmentation feature to locate the outer checksum
> inside the packet.
>
> When  UDP tunnel checksum offload is enabled, the VIRTIO_NET_HDR_F_DATA_VALID
> bit semantic is extended, in the device -> driver direction, to
> cover both the inner and the outer checksum validation.

> +\item[VIRTIO_NET_F_GUEST_UDP_TUNNEL_CSUM (48)] Driver handles packets
> +  carried by a UDP tunnel with partial csum for the outer header.
> +
>
>  \item[VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO (49)] Device can receive GSO packets
>    carried by a UDP tunnel.
>
> +\item[VIRTIO_NET_F_HOST_UDP_TUNNEL_CSUM (50)] Device handles packets
> +  carried by a UDP tunnel with partial csum for the outer header.
> +
>  \item[VIRTIO_NET_F_HASH_TUNNEL(51)] Device supports inner header hash for encapsulated packets.

> +\begin{note}
> +The dependency between UDP_TUNNEL_CSUM and UDP_TUNNEL_GSO is intentionally
> +in the opposite direction with respect to the plain GSO features and the plain
> +checksum offload because UDP tunnel checksum offload gives very little gain
> +for non GSO packets is quite complex to implement in H/W.
> +\end{note}

Plan was to call this UDP_TUNNEL_GSO_CSUM, to make this implicit
dependency more explicit? It really only is checksum offload for the
outer checksum that this feature covers.

>  \item The converse features are also available: a driver can save
>    the virtual device some work by negotiating these features.\note{For example, a network packet transported between two guests on
>  the same system might not need checksumming at all, nor segmentation,
> @@ -399,8 +419,9 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev
>    checksummed packets can be received, and if it can do that then
>    the VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6,
>    VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_GUEST_USO4,
> -  VIRTIO_NET_F_GUEST_USO6 and VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO
> -  are the input equivalents of the features described above.
> +  VIRTIO_NET_F_GUEST_USO6 VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO and
> +  VIRTIO_NET_F_GUEST_UDP_TUNNEL_CSUM are the input equivalents of
> +  the features described above.

Minor, not specific to this feature and more to the spec maintainers:

Does it make sense to put each constant on its own line? To avoid
having to edit existing lines when expanding. This is source text, so
that does not show in the generated output anyway:

and if it can do that then the
VIRTIO_NET_F_GUEST_TSO4,
VIRTIO_NET_F_GUEST_TSO6,
 VIRTIO_NET_F_GUEST_UFO,
VIRTIO_NET_F_GUEST_ECN,
VIRTIO_NET_F_GUEST_USO4,
 VIRTIO_NET_F_GUEST_USO6,
+ VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO,
are the input equivalents of the features described above.

The trailing comma is admittedly a bit ugly. If fixing that up, the
diff is still cleaner with one constant per line:

- VIRTIO_NET_F_GUEST_USO6
+ VIRTIO_NET_F_GUEST_USO6,
+ VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO

> @@ -490,7 +512,7 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
>      new (16 bit ones' complement) checksum is placed by the device.
>
>    \item The TCP checksum field in the packet is set to the sum
> -    of the TCP pseudo header, so that replacing it by the ones'
> +    of the TCP pseudo header, so that replacing it with the ones'

Unintentional change?

> +  \begin{itemize}
> +  \item \field{flags} has the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM set,
> +
> +  \item The outer UDP checksum field in the packet is set to the sum
> +    of the UDP pseudo header, so that replacing it by the ones'
> +    complement checksum of the outer UDP header and payload will give the
> +    correct result.
> +  \end{itemize}

"set to the sum of the UDP pseudo header [of each segment]"?

as it will not be the UDP length of the gso packet.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 1/2] virtio-net: define UDP tunnel segmentation offload feature
  2024-07-29 21:04   ` Willem de Bruijn
@ 2024-07-29 21:36     ` Willem de Bruijn
  2024-07-30  7:38       ` Paolo Abeni
  2024-07-30  7:33     ` Paolo Abeni
  1 sibling, 1 reply; 27+ messages in thread
From: Willem de Bruijn @ 2024-07-29 21:36 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: virtio-comment, maxime.coquelin, Eelco Chaudron, Jason Wang,
	Stefano Garzarella

On Mon, Jul 29, 2024 at 5:04 PM Willem de Bruijn <willemb@google.com> wrote:
>
> On Mon, Jul 29, 2024 at 7:30 AM Paolo Abeni <pabeni@redhat.com> wrote:
> >
> > The VIRTIO_NET_HDR_GSO_UDP_TUNNEL is a gso_type flag allowing GSO over
> > UDP tunnel. It can be negotiated on both the host and guest sides.
> >
> > One constraint addressed here is that the virtio side (either device or
> > driver) receiving a UDP tunneled GSO packet must be able to reconstruct
> > completely the inner and outer headers offset - to allow for later GSO.
> >
> > To accommodate such need, new fields are introduced in the virtio_net
> > header: outer_th_offset, inner_protocol_type and inner_nh_offset.
> > They map directly to the corresponding header information. The inner
> > transport header is implied by the (inner) checksum offload.
> >
> > Those fields are required because a virtio device H/W implementation
> > performing segmentation for UDP tunneled packet will need to touch
> > the outer transport protocol (for the UDP length filed), the
> > inner network protocol (for the total length field, in the IPv4
> > case).
> >
> > Note that segmentation will additionally need to touch
> > the outer network protocol and the inner transport protocol. The first
> > is implied/easily found with trivial parsing, the latter is identified
> > by the existing csum_start field.
> >
> > Note that there is no concept of UDP tunnel type negotiation (e.g.
> > vxlan, geneve, vxlan-gpe, etc.), as a virtio device H/W implementation
> > can perform segmentation for every possible UDP-tunnel given the
> > specified new fields.
> > In the reverse direction, if a virtio device H/W implementation receives
> > some traffic for an unknown or unsupported UDP tunnel, it will simply
> > not aggregate the wire packet in GSO one.
> >
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>

> > +
> > +\begin{note}
> > +The valid complete checksum of the outer UDP header of individual segments
> > +can be computed by the driver prior to segmentation only if the GSO packet
> > +size is a multiple of \field{gso_size}.
> > +\end{note}
>
> Might be helpful to explain why. E.g., ", because then all segments
> have the same size and thus all data included in the outer UDP
> checksum is the same for every segment. These pre-computed segment
> length and checksum fields are different from those of the GSO
> packet."

Does this part belong in patch 2/2?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 1/2] virtio-net: define UDP tunnel segmentation offload feature
  2024-07-29 21:04   ` Willem de Bruijn
  2024-07-29 21:36     ` Willem de Bruijn
@ 2024-07-30  7:33     ` Paolo Abeni
  2024-07-31  4:10       ` Jason Wang
  1 sibling, 1 reply; 27+ messages in thread
From: Paolo Abeni @ 2024-07-30  7:33 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: virtio-comment, maxime.coquelin, Eelco Chaudron, Jason Wang,
	Stefano Garzarella

On 7/29/24 23:04, Willem de Bruijn wrote:
> You have a working software implementation, right? Will this be sent
> (as RFC) to the public netdev or so mailing list before this becomes a
> standard? I assume that there is a standard process for virtio that
> you are following. I'd just be more confident in a thumbs up if we
> also review the code that implements the text.

My plan is to write the S/W virtio_net implementation, too, but it's 
currently not ready. I had a working PoC for the first draft of this 
changeset, long time ago. Things are changed enough that I'll need a 
rewrite. I can try to shared such a thing on netdev ASAP (where "soon" 
here is a very relative terms ;)

Also I don't know the standardization process. I hope for some guidance 
from Jason and/or Michael.

> 
>> @@ -423,6 +441,9 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O
>>           le32 hash_value;        (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
>>           le16 hash_report;       (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
>>           le16 padding_reserved;  (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
>> +        le16 outer_th_offset    (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
>> +        le16 inner_protocol_type;(Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
>> +        le16 inner_nh_offset;   (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
>>   };
> 
> In relation to the above: I recall that expanding the virtio_net_hdr
> was non-trivial when I last tried to add fields ("[rfc,2/3]
> virtio-net: support receive timestamp"). If that is the experience
> again, it might be worthwhile to expand with enough space to also
> accommodate a few subsequent extensions. Headroom is not that
> expensive.

Do you mean, by adding 'padding' or 'unused' fields?

Side note: the above is not 32 bit aligned, do we need such aligment?
If we would have VIRTIO_NET_HDR_GSO_UDPv4_L4 and 
VIRTIO_NET_HDR_GSO_UDPv6_L4 variants, we could avoid entirely 
'inner_protocol_type', as the network protocol will be implied by the 
GSO type itself.

>> +\item If the driver negotiated the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO feature,
>> +  the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit in \field{gso_type} indicates that
>> +  the GSO protocol is encapsulated in a UDP tunnel.
>> +  The outer UDP header must not require checksumming, e.g. the outer UDP checksum
>> +  must be zero.
> 
> nit: "e.g." should be "i.e.," here
> 
> and maybe must be a positive zero (0x0) checksum as defined in UDP RFC 768.
> 
>> +If the bit VIRTIO_NET_HDR_GSO_UDP_TUNNEL in \field{gso_type} is set, the outer
>> +UDP header MUST NOT require checksumming. That is the, the outer UDP checksum
>> +value must be 0, or the valid complete checksum for such header.
> 
> MUST NOT require checksum validation. [..] or the validated complete checksum

Ack to all the above.

Thanks!

Paolo


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 1/2] virtio-net: define UDP tunnel segmentation offload feature
  2024-07-29 21:36     ` Willem de Bruijn
@ 2024-07-30  7:38       ` Paolo Abeni
  0 siblings, 0 replies; 27+ messages in thread
From: Paolo Abeni @ 2024-07-30  7:38 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: virtio-comment, maxime.coquelin, Eelco Chaudron, Jason Wang,
	Stefano Garzarella



On 7/29/24 23:36, Willem de Bruijn wrote:
> On Mon, Jul 29, 2024 at 5:04 PM Willem de Bruijn <willemb@google.com> wrote:
>>
>> On Mon, Jul 29, 2024 at 7:30 AM Paolo Abeni <pabeni@redhat.com> wrote:
>>>
>>> The VIRTIO_NET_HDR_GSO_UDP_TUNNEL is a gso_type flag allowing GSO over
>>> UDP tunnel. It can be negotiated on both the host and guest sides.
>>>
>>> One constraint addressed here is that the virtio side (either device or
>>> driver) receiving a UDP tunneled GSO packet must be able to reconstruct
>>> completely the inner and outer headers offset - to allow for later GSO.
>>>
>>> To accommodate such need, new fields are introduced in the virtio_net
>>> header: outer_th_offset, inner_protocol_type and inner_nh_offset.
>>> They map directly to the corresponding header information. The inner
>>> transport header is implied by the (inner) checksum offload.
>>>
>>> Those fields are required because a virtio device H/W implementation
>>> performing segmentation for UDP tunneled packet will need to touch
>>> the outer transport protocol (for the UDP length filed), the
>>> inner network protocol (for the total length field, in the IPv4
>>> case).
>>>
>>> Note that segmentation will additionally need to touch
>>> the outer network protocol and the inner transport protocol. The first
>>> is implied/easily found with trivial parsing, the latter is identified
>>> by the existing csum_start field.
>>>
>>> Note that there is no concept of UDP tunnel type negotiation (e.g.
>>> vxlan, geneve, vxlan-gpe, etc.), as a virtio device H/W implementation
>>> can perform segmentation for every possible UDP-tunnel given the
>>> specified new fields.
>>> In the reverse direction, if a virtio device H/W implementation receives
>>> some traffic for an unknown or unsupported UDP tunnel, it will simply
>>> not aggregate the wire packet in GSO one.
>>>
>>> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> 
>>> +
>>> +\begin{note}
>>> +The valid complete checksum of the outer UDP header of individual segments
>>> +can be computed by the driver prior to segmentation only if the GSO packet
>>> +size is a multiple of \field{gso_size}.
>>> +\end{note}
>>
>> Might be helpful to explain why. E.g., ", because then all segments
>> have the same size and thus all data included in the outer UDP
>> checksum is the same for every segment. These pre-computed segment
>> length and checksum fields are different from those of the GSO
>> packet."
> 
> Does this part belong in patch 2/2?

Possibly is a bit border-line.
IMHO it could belong to this patch, too. The idea is to specify one of 
the scenarios what will not require checksum offload.

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 2/2] virtio-net: define UDP tunnel checksum offload feature
  2024-07-29 21:35   ` Willem de Bruijn
@ 2024-07-30  7:52     ` Paolo Abeni
  0 siblings, 0 replies; 27+ messages in thread
From: Paolo Abeni @ 2024-07-30  7:52 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: virtio-comment, maxime.coquelin, Eelco Chaudron, Jason Wang,
	Stefano Garzarella

On 7/29/24 23:35, Willem de Bruijn wrote:
> On Mon, Jul 29, 2024 at 7:30 AM Paolo Abeni <pabeni@redhat.com> wrote:
>> +\begin{note}
>> +The dependency between UDP_TUNNEL_CSUM and UDP_TUNNEL_GSO is intentionally
>> +in the opposite direction with respect to the plain GSO features and the plain
>> +checksum offload because UDP tunnel checksum offload gives very little gain
>> +for non GSO packets is quite complex to implement in H/W.
>> +\end{note}
> 
> Plan was to call this UDP_TUNNEL_GSO_CSUM, to make this implicit
> dependency more explicit? It really only is checksum offload for the
> outer checksum that this feature covers.

Indeed. I lost that change in the split. I'll fix in the next revision.

>>   \item The converse features are also available: a driver can save
>>     the virtual device some work by negotiating these features.\note{For example, a network packet transported between two guests on
>>   the same system might not need checksumming at all, nor segmentation,
>> @@ -399,8 +419,9 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev
>>     checksummed packets can be received, and if it can do that then
>>     the VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6,
>>     VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_GUEST_USO4,
>> -  VIRTIO_NET_F_GUEST_USO6 and VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO
>> -  are the input equivalents of the features described above.
>> +  VIRTIO_NET_F_GUEST_USO6 VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO and
>> +  VIRTIO_NET_F_GUEST_UDP_TUNNEL_CSUM are the input equivalents of
>> +  the features described above.
> 
> Minor, not specific to this feature and more to the spec maintainers:
> 
> Does it make sense to put each constant on its own line? To avoid
> having to edit existing lines when expanding. This is source text, so
> that does not show in the generated output anyway:
> 
> and if it can do that then the
> VIRTIO_NET_F_GUEST_TSO4,
> VIRTIO_NET_F_GUEST_TSO6,
>   VIRTIO_NET_F_GUEST_UFO,
> VIRTIO_NET_F_GUEST_ECN,
> VIRTIO_NET_F_GUEST_USO4,
>   VIRTIO_NET_F_GUEST_USO6,
> + VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO,
> are the input equivalents of the features described above.
> 
> The trailing comma is admittedly a bit ugly. If fixing that up, the
> diff is still cleaner with one constant per line:
> 
> - VIRTIO_NET_F_GUEST_USO6
> + VIRTIO_NET_F_GUEST_USO6,
> + VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO
> 
>> @@ -490,7 +512,7 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
>>       new (16 bit ones' complement) checksum is placed by the device.
>>
>>     \item The TCP checksum field in the packet is set to the sum
>> -    of the TCP pseudo header, so that replacing it by the ones'
>> +    of the TCP pseudo header, so that replacing it with the ones'
> 
> Unintentional change?

Yep, I piped the patch in a grammar checker and while reflecting the 
suggestions on the patch itself I unintentionally touched the context, too.

>> +  \begin{itemize}
>> +  \item \field{flags} has the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM set,
>> +
>> +  \item The outer UDP checksum field in the packet is set to the sum
>> +    of the UDP pseudo header, so that replacing it by the ones'
>> +    complement checksum of the outer UDP header and payload will give the
>> +    correct result.
>> +  \end{itemize}
> 
> "set to the sum of the UDP pseudo header [of each segment]"?
> 
> as it will not be the UDP length of the gso packet.

AFAICS the Linux tunnel implementation uses the GSO length:

<each UDP tunnel> -> udp_tunnel_xmit_skb -> udp_set_csum

https://elixir.bootlin.com/linux/v6.10.2/source/net/ipv4/udp_tunnel_core.c#L170
https://elixir.bootlin.com/linux/v6.10.2/source/net/ipv4/udp.c#L890

Regardless of the Linux implementation, using the individual segment len 
will
makes things difficult for the last segment (len < gso_size). The caller 
will
have to touch the individual segment headers, which could be quite 
hard/costily.

Am I missing something?

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 1/2] virtio-net: define UDP tunnel segmentation offload feature
  2024-07-29 11:30 ` [PATCH v6 1/2] virtio-net: define UDP tunnel segmentation offload feature Paolo Abeni
  2024-07-29 21:04   ` Willem de Bruijn
@ 2024-07-31  1:57   ` Jason Wang
  2024-07-31  8:43     ` Paolo Abeni
  1 sibling, 1 reply; 27+ messages in thread
From: Jason Wang @ 2024-07-31  1:57 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: virtio-comment, maxime.coquelin, Eelco Chaudron,
	Stefano Garzarella, Willem de Bruijn

On Mon, Jul 29, 2024 at 7:30 PM Paolo Abeni <pabeni@redhat.com> wrote:
>
> The VIRTIO_NET_HDR_GSO_UDP_TUNNEL is a gso_type flag allowing GSO over
> UDP tunnel. It can be negotiated on both the host and guest sides.
>
> One constraint addressed here is that the virtio side (either device or
 > completely the inner and outer headers offset - to allow for later GSO.
>
> To accommodate such need, new fields are introduced in the virtio_net
> header: outer_th_offset, inner_protocol_type and inner_nh_offset.
> They map directly to the corresponding header information. The inner
> transport header is implied by the (inner) checksum offload.
>
> Those fields are required because a virtio device H/W implementation
> performing segmentation for UDP tunneled packet will need to touch
> the outer transport protocol (for the UDP length filed), the
> inner network protocol (for the total length field, in the IPv4
> case).
>
> Note that segmentation will additionally need to touch
> the outer network protocol and the inner transport protocol. The first
> is implied/easily found with trivial parsing, the latter is identified
> by the existing csum_start field.
>
> Note that there is no concept of UDP tunnel type negotiation (e.g.
> vxlan, geneve, vxlan-gpe, etc.), as a virtio device H/W implementation
> can perform segmentation for every possible UDP-tunnel given the
> specified new fields.
> In the reverse direction, if a virtio device H/W implementation receives
> some traffic for an unknown or unsupported UDP tunnel, it will simply
> not aggregate the wire packet in GSO one.
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
> v5 -> v6:
>  - split in 2 patches
>  - dropped the unneeded outer_mh_offset field
> https://lore.kernel.org/virtio-comment/56ccba136272593d0d2af0303600c06f26bd84a1.1718621522.git.pabeni@redhat.com/
>
> v4 -> v5:
>  - add separate negotiation for UDP_TUNNEL_CSUM
>  - much more verbose specification of outer csum handling
>  - UDP_TUNNEL_CSUM requires GSO_UDP_TUNNEL
>  https://lore.kernel.org/virtio-comment/cb89d3d6b6a7e4a7f13e60dd3fded5e950727890.1716448766.git.pabeni@redhat.com/
>
> v3 -> v4:
>  - new virtio_net_hdr fields depends on F_HOST_UDP_TUNNEL_GSO or
>    F_GUEST_UDP_TUNNEL_GSO (Stefano)
>  - Extended the changelog to answer Jason's questions.
>  - Clarified outer csum handling (Jason)
>  https://lore.kernel.org/virtio-dev/f0bf2db203f8061a3ecb50b7a48cf26d8f3c7f68.1716193086.git.pabeni@redhat.com
>
> v2 -> v3:
>  - UDP_TUNNEL -> UDP_TUNNEL_GSO
>  - add explicit fields for the inner meta-data
>  - more verbose changelog
>  https://lists.oasis-open.org/archives/virtio-dev/202206/msg00026.html
>
> v1 -> v2:
>  - explicitly state that the outer header probing is mandatory
>  - explicitly state that GSO_UDP is not allowed with GSO_UDP_TUNNEL
>  - clarify hdr_len usage
>  - clarify UDP_TUNNEL_CSUM bit usage
>  - fix a few typos
>  https://lists.oasis-open.org/archives/virtio-dev/202205/msg00037.html
> ---
>  device-types/net/description.tex | 118 +++++++++++++++++++++++++++++--
>  1 file changed, 112 insertions(+), 6 deletions(-)
>
> diff --git a/device-types/net/description.tex b/device-types/net/description.tex
> index 76585b0..eda294d 100644
> --- a/device-types/net/description.tex
> +++ b/device-types/net/description.tex
> @@ -88,6 +88,12 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
>  \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
>      channel.
>
> +\item[VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO (47)] Driver can receive GSO packets
> +  carried by a UDP tunnel.
> +
> +\item[VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO (49)] Device can receive GSO packets
> +  carried by a UDP tunnel.
> +
>  \item[VIRTIO_NET_F_HASH_TUNNEL(51)] Device supports inner header hash for encapsulated packets.
>
>  \item[VIRTIO_NET_F_VQ_NOTF_COAL(52)] Device supports virtqueue notification coalescing.
> @@ -133,12 +139,16 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
>  \item[VIRTIO_NET_F_GUEST_UFO] Requires VIRTIO_NET_F_GUEST_CSUM.
>  \item[VIRTIO_NET_F_GUEST_USO4] Requires VIRTIO_NET_F_GUEST_CSUM.
>  \item[VIRTIO_NET_F_GUEST_USO6] Requires VIRTIO_NET_F_GUEST_CSUM.
> +\item[VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO] Requires both VIRTIO_NET_F_GUEST_TSO4 and VIRTIO_NET_F_GUEST_TSO6, or
> +   both VIRTIO_NET_F_GUEST_USO4 and VIRTIO_NET_F_GUEST_USO6.

I think VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO requires all of the four?

>
>  \item[VIRTIO_NET_F_HOST_TSO4] Requires VIRTIO_NET_F_CSUM.
>  \item[VIRTIO_NET_F_HOST_TSO6] Requires VIRTIO_NET_F_CSUM.
>  \item[VIRTIO_NET_F_HOST_ECN] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
>  \item[VIRTIO_NET_F_HOST_UFO] Requires VIRTIO_NET_F_CSUM.
>  \item[VIRTIO_NET_F_HOST_USO] Requires VIRTIO_NET_F_CSUM.
> +\item[VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO] Requires both VIRTIO_NET_F_HOST_TSO4 and VIRTIO_NET_F_HOST_TSO6, or
> +   both VIRTIO_NET_F_GUEST_USO4 or VIRTIO_NET_F_GUEST_USO6.
>
>  \item[VIRTIO_NET_F_CTRL_RX] Requires VIRTIO_NET_F_CTRL_VQ.
>  \item[VIRTIO_NET_F_CTRL_VLAN] Requires VIRTIO_NET_F_CTRL_VQ.
> @@ -375,6 +385,12 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev
>    TCP), VIRTIO_NET_F_HOST_TSO6 (IPv6 TCP), VIRTIO_NET_F_HOST_UFO
>    (UDP fragmentation) and VIRTIO_NET_F_HOST_USO (UDP segmentation) features.
>
> +\item If any TCP or UDP segmentation feature is negotiated,

Not a native speaker but this seems to conflict with the above. For
example, does "only VIRTIO_NET_F_GUEST_TSO4 is negotiated" mean any
TCP segmentation feature?

> a driver can
> +  use TCP segmentation or UDP segmentation on top of UDP encapsulation
> +  offload, when the outer header does not require checksumming - e.g.
> +  the outer UDP checksum is zero - by negotiating the
> +  VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO feature.
> +
>  \item The converse features are also available: a driver can save
>    the virtual device some work by negotiating these features.\note{For example, a network packet transported between two guests on
>  the same system might not need checksumming at all, nor segmentation,
> @@ -382,8 +398,9 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev
>     The VIRTIO_NET_F_GUEST_CSUM feature indicates that partially
>    checksummed packets can be received, and if it can do that then
>    the VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6,
> -  VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_GUEST_USO4
> -  and VIRTIO_NET_F_GUEST_USO6 are the input equivalents of the features described above.
> +  VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_GUEST_USO4,
> +  VIRTIO_NET_F_GUEST_USO6 and VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO
> +  are the input equivalents of the features described above.
>    See \ref{sec:Device Types / Network Device / Device Operation /
>  Setting Up Receive Buffers}~\nameref{sec:Device Types / Network
>  Device / Device Operation / Setting Up Receive Buffers} and
> @@ -413,6 +430,7 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O
>  #define VIRTIO_NET_HDR_GSO_UDP         3
>  #define VIRTIO_NET_HDR_GSO_TCPV6       4
>  #define VIRTIO_NET_HDR_GSO_UDP_L4      5
> +#define VIRTIO_NET_HDR_GSO_UDP_TUNNEL 0x40
>  #define VIRTIO_NET_HDR_GSO_ECN      0x80
>          u8 gso_type;
>          le16 hdr_len;
> @@ -423,6 +441,9 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O
>          le32 hash_value;        (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
>          le16 hash_report;       (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
>          le16 padding_reserved;  (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
> +        le16 outer_th_offset    (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
> +        le16 inner_protocol_type;(Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
> +        le16 inner_nh_offset;   (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
>  };
>  \end{lstlisting}
>
> @@ -480,6 +501,8 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
>  followed by the TCP header (with the TCP checksum field 16 bytes
>  into that header). \field{csum_start} will be 14+20 = 34 (the TCP
>  checksum includes the header), and \field{csum_offset} will be 16.
> +If the given packet has the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit set,
> +the above checksum fields refer to the inner header checksum.
>  \end{note}
>
>  \item If the driver negotiated
> @@ -516,6 +539,26 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
>  specifically in the protocol.}.
>     \end{itemize}
>
> +\item If the driver negotiated the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO feature,
> +  the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit in \field{gso_type} indicates that
> +  the GSO protocol is encapsulated in a UDP tunnel.
> +  The outer UDP header must not require checksumming, e.g. the outer UDP checksum
> +  must be zero.
> +  The other tunnel-related fields indicate how to replicate the packet
> +  headers to cut it into smaller packets:
> +
> +  \begin{itemize}
> +  \item \field{outer_th_offset} field indicates the outer transport header within
> +      the packet. Note that such field differs from \field{csum_start} as the latter
> +      points to the inner transport header within the packet

I wonder if it's better to stick to outer_csum_start, and say it's
offset within the packet to begin checksumming for outer packet.  This
follows the existing name and could be reused for encapsulation other
than UDP tunnel.

> +
> +  \item \field{inner_protocol_type} field indicates the ethernet type of the inner
> +      protocol.
> +
> +  \item \field{inner_nh_offset} field indicates the inner network header within
> +      the packet.
> +  \end{itemize}
> +
>  \item \field{num_buffers} is set to zero.  This field is unused on transmitted packets.
>
>  \item The header and packet are added as one output descriptor to the
> @@ -557,6 +600,18 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
>  driver MUST set the VIRTIO_NET_HDR_GSO_ECN bit in
>  \field{gso_type}.
>
> +The driver MUST NOT send to the device TCP or UDP GSO packets over UDP tunnel
> +requiring segmentation offload, unless the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO is
> +negotiated, in which case the driver MUST set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL
> +bit in \field{gso_type}.

Is this better to follow the existing style of the driver normative?
For example, we had

"If VIRTIO_NET_F_HOST_TSO6 is negotiated, the driver MAY set gso_type
to VIRTIO_NET_HDR_GSO_TCPV6 to request TCPv6 segmentation, otherwise
the driver MUST NOT set gso_type to VIRTIO_NET_HDR_GSO_TCPV6."

We could do the same

"If VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO is negotiated, the driver MAY ...
otherwise the driver MUST NOT ..."

> +
> +The driver MUST NOT send to the device TCP or UDP GSO packets over UDP tunnel
> +requiring segmentation and outer UDP checksum offload.
> +
> +The driver MUST NOT set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL together with
> +VIRTIO_NET_HDR_GSO_UDP, as the latter is deprecated in favor of UDP_L4
> +and no new features will support it.
> +
>  If the VIRTIO_NET_F_CSUM feature has been negotiated, the
>  driver MAY set the VIRTIO_NET_HDR_F_NEEDS_CSUM bit in
>  \field{flags}, if so:
> @@ -598,6 +653,28 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
>         header.
>  \end{itemize}
>
> +If the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO option has been negotiated, the
> +driver MAY set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit in \field{gso_type}, if so:
> +\begin{itemize}
> +\item the driver MUST set \field{outer_th_offset} to the outer UDP header
> +  offset, \field{inner_protocol_type} to the ethernet type of the inner
> +  protocol, and \field{inner_nh_offset} to the inner network header offset.
> +\end{itemize}

Do we need to say then csum_start and csum_offset is for the inner packet here?

> +
> +If the bit VIRTIO_NET_HDR_GSO_UDP_TUNNEL in \field{gso_type} is set, the outer
> +UDP header MUST NOT require checksumming. That is the, the outer UDP checksum
> +value must be 0, or the valid complete checksum for such header.
> +
> +\begin{note}
> +The valid complete checksum of the outer UDP header of individual segments
> +can be computed by the driver prior to segmentation only if the GSO packet
> +size is a multiple of \field{gso_size}.
> +\end{note}
> +
> +If the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO option has not
> +been negotiated, the driver MUST NOT set the VIRTIO_NET_HDR_F_UDP_TUNNEL bit
> +in \field{gso_type}.
> +
>  The driver SHOULD accept the VIRTIO_NET_F_GUEST_HDRLEN feature if it has
>  been offered, and if it's able to provide the exact header length.
>
> @@ -633,6 +710,10 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
>         \end{note}
>  \end{itemize}
>
> +If VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit in \field{gso_type} is not set, the
> +device MUST NOT use the \field{outer_th_offset}, \field{inner_protocol_type}
> + and \field{inner_nh_offset}.
> +
>  If VIRTIO_NET_HDR_F_NEEDS_CSUM is not set, the device MUST NOT
>  rely on the packet checksum being correct.
>  \paragraph{Packet Transmission Interrupt}\label{sec:Device Types / Network Device / Device Operation / Packet Transmission / Packet Transmission Interrupt}
> @@ -727,8 +808,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>    has been validated.
>  \end{enumerate}
>
> -Additionally, VIRTIO_NET_F_GUEST_CSUM, TSO4, TSO6, UDP and ECN
> -features enable receive checksum, large receive offload and ECN
> +Additionally, VIRTIO_NET_F_GUEST_CSUM, TSO4, TSO6, UDP, UDP_TUNNEL
> +and ECN features enable receive checksum, large receive offload and ECN
>  support which are the input equivalents of the transmit checksum,
>  transmit segmentation offloading and ECN features, as described
>  in \ref{sec:Device Types / Network Device / Device Operation /
> @@ -750,8 +831,14 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>    from \field{csum_start} and any preceding checksums
>    have been validated.  The checksum on the packet is incomplete and
>    if bit VIRTIO_NET_HDR_F_RSC_INFO is not set in \field{flags},
> -  then \field{csum_start} and \field{csum_offset} indicate how to calculate it
> -  (see Packet Transmission point 1).
> +  then \field{csum_start} and \field{csum_offset} indicate how to calculate it.
> +  If the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit is set, the \field{csum_start} field
> +  refers to the inner transport header offset (see Packet Transmission point 1).
> +\item If the VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO option was negotiated and
> +  \field{gso_type} is not VIRTIO_NET_HDR_GSO_NONE, the VIRTIO_NET_HDR_GSO_UDP_TUNNEL
> +  bit MAY be set. In such case the \field{outer_th_offset}, \field{inner_protocol_type},
> +  and \field{inner_nh_offset} fields indicate the corresponding
> +  headers information.
>
>  \end{enumerate}
>
> @@ -796,6 +883,9 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>  If none of VIRTIO_NET_F_GUEST_USO4 or VIRTIO_NET_F_GUEST_USO6 have been negotiated,
>  the device MUST NOT set \field{gso_type} to VIRTIO_NET_HDR_GSO_UDP_L4.
>
> +If VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO is not negotiated the device MUST NOT set
> +the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit in \field{gso_type}.
> +
>  The device SHOULD NOT send to the driver TCP packets requiring segmentation offload
>  which have the Explicit Congestion Notification bit set, unless the
>  VIRTIO_NET_F_GUEST_ECN feature is negotiated, in which case the
> @@ -819,6 +909,17 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>         fully checksummed packet;
>  \end{enumerate}
>
> +The device MUST NOT send to the driver TCP or UDP GSO packets encapsulated in UDP
> +tunnel and requiring segmentation offload, unless the
> +VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO is negotiated, in which case the device MUST set
> +the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit in \field{gso_type} and MUST set the
> +\field{outer_th_offset}, \field{inner_protocol_type} and
> +\field{inner_nh_offset} fields to the corresponding header information, and the outer
> +UDP header MUST NOT require checksum offload.

Similarly, do we need to say csum_start|offset is for inner?

> +
> +The device MUST NOT send the driver TCP or UDP GSO packets encapsulated in UDP
> +tunnel and requiring segmentation and outer checksum offload.
> +
>  If none of the VIRTIO_NET_F_GUEST_TSO4, TSO6, UFO, USO4 or USO6 options have
>  been negotiated, the device MUST set \field{gso_type} to
>  VIRTIO_NET_HDR_GSO_NONE.
> @@ -867,6 +968,10 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>  VIRTIO_NET_HDR_F_DATA_VALID is set, the driver MUST NOT
>  rely on the packet checksum being correct.
>
> +If the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_GSO bit in \field{gso_type} is not set,
> +the driver MUST NOT use the \field{outer_th_offset}, \field{inner_protocol_type},
> +\field{inner_mac_offset} and \field{inner_nh_offset}.
> +
>  \paragraph{Hash calculation for incoming packets}
>  \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets}
>
> @@ -1624,6 +1729,7 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
>  #define VIRTIO_NET_F_GUEST_TSO6       8
>  #define VIRTIO_NET_F_GUEST_ECN        9
>  #define VIRTIO_NET_F_GUEST_UFO        10
> +#define VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO  47
>  #define VIRTIO_NET_F_GUEST_USO4       54
>  #define VIRTIO_NET_F_GUEST_USO6       55
>
> --
> 2.45.2
>

Thanks


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 1/2] virtio-net: define UDP tunnel segmentation offload feature
  2024-07-30  7:33     ` Paolo Abeni
@ 2024-07-31  4:10       ` Jason Wang
  2024-07-31  9:02         ` Paolo Abeni
  2024-07-31 14:25         ` Willem de Bruijn
  0 siblings, 2 replies; 27+ messages in thread
From: Jason Wang @ 2024-07-31  4:10 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Willem de Bruijn, virtio-comment, maxime.coquelin, Eelco Chaudron,
	Stefano Garzarella

On Tue, Jul 30, 2024 at 3:33 PM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On 7/29/24 23:04, Willem de Bruijn wrote:
> > You have a working software implementation, right? Will this be sent
> > (as RFC) to the public netdev or so mailing list before this becomes a
> > standard? I assume that there is a standard process for virtio that
> > you are following. I'd just be more confident in a thumbs up if we
> > also review the code that implements the text.
>
> My plan is to write the S/W virtio_net implementation, too, but it's
> currently not ready. I had a working PoC for the first draft of this
> changeset, long time ago. Things are changed enough that I'll need a
> rewrite. I can try to shared such a thing on netdev ASAP (where "soon"
> here is a very relative terms ;)
>
> Also I don't know the standardization process. I hope for some guidance
> from Jason and/or Michael.

It should be something like this. When there's no future comment, open
an issue https://github.com/oasis-tcs/virtio-spec/issues and ask for a
vote. When it passes the voting, it would be merged.

>
> >
> >> @@ -423,6 +441,9 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O
> >>           le32 hash_value;        (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
> >>           le16 hash_report;       (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
> >>           le16 padding_reserved;  (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
> >> +        le16 outer_th_offset    (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
> >> +        le16 inner_protocol_type;(Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
> >> +        le16 inner_nh_offset;   (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
> >>   };
> >
> > In relation to the above: I recall that expanding the virtio_net_hdr
> > was non-trivial when I last tried to add fields ("[rfc,2/3]
> > virtio-net: support receive timestamp"). If that is the experience
> > again, it might be worthwhile to expand with enough space to also
> > accommodate a few subsequent extensions. Headroom is not that
> > expensive.
>

Yes, but it looks like an independent issue for this.

> Do you mean, by adding 'padding' or 'unused' fields?
>
> Side note: the above is not 32 bit aligned, do we need such aligment?

Maybe, but it should be an independent topic.

> If we would have VIRTIO_NET_HDR_GSO_UDPv4_L4 and
> VIRTIO_NET_HDR_GSO_UDPv6_L4 variants, we could avoid entirely
> 'inner_protocol_type', as the network protocol will be implied by the
> GSO type itself.

I may miss something, but can it be done by just introducing a new
gso_type without vnet header extension?

>
> >> +\item If the driver negotiated the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO feature,
> >> +  the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit in \field{gso_type} indicates that
> >> +  the GSO protocol is encapsulated in a UDP tunnel.
> >> +  The outer UDP header must not require checksumming, e.g. the outer UDP checksum
> >> +  must be zero.
> >
> > nit: "e.g." should be "i.e.," here
> >
> > and maybe must be a positive zero (0x0) checksum as defined in UDP RFC 768.
> >
> >> +If the bit VIRTIO_NET_HDR_GSO_UDP_TUNNEL in \field{gso_type} is set, the outer
> >> +UDP header MUST NOT require checksumming. That is the, the outer UDP checksum
> >> +value must be 0, or the valid complete checksum for such header.
> >
> > MUST NOT require checksum validation. [..] or the validated complete checksum
>
> Ack to all the above.
>
> Thanks!
>
> Paolo
>

Thanks


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 1/2] virtio-net: define UDP tunnel segmentation offload feature
  2024-07-31  1:57   ` Jason Wang
@ 2024-07-31  8:43     ` Paolo Abeni
  2024-07-31  9:16       ` Paolo Abeni
  2024-08-01  3:01       ` Jason Wang
  0 siblings, 2 replies; 27+ messages in thread
From: Paolo Abeni @ 2024-07-31  8:43 UTC (permalink / raw)
  To: Jason Wang
  Cc: virtio-comment, maxime.coquelin, Eelco Chaudron,
	Stefano Garzarella, Willem de Bruijn

On 7/31/24 03:57, Jason Wang wrote:
>> @@ -133,12 +139,16 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
>>   \item[VIRTIO_NET_F_GUEST_UFO] Requires VIRTIO_NET_F_GUEST_CSUM.
>>   \item[VIRTIO_NET_F_GUEST_USO4] Requires VIRTIO_NET_F_GUEST_CSUM.
>>   \item[VIRTIO_NET_F_GUEST_USO6] Requires VIRTIO_NET_F_GUEST_CSUM.
>> +\item[VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO] Requires both VIRTIO_NET_F_GUEST_TSO4 and VIRTIO_NET_F_GUEST_TSO6, or
>> +   both VIRTIO_NET_F_GUEST_USO4 and VIRTIO_NET_F_GUEST_USO6.
> 
> I think VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO requires all of the four?

Functionally speaking is not strictly necessary: the UDP tunnel feature 
needs just a single inner GSO type to assemble and process an UDP 
tunneled packet.

On the flip side I think is reasonable that modern implementations 
support all the mentioned inner GSO type and requiring all of them would 
possibly make the implementation simpler and/or less bug prone.

So I guess I can change the requirement as you say above.

>>   \item[VIRTIO_NET_F_HOST_TSO4] Requires VIRTIO_NET_F_CSUM.
>>   \item[VIRTIO_NET_F_HOST_TSO6] Requires VIRTIO_NET_F_CSUM.
>>   \item[VIRTIO_NET_F_HOST_ECN] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
>>   \item[VIRTIO_NET_F_HOST_UFO] Requires VIRTIO_NET_F_CSUM.
>>   \item[VIRTIO_NET_F_HOST_USO] Requires VIRTIO_NET_F_CSUM.
>> +\item[VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO] Requires both VIRTIO_NET_F_HOST_TSO4 and VIRTIO_NET_F_HOST_TSO6, or
>> +   both VIRTIO_NET_F_GUEST_USO4 or VIRTIO_NET_F_GUEST_USO6.
>>
>>   \item[VIRTIO_NET_F_CTRL_RX] Requires VIRTIO_NET_F_CTRL_VQ.
>>   \item[VIRTIO_NET_F_CTRL_VLAN] Requires VIRTIO_NET_F_CTRL_VQ.
>> @@ -375,6 +385,12 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev
>>     TCP), VIRTIO_NET_F_HOST_TSO6 (IPv6 TCP), VIRTIO_NET_F_HOST_UFO
>>     (UDP fragmentation) and VIRTIO_NET_F_HOST_USO (UDP segmentation) features.
>>
>> +\item If any TCP or UDP segmentation feature is negotiated,
> 
> Not a native speaker but this seems to conflict with the above. For
> example, does "only VIRTIO_NET_F_GUEST_TSO4 is negotiated" mean any
> TCP segmentation feature?

As per the previois comment, I'll change this in the next revision to:

"""
If the VIRTIO_NET_F_HOST_TSO6, VIRTIO_NET_F_HOST_TSO4 and 
VIRTIO_NET_HDR_GSO_UDP_L4 features are negotiated
""">> @@ -516,6 +539,26 @@ \subsubsection{Packet 
Transmission}\label{sec:Device Types / Network Device / De
>>   specifically in the protocol.}.
>>      \end{itemize}
>>
>> +\item If the driver negotiated the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO feature,
>> +  the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit in \field{gso_type} indicates that
>> +  the GSO protocol is encapsulated in a UDP tunnel.
>> +  The outer UDP header must not require checksumming, e.g. the outer UDP checksum
>> +  must be zero.
>> +  The other tunnel-related fields indicate how to replicate the packet
>> +  headers to cut it into smaller packets:
>> +
>> +  \begin{itemize}
>> +  \item \field{outer_th_offset} field indicates the outer transport header within
>> +      the packet. Note that such field differs from \field{csum_start} as the latter
>> +      points to the inner transport header within the packet
> 
> I wonder if it's better to stick to outer_csum_start, and say it's
> offset within the packet to begin checksumming for outer packet.  This
> follows the existing name and could be reused for encapsulation other
> than UDP tunnel.

I thought we agreed on this name in the previous iteration?!? It feels 
strange to introduce a csum related field in a change NOT introducing a 
checksum related feature. And we need this filed here, for plain 
segmentation.

>> @@ -557,6 +600,18 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
>>   driver MUST set the VIRTIO_NET_HDR_GSO_ECN bit in
>>   \field{gso_type}.
>>
>> +The driver MUST NOT send to the device TCP or UDP GSO packets over UDP tunnel
>> +requiring segmentation offload, unless the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO is
>> +negotiated, in which case the driver MUST set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL
>> +bit in \field{gso_type}.
> 
> Is this better to follow the existing style of the driver normative?
> For example, we had
> 
> "If VIRTIO_NET_F_HOST_TSO6 is negotiated, the driver MAY set gso_type
> to VIRTIO_NET_HDR_GSO_TCPV6 to request TCPv6 segmentation, otherwise
> the driver MUST NOT set gso_type to VIRTIO_NET_HDR_GSO_TCPV6."
> 
> We could do the same
> 
> "If VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO is negotiated, the driver MAY ...
> otherwise the driver MUST NOT ..."

ok, I'll do that in the next revision

>> @@ -598,6 +653,28 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
>>          header.
>>   \end{itemize}
>>
>> +If the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO option has been negotiated, the
>> +driver MAY set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit in \field{gso_type}, if so:
>> +\begin{itemize}
>> +\item the driver MUST set \field{outer_th_offset} to the outer UDP header
>> +  offset, \field{inner_protocol_type} to the ethernet type of the inner
>> +  protocol, and \field{inner_nh_offset} to the inner network header offset.
>> +\end{itemize}
> 
> Do we need to say then csum_start and csum_offset is for the inner packet here?

I thought such statement was already everywhere, but no objection to add 
it here, too.

Somewhat related, I have a doubt WRT the csum_start/csum_offset fields 
"presence" with plain GSO packets. AFAICS, the specs require the csum 
offload feature as a pre-req for the plain segmentation feature, but 
also allow the reception of a GSO packet without the 
VIRTIO_NET_HDR_F_NEEDS_CSUM bit set, see the recent linux patch from Willem:

https://lore.kernel.org/netdev/20240729201108.1615114-1-willemdebruijn.kernel@gmail.com/

I think it will make the thing simpler if we always require such bit to 
be set for GSO packets. I guess we can't change the current status for 
the existing GSO types, but for UDP tunnel we could add that requirement 
from the start, both for the driver -> device direction and for the 
device -> driver direction.

Even when the device sets the VIRTIO_NET_HDR_F_DATA_VALID, so can always 
expect a valid csum_start/csum_offset pair, and make things hopefully 
more robust. When VIRTIO_NET_HDR_F_DATA_VALID is set, the pair will NOT 
be used to validated the csum, just to point to the inner transport 
header and avoid parsing. WDYT?

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 1/2] virtio-net: define UDP tunnel segmentation offload feature
  2024-07-31  4:10       ` Jason Wang
@ 2024-07-31  9:02         ` Paolo Abeni
  2024-07-31 11:17           ` Paolo Abeni
  2024-08-01  3:14           ` Jason Wang
  2024-07-31 14:25         ` Willem de Bruijn
  1 sibling, 2 replies; 27+ messages in thread
From: Paolo Abeni @ 2024-07-31  9:02 UTC (permalink / raw)
  To: Jason Wang
  Cc: Willem de Bruijn, virtio-comment, maxime.coquelin, Eelco Chaudron,
	Stefano Garzarella

On 7/31/24 06:10, Jason Wang wrote:
> On Tue, Jul 30, 2024 at 3:33 PM Paolo Abeni <pabeni@redhat.com> wrote:
>> On 7/29/24 23:04, Willem de Bruijn wrote:
>>>> @@ -423,6 +441,9 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O
>>>>            le32 hash_value;        (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
>>>>            le16 hash_report;       (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
>>>>            le16 padding_reserved;  (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
>>>> +        le16 outer_th_offset    (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
>>>> +        le16 inner_protocol_type;(Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
>>>> +        le16 inner_nh_offset;   (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
>>>>    };

[...]

>> If we would have VIRTIO_NET_HDR_GSO_UDPv4_L4 and
>> VIRTIO_NET_HDR_GSO_UDPv6_L4 variants, we could avoid entirely
>> 'inner_protocol_type', as the network protocol will be implied by the
>> GSO type itself.
> 
> I may miss something, but can it be done by just introducing a new
> gso_type without vnet header extension?

Yep, that is what I mean above: we could replace the
'inner_protocol_type' field - which carries a single bit information, 
inner ipv4 vs ipv6 - with some (inner) GSO-type related info.

Unfortunately that last step does not look 110% 'clean' to me, as the 
inner GSO type is already specified by VIRTIO_NET_HDR_GSO_TCPV4, 
VIRTIO_NET_HDR_GSO_TCPV6 or
VIRTIO_NET_HDR_GSO_UDP_L4. In case of inner TCP, the inner network 
protocol is already specified, but it's not in case of UDP.

I'm wondering: why the single VIRTIO_NET_HDR_GSO_UDP_L4 value has been 
preferred to a pair of ipv4/ipv6 variants? Could we add such variants 
now? Note that I would like such option very much, as it will make this 
change even more complex, but looks IMHO quite clean.

Otherwise another option would be to allocate another bit in the GSO 
type field, set such field when VIRTIO_NET_HDR_GSO_UDP_TUNNEL is set, 
too, and use such field specify the inner header network protocol. I 
don't like much even this option because the new bit will carry 
duplicate information in case of TCP, and it's implementation looks bug 
prone.

I don't see other options, any suggestions/preferences welcome!

Thanks,

Paolo





> 
>>
>>>> +\item If the driver negotiated the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO feature,
>>>> +  the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit in \field{gso_type} indicates that
>>>> +  the GSO protocol is encapsulated in a UDP tunnel.
>>>> +  The outer UDP header must not require checksumming, e.g. the outer UDP checksum
>>>> +  must be zero.
>>>
>>> nit: "e.g." should be "i.e.," here
>>>
>>> and maybe must be a positive zero (0x0) checksum as defined in UDP RFC 768.
>>>
>>>> +If the bit VIRTIO_NET_HDR_GSO_UDP_TUNNEL in \field{gso_type} is set, the outer
>>>> +UDP header MUST NOT require checksumming. That is the, the outer UDP checksum
>>>> +value must be 0, or the valid complete checksum for such header.
>>>
>>> MUST NOT require checksum validation. [..] or the validated complete checksum
>>
>> Ack to all the above.
>>
>> Thanks!
>>
>> Paolo
>>
> 
> Thanks
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 1/2] virtio-net: define UDP tunnel segmentation offload feature
  2024-07-31  8:43     ` Paolo Abeni
@ 2024-07-31  9:16       ` Paolo Abeni
  2024-08-01  3:01       ` Jason Wang
  1 sibling, 0 replies; 27+ messages in thread
From: Paolo Abeni @ 2024-07-31  9:16 UTC (permalink / raw)
  To: Jason Wang
  Cc: virtio-comment, maxime.coquelin, Eelco Chaudron,
	Stefano Garzarella, Willem de Bruijn

On 7/31/24 10:43, Paolo Abeni wrote:
> On 7/31/24 03:57, Jason Wang wrote:
>>>    \item[VIRTIO_NET_F_HOST_TSO4] Requires VIRTIO_NET_F_CSUM.
>>>    \item[VIRTIO_NET_F_HOST_TSO6] Requires VIRTIO_NET_F_CSUM.
>>>    \item[VIRTIO_NET_F_HOST_ECN] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
>>>    \item[VIRTIO_NET_F_HOST_UFO] Requires VIRTIO_NET_F_CSUM.
>>>    \item[VIRTIO_NET_F_HOST_USO] Requires VIRTIO_NET_F_CSUM.
>>> +\item[VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO] Requires both VIRTIO_NET_F_HOST_TSO4 and VIRTIO_NET_F_HOST_TSO6, or
>>> +   both VIRTIO_NET_F_GUEST_USO4 or VIRTIO_NET_F_GUEST_USO6.
>>>
>>>    \item[VIRTIO_NET_F_CTRL_RX] Requires VIRTIO_NET_F_CTRL_VQ.
>>>    \item[VIRTIO_NET_F_CTRL_VLAN] Requires VIRTIO_NET_F_CTRL_VQ.
>>> @@ -375,6 +385,12 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev
>>>      TCP), VIRTIO_NET_F_HOST_TSO6 (IPv6 TCP), VIRTIO_NET_F_HOST_UFO
>>>      (UDP fragmentation) and VIRTIO_NET_F_HOST_USO (UDP segmentation) features.
>>>
>>> +\item If any TCP or UDP segmentation feature is negotiated,
>>
>> Not a native speaker but this seems to conflict with the above. For
>> example, does "only VIRTIO_NET_F_GUEST_TSO4 is negotiated" mean any
>> TCP segmentation feature?
> 
> As per the previois comment, I'll change this in the next revision to:
> 
> """
> If the VIRTIO_NET_F_HOST_TSO6, VIRTIO_NET_F_HOST_TSO4 and
> VIRTIO_NET_HDR_GSO_UDP_L4 features are negotiated
> """

Typos above, should be:

"""
If the VIRTIO_NET_F_HOST_TSO6, VIRTIO_NET_F_HOST_TSO4, 
VIRTIO_NET_F_HOST_USO features are negotiated
"""

Cheers,

Paolo


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 1/2] virtio-net: define UDP tunnel segmentation offload feature
  2024-07-31  9:02         ` Paolo Abeni
@ 2024-07-31 11:17           ` Paolo Abeni
  2024-08-01  3:14           ` Jason Wang
  1 sibling, 0 replies; 27+ messages in thread
From: Paolo Abeni @ 2024-07-31 11:17 UTC (permalink / raw)
  To: Jason Wang
  Cc: Willem de Bruijn, virtio-comment, maxime.coquelin, Eelco Chaudron,
	Stefano Garzarella

On 7/31/24 11:02, Paolo Abeni wrote:
> On 7/31/24 06:10, Jason Wang wrote:
>> On Tue, Jul 30, 2024 at 3:33 PM Paolo Abeni <pabeni@redhat.com> wrote:
>>> On 7/29/24 23:04, Willem de Bruijn wrote:
>>>>> @@ -423,6 +441,9 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O
>>>>>             le32 hash_value;        (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
>>>>>             le16 hash_report;       (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
>>>>>             le16 padding_reserved;  (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
>>>>> +        le16 outer_th_offset    (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
>>>>> +        le16 inner_protocol_type;(Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
>>>>> +        le16 inner_nh_offset;   (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
>>>>>     };
> 
> [...]
> 
>>> If we would have VIRTIO_NET_HDR_GSO_UDPv4_L4 and
>>> VIRTIO_NET_HDR_GSO_UDPv6_L4 variants, we could avoid entirely
>>> 'inner_protocol_type', as the network protocol will be implied by the
>>> GSO type itself.
>>
>> I may miss something, but can it be done by just introducing a new
>> gso_type without vnet header extension?
> 
> Yep, that is what I mean above: we could replace the
> 'inner_protocol_type' field - which carries a single bit information,
> inner ipv4 vs ipv6 - with some (inner) GSO-type related info.
> 
> Unfortunately that last step does not look 110% 'clean' to me, as the
> inner GSO type is already specified by VIRTIO_NET_HDR_GSO_TCPV4,
> VIRTIO_NET_HDR_GSO_TCPV6 or
> VIRTIO_NET_HDR_GSO_UDP_L4. In case of inner TCP, the inner network
> protocol is already specified, but it's not in case of UDP.
> 
> I'm wondering: why the single VIRTIO_NET_HDR_GSO_UDP_L4 value has been
> preferred to a pair of ipv4/ipv6 variants? Could we add such variants
> now? Note that I would like such option very much, as it will make this
> change even more complex, but looks IMHO quite clean.

After more pondering, I think implementing this option would be quite a 
mess. E.g. we would need a feature bit for 
VIRTIO_NET_HDR_GSO_UDP{4,6}_L4 different from 
VIRTIO_NET_F_{GUEST,HOST}_USO{4,6,}...

I guess we are better off excluding such option.

> Otherwise another option would be to allocate another bit in the GSO
> type field, set such field when VIRTIO_NET_HDR_GSO_UDP_TUNNEL is set,
> too, and use such field specify the inner header network protocol. I
> don't like much even this option because the new bit will carry
> duplicate information in case of TCP, and it's implementation looks bug
> prone.

In practice it would replace 'inner_protocol_type' with:

#define VIRTIO_NET_HDR_GSO_INNER_IPv6 0x20

And set/clearing such bit depending on the inner network protocol type, 
for VIRTIO_NET_HDR_GSO_UDP_TUNNEL packets.

Processing the virtio net hdr will require validating such bit value 
with WRT the carried GSO type, i.e. must be set when

gso_type & (VIRTIO_NET_HDR_GSO_TCPV6 | VIRTIO_NET_HDR_GSO_UDP_TUNNEL) ==
VIRTIO_NET_HDR_GSO_TCPV6 | VIRTIO_NET_HDR_GSO_UDP_TUNNEL

must be cleared in all the other cases

Cheers,

Paolo


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 1/2] virtio-net: define UDP tunnel segmentation offload feature
  2024-07-31  4:10       ` Jason Wang
  2024-07-31  9:02         ` Paolo Abeni
@ 2024-07-31 14:25         ` Willem de Bruijn
  2024-07-31 17:32           ` Paolo Abeni
  2024-08-01  2:24           ` Jason Wang
  1 sibling, 2 replies; 27+ messages in thread
From: Willem de Bruijn @ 2024-07-31 14:25 UTC (permalink / raw)
  To: Jason Wang
  Cc: Paolo Abeni, virtio-comment, maxime.coquelin, Eelco Chaudron,
	Stefano Garzarella

On Wed, Jul 31, 2024 at 12:11 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Tue, Jul 30, 2024 at 3:33 PM Paolo Abeni <pabeni@redhat.com> wrote:
> >
> > On 7/29/24 23:04, Willem de Bruijn wrote:
> > > You have a working software implementation, right? Will this be sent
> > > (as RFC) to the public netdev or so mailing list before this becomes a
> > > standard? I assume that there is a standard process for virtio that
> > > you are following. I'd just be more confident in a thumbs up if we
> > > also review the code that implements the text.
> >
> > My plan is to write the S/W virtio_net implementation, too, but it's
> > currently not ready. I had a working PoC for the first draft of this
> > changeset, long time ago. Things are changed enough that I'll need a
> > rewrite. I can try to shared such a thing on netdev ASAP (where "soon"
> > here is a very relative terms ;)
> >
> > Also I don't know the standardization process. I hope for some guidance
> > from Jason and/or Michael.
>
> It should be something like this. When there's no future comment, open
> an issue https://github.com/oasis-tcs/virtio-spec/issues and ask for a
> vote. When it passes the voting, it would be merged.

So changes to the spec can be merged, because we know that they can be
implemented reasonably?

IETF always requires (two, because interoperable protocol definitions)
working implementations.

It sounds a bit risky to me to skip this, as you come across the interesting
non-obvious details during implementation.

Since implementation is planned anyway, doing at least a rough proof of
concept first does not introduce extra work, but reduces risk that the spec
contains errors.

Unless it is easy to update the spec before it is numbered and finalized.
Then issues can be fixed up.

Anyway, you have gone through this many times, so whatever the process
is, I trust that it works. Just sharing my thought as a somewhat uniformed
reader.

>
> >
> > >
> > >> @@ -423,6 +441,9 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O
> > >>           le32 hash_value;        (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
> > >>           le16 hash_report;       (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
> > >>           le16 padding_reserved;  (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
> > >> +        le16 outer_th_offset    (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
> > >> +        le16 inner_protocol_type;(Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
> > >> +        le16 inner_nh_offset;   (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
> > >>   };
> > >
> > > In relation to the above: I recall that expanding the virtio_net_hdr
> > > was non-trivial when I last tried to add fields ("[rfc,2/3]
> > > virtio-net: support receive timestamp"). If that is the experience
> > > again, it might be worthwhile to expand with enough space to also
> > > accommodate a few subsequent extensions. Headroom is not that
> > > expensive.
> >
>
> Yes, but it looks like an independent issue for this.

If this change requires new headroom, then it depends on headroom expansion.

My suggestion is to then expand it significantly immediately, to minimize the
number of times we have to change the virtio_net code to support new lengths.

We already have structs virtio_net_hdr, virtio_net_hdr_mrg_rxbuf,
virtio_net_hdr_v1, virtio_net_hdr_v1_hash.

Adding a new struct with every new field added just adds lots of churn in
the code.

Just a thought, feel free to skip if the maintainers disagree.


>
> > Do you mean, by adding 'padding' or 'unused' fields?

Yes.

> >
> > Side note: the above is not 32 bit aligned, do we need such aligment?
>
> Maybe, but it should be an independent topic.
>
> > If we would have VIRTIO_NET_HDR_GSO_UDPv4_L4 and
> > VIRTIO_NET_HDR_GSO_UDPv6_L4 variants, we could avoid entirely
> > 'inner_protocol_type', as the network protocol will be implied by the
> > GSO type itself.

But that's too late now? Probably don't want to make this work
depending on such a change.

I don't entirely recall why we did not do that immediately. Perhaps
it is related to also only having SKB_GSO_UDP_L4.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 1/2] virtio-net: define UDP tunnel segmentation offload feature
  2024-07-31 14:25         ` Willem de Bruijn
@ 2024-07-31 17:32           ` Paolo Abeni
  2024-07-31 18:21             ` Willem de Bruijn
  2024-08-01  2:24           ` Jason Wang
  1 sibling, 1 reply; 27+ messages in thread
From: Paolo Abeni @ 2024-07-31 17:32 UTC (permalink / raw)
  To: Willem de Bruijn, Jason Wang
  Cc: virtio-comment, maxime.coquelin, Eelco Chaudron,
	Stefano Garzarella

On 7/31/24 16:25, Willem de Bruijn wrote:
> On Wed, Jul 31, 2024 at 12:11 AM Jason Wang <jasowang@redhat.com> wrote:
>> On Tue, Jul 30, 2024 at 3:33 PM Paolo Abeni <pabeni@redhat.com> wrote:
>>> On 7/29/24 23:04, Willem de Bruijn wrote:
>>>>> @@ -423,6 +441,9 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O
>>>>>            le32 hash_value;        (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
>>>>>            le16 hash_report;       (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
>>>>>            le16 padding_reserved;  (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
>>>>> +        le16 outer_th_offset    (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
>>>>> +        le16 inner_protocol_type;(Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
>>>>> +        le16 inner_nh_offset;   (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
>>>>>    };
>>>>
>>>> In relation to the above: I recall that expanding the virtio_net_hdr
>>>> was non-trivial when I last tried to add fields ("[rfc,2/3]
>>>> virtio-net: support receive timestamp"). If that is the experience
>>>> again, it might be worthwhile to expand with enough space to also
>>>> accommodate a few subsequent extensions. Headroom is not that
>>>> expensive.
>>
>> Yes, but it looks like an independent issue for this.
> 
> If this change requires new headroom, then it depends on headroom expansion.
> 
> My suggestion is to then expand it significantly immediately, to minimize the
> number of times we have to change the virtio_net code to support new lengths.
> 
> We already have structs virtio_net_hdr, virtio_net_hdr_mrg_rxbuf,
> virtio_net_hdr_v1, virtio_net_hdr_v1_hash.
> 
> Adding a new struct with every new field added just adds lots of churn in
> the code.
> 
> Just a thought, feel free to skip if the maintainers disagree.

I just read the old thread WRT virtio net timestamping.

If we reserve additional headroom space for timestamping here, such room 
will be available only if UDP_TUNNEL is negotiated, which sounds a bit 
counterintuitive to me.

I *think* ideally we want a separate virtio net feature for that?

Regarding the virtio_net_hdr parsing complexity increase, I think we 
should add to the uAPI a struct containing only the new fields, e.g. for 
UDP tunnel:

struct virtio_net_udp_tunnel {
	__virtio16 outer_th_offset;
	// eventually __virtio16 inner_protocol_type;
	__virtio16 inner_nh_offset;
};

At feature negotiation time both the drivers and the device compute the 
offset inside the virtio buffer where such struct is placed:

if VIRTIO_NET_F_HASH_REPORT is negotiated such offset is sizeof(struct 
virtio_net_hdr_v1_hash) elsewhere is sizeof(struct virtio_net_hdr_v1)

At rx time, struct virtio_net_udp_tunnel() is parsed from the above offset.

With all the above in place, for every additional later feature we add 
just a struct definition and the virtio_net header parsing complexity 
scales linearly.

>>> If we would have VIRTIO_NET_HDR_GSO_UDPv4_L4 and
>>> VIRTIO_NET_HDR_GSO_UDPv6_L4 variants, we could avoid entirely
>>> 'inner_protocol_type', as the network protocol will be implied by the
>>> GSO type itself.
> 
> But that's too late now? Probably don't want to make this work
> depending on such a change.
> 
> I don't entirely recall why we did not do that immediately. Perhaps
> it is related to also only having SKB_GSO_UDP_L4.

I reached the same conclusion here:

https://lore.kernel.org/virtio-comment/cover.1722252302.git.pabeni@redhat.com/T/#mac694da6e907451d5529116a97510c940bf2edd9

What about reserving an additional bit in gso_type to specify the inner 
network header protocol?

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 1/2] virtio-net: define UDP tunnel segmentation offload feature
  2024-07-31 17:32           ` Paolo Abeni
@ 2024-07-31 18:21             ` Willem de Bruijn
  2024-08-01  3:19               ` Jason Wang
  2024-08-01 15:34               ` Paolo Abeni
  0 siblings, 2 replies; 27+ messages in thread
From: Willem de Bruijn @ 2024-07-31 18:21 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Jason Wang, virtio-comment, maxime.coquelin, Eelco Chaudron,
	Stefano Garzarella

On Wed, Jul 31, 2024 at 1:32 PM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On 7/31/24 16:25, Willem de Bruijn wrote:
> > On Wed, Jul 31, 2024 at 12:11 AM Jason Wang <jasowang@redhat.com> wrote:
> >> On Tue, Jul 30, 2024 at 3:33 PM Paolo Abeni <pabeni@redhat.com> wrote:
> >>> On 7/29/24 23:04, Willem de Bruijn wrote:
> >>>>> @@ -423,6 +441,9 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O
> >>>>>            le32 hash_value;        (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
> >>>>>            le16 hash_report;       (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
> >>>>>            le16 padding_reserved;  (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
> >>>>> +        le16 outer_th_offset    (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
> >>>>> +        le16 inner_protocol_type;(Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
> >>>>> +        le16 inner_nh_offset;   (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
> >>>>>    };
> >>>>
> >>>> In relation to the above: I recall that expanding the virtio_net_hdr
> >>>> was non-trivial when I last tried to add fields ("[rfc,2/3]
> >>>> virtio-net: support receive timestamp"). If that is the experience
> >>>> again, it might be worthwhile to expand with enough space to also
> >>>> accommodate a few subsequent extensions. Headroom is not that
> >>>> expensive.
> >>
> >> Yes, but it looks like an independent issue for this.
> >
> > If this change requires new headroom, then it depends on headroom expansion.
> >
> > My suggestion is to then expand it significantly immediately, to minimize the
> > number of times we have to change the virtio_net code to support new lengths.
> >
> > We already have structs virtio_net_hdr, virtio_net_hdr_mrg_rxbuf,
> > virtio_net_hdr_v1, virtio_net_hdr_v1_hash.
> >
> > Adding a new struct with every new field added just adds lots of churn in
> > the code.
> >
> > Just a thought, feel free to skip if the maintainers disagree.
>
> I just read the old thread WRT virtio net timestamping.
>
> If we reserve additional headroom space for timestamping here, such room
> will be available only if UDP_TUNNEL is negotiated, which sounds a bit
> counterintuitive to me.
>
> I *think* ideally we want a separate virtio net feature for that?

Fair point. Agreed that extra headroom should not be dependent on this feature.

> Regarding the virtio_net_hdr parsing complexity increase, I think we
> should add to the uAPI a struct containing only the new fields, e.g. for
> UDP tunnel:
>
> struct virtio_net_udp_tunnel {
>         __virtio16 outer_th_offset;
>         // eventually __virtio16 inner_protocol_type;
>         __virtio16 inner_nh_offset;
> };
>
> At feature negotiation time both the drivers and the device compute the
> offset inside the virtio buffer where such struct is placed:
>
> if VIRTIO_NET_F_HASH_REPORT is negotiated such offset is sizeof(struct
> virtio_net_hdr_v1_hash) elsewhere is sizeof(struct virtio_net_hdr_v1)
>
> At rx time, struct virtio_net_udp_tunnel() is parsed from the above offset.
>
> With all the above in place, for every additional later feature we add
> just a struct definition and the virtio_net header parsing complexity
> scales linearly.

Sounds good to me.

> >>> If we would have VIRTIO_NET_HDR_GSO_UDPv4_L4 and
> >>> VIRTIO_NET_HDR_GSO_UDPv6_L4 variants, we could avoid entirely
> >>> 'inner_protocol_type', as the network protocol will be implied by the
> >>> GSO type itself.
> >
> > But that's too late now? Probably don't want to make this work
> > depending on such a change.
> >
> > I don't entirely recall why we did not do that immediately. Perhaps
> > it is related to also only having SKB_GSO_UDP_L4.
>
> I reached the same conclusion here:
>
> https://lore.kernel.org/virtio-comment/cover.1722252302.git.pabeni@redhat.com/T/#mac694da6e907451d5529116a97510c940bf2edd9
>
> What about reserving an additional bit in gso_type to specify the inner
> network header protocol?

Similar to VIRTIO_NET_HDR_GSO_ECN?

Is that preferable over VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP4 and
.._UDP_TUNNEL_IP6 variants?

And to parsing just this one byte (or nibble) from packet data using
inner_nh_offset?

If it's the best of those options, no objections from me. Definitely
no need for a u16 if the only options are IPPROTO_IP and IPPROTO_IPV6.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 1/2] virtio-net: define UDP tunnel segmentation offload feature
  2024-07-31 14:25         ` Willem de Bruijn
  2024-07-31 17:32           ` Paolo Abeni
@ 2024-08-01  2:24           ` Jason Wang
  1 sibling, 0 replies; 27+ messages in thread
From: Jason Wang @ 2024-08-01  2:24 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Paolo Abeni, virtio-comment, maxime.coquelin, Eelco Chaudron,
	Stefano Garzarella

On Wed, Jul 31, 2024 at 10:26 PM Willem de Bruijn <willemb@google.com> wrote:
>
> On Wed, Jul 31, 2024 at 12:11 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Tue, Jul 30, 2024 at 3:33 PM Paolo Abeni <pabeni@redhat.com> wrote:
> > >
> > > On 7/29/24 23:04, Willem de Bruijn wrote:
> > > > You have a working software implementation, right? Will this be sent
> > > > (as RFC) to the public netdev or so mailing list before this becomes a
> > > > standard? I assume that there is a standard process for virtio that
> > > > you are following. I'd just be more confident in a thumbs up if we
> > > > also review the code that implements the text.
> > >
> > > My plan is to write the S/W virtio_net implementation, too, but it's
> > > currently not ready. I had a working PoC for the first draft of this
> > > changeset, long time ago. Things are changed enough that I'll need a
> > > rewrite. I can try to shared such a thing on netdev ASAP (where "soon"
> > > here is a very relative terms ;)
> > >
> > > Also I don't know the standardization process. I hope for some guidance
> > > from Jason and/or Michael.
> >
> > It should be something like this. When there's no future comment, open
> > an issue https://github.com/oasis-tcs/virtio-spec/issues and ask for a
> > vote. When it passes the voting, it would be merged.
>
> So changes to the spec can be merged, because we know that they can be
> implemented reasonably?

This seems to be how it works now.

>
> IETF always requires (two, because interoperable protocol definitions)
> working implementations.
>
> It sounds a bit risky to me to skip this, as you come across the interesting
> non-obvious details during implementation.
>
> Since implementation is planned anyway, doing at least a rough proof of
> concept first does not introduce extra work, but reduces risk that the spec
> contains errors.

Exactly and we have already had qemu which fits for that purpose. A
lot of hardware was prototyped in that way (for example, rocker fbnic
etc).

>
> Unless it is easy to update the spec before it is numbered and finalized.
> Then issues can be fixed up.
>
> Anyway, you have gone through this many times, so whatever the process
> is, I trust that it works. Just sharing my thought as a somewhat uniformed
> reader.

I fully agree with you. In the long debate of the live migration
series, I point out we need a prototype for features that requires
huge changes. But I don't get sufficient feedback for that.

>
> >
> > >
> > > >
> > > >> @@ -423,6 +441,9 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O
> > > >>           le32 hash_value;        (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
> > > >>           le16 hash_report;       (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
> > > >>           le16 padding_reserved;  (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
> > > >> +        le16 outer_th_offset    (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
> > > >> +        le16 inner_protocol_type;(Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
> > > >> +        le16 inner_nh_offset;   (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
> > > >>   };
> > > >
> > > > In relation to the above: I recall that expanding the virtio_net_hdr
> > > > was non-trivial when I last tried to add fields ("[rfc,2/3]
> > > > virtio-net: support receive timestamp"). If that is the experience
> > > > again, it might be worthwhile to expand with enough space to also
> > > > accommodate a few subsequent extensions. Headroom is not that
> > > > expensive.
> > >
> >
> > Yes, but it looks like an independent issue for this.
>
> If this change requires new headroom, then it depends on headroom expansion.
>
> My suggestion is to then expand it significantly immediately, to minimize the
> number of times we have to change the virtio_net code to support new lengths.
>
> We already have structs virtio_net_hdr, virtio_net_hdr_mrg_rxbuf,
> virtio_net_hdr_v1, virtio_net_hdr_v1_hash.
>
> Adding a new struct with every new field added just adds lots of churn in
> the code.
>
> Just a thought, feel free to skip if the maintainers disagree.
>

I think I agree with that, but I meant it's better to not make UDP
tunnel series depending on that.

Thanks


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 1/2] virtio-net: define UDP tunnel segmentation offload feature
  2024-07-31  8:43     ` Paolo Abeni
  2024-07-31  9:16       ` Paolo Abeni
@ 2024-08-01  3:01       ` Jason Wang
  1 sibling, 0 replies; 27+ messages in thread
From: Jason Wang @ 2024-08-01  3:01 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: virtio-comment, maxime.coquelin, Eelco Chaudron,
	Stefano Garzarella, Willem de Bruijn

On Wed, Jul 31, 2024 at 4:43 PM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On 7/31/24 03:57, Jason Wang wrote:
> >> @@ -133,12 +139,16 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
> >>   \item[VIRTIO_NET_F_GUEST_UFO] Requires VIRTIO_NET_F_GUEST_CSUM.
> >>   \item[VIRTIO_NET_F_GUEST_USO4] Requires VIRTIO_NET_F_GUEST_CSUM.
> >>   \item[VIRTIO_NET_F_GUEST_USO6] Requires VIRTIO_NET_F_GUEST_CSUM.
> >> +\item[VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO] Requires both VIRTIO_NET_F_GUEST_TSO4 and VIRTIO_NET_F_GUEST_TSO6, or
> >> +   both VIRTIO_NET_F_GUEST_USO4 and VIRTIO_NET_F_GUEST_USO6.
> >
> > I think VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO requires all of the four?
>
> Functionally speaking is not strictly necessary: the UDP tunnel feature
> needs just a single inner GSO type to assemble and process an UDP
> tunneled packet.
>
> On the flip side I think is reasonable that modern implementations
> support all the mentioned inner GSO type and requiring all of them would
> possibly make the implementation simpler and/or less bug prone.

Yes.

>
> So I guess I can change the requirement as you say above.

That would be great.

>
> >>   \item[VIRTIO_NET_F_HOST_TSO4] Requires VIRTIO_NET_F_CSUM.
> >>   \item[VIRTIO_NET_F_HOST_TSO6] Requires VIRTIO_NET_F_CSUM.
> >>   \item[VIRTIO_NET_F_HOST_ECN] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
> >>   \item[VIRTIO_NET_F_HOST_UFO] Requires VIRTIO_NET_F_CSUM.
> >>   \item[VIRTIO_NET_F_HOST_USO] Requires VIRTIO_NET_F_CSUM.
> >> +\item[VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO] Requires both VIRTIO_NET_F_HOST_TSO4 and VIRTIO_NET_F_HOST_TSO6, or
> >> +   both VIRTIO_NET_F_GUEST_USO4 or VIRTIO_NET_F_GUEST_USO6.
> >>
> >>   \item[VIRTIO_NET_F_CTRL_RX] Requires VIRTIO_NET_F_CTRL_VQ.
> >>   \item[VIRTIO_NET_F_CTRL_VLAN] Requires VIRTIO_NET_F_CTRL_VQ.
> >> @@ -375,6 +385,12 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev
> >>     TCP), VIRTIO_NET_F_HOST_TSO6 (IPv6 TCP), VIRTIO_NET_F_HOST_UFO
> >>     (UDP fragmentation) and VIRTIO_NET_F_HOST_USO (UDP segmentation) features.
> >>
> >> +\item If any TCP or UDP segmentation feature is negotiated,
> >
> > Not a native speaker but this seems to conflict with the above. For
> > example, does "only VIRTIO_NET_F_GUEST_TSO4 is negotiated" mean any
> > TCP segmentation feature?
>
> As per the previois comment, I'll change this in the next revision to:
>
> """
> If the VIRTIO_NET_F_HOST_TSO6, VIRTIO_NET_F_HOST_TSO4 and
> VIRTIO_NET_HDR_GSO_UDP_L4 features are negotiated
> """

Ok.

>> @@ -516,6 +539,26 @@ \subsubsection{Packet
> Transmission}\label{sec:Device Types / Network Device / De
> >>   specifically in the protocol.}.
> >>      \end{itemize}
> >>
> >> +\item If the driver negotiated the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO feature,
> >> +  the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit in \field{gso_type} indicates that
> >> +  the GSO protocol is encapsulated in a UDP tunnel.
> >> +  The outer UDP header must not require checksumming, e.g. the outer UDP checksum
> >> +  must be zero.
> >> +  The other tunnel-related fields indicate how to replicate the packet
> >> +  headers to cut it into smaller packets:
> >> +
> >> +  \begin{itemize}
> >> +  \item \field{outer_th_offset} field indicates the outer transport header within
> >> +      the packet. Note that such field differs from \field{csum_start} as the latter
> >> +      points to the inner transport header within the packet
> >
> > I wonder if it's better to stick to outer_csum_start, and say it's
> > offset within the packet to begin checksumming for outer packet.  This
> > follows the existing name and could be reused for encapsulation other
> > than UDP tunnel.
>
> I thought we agreed on this name in the previous iteration?!?

Sorry if I do that, the switch loses some context :(

> It feels
> strange to introduce a csum related field in a change NOT introducing a
> checksum related feature. And we need this filed here, for plain
> segmentation.

Ok, fine.

>
> >> @@ -557,6 +600,18 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
> >>   driver MUST set the VIRTIO_NET_HDR_GSO_ECN bit in
> >>   \field{gso_type}.
> >>
> >> +The driver MUST NOT send to the device TCP or UDP GSO packets over UDP tunnel
> >> +requiring segmentation offload, unless the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO is
> >> +negotiated, in which case the driver MUST set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL
> >> +bit in \field{gso_type}.
> >
> > Is this better to follow the existing style of the driver normative?
> > For example, we had
> >
> > "If VIRTIO_NET_F_HOST_TSO6 is negotiated, the driver MAY set gso_type
> > to VIRTIO_NET_HDR_GSO_TCPV6 to request TCPv6 segmentation, otherwise
> > the driver MUST NOT set gso_type to VIRTIO_NET_HDR_GSO_TCPV6."
> >
> > We could do the same
> >
> > "If VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO is negotiated, the driver MAY ...
> > otherwise the driver MUST NOT ..."
>
> ok, I'll do that in the next revision
>
> >> @@ -598,6 +653,28 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
> >>          header.
> >>   \end{itemize}
> >>
> >> +If the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO option has been negotiated, the
> >> +driver MAY set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit in \field{gso_type}, if so:
> >> +\begin{itemize}
> >> +\item the driver MUST set \field{outer_th_offset} to the outer UDP header
> >> +  offset, \field{inner_protocol_type} to the ethernet type of the inner
> >> +  protocol, and \field{inner_nh_offset} to the inner network header offset.
> >> +\end{itemize}
> >
> > Do we need to say then csum_start and csum_offset is for the inner packet here?
>
> I thought such statement was already everywhere, but no objection to add
> it here, too.

I think we need some tweaks. E.g with this patch applied, in "Packet
Transmission" subsection, we still have something which is not clear:

"""
  \item \field{csum_start} is set to the offset within the packet to
begin checksumming,
    and

  \item \field{csum_offset} indicates how many bytes after the csum_start the
    new (16 bit ones' complement) checksum is placed by the device.
"""

I wonder if it's more clear to just describe the outer and inner metadata there?

And in the end of this description

"""
\item If the driver negotiated the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO feature,
  the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit in \field{gso_type} indicates that
  the GSO protocol is encapsulated in a UDP tunnel.
  The outer UDP header must not require checksumming, e.g. the outer
UDP checksum
  must be zero.
  The other tunnel-related fields indicate how to replicate the packet
  headers to cut it into smaller packets:

  \begin{itemize}
  \item \field{outer_th_offset} field indicates the outer transport
header within
      the packet. Note that such field differs from \field{csum_start}
as the latter
      points to the inner transport header within the packet.

  \item \field{inner_protocol_type} field indicates the ethernet type
of the inner
      protocol.

  \item \field{inner_nh_offset} field indicates the inner network header within
      the packet.
  \end{itemize}
"""

And it would be even nicer if an example could be give there, as we
have a non tunnel example for setting metatdata:

"""
For example, consider a partially checksummed TCP (IPv4) packet.
It will have a 14 byte ethernet header and 20 byte IP header
followed by the TCP header (with the TCP checksum field 16 bytes
into that header). \field{csum_start} will be 14+20 = 34 (the TCP
checksum includes the header), and \field{csum_offset} will be 16.
"""

>
> Somewhat related, I have a doubt WRT the csum_start/csum_offset fields
> "presence" with plain GSO packets. AFAICS, the specs require the csum
> offload feature as a pre-req for the plain segmentation feature, but
> also allow the reception of a GSO packet without the
> VIRTIO_NET_HDR_F_NEEDS_CSUM bit set, see the recent linux patch from Willem:
>
> https://lore.kernel.org/netdev/20240729201108.1615114-1-willemdebruijn.kernel@gmail.com/
>
> I think it will make the thing simpler if we always require such bit to
> be set for GSO packets.

Did you mean forbid VIRTIO_NET_HDR_F_DATA_VALID? I think it means we
will prevent vendors from implementing GRO_HW for UDP tunnels.

> I guess we can't change the current status for
> the existing GSO types, but for UDP tunnel we could add that requirement
> from the start, both for the driver -> device direction and for the
> device -> driver direction.
>
> Even when the device sets the VIRTIO_NET_HDR_F_DATA_VALID, so can always
> expect a valid csum_start/csum_offset pair, and make things hopefully
> more robust.

Ok, so you meant you want to mandate csum_start/csum_offset even in
the case of VIRTIO_NET_HDR_F_DATA_VALID.

> When VIRTIO_NET_HDR_F_DATA_VALID is set, the pair will NOT
> be used to validated the csum, just to point to the inner transport
> header and avoid parsing. WDYT?

It should be fine, but I think it's better to be an independent
feature (e.g a new feature flag as it is not specific to tunnels).

If we add too many dependencies for UDP tunnel, it would take too long
to be converged.

Thanks

>
> Thanks,
>
> Paolo
>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 1/2] virtio-net: define UDP tunnel segmentation offload feature
  2024-07-31  9:02         ` Paolo Abeni
  2024-07-31 11:17           ` Paolo Abeni
@ 2024-08-01  3:14           ` Jason Wang
  1 sibling, 0 replies; 27+ messages in thread
From: Jason Wang @ 2024-08-01  3:14 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Willem de Bruijn, virtio-comment, maxime.coquelin, Eelco Chaudron,
	Stefano Garzarella

On Wed, Jul 31, 2024 at 5:02 PM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On 7/31/24 06:10, Jason Wang wrote:
> > On Tue, Jul 30, 2024 at 3:33 PM Paolo Abeni <pabeni@redhat.com> wrote:
> >> On 7/29/24 23:04, Willem de Bruijn wrote:
> >>>> @@ -423,6 +441,9 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O
> >>>>            le32 hash_value;        (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
> >>>>            le16 hash_report;       (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
> >>>>            le16 padding_reserved;  (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
> >>>> +        le16 outer_th_offset    (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
> >>>> +        le16 inner_protocol_type;(Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
> >>>> +        le16 inner_nh_offset;   (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
> >>>>    };
>
> [...]
>
> >> If we would have VIRTIO_NET_HDR_GSO_UDPv4_L4 and
> >> VIRTIO_NET_HDR_GSO_UDPv6_L4 variants, we could avoid entirely
> >> 'inner_protocol_type', as the network protocol will be implied by the
> >> GSO type itself.
> >
> > I may miss something, but can it be done by just introducing a new
> > gso_type without vnet header extension?
>
> Yep, that is what I mean above: we could replace the
> 'inner_protocol_type' field - which carries a single bit information,
> inner ipv4 vs ipv6 - with some (inner) GSO-type related info.
>
> Unfortunately that last step does not look 110% 'clean' to me, as the
> inner GSO type is already specified by VIRTIO_NET_HDR_GSO_TCPV4,
> VIRTIO_NET_HDR_GSO_TCPV6 or
> VIRTIO_NET_HDR_GSO_UDP_L4. In case of inner TCP, the inner network
> protocol is already specified, but it's not in case of UDP.
>
> I'm wondering: why the single VIRTIO_NET_HDR_GSO_UDP_L4 value has been
> preferred to a pair of ipv4/ipv6 variants? Could we add such variants
> now? Note that I would like such option very much, as it will make this
> change even more complex, but looks IMHO quite clean.

Actually, I point out that when it is proposed. And I was told that it
is because Linux doesn't differ that.

>
> Otherwise another option would be to allocate another bit in the GSO
> type field, set such field when VIRTIO_NET_HDR_GSO_UDP_TUNNEL is set,
> too, and use such field specify the inner header network protocol. I
> don't like much even this option because the new bit will carry
> duplicate information in case of TCP, and it's implementation looks bug
> prone.
>
> I don't see other options, any suggestions/preferences welcome!
>
> Thanks,
>
> Paolo

Thanks


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 1/2] virtio-net: define UDP tunnel segmentation offload feature
  2024-07-31 18:21             ` Willem de Bruijn
@ 2024-08-01  3:19               ` Jason Wang
  2024-08-01 15:34               ` Paolo Abeni
  1 sibling, 0 replies; 27+ messages in thread
From: Jason Wang @ 2024-08-01  3:19 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Paolo Abeni, virtio-comment, maxime.coquelin, Eelco Chaudron,
	Stefano Garzarella

On Thu, Aug 1, 2024 at 2:22 AM Willem de Bruijn <willemb@google.com> wrote:
>
> On Wed, Jul 31, 2024 at 1:32 PM Paolo Abeni <pabeni@redhat.com> wrote:
> >
> > On 7/31/24 16:25, Willem de Bruijn wrote:
> > > On Wed, Jul 31, 2024 at 12:11 AM Jason Wang <jasowang@redhat.com> wrote:
> > >> On Tue, Jul 30, 2024 at 3:33 PM Paolo Abeni <pabeni@redhat.com> wrote:
> > >>> On 7/29/24 23:04, Willem de Bruijn wrote:
> > >>>>> @@ -423,6 +441,9 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O
> > >>>>>            le32 hash_value;        (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
> > >>>>>            le16 hash_report;       (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
> > >>>>>            le16 padding_reserved;  (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
> > >>>>> +        le16 outer_th_offset    (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
> > >>>>> +        le16 inner_protocol_type;(Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
> > >>>>> +        le16 inner_nh_offset;   (Only if VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO negotiated)
> > >>>>>    };
> > >>>>
> > >>>> In relation to the above: I recall that expanding the virtio_net_hdr
> > >>>> was non-trivial when I last tried to add fields ("[rfc,2/3]
> > >>>> virtio-net: support receive timestamp"). If that is the experience
> > >>>> again, it might be worthwhile to expand with enough space to also
> > >>>> accommodate a few subsequent extensions. Headroom is not that
> > >>>> expensive.
> > >>
> > >> Yes, but it looks like an independent issue for this.
> > >
> > > If this change requires new headroom, then it depends on headroom expansion.
> > >
> > > My suggestion is to then expand it significantly immediately, to minimize the
> > > number of times we have to change the virtio_net code to support new lengths.
> > >
> > > We already have structs virtio_net_hdr, virtio_net_hdr_mrg_rxbuf,
> > > virtio_net_hdr_v1, virtio_net_hdr_v1_hash.
> > >
> > > Adding a new struct with every new field added just adds lots of churn in
> > > the code.
> > >
> > > Just a thought, feel free to skip if the maintainers disagree.
> >
> > I just read the old thread WRT virtio net timestamping.
> >
> > If we reserve additional headroom space for timestamping here, such room
> > will be available only if UDP_TUNNEL is negotiated, which sounds a bit
> > counterintuitive to me.
> >
> > I *think* ideally we want a separate virtio net feature for that?
>
> Fair point. Agreed that extra headroom should not be dependent on this feature.
>
> > Regarding the virtio_net_hdr parsing complexity increase, I think we
> > should add to the uAPI a struct containing only the new fields, e.g. for
> > UDP tunnel:
> >
> > struct virtio_net_udp_tunnel {
> >         __virtio16 outer_th_offset;
> >         // eventually __virtio16 inner_protocol_type;
> >         __virtio16 inner_nh_offset;
> > };
> >
> > At feature negotiation time both the drivers and the device compute the
> > offset inside the virtio buffer where such struct is placed:
> >
> > if VIRTIO_NET_F_HASH_REPORT is negotiated such offset is sizeof(struct
> > virtio_net_hdr_v1_hash) elsewhere is sizeof(struct virtio_net_hdr_v1)
> >
> > At rx time, struct virtio_net_udp_tunnel() is parsed from the above offset.
> >
> > With all the above in place, for every additional later feature we add
> > just a struct definition and the virtio_net header parsing complexity
> > scales linearly.
>
> Sounds good to me.
>
> > >>> If we would have VIRTIO_NET_HDR_GSO_UDPv4_L4 and
> > >>> VIRTIO_NET_HDR_GSO_UDPv6_L4 variants, we could avoid entirely
> > >>> 'inner_protocol_type', as the network protocol will be implied by the
> > >>> GSO type itself.
> > >
> > > But that's too late now? Probably don't want to make this work
> > > depending on such a change.
> > >
> > > I don't entirely recall why we did not do that immediately. Perhaps
> > > it is related to also only having SKB_GSO_UDP_L4.
> >
> > I reached the same conclusion here:
> >
> > https://lore.kernel.org/virtio-comment/cover.1722252302.git.pabeni@redhat.com/T/#mac694da6e907451d5529116a97510c940bf2edd9
> >
> > What about reserving an additional bit in gso_type to specify the inner
> > network header protocol?
>
> Similar to VIRTIO_NET_HDR_GSO_ECN?

I think so.

>
> Is that preferable over VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP4 and
> .._UDP_TUNNEL_IP6 variants?
>
> And to parsing just this one byte (or nibble) from packet data using
> inner_nh_offset?
>
> If it's the best of those options, no objections from me. Definitely
> no need for a u16 if the only options are IPPROTO_IP and IPPROTO_IPV6.
>

Exactly.

Thanks


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 1/2] virtio-net: define UDP tunnel segmentation offload feature
  2024-07-31 18:21             ` Willem de Bruijn
  2024-08-01  3:19               ` Jason Wang
@ 2024-08-01 15:34               ` Paolo Abeni
  2024-08-01 16:04                 ` Willem de Bruijn
  1 sibling, 1 reply; 27+ messages in thread
From: Paolo Abeni @ 2024-08-01 15:34 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Jason Wang, virtio-comment, maxime.coquelin, Eelco Chaudron,
	Stefano Garzarella

On 7/31/24 20:21, Willem de Bruijn wrote:
> On Wed, Jul 31, 2024 at 1:32 PM Paolo Abeni <pabeni@redhat.com> wrote:
>> What about reserving an additional bit in gso_type to specify the inner
>> network header protocol?
> 
> Similar to VIRTIO_NET_HDR_GSO_ECN?
> 
> Is that preferable over VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP4 and
> .._UDP_TUNNEL_IP6 variants?

Should be basically the same, as VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP4 
VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP6 will be 2 separate bits in gso_type, 
right?

> And to parsing just this one byte (or nibble) from packet data using
> inner_nh_offset?

I think one of the state goal is to avoid parsing. More importantly, 
accessing the inner network header content this early will cause 
possibly avoidable (or at least mitigable via prefetch) cache misses.

> If it's the best of those options, no objections from me. Definitely
> no need for a u16 if the only options are IPPROTO_IP and IPPROTO_IPV6.

Wrapping all the above I'll go for the additional bit for the inner 
network header type. If I misread something, please LMK:)

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 1/2] virtio-net: define UDP tunnel segmentation offload feature
  2024-08-01 15:34               ` Paolo Abeni
@ 2024-08-01 16:04                 ` Willem de Bruijn
  2024-08-01 16:26                   ` Paolo Abeni
  0 siblings, 1 reply; 27+ messages in thread
From: Willem de Bruijn @ 2024-08-01 16:04 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Jason Wang, virtio-comment, maxime.coquelin, Eelco Chaudron,
	Stefano Garzarella

On Thu, Aug 1, 2024 at 11:34 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On 7/31/24 20:21, Willem de Bruijn wrote:
> > On Wed, Jul 31, 2024 at 1:32 PM Paolo Abeni <pabeni@redhat.com> wrote:
> >> What about reserving an additional bit in gso_type to specify the inner
> >> network header protocol?
> >
> > Similar to VIRTIO_NET_HDR_GSO_ECN?
> >
> > Is that preferable over VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP4 and
> > .._UDP_TUNNEL_IP6 variants?
>
> Should be basically the same, as VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP4
> VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP6 will be 2 separate bits in gso_type,
> right?

One difference is that we have to update other code to use a bitmask.

I'd prefer the two types for that reason.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 1/2] virtio-net: define UDP tunnel segmentation offload feature
  2024-08-01 16:04                 ` Willem de Bruijn
@ 2024-08-01 16:26                   ` Paolo Abeni
  2024-08-01 18:04                     ` Willem de Bruijn
  2024-08-02  3:30                     ` Jason Wang
  0 siblings, 2 replies; 27+ messages in thread
From: Paolo Abeni @ 2024-08-01 16:26 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Jason Wang, virtio-comment, maxime.coquelin, Eelco Chaudron,
	Stefano Garzarella

On 8/1/24 18:04, Willem de Bruijn wrote:
> On Thu, Aug 1, 2024 at 11:34 AM Paolo Abeni <pabeni@redhat.com> wrote:
>>
>> On 7/31/24 20:21, Willem de Bruijn wrote:
>>> On Wed, Jul 31, 2024 at 1:32 PM Paolo Abeni <pabeni@redhat.com> wrote:
>>>> What about reserving an additional bit in gso_type to specify the inner
>>>> network header protocol?
>>>
>>> Similar to VIRTIO_NET_HDR_GSO_ECN?
>>>
>>> Is that preferable over VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP4 and
>>> .._UDP_TUNNEL_IP6 variants?
>>
>> Should be basically the same, as VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP4
>> VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP6 will be 2 separate bits in gso_type,
>> right?
> 
> One difference is that we have to update other code to use a bitmask.
> 
> I'd prefer the two types for that reason.

Please pardon me for the high-frequency spamming. I'd love to have this 
topic well defined.

I'm unsure how to read your suggestion exactly.

VIRTIO_NET_HDR_GSO_UDP_TUNNEL needs to be "combined" (set together) with 
VIRTIO_NET_HDR_GSO_TCPV4, VIRTIO_NET_HDR_GSO_TCPV6 or 
VIRTIO_NET_HDR_GSO_UDP_L4, so it's currently defined as a bit flag:

#define VIRTIO_NET_HDR_GSO_UDP_TUNNEL 0x40

 From an implementation perspective this allow us reusing the existing 
code for to fill inner headers related field, masking such bit in the 
gso_type before invoking the existing virtio_hdr parsing helper - it 
will be used unmodified.

Due to all the above, I thought the IPv4 and IPv6 variant would be 
defined as:

#define VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP4 0x20
#define VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP6 0x40

How would you define VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP4 and 
VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP6?

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 1/2] virtio-net: define UDP tunnel segmentation offload feature
  2024-08-01 16:26                   ` Paolo Abeni
@ 2024-08-01 18:04                     ` Willem de Bruijn
  2024-08-02  3:30                     ` Jason Wang
  1 sibling, 0 replies; 27+ messages in thread
From: Willem de Bruijn @ 2024-08-01 18:04 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Jason Wang, virtio-comment, maxime.coquelin, Eelco Chaudron,
	Stefano Garzarella

On Thu, Aug 1, 2024 at 12:26 PM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On 8/1/24 18:04, Willem de Bruijn wrote:
> > On Thu, Aug 1, 2024 at 11:34 AM Paolo Abeni <pabeni@redhat.com> wrote:
> >>
> >> On 7/31/24 20:21, Willem de Bruijn wrote:
> >>> On Wed, Jul 31, 2024 at 1:32 PM Paolo Abeni <pabeni@redhat.com> wrote:
> >>>> What about reserving an additional bit in gso_type to specify the inner
> >>>> network header protocol?
> >>>
> >>> Similar to VIRTIO_NET_HDR_GSO_ECN?
> >>>
> >>> Is that preferable over VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP4 and
> >>> .._UDP_TUNNEL_IP6 variants?
> >>
> >> Should be basically the same, as VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP4
> >> VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP6 will be 2 separate bits in gso_type,
> >> right?
> >
> > One difference is that we have to update other code to use a bitmask.
> >
> > I'd prefer the two types for that reason.
>
> Please pardon me for the high-frequency spamming. I'd love to have this
> topic well defined.
>
> I'm unsure how to read your suggestion exactly.
>
> VIRTIO_NET_HDR_GSO_UDP_TUNNEL needs to be "combined" (set together) with
> VIRTIO_NET_HDR_GSO_TCPV4, VIRTIO_NET_HDR_GSO_TCPV6 or
> VIRTIO_NET_HDR_GSO_UDP_L4, so it's currently defined as a bit flag:
>
> #define VIRTIO_NET_HDR_GSO_UDP_TUNNEL 0x40
>
>  From an implementation perspective this allow us reusing the existing
> code for to fill inner headers related field, masking such bit in the
> gso_type before invoking the existing virtio_hdr parsing helper - it
> will be used unmodified.
>
> Due to all the above, I thought the IPv4 and IPv6 variant would be
> defined as:
>
> #define VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP4 0x20
> #define VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP6 0x40
>
> How would you define VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP4 and
> VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP6?

Sorry for the confusion. I agree with your proposal.

I had forgotten that it already adds a bit that will have to be masked
out in any branch that only deals with the inner type.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 1/2] virtio-net: define UDP tunnel segmentation offload feature
  2024-08-01 16:26                   ` Paolo Abeni
  2024-08-01 18:04                     ` Willem de Bruijn
@ 2024-08-02  3:30                     ` Jason Wang
  1 sibling, 0 replies; 27+ messages in thread
From: Jason Wang @ 2024-08-02  3:30 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Willem de Bruijn, virtio-comment, maxime.coquelin, Eelco Chaudron,
	Stefano Garzarella

On Fri, Aug 2, 2024 at 12:26 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On 8/1/24 18:04, Willem de Bruijn wrote:
> > On Thu, Aug 1, 2024 at 11:34 AM Paolo Abeni <pabeni@redhat.com> wrote:
> >>
> >> On 7/31/24 20:21, Willem de Bruijn wrote:
> >>> On Wed, Jul 31, 2024 at 1:32 PM Paolo Abeni <pabeni@redhat.com> wrote:
> >>>> What about reserving an additional bit in gso_type to specify the inner
> >>>> network header protocol?
> >>>
> >>> Similar to VIRTIO_NET_HDR_GSO_ECN?
> >>>
> >>> Is that preferable over VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP4 and
> >>> .._UDP_TUNNEL_IP6 variants?
> >>
> >> Should be basically the same, as VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP4
> >> VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP6 will be 2 separate bits in gso_type,
> >> right?
> >
> > One difference is that we have to update other code to use a bitmask.
> >
> > I'd prefer the two types for that reason.
>
> Please pardon me for the high-frequency spamming. I'd love to have this
> topic well defined.
>
> I'm unsure how to read your suggestion exactly.
>
> VIRTIO_NET_HDR_GSO_UDP_TUNNEL needs to be "combined" (set together) with
> VIRTIO_NET_HDR_GSO_TCPV4, VIRTIO_NET_HDR_GSO_TCPV6 or
> VIRTIO_NET_HDR_GSO_UDP_L4, so it's currently defined as a bit flag:
>
> #define VIRTIO_NET_HDR_GSO_UDP_TUNNEL 0x40
>
>  From an implementation perspective this allow us reusing the existing
> code for to fill inner headers related field, masking such bit in the
> gso_type before invoking the existing virtio_hdr parsing helper - it
> will be used unmodified.
>
> Due to all the above, I thought the IPv4 and IPv6 variant would be
> defined as:
>
> #define VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP4 0x20
> #define VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP6 0x40

Yes, I think this makes sense.

>
> How would you define VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP4 and
> VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IP6?
>
> Thanks,
>
> Paolo
>

Thanks


^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2024-08-02  3:30 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-29 11:30 [PATCH v6 0/2] virtio-net: define UDP tunnel offload Paolo Abeni
2024-07-29 11:30 ` [PATCH v6 1/2] virtio-net: define UDP tunnel segmentation offload feature Paolo Abeni
2024-07-29 21:04   ` Willem de Bruijn
2024-07-29 21:36     ` Willem de Bruijn
2024-07-30  7:38       ` Paolo Abeni
2024-07-30  7:33     ` Paolo Abeni
2024-07-31  4:10       ` Jason Wang
2024-07-31  9:02         ` Paolo Abeni
2024-07-31 11:17           ` Paolo Abeni
2024-08-01  3:14           ` Jason Wang
2024-07-31 14:25         ` Willem de Bruijn
2024-07-31 17:32           ` Paolo Abeni
2024-07-31 18:21             ` Willem de Bruijn
2024-08-01  3:19               ` Jason Wang
2024-08-01 15:34               ` Paolo Abeni
2024-08-01 16:04                 ` Willem de Bruijn
2024-08-01 16:26                   ` Paolo Abeni
2024-08-01 18:04                     ` Willem de Bruijn
2024-08-02  3:30                     ` Jason Wang
2024-08-01  2:24           ` Jason Wang
2024-07-31  1:57   ` Jason Wang
2024-07-31  8:43     ` Paolo Abeni
2024-07-31  9:16       ` Paolo Abeni
2024-08-01  3:01       ` Jason Wang
2024-07-29 11:30 ` [PATCH v6 2/2] virtio-net: define UDP tunnel checksum " Paolo Abeni
2024-07-29 21:35   ` Willem de Bruijn
2024-07-30  7:52     ` Paolo Abeni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox