* [PATCH v7 1/2] virtio-net: Fix ECN feature descritpion
2025-04-17 21:57 [PATCH v7 0/2] Update ECN and Add AccECN feature chia-yu.chang
@ 2025-04-17 21:57 ` chia-yu.chang
2025-04-17 22:08 ` Michael S. Tsirkin
2025-04-17 21:57 ` [PATCH v7 2/2] virtio-net: define Accurate ECN feature in virtio-spec chia-yu.chang
1 sibling, 1 reply; 4+ messages in thread
From: chia-yu.chang @ 2025-04-17 21:57 UTC (permalink / raw)
To: virtio-comment, mst, cohuck, mvaralar, jasowang, xuanzhuo,
eperezma, ij, ncardwell, koen.de_schepper, g.white, vidhi_goel,
ingemar.s.johansson, mirja.kuehlewind
Cc: Chia-Yu Chang
From: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com>
Clarify that the VIRTIO_NET_HDR_GSO_ECN gso_type flag does not mean that
TCP has IP-ECN set, instead it identifies TCP CWR flag is set and will be
cleared from the second segment. This is used to offload TCP CWR flag in a
way that is compatible with RFC3168 ECN but is problematic for non-RFC3168
use of CWR flag.
Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com>
---
device-types/net/description.tex | 33 +++++++++++++++++---------------
introduction.tex | 3 +++
2 files changed, 21 insertions(+), 15 deletions(-)
diff --git a/device-types/net/description.tex b/device-types/net/description.tex
index 1b6b54d..a2c9de8 100644
--- a/device-types/net/description.tex
+++ b/device-types/net/description.tex
@@ -54,7 +54,9 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
\item[VIRTIO_NET_F_GUEST_TSO6 (8)] Driver can receive TSOv6.
-\item[VIRTIO_NET_F_GUEST_ECN (9)] Driver can receive TSO with ECN.
+\item[VIRTIO_NET_F_GUEST_ECN (9)] Driver can receive TSO with TCP CWR flag set
+ and follow the ACE bits handling approach mentioned in
+ \hyperref[intro:rfc3168]{[RFC3168]}.
\item[VIRTIO_NET_F_GUEST_UFO (10)] Driver can receive UFO.
@@ -62,7 +64,9 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
\item[VIRTIO_NET_F_HOST_TSO6 (12)] Device can receive TSOv6.
-\item[VIRTIO_NET_F_HOST_ECN (13)] Device can receive TSO with ECN.
+\item[VIRTIO_NET_F_HOST_ECN (13)] Device can receive TSO with TCP CWR flag set
+ and follow the ACE bits handling approach mentioend in
+ \hyperref[intro:rfc3168]{[RFC3168]}.
\item[VIRTIO_NET_F_HOST_UFO (14)] Device can receive UFO.
@@ -695,8 +699,9 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
\item If the driver negotiated the VIRTIO_NET_F_HOST_ECN feature,
the VIRTIO_NET_HDR_GSO_ECN bit in \field{gso_type}
- indicates that the TCP packet has the ECN bit set\footnote{This case is not handled by some older hardware, so is called out
-specifically in the protocol.}.
+ indicates that the TCP packet has TCP CWR flag set and the flag will be handled differently to all segements of
+ an aggregated segment, as mentioned in \hyperref[intro:rfc3168]{[RFC3168]}
+ \footnote{This case is not handled by some older hardware, so is called out specifically in the protocol.}.
\end{itemize}
\item If the driver negotiated the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO feature and the
@@ -788,10 +793,9 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
\field{gso_type} to VIRTIO_NET_HDR_GSO_UDP_L4.
The driver SHOULD NOT send to the device TCP packets requiring segmentation offload
-which have the Explicit Congestion Notification bit set, unless the
-VIRTIO_NET_F_HOST_ECN feature is negotiated, in which case the
-driver MUST set the VIRTIO_NET_HDR_GSO_ECN bit in
-\field{gso_type}.
+which have the TCP CWR flag set and require the flag be handled as mentioned in
+\hyperref[intro:rfc3168]{[RFC3168]}, unless the VIRTIO_NET_F_HOST_ECN feature is
+negotiated, in which case the driver MUST set the VIRTIO_NET_HDR_GSO_ECN bit in \field{gso_type}.
If VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO is negotiated, the driver MAY set
VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit
@@ -1105,9 +1109,9 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
\end{enumerate}
Additionally, VIRTIO_NET_F_GUEST_CSUM, TSO4, TSO6, UDP, UDP_TUNNEL
-and ECN features enable receive checksum, large receive offload and ECN
-support which are the input equivalents of the transmit checksum,
-transmit segmentation offloading and ECN features, as described
+and ECN features enable receive checksum, large receive offload and RFC3168
+ECN support which are the input equivalents of the transmit checksum,
+transmit segmentation offloading and RFC3168 ECN features, as described
in \ref{sec:Device Types / Network Device / Device Operation /
Packet Transmission}:
\begin{enumerate}
@@ -1210,10 +1214,9 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit in \field{flags}.
The device SHOULD NOT send to the driver TCP packets requiring segmentation offload
-which have the Explicit Congestion Notification bit set, unless the
-VIRTIO_NET_F_GUEST_ECN feature is negotiated, in which case the
-device MUST set the VIRTIO_NET_HDR_GSO_ECN bit in
-\field{gso_type}.
+which have the TCP CWR flag set and require the flag be handled as mentioned in
+\hyperref[intro:rfc3168]{[RFC3168]}, unless the VIRTIO_NET_F_GUEST_ECN feature is
+negotiated, in which case the device MUST set the VIRTIO_NET_HDR_GSO_ECN bit in \field{gso_type}.
If the VIRTIO_NET_F_GUEST_CSUM feature has been negotiated, the
device MAY set the VIRTIO_NET_HDR_F_NEEDS_CSUM bit in
diff --git a/introduction.tex b/introduction.tex
index e60298a..d52622e 100644
--- a/introduction.tex
+++ b/introduction.tex
@@ -168,6 +168,9 @@ \section{Normative References}\label{sec:Normative References}
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP
14, RFC 8174, DOI 10.17487/RFC8174, May 2017
\newline\url{http://www.ietf.org/rfc/rfc8174.txt}\\
+ \phantomsection\label{intro:rfc3168}\textbf{[RFC3168]} &
+ S. Floyd., ``The Addition of Explicit Congestion Notification (ECN) to IP'', September 2001.
+ \newline\url{http://www.ietf.org/rfc/rfc3168.txt}\\
\end{longtable}
\section{Non-Normative References}
--
2.34.1
^ permalink raw reply related [flat|nested] 4+ messages in thread* [PATCH v7 2/2] virtio-net: define Accurate ECN feature in virtio-spec
2025-04-17 21:57 [PATCH v7 0/2] Update ECN and Add AccECN feature chia-yu.chang
2025-04-17 21:57 ` [PATCH v7 1/2] virtio-net: Fix ECN feature descritpion chia-yu.chang
@ 2025-04-17 21:57 ` chia-yu.chang
1 sibling, 0 replies; 4+ messages in thread
From: chia-yu.chang @ 2025-04-17 21:57 UTC (permalink / raw)
To: virtio-comment, mst, cohuck, mvaralar, jasowang, xuanzhuo,
eperezma, ij, ncardwell, koen.de_schepper, g.white, vidhi_goel,
ingemar.s.johansson, mirja.kuehlewind
Cc: Chia-Yu Chang
From: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com>
This change implements Accurate ECN based on AccECN specifications:
https://tools.ietf.org/id/draft-ietf-tcpm-accurate-ecn-28.txt
Unlike RFC 3168 ECN, Accurate ECN uses the CWR flag as part of the ACE
field to count new packets with CE mark; however, RFC 3168 ECN-aware TSO
will clean CWR flag from the 2nd segment of an aggregated segment.
Therefore, fallback shall be applied by setting NETIF_F_GSO_ACCECN to
ensure that the CWR flag should not be changed within a aggregated segment
(e.g., super-skb in Linux).
To apply it in virtio-spec, new feature bits for host and guest are added
for feature negotiation between driver and device. And the translation
of Accurate ECN GSO flag between virtio_net_hdr and skb header for
NETIF_F_GSO_ACCECN is also added to avoid CWR flag corruption due to
RFC3168 ECN TSO.
Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com>
---
device-types/net/description.tex | 50 +++++++++++++++++++++++++-------
introduction.tex | 3 ++
2 files changed, 42 insertions(+), 11 deletions(-)
diff --git a/device-types/net/description.tex b/device-types/net/description.tex
index a2c9de8..61e2aac 100644
--- a/device-types/net/description.tex
+++ b/device-types/net/description.tex
@@ -140,6 +140,14 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
\item[VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM (68)] Device handles packets
carried by a UDP tunnel with partial csum for the outer header.
+
+\item[VIRTIO_NET_F_HOST_ACCECN (69)] Device can receive TSO with TCP CWR flag set
+ and follow the ACE bits handling approach mentioned in
+ \hyperref[intro:accecn]{[AccECN]}.
+
+\item[VIRTIO_NET_F_GUEST_ACCECN (70)] Driver can receive TSO with TCP CWR flag set
+ and follow the ACE bits handling approach mentioned in
+ \hyperref[intro:accecn]{[AccECN]}.
\end{description}
\subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device / Feature bits / Feature bit requirements}
@@ -151,6 +159,7 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
\item[VIRTIO_NET_F_GUEST_TSO4] Requires VIRTIO_NET_F_GUEST_CSUM.
\item[VIRTIO_NET_F_GUEST_TSO6] Requires VIRTIO_NET_F_GUEST_CSUM.
\item[VIRTIO_NET_F_GUEST_ECN] Requires VIRTIO_NET_F_GUEST_TSO4 or VIRTIO_NET_F_GUEST_TSO6.
+\item[VIRTIO_NET_F_GUEST_ACCECN] Requires VIRTIO_NET_F_GUEST_TSO4 or VIRTIO_NET_F_GUEST_TSO6.
\item[VIRTIO_NET_F_GUEST_UFO] Requires VIRTIO_NET_F_GUEST_CSUM.
\item[VIRTIO_NET_F_GUEST_USO4] Requires VIRTIO_NET_F_GUEST_CSUM.
\item[VIRTIO_NET_F_GUEST_USO6] Requires VIRTIO_NET_F_GUEST_CSUM.
@@ -161,6 +170,7 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
\item[VIRTIO_NET_F_HOST_TSO4] Requires VIRTIO_NET_F_CSUM.
\item[VIRTIO_NET_F_HOST_TSO6] Requires VIRTIO_NET_F_CSUM.
\item[VIRTIO_NET_F_HOST_ECN] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
+\item[VIRTIO_NET_F_HOST_ACCECN] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
\item[VIRTIO_NET_F_HOST_UFO] Requires VIRTIO_NET_F_CSUM.
\item[VIRTIO_NET_F_HOST_USO] Requires VIRTIO_NET_F_CSUM.
\item[VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO] Requires VIRTIO_NET_F_HOST_TSO4, VIRTIO_NET_F_HOST_TSO6
@@ -284,11 +294,11 @@ \subsection{Device configuration layout}\label{sec:Device Types / Network Device
The device MUST NOT modify \field{mtu} once it has been set.
The device MUST NOT pass received packets that exceed \field{mtu} (plus low
-level ethernet header length) size with \field{gso_type} NONE or ECN
+level ethernet header length) size with \field{gso_type} NONE, ECN or ACCECN
after VIRTIO_NET_F_MTU has been successfully negotiated.
The device MUST forward transmitted packets of up to \field{mtu} (plus low
-level ethernet header length) size with \field{gso_type} NONE or ECN, and do
+level ethernet header length) size with \field{gso_type} NONE, ECN or ACCECN, and do
so without fragmentation, after VIRTIO_NET_F_MTU has been successfully
negotiated.
@@ -338,11 +348,11 @@ \subsection{Device configuration layout}\label{sec:Device Types / Network Device
If the driver negotiates VIRTIO_NET_F_MTU, it MUST supply enough receive
buffers to receive at least one receive packet of size \field{mtu} (plus low
-level ethernet header length) with \field{gso_type} NONE or ECN.
+level ethernet header length) with \field{gso_type} NONE, ECN or ACCECN.
If the driver negotiates VIRTIO_NET_F_MTU, it MUST NOT transmit packets of
size exceeding the value of \field{mtu} (plus low level ethernet header length)
-with \field{gso_type} NONE or ECN.
+with \field{gso_type} NONE, ECN or ACCECN.
A driver SHOULD negotiate the VIRTIO_NET_F_STANDBY feature if the device offers it.
@@ -433,7 +443,7 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev
The VIRTIO_NET_F_GUEST_CSUM feature indicates that partially
checksummed packets can be received, and if it can do that then
the VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6,
- VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_GUEST_USO4,
+ VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_GUEST_ACCECN, VIRTIO_NET_F_GUEST_USO4,
VIRTIO_NET_F_GUEST_USO6 VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO and
VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM are the input equivalents of
the features described above.
@@ -592,6 +602,7 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O
#define VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 0x20
#define VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 0x40
#define VIRTIO_NET_HDR_GSO_ECN 0x80
+#define VIRTIO_NET_HDR_GSO_ACCECN 0x10
u8 gso_type;
le16 hdr_len;
le16 gso_size;
@@ -702,6 +713,12 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
indicates that the TCP packet has TCP CWR flag set and the flag will be handled differently to all segements of
an aggregated segment, as mentioned in \hyperref[intro:rfc3168]{[RFC3168]}
\footnote{This case is not handled by some older hardware, so is called out specifically in the protocol.}.
+
+ \item If the driver negotiated the VIRTIO_NET_F_HOST_ACCECN feature,
+ the VIRTIO_NET_HDR_GSO_ACCECN bit in \field{gso_type}
+ indicates that the TCP packet has TCP CWR flag set and the flag will be applied to all segments of an aggregated
+ segment, as mentioend in \hyperref[intro:accecn]{[AccECN]}
+ \footnote{This case is not handled by some older hardware, so is called out specifically in the protocol.}.
\end{itemize}
\item If the driver negotiated the VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO feature and the
@@ -797,6 +814,11 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
\hyperref[intro:rfc3168]{[RFC3168]}, unless the VIRTIO_NET_F_HOST_ECN feature is
negotiated, in which case the driver MUST set the VIRTIO_NET_HDR_GSO_ECN bit in \field{gso_type}.
+The driver SHOULD NOT send to the device TCP packets requiring segmentation offload
+which have the TCP CWR flag set and require the flag be applied as mentioend in
+\hyperref[intro:accecn]{[AccECN]}, unless the VIRTIO_NET_F_HOST_ACCECN feature is
+negotiated, in which case the driver MUST set the VIRTIO_NET_HDR_GSO_ACCECN bit in \field{gso_type}.
+
If VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO is negotiated, the driver MAY set
VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit
in \field{gso_type} according to the inner network header protocol type
@@ -1108,12 +1130,12 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
in case of tunnels) has been validated.
\end{enumerate}
-Additionally, VIRTIO_NET_F_GUEST_CSUM, TSO4, TSO6, UDP, UDP_TUNNEL
-and ECN features enable receive checksum, large receive offload and RFC3168
-ECN support which are the input equivalents of the transmit checksum,
-transmit segmentation offloading and RFC3168 ECN features, as described
-in \ref{sec:Device Types / Network Device / Device Operation /
-Packet Transmission}:
+Additionally, VIRTIO_NET_F_GUEST_CSUM, TSO4, TSO6, UDP, UDP_TUNNEL, ECN and
+ACCECN features enable receive checksum, large receive offload, RFC3168 ECN
+and Accurate ECN support which are the input equivalents of the transmit
+checksum, transmit segmentation offloading, RFC3168 ECN and Accurate ECN
+features, as described in \ref{sec:Device Types / Network Device /
+Device Operation / Packet Transmission}:
\begin{enumerate}
\item If the VIRTIO_NET_F_GUEST_TSO4, TSO6, UFO, USO4 or USO6 options were
negotiated, then \field{gso_type} MAY be something other than
@@ -1218,6 +1240,11 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
\hyperref[intro:rfc3168]{[RFC3168]}, unless the VIRTIO_NET_F_GUEST_ECN feature is
negotiated, in which case the device MUST set the VIRTIO_NET_HDR_GSO_ECN bit in \field{gso_type}.
+The device SHOULD NOT send to the driver TCP packets requiring segmentation offload
+which have the TCP CWR flag set and require the flag be handled as mentioned in
+\hyperref[intro:accecn]{[AccECN]}, unless the VIRTIO_NET_F_GUEST_ACCECN feature is
+negotiated, in which case the device MUST set the VIRTIO_NET_HDR_GSO_ACCECN bit in \field{gso_type}.
+
If the VIRTIO_NET_F_GUEST_CSUM feature has been negotiated, the
device MAY set the VIRTIO_NET_HDR_F_NEEDS_CSUM bit in
\field{flags}, if so:
@@ -2193,6 +2220,7 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
#define VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM 47
#define VIRTIO_NET_F_GUEST_USO4 54
#define VIRTIO_NET_F_GUEST_USO6 55
+#define VIRTIO_NET_F_GUEST_ACCECN 70
#define VIRTIO_NET_CTRL_GUEST_OFFLOADS 5
#define VIRTIO_NET_CTRL_GUEST_OFFLOADS_SET 0
diff --git a/introduction.tex b/introduction.tex
index d52622e..9320ca1 100644
--- a/introduction.tex
+++ b/introduction.tex
@@ -171,6 +171,9 @@ \section{Normative References}\label{sec:Normative References}
\phantomsection\label{intro:rfc3168}\textbf{[RFC3168]} &
S. Floyd., ``The Addition of Explicit Congestion Notification (ECN) to IP'', September 2001.
\newline\url{http://www.ietf.org/rfc/rfc3168.txt}\\
+ \phantomsection\label{intro:accecn}\textbf{[AccECN]} &
+ B. Briscoe., ``More Accurate Explicit Congestion Notification (AccECN) Feedback in TCP'', February 2025.
+ \newline\url{https://www.ietf.org/archive/id/draft-ietf-tcpm-accurate-ecn-33.txt}\\
\end{longtable}
\section{Non-Normative References}
--
2.34.1
^ permalink raw reply related [flat|nested] 4+ messages in thread