All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RESEND 0/4] virtio-net: add tx-hash, rx/tx-tstamp and tx-time
@ 2024-06-24 11:09 Steffen Trumtrar
  2024-06-24 11:09 ` [PATCH RESEND 1/4] virtio-net: support transmit hash report Steffen Trumtrar
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Steffen Trumtrar @ 2024-06-24 11:09 UTC (permalink / raw)
  To: virtio-comment, kernel; +Cc: Steffen Trumtrar

The linux patch series [1] adds support for transmit hash reports and
receive and transmit timestamps.

This series describes the added flags and how the new features are
implemented.

Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>

[1] https://lore.kernel.org/all/20231218-v6-7-topic-virtio-net-ptp-v1-0-cac92b2d8532@pengutronix.de/

---
Steffen Trumtrar (4):
      virtio-net: support transmit hash report
      virtio-net: support receive ptp timestamps
      virtio-net: support transmit ptp timestamps
      virtio-net: support future packet transmit time

 device-types/net/description.tex | 65 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 65 insertions(+)
---
base-commit: 37c6a406678a5ee891fdf5671298cb4fcfa517f2
change-id: 20240618-v1-4-topic-virtio-net-timestamping-0ac98fc8bc42

Best regards,
-- 
Steffen Trumtrar <s.trumtrar@pengutronix.de>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH RESEND 1/4] virtio-net: support transmit hash report
  2024-06-24 11:09 [PATCH RESEND 0/4] virtio-net: add tx-hash, rx/tx-tstamp and tx-time Steffen Trumtrar
@ 2024-06-24 11:09 ` Steffen Trumtrar
  2024-06-24 11:57   ` Michael S. Tsirkin
  2024-06-25  5:07   ` Parav Pandit
  2024-06-24 11:09 ` [PATCH RESEND 2/4] virtio-net: support receive ptp timestamps Steffen Trumtrar
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 14+ messages in thread
From: Steffen Trumtrar @ 2024-06-24 11:09 UTC (permalink / raw)
  To: virtio-comment, kernel; +Cc: Steffen Trumtrar

Virtio-net supports sharing the flow hash from device to driver on rx.
Do the same in the other direction for robust routing and telemetry.

Experimental results mirror what the theory suggests: where IPv6
FlowLabel is included in path selection (e.g., LAG/ECMP), flowlabel
rotation on TCP timeout avoids the vast majority of TCP disconnects
that would otherwise have occurred during link failures in long-haul
backbones, when an alternative path is available.

Rotation can be applied to various bad connection signals, such as
timeouts and spurious retransmissions. In aggregate, such flow level
signals can help locate network issues.

Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
---
 device-types/net/description.tex | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/device-types/net/description.tex b/device-types/net/description.tex
index 61cce1f..eb2c08e 100644
--- a/device-types/net/description.tex
+++ b/device-types/net/description.tex
@@ -88,6 +88,8 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
 \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
     channel.
 
+\item[VIRTIO_NET_F_TX_HASH(49)] Driver sends hash report
+
 \item[VIRTIO_NET_F_DEVICE_STATS(50)] Device can provide device-level statistics
     to the driver through the control virtqueue.
 
@@ -641,6 +643,34 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
 
 If VIRTIO_NET_HDR_F_NEEDS_CSUM is not set, the device MUST NOT
 rely on the packet checksum being correct.
+
+\paragraph{Hash calculation for outgoing packets}
+\label{sec:Device Types / Network Device / Device Operation / Packet Transmission / Hash calculation for outgoing packets }
+
+If the VIRTIO_NET_F_TX_HASH was negotiated and the packet includes a hash, the driver uses
+the structure virtio_net_hdr_hash_ts instead of virtio_net_hdr and increases the \field{hdr_len} accordingly.
+
+\begin{lstlisting}
+  struct virtio_net_hdr_hash_ts {
+    struct {
+      struct virtio_net_hdr hdr;
+      __le33 value;
+      __le16 report;
+      __le16 flow_state;
+    } hash;
+    __u32 reserved;
+  };
+\end{lstlisting}
+
+The driver fills \field{hash_value} with the value of the calculated hash and \field{hash_report} with the report type of the calculated hash.
+
+Possible values that the driver supports in \field{hash_report} are defined below.
+
+\begin{lstlisting}
+#define VIRTIO_NET_HASH_REPORT_L4             (1 << 10)
+#define VIRTIO_NET_HASH_REPORT_OTHER          (1 << 11)
+\end{lstlisting}
+
 \paragraph{Packet Transmission Interrupt}\label{sec:Device Types / Network Device / Device Operation / Packet Transmission / Packet Transmission Interrupt}
 
 Often a driver will suppress transmission virtqueue interrupts

-- 
2.42.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH RESEND 2/4] virtio-net: support receive ptp timestamps
  2024-06-24 11:09 [PATCH RESEND 0/4] virtio-net: add tx-hash, rx/tx-tstamp and tx-time Steffen Trumtrar
  2024-06-24 11:09 ` [PATCH RESEND 1/4] virtio-net: support transmit hash report Steffen Trumtrar
@ 2024-06-24 11:09 ` Steffen Trumtrar
  2024-11-06  8:51   ` Xuan Zhuo
  2024-06-24 11:09 ` [PATCH RESEND 3/4] virtio-net: support transmit " Steffen Trumtrar
  2024-06-24 11:09 ` [PATCH RESEND 4/4] virtio-net: support future packet transmit time Steffen Trumtrar
  3 siblings, 1 reply; 14+ messages in thread
From: Steffen Trumtrar @ 2024-06-24 11:09 UTC (permalink / raw)
  To: virtio-comment, kernel; +Cc: Steffen Trumtrar

Accurate RTT measurement requires timestamps close to the wire.
Introduce virtio feature VIRTIO_NET_F_RX_TSTAMP. If negotiated, the
virtio-net header is expanded with room for a timestamp.

A device may pass receive timestamps for all or some packets. Flag
VIRTIO_NET_HDR_F_TSTAMP signals whether a timestamp is recorded.

Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
---
 device-types/net/description.tex | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/device-types/net/description.tex b/device-types/net/description.tex
index eb2c08e..09d9c28 100644
--- a/device-types/net/description.tex
+++ b/device-types/net/description.tex
@@ -88,6 +88,8 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
 \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
     channel.
 
+\item[VIRTIO_NET_F_RX_TSTAMP(48)] Device sends TAI receive time
+
 \item[VIRTIO_NET_F_TX_HASH(49)] Driver sends hash report
 
 \item[VIRTIO_NET_F_DEVICE_STATS(50)] Device can provide device-level statistics
@@ -415,6 +417,7 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O
 #define VIRTIO_NET_HDR_F_NEEDS_CSUM    1
 #define VIRTIO_NET_HDR_F_DATA_VALID    2
 #define VIRTIO_NET_HDR_F_RSC_INFO      4
+#define VIRTIO_NET_HDR_F_TSTAMP        8
         u8 flags;
 #define VIRTIO_NET_HDR_GSO_NONE        0
 #define VIRTIO_NET_HDR_GSO_TCPV4       1
@@ -659,6 +662,7 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
       __le16 flow_state;
     } hash;
     __u32 reserved;
+    __le64 tstamp;
   };
 \end{lstlisting}
 
@@ -1177,6 +1181,13 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
 #define VIRTIO_NET_HASH_REPORT_UDPv6_EX        9
 \end{lstlisting}
 
+\paragraph{Timestamping for incoming packets}
+\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Timestamping for incoming packets}
+
+If the feature VIRTIO_NET_F_RX_TSTAMP is negotiated, the \field{hdr_len} is set to the size of the virtio_net_hdr_hash_ts struct.
+The device may pass receive timestamps for incoming packets via \field{tstamp} and signals this with the
+VIRTIO_NET_HDR_F_TSTAMP flag.
+
 \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue}
 
 The driver uses the control virtqueue (if VIRTIO_NET_F_CTRL_VQ is

-- 
2.42.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH RESEND 3/4] virtio-net: support transmit ptp timestamps
  2024-06-24 11:09 [PATCH RESEND 0/4] virtio-net: add tx-hash, rx/tx-tstamp and tx-time Steffen Trumtrar
  2024-06-24 11:09 ` [PATCH RESEND 1/4] virtio-net: support transmit hash report Steffen Trumtrar
  2024-06-24 11:09 ` [PATCH RESEND 2/4] virtio-net: support receive ptp timestamps Steffen Trumtrar
@ 2024-06-24 11:09 ` Steffen Trumtrar
  2024-06-25  5:25   ` Parav Pandit
  2024-06-24 11:09 ` [PATCH RESEND 4/4] virtio-net: support future packet transmit time Steffen Trumtrar
  3 siblings, 1 reply; 14+ messages in thread
From: Steffen Trumtrar @ 2024-06-24 11:09 UTC (permalink / raw)
  To: virtio-comment, kernel; +Cc: Steffen Trumtrar

Add optional PTP hardware tx timestamp offload for virtio-net.

Accurate RTT measurement requires timestamps close to the wire.
Introduce virtio feature VIRTIO_NET_F_TX_TSTAMP, the transmit
equivalent to VIRTIO_NET_F_RX_TSTAMP.

The driver sets VIRTIO_NET_HDR_F_TSTAMP to request a timestamp
returned on completion. If the feature is negotiated, the device
either places the timestamp or clears the feature bit.

Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
---
 device-types/net/description.tex | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/device-types/net/description.tex b/device-types/net/description.tex
index 09d9c28..c438b0b 100644
--- a/device-types/net/description.tex
+++ b/device-types/net/description.tex
@@ -88,6 +88,8 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
 \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
     channel.
 
+\item[VIRTIO_NET_F_TX_TSTAMP(47)] Device sends TAI transmit time
+
 \item[VIRTIO_NET_F_RX_TSTAMP(48)] Device sends TAI receive time
 
 \item[VIRTIO_NET_F_TX_HASH(49)] Driver sends hash report
@@ -532,6 +534,12 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
 \item The header and packet are added as one output descriptor to the
   transmitq, and the device is notified of the new entry
   (see \ref{sec:Device Types / Network Device / Device Initialization}~\nameref{sec:Device Types / Network Device / Device Initialization}).
+
+\item If the driver negotiated the VIRTIO_NET_F_TX_TSTAMP and hardware
+  timestamping is supported on the device, the VIRTIO_NET_HDR_F_TSTAMP
+  flag is set. One writable input descriptor with the size of a timestamp,
+  i.e. \field{tstamp} in virtio_net_hdr_hash_ts,
+  is appended to the output descriptor.
 \end{enumerate}
 
 \drivernormative{\paragraph}{Packet Transmission}{Device Types / Network Device / Device Operation / Packet Transmission}
@@ -675,6 +683,17 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
 #define VIRTIO_NET_HASH_REPORT_OTHER          (1 << 11)
 \end{lstlisting}
 
+\paragraph{Timestamping for outgoing packets}
+\label{sec:Device Types / Network Device / Device Operation / Packet Transmission / Timestamping for outgoing packets}
+
+If the feature VIRTIO_NET_F_TX_TSTAMP is negotiated, the driver sets \field{hdr_len} to the virtio_net_hdr_hash_ts.
+
+The driver sets VIRTIO_NET_HDR_F_STAMP to request a timestamp returned on completion in \field{tstamp}.
+
+The device gets the second, writable descriptor, generates a timestamp and writes it to the \field{tstamp}.
+
+Finally, after transmitting the packet, the driver gets the \field{tstamp} from the packet.
+
 \paragraph{Packet Transmission Interrupt}\label{sec:Device Types / Network Device / Device Operation / Packet Transmission / Packet Transmission Interrupt}
 
 Often a driver will suppress transmission virtqueue interrupts

-- 
2.42.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH RESEND 4/4] virtio-net: support future packet transmit time
  2024-06-24 11:09 [PATCH RESEND 0/4] virtio-net: add tx-hash, rx/tx-tstamp and tx-time Steffen Trumtrar
                   ` (2 preceding siblings ...)
  2024-06-24 11:09 ` [PATCH RESEND 3/4] virtio-net: support transmit " Steffen Trumtrar
@ 2024-06-24 11:09 ` Steffen Trumtrar
  3 siblings, 0 replies; 14+ messages in thread
From: Steffen Trumtrar @ 2024-06-24 11:09 UTC (permalink / raw)
  To: virtio-comment, kernel; +Cc: Steffen Trumtrar

Add optional transmit time (SO_TXTIME) offload for virtio-net.

The Linux TCP/IP stack tries to avoid bursty transmission and network
congestion through pacing: computing an skb delivery time based on
congestion information. Userspace protocol implementations can achieve
the same with SO_TXTIME. This may also reduce scheduling jitter and
improve RTT estimation.

Pacing can be implemented in ETF or FQ qdiscs or offloaded to NIC
hardware. Allow virtio-net driver to offload for the same reasons.

Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
---
 device-types/net/description.tex | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/device-types/net/description.tex b/device-types/net/description.tex
index c438b0b..bae1a59 100644
--- a/device-types/net/description.tex
+++ b/device-types/net/description.tex
@@ -88,6 +88,8 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
 \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
     channel.
 
+\item[VIRTIO_NET_F_TX_TIME(46)] Driver sets TAI delivery time
+
 \item[VIRTIO_NET_F_TX_TSTAMP(47)] Device sends TAI transmit time
 
 \item[VIRTIO_NET_F_RX_TSTAMP(48)] Device sends TAI receive time
@@ -540,6 +542,9 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
   flag is set. One writable input descriptor with the size of a timestamp,
   i.e. \field{tstamp} in virtio_net_hdr_hash_ts,
   is appended to the output descriptor.
+
+\item If the driver negotiated the VIRTIO_NET_F_TX_TIME the transmit time
+of the packet is written to the \field{tstamp}.
 \end{enumerate}
 
 \drivernormative{\paragraph}{Packet Transmission}{Device Types / Network Device / Device Operation / Packet Transmission}

-- 
2.42.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH RESEND 1/4] virtio-net: support transmit hash report
  2024-06-24 11:09 ` [PATCH RESEND 1/4] virtio-net: support transmit hash report Steffen Trumtrar
@ 2024-06-24 11:57   ` Michael S. Tsirkin
  2024-06-25  5:07   ` Parav Pandit
  1 sibling, 0 replies; 14+ messages in thread
From: Michael S. Tsirkin @ 2024-06-24 11:57 UTC (permalink / raw)
  To: Steffen Trumtrar; +Cc: virtio-comment, kernel

On Mon, Jun 24, 2024 at 01:09:04PM +0200, Steffen Trumtrar wrote:
> Virtio-net supports sharing the flow hash from device to driver on rx.
> Do the same in the other direction for robust routing and telemetry.
> 
> Experimental results mirror what the theory suggests: where IPv6
> FlowLabel is included in path selection (e.g., LAG/ECMP), flowlabel
> rotation on TCP timeout avoids the vast majority of TCP disconnects
> that would otherwise have occurred during link failures in long-haul
> backbones, when an alternative path is available.
> 
> Rotation can be applied to various bad connection signals, such as
> timeouts and spurious retransmissions. In aggregate, such flow level
> signals can help locate network issues.
> 
> Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
> ---
>  device-types/net/description.tex | 30 ++++++++++++++++++++++++++++++
>  1 file changed, 30 insertions(+)
> 
> diff --git a/device-types/net/description.tex b/device-types/net/description.tex
> index 61cce1f..eb2c08e 100644
> --- a/device-types/net/description.tex
> +++ b/device-types/net/description.tex
> @@ -88,6 +88,8 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
>  \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
>      channel.
>  
> +\item[VIRTIO_NET_F_TX_HASH(49)] Driver sends hash report
> +
>  \item[VIRTIO_NET_F_DEVICE_STATS(50)] Device can provide device-level statistics
>      to the driver through the control virtqueue.
>  
> @@ -641,6 +643,34 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
>  
>  If VIRTIO_NET_HDR_F_NEEDS_CSUM is not set, the device MUST NOT
>  rely on the packet checksum being correct.
> +
> +\paragraph{Hash calculation for outgoing packets}
> +\label{sec:Device Types / Network Device / Device Operation / Packet Transmission / Hash calculation for outgoing packets }
> +
> +If the VIRTIO_NET_F_TX_HASH was negotiated and the packet includes a hash, the driver uses
> +the structure virtio_net_hdr_hash_ts instead of virtio_net_hdr and increases the \field{hdr_len} accordingly.

I don't get what does hdr_len have to do with it. It does not normally
include virtio net header and it does not look like your
patch changed it, either.


> +
> +\begin{lstlisting}
> +  struct virtio_net_hdr_hash_ts {
> +    struct {
> +      struct virtio_net_hdr hdr;

hdr is not part of hash, is it? why include it here?

> +      __le33 value;

le33?

> +      __le16 report;
> +      __le16 flow_state;

fields are left undocumented.

> +    } hash;
> +    __u32 reserved;

why do we need this? at least explain what it is and how is this
supposed to be handled, please.

> +  };
> +\end{lstlisting}
> +
> +The driver fills \field{hash_value} with the value of the calculated hash and \field{hash_report} with the report type of the calculated hash.

what are hash_value and hash_report?

> +
> +Possible values that the driver supports in \field{hash_report} are defined below.

considering how we have an extensive infrastructure for supported
hashes for rx, don't we want something like this for tx?

e.g. isn't there overhead associated with filling out this hash,
and doesn't driver benefit from knowing which ones can the
device use, exactly?

> +
> +\begin{lstlisting}
> +#define VIRTIO_NET_HASH_REPORT_L4             (1 << 10)
> +#define VIRTIO_NET_HASH_REPORT_OTHER          (1 << 11)
> +\end{lstlisting}
> +
>  \paragraph{Packet Transmission Interrupt}\label{sec:Device Types / Network Device / Device Operation / Packet Transmission / Packet Transmission Interrupt}
>  
>  Often a driver will suppress transmission virtqueue interrupts
> 
> -- 
> 2.42.0
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH RESEND 1/4] virtio-net: support transmit hash report
  2024-06-24 11:09 ` [PATCH RESEND 1/4] virtio-net: support transmit hash report Steffen Trumtrar
  2024-06-24 11:57   ` Michael S. Tsirkin
@ 2024-06-25  5:07   ` Parav Pandit
  2024-07-11 12:11     ` Steffen Trumtrar
  1 sibling, 1 reply; 14+ messages in thread
From: Parav Pandit @ 2024-06-25  5:07 UTC (permalink / raw)
  To: Steffen Trumtrar, virtio-comment@lists.linux.dev,
	kernel@pengutronix.de



> From: Steffen Trumtrar <s.trumtrar@pengutronix.de>
> Sent: Monday, June 24, 2024 4:39 PM
> 
> Virtio-net supports sharing the flow hash from device to driver on rx.
> Do the same in the other direction for robust routing and telemetry.
> 
> Experimental results mirror what the theory suggests: where IPv6 FlowLabel
> is included in path selection (e.g., LAG/ECMP), flowlabel rotation on TCP
> timeout avoids the vast majority of TCP disconnects that would otherwise
> have occurred during link failures in long-haul backbones, when an
> alternative path is available.
>
Can you please explain how does this related to the virtio-net device on transmit path?
The value proposed here is only between the driver and device. It wont be part of the packet. Did I understand right?
If so, which link do you refer to in above description? Virtio-net device link?

 
> Rotation can be applied to various bad connection signals, such as timeouts
> and spurious retransmissions. In aggregate, such flow level signals can help
> locate network issues.
>
 
> Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
> ---
>  device-types/net/description.tex | 30 ++++++++++++++++++++++++++++++
>  1 file changed, 30 insertions(+)
> 
> diff --git a/device-types/net/description.tex b/device-
> types/net/description.tex
> index 61cce1f..eb2c08e 100644
> --- a/device-types/net/description.tex
> +++ b/device-types/net/description.tex
> @@ -88,6 +88,8 @@ \subsection{Feature bits}\label{sec:Device Types /
> Network Device / Feature bits  \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)]
> Set MAC address through control
>      channel.
> 
> +\item[VIRTIO_NET_F_TX_HASH(49)] Driver sends hash report
> +
>  \item[VIRTIO_NET_F_DEVICE_STATS(50)] Device can provide device-level
> statistics
>      to the driver through the control virtqueue.
> 
> @@ -641,6 +643,34 @@ \subsubsection{Packet
> Transmission}\label{sec:Device Types / Network Device / De
> 
>  If VIRTIO_NET_HDR_F_NEEDS_CSUM is not set, the device MUST NOT  rely
> on the packet checksum being correct.
> +
> +\paragraph{Hash calculation for outgoing packets} \label{sec:Device
> +Types / Network Device / Device Operation / Packet Transmission / Hash
> +calculation for outgoing packets }
> +
> +If the VIRTIO_NET_F_TX_HASH was negotiated and the packet includes a
> +hash, the driver uses the structure virtio_net_hdr_hash_ts instead of
> virtio_net_hdr and increases the \field{hdr_len} accordingly.
> +
> +\begin{lstlisting}
> +  struct virtio_net_hdr_hash_ts {
> +    struct {
> +      struct virtio_net_hdr hdr;
> +      __le33 value;
> +      __le16 report;
> +      __le16 flow_state;
> +    } hash;
> +    __u32 reserved;
> +  };
> +\end{lstlisting}
> +
> +The driver fills \field{hash_value} with the value of the calculated hash and
> \field{hash_report} with the report type of the calculated hash.
> +
It would be better to write it as \field{hash.value} because there is no hash_value field.

Instead of defining new structure, proposed feature bit can reuse the existing hash_value and hash_report fields.
I guess that’s what you meant in above?

> +Possible values that the driver supports in \field{hash_report} are defined
> below.
> +
> +\begin{lstlisting}
> +#define VIRTIO_NET_HASH_REPORT_L4             (1 << 10)
> +#define VIRTIO_NET_HASH_REPORT_OTHER          (1 << 11)
> +\end{lstlisting}
> +
Please described _OTHER type of hash.

>  \paragraph{Packet Transmission Interrupt}\label{sec:Device Types / Network
> Device / Device Operation / Packet Transmission / Packet Transmission
> Interrupt}
> 
>  Often a driver will suppress transmission virtqueue interrupts
> 
> --
> 2.42.0
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH RESEND 3/4] virtio-net: support transmit ptp timestamps
  2024-06-24 11:09 ` [PATCH RESEND 3/4] virtio-net: support transmit " Steffen Trumtrar
@ 2024-06-25  5:25   ` Parav Pandit
  2024-08-22  9:30     ` Steffen Trumtrar
  0 siblings, 1 reply; 14+ messages in thread
From: Parav Pandit @ 2024-06-25  5:25 UTC (permalink / raw)
  To: Steffen Trumtrar, virtio-comment@lists.linux.dev,
	kernel@pengutronix.de

Hi Steffen,

> From: Steffen Trumtrar <s.trumtrar@pengutronix.de>
> Sent: Monday, June 24, 2024 4:39 PM
> To: virtio-comment@lists.linux.dev; kernel@pengutronix.de
> Cc: Steffen Trumtrar <s.trumtrar@pengutronix.de>
> Subject: [PATCH RESEND 3/4] virtio-net: support transmit ptp timestamps
> 
> Add optional PTP hardware tx timestamp offload for virtio-net.
> 
Very good initiate. Thanks for bringing timestamping to virtio.

> Accurate RTT measurement requires timestamps close to the wire.
> Introduce virtio feature VIRTIO_NET_F_TX_TSTAMP, the transmit equivalent
> to VIRTIO_NET_F_RX_TSTAMP.
> 
> The driver sets VIRTIO_NET_HDR_F_TSTAMP to request a timestamp
> returned on completion. If the feature is negotiated, the device either places
> the timestamp or clears the feature bit.
> 
> Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
> ---
>  device-types/net/description.tex | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/device-types/net/description.tex b/device-
> types/net/description.tex
> index 09d9c28..c438b0b 100644
> --- a/device-types/net/description.tex
> +++ b/device-types/net/description.tex
> @@ -88,6 +88,8 @@ \subsection{Feature bits}\label{sec:Device Types /
> Network Device / Feature bits  \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)]
> Set MAC address through control
>      channel.
> 
> +\item[VIRTIO_NET_F_TX_TSTAMP(47)] Device sends TAI transmit time
> +
The feature bit is not helpful.
The reason is, the user needs to enable timestamp much later using ethtool (Linux) SIOCSHWTSTAMP.
And it is desired to not reset the full device when changing this feature.

And when the feature is offered, it is also not desire to always performing timestamping as it has trade-offs.

Therefore, enablement of this functionality should be done using capabilities or other means dynamically.
Dynamic means = without re-initializing the device and features.

>  \item[VIRTIO_NET_F_RX_TSTAMP(48)] Device sends TAI receive time
> 
>  \item[VIRTIO_NET_F_TX_HASH(49)] Driver sends hash report @@ -532,6
> +534,12 @@ \subsubsection{Packet Transmission}\label{sec:Device Types /
> Network Device / De  \item The header and packet are added as one output
> descriptor to the
>    transmitq, and the device is notified of the new entry
>    (see \ref{sec:Device Types / Network Device / Device
> Initialization}~\nameref{sec:Device Types / Network Device / Device
> Initialization}).
> +
> +\item If the driver negotiated the VIRTIO_NET_F_TX_TSTAMP and hardware
> +  timestamping is supported on the device, the
> VIRTIO_NET_HDR_F_TSTAMP
> +  flag is set. One writable input descriptor with the size of a
> +timestamp,
> +  i.e. \field{tstamp} in virtio_net_hdr_hash_ts,
> +  is appended to the output descriptor.
>  \end{enumerate}
> 
This does not work any effectively for following reasons.
1. a TS is 8B, it is pointless to pass another 16B descriptor for 8B of data. This is 200% overhead.
2. a TS write of 8B also creates a cache line bounding due to partial writes even in modern cpus
3. Apart from the short write, because it is a dedicated address, the DMA cannot be merged by the device for it with something else.

In short, a dedicated descriptor for TS timestamp or including this TS in vnet_hdr is not good idea.

Many of us already discuss the new descriptor format that solves many of these issues.
i.e. to have a transmit completion which contains all needed fields including TS that also overcomes above 3 issues.

We captured these requirements at [1]. Please refer to 3.5.1 point 3.
It would be good if you can extend the descriptors this way that can work for hw.

[1] https://lore.kernel.org/virtio-comment/20230818043557.496964-7-parav@nvidia.com/

>  \drivernormative{\paragraph}{Packet Transmission}{Device Types / Network
> Device / Device Operation / Packet Transmission} @@ -675,6 +683,17 @@
> \subsubsection{Packet Transmission}\label{sec:Device Types / Network
> Device / De
>  #define VIRTIO_NET_HASH_REPORT_OTHER          (1 << 11)
>  \end{lstlisting}
> 
> +\paragraph{Timestamping for outgoing packets} \label{sec:Device Types /
> +Network Device / Device Operation / Packet Transmission / Timestamping
> +for outgoing packets}
> +
> +If the feature VIRTIO_NET_F_TX_TSTAMP is negotiated, the driver sets
> \field{hdr_len} to the virtio_net_hdr_hash_ts.
> +
> +The driver sets VIRTIO_NET_HDR_F_STAMP to request a timestamp
> returned on completion in \field{tstamp}.
> +
> +The device gets the second, writable descriptor, generates a timestamp and
> writes it to the \field{tstamp}.
> +
> +Finally, after transmitting the packet, the driver gets the \field{tstamp} from
> the packet.
> +
>  \paragraph{Packet Transmission Interrupt}\label{sec:Device Types / Network
> Device / Device Operation / Packet Transmission / Packet Transmission
> Interrupt}
> 
>  Often a driver will suppress transmission virtqueue interrupts
> 
> --
> 2.42.0
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RESEND 1/4] virtio-net: support transmit hash report
  2024-06-25  5:07   ` Parav Pandit
@ 2024-07-11 12:11     ` Steffen Trumtrar
  2024-07-29 10:20       ` Parav Pandit
  0 siblings, 1 reply; 14+ messages in thread
From: Steffen Trumtrar @ 2024-07-11 12:11 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtio-comment@lists.linux.dev, kernel@pengutronix.de


Hi Parav,

On 2024-06-25 at 05:07 GMT, Parav Pandit <parav@nvidia.com> wrote:

> > From: Steffen Trumtrar <s.trumtrar@pengutronix.de>
> > Sent: Monday, June 24, 2024 4:39 PM
> > 
> > Virtio-net supports sharing the flow hash from device to driver on rx.
> > Do the same in the other direction for robust routing and telemetry.
> > 
> > Experimental results mirror what the theory suggests: where IPv6 FlowLabel
> > is included in path selection (e.g., LAG/ECMP), flowlabel rotation on TCP
> > timeout avoids the vast majority of TCP disconnects that would otherwise
> > have occurred during link failures in long-haul backbones, when an
> > alternative path is available.
> >
> Can you please explain how does this related to the virtio-net device on transmit path?
> The value proposed here is only between the driver and device. It wont be part of the packet. Did I understand right?
> If so, which link do you refer to in above description? Virtio-net device link?
>

As I didn't develop that patch myself and I'm a little slow on the virtio-lingo I'm not sure
if *I* understand it right O:-)

So, in the kernel patch the TX hash report is set in the skb that is send with xmit_skb.
Does Qemu change that packet? Qemu is the "device", right?

>  
> > Rotation can be applied to various bad connection signals, such as timeouts
> > and spurious retransmissions. In aggregate, such flow level signals can help
> > locate network issues.
> >
>  
> > Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
> > ---
> >  device-types/net/description.tex | 30 ++++++++++++++++++++++++++++++
> >  1 file changed, 30 insertions(+)
> > 
> > diff --git a/device-types/net/description.tex b/device-
> > types/net/description.tex
> > index 61cce1f..eb2c08e 100644
> > --- a/device-types/net/description.tex
> > +++ b/device-types/net/description.tex
> > @@ -88,6 +88,8 @@ \subsection{Feature bits}\label{sec:Device Types /
> > Network Device / Feature bits  \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)]
> > Set MAC address through control
> >      channel.
> > 
> > +\item[VIRTIO_NET_F_TX_HASH(49)] Driver sends hash report
> > +
> >  \item[VIRTIO_NET_F_DEVICE_STATS(50)] Device can provide device-level
> > statistics
> >      to the driver through the control virtqueue.
> > 
> > @@ -641,6 +643,34 @@ \subsubsection{Packet
> > Transmission}\label{sec:Device Types / Network Device / De
> > 
> >  If VIRTIO_NET_HDR_F_NEEDS_CSUM is not set, the device MUST NOT  rely
> > on the packet checksum being correct.
> > +
> > +\paragraph{Hash calculation for outgoing packets} \label{sec:Device
> > +Types / Network Device / Device Operation / Packet Transmission / Hash
> > +calculation for outgoing packets }
> > +
> > +If the VIRTIO_NET_F_TX_HASH was negotiated and the packet includes a
> > +hash, the driver uses the structure virtio_net_hdr_hash_ts instead of
> > virtio_net_hdr and increases the \field{hdr_len} accordingly.
> > +
> > +\begin{lstlisting}
> > +  struct virtio_net_hdr_hash_ts {
> > +    struct {
> > +      struct virtio_net_hdr hdr;
> > +      __le33 value;
> > +      __le16 report;
> > +      __le16 flow_state;
> > +    } hash;
> > +    __u32 reserved;
> > +  };
> > +\end{lstlisting}
> > +
> > +The driver fills \field{hash_value} with the value of the calculated hash and
> > \field{hash_report} with the report type of the calculated hash.
> > +
> It would be better to write it as \field{hash.value} because there is no hash_value field.
> 
> Instead of defining new structure, proposed feature bit can reuse the existing hash_value and hash_report fields.
> I guess that’s what you meant in above?
>

I mixed up the linux implementation with the spec.
Thanks, I updated the documentation completely according to your suggestions as I now
understand what is actually happening. 

> 
> > +Possible values that the driver supports in \field{hash_report} are defined
> > below.
> > +
> > +\begin{lstlisting}
> > +#define VIRTIO_NET_HASH_REPORT_L4             (1 << 10)
> > +#define VIRTIO_NET_HASH_REPORT_OTHER          (1 << 11)
> > +\end{lstlisting}
> > +
> Please described _OTHER type of hash.
> 

Yes, will do.


Thanks,
Steffen

-- 
Pengutronix e.K.                | Dipl.-Inform. Steffen Trumtrar |
Steuerwalder Str. 21            | https://www.pengutronix.de/    |
31137 Hildesheim, Germany       | Phone: +49-5121-206917-0       |
Amtsgericht Hildesheim, HRA 2686| Fax:   +49-5121-206917-5555    |

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH RESEND 1/4] virtio-net: support transmit hash report
  2024-07-11 12:11     ` Steffen Trumtrar
@ 2024-07-29 10:20       ` Parav Pandit
  0 siblings, 0 replies; 14+ messages in thread
From: Parav Pandit @ 2024-07-29 10:20 UTC (permalink / raw)
  To: Steffen Trumtrar; +Cc: virtio-comment@lists.linux.dev, kernel@pengutronix.de


> From: Steffen Trumtrar <s.trumtrar@pengutronix.de>
> Sent: Thursday, July 11, 2024 5:41 PM
> To: Parav Pandit <parav@nvidia.com>
> Cc: virtio-comment@lists.linux.dev; kernel@pengutronix.de
> Subject: Re: [PATCH RESEND 1/4] virtio-net: support transmit hash report
> 
> 
> Hi Parav,
> 
> On 2024-06-25 at 05:07 GMT, Parav Pandit <parav@nvidia.com> wrote:
> 
> > > From: Steffen Trumtrar <s.trumtrar@pengutronix.de>
> > > Sent: Monday, June 24, 2024 4:39 PM
> > >
> > > Virtio-net supports sharing the flow hash from device to driver on rx.
> > > Do the same in the other direction for robust routing and telemetry.
> > >
> > > Experimental results mirror what the theory suggests: where IPv6
> > > FlowLabel is included in path selection (e.g., LAG/ECMP), flowlabel
> > > rotation on TCP timeout avoids the vast majority of TCP disconnects
> > > that would otherwise have occurred during link failures in long-haul
> > > backbones, when an alternative path is available.
> > >
> > Can you please explain how does this related to the virtio-net device on
> transmit path?
> > The value proposed here is only between the driver and device. It wont be
> part of the packet. Did I understand right?
> > If so, which link do you refer to in above description? Virtio-net device link?
> >
> 
> As I didn't develop that patch myself and I'm a little slow on the virtio-lingo I'm
> not sure if *I* understand it right O:-)
> 
> So, in the kernel patch the TX hash report is set in the skb that is send with
> xmit_skb.
> Does Qemu change that packet? Qemu is the "device", right?
>
Not sure how is this question of "changing packet" relevant here.
Or device either..

You initially explained that a flow label present in the ipv6 header used by ECMP.
In your example if was for backbones (and not in a end point NIC)...

So that is already present, why is the hash of a packet needed or relevant to virtio-net?
Is this hash always calculated/present?

I have a disconnect between the problem statement and the proposed solution.
Can you please describe both of them?

> >
> > > Rotation can be applied to various bad connection signals, such as
> > > timeouts and spurious retransmissions. In aggregate, such flow level
> > > signals can help locate network issues.
> > >
> >
> > > Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
> > > ---
> > >  device-types/net/description.tex | 30
> > > ++++++++++++++++++++++++++++++
> > >  1 file changed, 30 insertions(+)
> > >
> > > diff --git a/device-types/net/description.tex b/device-
> > > types/net/description.tex index 61cce1f..eb2c08e 100644
> > > --- a/device-types/net/description.tex
> > > +++ b/device-types/net/description.tex
> > > @@ -88,6 +88,8 @@ \subsection{Feature bits}\label{sec:Device Types /
> > > Network Device / Feature bits
> \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)]
> > > Set MAC address through control
> > >      channel.
> > >
> > > +\item[VIRTIO_NET_F_TX_HASH(49)] Driver sends hash report
> > > +
> > >  \item[VIRTIO_NET_F_DEVICE_STATS(50)] Device can provide
> > > device-level statistics
> > >      to the driver through the control virtqueue.
> > >
> > > @@ -641,6 +643,34 @@ \subsubsection{Packet
> > > Transmission}\label{sec:Device Types / Network Device / De
> > >
> > >  If VIRTIO_NET_HDR_F_NEEDS_CSUM is not set, the device MUST NOT
> > > rely on the packet checksum being correct.
> > > +
> > > +\paragraph{Hash calculation for outgoing packets} \label{sec:Device
> > > +Types / Network Device / Device Operation / Packet Transmission /
> > > +Hash calculation for outgoing packets }
> > > +
> > > +If the VIRTIO_NET_F_TX_HASH was negotiated and the packet includes
> > > +a hash, the driver uses the structure virtio_net_hdr_hash_ts
> > > +instead of
> > > virtio_net_hdr and increases the \field{hdr_len} accordingly.
> > > +
> > > +\begin{lstlisting}
> > > +  struct virtio_net_hdr_hash_ts {
> > > +    struct {
> > > +      struct virtio_net_hdr hdr;
> > > +      __le33 value;
> > > +      __le16 report;
> > > +      __le16 flow_state;
> > > +    } hash;
> > > +    __u32 reserved;
> > > +  };
> > > +\end{lstlisting}
> > > +
> > > +The driver fills \field{hash_value} with the value of the
> > > +calculated hash and
> > > \field{hash_report} with the report type of the calculated hash.
> > > +
> > It would be better to write it as \field{hash.value} because there is no
> hash_value field.
> >
> > Instead of defining new structure, proposed feature bit can reuse the
> existing hash_value and hash_report fields.
> > I guess that's what you meant in above?
> >
> 
> I mixed up the linux implementation with the spec.
> Thanks, I updated the documentation completely according to your
> suggestions as I now understand what is actually happening.
> 
> >
> > > +Possible values that the driver supports in \field{hash_report} are
> > > +defined
> > > below.
> > > +
> > > +\begin{lstlisting}
> > > +#define VIRTIO_NET_HASH_REPORT_L4             (1 << 10)
> > > +#define VIRTIO_NET_HASH_REPORT_OTHER          (1 << 11)
> > > +\end{lstlisting}
> > > +
> > Please described _OTHER type of hash.
> >
> 
> Yes, will do.
> 
> 
> Thanks,
> Steffen
> 
> --
> Pengutronix e.K.                | Dipl.-Inform. Steffen Trumtrar |
> Steuerwalder Str. 21            |
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fww
> w.pengutronix.de%2F&data=05%7C02%7Cparav%40nvidia.com%7C4ead5f4
> 482924b64791408dca1a29460%7C43083d15727340c1b7db39efd9ccc17a
> %7C0%7C0%7C638562966877257668%7CUnknown%7CTWFpbGZsb3d8ey
> JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D
> %7C0%7C%7C%7C&sdata=dTdeYxHxwR%2F82aCvUFh0wQKZQ7P30qXtKD0Z
> hBUYnAA%3D&reserved=0    |
> 31137 Hildesheim, Germany       | Phone: +49-5121-206917-0       |
> Amtsgericht Hildesheim, HRA 2686| Fax:   +49-5121-206917-5555    |

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RESEND 3/4] virtio-net: support transmit ptp timestamps
  2024-06-25  5:25   ` Parav Pandit
@ 2024-08-22  9:30     ` Steffen Trumtrar
  2024-08-24 13:19       ` Parav Pandit
  0 siblings, 1 reply; 14+ messages in thread
From: Steffen Trumtrar @ 2024-08-22  9:30 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtio-comment@lists.linux.dev, kernel@pengutronix.de


Hi,

On 2024-06-25 at 05:25 GMT, Parav Pandit <parav@nvidia.com> wrote:

> Hi Steffen,
> 
> > From: Steffen Trumtrar <s.trumtrar@pengutronix.de>
> > Sent: Monday, June 24, 2024 4:39 PM
> > To: virtio-comment@lists.linux.dev; kernel@pengutronix.de
> > Cc: Steffen Trumtrar <s.trumtrar@pengutronix.de>
> > Subject: [PATCH RESEND 3/4] virtio-net: support transmit ptp timestamps
> > 
> > Add optional PTP hardware tx timestamp offload for virtio-net.
> > 
> Very good initiate. Thanks for bringing timestamping to virtio.
> 
> > Accurate RTT measurement requires timestamps close to the wire.
> > Introduce virtio feature VIRTIO_NET_F_TX_TSTAMP, the transmit equivalent
> > to VIRTIO_NET_F_RX_TSTAMP.
> > 
> > The driver sets VIRTIO_NET_HDR_F_TSTAMP to request a timestamp
> > returned on completion. If the feature is negotiated, the device either places
> > the timestamp or clears the feature bit.
> > 
> > Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
> > ---
> >  device-types/net/description.tex | 19 +++++++++++++++++++
> >  1 file changed, 19 insertions(+)
> > 
> > diff --git a/device-types/net/description.tex b/device-
> > types/net/description.tex
> > index 09d9c28..c438b0b 100644
> > --- a/device-types/net/description.tex
> > +++ b/device-types/net/description.tex
> > @@ -88,6 +88,8 @@ \subsection{Feature bits}\label{sec:Device Types /
> > Network Device / Feature bits  \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)]
> > Set MAC address through control
> >      channel.
> > 
> > +\item[VIRTIO_NET_F_TX_TSTAMP(47)] Device sends TAI transmit time
> > +
> The feature bit is not helpful.
> The reason is, the user needs to enable timestamp much later using ethtool (Linux) SIOCSHWTSTAMP.
> And it is desired to not reset the full device when changing this feature.
> 
> And when the feature is offered, it is also not desire to always performing timestamping as it has trade-offs.
> 
> Therefore, enablement of this functionality should be done using capabilities or other means dynamically.
> Dynamic means = without re-initializing the device and features.

okay, I understand that.

> 
> >  \item[VIRTIO_NET_F_RX_TSTAMP(48)] Device sends TAI receive time
> > 
> >  \item[VIRTIO_NET_F_TX_HASH(49)] Driver sends hash report @@ -532,6
> > +534,12 @@ \subsubsection{Packet Transmission}\label{sec:Device Types /
> > Network Device / De  \item The header and packet are added as one output
> > descriptor to the
> >    transmitq, and the device is notified of the new entry
> >    (see \ref{sec:Device Types / Network Device / Device
> > Initialization}~\nameref{sec:Device Types / Network Device / Device
> > Initialization}).
> > +
> > +\item If the driver negotiated the VIRTIO_NET_F_TX_TSTAMP and hardware
> > +  timestamping is supported on the device, the
> > VIRTIO_NET_HDR_F_TSTAMP
> > +  flag is set. One writable input descriptor with the size of a
> > +timestamp,
> > +  i.e. \field{tstamp} in virtio_net_hdr_hash_ts,
> > +  is appended to the output descriptor.
> >  \end{enumerate}
> > 
> This does not work any effectively for following reasons.
> 1. a TS is 8B, it is pointless to pass another 16B descriptor for 8B of data. This is 200% overhead.
> 2. a TS write of 8B also creates a cache line bounding due to partial writes even in modern cpus
> 3. Apart from the short write, because it is a dedicated address, the DMA cannot be merged by the device for it with something else.
> 
> In short, a dedicated descriptor for TS timestamp or including this TS in vnet_hdr is not good idea.

And this sounds reasonable, too. Understood.
> 
> Many of us already discuss the new descriptor format that solves many of these issues.
> i.e. to have a transmit completion which contains all needed fields including TS that also overcomes above 3 issues.
> 
> We captured these requirements at [1]. Please refer to 3.5.1 point 3.
> It would be good if you can extend the descriptors this way that can work for hw.
> 
> [1] https://lore.kernel.org/virtio-comment/20230818043557.496964-7-parav@nvidia.com/
> 

Here I'm a little bit confused. So you want me to describe the feature in terms of the vnet_rx_completion or vnet_tx_completion that is, which is not described yet? For the spec part this seems "easy" but the actual implementation sounds a "little bit" more involved.

Do you know, if there are already patches floating around for virtio-net that implement this new specification. I'd like to test it in code before writing the documentation/spec. 


Best regards,
Steffen

-- 
Pengutronix e.K.                | Dipl.-Inform. Steffen Trumtrar |
Steuerwalder Str. 21            | https://www.pengutronix.de/    |
31137 Hildesheim, Germany       | Phone: +49-5121-206917-0       |
Amtsgericht Hildesheim, HRA 2686| Fax:   +49-5121-206917-5555    |

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH RESEND 3/4] virtio-net: support transmit ptp timestamps
  2024-08-22  9:30     ` Steffen Trumtrar
@ 2024-08-24 13:19       ` Parav Pandit
  0 siblings, 0 replies; 14+ messages in thread
From: Parav Pandit @ 2024-08-24 13:19 UTC (permalink / raw)
  To: Steffen Trumtrar; +Cc: virtio-comment@lists.linux.dev, kernel@pengutronix.de



> From: Steffen Trumtrar <s.trumtrar@pengutronix.de>
> Sent: Thursday, August 22, 2024 3:00 PM
> 
> Hi,
> 
> On 2024-06-25 at 05:25 GMT, Parav Pandit <parav@nvidia.com> wrote:
> 
> > Hi Steffen,
> >
> > > From: Steffen Trumtrar <s.trumtrar@pengutronix.de>
> > > Sent: Monday, June 24, 2024 4:39 PM
> > > To: virtio-comment@lists.linux.dev; kernel@pengutronix.de
> > > Cc: Steffen Trumtrar <s.trumtrar@pengutronix.de>
> > > Subject: [PATCH RESEND 3/4] virtio-net: support transmit ptp
> > > timestamps
> > >
> > > Add optional PTP hardware tx timestamp offload for virtio-net.
> > >
> > Very good initiate. Thanks for bringing timestamping to virtio.
> >
> > > Accurate RTT measurement requires timestamps close to the wire.
> > > Introduce virtio feature VIRTIO_NET_F_TX_TSTAMP, the transmit
> > > equivalent to VIRTIO_NET_F_RX_TSTAMP.
> > >
> > > The driver sets VIRTIO_NET_HDR_F_TSTAMP to request a timestamp
> > > returned on completion. If the feature is negotiated, the device
> > > either places the timestamp or clears the feature bit.
> > >
> > > Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
> > > ---
> > >  device-types/net/description.tex | 19 +++++++++++++++++++
> > >  1 file changed, 19 insertions(+)
> > >
> > > diff --git a/device-types/net/description.tex b/device-
> > > types/net/description.tex index 09d9c28..c438b0b 100644
> > > --- a/device-types/net/description.tex
> > > +++ b/device-types/net/description.tex
> > > @@ -88,6 +88,8 @@ \subsection{Feature bits}\label{sec:Device Types /
> > > Network Device / Feature bits
> \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)]
> > > Set MAC address through control
> > >      channel.
> > >
> > > +\item[VIRTIO_NET_F_TX_TSTAMP(47)] Device sends TAI transmit time
> > > +
> > The feature bit is not helpful.
> > The reason is, the user needs to enable timestamp much later using ethtool
> (Linux) SIOCSHWTSTAMP.
> > And it is desired to not reset the full device when changing this feature.
> >
> > And when the feature is offered, it is also not desire to always performing
> timestamping as it has trade-offs.
> >
> > Therefore, enablement of this functionality should be done using
> capabilities or other means dynamically.
> > Dynamic means = without re-initializing the device and features.
> 
> okay, I understand that.
> 
> >
> > >  \item[VIRTIO_NET_F_RX_TSTAMP(48)] Device sends TAI receive time
> > >
> > >  \item[VIRTIO_NET_F_TX_HASH(49)] Driver sends hash report @@ -532,6
> > > +534,12 @@ \subsubsection{Packet Transmission}\label{sec:Device
> > > +Types /
> > > Network Device / De  \item The header and packet are added as one
> > > output descriptor to the
> > >    transmitq, and the device is notified of the new entry
> > >    (see \ref{sec:Device Types / Network Device / Device
> > > Initialization}~\nameref{sec:Device Types / Network Device / Device
> > > Initialization}).
> > > +
> > > +\item If the driver negotiated the VIRTIO_NET_F_TX_TSTAMP and
> > > +hardware
> > > +  timestamping is supported on the device, the
> > > VIRTIO_NET_HDR_F_TSTAMP
> > > +  flag is set. One writable input descriptor with the size of a
> > > +timestamp,
> > > +  i.e. \field{tstamp} in virtio_net_hdr_hash_ts,
> > > +  is appended to the output descriptor.
> > >  \end{enumerate}
> > >
> > This does not work any effectively for following reasons.
> > 1. a TS is 8B, it is pointless to pass another 16B descriptor for 8B of data.
> This is 200% overhead.
> > 2. a TS write of 8B also creates a cache line bounding due to partial
> > writes even in modern cpus 3. Apart from the short write, because it is a
> dedicated address, the DMA cannot be merged by the device for it with
> something else.
> >
> > In short, a dedicated descriptor for TS timestamp or including this TS in
> vnet_hdr is not good idea.
> 
> And this sounds reasonable, too. Understood.
> >
> > Many of us already discuss the new descriptor format that solves many of
> these issues.
> > i.e. to have a transmit completion which contains all needed fields
> including TS that also overcomes above 3 issues.
> >
> > We captured these requirements at [1]. Please refer to 3.5.1 point 3.
> > It would be good if you can extend the descriptors this way that can work
> for hw.
> >
> > [1] https://lore.kernel.org/virtio-comment/20230818043557.496964-7-parav@nvidia.com/
> >
> 
> Here I'm a little bit confused. So you want me to describe the feature in
> terms of the vnet_rx_completion or vnet_tx_completion that is, which is not
> described yet? For the spec part this seems "easy" but the actual
> implementation sounds a "little bit" more involved.
> 
Yes. 

> Do you know, if there are already patches floating around for virtio-net that
> implement this new specification. I'd like to test it in code before writing the
> documentation/spec.
>
We are currently implementing this on a programmable device internally and witness around 2.5x to 3x packet rate nearly 50% less compute.
You probably will have to do this on some other device or sw platform.
 
> 
> Best regards,
> Steffen
> 
> --
> Pengutronix e.K.                | Dipl.-Inform. Steffen Trumtrar |
> Steuerwalder Str. 21            |
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww
> .pengutronix.de%2F&data=05%7C02%7Cparav%40nvidia.com%7C7d14a44b6
> 71a4cb63f0c08dcc28d08e9%7C43083d15727340c1b7db39efd9ccc17a%7C0%
> 7C0%7C638599158238183520%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
> wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%
> 7C%7C&sdata=kN9ykIXcxHnmrhtXcrsOsPr2itmu0BQ2KpSyYOKRlok%3D&reser
> ved=0    |
> 31137 Hildesheim, Germany       | Phone: +49-5121-206917-0       |
> Amtsgericht Hildesheim, HRA 2686| Fax:   +49-5121-206917-5555    |

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RESEND 2/4] virtio-net: support receive ptp timestamps
  2024-06-24 11:09 ` [PATCH RESEND 2/4] virtio-net: support receive ptp timestamps Steffen Trumtrar
@ 2024-11-06  8:51   ` Xuan Zhuo
  2024-11-06  9:00     ` Parav Pandit
  0 siblings, 1 reply; 14+ messages in thread
From: Xuan Zhuo @ 2024-11-06  8:51 UTC (permalink / raw)
  To: Steffen Trumtrar; +Cc: Steffen Trumtrar, virtio-comment, kernel

On Mon, 24 Jun 2024 13:09:05 +0200, Steffen Trumtrar <s.trumtrar@pengutronix.de> wrote:
> Accurate RTT measurement requires timestamps close to the wire.
> Introduce virtio feature VIRTIO_NET_F_RX_TSTAMP. If negotiated, the
> virtio-net header is expanded with room for a timestamp.
>
> A device may pass receive timestamps for all or some packets. Flag
> VIRTIO_NET_HDR_F_TSTAMP signals whether a timestamp is recorded.
>
> Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
> ---
>  device-types/net/description.tex | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>
> diff --git a/device-types/net/description.tex b/device-types/net/description.tex
> index eb2c08e..09d9c28 100644
> --- a/device-types/net/description.tex
> +++ b/device-types/net/description.tex
> @@ -88,6 +88,8 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
>  \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
>      channel.
>
> +\item[VIRTIO_NET_F_RX_TSTAMP(48)] Device sends TAI receive time
> +
>  \item[VIRTIO_NET_F_TX_HASH(49)] Driver sends hash report
>
>  \item[VIRTIO_NET_F_DEVICE_STATS(50)] Device can provide device-level statistics
> @@ -415,6 +417,7 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O
>  #define VIRTIO_NET_HDR_F_NEEDS_CSUM    1
>  #define VIRTIO_NET_HDR_F_DATA_VALID    2
>  #define VIRTIO_NET_HDR_F_RSC_INFO      4
> +#define VIRTIO_NET_HDR_F_TSTAMP        8
>          u8 flags;
>  #define VIRTIO_NET_HDR_GSO_NONE        0
>  #define VIRTIO_NET_HDR_GSO_TCPV4       1
> @@ -659,6 +662,7 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
>        __le16 flow_state;
>      } hash;
>      __u32 reserved;
> +    __le64 tstamp;
>    };
>  \end{lstlisting}
>
> @@ -1177,6 +1181,13 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>  #define VIRTIO_NET_HASH_REPORT_UDPv6_EX        9
>  \end{lstlisting}
>
> +\paragraph{Timestamping for incoming packets}
> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Timestamping for incoming packets}
> +
> +If the feature VIRTIO_NET_F_RX_TSTAMP is negotiated, the \field{hdr_len} is set to the size of the virtio_net_hdr_hash_ts struct.
> +The device may pass receive timestamps for incoming packets via \field{tstamp} and signals this with the
> +VIRTIO_NET_HDR_F_TSTAMP flag.
> +
>  \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue}
>
>  The driver uses the control virtqueue (if VIRTIO_NET_F_CTRL_VQ is


This has not been updated for quite some time, but for us, the RX HW timestamp
feature has always been crucial. The TX HW timestamp feature, however, has not
progressed due to reasons we all know about.

Could we make a compromise in this area? First, let's complete the RX HW
timestamp feature. For us, the RX HW timestamp is much more important than the
TX HW timestamp.

@Jason @Michael @Parav

Thanks.


>
> --
> 2.42.0
>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH RESEND 2/4] virtio-net: support receive ptp timestamps
  2024-11-06  8:51   ` Xuan Zhuo
@ 2024-11-06  9:00     ` Parav Pandit
  0 siblings, 0 replies; 14+ messages in thread
From: Parav Pandit @ 2024-11-06  9:00 UTC (permalink / raw)
  To: Xuan Zhuo, Steffen Trumtrar
  Cc: Steffen Trumtrar, virtio-comment@lists.linux.dev,
	kernel@pengutronix.de



> From: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> Sent: Wednesday, November 6, 2024 2:22 PM
> 
> On Mon, 24 Jun 2024 13:09:05 +0200, Steffen Trumtrar
> <s.trumtrar@pengutronix.de> wrote:
> > Accurate RTT measurement requires timestamps close to the wire.
> > Introduce virtio feature VIRTIO_NET_F_RX_TSTAMP. If negotiated, the
> > virtio-net header is expanded with room for a timestamp.
> >
> > A device may pass receive timestamps for all or some packets. Flag
> > VIRTIO_NET_HDR_F_TSTAMP signals whether a timestamp is recorded.
> >
> > Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
> > ---
> >  device-types/net/description.tex | 11 +++++++++++
> >  1 file changed, 11 insertions(+)
> >
> > diff --git a/device-types/net/description.tex
> > b/device-types/net/description.tex
> > index eb2c08e..09d9c28 100644
> > --- a/device-types/net/description.tex
> > +++ b/device-types/net/description.tex
> > @@ -88,6 +88,8 @@ \subsection{Feature bits}\label{sec:Device Types /
> > Network Device / Feature bits  \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)]
> Set MAC address through control
> >      channel.
> >
> > +\item[VIRTIO_NET_F_RX_TSTAMP(48)] Device sends TAI receive time
> > +
> >  \item[VIRTIO_NET_F_TX_HASH(49)] Driver sends hash report
> >
> >  \item[VIRTIO_NET_F_DEVICE_STATS(50)] Device can provide device-level
> > statistics @@ -415,6 +417,7 @@ \subsection{Device
> Operation}\label{sec:Device Types / Network Device / Device O
> >  #define VIRTIO_NET_HDR_F_NEEDS_CSUM    1
> >  #define VIRTIO_NET_HDR_F_DATA_VALID    2
> >  #define VIRTIO_NET_HDR_F_RSC_INFO      4
> > +#define VIRTIO_NET_HDR_F_TSTAMP        8
> >          u8 flags;
> >  #define VIRTIO_NET_HDR_GSO_NONE        0
> >  #define VIRTIO_NET_HDR_GSO_TCPV4       1
> > @@ -659,6 +662,7 @@ \subsubsection{Packet
> Transmission}\label{sec:Device Types / Network Device / De
> >        __le16 flow_state;
> >      } hash;
> >      __u32 reserved;
> > +    __le64 tstamp;
> >    };
> >  \end{lstlisting}
> >
> > @@ -1177,6 +1181,13 @@ \subsubsection{Processing of Incoming
> Packets}\label{sec:Device Types / Network
> >  #define VIRTIO_NET_HASH_REPORT_UDPv6_EX        9
> >  \end{lstlisting}
> >
> > +\paragraph{Timestamping for incoming packets} \label{sec:Device Types
> > +/ Network Device / Device Operation / Processing of Incoming Packets
> > +/ Timestamping for incoming packets}
> > +
> > +If the feature VIRTIO_NET_F_RX_TSTAMP is negotiated, the \field{hdr_len}
> is set to the size of the virtio_net_hdr_hash_ts struct.
> > +The device may pass receive timestamps for incoming packets via
> > +\field{tstamp} and signals this with the VIRTIO_NET_HDR_F_TSTAMP flag.
> > +
> >  \subsubsection{Control Virtqueue}\label{sec:Device Types / Network
> > Device / Device Operation / Control Virtqueue}
> >
> >  The driver uses the control virtqueue (if VIRTIO_NET_F_CTRL_VQ is
> 
> 
> This has not been updated for quite some time, but for us, the RX HW
> timestamp feature has always been crucial. The TX HW timestamp feature,
> however, has not progressed due to reasons we all know about.
> 
> Could we make a compromise in this area? First, let's complete the RX HW
> timestamp feature. For us, the RX HW timestamp is much more important
> than the TX HW timestamp.
> 
> @Jason @Michael @Parav
> 
> Thanks.
> 
Having rx timestamp in vnet header can be a good enough.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-11-06  9:00 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-24 11:09 [PATCH RESEND 0/4] virtio-net: add tx-hash, rx/tx-tstamp and tx-time Steffen Trumtrar
2024-06-24 11:09 ` [PATCH RESEND 1/4] virtio-net: support transmit hash report Steffen Trumtrar
2024-06-24 11:57   ` Michael S. Tsirkin
2024-06-25  5:07   ` Parav Pandit
2024-07-11 12:11     ` Steffen Trumtrar
2024-07-29 10:20       ` Parav Pandit
2024-06-24 11:09 ` [PATCH RESEND 2/4] virtio-net: support receive ptp timestamps Steffen Trumtrar
2024-11-06  8:51   ` Xuan Zhuo
2024-11-06  9:00     ` Parav Pandit
2024-06-24 11:09 ` [PATCH RESEND 3/4] virtio-net: support transmit " Steffen Trumtrar
2024-06-25  5:25   ` Parav Pandit
2024-08-22  9:30     ` Steffen Trumtrar
2024-08-24 13:19       ` Parav Pandit
2024-06-24 11:09 ` [PATCH RESEND 4/4] virtio-net: support future packet transmit time Steffen Trumtrar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.