netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 0/4] virtio-net: add tx-hash, rx-tstamp, tx-tstamp and tx-time
@ 2023-12-18 11:37 Steffen Trumtrar
  2023-12-18 11:37 ` [PATCH RFC 1/4] virtio-net: support transmit hash report Steffen Trumtrar
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Steffen Trumtrar @ 2023-12-18 11:37 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang, Xuan Zhuo, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Richard Cochran
  Cc: virtualization, netdev, linux-kernel, Willem de Bruijn

This series tries to pick up the work on the virtio-net timestamping
feature from Willem de Bruijn.

Original series
    Message-Id: 20210208185558.995292-1-willemdebruijn.kernel@gmail.com
    Subject: [PATCH RFC v2 0/4] virtio-net: add tx-hash, rx-tstamp,
    tx-tstamp and tx-time
    From: Willem de Bruijn <willemb@google.com>

    RFC for four new features to the virtio network device:

    1. pass tx flow state to host, for routing + telemetry
    2. pass rx tstamp to guest, for better RTT estimation
    3. pass tx tstamp to guest, idem
    3. pass tx delivery time to host, for accurate pacing

    All would introduce an extension to the virtio spec.

The original series consisted of a hack around the DMA API, which should
be fixed in this series.

The changes in this series are to the driver side. For the changes to qemu see:
    https://github.com/strumtrar/qemu/tree/v8.1.1/virtio-net-ptp

Currently only virtio-net is supported. The original series used
vhost-net as backend. However, the path through tun via sendmsg doesn't
allow us to write data back to the driver side without any hacks.
Therefore use the way via plain virtio-net without vhost albeit better
performance.

Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
---
Willem de Bruijn (4):
      virtio-net: support transmit hash report
      virtio-net: support receive timestamp
      virtio-net: support transmit timestamp
      virtio-net: support future packet transmit time

 drivers/net/virtio_net.c        | 211 ++++++++++++++++++++++++++++++++++++++--
 include/uapi/linux/virtio_net.h |  24 ++++-
 2 files changed, 228 insertions(+), 7 deletions(-)
---
base-commit: b85ea95d086471afb4ad062012a4d73cd328fa86
change-id: 20231218-v6-7-topic-virtio-net-ptp-3df023bc4f4d

Best regards,
-- 
Steffen Trumtrar <s.trumtrar@pengutronix.de>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH RFC 1/4] virtio-net: support transmit hash report
  2023-12-18 11:37 [PATCH RFC 0/4] virtio-net: add tx-hash, rx-tstamp, tx-tstamp and tx-time Steffen Trumtrar
@ 2023-12-18 11:37 ` Steffen Trumtrar
  2023-12-19 18:36   ` Michael S. Tsirkin
  2023-12-18 11:37 ` [PATCH RFC 2/4] virtio-net: support receive timestamp Steffen Trumtrar
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Steffen Trumtrar @ 2023-12-18 11:37 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang, Xuan Zhuo, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Richard Cochran
  Cc: virtualization, netdev, linux-kernel, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

Virtio-net supports sharing the flow hash from device to driver on rx.
Do the same in the other direction for robust routing and telemetry.

Linux derives ipv6 flowlabel and ECMP multipath from sk->sk_txhash,
and updates this field on error with sk_rethink_txhash. Allow the host
stack to do the same.

Concrete examples of error conditions that are resolved are
mentioned in the commits that add sk_rethink_txhash calls. Such as
commit 7788174e8726 ("tcp: change IPv6 flow-label upon receiving
spurious retransmission").

Experimental results mirror what the theory suggests: where IPv6
FlowLabel is included in path selection (e.g., LAG/ECMP), flowlabel
rotation on TCP timeout avoids the vast majority of TCP disconnects
that would otherwise have occurred during link failures in long-haul
backbones, when an alternative path is available.

Rotation can be applied to various bad connection signals, such as
timeouts and spurious retransmissions. In aggregate, such flow level
signals can help locate network issues. Reserve field hash_state to
share this info. For now, always set VIRTIO_NET_HASH_STATE_DEFAULT.
Passing information between TCP stack and driver is future work.

Changes RFC->RFCv2
  - drop unused VIRTIO_NET_HASH_STATE_TIMEOUT_BIT
  - convert from cpu_to_virtioXX to cpu_to_leXX

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
---
 drivers/net/virtio_net.c        | 26 +++++++++++++++++++++++---
 include/uapi/linux/virtio_net.h |  9 ++++++++-
 2 files changed, 31 insertions(+), 4 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index d16f592c2061f..8c9de31af8942 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -271,6 +271,9 @@ struct virtnet_info {
 	u32 rss_hash_types_supported;
 	u32 rss_hash_types_saved;
 
+	/* Driver will pass tx path info to the device */
+	bool has_tx_hash;
+
 	/* Has control virtqueue */
 	bool has_cvq;
 
@@ -512,9 +515,9 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
 
 	hdr_len = vi->hdr_len;
 	if (vi->mergeable_rx_bufs)
-		hdr_padded_len = hdr_len;
+		hdr_padded_len = max_t(unsigned int, hdr_len, sizeof(*hdr));
 	else
-		hdr_padded_len = sizeof(struct padded_vnet_hdr);
+		hdr_padded_len = ALIGN(hdr_len, 16);
 
 	buf = p - headroom;
 	len -= hdr_len;
@@ -2282,6 +2285,7 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
 	struct virtio_net_hdr_mrg_rxbuf *hdr;
 	const unsigned char *dest = ((struct ethhdr *)skb->data)->h_dest;
 	struct virtnet_info *vi = sq->vq->vdev->priv;
+	struct virtio_net_hdr_v1_hash *ht;
 	int num_sg;
 	unsigned hdr_len = vi->hdr_len;
 	bool can_push;
@@ -2306,6 +2310,16 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
 	if (vi->mergeable_rx_bufs)
 		hdr->num_buffers = 0;
 
+	ht = (void *)hdr;
+	if (vi->has_tx_hash) {
+		u16 report = skb->l4_hash ? VIRTIO_NET_HASH_REPORT_L4 :
+					    VIRTIO_NET_HASH_REPORT_OTHER;
+
+		ht->hash_value = cpu_to_le32(skb->hash);
+		ht->hash_report = cpu_to_le16(report);
+		ht->hash_state = cpu_to_le16(VIRTIO_NET_HASH_STATE_DEFAULT);
+	}
+
 	sg_init_table(sq->sg, skb_shinfo(skb)->nr_frags + (can_push ? 1 : 2));
 	if (can_push) {
 		__skb_push(skb, hdr_len);
@@ -4511,6 +4525,11 @@ static int virtnet_probe(struct virtio_device *vdev)
 	else
 		vi->hdr_len = sizeof(struct virtio_net_hdr);
 
+	if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_TX_HASH)) {
+		vi->has_tx_hash = true;
+		vi->hdr_len = sizeof(struct virtio_net_hdr_v1_hash);
+	}
+
 	if (virtio_has_feature(vdev, VIRTIO_F_ANY_LAYOUT) ||
 	    virtio_has_feature(vdev, VIRTIO_F_VERSION_1))
 		vi->any_header_sg = true;
@@ -4753,7 +4772,8 @@ static struct virtio_device_id id_table[] = {
 	VIRTIO_NET_F_SPEED_DUPLEX, VIRTIO_NET_F_STANDBY, \
 	VIRTIO_NET_F_RSS, VIRTIO_NET_F_HASH_REPORT, VIRTIO_NET_F_NOTF_COAL, \
 	VIRTIO_NET_F_VQ_NOTF_COAL, \
-	VIRTIO_NET_F_GUEST_HDRLEN
+	VIRTIO_NET_F_GUEST_HDRLEN, \
+	VIRTIO_NET_F_TX_HASH
 
 static unsigned int features[] = {
 	VIRTNET_FEATURES,
diff --git a/include/uapi/linux/virtio_net.h b/include/uapi/linux/virtio_net.h
index cc65ef0f3c3e2..698a11f8c6ab9 100644
--- a/include/uapi/linux/virtio_net.h
+++ b/include/uapi/linux/virtio_net.h
@@ -56,6 +56,7 @@
 #define VIRTIO_NET_F_MQ	22	/* Device supports Receive Flow
 					 * Steering */
 #define VIRTIO_NET_F_CTRL_MAC_ADDR 23	/* Set MAC address */
+#define VIRTIO_NET_F_TX_HASH	  51	/* Driver sends hash report */
 #define VIRTIO_NET_F_VQ_NOTF_COAL 52	/* Device supports virtqueue notification coalescing */
 #define VIRTIO_NET_F_NOTF_COAL	53	/* Device supports notifications coalescing */
 #define VIRTIO_NET_F_GUEST_USO4	54	/* Guest can handle USOv4 in. */
@@ -176,8 +177,14 @@ struct virtio_net_hdr_v1_hash {
 #define VIRTIO_NET_HASH_REPORT_IPv6_EX         7
 #define VIRTIO_NET_HASH_REPORT_TCPv6_EX        8
 #define VIRTIO_NET_HASH_REPORT_UDPv6_EX        9
+#define VIRTIO_NET_HASH_REPORT_L4              10
+#define VIRTIO_NET_HASH_REPORT_OTHER           11
 	__le16 hash_report;
-	__le16 padding;
+	union {
+		__le16 padding;
+#define VIRTIO_NET_HASH_STATE_DEFAULT          0
+		__le16 hash_state;
+	};
 };
 
 #ifndef VIRTIO_NET_NO_LEGACY

-- 
2.42.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH RFC 2/4] virtio-net: support receive timestamp
  2023-12-18 11:37 [PATCH RFC 0/4] virtio-net: add tx-hash, rx-tstamp, tx-tstamp and tx-time Steffen Trumtrar
  2023-12-18 11:37 ` [PATCH RFC 1/4] virtio-net: support transmit hash report Steffen Trumtrar
@ 2023-12-18 11:37 ` Steffen Trumtrar
  2023-12-18 11:37 ` [PATCH RFC 3/4] virtio-net: support transmit timestamp Steffen Trumtrar
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Steffen Trumtrar @ 2023-12-18 11:37 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang, Xuan Zhuo, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Richard Cochran
  Cc: virtualization, netdev, linux-kernel, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

Add optional PTP hardware rx timestamp offload for virtio-net.

Accurate RTT measurement requires timestamps close to the wire.
Introduce virtio feature VIRTIO_NET_F_RX_TSTAMP. If negotiated, the
virtio-net header is expanded with room for a timestamp.

A device may pass receive timestamps for all or some packets. Flag
VIRTIO_NET_HDR_F_TSTAMP signals whether a timestamp is recorded.

A driver that supports hardware timestamping must also support
ioctl SIOCSHWTSTAMP. Implement that, as well as information getters
ioctl SIOCGHWTSTAMP and ethtool get_ts_info (`ethtool -T $DEV`).

The timestamp straddles (virtual) hardware domains. Like PTP, use
international atomic time (CLOCK_TAI) as global clock base. The driver
must sync with the device, e.g., through kvm-clock.

Tested:
  guest: ./timestamping eth0 \
          SOF_TIMESTAMPING_RAW_HARDWARE \
          SOF_TIMESTAMPING_RX_HARDWARE
  host: nc -4 -u 192.168.1.1 319

Changes RFC -> RFCv2
  - rename virtio_net_hdr_v12 to virtio_net_hdr_hash_ts
  - add ethtool .get_ts_info to query capabilities
  - add ioctl SIOC[GS]HWTSTAMP to configure feature
  - add vi->enable_rx_tstamp to store configuration
  - convert virtioXX_to_cpu to leXX_to_cpu
  - convert reserved to __u32

Signed-off-by: Willem de Bruijn <willemb@google.com>
Co-developed-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>

--
  Changes to original RFCv2
  - move virtio_net_hdr_v1 into hash-struct
---
 drivers/net/virtio_net.c        | 113 +++++++++++++++++++++++++++++++++++++++-
 include/uapi/linux/virtio_net.h |  13 +++++
 2 files changed, 124 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 8c9de31af8942..9bb6cdaf49cc9 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -274,6 +274,12 @@ struct virtnet_info {
 	/* Driver will pass tx path info to the device */
 	bool has_tx_hash;
 
+	/* Device can pass CLOCK_TAI receive time to the driver */
+	bool has_rx_tstamp;
+
+	/* Device will pass rx timestamp. Requires has_rx_tstamp */
+	bool enable_rx_tstamp;
+
 	/* Has control virtqueue */
 	bool has_cvq;
 
@@ -384,6 +390,13 @@ skb_vnet_common_hdr(struct sk_buff *skb)
 	return (struct virtio_net_common_hdr *)skb->cb;
 }
 
+static inline struct virtio_net_hdr_hash_ts *skb_vnet_hdr_ht(struct sk_buff *skb)
+{
+	BUILD_BUG_ON(sizeof(struct virtio_net_hdr_hash_ts) > sizeof(skb->cb));
+
+	return (void *)skb->cb;
+}
+
 /*
  * private is used to chain pages for big packets, put the whole
  * most recent used list in the beginning for reuse
@@ -1755,6 +1768,19 @@ static void virtio_skb_set_hash(const struct virtio_net_hdr_v1_hash *hdr_hash,
 	skb_set_hash(skb, __le32_to_cpu(hdr_hash->hash_value), rss_hash_type);
 }
 
+static inline void virtnet_record_rx_tstamp(const struct virtnet_info *vi,
+					    struct sk_buff *skb)
+{
+	const struct virtio_net_hdr_hash_ts *h = skb_vnet_hdr_ht(skb);
+
+	if (h->hash.hdr.flags & VIRTIO_NET_HDR_F_TSTAMP &&
+	    vi->enable_rx_tstamp) {
+		u64 ts = le64_to_cpu(h->tstamp);
+
+		skb_hwtstamps(skb)->hwtstamp = ns_to_ktime(ts);
+	}
+}
+
 static void receive_buf(struct virtnet_info *vi, struct receive_queue *rq,
 			void *buf, unsigned int len, void **ctx,
 			unsigned int *xdp_xmit,
@@ -1797,6 +1823,7 @@ static void receive_buf(struct virtnet_info *vi, struct receive_queue *rq,
 		goto frame_err;
 	}
 
+	virtnet_record_rx_tstamp(vi, skb);
 	skb_record_rx_queue(skb, vq2rxq(rq->vq));
 	skb->protocol = eth_type_trans(skb, dev);
 	pr_debug("Receiving skb proto 0x%04x len %i type %i\n",
@@ -3512,6 +3539,28 @@ static int virtnet_get_per_queue_coalesce(struct net_device *dev,
 	return 0;
 }
 
+static int virtnet_get_ts_info(struct net_device *dev,
+			       struct ethtool_ts_info *info)
+{
+	struct virtnet_info *vi = netdev_priv(dev);
+
+	/* setup default software timestamp */
+	ethtool_op_get_ts_info(dev, info);
+
+	/* return rx capabilities (which may differ from current enable) */
+	if (vi->has_rx_tstamp) {
+		info->so_timestamping |= SOF_TIMESTAMPING_RX_HARDWARE |
+					 SOF_TIMESTAMPING_RAW_HARDWARE;
+		info->rx_filters = HWTSTAMP_FILTER_ALL;
+	} else {
+		info->rx_filters = HWTSTAMP_FILTER_NONE;
+	}
+
+	info->tx_types = HWTSTAMP_TX_OFF;
+
+	return 0;
+}
+
 static void virtnet_init_settings(struct net_device *dev)
 {
 	struct virtnet_info *vi = netdev_priv(dev);
@@ -3637,7 +3686,7 @@ static const struct ethtool_ops virtnet_ethtool_ops = {
 	.get_ethtool_stats = virtnet_get_ethtool_stats,
 	.set_channels = virtnet_set_channels,
 	.get_channels = virtnet_get_channels,
-	.get_ts_info = ethtool_op_get_ts_info,
+	.get_ts_info = virtnet_get_ts_info,
 	.get_link_ksettings = virtnet_get_link_ksettings,
 	.set_link_ksettings = virtnet_set_link_ksettings,
 	.set_coalesce = virtnet_set_coalesce,
@@ -3926,6 +3975,60 @@ static void virtnet_tx_timeout(struct net_device *dev, unsigned int txqueue)
 		   jiffies_to_usecs(jiffies - READ_ONCE(txq->trans_start)));
 }
 
+static int virtnet_ioctl_set_hwtstamp(struct net_device *dev, struct ifreq *ifr)
+{
+	struct virtnet_info *vi = netdev_priv(dev);
+	struct hwtstamp_config tsconf;
+
+	if (copy_from_user(&tsconf, ifr->ifr_data, sizeof(tsconf)))
+		return -EFAULT;
+	if (tsconf.flags)
+		return -EINVAL;
+	if (tsconf.tx_type != HWTSTAMP_TX_OFF)
+		return -ERANGE;
+	if (tsconf.rx_filter != HWTSTAMP_FILTER_NONE &&
+	    tsconf.rx_filter != HWTSTAMP_FILTER_ALL)
+		tsconf.rx_filter = HWTSTAMP_FILTER_ALL;
+
+	if (!vi->has_rx_tstamp)
+		tsconf.rx_filter = HWTSTAMP_FILTER_NONE;
+	else
+		vi->enable_rx_tstamp = tsconf.rx_filter == HWTSTAMP_FILTER_ALL;
+
+	if (copy_to_user(ifr->ifr_data, &tsconf, sizeof(tsconf)))
+		return -EFAULT;
+
+	return 0;
+}
+
+static int virtnet_ioctl_get_hwtstamp(struct net_device *dev, struct ifreq *ifr)
+{
+	struct virtnet_info *vi = netdev_priv(dev);
+	struct hwtstamp_config tsconf;
+
+	tsconf.flags = 0;
+	tsconf.rx_filter = vi->enable_rx_tstamp ? HWTSTAMP_FILTER_ALL :
+						  HWTSTAMP_FILTER_NONE;
+	tsconf.tx_type = HWTSTAMP_TX_OFF;
+
+	if (copy_to_user(ifr->ifr_data, &tsconf, sizeof(tsconf)))
+		return -EFAULT;
+
+	return 0;
+}
+
+static int virtnet_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
+{
+	switch (cmd) {
+	case SIOCSHWTSTAMP:
+		return virtnet_ioctl_set_hwtstamp(dev, ifr);
+
+	case SIOCGHWTSTAMP:
+		return virtnet_ioctl_get_hwtstamp(dev, ifr);
+	}
+	return -EOPNOTSUPP;
+}
+
 static const struct net_device_ops virtnet_netdev = {
 	.ndo_open            = virtnet_open,
 	.ndo_stop   	     = virtnet_close,
@@ -3942,6 +4045,7 @@ static const struct net_device_ops virtnet_netdev = {
 	.ndo_get_phys_port_name	= virtnet_get_phys_port_name,
 	.ndo_set_features	= virtnet_set_features,
 	.ndo_tx_timeout		= virtnet_tx_timeout,
+	.ndo_eth_ioctl		= virtnet_ioctl,
 };
 
 static void virtnet_config_changed_work(struct work_struct *work)
@@ -4530,6 +4634,11 @@ static int virtnet_probe(struct virtio_device *vdev)
 		vi->hdr_len = sizeof(struct virtio_net_hdr_v1_hash);
 	}
 
+	if (virtio_has_feature(vdev, VIRTIO_NET_F_RX_TSTAMP)) {
+		vi->has_rx_tstamp = true;
+		vi->hdr_len = sizeof(struct virtio_net_hdr_hash_ts);
+	}
+
 	if (virtio_has_feature(vdev, VIRTIO_F_ANY_LAYOUT) ||
 	    virtio_has_feature(vdev, VIRTIO_F_VERSION_1))
 		vi->any_header_sg = true;
@@ -4773,7 +4882,7 @@ static struct virtio_device_id id_table[] = {
 	VIRTIO_NET_F_RSS, VIRTIO_NET_F_HASH_REPORT, VIRTIO_NET_F_NOTF_COAL, \
 	VIRTIO_NET_F_VQ_NOTF_COAL, \
 	VIRTIO_NET_F_GUEST_HDRLEN, \
-	VIRTIO_NET_F_TX_HASH
+	VIRTIO_NET_F_TX_HASH, VIRTIO_NET_F_RX_TSTAMP
 
 static unsigned int features[] = {
 	VIRTNET_FEATURES,
diff --git a/include/uapi/linux/virtio_net.h b/include/uapi/linux/virtio_net.h
index 698a11f8c6ab9..6a0a6f299a13b 100644
--- a/include/uapi/linux/virtio_net.h
+++ b/include/uapi/linux/virtio_net.h
@@ -56,6 +56,7 @@
 #define VIRTIO_NET_F_MQ	22	/* Device supports Receive Flow
 					 * Steering */
 #define VIRTIO_NET_F_CTRL_MAC_ADDR 23	/* Set MAC address */
+#define VIRTIO_NET_F_RX_TSTAMP	  50	/* Device sends TAI receive time */
 #define VIRTIO_NET_F_TX_HASH	  51	/* Driver sends hash report */
 #define VIRTIO_NET_F_VQ_NOTF_COAL 52	/* Device supports virtqueue notification coalescing */
 #define VIRTIO_NET_F_NOTF_COAL	53	/* Device supports notifications coalescing */
@@ -131,6 +132,7 @@ struct virtio_net_hdr_v1 {
 #define VIRTIO_NET_HDR_F_NEEDS_CSUM	1	/* Use csum_start, csum_offset */
 #define VIRTIO_NET_HDR_F_DATA_VALID	2	/* Csum is valid */
 #define VIRTIO_NET_HDR_F_RSC_INFO	4	/* rsc info in csum_ fields */
+#define VIRTIO_NET_HDR_F_TSTAMP		8	/* timestamp is recorded */
 	__u8 flags;
 #define VIRTIO_NET_HDR_GSO_NONE		0	/* Not a GSO frame */
 #define VIRTIO_NET_HDR_GSO_TCPV4	1	/* GSO frame, IPv4 TCP (TSO) */
@@ -187,6 +189,17 @@ struct virtio_net_hdr_v1_hash {
 	};
 };
 
+struct virtio_net_hdr_hash_ts {
+	struct {
+		struct virtio_net_hdr_v1 hdr;
+		__le32 value;
+		__le16 report;
+		__le16 flow_state;
+	} hash;
+	__u32 reserved;
+	__le64 tstamp;
+};
+
 #ifndef VIRTIO_NET_NO_LEGACY
 /* This header comes first in the scatter-gather list.
  * For legacy virtio, if VIRTIO_F_ANY_LAYOUT is not negotiated, it must

-- 
2.42.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH RFC 3/4] virtio-net: support transmit timestamp
  2023-12-18 11:37 [PATCH RFC 0/4] virtio-net: add tx-hash, rx-tstamp, tx-tstamp and tx-time Steffen Trumtrar
  2023-12-18 11:37 ` [PATCH RFC 1/4] virtio-net: support transmit hash report Steffen Trumtrar
  2023-12-18 11:37 ` [PATCH RFC 2/4] virtio-net: support receive timestamp Steffen Trumtrar
@ 2023-12-18 11:37 ` Steffen Trumtrar
  2023-12-18 11:37 ` [PATCH RFC 4/4] virtio-net: support future packet transmit time Steffen Trumtrar
  2023-12-18 16:36 ` [PATCH RFC 0/4] virtio-net: add tx-hash, rx-tstamp, tx-tstamp and tx-time Willem de Bruijn
  4 siblings, 0 replies; 9+ messages in thread
From: Steffen Trumtrar @ 2023-12-18 11:37 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang, Xuan Zhuo, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Richard Cochran
  Cc: virtualization, netdev, linux-kernel, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

Add optional PTP hardware tx timestamp offload for virtio-net.

Accurate RTT measurement requires timestamps close to the wire.
Introduce virtio feature VIRTIO_NET_F_TX_TSTAMP, the transmit
equivalent to VIRTIO_NET_F_RX_TSTAMP.

The driver sets VIRTIO_NET_HDR_F_TSTAMP to request a timestamp
returned on completion. If the feature is negotiated, the device
either places the timestamp or clears the feature bit.

The timestamp straddles (virtual) hardware domains. Like PTP, use
international atomic time (CLOCK_TAI) as global clock base. The driver
must sync with the device, e.g., through kvm-clock.

Modify can_push to ensure that on tx completion the header, and thus
timestamp, is in a predicatable location at skb_vnet_hdr.

Tested: modified txtimestamp.c to with h/w timestamping:
  -       sock_opt = SOF_TIMESTAMPING_SOFTWARE |
  +       sock_opt = SOF_TIMESTAMPING_RAW_HARDWARE |
  + do_test(family, SOF_TIMESTAMPING_TX_HARDWARE);

Signed-off-by: Willem de Bruijn <willemb@google.com>
Co-developed-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>

--
  Changes to original RFC v2:
  - append write-able descriptor for TX timestamp

Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
---
 drivers/net/virtio_net.c        | 79 +++++++++++++++++++++++++++++++++++------
 include/uapi/linux/virtio_net.h |  1 +
 2 files changed, 70 insertions(+), 10 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 9bb6cdaf49cc9..4065834957fbd 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -280,6 +280,12 @@ struct virtnet_info {
 	/* Device will pass rx timestamp. Requires has_rx_tstamp */
 	bool enable_rx_tstamp;
 
+	/* Device can pass CLOCK_TAI transmit time to the driver */
+	bool has_tx_tstamp;
+
+	/* Device will pass tx timestamp. Requires has_tx_tstamp */
+	bool enable_tx_tstamp;
+
 	/* Has control virtqueue */
 	bool has_cvq;
 
@@ -760,6 +766,21 @@ static void virtnet_rq_set_premapped(struct virtnet_info *vi)
 	}
 }
 
+static void virtnet_record_tx_tstamp(const struct send_queue *sq,
+				     struct sk_buff *skb)
+{
+	const struct virtio_net_hdr_hash_ts *h = skb_vnet_hdr_ht(skb);
+	const struct virtnet_info *vi = sq->vq->vdev->priv;
+	struct skb_shared_hwtstamps ts;
+
+	if (h->hash.hdr.flags & VIRTIO_NET_HDR_F_TSTAMP &&
+	    vi->enable_tx_tstamp) {
+		/* Write the timestamp from the userspace to the skb */
+		ts.hwtstamp = ns_to_ktime(le64_to_cpu(h->tstamp));
+		skb_tstamp_tx(skb, &ts);
+	}
+}
+
 static void free_old_xmit_skbs(struct send_queue *sq, bool in_napi)
 {
 	unsigned int len;
@@ -771,6 +792,7 @@ static void free_old_xmit_skbs(struct send_queue *sq, bool in_napi)
 		if (likely(!is_xdp_frame(ptr))) {
 			struct sk_buff *skb = ptr;
 
+			virtnet_record_tx_tstamp(sq, skb);
 			pr_debug("Sent skb %p\n", skb);
 
 			bytes += skb->len;
@@ -2312,7 +2334,7 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
 	struct virtio_net_hdr_mrg_rxbuf *hdr;
 	const unsigned char *dest = ((struct ethhdr *)skb->data)->h_dest;
 	struct virtnet_info *vi = sq->vq->vdev->priv;
-	struct virtio_net_hdr_v1_hash *ht;
+	struct virtio_net_hdr_hash_ts *ht;
 	int num_sg;
 	unsigned hdr_len = vi->hdr_len;
 	bool can_push;
@@ -2321,7 +2343,8 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
 
 	can_push = vi->any_header_sg &&
 		!((unsigned long)skb->data & (__alignof__(*hdr) - 1)) &&
-		!skb_header_cloned(skb) && skb_headroom(skb) >= hdr_len;
+		!skb_header_cloned(skb) && skb_headroom(skb) >= hdr_len &&
+		!vi->enable_tx_tstamp;
 	/* Even if we can, don't push here yet as this would skew
 	 * csum_start offset below. */
 	if (can_push)
@@ -2342,10 +2365,12 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
 		u16 report = skb->l4_hash ? VIRTIO_NET_HASH_REPORT_L4 :
 					    VIRTIO_NET_HASH_REPORT_OTHER;
 
-		ht->hash_value = cpu_to_le32(skb->hash);
-		ht->hash_report = cpu_to_le16(report);
-		ht->hash_state = cpu_to_le16(VIRTIO_NET_HASH_STATE_DEFAULT);
+		ht->hash.value = cpu_to_le32(skb->hash);
+		ht->hash.report = cpu_to_le16(report);
+		ht->hash.flow_state = cpu_to_le16(VIRTIO_NET_HASH_STATE_DEFAULT);
 	}
+	if (vi->enable_tx_tstamp && skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)
+		ht->hash.hdr.flags |= VIRTIO_NET_HDR_F_TSTAMP;
 
 	sg_init_table(sq->sg, skb_shinfo(skb)->nr_frags + (can_push ? 1 : 2));
 	if (can_push) {
@@ -2362,7 +2387,23 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
 			return num_sg;
 		num_sg++;
 	}
-	return virtqueue_add_outbuf(sq->vq, sq->sg, num_sg, skb, GFP_ATOMIC);
+
+	/* Append a writeable descriptor after the RO ones. TX timestamp is
+	 * received via this descriptor
+	 */
+	if (ht->hash.hdr.flags & VIRTIO_NET_HDR_F_TSTAMP) {
+		struct virtio_net_hdr_hash_ts *h = skb_vnet_hdr_ht(skb);
+		struct scatterlist *sgs[2];
+		struct scatterlist sg_in;
+
+		sg_init_one(&sg_in, &h->tstamp, sizeof(h->tstamp));
+		sgs[0] = sq->sg;
+		sgs[1] = &sg_in;
+
+		return virtqueue_add_sgs(sq->vq, sgs, 1, 1, skb, GFP_ATOMIC);
+	} else {
+		return virtqueue_add_outbuf(sq->vq, sq->sg, num_sg, skb, GFP_ATOMIC);
+	}
 }
 
 static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
@@ -3556,7 +3597,13 @@ static int virtnet_get_ts_info(struct net_device *dev,
 		info->rx_filters = HWTSTAMP_FILTER_NONE;
 	}
 
-	info->tx_types = HWTSTAMP_TX_OFF;
+	if (vi->has_tx_tstamp) {
+		info->so_timestamping |= SOF_TIMESTAMPING_TX_HARDWARE |
+					 SOF_TIMESTAMPING_RAW_HARDWARE;
+		info->tx_types = HWTSTAMP_TX_ON;
+	} else {
+		info->tx_types = HWTSTAMP_TX_OFF;
+	}
 
 	return 0;
 }
@@ -3984,7 +4031,8 @@ static int virtnet_ioctl_set_hwtstamp(struct net_device *dev, struct ifreq *ifr)
 		return -EFAULT;
 	if (tsconf.flags)
 		return -EINVAL;
-	if (tsconf.tx_type != HWTSTAMP_TX_OFF)
+	if (tsconf.tx_type != HWTSTAMP_TX_OFF &&
+	    tsconf.tx_type != HWTSTAMP_TX_ON)
 		return -ERANGE;
 	if (tsconf.rx_filter != HWTSTAMP_FILTER_NONE &&
 	    tsconf.rx_filter != HWTSTAMP_FILTER_ALL)
@@ -3995,6 +4043,11 @@ static int virtnet_ioctl_set_hwtstamp(struct net_device *dev, struct ifreq *ifr)
 	else
 		vi->enable_rx_tstamp = tsconf.rx_filter == HWTSTAMP_FILTER_ALL;
 
+	if (!vi->has_tx_tstamp)
+		tsconf.tx_type = HWTSTAMP_TX_OFF;
+	else
+		vi->enable_tx_tstamp = tsconf.tx_type == HWTSTAMP_TX_ON;
+
 	if (copy_to_user(ifr->ifr_data, &tsconf, sizeof(tsconf)))
 		return -EFAULT;
 
@@ -4009,7 +4062,8 @@ static int virtnet_ioctl_get_hwtstamp(struct net_device *dev, struct ifreq *ifr)
 	tsconf.flags = 0;
 	tsconf.rx_filter = vi->enable_rx_tstamp ? HWTSTAMP_FILTER_ALL :
 						  HWTSTAMP_FILTER_NONE;
-	tsconf.tx_type = HWTSTAMP_TX_OFF;
+	tsconf.tx_type = vi->enable_tx_tstamp ? HWTSTAMP_TX_ON :
+						HWTSTAMP_TX_OFF;
 
 	if (copy_to_user(ifr->ifr_data, &tsconf, sizeof(tsconf)))
 		return -EFAULT;
@@ -4639,6 +4693,11 @@ static int virtnet_probe(struct virtio_device *vdev)
 		vi->hdr_len = sizeof(struct virtio_net_hdr_hash_ts);
 	}
 
+	if (virtio_has_feature(vdev, VIRTIO_NET_F_TX_TSTAMP)) {
+		vi->has_tx_tstamp = true;
+		vi->hdr_len = sizeof(struct virtio_net_hdr_hash_ts);
+	}
+
 	if (virtio_has_feature(vdev, VIRTIO_F_ANY_LAYOUT) ||
 	    virtio_has_feature(vdev, VIRTIO_F_VERSION_1))
 		vi->any_header_sg = true;
@@ -4882,7 +4941,7 @@ static struct virtio_device_id id_table[] = {
 	VIRTIO_NET_F_RSS, VIRTIO_NET_F_HASH_REPORT, VIRTIO_NET_F_NOTF_COAL, \
 	VIRTIO_NET_F_VQ_NOTF_COAL, \
 	VIRTIO_NET_F_GUEST_HDRLEN, \
-	VIRTIO_NET_F_TX_HASH, VIRTIO_NET_F_RX_TSTAMP
+	VIRTIO_NET_F_TX_HASH, VIRTIO_NET_F_RX_TSTAMP, VIRTIO_NET_F_TX_TSTAMP
 
 static unsigned int features[] = {
 	VIRTNET_FEATURES,
diff --git a/include/uapi/linux/virtio_net.h b/include/uapi/linux/virtio_net.h
index 6a0a6f299a13b..03a7ef45372d7 100644
--- a/include/uapi/linux/virtio_net.h
+++ b/include/uapi/linux/virtio_net.h
@@ -56,6 +56,7 @@
 #define VIRTIO_NET_F_MQ	22	/* Device supports Receive Flow
 					 * Steering */
 #define VIRTIO_NET_F_CTRL_MAC_ADDR 23	/* Set MAC address */
+#define VIRTIO_NET_F_TX_TSTAMP	  49	/* Device sends TAI transmit time */
 #define VIRTIO_NET_F_RX_TSTAMP	  50	/* Device sends TAI receive time */
 #define VIRTIO_NET_F_TX_HASH	  51	/* Driver sends hash report */
 #define VIRTIO_NET_F_VQ_NOTF_COAL 52	/* Device supports virtqueue notification coalescing */

-- 
2.42.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH RFC 4/4] virtio-net: support future packet transmit time
  2023-12-18 11:37 [PATCH RFC 0/4] virtio-net: add tx-hash, rx-tstamp, tx-tstamp and tx-time Steffen Trumtrar
                   ` (2 preceding siblings ...)
  2023-12-18 11:37 ` [PATCH RFC 3/4] virtio-net: support transmit timestamp Steffen Trumtrar
@ 2023-12-18 11:37 ` Steffen Trumtrar
  2023-12-18 16:36 ` [PATCH RFC 0/4] virtio-net: add tx-hash, rx-tstamp, tx-tstamp and tx-time Willem de Bruijn
  4 siblings, 0 replies; 9+ messages in thread
From: Steffen Trumtrar @ 2023-12-18 11:37 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang, Xuan Zhuo, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Richard Cochran
  Cc: virtualization, netdev, linux-kernel, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

Add optional transmit time (SO_TXTIME) offload for virtio-net.

The Linux TCP/IP stack tries to avoid bursty transmission and network
congestion through pacing: computing an skb delivery time based on
congestion information. Userspace protocol implementations can achieve
the same with SO_TXTIME. This may also reduce scheduling jitter and
improve RTT estimation.

Pacing can be implemented in ETF or FQ qdiscs or offloaded to NIC
hardware. Allow virtio-net driver to offload for the same reasons.

The timestamp straddles (virtual) hardware domains. Like PTP, use
international atomic time (CLOCK_TAI) as global clock base. The driver
must sync with the device, e.g., through kvm-clock.

Changes RFC - RFCv2
  - rename from transmit timestamp to future packet transmit time
  - convert cpu_to_virtioXX to cpu_to_leXX

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
---
 drivers/net/virtio_net.c        | 13 ++++++++++++-
 include/uapi/linux/virtio_net.h |  1 +
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 4065834957fbd..fb3572074e8b8 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -286,6 +286,9 @@ struct virtnet_info {
 	/* Device will pass tx timestamp. Requires has_tx_tstamp */
 	bool enable_tx_tstamp;
 
+	/* Driver will pass CLOCK_TAI delivery time to the device */
+	bool has_tx_time;
+
 	/* Has control virtqueue */
 	bool has_cvq;
 
@@ -2371,6 +2374,8 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
 	}
 	if (vi->enable_tx_tstamp && skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)
 		ht->hash.hdr.flags |= VIRTIO_NET_HDR_F_TSTAMP;
+	if (vi->has_tx_time && skb->tstamp)
+		ht->tstamp = cpu_to_le64(skb->tstamp);
 
 	sg_init_table(sq->sg, skb_shinfo(skb)->nr_frags + (can_push ? 1 : 2));
 	if (can_push) {
@@ -4698,6 +4703,11 @@ static int virtnet_probe(struct virtio_device *vdev)
 		vi->hdr_len = sizeof(struct virtio_net_hdr_hash_ts);
 	}
 
+	if (virtio_has_feature(vdev, VIRTIO_NET_F_TX_TIME)) {
+		vi->has_tx_time = true;
+		vi->hdr_len = sizeof(struct virtio_net_hdr_hash_ts);
+	}
+
 	if (virtio_has_feature(vdev, VIRTIO_F_ANY_LAYOUT) ||
 	    virtio_has_feature(vdev, VIRTIO_F_VERSION_1))
 		vi->any_header_sg = true;
@@ -4941,7 +4951,8 @@ static struct virtio_device_id id_table[] = {
 	VIRTIO_NET_F_RSS, VIRTIO_NET_F_HASH_REPORT, VIRTIO_NET_F_NOTF_COAL, \
 	VIRTIO_NET_F_VQ_NOTF_COAL, \
 	VIRTIO_NET_F_GUEST_HDRLEN, \
-	VIRTIO_NET_F_TX_HASH, VIRTIO_NET_F_RX_TSTAMP, VIRTIO_NET_F_TX_TSTAMP
+	VIRTIO_NET_F_TX_HASH, VIRTIO_NET_F_RX_TSTAMP, VIRTIO_NET_F_TX_TSTAMP, \
+	VIRTIO_NET_F_TX_TIME
 
 static unsigned int features[] = {
 	VIRTNET_FEATURES,
diff --git a/include/uapi/linux/virtio_net.h b/include/uapi/linux/virtio_net.h
index 03a7ef45372d7..18db1e1130d92 100644
--- a/include/uapi/linux/virtio_net.h
+++ b/include/uapi/linux/virtio_net.h
@@ -56,6 +56,7 @@
 #define VIRTIO_NET_F_MQ	22	/* Device supports Receive Flow
 					 * Steering */
 #define VIRTIO_NET_F_CTRL_MAC_ADDR 23	/* Set MAC address */
+#define VIRTIO_NET_F_TX_TIME	  48	/* Driver sets TAI delivery time */
 #define VIRTIO_NET_F_TX_TSTAMP	  49	/* Device sends TAI transmit time */
 #define VIRTIO_NET_F_RX_TSTAMP	  50	/* Device sends TAI receive time */
 #define VIRTIO_NET_F_TX_HASH	  51	/* Driver sends hash report */

-- 
2.42.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC 0/4] virtio-net: add tx-hash, rx-tstamp, tx-tstamp and tx-time
  2023-12-18 11:37 [PATCH RFC 0/4] virtio-net: add tx-hash, rx-tstamp, tx-tstamp and tx-time Steffen Trumtrar
                   ` (3 preceding siblings ...)
  2023-12-18 11:37 ` [PATCH RFC 4/4] virtio-net: support future packet transmit time Steffen Trumtrar
@ 2023-12-18 16:36 ` Willem de Bruijn
  2023-12-20  3:48   ` Jason Wang
  4 siblings, 1 reply; 9+ messages in thread
From: Willem de Bruijn @ 2023-12-18 16:36 UTC (permalink / raw)
  To: Steffen Trumtrar, Michael S. Tsirkin, Jason Wang, Xuan Zhuo,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Richard Cochran
  Cc: virtualization, netdev, linux-kernel, Willem de Bruijn

Steffen Trumtrar wrote:
> This series tries to pick up the work on the virtio-net timestamping
> feature from Willem de Bruijn.
> 
> Original series
>     Message-Id: 20210208185558.995292-1-willemdebruijn.kernel@gmail.com
>     Subject: [PATCH RFC v2 0/4] virtio-net: add tx-hash, rx-tstamp,
>     tx-tstamp and tx-time
>     From: Willem de Bruijn <willemb@google.com>
> 
>     RFC for four new features to the virtio network device:
> 
>     1. pass tx flow state to host, for routing + telemetry
>     2. pass rx tstamp to guest, for better RTT estimation
>     3. pass tx tstamp to guest, idem
>     3. pass tx delivery time to host, for accurate pacing
> 
>     All would introduce an extension to the virtio spec.
> 
> The original series consisted of a hack around the DMA API, which should
> be fixed in this series.
> 
> The changes in this series are to the driver side. For the changes to qemu see:
>     https://github.com/strumtrar/qemu/tree/v8.1.1/virtio-net-ptp
> 
> Currently only virtio-net is supported. The original series used
> vhost-net as backend. However, the path through tun via sendmsg doesn't
> allow us to write data back to the driver side without any hacks.
> Therefore use the way via plain virtio-net without vhost albeit better
> performance.
> 
> Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>

Thanks for picking this back up, Steffen. Nice to see that the code still
applies mostly cleanly.

For context: I dropped the work only because I had no real device
implementation. The referenced patch series to qemu changes that.

I suppose the main issue is the virtio API changes that this introduces,
which will have to be accepted to the spec.

One small comment to patch 4: there I just assumed the virtual device
time is CLOCK_TAI. There is a concurrent feature under review for HW
pacing offload with AF_XDP sockets. The clock issue comes up a bit. In
general, for hardware we cannot assume a clock. For virtio, perhaps
assuming the same monotonic hardware clock in guest and host can be
assumed. But this clock alignment needs some thought.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC 1/4] virtio-net: support transmit hash report
  2023-12-18 11:37 ` [PATCH RFC 1/4] virtio-net: support transmit hash report Steffen Trumtrar
@ 2023-12-19 18:36   ` Michael S. Tsirkin
  0 siblings, 0 replies; 9+ messages in thread
From: Michael S. Tsirkin @ 2023-12-19 18:36 UTC (permalink / raw)
  To: Steffen Trumtrar
  Cc: Jason Wang, Xuan Zhuo, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Richard Cochran, virtualization,
	netdev, linux-kernel, Willem de Bruijn

On Mon, Dec 18, 2023 at 12:37:08PM +0100, Steffen Trumtrar wrote:
> diff --git a/include/uapi/linux/virtio_net.h b/include/uapi/linux/virtio_net.h
> index cc65ef0f3c3e2..698a11f8c6ab9 100644
> --- a/include/uapi/linux/virtio_net.h
> +++ b/include/uapi/linux/virtio_net.h
> @@ -56,6 +56,7 @@
>  #define VIRTIO_NET_F_MQ	22	/* Device supports Receive Flow
>  					 * Steering */
>  #define VIRTIO_NET_F_CTRL_MAC_ADDR 23	/* Set MAC address */
> +#define VIRTIO_NET_F_TX_HASH	  51	/* Driver sends hash report */
>  #define VIRTIO_NET_F_VQ_NOTF_COAL 52	/* Device supports virtqueue notification coalescing */
>  #define VIRTIO_NET_F_NOTF_COAL	53	/* Device supports notifications coalescing */
>  #define VIRTIO_NET_F_GUEST_USO4	54	/* Guest can handle USOv4 in. */


Please make sure to register the feature bit with virtio tc
to avoid collisions.

-- 
MST


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC 0/4] virtio-net: add tx-hash, rx-tstamp, tx-tstamp and tx-time
  2023-12-18 16:36 ` [PATCH RFC 0/4] virtio-net: add tx-hash, rx-tstamp, tx-tstamp and tx-time Willem de Bruijn
@ 2023-12-20  3:48   ` Jason Wang
  2023-12-20  4:25     ` Willem de Bruijn
  0 siblings, 1 reply; 9+ messages in thread
From: Jason Wang @ 2023-12-20  3:48 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Steffen Trumtrar, Michael S. Tsirkin, Xuan Zhuo, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Richard Cochran,
	virtualization, netdev, linux-kernel, Willem de Bruijn

On Tue, Dec 19, 2023 at 12:36 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> Steffen Trumtrar wrote:
> > This series tries to pick up the work on the virtio-net timestamping
> > feature from Willem de Bruijn.
> >
> > Original series
> >     Message-Id: 20210208185558.995292-1-willemdebruijn.kernel@gmail.com
> >     Subject: [PATCH RFC v2 0/4] virtio-net: add tx-hash, rx-tstamp,
> >     tx-tstamp and tx-time
> >     From: Willem de Bruijn <willemb@google.com>
> >
> >     RFC for four new features to the virtio network device:
> >
> >     1. pass tx flow state to host, for routing + telemetry
> >     2. pass rx tstamp to guest, for better RTT estimation
> >     3. pass tx tstamp to guest, idem
> >     3. pass tx delivery time to host, for accurate pacing
> >
> >     All would introduce an extension to the virtio spec.
> >
> > The original series consisted of a hack around the DMA API, which should
> > be fixed in this series.
> >
> > The changes in this series are to the driver side. For the changes to qemu see:
> >     https://github.com/strumtrar/qemu/tree/v8.1.1/virtio-net-ptp
> >
> > Currently only virtio-net is supported. The original series used
> > vhost-net as backend. However, the path through tun via sendmsg doesn't
> > allow us to write data back to the driver side without any hacks.
> > Therefore use the way via plain virtio-net without vhost albeit better
> > performance.
> >
> > Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
>
> Thanks for picking this back up, Steffen. Nice to see that the code still
> applies mostly cleanly.
>
> For context: I dropped the work only because I had no real device
> implementation. The referenced patch series to qemu changes that.
>
> I suppose the main issue is the virtio API changes that this introduces,
> which will have to be accepted to the spec.
>
> One small comment to patch 4: there I just assumed the virtual device
> time is CLOCK_TAI. There is a concurrent feature under review for HW
> pacing offload with AF_XDP sockets. The clock issue comes up a bit. In
> general, for hardware we cannot assume a clock.

Any reason for this? E.g some modern NIC have PTP support.

> For virtio, perhaps
> assuming the same monotonic hardware clock in guest and host can be
> assumed.

Note that virtio can be implemented in hardware now. So we can assume
things like the kvm ptp clock.

> But this clock alignment needs some thought.
>

Thanks


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC 0/4] virtio-net: add tx-hash, rx-tstamp, tx-tstamp and tx-time
  2023-12-20  3:48   ` Jason Wang
@ 2023-12-20  4:25     ` Willem de Bruijn
  0 siblings, 0 replies; 9+ messages in thread
From: Willem de Bruijn @ 2023-12-20  4:25 UTC (permalink / raw)
  To: Jason Wang, Willem de Bruijn
  Cc: Steffen Trumtrar, Michael S. Tsirkin, Xuan Zhuo, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Richard Cochran,
	virtualization, netdev, linux-kernel, Willem de Bruijn

Jason Wang wrote:
> On Tue, Dec 19, 2023 at 12:36 AM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
> >
> > Steffen Trumtrar wrote:
> > > This series tries to pick up the work on the virtio-net timestamping
> > > feature from Willem de Bruijn.
> > >
> > > Original series
> > >     Message-Id: 20210208185558.995292-1-willemdebruijn.kernel@gmail.com
> > >     Subject: [PATCH RFC v2 0/4] virtio-net: add tx-hash, rx-tstamp,
> > >     tx-tstamp and tx-time
> > >     From: Willem de Bruijn <willemb@google.com>
> > >
> > >     RFC for four new features to the virtio network device:
> > >
> > >     1. pass tx flow state to host, for routing + telemetry
> > >     2. pass rx tstamp to guest, for better RTT estimation
> > >     3. pass tx tstamp to guest, idem
> > >     3. pass tx delivery time to host, for accurate pacing
> > >
> > >     All would introduce an extension to the virtio spec.
> > >
> > > The original series consisted of a hack around the DMA API, which should
> > > be fixed in this series.
> > >
> > > The changes in this series are to the driver side. For the changes to qemu see:
> > >     https://github.com/strumtrar/qemu/tree/v8.1.1/virtio-net-ptp
> > >
> > > Currently only virtio-net is supported. The original series used
> > > vhost-net as backend. However, the path through tun via sendmsg doesn't
> > > allow us to write data back to the driver side without any hacks.
> > > Therefore use the way via plain virtio-net without vhost albeit better
> > > performance.
> > >
> > > Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
> >
> > Thanks for picking this back up, Steffen. Nice to see that the code still
> > applies mostly cleanly.
> >
> > For context: I dropped the work only because I had no real device
> > implementation. The referenced patch series to qemu changes that.
> >
> > I suppose the main issue is the virtio API changes that this introduces,
> > which will have to be accepted to the spec.
> >
> > One small comment to patch 4: there I just assumed the virtual device
> > time is CLOCK_TAI. There is a concurrent feature under review for HW
> > pacing offload with AF_XDP sockets. The clock issue comes up a bit. In
> > general, for hardware we cannot assume a clock.
> 
> Any reason for this? E.g some modern NIC have PTP support.

I meant that we cannot assume a specific clock, if aiming to offload
existing pacing (or "launch time") methods.

The issue discussed in the AF_XDP thread is whether to use CLOCK_TAI
or CLOCK_MONOTONIC. Both of which are already in use in software
pacing offload, in the ETF and FQ qdiscs, respectively.

But for virtio it may be acceptable to restrict to one clock, such as
CLOCK_REALTIME or CLOCK_TAI.

CLOCK_MONOTONIC being boottime is almost certain to have an offset.
Even if the clocks' rates are synchronized with phc2sys.

> > For virtio, perhaps
> > assuming the same monotonic hardware clock in guest and host can be
> > assumed.
> 
> Note that virtio can be implemented in hardware now. So we can assume
> things like the kvm ptp clock.
> 
> > But this clock alignment needs some thought.
> >
> 
> Thanks
> 



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-12-20  4:25 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-18 11:37 [PATCH RFC 0/4] virtio-net: add tx-hash, rx-tstamp, tx-tstamp and tx-time Steffen Trumtrar
2023-12-18 11:37 ` [PATCH RFC 1/4] virtio-net: support transmit hash report Steffen Trumtrar
2023-12-19 18:36   ` Michael S. Tsirkin
2023-12-18 11:37 ` [PATCH RFC 2/4] virtio-net: support receive timestamp Steffen Trumtrar
2023-12-18 11:37 ` [PATCH RFC 3/4] virtio-net: support transmit timestamp Steffen Trumtrar
2023-12-18 11:37 ` [PATCH RFC 4/4] virtio-net: support future packet transmit time Steffen Trumtrar
2023-12-18 16:36 ` [PATCH RFC 0/4] virtio-net: add tx-hash, rx-tstamp, tx-tstamp and tx-time Willem de Bruijn
2023-12-20  3:48   ` Jason Wang
2023-12-20  4:25     ` Willem de Bruijn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).