All of lore.kernel.org
 help / color / mirror / Atom feed
* [virtio-comment] [PATCH requirements v5 0/7] virtio net requirements for 1.4
@ 2023-08-18  4:35 Parav Pandit
  2023-08-18  4:35 ` [virtio-comment] [PATCH requirements v5 1/7] net-features: Add requirements document for release 1.4 Parav Pandit
                   ` (6 more replies)
  0 siblings, 7 replies; 17+ messages in thread
From: Parav Pandit @ 2023-08-18  4:35 UTC (permalink / raw)
  To: virtio-comment, hengqi, david.edmondson, xuanzhuo, sburla
  Cc: shahafs, virtio, Parav Pandit

Hi All,

This document captures the virtio net device requirements for the upcoming
release 1.4 that some of us are currently working on.

This is live document to be updated in coming time and work towards it
for its design which can result in a draft specification.

The objectives are:
1. to consider these requirements in introducing new features listed in the
document and work towards the interface design followed by drafting the
specification changes.

2. Define list of requirements that can be practical to achieve in 1.4
timeframe incrementally and also have the ability to implement them.

Please review mainly patch 5 at the priority.

Receive flow filters is the first item apart from counters to complete in this iteration to start drafting the design spec.
Rest of the requirements are largly untouched other than Stefan's comment.

TODO:
1. Some more refinement needed for rx low latency and header data split
   requirements.

changelog:
v4->v5:
- Refined receive flow filter requirements to be ready for spec draft
- updated timestamp requirement to feedback from Willem
- Fixed counters rquirements on comments from David
v3->v4:
- receive flow filters requirements undergo major updates to take to
  spec draft level.
- Addressed comments from Xuan, Heng, David, Satananda.
- Refined wordings in rest of the requirements
v2->v3:
- addressed comments from Stefan for tx low latency and notification
- redrafted the requirements to use rearm term and avoid queue enable
  confusion for notification
- addressed all comments and refined receive flow filters requirements to
  take to design level
v1->v2:
- major update of receive flow filter requirements updated based on last
  two design discussions in community and offline research
- examples added
- link to use case and design goal added
- control and operation side requirements split
- more verbose
v0->v1:
- addressed comments from Heng Li
- addressed few (not all) comments from Michael
- per patch changelog 

Parav Pandit (7):
  net-features: Add requirements document for release 1.4
  net-features: Add low latency transmit queue requirements
  net-features: Add low latency receive queue requirements
  net-features: Add notification coalescing requirements
  net-features: Add n-tuple receive flow filters requirements
  net-features: Add packet timestamp requirements
  net-features: Add header data split requirements

 net-workstream/features-1.4.md | 383 +++++++++++++++++++++++++++++++++
 1 file changed, 383 insertions(+)
 create mode 100644 net-workstream/features-1.4.md

-- 
2.26.2


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [virtio-comment] [PATCH requirements v5 1/7] net-features: Add requirements document for release 1.4
  2023-08-18  4:35 [virtio-comment] [PATCH requirements v5 0/7] virtio net requirements for 1.4 Parav Pandit
@ 2023-08-18  4:35 ` Parav Pandit
  2023-08-21 10:44   ` David Edmondson
  2023-08-18  4:35 ` [virtio-comment] [PATCH requirements v5 2/7] net-features: Add low latency transmit queue requirements Parav Pandit
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Parav Pandit @ 2023-08-18  4:35 UTC (permalink / raw)
  To: virtio-comment, hengqi, david.edmondson, xuanzhuo, sburla
  Cc: shahafs, virtio, Parav Pandit

Add requirements document template for the virtio net features.

Add virtio net device counters visible to driver.

Signed-off-by: Parav Pandit <parav@nvidia.com>
---
changelog:
v4->v5:
- Fixed attributes query and counters query
v3->v4:
- Addressed comment from David
- Added link to more counters that we are already discussing
v0->v1:
- removed tx dropped counter
- updated requirements to mention about virtqueue interface for counters
  query
---
 net-workstream/features-1.4.md | 41 ++++++++++++++++++++++++++++++++++
 1 file changed, 41 insertions(+)
 create mode 100644 net-workstream/features-1.4.md

diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
new file mode 100644
index 0000000..ea36f09
--- /dev/null
+++ b/net-workstream/features-1.4.md
@@ -0,0 +1,41 @@
+# 1. Introduction
+
+This document describes the overall requirements for virtio net device
+improvements for upcoming release 1.4. Some of these requirements are
+interrelated and influence the interface design, hence reviewing them
+together is desired while updating the virtio net interface.
+
+# 2. Summary
+1. Device counters visible to the driver
+
+# 3. Requirements
+## 3.1 Device counters
+1. The driver should be able to query the device and/or per vq counters for
+   debugging purpose using a virtqueue directly from driver to device for
+   example using a control vq.
+2. The driver should be able to query which counters are supported using a
+   virtqueue command, for example using an existing control vq.
+3. If this device is migrated between two hosts, the driver should be able
+   get the counter values in the destination host from where it was left
+   off in the source host.
+4. If a virtio device is a group member device, it must be possible to query
+   all of the group member counters via the group owner device.
+5. If a virtio device is a group member device, it must be possible to query
+   all of the group member counter attributes via the group owner device.
+
+### 3.1.1 Per receive queue counters
+1. le64 rx_oversize_pkt_errors: Packet dropped due to receive packet being
+    oversize than the buffer size
+2. le64 rx_no_buffer_pkt_errors: Packet dropped due to unavailability of the
+    buffer in the receive queue
+3. le64 rx_gso_pkts: Packets treated as receive GSO sequence by the device
+4. le64 rx_pkts: Total packets received by the device
+
+### 3.1.2 Per transmit queue counters
+1. le64 tx_gso_pkts: Packets send as transmit GSO sequence
+2. le64 tx_pkts: Total packets send by the device
+
+### 3.1.3 More counters
+More counters discussed in [1].
+
+[1] https://lists.oasis-open.org/archives/virtio-comment/202308/msg00176.html
-- 
2.26.2


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [virtio-comment] [PATCH requirements v5 2/7] net-features: Add low latency transmit queue requirements
  2023-08-18  4:35 [virtio-comment] [PATCH requirements v5 0/7] virtio net requirements for 1.4 Parav Pandit
  2023-08-18  4:35 ` [virtio-comment] [PATCH requirements v5 1/7] net-features: Add requirements document for release 1.4 Parav Pandit
@ 2023-08-18  4:35 ` Parav Pandit
  2023-08-18  4:35 ` [virtio-comment] [PATCH requirements v5 3/7] net-features: Add low latency receive " Parav Pandit
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Parav Pandit @ 2023-08-18  4:35 UTC (permalink / raw)
  To: virtio-comment, hengqi, david.edmondson, xuanzhuo, sburla
  Cc: shahafs, virtio, Parav Pandit

Add requirements for the low latency transmit queue.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Acked-by: David Edmondson <david.edmondson@oracle.com>
---
chagelog:
v3->v4:
- Addressed comments from David
- rewrote timestamp and completions pcie transcation requirement
v1->v2:
- added generic requirement to inline the request content
  along with the descriptor for non virtio-net devices
- added requirement to inline the header content along
  with the descriptor for virtio flow filter queue as two
  features are similar
v0->v1:
- added design goals for which requirements are added
---
 net-workstream/features-1.4.md | 88 ++++++++++++++++++++++++++++++++++
 1 file changed, 88 insertions(+)

diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
index ea36f09..1167ce2 100644
--- a/net-workstream/features-1.4.md
+++ b/net-workstream/features-1.4.md
@@ -7,6 +7,7 @@ together is desired while updating the virtio net interface.
 
 # 2. Summary
 1. Device counters visible to the driver
+2. Low latency tx virtqueue for PCI transport
 
 # 3. Requirements
 ## 3.1 Device counters
@@ -39,3 +40,90 @@ together is desired while updating the virtio net interface.
 More counters discussed in [1].
 
 [1] https://lists.oasis-open.org/archives/virtio-comment/202308/msg00176.html
+
+## 3.2 Low PCI latency virtqueues
+### 3.2.1 Low PCI latency tx virtqueue
+0. Design goal
+   a. Reduce PCI access latency in packet transmit flow
+   b. Avoid O(N) descriptor parser to detect a packet stream to simplify device
+      logic
+   c. Reduce number of PCI transmit completion transactions and have unified
+      completion flow with/without transmit timestamping
+   d. Avoid partial cache line writes on transmit completions
+
+1. Packet transmit descriptor should contain data descriptors count without any
+   indirection and without any O(N) search to find the end of a packet stream.
+   For example, a packet transmit descriptor (called vnet_tx_hdr_desc
+   subsequently) to contain a field num_next_desc for the packet stream
+   indicating that a packet is located in N data descriptors.
+
+2. Packet transmit descriptor should contain segmentation offload-related fields
+   without any indirection. For example, packet transmit descriptor to contain
+   gso_type, gso_size/mss, header length, csum placement byte offset, and
+   csum start.
+
+3. Packet transmit descriptor should be able to place a small size packet that
+   does not have any L4 data after the vnet_tx_hdr_desc in the virtqueue memory.
+   For example a TCP ack only packet can fit in a descriptor memory which
+   otherwise consume more than 25% of metadata to describe the packet.
+
+4. Packet transmit descriptor should be able to place a full GSO header (L2 to
+   L4) after header descriptor and before data descriptors. For example, the
+   GSO header is placed after struct vnet_tx_hdr_desc in the virtqueue memory.
+   When such a GSO header is positioned adjacent to the packet transmit
+   descriptor, and when the GSO header is not aligned to 16B, the following
+   data descriptor to start on the 8B aligned boundary.
+
+5. An example of the above requirements at high level is:
+
+```
+struct virtio_packed_q_desc {
+   /* current desc for reference */
+   u64 address;
+   u32 len;
+   u16 id;
+   u16 flags;
+};
+
+/* Constant size header descriptor for tx packets */
+struct vnet_tx_hdr_desc {
+   u16 flags; /* indicate how to parse next fields */
+   u16 id; /* desc id to come back in completion */
+   u8 num_next_desc; /* indicates the number of the next 16B data desc for this
+		      * buffer.
+		      */
+   u8 gso_type;
+   le16 gso_hdr_len;
+   le16 gso_size;
+   le16 csum_start;
+   le16 csum_offset;
+   u8 inline_pkt_len; /* indicates the length of the inline packet after this
+		       * desc
+		       */
+   u8 reserved;
+   u8 padding[];
+};
+
+/* Example of a short packet or GSO header placed in the desc section of the vq
+ */
+struct vnet_tx_small_pkt_desc {
+   u8 raw_pkt[128];
+};
+
+/* Example of header followed by data descriptor */
+struct vnet_tx_hdr_desc hdr_desc;
+struct vnet_data_desc desc[2];
+
+```
+
+6. Ability to zero pad the transmit completion when the transmit completion is
+   shorter than the CPU cache line size.
+
+7. Ability to write per packet timestamp and also write multiple
+   transmit completions using single PCIe transcation.
+
+8. A generic feature of the virtqueue, to contain such header data inline for virtio
+   devices other than virtio-net.
+
+9. A flow filter virtqueue also similarly need the ability to inline the short flow
+   command header.
-- 
2.26.2


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [virtio-comment] [PATCH requirements v5 3/7] net-features: Add low latency receive queue requirements
  2023-08-18  4:35 [virtio-comment] [PATCH requirements v5 0/7] virtio net requirements for 1.4 Parav Pandit
  2023-08-18  4:35 ` [virtio-comment] [PATCH requirements v5 1/7] net-features: Add requirements document for release 1.4 Parav Pandit
  2023-08-18  4:35 ` [virtio-comment] [PATCH requirements v5 2/7] net-features: Add low latency transmit queue requirements Parav Pandit
@ 2023-08-18  4:35 ` Parav Pandit
  2023-08-21 10:47   ` [virtio-comment] " David Edmondson
  2023-09-11 13:47   ` [virtio-comment] Re: [virtio] " Stefan Hajnoczi
  2023-08-18  4:35 ` [virtio-comment] [PATCH requirements v5 4/7] net-features: Add notification coalescing requirements Parav Pandit
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 17+ messages in thread
From: Parav Pandit @ 2023-08-18  4:35 UTC (permalink / raw)
  To: virtio-comment, hengqi, david.edmondson, xuanzhuo, sburla
  Cc: shahafs, virtio, Parav Pandit

Add requirements for the low latency receive queue.

Signed-off-by: Parav Pandit <parav@nvidia.com>
---
changelog:
v0->v1:
- clarified the requirements further
- added line for the gro case
- added design goals as the motivation for the requirements
---
 net-workstream/features-1.4.md | 45 +++++++++++++++++++++++++++++++++-
 1 file changed, 44 insertions(+), 1 deletion(-)

diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
index 1167ce2..bc9e971 100644
--- a/net-workstream/features-1.4.md
+++ b/net-workstream/features-1.4.md
@@ -7,7 +7,7 @@ together is desired while updating the virtio net interface.
 
 # 2. Summary
 1. Device counters visible to the driver
-2. Low latency tx virtqueue for PCI transport
+2. Low latency tx and rx virtqueues for PCI transport
 
 # 3. Requirements
 ## 3.1 Device counters
@@ -127,3 +127,46 @@ struct vnet_data_desc desc[2];
 
 9. A flow filter virtqueue also similarly need the ability to inline the short flow
    command header.
+
+### 3.2.2 Low latency rx virtqueue
+0. Design goal:
+   a. Keep packet metadata and buffer data together which is consumed by driver
+      layer and make it available in a single cache line of cpu
+   b. Instead of having per packet descriptors which is complex to scale for
+      the device, supply the page directly to the device to consume it based
+      on packet size
+1. The device should be able to write a packet receive completion that consists
+   of struct virtio_net_hdr (or similar) and a buffer id using a single DMA write
+   PCIe TLP.
+2. The device should be able to perform DMA writes of multiple packets
+   completions in a single DMA transaction up to the PCIe maximum write limit
+   in a transaction.
+3. The device should be able to zero pad packet write completion to align it to
+   64B or CPU cache line size whenever possible.
+4. An example of the above DMA completion structure:
+
+```
+/* Constant size receive packet completion */
+struct vnet_rx_completion {
+   u16 flags;
+   u16 id; /* buffer id */
+   u8 gso_type;
+   u8 reserved[3];
+   le16 gso_hdr_len;
+   le16 gso_size;
+   le16 csum_start;
+   le16 csum_offset;
+   u16 reserved2;
+   u64 timestamp; /* explained later */
+   u8 padding[];
+};
+```
+5. The driver should be able to post constant-size buffer pages on a receive
+   queue which can be consumed by the device for an incoming packet of any size
+   from 64B to 9K bytes.
+6. The device should be able to know the constant buffer size at receive
+   virtqueue level instead of per buffer level.
+7. The device should be able to indicate when a full page buffer is consumed,
+   which can be recycled by the driver when the packets from the completed
+   page is fully consumed.
+8. The device should be able to consume multiple pages for a receive GSO stream.
-- 
2.26.2


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [virtio-comment] [PATCH requirements v5 4/7] net-features: Add notification coalescing requirements
  2023-08-18  4:35 [virtio-comment] [PATCH requirements v5 0/7] virtio net requirements for 1.4 Parav Pandit
                   ` (2 preceding siblings ...)
  2023-08-18  4:35 ` [virtio-comment] [PATCH requirements v5 3/7] net-features: Add low latency receive " Parav Pandit
@ 2023-08-18  4:35 ` Parav Pandit
  2023-08-18  4:35 ` [virtio-comment] [PATCH requirements v5 5/7] net-features: Add n-tuple receive flow filters requirements Parav Pandit
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Parav Pandit @ 2023-08-18  4:35 UTC (permalink / raw)
  To: virtio-comment, hengqi, david.edmondson, xuanzhuo, sburla
  Cc: shahafs, virtio, Parav Pandit

Add virtio net device notification coalescing improvements requirements.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Acked-by: David Edmondson <david.edmondson@oracle.com>

---
changelog:
v3->v4:
- no change

v1->v2:
- addressed comments from Stefan
- redrafted the requirements to use rearm term and avoid queue enable
  confusion
v0->v1:
- updated the description
---
 net-workstream/features-1.4.md | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
index bc9e971..72b4132 100644
--- a/net-workstream/features-1.4.md
+++ b/net-workstream/features-1.4.md
@@ -8,6 +8,7 @@ together is desired while updating the virtio net interface.
 # 2. Summary
 1. Device counters visible to the driver
 2. Low latency tx and rx virtqueues for PCI transport
+3. Virtqueue notification coalescing re-arming support
 
 # 3. Requirements
 ## 3.1 Device counters
@@ -170,3 +171,13 @@ struct vnet_rx_completion {
    which can be recycled by the driver when the packets from the completed
    page is fully consumed.
 8. The device should be able to consume multiple pages for a receive GSO stream.
+
+## 3.3 Virtqueue notification coalescing re-arming support
+0. Design goal:
+   a. Avoid constant notifications from the device even in conditions when
+      the driver may not have acted on the previous pending notification.
+1. When Tx and Rx virtqueue notification coalescing is enabled, and when such
+   a notification is reported by the device, the device stops sending further
+   notifications until the driver rearms the notifications of the virtqueue.
+2. When the driver rearms the notification of the virtqueue, the device
+   to notify again if notification coalescing conditions are met.
-- 
2.26.2


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [virtio-comment] [PATCH requirements v5 5/7] net-features: Add n-tuple receive flow filters requirements
  2023-08-18  4:35 [virtio-comment] [PATCH requirements v5 0/7] virtio net requirements for 1.4 Parav Pandit
                   ` (3 preceding siblings ...)
  2023-08-18  4:35 ` [virtio-comment] [PATCH requirements v5 4/7] net-features: Add notification coalescing requirements Parav Pandit
@ 2023-08-18  4:35 ` Parav Pandit
  2023-08-21  5:06   ` [virtio-comment] " Heng Qi
  2023-08-22  7:41   ` Parav Pandit
  2023-08-18  4:35 ` [virtio-comment] [PATCH requirements v5 6/7] net-features: Add packet timestamp requirements Parav Pandit
  2023-08-18  4:35 ` [virtio-comment] [PATCH requirements v5 7/7] net-features: Add header data split requirements Parav Pandit
  6 siblings, 2 replies; 17+ messages in thread
From: Parav Pandit @ 2023-08-18  4:35 UTC (permalink / raw)
  To: virtio-comment, hengqi, david.edmondson, xuanzhuo, sburla
  Cc: shahafs, virtio, Parav Pandit

Add virtio net device requirements for receive flow filters.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Satananda Burla <sburla@marvell.com>
Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
---
changelog:
v4->v5:
- Rewrote cvq and flow filter vq mutual exclusive text
- added cvq command to enable flow filters on cvq
- made commands more refined for priority, opcode and more
- Addressed comments from Heng
- restructured interface commands
v3->v4:
- Addressed comments from Satananda, Heng, David
- removed context specific wording, replaced with destination
- added group create/delete examples and updated requirements
- added optional support to use cvq for flor filter commands
- added example of transporting flow filter commands over cvq
- made group size to be 16-bit
- added concept of 0->n max flow filter entries based on max count
- added concept of 0->n max flow group based on max count
- split field bitmask to separate command from other filter capabilities
- rewrote rx filter processing chain order with respect to existing
  filter commands and rss
- made flow_id flat across all groups
v1->v2:
- split setup and operations requirements
- added design goal
- worded requirements more precisely
v0->v1:
- fixed comments from Heng Li
- renamed receive flow steering to receive flow filters
- clarified byte offset in match criteria
---
 net-workstream/features-1.4.md | 163 +++++++++++++++++++++++++++++++++
 1 file changed, 163 insertions(+)

diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
index 72b4132..330949c 100644
--- a/net-workstream/features-1.4.md
+++ b/net-workstream/features-1.4.md
@@ -9,6 +9,7 @@ together is desired while updating the virtio net interface.
 1. Device counters visible to the driver
 2. Low latency tx and rx virtqueues for PCI transport
 3. Virtqueue notification coalescing re-arming support
+4  Virtqueue receive flow filters (RFF)
 
 # 3. Requirements
 ## 3.1 Device counters
@@ -181,3 +182,165 @@ struct vnet_rx_completion {
    notifications until the driver rearms the notifications of the virtqueue.
 2. When the driver rearms the notification of the virtqueue, the device
    to notify again if notification coalescing conditions are met.
+
+## 3.4 Virtqueue receive flow filters (RFF)
+0. Design goal:
+   To filter and/or to steer packet based on specific pattern match to a
+   specific destination to support application/networking stack driven receive
+   processing.
+1. Two use cases are: to support Linux netdev set_rxnfc() for ETHTOOL_SRXCLSRLINS
+   and to support netdev feature NETIF_F_NTUPLE aka ARFS.
+
+### 3.4.1 control path
+1. The number of flow filter operations/sec can range from 100k/sec to 1M/sec
+   or even more. Hence flow filter operations must be done over a queueing
+   interface using one or more queues.
+2. The device should be able to expose one or more supported flow filter queue
+   count and its start vq index to the driver.
+3. As each device may be operating for different performance characteristic,
+   start vq index and count may be different for each device. Secondly, it is
+   inefficient for device to provide flow filters capabilities via a config space
+   region. Hence, the device should be able to share these attributes using
+   dma interface, instead of transport registers.
+4. Since flow filters are enabled much later in the driver life cycle, driver
+   will likely create these queues when flow filters are enabled.
+5. Flow filter operations are often accelerated by device in a hardware. Ability
+   to handle them on a queue other than control vq is desired. This achieves near
+   zero modifications to existing implementations to add new operations on new
+   purpose built queues (similar to transmit and receive queue). Some devices
+   may not support flow filter queues and may want to support flow filter operations
+   over existing cvq, this gives the ability to utilize an existing cvq.
+   Therefore,
+   a. Flow filter queues and flow filter commands on cvq are mutually exclusive.
+   b. When flow filter queues are supported, the driver should use the flow filter
+      queues flow filter operations.
+      (Since cvq is not enabled for flow filters, any flow filter command coming
+      on cvq must fail).
+   c. If driver wants to use flow filters over cvq, driver must explicitly
+      enable flow filters on cvq via a command, when it is enabled on the cvq
+      driver cannot use flow filter queues. This eliminates any synchronization
+      needed by the device among different types of queues.
+6. The filter masks are optional; the device should be able to expose if it
+   support filter masks.
+7. The driver may want to have priority among group of flow entries; to facilitate
+   the device support grouping flow filter entries by a notion of a flow group.
+   Each flow group defines priority in processing flow.
+8. The driver and group owner driver should be able to query supported device
+   limits for the receive flow filters.
+9. Query the flow filter capabilities of the member device by the owner device
+   using administrative command.
+
+### 3.4.2 flow operations path
+1. The driver should be able to define a receive packet match criteria, an
+   action and a destination for a packet. For example, an ipv4 packet with a
+   multicast address to be steered to the receive vq 0. The second example is
+   ipv4, tcp packet matching a specified IP address and tcp port tuple to
+   be steered to receive vq 10.
+2. The match criteria should include exact tuple fields well-defined such as mac
+   address, IP addresses, tcp/udp ports, etc.
+3. The match criteria should also optionally include the field mask.
+4. Action includes (a) dropping or (b) forwarding the packet.
+5. Destination is a receive virtqueue index.
+6. Receive packet processing chain is:
+   a. filters programmed using cvq commands VIRTIO_NET_CTRL_RX,
+      VIRTIO_NET_CTRL_MAC and VIRTIO_NET_CTRL_VLAN.
+   b. filters programmed using RFF functiionality.
+   c. filters programmed using RSS VIRTIO_NET_CTRL_MQ_RSS_CONFIG command.
+   Whichever filtering and steering functionality is enabled, they are applied
+   in the above order.
+7. If multiple entries are programmed which has overlapping filtering attributes
+   for a received packet, the driver to define the location/priority of the entry.
+8. The filter entries are usually short in size of few tens of bytes,
+   for example IPv6 + TCP tuple would be 36 bytes, and ops/sec rate is
+   high, hence supplying fields inside the queue descriptor is preferred for
+   up to a certain fixed size, say 96 bytes.
+9. A flow filter entry consists of (a) match criteria, (b) action,
+    (c) destination and (d) a unique 32 bit flow id, all supplied by the
+    driver.
+10. The driver should be able to query and delete flow filter entry
+    by the flow id.
+
+### 3.4.3 interface example
+
+1. Flow filter capabilities to query using a DMA interface such as cvq
+using two different commands.
+
+```
+struct virtio_net_rff_cmd {
+	u8 class; /* RFF class */
+	u8 commands; /* 0 = query cap
+		      * 1 = query packet fields mask
+		      * 2 = enable flow filter operations over cvq
+		      * 3 = add flow group
+		      * 4 = del flow group
+		      * 5 = flow filter op.
+		      */
+	u8 command-specific-data[];
+};
+
+/* command 1 (query) */
+struct flow_filter_capabilities {
+	le16 start_vq_index;
+	le16 num_flow_filter_vqs;
+	le16 max_flow_groups; /* valid group id = max_flow_groups - 1 */
+	le16 max_group_priorities; /* max priorities of the group */
+	le32 max_flow_filters_per_group;
+	le32 max_flow_filters; /* max flow_id in add/del 
+				* is equal = max_flow_filters - 1.
+				*/
+	u8 max_priorities_per_group;
+	u8 cvq_supports_flow_filters_ops;
+};
+
+/* command 2 (query packet field masks) */
+struct flow_filter_fields_support_mask {
+	le64 supported_packet_field_mask_bmap[1];
+};
+
+```
+
+2. Group add/delete cvq commands:
+
+```
+/* command 3 */
+struct virtio_net_rff_group_add {
+	le16 priority;	/* higher the value, higher priority */
+	le16 group_id;
+};
+
+
+/* command 4 */
+struct virtio_net_rff_group_delete {
+	le16 group_id;
+
+```
+
+3. Flow filter entry add/modify, delete over flow vq:
+
+```
+struct virtio_net_rff_add_modify {
+	u8 flow_op;
+	u8 priority;	/* higher the value, higher priority */
+	u16 group_id;
+	le32 flow_id;
+	struct match_criteria mc;
+	struct destination dest;
+	struct action action;
+
+	struct match_criteria mask;	/* optional */
+};
+
+struct virtio_net_rff_delete {
+	u8 flow_op;
+	u8 padding[3];
+	le32 flow_id;
+};
+
+```
+
+### 3.4.4 For incremental future
+a. Driver should be able to specify a specific packet byte offset, number
+   of bytes and mask as math criteria.
+b. Support RSS context, in addition to a specific RQ.
+c. If/when virtio switch object is implemented, support ingress/egress flow
+   filters at the switch port level.
-- 
2.26.2


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [virtio-comment] [PATCH requirements v5 6/7] net-features: Add packet timestamp requirements
  2023-08-18  4:35 [virtio-comment] [PATCH requirements v5 0/7] virtio net requirements for 1.4 Parav Pandit
                   ` (4 preceding siblings ...)
  2023-08-18  4:35 ` [virtio-comment] [PATCH requirements v5 5/7] net-features: Add n-tuple receive flow filters requirements Parav Pandit
@ 2023-08-18  4:35 ` Parav Pandit
  2023-08-18  4:35 ` [virtio-comment] [PATCH requirements v5 7/7] net-features: Add header data split requirements Parav Pandit
  6 siblings, 0 replies; 17+ messages in thread
From: Parav Pandit @ 2023-08-18  4:35 UTC (permalink / raw)
  To: virtio-comment, hengqi, david.edmondson, xuanzhuo, sburla
  Cc: shahafs, virtio, Parav Pandit

Add tx and rx packet timestamp requirements.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Acked-by: David Edmondson <david.edmondson@oracle.com>

---
changelog:
v4->v5:
- relaxed mmio requirement on feedback from Wiliem
v3->v4:
- no change
---
 net-workstream/features-1.4.md | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
index 330949c..31aa587 100644
--- a/net-workstream/features-1.4.md
+++ b/net-workstream/features-1.4.md
@@ -10,6 +10,7 @@ together is desired while updating the virtio net interface.
 2. Low latency tx and rx virtqueues for PCI transport
 3. Virtqueue notification coalescing re-arming support
 4  Virtqueue receive flow filters (RFF)
+5. Device timestamp for tx and rx packets
 
 # 3. Requirements
 ## 3.1 Device counters
@@ -344,3 +345,26 @@ a. Driver should be able to specify a specific packet byte offset, number
 b. Support RSS context, in addition to a specific RQ.
 c. If/when virtio switch object is implemented, support ingress/egress flow
    filters at the switch port level.
+
+## 3.5 Packet timestamp
+1. Device should provide transmit timestamp and receive timestamp of the packets
+   at per packet level when the timestamping is enabled in the device.
+2. Device should provide the current frequency and the frequency unit for the
+   software to synchronize the reference point of software and the device using
+   a control vq command.
+
+### 3.5.1 Transmit timestamp
+1. Transmit completion must contain a packet transmission timestamp when the
+   device is enabled for it.
+2. The device should record the packet transmit timestamp in the completion at
+   the farthest egress point towards the network.
+3. The device must provide a transmit packet timestamp in a single DMA
+   transaction along with the rest of the transmit completion fields.
+
+### 3.5.2 Receive timestamp
+1. Receive completion must contain a packet reception timestamp when the device
+   is enabled for it.
+2. The device should record the received packet timestamp at the closet ingress
+   point of reception from the network.
+3. The device should provide a receive packet timestamp in a single DMA
+   transaction along with the rest of the receive completion fields.
-- 
2.26.2


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [virtio-comment] [PATCH requirements v5 7/7] net-features: Add header data split requirements
  2023-08-18  4:35 [virtio-comment] [PATCH requirements v5 0/7] virtio net requirements for 1.4 Parav Pandit
                   ` (5 preceding siblings ...)
  2023-08-18  4:35 ` [virtio-comment] [PATCH requirements v5 6/7] net-features: Add packet timestamp requirements Parav Pandit
@ 2023-08-18  4:35 ` Parav Pandit
  2023-08-21 10:45   ` [virtio-comment] " David Edmondson
  6 siblings, 1 reply; 17+ messages in thread
From: Parav Pandit @ 2023-08-18  4:35 UTC (permalink / raw)
  To: virtio-comment, hengqi, david.edmondson, xuanzhuo, sburla
  Cc: shahafs, virtio, Parav Pandit

Add header data split requirements for the receive packets.

Signed-off-by: Parav Pandit <parav@nvidia.com>
---
 net-workstream/features-1.4.md | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
index 31aa587..7a56fa8 100644
--- a/net-workstream/features-1.4.md
+++ b/net-workstream/features-1.4.md
@@ -11,6 +11,7 @@ together is desired while updating the virtio net interface.
 3. Virtqueue notification coalescing re-arming support
 4  Virtqueue receive flow filters (RFF)
 5. Device timestamp for tx and rx packets
+6. Header data split for the receive virtqueue
 
 # 3. Requirements
 ## 3.1 Device counters
@@ -368,3 +369,15 @@ c. If/when virtio switch object is implemented, support ingress/egress flow
    point of reception from the network.
 3. The device should provide a receive packet timestamp in a single DMA
    transaction along with the rest of the receive completion fields.
+
+## 3.6 Header data split for the receive virtqueue
+1. The device should be able to DMA the packet header and data to two different
+   memory locations, this enables driver and networking stack to perform zero
+   copy to application buffer(s).
+2. The driver should be able to configure maximum header buffer size per
+   virtqueue.
+3. The header buffer to be in a physically contiguous memory per virtqueue
+4. The device should be able to indicate header data split in the receive
+   completion.
+5. The device should be able to zero pad the header buffer when the received
+   header is shorter than cpu cache line size.
-- 
2.26.2


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [virtio-comment] Re: [PATCH requirements v5 5/7] net-features: Add n-tuple receive flow filters requirements
  2023-08-18  4:35 ` [virtio-comment] [PATCH requirements v5 5/7] net-features: Add n-tuple receive flow filters requirements Parav Pandit
@ 2023-08-21  5:06   ` Heng Qi
  2023-08-21  5:14     ` [virtio-comment] " Parav Pandit
  2023-08-22  7:41   ` Parav Pandit
  1 sibling, 1 reply; 17+ messages in thread
From: Heng Qi @ 2023-08-21  5:06 UTC (permalink / raw)
  To: Parav Pandit, virtio-comment, david.edmondson, xuanzhuo, sburla
  Cc: shahafs, virtio



在 2023/8/18 下午12:35, Parav Pandit 写道:
> Add virtio net device requirements for receive flow filters.
>
> Signed-off-by: Parav Pandit <parav@nvidia.com>
> Signed-off-by: Satananda Burla <sburla@marvell.com>
> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> ---
> changelog:
> v4->v5:
> - Rewrote cvq and flow filter vq mutual exclusive text
> - added cvq command to enable flow filters on cvq
> - made commands more refined for priority, opcode and more
> - Addressed comments from Heng
> - restructured interface commands
> v3->v4:
> - Addressed comments from Satananda, Heng, David
> - removed context specific wording, replaced with destination
> - added group create/delete examples and updated requirements
> - added optional support to use cvq for flor filter commands
> - added example of transporting flow filter commands over cvq
> - made group size to be 16-bit
> - added concept of 0->n max flow filter entries based on max count
> - added concept of 0->n max flow group based on max count
> - split field bitmask to separate command from other filter capabilities
> - rewrote rx filter processing chain order with respect to existing
>    filter commands and rss
> - made flow_id flat across all groups
> v1->v2:
> - split setup and operations requirements
> - added design goal
> - worded requirements more precisely
> v0->v1:
> - fixed comments from Heng Li
> - renamed receive flow steering to receive flow filters
> - clarified byte offset in match criteria
> ---
>   net-workstream/features-1.4.md | 163 +++++++++++++++++++++++++++++++++
>   1 file changed, 163 insertions(+)
>
> diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
> index 72b4132..330949c 100644
> --- a/net-workstream/features-1.4.md
> +++ b/net-workstream/features-1.4.md
> @@ -9,6 +9,7 @@ together is desired while updating the virtio net interface.
>   1. Device counters visible to the driver
>   2. Low latency tx and rx virtqueues for PCI transport
>   3. Virtqueue notification coalescing re-arming support
> +4  Virtqueue receive flow filters (RFF)
>   
>   # 3. Requirements
>   ## 3.1 Device counters
> @@ -181,3 +182,165 @@ struct vnet_rx_completion {
>      notifications until the driver rearms the notifications of the virtqueue.
>   2. When the driver rearms the notification of the virtqueue, the device
>      to notify again if notification coalescing conditions are met.
> +
> +## 3.4 Virtqueue receive flow filters (RFF)
> +0. Design goal:
> +   To filter and/or to steer packet based on specific pattern match to a
> +   specific destination to support application/networking stack driven receive
> +   processing.
> +1. Two use cases are: to support Linux netdev set_rxnfc() for ETHTOOL_SRXCLSRLINS
> +   and to support netdev feature NETIF_F_NTUPLE aka ARFS.
> +
> +### 3.4.1 control path
> +1. The number of flow filter operations/sec can range from 100k/sec to 1M/sec
> +   or even more. Hence flow filter operations must be done over a queueing
> +   interface using one or more queues.
> +2. The device should be able to expose one or more supported flow filter queue
> +   count and its start vq index to the driver.
> +3. As each device may be operating for different performance characteristic,
> +   start vq index and count may be different for each device. Secondly, it is
> +   inefficient for device to provide flow filters capabilities via a config space
> +   region. Hence, the device should be able to share these attributes using
> +   dma interface, instead of transport registers.
> +4. Since flow filters are enabled much later in the driver life cycle, driver
> +   will likely create these queues when flow filters are enabled.

I remember here that the creation phase of flow vq is in the probe 
phase, and a feature is used to indicate dynamic creation.
Or do we want to revert to a workaround similar to vq reset here?

Thanks!

> +5. Flow filter operations are often accelerated by device in a hardware. Ability
> +   to handle them on a queue other than control vq is desired. This achieves near
> +   zero modifications to existing implementations to add new operations on new
> +   purpose built queues (similar to transmit and receive queue). Some devices
> +   may not support flow filter queues and may want to support flow filter operations
> +   over existing cvq, this gives the ability to utilize an existing cvq.
> +   Therefore,
> +   a. Flow filter queues and flow filter commands on cvq are mutually exclusive.
> +   b. When flow filter queues are supported, the driver should use the flow filter
> +      queues flow filter operations.
> +      (Since cvq is not enabled for flow filters, any flow filter command coming
> +      on cvq must fail).
> +   c. If driver wants to use flow filters over cvq, driver must explicitly
> +      enable flow filters on cvq via a command, when it is enabled on the cvq
> +      driver cannot use flow filter queues. This eliminates any synchronization
> +      needed by the device among different types of queues.
> +6. The filter masks are optional; the device should be able to expose if it
> +   support filter masks.
> +7. The driver may want to have priority among group of flow entries; to facilitate
> +   the device support grouping flow filter entries by a notion of a flow group.
> +   Each flow group defines priority in processing flow.
> +8. The driver and group owner driver should be able to query supported device
> +   limits for the receive flow filters.
> +9. Query the flow filter capabilities of the member device by the owner device
> +   using administrative command.
> +
> +### 3.4.2 flow operations path
> +1. The driver should be able to define a receive packet match criteria, an
> +   action and a destination for a packet. For example, an ipv4 packet with a
> +   multicast address to be steered to the receive vq 0. The second example is
> +   ipv4, tcp packet matching a specified IP address and tcp port tuple to
> +   be steered to receive vq 10.
> +2. The match criteria should include exact tuple fields well-defined such as mac
> +   address, IP addresses, tcp/udp ports, etc.
> +3. The match criteria should also optionally include the field mask.
> +4. Action includes (a) dropping or (b) forwarding the packet.
> +5. Destination is a receive virtqueue index.
> +6. Receive packet processing chain is:
> +   a. filters programmed using cvq commands VIRTIO_NET_CTRL_RX,
> +      VIRTIO_NET_CTRL_MAC and VIRTIO_NET_CTRL_VLAN.
> +   b. filters programmed using RFF functiionality.
> +   c. filters programmed using RSS VIRTIO_NET_CTRL_MQ_RSS_CONFIG command.
> +   Whichever filtering and steering functionality is enabled, they are applied
> +   in the above order.
> +7. If multiple entries are programmed which has overlapping filtering attributes
> +   for a received packet, the driver to define the location/priority of the entry.
> +8. The filter entries are usually short in size of few tens of bytes,
> +   for example IPv6 + TCP tuple would be 36 bytes, and ops/sec rate is
> +   high, hence supplying fields inside the queue descriptor is preferred for
> +   up to a certain fixed size, say 96 bytes.
> +9. A flow filter entry consists of (a) match criteria, (b) action,
> +    (c) destination and (d) a unique 32 bit flow id, all supplied by the
> +    driver.
> +10. The driver should be able to query and delete flow filter entry
> +    by the flow id.
> +
> +### 3.4.3 interface example
> +
> +1. Flow filter capabilities to query using a DMA interface such as cvq
> +using two different commands.
> +
> +```
> +struct virtio_net_rff_cmd {
> +	u8 class; /* RFF class */
> +	u8 commands; /* 0 = query cap
> +		      * 1 = query packet fields mask
> +		      * 2 = enable flow filter operations over cvq
> +		      * 3 = add flow group
> +		      * 4 = del flow group
> +		      * 5 = flow filter op.
> +		      */
> +	u8 command-specific-data[];
> +};
> +
> +/* command 1 (query) */
> +struct flow_filter_capabilities {
> +	le16 start_vq_index;
> +	le16 num_flow_filter_vqs;
> +	le16 max_flow_groups; /* valid group id = max_flow_groups - 1 */
> +	le16 max_group_priorities; /* max priorities of the group */
> +	le32 max_flow_filters_per_group;
> +	le32 max_flow_filters; /* max flow_id in add/del
> +				* is equal = max_flow_filters - 1.
> +				*/
> +	u8 max_priorities_per_group;
> +	u8 cvq_supports_flow_filters_ops;
> +};
> +
> +/* command 2 (query packet field masks) */
> +struct flow_filter_fields_support_mask {
> +	le64 supported_packet_field_mask_bmap[1];
> +};
> +
> +```
> +
> +2. Group add/delete cvq commands:
> +
> +```
> +/* command 3 */
> +struct virtio_net_rff_group_add {
> +	le16 priority;	/* higher the value, higher priority */
> +	le16 group_id;
> +};
> +
> +
> +/* command 4 */
> +struct virtio_net_rff_group_delete {
> +	le16 group_id;
> +
> +```
> +
> +3. Flow filter entry add/modify, delete over flow vq:
> +
> +```
> +struct virtio_net_rff_add_modify {
> +	u8 flow_op;
> +	u8 priority;	/* higher the value, higher priority */
> +	u16 group_id;
> +	le32 flow_id;
> +	struct match_criteria mc;
> +	struct destination dest;
> +	struct action action;
> +
> +	struct match_criteria mask;	/* optional */
> +};
> +
> +struct virtio_net_rff_delete {
> +	u8 flow_op;
> +	u8 padding[3];
> +	le32 flow_id;
> +};
> +
> +```
> +
> +### 3.4.4 For incremental future
> +a. Driver should be able to specify a specific packet byte offset, number
> +   of bytes and mask as math criteria.
> +b. Support RSS context, in addition to a specific RQ.
> +c. If/when virtio switch object is implemented, support ingress/egress flow
> +   filters at the switch port level.


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [virtio-comment] RE: [PATCH requirements v5 5/7] net-features: Add n-tuple receive flow filters requirements
  2023-08-21  5:06   ` [virtio-comment] " Heng Qi
@ 2023-08-21  5:14     ` Parav Pandit
  0 siblings, 0 replies; 17+ messages in thread
From: Parav Pandit @ 2023-08-21  5:14 UTC (permalink / raw)
  To: Heng Qi, virtio-comment@lists.oasis-open.org,
	david.edmondson@oracle.com, xuanzhuo@linux.alibaba.com,
	sburla@marvell.com
  Cc: Shahaf Shuler, virtio@lists.oasis-open.org



> From: Heng Qi <hengqi@linux.alibaba.com>
> Sent: Monday, August 21, 2023 10:36 AM

> > +4. Since flow filters are enabled much later in the driver life cycle, driver
> > +   will likely create these queues when flow filters are enabled.
> 
> I remember here that the creation phase of flow vq is in the probe phase, and a
> feature is used to indicate dynamic creation.

> Or do we want to revert to a workaround similar to vq reset here?

As we discussed during v4, dynamic creation using a new feature bit.
Say F_VQ_DYNAMIC_CREATE.
Will introduce it as part of this series.
Lets not build new features using workarounds.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [virtio-comment] [PATCH requirements v5 1/7] net-features: Add requirements document for release 1.4
  2023-08-18  4:35 ` [virtio-comment] [PATCH requirements v5 1/7] net-features: Add requirements document for release 1.4 Parav Pandit
@ 2023-08-21 10:44   ` David Edmondson
  0 siblings, 0 replies; 17+ messages in thread
From: David Edmondson @ 2023-08-21 10:44 UTC (permalink / raw)
  To: Parav Pandit; +Cc: hengqi, xuanzhuo, sburla, shahafs, virtio, virtio-comment


On Friday, 2023-08-18 at 07:35:51 +03, Parav Pandit wrote:
> Add requirements document template for the virtio net features.
>
> Add virtio net device counters visible to driver.
>
> Signed-off-by: Parav Pandit <parav@nvidia.com>

Acked-by: David Edmondson <david.edmondson@oracle.com>

> ---
> changelog:
> v4->v5:
> - Fixed attributes query and counters query
> v3->v4:
> - Addressed comment from David
> - Added link to more counters that we are already discussing
> v0->v1:
> - removed tx dropped counter
> - updated requirements to mention about virtqueue interface for counters
>   query
> ---
>  net-workstream/features-1.4.md | 41 ++++++++++++++++++++++++++++++++++
>  1 file changed, 41 insertions(+)
>  create mode 100644 net-workstream/features-1.4.md
>
> diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
> new file mode 100644
> index 0000000..ea36f09
> --- /dev/null
> +++ b/net-workstream/features-1.4.md
> @@ -0,0 +1,41 @@
> +# 1. Introduction
> +
> +This document describes the overall requirements for virtio net device
> +improvements for upcoming release 1.4. Some of these requirements are
> +interrelated and influence the interface design, hence reviewing them
> +together is desired while updating the virtio net interface.
> +
> +# 2. Summary
> +1. Device counters visible to the driver
> +
> +# 3. Requirements
> +## 3.1 Device counters
> +1. The driver should be able to query the device and/or per vq counters for
> +   debugging purpose using a virtqueue directly from driver to device for
> +   example using a control vq.
> +2. The driver should be able to query which counters are supported using a
> +   virtqueue command, for example using an existing control vq.
> +3. If this device is migrated between two hosts, the driver should be able
> +   get the counter values in the destination host from where it was left
> +   off in the source host.
> +4. If a virtio device is a group member device, it must be possible to query
> +   all of the group member counters via the group owner device.
> +5. If a virtio device is a group member device, it must be possible to query
> +   all of the group member counter attributes via the group owner device.
> +
> +### 3.1.1 Per receive queue counters
> +1. le64 rx_oversize_pkt_errors: Packet dropped due to receive packet being
> +    oversize than the buffer size
> +2. le64 rx_no_buffer_pkt_errors: Packet dropped due to unavailability of the
> +    buffer in the receive queue
> +3. le64 rx_gso_pkts: Packets treated as receive GSO sequence by the device
> +4. le64 rx_pkts: Total packets received by the device
> +
> +### 3.1.2 Per transmit queue counters
> +1. le64 tx_gso_pkts: Packets send as transmit GSO sequence
> +2. le64 tx_pkts: Total packets send by the device
> +
> +### 3.1.3 More counters
> +More counters discussed in [1].
> +
> +[1] https://lists.oasis-open.org/archives/virtio-comment/202308/msg00176.html
-- 
I know a man called Sylvester, him have to wear a bullet proof vest y'all.

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [virtio-comment] Re: [PATCH requirements v5 7/7] net-features: Add header data split requirements
  2023-08-18  4:35 ` [virtio-comment] [PATCH requirements v5 7/7] net-features: Add header data split requirements Parav Pandit
@ 2023-08-21 10:45   ` David Edmondson
  0 siblings, 0 replies; 17+ messages in thread
From: David Edmondson @ 2023-08-21 10:45 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtio-comment, hengqi, xuanzhuo, sburla, shahafs, virtio


On Friday, 2023-08-18 at 07:35:57 +03, Parav Pandit wrote:
> Add header data split requirements for the receive packets.
>
> Signed-off-by: Parav Pandit <parav@nvidia.com>

Acked-by: David Edmondson <david.edmondson@oracle.com>

> ---
>  net-workstream/features-1.4.md | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
>
> diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
> index 31aa587..7a56fa8 100644
> --- a/net-workstream/features-1.4.md
> +++ b/net-workstream/features-1.4.md
> @@ -11,6 +11,7 @@ together is desired while updating the virtio net interface.
>  3. Virtqueue notification coalescing re-arming support
>  4  Virtqueue receive flow filters (RFF)
>  5. Device timestamp for tx and rx packets
> +6. Header data split for the receive virtqueue
>  
>  # 3. Requirements
>  ## 3.1 Device counters
> @@ -368,3 +369,15 @@ c. If/when virtio switch object is implemented, support ingress/egress flow
>     point of reception from the network.
>  3. The device should provide a receive packet timestamp in a single DMA
>     transaction along with the rest of the receive completion fields.
> +
> +## 3.6 Header data split for the receive virtqueue
> +1. The device should be able to DMA the packet header and data to two different
> +   memory locations, this enables driver and networking stack to perform zero
> +   copy to application buffer(s).
> +2. The driver should be able to configure maximum header buffer size per
> +   virtqueue.
> +3. The header buffer to be in a physically contiguous memory per virtqueue
> +4. The device should be able to indicate header data split in the receive
> +   completion.
> +5. The device should be able to zero pad the header buffer when the received
> +   header is shorter than cpu cache line size.
-- 
Do I have to tell the story, of a thousand rainy days since we first met?

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [virtio-comment] Re: [PATCH requirements v5 3/7] net-features: Add low latency receive queue requirements
  2023-08-18  4:35 ` [virtio-comment] [PATCH requirements v5 3/7] net-features: Add low latency receive " Parav Pandit
@ 2023-08-21 10:47   ` David Edmondson
  2023-08-22  6:12     ` [virtio-comment] " Parav Pandit
  2023-09-11 13:47   ` [virtio-comment] Re: [virtio] " Stefan Hajnoczi
  1 sibling, 1 reply; 17+ messages in thread
From: David Edmondson @ 2023-08-21 10:47 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtio-comment, hengqi, xuanzhuo, sburla, shahafs, virtio


On Friday, 2023-08-18 at 07:35:53 +03, Parav Pandit wrote:
> Add requirements for the low latency receive queue.
>
> Signed-off-by: Parav Pandit <parav@nvidia.com>
> ---
> changelog:
> v0->v1:
> - clarified the requirements further
> - added line for the gro case
> - added design goals as the motivation for the requirements
> ---
>  net-workstream/features-1.4.md | 45 +++++++++++++++++++++++++++++++++-
>  1 file changed, 44 insertions(+), 1 deletion(-)
>
> diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
> index 1167ce2..bc9e971 100644
> --- a/net-workstream/features-1.4.md
> +++ b/net-workstream/features-1.4.md
> @@ -7,7 +7,7 @@ together is desired while updating the virtio net interface.
>  
>  # 2. Summary
>  1. Device counters visible to the driver
> -2. Low latency tx virtqueue for PCI transport
> +2. Low latency tx and rx virtqueues for PCI transport
>  
>  # 3. Requirements
>  ## 3.1 Device counters
> @@ -127,3 +127,46 @@ struct vnet_data_desc desc[2];
>  
>  9. A flow filter virtqueue also similarly need the ability to inline the short flow
>     command header.
> +
> +### 3.2.2 Low latency rx virtqueue
> +0. Design goal:
> +   a. Keep packet metadata and buffer data together which is consumed by driver
> +      layer and make it available in a single cache line of cpu

Phrased like this, it seems to run counter to the "header data split"
requirement.

Is there an implicit guard that this only applies for very small payloads?

> +   b. Instead of having per packet descriptors which is complex to scale for
> +      the device, supply the page directly to the device to consume it based
> +      on packet size
> +1. The device should be able to write a packet receive completion that consists
> +   of struct virtio_net_hdr (or similar) and a buffer id using a single DMA write
> +   PCIe TLP.
> +2. The device should be able to perform DMA writes of multiple packets
> +   completions in a single DMA transaction up to the PCIe maximum write limit
> +   in a transaction.
> +3. The device should be able to zero pad packet write completion to align it to
> +   64B or CPU cache line size whenever possible.
> +4. An example of the above DMA completion structure:
> +
> +```
> +/* Constant size receive packet completion */
> +struct vnet_rx_completion {
> +   u16 flags;
> +   u16 id; /* buffer id */
> +   u8 gso_type;
> +   u8 reserved[3];
> +   le16 gso_hdr_len;
> +   le16 gso_size;
> +   le16 csum_start;
> +   le16 csum_offset;
> +   u16 reserved2;
> +   u64 timestamp; /* explained later */
> +   u8 padding[];
> +};
> +```
> +5. The driver should be able to post constant-size buffer pages on a receive
> +   queue which can be consumed by the device for an incoming packet of any size
> +   from 64B to 9K bytes.
> +6. The device should be able to know the constant buffer size at receive
> +   virtqueue level instead of per buffer level.
> +7. The device should be able to indicate when a full page buffer is consumed,
> +   which can be recycled by the driver when the packets from the completed
> +   page is fully consumed.
> +8. The device should be able to consume multiple pages for a receive GSO stream.
-- 
Modern people tend to dance.

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [virtio-comment] RE: [PATCH requirements v5 3/7] net-features: Add low latency receive queue requirements
  2023-08-21 10:47   ` [virtio-comment] " David Edmondson
@ 2023-08-22  6:12     ` Parav Pandit
  0 siblings, 0 replies; 17+ messages in thread
From: Parav Pandit @ 2023-08-22  6:12 UTC (permalink / raw)
  To: David Edmondson
  Cc: virtio-comment@lists.oasis-open.org, hengqi@linux.alibaba.com,
	xuanzhuo@linux.alibaba.com, sburla@marvell.com, Shahaf Shuler,
	virtio@lists.oasis-open.org


> From: David Edmondson <david.edmondson@oracle.com>
> Sent: Monday, August 21, 2023 4:17 PM

> > +### 3.2.2 Low latency rx virtqueue
> > +0. Design goal:
> > +   a. Keep packet metadata and buffer data together which is consumed by
> driver
> > +      layer and make it available in a single cache line of cpu
> 
> Phrased like this, it seems to run counter to the "header data split"
> requirement.
> 
Mostly not. Currently, the packet metadata consumed by the driver is spread in two different DMAs at two different addresses.
For split q: virtio_net_hdr + used ring
For split q: virtio_net_hdr + desc.

Instead, both to complete in single PCIe DMA and also read in single cache line from the cpu while processing it.

> Is there an implicit guard that this only applies for very small payloads?
> 
No.
All packet sizes benefit from it.

> > +   b. Instead of having per packet descriptors which is complex to scale for
> > +      the device, supply the page directly to the device to consume it based
> > +      on packet size
> > +1. The device should be able to write a packet receive completion that
> consists
> > +   of struct virtio_net_hdr (or similar) and a buffer id using a single DMA
> write
> > +   PCIe TLP.
> > +2. The device should be able to perform DMA writes of multiple packets
> > +   completions in a single DMA transaction up to the PCIe maximum write
> limit
> > +   in a transaction.
> > +3. The device should be able to zero pad packet write completion to align it
> to
> > +   64B or CPU cache line size whenever possible.
> > +4. An example of the above DMA completion structure:
> > +
> > +```
> > +/* Constant size receive packet completion */ struct
> > +vnet_rx_completion {
> > +   u16 flags;
> > +   u16 id; /* buffer id */
> > +   u8 gso_type;
> > +   u8 reserved[3];
> > +   le16 gso_hdr_len;
> > +   le16 gso_size;
> > +   le16 csum_start;
> > +   le16 csum_offset;
> > +   u16 reserved2;
> > +   u64 timestamp; /* explained later */
> > +   u8 padding[];
> > +};
> > +```
> > +5. The driver should be able to post constant-size buffer pages on a receive
> > +   queue which can be consumed by the device for an incoming packet of any
> size
> > +   from 64B to 9K bytes.
> > +6. The device should be able to know the constant buffer size at receive
> > +   virtqueue level instead of per buffer level.
> > +7. The device should be able to indicate when a full page buffer is consumed,
> > +   which can be recycled by the driver when the packets from the completed
> > +   page is fully consumed.
> > +8. The device should be able to consume multiple pages for a receive GSO
> stream.
> --
> Modern people tend to dance.

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [virtio-comment] RE: [PATCH requirements v5 5/7] net-features: Add n-tuple receive flow filters requirements
  2023-08-18  4:35 ` [virtio-comment] [PATCH requirements v5 5/7] net-features: Add n-tuple receive flow filters requirements Parav Pandit
  2023-08-21  5:06   ` [virtio-comment] " Heng Qi
@ 2023-08-22  7:41   ` Parav Pandit
  1 sibling, 0 replies; 17+ messages in thread
From: Parav Pandit @ 2023-08-22  7:41 UTC (permalink / raw)
  To: virtio-comment@lists.oasis-open.org, hengqi@linux.alibaba.com,
	david.edmondson@oracle.com, xuanzhuo@linux.alibaba.com,
	sburla@marvell.com
  Cc: Shahaf Shuler, virtio@lists.oasis-open.org

Hi Michael, Jason,

> From: Parav Pandit <parav@nvidia.com>
> Sent: Friday, August 18, 2023 10:06 AM
> 
> Add virtio net device requirements for receive flow filters.
> 
> Signed-off-by: Parav Pandit <parav@nvidia.com>
> Signed-off-by: Satananda Burla <sburla@marvell.com>
> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> ---
> changelog:
> v4->v5:
> - Rewrote cvq and flow filter vq mutual exclusive text
> - added cvq command to enable flow filters on cvq
> - made commands more refined for priority, opcode and more
> - Addressed comments from Heng
> - restructured interface commands

[..]

We will be drafting the spec part for this patch which is far mature than other requirements.
It has undergone many rounds of reviews and discussions.
Do you have any more comments?
We do not want to discuss the requirements again during the spec review.
So if you have comments, please ask now.


> ---
>  net-workstream/features-1.4.md | 163
> +++++++++++++++++++++++++++++++++
>  1 file changed, 163 insertions(+)
> 
> diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
> index 72b4132..330949c 100644
> --- a/net-workstream/features-1.4.md
> +++ b/net-workstream/features-1.4.md
> @@ -9,6 +9,7 @@ together is desired while updating the virtio net interface.
>  1. Device counters visible to the driver  2. Low latency tx and rx virtqueues for
> PCI transport  3. Virtqueue notification coalescing re-arming support
> +4  Virtqueue receive flow filters (RFF)
> 
>  # 3. Requirements
>  ## 3.1 Device counters
> @@ -181,3 +182,165 @@ struct vnet_rx_completion {
>     notifications until the driver rearms the notifications of the virtqueue.
>  2. When the driver rearms the notification of the virtqueue, the device
>     to notify again if notification coalescing conditions are met.
> +
> +## 3.4 Virtqueue receive flow filters (RFF) 0. Design goal:
> +   To filter and/or to steer packet based on specific pattern match to a
> +   specific destination to support application/networking stack driven receive
> +   processing.
> +1. Two use cases are: to support Linux netdev set_rxnfc() for
> ETHTOOL_SRXCLSRLINS
> +   and to support netdev feature NETIF_F_NTUPLE aka ARFS.
> +
> +### 3.4.1 control path
> +1. The number of flow filter operations/sec can range from 100k/sec to
> 1M/sec
> +   or even more. Hence flow filter operations must be done over a queueing
> +   interface using one or more queues.
> +2. The device should be able to expose one or more supported flow filter
> queue
> +   count and its start vq index to the driver.
> +3. As each device may be operating for different performance characteristic,
> +   start vq index and count may be different for each device. Secondly, it is
> +   inefficient for device to provide flow filters capabilities via a config space
> +   region. Hence, the device should be able to share these attributes using
> +   dma interface, instead of transport registers.
> +4. Since flow filters are enabled much later in the driver life cycle, driver
> +   will likely create these queues when flow filters are enabled.
> +5. Flow filter operations are often accelerated by device in a hardware. Ability
> +   to handle them on a queue other than control vq is desired. This achieves
> near
> +   zero modifications to existing implementations to add new operations on
> new
> +   purpose built queues (similar to transmit and receive queue). Some devices
> +   may not support flow filter queues and may want to support flow filter
> operations
> +   over existing cvq, this gives the ability to utilize an existing cvq.
> +   Therefore,
> +   a. Flow filter queues and flow filter commands on cvq are mutually exclusive.
> +   b. When flow filter queues are supported, the driver should use the flow
> filter
> +      queues flow filter operations.
> +      (Since cvq is not enabled for flow filters, any flow filter command coming
> +      on cvq must fail).
> +   c. If driver wants to use flow filters over cvq, driver must explicitly
> +      enable flow filters on cvq via a command, when it is enabled on the cvq
> +      driver cannot use flow filter queues. This eliminates any synchronization
> +      needed by the device among different types of queues.
> +6. The filter masks are optional; the device should be able to expose if it
> +   support filter masks.
> +7. The driver may want to have priority among group of flow entries; to
> facilitate
> +   the device support grouping flow filter entries by a notion of a flow group.
> +   Each flow group defines priority in processing flow.
> +8. The driver and group owner driver should be able to query supported
> device
> +   limits for the receive flow filters.
> +9. Query the flow filter capabilities of the member device by the owner device
> +   using administrative command.
> +
> +### 3.4.2 flow operations path
> +1. The driver should be able to define a receive packet match criteria, an
> +   action and a destination for a packet. For example, an ipv4 packet with a
> +   multicast address to be steered to the receive vq 0. The second example is
> +   ipv4, tcp packet matching a specified IP address and tcp port tuple to
> +   be steered to receive vq 10.
> +2. The match criteria should include exact tuple fields well-defined such as
> mac
> +   address, IP addresses, tcp/udp ports, etc.
> +3. The match criteria should also optionally include the field mask.
> +4. Action includes (a) dropping or (b) forwarding the packet.
> +5. Destination is a receive virtqueue index.
> +6. Receive packet processing chain is:
> +   a. filters programmed using cvq commands VIRTIO_NET_CTRL_RX,
> +      VIRTIO_NET_CTRL_MAC and VIRTIO_NET_CTRL_VLAN.
> +   b. filters programmed using RFF functiionality.
> +   c. filters programmed using RSS VIRTIO_NET_CTRL_MQ_RSS_CONFIG
> command.
> +   Whichever filtering and steering functionality is enabled, they are applied
> +   in the above order.
> +7. If multiple entries are programmed which has overlapping filtering attributes
> +   for a received packet, the driver to define the location/priority of the entry.
> +8. The filter entries are usually short in size of few tens of bytes,
> +   for example IPv6 + TCP tuple would be 36 bytes, and ops/sec rate is
> +   high, hence supplying fields inside the queue descriptor is preferred for
> +   up to a certain fixed size, say 96 bytes.
> +9. A flow filter entry consists of (a) match criteria, (b) action,
> +    (c) destination and (d) a unique 32 bit flow id, all supplied by the
> +    driver.
> +10. The driver should be able to query and delete flow filter entry
> +    by the flow id.
> +
> +### 3.4.3 interface example
> +
> +1. Flow filter capabilities to query using a DMA interface such as cvq
> +using two different commands.
> +
> +```
> +struct virtio_net_rff_cmd {
> +	u8 class; /* RFF class */
> +	u8 commands; /* 0 = query cap
> +		      * 1 = query packet fields mask
> +		      * 2 = enable flow filter operations over cvq
> +		      * 3 = add flow group
> +		      * 4 = del flow group
> +		      * 5 = flow filter op.
> +		      */
> +	u8 command-specific-data[];
> +};
> +
> +/* command 1 (query) */
> +struct flow_filter_capabilities {
> +	le16 start_vq_index;
> +	le16 num_flow_filter_vqs;
> +	le16 max_flow_groups; /* valid group id = max_flow_groups - 1 */
> +	le16 max_group_priorities; /* max priorities of the group */
> +	le32 max_flow_filters_per_group;
> +	le32 max_flow_filters; /* max flow_id in add/del
> +				* is equal = max_flow_filters - 1.
> +				*/
> +	u8 max_priorities_per_group;
> +	u8 cvq_supports_flow_filters_ops;
> +};
> +
> +/* command 2 (query packet field masks) */ struct
> +flow_filter_fields_support_mask {
> +	le64 supported_packet_field_mask_bmap[1];
> +};
> +
> +```
> +
> +2. Group add/delete cvq commands:
> +
> +```
> +/* command 3 */
> +struct virtio_net_rff_group_add {
> +	le16 priority;	/* higher the value, higher priority */
> +	le16 group_id;
> +};
> +
> +
> +/* command 4 */
> +struct virtio_net_rff_group_delete {
> +	le16 group_id;
> +
> +```
> +
> +3. Flow filter entry add/modify, delete over flow vq:
> +
> +```
> +struct virtio_net_rff_add_modify {
> +	u8 flow_op;
> +	u8 priority;	/* higher the value, higher priority */
> +	u16 group_id;
> +	le32 flow_id;
> +	struct match_criteria mc;
> +	struct destination dest;
> +	struct action action;
> +
> +	struct match_criteria mask;	/* optional */
> +};
> +
> +struct virtio_net_rff_delete {
> +	u8 flow_op;
> +	u8 padding[3];
> +	le32 flow_id;
> +};
> +
> +```
> +
> +### 3.4.4 For incremental future
> +a. Driver should be able to specify a specific packet byte offset, number
> +   of bytes and mask as math criteria.
> +b. Support RSS context, in addition to a specific RQ.
> +c. If/when virtio switch object is implemented, support ingress/egress flow
> +   filters at the switch port level.
> --
> 2.26.2


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [virtio-comment] Re: [virtio] [PATCH requirements v5 3/7] net-features: Add low latency receive queue requirements
  2023-08-18  4:35 ` [virtio-comment] [PATCH requirements v5 3/7] net-features: Add low latency receive " Parav Pandit
  2023-08-21 10:47   ` [virtio-comment] " David Edmondson
@ 2023-09-11 13:47   ` Stefan Hajnoczi
  2023-09-11 16:03     ` [virtio-comment] " Parav Pandit
  1 sibling, 1 reply; 17+ messages in thread
From: Stefan Hajnoczi @ 2023-09-11 13:47 UTC (permalink / raw)
  To: Parav Pandit
  Cc: virtio-comment, hengqi, david.edmondson, xuanzhuo, sburla,
	shahafs, virtio

[-- Attachment #1: Type: text/plain, Size: 3117 bytes --]

On Fri, Aug 18, 2023 at 07:35:53AM +0300, Parav Pandit wrote:
> +### 3.2.2 Low latency rx virtqueue
> +0. Design goal:
> +   a. Keep packet metadata and buffer data together which is consumed by driver
> +      layer and make it available in a single cache line of cpu
> +   b. Instead of having per packet descriptors which is complex to scale for
> +      the device, supply the page directly to the device to consume it based
> +      on packet size
> +1. The device should be able to write a packet receive completion that consists
> +   of struct virtio_net_hdr (or similar) and a buffer id using a single DMA write
> +   PCIe TLP.
> +2. The device should be able to perform DMA writes of multiple packets
> +   completions in a single DMA transaction up to the PCIe maximum write limit
> +   in a transaction.
> +3. The device should be able to zero pad packet write completion to align it to
> +   64B or CPU cache line size whenever possible.
> +4. An example of the above DMA completion structure:
> +
> +```
> +/* Constant size receive packet completion */
> +struct vnet_rx_completion {
> +   u16 flags;
> +   u16 id; /* buffer id */
> +   u8 gso_type;
> +   u8 reserved[3];
> +   le16 gso_hdr_len;
> +   le16 gso_size;
> +   le16 csum_start;
> +   le16 csum_offset;
> +   u16 reserved2;
> +   u64 timestamp; /* explained later */
> +   u8 padding[];
> +};
> +```
> +5. The driver should be able to post constant-size buffer pages on a receive
> +   queue which can be consumed by the device for an incoming packet of any size
> +   from 64B to 9K bytes.
> +6. The device should be able to know the constant buffer size at receive
> +   virtqueue level instead of per buffer level.
> +7. The device should be able to indicate when a full page buffer is consumed,
> +   which can be recycled by the driver when the packets from the completed
> +   page is fully consumed.
> +8. The device should be able to consume multiple pages for a receive GSO stream.

If I understand correctly there is no longer a 1:1 correspondence
between driver-supplied rx pages (available buffers) and received
packets (used buffers). Instead, the device consumes portions of
driver-supplied rx pages as needed and notifies the driver, and the
entire rx page is marked used later when it has been fully consumed.

The virtqueue model is based on submitting available buffers and
completing used buffers, not individual DMA transfers. It's not possible
to do DMA piecewise in this model. If you think about a VIRTIO over TCP
transport that uses message-passing for available and used buffers, then
it's clear the rx page approach breaks the model because only entire
virtqueues buffers can be marked used (there is a 1:1 correspondence
between available buffers and used buffers).

Two options:
1. Extend the virtqueue model to support this.
2. Document this violation of the virtqueue model clearly but treat it
   as an exception that may lead to complications in the future (e.g.
   incompatibility with VIRTIO over TCP).

I think it's worth investigating #1 to see whether the virtqueue model
can be extended cleanly.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [virtio-comment] RE: [virtio] [PATCH requirements v5 3/7] net-features: Add low latency receive queue requirements
  2023-09-11 13:47   ` [virtio-comment] Re: [virtio] " Stefan Hajnoczi
@ 2023-09-11 16:03     ` Parav Pandit
  0 siblings, 0 replies; 17+ messages in thread
From: Parav Pandit @ 2023-09-11 16:03 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: virtio-comment@lists.oasis-open.org, hengqi@linux.alibaba.com,
	david.edmondson@oracle.com, xuanzhuo@linux.alibaba.com,
	sburla@marvell.com, Shahaf Shuler, virtio@lists.oasis-open.org



> From: Stefan Hajnoczi <stefanha@redhat.com>
> Sent: Monday, September 11, 2023 7:17 PM

> 
> If I understand correctly there is no longer a 1:1 correspondence between
> driver-supplied rx pages (available buffers) and received packets (used buffers).
> Instead, the device consumes portions of driver-supplied rx pages as needed
> and notifies the driver, and the entire rx page is marked used later when it has
> been fully consumed.
> 
> The virtqueue model is based on submitting available buffers and completing
> used buffers, not individual DMA transfers. It's not possible to do DMA
> piecewise in this model. If you think about a VIRTIO over TCP transport that
> uses message-passing for available and used buffers, then it's clear the rx page
> approach breaks the model because only entire virtqueues buffers can be
> marked used (there is a 1:1 correspondence between available buffers and used
> buffers).
> 
> Two options:
> 1. Extend the virtqueue model to support this.
> 2. Document this violation of the virtqueue model clearly but treat it
>    as an exception that may lead to complications in the future (e.g.
>    incompatibility with VIRTIO over TCP).
> 
I don't think it a violation. It is an extension of a new model. PCI and MMIO will support it.
TCP transport may not be able support everything that exists today in PCI.
But I am not fully sure at present this as limitation.

I will consider #1 later in this month further.
This week occupied with the LM and flow filters that we want to review on Wed meet.

> I think it's worth investigating #1 to see whether the virtqueue model can be
> extended cleanly.


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-09-11 16:03 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-18  4:35 [virtio-comment] [PATCH requirements v5 0/7] virtio net requirements for 1.4 Parav Pandit
2023-08-18  4:35 ` [virtio-comment] [PATCH requirements v5 1/7] net-features: Add requirements document for release 1.4 Parav Pandit
2023-08-21 10:44   ` David Edmondson
2023-08-18  4:35 ` [virtio-comment] [PATCH requirements v5 2/7] net-features: Add low latency transmit queue requirements Parav Pandit
2023-08-18  4:35 ` [virtio-comment] [PATCH requirements v5 3/7] net-features: Add low latency receive " Parav Pandit
2023-08-21 10:47   ` [virtio-comment] " David Edmondson
2023-08-22  6:12     ` [virtio-comment] " Parav Pandit
2023-09-11 13:47   ` [virtio-comment] Re: [virtio] " Stefan Hajnoczi
2023-09-11 16:03     ` [virtio-comment] " Parav Pandit
2023-08-18  4:35 ` [virtio-comment] [PATCH requirements v5 4/7] net-features: Add notification coalescing requirements Parav Pandit
2023-08-18  4:35 ` [virtio-comment] [PATCH requirements v5 5/7] net-features: Add n-tuple receive flow filters requirements Parav Pandit
2023-08-21  5:06   ` [virtio-comment] " Heng Qi
2023-08-21  5:14     ` [virtio-comment] " Parav Pandit
2023-08-22  7:41   ` Parav Pandit
2023-08-18  4:35 ` [virtio-comment] [PATCH requirements v5 6/7] net-features: Add packet timestamp requirements Parav Pandit
2023-08-18  4:35 ` [virtio-comment] [PATCH requirements v5 7/7] net-features: Add header data split requirements Parav Pandit
2023-08-21 10:45   ` [virtio-comment] " David Edmondson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.