* [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4
2023-06-01 22:02 [virtio-comment] [PATCH requirements 0/7] virtio net new features requirements Parav Pandit
@ 2023-06-01 22:02 ` Parav Pandit
2023-06-06 22:15 ` Michael S. Tsirkin
2023-06-07 9:31 ` Xuan Zhuo
0 siblings, 2 replies; 73+ messages in thread
From: Parav Pandit @ 2023-06-01 22:02 UTC (permalink / raw)
To: virtio-comment; +Cc: shahafs, virtio, Parav Pandit
Add requirements document template for the virtio net features.
Add virtio net device counters visible to driver.
Signed-off-by: Parav Pandit <parav@nvidia.com>
---
net-workstream/features-1.4.md | 36 ++++++++++++++++++++++++++++++++++
1 file changed, 36 insertions(+)
create mode 100644 net-workstream/features-1.4.md
diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
new file mode 100644
index 0000000..03b4eb3
--- /dev/null
+++ b/net-workstream/features-1.4.md
@@ -0,0 +1,36 @@
+# 1. Introduction
+
+This document describes the overall requirements for virtio net device
+improvements for upcoming release 1.4. Some of these requirements are
+interrelated and influence the interface design, hence reviewing them
+together is desired while updating the virtio net interface.
+
+# 2. Summary
+1. Device counters visible to the driver
+
+# 3. Requirements
+## 3.1 Device counters
+1. The driver should be able to query the device and/or per vq counters for
+ debugging purpose using a control vq command.
+2. The driver should be able to query which counters are supported using a
+ control vq command.
+3. If this device is migrated between two hosts, the driver should be able
+ get the counter values in the destination host from where it was left
+ off in the source host.
+4. If a virtio device is group member device, a group owner should be able
+ to query all the counter attributes using the admin queue command which
+ a virtio device will expose via a control vq to the driver.
+
+### 3.1.1 Per receive queue counters
+1. le64 rx_oversize_pkt_errors: Packet dropped due to receive packet being
+ oversize than the buffer size
+2. le64 rx_no_buffer_pkt_errors: Packet dropped due to unavailability of the
+ buffer in the receive queue
+3. le64 rx_gro_pkts: Packets treated as guest GSO sequence by the device
+4. le64 rx_pkts: Total packets received by the device
+
+### 3.1.2 Per transmit queue counters
+1. le64 tx_bad_desc_errors: Descriptors dropped by the device due to errors in
+ descriptors
+2. le64 tx_gso_pkts: Packets send as host GSO sequence
+3. le64 tx_pkts: Total packets send by the device
--
2.26.2
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply related [flat|nested] 73+ messages in thread
* Re: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4
2023-06-01 22:02 ` [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4 Parav Pandit
@ 2023-06-06 22:15 ` Michael S. Tsirkin
2023-06-06 22:28 ` Parav Pandit
2023-06-07 9:31 ` Xuan Zhuo
1 sibling, 1 reply; 73+ messages in thread
From: Michael S. Tsirkin @ 2023-06-06 22:15 UTC (permalink / raw)
To: Parav Pandit; +Cc: virtio-comment, shahafs, virtio
On Fri, Jun 02, 2023 at 01:02:59AM +0300, Parav Pandit wrote:
> Add requirements document template for the virtio net features.
>
> Add virtio net device counters visible to driver.
>
> Signed-off-by: Parav Pandit <parav@nvidia.com>
> ---
> net-workstream/features-1.4.md | 36 ++++++++++++++++++++++++++++++++++
> 1 file changed, 36 insertions(+)
> create mode 100644 net-workstream/features-1.4.md
>
> diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
> new file mode 100644
> index 0000000..03b4eb3
> --- /dev/null
> +++ b/net-workstream/features-1.4.md
> @@ -0,0 +1,36 @@
> +# 1. Introduction
> +
> +This document describes the overall requirements for virtio net device
> +improvements for upcoming release 1.4. Some of these requirements are
> +interrelated and influence the interface design, hence reviewing them
> +together is desired while updating the virtio net interface.
> +
> +# 2. Summary
> +1. Device counters visible to the driver
> +
> +# 3. Requirements
> +## 3.1 Device counters
> +1. The driver should be able to query the device and/or per vq counters for
> + debugging purpose using a control vq command.
> +2. The driver should be able to query which counters are supported using a
> + control vq command.
why does it matter for requirements whether it's a control vq?
what matters is whether they have to be synchronized
with a given queue - I get it they don't have to.
> +3. If this device is migrated between two hosts, the driver should be able
> + get the counter values in the destination host from where it was left
> + off in the source host.
we don't cover migration currently don't see how this is a spec
rquirement. unless maybe it's justification for 4?
so maybe it means there needs to be a way to set counters?
so there's no need to mention migration - just that it
should be possible to move counters between devices.
> +4. If a virtio device is group member device, a group owner should be able
> + to query all the counter attributes using the admin queue command which
> + a virtio device will expose via a control vq to the driver.
this seems weirdly specific.
what is the actual requirement?
> +
> +### 3.1.1 Per receive queue counters
> +1. le64 rx_oversize_pkt_errors: Packet dropped due to receive packet being
> + oversize than the buffer size
with mergeable buffers how does this differ from 2?
> +2. le64 rx_no_buffer_pkt_errors: Packet dropped due to unavailability of the
> + buffer in the receive queue
> +3. le64 rx_gro_pkts: Packets treated as guest GSO sequence by the device
what does this mean exactly? packets before or after they are combined?
pls stick to device/driver terminology not guest/host
> +4. le64 rx_pkts: Total packets received by the device
including dropped ones or not?
> +
> +### 3.1.2 Per transmit queue counters
> +1. le64 tx_bad_desc_errors: Descriptors dropped by the device due to errors in
> + descriptors
since when do we drop packets on error in descriptor?
we just as likely stall ...
> +2. le64 tx_gso_pkts: Packets send as host GSO sequence
same questions as gro
> +3. le64 tx_pkts: Total packets send by the device
sent
> --
> 2.26.2
>
>
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
>
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
>
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/
>
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* RE: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4
2023-06-06 22:15 ` Michael S. Tsirkin
@ 2023-06-06 22:28 ` Parav Pandit
2023-06-06 22:56 ` Michael S. Tsirkin
0 siblings, 1 reply; 73+ messages in thread
From: Parav Pandit @ 2023-06-06 22:28 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: virtio-comment@lists.oasis-open.org, Shahaf Shuler,
virtio@lists.oasis-open.org
> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, June 6, 2023 6:15 PM
> > +# 3. Requirements
> > +## 3.1 Device counters
> > +1. The driver should be able to query the device and/or per vq counters for
> > + debugging purpose using a control vq command.
> > +2. The driver should be able to query which counters are supported using a
> > + control vq command.
>
> why does it matter for requirements whether it's a control vq?
>
It matters for requirements, so we produce design that addresses it.
We don't want to add config space every growing bit map which may be different between different devices.
>
> what matters is whether they have to be synchronized with a given queue - I get
> it they don't have to.
They don't have to be.
I don't see how it can be every synchronized.
>
> > +3. If this device is migrated between two hosts, the driver should be able
> > + get the counter values in the destination host from where it was left
> > + off in the source host.
>
> we don't cover migration currently don't see how this is a spec rquirement.
> unless maybe it's justification for 4?
True, but design need to keep this in mind if it has some touch points like of #4 so when migration arrives, it has the building block in place.
> so maybe it means there needs to be a way to set counters?
> so there's no need to mention migration - just that it should be possible to
> move counters between devices.
>
> > +4. If a virtio device is group member device, a group owner should be able
> > + to query all the counter attributes using the admin queue command which
> > + a virtio device will expose via a control vq to the driver.
>
>
> this seems weirdly specific.
> what is the actual requirement?
>
I don't follow the question.
When a device migrates from src to dst, group owner needs to know if both side underlying member device has same counter attributes or not.
>
> > +
> > +### 3.1.1 Per receive queue counters
> > +1. le64 rx_oversize_pkt_errors: Packet dropped due to receive packet being
> > + oversize than the buffer size
>
> with mergeable buffers how does this differ from 2?
>
> > +2. le64 rx_no_buffer_pkt_errors: Packet dropped due to unavailability of the
> > + buffer in the receive queue
> > +3. le64 rx_gro_pkts: Packets treated as guest GSO sequence by the
> > +device
>
> what does this mean exactly? packets before or after they are combined?
>
Before.
> pls stick to device/driver terminology not guest/host
>
Yes, will change the GUEST surfaced from the current F_GUEST terminology of the net device.
> > +4. le64 rx_pkts: Total packets received by the device
>
> including dropped ones or not?
>
Not including. Will add this clarification further in v1.
> > +
> > +### 3.1.2 Per transmit queue counters 1. le64 tx_bad_desc_errors:
> > +Descriptors dropped by the device due to errors in
> > + descriptors
>
> since when do we drop packets on error in descriptor?
> we just as likely stall ...
>
It is left to the device to implement on what to do on bad desc.
> > +2. le64 tx_gso_pkts: Packets send as host GSO sequence
>
> same questions as gro
>
> > +3. le64 tx_pkts: Total packets send by the device
>
> sent
>
Ack.
> >
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4
2023-06-06 22:28 ` Parav Pandit
@ 2023-06-06 22:56 ` Michael S. Tsirkin
2023-06-06 23:08 ` Parav Pandit
0 siblings, 1 reply; 73+ messages in thread
From: Michael S. Tsirkin @ 2023-06-06 22:56 UTC (permalink / raw)
To: Parav Pandit
Cc: virtio-comment@lists.oasis-open.org, Shahaf Shuler,
virtio@lists.oasis-open.org
On Tue, Jun 06, 2023 at 10:28:46PM +0000, Parav Pandit wrote:
>
>
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Tuesday, June 6, 2023 6:15 PM
>
> > > +# 3. Requirements
> > > +## 3.1 Device counters
> > > +1. The driver should be able to query the device and/or per vq counters for
> > > + debugging purpose using a control vq command.
> > > +2. The driver should be able to query which counters are supported using a
> > > + control vq command.
> >
> > why does it matter for requirements whether it's a control vq?
> >
> It matters for requirements, so we produce design that addresses it.
> We don't want to add config space every growing bit map which may be different between different devices.
then say that you want to conserve config space, that is
the requirement. not cvq specifically.
> >
> > what matters is whether they have to be synchronized with a given queue - I get
> > it they don't have to.
> They don't have to be.
Then say that.
> I don't see how it can be every synchronized.
you could program them through the vq itself.
> >
> > > +3. If this device is migrated between two hosts, the driver should be able
> > > + get the counter values in the destination host from where it was left
> > > + off in the source host.
> >
> > we don't cover migration currently don't see how this is a spec rquirement.
> > unless maybe it's justification for 4?
> True, but design need to keep this in mind if it has some touch points like of #4 so when migration arrives, it has the building block in place.
so again, let's make sure we capture the actual spec requirement.
> > so maybe it means there needs to be a way to set counters?
> > so there's no need to mention migration - just that it should be possible to
> > move counters between devices.
> >
> > > +4. If a virtio device is group member device, a group owner should be able
> > > + to query all the counter attributes using the admin queue command which
> > > + a virtio device will expose via a control vq to the driver.
> >
> >
> > this seems weirdly specific.
> > what is the actual requirement?
> >
> I don't follow the question.
> When a device migrates from src to dst, group owner needs to know if both side underlying member device has same counter attributes or not.
whether it's through a command or not is not a requirement
and I still do not know what the requirement is.
what does "same counter attributes" mean? you never mentioned
attributes before.
> >
> > > +
> > > +### 3.1.1 Per receive queue counters
> > > +1. le64 rx_oversize_pkt_errors: Packet dropped due to receive packet being
> > > + oversize than the buffer size
> >
> > with mergeable buffers how does this differ from 2?
> >
> > > +2. le64 rx_no_buffer_pkt_errors: Packet dropped due to unavailability of the
> > > + buffer in the receive queue
> > > +3. le64 rx_gro_pkts: Packets treated as guest GSO sequence by the
> > > +device
> >
> > what does this mean exactly? packets before or after they are combined?
> >
> Before.
>
> > pls stick to device/driver terminology not guest/host
> >
> Yes, will change the GUEST surfaced from the current F_GUEST terminology of the net device.
So? this predates 1.x spec we never bothered changing them.
> > > +4. le64 rx_pkts: Total packets received by the device
> >
> > including dropped ones or not?
> >
> Not including. Will add this clarification further in v1.
>
> > > +
> > > +### 3.1.2 Per transmit queue counters 1. le64 tx_bad_desc_errors:
> > > +Descriptors dropped by the device due to errors in
> > > + descriptors
> >
> > since when do we drop packets on error in descriptor?
> > we just as likely stall ...
> >
> It is left to the device to implement on what to do on bad desc.
then how can you count drops? why does this even matter?
and why on tx specifically?
I feel addressing descriptor errors is a completely separate project.
> > > +2. le64 tx_gso_pkts: Packets send as host GSO sequence
> >
> > same questions as gro
> >
> > > +3. le64 tx_pkts: Total packets send by the device
> >
> > sent
> >
> Ack.
> > >
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* RE: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4
2023-06-06 22:56 ` Michael S. Tsirkin
@ 2023-06-06 23:08 ` Parav Pandit
2023-06-06 23:18 ` Michael S. Tsirkin
2023-06-07 20:35 ` Michael S. Tsirkin
0 siblings, 2 replies; 73+ messages in thread
From: Parav Pandit @ 2023-06-06 23:08 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: virtio-comment@lists.oasis-open.org, Shahaf Shuler,
virtio@lists.oasis-open.org
> From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> open.org> On Behalf Of Michael S. Tsirkin
> Sent: Tuesday, June 6, 2023 6:57 PM
> > It matters for requirements, so we produce design that addresses it.
> > We don't want to add config space every growing bit map which may be
> different between different devices.
>
> then say that you want to conserve config space, that is the requirement. not
> cvq specifically.
>
Well in one meeting you specifically told that requirements and design to be combined together, so it is drafted this way.
Instead of very abstract like "conserve config space".
We can debate and change from cvq to new cfgvq as part of the journey.
It is fine.
> > >
> > > what matters is whether they have to be synchronized with a given
> > > queue - I get it they don't have to.
> > They don't have to be.
>
> Then say that.
>
I thought this is very obvious as querying counter is hundred-time slower operation than packet processing.
> > I don't see how it can be every synchronized.
>
> you could program them through the vq itself.
>
Do you mean in the packet transmit and receive completions itself?
It would be too heavy to do so to mix the control fields in the data path.
> > > we don't cover migration currently don't see how this is a spec rquirement.
> > > unless maybe it's justification for 4?
> > True, but design need to keep this in mind if it has some touch points like of
> #4 so when migration arrives, it has the building block in place.
>
> so again, let's make sure we capture the actual spec requirement.
>
It is an actual spec requirement to be done and drafted in the spec.
We may not do everything in the first phase, this are the broad requirements.
And in design, we will say, requirement #4 is phase 2.
>
> > > so maybe it means there needs to be a way to set counters?
> > > so there's no need to mention migration - just that it should be
> > > possible to move counters between devices.
> > >
> > > > +4. If a virtio device is group member device, a group owner should be
> able
> > > > + to query all the counter attributes using the admin queue command
> which
> > > > + a virtio device will expose via a control vq to the driver.
> > >
> > >
> > > this seems weirdly specific.
> > > what is the actual requirement?
> > >
> > I don't follow the question.
> > When a device migrates from src to dst, group owner needs to know if both
> side underlying member device has same counter attributes or not.
>
> whether it's through a command or not is not a requirement and I still do not
> know what the requirement is.
> what does "same counter attributes" mean? you never mentioned attributes
> before.
>
I will refine this further and drop the word "attribute".
If device support X counters, group owner should be able to know this bitmap.
> > Yes, will change the GUEST surfaced from the current F_GUEST terminology of
> the net device.
>
> So? this predates 1.x spec we never bothered changing them.
>
Will remove the guest wording and will change to transmit and receive.
> > > > +4. le64 rx_pkts: Total packets received by the device
> > >
> > > including dropped ones or not?
> > >
> > Not including. Will add this clarification further in v1.
> >
> > > > +
> > > > +### 3.1.2 Per transmit queue counters 1. le64 tx_bad_desc_errors:
> > > > +Descriptors dropped by the device due to errors in
> > > > + descriptors
> > >
> > > since when do we drop packets on error in descriptor?
> > > we just as likely stall ...
> > >
> > It is left to the device to implement on what to do on bad desc.
>
> then how can you count drops?
A device may count the drops based on errors.
It is counting drops based on the error it got.
> why does this even matter?
To debug.
> and why on tx specifically?
Missed the rx, will add.
> I feel addressing descriptor errors is a completely separate project.
Not sure if it is that big project.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4
2023-06-06 23:08 ` Parav Pandit
@ 2023-06-06 23:18 ` Michael S. Tsirkin
2023-06-07 20:35 ` Michael S. Tsirkin
1 sibling, 0 replies; 73+ messages in thread
From: Michael S. Tsirkin @ 2023-06-06 23:18 UTC (permalink / raw)
To: Parav Pandit
Cc: virtio-comment@lists.oasis-open.org, Shahaf Shuler,
virtio@lists.oasis-open.org
On Tue, Jun 06, 2023 at 11:08:12PM +0000, Parav Pandit wrote:
>
>
> > From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> > open.org> On Behalf Of Michael S. Tsirkin
> > Sent: Tuesday, June 6, 2023 6:57 PM
>
> > > It matters for requirements, so we produce design that addresses it.
> > > We don't want to add config space every growing bit map which may be
> > different between different devices.
> >
> > then say that you want to conserve config space, that is the requirement. not
> > cvq specifically.
> >
> Well in one meeting you specifically told that requirements and design to be combined together, so it is drafted this way.
> Instead of very abstract like "conserve config space".
>
> We can debate and change from cvq to new cfgvq as part of the journey.
> It is fine.
just don't keep listing this in each feature. my plan is to work
on cfgvq so we can keep using config space without these
limitations.
> > > >
> > > > what matters is whether they have to be synchronized with a given
> > > > queue - I get it they don't have to.
> > > They don't have to be.
> >
> > Then say that.
> >
> I thought this is very obvious as querying counter is hundred-time slower operation than packet processing.
>
> > > I don't see how it can be every synchronized.
> >
> > you could program them through the vq itself.
> >
> Do you mean in the packet transmit and receive completions itself?
> It would be too heavy to do so to mix the control fields in the data path.
maybe. but notice how you have a specific design in mind so you are jumping ahead.
you asked how we could make these synch, i told you how.
the actual point is you want to say "we don't need this synchronous",
design idea: use cvq
> > > > we don't cover migration currently don't see how this is a spec rquirement.
> > > > unless maybe it's justification for 4?
> > > True, but design need to keep this in mind if it has some touch points like of
> > #4 so when migration arrives, it has the building block in place.
> >
> > so again, let's make sure we capture the actual spec requirement.
> >
> It is an actual spec requirement to be done and drafted in the spec.
> We may not do everything in the first phase, this are the broad requirements.
> And in design, we will say, requirement #4 is phase 2.
yes but this one I don't know what it means, spec wise.
some kind of block? what for?
> >
> > > > so maybe it means there needs to be a way to set counters?
> > > > so there's no need to mention migration - just that it should be
> > > > possible to move counters between devices.
> > > >
> > > > > +4. If a virtio device is group member device, a group owner should be
> > able
> > > > > + to query all the counter attributes using the admin queue command
> > which
> > > > > + a virtio device will expose via a control vq to the driver.
> > > >
> > > >
> > > > this seems weirdly specific.
> > > > what is the actual requirement?
> > > >
> > > I don't follow the question.
> > > When a device migrates from src to dst, group owner needs to know if both
> > side underlying member device has same counter attributes or not.
> >
> > whether it's through a command or not is not a requirement and I still do not
> > know what the requirement is.
> > what does "same counter attributes" mean? you never mentioned attributes
> > before.
> >
> I will refine this further and drop the word "attribute".
> If device support X counters, group owner should be able to know this bitmap.
i guess device might only support part of counters then?
> > > Yes, will change the GUEST surfaced from the current F_GUEST terminology of
> > the net device.
> >
> > So? this predates 1.x spec we never bothered changing them.
> >
> Will remove the guest wording and will change to transmit and receive.
>
>
> > > > > +4. le64 rx_pkts: Total packets received by the device
> > > >
> > > > including dropped ones or not?
> > > >
> > > Not including. Will add this clarification further in v1.
> > >
> > > > > +
> > > > > +### 3.1.2 Per transmit queue counters 1. le64 tx_bad_desc_errors:
> > > > > +Descriptors dropped by the device due to errors in
> > > > > + descriptors
> > > >
> > > > since when do we drop packets on error in descriptor?
> > > > we just as likely stall ...
> > > >
> > > It is left to the device to implement on what to do on bad desc.
> >
> > then how can you count drops?
> A device may count the drops based on errors.
> It is counting drops based on the error it got.
>
> > why does this even matter?
> To debug.
for debugging it's easiest if you just stop the vq, instead of drops.
> > and why on tx specifically?
> Missed the rx, will add.
>
> > I feel addressing descriptor errors is a completely separate project.
>
> Not sure if it is that big project.
I would have a separate debug section.
--
MST
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4
2023-06-01 22:02 ` [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4 Parav Pandit
2023-06-06 22:15 ` Michael S. Tsirkin
@ 2023-06-07 9:31 ` Xuan Zhuo
1 sibling, 0 replies; 73+ messages in thread
From: Xuan Zhuo @ 2023-06-07 9:31 UTC (permalink / raw)
To: Parav Pandit; +Cc: shahafs, virtio, Parav Pandit, virtio-comment
On Fri, 2 Jun 2023 01:02:59 +0300, Parav Pandit <parav@nvidia.com> wrote:
> Add requirements document template for the virtio net features.
>
> Add virtio net device counters visible to driver.
>
> Signed-off-by: Parav Pandit <parav@nvidia.com>
> ---
> net-workstream/features-1.4.md | 36 ++++++++++++++++++++++++++++++++++
> 1 file changed, 36 insertions(+)
> create mode 100644 net-workstream/features-1.4.md
>
> diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
> new file mode 100644
> index 0000000..03b4eb3
> --- /dev/null
> +++ b/net-workstream/features-1.4.md
> @@ -0,0 +1,36 @@
> +# 1. Introduction
> +
> +This document describes the overall requirements for virtio net device
> +improvements for upcoming release 1.4. Some of these requirements are
> +interrelated and influence the interface design, hence reviewing them
> +together is desired while updating the virtio net interface.
> +
> +# 2. Summary
> +1. Device counters visible to the driver
> +
> +# 3. Requirements
> +## 3.1 Device counters
> +1. The driver should be able to query the device and/or per vq counters for
> + debugging purpose using a control vq command.
> +2. The driver should be able to query which counters are supported using a
> + control vq command.
> +3. If this device is migrated between two hosts, the driver should be able
> + get the counter values in the destination host from where it was left
> + off in the source host.
> +4. If a virtio device is group member device, a group owner should be able
> + to query all the counter attributes using the admin queue command which
> + a virtio device will expose via a control vq to the driver.
> +
> +### 3.1.1 Per receive queue counters
> +1. le64 rx_oversize_pkt_errors: Packet dropped due to receive packet being
> + oversize than the buffer size
> +2. le64 rx_no_buffer_pkt_errors: Packet dropped due to unavailability of the
> + buffer in the receive queue
> +3. le64 rx_gro_pkts: Packets treated as guest GSO sequence by the device
> +4. le64 rx_pkts: Total packets received by the device
> +
> +### 3.1.2 Per transmit queue counters
> +1. le64 tx_bad_desc_errors: Descriptors dropped by the device due to errors in
> + descriptors
> +2. le64 tx_gso_pkts: Packets send as host GSO sequence
> +3. le64 tx_pkts: Total packets send by the device
We hope to support device custom counter. That is, virtio spec provides a
channel for driver and device, and both key and value are provided by device.
We discussed this issue earlier, and after some internal practice, I think it is
still necessary to discuss this again.
It is very important, each cloud vendor will always have some special counters,
these counters may not exist in another vendor. At the same time, if we have
to discuss it in the spec every time we add a counter, or add a feature, I
think it is very inconvenient. Manufacturers may add some new counters at any
time based on some new requirements. Some counters may also be removed at any
time.
Of course I know that doing this might hurt migration. But what I want to say is,
why does it affect live migration? These counters we plan to give to users
through ethtool, and some changes have taken place in the ethtool counters
output by users. Does this have any practical impact? Or do we directly use
some other output in a way, we can clearly tell the user that these counters
may change during the migration process. For example, the driver is migrated
to some new devices. These devices support some new counters. I think users
should be able to see these new counters. These new counters may be the
purpose of the migration.
We don't need to support live migration at this point.
Thanks.
> --
> 2.26.2
>
>
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
>
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
>
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/
>
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4
2023-06-06 23:08 ` Parav Pandit
2023-06-06 23:18 ` Michael S. Tsirkin
@ 2023-06-07 20:35 ` Michael S. Tsirkin
2023-06-07 20:39 ` Parav Pandit
1 sibling, 1 reply; 73+ messages in thread
From: Michael S. Tsirkin @ 2023-06-07 20:35 UTC (permalink / raw)
To: Parav Pandit
Cc: virtio-comment@lists.oasis-open.org, Shahaf Shuler,
virtio@lists.oasis-open.org
On Tue, Jun 06, 2023 at 11:08:12PM +0000, Parav Pandit wrote:
> We can debate and change from cvq to new cfgvq as part of the journey.
> It is fine.
Just to stress, cvq has a bunch of drawbacks. For example it is
hard to access from the hypervisor since it's DMA from VF.
This is why I think our direction should be to add a vq
transport that does all config accesses over admin commands.
Or maybe I'm wrong.
But, let's distinguish between requirements, and between
requirements and design. Getting supported counters is a requirement.
Doing this over cvq is a design. Reducing MMIO registers is
a requirement. OK?
--
MST
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* RE: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4
2023-06-07 20:35 ` Michael S. Tsirkin
@ 2023-06-07 20:39 ` Parav Pandit
2023-06-07 20:50 ` Michael S. Tsirkin
0 siblings, 1 reply; 73+ messages in thread
From: Parav Pandit @ 2023-06-07 20:39 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: virtio-comment@lists.oasis-open.org, Shahaf Shuler,
virtio@lists.oasis-open.org
> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Wednesday, June 7, 2023 4:36 PM
>
> On Tue, Jun 06, 2023 at 11:08:12PM +0000, Parav Pandit wrote:
> > We can debate and change from cvq to new cfgvq as part of the journey.
> > It is fine.
>
> Just to stress, cvq has a bunch of drawbacks. For example it is hard to access
> from the hypervisor since it's DMA from VF.
> This is why I think our direction should be to add a vq transport that does all
> config accesses over admin commands.
We debate this before, so will write is below.
> Or maybe I'm wrong.
>
> But, let's distinguish between requirements, and between requirements and
> design. Getting supported counters is a requirement.
> Doing this over cvq is a design. Reducing MMIO registers is a requirement. OK?
Yes, I would write it as reducing MMIO and accessing it over vq of the own device is a requirement.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4
2023-06-07 20:39 ` Parav Pandit
@ 2023-06-07 20:50 ` Michael S. Tsirkin
2023-06-07 20:53 ` Parav Pandit
0 siblings, 1 reply; 73+ messages in thread
From: Michael S. Tsirkin @ 2023-06-07 20:50 UTC (permalink / raw)
To: Parav Pandit
Cc: virtio-comment@lists.oasis-open.org, Shahaf Shuler,
virtio@lists.oasis-open.org
On Wed, Jun 07, 2023 at 08:39:11PM +0000, Parav Pandit wrote:
>
>
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Wednesday, June 7, 2023 4:36 PM
> >
> > On Tue, Jun 06, 2023 at 11:08:12PM +0000, Parav Pandit wrote:
> > > We can debate and change from cvq to new cfgvq as part of the journey.
> > > It is fine.
> >
> > Just to stress, cvq has a bunch of drawbacks. For example it is hard to access
> > from the hypervisor since it's DMA from VF.
> > This is why I think our direction should be to add a vq transport that does all
> > config accesses over admin commands.
> We debate this before, so will write is below.
> > Or maybe I'm wrong.
> >
>
> > But, let's distinguish between requirements, and between requirements and
> > design. Getting supported counters is a requirement.
> > Doing this over cvq is a design. Reducing MMIO registers is a requirement. OK?
>
> Yes, I would write it as reducing MMIO and accessing it over vq of the own device is a requirement.
reducing mmio is a requirement (unrelated to counters btw, it's
general). access over vq is a design point I think.
--
MST
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* RE: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4
2023-06-07 20:50 ` Michael S. Tsirkin
@ 2023-06-07 20:53 ` Parav Pandit
0 siblings, 0 replies; 73+ messages in thread
From: Parav Pandit @ 2023-06-07 20:53 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: virtio-comment@lists.oasis-open.org, Shahaf Shuler,
virtio@lists.oasis-open.org
> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Wednesday, June 7, 2023 4:50 PM
>
> On Wed, Jun 07, 2023 at 08:39:11PM +0000, Parav Pandit wrote:
> >
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Wednesday, June 7, 2023 4:36 PM
> > >
> > > On Tue, Jun 06, 2023 at 11:08:12PM +0000, Parav Pandit wrote:
> > > > We can debate and change from cvq to new cfgvq as part of the journey.
> > > > It is fine.
> > >
> > > Just to stress, cvq has a bunch of drawbacks. For example it is hard
> > > to access from the hypervisor since it's DMA from VF.
> > > This is why I think our direction should be to add a vq transport
> > > that does all config accesses over admin commands.
> > We debate this before, so will write is below.
> > > Or maybe I'm wrong.
> > >
> >
> > > But, let's distinguish between requirements, and between
> > > requirements and design. Getting supported counters is a requirement.
> > > Doing this over cvq is a design. Reducing MMIO registers is a requirement.
> OK?
> >
> > Yes, I would write it as reducing MMIO and accessing it over vq of the own
> device is a requirement.
>
> reducing mmio is a requirement (unrelated to counters btw, it's general). access
> over vq is a design point I think.
Didn't you ask to combine requirements and high-level design together in the first draft? What changed?
Should we create a new queue, reuse cvq, generalize over cfg q, etc are detailed design discussions.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] [PATCH requirements 0/7] virtio net new features requirements
@ 2023-07-24 3:34 Parav Pandit
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4 Parav Pandit
` (6 more replies)
0 siblings, 7 replies; 73+ messages in thread
From: Parav Pandit @ 2023-07-24 3:34 UTC (permalink / raw)
To: virtio-comment; +Cc: shahafs, hengqi, virtio, Parav Pandit
Hi All,
This document captures the virtio net device requirements for the upcoming
release 1.4 that some of us are currently working on.
This is live document to be updated in coming time and work towards it for
its design which can result in a draft specification.
The objectives are:
1. to consider these requirements in introducing new features
listed in the document and otherwise and work towards the interface
design followed by drafting the specification changes.
2. Define practical list of requirements that can be achieved in 1.4
timeframe incrementally and also have the ability to implement them.
Please review mainly patch 5 at the priority.
Receive flow filters is the first item apart from counters to complete
in this iteration to start drafting the design spec.
Rest of the requirements are largly untouched other than Stefan's
comment.
TODO:
1. Some more refinement needed for rx low latency and header data split
requirements.
2. counters requirements not yet up to date to match the discussion
---
changelog:
v2->v3:
- addressed comments from Stefan for tx low latency and notification
- redrafted the requirements to use rearm term and avoid queue enable
confusion for notification
- addressed all comments and refined receive flow filters requirements to
take to design level
v1->v2:
- major update of receive flow filter requirements updated based on last
two design discussions in community and offline research
- examples added
- link to use case and design goal added
- control and operation side requirements split
- more verbose
v0->v1:
- addressed comments from Heng Li
- addressed few (not all) comments from Michael
- per patch changelog
Parav Pandit (7):
net-features: Add requirements document for release 1.4
net-features: Add low latency transmit queue requirements
net-features: Add low latency receive queue requirements
net-features: Add notification coalescing requirements
net-features: Add n-tuple receive flow filters requirements
net-features: Add packet timestamp requirements
net-features: Add header data split requirements
net-workstream/features-1.4.md | 321 +++++++++++++++++++++++++++++++++
1 file changed, 321 insertions(+)
create mode 100644 net-workstream/features-1.4.md
--
2.26.2
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4
2023-07-24 3:34 [virtio-comment] [PATCH requirements 0/7] virtio net new features requirements Parav Pandit
@ 2023-07-24 3:34 ` Parav Pandit
2023-08-08 8:16 ` David Edmondson
2023-08-14 11:56 ` David Edmondson
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 2/7] net-features: Add low latency transmit queue requirements Parav Pandit
` (5 subsequent siblings)
6 siblings, 2 replies; 73+ messages in thread
From: Parav Pandit @ 2023-07-24 3:34 UTC (permalink / raw)
To: virtio-comment; +Cc: shahafs, hengqi, virtio, Parav Pandit
Add requirements document template for the virtio net features.
Add virtio net device counters visible to driver.
Signed-off-by: Parav Pandit <parav@nvidia.com>
---
changelog:
v0->v1:
- removed tx dropped counter
- updated requirements to mention about virtqueue interface for counters
query
---
net-workstream/features-1.4.md | 35 ++++++++++++++++++++++++++++++++++
1 file changed, 35 insertions(+)
create mode 100644 net-workstream/features-1.4.md
diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
new file mode 100644
index 0000000..4c3797b
--- /dev/null
+++ b/net-workstream/features-1.4.md
@@ -0,0 +1,35 @@
+# 1. Introduction
+
+This document describes the overall requirements for virtio net device
+improvements for upcoming release 1.4. Some of these requirements are
+interrelated and influence the interface design, hence reviewing them
+together is desired while updating the virtio net interface.
+
+# 2. Summary
+1. Device counters visible to the driver
+
+# 3. Requirements
+## 3.1 Device counters
+1. The driver should be able to query the device and/or per vq counters for
+ debugging purpose using a virtqueue directly from driver to device for
+ example using a control vq.
+2. The driver should be able to query which counters are supported using a
+ virtqueue command, for example using an existing control vq.
+3. If this device is migrated between two hosts, the driver should be able
+ get the counter values in the destination host from where it was left
+ off in the source host.
+4. If a virtio device is group member device, a group owner should be able
+ to query all the counter attributes using the administration command which
+ a virtio member device will expose via a virtqueue to the driver.
+
+### 3.1.1 Per receive queue counters
+1. le64 rx_oversize_pkt_errors: Packet dropped due to receive packet being
+ oversize than the buffer size
+2. le64 rx_no_buffer_pkt_errors: Packet dropped due to unavailability of the
+ buffer in the receive queue
+3. le64 rx_gro_pkts: Packets treated as receive GSO sequence by the device
+4. le64 rx_pkts: Total packets received by the device
+
+### 3.1.2 Per transmit queue counters
+1. le64 tx_gso_pkts: Packets send as transmit GSO sequence
+2. le64 tx_pkts: Total packets send by the device
--
2.26.2
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply related [flat|nested] 73+ messages in thread
* [virtio-comment] [PATCH requirements 2/7] net-features: Add low latency transmit queue requirements
2023-07-24 3:34 [virtio-comment] [PATCH requirements 0/7] virtio net new features requirements Parav Pandit
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4 Parav Pandit
@ 2023-07-24 3:34 ` Parav Pandit
2023-08-08 8:24 ` David Edmondson
` (2 more replies)
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 3/7] net-features: Add low latency receive " Parav Pandit
` (4 subsequent siblings)
6 siblings, 3 replies; 73+ messages in thread
From: Parav Pandit @ 2023-07-24 3:34 UTC (permalink / raw)
To: virtio-comment; +Cc: shahafs, hengqi, virtio, Parav Pandit
Add requirements for the low latency transmit queue.
Signed-off-by: Parav Pandit <parav@nvidia.com>
---
chagelog:
v1->v2:
- added generic requirement to inline the request content
along with the descriptor for non virtio-net devices
- added requirement to inline the request content along
with the descriptor for virtio flow filter queue as two
features are similar
v0->v1:
- added design goals for which requirements are added
---
net-workstream/features-1.4.md | 88 ++++++++++++++++++++++++++++++++++
1 file changed, 88 insertions(+)
diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
index 4c3797b..eb95592 100644
--- a/net-workstream/features-1.4.md
+++ b/net-workstream/features-1.4.md
@@ -7,6 +7,7 @@ together is desired while updating the virtio net interface.
# 2. Summary
1. Device counters visible to the driver
+2. Low latency tx virtqueue for PCI transport
# 3. Requirements
## 3.1 Device counters
@@ -33,3 +34,90 @@ together is desired while updating the virtio net interface.
### 3.1.2 Per transmit queue counters
1. le64 tx_gso_pkts: Packets send as transmit GSO sequence
2. le64 tx_pkts: Total packets send by the device
+
+## 3.2 Low PCI latency virtqueues
+### 3.2.1 Low PCI latency tx virtqueue
+0. Design goal
+ a. Reduce PCI access latency in packet transmit flow
+ b. Avoid O(N) descriptor parser to detect a packet stream to simplify device
+ logic
+ c. Reduce number of PCI transmit completion transactions and have unified
+ completion flow with/without transmit timestamping
+ d. Avoid partial cache line writes on transmit completions
+
+1. Packet transmit descriptor should contain data descriptors count without any
+ indirection and without any O(N) search to find the end of a packet stream.
+ For example, a packet transmit descriptor (called vnet_tx_hdr_desc
+ subsequently) to contain a field num_next_desc for the packet stream
+ indicating that a packet is located in N data descriptors.
+
+2. Packet transmit descriptor should contain segmentation offload-related fields
+ without any indirection. For example, packet transmit descriptor to contain
+ gso_type, gso_size/mss, header length, csum placement byte offset, and
+ csum start.
+
+3. Packet transmit descriptor should be able to place a small size packet that
+ does not have any L4 data after the vnet_tx_hdr_desc in the virtqueue memory.
+ For example a TCP ack only packet can fit in a descriptor memory which
+ otherwise consume more than 25% of metadata to describe the packet.
+
+4. Packet transmit descriptor should be able to place a full GSO header (L2 to
+ L4) after header descriptor and before data descriptors. For example, the
+ GSO header is placed after struct vnet_tx_hdr_desc in the virtqueue memory.
+ When such a GSO header is positioned adjacent to the packet transmit
+ descriptor, and when the GSO header is not aligned to 16B, the following
+ data descriptor to start on the 8B aligned boundary.
+
+5. An example of the above requirements at high level is:
+
+```
+struct vitio_packed_q_desc {
+ /* current desc for reference */
+ u64 address;
+ u32 len;
+ u16 id;
+ u16 flags;
+};
+
+/* Constant size header descriptor for tx packets */
+struct vnet_tx_hdr_desc {
+ u16 flags; /* indicate how to parse next fields */
+ u16 id; /* desc id to come back in completion */
+ u8 num_next_desc; /* indicates the number of the next 16B data desc for this
+ * buffer.
+ */
+ u8 gso_type;
+ le16 gso_hdr_len;
+ le16 gso_size;
+ le16 csum_start;
+ le16 csum_offset;
+ u8 inline_pkt_len; /* indicates the length of the inline packet after this
+ * desc
+ */
+ u8 reserved;
+ u8 padding[];
+};
+
+/* Example of a short packet or GSO header placed in the desc section of the vq
+ */
+struct vnet_tx_small_pkt_desc {
+ u8 raw_pkt[128];
+};
+
+/* Example of header followed by data descriptor */
+struct vnet_tx_hdr_desc hdr_desc;
+struct vnet_data_desc desc[2];
+
+```
+
+6. Ability to zero pad the transmit completion when the transmit completion is
+ shorter than the CPU cache line size.
+
+7. Ability to place all transmit completion together with it per packet stream
+ transmit timestamp using single PCIe transcation.
+
+8. A generic feature of the virtqueue, to contain such header data inline for virtio
+ devices other than virtio-net.
+
+9. A flow filter virtqueue also similarly need the ability to inline the short flow
+ command header.
--
2.26.2
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply related [flat|nested] 73+ messages in thread
* [virtio-comment] [PATCH requirements 3/7] net-features: Add low latency receive queue requirements
2023-07-24 3:34 [virtio-comment] [PATCH requirements 0/7] virtio net new features requirements Parav Pandit
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4 Parav Pandit
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 2/7] net-features: Add low latency transmit queue requirements Parav Pandit
@ 2023-07-24 3:34 ` Parav Pandit
2023-08-08 8:32 ` David Edmondson
2023-08-14 11:54 ` David Edmondson
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 4/7] net-features: Add notification coalescing requirements Parav Pandit
` (3 subsequent siblings)
6 siblings, 2 replies; 73+ messages in thread
From: Parav Pandit @ 2023-07-24 3:34 UTC (permalink / raw)
To: virtio-comment; +Cc: shahafs, hengqi, virtio, Parav Pandit
Add requirements for the low latency receive queue.
Signed-off-by: Parav Pandit <parav@nvidia.com>
---
changelog:
v0->v1:
- clarified the requirements further
- added line for the gro case
- added design goals as the motivation for the requirements
---
net-workstream/features-1.4.md | 45 +++++++++++++++++++++++++++++++++-
1 file changed, 44 insertions(+), 1 deletion(-)
diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
index eb95592..e04727a 100644
--- a/net-workstream/features-1.4.md
+++ b/net-workstream/features-1.4.md
@@ -7,7 +7,7 @@ together is desired while updating the virtio net interface.
# 2. Summary
1. Device counters visible to the driver
-2. Low latency tx virtqueue for PCI transport
+2. Low latency tx and rx virtqueues for PCI transport
# 3. Requirements
## 3.1 Device counters
@@ -121,3 +121,46 @@ struct vnet_data_desc desc[2];
9. A flow filter virtqueue also similarly need the ability to inline the short flow
command header.
+
+### 3.2.2 Low latency rx virtqueue
+0. Design goal:
+ a. Keep packet metadata and buffer data together which is consumed by driver
+ layer and make it available in a single cache line of cpu
+ b. Instead of having per packet descriptors which is complex to scale for
+ the device, supply the page directly to the device to consume it based
+ on packet size
+1. The device should be able to write a packet receive completion that consists
+ of struct virtio_net_hdr (or similar) and a buffer id using a single DMA write
+ PCIe TLP.
+2. The device should be able to perform DMA writes of multiple packets
+ completions in a single DMA transaction up to the PCIe maximum write limit
+ in a transaction.
+3. The device should be able to zero pad packet write completion to align it to
+ 64B or CPU cache line size whenever possible.
+4. An example of the above DMA completion structure:
+
+```
+/* Constant size receive packet completion */
+struct vnet_rx_completion {
+ u16 flags;
+ u16 id; /* buffer id */
+ u8 gso_type;
+ u8 reserved[3];
+ le16 gso_hdr_len;
+ le16 gso_size;
+ le16 csum_start;
+ le16 csum_offset;
+ u16 reserved2;
+ u64 timestamp; /* explained later */
+ u8 padding[];
+};
+```
+5. The driver should be able to post constant-size buffer pages on a receive
+ queue which can be consumed by the device for an incoming packet of any size
+ from 64B to 9K bytes.
+6. The device should be able to know the constant buffer size at receive
+ virtqueue level instead of per buffer level.
+7. The device should be able to indicate when a full page buffer is consumed,
+ which can be recycled by the driver when the packets from the completed
+ page is fully consumed.
+8. The device should be able to consume multiple pages for a receive GSO stream.
--
2.26.2
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply related [flat|nested] 73+ messages in thread
* [virtio-comment] [PATCH requirements 4/7] net-features: Add notification coalescing requirements
2023-07-24 3:34 [virtio-comment] [PATCH requirements 0/7] virtio net new features requirements Parav Pandit
` (2 preceding siblings ...)
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 3/7] net-features: Add low latency receive " Parav Pandit
@ 2023-07-24 3:34 ` Parav Pandit
2023-08-14 11:57 ` David Edmondson
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements Parav Pandit
` (2 subsequent siblings)
6 siblings, 1 reply; 73+ messages in thread
From: Parav Pandit @ 2023-07-24 3:34 UTC (permalink / raw)
To: virtio-comment; +Cc: shahafs, hengqi, virtio, Parav Pandit
Add virtio net device notification coalescing improvements requirements.
Signed-off-by: Parav Pandit <parav@nvidia.com>
---
changelog:
v1->v2:
- addressed comments from Stefan
- redrafted the requirements to use rearm term and avoid queue enable
confusion
v0->v1:
- updated the description
---
net-workstream/features-1.4.md | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
index e04727a..27a7886 100644
--- a/net-workstream/features-1.4.md
+++ b/net-workstream/features-1.4.md
@@ -8,6 +8,7 @@ together is desired while updating the virtio net interface.
# 2. Summary
1. Device counters visible to the driver
2. Low latency tx and rx virtqueues for PCI transport
+3. Virtqueue notification coalescing re-arming support
# 3. Requirements
## 3.1 Device counters
@@ -164,3 +165,13 @@ struct vnet_rx_completion {
which can be recycled by the driver when the packets from the completed
page is fully consumed.
8. The device should be able to consume multiple pages for a receive GSO stream.
+
+## 3.3 Virtqueue notification coalescing re-arming support
+0. Design goal:
+ a. Avoid constant notifications from the device even in conditions when
+ the driver may not have acted on the previous pending notification.
+1. When Tx and Rx virtqueue notification coalescing is enabled, and when such
+ a notification is reported by the device, the device stops sending further
+ notifications until the driver rearms the notifications of the virtqueue.
+2. When the driver rearms the notification of the virtqueue, the device
+ to notify again if notification coalescing conditions are met.
--
2.26.2
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply related [flat|nested] 73+ messages in thread
* [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements
2023-07-24 3:34 [virtio-comment] [PATCH requirements 0/7] virtio net new features requirements Parav Pandit
` (3 preceding siblings ...)
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 4/7] net-features: Add notification coalescing requirements Parav Pandit
@ 2023-07-24 3:34 ` Parav Pandit
2023-08-01 8:33 ` [virtio-comment] " Parav Pandit
` (2 more replies)
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 6/7] net-features: Add packet timestamp requirements Parav Pandit
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 7/7] net-features: Add header data split requirements Parav Pandit
6 siblings, 3 replies; 73+ messages in thread
From: Parav Pandit @ 2023-07-24 3:34 UTC (permalink / raw)
To: virtio-comment; +Cc: shahafs, hengqi, virtio, Parav Pandit
Add virtio net device requirements for receive flow filters.
Signed-off-by: Parav Pandit <parav@nvidia.com>
---
changelog:
v1->v2:
- split setup and operations requirements
- added design goal
- worded requirements more precisely
v0->v1:
- fixed comments from Heng Li
- renamed receive flow steering to receive flow filters
- clarified byte offset in match criteria
---
net-workstream/features-1.4.md | 105 +++++++++++++++++++++++++++++++++
1 file changed, 105 insertions(+)
diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
index 27a7886..d228462 100644
--- a/net-workstream/features-1.4.md
+++ b/net-workstream/features-1.4.md
@@ -9,6 +9,7 @@ together is desired while updating the virtio net interface.
1. Device counters visible to the driver
2. Low latency tx and rx virtqueues for PCI transport
3. Virtqueue notification coalescing re-arming support
+4 Virtqueue receive flow filters (RFF)
# 3. Requirements
## 3.1 Device counters
@@ -175,3 +176,107 @@ struct vnet_rx_completion {
notifications until the driver rearms the notifications of the virtqueue.
2. When the driver rearms the notification of the virtqueue, the device
to notify again if notification coalescing conditions are met.
+
+## 3.4 Virtqueue receive flow filters (RFF)
+0. Design goal:
+ To filter and/or to steer packet based on specific pattern match to a
+ specific context to support application/networking stack driven receive
+ processing.
+1. Two use cases are: to support Linux netdev set_rxnfc() for ETHTOOL_SRXCLSRLINS
+ and to support netdev feature NETIF_F_NTUPLE aka ARFS.
+
+### 3.4.1 control path
+1. The number of flow filter operations/sec can range from 100k/sec to 1M/sec
+ or even more. Hence flow filter operations must be done over a queueing
+ interface using one or more queues.
+2. The device should be able to expose one or more supported flow filter queue
+ count and its start vq index to the driver.
+3. As each device may be operating for different performance characteristic,
+ start vq index and count may be different for each device. Secondly, it is
+ inefficient for device to provide flow filters capabilities via a config space
+ region. Hence, the device should be able to share these attributes using
+ dma interface, instead of transport registers.
+4. Since flow filters are enabled much later in the driver life cycle, driver
+ will likely create these queues when flow filters are enabled.
+5. Flow filter operations are often accelerated by device in a hardware. Ability
+ to handle them on a queue other than control vq is desired. This achieves near
+ zero modifications to existing implementations to add new operations on new
+ purpose built queues (similar to transmit and receive queue).
+6. The filter masks are optional; the device should be able to expose if it
+ support filter masks.
+7. The driver may want to have priority among group of flow entries; to facilitate
+ the device support grouping flow filter entries by a notion of a group. Each
+ group defines priority in processing flow.
+8. The driver and group owner driver should be able to query supported device
+ limits for the flow filter entries.
+
+### 3.4.2 flow operations path
+1. The driver should be able to define a receive packet match criteria, an
+ action and a destination for a packet. For example, an ipv4 packet with a
+ multicast address to be steered to the receive vq 0. The second example is
+ ipv4, tcp packet matching a specified IP address and tcp port tuple to
+ be steered to receive vq 10.
+2. The match criteria should include exact tuple fields well-defined such as mac
+ address, IP addresses, tcp/udp ports, etc.
+3. The match criteria should also optionally include the field mask.
+4. The match criteria may optionally also include specific packet byte offset
+ pattern, match length, mask instead of RFC defined fields.
+ length, and matching pattern, which may not be defined in the standard RFC.
+5. Action includes (a) dropping or (b) forwarding the packet.
+6. Destination is a receive virtqueue index.
+7. The device should process packet receive filters programmed via control vq
+ commands first in the processing chain.
+7. The device should process RFF entries before RSS configuration, i.e.,
+ when there is a miss on the RFF entry, RSS configuration applies if it exists.
+8. To summarize the processing chain on a rx packet is:
+ {mac,vlan,promisc rx filters} -> {receive flow filters} -> {rss/hash config}.
+9. If multiple entries are programmed which has overlapping attributes for a
+ received packet, the driver to define the location/priority of the entry.
+10. The filter entries are usually short in size of few tens of bytes,
+ for example IPv6 + TCP tuple would be 36 bytes, and ops/sec rate is
+ high, hence supplying fields inside the queue descriptor is preferred for
+ up to a certain fixed size, say 56 bytes.
+11. A flow filter entry consists of (a) match criteria, (b) action,
+ (c) destination and (d) a unique 32 bit flow id, all supplied by the
+ driver.
+12. The driver should be able to query and delete flow filter entry by the
+ the device by the flow id.
+
+### 3.4.3 interface example
+
+Flow filter capabilities to query using a DMA interface:
+
+```
+struct flow_filter_capabilities {
+ u8 flow_groups;
+ u16 num_flow_filter_vqs;
+ u16 start_vq_index;
+ u32 max_flow_filters_per_group;
+ u32 max_flow_filters;
+ u64 supported_packet_field_mask_bmap[4];
+};
+
+
+```
+
+1. Flow filter entry add/modify, delete:
+
+struct virtio_net_rff_add_modify {
+ u8 flow_op;
+ u8 group_id;
+ u8 padding[2];
+ le32 flow_id;
+ struct match_criteria mc;
+ struct destination dest;
+ struct action action;
+
+ struct match_criteria mask; /* optional */
+};
+
+2. Flow filter entry delete:
+struct virtio_net_rff_delete {
+ u8 flow_op;
+ u8 group_id;
+ u8 padding[2];
+ le32 flow_id;
+};
--
2.26.2
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply related [flat|nested] 73+ messages in thread
* [virtio-comment] [PATCH requirements 6/7] net-features: Add packet timestamp requirements
2023-07-24 3:34 [virtio-comment] [PATCH requirements 0/7] virtio net new features requirements Parav Pandit
` (4 preceding siblings ...)
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements Parav Pandit
@ 2023-07-24 3:34 ` Parav Pandit
2023-08-09 8:35 ` [virtio-comment] Re: [virtio] " Xuan Zhuo
2023-08-14 11:59 ` [virtio-comment] " David Edmondson
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 7/7] net-features: Add header data split requirements Parav Pandit
6 siblings, 2 replies; 73+ messages in thread
From: Parav Pandit @ 2023-07-24 3:34 UTC (permalink / raw)
To: virtio-comment; +Cc: shahafs, hengqi, virtio, Parav Pandit
Add tx and rx packet timestamp requirements.
Signed-off-by: Parav Pandit <parav@nvidia.com>
---
net-workstream/features-1.4.md | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)
diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
index d228462..37820b6 100644
--- a/net-workstream/features-1.4.md
+++ b/net-workstream/features-1.4.md
@@ -10,6 +10,7 @@ together is desired while updating the virtio net interface.
2. Low latency tx and rx virtqueues for PCI transport
3. Virtqueue notification coalescing re-arming support
4 Virtqueue receive flow filters (RFF)
+5. Device timestamp for tx and rx packets
# 3. Requirements
## 3.1 Device counters
@@ -280,3 +281,28 @@ struct virtio_net_rff_delete {
u8 padding[2];
le32 flow_id;
};
+
+## 3.5 Packet timestamp
+1. Device should provide transmit timestamp and receive timestamp of the packets
+ at per packet level when the device is enabled.
+2. Device should provide the current free running clock in the least latency
+ possible using an MMIO register read of 64-bit to have the least jitter.
+3. Device should provide the current frequency and the frequency unit for the
+ software to synchronize the reference point of software and the device using
+ a control vq command.
+
+### 3.5.1 Transmit timestamp
+1. Transmit completion must contain a packet transmission timestamp when the
+ device is enabled for it.
+2. The device should record the packet transmit timestamp in the completion at
+ the farthest egress point towards the network.
+3. The device must provide a transmit packet timestamp in a single DMA
+ transaction along with the rest of the transmit completion fields.
+
+### 3.5.2 Receive timestamp
+1. Receive completion must contain a packet reception timestamp when the device
+ is enabled for it.
+2. The device should record the received packet timestamp at the closet ingress
+ point of reception from the network.
+3. The device should provide a receive packet timestamp in a single DMA
+ transaction along with the rest of the receive completion fields.
--
2.26.2
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply related [flat|nested] 73+ messages in thread
* [virtio-comment] [PATCH requirements 7/7] net-features: Add header data split requirements
2023-07-24 3:34 [virtio-comment] [PATCH requirements 0/7] virtio net new features requirements Parav Pandit
` (5 preceding siblings ...)
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 6/7] net-features: Add packet timestamp requirements Parav Pandit
@ 2023-07-24 3:34 ` Parav Pandit
2023-08-10 19:19 ` [virtio-comment] RE: [EXT] [virtio] " Satananda Burla
2023-08-14 12:00 ` [virtio-comment] " David Edmondson
6 siblings, 2 replies; 73+ messages in thread
From: Parav Pandit @ 2023-07-24 3:34 UTC (permalink / raw)
To: virtio-comment; +Cc: shahafs, hengqi, virtio, Parav Pandit
Add header data split requirements for the receive packets.
Signed-off-by: Parav Pandit <parav@nvidia.com>
---
net-workstream/features-1.4.md | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
index 37820b6..a64e356 100644
--- a/net-workstream/features-1.4.md
+++ b/net-workstream/features-1.4.md
@@ -11,6 +11,7 @@ together is desired while updating the virtio net interface.
3. Virtqueue notification coalescing re-arming support
4 Virtqueue receive flow filters (RFF)
5. Device timestamp for tx and rx packets
+6. Header data split for the receive virtqueue
# 3. Requirements
## 3.1 Device counters
@@ -306,3 +307,15 @@ struct virtio_net_rff_delete {
point of reception from the network.
3. The device should provide a receive packet timestamp in a single DMA
transaction along with the rest of the receive completion fields.
+
+## 3.6 Header data split for the receive virtqueue
+1. The device should be able to DMA the packet header and data to two different
+ memory locations, this enables driver and networking stack to perform zero
+ copy to application buffer(s).
+2. The driver should be able to configure maximum header buffer size per
+ virtqueue.
+3. The header buffer to be in a physically contiguous memory per virtqueue
+4. The device should be able to indicate header data split in the receive
+ completion.
+5. The device should be able to zero pad the header buffer when the received
+ header is shorter than cpu cache line size.
--
2.26.2
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply related [flat|nested] 73+ messages in thread
* [virtio-comment] RE: [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements Parav Pandit
@ 2023-08-01 8:33 ` Parav Pandit
2023-08-02 6:44 ` Parav Pandit
2023-08-02 7:17 ` [virtio-comment] RE: [EXT] [virtio] " Satananda Burla
2023-08-02 15:25 ` [virtio-comment] " Heng Qi
2 siblings, 1 reply; 73+ messages in thread
From: Parav Pandit @ 2023-08-01 8:33 UTC (permalink / raw)
To: virtio-comment@lists.oasis-open.org, Michael S. Tsirkin
Cc: Shahaf Shuler, hengqi@linux.alibaba.com,
virtio@lists.oasis-open.org
Hi Michael and all,
> From: Parav Pandit <parav@nvidia.com>
> Sent: Monday, July 24, 2023 9:04 AM
>
> Add virtio net device requirements for receive flow filters.
>
> Signed-off-by: Parav Pandit <parav@nvidia.com>
Do you have any further comments on it?
Heng and I want to progress now to start the design/spec draft of it starting 3rd Aug.
Flow filter command requires packed VQ descriptor extension.
So, we need to wrap up now for this infrastructure extension pre-work which has significant effort.
For now, we are taking design to next stage for these steering requirements.
The packed vq extension is going to benefit low latency tx part as well as reviewed by Stephan.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] RE: [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements
2023-08-01 8:33 ` [virtio-comment] " Parav Pandit
@ 2023-08-02 6:44 ` Parav Pandit
2023-08-02 15:32 ` Heng Qi
0 siblings, 1 reply; 73+ messages in thread
From: Parav Pandit @ 2023-08-02 6:44 UTC (permalink / raw)
To: Parav Pandit, virtio-comment@lists.oasis-open.org,
Michael S. Tsirkin
Cc: Shahaf Shuler, hengqi@linux.alibaba.com,
virtio@lists.oasis-open.org
> From: virtio@lists.oasis-open.org <virtio@lists.oasis-open.org> On Behalf Of
> Parav Pandit
> Sent: Tuesday, August 1, 2023 2:04 PM
>
> Hi Michael and all,
>
> > From: Parav Pandit <parav@nvidia.com>
> > Sent: Monday, July 24, 2023 9:04 AM
> >
> > Add virtio net device requirements for receive flow filters.
> >
> > Signed-off-by: Parav Pandit <parav@nvidia.com>
>
> Do you have any further comments on it?
>
> Heng and I want to progress now to start the design/spec draft of it starting 3rd
> Aug.
> Flow filter command requires packed VQ descriptor extension.
> So, we need to wrap up now for this infrastructure extension pre-work which
> has significant effort.
>
> For now, we are taking design to next stage for these steering requirements.
> The packed vq extension is going to benefit low latency tx part as well as
> reviewed by Stephan.
In todays' meeting we further discussed that inline is optional feature for the txq and for steering filters.
Hence, based on the feedback, we can progress now without packed vq extension.
We need to close on the config space access via dma interface.
So lets finish that blocker now so counters and filters can progress.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] RE: [EXT] [virtio] [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements Parav Pandit
2023-08-01 8:33 ` [virtio-comment] " Parav Pandit
@ 2023-08-02 7:17 ` Satananda Burla
2023-08-02 8:14 ` Parav Pandit
2023-08-02 15:25 ` [virtio-comment] " Heng Qi
2 siblings, 1 reply; 73+ messages in thread
From: Satananda Burla @ 2023-08-02 7:17 UTC (permalink / raw)
To: Parav Pandit, virtio-comment@lists.oasis-open.org
Cc: shahafs@nvidia.com, hengqi@linux.alibaba.com,
virtio@lists.oasis-open.org
Hi Parav
> -----Original Message-----
> From: virtio@lists.oasis-open.org <virtio@lists.oasis-open.org> On
> Behalf Of Parav Pandit
> Sent: Sunday, July 23, 2023 8:34 PM
> To: virtio-comment@lists.oasis-open.org
> Cc: shahafs@nvidia.com; hengqi@linux.alibaba.com; virtio@lists.oasis-
> open.org; Parav Pandit <parav@nvidia.com>
> Subject: [EXT] [virtio] [PATCH requirements 5/7] net-features: Add n-
> tuple receive flow filters requirements
>
> External Email
>
> ----------------------------------------------------------------------
> Add virtio net device requirements for receive flow filters.
>
> Signed-off-by: Parav Pandit <parav@nvidia.com>
> ---
> changelog:
> v1->v2:
> - split setup and operations requirements
> - added design goal
> - worded requirements more precisely
> v0->v1:
> - fixed comments from Heng Li
> - renamed receive flow steering to receive flow filters
> - clarified byte offset in match criteria
> ---
> net-workstream/features-1.4.md | 105 +++++++++++++++++++++++++++++++++
> 1 file changed, 105 insertions(+)
>
> diff --git a/net-workstream/features-1.4.md b/net-workstream/features-
> 1.4.md
> index 27a7886..d228462 100644
> --- a/net-workstream/features-1.4.md
> +++ b/net-workstream/features-1.4.md
> @@ -9,6 +9,7 @@ together is desired while updating the virtio net
> interface.
> 1. Device counters visible to the driver
> 2. Low latency tx and rx virtqueues for PCI transport
> 3. Virtqueue notification coalescing re-arming support
> +4 Virtqueue receive flow filters (RFF)
>
> # 3. Requirements
> ## 3.1 Device counters
> @@ -175,3 +176,107 @@ struct vnet_rx_completion {
> notifications until the driver rearms the notifications of the
> virtqueue.
> 2. When the driver rearms the notification of the virtqueue, the device
> to notify again if notification coalescing conditions are met.
> +
> +## 3.4 Virtqueue receive flow filters (RFF)
> +0. Design goal:
> + To filter and/or to steer packet based on specific pattern match to
> a
> + specific context to support application/networking stack driven
> receive
> + processing.
> +1. Two use cases are: to support Linux netdev set_rxnfc() for
> ETHTOOL_SRXCLSRLINS
> + and to support netdev feature NETIF_F_NTUPLE aka ARFS.
> +
> +### 3.4.1 control path
> +1. The number of flow filter operations/sec can range from 100k/sec to
> 1M/sec
> + or even more. Hence flow filter operations must be done over a
> queueing
> + interface using one or more queues.
> +2. The device should be able to expose one or more supported flow
> filter queue
If there is more than 1 queue, I assume there is no ordering requirement
between operations submitted on multiple queues.
> + count and its start vq index to the driver.
> +3. As each device may be operating for different performance
> characteristic,
> + start vq index and count may be different for each device. Secondly,
> it is
> + inefficient for device to provide flow filters capabilities via a
> config space
> + region. Hence, the device should be able to share these attributes
> using
> + dma interface, instead of transport registers.
> +4. Since flow filters are enabled much later in the driver life cycle,
> driver
> + will likely create these queues when flow filters are enabled.
> +5. Flow filter operations are often accelerated by device in a
> hardware. Ability
> + to handle them on a queue other than control vq is desired. This
> achieves near
> + zero modifications to existing implementations to add new operations
> on new
> + purpose built queues (similar to transmit and receive queue).
> +6. The filter masks are optional; the device should be able to expose
> if it
> + support filter masks.
> +7. The driver may want to have priority among group of flow entries; to
> facilitate
> + the device support grouping flow filter entries by a notion of a
> group. Each
> + group defines priority in processing flow.
> +8. The driver and group owner driver should be able to query supported
> device
> + limits for the flow filter entries.
> +
> +### 3.4.2 flow operations path
> +1. The driver should be able to define a receive packet match criteria,
> an
> + action and a destination for a packet. For example, an ipv4 packet
> with a
> + multicast address to be steered to the receive vq 0. The second
> example is
> + ipv4, tcp packet matching a specified IP address and tcp port tuple
> to
> + be steered to receive vq 10.
> +2. The match criteria should include exact tuple fields well-defined
> such as mac
> + address, IP addresses, tcp/udp ports, etc.
> +3. The match criteria should also optionally include the field mask.
> +4. The match criteria may optionally also include specific packet byte
> offset
> + pattern, match length, mask instead of RFC defined fields.
> + length, and matching pattern, which may not be defined in the
> standard RFC.
> +5. Action includes (a) dropping or (b) forwarding the packet.
> +6. Destination is a receive virtqueue index.
> +7. The device should process packet receive filters programmed via
> control vq
> + commands first in the processing chain.
> +7. The device should process RFF entries before RSS configuration,
> i.e.,
> + when there is a miss on the RFF entry, RSS configuration applies if
> it exists.
> +8. To summarize the processing chain on a rx packet is:
> + {mac,vlan,promisc rx filters} -> {receive flow filters} -> {rss/hash
> config}.
Shouldn't this be
|-match-> {RFF processing}
{mac,vlan,promisc rx filters} -> {receive flow filters} -|
|-no match-> {rss/hash config}.
The above looks like RSS processing will always happen.
> +9. If multiple entries are programmed which has overlapping attributes
> for a
> + received packet, the driver to define the location/priority of the
> entry.
If driver does not provide group or provides same group with 2 rules that
have a same match part, does the new entry overwrite the old one for exact
matches(no mask) ? and for rules with masks, does the rule with longest match take
precedence? or the latest added rule takes precedence?
> +10. The filter entries are usually short in size of few tens of bytes,
> + for example IPv6 + TCP tuple would be 36 bytes, and ops/sec rate is
> + high, hence supplying fields inside the queue descriptor is
> preferred for
> + up to a certain fixed size, say 56 bytes.
> +11. A flow filter entry consists of (a) match criteria, (b) action,
> + (c) destination and (d) a unique 32 bit flow id, all supplied by
> the
> + driver.
> +12. The driver should be able to query and delete flow filter entry by
> the
> + the device by the flow id.
The flowid here seems to be used for rule index. Can this be returned by
the device instead of being sent by driver? A 32 bit value to store might
impose undue restrictions on devices that have lesser capacity. Or could
there be a restriction that flowid cannot exceed the value returned by
device as the capacity.
> +
> +### 3.4.3 interface example
> +
> +Flow filter capabilities to query using a DMA interface:
> +
> +```
> +struct flow_filter_capabilities {
> + u8 flow_groups;
> + u16 num_flow_filter_vqs;
> + u16 start_vq_index;
> + u32 max_flow_filters_per_group;
> + u32 max_flow_filters;
> + u64 supported_packet_field_mask_bmap[4];
> +};
> +
> +
> +```
> +
> +1. Flow filter entry add/modify, delete:
> +
> +struct virtio_net_rff_add_modify {
> + u8 flow_op;
> + u8 group_id;
> + u8 padding[2];
> + le32 flow_id;
> + struct match_criteria mc;
> + struct destination dest;
> + struct action action;
> +
> + struct match_criteria mask; /* optional */
> +};
> +
> +2. Flow filter entry delete:
> +struct virtio_net_rff_delete {
> + u8 flow_op;
> + u8 group_id;
> + u8 padding[2];
> + le32 flow_id;
> +};
> --
> 2.26.2
>
>
> ---------------------------------------------------------------------
> To unsubscribe from this mail list, you must leave the OASIS TC that
> generates this mail. Follow this link to all your TCs in OASIS at:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.oasis-
> 2Dopen.org_apps_org_workgroup_portal_my-
> 5Fworkgroups.php&d=DwIDAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=NHDPsfcAYlN2z-
> NXHHG4WB09qqS0voo_nf6_kGS625A&m=s4DkD0yyl9BvPMHr6DMuYpiRlBf_vmBXRWinZK_m
> NRho9Fue9HuTbs35i1zY6NeO&s=VS8X60QpRfg49zJm_fZGhwms1_R4J-
> MDPPhlu0nMG2w&e=
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] RE: [EXT] [virtio] [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements
2023-08-02 7:17 ` [virtio-comment] RE: [EXT] [virtio] " Satananda Burla
@ 2023-08-02 8:14 ` Parav Pandit
2023-08-02 18:32 ` Satananda Burla
0 siblings, 1 reply; 73+ messages in thread
From: Parav Pandit @ 2023-08-02 8:14 UTC (permalink / raw)
To: Satananda Burla, virtio-comment@lists.oasis-open.org
Cc: Shahaf Shuler, hengqi@linux.alibaba.com,
virtio@lists.oasis-open.org
> From: Satananda Burla <sburla@marvell.com>
> Sent: Wednesday, August 2, 2023 12:48 PM
[..]
> > +7. The device should process packet receive filters programmed via
> > control vq
> > + commands first in the processing chain.
> > +7. The device should process RFF entries before RSS configuration,
> > i.e.,
> > + when there is a miss on the RFF entry, RSS configuration applies
> > + if
> > it exists.
> > +8. To summarize the processing chain on a rx packet is:
> > + {mac,vlan,promisc rx filters} -> {receive flow filters} ->
> > +{rss/hash
> > config}.
> Shouldn't this be
> |-match-> {RFF processing}
> {mac,vlan,promisc rx filters} -> {receive flow filters} -|
> |-no match-> {rss/hash config}.
I likely didn't understand your suggestion.
In the filter chain, the first filters are promiscuous, mac, vlan filters which are programmed through the cvq.
If mac filter of cvq drops the packet, packet does not reach the newly introduced RFF filter.
This is because RFF are steering rules (like RSS).
They do not override the existing mac/vlan filters from OS/driver pov.
> The above looks like RSS processing will always happen.
Oh my bad.
Rss is only on the no_match, didn't clarify enough.
Fixing it.
> > +9. If multiple entries are programmed which has overlapping
> > +attributes
> > for a
> > + received packet, the driver to define the location/priority of the
> > entry.
> If driver does not provide group or provides same group with 2 rules that have a
> same match part, does the new entry overwrite the old one for exact
> matches(no mask) ? and for rules with masks, does the rule with longest match
> take precedence? or the latest added rule takes precedence?
> > +10. The filter entries are usually short in size of few tens of bytes,
> > + for example IPv6 + TCP tuple would be 36 bytes, and ops/sec rate is
> > + high, hence supplying fields inside the queue descriptor is
> > preferred for
> > + up to a certain fixed size, say 56 bytes.
> > +11. A flow filter entry consists of (a) match criteria, (b) action,
> > + (c) destination and (d) a unique 32 bit flow id, all supplied by
> > the
> > + driver.
> > +12. The driver should be able to query and delete flow filter entry
> > +by
> > the
> > + the device by the flow id.
> The flowid here seems to be used for rule index. Can this be returned by the
> device instead of being sent by driver? A 32 bit value to store might impose
> undue restrictions on devices that have lesser capacity. Or could there be a
> restriction that flowid cannot exceed the value returned by device as the
> capacity.
> > +
I had thought about it as well.
The main reason for driver to choose the value is, live migration scenario.
If the device chooses _any_ id, then one vendor may choose A and other vendor may choose B and it may not work.
So one way is to keep driver supplied id we need to keep it detached from the capacity.
Alternatively,
we can add max_value in the device provisioning flow to ease the device implementation.
Like how we have max_vqs per device (num_queues in virtio_pci_common_cfg).
(max capacity is already present in below flow_filter_capabilities struct).
> > +### 3.4.3 interface example
> > +
> > +Flow filter capabilities to query using a DMA interface:
> > +
> > +```
> > +struct flow_filter_capabilities {
> > + u8 flow_groups;
> > + u16 num_flow_filter_vqs;
> > + u16 start_vq_index;
> > + u32 max_flow_filters_per_group;
> > + u32 max_flow_filters;
> > + u64 supported_packet_field_mask_bmap[4];
> > +};
> > +
> > +
> > +```
> > +
> > +1. Flow filter entry add/modify, delete:
> > +
> > +struct virtio_net_rff_add_modify {
> > + u8 flow_op;
> > + u8 group_id;
> > + u8 padding[2];
> > + le32 flow_id;
> > + struct match_criteria mc;
> > + struct destination dest;
> > + struct action action;
> > +
> > + struct match_criteria mask; /* optional */
> > +};
> > +
> > +2. Flow filter entry delete:
> > +struct virtio_net_rff_delete {
> > + u8 flow_op;
> > + u8 group_id;
> > + u8 padding[2];
> > + le32 flow_id;
> > +};
> > --
> > 2.26.2
> >
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] Re: [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements Parav Pandit
2023-08-01 8:33 ` [virtio-comment] " Parav Pandit
2023-08-02 7:17 ` [virtio-comment] RE: [EXT] [virtio] " Satananda Burla
@ 2023-08-02 15:25 ` Heng Qi
2023-08-03 9:59 ` [virtio-comment] " Parav Pandit
2 siblings, 1 reply; 73+ messages in thread
From: Heng Qi @ 2023-08-02 15:25 UTC (permalink / raw)
To: Parav Pandit
Cc: Satananda Burla, virtio-comment@lists.oasis-open.org,
Shahaf Shuler, virtio@lists.oasis-open.org
On Mon, Jul 24, 2023 at 06:34:19AM +0300, Parav Pandit wrote:
> Add virtio net device requirements for receive flow filters.
>
> Signed-off-by: Parav Pandit <parav@nvidia.com>
> ---
> changelog:
> v1->v2:
> - split setup and operations requirements
> - added design goal
> - worded requirements more precisely
> v0->v1:
> - fixed comments from Heng Li
> - renamed receive flow steering to receive flow filters
> - clarified byte offset in match criteria
> ---
> net-workstream/features-1.4.md | 105 +++++++++++++++++++++++++++++++++
> 1 file changed, 105 insertions(+)
>
> diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
> index 27a7886..d228462 100644
> --- a/net-workstream/features-1.4.md
> +++ b/net-workstream/features-1.4.md
> @@ -9,6 +9,7 @@ together is desired while updating the virtio net interface.
> 1. Device counters visible to the driver
> 2. Low latency tx and rx virtqueues for PCI transport
> 3. Virtqueue notification coalescing re-arming support
> +4 Virtqueue receive flow filters (RFF)
>
> # 3. Requirements
> ## 3.1 Device counters
> @@ -175,3 +176,107 @@ struct vnet_rx_completion {
> notifications until the driver rearms the notifications of the virtqueue.
> 2. When the driver rearms the notification of the virtqueue, the device
> to notify again if notification coalescing conditions are met.
> +
> +## 3.4 Virtqueue receive flow filters (RFF)
> +0. Design goal:
> + To filter and/or to steer packet based on specific pattern match to a
> + specific context to support application/networking stack driven receive
> + processing.
> +1. Two use cases are: to support Linux netdev set_rxnfc() for ETHTOOL_SRXCLSRLINS
> + and to support netdev feature NETIF_F_NTUPLE aka ARFS.
Hi, Parav. Sorry for not responding to this in time due to other things recently.
Yes, RFF has two scenarios, set_rxnfc and ARFS, both of which will affect the packet steering on the device side.
I think manually configured rules should have higher priority than ARFS automatic configuration.
This behavior is intuitive and consistent with other drivers. Therefore, the processing chain on a rx packet is:
{mac,vlan,promisc rx filters} -> {set_rxnfc} -> {ARFS} -> {rss/hash config}.
There are also priorities within set_rxnfc and ARFS respectively.
1. For set_rxnfc, which has the exact match and the mask match. Exact matches should have higher priority.
Suppose there are two rules,
rule1: {"tcpv4", "src-ip: 1.1.1.1"} -> rxq1
rule2: {"tcpv4", "src-ip: 1.1.1.1", "dst-port: 8989"} -> rxq2
.For recieved rx packets whose src-ip is 1.1.1.1, should match rule2 instead of rule1.
The rules of set_rxnfc come from manual configuration, the number of these rules is small and
we may not need group grouping for this. And ctrlq can meet the configuration rate,
2. For ARFS, which only has the exact match.
For ARFS, since there is only one matching rule for a certain flow, so there is no need for group?
We may need different types of tables, such as UDPv4 flow table, TCPv4 flow table to speed up the lookup for differect flow types.
Besides, the high rate and large number of configuration rules means that we need flow vq.
Therefore, although set_rxnfc and ARFS share a set of infrastructure, there are still some differences,
such as configuration rate and quantity. So do we need add two features (VIRTIO_NET_F_RXNFC and VIRTIO_NET_F_ARFS)
for set_rxnfc and ARFS respectively, and ARFS can choose flow vq?
In this way, is it more conducive to advancing the work of RFF (such as accelerating the advancement of set_rxnfc)?
> +
> +### 3.4.1 control path
> +1. The number of flow filter operations/sec can range from 100k/sec to 1M/sec
> + or even more. Hence flow filter operations must be done over a queueing
> + interface using one or more queues.
This is only for ARFS, for devices that only want to support set_rxnfc,
they don't provide VIRTIO_NET_F_ARFS and consider implementing flow vq.
> +2. The device should be able to expose one or more supported flow filter queue
> + count and its start vq index to the driver.
> +3. As each device may be operating for different performance characteristic,
> + start vq index and count may be different for each device. Secondly, it is
> + inefficient for device to provide flow filters capabilities via a config space
> + region. Hence, the device should be able to share these attributes using
> + dma interface, instead of transport registers.
> +4. Since flow filters are enabled much later in the driver life cycle, driver
> + will likely create these queues when flow filters are enabled.
I understand that the number of flow vqs is not reflected in
max_virtqueue_pairs. And a new vq is created at runtime, is this
supported in the existing virtio spec?
> +5. Flow filter operations are often accelerated by device in a hardware. Ability
> + to handle them on a queue other than control vq is desired. This achieves near
> + zero modifications to existing implementations to add new operations on new
> + purpose built queues (similar to transmit and receive queue).
> +6. The filter masks are optional; the device should be able to expose if it
> + support filter masks.
> +7. The driver may want to have priority among group of flow entries; to facilitate
> + the device support grouping flow filter entries by a notion of a group. Each
> + group defines priority in processing flow.
> +8. The driver and group owner driver should be able to query supported device
> + limits for the flow filter entries.
> +
> +### 3.4.2 flow operations path
> +1. The driver should be able to define a receive packet match criteria, an
> + action and a destination for a packet.
When the user does not specify a destination when configuring a rule, do
we need a default destination?
> For example, an ipv4 packet with a
> + multicast address to be steered to the receive vq 0. The second example is
> + ipv4, tcp packet matching a specified IP address and tcp port tuple to
> + be steered to receive vq 10.
> +2. The match criteria should include exact tuple fields well-defined such as mac
> + address, IP addresses, tcp/udp ports, etc.
> +3. The match criteria should also optionally include the field mask.
> +4. The match criteria may optionally also include specific packet byte offset
> + pattern, match length, mask instead of RFC defined fields.
> + length, and matching pattern, which may not be defined in the standard RFC.
Is there a description error here?
> +5. Action includes (a) dropping or (b) forwarding the packet.
> +6. Destination is a receive virtqueue index.
Since the concept of RSS context does not yet exist in the virtio spec.
Did we say that we also support carrying RSS context information when
negotiating the RFF feature? For example, RSS context configuration
commands and structures, etc.
Or support RSS context functionality as a separate feature in another thread?
A related point to consider is that when a user inserts a rule with an
rss context, the RSS context cannot be deleted, otherwise the device
will cause undefined behavior.
Thanks!
> +7. The device should process packet receive filters programmed via control vq
> + commands first in the processing chain.
> +7. The device should process RFF entries before RSS configuration, i.e.,
> + when there is a miss on the RFF entry, RSS configuration applies if it exists.
> +8. To summarize the processing chain on a rx packet is:
> + {mac,vlan,promisc rx filters} -> {receive flow filters} -> {rss/hash config}.
> +9. If multiple entries are programmed which has overlapping attributes for a
> + received packet, the driver to define the location/priority of the entry.
> +10. The filter entries are usually short in size of few tens of bytes,
> + for example IPv6 + TCP tuple would be 36 bytes, and ops/sec rate is
> + high, hence supplying fields inside the queue descriptor is preferred for
> + up to a certain fixed size, say 56 bytes.
> +11. A flow filter entry consists of (a) match criteria, (b) action,
> + (c) destination and (d) a unique 32 bit flow id, all supplied by the
> + driver.
> +12. The driver should be able to query and delete flow filter entry by the
> + the device by the flow id.
> +
> +### 3.4.3 interface example
> +
> +Flow filter capabilities to query using a DMA interface:
> +
> +```
> +struct flow_filter_capabilities {
> + u8 flow_groups;
> + u16 num_flow_filter_vqs;
> + u16 start_vq_index;
> + u32 max_flow_filters_per_group;
> + u32 max_flow_filters;
> + u64 supported_packet_field_mask_bmap[4];
> +};
> +
> +
> +```
> +
> +1. Flow filter entry add/modify, delete:
> +
> +struct virtio_net_rff_add_modify {
> + u8 flow_op;
> + u8 group_id;
> + u8 padding[2];
> + le32 flow_id;
> + struct match_criteria mc;
> + struct destination dest;
> + struct action action;
> +
> + struct match_criteria mask; /* optional */
> +};
> +
> +2. Flow filter entry delete:
> +struct virtio_net_rff_delete {
> + u8 flow_op;
> + u8 group_id;
> + u8 padding[2];
> + le32 flow_id;
> +};
> --
> 2.26.2
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: [virtio-comment] RE: [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements
2023-08-02 6:44 ` Parav Pandit
@ 2023-08-02 15:32 ` Heng Qi
2023-08-03 10:01 ` Parav Pandit
0 siblings, 1 reply; 73+ messages in thread
From: Heng Qi @ 2023-08-02 15:32 UTC (permalink / raw)
To: Parav Pandit, virtio-comment@lists.oasis-open.org,
Michael S. Tsirkin
Cc: Shahaf Shuler, virtio@lists.oasis-open.org
在 2023/8/2 下午2:44, Parav Pandit 写道:
>
>> From: virtio@lists.oasis-open.org <virtio@lists.oasis-open.org> On Behalf Of
>> Parav Pandit
>> Sent: Tuesday, August 1, 2023 2:04 PM
>>
>> Hi Michael and all,
>>
>>> From: Parav Pandit <parav@nvidia.com>
>>> Sent: Monday, July 24, 2023 9:04 AM
>>>
>>> Add virtio net device requirements for receive flow filters.
>>>
>>> Signed-off-by: Parav Pandit <parav@nvidia.com>
>> Do you have any further comments on it?
>>
>> Heng and I want to progress now to start the design/spec draft of it starting 3rd
>> Aug.
>> Flow filter command requires packed VQ descriptor extension.
>> So, we need to wrap up now for this infrastructure extension pre-work which
>> has significant effort.
>>
>> For now, we are taking design to next stage for these steering requirements.
>> The packed vq extension is going to benefit low latency tx part as well as
>> reviewed by Stephan.
> In todays' meeting we further discussed that inline is optional feature for the txq and for steering filters.
> Hence, based on the feedback, we can progress now without packed vq extension.
Hi, Parav.
I remember that the last meeting (7.26) was canceled and the meeting was
postponed until today (8.02)?
If yes, unfortunately we missed this meeting :(.
Is the next meeting still two weeks away (8.16) ?
Thanks!
>
> We need to close on the config space access via dma interface.
> So lets finish that blocker now so counters and filters can progress.
>
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
>
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
>
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] RE: [EXT] [virtio] [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements
2023-08-02 8:14 ` Parav Pandit
@ 2023-08-02 18:32 ` Satananda Burla
2023-08-04 7:32 ` Parav Pandit
0 siblings, 1 reply; 73+ messages in thread
From: Satananda Burla @ 2023-08-02 18:32 UTC (permalink / raw)
To: Parav Pandit, virtio-comment@lists.oasis-open.org
Cc: Shahaf Shuler, hengqi@linux.alibaba.com,
virtio@lists.oasis-open.org
> -----Original Message-----
> From: Parav Pandit <parav@nvidia.com>
> Sent: Wednesday, August 2, 2023 1:15 AM
> To: Satananda Burla <sburla@marvell.com>; virtio-comment@lists.oasis-
> open.org
> Cc: Shahaf Shuler <shahafs@nvidia.com>; hengqi@linux.alibaba.com;
> virtio@lists.oasis-open.org
> Subject: RE: [EXT] [virtio] [PATCH requirements 5/7] net-features: Add
> n-tuple receive flow filters requirements
>
>
> > From: Satananda Burla <sburla@marvell.com>
> > Sent: Wednesday, August 2, 2023 12:48 PM
>
> [..]
> > > +7. The device should process packet receive filters programmed via
> > > control vq
> > > + commands first in the processing chain.
> > > +7. The device should process RFF entries before RSS configuration,
> > > i.e.,
> > > + when there is a miss on the RFF entry, RSS configuration applies
> > > + if
> > > it exists.
> > > +8. To summarize the processing chain on a rx packet is:
> > > + {mac,vlan,promisc rx filters} -> {receive flow filters} ->
> > > +{rss/hash
> > > config}.
> > Shouldn't this be
> > |-match->
> {RFF processing}
> > {mac,vlan,promisc rx filters} -> {receive flow filters} -|
> > |-no match->
> {rss/hash config}.
> I likely didn't understand your suggestion.
>
> In the filter chain, the first filters are promiscuous, mac, vlan
> filters which are programmed through the cvq.
> If mac filter of cvq drops the packet, packet does not reach the newly
> introduced RFF filter.
>
> This is because RFF are steering rules (like RSS).
> They do not override the existing mac/vlan filters from OS/driver pov.
>
> > The above looks like RSS processing will always happen.
>
> Oh my bad.
> Rss is only on the no_match, didn't clarify enough.
>
> Fixing it.
>
> > > +9. If multiple entries are programmed which has overlapping
> > > +attributes
> > > for a
> > > + received packet, the driver to define the location/priority of
> the
> > > entry.
> > If driver does not provide group or provides same group with 2 rules
> that have a
> > same match part, does the new entry overwrite the old one for exact
> > matches(no mask) ? and for rules with masks, does the rule with
> longest match
> > take precedence? or the latest added rule takes precedence?
> > > +10. The filter entries are usually short in size of few tens of
> bytes,
> > > + for example IPv6 + TCP tuple would be 36 bytes, and ops/sec rate
> is
> > > + high, hence supplying fields inside the queue descriptor is
> > > preferred for
> > > + up to a certain fixed size, say 56 bytes.
> > > +11. A flow filter entry consists of (a) match criteria, (b) action,
> > > + (c) destination and (d) a unique 32 bit flow id, all supplied
> by
> > > the
> > > + driver.
> > > +12. The driver should be able to query and delete flow filter entry
> > > +by
> > > the
> > > + the device by the flow id.
> > The flowid here seems to be used for rule index. Can this be returned
> by the
> > device instead of being sent by driver? A 32 bit value to store might
> impose
> > undue restrictions on devices that have lesser capacity. Or could
> there be a
> > restriction that flowid cannot exceed the value returned by device as
> the
> > capacity.
> > > +
> I had thought about it as well.
> The main reason for driver to choose the value is, live migration
> scenario.
> If the device chooses _any_ id, then one vendor may choose A and other
> vendor may choose B and it may not work
>
> So one way is to keep driver supplied id we need to keep it detached
> from the capacity.
Ok. I was proposing that everybody agrees to use index value in 0-n per
group. I am fine with the size limitation described below.
>
> Alternatively,
> we can add max_value in the device provisioning flow to ease the device
> implementation.
> Like how we have max_vqs per device (num_queues in
> virtio_pci_common_cfg).
>
> (max capacity is already present in below flow_filter_capabilities
> struct).
Yes, this was my alternate suggestion as well. We could have a max value
in the provisioning flow.
>
> > > +### 3.4.3 interface example
> > > +
> > > +Flow filter capabilities to query using a DMA interface:
> > > +
> > > +```
> > > +struct flow_filter_capabilities {
> > > + u8 flow_groups;
> > > + u16 num_flow_filter_vqs;
> > > + u16 start_vq_index;
> > > + u32 max_flow_filters_per_group;
> > > + u32 max_flow_filters;
> > > + u64 supported_packet_field_mask_bmap[4];
> > > +};
> > > +
> > > +
> > > +```
> > > +
> > > +1. Flow filter entry add/modify, delete:
> > > +
> > > +struct virtio_net_rff_add_modify {
> > > + u8 flow_op;
> > > + u8 group_id;
> > > + u8 padding[2];
> > > + le32 flow_id;
> > > + struct match_criteria mc;
> > > + struct destination dest;
> > > + struct action action;
> > > +
> > > + struct match_criteria mask; /* optional */
> > > +};
> > > +
> > > +2. Flow filter entry delete:
> > > +struct virtio_net_rff_delete {
> > > + u8 flow_op;
> > > + u8 group_id;
> > > + u8 padding[2];
> > > + le32 flow_id;
> > > +};
> > > --
> > > 2.26.2
> > >
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] RE: [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements
2023-08-02 15:25 ` [virtio-comment] " Heng Qi
@ 2023-08-03 9:59 ` Parav Pandit
2023-08-03 13:07 ` [virtio-comment] " Heng Qi
2023-08-08 8:21 ` [virtio-comment] " Heng Qi
0 siblings, 2 replies; 73+ messages in thread
From: Parav Pandit @ 2023-08-03 9:59 UTC (permalink / raw)
To: Heng Qi
Cc: Satananda Burla, virtio-comment@lists.oasis-open.org,
Shahaf Shuler, virtio@lists.oasis-open.org
> From: Heng Qi <hengqi@linux.alibaba.com>
> Sent: Wednesday, August 2, 2023 8:55 PM
> Hi, Parav. Sorry for not responding to this in time due to other things recently.
>
> Yes, RFF has two scenarios, set_rxnfc and ARFS, both of which will affect the
> packet steering on the device side.
> I think manually configured rules should have higher priority than ARFS
> automatic configuration.
> This behavior is intuitive and consistent with other drivers. Therefore, the
> processing chain on a rx packet is:
> {mac,vlan,promisc rx filters} -> {set_rxnfc} -> {ARFS} -> {rss/hash config}.
>
Correct.
Within the RFF context, the priority among multiple RFF entries is governed by the concept of group.
So above two users of the RFF will create two groups and assign priority to it and achieve the desired processing order.
> There are also priorities within set_rxnfc and ARFS respectively.
> 1. For set_rxnfc, which has the exact match and the mask match. Exact
> matches should have higher priority.
> Suppose there are two rules,
> rule1: {"tcpv4", "src-ip: 1.1.1.1"} -> rxq1
> rule2: {"tcpv4", "src-ip: 1.1.1.1", "dst-port: 8989"} -> rxq2 .For recieved
> rx packets whose src-ip is 1.1.1.1, should match rule2 instead of rule1.
>
Yes. Driver should be able to set the priority within the group as well for above scenario.
> The rules of set_rxnfc come from manual configuration, the number of these
> rules is small and we may not need group grouping for this. And ctrlq can meet
> the configuration rate,
>
Yes, but having single interface for two use cases enables the device implementation to not build driver interface specific infra.
Both can be handled by unified interface.
> 2. For ARFS, which only has the exact match.
> For ARFS, since there is only one matching rule for a certain flow, so there is no
> need for group?
Groups are defining the priority between two types of rules.
Within ARFS domain we don't need group.
However instead of starting with only two limiting groups, it is better to have some flexibility for supporting multiple groups.
A device can device one/two or more groups.
So in future if a use case arise, interface wont be limiting to it.
> We may need different types of tables, such as UDPv4 flow table, TCPv4 flow
> table to speed up the lookup for differect flow types.
> Besides, the high rate and large number of configuration rules means that we
> need flow vq.
>
Yes, I am not sure if those tables should be exposed to the driver.
Thinking that a device may be able to decide on table count which it may be able to create.
> Therefore, although set_rxnfc and ARFS share a set of infrastructure, there are
> still some differences, such as configuration rate and quantity. So do we need
> add two features (VIRTIO_NET_F_RXNFC and VIRTIO_NET_F_ARFS) for
> set_rxnfc and ARFS respectively, and ARFS can choose flow vq?
Not really, as one interface can fullfil both the needs without attaching it to a specific OS interface.
> In this way, is it more conducive to advancing the work of RFF (such as
> accelerating the advancement of set_rxnfc)?
>
Both the use cases are equally immediately usable so we can advance it easily using single interface now.
> > +
> > +### 3.4.1 control path
> > +1. The number of flow filter operations/sec can range from 100k/sec to
> 1M/sec
> > + or even more. Hence flow filter operations must be done over a queueing
> > + interface using one or more queues.
>
> This is only for ARFS, for devices that only want to support set_rxnfc, they don't
> provide VIRTIO_NET_F_ARFS and consider implementing flow vq.
>
Well once the device implements flow vq, it will service both cases.
A simple device implementation who only case for RXNFC, can implement flowvq in semi-software serving very small number of req/sec.
> > +2. The device should be able to expose one or more supported flow filter
> queue
> > + count and its start vq index to the driver.
> > +3. As each device may be operating for different performance characteristic,
> > + start vq index and count may be different for each device. Secondly, it is
> > + inefficient for device to provide flow filters capabilities via a config space
> > + region. Hence, the device should be able to share these attributes using
> > + dma interface, instead of transport registers.
> > +4. Since flow filters are enabled much later in the driver life cycle, driver
> > + will likely create these queues when flow filters are enabled.
>
> I understand that the number of flow vqs is not reflected in
> max_virtqueue_pairs. And a new vq is created at runtime, is this supported in
> the existing virtio spec?
>
We are extending the virtio-spec now if it is not supported.
But yes, it is supported because max_virtqueue_pairs will not expose the count of flow_vq.
(similar to how we did the AQ).
And flowvq anyway is not _pair_ so we cannot expose there anyway.
> > +5. Flow filter operations are often accelerated by device in a hardware.
> Ability
> > + to handle them on a queue other than control vq is desired. This achieves
> near
> > + zero modifications to existing implementations to add new operations on
> new
> > + purpose built queues (similar to transmit and receive queue).
> > +6. The filter masks are optional; the device should be able to expose if it
> > + support filter masks.
> > +7. The driver may want to have priority among group of flow entries; to
> facilitate
> > + the device support grouping flow filter entries by a notion of a group. Each
> > + group defines priority in processing flow.
> > +8. The driver and group owner driver should be able to query supported
> device
> > + limits for the flow filter entries.
> > +
> > +### 3.4.2 flow operations path
> > +1. The driver should be able to define a receive packet match criteria, an
> > + action and a destination for a packet.
>
> When the user does not specify a destination when configuring a rule, do we
> need a default destination?
>
I think we should not give such option to driver.
A human/end user may not have the destination, but driver should be able to decide a predictable destination.
> > For example, an ipv4 packet with a
> > + multicast address to be steered to the receive vq 0. The second example is
> > + ipv4, tcp packet matching a specified IP address and tcp port tuple to
> > + be steered to receive vq 10.
> > +2. The match criteria should include exact tuple fields well-defined such as
> mac
> > + address, IP addresses, tcp/udp ports, etc.
> > +3. The match criteria should also optionally include the field mask.
> > +4. The match criteria may optionally also include specific packet byte offset
> > + pattern, match length, mask instead of RFC defined fields.
> > + length, and matching pattern, which may not be defined in the standard
> RFC.
>
> Is there a description error here?
>
Didn't follow your comment. Do you mean there is an error in above description?
> > +5. Action includes (a) dropping or (b) forwarding the packet.
> > +6. Destination is a receive virtqueue index.
>
> Since the concept of RSS context does not yet exist in the virtio spec.
> Did we say that we also support carrying RSS context information when
> negotiating the RFF feature? For example, RSS context configuration commands
> and structures, etc.
>
> Or support RSS context functionality as a separate feature in another thread?
>
Support RSS context as separate feature.
> A related point to consider is that when a user inserts a rule with an rss context,
> the RSS context cannot be deleted, otherwise the device will cause undefined
> behavior.
>
Yes, for now we can keep rss context as separate feature.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* RE: [virtio-comment] RE: [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements
2023-08-02 15:32 ` Heng Qi
@ 2023-08-03 10:01 ` Parav Pandit
2023-08-03 13:11 ` [virtio-comment] Re: [virtio] " Heng Qi
0 siblings, 1 reply; 73+ messages in thread
From: Parav Pandit @ 2023-08-03 10:01 UTC (permalink / raw)
To: Heng Qi, virtio-comment@lists.oasis-open.org, Michael S. Tsirkin
Cc: Shahaf Shuler, virtio@lists.oasis-open.org
> From: Heng Qi <hengqi@linux.alibaba.com>
> Sent: Wednesday, August 2, 2023 9:03 PM
> Hi, Parav.
> I remember that the last meeting (7.26) was canceled and the meeting was
> postponed until today (8.02)?
> If yes, unfortunately we missed this meeting :(.
>
> Is the next meeting still two weeks away (8.16) ?
>
Yes. I was severely ill during last two weeks, so cancelled it.
I am sorry for the disruption.
We will continue the regular schedule of 8/16.
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] Re: [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements
2023-08-03 9:59 ` [virtio-comment] " Parav Pandit
@ 2023-08-03 13:07 ` Heng Qi
2023-08-04 6:20 ` [virtio-comment] " Parav Pandit
2023-08-08 8:21 ` [virtio-comment] " Heng Qi
1 sibling, 1 reply; 73+ messages in thread
From: Heng Qi @ 2023-08-03 13:07 UTC (permalink / raw)
To: Parav Pandit
Cc: Satananda Burla, virtio-comment@lists.oasis-open.org,
Shahaf Shuler, virtio@lists.oasis-open.org
On Thu, Aug 03, 2023 at 09:59:54AM +0000, Parav Pandit wrote:
>
> > From: Heng Qi <hengqi@linux.alibaba.com>
> > Sent: Wednesday, August 2, 2023 8:55 PM
>
> > Hi, Parav. Sorry for not responding to this in time due to other things recently.
> >
> > Yes, RFF has two scenarios, set_rxnfc and ARFS, both of which will affect the
> > packet steering on the device side.
> > I think manually configured rules should have higher priority than ARFS
> > automatic configuration.
> > This behavior is intuitive and consistent with other drivers. Therefore, the
> > processing chain on a rx packet is:
> > {mac,vlan,promisc rx filters} -> {set_rxnfc} -> {ARFS} -> {rss/hash config}.
> >
> Correct.
> Within the RFF context, the priority among multiple RFF entries is governed by the concept of group.
> So above two users of the RFF will create two groups and assign priority to it and achieve the desired processing order.
OK, we intend to use group as the concept of rule storage. Therefore, we
should have two priorities:
1. one is the priority of the group, this field is not seen in
the structure virtio_net_rff_add_modify, or the group id implies the priority
(for example, the smaller the priority, the higher the priority?)?
2. the other is the priority of the rule, the current structure
virtio_net_rff_add_modify is still missing this.
I think we should add some more texts in the next version describing how
matching rules are prioritized and how groups work. This is important
for RFF.
I also want to confirm that for the interaction between the driver and
the device, the driver only needs to tell the priority of the device group and the
priority of the rule, and we should not reflect how the device stores
and queries rules (such as tcam or some acl acceleration solutions) ?
>
> > There are also priorities within set_rxnfc and ARFS respectively.
> > 1. For set_rxnfc, which has the exact match and the mask match. Exact
> > matches should have higher priority.
> > Suppose there are two rules,
> > rule1: {"tcpv4", "src-ip: 1.1.1.1"} -> rxq1
> > rule2: {"tcpv4", "src-ip: 1.1.1.1", "dst-port: 8989"} -> rxq2 .For recieved
> > rx packets whose src-ip is 1.1.1.1, should match rule2 instead of rule1.
> >
> Yes. Driver should be able to set the priority within the group as well for above scenario.
But here I am wrong, it should be:
rx packets whose src-ip is 1.1.1.1 and dst-port is 8989, should match rule2 instead of rule1.
>
> > The rules of set_rxnfc come from manual configuration, the number of these
> > rules is small and we may not need group grouping for this. And ctrlq can meet
> > the configuration rate,
> >
> Yes, but having single interface for two use cases enables the device implementation to not build driver interface specific infra.
> Both can be handled by unified interface.
I agree:) Is ctrlq an option when the num_flow_filter_vqs
exposed by the device is 0?
>
> > 2. For ARFS, which only has the exact match.
> > For ARFS, since there is only one matching rule for a certain flow, so there is no
> > need for group?
> Groups are defining the priority between two types of rules.
> Within ARFS domain we don't need group.
Yes, ARFS doesn't need group.
>
> However instead of starting with only two limiting groups, it is better to have some flexibility for supporting multiple groups.
> A device can device one/two or more groups.
> So in future if a use case arise, interface wont be limiting to it.
Ok. This works.
>
> > We may need different types of tables, such as UDPv4 flow table, TCPv4 flow
> > table to speed up the lookup for differect flow types.
> > Besides, the high rate and large number of configuration rules means that we
> > need flow vq.
> >
> Yes, I am not sure if those tables should be exposed to the driver.
> Thinking that a device may be able to decide on table count which it may be able to create.
o how to store and what method to use to store rules is determined by
the device, just like my question above. If yes, I think this is a good
way to work, because it allows for increased flexibility in device
implementation.
>
> > Therefore, although set_rxnfc and ARFS share a set of infrastructure, there are
> > still some differences, such as configuration rate and quantity. So do we need
> > add two features (VIRTIO_NET_F_RXNFC and VIRTIO_NET_F_ARFS) for
> > set_rxnfc and ARFS respectively, and ARFS can choose flow vq?
> Not really, as one interface can fullfil both the needs without attaching it to a specific OS interface.
>
Ok!
> > In this way, is it more conducive to advancing the work of RFF (such as
> > accelerating the advancement of set_rxnfc)?
> >
> Both the use cases are equally immediately usable so we can advance it easily using single interface now.
>
> > > +
> > > +### 3.4.1 control path
> > > +1. The number of flow filter operations/sec can range from 100k/sec to
> > 1M/sec
> > > + or even more. Hence flow filter operations must be done over a queueing
> > > + interface using one or more queues.
> >
> > This is only for ARFS, for devices that only want to support set_rxnfc, they don't
> > provide VIRTIO_NET_F_ARFS and consider implementing flow vq.
> >
> Well once the device implements flow vq, it will service both cases.
> A simple device implementation who only case for RXNFC, can implement flowvq in semi-software serving very small number of req/sec.
>
When the device does not provide flow vq, whether the driver can use
ctrlq to the device.
> > > +2. The device should be able to expose one or more supported flow filter
> > queue
> > > + count and its start vq index to the driver.
> > > +3. As each device may be operating for different performance characteristic,
> > > + start vq index and count may be different for each device. Secondly, it is
> > > + inefficient for device to provide flow filters capabilities via a config space
> > > + region. Hence, the device should be able to share these attributes using
> > > + dma interface, instead of transport registers.
> > > +4. Since flow filters are enabled much later in the driver life cycle, driver
> > > + will likely create these queues when flow filters are enabled.
> >
> > I understand that the number of flow vqs is not reflected in
> > max_virtqueue_pairs. And a new vq is created at runtime, is this supported in
> > the existing virtio spec?
> >
> We are extending the virtio-spec now if it is not supported.
> But yes, it is supported because max_virtqueue_pairs will not expose the count of flow_vq.
> (similar to how we did the AQ).
> And flowvq anyway is not _pair_ so we cannot expose there anyway.
Absolutely.
>
> > > +5. Flow filter operations are often accelerated by device in a hardware.
> > Ability
> > > + to handle them on a queue other than control vq is desired. This achieves
> > near
> > > + zero modifications to existing implementations to add new operations on
> > new
> > > + purpose built queues (similar to transmit and receive queue).
> > > +6. The filter masks are optional; the device should be able to expose if it
> > > + support filter masks.
> > > +7. The driver may want to have priority among group of flow entries; to
> > facilitate
> > > + the device support grouping flow filter entries by a notion of a group. Each
> > > + group defines priority in processing flow.
> > > +8. The driver and group owner driver should be able to query supported
> > device
> > > + limits for the flow filter entries.
> > > +
> > > +### 3.4.2 flow operations path
> > > +1. The driver should be able to define a receive packet match criteria, an
> > > + action and a destination for a packet.
> >
> > When the user does not specify a destination when configuring a rule, do we
> > need a default destination?
> >
> I think we should not give such option to driver.
> A human/end user may not have the destination, but driver should be able to decide a predictable destination.
Yes, that's what I mean:), and I said "we" for "the driver."
>
> > > For example, an ipv4 packet with a
> > > + multicast address to be steered to the receive vq 0. The second example is
> > > + ipv4, tcp packet matching a specified IP address and tcp port tuple to
> > > + be steered to receive vq 10.
> > > +2. The match criteria should include exact tuple fields well-defined such as
> > mac
> > > + address, IP addresses, tcp/udp ports, etc.
> > > +3. The match criteria should also optionally include the field mask.
> > > +4. The match criteria may optionally also include specific packet byte offset
> > > + pattern, match length, mask instead of RFC defined fields.
> > > + length, and matching pattern, which may not be defined in the standard
> > RFC.
> >
> > Is there a description error here?
> >
> Didn't follow your comment. Do you mean there is an error in above description?
I don't quite understand what "specific packet byte offset pattern" means :(
>
> > > +5. Action includes (a) dropping or (b) forwarding the packet.
> > > +6. Destination is a receive virtqueue index.
> >
> > Since the concept of RSS context does not yet exist in the virtio spec.
> > Did we say that we also support carrying RSS context information when
> > negotiating the RFF feature? For example, RSS context configuration commands
> > and structures, etc.
> >
> > Or support RSS context functionality as a separate feature in another thread?
> >
> Support RSS context as separate feature.
Ok, humbly asking if your work plan includes this part, do you need me
to share your work, such as rss context.
Thanks a lot!
>
> > A related point to consider is that when a user inserts a rule with an rss context,
> > the RSS context cannot be deleted, otherwise the device will cause undefined
> > behavior.
> >
> Yes, for now we can keep rss context as separate feature.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] Re: [virtio] RE: [virtio-comment] RE: [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements
2023-08-03 10:01 ` Parav Pandit
@ 2023-08-03 13:11 ` Heng Qi
0 siblings, 0 replies; 73+ messages in thread
From: Heng Qi @ 2023-08-03 13:11 UTC (permalink / raw)
To: Parav Pandit
Cc: Shahaf Shuler, virtio@lists.oasis-open.org, Michael S. Tsirkin,
virtio-comment@lists.oasis-open.org
在 2023/8/3 下午6:01, Parav Pandit 写道:
>> From: Heng Qi <hengqi@linux.alibaba.com>
>> Sent: Wednesday, August 2, 2023 9:03 PM
>> Hi, Parav.
>> I remember that the last meeting (7.26) was canceled and the meeting was
>> postponed until today (8.02)?
>> If yes, unfortunately we missed this meeting :(.
>>
>> Is the next meeting still two weeks away (8.16) ?
>>
> Yes. I was severely ill during last two weeks, so cancelled it.
I'm sorry to hear that. Please take good care!
> I am sorry for the disruption.
> We will continue the regular schedule of 8/16.
Ok, we'll be friendly on time!
Thanks.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] RE: [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements
2023-08-03 13:07 ` [virtio-comment] " Heng Qi
@ 2023-08-04 6:20 ` Parav Pandit
2023-08-04 7:17 ` [virtio-comment] " Heng Qi
0 siblings, 1 reply; 73+ messages in thread
From: Parav Pandit @ 2023-08-04 6:20 UTC (permalink / raw)
To: Heng Qi
Cc: Satananda Burla, virtio-comment@lists.oasis-open.org,
Shahaf Shuler, virtio@lists.oasis-open.org
> From: Heng Qi <hengqi@linux.alibaba.com>
> Sent: Thursday, August 3, 2023 6:37 PM
>
> On Thu, Aug 03, 2023 at 09:59:54AM +0000, Parav Pandit wrote:
> >
> > > From: Heng Qi <hengqi@linux.alibaba.com>
> > > Sent: Wednesday, August 2, 2023 8:55 PM
> >
> > > Hi, Parav. Sorry for not responding to this in time due to other things
> recently.
> > >
> > > Yes, RFF has two scenarios, set_rxnfc and ARFS, both of which will
> > > affect the packet steering on the device side.
> > > I think manually configured rules should have higher priority than
> > > ARFS automatic configuration.
> > > This behavior is intuitive and consistent with other drivers.
> > > Therefore, the processing chain on a rx packet is:
> > > {mac,vlan,promisc rx filters} -> {set_rxnfc} -> {ARFS} -> {rss/hash config}.
> > >
> > Correct.
> > Within the RFF context, the priority among multiple RFF entries is governed by
> the concept of group.
> > So above two users of the RFF will create two groups and assign priority to it
> and achieve the desired processing order.
>
> OK, we intend to use group as the concept of rule storage. Therefore, we
> should have two priorities:
> 1. one is the priority of the group, this field is not seen in the structure
> virtio_net_rff_add_modify, or the group id implies the priority (for example, the
> smaller the priority, the higher the priority?)?
Good catch, yes, we need priority assignment to the group.
Hence, we need group add/delete command as well.
> 2. the other is the priority of the rule, the current structure
> virtio_net_rff_add_modify is still missing this.
>
Adding it.
> I think we should add some more texts in the next version describing how
> matching rules are prioritized and how groups work. This is important for RFF.
>
> I also want to confirm that for the interaction between the driver and the
> device, the driver only needs to tell the priority of the device group and the
> priority of the rule, and we should not reflect how the device stores and
> queries rules (such as tcam or some acl acceleration solutions) ?
>
Correct, we try to keep thing as abstract as possible.
> >
> > > There are also priorities within set_rxnfc and ARFS respectively.
> > > 1. For set_rxnfc, which has the exact match and the mask match.
> > > Exact matches should have higher priority.
> > > Suppose there are two rules,
> > > rule1: {"tcpv4", "src-ip: 1.1.1.1"} -> rxq1
> > > rule2: {"tcpv4", "src-ip: 1.1.1.1", "dst-port: 8989"} -> rxq2 .For
> > > recieved rx packets whose src-ip is 1.1.1.1, should match rule2 instead of
> rule1.
> > >
> > Yes. Driver should be able to set the priority within the group as well for above
> scenario.
>
> But here I am wrong, it should be:
> rx packets whose src-ip is 1.1.1.1 and dst-port is 8989, should match rule2
> instead of rule1.
>
That is what you wrote, here both the rules are within one group.
And rule2 should have higher priority than rule1.
> >
> > > The rules of set_rxnfc come from manual configuration, the number of
> > > these rules is small and we may not need group grouping for this.
> > > And ctrlq can meet the configuration rate,
> > >
> > Yes, but having single interface for two use cases enables the device
> implementation to not build driver interface specific infra.
> > Both can be handled by unified interface.
>
> I agree:) Is ctrlq an option when the num_flow_filter_vqs exposed by the device
> is 0?
>
I think ctrlq would in above scenario offers, a quick start to the feature by making flowvq optional.
It comes with the tradeoff of perf, and dual code implementation.
Which is fine, the problem occurs is when flowvq is also supported, and flowvq is created, if driver issues command on cvq and flowvq both, synchronizing these on the device is nightmare.
So if we can draft it as:
If flowvq is created, RFF must be done only on flowvq by the driver.
If flowvq is supported, but not created, cvq can be used.
In that case it is flexible enough for device to implement with reasonable trade off.
> >
> > > 2. For ARFS, which only has the exact match.
> > > For ARFS, since there is only one matching rule for a certain flow,
> > > so there is no need for group?
> > Groups are defining the priority between two types of rules.
> > Within ARFS domain we don't need group.
>
> Yes, ARFS doesn't need group.
>
> >
> > However instead of starting with only two limiting groups, it is better to have
> some flexibility for supporting multiple groups.
> > A device can device one/two or more groups.
> > So in future if a use case arise, interface wont be limiting to it.
>
> Ok. This works.
>
> >
> > > We may need different types of tables, such as UDPv4 flow table,
> > > TCPv4 flow table to speed up the lookup for differect flow types.
> > > Besides, the high rate and large number of configuration rules means
> > > that we need flow vq.
> > >
> > Yes, I am not sure if those tables should be exposed to the driver.
> > Thinking that a device may be able to decide on table count which it may be
> able to create.
>
> o how to store and what method to use to store rules is determined by the
> device, just like my question above. If yes, I think this is a good way to work,
> because it allows for increased flexibility in device implementation.
>
Right, we can possibly avoid concept of table in spec.
So far I see below objects:
1. group with priority (priority applies among the group)
2. flow entries with priority (priority applies to entries within the group)
> >
> > > Therefore, although set_rxnfc and ARFS share a set of
> > > infrastructure, there are still some differences, such as
> > > configuration rate and quantity. So do we need add two features
> > > (VIRTIO_NET_F_RXNFC and VIRTIO_NET_F_ARFS) for set_rxnfc and ARFS
> respectively, and ARFS can choose flow vq?
> > Not really, as one interface can fullfil both the needs without attaching it to a
> specific OS interface.
> >
>
> Ok!
>
> > > In this way, is it more conducive to advancing the work of RFF (such
> > > as accelerating the advancement of set_rxnfc)?
> > >
> > Both the use cases are equally immediately usable so we can advance it easily
> using single interface now.
> >
> > > > +
> > > > +### 3.4.1 control path
> > > > +1. The number of flow filter operations/sec can range from
> > > > +100k/sec to
> > > 1M/sec
> > > > + or even more. Hence flow filter operations must be done over a
> queueing
> > > > + interface using one or more queues.
> > >
> > > This is only for ARFS, for devices that only want to support
> > > set_rxnfc, they don't provide VIRTIO_NET_F_ARFS and consider
> implementing flow vq.
> > >
> > Well once the device implements flow vq, it will service both cases.
> > A simple device implementation who only case for RXNFC, can implement
> flowvq in semi-software serving very small number of req/sec.
> >
>
> When the device does not provide flow vq, whether the driver can use ctrlq to
> the device.
>
Please see above.
> > > > +2. The device should be able to expose one or more supported flow
> > > > +filter
> > > queue
> > > > + count and its start vq index to the driver.
> > > > +3. As each device may be operating for different performance
> characteristic,
> > > > + start vq index and count may be different for each device. Secondly, it is
> > > > + inefficient for device to provide flow filters capabilities via a config
> space
> > > > + region. Hence, the device should be able to share these attributes using
> > > > + dma interface, instead of transport registers.
> > > > +4. Since flow filters are enabled much later in the driver life cycle, driver
> > > > + will likely create these queues when flow filters are enabled.
> > >
> > > I understand that the number of flow vqs is not reflected in
> > > max_virtqueue_pairs. And a new vq is created at runtime, is this
> > > supported in the existing virtio spec?
> > >
> > We are extending the virtio-spec now if it is not supported.
> > But yes, it is supported because max_virtqueue_pairs will not expose the
> count of flow_vq.
> > (similar to how we did the AQ).
> > And flowvq anyway is not _pair_ so we cannot expose there anyway.
>
> Absolutely.
>
> >
> > > > +5. Flow filter operations are often accelerated by device in a hardware.
> > > Ability
> > > > + to handle them on a queue other than control vq is desired.
> > > > + This achieves
> > > near
> > > > + zero modifications to existing implementations to add new
> > > > + operations on
> > > new
> > > > + purpose built queues (similar to transmit and receive queue).
> > > > +6. The filter masks are optional; the device should be able to expose if it
> > > > + support filter masks.
> > > > +7. The driver may want to have priority among group of flow
> > > > +entries; to
> > > facilitate
> > > > + the device support grouping flow filter entries by a notion of a group.
> Each
> > > > + group defines priority in processing flow.
> > > > +8. The driver and group owner driver should be able to query
> > > > +supported
> > > device
> > > > + limits for the flow filter entries.
> > > > +
> > > > +### 3.4.2 flow operations path
> > > > +1. The driver should be able to define a receive packet match criteria, an
> > > > + action and a destination for a packet.
> > >
> > > When the user does not specify a destination when configuring a
> > > rule, do we need a default destination?
> > >
> > I think we should not give such option to driver.
> > A human/end user may not have the destination, but driver should be able to
> decide a predictable destination.
>
> Yes, that's what I mean:), and I said "we" for "the driver."
>
Ok. got it.
> >
> > > > For example, an ipv4 packet with a
> > > > + multicast address to be steered to the receive vq 0. The second
> example is
> > > > + ipv4, tcp packet matching a specified IP address and tcp port tuple to
> > > > + be steered to receive vq 10.
> > > > +2. The match criteria should include exact tuple fields
> > > > +well-defined such as
> > > mac
> > > > + address, IP addresses, tcp/udp ports, etc.
> > > > +3. The match criteria should also optionally include the field mask.
> > > > +4. The match criteria may optionally also include specific packet byte
> offset
> > > > + pattern, match length, mask instead of RFC defined fields.
> > > > + length, and matching pattern, which may not be defined in the
> > > > +standard
> > > RFC.
> > >
> > > Is there a description error here?
> > >
> > Didn't follow your comment. Do you mean there is an error in above
> description?
>
> I don't quite understand what "specific packet byte offset pattern" means :(
>
Time to make it verbose. :)
For any new/undefined protocol, if user wants to say,
In a packet at byte offset A, if you find pattern == 0x800, drop the packet.
In a packet at byte offset B, if you find pattern == 0x8100, forward to rq 10.
I didn't consider multiple matching patterns for now, though it is very useful.
I am inclined to keep the option of _any_match to take up later, for now to do only well defined match?
WDYT?
> >
> > > > +5. Action includes (a) dropping or (b) forwarding the packet.
> > > > +6. Destination is a receive virtqueue index.
> > >
> > > Since the concept of RSS context does not yet exist in the virtio spec.
> > > Did we say that we also support carrying RSS context information
> > > when negotiating the RFF feature? For example, RSS context
> > > configuration commands and structures, etc.
> > >
> > > Or support RSS context functionality as a separate feature in another
> thread?
> > >
> > Support RSS context as separate feature.
>
> Ok, humbly asking if your work plan includes this part, do you need me to share
> your work, such as rss context.
>
Lets keep rss context in future work as its orthogonal to it.
Yes, your help for rss context will be good.
Lets first finish RFF as its bit in the advance stage.
> Thanks a lot!
>
> >
> > > A related point to consider is that when a user inserts a rule with
> > > an rss context, the RSS context cannot be deleted, otherwise the
> > > device will cause undefined behavior.
> > >
> > Yes, for now we can keep rss context as separate feature.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] Re: [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements
2023-08-04 6:20 ` [virtio-comment] " Parav Pandit
@ 2023-08-04 7:17 ` Heng Qi
2023-08-04 7:30 ` [virtio-comment] " Parav Pandit
0 siblings, 1 reply; 73+ messages in thread
From: Heng Qi @ 2023-08-04 7:17 UTC (permalink / raw)
To: Parav Pandit
Cc: Satananda Burla, virtio-comment@lists.oasis-open.org,
Shahaf Shuler, virtio@lists.oasis-open.org
On Fri, Aug 04, 2023 at 06:20:53AM +0000, Parav Pandit wrote:
>
> > From: Heng Qi <hengqi@linux.alibaba.com>
> > Sent: Thursday, August 3, 2023 6:37 PM
> >
> > On Thu, Aug 03, 2023 at 09:59:54AM +0000, Parav Pandit wrote:
> > >
> > > > From: Heng Qi <hengqi@linux.alibaba.com>
> > > > Sent: Wednesday, August 2, 2023 8:55 PM
> > >
> > > > Hi, Parav. Sorry for not responding to this in time due to other things
> > recently.
> > > >
> > > > Yes, RFF has two scenarios, set_rxnfc and ARFS, both of which will
> > > > affect the packet steering on the device side.
> > > > I think manually configured rules should have higher priority than
> > > > ARFS automatic configuration.
> > > > This behavior is intuitive and consistent with other drivers.
> > > > Therefore, the processing chain on a rx packet is:
> > > > {mac,vlan,promisc rx filters} -> {set_rxnfc} -> {ARFS} -> {rss/hash config}.
> > > >
> > > Correct.
> > > Within the RFF context, the priority among multiple RFF entries is governed by
> > the concept of group.
> > > So above two users of the RFF will create two groups and assign priority to it
> > and achieve the desired processing order.
> >
> > OK, we intend to use group as the concept of rule storage. Therefore, we
> > should have two priorities:
> > 1. one is the priority of the group, this field is not seen in the structure
> > virtio_net_rff_add_modify, or the group id implies the priority (for example, the
> > smaller the priority, the higher the priority?)?
> Good catch, yes, we need priority assignment to the group.
> Hence, we need group add/delete command as well.
Yes, then group priority can be in group add command.
>
> > 2. the other is the priority of the rule, the current structure
> > virtio_net_rff_add_modify is still missing this.
> >
> Adding it.
Ok.
> > I think we should add some more texts in the next version describing how
> > matching rules are prioritized and how groups work. This is important for RFF.
> >
> > I also want to confirm that for the interaction between the driver and the
> > device, the driver only needs to tell the priority of the device group and the
> > priority of the rule, and we should not reflect how the device stores and
> > queries rules (such as tcam or some acl acceleration solutions) ?
> >
> Correct, we try to keep thing as abstract as possible.
Good! The implementation of the priority of the rules and groups can be reserved in
the driver implementation.
> > >
> > > > There are also priorities within set_rxnfc and ARFS respectively.
> > > > 1. For set_rxnfc, which has the exact match and the mask match.
> > > > Exact matches should have higher priority.
> > > > Suppose there are two rules,
> > > > rule1: {"tcpv4", "src-ip: 1.1.1.1"} -> rxq1
> > > > rule2: {"tcpv4", "src-ip: 1.1.1.1", "dst-port: 8989"} -> rxq2 .For
> > > > recieved rx packets whose src-ip is 1.1.1.1, should match rule2 instead of
> > rule1.
> > > >
> > > Yes. Driver should be able to set the priority within the group as well for above
> > scenario.
> >
> > But here I am wrong, it should be:
> > rx packets whose src-ip is 1.1.1.1 and dst-port is 8989, should match rule2
> > instead of rule1.
> >
> That is what you wrote, here both the rules are within one group.
> And rule2 should have higher priority than rule1.
>
Yes.
> > >
> > > > The rules of set_rxnfc come from manual configuration, the number of
> > > > these rules is small and we may not need group grouping for this.
> > > > And ctrlq can meet the configuration rate,
> > > >
> > > Yes, but having single interface for two use cases enables the device
> > implementation to not build driver interface specific infra.
> > > Both can be handled by unified interface.
> >
> > I agree:) Is ctrlq an option when the num_flow_filter_vqs exposed by the device
> > is 0?
> >
> I think ctrlq would in above scenario offers, a quick start to the feature by making flowvq optional.
> It comes with the tradeoff of perf, and dual code implementation.
> Which is fine, the problem occurs is when flowvq is also supported, and flowvq is created, if driver issues command on cvq and flowvq both, synchronizing these on the device is nightmare.
>
> So if we can draft it as:
> If flowvq is created, RFF must be done only on flowvq by the driver.
> If flowvq is supported, but not created, cvq can be used.
>
> In that case it is flexible enough for device to implement with reasonable trade off.
Yes, but a small update:
If flowvq is created, RFF must be done only on flowvq by the driver.
If flowvq is supported, but not created, cvq can be used.
If flowvq is not supported, cvq is used.
>
> > >
> > > > 2. For ARFS, which only has the exact match.
> > > > For ARFS, since there is only one matching rule for a certain flow,
> > > > so there is no need for group?
> > > Groups are defining the priority between two types of rules.
> > > Within ARFS domain we don't need group.
> >
> > Yes, ARFS doesn't need group.
> >
> > >
> > > However instead of starting with only two limiting groups, it is better to have
> > some flexibility for supporting multiple groups.
> > > A device can device one/two or more groups.
> > > So in future if a use case arise, interface wont be limiting to it.
> >
> > Ok. This works.
> >
> > >
> > > > We may need different types of tables, such as UDPv4 flow table,
> > > > TCPv4 flow table to speed up the lookup for differect flow types.
> > > > Besides, the high rate and large number of configuration rules means
> > > > that we need flow vq.
> > > >
> > > Yes, I am not sure if those tables should be exposed to the driver.
> > > Thinking that a device may be able to decide on table count which it may be
> > able to create.
> >
> > o how to store and what method to use to store rules is determined by the
> > device, just like my question above. If yes, I think this is a good way to work,
> > because it allows for increased flexibility in device implementation.
> >
> Right, we can possibly avoid concept of table in spec.
> So far I see below objects:
>
> 1. group with priority (priority applies among the group)
> 2. flow entries with priority (priority applies to entries within the group)
Yes!
> > >
> > > > Therefore, although set_rxnfc and ARFS share a set of
> > > > infrastructure, there are still some differences, such as
> > > > configuration rate and quantity. So do we need add two features
> > > > (VIRTIO_NET_F_RXNFC and VIRTIO_NET_F_ARFS) for set_rxnfc and ARFS
> > respectively, and ARFS can choose flow vq?
> > > Not really, as one interface can fullfil both the needs without attaching it to a
> > specific OS interface.
> > >
> >
> > Ok!
> >
> > > > In this way, is it more conducive to advancing the work of RFF (such
> > > > as accelerating the advancement of set_rxnfc)?
> > > >
> > > Both the use cases are equally immediately usable so we can advance it easily
> > using single interface now.
> > >
> > > > > +
> > > > > +### 3.4.1 control path
> > > > > +1. The number of flow filter operations/sec can range from
> > > > > +100k/sec to
> > > > 1M/sec
> > > > > + or even more. Hence flow filter operations must be done over a
> > queueing
> > > > > + interface using one or more queues.
> > > >
> > > > This is only for ARFS, for devices that only want to support
> > > > set_rxnfc, they don't provide VIRTIO_NET_F_ARFS and consider
> > implementing flow vq.
> > > >
> > > Well once the device implements flow vq, it will service both cases.
> > > A simple device implementation who only case for RXNFC, can implement
> > flowvq in semi-software serving very small number of req/sec.
> > >
> >
> > When the device does not provide flow vq, whether the driver can use ctrlq to
> > the device.
> >
> Please see above.
>
> > > > > +2. The device should be able to expose one or more supported flow
> > > > > +filter
> > > > queue
> > > > > + count and its start vq index to the driver.
> > > > > +3. As each device may be operating for different performance
> > characteristic,
> > > > > + start vq index and count may be different for each device. Secondly, it is
> > > > > + inefficient for device to provide flow filters capabilities via a config
> > space
> > > > > + region. Hence, the device should be able to share these attributes using
> > > > > + dma interface, instead of transport registers.
> > > > > +4. Since flow filters are enabled much later in the driver life cycle, driver
> > > > > + will likely create these queues when flow filters are enabled.
> > > >
> > > > I understand that the number of flow vqs is not reflected in
> > > > max_virtqueue_pairs. And a new vq is created at runtime, is this
> > > > supported in the existing virtio spec?
> > > >
> > > We are extending the virtio-spec now if it is not supported.
> > > But yes, it is supported because max_virtqueue_pairs will not expose the
> > count of flow_vq.
> > > (similar to how we did the AQ).
> > > And flowvq anyway is not _pair_ so we cannot expose there anyway.
> >
> > Absolutely.
> >
> > >
> > > > > +5. Flow filter operations are often accelerated by device in a hardware.
> > > > Ability
> > > > > + to handle them on a queue other than control vq is desired.
> > > > > + This achieves
> > > > near
> > > > > + zero modifications to existing implementations to add new
> > > > > + operations on
> > > > new
> > > > > + purpose built queues (similar to transmit and receive queue).
> > > > > +6. The filter masks are optional; the device should be able to expose if it
> > > > > + support filter masks.
> > > > > +7. The driver may want to have priority among group of flow
> > > > > +entries; to
> > > > facilitate
> > > > > + the device support grouping flow filter entries by a notion of a group.
> > Each
> > > > > + group defines priority in processing flow.
> > > > > +8. The driver and group owner driver should be able to query
> > > > > +supported
> > > > device
> > > > > + limits for the flow filter entries.
> > > > > +
> > > > > +### 3.4.2 flow operations path
> > > > > +1. The driver should be able to define a receive packet match criteria, an
> > > > > + action and a destination for a packet.
> > > >
> > > > When the user does not specify a destination when configuring a
> > > > rule, do we need a default destination?
> > > >
> > > I think we should not give such option to driver.
> > > A human/end user may not have the destination, but driver should be able to
> > decide a predictable destination.
> >
> > Yes, that's what I mean:), and I said "we" for "the driver."
> >
> Ok. got it.
>
> > >
> > > > > For example, an ipv4 packet with a
> > > > > + multicast address to be steered to the receive vq 0. The second
> > example is
> > > > > + ipv4, tcp packet matching a specified IP address and tcp port tuple to
> > > > > + be steered to receive vq 10.
> > > > > +2. The match criteria should include exact tuple fields
> > > > > +well-defined such as
> > > > mac
> > > > > + address, IP addresses, tcp/udp ports, etc.
> > > > > +3. The match criteria should also optionally include the field mask.
> > > > > +4. The match criteria may optionally also include specific packet byte
> > offset
> > > > > + pattern, match length, mask instead of RFC defined fields.
> > > > > + length, and matching pattern, which may not be defined in the
> > > > > +standard
> > > > RFC.
> > > >
> > > > Is there a description error here?
> > > >
> > > Didn't follow your comment. Do you mean there is an error in above
> > description?
> >
> > I don't quite understand what "specific packet byte offset pattern" means :(
> >
> Time to make it verbose. :)
Oh, I got it, I think we should let people see more details in the next version.:)
> For any new/undefined protocol, if user wants to say,
> In a packet at byte offset A, if you find pattern == 0x800, drop the packet.
> In a packet at byte offset B, if you find pattern == 0x8100, forward to rq 10.
>
> I didn't consider multiple matching patterns for now, though it is very useful.
> I am inclined to keep the option of _any_match to take up later, for now to do only well defined match?
> WDYT?
I think it's ok, but we don't seem to need the length here. But in any
case, I don't think it's that important to have or not ^^
>
>
> > >
> > > > > +5. Action includes (a) dropping or (b) forwarding the packet.
> > > > > +6. Destination is a receive virtqueue index.
> > > >
> > > > Since the concept of RSS context does not yet exist in the virtio spec.
> > > > Did we say that we also support carrying RSS context information
> > > > when negotiating the RFF feature? For example, RSS context
> > > > configuration commands and structures, etc.
> > > >
> > > > Or support RSS context functionality as a separate feature in another
> > thread?
> > > >
> > > Support RSS context as separate feature.
> >
> > Ok, humbly asking if your work plan includes this part, do you need me to share
> > your work, such as rss context.
> >
> Lets keep rss context in future work as its orthogonal to it.
> Yes, your help for rss context will be good.
> Lets first finish RFF as its bit in the advance stage.
Yes! But I worry that a spec that references a concept that doesn't
exist in the existing spec may be blocked, so if you have no objections,
I will push this work forward in the near future to help RFF possible
blocking.
Thanks.
>
> > Thanks a lot!
> >
> > >
> > > > A related point to consider is that when a user inserts a rule with
> > > > an rss context, the RSS context cannot be deleted, otherwise the
> > > > device will cause undefined behavior.
> > > >
> > > Yes, for now we can keep rss context as separate feature.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] RE: [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements
2023-08-04 7:17 ` [virtio-comment] " Heng Qi
@ 2023-08-04 7:30 ` Parav Pandit
2023-08-04 7:51 ` [virtio-comment] Re: [virtio] " Heng Qi
0 siblings, 1 reply; 73+ messages in thread
From: Parav Pandit @ 2023-08-04 7:30 UTC (permalink / raw)
To: Heng Qi
Cc: Satananda Burla, virtio-comment@lists.oasis-open.org,
Shahaf Shuler, virtio@lists.oasis-open.org
> From: Heng Qi <hengqi@linux.alibaba.com>
> Sent: Friday, August 4, 2023 12:47 PM
> Yes, but a small update:
> If flowvq is created, RFF must be done only on flowvq by the driver.
> If flowvq is supported, but not created, cvq can be used.
> If flowvq is not supported, cvq is used.
>
Looks fine, I want to think little more on the last point to make sure we are not missing something.
Will respond by Monday on it.
[..]
> Oh, I got it, I think we should let people see more details in the next version.:)
>
> > For any new/undefined protocol, if user wants to say, In a packet at
> > byte offset A, if you find pattern == 0x800, drop the packet.
> > In a packet at byte offset B, if you find pattern == 0x8100, forward to rq 10.
> >
> > I didn't consider multiple matching patterns for now, though it is very useful.
> > I am inclined to keep the option of _any_match to take up later, for now to do
> only well defined match?
> > WDYT?
>
> I think it's ok, but we don't seem to need the length here. But in any case, I
> don't think it's that important to have or not ^^
>
Length will be needed to indicate how many bytes to match to.
Lets do this rule incrementally after we get the base line done for known RFC defined fields.
> > Lets keep rss context in future work as its orthogonal to it.
> > Yes, your help for rss context will be good.
> > Lets first finish RFF as its bit in the advance stage.
>
> Yes! But I worry that a spec that references a concept that doesn't exist in the
> existing spec may be blocked, so if you have no objections, I will push this work
> forward in the near future to help RFF possible blocking.
I was probably not clear enough, I propose that lets remove the RSS context for now in the RFF.
Once RSS context is done, at that point in future to enhance RFF.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] RE: [EXT] [virtio] [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements
2023-08-02 18:32 ` Satananda Burla
@ 2023-08-04 7:32 ` Parav Pandit
0 siblings, 0 replies; 73+ messages in thread
From: Parav Pandit @ 2023-08-04 7:32 UTC (permalink / raw)
To: Satananda Burla, virtio-comment@lists.oasis-open.org
Cc: Shahaf Shuler, hengqi@linux.alibaba.com,
virtio@lists.oasis-open.org
> From: Satananda Burla <sburla@marvell.com>
> Sent: Thursday, August 3, 2023 12:02 AM
> Ok. I was proposing that everybody agrees to use index value in 0-n per group. I
> am fine with the size limitation described below.
Yes, n is configurable and value picked by driver so, things will work with LM flow.
I missed to reply yday...
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] Re: [virtio] RE: [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements
2023-08-04 7:30 ` [virtio-comment] " Parav Pandit
@ 2023-08-04 7:51 ` Heng Qi
2023-08-07 7:22 ` Heng Qi
0 siblings, 1 reply; 73+ messages in thread
From: Heng Qi @ 2023-08-04 7:51 UTC (permalink / raw)
To: Parav Pandit
Cc: Satananda Burla, virtio-comment@lists.oasis-open.org,
Shahaf Shuler, virtio@lists.oasis-open.org
在 2023/8/4 下午3:30, Parav Pandit 写道:
>> From: Heng Qi <hengqi@linux.alibaba.com>
>> Sent: Friday, August 4, 2023 12:47 PM
>> Yes, but a small update:
>> If flowvq is created, RFF must be done only on flowvq by the driver.
>> If flowvq is supported, but not created, cvq can be used.
>> If flowvq is not supported, cvq is used.
>>
> Looks fine, I want to think little more on the last point to make sure we are not missing something.
> Will respond by Monday on it.
Ok.
> [..]
>
>> Oh, I got it, I think we should let people see more details in the next version.:)
>>
>>> For any new/undefined protocol, if user wants to say, In a packet at
>>> byte offset A, if you find pattern == 0x800, drop the packet.
>>> In a packet at byte offset B, if you find pattern == 0x8100, forward to rq 10.
>>>
>>> I didn't consider multiple matching patterns for now, though it is very useful.
>>> I am inclined to keep the option of _any_match to take up later, for now to do
>> only well defined match?
>>> WDYT?
>> I think it's ok, but we don't seem to need the length here. But in any case, I
>> don't think it's that important to have or not ^^
>>
> Length will be needed to indicate how many bytes to match to.
> Lets do this rule incrementally after we get the base line done for known RFC defined fields.
>
>>> Lets keep rss context in future work as its orthogonal to it.
>>> Yes, your help for rss context will be good.
>>> Lets first finish RFF as its bit in the advance stage.
>> Yes! But I worry that a spec that references a concept that doesn't exist in the
>> existing spec may be blocked, so if you have no objections, I will push this work
>> forward in the near future to help RFF possible blocking.
> I was probably not clear enough, I propose that lets remove the RSS context for now in the RFF.
> Once RSS context is done, at that point in future to enhance RFF.
I need some time to see if the rss context has an effect on our
scenario, so please hold it for now. I'll sync it up as soon as possible.
Thanks!
>
> ---------------------------------------------------------------------
> To unsubscribe from this mail list, you must leave the OASIS TC that
> generates this mail. Follow this link to all your TCs in OASIS at:
> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: [virtio-comment] Re: [virtio] RE: [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements
2023-08-04 7:51 ` [virtio-comment] Re: [virtio] " Heng Qi
@ 2023-08-07 7:22 ` Heng Qi
2023-08-08 7:13 ` Parav Pandit
0 siblings, 1 reply; 73+ messages in thread
From: Heng Qi @ 2023-08-07 7:22 UTC (permalink / raw)
To: Parav Pandit
Cc: Satananda Burla, virtio-comment@lists.oasis-open.org,
Shahaf Shuler, virtio@lists.oasis-open.org
On Fri, Aug 04, 2023 at 03:51:16PM +0800, Heng Qi wrote:
>
>
> 在 2023/8/4 下午3:30, Parav Pandit 写道:
> >>From: Heng Qi <hengqi@linux.alibaba.com>
> >>Sent: Friday, August 4, 2023 12:47 PM
> >>Yes, but a small update:
> >>If flowvq is created, RFF must be done only on flowvq by the driver.
> >>If flowvq is supported, but not created, cvq can be used.
> >>If flowvq is not supported, cvq is used.
> >>
> >Looks fine, I want to think little more on the last point to make sure we are not missing something.
> >Will respond by Monday on it.
>
> Ok.
>
> >[..]
> >
> >>Oh, I got it, I think we should let people see more details in the next version.:)
> >>
> >>>For any new/undefined protocol, if user wants to say, In a packet at
> >>>byte offset A, if you find pattern == 0x800, drop the packet.
> >>>In a packet at byte offset B, if you find pattern == 0x8100, forward to rq 10.
> >>>
> >>>I didn't consider multiple matching patterns for now, though it is very useful.
> >>>I am inclined to keep the option of _any_match to take up later, for now to do
> >>only well defined match?
> >>>WDYT?
> >>I think it's ok, but we don't seem to need the length here. But in any case, I
> >>don't think it's that important to have or not ^^
> >>
> >Length will be needed to indicate how many bytes to match to.
> >Lets do this rule incrementally after we get the base line done for known RFC defined fields.
> >
> >>>Lets keep rss context in future work as its orthogonal to it.
> >>>Yes, your help for rss context will be good.
> >>>Lets first finish RFF as its bit in the advance stage.
> >>Yes! But I worry that a spec that references a concept that doesn't exist in the
> >>existing spec may be blocked, so if you have no objections, I will push this work
> >>forward in the near future to help RFF possible blocking.
> >I was probably not clear enough, I propose that lets remove the RSS context for now in the RFF.
> >Once RSS context is done, at that point in future to enhance RFF.
Hi, Parav.
We need the RSS context, which can be combined with 'ethtool -X .. equal'
to achieve the purpose of traffic isolation, so please keep the
RSS context. To allay your concerns that an RSS context might be
blocking the work of n-tuples, I'll be issuing a spec this week (or next
week) for virtio support for the RSS context. For n-tuple RFF context,
we can support it after RFF, as a point to enhance RFF.
> >[..]
> +9. The filter rule add/delete entries are usually short in size of few tens of
> + bytes, for example IPv6 + TCP tuple would be 36 bytes, and ops/sec rate is
> + high, hence supplying fields inside the queue descriptor is preferred for
> + up to a certain fixed size, say 56 bytes.
'56B' does not seem to be enough. For example,
src mac + dst mac + src-ip6 + dst-ip6 + src-port + dst-port + user-define
(packet byte offset pattern, length and mask) + flow id + rule priority + destination + action =
6 + 6 + 16 + 16 + 2 + 2 + 8 + 4 + 1 + 2 + 1 = 64 B,
we also have structure alignment and quintuple mask, etc.
> [..]
> +5. The device should be able to expose if it support filter masks.
> [..]
> +7. Flow filter capabilities to query using a DMA interface:
> +
> +```
> +struct flow_filter_capabilities {
> + u8 flow_groups;
> + u16 num_flow_filter_vqs;
> + u16 start_vq_index;
> + u32 max_flow_filters_per_group;
> + u32 max_flow_filters;
> + u64 supported_packet_field_mask_bmap[4];
I think the function here is that after the user sends the mask, the
driver should do an 'AND' operation with the mask supported by the device first?
Meanwhile, is 32B enough:) :
src-port + dst-port + src mac + dst mac + src-ip6 + dst-ip6 =
2 + 2 + 6 + 6 + 16 + 16 = 48B
Thanks!
> +};
> +
> +```
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* RE: [virtio-comment] Re: [virtio] RE: [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements
2023-08-07 7:22 ` Heng Qi
@ 2023-08-08 7:13 ` Parav Pandit
2023-08-08 8:18 ` [virtio-comment] Re: [virtio] " Heng Qi
0 siblings, 1 reply; 73+ messages in thread
From: Parav Pandit @ 2023-08-08 7:13 UTC (permalink / raw)
To: Heng Qi
Cc: Satananda Burla, virtio-comment@lists.oasis-open.org,
Shahaf Shuler, virtio@lists.oasis-open.org
> From: Heng Qi <hengqi@linux.alibaba.com>
> Sent: Monday, August 7, 2023 12:53 PM
> > >Once RSS context is done, at that point in future to enhance RFF.
> Hi, Parav.
>
> We need the RSS context, which can be combined with 'ethtool -X .. equal'
> to achieve the purpose of traffic isolation, so please keep the RSS context. To
> allay your concerns that an RSS context might be blocking the work of n-tuples,
> I'll be issuing a spec this week (or next
> week) for virtio support for the RSS context. For n-tuple RFF context, we can
> support it after RFF, as a point to enhance RFF.
>
Ok. sounds good.
> > >[..]
>
> > +9. The filter rule add/delete entries are usually short in size of few tens of
> > + bytes, for example IPv6 + TCP tuple would be 36 bytes, and ops/sec rate is
> > + high, hence supplying fields inside the queue descriptor is preferred for
> > + up to a certain fixed size, say 56 bytes.
>
> '56B' does not seem to be enough. For example, src mac + dst mac + src-ip6 +
> dst-ip6 + src-port + dst-port + user-define (packet byte offset pattern, length
> and mask) + flow id + rule priority + destination + action =
> 6 + 6 + 16 + 16 + 2 + 2 + 8 + 4 + 1 + 2 + 1 = 64 B, we also have structure alignment
> and quintuple mask, etc.
>
Yes, 64B also need to add group id to it.
A practical finite limit would be fine, I took 56B example based on ipv4 case.
I think 96B at higher upper limit looks reasonable without considering mask.
As ARFS kind of use cases usually are exact match, hence mask is optional that doest need to be inline.
> > [..]
> > +5. The device should be able to expose if it support filter masks.
> > [..]
> > +7. Flow filter capabilities to query using a DMA interface:
> > +
> > +```
> > +struct flow_filter_capabilities {
> > + u8 flow_groups;
> > + u16 num_flow_filter_vqs;
> > + u16 start_vq_index;
> > + u32 max_flow_filters_per_group;
> > + u32 max_flow_filters;
> > + u64 supported_packet_field_mask_bmap[4];
>
> I think the function here is that after the user sends the mask, the driver should
> do an 'AND' operation with the mask supported by the device first?
> Meanwhile, is 32B enough:) :
> src-port + dst-port + src mac + dst mac + src-ip6 + dst-ip6 =
> 2 + 2 + 6 + 6 + 16 + 16 = 48B
>
Oh I didn’t document well.
The field supported_packet_field_mask_bmap is bitmap of well defined fields, one bit for each field.
Such as,
Bit_0 = dst_mac
Bit_1 = src_mac
Bit_2 = eth_type
Bit_3 = vlan_tag
Bit_4 = dst_ip
And so on.
So yes, the actual content like src-port does not need to mask, but yes,
when the filter rule arrives from ARFS or FC side, that metdata info coming in ethtool_rx_flow_spec to be masked.
It would make more sense to have a dedicated command for the bitmap for two reasons.
1. It doesn't get sandwiched in the future when new fields are added after the bitmap.
2. When bitmap needs to extend, it is not fragmented at two or more places.
^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4 Parav Pandit
@ 2023-08-08 8:16 ` David Edmondson
2023-08-14 5:17 ` Parav Pandit
2023-08-14 11:56 ` David Edmondson
1 sibling, 1 reply; 73+ messages in thread
From: David Edmondson @ 2023-08-08 8:16 UTC (permalink / raw)
To: Parav Pandit; +Cc: shahafs, hengqi, virtio, virtio-comment
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] Re: [virtio] RE: [virtio-comment] Re: [virtio] RE: [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements
2023-08-08 7:13 ` Parav Pandit
@ 2023-08-08 8:18 ` Heng Qi
0 siblings, 0 replies; 73+ messages in thread
From: Heng Qi @ 2023-08-08 8:18 UTC (permalink / raw)
To: Parav Pandit
Cc: Satananda Burla, virtio-comment@lists.oasis-open.org,
Shahaf Shuler, virtio@lists.oasis-open.org
On Tue, Aug 08, 2023 at 07:13:13AM +0000, Parav Pandit wrote:
>
>
> > From: Heng Qi <hengqi@linux.alibaba.com>
> > Sent: Monday, August 7, 2023 12:53 PM
>
> > > >Once RSS context is done, at that point in future to enhance RFF.
> > Hi, Parav.
> >
> > We need the RSS context, which can be combined with 'ethtool -X .. equal'
> > to achieve the purpose of traffic isolation, so please keep the RSS context. To
> > allay your concerns that an RSS context might be blocking the work of n-tuples,
> > I'll be issuing a spec this week (or next
> > week) for virtio support for the RSS context. For n-tuple RFF context, we can
> > support it after RFF, as a point to enhance RFF.
> >
> Ok. sounds good.
>
> > > >[..]
> >
> > > +9. The filter rule add/delete entries are usually short in size of few tens of
> > > + bytes, for example IPv6 + TCP tuple would be 36 bytes, and ops/sec rate is
> > > + high, hence supplying fields inside the queue descriptor is preferred for
> > > + up to a certain fixed size, say 56 bytes.
> >
> > '56B' does not seem to be enough. For example, src mac + dst mac + src-ip6 +
> > dst-ip6 + src-port + dst-port + user-define (packet byte offset pattern, length
> > and mask) + flow id + rule priority + destination + action =
> > 6 + 6 + 16 + 16 + 2 + 2 + 8 + 4 + 1 + 2 + 1 = 64 B, we also have structure alignment
> > and quintuple mask, etc.
> >
> Yes, 64B also need to add group id to it.
Yes, there are also fields such as vlan, I just did a rough
calculation:)
> A practical finite limit would be fine, I took 56B example based on ipv4 case.
> I think 96B at higher upper limit looks reasonable without considering mask.
96B is enough without mask.
> As ARFS kind of use cases usually are exact match, hence mask is optional that doest need to be inline.
Yes, But RFF will carry the mask, so we need to consider the mask field when
considering the maximum fixed length.
>
> > > [..]
> > > +5. The device should be able to expose if it support filter masks.
> > > [..]
> > > +7. Flow filter capabilities to query using a DMA interface:
> > > +
> > > +```
> > > +struct flow_filter_capabilities {
> > > + u8 flow_groups;
> > > + u16 num_flow_filter_vqs;
> > > + u16 start_vq_index;
> > > + u32 max_flow_filters_per_group;
> > > + u32 max_flow_filters;
> > > + u64 supported_packet_field_mask_bmap[4];
> >
> > I think the function here is that after the user sends the mask, the driver should
> > do an 'AND' operation with the mask supported by the device first?
> > Meanwhile, is 32B enough:) :
> > src-port + dst-port + src mac + dst mac + src-ip6 + dst-ip6 =
> > 2 + 2 + 6 + 6 + 16 + 16 = 48B
> >
> Oh I didn’t document well.
> The field supported_packet_field_mask_bmap is bitmap of well defined fields, one bit for each field.
> Such as,
> Bit_0 = dst_mac
> Bit_1 = src_mac
> Bit_2 = eth_type
> Bit_3 = vlan_tag
> Bit_4 = dst_ip
> And so on.
Ok, I got it. Then we don't need to reserve 256B for
supported_packet_field_mask_bmap. 64 bits represent 64 different fields,
and I think it seems enough, or 128 bits?
Considering the alignment of the structure, it should look like this:
struct flow_filter_capabilities {
u8 flow_groups;
u8 padding1;
u16 num_flow_filter_vqs;
u16 start_vq_index;
u16 padding2;
u32 max_flow_filters_per_group;
u32 max_flow_filters;
u64 supported_packet_field_mask_bmap;
};
>
> So yes, the actual content like src-port does not need to mask, but yes,
> when the filter rule arrives from ARFS or FC side, that metdata info coming in ethtool_rx_flow_spec to be masked.
>
Yes.
> It would make more sense to have a dedicated command for the bitmap for two reasons.
> 1. It doesn't get sandwiched in the future when new fields are added after the bitmap.
> 2. When bitmap needs to extend, it is not fragmented at two or more places.
Agree! Allowing field extensions is good, especially the supported mask fields.
Thanks!
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] Re: [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements
2023-08-03 9:59 ` [virtio-comment] " Parav Pandit
2023-08-03 13:07 ` [virtio-comment] " Heng Qi
@ 2023-08-08 8:21 ` Heng Qi
2023-08-14 5:15 ` [virtio-comment] " Parav Pandit
1 sibling, 1 reply; 73+ messages in thread
From: Heng Qi @ 2023-08-08 8:21 UTC (permalink / raw)
To: Parav Pandit
Cc: Satananda Burla, virtio-comment@lists.oasis-open.org,
Shahaf Shuler, virtio@lists.oasis-open.org
On Thu, Aug 03, 2023 at 09:59:54AM +0000, Parav Pandit wrote:
>
> > From: Heng Qi <hengqi@linux.alibaba.com>
> > Sent: Wednesday, August 2, 2023 8:55 PM
>
> > Hi, Parav. Sorry for not responding to this in time due to other things recently.
> >
> > Yes, RFF has two scenarios, set_rxnfc and ARFS, both of which will affect the
> > packet steering on the device side.
> > I think manually configured rules should have higher priority than ARFS
> > automatic configuration.
> > This behavior is intuitive and consistent with other drivers. Therefore, the
> > processing chain on a rx packet is:
> > {mac,vlan,promisc rx filters} -> {set_rxnfc} -> {ARFS} -> {rss/hash config}.
> >
> Correct.
> Within the RFF context, the priority among multiple RFF entries is governed by the concept of group.
> So above two users of the RFF will create two groups and assign priority to it and achieve the desired processing order.
>
> > There are also priorities within set_rxnfc and ARFS respectively.
> > 1. For set_rxnfc, which has the exact match and the mask match. Exact
> > matches should have higher priority.
> > Suppose there are two rules,
> > rule1: {"tcpv4", "src-ip: 1.1.1.1"} -> rxq1
> > rule2: {"tcpv4", "src-ip: 1.1.1.1", "dst-port: 8989"} -> rxq2 .For recieved
> > rx packets whose src-ip is 1.1.1.1, should match rule2 instead of rule1.
> >
> Yes. Driver should be able to set the priority within the group as well for above scenario.
>
> > The rules of set_rxnfc come from manual configuration, the number of these
> > rules is small and we may not need group grouping for this. And ctrlq can meet
> > the configuration rate,
> >
> Yes, but having single interface for two use cases enables the device implementation to not build driver interface specific infra.
> Both can be handled by unified interface.
I reconsidered the number of groups, we don't necessarily have only
two groups for the time being (one is RFF, the other is ARFS). For
example, the driver may maintain groups with different priorities for
RFF itself (for example, according to the number of fields contained in
ntuple, etc.), and the driver may also maintain different groups with
the same priority for different flow types of ARFS, etc.
Thanks!
>
> > 2. For ARFS, which only has the exact match.
> > For ARFS, since there is only one matching rule for a certain flow, so there is no
> > need for group?
> Groups are defining the priority between two types of rules.
> Within ARFS domain we don't need group.
>
> However instead of starting with only two limiting groups, it is better to have some flexibility for supporting multiple groups.
> A device can device one/two or more groups.
> So in future if a use case arise, interface wont be limiting to it.
>
> > We may need different types of tables, such as UDPv4 flow table, TCPv4 flow
> > table to speed up the lookup for differect flow types.
> > Besides, the high rate and large number of configuration rules means that we
> > need flow vq.
> >
> Yes, I am not sure if those tables should be exposed to the driver.
> Thinking that a device may be able to decide on table count which it may be able to create.
>
> > Therefore, although set_rxnfc and ARFS share a set of infrastructure, there are
> > still some differences, such as configuration rate and quantity. So do we need
> > add two features (VIRTIO_NET_F_RXNFC and VIRTIO_NET_F_ARFS) for
> > set_rxnfc and ARFS respectively, and ARFS can choose flow vq?
> Not really, as one interface can fullfil both the needs without attaching it to a specific OS interface.
>
> > In this way, is it more conducive to advancing the work of RFF (such as
> > accelerating the advancement of set_rxnfc)?
> >
> Both the use cases are equally immediately usable so we can advance it easily using single interface now.
>
> > > +
> > > +### 3.4.1 control path
> > > +1. The number of flow filter operations/sec can range from 100k/sec to
> > 1M/sec
> > > + or even more. Hence flow filter operations must be done over a queueing
> > > + interface using one or more queues.
> >
> > This is only for ARFS, for devices that only want to support set_rxnfc, they don't
> > provide VIRTIO_NET_F_ARFS and consider implementing flow vq.
> >
> Well once the device implements flow vq, it will service both cases.
> A simple device implementation who only case for RXNFC, can implement flowvq in semi-software serving very small number of req/sec.
>
> > > +2. The device should be able to expose one or more supported flow filter
> > queue
> > > + count and its start vq index to the driver.
> > > +3. As each device may be operating for different performance characteristic,
> > > + start vq index and count may be different for each device. Secondly, it is
> > > + inefficient for device to provide flow filters capabilities via a config space
> > > + region. Hence, the device should be able to share these attributes using
> > > + dma interface, instead of transport registers.
> > > +4. Since flow filters are enabled much later in the driver life cycle, driver
> > > + will likely create these queues when flow filters are enabled.
> >
> > I understand that the number of flow vqs is not reflected in
> > max_virtqueue_pairs. And a new vq is created at runtime, is this supported in
> > the existing virtio spec?
> >
> We are extending the virtio-spec now if it is not supported.
> But yes, it is supported because max_virtqueue_pairs will not expose the count of flow_vq.
> (similar to how we did the AQ).
> And flowvq anyway is not _pair_ so we cannot expose there anyway.
>
> > > +5. Flow filter operations are often accelerated by device in a hardware.
> > Ability
> > > + to handle them on a queue other than control vq is desired. This achieves
> > near
> > > + zero modifications to existing implementations to add new operations on
> > new
> > > + purpose built queues (similar to transmit and receive queue).
> > > +6. The filter masks are optional; the device should be able to expose if it
> > > + support filter masks.
> > > +7. The driver may want to have priority among group of flow entries; to
> > facilitate
> > > + the device support grouping flow filter entries by a notion of a group. Each
> > > + group defines priority in processing flow.
> > > +8. The driver and group owner driver should be able to query supported
> > device
> > > + limits for the flow filter entries.
> > > +
> > > +### 3.4.2 flow operations path
> > > +1. The driver should be able to define a receive packet match criteria, an
> > > + action and a destination for a packet.
> >
> > When the user does not specify a destination when configuring a rule, do we
> > need a default destination?
> >
> I think we should not give such option to driver.
> A human/end user may not have the destination, but driver should be able to decide a predictable destination.
>
> > > For example, an ipv4 packet with a
> > > + multicast address to be steered to the receive vq 0. The second example is
> > > + ipv4, tcp packet matching a specified IP address and tcp port tuple to
> > > + be steered to receive vq 10.
> > > +2. The match criteria should include exact tuple fields well-defined such as
> > mac
> > > + address, IP addresses, tcp/udp ports, etc.
> > > +3. The match criteria should also optionally include the field mask.
> > > +4. The match criteria may optionally also include specific packet byte offset
> > > + pattern, match length, mask instead of RFC defined fields.
> > > + length, and matching pattern, which may not be defined in the standard
> > RFC.
> >
> > Is there a description error here?
> >
> Didn't follow your comment. Do you mean there is an error in above description?
>
> > > +5. Action includes (a) dropping or (b) forwarding the packet.
> > > +6. Destination is a receive virtqueue index.
> >
> > Since the concept of RSS context does not yet exist in the virtio spec.
> > Did we say that we also support carrying RSS context information when
> > negotiating the RFF feature? For example, RSS context configuration commands
> > and structures, etc.
> >
> > Or support RSS context functionality as a separate feature in another thread?
> >
> Support RSS context as separate feature.
>
> > A related point to consider is that when a user inserts a rule with an rss context,
> > the RSS context cannot be deleted, otherwise the device will cause undefined
> > behavior.
> >
> Yes, for now we can keep rss context as separate feature.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: [virtio-comment] [PATCH requirements 2/7] net-features: Add low latency transmit queue requirements
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 2/7] net-features: Add low latency transmit queue requirements Parav Pandit
@ 2023-08-08 8:24 ` David Edmondson
2023-08-10 19:05 ` [virtio-comment] RE: [EXT] [virtio] " Satananda Burla
2023-08-14 11:55 ` [virtio-comment] " David Edmondson
2 siblings, 0 replies; 73+ messages in thread
From: David Edmondson @ 2023-08-08 8:24 UTC (permalink / raw)
To: Parav Pandit; +Cc: shahafs, hengqi, virtio, virtio-comment
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: [virtio-comment] [PATCH requirements 3/7] net-features: Add low latency receive queue requirements
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 3/7] net-features: Add low latency receive " Parav Pandit
@ 2023-08-08 8:32 ` David Edmondson
2023-08-14 11:54 ` David Edmondson
1 sibling, 0 replies; 73+ messages in thread
From: David Edmondson @ 2023-08-08 8:32 UTC (permalink / raw)
To: Parav Pandit; +Cc: shahafs, hengqi, virtio, virtio-comment
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] Re: [virtio] [PATCH requirements 6/7] net-features: Add packet timestamp requirements
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 6/7] net-features: Add packet timestamp requirements Parav Pandit
@ 2023-08-09 8:35 ` Xuan Zhuo
2023-08-10 6:56 ` Jason Wang
2023-08-14 13:06 ` [virtio-comment] " Parav Pandit
2023-08-14 11:59 ` [virtio-comment] " David Edmondson
1 sibling, 2 replies; 73+ messages in thread
From: Xuan Zhuo @ 2023-08-09 8:35 UTC (permalink / raw)
To: Parav Pandit; +Cc: shahafs, hengqi, virtio, Parav Pandit, virtio-comment
On Mon, 24 Jul 2023 06:34:20 +0300, Parav Pandit <parav@nvidia.com> wrote:
> Add tx and rx packet timestamp requirements.
>
> Signed-off-by: Parav Pandit <parav@nvidia.com>
> ---
> net-workstream/features-1.4.md | 26 ++++++++++++++++++++++++++
> 1 file changed, 26 insertions(+)
>
> diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
> index d228462..37820b6 100644
> --- a/net-workstream/features-1.4.md
> +++ b/net-workstream/features-1.4.md
> @@ -10,6 +10,7 @@ together is desired while updating the virtio net interface.
> 2. Low latency tx and rx virtqueues for PCI transport
> 3. Virtqueue notification coalescing re-arming support
> 4 Virtqueue receive flow filters (RFF)
> +5. Device timestamp for tx and rx packets
>
> # 3. Requirements
> ## 3.1 Device counters
> @@ -280,3 +281,28 @@ struct virtio_net_rff_delete {
> u8 padding[2];
> le32 flow_id;
> };
> +
> +## 3.5 Packet timestamp
> +1. Device should provide transmit timestamp and receive timestamp of the packets
> + at per packet level when the device is enabled.
> +2. Device should provide the current free running clock in the least latency
> + possible using an MMIO register read of 64-bit to have the least jitter.
> +3. Device should provide the current frequency and the frequency unit for the
> + software to synchronize the reference point of software and the device using
> + a control vq command.
> +
> +### 3.5.1 Transmit timestamp
> +1. Transmit completion must contain a packet transmission timestamp when the
> + device is enabled for it.
> +2. The device should record the packet transmit timestamp in the completion at
> + the farthest egress point towards the network.
> +3. The device must provide a transmit packet timestamp in a single DMA
> + transaction along with the rest of the transmit completion fields.
> +
> +### 3.5.2 Receive timestamp
> +1. Receive completion must contain a packet reception timestamp when the device
> + is enabled for it.
> +2. The device should record the received packet timestamp at the closet ingress
> + point of reception from the network.
> +3. The device should provide a receive packet timestamp in a single DMA
> + transaction along with the rest of the receive completion fields.
According to the last discuss, the feature will depend on the new desc
structure.
I would to know, can we introduce this to the current spec with a simple change?
struct vring_used_elem {
/* Index of start of used descriptor chain. */
__virtio32 id;
/* Total length of the descriptor chain which was used (written to) */
__virtio32 len;
+ __virtio64 timestamp;
};
Then, the existing devices can support this easily. If we introduce this by the
new desc structure, we can foresee that this function will not be implemented by
many existing machines. But this function is useful. So we want support this by
a simple way.
Thanks.
> --
> 2.26.2
>
>
> ---------------------------------------------------------------------
> To unsubscribe from this mail list, you must leave the OASIS TC that
> generates this mail. Follow this link to all your TCs in OASIS at:
> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
>
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: [virtio-comment] Re: [virtio] [PATCH requirements 6/7] net-features: Add packet timestamp requirements
2023-08-09 8:35 ` [virtio-comment] Re: [virtio] " Xuan Zhuo
@ 2023-08-10 6:56 ` Jason Wang
2023-08-15 6:13 ` Parav Pandit
[not found] ` <CAF=yD-+LMY3yE3qtd4vHc8CGOz6UAf4njM2QiZcajrQgL=KZRQ@mail.gmail.com>
2023-08-14 13:06 ` [virtio-comment] " Parav Pandit
1 sibling, 2 replies; 73+ messages in thread
From: Jason Wang @ 2023-08-10 6:56 UTC (permalink / raw)
To: Xuan Zhuo
Cc: Parav Pandit, shahafs, hengqi, virtio, virtio-comment,
Willem de Bruijn
On Wed, Aug 9, 2023 at 4:47 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> On Mon, 24 Jul 2023 06:34:20 +0300, Parav Pandit <parav@nvidia.com> wrote:
> > Add tx and rx packet timestamp requirements.
> >
> > Signed-off-by: Parav Pandit <parav@nvidia.com>
> > ---
> > net-workstream/features-1.4.md | 26 ++++++++++++++++++++++++++
> > 1 file changed, 26 insertions(+)
> >
> > diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
> > index d228462..37820b6 100644
> > --- a/net-workstream/features-1.4.md
> > +++ b/net-workstream/features-1.4.md
> > @@ -10,6 +10,7 @@ together is desired while updating the virtio net interface.
> > 2. Low latency tx and rx virtqueues for PCI transport
> > 3. Virtqueue notification coalescing re-arming support
> > 4 Virtqueue receive flow filters (RFF)
> > +5. Device timestamp for tx and rx packets
> >
> > # 3. Requirements
> > ## 3.1 Device counters
> > @@ -280,3 +281,28 @@ struct virtio_net_rff_delete {
> > u8 padding[2];
> > le32 flow_id;
> > };
> > +
> > +## 3.5 Packet timestamp
> > +1. Device should provide transmit timestamp and receive timestamp of the packets
> > + at per packet level when the device is enabled.
> > +2. Device should provide the current free running clock in the least latency
> > + possible using an MMIO register read of 64-bit to have the least jitter.
> > +3. Device should provide the current frequency and the frequency unit for the
> > + software to synchronize the reference point of software and the device using
> > + a control vq command.
> > +
> > +### 3.5.1 Transmit timestamp
> > +1. Transmit completion must contain a packet transmission timestamp when the
> > + device is enabled for it.
> > +2. The device should record the packet transmit timestamp in the completion at
> > + the farthest egress point towards the network.
> > +3. The device must provide a transmit packet timestamp in a single DMA
> > + transaction along with the rest of the transmit completion fields.
> > +
> > +### 3.5.2 Receive timestamp
> > +1. Receive completion must contain a packet reception timestamp when the device
> > + is enabled for it.
> > +2. The device should record the received packet timestamp at the closet ingress
> > + point of reception from the network.
> > +3. The device should provide a receive packet timestamp in a single DMA
> > + transaction along with the rest of the receive completion fields.
>
>
> According to the last discuss, the feature will depend on the new desc
> structure.
>
> I would to know, can we introduce this to the current spec with a simple change?
>
> struct vring_used_elem {
> /* Index of start of used descriptor chain. */
> __virtio32 id;
> /* Total length of the descriptor chain which was used (written to) */
> __virtio32 len;
>
> + __virtio64 timestamp;
> };
I think this could be one way and another proposal from Willem is:
https://lists.linuxfoundation.org/pipermail/virtualization/2021-February/052422.html
which might be tricky for TX but it's more flexible since it allows
timestamps to be done per buffer.
>
>
> Then, the existing devices can support this easily. If we introduce this by the
> new desc structure, we can foresee that this function will not be implemented by
> many existing machines. But this function is useful. So we want support this by
> a simple way.
Make sense.
Thanks
>
>
> Thanks.
>
> > --
> > 2.26.2
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe from this mail list, you must leave the OASIS TC that
> > generates this mail. Follow this link to all your TCs in OASIS at:
> > https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
> >
>
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
>
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
>
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/
>
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] RE: [EXT] [virtio] [PATCH requirements 2/7] net-features: Add low latency transmit queue requirements
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 2/7] net-features: Add low latency transmit queue requirements Parav Pandit
2023-08-08 8:24 ` David Edmondson
@ 2023-08-10 19:05 ` Satananda Burla
2023-08-15 5:51 ` Parav Pandit
2023-08-14 11:55 ` [virtio-comment] " David Edmondson
2 siblings, 1 reply; 73+ messages in thread
From: Satananda Burla @ 2023-08-10 19:05 UTC (permalink / raw)
To: Parav Pandit, virtio-comment@lists.oasis-open.org
Cc: shahafs@nvidia.com, hengqi@linux.alibaba.com,
virtio@lists.oasis-open.org
Hi Parav
> -----Original Message-----
> From: virtio@lists.oasis-open.org <virtio@lists.oasis-open.org> On
> Behalf Of Parav Pandit
> Sent: Sunday, July 23, 2023 8:34 PM
> To: virtio-comment@lists.oasis-open.org
> Cc: shahafs@nvidia.com; hengqi@linux.alibaba.com; virtio@lists.oasis-
> open.org; Parav Pandit <parav@nvidia.com>
> Subject: [EXT] [virtio] [PATCH requirements 2/7] net-features: Add low
> latency transmit queue requirements
>
> External Email
>
> ----------------------------------------------------------------------
> Add requirements for the low latency transmit queue.
>
> Signed-off-by: Parav Pandit <parav@nvidia.com>
> ---
> chagelog:
> v1->v2:
> - added generic requirement to inline the request content
> along with the descriptor for non virtio-net devices
> - added requirement to inline the request content along
> with the descriptor for virtio flow filter queue as two
> features are similar
> v0->v1:
> - added design goals for which requirements are added
> ---
> net-workstream/features-1.4.md | 88 ++++++++++++++++++++++++++++++++++
> 1 file changed, 88 insertions(+)
>
> diff --git a/net-workstream/features-1.4.md b/net-workstream/features-
> 1.4.md
> index 4c3797b..eb95592 100644
> --- a/net-workstream/features-1.4.md
> +++ b/net-workstream/features-1.4.md
> @@ -7,6 +7,7 @@ together is desired while updating the virtio net
> interface.
>
> # 2. Summary
> 1. Device counters visible to the driver
> +2. Low latency tx virtqueue for PCI transport
>
> # 3. Requirements
> ## 3.1 Device counters
> @@ -33,3 +34,90 @@ together is desired while updating the virtio net
> interface.
> ### 3.1.2 Per transmit queue counters
> 1. le64 tx_gso_pkts: Packets send as transmit GSO sequence
> 2. le64 tx_pkts: Total packets send by the device
> +
> +## 3.2 Low PCI latency virtqueues
> +### 3.2.1 Low PCI latency tx virtqueue
> +0. Design goal
> + a. Reduce PCI access latency in packet transmit flow
> + b. Avoid O(N) descriptor parser to detect a packet stream to
> simplify device
> + logic
> + c. Reduce number of PCI transmit completion transactions and have
> unified
> + completion flow with/without transmit timestamping
> + d. Avoid partial cache line writes on transmit completions
> +
> +1. Packet transmit descriptor should contain data descriptors count
> without any
> + indirection and without any O(N) search to find the end of a packet
> stream.
> + For example, a packet transmit descriptor (called vnet_tx_hdr_desc
> + subsequently) to contain a field num_next_desc for the packet stream
> + indicating that a packet is located in N data descriptors.
> +
> +2. Packet transmit descriptor should contain segmentation offload-
> related fields
> + without any indirection. For example, packet transmit descriptor to
> contain
> + gso_type, gso_size/mss, header length, csum placement byte offset,
> and
> + csum start.
> +
> +3. Packet transmit descriptor should be able to place a small size
> packet that
> + does not have any L4 data after the vnet_tx_hdr_desc in the
> virtqueue memory.
> + For example a TCP ack only packet can fit in a descriptor memory
> which
> + otherwise consume more than 25% of metadata to describe the packet.
> +
> +4. Packet transmit descriptor should be able to place a full GSO header
> (L2 to
> + L4) after header descriptor and before data descriptors. For
> example, the
> + GSO header is placed after struct vnet_tx_hdr_desc in the virtqueue
> memory.
> + When such a GSO header is positioned adjacent to the packet transmit
> + descriptor, and when the GSO header is not aligned to 16B, the
> following
> + data descriptor to start on the 8B aligned boundary.
> +
> +5. An example of the above requirements at high level is:
> +
> +```
> +struct vitio_packed_q_desc {
> + /* current desc for reference */
> + u64 address;
> + u32 len;
> + u16 id;
> + u16 flags;
> +};
> +
> +/* Constant size header descriptor for tx packets */
> +struct vnet_tx_hdr_desc {
> + u16 flags; /* indicate how to parse next fields */
> + u16 id; /* desc id to come back in completion */
> + u8 num_next_desc; /* indicates the number of the next 16B data desc
> for this
> + * buffer.
> + */
> + u8 gso_type;
> + le16 gso_hdr_len;
> + le16 gso_size;
> + le16 csum_start;
> + le16 csum_offset;
> + u8 inline_pkt_len; /* indicates the length of the inline packet
> after this
> + * desc
> + */
> + u8 reserved;
> + u8 padding[];
> +};
> +
> +/* Example of a short packet or GSO header placed in the desc section
> of the vq
> + */
> +struct vnet_tx_small_pkt_desc {
> + u8 raw_pkt[128];
> +};
> +
> +/* Example of header followed by data descriptor */
> +struct vnet_tx_hdr_desc hdr_desc;
> +struct vnet_data_desc desc[2];
> +
> +```
> +
> +6. Ability to zero pad the transmit completion when the transmit
> completion is
> + shorter than the CPU cache line size.
Did you mean that the last completion in a cache line will be padded ?
For ex if a transmit completion is 16 byte, you could fit 4 of them into
a cacheline. But if device has got only 3 completions to post, it would post
3 and pad the last 16 bytes. I hope you did not mean that every transmit
completion is padded to cacheline size.
> +
> +7. Ability to place all transmit completion together with it per packet
> stream
> + transmit timestamp using single PCIe transcation.
> +
> +8. A generic feature of the virtqueue, to contain such header data
> inline for virtio
> + devices other than virtio-net.
> +
> +9. A flow filter virtqueue also similarly need the ability to inline
> the short flow
> + command header.
> --
> 2.26.2
>
>
> ---------------------------------------------------------------------
> To unsubscribe from this mail list, you must leave the OASIS TC that
> generates this mail. Follow this link to all your TCs in OASIS at:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.oasis-
> 2Dopen.org_apps_org_workgroup_portal_my-
> 5Fworkgroups.php&d=DwIDAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=NHDPsfcAYlN2z-
> NXHHG4WB09qqS0voo_nf6_kGS625A&m=tvyHiYG_HE5F75ytZ2r2VlJuL4SVMmHF1elkPvWb
> byeYrTpljeSms0FoFoRXZR4g&s=NadfoNGGIuFl6YXYFu0njgTXjGISJgOdW-
> AVVvvK2Ms&e=
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] RE: [EXT] [virtio] [PATCH requirements 7/7] net-features: Add header data split requirements
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 7/7] net-features: Add header data split requirements Parav Pandit
@ 2023-08-10 19:19 ` Satananda Burla
2023-08-14 12:00 ` [virtio-comment] " David Edmondson
1 sibling, 0 replies; 73+ messages in thread
From: Satananda Burla @ 2023-08-10 19:19 UTC (permalink / raw)
To: Parav Pandit, virtio-comment@lists.oasis-open.org
Cc: shahafs@nvidia.com, hengqi@linux.alibaba.com,
virtio@lists.oasis-open.org
Hi Parav
> -----Original Message-----
> From: virtio@lists.oasis-open.org <virtio@lists.oasis-open.org> On
> Behalf Of Parav Pandit
> Sent: Sunday, July 23, 2023 8:34 PM
> To: virtio-comment@lists.oasis-open.org
> Cc: shahafs@nvidia.com; hengqi@linux.alibaba.com; virtio@lists.oasis-
> open.org; Parav Pandit <parav@nvidia.com>
> Subject: [EXT] [virtio] [PATCH requirements 7/7] net-features: Add
> header data split requirements
>
> External Email
>
> ----------------------------------------------------------------------
> Add header data split requirements for the receive packets.
>
> Signed-off-by: Parav Pandit <parav@nvidia.com>
> ---
> net-workstream/features-1.4.md | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/net-workstream/features-1.4.md b/net-workstream/features-
> 1.4.md
> index 37820b6..a64e356 100644
> --- a/net-workstream/features-1.4.md
> +++ b/net-workstream/features-1.4.md
> @@ -11,6 +11,7 @@ together is desired while updating the virtio net
> interface.
> 3. Virtqueue notification coalescing re-arming support
> 4 Virtqueue receive flow filters (RFF)
> 5. Device timestamp for tx and rx packets
> +6. Header data split for the receive virtqueue
>
> # 3. Requirements
> ## 3.1 Device counters
> @@ -306,3 +307,15 @@ struct virtio_net_rff_delete {
> point of reception from the network.
> 3. The device should provide a receive packet timestamp in a single DMA
> transaction along with the rest of the receive completion fields.
> +
> +## 3.6 Header data split for the receive virtqueue
> +1. The device should be able to DMA the packet header and data to two
> different
> + memory locations, this enables driver and networking stack to
> perform zero
> + copy to application buffer(s).
> +2. The driver should be able to configure maximum header buffer size
> per
> + virtqueue.
> +3. The header buffer to be in a physically contiguous memory per
> virtqueue
> +4. The device should be able to indicate header data split in the
> receive
> + completion.
The device should also be able to indicate the size of the header that
has been placed in the header buffer.
> +5. The device should be able to zero pad the header buffer when the
> received
> + header is shorter than cpu cache line size.
I am not sure why this is needed. The driver would anyway need the size
of header that was placed in the header buffer to be indicated by the
device and would not consume beyond that.
> --
> 2.26.2
>
>
> ---------------------------------------------------------------------
> To unsubscribe from this mail list, you must leave the OASIS TC that
> generates this mail. Follow this link to all your TCs in OASIS at:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.oasis-
> 2Dopen.org_apps_org_workgroup_portal_my-
> 5Fworkgroups.php&d=DwIDAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=NHDPsfcAYlN2z-
> NXHHG4WB09qqS0voo_nf6_kGS625A&m=8mCmsaqsETpHBYC1cSO251gJSPqA8d8Tv5mTc3Av
> 0hL0J7SuDkUgA4w2WQFUq-pD&s=4CzAmGWykUmjgHgWM2C2-
> S4tknBE2erM7pTbXK10r7M&e=
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: [virtio-comment] Re: [virtio] [PATCH requirements 6/7] net-features: Add packet timestamp requirements
[not found] ` <CAF=yD-+LMY3yE3qtd4vHc8CGOz6UAf4njM2QiZcajrQgL=KZRQ@mail.gmail.com>
@ 2023-08-14 2:54 ` Jason Wang
2023-08-15 6:26 ` Parav Pandit
1 sibling, 0 replies; 73+ messages in thread
From: Jason Wang @ 2023-08-14 2:54 UTC (permalink / raw)
To: Willem de Bruijn
Cc: Xuan Zhuo, Parav Pandit, shahafs, hengqi, virtio, virtio-comment,
Willem de Bruijn
On Fri, Aug 11, 2023 at 5:28 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> On Thu, Aug 10, 2023 at 2:56 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Wed, Aug 9, 2023 at 4:47 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > >
> > > On Mon, 24 Jul 2023 06:34:20 +0300, Parav Pandit <parav@nvidia.com> wrote:
> > > > Add tx and rx packet timestamp requirements.
> > > >
> > > > Signed-off-by: Parav Pandit <parav@nvidia.com>
> > > > ---
> > > > net-workstream/features-1.4.md | 26 ++++++++++++++++++++++++++
> > > > 1 file changed, 26 insertions(+)
> > > >
> > > > diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
> > > > index d228462..37820b6 100644
> > > > --- a/net-workstream/features-1.4.md
> > > > +++ b/net-workstream/features-1.4.md
> > > > @@ -10,6 +10,7 @@ together is desired while updating the virtio net interface.
> > > > 2. Low latency tx and rx virtqueues for PCI transport
> > > > 3. Virtqueue notification coalescing re-arming support
> > > > 4 Virtqueue receive flow filters (RFF)
> > > > +5. Device timestamp for tx and rx packets
> > > >
> > > > # 3. Requirements
> > > > ## 3.1 Device counters
> > > > @@ -280,3 +281,28 @@ struct virtio_net_rff_delete {
> > > > u8 padding[2];
> > > > le32 flow_id;
> > > > };
> > > > +
> > > > +## 3.5 Packet timestamp
> > > > +1. Device should provide transmit timestamp and receive timestamp of the packets
> > > > + at per packet level when the device is enabled.
> > > > +2. Device should provide the current free running clock in the least latency
> > > > + possible using an MMIO register read of 64-bit to have the least jitter.
Btw, register is not offered via all transports.
> > > > +3. Device should provide the current frequency and the frequency unit for the
> > > > + software to synchronize the reference point of software and the device using
> > > > + a control vq command.
> > > > +
> > > > +### 3.5.1 Transmit timestamp
> > > > +1. Transmit completion must contain a packet transmission timestamp when the
> > > > + device is enabled for it.
> > > > +2. The device should record the packet transmit timestamp in the completion at
> > > > + the farthest egress point towards the network.
> > > > +3. The device must provide a transmit packet timestamp in a single DMA
> > > > + transaction along with the rest of the transmit completion fields.
> > > > +
> > > > +### 3.5.2 Receive timestamp
> > > > +1. Receive completion must contain a packet reception timestamp when the device
> > > > + is enabled for it.
> > > > +2. The device should record the received packet timestamp at the closet ingress
> > > > + point of reception from the network.
> > > > +3. The device should provide a receive packet timestamp in a single DMA
> > > > + transaction along with the rest of the receive completion fields.
> > >
> > >
> > > According to the last discuss, the feature will depend on the new desc
> > > structure.
> > >
> > > I would to know, can we introduce this to the current spec with a simple change?
> > >
> > > struct vring_used_elem {
> > > /* Index of start of used descriptor chain. */
> > > __virtio32 id;
> > > /* Total length of the descriptor chain which was used (written to) */
> > > __virtio32 len;
> > >
> > > + __virtio64 timestamp;
> > > };
> >
> > I think this could be one way and another proposal from Willem is:
> >
> > https://lists.linuxfoundation.org/pipermail/virtualization/2021-February/052422.html
> >
> > which might be tricky for TX but it's more flexible since it allows
> > timestamps to be done per buffer.
>
> Thanks for looping me in, Jason.
>
> I would separate timestamping data from clock synchronization and
> timestamping control path.
>
>
> For synchronization, a single 64-bit MMIO read is not necessarily the
> fastest when traversing real hardware boundaries, or sufficient. PCIe
> has Precision Time Measurement (PTM) hardware logic to capture
> clock measurement with less uncertainty and possibly latency, for instance.
> Indeed, uncertainty is more important than raw latency.
>
> FWIW, we're also trying to capture similar requires for physical devices
> through the Open Compute Platform NIC Core Offloads effort:
> https://www.opencompute.org/w/index.php?title=Core_Offloads#Timestamping
This is interesting and the whole idea of "NIC Core Offloads" looks
great as well.
>
> Virtio defines both a virtual interface and one that can be supported
> by hardware, so it's not a 1:1 match, but many subtle points probably
> apply to both.
Yeah, I think virito-net may try to align with that.
>
>
> For timestamping, the difficult part is the transmit completion stamp,
> asynchronous with data flow. If this is taken on the device before or
> at the time of queuing the completion to the host, then it can be stored
> in\ the completion descriptor.
Yes, this is the proposal from Xuan actually.
> But if taken at the PHY, say, it is not
> uncommon for this to happen after the completion is written.
This seems the requirement of the this patch which says:
"
The device should record the packet transmit timestamp in the completion at
the farthest egress point towards the network.
"
It seems to imply the PHY but with a DMA interface:
"
The device must provide a transmit packet timestamp in a single DMA
transaction along with the rest of the transmit completion fields.
"
Parav, can you clarify? And
> And
> instead a block of dedicated registers is used and the host must poll
> a register. There are examples of these in the Linux device driver
> directory. For virtio, it is probably sufficient to only support the
> first kind.
RIght, for the second, it could be tricky for a virtual environemnt.
>
>
> A third aspect is the control path: querying and configuring hardware
> timestamps (siocgstamp/siosgstamp). If hardware is capable, easiest is
> to just timestamp every packet. Much hardware out there targets low
> rate PTP messages. But a virtualized virtio device could timestamp
> all. In which case this control API can maybe be punted on.
>
That could be one way.
Thanks
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] RE: [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements
2023-08-08 8:21 ` [virtio-comment] " Heng Qi
@ 2023-08-14 5:15 ` Parav Pandit
2023-08-14 6:18 ` [virtio-comment] " Heng Qi
0 siblings, 1 reply; 73+ messages in thread
From: Parav Pandit @ 2023-08-14 5:15 UTC (permalink / raw)
To: Heng Qi
Cc: Satananda Burla, virtio-comment@lists.oasis-open.org,
Shahaf Shuler, virtio@lists.oasis-open.org
> From: Heng Qi <hengqi@linux.alibaba.com>
> Sent: Tuesday, August 8, 2023 1:52 PM
> > Yes, but having single interface for two use cases enables the device
> implementation to not build driver interface specific infra.
> > Both can be handled by unified interface.
>
> I reconsidered the number of groups, we don't necessarily have only two groups
> for the time being (one is RFF, the other is ARFS). For example, the driver may
> maintain groups with different priorities for RFF itself (for example, according
> to the number of fields contained in ntuple, etc.), and the driver may also
> maintain different groups with the same priority for different flow types of
> ARFS, etc.
This is fine and covered with the interface.
Number of supported max groups is device capability that is exposed by the device.
How many groups to use and which priority assign to each is driver's decision.
So more than 2 groups is fine and supported by the requirements.
In Linux net device example, 2 groups seem enough, but spec is not limited to it.
When/if there is switch, it can also create a group and filter prioritize message before it reaches further nic processing.
But we can keep this aside for now to not complicate the discussion more.
So in nutshell, single interface is able to service the need of both ARFS and ethtool programming.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* RE: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4
2023-08-08 8:16 ` David Edmondson
@ 2023-08-14 5:17 ` Parav Pandit
2023-08-14 11:53 ` David Edmondson
0 siblings, 1 reply; 73+ messages in thread
From: Parav Pandit @ 2023-08-14 5:17 UTC (permalink / raw)
To: David Edmondson
Cc: Shahaf Shuler, hengqi@linux.alibaba.com,
virtio@lists.oasis-open.org, virtio-comment@lists.oasis-open.org
Hi David,
> From: David Edmondson <david.edmondson@oracle.com>
> Sent: Tuesday, August 8, 2023 1:46 PM
Something is wrong with all 3 replies from you.
There is no message body in them.
I thought it is my mailbox, but looking at the mailing list [1], it is also missing.
Can you please reply your comments again?
[1] https://lists.oasis-open.org/archives/virtio-comment/202308/msg00125.html
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] Re: [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements
2023-08-14 5:15 ` [virtio-comment] " Parav Pandit
@ 2023-08-14 6:18 ` Heng Qi
2023-08-14 6:35 ` [virtio-comment] " Parav Pandit
0 siblings, 1 reply; 73+ messages in thread
From: Heng Qi @ 2023-08-14 6:18 UTC (permalink / raw)
To: Parav Pandit
Cc: Satananda Burla, virtio-comment@lists.oasis-open.org,
Shahaf Shuler, virtio@lists.oasis-open.org
On Mon, Aug 14, 2023 at 05:15:06AM +0000, Parav Pandit wrote:
> > From: Heng Qi <hengqi@linux.alibaba.com>
> > Sent: Tuesday, August 8, 2023 1:52 PM
>
> > > Yes, but having single interface for two use cases enables the device
> > implementation to not build driver interface specific infra.
> > > Both can be handled by unified interface.
> >
> > I reconsidered the number of groups, we don't necessarily have only two groups
> > for the time being (one is RFF, the other is ARFS). For example, the driver may
> > maintain groups with different priorities for RFF itself (for example, according
> > to the number of fields contained in ntuple, etc.), and the driver may also
> > maintain different groups with the same priority for different flow types of
> > ARFS, etc.
>
> This is fine and covered with the interface.
> Number of supported max groups is device capability that is exposed by the device.
>
> How many groups to use and which priority assign to each is driver's decision.
> So more than 2 groups is fine and supported by the requirements.
Yes, that's what I want to stress too.
>
> In Linux net device example, 2 groups seem enough,
Sorry I didn't understand this, are you referring to a net device
documentation or a driver implementation?
> but spec is not limited to it.
>
> When/if there is switch, it can also create a group and filter prioritize message before it reaches further nic processing.
> But we can keep this aside for now to not complicate the discussion more.
Yes, I noticed Xuan's thread, so this can be discussed in his thread.
Thanks!
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] RE: [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements
2023-08-14 6:18 ` [virtio-comment] " Heng Qi
@ 2023-08-14 6:35 ` Parav Pandit
0 siblings, 0 replies; 73+ messages in thread
From: Parav Pandit @ 2023-08-14 6:35 UTC (permalink / raw)
To: Heng Qi
Cc: Satananda Burla, virtio-comment@lists.oasis-open.org,
Shahaf Shuler, virtio@lists.oasis-open.org
> From: Heng Qi <hengqi@linux.alibaba.com>
> Sent: Monday, August 14, 2023 11:49 AM
> > How many groups to use and which priority assign to each is driver's decision.
> > So more than 2 groups is fine and supported by the requirements.
>
> Yes, that's what I want to stress too.
>
Ok.
> >
> > In Linux net device example, 2 groups seem enough,
>
> Sorry I didn't understand this, are you referring to a net device documentation
> or a driver implementation?
>
Driver implementation.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4
2023-08-14 5:17 ` Parav Pandit
@ 2023-08-14 11:53 ` David Edmondson
0 siblings, 0 replies; 73+ messages in thread
From: David Edmondson @ 2023-08-14 11:53 UTC (permalink / raw)
To: Parav Pandit
Cc: Shahaf Shuler, hengqi@linux.alibaba.com,
virtio@lists.oasis-open.org, virtio-comment
On Monday, 2023-08-14 at 05:17:34 UTC, Parav Pandit wrote:
> Hi David,
>
>> From: David Edmondson <david.edmondson@oracle.com>
>> Sent: Tuesday, August 8, 2023 1:46 PM
>
> Something is wrong with all 3 replies from you.
> There is no message body in them.
> I thought it is my mailbox, but looking at the mailing list [1], it is also missing.
>
> Can you please reply your comments again?
Apologies, I will resend. This will teach me to fiddle with the
configuration of my mail client...
> [1] https://lists.oasis-open.org/archives/virtio-comment/202308/msg00125.html
>
>
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
>
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
>
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/
--
Seems I'm not alone at being alone.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: [virtio-comment] [PATCH requirements 3/7] net-features: Add low latency receive queue requirements
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 3/7] net-features: Add low latency receive " Parav Pandit
2023-08-08 8:32 ` David Edmondson
@ 2023-08-14 11:54 ` David Edmondson
2023-08-15 4:45 ` Parav Pandit
1 sibling, 1 reply; 73+ messages in thread
From: David Edmondson @ 2023-08-14 11:54 UTC (permalink / raw)
To: Parav Pandit; +Cc: shahafs, hengqi, virtio, virtio-comment
On Monday, 2023-07-24 at 06:34:17 +03, Parav Pandit wrote:
> Add requirements for the low latency receive queue.
>
> Signed-off-by: Parav Pandit <parav@nvidia.com>
> ---
> changelog:
> v0->v1:
> - clarified the requirements further
> - added line for the gro case
> - added design goals as the motivation for the requirements
> ---
> net-workstream/features-1.4.md | 45 +++++++++++++++++++++++++++++++++-
> 1 file changed, 44 insertions(+), 1 deletion(-)
>
> diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
> index eb95592..e04727a 100644
> --- a/net-workstream/features-1.4.md
> +++ b/net-workstream/features-1.4.md
> @@ -7,7 +7,7 @@ together is desired while updating the virtio net interface.
>
> # 2. Summary
> 1. Device counters visible to the driver
> -2. Low latency tx virtqueue for PCI transport
> +2. Low latency tx and rx virtqueues for PCI transport
>
> # 3. Requirements
> ## 3.1 Device counters
> @@ -121,3 +121,46 @@ struct vnet_data_desc desc[2];
>
> 9. A flow filter virtqueue also similarly need the ability to inline the short flow
> command header.
> +
> +### 3.2.2 Low latency rx virtqueue
> +0. Design goal:
> + a. Keep packet metadata and buffer data together which is consumed by driver
> + layer and make it available in a single cache line of cpu
> + b. Instead of having per packet descriptors which is complex to scale for
> + the device, supply the page directly to the device to consume it based
> + on packet size
Really "per packet descriptor buffers"?
> +1. The device should be able to write a packet receive completion that consists
> + of struct virtio_net_hdr (or similar) and a buffer id using a single DMA write
> + PCIe TLP.
> +2. The device should be able to perform DMA writes of multiple packets
> + completions in a single DMA transaction up to the PCIe maximum write limit
> + in a transaction.
> +3. The device should be able to zero pad packet write completion to align it to
> + 64B or CPU cache line size whenever possible.
> +4. An example of the above DMA completion structure:
> +
> +```
> +/* Constant size receive packet completion */
> +struct vnet_rx_completion {
> + u16 flags;
> + u16 id; /* buffer id */
> + u8 gso_type;
> + u8 reserved[3];
> + le16 gso_hdr_len;
> + le16 gso_size;
> + le16 csum_start;
> + le16 csum_offset;
> + u16 reserved2;
> + u64 timestamp; /* explained later */
> + u8 padding[];
> +};
> +```
> +5. The driver should be able to post constant-size buffer pages on a receive
> + queue which can be consumed by the device for an incoming packet of any size
> + from 64B to 9K bytes.
> +6. The device should be able to know the constant buffer size at receive
> + virtqueue level instead of per buffer level.
> +7. The device should be able to indicate when a full page buffer is consumed,
> + which can be recycled by the driver when the packets from the completed
> + page is fully consumed.
s/is full consumed/are fully consumed/
> +8. The device should be able to consume multiple pages for a receive GSO stream.
--
So tap at my window, maybe I might let you in.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: [virtio-comment] [PATCH requirements 2/7] net-features: Add low latency transmit queue requirements
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 2/7] net-features: Add low latency transmit queue requirements Parav Pandit
2023-08-08 8:24 ` David Edmondson
2023-08-10 19:05 ` [virtio-comment] RE: [EXT] [virtio] " Satananda Burla
@ 2023-08-14 11:55 ` David Edmondson
2 siblings, 0 replies; 73+ messages in thread
From: David Edmondson @ 2023-08-14 11:55 UTC (permalink / raw)
To: Parav Pandit; +Cc: shahafs, hengqi, virtio, virtio-comment
On Monday, 2023-07-24 at 06:34:16 +03, Parav Pandit wrote:
> Add requirements for the low latency transmit queue.
>
> Signed-off-by: Parav Pandit <parav@nvidia.com>
> ---
> chagelog:
> v1->v2:
> - added generic requirement to inline the request content
> along with the descriptor for non virtio-net devices
> - added requirement to inline the request content along
> with the descriptor for virtio flow filter queue as two
> features are similar
> v0->v1:
> - added design goals for which requirements are added
> ---
> net-workstream/features-1.4.md | 88 ++++++++++++++++++++++++++++++++++
> 1 file changed, 88 insertions(+)
>
> diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
> index 4c3797b..eb95592 100644
> --- a/net-workstream/features-1.4.md
> +++ b/net-workstream/features-1.4.md
> @@ -7,6 +7,7 @@ together is desired while updating the virtio net interface.
>
> # 2. Summary
> 1. Device counters visible to the driver
> +2. Low latency tx virtqueue for PCI transport
>
> # 3. Requirements
> ## 3.1 Device counters
> @@ -33,3 +34,90 @@ together is desired while updating the virtio net interface.
> ### 3.1.2 Per transmit queue counters
> 1. le64 tx_gso_pkts: Packets send as transmit GSO sequence
> 2. le64 tx_pkts: Total packets send by the device
> +
> +## 3.2 Low PCI latency virtqueues
> +### 3.2.1 Low PCI latency tx virtqueue
> +0. Design goal
> + a. Reduce PCI access latency in packet transmit flow
> + b. Avoid O(N) descriptor parser to detect a packet stream to simplify device
> + logic
> + c. Reduce number of PCI transmit completion transactions and have unified
> + completion flow with/without transmit timestamping
> + d. Avoid partial cache line writes on transmit completions
> +
> +1. Packet transmit descriptor should contain data descriptors count without any
> + indirection and without any O(N) search to find the end of a packet stream.
> + For example, a packet transmit descriptor (called vnet_tx_hdr_desc
> + subsequently) to contain a field num_next_desc for the packet stream
> + indicating that a packet is located in N data descriptors.
> +
> +2. Packet transmit descriptor should contain segmentation offload-related fields
> + without any indirection. For example, packet transmit descriptor to contain
> + gso_type, gso_size/mss, header length, csum placement byte offset, and
> + csum start.
> +
> +3. Packet transmit descriptor should be able to place a small size packet that
> + does not have any L4 data after the vnet_tx_hdr_desc in the virtqueue memory.
> + For example a TCP ack only packet can fit in a descriptor memory which
> + otherwise consume more than 25% of metadata to describe the packet.
> +
> +4. Packet transmit descriptor should be able to place a full GSO header (L2 to
> + L4) after header descriptor and before data descriptors. For example, the
> + GSO header is placed after struct vnet_tx_hdr_desc in the virtqueue memory.
> + When such a GSO header is positioned adjacent to the packet transmit
> + descriptor, and when the GSO header is not aligned to 16B, the following
> + data descriptor to start on the 8B aligned boundary.
> +
> +5. An example of the above requirements at high level is:
> +
> +```
> +struct vitio_packed_q_desc {
"virtio_packed_q_desc"
> + /* current desc for reference */
> + u64 address;
> + u32 len;
> + u16 id;
> + u16 flags;
> +};
> +
> +/* Constant size header descriptor for tx packets */
> +struct vnet_tx_hdr_desc {
> + u16 flags; /* indicate how to parse next fields */
> + u16 id; /* desc id to come back in completion */
> + u8 num_next_desc; /* indicates the number of the next 16B data desc for this
> + * buffer.
> + */
> + u8 gso_type;
> + le16 gso_hdr_len;
> + le16 gso_size;
> + le16 csum_start;
> + le16 csum_offset;
> + u8 inline_pkt_len; /* indicates the length of the inline packet after this
> + * desc
> + */
> + u8 reserved;
> + u8 padding[];
> +};
> +
> +/* Example of a short packet or GSO header placed in the desc section of the vq
> + */
> +struct vnet_tx_small_pkt_desc {
> + u8 raw_pkt[128];
> +};
> +
> +/* Example of header followed by data descriptor */
> +struct vnet_tx_hdr_desc hdr_desc;
> +struct vnet_data_desc desc[2];
> +
> +```
> +
> +6. Ability to zero pad the transmit completion when the transmit completion is
> + shorter than the CPU cache line size.
> +
> +7. Ability to place all transmit completion together with it per packet stream
> + transmit timestamp using single PCIe transcation.
The meaning of this is unclear to me. Is it:
The ability to place all transmit completions with a per-packet stream
transmit timestamp using a single PCIe transaction.
?
> +
> +8. A generic feature of the virtqueue, to contain such header data inline for virtio
> + devices other than virtio-net.
Given that this feature is used by this patch (for TX), the following
patch (for RX) and flow filter manipulation, perhaps pull it out as a
distinct requirement.
> +
> +9. A flow filter virtqueue also similarly need the ability to inline the short flow
> + command header.
--
So tap at my window, maybe I might let you in.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4 Parav Pandit
2023-08-08 8:16 ` David Edmondson
@ 2023-08-14 11:56 ` David Edmondson
1 sibling, 0 replies; 73+ messages in thread
From: David Edmondson @ 2023-08-14 11:56 UTC (permalink / raw)
To: Parav Pandit; +Cc: shahafs, hengqi, virtio, virtio-comment
On Monday, 2023-07-24 at 06:34:15 +03, Parav Pandit wrote:
> Add requirements document template for the virtio net features.
>
> Add virtio net device counters visible to driver.
Minor, but perhaps separate the introduction and the statistics into
distinct changes.
> Signed-off-by: Parav Pandit <parav@nvidia.com>
> ---
> changelog:
> v0->v1:
> - removed tx dropped counter
> - updated requirements to mention about virtqueue interface for counters
> query
> ---
> net-workstream/features-1.4.md | 35 ++++++++++++++++++++++++++++++++++
> 1 file changed, 35 insertions(+)
> create mode 100644 net-workstream/features-1.4.md
>
> diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
> new file mode 100644
> index 0000000..4c3797b
> --- /dev/null
> +++ b/net-workstream/features-1.4.md
> @@ -0,0 +1,35 @@
> +# 1. Introduction
> +
> +This document describes the overall requirements for virtio net device
> +improvements for upcoming release 1.4. Some of these requirements are
> +interrelated and influence the interface design, hence reviewing them
> +together is desired while updating the virtio net interface.
> +
> +# 2. Summary
> +1. Device counters visible to the driver
> +
> +# 3. Requirements
> +## 3.1 Device counters
> +1. The driver should be able to query the device and/or per vq counters for
> + debugging purpose using a virtqueue directly from driver to device for
> + example using a control vq.
> +2. The driver should be able to query which counters are supported using a
> + virtqueue command, for example using an existing control vq.
> +3. If this device is migrated between two hosts, the driver should be able
> + get the counter values in the destination host from where it was left
> + off in the source host.
Isn't this really "if the driver is migrated"?
I'm not sure of an obvious term for "the abstracted device that
represents an actual device to which this driver is attached".
> +4. If a virtio device is group member device, a group owner should be able
> + to query all the counter attributes using the administration command which
> + a virtio member device will expose via a virtqueue to the driver.
Suggest:
If a virtio device is a group member device, a group owner should be
able to query all of the member device counter attributes and counters
via the group owner device.
> +
> +### 3.1.1 Per receive queue counters
> +1. le64 rx_oversize_pkt_errors: Packet dropped due to receive packet being
> + oversize than the buffer size
> +2. le64 rx_no_buffer_pkt_errors: Packet dropped due to unavailability of the
> + buffer in the receive queue
> +3. le64 rx_gro_pkts: Packets treated as receive GSO sequence by the device
> +4. le64 rx_pkts: Total packets received by the device
> +
> +### 3.1.2 Per transmit queue counters
> +1. le64 tx_gso_pkts: Packets send as transmit GSO sequence
> +2. le64 tx_pkts: Total packets send by the device
The patch from Xuan includes more than this - perhaps include them here
so that we can debate the specifics?
--
Walking upside down in the sky, between the satellites passing by, I'm looking.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: [virtio-comment] [PATCH requirements 4/7] net-features: Add notification coalescing requirements
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 4/7] net-features: Add notification coalescing requirements Parav Pandit
@ 2023-08-14 11:57 ` David Edmondson
0 siblings, 0 replies; 73+ messages in thread
From: David Edmondson @ 2023-08-14 11:57 UTC (permalink / raw)
To: Parav Pandit; +Cc: shahafs, hengqi, virtio, virtio-comment
On Monday, 2023-07-24 at 06:34:18 +03, Parav Pandit wrote:
> Add virtio net device notification coalescing improvements requirements.
>
> Signed-off-by: Parav Pandit <parav@nvidia.com>
Acked-by: David Edmondson <david.edmondson@oracle.com>
> ---
> changelog:
> v1->v2:
> - addressed comments from Stefan
> - redrafted the requirements to use rearm term and avoid queue enable
> confusion
> v0->v1:
> - updated the description
> ---
> net-workstream/features-1.4.md | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
> index e04727a..27a7886 100644
> --- a/net-workstream/features-1.4.md
> +++ b/net-workstream/features-1.4.md
> @@ -8,6 +8,7 @@ together is desired while updating the virtio net interface.
> # 2. Summary
> 1. Device counters visible to the driver
> 2. Low latency tx and rx virtqueues for PCI transport
> +3. Virtqueue notification coalescing re-arming support
>
> # 3. Requirements
> ## 3.1 Device counters
> @@ -164,3 +165,13 @@ struct vnet_rx_completion {
> which can be recycled by the driver when the packets from the completed
> page is fully consumed.
> 8. The device should be able to consume multiple pages for a receive GSO stream.
> +
> +## 3.3 Virtqueue notification coalescing re-arming support
> +0. Design goal:
> + a. Avoid constant notifications from the device even in conditions when
> + the driver may not have acted on the previous pending notification.
> +1. When Tx and Rx virtqueue notification coalescing is enabled, and when such
> + a notification is reported by the device, the device stops sending further
> + notifications until the driver rearms the notifications of the virtqueue.
> +2. When the driver rearms the notification of the virtqueue, the device
> + to notify again if notification coalescing conditions are met.
--
You know your green from your red.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: [virtio-comment] [PATCH requirements 6/7] net-features: Add packet timestamp requirements
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 6/7] net-features: Add packet timestamp requirements Parav Pandit
2023-08-09 8:35 ` [virtio-comment] Re: [virtio] " Xuan Zhuo
@ 2023-08-14 11:59 ` David Edmondson
1 sibling, 0 replies; 73+ messages in thread
From: David Edmondson @ 2023-08-14 11:59 UTC (permalink / raw)
To: Parav Pandit; +Cc: shahafs, hengqi, virtio, virtio-comment
On Monday, 2023-07-24 at 06:34:20 +03, Parav Pandit wrote:
> Add tx and rx packet timestamp requirements.
>
> Signed-off-by: Parav Pandit <parav@nvidia.com>
Acked-by: David Edmondson <david.edmondson@oracle.com>
> ---
> net-workstream/features-1.4.md | 26 ++++++++++++++++++++++++++
> 1 file changed, 26 insertions(+)
>
> diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
> index d228462..37820b6 100644
> --- a/net-workstream/features-1.4.md
> +++ b/net-workstream/features-1.4.md
> @@ -10,6 +10,7 @@ together is desired while updating the virtio net interface.
> 2. Low latency tx and rx virtqueues for PCI transport
> 3. Virtqueue notification coalescing re-arming support
> 4 Virtqueue receive flow filters (RFF)
> +5. Device timestamp for tx and rx packets
>
> # 3. Requirements
> ## 3.1 Device counters
> @@ -280,3 +281,28 @@ struct virtio_net_rff_delete {
> u8 padding[2];
> le32 flow_id;
> };
> +
> +## 3.5 Packet timestamp
> +1. Device should provide transmit timestamp and receive timestamp of the packets
> + at per packet level when the device is enabled.
> +2. Device should provide the current free running clock in the least latency
> + possible using an MMIO register read of 64-bit to have the least jitter.
> +3. Device should provide the current frequency and the frequency unit for the
> + software to synchronize the reference point of software and the device using
> + a control vq command.
> +
> +### 3.5.1 Transmit timestamp
> +1. Transmit completion must contain a packet transmission timestamp when the
> + device is enabled for it.
> +2. The device should record the packet transmit timestamp in the completion at
> + the farthest egress point towards the network.
> +3. The device must provide a transmit packet timestamp in a single DMA
> + transaction along with the rest of the transmit completion fields.
> +
> +### 3.5.2 Receive timestamp
> +1. Receive completion must contain a packet reception timestamp when the device
> + is enabled for it.
> +2. The device should record the received packet timestamp at the closet ingress
> + point of reception from the network.
> +3. The device should provide a receive packet timestamp in a single DMA
> + transaction along with the rest of the receive completion fields.
--
Do not leave the building.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: [virtio-comment] [PATCH requirements 7/7] net-features: Add header data split requirements
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 7/7] net-features: Add header data split requirements Parav Pandit
2023-08-10 19:19 ` [virtio-comment] RE: [EXT] [virtio] " Satananda Burla
@ 2023-08-14 12:00 ` David Edmondson
[not found] ` <CA+FuTSeguCKk4zxZ0=Ebr1phZhF9kssHeGPn2eZj6SRNv2ewsA@mail.gmail.com>
1 sibling, 1 reply; 73+ messages in thread
From: David Edmondson @ 2023-08-14 12:00 UTC (permalink / raw)
To: Parav Pandit; +Cc: shahafs, hengqi, virtio, virtio-comment
On Monday, 2023-07-24 at 06:34:21 +03, Parav Pandit wrote:
> Add header data split requirements for the receive packets.
>
> Signed-off-by: Parav Pandit <parav@nvidia.com>
> ---
> net-workstream/features-1.4.md | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
> index 37820b6..a64e356 100644
> --- a/net-workstream/features-1.4.md
> +++ b/net-workstream/features-1.4.md
> @@ -11,6 +11,7 @@ together is desired while updating the virtio net interface.
> 3. Virtqueue notification coalescing re-arming support
> 4 Virtqueue receive flow filters (RFF)
> 5. Device timestamp for tx and rx packets
> +6. Header data split for the receive virtqueue
>
> # 3. Requirements
> ## 3.1 Device counters
> @@ -306,3 +307,15 @@ struct virtio_net_rff_delete {
> point of reception from the network.
> 3. The device should provide a receive packet timestamp in a single DMA
> transaction along with the rest of the receive completion fields.
> +
> +## 3.6 Header data split for the receive virtqueue
> +1. The device should be able to DMA the packet header and data to two different
> + memory locations, this enables driver and networking stack to perform zero
> + copy to application buffer(s).
> +2. The driver should be able to configure maximum header buffer size per
> + virtqueue.
> +3. The header buffer to be in a physically contiguous memory per virtqueue
> +4. The device should be able to indicate header data split in the receive
> + completion.
> +5. The device should be able to zero pad the header buffer when the received
> + header is shorter than cpu cache line size.
What's the use case for this (item 5)?
--
And now I know what every step is for.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] RE: [virtio] [PATCH requirements 6/7] net-features: Add packet timestamp requirements
2023-08-09 8:35 ` [virtio-comment] Re: [virtio] " Xuan Zhuo
2023-08-10 6:56 ` Jason Wang
@ 2023-08-14 13:06 ` Parav Pandit
2023-08-15 2:47 ` [virtio-comment] " Xuan Zhuo
1 sibling, 1 reply; 73+ messages in thread
From: Parav Pandit @ 2023-08-14 13:06 UTC (permalink / raw)
To: Xuan Zhuo
Cc: Shahaf Shuler, hengqi@linux.alibaba.com,
virtio@lists.oasis-open.org, virtio-comment@lists.oasis-open.org
> From: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> Sent: Wednesday, August 9, 2023 2:06 PM
> struct vring_used_elem {
> /* Index of start of used descriptor chain. */
> __virtio32 id;
> /* Total length of the descriptor chain which was used (written to) */
> __virtio32 len;
>
> + __virtio64 timestamp;
> };
>
>
> Then, the existing devices can support this easily. If we introduce this by the
> new desc structure, we can foresee that this function will not be implemented
> by many existing machines. But this function is useful. So we want support this
> by a simple way.
This only works for split q.
Packed q needs yet another format.
And even after that we still have to live with other limitations listed in other requirements.
Therefore its better to do the new desc definition one time which enables to optionally support timestamp, using single descriptor format.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] Re: [virtio] Re: [virtio-comment] [PATCH requirements 7/7] net-features: Add header data split requirements
[not found] ` <CA+FuTSeguCKk4zxZ0=Ebr1phZhF9kssHeGPn2eZj6SRNv2ewsA@mail.gmail.com>
@ 2023-08-14 13:09 ` David Edmondson
2023-08-14 13:28 ` [virtio-comment] " Parav Pandit
0 siblings, 1 reply; 73+ messages in thread
From: David Edmondson @ 2023-08-14 13:09 UTC (permalink / raw)
To: Willem de Bruijn; +Cc: Parav Pandit, shahafs, hengqi, virtio, virtio-comment
On Monday, 2023-08-14 at 08:44:11 -04, Willem de Bruijn wrote:
> On Mon, Aug 14, 2023 at 8:01 AM David Edmondson
> <david.edmondson@oracle.com> wrote:
>>
>>
>> On Monday, 2023-07-24 at 06:34:21 +03, Parav Pandit wrote:
>> > Add header data split requirements for the receive packets.
>> >
>> > Signed-off-by: Parav Pandit <parav@nvidia.com>
>> > ---
>> > net-workstream/features-1.4.md | 13 +++++++++++++
>> > 1 file changed, 13 insertions(+)
>> >
>> > diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md
>> > index 37820b6..a64e356 100644
>> > --- a/net-workstream/features-1.4.md
>> > +++ b/net-workstream/features-1.4.md
>> > @@ -11,6 +11,7 @@ together is desired while updating the virtio net interface.
>> > 3. Virtqueue notification coalescing re-arming support
>> > 4 Virtqueue receive flow filters (RFF)
>> > 5. Device timestamp for tx and rx packets
>> > +6. Header data split for the receive virtqueue
>> >
>> > # 3. Requirements
>> > ## 3.1 Device counters
>> > @@ -306,3 +307,15 @@ struct virtio_net_rff_delete {
>> > point of reception from the network.
>> > 3. The device should provide a receive packet timestamp in a single DMA
>> > transaction along with the rest of the receive completion fields.
>> > +
>> > +## 3.6 Header data split for the receive virtqueue
>> > +1. The device should be able to DMA the packet header and data to two different
>> > + memory locations, this enables driver and networking stack to perform zero
>> > + copy to application buffer(s).
>> > +2. The driver should be able to configure maximum header buffer size per
>> > + virtqueue.
>> > +3. The header buffer to be in a physically contiguous memory per virtqueue
>> > +4. The device should be able to indicate header data split in the receive
>> > + completion.
>> > +5. The device should be able to zero pad the header buffer when the received
>> > + header is shorter than cpu cache line size.
>>
>> What's the use case for this (item 5)?
>
> Without zero padding, each header write results in a
> read-modify-write, possibly over PCIe. That can significantly depress
> throughput.
Understood. So it could be anything padding, we just want to write a
full cache line.
--
Woke up in my clothes again this morning, don't know exactly where I am.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] RE: [virtio] Re: [virtio-comment] [PATCH requirements 7/7] net-features: Add header data split requirements
2023-08-14 13:09 ` [virtio-comment] Re: [virtio] " David Edmondson
@ 2023-08-14 13:28 ` Parav Pandit
2023-08-14 13:56 ` [virtio-comment] " David Edmondson
0 siblings, 1 reply; 73+ messages in thread
From: Parav Pandit @ 2023-08-14 13:28 UTC (permalink / raw)
To: David Edmondson, Willem de Bruijn
Cc: Shahaf Shuler, hengqi@linux.alibaba.com,
virtio@lists.oasis-open.org, virtio-comment@lists.oasis-open.org
> From: David Edmondson <david.edmondson@oracle.com>
> Sent: Monday, August 14, 2023 6:40 PM
> >> > +5. The device should be able to zero pad the header buffer when the
> received
> >> > + header is shorter than cpu cache line size.
> >>
> >> What's the use case for this (item 5)?
> >
> > Without zero padding, each header write results in a
> > read-modify-write, possibly over PCIe. That can significantly depress
> > throughput.
>
> Understood. So it could be anything padding, we just want to write a full cache
> line.
Yes. if the data descriptor is partial, need to zero fill too.
I will double check if I covered already or not.
Yet catching up after a break.
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] Re: [virtio] Re: [virtio-comment] [PATCH requirements 7/7] net-features: Add header data split requirements
2023-08-14 13:28 ` [virtio-comment] " Parav Pandit
@ 2023-08-14 13:56 ` David Edmondson
2023-08-15 4:41 ` [virtio-comment] " Parav Pandit
0 siblings, 1 reply; 73+ messages in thread
From: David Edmondson @ 2023-08-14 13:56 UTC (permalink / raw)
To: Parav Pandit
Cc: Willem de Bruijn, Shahaf Shuler, hengqi@linux.alibaba.com,
virtio@lists.oasis-open.org, virtio-comment@lists.oasis-open.org
On Monday, 2023-08-14 at 13:28:52 UTC, Parav Pandit wrote:
>> From: David Edmondson <david.edmondson@oracle.com>
>> Sent: Monday, August 14, 2023 6:40 PM
>
>> >> > +5. The device should be able to zero pad the header buffer when the
>> received
>> >> > + header is shorter than cpu cache line size.
>> >>
>> >> What's the use case for this (item 5)?
>> >
>> > Without zero padding, each header write results in a
>> > read-modify-write, possibly over PCIe. That can significantly depress
>> > throughput.
>>
>> Understood. So it could be anything padding, we just want to write a full cache
>> line.
> Yes. if the data descriptor is partial, need to zero fill too.
> I will double check if I covered already or not.
Perhaps the requirement, then, should be that the device is permitted
(and encouraged) to write full cache lines, with an aside that zero
padding should be used to achieve this.
--
No proper time of day.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] Re: RE: [virtio] [PATCH requirements 6/7] net-features: Add packet timestamp requirements
2023-08-14 13:06 ` [virtio-comment] " Parav Pandit
@ 2023-08-15 2:47 ` Xuan Zhuo
2023-08-15 4:01 ` [virtio-comment] " Parav Pandit
0 siblings, 1 reply; 73+ messages in thread
From: Xuan Zhuo @ 2023-08-15 2:47 UTC (permalink / raw)
To: Parav Pandit
Cc: Shahaf Shuler, hengqi@linux.alibaba.com,
virtio@lists.oasis-open.org, virtio-comment@lists.oasis-open.org
On Mon, 14 Aug 2023 13:06:03 +0000, Parav Pandit <parav@nvidia.com> wrote:
>
>
> > From: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > Sent: Wednesday, August 9, 2023 2:06 PM
> > struct vring_used_elem {
> > /* Index of start of used descriptor chain. */
> > __virtio32 id;
> > /* Total length of the descriptor chain which was used (written to) */
> > __virtio32 len;
> >
> > + __virtio64 timestamp;
> > };
> >
> >
> > Then, the existing devices can support this easily. If we introduce this by the
> > new desc structure, we can foresee that this function will not be implemented
> > by many existing machines. But this function is useful. So we want support this
> > by a simple way.
>
> This only works for split q.
YES.
> Packed q needs yet another format.
> And even after that we still have to live with other limitations listed in other requirements.
I would like some simple changes for the tx timestamp.
>
> Therefore its better to do the new desc definition one time which enables to optionally support timestamp, using single descriptor format.
I agree.
But we can have too. In addition to your plans, I would like to introduce a
simple method that can be used with existing machines.
Thanks.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] RE: RE: [virtio] [PATCH requirements 6/7] net-features: Add packet timestamp requirements
2023-08-15 2:47 ` [virtio-comment] " Xuan Zhuo
@ 2023-08-15 4:01 ` Parav Pandit
2023-08-15 6:01 ` [virtio-comment] " Xuan Zhuo
0 siblings, 1 reply; 73+ messages in thread
From: Parav Pandit @ 2023-08-15 4:01 UTC (permalink / raw)
To: Xuan Zhuo
Cc: Shahaf Shuler, hengqi@linux.alibaba.com,
virtio@lists.oasis-open.org, virtio-comment@lists.oasis-open.org
> From: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> Sent: Tuesday, August 15, 2023 8:17 AM
>
> > Packed q needs yet another format.
> > And even after that we still have to live with other limitations listed in other
> requirements.
>
> I would like some simple changes for the tx timestamp.
>
> >
> > Therefore its better to do the new desc definition one time which enables to
> optionally support timestamp, using single descriptor format.
>
>
> I agree.
>
> But we can have too. In addition to your plans, I would like to introduce a simple
> method that can be used with existing machines.
>
Can you please explain what is a "existing machine"?
This is the change in driver device interface.
So one needs new driver and device extension anyway.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] RE: [virtio] Re: [virtio-comment] [PATCH requirements 7/7] net-features: Add header data split requirements
2023-08-14 13:56 ` [virtio-comment] " David Edmondson
@ 2023-08-15 4:41 ` Parav Pandit
0 siblings, 0 replies; 73+ messages in thread
From: Parav Pandit @ 2023-08-15 4:41 UTC (permalink / raw)
To: David Edmondson
Cc: Willem de Bruijn, Shahaf Shuler, hengqi@linux.alibaba.com,
virtio@lists.oasis-open.org, virtio-comment@lists.oasis-open.org
> From: David Edmondson <david.edmondson@oracle.com>
> Sent: Monday, August 14, 2023 7:26 PM
> Perhaps the requirement, then, should be that the device is permitted (and
> encouraged) to write full cache lines, with an aside that zero padding should be
> used to achieve this.
Yes, we can relax this.
I think the point from driver->device interface point of view is to communicate this details and use aligned offset within the header buffer for subsequent packet entries.
So that driver knows where to expect the next packet's header (at cache aligned address).
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* RE: [virtio-comment] [PATCH requirements 3/7] net-features: Add low latency receive queue requirements
2023-08-14 11:54 ` David Edmondson
@ 2023-08-15 4:45 ` Parav Pandit
0 siblings, 0 replies; 73+ messages in thread
From: Parav Pandit @ 2023-08-15 4:45 UTC (permalink / raw)
To: David Edmondson
Cc: Shahaf Shuler, hengqi@linux.alibaba.com,
virtio@lists.oasis-open.org, virtio-comment@lists.oasis-open.org
> From: David Edmondson <david.edmondson@oracle.com>
> Sent: Monday, August 14, 2023 5:25 PM
>
> On Monday, 2023-07-24 at 06:34:17 +03, Parav Pandit wrote:
> > Add requirements for the low latency receive queue.
> >
> > Signed-off-by: Parav Pandit <parav@nvidia.com>
> > ---
> > changelog:
> > v0->v1:
> > - clarified the requirements further
> > - added line for the gro case
> > - added design goals as the motivation for the requirements
> > ---
> > net-workstream/features-1.4.md | 45
> > +++++++++++++++++++++++++++++++++-
> > 1 file changed, 44 insertions(+), 1 deletion(-)
> >
> > diff --git a/net-workstream/features-1.4.md
> > b/net-workstream/features-1.4.md index eb95592..e04727a 100644
> > --- a/net-workstream/features-1.4.md
> > +++ b/net-workstream/features-1.4.md
> > @@ -7,7 +7,7 @@ together is desired while updating the virtio net interface.
> >
> > # 2. Summary
> > 1. Device counters visible to the driver -2. Low latency tx virtqueue
> > for PCI transport
> > +2. Low latency tx and rx virtqueues for PCI transport
> >
> > # 3. Requirements
> > ## 3.1 Device counters
> > @@ -121,3 +121,46 @@ struct vnet_data_desc desc[2];
> >
> > 9. A flow filter virtqueue also similarly need the ability to inline the short flow
> > command header.
> > +
> > +### 3.2.2 Low latency rx virtqueue
> > +0. Design goal:
> > + a. Keep packet metadata and buffer data together which is consumed by
> driver
> > + layer and make it available in a single cache line of cpu
> > + b. Instead of having per packet descriptors which is complex to scale for
> > + the device, supply the page directly to the device to consume it based
> > + on packet size
>
> Really "per packet descriptor buffers"?
>
Yes, this is what is done today with packed and split q using mergable buffer and otherwise.
Every 64B to 1500B packet consume one descriptor (per packet one descriptor).
Today driver takes the page, splits them in descriptors and device end up assembling them back in the tedious process.
And _Instead_ of doing that a better scheme of supplying the page directly, and let device tell how much/where things are place.
So there is no segmentation and reassembly.
Only segmentation by the device.
> > +1. The device should be able to write a packet receive completion that
> consists
> > + of struct virtio_net_hdr (or similar) and a buffer id using a single DMA write
> > + PCIe TLP.
> > +2. The device should be able to perform DMA writes of multiple packets
> > + completions in a single DMA transaction up to the PCIe maximum write
> limit
> > + in a transaction.
> > +3. The device should be able to zero pad packet write completion to align it
> to
> > + 64B or CPU cache line size whenever possible.
> > +4. An example of the above DMA completion structure:
> > +
> > +```
> > +/* Constant size receive packet completion */ struct
> > +vnet_rx_completion {
> > + u16 flags;
> > + u16 id; /* buffer id */
> > + u8 gso_type;
> > + u8 reserved[3];
> > + le16 gso_hdr_len;
> > + le16 gso_size;
> > + le16 csum_start;
> > + le16 csum_offset;
> > + u16 reserved2;
> > + u64 timestamp; /* explained later */
> > + u8 padding[];
> > +};
> > +```
> > +5. The driver should be able to post constant-size buffer pages on a receive
> > + queue which can be consumed by the device for an incoming packet of any
> size
> > + from 64B to 9K bytes.
> > +6. The device should be able to know the constant buffer size at receive
> > + virtqueue level instead of per buffer level.
> > +7. The device should be able to indicate when a full page buffer is consumed,
> > + which can be recycled by the driver when the packets from the completed
> > + page is fully consumed.
>
> s/is full consumed/are fully consumed/
>
Ack. Will fix.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] RE: [EXT] [virtio] [PATCH requirements 2/7] net-features: Add low latency transmit queue requirements
2023-08-10 19:05 ` [virtio-comment] RE: [EXT] [virtio] " Satananda Burla
@ 2023-08-15 5:51 ` Parav Pandit
0 siblings, 0 replies; 73+ messages in thread
From: Parav Pandit @ 2023-08-15 5:51 UTC (permalink / raw)
To: Satananda Burla, virtio-comment@lists.oasis-open.org
Cc: Shahaf Shuler, hengqi@linux.alibaba.com,
virtio@lists.oasis-open.org
> From: Satananda Burla <sburla@marvell.com>
> Sent: Friday, August 11, 2023 12:35 AM
> > +6. Ability to zero pad the transmit completion when the transmit
> > completion is
> > + shorter than the CPU cache line size.
> Did you mean that the last completion in a cache line will be padded ?
> For ex if a transmit completion is 16 byte, you could fit 4 of them into a
> cacheline. But if device has got only 3 completions to post, it would post
> 3 and pad the last 16 bytes.
Right. Device can choose to pad last in above example.
If 2 completions are written, it can still choose to pad remaining 2.
> I hope you did not mean that every transmit
> completion is padded to cacheline size.
Right, I didn't mean that.
David had better wording to describe this. So I will use his suggestion.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] Re: RE: RE: [virtio] [PATCH requirements 6/7] net-features: Add packet timestamp requirements
2023-08-15 4:01 ` [virtio-comment] " Parav Pandit
@ 2023-08-15 6:01 ` Xuan Zhuo
2023-08-15 6:09 ` [virtio-comment] " Parav Pandit
0 siblings, 1 reply; 73+ messages in thread
From: Xuan Zhuo @ 2023-08-15 6:01 UTC (permalink / raw)
To: Parav Pandit
Cc: Shahaf Shuler, hengqi@linux.alibaba.com,
virtio@lists.oasis-open.org, virtio-comment@lists.oasis-open.org
On Tue, 15 Aug 2023 04:01:25 +0000, Parav Pandit <parav@nvidia.com> wrote:
>
>
> > From: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > Sent: Tuesday, August 15, 2023 8:17 AM
> >
> > > Packed q needs yet another format.
> > > And even after that we still have to live with other limitations listed in other
> > requirements.
> >
> > I would like some simple changes for the tx timestamp.
> >
> > >
> > > Therefore its better to do the new desc definition one time which enables to
> > optionally support timestamp, using single descriptor format.
> >
> >
> > I agree.
> >
> > But we can have too. In addition to your plans, I would like to introduce a simple
> > method that can be used with existing machines.
> >
> Can you please explain what is a "existing machine"?
> This is the change in driver device interface.
> So one needs new driver and device extension anyway.
The devices that are running now.
If the change for timestamp is small, then that will be easy for these devices
to update.
But the timestamp is so usefull, we want introduce this feature to these devices.
I see your new description, for the existing devices, that will be high risk to
update. And the new description maybe not be needed for they.
Thanks.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] RE: RE: RE: [virtio] [PATCH requirements 6/7] net-features: Add packet timestamp requirements
2023-08-15 6:01 ` [virtio-comment] " Xuan Zhuo
@ 2023-08-15 6:09 ` Parav Pandit
2023-08-15 9:44 ` [virtio-comment] " Xuan Zhuo
0 siblings, 1 reply; 73+ messages in thread
From: Parav Pandit @ 2023-08-15 6:09 UTC (permalink / raw)
To: Xuan Zhuo
Cc: Shahaf Shuler, hengqi@linux.alibaba.com,
virtio@lists.oasis-open.org, virtio-comment@lists.oasis-open.org
> From: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> Sent: Tuesday, August 15, 2023 11:31 AM
> > Can you please explain what is a "existing machine"?
> > This is the change in driver device interface.
> > So one needs new driver and device extension anyway.
>
>
> The devices that are running now.
>
>
> If the change for timestamp is small, then that will be easy for these devices to
> update.
>
> But the timestamp is so usefull, we want introduce this feature to these devices.
>
So lets work towards getting the new descriptors more quickly and prioritize it.
> I see your new description, for the existing devices, that will be high risk to
> update. And the new description maybe not be needed for they.
Lets mitigate the risk by working towards making the new descriptors that solves multiple problems.
Otherwise its all double work for all the layers from spec, device, driver and more.
I agree that I got slowed down in last 1.5 months due to personal issues, but I am better now to speed up the new format.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* RE: [virtio-comment] Re: [virtio] [PATCH requirements 6/7] net-features: Add packet timestamp requirements
2023-08-10 6:56 ` Jason Wang
@ 2023-08-15 6:13 ` Parav Pandit
[not found] ` <CAF=yD-+LMY3yE3qtd4vHc8CGOz6UAf4njM2QiZcajrQgL=KZRQ@mail.gmail.com>
1 sibling, 0 replies; 73+ messages in thread
From: Parav Pandit @ 2023-08-15 6:13 UTC (permalink / raw)
To: Jason Wang, Xuan Zhuo
Cc: Shahaf Shuler, hengqi@linux.alibaba.com,
virtio@lists.oasis-open.org, virtio-comment@lists.oasis-open.org,
Willem de Bruijn
> From: Jason Wang <jasowang@redhat.com>
> Sent: Thursday, August 10, 2023 12:26 PM
> I think this could be one way and another proposal from Willem is:
>
> https://lists.linuxfoundation.org/pipermail/virtualization/2021-
> February/052422.html
I reviewed this a while ago already.
It does not full fill one of the requirement.
Mainly it incurs yet additional DMA for timestamp that must be avoided.
Xuan's idea of combining in the used element is better than above, however it is not enough.
So lets progress on the new descriptor and completion format that addresses multiple of them.
This is the main reason to see all requirements in composite way and not the piece meal work of single feature.
^ permalink raw reply [flat|nested] 73+ messages in thread
* RE: [virtio-comment] Re: [virtio] [PATCH requirements 6/7] net-features: Add packet timestamp requirements
[not found] ` <CAF=yD-+LMY3yE3qtd4vHc8CGOz6UAf4njM2QiZcajrQgL=KZRQ@mail.gmail.com>
2023-08-14 2:54 ` Jason Wang
@ 2023-08-15 6:26 ` Parav Pandit
[not found] ` <CAF=yD-LXtrQeW0GnTR0BeDuExN5aBLC4dGEfdWbjtxmhNY2G6g@mail.gmail.com>
1 sibling, 1 reply; 73+ messages in thread
From: Parav Pandit @ 2023-08-15 6:26 UTC (permalink / raw)
To: Willem de Bruijn, Jason Wang
Cc: Xuan Zhuo, Shahaf Shuler, hengqi@linux.alibaba.com,
virtio@lists.oasis-open.org, virtio-comment@lists.oasis-open.org,
Willem de Bruijn
Hi Willem,
> From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
> Sent: Friday, August 11, 2023 2:58 AM
>
> For synchronization, a single 64-bit MMIO read is not necessarily the fastest
> when traversing real hardware boundaries, or sufficient. PCIe has Precision Time
> Measurement (PTM) hardware logic to capture clock measurement with less
> uncertainty and possibly latency, for instance.
> Indeed, uncertainty is more important than raw latency.
>
PTM would the next in our line to support on virtio, mainly for the PCIe PFs.
> FWIW, we're also trying to capture similar requires for physical devices through
> the Open Compute Platform NIC Core Offloads effort:
> https://www.opencompute.org/w/index.php?title=Core_Offloads#Timestampin
> g
>
Thanks for the above pointer and your inputs, we are considering many of them in 1.4 timeframe.
Many are post 1.4 so that one can implement also in practical time frame :)
> Virtio defines both a virtual interface and one that can be supported by
> hardware, so it's not a 1:1 match, but many subtle points probably apply to
> both.
>
Yes, it does and the proposal here in first phase is to support both so that precision may not be nsec level but at usec for semi virtual interfaces.
And progress towards supporting PTM after that.
>
> For timestamping, the difficult part is the transmit completion stamp,
> asynchronous with data flow. If this is taken on the device before or at the time
> of queuing the completion to the host, then it can be stored in\ the completion
> descriptor.
At this point we are considering before the PHY and after the queuing as listed in this doc.
So that completion can hold this.
> But if taken at the PHY, say, it is not uncommon for this to happen
> after the completion is written. And instead a block of dedicated registers is
> used and the host must poll a register. There are examples of these in the Linux
> device driver directory. For virtio, it is probably sufficient to only support the
> first kind.
>
We do not have good abstraction of the PHY yet and it is bit hard to adopt to phy when its mix of virtual and physical.
So current plan is to place them in the completions for tx and rx.
>
> A third aspect is the control path: querying and configuring hardware
> timestamps (siocgstamp/siosgstamp). If hardware is capable, easiest is to just
> timestamp every packet. Much hardware out there targets low rate PTP
> messages. But a virtualized virtio device could timestamp all. In which case this
> control API can maybe be punted on.
Virtualized virtio device will be able to do most of the control and data path timestaping what hw can do.
Likely at lower precision or accuracy... :)
^ permalink raw reply [flat|nested] 73+ messages in thread
* [virtio-comment] Re: RE: RE: RE: [virtio] [PATCH requirements 6/7] net-features: Add packet timestamp requirements
2023-08-15 6:09 ` [virtio-comment] " Parav Pandit
@ 2023-08-15 9:44 ` Xuan Zhuo
0 siblings, 0 replies; 73+ messages in thread
From: Xuan Zhuo @ 2023-08-15 9:44 UTC (permalink / raw)
To: Parav Pandit
Cc: Shahaf Shuler, hengqi@linux.alibaba.com,
virtio@lists.oasis-open.org, virtio-comment@lists.oasis-open.org
On Tue, 15 Aug 2023 06:09:48 +0000, Parav Pandit <parav@nvidia.com> wrote:
> > From: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > Sent: Tuesday, August 15, 2023 11:31 AM
>
> > > Can you please explain what is a "existing machine"?
> > > This is the change in driver device interface.
> > > So one needs new driver and device extension anyway.
> >
> >
> > The devices that are running now.
> >
> >
> > If the change for timestamp is small, then that will be easy for these devices to
> > update.
> >
> > But the timestamp is so usefull, we want introduce this feature to these devices.
> >
> So lets work towards getting the new descriptors more quickly and prioritize it.
>
> > I see your new description, for the existing devices, that will be high risk to
> > update. And the new description maybe not be needed for they.
>
> Lets mitigate the risk by working towards making the new descriptors that solves multiple problems.
> Otherwise its all double work for all the layers from spec, device, driver and more.
>
> I agree that I got slowed down in last 1.5 months due to personal issues, but I am better now to speed up the new format.
I am happy to hear that you are healthy.
Thanks.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 73+ messages in thread
* RE: [virtio-comment] Re: [virtio] [PATCH requirements 6/7] net-features: Add packet timestamp requirements
[not found] ` <CAF=yD-LXtrQeW0GnTR0BeDuExN5aBLC4dGEfdWbjtxmhNY2G6g@mail.gmail.com>
@ 2023-08-16 4:10 ` Parav Pandit
0 siblings, 0 replies; 73+ messages in thread
From: Parav Pandit @ 2023-08-16 4:10 UTC (permalink / raw)
To: Willem de Bruijn
Cc: Jason Wang, Xuan Zhuo, Shahaf Shuler, hengqi@linux.alibaba.com,
virtio@lists.oasis-open.org, virtio-comment@lists.oasis-open.org,
Willem de Bruijn, Stanislav Fomichev
> From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
> Sent: Tuesday, August 15, 2023 8:21 PM
> > Thanks for the above pointer and your inputs, we are considering many of
> them in 1.4 timeframe.
> > Many are post 1.4 so that one can implement also in practical time
> > frame :)
>
> Just be careful about precise use of terminology around this topic:
> uncertainty, precision and the like.
>
> For example, "a single MMIO read of a 64-bit register gives the lowest latency"
> perhaps can use some clarification.
>
Lower latency compared to accessing it via cvq interface.
> In clock synchronization, goal is not to minimize latency of the synchronization
> operation itself, but of the capture of the two timestamps.
>
Right.
> Which is why PCIe PTM and others use a sandwich of host_ts - device_ts
> - host_ts. The latency between these timestamps is what matters.
>
Will update to follow this as well. Was hoping to merge with the rtc work ongoing in the virtio to have as part of the net device.
> A single register read is certainly simplest. And it may be correct.
> But be careful if it is strictly worse than returning a pair of timestamps, like
> ptp_clock_info.getcrosststamp. The initial partial implementation should not
> become obsolete as we fill in the details later.
>
I guess it won't be because sandwich mode is still possible with/without mmio.
We can move to read via the queue itself.
> > > Virtio defines both a virtual interface and one that can be
> > > supported by hardware, so it's not a 1:1 match, but many subtle
> > > points probably apply to both.
> > >
> > Yes, it does and the proposal here in first phase is to support both so that
> precision may not be nsec level but at usec for semi virtual interfaces.
> > And progress towards supporting PTM after that.
> >
> > >
> > > For timestamping, the difficult part is the transmit completion
> > > stamp, asynchronous with data flow. If this is taken on the device
> > > before or at the time of queuing the completion to the host, then it
> > > can be stored in\ the completion descriptor.
> > At this point we are considering before the PHY and after the queuing as
> listed in this doc.
> > So that completion can hold this.
>
> Ack. Sounds good.
>
> This Tx timestamp is the one where you pointed out that my series was worse
> because "Mainly it incurs yet additional DMA for timestamp that must be
> avoided.", right?
>
Right.
> >
> > > But if taken at the PHY, say, it is not uncommon for this to happen
> > > after the completion is written. And instead a block of dedicated
> > > registers is used and the host must poll a register. There are
> > > examples of these in the Linux device driver directory. For virtio,
> > > it is probably sufficient to only support the first kind.
> > >
> > We do not have good abstraction of the PHY yet and it is bit hard to adopt to
> phy when its mix of virtual and physical.
> > So current plan is to place them in the completions for tx and rx.
>
> Same as above. Sounds entirely reasonable to me. Just wanted to point out this
> limitation / possible future extension.
True.
^ permalink raw reply [flat|nested] 73+ messages in thread
end of thread, other threads:[~2023-08-16 4:10 UTC | newest]
Thread overview: 73+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-24 3:34 [virtio-comment] [PATCH requirements 0/7] virtio net new features requirements Parav Pandit
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4 Parav Pandit
2023-08-08 8:16 ` David Edmondson
2023-08-14 5:17 ` Parav Pandit
2023-08-14 11:53 ` David Edmondson
2023-08-14 11:56 ` David Edmondson
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 2/7] net-features: Add low latency transmit queue requirements Parav Pandit
2023-08-08 8:24 ` David Edmondson
2023-08-10 19:05 ` [virtio-comment] RE: [EXT] [virtio] " Satananda Burla
2023-08-15 5:51 ` Parav Pandit
2023-08-14 11:55 ` [virtio-comment] " David Edmondson
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 3/7] net-features: Add low latency receive " Parav Pandit
2023-08-08 8:32 ` David Edmondson
2023-08-14 11:54 ` David Edmondson
2023-08-15 4:45 ` Parav Pandit
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 4/7] net-features: Add notification coalescing requirements Parav Pandit
2023-08-14 11:57 ` David Edmondson
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 5/7] net-features: Add n-tuple receive flow filters requirements Parav Pandit
2023-08-01 8:33 ` [virtio-comment] " Parav Pandit
2023-08-02 6:44 ` Parav Pandit
2023-08-02 15:32 ` Heng Qi
2023-08-03 10:01 ` Parav Pandit
2023-08-03 13:11 ` [virtio-comment] Re: [virtio] " Heng Qi
2023-08-02 7:17 ` [virtio-comment] RE: [EXT] [virtio] " Satananda Burla
2023-08-02 8:14 ` Parav Pandit
2023-08-02 18:32 ` Satananda Burla
2023-08-04 7:32 ` Parav Pandit
2023-08-02 15:25 ` [virtio-comment] " Heng Qi
2023-08-03 9:59 ` [virtio-comment] " Parav Pandit
2023-08-03 13:07 ` [virtio-comment] " Heng Qi
2023-08-04 6:20 ` [virtio-comment] " Parav Pandit
2023-08-04 7:17 ` [virtio-comment] " Heng Qi
2023-08-04 7:30 ` [virtio-comment] " Parav Pandit
2023-08-04 7:51 ` [virtio-comment] Re: [virtio] " Heng Qi
2023-08-07 7:22 ` Heng Qi
2023-08-08 7:13 ` Parav Pandit
2023-08-08 8:18 ` [virtio-comment] Re: [virtio] " Heng Qi
2023-08-08 8:21 ` [virtio-comment] " Heng Qi
2023-08-14 5:15 ` [virtio-comment] " Parav Pandit
2023-08-14 6:18 ` [virtio-comment] " Heng Qi
2023-08-14 6:35 ` [virtio-comment] " Parav Pandit
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 6/7] net-features: Add packet timestamp requirements Parav Pandit
2023-08-09 8:35 ` [virtio-comment] Re: [virtio] " Xuan Zhuo
2023-08-10 6:56 ` Jason Wang
2023-08-15 6:13 ` Parav Pandit
[not found] ` <CAF=yD-+LMY3yE3qtd4vHc8CGOz6UAf4njM2QiZcajrQgL=KZRQ@mail.gmail.com>
2023-08-14 2:54 ` Jason Wang
2023-08-15 6:26 ` Parav Pandit
[not found] ` <CAF=yD-LXtrQeW0GnTR0BeDuExN5aBLC4dGEfdWbjtxmhNY2G6g@mail.gmail.com>
2023-08-16 4:10 ` Parav Pandit
2023-08-14 13:06 ` [virtio-comment] " Parav Pandit
2023-08-15 2:47 ` [virtio-comment] " Xuan Zhuo
2023-08-15 4:01 ` [virtio-comment] " Parav Pandit
2023-08-15 6:01 ` [virtio-comment] " Xuan Zhuo
2023-08-15 6:09 ` [virtio-comment] " Parav Pandit
2023-08-15 9:44 ` [virtio-comment] " Xuan Zhuo
2023-08-14 11:59 ` [virtio-comment] " David Edmondson
2023-07-24 3:34 ` [virtio-comment] [PATCH requirements 7/7] net-features: Add header data split requirements Parav Pandit
2023-08-10 19:19 ` [virtio-comment] RE: [EXT] [virtio] " Satananda Burla
2023-08-14 12:00 ` [virtio-comment] " David Edmondson
[not found] ` <CA+FuTSeguCKk4zxZ0=Ebr1phZhF9kssHeGPn2eZj6SRNv2ewsA@mail.gmail.com>
2023-08-14 13:09 ` [virtio-comment] Re: [virtio] " David Edmondson
2023-08-14 13:28 ` [virtio-comment] " Parav Pandit
2023-08-14 13:56 ` [virtio-comment] " David Edmondson
2023-08-15 4:41 ` [virtio-comment] " Parav Pandit
-- strict thread matches above, loose matches on Subject: below --
2023-06-01 22:02 [virtio-comment] [PATCH requirements 0/7] virtio net new features requirements Parav Pandit
2023-06-01 22:02 ` [virtio-comment] [PATCH requirements 1/7] net-features: Add requirements document for release 1.4 Parav Pandit
2023-06-06 22:15 ` Michael S. Tsirkin
2023-06-06 22:28 ` Parav Pandit
2023-06-06 22:56 ` Michael S. Tsirkin
2023-06-06 23:08 ` Parav Pandit
2023-06-06 23:18 ` Michael S. Tsirkin
2023-06-07 20:35 ` Michael S. Tsirkin
2023-06-07 20:39 ` Parav Pandit
2023-06-07 20:50 ` Michael S. Tsirkin
2023-06-07 20:53 ` Parav Pandit
2023-06-07 9:31 ` Xuan Zhuo
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.