netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv4] virtio-spec: virtio network device multiqueue support
@ 2012-09-09 13:03 Michael S. Tsirkin
  2012-09-10  2:12 ` Rusty Russell
  0 siblings, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2012-09-09 13:03 UTC (permalink / raw)
  To: kvm, virtualization, netdev; +Cc: rick.jones2, pbonzini, levinsasha928

Add multiqueue support to virtio network device.  Add a new feature flag
VIRTIO_NET_F_MULTIQUEUE for this feature, a +new configuration field
max_virtqueue_pairs to detect supported number +of virtqueues as well as
a new command VIRTIO_NET_CTRL_STEERING to +program packet steering.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Changes from v3:
Address Sasha's comments
- drop debug fields - less fields less to debug :)
- clarify max_virtqueue_pairs field and steering param field
- misc typos
Address Paolo's comments
- Fixed old rule name left over from v2
Address Rick's comment
- Tweaked wording

Changes from v2:
Address Jason's comments on v2:
- Changed STEERING_HOST to STEERING_RX_FOLLOWS_TX:
  this is both clearer and easier to support.
  It does not look like we need a separate steering command
  since host can just watch tx packets as they go.
- Moved RX and TX steering sections near each other.
- Add motivation for other changes in v2

Changes from Jason's rfc:
- reserved vq 3: this makes all rx vqs even and tx vqs odd, which
  looks nicer to me.
- documented packet steering, added a generalized steering programming
  command. Current modes are single queue and host driven multiqueue,
  but I envision support for guest driven multiqueue in the future.
- make default vqs unused when in mq mode - this wastes some memory
  but makes it more efficient to switch between modes as
  we can avoid this causing packet reordering.

If this looks OK to everyone, we can proceed with finalizing the
implementation.  This patch is against
eb9fc84d0d3c46438aaab190e2401a9e5409a052 in virtio-spec git tree.

---
 virtio-spec.lyx | 453 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 446 insertions(+), 7 deletions(-)

diff --git a/virtio-spec.lyx b/virtio-spec.lyx
index fb6a4e3..2c2490e 100644
--- a/virtio-spec.lyx
+++ b/virtio-spec.lyx
@@ -58,6 +58,7 @@
 \html_be_strict false
 \author -608949062 "Rusty Russell,,," 
 \author 1531152142 "Paolo Bonzini,,," 
+\author 1986246365 "Michael S. Tsirkin" 
 \end_header
 
 \begin_body
@@ -3896,6 +3897,61 @@ Only if VIRTIO_NET_F_CTRL_VQ set
 \end_inset
 
 
+\change_inserted 1986246365 1346663522
+ 3: reserved
+\end_layout
+
+\begin_layout Description
+
+\change_inserted 1986246365 1346663550
+4: receiveq1.
+ 5: transmitq1.
+ 6: receiveq2.
+ 7.
+ transmitq2.
+ ...
+ 2
+\emph on
+N
+\emph default
++2:receivq
+\emph on
+N
+\emph default
+, 2
+\emph on
+N
+\emph default
++3:transmitq
+\emph on
+N
+\emph default
+
+\begin_inset Foot
+status open
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346663558
+Only if VIRTIO_NET_F_CTRL_VQ set.
+ 
+\emph on
+N
+\emph default
+ is indicated by 
+\emph on
+max_virtqueue_pairs
+\emph default
+ field.
+\change_unchanged
+
+\end_layout
+
+\end_inset
+
+
+\change_unchanged
+
 \end_layout
 
 \begin_layout Description
@@ -4056,6 +4112,17 @@ VIRTIO_NET_F_CTRL_VLAN
 
 \begin_layout Description
 VIRTIO_NET_F_GUEST_ANNOUNCE(21) Guest can send gratuitous packets.
+\change_inserted 1986246365 1346617842
+
+\end_layout
+
+\begin_layout Description
+
+\change_inserted 1986246365 1346618103
+VIRTIO_NET_F_MULTIQUEUE(22) Device has multiple receive and transmission
+ queues.
+\change_unchanged
+
 \end_layout
 
 \end_deeper
@@ -4068,11 +4135,45 @@ configuration
 \begin_inset space ~
 \end_inset
 
-layout Two configuration fields are currently defined.
+layout 
+\change_deleted 1986246365 1346671560
+Two
+\change_inserted 1986246365 1346671647
+Six
+\change_unchanged
+ configuration fields are currently defined.
  The mac address field always exists (though is only valid if VIRTIO_NET_F_MAC
  is set), and the status field only exists if VIRTIO_NET_F_STATUS is set.
  Two read-only bits are currently defined for the status field: VIRTIO_NET_S_LIN
 K_UP and VIRTIO_NET_S_ANNOUNCE.
+
+\change_inserted 1986246365 1347194909
+ The following read-only field, 
+\emph on
+max_virtqueue_pairs
+\emph default
+ only exists if VIRTIO_NET_F_MULTIQUEUE is set.
+ This field specifies the maximum number of each of transmit and receive
+ virtqueues (receiveq1..receiveq
+\emph on
+N
+\emph default
+ and transmitq1..transmitq
+\emph on
+N
+\emph default
+ respectively; 
+\emph on
+N
+\emph default
+=
+\emph on
+max_virtqueue_pairs
+\emph default
+) that can be used for multiqueue operation, excluding the default receiveq(0)
+ and transmitq(1) virtqueues.
+
+\change_unchanged
  
 \begin_inset listings
 inline false
@@ -4105,6 +4206,15 @@ struct virtio_net_config {
 \begin_layout Plain Layout
 
     u16 status;
+\change_inserted 1986246365 1346671221
+
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346671532
+
+    u16 max_virtqueue_pairs;
 \end_layout
 
 \begin_layout Plain Layout
@@ -4151,6 +4261,18 @@ physical
 \begin_layout Enumerate
 If the VIRTIO_NET_F_CTRL_VQ feature bit is negotiated, identify the control
  virtqueue.
+\change_inserted 1986246365 1346618052
+
+\end_layout
+
+\begin_layout Enumerate
+
+\change_inserted 1986246365 1346618175
+If VIRTIO_NET_F_MULTIQUEUE feature bit is negotiated, identify the receive
+ and transmission queues that are going to be used in multiqueue mode.
+ Only queues that are going to be used need to be initialized.
+\change_unchanged
+
 \end_layout
 
 \begin_layout Enumerate
@@ -4168,7 +4290,11 @@ status
 \end_layout
 
 \begin_layout Enumerate
-The receive virtqueue should be filled with receive buffers.
+The receive virtqueue
+\change_inserted 1986246365 1346618180
+(s)
+\change_unchanged
+ should be filled with receive buffers.
  This is described in detail below in 
 \begin_inset Quotes eld
 \end_inset
@@ -4513,6 +4639,8 @@ Note that the header will be two bytes longer for the VIRTIO_NET_F_MRG_RXBUF
 \end_inset
 
 
+\change_deleted 1986246365 1346932640
+
 \end_layout
 
 \begin_layout Subsection*
@@ -4988,8 +5116,24 @@ status open
 The Guest needs to check VIRTIO_NET_S_ANNOUNCE bit in status field when
  it notices the changes of device configuration.
  The command VIRTIO_NET_CTRL_ANNOUNCE_ACK is used to indicate that driver
- has recevied the notification and device would clear the VIRTIO_NET_S_ANNOUNCE
- bit in the status filed after it received this command.
+ has rece
+\change_inserted 1986246365 1346663932
+i
+\change_unchanged
+v
+\change_deleted 1986246365 1346663934
+i
+\change_unchanged
+ed the notification and device would clear the VIRTIO_NET_S_ANNOUNCE bit
+ in the status fi
+\change_inserted 1986246365 1346663942
+e
+\change_unchanged
+l
+\change_deleted 1986246365 1346663943
+e
+\change_unchanged
+d after it received this command.
 \end_layout
 
 \begin_layout Standard
@@ -5004,10 +5148,306 @@ Sending the gratuitous packets or marking there are pending gratuitous packets
 \begin_layout Enumerate
 Sending VIRTIO_NET_CTRL_ANNOUNCE_ACK command through control vq.
  
+\change_deleted 1986246365 1346662247
+
 \end_layout
 
-\begin_layout Enumerate
+\begin_layout Subsection*
+
+\change_inserted 1986246365 1346932658
+\begin_inset CommandInset label
+LatexCommand label
+name "sub:Transmit-Packet-Steering"
+
+\end_inset
+
+Transmit Packet Steering
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346932658
+When VIRTIO_NET_F_MULTIQUEUE feature bit is negotiated, guest can use any
+ of multiple configured transmit queues to transmit a given packet.
+ To avoid packet reordering by device (which generally leads to performance
+ degradation) driver should attempt to utilize the same transmit virtqueue
+ for all packets of a given transmit flow.
+ For bi-directional protocols (in practice, TCP), a given network connection
+ can utilize both transmit and receive queues.
+ For best performance, packets from a single connection should utilize the
+ paired transmit and receive queues from the same virtqueue pair; for example
+ both transmitqN and receiveqN.
+ This rule makes it possible to optimize processing on the device side,
+ but this is not a hard requirement: devices should function correctly even
+ when this rule is not followed.
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346932658
+Driver selects an active steering rule using VIRTIO_NET_CTRL_STEERING command
+ (this controls both which virtqueue is selected for a given packet for
+ receive and notifies the device which virtqueues are about to be used for
+ transmit).
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346932658
+This command accepts a single out argument in the following format:
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346932658
+\begin_inset listings
+inline false
+status open
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1347192845
+
+#define VIRTIO_NET_CTRL_STEERING               4
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346932658
+
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346932658
+
+struct virtio_net_ctrl_steering {
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346932658
+
+	u8 current_steering_rule;
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346932658
+
+    u8 reserved;
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346932658
+
+	u16 current_steering_param;
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346932658
+
+};
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1347192841
+
+#define VIRTIO_NET_CTRL_STEERING_SINGLE        0
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1347192840
+
+#define VIRTIO_NET_CTRL_STEERING_RX_FOLLOWS_TX 1
+\end_layout
+
+\end_inset
+
+
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1347193028
+The field 
+\emph on
+rule
+\emph default
+ specifies the function used to select transmit virtqueue for a given packet;
+ the field 
+\emph on
+param
+\emph default
+ makes it possible to pass an extra parameter if appropriate.
+ When 
+\emph on
+rule
+\emph default
+ is set to VIRTIO_NET_CTRL_STEERING_SINGLE (this is the default) all packets
+ are steered to the default virtqueue transmitq (1); param is unused; this
+ is the default.
+ With any other rule, When 
+\emph on
+rule
+\emph default
+ is set to VIRTIO_NET_CTRL_STEERING_RX_FOLLOWS_TX packets are steered by
+ driver to the first 
+\emph on
+N
+\emph default
+=(
+\emph on
+param
+\emph default
++1) multiqueue virtqueues transmitq1...transmitqN; the default transmitq is
+ unused.
+ Driver must have configured all these (
+\emph on
+param
+\emph default
++1) virtqueues beforehand.
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1347193114
+Supported steering rules can be added and removed in the future.
+ Driver should check that the request to change the steering rule was successful
+ by checking ack values of the command.
+ As selecting a specific steering is an optimization feature, drivers should
+ avoid hard failure and fall back on using a supported steering rule if
+ this command fails.
+ The default steering rule is VIRTIO_NET_CTRL_STEERING_SINGLE.
+ It will not be removed.
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346932658
+When the steering rule is modified, some packets can still be outstanding
+ in one or more of the transmit virtqueues.
+ Since drivers might choose to modify the current steering rule at a high
+ rate (e.g.
+ adaptively in response to changes in the workload) to avoid reordering
+ packets, device is recommended to complete processing of the transmit queue(s)
+ utilized by the original steering before processing any packets delivered
+ by the modified steering rule.
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346932658
+For debugging, the current steering rule can also be read from the configuration
+ space.
+\end_layout
+
+\begin_layout Subsection*
+
+\change_inserted 1986246365 1346670357
+\begin_inset CommandInset label
+LatexCommand label
+name "sub:Receive-Packet-Steering"
+
+\end_inset
+
+Receive Packet Steering
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346671046
+When VIRTIO_NET_F_MULTIQUEUE feature bit is negotiated, device can use any
+ of multiple configured receive queues to pass a given packet to driver.
+ Driver controls which virtqueue is selected in practice by configuring
+ packet steering rule using VIRTIO_NET_CTRL_STEERING command, as described
+ above
+\begin_inset CommandInset ref
+LatexCommand ref
+reference "sub:Transmit-Packet-Steering"
+
+\end_inset
+
 .
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1347193175
+The field 
+\emph on
+rule
+\emph default
+ specifies the function used to select receive virtqueue for a given packet;
+ the field 
+\emph on
+param
+\emph default
+ makes it possible to pass an extra parameter if appropriate.
+ When 
+\emph on
+rule
+\emph default
+ is set to VIRTIO_NET_CTRL_STEERING_SINGLE all packets are steered to the
+ default virtqueue receiveq (0); param is unused; this is the default.
+ When 
+\emph on
+rule
+\emph default
+ is set to VIRTIO_NET_CTRL_STEERING_RX_FOLLOWS_TX packets are steered by
+ host to the first 
+\emph on
+N
+\emph default
+=(
+\emph on
+param
+\emph default
++1) multiqueue virtqueues receiveq1...receiveqN; the default receiveq is unused.
+ Driver must have configured all these (
+\emph on
+param
+\emph default
++1) virtqueues beforehand.
+ For best performance for bi-directional flows (such as TCP) device should
+ detect the flow to virtqueue pair mapping on transmit and select the receive
+ virtqueue from the same virtqueue pair.
+ For uni-directional flows, or when this mapping information is missing,
+ a device-specific steering function is used.
+\change_unchanged
+
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346669564
+Supported steering rules can be added and removed in the future.
+ Driver should probe for supported rules by checking ack values of the command.
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346932135
+When the steering rule is modified, some packets can still be outstanding
+ in one or more of the virtqueues.
+ Device is not required to wait for these packets to be consumed before
+ delivering packets using the new streering rule.
+ Drivers modifying the steering rule at a high rate (e.g.
+ adaptively in response to changes in the workload) are recommended to complete
+ processing of the receive queue(s) utilized by the original steering before
+ processing any packets delivered by the modified steering rule.
+\end_layout
+
+\begin_layout Standard
+
+\change_deleted 1986246365 1346664095
+.
+
+\change_unchanged
  
 \end_layout
 
@@ -5973,8 +6413,7 @@ If the VIRTIO_CONSOLE_F_MULTIPORT feature is negotiated, the driver can
  spawn multiple ports, not all of which may be attached to a console.
  Some could be generic ports.
  In this case, the control virtqueues are enabled and according to the max_nr_po
-rts configuration-space value, an appropriate number of virtqueues are
- created.
+rts configuration-space value, an appropriate number of virtqueues are created.
  A control message indicating the driver is ready is sent to the host.
  The host can then send control messages for adding new ports to the device.
  After creating and initializing each port, a VIRTIO_CONSOLE_PORT_READY
-- 
MST

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCHv4] virtio-spec: virtio network device multiqueue support
  2012-09-09 13:03 [PATCHv4] virtio-spec: virtio network device multiqueue support Michael S. Tsirkin
@ 2012-09-10  2:12 ` Rusty Russell
  2012-09-10  6:16   ` Michael S. Tsirkin
  2012-09-10 18:39   ` Rick Jones
  0 siblings, 2 replies; 15+ messages in thread
From: Rusty Russell @ 2012-09-10  2:12 UTC (permalink / raw)
  To: Michael S. Tsirkin, kvm, virtualization, netdev
  Cc: pbonzini, rick.jones2, levinsasha928, Tom Herbert

OK, I read the spec (pasted below for easy of reading), but I'm still
confused over how this will work.

I thought normal net drivers have the hardware provide an rxhash for
each packet, and we map that to CPU to queue the packet on[1].  We hope
that the receiving process migrates to that CPU, so xmit queue
matches.

For virtio this would mean a new per-packet rxhash value, right?

Why are we doing something different?  What am I missing?

Thanks,
Rusty.
[1] Everything I Know About Networking I Learned From LWN:
    https://lwn.net/Articles/362339/

---
Transmit Packet Steering

When VIRTIO_NET_F_MULTIQUEUE feature bit is negotiated, guest can use any of multiple configured transmit queues to transmit a given packet. To avoid packet reordering by device (which generally leads to performance degradation) driver should attempt to utilize the same transmit virtqueue for all packets of a given transmit flow. For bi-directional protocols (in practice, TCP), a given network connection can utilize both transmit and receive queues. For best performance, packets from a single connection should utilize the paired transmit and receive queues from the same virtqueue pair; for example both transmitqN and receiveqN. This rule makes it possible to optimize processing on the device side, but this is not a hard requirement: devices should function correctly even when this rule is 
 not followed.

Driver selects an active steering rule using VIRTIO_NET_CTRL_STEERING command (this controls both which virtqueue is selected for a given packet for receive and notifies the device which virtqueues are about to be used for transmit).

This command accepts a single out argument in the following format:

#define VIRTIO_NET_CTRL_STEERING               4

The field rule specifies the function used to select transmit virtqueue for a given packet; the field param makes it possible to pass an extra parameter if appropriate. When rule is set to VIRTIO_NET_CTRL_STEERING_SINGLE (this is the default) all packets are steered to the default virtqueue transmitq (1); param is unused; this is the default. With any other rule, When rule is set to VIRTIO_NET_CTRL_STEERING_RX_FOLLOWS_TX packets are steered by driver to the first N=(param+1) multiqueue virtqueues transmitq1...transmitqN; the default transmitq is unused. Driver must have configured all these (param+1) virtqueues beforehand.

Supported steering rules can be added and removed in the future. Driver should check that the request to change the steering rule was successful by checking ack values of the command. As selecting a specific steering is an optimization feature, drivers should avoid hard failure and fall back on using a supported steering rule if this command fails. The default steering rule is VIRTIO_NET_CTRL_STEERING_SINGLE. It will not be removed.

When the steering rule is modified, some packets can still be outstanding in one or more of the transmit virtqueues. Since drivers might choose to modify the current steering rule at a high rate (e.g. adaptively in response to changes in the workload) to avoid reordering packets, device is recommended to complete processing of the transmit queue(s) utilized by the original steering before processing any packets delivered by the modified steering rule.

For debugging, the current steering rule can also be read from the configuration space.

Receive Packet Steering

When VIRTIO_NET_F_MULTIQUEUE feature bit is negotiated, device can use any of multiple configured receive queues to pass a given packet to driver. Driver controls which virtqueue is selected in practice by configuring packet steering rule using VIRTIO_NET_CTRL_STEERING command, as described above[sub:Transmit-Packet-Steering].

The field rule specifies the function used to select receive virtqueue for a given packet; the field param makes it possible to pass an extra parameter if appropriate. When rule is set to VIRTIO_NET_CTRL_STEERING_SINGLE all packets are steered to the default virtqueue receiveq (0); param is unused; this is the default. When rule is set to VIRTIO_NET_CTRL_STEERING_RX_FOLLOWS_TX packets are steered by host to the first N=(param+1) multiqueue virtqueues receiveq1...receiveqN; the default receiveq is unused. Driver must have configured all these (param+1) virtqueues beforehand. For best performance for bi-directional flows (such as TCP) device should detect the flow to virtqueue pair mapping on transmit and select the receive virtqueue from the same virtqueue pair. For uni-directional flows, o
 r when this mapping information is missing, a device-specific steering function is used.

Supported steering rules can be added and removed in the future. Driver should probe for supported rules by checking ack values of the command.

When the steering rule is modified, some packets can still be outstanding in one or more of the virtqueues. Device is not required to wait for these packets to be consumed before delivering packets using the new streering rule. Drivers modifying the steering rule at a high rate (e.g. adaptively in response to changes in the workload) are recommended to complete processing of the receive queue(s) utilized by the original steering before processing any packets delivered by the modified steering rule.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv4] virtio-spec: virtio network device multiqueue support
  2012-09-10  2:12 ` Rusty Russell
@ 2012-09-10  6:16   ` Michael S. Tsirkin
  2012-09-10  6:27     ` Michael S. Tsirkin
  2012-09-12  0:29     ` Rusty Russell
  2012-09-10 18:39   ` Rick Jones
  1 sibling, 2 replies; 15+ messages in thread
From: Michael S. Tsirkin @ 2012-09-10  6:16 UTC (permalink / raw)
  To: Rusty Russell
  Cc: kvm, netdev, rick.jones2, virtualization, levinsasha928, pbonzini,
	Tom Herbert

On Mon, Sep 10, 2012 at 11:42:25AM +0930, Rusty Russell wrote:
> OK, I read the spec (pasted below for easy of reading), but I'm still
> confused over how this will work.
> 
> I thought normal net drivers have the hardware provide an rxhash for
> each packet, and we map that to CPU to queue the packet on[1].  We hope
> that the receiving process migrates to that CPU, so xmit queue
> matches.

This ony works sometimes.  For example it's common to pin netperf to a
cpu to get consistent performance.  Proper hardware must obey what
applications want it to do, not the other way around.

> For virtio this would mean a new per-packet rxhash value, right?
> 
> Why are we doing something different?  What am I missing?
> 
> Thanks,
> Rusty.
> [1] Everything I Know About Networking I Learned From LWN:
>     https://lwn.net/Articles/362339/

I think you missed this:

	Some network interfaces can help with the distribution of incoming
	packets; they have multiple receive queues and multiple interrupt lines.
	Others, though, are equipped with a single queue, meaning that the
	driver for that hardware must deal with all incoming packets in a
	single, serialized stream. Parallelizing such a stream requires some
	intelligence on the part of the host operating system. 

In other words RPS is a hack to speed up networking on cheapo
hardware, this is one of the reasons it is off by default.
Good hardware has multiple receive queues.
We can implement a good one so we do not need RPS.

Also not all guest OS-es support RPS.

Does this clarify?

> ---
> Transmit Packet Steering
> 
> When VIRTIO_NET_F_MULTIQUEUE feature bit is negotiated, guest can use any of multiple configured transmit queues to transmit a given packet. To avoid packet reordering by device (which generally leads to performance degradation) driver should attempt to utilize the same transmit virtqueue for all packets of a given transmit flow. For bi-directional protocols (in practice, TCP), a given network connection can utilize both transmit and receive queues. For best performance, packets from a single connection should utilize the paired transmit and receive queues from the same virtqueue pair; for example both transmitqN and receiveqN. This rule makes it possible to optimize processing on the device side, but this is not a hard requirement: devices should function correctly even when this rule i
 s not followed.
> 
> Driver selects an active steering rule using VIRTIO_NET_CTRL_STEERING command (this controls both which virtqueue is selected for a given packet for receive and notifies the device which virtqueues are about to be used for transmit).
> 
> This command accepts a single out argument in the following format:
> 
> #define VIRTIO_NET_CTRL_STEERING               4
> 
> The field rule specifies the function used to select transmit virtqueue for a given packet; the field param makes it possible to pass an extra parameter if appropriate. When rule is set to VIRTIO_NET_CTRL_STEERING_SINGLE (this is the default) all packets are steered to the default virtqueue transmitq (1); param is unused; this is the default. With any other rule, When rule is set to VIRTIO_NET_CTRL_STEERING_RX_FOLLOWS_TX packets are steered by driver to the first N=(param+1) multiqueue virtqueues transmitq1...transmitqN; the default transmitq is unused. Driver must have configured all these (param+1) virtqueues beforehand.
> 
> Supported steering rules can be added and removed in the future. Driver should check that the request to change the steering rule was successful by checking ack values of the command. As selecting a specific steering is an optimization feature, drivers should avoid hard failure and fall back on using a supported steering rule if this command fails. The default steering rule is VIRTIO_NET_CTRL_STEERING_SINGLE. It will not be removed.
> 
> When the steering rule is modified, some packets can still be outstanding in one or more of the transmit virtqueues. Since drivers might choose to modify the current steering rule at a high rate (e.g. adaptively in response to changes in the workload) to avoid reordering packets, device is recommended to complete processing of the transmit queue(s) utilized by the original steering before processing any packets delivered by the modified steering rule.
> 
> For debugging, the current steering rule can also be read from the configuration space.
> 
> Receive Packet Steering
> 
> When VIRTIO_NET_F_MULTIQUEUE feature bit is negotiated, device can use any of multiple configured receive queues to pass a given packet to driver. Driver controls which virtqueue is selected in practice by configuring packet steering rule using VIRTIO_NET_CTRL_STEERING command, as described above[sub:Transmit-Packet-Steering].
> 
> The field rule specifies the function used to select receive virtqueue for a given packet; the field param makes it possible to pass an extra parameter if appropriate. When rule is set to VIRTIO_NET_CTRL_STEERING_SINGLE all packets are steered to the default virtqueue receiveq (0); param is unused; this is the default. When rule is set to VIRTIO_NET_CTRL_STEERING_RX_FOLLOWS_TX packets are steered by host to the first N=(param+1) multiqueue virtqueues receiveq1...receiveqN; the default receiveq is unused. Driver must have configured all these (param+1) virtqueues beforehand. For best performance for bi-directional flows (such as TCP) device should detect the flow to virtqueue pair mapping on transmit and select the receive virtqueue from the same virtqueue pair. For uni-directional flows,
  or when this mapping information is missing, a device-specific steering function is used.
> 
> Supported steering rules can be added and removed in the future. Driver should probe for supported rules by checking ack values of the command.
> 
> When the steering rule is modified, some packets can still be outstanding in one or more of the virtqueues. Device is not required to wait for these packets to be consumed before delivering packets using the new streering rule. Drivers modifying the steering rule at a high rate (e.g. adaptively in response to changes in the workload) are recommended to complete processing of the receive queue(s) utilized by the original steering before processing any packets delivered by the modified steering rule.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv4] virtio-spec: virtio network device multiqueue support
  2012-09-10  6:16   ` Michael S. Tsirkin
@ 2012-09-10  6:27     ` Michael S. Tsirkin
  2012-09-10  6:33       ` Michael S. Tsirkin
  2012-09-12  0:29     ` Rusty Russell
  1 sibling, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2012-09-10  6:27 UTC (permalink / raw)
  To: Rusty Russell
  Cc: kvm, netdev, rick.jones2, virtualization, levinsasha928, pbonzini,
	Tom Herbert

On Mon, Sep 10, 2012 at 09:16:29AM +0300, Michael S. Tsirkin wrote:
> On Mon, Sep 10, 2012 at 11:42:25AM +0930, Rusty Russell wrote:
> > OK, I read the spec (pasted below for easy of reading), but I'm still
> > confused over how this will work.
> > 
> > I thought normal net drivers have the hardware provide an rxhash for
> > each packet, and we map that to CPU to queue the packet on[1].  We hope
> > that the receiving process migrates to that CPU, so xmit queue
> > matches.
> 
> This ony works sometimes.  For example it's common to pin netperf to a
> cpu to get consistent performance.  Proper hardware must obey what
> applications want it to do, not the other way around.
> 
> > For virtio this would mean a new per-packet rxhash value, right?
> > 
> > Why are we doing something different?  What am I missing?
> > 
> > Thanks,
> > Rusty.
> > [1] Everything I Know About Networking I Learned From LWN:
> >     https://lwn.net/Articles/362339/
> 
> I think you missed this:
> 
> 	Some network interfaces can help with the distribution of incoming
> 	packets; they have multiple receive queues and multiple interrupt lines.
> 	Others, though, are equipped with a single queue, meaning that the
> 	driver for that hardware must deal with all incoming packets in a
> 	single, serialized stream. Parallelizing such a stream requires some
> 	intelligence on the part of the host operating system. 
> 
> In other words RPS is a hack to speed up networking on cheapo
> hardware, this is one of the reasons it is off by default.
> Good hardware has multiple receive queues.
> We can implement a good one so we do not need RPS.
> 
> Also not all guest OS-es support RPS.
> 
> Does this clarify?

I would like to add that on many processors, sending
IPCs between guest CPUs requires exits on sending *and*
receiving path, making it very expensive.

> > ---
> > Transmit Packet Steering
> > 
> > When VIRTIO_NET_F_MULTIQUEUE feature bit is negotiated, guest can use any of multiple configured transmit queues to transmit a given packet. To avoid packet reordering by device (which generally leads to performance degradation) driver should attempt to utilize the same transmit virtqueue for all packets of a given transmit flow. For bi-directional protocols (in practice, TCP), a given network connection can utilize both transmit and receive queues. For best performance, packets from a single connection should utilize the paired transmit and receive queues from the same virtqueue pair; for example both transmitqN and receiveqN. This rule makes it possible to optimize processing on the device side, but this is not a hard requirement: devices should function correctly even when this rule
  is not followed.
> > 
> > Driver selects an active steering rule using VIRTIO_NET_CTRL_STEERING command (this controls both which virtqueue is selected for a given packet for receive and notifies the device which virtqueues are about to be used for transmit).
> > 
> > This command accepts a single out argument in the following format:
> > 
> > #define VIRTIO_NET_CTRL_STEERING               4
> > 
> > The field rule specifies the function used to select transmit virtqueue for a given packet; the field param makes it possible to pass an extra parameter if appropriate. When rule is set to VIRTIO_NET_CTRL_STEERING_SINGLE (this is the default) all packets are steered to the default virtqueue transmitq (1); param is unused; this is the default. With any other rule, When rule is set to VIRTIO_NET_CTRL_STEERING_RX_FOLLOWS_TX packets are steered by driver to the first N=(param+1) multiqueue virtqueues transmitq1...transmitqN; the default transmitq is unused. Driver must have configured all these (param+1) virtqueues beforehand.
> > 
> > Supported steering rules can be added and removed in the future. Driver should check that the request to change the steering rule was successful by checking ack values of the command. As selecting a specific steering is an optimization feature, drivers should avoid hard failure and fall back on using a supported steering rule if this command fails. The default steering rule is VIRTIO_NET_CTRL_STEERING_SINGLE. It will not be removed.
> > 
> > When the steering rule is modified, some packets can still be outstanding in one or more of the transmit virtqueues. Since drivers might choose to modify the current steering rule at a high rate (e.g. adaptively in response to changes in the workload) to avoid reordering packets, device is recommended to complete processing of the transmit queue(s) utilized by the original steering before processing any packets delivered by the modified steering rule.
> > 
> > For debugging, the current steering rule can also be read from the configuration space.
> > 
> > Receive Packet Steering
> > 
> > When VIRTIO_NET_F_MULTIQUEUE feature bit is negotiated, device can use any of multiple configured receive queues to pass a given packet to driver. Driver controls which virtqueue is selected in practice by configuring packet steering rule using VIRTIO_NET_CTRL_STEERING command, as described above[sub:Transmit-Packet-Steering].
> > 
> > The field rule specifies the function used to select receive virtqueue for a given packet; the field param makes it possible to pass an extra parameter if appropriate. When rule is set to VIRTIO_NET_CTRL_STEERING_SINGLE all packets are steered to the default virtqueue receiveq (0); param is unused; this is the default. When rule is set to VIRTIO_NET_CTRL_STEERING_RX_FOLLOWS_TX packets are steered by host to the first N=(param+1) multiqueue virtqueues receiveq1...receiveqN; the default receiveq is unused. Driver must have configured all these (param+1) virtqueues beforehand. For best performance for bi-directional flows (such as TCP) device should detect the flow to virtqueue pair mapping on transmit and select the receive virtqueue from the same virtqueue pair. For uni-directional flow
 s, or when this mapping information is missing, a device-specific steering function is used.
> > 
> > Supported steering rules can be added and removed in the future. Driver should probe for supported rules by checking ack values of the command.
> > 
> > When the steering rule is modified, some packets can still be outstanding in one or more of the virtqueues. Device is not required to wait for these packets to be consumed before delivering packets using the new streering rule. Drivers modifying the steering rule at a high rate (e.g. adaptively in response to changes in the workload) are recommended to complete processing of the receive queue(s) utilized by the original steering before processing any packets delivered by the modified steering rule.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv4] virtio-spec: virtio network device multiqueue support
  2012-09-10  6:27     ` Michael S. Tsirkin
@ 2012-09-10  6:33       ` Michael S. Tsirkin
  2012-09-10 11:00         ` Jason Wang
  0 siblings, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2012-09-10  6:33 UTC (permalink / raw)
  To: Rusty Russell
  Cc: kvm, netdev, rick.jones2, virtualization, levinsasha928, pbonzini,
	Tom Herbert

On Mon, Sep 10, 2012 at 09:27:38AM +0300, Michael S. Tsirkin wrote:
> On Mon, Sep 10, 2012 at 09:16:29AM +0300, Michael S. Tsirkin wrote:
> > On Mon, Sep 10, 2012 at 11:42:25AM +0930, Rusty Russell wrote:
> > > OK, I read the spec (pasted below for easy of reading), but I'm still
> > > confused over how this will work.
> > > 
> > > I thought normal net drivers have the hardware provide an rxhash for
> > > each packet, and we map that to CPU to queue the packet on[1].  We hope
> > > that the receiving process migrates to that CPU, so xmit queue
> > > matches.
> > 
> > This ony works sometimes.  For example it's common to pin netperf to a
> > cpu to get consistent performance.  Proper hardware must obey what
> > applications want it to do, not the other way around.
> > 
> > > For virtio this would mean a new per-packet rxhash value, right?
> > > 
> > > Why are we doing something different?  What am I missing?
> > > 
> > > Thanks,
> > > Rusty.
> > > [1] Everything I Know About Networking I Learned From LWN:
> > >     https://lwn.net/Articles/362339/
> > 
> > I think you missed this:
> > 
> > 	Some network interfaces can help with the distribution of incoming
> > 	packets; they have multiple receive queues and multiple interrupt lines.
> > 	Others, though, are equipped with a single queue, meaning that the
> > 	driver for that hardware must deal with all incoming packets in a
> > 	single, serialized stream. Parallelizing such a stream requires some
> > 	intelligence on the part of the host operating system. 
> > 
> > In other words RPS is a hack to speed up networking on cheapo
> > hardware, this is one of the reasons it is off by default.
> > Good hardware has multiple receive queues.
> > We can implement a good one so we do not need RPS.
> > 
> > Also not all guest OS-es support RPS.
> > 
> > Does this clarify?
> 
> I would like to add that on many processors, sending
> IPCs between guest CPUs requires exits on sending *and*
> receiving path, making it very expensive.

A final addition: what you suggest above would be
"TX follows RX", right?
It is in anticipation of something like that, that I made
steering programming so generic.
I think TX follows RX is more immediately useful for reasons above
but we can add both to spec and let drivers and devices
decide what they want to support.
Pls let me know.

> > > ---
> > > Transmit Packet Steering
> > > 
> > > When VIRTIO_NET_F_MULTIQUEUE feature bit is negotiated, guest can use any of multiple configured transmit queues to transmit a given packet. To avoid packet reordering by device (which generally leads to performance degradation) driver should attempt to utilize the same transmit virtqueue for all packets of a given transmit flow. For bi-directional protocols (in practice, TCP), a given network connection can utilize both transmit and receive queues. For best performance, packets from a single connection should utilize the paired transmit and receive queues from the same virtqueue pair; for example both transmitqN and receiveqN. This rule makes it possible to optimize processing on the device side, but this is not a hard requirement: devices should function correctly even when this ru
 le is not followed.
> > > 
> > > Driver selects an active steering rule using VIRTIO_NET_CTRL_STEERING command (this controls both which virtqueue is selected for a given packet for receive and notifies the device which virtqueues are about to be used for transmit).
> > > 
> > > This command accepts a single out argument in the following format:
> > > 
> > > #define VIRTIO_NET_CTRL_STEERING               4
> > > 
> > > The field rule specifies the function used to select transmit virtqueue for a given packet; the field param makes it possible to pass an extra parameter if appropriate. When rule is set to VIRTIO_NET_CTRL_STEERING_SINGLE (this is the default) all packets are steered to the default virtqueue transmitq (1); param is unused; this is the default. With any other rule, When rule is set to VIRTIO_NET_CTRL_STEERING_RX_FOLLOWS_TX packets are steered by driver to the first N=(param+1) multiqueue virtqueues transmitq1...transmitqN; the default transmitq is unused. Driver must have configured all these (param+1) virtqueues beforehand.
> > > 
> > > Supported steering rules can be added and removed in the future. Driver should check that the request to change the steering rule was successful by checking ack values of the command. As selecting a specific steering is an optimization feature, drivers should avoid hard failure and fall back on using a supported steering rule if this command fails. The default steering rule is VIRTIO_NET_CTRL_STEERING_SINGLE. It will not be removed.
> > > 
> > > When the steering rule is modified, some packets can still be outstanding in one or more of the transmit virtqueues. Since drivers might choose to modify the current steering rule at a high rate (e.g. adaptively in response to changes in the workload) to avoid reordering packets, device is recommended to complete processing of the transmit queue(s) utilized by the original steering before processing any packets delivered by the modified steering rule.
> > > 
> > > For debugging, the current steering rule can also be read from the configuration space.
> > > 
> > > Receive Packet Steering
> > > 
> > > When VIRTIO_NET_F_MULTIQUEUE feature bit is negotiated, device can use any of multiple configured receive queues to pass a given packet to driver. Driver controls which virtqueue is selected in practice by configuring packet steering rule using VIRTIO_NET_CTRL_STEERING command, as described above[sub:Transmit-Packet-Steering].
> > > 
> > > The field rule specifies the function used to select receive virtqueue for a given packet; the field param makes it possible to pass an extra parameter if appropriate. When rule is set to VIRTIO_NET_CTRL_STEERING_SINGLE all packets are steered to the default virtqueue receiveq (0); param is unused; this is the default. When rule is set to VIRTIO_NET_CTRL_STEERING_RX_FOLLOWS_TX packets are steered by host to the first N=(param+1) multiqueue virtqueues receiveq1...receiveqN; the default receiveq is unused. Driver must have configured all these (param+1) virtqueues beforehand. For best performance for bi-directional flows (such as TCP) device should detect the flow to virtqueue pair mapping on transmit and select the receive virtqueue from the same virtqueue pair. For uni-directional fl
 ows, or when this mapping information is missing, a device-specific steering function is used.
> > > 
> > > Supported steering rules can be added and removed in the future. Driver should probe for supported rules by checking ack values of the command.
> > > 
> > > When the steering rule is modified, some packets can still be outstanding in one or more of the virtqueues. Device is not required to wait for these packets to be consumed before delivering packets using the new streering rule. Drivers modifying the steering rule at a high rate (e.g. adaptively in response to changes in the workload) are recommended to complete processing of the receive queue(s) utilized by the original steering before processing any packets delivered by the modified steering rule.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv4] virtio-spec: virtio network device multiqueue support
  2012-09-10  6:33       ` Michael S. Tsirkin
@ 2012-09-10 11:00         ` Jason Wang
  2012-09-12  5:49           ` Rusty Russell
  0 siblings, 1 reply; 15+ messages in thread
From: Jason Wang @ 2012-09-10 11:00 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: rick.jones2, kvm, netdev, virtualization, levinsasha928, pbonzini,
	Tom Herbert

On 09/10/2012 02:33 PM, Michael S. Tsirkin wrote:
> On Mon, Sep 10, 2012 at 09:27:38AM +0300, Michael S. Tsirkin wrote:
>> On Mon, Sep 10, 2012 at 09:16:29AM +0300, Michael S. Tsirkin wrote:
>>> On Mon, Sep 10, 2012 at 11:42:25AM +0930, Rusty Russell wrote:
>>>> OK, I read the spec (pasted below for easy of reading), but I'm still
>>>> confused over how this will work.
>>>>
>>>> I thought normal net drivers have the hardware provide an rxhash for
>>>> each packet, and we map that to CPU to queue the packet on[1].  We hope
>>>> that the receiving process migrates to that CPU, so xmit queue
>>>> matches.
>>> This ony works sometimes.  For example it's common to pin netperf to a
>>> cpu to get consistent performance.  Proper hardware must obey what
>>> applications want it to do, not the other way around.
>>>
>>>> For virtio this would mean a new per-packet rxhash value, right?
>>>>
>>>> Why are we doing something different?  What am I missing?
>>>>
>>>> Thanks,
>>>> Rusty.
>>>> [1] Everything I Know About Networking I Learned From LWN:
>>>>      https://lwn.net/Articles/362339/
>>> I think you missed this:
>>>
>>> 	Some network interfaces can help with the distribution of incoming
>>> 	packets; they have multiple receive queues and multiple interrupt lines.
>>> 	Others, though, are equipped with a single queue, meaning that the
>>> 	driver for that hardware must deal with all incoming packets in a
>>> 	single, serialized stream. Parallelizing such a stream requires some
>>> 	intelligence on the part of the host operating system.
>>>
>>> In other words RPS is a hack to speed up networking on cheapo
>>> hardware, this is one of the reasons it is off by default.
>>> Good hardware has multiple receive queues.
>>> We can implement a good one so we do not need RPS.
>>>
>>> Also not all guest OS-es support RPS.
>>>
>>> Does this clarify?
>> I would like to add that on many processors, sending
>> IPCs between guest CPUs requires exits on sending *and*
>> receiving path, making it very expensive.
> A final addition: what you suggest above would be
> "TX follows RX", right?
> It is in anticipation of something like that, that I made
> steering programming so generic.
> I think TX follows RX is more immediately useful for reasons above
> but we can add both to spec and let drivers and devices
> decide what they want to support.
> Pls let me know.

AFAIK, ixgbe does "rx follows tx". The only differences between ixgbe 
and virtio-net is that ixgbe driver programs the flow director during 
packet transmission but we suggest to do it silently in the device for 
simplicity. Even with this, more co-operation is still needed for the 
driver ( e.g ixgbe try to use per-cpu queue by setting affinity hint and 
using cpuid to choose the txq which could be reused in virtio-net driver).
>
>>>> ---
>>>> Transmit Packet Steering
>>>>
>>>> When VIRTIO_NET_F_MULTIQUEUE feature bit is negotiated, guest can use any of multiple configured transmit queues to transmit a given packet. To avoid packet reordering by device (which generally leads to performance degradation) driver should attempt to utilize the same transmit virtqueue for all packets of a given transmit flow. For bi-directional protocols (in practice, TCP), a given network connection can utilize both transmit and receive queues. For best performance, packets from a single connection should utilize the paired transmit and receive queues from the same virtqueue pair; for example both transmitqN and receiveqN. This rule makes it possible to optimize processing on the device side, but this is not a hard requirement: devices should function correctly even when this rul
 e is not followed.
>>>>
>>>> Driver selects an active steering rule using VIRTIO_NET_CTRL_STEERING command (this controls both which virtqueue is selected for a given packet for receive and notifies the device which virtqueues are about to be used for transmit).
>>>>
>>>> This command accepts a single out argument in the following format:
>>>>
>>>> #define VIRTIO_NET_CTRL_STEERING               4
>>>>
>>>> The field rule specifies the function used to select transmit virtqueue for a given packet; the field param makes it possible to pass an extra parameter if appropriate. When rule is set to VIRTIO_NET_CTRL_STEERING_SINGLE (this is the default) all packets are steered to the default virtqueue transmitq (1); param is unused; this is the default. With any other rule, When rule is set to VIRTIO_NET_CTRL_STEERING_RX_FOLLOWS_TX packets are steered by driver to the first N=(param+1) multiqueue virtqueues transmitq1...transmitqN; the default transmitq is unused. Driver must have configured all these (param+1) virtqueues beforehand.
>>>>
>>>> Supported steering rules can be added and removed in the future. Driver should check that the request to change the steering rule was successful by checking ack values of the command. As selecting a specific steering is an optimization feature, drivers should avoid hard failure and fall back on using a supported steering rule if this command fails. The default steering rule is VIRTIO_NET_CTRL_STEERING_SINGLE. It will not be removed.
>>>>
>>>> When the steering rule is modified, some packets can still be outstanding in one or more of the transmit virtqueues. Since drivers might choose to modify the current steering rule at a high rate (e.g. adaptively in response to changes in the workload) to avoid reordering packets, device is recommended to complete processing of the transmit queue(s) utilized by the original steering before processing any packets delivered by the modified steering rule.
>>>>
>>>> For debugging, the current steering rule can also be read from the configuration space.
>>>>
>>>> Receive Packet Steering
>>>>
>>>> When VIRTIO_NET_F_MULTIQUEUE feature bit is negotiated, device can use any of multiple configured receive queues to pass a given packet to driver. Driver controls which virtqueue is selected in practice by configuring packet steering rule using VIRTIO_NET_CTRL_STEERING command, as described above[sub:Transmit-Packet-Steering].
>>>>
>>>> The field rule specifies the function used to select receive virtqueue for a given packet; the field param makes it possible to pass an extra parameter if appropriate. When rule is set to VIRTIO_NET_CTRL_STEERING_SINGLE all packets are steered to the default virtqueue receiveq (0); param is unused; this is the default. When rule is set to VIRTIO_NET_CTRL_STEERING_RX_FOLLOWS_TX packets are steered by host to the first N=(param+1) multiqueue virtqueues receiveq1...receiveqN; the default receiveq is unused. Driver must have configured all these (param+1) virtqueues beforehand. For best performance for bi-directional flows (such as TCP) device should detect the flow to virtqueue pair mapping on transmit and select the receive virtqueue from the same virtqueue pair. For uni-directional flo
 ws, or when this mapping information is missing, a device-specific steering function is used.
>>>>
>>>> Supported steering rules can be added and removed in the future. Driver should probe for supported rules by checking ack values of the command.
>>>>
>>>> When the steering rule is modified, some packets can still be outstanding in one or more of the virtqueues. Device is not required to wait for these packets to be consumed before delivering packets using the new streering rule. Drivers modifying the steering rule at a high rate (e.g. adaptively in response to changes in the workload) are recommended to complete processing of the receive queue(s) utilized by the original steering before processing any packets delivered by the modified steering rule.
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv4] virtio-spec: virtio network device multiqueue support
  2012-09-10  2:12 ` Rusty Russell
  2012-09-10  6:16   ` Michael S. Tsirkin
@ 2012-09-10 18:39   ` Rick Jones
  1 sibling, 0 replies; 15+ messages in thread
From: Rick Jones @ 2012-09-10 18:39 UTC (permalink / raw)
  To: Rusty Russell
  Cc: kvm, Michael S. Tsirkin, netdev, virtualization, levinsasha928,
	pbonzini, Tom Herbert

On 09/09/2012 07:12 PM, Rusty Russell wrote:
> OK, I read the spec (pasted below for easy of reading), but I'm still
> confused over how this will work.
>
> I thought normal net drivers have the hardware provide an rxhash for
> each packet, and we map that to CPU to queue the packet on[1].  We hope
> that the receiving process migrates to that CPU, so xmit queue
> matches.
>
> For virtio this would mean a new per-packet rxhash value, right?
>
> Why are we doing something different?  What am I missing?
>
> Thanks,
> Rusty.
> [1] Everything I Know About Networking I Learned From LWN:
>      https://lwn.net/Articles/362339/

In my taxonomy at least, "multi-queue" predates RPS and RFS and is 
simply where the NIC via some means, perhaps a headers hash, separates 
incoming frames to different queues.

RPS can be thought of as doing something similar inside the host.  That 
could be used to get a spread from an otherwise "dumb" NIC (certainly 
that is what one of its predecessors - Inbound Packet Scheduling - used 
it for in HP-UX 10.20), or it could be used to augment the multi-queue 
support of a not-so-dump NIC - say if said NIC had a limit of queues 
that was rather lower than the number of cores/threads in the host. 
Indeed some driver/NIC combinations provide a hash value to the host for 
the host to use as it sees fit.

However, there is still the matter of a single thread of an application 
servicing multiple connections, each of which would hash to different 
locations.

RFS  (Receive Flow Steering) then goes one step further, and looks-up 
where the flow endpoint was last accessed and steers the traffic there. 
  The idea being that a thread of execution servicing multiple flows 
will have the traffic of those flows sent to the same place.  It then 
allows the scheduler to decide where things should be run rather than 
the networking code.

rick jones

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv4] virtio-spec: virtio network device multiqueue support
  2012-09-10  6:16   ` Michael S. Tsirkin
  2012-09-10  6:27     ` Michael S. Tsirkin
@ 2012-09-12  0:29     ` Rusty Russell
  1 sibling, 0 replies; 15+ messages in thread
From: Rusty Russell @ 2012-09-12  0:29 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, netdev, rick.jones2, virtualization, levinsasha928, pbonzini,
	Tom Herbert

"Michael S. Tsirkin" <mst@redhat.com> writes:
> In other words RPS is a hack to speed up networking on cheapo
> hardware, this is one of the reasons it is off by default.
> Good hardware has multiple receive queues.
> We can implement a good one so we do not need RPS.
>
> Also not all guest OS-es support RPS.
>
> Does this clarify?

Ok, thanks.

BTW, I found a better description by Tom Herbert, BTW:
        https://code.google.com/p/kernel/wiki/NetScalingGuide

Now, I find the description of VIRTIO_NET_CTRL_STEERING_RX_FOLLOWS_TX
confusing:

1) AFAICT it turns on multiqueue rx, with no semantics attached.
   I have no idea why it's called what it is.  Why?

2) We've said we can remove steering methods, but we haven't actually
   defined any, as we've left it completely open.

If I were a driver author, it leaves me completely baffled on how to
implement the spec :(

What are we actually planning to implement at the moment?

>For best performance, packets from a single connection should utilize
>the paired transmit and receive queues from the same virtqueue pair;
>for example both transmitqN and receiveqN. This rule makes it possible
>to optimize processing on the device side, but this is not a hard
>requirement: devices should function correctly even when this rule is
>not followed.

Why is this true?  I don't actually see why the queues are in pairs at
all; are tx and rx not completely independent?  So why does it matter?

>> When the steering rule is modified, some packets can still be
>> outstanding in one or more of the virtqueues. Device is not required
>> to wait for these packets to be consumed before delivering packets
>> using the new streering rule. Drivers modifying the steering rule at
>> a high rate (e.g. adaptively in response to changes in the workload)
>> are recommended to complete processing of the receive queue(s)
>> utilized by the original steering before processing any packets
>> delivered by the modified steering rule.

How can this be done?  This isn't actually possible without taking the
queue down, since more packets are incoming.

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv4] virtio-spec: virtio network device multiqueue support
  2012-09-10 11:00         ` Jason Wang
@ 2012-09-12  5:49           ` Rusty Russell
  2012-09-12  7:57             ` Michael S. Tsirkin
  2012-09-12 14:38             ` Tom Herbert
  0 siblings, 2 replies; 15+ messages in thread
From: Rusty Russell @ 2012-09-12  5:49 UTC (permalink / raw)
  To: Jason Wang, Michael S. Tsirkin
  Cc: kvm, netdev, rick.jones2, virtualization, levinsasha928, pbonzini,
	Tom Herbert

Jason Wang <jasowang@redhat.com> writes:
> On 09/10/2012 02:33 PM, Michael S. Tsirkin wrote:
>> A final addition: what you suggest above would be
>> "TX follows RX", right?

BTW, yes.  But it's a weird way to express what the nic is doing.

>> It is in anticipation of something like that, that I made
>> steering programming so generic.

>> I think TX follows RX is more immediately useful for reasons above
>> but we can add both to spec and let drivers and devices
>> decide what they want to support.

You mean "RX follows TX"?  ie. accelerated RFS.  I agree.

Perhaps Tom can explain how we avoid out-of-order receive for the
accelerated RFS case?  It's not clear to me, but we need to be able to
do that for virtio-net if it implements accelerated RFS.

> AFAIK, ixgbe does "rx follows tx". The only differences between ixgbe 
> and virtio-net is that ixgbe driver programs the flow director during 
> packet transmission but we suggest to do it silently in the device for 
> simplicity.

Implying the receive queue by xmit will be slightly laggy.  Don't know
if that's a problem.

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv4] virtio-spec: virtio network device multiqueue support
  2012-09-12  5:49           ` Rusty Russell
@ 2012-09-12  7:57             ` Michael S. Tsirkin
  2012-09-12 14:40               ` Tom Herbert
  2012-09-12 14:38             ` Tom Herbert
  1 sibling, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2012-09-12  7:57 UTC (permalink / raw)
  To: Rusty Russell
  Cc: kvm, netdev, rick.jones2, virtualization, levinsasha928, pbonzini,
	Tom Herbert

On Wed, Sep 12, 2012 at 03:19:11PM +0930, Rusty Russell wrote:
> Jason Wang <jasowang@redhat.com> writes:
> > On 09/10/2012 02:33 PM, Michael S. Tsirkin wrote:
> >> A final addition: what you suggest above would be
> >> "TX follows RX", right?
> 
> BTW, yes.  But it's a weird way to express what the nic is doing.

It explains what the system is doing.
TX is done by driver, RX by nic.
We document both driver and device in the spec
so I thought it's fine. any suggestions wellcome.

> >> It is in anticipation of something like that, that I made
> >> steering programming so generic.
> 
> >> I think TX follows RX is more immediately useful for reasons above
> >> but we can add both to spec and let drivers and devices
> >> decide what they want to support.
> 
> You mean "RX follows TX"?  ie. accelerated RFS.  I agree.


Yes that's what I meant. Thanks for the correction.

> Perhaps Tom can explain how we avoid out-of-order receive for the
> accelerated RFS case?  It's not clear to me, but we need to be able to
> do that for virtio-net if it implements accelerated RFS.

Basically this has tx vq per cpu and relies on scheduler not bouncing threads
between cpus too aggressively. Appears to be what ixgbe does.

> > AFAIK, ixgbe does "rx follows tx". The only differences between ixgbe 
> > and virtio-net is that ixgbe driver programs the flow director during 
> > packet transmission but we suggest to do it silently in the device for 
> > simplicity.
> 
> Implying the receive queue by xmit will be slightly laggy.  Don't know
> if that's a problem.
> 
> Cheers,
> Rusty.

Doesn't seem to be a problem in Jason's testing so far.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv4] virtio-spec: virtio network device multiqueue support
  2012-09-12  5:49           ` Rusty Russell
  2012-09-12  7:57             ` Michael S. Tsirkin
@ 2012-09-12 14:38             ` Tom Herbert
  2012-09-19  1:40               ` Rusty Russell
  1 sibling, 1 reply; 15+ messages in thread
From: Tom Herbert @ 2012-09-12 14:38 UTC (permalink / raw)
  To: Rusty Russell
  Cc: kvm, Michael S. Tsirkin, netdev, rick.jones2, virtualization,
	levinsasha928, pbonzini


[-- Attachment #1.1: Type: text/plain, Size: 2086 bytes --]

On Tue, Sep 11, 2012 at 10:49 PM, Rusty Russell <rusty@rustcorp.com.au>wrote:

> Jason Wang <jasowang@redhat.com> writes:
> > On 09/10/2012 02:33 PM, Michael S. Tsirkin wrote:
> >> A final addition: what you suggest above would be
> >> "TX follows RX", right?
>
> BTW, yes.  But it's a weird way to express what the nic is doing.
>
> >> It is in anticipation of something like that, that I made
> >> steering programming so generic.
>
> >> I think TX follows RX is more immediately useful for reasons above
> >> but we can add both to spec and let drivers and devices
> >> decide what they want to support.
>
> You mean "RX follows TX"?  ie. accelerated RFS.  I agree.
>
> RX following TX is logic of flow director I believe.  {a}RFS has RX follow
CPU where application receive is done on the socket.  So in RFS there is no
requirement to have a 1-1 correspondence between TX and RX queues, and in
fact this allows different number of queues between TX and RX.  We found
this necessary when using priority HW queues, so that there are more TX
queues than RX.

Perhaps Tom can explain how we avoid out-of-order receive for the
> accelerated RFS case?  It's not clear to me, but we need to be able to
> do that for virtio-net if it implements accelerated RFS.
>
> AFAIK ooo RX is possible with accelerated RFS.  We have an algorithm that
prevents this for RFS case by deferring a migration to a new queue as long
as it's possible that a flow might have outstanding packets on the old
queue.  I suppose this could be implemented in the device for the HW
queues, but I don't think it would be easy to cover all cases where packets
were already in transit to the host or other cases where host and device
queues are out of sync.


> > AFAIK, ixgbe does "rx follows tx". The only differences between ixgbe
> > and virtio-net is that ixgbe driver programs the flow director during
> > packet transmission but we suggest to do it silently in the device for
> > simplicity.
>
> Implying the receive queue by xmit will be slightly laggy.  Don't know
> if that's a problem.
>
> Cheers,
> Rusty.
>

[-- Attachment #1.2: Type: text/html, Size: 2944 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv4] virtio-spec: virtio network device multiqueue support
  2012-09-12  7:57             ` Michael S. Tsirkin
@ 2012-09-12 14:40               ` Tom Herbert
  2012-09-12 19:11                 ` Ben Hutchings
  0 siblings, 1 reply; 15+ messages in thread
From: Tom Herbert @ 2012-09-12 14:40 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Rusty Russell, Jason Wang, kvm, virtualization, netdev, pbonzini,
	levinsasha928, rick.jones2

On Wed, Sep 12, 2012 at 12:57 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Wed, Sep 12, 2012 at 03:19:11PM +0930, Rusty Russell wrote:
>> Jason Wang <jasowang@redhat.com> writes:
>> > On 09/10/2012 02:33 PM, Michael S. Tsirkin wrote:
>> >> A final addition: what you suggest above would be
>> >> "TX follows RX", right?
>>
>> BTW, yes.  But it's a weird way to express what the nic is doing.
>
> It explains what the system is doing.
> TX is done by driver, RX by nic.
> We document both driver and device in the spec
> so I thought it's fine. any suggestions wellcome.
>
>> >> It is in anticipation of something like that, that I made
>> >> steering programming so generic.
>>
>> >> I think TX follows RX is more immediately useful for reasons above
>> >> but we can add both to spec and let drivers and devices
>> >> decide what they want to support.
>>
>> You mean "RX follows TX"?  ie. accelerated RFS.  I agree.
>
RX following TX is logic of flow director I believe.  {a}RFS has RX
follow CPU where application receive is done on the socket.  So in RFS
there is no requirement to have a 1-1 correspondence between TX and RX
queues, and in fact this allows different number of queues between TX
and RX.  We found this necessary when using priority HW queues, so
that there are more TX queues than RX.

>
> Yes that's what I meant. Thanks for the correction.
>
>> Perhaps Tom can explain how we avoid out-of-order receive for the
>> accelerated RFS case?  It's not clear to me, but we need to be able to
>> do that for virtio-net if it implements accelerated RFS.
>
AFAIK ooo RX is still possible with accelerated RFS.  We have an
algorithm that prevents this for RFS by deferring a migration to a new
queue as long as it's possible that a flow might have outstanding
packets on the old queue.  I suppose this could be implemented in the
device for the HW queues, but I don't think it would be easy to cover
all cases where packets were already in transit to the host or other
cases where host and device queues are out of sync.

> Basically this has tx vq per cpu and relies on scheduler not bouncing threads
> between cpus too aggressively. Appears to be what ixgbe does.
>
>> > AFAIK, ixgbe does "rx follows tx". The only differences between ixgbe
>> > and virtio-net is that ixgbe driver programs the flow director during
>> > packet transmission but we suggest to do it silently in the device for
>> > simplicity.
>>
>> Implying the receive queue by xmit will be slightly laggy.  Don't know
>> if that's a problem.
>>
>> Cheers,
>> Rusty.
>
> Doesn't seem to be a problem in Jason's testing so far.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv4] virtio-spec: virtio network device multiqueue support
  2012-09-12 14:40               ` Tom Herbert
@ 2012-09-12 19:11                 ` Ben Hutchings
  0 siblings, 0 replies; 15+ messages in thread
From: Ben Hutchings @ 2012-09-12 19:11 UTC (permalink / raw)
  To: Tom Herbert
  Cc: rick.jones2, kvm, Michael S. Tsirkin, netdev, virtualization,
	levinsasha928, pbonzini

On Wed, 2012-09-12 at 07:40 -0700, Tom Herbert wrote:
> On Wed, Sep 12, 2012 at 12:57 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > On Wed, Sep 12, 2012 at 03:19:11PM +0930, Rusty Russell wrote:
[...]
> >> Perhaps Tom can explain how we avoid out-of-order receive for the
> >> accelerated RFS case?  It's not clear to me, but we need to be able to
> >> do that for virtio-net if it implements accelerated RFS.
> >
> AFAIK ooo RX is still possible with accelerated RFS.  We have an
> algorithm that prevents this for RFS by deferring a migration to a new
> queue as long as it's possible that a flow might have outstanding
> packets on the old queue.  I suppose this could be implemented in the
> device for the HW queues, but I don't think it would be easy to cover
> all cases where packets were already in transit to the host or other
> cases where host and device queues are out of sync.
[...]

Yes, I couldn't see any way to eliminate the possibility of OOO.  The
software queue check in RFS should redirect the flow only when it is new
or has had an idle period, when I hope only a few packets will be
received before we send some kind of response (transport or application
layer ACK).  So I think that OOO is not that likely in practice, but I
don't have the evidence to back that up.

If the filter update latency is high enough that a response can overtake
the filter update, there may be more of a problem.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv4] virtio-spec: virtio network device multiqueue support
  2012-09-12 14:38             ` Tom Herbert
@ 2012-09-19  1:40               ` Rusty Russell
  2012-09-19  6:12                 ` Michael S. Tsirkin
  0 siblings, 1 reply; 15+ messages in thread
From: Rusty Russell @ 2012-09-19  1:40 UTC (permalink / raw)
  To: Tom Herbert
  Cc: kvm, Michael S. Tsirkin, netdev, rick.jones2, virtualization,
	levinsasha928, pbonzini

Tom Herbert <therbert@google.com> writes:
> On Tue, Sep 11, 2012 at 10:49 PM, Rusty Russell <rusty@rustcorp.com.au>wrote:
>> Perhaps Tom can explain how we avoid out-of-order receive for the
>> accelerated RFS case?  It's not clear to me, but we need to be able to
>> do that for virtio-net if it implements accelerated RFS.
>
> AFAIK ooo RX is possible with accelerated RFS.  We have an algorithm that
> prevents this for RFS case by deferring a migration to a new queue as long
> as it's possible that a flow might have outstanding packets on the old
> queue.  I suppose this could be implemented in the device for the HW
> queues, but I don't think it would be easy to cover all cases where packets
> were already in transit to the host or other cases where host and device
> queues are out of sync.

Having gone to such great lengths to avoid ooo for RFS, I don't think
DaveM would be happy if we allow it for virtio_net.

So, how *would* we implement such a thing for a "hardware" device?  What
if the device will only change the receive queue if the old receive
queue is empty?

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv4] virtio-spec: virtio network device multiqueue support
  2012-09-19  1:40               ` Rusty Russell
@ 2012-09-19  6:12                 ` Michael S. Tsirkin
  0 siblings, 0 replies; 15+ messages in thread
From: Michael S. Tsirkin @ 2012-09-19  6:12 UTC (permalink / raw)
  To: Rusty Russell
  Cc: kvm, netdev, rick.jones2, virtualization, levinsasha928, pbonzini,
	Tom Herbert

On Wed, Sep 19, 2012 at 11:10:10AM +0930, Rusty Russell wrote:
> Tom Herbert <therbert@google.com> writes:
> > On Tue, Sep 11, 2012 at 10:49 PM, Rusty Russell <rusty@rustcorp.com.au>wrote:
> >> Perhaps Tom can explain how we avoid out-of-order receive for the
> >> accelerated RFS case?  It's not clear to me, but we need to be able to
> >> do that for virtio-net if it implements accelerated RFS.
> >
> > AFAIK ooo RX is possible with accelerated RFS.  We have an algorithm that
> > prevents this for RFS case by deferring a migration to a new queue as long
> > as it's possible that a flow might have outstanding packets on the old
> > queue.  I suppose this could be implemented in the device for the HW
> > queues, but I don't think it would be easy to cover all cases where packets
> > were already in transit to the host or other cases where host and device
> > queues are out of sync.
> 
> Having gone to such great lengths to avoid ooo for RFS, I don't think
> DaveM would be happy if we allow it for virtio_net.
> 
> So, how *would* we implement such a thing for a "hardware" device?  What
> if the device will only change the receive queue if the old receive
> queue is empty?
> 
> Cheers,
> Rusty.
> 

I think that would do it in most cases.  Or if we want to be more
exact we could delay switching a specific flow until no
outstanding rx packets for this flow. Not sure it's worth the
hassle.

-- 
MST

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2012-09-19  6:12 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-09 13:03 [PATCHv4] virtio-spec: virtio network device multiqueue support Michael S. Tsirkin
2012-09-10  2:12 ` Rusty Russell
2012-09-10  6:16   ` Michael S. Tsirkin
2012-09-10  6:27     ` Michael S. Tsirkin
2012-09-10  6:33       ` Michael S. Tsirkin
2012-09-10 11:00         ` Jason Wang
2012-09-12  5:49           ` Rusty Russell
2012-09-12  7:57             ` Michael S. Tsirkin
2012-09-12 14:40               ` Tom Herbert
2012-09-12 19:11                 ` Ben Hutchings
2012-09-12 14:38             ` Tom Herbert
2012-09-19  1:40               ` Rusty Russell
2012-09-19  6:12                 ` Michael S. Tsirkin
2012-09-12  0:29     ` Rusty Russell
2012-09-10 18:39   ` Rick Jones

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).