* [virtio-dev] [PATCH v2] virtio-net: support setting coalescing params for multiple vqs
@ 2024-01-15 13:06 Heng Qi
2024-01-15 13:21 ` [virtio-dev] RE: [virtio-comment] " Parav Pandit
` (2 more replies)
0 siblings, 3 replies; 20+ messages in thread
From: Heng Qi @ 2024-01-15 13:06 UTC (permalink / raw)
To: virtio-comment, virtio-dev; +Cc: Jason Wang, Michael S. Tsirkin, Xuan Zhuo
Currently, when each time the driver attempts to update the coalescing parameters
for a vq, it needs to kick the device and wait for the ctrlq response to return.
The following path is observed: 1. Driver kicks the device; 2. After the device
receives the kick, CPU scheduling occurs and DMA multiple buffers multiple times;
3. The device completes processing and replies with a response.
When large-queue devices issue multiple requests and kick the device frequently,
this often interrupt the work of the device-side CPU. In addition, each vq request
is processed separately, causing more delays for the CPU to wait for the DMA
request to complete.
These interruptions and overhead will strain the CPU responsible for controlling
the path of the DPU, especially in multi-device and large-queue scenarios.
To solve the above problems, we internally tried batch request, which merges
requests from multiple queues and sends them at once. We conservatively tested
8 queue commands and sent them together. The DPU processing efficiency can be
improved by 8 times, which greatly eases the DPU's support for multi-device
and multi-queue DIM.
Maintainers may be concerned about whether the batch command method can optimize
the above problems: accumulate multiple request commands to kick the device once,
and obtain the processing results of the corresponding commands asynchronously.
The batch command method is used by us to optimize the CPU overhead of the DIM
worker caused by the guest being busy waiting for the command response result.
This is a different focus than batch request to solve the problem.
Suggested-by: Xiaoming Zhao <zxm377917@alibaba-inc.com>
Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
---
v1->v2: Updated commit log. Due to sensitivity, sorry that can not give the
absolute value directly. @Michael
device-types/net/description.tex | 26 ++++++++++++++++++++------
1 file changed, 20 insertions(+), 6 deletions(-)
diff --git a/device-types/net/description.tex b/device-types/net/description.tex
index aff5e08..b3766c4 100644
--- a/device-types/net/description.tex
+++ b/device-types/net/description.tex
@@ -1667,8 +1667,8 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
for notification coalescing.
If the VIRTIO_NET_F_VQ_NOTF_COAL feature is negotiated, the driver can
-send commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET
-for virtqueue notification coalescing.
+send commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and
+VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET for virtqueue notification coalescing.
\begin{lstlisting}
struct virtio_net_ctrl_coal {
@@ -1682,11 +1682,17 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
struct virtio_net_ctrl_coal coal;
};
+struct virtio_net_ctrl_mrg_coal_vq {
+ le16 num_entries; /* indicates number of valid entries */
+ struct virtio_net_ctrl_coal_vq entries[];
+};
+
#define VIRTIO_NET_CTRL_NOTF_COAL 6
#define VIRTIO_NET_CTRL_NOTF_COAL_TX_SET 0
#define VIRTIO_NET_CTRL_NOTF_COAL_RX_SET 1
#define VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET 2
#define VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET 3
+ #define VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET 4
\end{lstlisting}
Coalescing parameters:
@@ -1706,6 +1712,7 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
\item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, the structure virtio_net_ctrl_coal_vq is write-only for the driver.
\item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET, \field{vq_index} and \field{reserved} are write-only
for the driver, and the structure virtio_net_ctrl_coal is read-only for the driver.
+\item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET, the structure virtio_net_ctrl_mrg_coal_vq is write-only for the driver.
\end{itemize}
The class VIRTIO_NET_CTRL_NOTF_COAL has the following commands:
@@ -1716,6 +1723,9 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
for an enabled transmit/receive virtqueue whose index is \field{vq_index}.
\item VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET: use the structure virtio_net_ctrl_coal_vq to get the \field{max_usecs} and \field{max_packets} parameters
for an enabled transmit/receive virtqueue whose index is \field{vq_index}.
+\item VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET: use the structure virtio_net_ctrl_mrg_coal_vq to set the \field{max_usecs} and \field{max_packets} parameters
+ for \field{num_entries} enabled transmit/receive virtqueues. The corresponding index value
+ of each configured virtqueue is \field{vq_index}.
\end{enumerate}
The device may generate notifications more or less frequently than specified by set commands of the VIRTIO_NET_CTRL_NOTF_COAL class.
@@ -1782,9 +1792,13 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
The driver MUST set \field{vq_index} to the virtqueue index of an enabled transmit or receive virtqueue.
+The driver MUST set \field{num_entries} to a non-zero value and MUST NOT set \field{num_entries} to
+a value greater than the number of enabled transmit and receive virtqueues.
+
The driver MUST have negotiated the VIRTIO_NET_F_NOTF_COAL feature when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and VIRTIO_NET_CTRL_NOTF_COAL_RX_SET.
-The driver MUST have negotiated the VIRTIO_NET_F_VQ_NOTF_COAL feature when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET.
+The driver MUST have negotiated the VIRTIO_NET_F_VQ_NOTF_COAL feature when issuing commands
+VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET.
The driver MUST ignore the values of coalescing parameters received from the VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET command if the device responds with VIRTIO_NET_ERR.
@@ -1794,10 +1808,10 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
The device SHOULD respond to VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and VIRTIO_NET_CTRL_NOTF_COAL_RX_SET commands with VIRTIO_NET_ERR if it was not able to change the parameters.
-The device MUST respond to the VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET command with VIRTIO_NET_ERR if it was not able to change the parameters.
+The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET commands with VIRTIO_NET_ERR if it was not able to change the parameters.
-The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET commands with
-VIRTIO_NET_ERR if the designated virtqueue is not an enabled transmit or receive virtqueue.
+The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET
+commands with VIRTIO_NET_ERR if the designated virtqueue is not an enabled transmit or receive virtqueue.
Upon disabling and re-enabling a transmit virtqueue, the device MUST set the coalescing parameters of the virtqueue
to those configured through the VIRTIO_NET_CTRL_NOTF_COAL_TX_SET command, or, if the driver did not set any TX coalescing parameters, to 0.
--
1.8.3.1
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [virtio-dev] RE: [virtio-comment] [PATCH v2] virtio-net: support setting coalescing params for multiple vqs
2024-01-15 13:06 [virtio-dev] [PATCH v2] virtio-net: support setting coalescing params for multiple vqs Heng Qi
@ 2024-01-15 13:21 ` Parav Pandit
2024-01-17 4:52 ` Heng Qi
2024-01-15 22:58 ` [virtio-dev] " Michael S. Tsirkin
2024-01-24 10:50 ` Michael S. Tsirkin
2 siblings, 1 reply; 20+ messages in thread
From: Parav Pandit @ 2024-01-15 13:21 UTC (permalink / raw)
To: Heng Qi, virtio-comment@lists.oasis-open.org,
virtio-dev@lists.oasis-open.org
Cc: Jason Wang, Michael S. Tsirkin, Xuan Zhuo
> From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> open.org> On Behalf Of Heng Qi
> Sent: Monday, January 15, 2024 6:36 PM
>
> Currently, when each time the driver attempts to update the coalescing
> parameters for a vq, it needs to kick the device and wait for the ctrlq
> response to return.
It does not need to wait. This is some driver limitation that does not use the queue as "queue".
Such driver limitation should be removed in the driver. It does not qualify as limitation.
This will enable driver to enqueue multiple cvq commands without waiting for previous one.
This will also enable device to find natural coalescing done on multiple commands.
> The following path is observed: 1. Driver kicks the device; 2. After the device
> receives the kick, CPU scheduling occurs and DMA multiple buffers multiple
> times; 3. The device completes processing and replies with a response.
>
> When large-queue devices issue multiple requests and kick the device
> frequently, this often interrupt the work of the device-side CPU.
When there is large devices and multiple driver notifications by a cpu that is N times faster than the device side cpu,
the device may find natural coalescing on the commands of a given cvq.
For multiple DMAs, we need to way to send 8 bytes of data without 16 bytes of indirection via a descriptor.
This is what we discussed a while back to do in txq and Stefan suggested to generalize for more queues, which is also a good idea.
> In addition,
> each vq request is processed separately, causing more delays for the CPU to
> wait for the DMA request to complete.
>
> These interruptions and overhead will strain the CPU responsible for
> controlling the path of the DPU, especially in multi-device and large-queue
> scenarios.
>
> To solve the above problems, we internally tried batch request, which merges
> requests from multiple queues and sends them at once. We conservatively
> tested
The batching done may end up modifying the given VQ's parameters multiple times.
The right test on Linux to do without rtnl lock which is anyway ugly and wrong semantic to use blocking the whole netdev stack.
(in case if you used that).
> 8 queue commands and sent them together. The DPU processing efficiency
> can be improved by 8 times, which greatly eases the DPU's support for multi-
> device and multi-queue DIM.
>
This is good.
> Maintainers may be concerned about whether the batch command method
> can optimize the above problems: accumulate multiple request commands to
> kick the device once, and obtain the processing results of the corresponding
> commands asynchronously.
This is unlikely to improve, rather it will have negative impact as it only means that moderation parameters are just delayed by the driver.
> The batch command method is used by us to optimize the CPU overhead of
> the DIM worker caused by the guest being busy waiting for the command
> response result.
In that case fixing the guest driver which is not yet written is the right fix.
> This is a different focus than batch request to solve the problem.
>
> Suggested-by: Xiaoming Zhao <zxm377917@alibaba-inc.com>
> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> ---
> v1->v2: Updated commit log. Due to sensitivity, sorry that can not give
> v1->the
> absolute value directly. @Michael
>
> device-types/net/description.tex | 26 ++++++++++++++++++++------
> 1 file changed, 20 insertions(+), 6 deletions(-)
>
> diff --git a/device-types/net/description.tex b/device-
> types/net/description.tex
> index aff5e08..b3766c4 100644
> --- a/device-types/net/description.tex
> +++ b/device-types/net/description.tex
> @@ -1667,8 +1667,8 @@ \subsubsection{Control
> Virtqueue}\label{sec:Device Types / Network Device / Devi for notification
> coalescing.
>
> If the VIRTIO_NET_F_VQ_NOTF_COAL feature is negotiated, the driver can -
> send commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and
> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET -for virtqueue notification
> coalescing.
> +send commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET,
> +VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and
> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET for virtqueue notification
> coalescing.
>
A new feature bit is needed for this extra functionality.
> \begin{lstlisting}
> struct virtio_net_ctrl_coal {
> @@ -1682,11 +1682,17 @@ \subsubsection{Control
> Virtqueue}\label{sec:Device Types / Network Device / Devi
> struct virtio_net_ctrl_coal coal;
> };
>
> +struct virtio_net_ctrl_mrg_coal_vq {
> + le16 num_entries; /* indicates number of valid entries */
> + struct virtio_net_ctrl_coal_vq entries[]; };
> +
> #define VIRTIO_NET_CTRL_NOTF_COAL 6
> #define VIRTIO_NET_CTRL_NOTF_COAL_TX_SET 0
> #define VIRTIO_NET_CTRL_NOTF_COAL_RX_SET 1
> #define VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET 2
> #define VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET 3
> + #define VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET 4
> \end{lstlisting}
>
> Coalescing parameters:
> @@ -1706,6 +1712,7 @@ \subsubsection{Control
> Virtqueue}\label{sec:Device Types / Network Device / Devi \item For the
> command VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, the structure
> virtio_net_ctrl_coal_vq is write-only for the driver.
> \item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET,
> \field{vq_index} and \field{reserved} are write-only
> for the driver, and the structure virtio_net_ctrl_coal is read-only for the
> driver.
> +\item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET, the
> structure virtio_net_ctrl_mrg_coal_vq is write-only for the driver.
> \end{itemize}
>
> The class VIRTIO_NET_CTRL_NOTF_COAL has the following commands:
> @@ -1716,6 +1723,9 @@ \subsubsection{Control
> Virtqueue}\label{sec:Device Types / Network Device / Devi
> for an enabled transmit/receive virtqueue whose
> index is \field{vq_index}.
> \item VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET: use the structure
> virtio_net_ctrl_coal_vq to get the \field{max_usecs} and \field{max_packets}
> parameters
> for an enabled transmit/receive virtqueue whose
> index is \field{vq_index}.
> +\item VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET: use the structure
> virtio_net_ctrl_mrg_coal_vq to set the \field{max_usecs} and
> \field{max_packets} parameters
> + for \field{num_entries} enabled transmit/receive
> virtqueues. The corresponding index value
> + of each configured virtqueue is \field{vq_index}.
> \end{enumerate}
>
> The device may generate notifications more or less frequently than specified
> by set commands of the VIRTIO_NET_CTRL_NOTF_COAL class.
> @@ -1782,9 +1792,13 @@ \subsubsection{Control
> Virtqueue}\label{sec:Device Types / Network Device / Devi
>
> The driver MUST set \field{vq_index} to the virtqueue index of an enabled
> transmit or receive virtqueue.
>
> +The driver MUST set \field{num_entries} to a non-zero value and MUST
> +NOT set \field{num_entries} to a value greater than the number of enabled
> transmit and receive virtqueues.
> +
> The driver MUST have negotiated the VIRTIO_NET_F_NOTF_COAL feature
> when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and
> VIRTIO_NET_CTRL_NOTF_COAL_RX_SET.
>
> -The driver MUST have negotiated the VIRTIO_NET_F_VQ_NOTF_COAL
> feature when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET
> and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET.
> +The driver MUST have negotiated the VIRTIO_NET_F_VQ_NOTF_COAL
> feature
> +when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET,
> VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and
> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET.
>
> The driver MUST ignore the values of coalescing parameters received from
> the VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET command if the device
> responds with VIRTIO_NET_ERR.
>
> @@ -1794,10 +1808,10 @@ \subsubsection{Control
> Virtqueue}\label{sec:Device Types / Network Device / Devi
>
> The device SHOULD respond to VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and
> VIRTIO_NET_CTRL_NOTF_COAL_RX_SET commands with VIRTIO_NET_ERR if
> it was not able to change the parameters.
>
> -The device MUST respond to the VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET
> command with VIRTIO_NET_ERR if it was not able to change the parameters.
> +The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and
> VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET commands with VIRTIO_NET_ERR
> if it was not able to change the parameters.
>
> -The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and
> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET commands with -VIRTIO_NET_ERR
> if the designated virtqueue is not an enabled transmit or receive virtqueue.
> +The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET,
> +VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and
> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET commands with VIRTIO_NET_ERR if
> the designated virtqueue is not an enabled transmit or receive virtqueue.
>
> Upon disabling and re-enabling a transmit virtqueue, the device MUST set
> the coalescing parameters of the virtqueue to those configured through the
> VIRTIO_NET_CTRL_NOTF_COAL_TX_SET command, or, if the driver did not
> set any TX coalescing parameters, to 0.
> --
> 1.8.3.1
>
>
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
>
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
>
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-
> open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 20+ messages in thread
* [virtio-dev] Re: [PATCH v2] virtio-net: support setting coalescing params for multiple vqs
2024-01-15 13:06 [virtio-dev] [PATCH v2] virtio-net: support setting coalescing params for multiple vqs Heng Qi
2024-01-15 13:21 ` [virtio-dev] RE: [virtio-comment] " Parav Pandit
@ 2024-01-15 22:58 ` Michael S. Tsirkin
2024-01-17 8:53 ` Heng Qi
2024-01-24 10:50 ` Michael S. Tsirkin
2 siblings, 1 reply; 20+ messages in thread
From: Michael S. Tsirkin @ 2024-01-15 22:58 UTC (permalink / raw)
To: Heng Qi; +Cc: virtio-comment, virtio-dev, Jason Wang, Xuan Zhuo
On Mon, Jan 15, 2024 at 09:06:02PM +0800, Heng Qi wrote:
> Currently, when each time the driver attempts to update the coalescing parameters
> for a vq, it needs to kick the device and wait for the ctrlq response to return.
> The following path is observed: 1. Driver kicks the device; 2. After the device
> receives the kick, CPU scheduling occurs and DMA multiple buffers multiple times;
> 3. The device completes processing and replies with a response.
>
> When large-queue devices issue multiple requests and kick the device frequently,
> this often interrupt the work of the device-side CPU. In addition, each vq request
> is processed separately, causing more delays for the CPU to wait for the DMA
> request to complete.
>
> These interruptions and overhead will strain the CPU responsible for controlling
> the path of the DPU, especially in multi-device and large-queue scenarios.
>
> To solve the above problems, we internally tried batch request, which merges
> requests from multiple queues and sends them at once. We conservatively tested
> 8 queue commands and sent them together. The DPU processing efficiency can be
> improved by 8 times, which greatly eases the DPU's support for multi-device
> and multi-queue DIM.
>
> Maintainers may be concerned about whether the batch command method can optimize
> the above problems: accumulate multiple request commands to kick the device once,
> and obtain the processing results of the corresponding commands asynchronously.
> The batch command method is used by us to optimize the CPU overhead of the DIM
> worker caused by the guest being busy waiting for the command response result.
> This is a different focus than batch request to solve the problem.
Hmm. Could the issue be actually that
1. you are using split vqs so processing is complex
2. Linux driver uses indirect to pass coalescing commands around
so even more complex
?
Is packed an option for you at all? Guest should also be changed to
avoid indirect with ctrl vq. As a quick test, disable indirect maybe
that is fast enough?
> Suggested-by: Xiaoming Zhao <zxm377917@alibaba-inc.com>
> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> ---
> v1->v2: Updated commit log. Due to sensitivity, sorry that can not give the
> absolute value directly. @Michael
>
> device-types/net/description.tex | 26 ++++++++++++++++++++------
> 1 file changed, 20 insertions(+), 6 deletions(-)
>
> diff --git a/device-types/net/description.tex b/device-types/net/description.tex
> index aff5e08..b3766c4 100644
> --- a/device-types/net/description.tex
> +++ b/device-types/net/description.tex
> @@ -1667,8 +1667,8 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
> for notification coalescing.
>
> If the VIRTIO_NET_F_VQ_NOTF_COAL feature is negotiated, the driver can
> -send commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET
> -for virtqueue notification coalescing.
> +send commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and
> +VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET for virtqueue notification coalescing.
>
> \begin{lstlisting}
> struct virtio_net_ctrl_coal {
> @@ -1682,11 +1682,17 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
> struct virtio_net_ctrl_coal coal;
> };
>
> +struct virtio_net_ctrl_mrg_coal_vq {
> + le16 num_entries; /* indicates number of valid entries */
> + struct virtio_net_ctrl_coal_vq entries[];
> +};
> +
> #define VIRTIO_NET_CTRL_NOTF_COAL 6
> #define VIRTIO_NET_CTRL_NOTF_COAL_TX_SET 0
> #define VIRTIO_NET_CTRL_NOTF_COAL_RX_SET 1
> #define VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET 2
> #define VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET 3
> + #define VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET 4
> \end{lstlisting}
>
> Coalescing parameters:
> @@ -1706,6 +1712,7 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
> \item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, the structure virtio_net_ctrl_coal_vq is write-only for the driver.
> \item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET, \field{vq_index} and \field{reserved} are write-only
> for the driver, and the structure virtio_net_ctrl_coal is read-only for the driver.
> +\item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET, the structure virtio_net_ctrl_mrg_coal_vq is write-only for the driver.
> \end{itemize}
>
> The class VIRTIO_NET_CTRL_NOTF_COAL has the following commands:
> @@ -1716,6 +1723,9 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
> for an enabled transmit/receive virtqueue whose index is \field{vq_index}.
> \item VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET: use the structure virtio_net_ctrl_coal_vq to get the \field{max_usecs} and \field{max_packets} parameters
> for an enabled transmit/receive virtqueue whose index is \field{vq_index}.
> +\item VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET: use the structure virtio_net_ctrl_mrg_coal_vq to set the \field{max_usecs} and \field{max_packets} parameters
> + for \field{num_entries} enabled transmit/receive virtqueues. The corresponding index value
> + of each configured virtqueue is \field{vq_index}.
> \end{enumerate}
>
> The device may generate notifications more or less frequently than specified by set commands of the VIRTIO_NET_CTRL_NOTF_COAL class.
> @@ -1782,9 +1792,13 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
>
> The driver MUST set \field{vq_index} to the virtqueue index of an enabled transmit or receive virtqueue.
>
> +The driver MUST set \field{num_entries} to a non-zero value and MUST NOT set \field{num_entries} to
> +a value greater than the number of enabled transmit and receive virtqueues.
> +
> The driver MUST have negotiated the VIRTIO_NET_F_NOTF_COAL feature when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and VIRTIO_NET_CTRL_NOTF_COAL_RX_SET.
>
> -The driver MUST have negotiated the VIRTIO_NET_F_VQ_NOTF_COAL feature when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET.
> +The driver MUST have negotiated the VIRTIO_NET_F_VQ_NOTF_COAL feature when issuing commands
> +VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET.
>
> The driver MUST ignore the values of coalescing parameters received from the VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET command if the device responds with VIRTIO_NET_ERR.
>
> @@ -1794,10 +1808,10 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
>
> The device SHOULD respond to VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and VIRTIO_NET_CTRL_NOTF_COAL_RX_SET commands with VIRTIO_NET_ERR if it was not able to change the parameters.
>
> -The device MUST respond to the VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET command with VIRTIO_NET_ERR if it was not able to change the parameters.
> +The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET commands with VIRTIO_NET_ERR if it was not able to change the parameters.
>
> -The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET commands with
> -VIRTIO_NET_ERR if the designated virtqueue is not an enabled transmit or receive virtqueue.
> +The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET
> +commands with VIRTIO_NET_ERR if the designated virtqueue is not an enabled transmit or receive virtqueue.
>
> Upon disabling and re-enabling a transmit virtqueue, the device MUST set the coalescing parameters of the virtqueue
> to those configured through the VIRTIO_NET_CTRL_NOTF_COAL_TX_SET command, or, if the driver did not set any TX coalescing parameters, to 0.
> --
> 1.8.3.1
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [virtio-dev] RE: [virtio-comment] [PATCH v2] virtio-net: support setting coalescing params for multiple vqs
2024-01-15 13:21 ` [virtio-dev] RE: [virtio-comment] " Parav Pandit
@ 2024-01-17 4:52 ` Heng Qi
2024-01-20 9:59 ` Parav Pandit
0 siblings, 1 reply; 20+ messages in thread
From: Heng Qi @ 2024-01-17 4:52 UTC (permalink / raw)
To: Parav Pandit, virtio-comment@lists.oasis-open.org,
virtio-dev@lists.oasis-open.org
Cc: Jason Wang, Michael S. Tsirkin, Xuan Zhuo
在 2024/1/15 下午9:21, Parav Pandit 写道:
>
>> From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
>> open.org> On Behalf Of Heng Qi
>> Sent: Monday, January 15, 2024 6:36 PM
>>
>> Currently, when each time the driver attempts to update the coalescing
>> parameters for a vq, it needs to kick the device and wait for the ctrlq
>> response to return.
> It does not need to wait. This is some driver limitation that does not use the queue as "queue".
> Such driver limitation should be removed in the driver. It does not qualify as limitation.
Yes, we don't have to wait.
But in general, for user commands, it is necessary to obtain the final
results synchronously.
The user command cannot return before the final result is obtained.
And wait is not the problem this patch solves.
>
> This will enable driver to enqueue multiple cvq commands without waiting for previous one.
> This will also enable device to find natural coalescing done on multiple commands.
When batch user commands occur, ensuring synchronization is a concern.
>
>> The following path is observed: 1. Driver kicks the device; 2. After the device
>> receives the kick, CPU scheduling occurs and DMA multiple buffers multiple
>> times; 3. The device completes processing and replies with a response.
>>
>> When large-queue devices issue multiple requests and kick the device
>> frequently, this often interrupt the work of the device-side CPU.
> When there is large devices and multiple driver notifications by a cpu that is N times faster than the device side cpu,
> the device may find natural coalescing on the commands of a given cvq.
First we have to solve the ctrlq batch adding user (ethtool) command.
Even if processed in a batch way on device side, the number of kicks and
the number of backend DMAs has not been reduced.
>
> For multiple DMAs, we need to way to send 8 bytes of data without 16 bytes of indirection via a descriptor.
> This is what we discussed a while back to do in txq and Stefan suggested to generalize for more queues, which is also a good idea.
Yes, this sounds good.
>
>> In addition,
>> each vq request is processed separately, causing more delays for the CPU to
>> wait for the DMA request to complete.
>>
>> These interruptions and overhead will strain the CPU responsible for
>> controlling the path of the DPU, especially in multi-device and large-queue
>> scenarios.
>>
>> To solve the above problems, we internally tried batch request, which merges
>> requests from multiple queues and sends them at once. We conservatively
>> tested
> The batching done may end up modifying the given VQ's parameters multiple times.
In practice, we do not try to accumulate multiple parameter
modifications for a specific vqn.
> The right test on Linux to do without rtnl lock which is anyway ugly and wrong semantic to use blocking the whole netdev stack.
> (in case if you used that).
Do you have any good directions and attempts to remove rtnl_lock?
>
>> 8 queue commands and sent them together. The DPU processing efficiency
>> can be improved by 8 times, which greatly eases the DPU's support for multi-
>> device and multi-queue DIM.
>>
> This is good.
YES. Makes sense for our DPUs.
>
>> Maintainers may be concerned about whether the batch command method
>> can optimize the above problems: accumulate multiple request commands to
>> kick the device once, and obtain the processing results of the corresponding
>> commands asynchronously.
> This is unlikely to improve, rather it will have negative impact as it only means that moderation parameters are just delayed by the driver.
>
Why is it delayed by the driver? It is not delayed by the driver, the
kick still happens for every command.
In theory and practice, it will not affect DIM performance, but it will
significantly reduce CPU consumption caused by waiting.
>> The batch command method is used by us to optimize the CPU overhead of
>> the DIM worker caused by the guest being busy waiting for the command
>> response result.
> In that case fixing the guest driver which is not yet written is the right fix.
>
>> This is a different focus than batch request to solve the problem.
>>
>> Suggested-by: Xiaoming Zhao <zxm377917@alibaba-inc.com>
>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
>> ---
>> v1->v2: Updated commit log. Due to sensitivity, sorry that can not give
>> v1->the
>> absolute value directly. @Michael
>>
>> device-types/net/description.tex | 26 ++++++++++++++++++++------
>> 1 file changed, 20 insertions(+), 6 deletions(-)
>>
>> diff --git a/device-types/net/description.tex b/device-
>> types/net/description.tex
>> index aff5e08..b3766c4 100644
>> --- a/device-types/net/description.tex
>> +++ b/device-types/net/description.tex
>> @@ -1667,8 +1667,8 @@ \subsubsection{Control
>> Virtqueue}\label{sec:Device Types / Network Device / Devi for notification
>> coalescing.
>>
>> If the VIRTIO_NET_F_VQ_NOTF_COAL feature is negotiated, the driver can -
>> send commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and
>> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET -for virtqueue notification
>> coalescing.
>> +send commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET,
>> +VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and
>> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET for virtqueue notification
>> coalescing.
>>
> A new feature bit is needed for this extra functionality.
I tried to extend it to the command VIRTIO_NET_F_VQ_NOTF_COAL, is it too
late?
Thanks!
>
>> \begin{lstlisting}
>> struct virtio_net_ctrl_coal {
>> @@ -1682,11 +1682,17 @@ \subsubsection{Control
>> Virtqueue}\label{sec:Device Types / Network Device / Devi
>> struct virtio_net_ctrl_coal coal;
>> };
>>
>> +struct virtio_net_ctrl_mrg_coal_vq {
>> + le16 num_entries; /* indicates number of valid entries */
>> + struct virtio_net_ctrl_coal_vq entries[]; };
>> +
>> #define VIRTIO_NET_CTRL_NOTF_COAL 6
>> #define VIRTIO_NET_CTRL_NOTF_COAL_TX_SET 0
>> #define VIRTIO_NET_CTRL_NOTF_COAL_RX_SET 1
>> #define VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET 2
>> #define VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET 3
>> + #define VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET 4
>> \end{lstlisting}
>>
>> Coalescing parameters:
>> @@ -1706,6 +1712,7 @@ \subsubsection{Control
>> Virtqueue}\label{sec:Device Types / Network Device / Devi \item For the
>> command VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, the structure
>> virtio_net_ctrl_coal_vq is write-only for the driver.
>> \item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET,
>> \field{vq_index} and \field{reserved} are write-only
>> for the driver, and the structure virtio_net_ctrl_coal is read-only for the
>> driver.
>> +\item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET, the
>> structure virtio_net_ctrl_mrg_coal_vq is write-only for the driver.
>> \end{itemize}
>>
>> The class VIRTIO_NET_CTRL_NOTF_COAL has the following commands:
>> @@ -1716,6 +1723,9 @@ \subsubsection{Control
>> Virtqueue}\label{sec:Device Types / Network Device / Devi
>> for an enabled transmit/receive virtqueue whose
>> index is \field{vq_index}.
>> \item VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET: use the structure
>> virtio_net_ctrl_coal_vq to get the \field{max_usecs} and \field{max_packets}
>> parameters
>> for an enabled transmit/receive virtqueue whose
>> index is \field{vq_index}.
>> +\item VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET: use the structure
>> virtio_net_ctrl_mrg_coal_vq to set the \field{max_usecs} and
>> \field{max_packets} parameters
>> + for \field{num_entries} enabled transmit/receive
>> virtqueues. The corresponding index value
>> + of each configured virtqueue is \field{vq_index}.
>> \end{enumerate}
>>
>> The device may generate notifications more or less frequently than specified
>> by set commands of the VIRTIO_NET_CTRL_NOTF_COAL class.
>> @@ -1782,9 +1792,13 @@ \subsubsection{Control
>> Virtqueue}\label{sec:Device Types / Network Device / Devi
>>
>> The driver MUST set \field{vq_index} to the virtqueue index of an enabled
>> transmit or receive virtqueue.
>>
>> +The driver MUST set \field{num_entries} to a non-zero value and MUST
>> +NOT set \field{num_entries} to a value greater than the number of enabled
>> transmit and receive virtqueues.
>> +
>> The driver MUST have negotiated the VIRTIO_NET_F_NOTF_COAL feature
>> when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and
>> VIRTIO_NET_CTRL_NOTF_COAL_RX_SET.
>>
>> -The driver MUST have negotiated the VIRTIO_NET_F_VQ_NOTF_COAL
>> feature when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET
>> and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET.
>> +The driver MUST have negotiated the VIRTIO_NET_F_VQ_NOTF_COAL
>> feature
>> +when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET,
>> VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and
>> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET.
>>
>> The driver MUST ignore the values of coalescing parameters received from
>> the VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET command if the device
>> responds with VIRTIO_NET_ERR.
>>
>> @@ -1794,10 +1808,10 @@ \subsubsection{Control
>> Virtqueue}\label{sec:Device Types / Network Device / Devi
>>
>> The device SHOULD respond to VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and
>> VIRTIO_NET_CTRL_NOTF_COAL_RX_SET commands with VIRTIO_NET_ERR if
>> it was not able to change the parameters.
>>
>> -The device MUST respond to the VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET
>> command with VIRTIO_NET_ERR if it was not able to change the parameters.
>> +The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and
>> VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET commands with VIRTIO_NET_ERR
>> if it was not able to change the parameters.
>>
>> -The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and
>> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET commands with -VIRTIO_NET_ERR
>> if the designated virtqueue is not an enabled transmit or receive virtqueue.
>> +The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET,
>> +VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and
>> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET commands with VIRTIO_NET_ERR if
>> the designated virtqueue is not an enabled transmit or receive virtqueue.
>>
>> Upon disabling and re-enabling a transmit virtqueue, the device MUST set
>> the coalescing parameters of the virtqueue to those configured through the
>> VIRTIO_NET_CTRL_NOTF_COAL_TX_SET command, or, if the driver did not
>> set any TX coalescing parameters, to 0.
>> --
>> 1.8.3.1
>>
>>
>> This publicly archived list offers a means to provide input to the
>> OASIS Virtual I/O Device (VIRTIO) TC.
>>
>> In order to verify user consent to the Feedback License terms and
>> to minimize spam in the list archive, subscription is required
>> before posting.
>>
>> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
>> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
>> List help: virtio-comment-help@lists.oasis-open.org
>> List archive: https://lists.oasis-open.org/archives/virtio-comment/
>> Feedback License: https://www.oasis-
>> open.org/who/ipr/feedback_license.pdf
>> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
>> Committee: https://www.oasis-open.org/committees/virtio/
>> Join OASIS: https://www.oasis-open.org/join/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [virtio-dev] Re: [PATCH v2] virtio-net: support setting coalescing params for multiple vqs
2024-01-15 22:58 ` [virtio-dev] " Michael S. Tsirkin
@ 2024-01-17 8:53 ` Heng Qi
0 siblings, 0 replies; 20+ messages in thread
From: Heng Qi @ 2024-01-17 8:53 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: virtio-comment, virtio-dev, Jason Wang, Xuan Zhuo
在 2024/1/16 上午6:58, Michael S. Tsirkin 写道:
> On Mon, Jan 15, 2024 at 09:06:02PM +0800, Heng Qi wrote:
>> Currently, when each time the driver attempts to update the coalescing parameters
>> for a vq, it needs to kick the device and wait for the ctrlq response to return.
>> The following path is observed: 1. Driver kicks the device; 2. After the device
>> receives the kick, CPU scheduling occurs and DMA multiple buffers multiple times;
>> 3. The device completes processing and replies with a response.
>>
>> When large-queue devices issue multiple requests and kick the device frequently,
>> this often interrupt the work of the device-side CPU. In addition, each vq request
>> is processed separately, causing more delays for the CPU to wait for the DMA
>> request to complete.
>>
>> These interruptions and overhead will strain the CPU responsible for controlling
>> the path of the DPU, especially in multi-device and large-queue scenarios.
>>
>> To solve the above problems, we internally tried batch request, which merges
>> requests from multiple queues and sends them at once. We conservatively tested
>> 8 queue commands and sent them together. The DPU processing efficiency can be
>> improved by 8 times, which greatly eases the DPU's support for multi-device
>> and multi-queue DIM.
>>
>> Maintainers may be concerned about whether the batch command method can optimize
>> the above problems: accumulate multiple request commands to kick the device once,
>> and obtain the processing results of the corresponding commands asynchronously.
>> The batch command method is used by us to optimize the CPU overhead of the DIM
>> worker caused by the guest being busy waiting for the command response result.
>> This is a different focus than batch request to solve the problem.
>
> Hmm. Could the issue be actually that
> 1. you are using split vqs so processing is complex
This is part of the problem.
> 2. Linux driver uses indirect to pass coalescing commands around
> so even more complex
> ?
Theoretically, when ctrlq cmd is sent,
the number of indirect DMA times will be 3 times less than
the number of chain DMA times. After the actual test, the indirect
method of ctrlq
is about 10% more efficient than the chain method.
Merging requests can both reduce the number of kicks and greatly reduce
the number of DMAs.
For example, after 8 queues requests are merged, only 1 kick + 8 DMAs
are required.
Without merging, 8 kicks + 64 DMAs are required.
>
> Is packed an option for you at all?
On our existing hardware, migrating from splitq to packedq would take a
long time.
Thanks.
> Guest should also be changed to
> avoid indirect with ctrl vq. As a quick test, disable indirect maybe
> that is fast enough?
>
>
>
>> Suggested-by: Xiaoming Zhao <zxm377917@alibaba-inc.com>
>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
>> ---
>> v1->v2: Updated commit log. Due to sensitivity, sorry that can not give the
>> absolute value directly. @Michael
>>
>> device-types/net/description.tex | 26 ++++++++++++++++++++------
>> 1 file changed, 20 insertions(+), 6 deletions(-)
>>
>> diff --git a/device-types/net/description.tex b/device-types/net/description.tex
>> index aff5e08..b3766c4 100644
>> --- a/device-types/net/description.tex
>> +++ b/device-types/net/description.tex
>> @@ -1667,8 +1667,8 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
>> for notification coalescing.
>>
>> If the VIRTIO_NET_F_VQ_NOTF_COAL feature is negotiated, the driver can
>> -send commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET
>> -for virtqueue notification coalescing.
>> +send commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and
>> +VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET for virtqueue notification coalescing.
>>
>> \begin{lstlisting}
>> struct virtio_net_ctrl_coal {
>> @@ -1682,11 +1682,17 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
>> struct virtio_net_ctrl_coal coal;
>> };
>>
>> +struct virtio_net_ctrl_mrg_coal_vq {
>> + le16 num_entries; /* indicates number of valid entries */
>> + struct virtio_net_ctrl_coal_vq entries[];
>> +};
>> +
>> #define VIRTIO_NET_CTRL_NOTF_COAL 6
>> #define VIRTIO_NET_CTRL_NOTF_COAL_TX_SET 0
>> #define VIRTIO_NET_CTRL_NOTF_COAL_RX_SET 1
>> #define VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET 2
>> #define VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET 3
>> + #define VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET 4
>> \end{lstlisting}
>>
>> Coalescing parameters:
>> @@ -1706,6 +1712,7 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
>> \item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, the structure virtio_net_ctrl_coal_vq is write-only for the driver.
>> \item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET, \field{vq_index} and \field{reserved} are write-only
>> for the driver, and the structure virtio_net_ctrl_coal is read-only for the driver.
>> +\item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET, the structure virtio_net_ctrl_mrg_coal_vq is write-only for the driver.
>> \end{itemize}
>>
>> The class VIRTIO_NET_CTRL_NOTF_COAL has the following commands:
>> @@ -1716,6 +1723,9 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
>> for an enabled transmit/receive virtqueue whose index is \field{vq_index}.
>> \item VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET: use the structure virtio_net_ctrl_coal_vq to get the \field{max_usecs} and \field{max_packets} parameters
>> for an enabled transmit/receive virtqueue whose index is \field{vq_index}.
>> +\item VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET: use the structure virtio_net_ctrl_mrg_coal_vq to set the \field{max_usecs} and \field{max_packets} parameters
>> + for \field{num_entries} enabled transmit/receive virtqueues. The corresponding index value
>> + of each configured virtqueue is \field{vq_index}.
>> \end{enumerate}
>>
>> The device may generate notifications more or less frequently than specified by set commands of the VIRTIO_NET_CTRL_NOTF_COAL class.
>> @@ -1782,9 +1792,13 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
>>
>> The driver MUST set \field{vq_index} to the virtqueue index of an enabled transmit or receive virtqueue.
>>
>> +The driver MUST set \field{num_entries} to a non-zero value and MUST NOT set \field{num_entries} to
>> +a value greater than the number of enabled transmit and receive virtqueues.
>> +
>> The driver MUST have negotiated the VIRTIO_NET_F_NOTF_COAL feature when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and VIRTIO_NET_CTRL_NOTF_COAL_RX_SET.
>>
>> -The driver MUST have negotiated the VIRTIO_NET_F_VQ_NOTF_COAL feature when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET.
>> +The driver MUST have negotiated the VIRTIO_NET_F_VQ_NOTF_COAL feature when issuing commands
>> +VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET.
>>
>> The driver MUST ignore the values of coalescing parameters received from the VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET command if the device responds with VIRTIO_NET_ERR.
>>
>> @@ -1794,10 +1808,10 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
>>
>> The device SHOULD respond to VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and VIRTIO_NET_CTRL_NOTF_COAL_RX_SET commands with VIRTIO_NET_ERR if it was not able to change the parameters.
>>
>> -The device MUST respond to the VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET command with VIRTIO_NET_ERR if it was not able to change the parameters.
>> +The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET commands with VIRTIO_NET_ERR if it was not able to change the parameters.
>>
>> -The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET commands with
>> -VIRTIO_NET_ERR if the designated virtqueue is not an enabled transmit or receive virtqueue.
>> +The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET
>> +commands with VIRTIO_NET_ERR if the designated virtqueue is not an enabled transmit or receive virtqueue.
>>
>> Upon disabling and re-enabling a transmit virtqueue, the device MUST set the coalescing parameters of the virtqueue
>> to those configured through the VIRTIO_NET_CTRL_NOTF_COAL_TX_SET command, or, if the driver did not set any TX coalescing parameters, to 0.
>> --
>> 1.8.3.1
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [virtio-dev] RE: [virtio-comment] [PATCH v2] virtio-net: support setting coalescing params for multiple vqs
2024-01-17 4:52 ` Heng Qi
@ 2024-01-20 9:59 ` Parav Pandit
2024-01-22 2:57 ` [virtio-dev] Re: [virtio-comment] " Heng Qi
0 siblings, 1 reply; 20+ messages in thread
From: Parav Pandit @ 2024-01-20 9:59 UTC (permalink / raw)
To: Heng Qi, virtio-comment@lists.oasis-open.org,
virtio-dev@lists.oasis-open.org
Cc: Jason Wang, Michael S. Tsirkin, Xuan Zhuo
> From: Heng Qi <hengqi@linux.alibaba.com>
> Sent: Wednesday, January 17, 2024 10:22 AM
>
> 在 2024/1/15 下午9:21, Parav Pandit 写道:
> >
> >> From: virtio-comment@lists.oasis-open.org
> >> <virtio-comment@lists.oasis- open.org> On Behalf Of Heng Qi
> >> Sent: Monday, January 15, 2024 6:36 PM
> >>
> >> Currently, when each time the driver attempts to update the
> >> coalescing parameters for a vq, it needs to kick the device and wait
> >> for the ctrlq response to return.
> > It does not need to wait. This is some driver limitation that does not use
> the queue as "queue".
> > Such driver limitation should be removed in the driver. It does not qualify
> as limitation.
>
> Yes, we don't have to wait.
>
> But in general, for user commands, it is necessary to obtain the final results
> synchronously.
Yes. Use initiated command can enqueue the request to cvq. Go to sleep for several micro to milliseconds.
> The user command cannot return before the final result is obtained.
> And wait is not the problem this patch solves.
>
By not holding the rtnl lock, rest of the context that needs to enqueue the request can progress such as that of netdim.
> >
> > This will enable driver to enqueue multiple cvq commands without waiting
> for previous one.
> > This will also enable device to find natural coalescing done on multiple
> commands.
>
> When batch user commands occur, ensuring synchronization is a concern.
>
> >
> >> The following path is observed: 1. Driver kicks the device; 2. After
> >> the device receives the kick, CPU scheduling occurs and DMA multiple
> >> buffers multiple times; 3. The device completes processing and replies
> with a response.
> >>
> >> When large-queue devices issue multiple requests and kick the device
> >> frequently, this often interrupt the work of the device-side CPU.
> > When there is large devices and multiple driver notifications by a cpu
> > that is N times faster than the device side cpu, the device may find natural
> coalescing on the commands of a given cvq.
>
> First we have to solve the ctrlq batch adding user (ethtool) command.
> Even if processed in a batch way on device side, the number of kicks and the
> number of backend DMAs has not been reduced.
Driver notifications are PCI writes so it should not hamper device side, which can ignore them when they do not bother about it.
Backend DMAs should be reduced by avoiding the LIFO pattern followed by the splitq driver.
Placing the descriptors contiguously like packedq reduces amount of DMA naturally.
The second predicable DMA to avoid is having 8Bytes of data inline in the descriptor, instead of 16B indirection and extra dma.
>
> >
> > For multiple DMAs, we need to way to send 8 bytes of data without 16
> bytes of indirection via a descriptor.
> > This is what we discussed a while back to do in txq and Stefan suggested to
> generalize for more queues, which is also a good idea.
>
> Yes, this sounds good.
>
This the next item to focus as soon as flow filters are stable/merged.
> >
> >> In addition,
> >> each vq request is processed separately, causing more delays for the
> >> CPU to wait for the DMA request to complete.
> >>
> >> These interruptions and overhead will strain the CPU responsible for
> >> controlling the path of the DPU, especially in multi-device and
> >> large-queue scenarios.
> >>
> >> To solve the above problems, we internally tried batch request, which
> >> merges requests from multiple queues and sends them at once. We
> >> conservatively tested
> > The batching done may end up modifying the given VQ's parameters
> multiple times.
>
> In practice, we do not try to accumulate multiple parameter modifications for
> a specific vqn.
I meant to accumulate parameters of multiple VQs to batch in single command.
>
> > The right test on Linux to do without rtnl lock which is anyway ugly and
> wrong semantic to use blocking the whole netdev stack.
> > (in case if you used that).
>
> Do you have any good directions and attempts to remove rtnl_lock?
>
I think per device lock instead of rtnl is first step that we can start with.
> >
> >> 8 queue commands and sent them together. The DPU processing
> >> efficiency can be improved by 8 times, which greatly eases the DPU's
> >> support for multi- device and multi-queue DIM.
> >>
> > This is good.
>
> YES. Makes sense for our DPUs.
>
> >
> >> Maintainers may be concerned about whether the batch command
> method
> >> can optimize the above problems: accumulate multiple request
> commands
> >> to kick the device once, and obtain the processing results of the
> >> corresponding commands asynchronously.
> > This is unlikely to improve, rather it will have negative impact as it only
> means that moderation parameters are just delayed by the driver.
> >
>
> Why is it delayed by the driver? It is not delayed by the driver, the kick still
> happens for every command.
Since the kick is delayed, the driver lost the amount of time and device reaction time is driven by this wait time.
The guest driver is not aware how much a shared DPU is busy.
So if it waits too much, the VQs moderation is slow.
> In theory and practice, it will not affect DIM performance, but it will
> significantly reduce CPU consumption caused by waiting.
>
Once the guest driver sleeps after enqueuing the request, it should be fine.
The DIM calls are in highly parallel worker threads, where/if the thread sleep, a new worker thread is invoked.
But for sure I agree that vq notification moderation should happen quickly enough under < 100usec or so.
May be now that we bring the aq resources, we can start migrating on it.
> >> The batch command method is used by us to optimize the CPU overhead
> >> of the DIM worker caused by the guest being busy waiting for the
> >> command response result.
> > In that case fixing the guest driver which is not yet written is the right fix.
> >
> >> This is a different focus than batch request to solve the problem.
> >>
> >> Suggested-by: Xiaoming Zhao <zxm377917@alibaba-inc.com>
> >> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> >> ---
> >> v1->v2: Updated commit log. Due to sensitivity, sorry that can not
> >> v1->give the
> >> absolute value directly. @Michael
> >>
> >> device-types/net/description.tex | 26 ++++++++++++++++++++------
> >> 1 file changed, 20 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/device-types/net/description.tex b/device-
> >> types/net/description.tex index aff5e08..b3766c4 100644
> >> --- a/device-types/net/description.tex
> >> +++ b/device-types/net/description.tex
> >> @@ -1667,8 +1667,8 @@ \subsubsection{Control
> >> Virtqueue}\label{sec:Device Types / Network Device / Devi for
> >> notification coalescing.
> >>
> >> If the VIRTIO_NET_F_VQ_NOTF_COAL feature is negotiated, the driver
> >> can - send commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and
> >> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET -for virtqueue notification
> >> coalescing.
> >> +send commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET,
> >> +VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and
> >> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET for virtqueue notification
> >> coalescing.
> >>
> > A new feature bit is needed for this extra functionality.
>
> I tried to extend it to the command VIRTIO_NET_F_VQ_NOTF_COAL, is it too
> late?
>
For existing command it is not.
But there may be a device which may not support coalesced command.
Better to have the feature bit for this coalesce command, instead of random NOSUPP failure.
> Thanks!
>
> >
> >> \begin{lstlisting}
> >> struct virtio_net_ctrl_coal {
> >> @@ -1682,11 +1682,17 @@ \subsubsection{Control
> >> Virtqueue}\label{sec:Device Types / Network Device / Devi
> >> struct virtio_net_ctrl_coal coal;
> >> };
> >>
> >> +struct virtio_net_ctrl_mrg_coal_vq {
> >> + le16 num_entries; /* indicates number of valid entries */
> >> + struct virtio_net_ctrl_coal_vq entries[]; };
> >> +
> >> #define VIRTIO_NET_CTRL_NOTF_COAL 6
> >> #define VIRTIO_NET_CTRL_NOTF_COAL_TX_SET 0
> >> #define VIRTIO_NET_CTRL_NOTF_COAL_RX_SET 1
> >> #define VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET 2
> >> #define VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET 3
> >> + #define VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET 4
> >> \end{lstlisting}
> >>
> >> Coalescing parameters:
> >> @@ -1706,6 +1712,7 @@ \subsubsection{Control
> >> Virtqueue}\label{sec:Device Types / Network Device / Devi \item For
> >> the command VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, the structure
> >> virtio_net_ctrl_coal_vq is write-only for the driver.
> >> \item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET,
> >> \field{vq_index} and \field{reserved} are write-only
> >> for the driver, and the structure virtio_net_ctrl_coal is
> >> read-only for the driver.
> >> +\item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET, the
> >> structure virtio_net_ctrl_mrg_coal_vq is write-only for the driver.
> >> \end{itemize}
> >>
> >> The class VIRTIO_NET_CTRL_NOTF_COAL has the following commands:
> >> @@ -1716,6 +1723,9 @@ \subsubsection{Control
> >> Virtqueue}\label{sec:Device Types / Network Device / Devi
> >> for an enabled
> >> transmit/receive virtqueue whose index is \field{vq_index}.
> >> \item VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET: use the structure
> >> virtio_net_ctrl_coal_vq to get the \field{max_usecs} and
> >> \field{max_packets} parameters
> >> for an enabled
> >> transmit/receive virtqueue whose index is \field{vq_index}.
> >> +\item VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET: use the structure
> >> virtio_net_ctrl_mrg_coal_vq to set the \field{max_usecs} and
> >> \field{max_packets} parameters
> >> + for \field{num_entries}
> >> + enabled transmit/receive
> >> virtqueues. The corresponding index value
> >> + of each configured virtqueue is \field{vq_index}.
> >> \end{enumerate}
> >>
> >> The device may generate notifications more or less frequently than
> >> specified by set commands of the VIRTIO_NET_CTRL_NOTF_COAL class.
> >> @@ -1782,9 +1792,13 @@ \subsubsection{Control
> >> Virtqueue}\label{sec:Device Types / Network Device / Devi
> >>
> >> The driver MUST set \field{vq_index} to the virtqueue index of an
> >> enabled transmit or receive virtqueue.
> >>
> >> +The driver MUST set \field{num_entries} to a non-zero value and MUST
> >> +NOT set \field{num_entries} to a value greater than the number of
> >> +enabled
> >> transmit and receive virtqueues.
> >> +
> >> The driver MUST have negotiated the VIRTIO_NET_F_NOTF_COAL
> feature
> >> when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and
> >> VIRTIO_NET_CTRL_NOTF_COAL_RX_SET.
> >>
> >> -The driver MUST have negotiated the VIRTIO_NET_F_VQ_NOTF_COAL
> >> feature when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET
> and
> >> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET.
> >> +The driver MUST have negotiated the VIRTIO_NET_F_VQ_NOTF_COAL
> >> feature
> >> +when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET,
> >> VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and
> >> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET.
> >>
> >> The driver MUST ignore the values of coalescing parameters received
> >> from the VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET command if the device
> >> responds with VIRTIO_NET_ERR.
> >>
> >> @@ -1794,10 +1808,10 @@ \subsubsection{Control
> >> Virtqueue}\label{sec:Device Types / Network Device / Devi
> >>
> >> The device SHOULD respond to VIRTIO_NET_CTRL_NOTF_COAL_TX_SET
> and
> >> VIRTIO_NET_CTRL_NOTF_COAL_RX_SET commands with
> VIRTIO_NET_ERR if it
> >> was not able to change the parameters.
> >>
> >> -The device MUST respond to the
> VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET
> >> command with VIRTIO_NET_ERR if it was not able to change the
> parameters.
> >> +The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET
> and
> >> VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET commands with
> VIRTIO_NET_ERR if it
> >> was not able to change the parameters.
> >>
> >> -The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET
> and
> >> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET commands with -
> VIRTIO_NET_ERR if the
> >> designated virtqueue is not an enabled transmit or receive virtqueue.
> >> +The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET,
> >> +VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and
> >> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET commands with
> VIRTIO_NET_ERR if the
> >> designated virtqueue is not an enabled transmit or receive virtqueue.
> >>
> >> Upon disabling and re-enabling a transmit virtqueue, the device
> >> MUST set the coalescing parameters of the virtqueue to those
> >> configured through the VIRTIO_NET_CTRL_NOTF_COAL_TX_SET
> command, or,
> >> if the driver did not set any TX coalescing parameters, to 0.
> >> --
> >> 1.8.3.1
> >>
> >>
> >> This publicly archived list offers a means to provide input to the
> >> OASIS Virtual I/O Device (VIRTIO) TC.
> >>
> >> In order to verify user consent to the Feedback License terms and to
> >> minimize spam in the list archive, subscription is required before
> >> posting.
> >>
> >> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> >> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> >> List help: virtio-comment-help@lists.oasis-open.org
> >> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> >> Feedback License: https://www.oasis-
> >> open.org/who/ipr/feedback_license.pdf
> >> List Guidelines:
> >> https://www.oasis-open.org/policies-guidelines/mailing-lists
> >> Committee: https://www.oasis-open.org/committees/virtio/
> >> Join OASIS: https://www.oasis-open.org/join/
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 20+ messages in thread
* [virtio-dev] Re: [virtio-comment] RE: [virtio-dev] RE: [virtio-comment] [PATCH v2] virtio-net: support setting coalescing params for multiple vqs
2024-01-20 9:59 ` Parav Pandit
@ 2024-01-22 2:57 ` Heng Qi
2024-01-22 5:03 ` [virtio-dev] " Parav Pandit
0 siblings, 1 reply; 20+ messages in thread
From: Heng Qi @ 2024-01-22 2:57 UTC (permalink / raw)
To: Parav Pandit, virtio-comment@lists.oasis-open.org,
virtio-dev@lists.oasis-open.org
Cc: Jason Wang, Michael S. Tsirkin, Xuan Zhuo
在 2024/1/20 下午5:59, Parav Pandit 写道:
>> From: Heng Qi <hengqi@linux.alibaba.com>
>> Sent: Wednesday, January 17, 2024 10:22 AM
>>
>> 在 2024/1/15 下午9:21, Parav Pandit 写道:
>>>> From: virtio-comment@lists.oasis-open.org
>>>> <virtio-comment@lists.oasis- open.org> On Behalf Of Heng Qi
>>>> Sent: Monday, January 15, 2024 6:36 PM
>>>>
>>>> Currently, when each time the driver attempts to update the
>>>> coalescing parameters for a vq, it needs to kick the device and wait
>>>> for the ctrlq response to return.
>>> It does not need to wait. This is some driver limitation that does not use
>> the queue as "queue".
>>> Such driver limitation should be removed in the driver. It does not qualify
>> as limitation.
>>
>> Yes, we don't have to wait.
>>
>> But in general, for user commands, it is necessary to obtain the final results
>> synchronously.
> Yes. Use initiated command can enqueue the request to cvq. Go to sleep for several micro to milliseconds.
>
>> The user command cannot return before the final result is obtained.
>> And wait is not the problem this patch solves.
>>
> By not holding the rtnl lock, rest of the context that needs to enqueue the request can progress such as that of netdim.
Would like to see the using of rtnl lock changed.
In addition, I have made batching and asynchronousization of the netdim
command, you can refer to this patch:
https://lore.kernel.org/all/1705410693-118895-4-git-send-email-hengqi@linux.alibaba.com/
>
>>> This will enable driver to enqueue multiple cvq commands without waiting
>> for previous one.
>>> This will also enable device to find natural coalescing done on multiple
>> commands.
>>
>> When batch user commands occur, ensuring synchronization is a concern.
>>
>>>> The following path is observed: 1. Driver kicks the device; 2. After
>>>> the device receives the kick, CPU scheduling occurs and DMA multiple
>>>> buffers multiple times; 3. The device completes processing and replies
>> with a response.
>>>> When large-queue devices issue multiple requests and kick the device
>>>> frequently, this often interrupt the work of the device-side CPU.
>>> When there is large devices and multiple driver notifications by a cpu
>>> that is N times faster than the device side cpu, the device may find natural
>> coalescing on the commands of a given cvq.
>>
>> First we have to solve the ctrlq batch adding user (ethtool) command.
>> Even if processed in a batch way on device side, the number of kicks and the
>> number of backend DMAs has not been reduced.
> Driver notifications are PCI writes so it should not hamper device side, which can ignore them when they do not bother about it.
Driver notifications need to be processed by the DPU, which interferes
with the CPU on the DPU.
> Backend DMAs should be reduced by avoiding the LIFO pattern followed by the splitq driver.
> Placing the descriptors contiguously like packedq reduces amount of DMA naturally.
splitq is widely used, migrating to packedq is not that easy, especially
when there are many components and hardware involved.
>
> The second predicable DMA to avoid is having 8Bytes of data inline in the descriptor, instead of 16B indirection and extra dma.
Looking forward to working inline!
But I think this does not conflict with batch work, and combining the
two will be more beneficial.
>
>>> For multiple DMAs, we need to way to send 8 bytes of data without 16
>> bytes of indirection via a descriptor.
>>> This is what we discussed a while back to do in txq and Stefan suggested to
>> generalize for more queues, which is also a good idea.
>>
>> Yes, this sounds good.
>>
> This the next item to focus as soon as flow filters are stable/merged.
Greate!
>
>>>> In addition,
>>>> each vq request is processed separately, causing more delays for the
>>>> CPU to wait for the DMA request to complete.
>>>>
>>>> These interruptions and overhead will strain the CPU responsible for
>>>> controlling the path of the DPU, especially in multi-device and
>>>> large-queue scenarios.
>>>>
>>>> To solve the above problems, we internally tried batch request, which
>>>> merges requests from multiple queues and sends them at once. We
>>>> conservatively tested
>>> The batching done may end up modifying the given VQ's parameters
>> multiple times.
>>
>> In practice, we do not try to accumulate multiple parameter modifications for
>> a specific vqn.
> I meant to accumulate parameters of multiple VQs to batch in single command.
Yes, this is what is done now, practical example:
https://lore.kernel.org/all/1705410693-118895-3-git-send-email-hengqi@linux.alibaba.com/
In our scenario, only benefits are seen, no negative effects.
>
>>> The right test on Linux to do without rtnl lock which is anyway ugly and
>> wrong semantic to use blocking the whole netdev stack.
>>> (in case if you used that).
>> Do you have any good directions and attempts to remove rtnl_lock?
>>
> I think per device lock instead of rtnl is first step that we can start with.
Totally looking forward to it.
Thanks,
Heng
>
>>>> 8 queue commands and sent them together. The DPU processing
>>>> efficiency can be improved by 8 times, which greatly eases the DPU's
>>>> support for multi- device and multi-queue DIM.
>>>>
>>> This is good.
>> YES. Makes sense for our DPUs.
>>
>>>> Maintainers may be concerned about whether the batch command
>> method
>>>> can optimize the above problems: accumulate multiple request
>> commands
>>>> to kick the device once, and obtain the processing results of the
>>>> corresponding commands asynchronously.
>>> This is unlikely to improve, rather it will have negative impact as it only
>> means that moderation parameters are just delayed by the driver.
>> Why is it delayed by the driver? It is not delayed by the driver, the kick still
>> happens for every command.
> Since the kick is delayed, the driver lost the amount of time and device reaction time is driven by this wait time.
> The guest driver is not aware how much a shared DPU is busy.
> So if it waits too much, the VQs moderation is slow.
>
>> In theory and practice, it will not affect DIM performance, but it will
>> significantly reduce CPU consumption caused by waiting.
>>
> Once the guest driver sleeps after enqueuing the request, it should be fine.
> The DIM calls are in highly parallel worker threads, where/if the thread sleep, a new worker thread is invoked.
>
> But for sure I agree that vq notification moderation should happen quickly enough under < 100usec or so.
> May be now that we bring the aq resources, we can start migrating on it.
>
>>>> The batch command method is used by us to optimize the CPU overhead
>>>> of the DIM worker caused by the guest being busy waiting for the
>>>> command response result.
>>> In that case fixing the guest driver which is not yet written is the right fix.
>>>
>>>> This is a different focus than batch request to solve the problem.
>>>>
>>>> Suggested-by: Xiaoming Zhao <zxm377917@alibaba-inc.com>
>>>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
>>>> ---
>>>> v1->v2: Updated commit log. Due to sensitivity, sorry that can not
>>>> v1->give the
>>>> absolute value directly. @Michael
>>>>
>>>> device-types/net/description.tex | 26 ++++++++++++++++++++------
>>>> 1 file changed, 20 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/device-types/net/description.tex b/device-
>>>> types/net/description.tex index aff5e08..b3766c4 100644
>>>> --- a/device-types/net/description.tex
>>>> +++ b/device-types/net/description.tex
>>>> @@ -1667,8 +1667,8 @@ \subsubsection{Control
>>>> Virtqueue}\label{sec:Device Types / Network Device / Devi for
>>>> notification coalescing.
>>>>
>>>> If the VIRTIO_NET_F_VQ_NOTF_COAL feature is negotiated, the driver
>>>> can - send commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and
>>>> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET -for virtqueue notification
>>>> coalescing.
>>>> +send commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET,
>>>> +VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and
>>>> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET for virtqueue notification
>>>> coalescing.
>>>>
>>> A new feature bit is needed for this extra functionality.
>> I tried to extend it to the command VIRTIO_NET_F_VQ_NOTF_COAL, is it too
>> late?
>>
> For existing command it is not.
> But there may be a device which may not support coalesced command.
> Better to have the feature bit for this coalesce command, instead of random NOSUPP failure.
>
>
>> Thanks!
>>
>>>> \begin{lstlisting}
>>>> struct virtio_net_ctrl_coal {
>>>> @@ -1682,11 +1682,17 @@ \subsubsection{Control
>>>> Virtqueue}\label{sec:Device Types / Network Device / Devi
>>>> struct virtio_net_ctrl_coal coal;
>>>> };
>>>>
>>>> +struct virtio_net_ctrl_mrg_coal_vq {
>>>> + le16 num_entries; /* indicates number of valid entries */
>>>> + struct virtio_net_ctrl_coal_vq entries[]; };
>>>> +
>>>> #define VIRTIO_NET_CTRL_NOTF_COAL 6
>>>> #define VIRTIO_NET_CTRL_NOTF_COAL_TX_SET 0
>>>> #define VIRTIO_NET_CTRL_NOTF_COAL_RX_SET 1
>>>> #define VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET 2
>>>> #define VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET 3
>>>> + #define VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET 4
>>>> \end{lstlisting}
>>>>
>>>> Coalescing parameters:
>>>> @@ -1706,6 +1712,7 @@ \subsubsection{Control
>>>> Virtqueue}\label{sec:Device Types / Network Device / Devi \item For
>>>> the command VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, the structure
>>>> virtio_net_ctrl_coal_vq is write-only for the driver.
>>>> \item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET,
>>>> \field{vq_index} and \field{reserved} are write-only
>>>> for the driver, and the structure virtio_net_ctrl_coal is
>>>> read-only for the driver.
>>>> +\item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET, the
>>>> structure virtio_net_ctrl_mrg_coal_vq is write-only for the driver.
>>>> \end{itemize}
>>>>
>>>> The class VIRTIO_NET_CTRL_NOTF_COAL has the following commands:
>>>> @@ -1716,6 +1723,9 @@ \subsubsection{Control
>>>> Virtqueue}\label{sec:Device Types / Network Device / Devi
>>>> for an enabled
>>>> transmit/receive virtqueue whose index is \field{vq_index}.
>>>> \item VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET: use the structure
>>>> virtio_net_ctrl_coal_vq to get the \field{max_usecs} and
>>>> \field{max_packets} parameters
>>>> for an enabled
>>>> transmit/receive virtqueue whose index is \field{vq_index}.
>>>> +\item VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET: use the structure
>>>> virtio_net_ctrl_mrg_coal_vq to set the \field{max_usecs} and
>>>> \field{max_packets} parameters
>>>> + for \field{num_entries}
>>>> + enabled transmit/receive
>>>> virtqueues. The corresponding index value
>>>> + of each configured virtqueue is \field{vq_index}.
>>>> \end{enumerate}
>>>>
>>>> The device may generate notifications more or less frequently than
>>>> specified by set commands of the VIRTIO_NET_CTRL_NOTF_COAL class.
>>>> @@ -1782,9 +1792,13 @@ \subsubsection{Control
>>>> Virtqueue}\label{sec:Device Types / Network Device / Devi
>>>>
>>>> The driver MUST set \field{vq_index} to the virtqueue index of an
>>>> enabled transmit or receive virtqueue.
>>>>
>>>> +The driver MUST set \field{num_entries} to a non-zero value and MUST
>>>> +NOT set \field{num_entries} to a value greater than the number of
>>>> +enabled
>>>> transmit and receive virtqueues.
>>>> +
>>>> The driver MUST have negotiated the VIRTIO_NET_F_NOTF_COAL
>> feature
>>>> when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and
>>>> VIRTIO_NET_CTRL_NOTF_COAL_RX_SET.
>>>>
>>>> -The driver MUST have negotiated the VIRTIO_NET_F_VQ_NOTF_COAL
>>>> feature when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET
>> and
>>>> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET.
>>>> +The driver MUST have negotiated the VIRTIO_NET_F_VQ_NOTF_COAL
>>>> feature
>>>> +when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET,
>>>> VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and
>>>> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET.
>>>>
>>>> The driver MUST ignore the values of coalescing parameters received
>>>> from the VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET command if the device
>>>> responds with VIRTIO_NET_ERR.
>>>>
>>>> @@ -1794,10 +1808,10 @@ \subsubsection{Control
>>>> Virtqueue}\label{sec:Device Types / Network Device / Devi
>>>>
>>>> The device SHOULD respond to VIRTIO_NET_CTRL_NOTF_COAL_TX_SET
>> and
>>>> VIRTIO_NET_CTRL_NOTF_COAL_RX_SET commands with
>> VIRTIO_NET_ERR if it
>>>> was not able to change the parameters.
>>>>
>>>> -The device MUST respond to the
>> VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET
>>>> command with VIRTIO_NET_ERR if it was not able to change the
>> parameters.
>>>> +The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET
>> and
>>>> VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET commands with
>> VIRTIO_NET_ERR if it
>>>> was not able to change the parameters.
>>>>
>>>> -The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET
>> and
>>>> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET commands with -
>> VIRTIO_NET_ERR if the
>>>> designated virtqueue is not an enabled transmit or receive virtqueue.
>>>> +The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET,
>>>> +VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and
>>>> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET commands with
>> VIRTIO_NET_ERR if the
>>>> designated virtqueue is not an enabled transmit or receive virtqueue.
>>>>
>>>> Upon disabling and re-enabling a transmit virtqueue, the device
>>>> MUST set the coalescing parameters of the virtqueue to those
>>>> configured through the VIRTIO_NET_CTRL_NOTF_COAL_TX_SET
>> command, or,
>>>> if the driver did not set any TX coalescing parameters, to 0.
>>>> --
>>>> 1.8.3.1
>>>>
>>>>
>>>> This publicly archived list offers a means to provide input to the
>>>> OASIS Virtual I/O Device (VIRTIO) TC.
>>>>
>>>> In order to verify user consent to the Feedback License terms and to
>>>> minimize spam in the list archive, subscription is required before
>>>> posting.
>>>>
>>>> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
>>>> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
>>>> List help: virtio-comment-help@lists.oasis-open.org
>>>> List archive: https://lists.oasis-open.org/archives/virtio-comment/
>>>> Feedback License: https://www.oasis-
>>>> open.org/who/ipr/feedback_license.pdf
>>>> List Guidelines:
>>>> https://www.oasis-open.org/policies-guidelines/mailing-lists
>>>> Committee: https://www.oasis-open.org/committees/virtio/
>>>> Join OASIS: https://www.oasis-open.org/join/
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
>>> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 20+ messages in thread
* [virtio-dev] RE: [virtio-comment] RE: [virtio-dev] RE: [virtio-comment] [PATCH v2] virtio-net: support setting coalescing params for multiple vqs
2024-01-22 2:57 ` [virtio-dev] Re: [virtio-comment] " Heng Qi
@ 2024-01-22 5:03 ` Parav Pandit
2024-01-22 7:36 ` [virtio-dev] " Michael S. Tsirkin
2024-01-24 13:01 ` [virtio-dev] " Heng Qi
0 siblings, 2 replies; 20+ messages in thread
From: Parav Pandit @ 2024-01-22 5:03 UTC (permalink / raw)
To: Heng Qi, virtio-comment@lists.oasis-open.org,
virtio-dev@lists.oasis-open.org
Cc: Jason Wang, Michael S. Tsirkin, Xuan Zhuo
> From: Heng Qi <hengqi@linux.alibaba.com>
> Sent: Monday, January 22, 2024 8:27 AM
>
> 在 2024/1/20 下午5:59, Parav Pandit 写道:
> >> From: Heng Qi <hengqi@linux.alibaba.com>
> >> Sent: Wednesday, January 17, 2024 10:22 AM
> >>
> >> 在 2024/1/15 下午9:21, Parav Pandit 写道:
> >>>> From: virtio-comment@lists.oasis-open.org
> >>>> <virtio-comment@lists.oasis- open.org> On Behalf Of Heng Qi
> >>>> Sent: Monday, January 15, 2024 6:36 PM
> >>>>
> >>>> Currently, when each time the driver attempts to update the
> >>>> coalescing parameters for a vq, it needs to kick the device and
> >>>> wait for the ctrlq response to return.
> >>> It does not need to wait. This is some driver limitation that does
> >>> not use
> >> the queue as "queue".
> >>> Such driver limitation should be removed in the driver. It does not
> >>> qualify
> >> as limitation.
> >>
> >> Yes, we don't have to wait.
> >>
> >> But in general, for user commands, it is necessary to obtain the
> >> final results synchronously.
> > Yes. Use initiated command can enqueue the request to cvq. Go to sleep
> for several micro to milliseconds.
> >
> >> The user command cannot return before the final result is obtained.
> >> And wait is not the problem this patch solves.
> >>
> > By not holding the rtnl lock, rest of the context that needs to enqueue the
> request can progress such as that of netdim.
>
> Would like to see the using of rtnl lock changed.
>
Inside the virtnet_rx_dim_work() there should be rtnl lock call.
A virtio_device level lock to be used for cvq. :)
> In addition, I have made batching and asynchronousization of the netdim
> command, you can refer to this patch:
> https://lore.kernel.org/all/1705410693-118895-4-git-send-email-
> hengqi@linux.alibaba.com/
>
In the listed above driver patch the motivation "to optimize the
CPU overhead of the DIM worker caused by the guest being busy
waiting for the command response result."
Is not right.
Because guest is still busy waiting.
Without batching, due to rtnl lock every VQ command is serialized as one outstanding command at a time in virtnet_rx_dim_work().
Due to this device is unable to take benefit of DMA batching at large scale.
> >
> >>> This will enable driver to enqueue multiple cvq commands without
> >>> waiting
> >> for previous one.
> >>> This will also enable device to find natural coalescing done on
> >>> multiple
> >> commands.
> >>
> >> When batch user commands occur, ensuring synchronization is a concern.
> >>
> >>>> The following path is observed: 1. Driver kicks the device; 2.
> >>>> After the device receives the kick, CPU scheduling occurs and DMA
> >>>> multiple buffers multiple times; 3. The device completes processing
> >>>> and replies
> >> with a response.
> >>>> When large-queue devices issue multiple requests and kick the
> >>>> device frequently, this often interrupt the work of the device-side CPU.
> >>> When there is large devices and multiple driver notifications by a
> >>> cpu that is N times faster than the device side cpu, the device may
> >>> find natural
> >> coalescing on the commands of a given cvq.
> >>
> >> First we have to solve the ctrlq batch adding user (ethtool) command.
> >> Even if processed in a batch way on device side, the number of kicks
> >> and the number of backend DMAs has not been reduced.
> > Driver notifications are PCI writes so it should not hamper device side,
> which can ignore them when they do not bother about it.
>
> Driver notifications need to be processed by the DPU, which interferes with
> the CPU on the DPU.
>
I was asking, if there is anyway to disable for your DPU to ignore these notifications while previous one is pending?
From your above description, it seems there isn’t.
> > Backend DMAs should be reduced by avoiding the LIFO pattern followed by
> the splitq driver.
> > Placing the descriptors contiguously like packedq reduces amount of DMA
> naturally.
>
> splitq is widely used, migrating to packedq is not that easy, especially when
> there are many components and hardware involved.
I am not suggesting to packed_VQ.
I am suggesting fixing the driver to not do LIFO on descriptors for splitq.
In other words, using contiguous set of descriptors on splitq will improve the splitq for DMA.
This will allow using splitq more efficiently for dma as short-term solution for DMA until more efficient queues are defined.
>
> >
> > The second predicable DMA to avoid is having 8Bytes of data inline in the
> descriptor, instead of 16B indirection and extra dma.
>
> Looking forward to working inline!
> But I think this does not conflict with batch work, and combining the two will
> be more beneficial.
>
It does not conflict. However, batching for large number of queues may not use the inline as the data bytes may not fit in the inline.
> >
> >>> For multiple DMAs, we need to way to send 8 bytes of data without 16
> >> bytes of indirection via a descriptor.
> >>> This is what we discussed a while back to do in txq and Stefan
> >>> suggested to
> >> generalize for more queues, which is also a good idea.
> >>
> >> Yes, this sounds good.
> >>
> > This the next item to focus as soon as flow filters are stable/merged.
>
> Greate!
>
> >
> >>>> In addition,
> >>>> each vq request is processed separately, causing more delays for
> >>>> the CPU to wait for the DMA request to complete.
> >>>>
> >>>> These interruptions and overhead will strain the CPU responsible
> >>>> for controlling the path of the DPU, especially in multi-device and
> >>>> large-queue scenarios.
> >>>>
> >>>> To solve the above problems, we internally tried batch request,
> >>>> which merges requests from multiple queues and sends them at once.
> >>>> We conservatively tested
> >>> The batching done may end up modifying the given VQ's parameters
> >> multiple times.
> >>
> >> In practice, we do not try to accumulate multiple parameter
> >> modifications for a specific vqn.
> > I meant to accumulate parameters of multiple VQs to batch in single
> command.
>
> Yes, this is what is done now, practical example:
> https://lore.kernel.org/all/1705410693-118895-3-git-send-email-
> hengqi@linux.alibaba.com/
> In our scenario, only benefits are seen, no negative effects.
>
The right way to test is without the rtnl lock and enqueue the command to cvq without being blocked.
The new optional batch command with new feature bit may be useful for devices to which CVQ kicks are interruptions.
> >
> >>> The right test on Linux to do without rtnl lock which is anyway ugly
> >>> and
> >> wrong semantic to use blocking the whole netdev stack.
> >>> (in case if you used that).
> >> Do you have any good directions and attempts to remove rtnl_lock?
> >>
> > I think per device lock instead of rtnl is first step that we can start with.
>
Wil check internally who if someone already started working on it.
> Totally looking forward to it.
>
> Thanks,
> Heng
>
> >
> >>>> 8 queue commands and sent them together. The DPU processing
> >>>> efficiency can be improved by 8 times, which greatly eases the
> >>>> DPU's support for multi- device and multi-queue DIM.
> >>>>
> >>> This is good.
> >> YES. Makes sense for our DPUs.
> >>
> >>>> Maintainers may be concerned about whether the batch command
> >> method
> >>>> can optimize the above problems: accumulate multiple request
> >> commands
> >>>> to kick the device once, and obtain the processing results of the
> >>>> corresponding commands asynchronously.
> >>> This is unlikely to improve, rather it will have negative impact as
> >>> it only
> >> means that moderation parameters are just delayed by the driver.
> >> Why is it delayed by the driver? It is not delayed by the driver, the
> >> kick still happens for every command.
> > Since the kick is delayed, the driver lost the amount of time and device
> reaction time is driven by this wait time.
> > The guest driver is not aware how much a shared DPU is busy.
> > So if it waits too much, the VQs moderation is slow.
> >
> >> In theory and practice, it will not affect DIM performance, but it
> >> will significantly reduce CPU consumption caused by waiting.
> >>
> > Once the guest driver sleeps after enqueuing the request, it should be fine.
> > The DIM calls are in highly parallel worker threads, where/if the thread
> sleep, a new worker thread is invoked.
> >
> > But for sure I agree that vq notification moderation should happen quickly
> enough under < 100usec or so.
> > May be now that we bring the aq resources, we can start migrating on it.
> >
> >>>> The batch command method is used by us to optimize the CPU
> overhead
> >>>> of the DIM worker caused by the guest being busy waiting for the
> >>>> command response result.
> >>> In that case fixing the guest driver which is not yet written is the right fix.
> >>>
> >>>> This is a different focus than batch request to solve the problem.
> >>>>
> >>>> Suggested-by: Xiaoming Zhao <zxm377917@alibaba-inc.com>
> >>>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> >>>> ---
> >>>> v1->v2: Updated commit log. Due to sensitivity, sorry that can not
> >>>> v1->give the
> >>>> absolute value directly. @Michael
> >>>>
> >>>> device-types/net/description.tex | 26 ++++++++++++++++++++------
> >>>> 1 file changed, 20 insertions(+), 6 deletions(-)
> >>>>
> >>>> diff --git a/device-types/net/description.tex b/device-
> >>>> types/net/description.tex index aff5e08..b3766c4 100644
> >>>> --- a/device-types/net/description.tex
> >>>> +++ b/device-types/net/description.tex
> >>>> @@ -1667,8 +1667,8 @@ \subsubsection{Control
> >>>> Virtqueue}\label{sec:Device Types / Network Device / Devi for
> >>>> notification coalescing.
> >>>>
> >>>> If the VIRTIO_NET_F_VQ_NOTF_COAL feature is negotiated, the
> >>>> driver can - send commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET
> and
> >>>> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET -for virtqueue notification
> >>>> coalescing.
> >>>> +send commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET,
> >>>> +VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and
> >>>> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET for virtqueue notification
> >>>> coalescing.
> >>>>
> >>> A new feature bit is needed for this extra functionality.
> >> I tried to extend it to the command VIRTIO_NET_F_VQ_NOTF_COAL, is it
> >> too late?
> >>
> > For existing command it is not.
> > But there may be a device which may not support coalesced command.
> > Better to have the feature bit for this coalesce command, instead of
> random NOSUPP failure.
> >
> >
> >> Thanks!
> >>
> >>>> \begin{lstlisting}
> >>>> struct virtio_net_ctrl_coal {
> >>>> @@ -1682,11 +1682,17 @@ \subsubsection{Control
> >>>> Virtqueue}\label{sec:Device Types / Network Device / Devi
> >>>> struct virtio_net_ctrl_coal coal;
> >>>> };
> >>>>
> >>>> +struct virtio_net_ctrl_mrg_coal_vq {
> >>>> + le16 num_entries; /* indicates number of valid entries */
> >>>> + struct virtio_net_ctrl_coal_vq entries[]; };
> >>>> +
> >>>> #define VIRTIO_NET_CTRL_NOTF_COAL 6
> >>>> #define VIRTIO_NET_CTRL_NOTF_COAL_TX_SET 0
> >>>> #define VIRTIO_NET_CTRL_NOTF_COAL_RX_SET 1
> >>>> #define VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET 2
> >>>> #define VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET 3
> >>>> + #define VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET 4
> >>>> \end{lstlisting}
> >>>>
> >>>> Coalescing parameters:
> >>>> @@ -1706,6 +1712,7 @@ \subsubsection{Control
> >>>> Virtqueue}\label{sec:Device Types / Network Device / Devi \item
> >>>> For the command VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, the
> structure
> >>>> virtio_net_ctrl_coal_vq is write-only for the driver.
> >>>> \item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET,
> >>>> \field{vq_index} and \field{reserved} are write-only
> >>>> for the driver, and the structure virtio_net_ctrl_coal is
> >>>> read-only for the driver.
> >>>> +\item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET, the
> >>>> structure virtio_net_ctrl_mrg_coal_vq is write-only for the driver.
> >>>> \end{itemize}
> >>>>
> >>>> The class VIRTIO_NET_CTRL_NOTF_COAL has the following
> commands:
> >>>> @@ -1716,6 +1723,9 @@ \subsubsection{Control
> >>>> Virtqueue}\label{sec:Device Types / Network Device / Devi
> >>>> for an enabled
> >>>> transmit/receive virtqueue whose index is \field{vq_index}.
> >>>> \item VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET: use the structure
> >>>> virtio_net_ctrl_coal_vq to get the \field{max_usecs} and
> >>>> \field{max_packets} parameters
> >>>> for an enabled
> >>>> transmit/receive virtqueue whose index is \field{vq_index}.
> >>>> +\item VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET: use the structure
> >>>> virtio_net_ctrl_mrg_coal_vq to set the \field{max_usecs} and
> >>>> \field{max_packets} parameters
> >>>> + for \field{num_entries}
> >>>> + enabled transmit/receive
> >>>> virtqueues. The corresponding index value
> >>>> + of each configured virtqueue is
> \field{vq_index}.
> >>>> \end{enumerate}
> >>>>
> >>>> The device may generate notifications more or less frequently
> >>>> than specified by set commands of the VIRTIO_NET_CTRL_NOTF_COAL
> class.
> >>>> @@ -1782,9 +1792,13 @@ \subsubsection{Control
> >>>> Virtqueue}\label{sec:Device Types / Network Device / Devi
> >>>>
> >>>> The driver MUST set \field{vq_index} to the virtqueue index of
> >>>> an enabled transmit or receive virtqueue.
> >>>>
> >>>> +The driver MUST set \field{num_entries} to a non-zero value and
> >>>> +MUST NOT set \field{num_entries} to a value greater than the
> >>>> +number of enabled
> >>>> transmit and receive virtqueues.
> >>>> +
> >>>> The driver MUST have negotiated the VIRTIO_NET_F_NOTF_COAL
> >> feature
> >>>> when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and
> >>>> VIRTIO_NET_CTRL_NOTF_COAL_RX_SET.
> >>>>
> >>>> -The driver MUST have negotiated the VIRTIO_NET_F_VQ_NOTF_COAL
> >>>> feature when issuing commands
> VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET
> >> and
> >>>> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET.
> >>>> +The driver MUST have negotiated the VIRTIO_NET_F_VQ_NOTF_COAL
> >>>> feature
> >>>> +when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET,
> >>>> VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and
> >>>> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET.
> >>>>
> >>>> The driver MUST ignore the values of coalescing parameters
> >>>> received from the VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET command if
> the
> >>>> device responds with VIRTIO_NET_ERR.
> >>>>
> >>>> @@ -1794,10 +1808,10 @@ \subsubsection{Control
> >>>> Virtqueue}\label{sec:Device Types / Network Device / Devi
> >>>>
> >>>> The device SHOULD respond to
> VIRTIO_NET_CTRL_NOTF_COAL_TX_SET
> >> and
> >>>> VIRTIO_NET_CTRL_NOTF_COAL_RX_SET commands with
> >> VIRTIO_NET_ERR if it
> >>>> was not able to change the parameters.
> >>>>
> >>>> -The device MUST respond to the
> >> VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET
> >>>> command with VIRTIO_NET_ERR if it was not able to change the
> >> parameters.
> >>>> +The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET
> >> and
> >>>> VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET commands with
> >> VIRTIO_NET_ERR if it
> >>>> was not able to change the parameters.
> >>>>
> >>>> -The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET
> >> and
> >>>> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET commands with -
> >> VIRTIO_NET_ERR if the
> >>>> designated virtqueue is not an enabled transmit or receive virtqueue.
> >>>> +The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET,
> >>>> +VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and
> >>>> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET commands with
> >> VIRTIO_NET_ERR if the
> >>>> designated virtqueue is not an enabled transmit or receive virtqueue.
> >>>>
> >>>> Upon disabling and re-enabling a transmit virtqueue, the device
> >>>> MUST set the coalescing parameters of the virtqueue to those
> >>>> configured through the VIRTIO_NET_CTRL_NOTF_COAL_TX_SET
> >> command, or,
> >>>> if the driver did not set any TX coalescing parameters, to 0.
> >>>> --
> >>>> 1.8.3.1
> >>>>
> >>>>
> >>>> This publicly archived list offers a means to provide input to the
> >>>> OASIS Virtual I/O Device (VIRTIO) TC.
> >>>>
> >>>> In order to verify user consent to the Feedback License terms and
> >>>> to minimize spam in the list archive, subscription is required
> >>>> before posting.
> >>>>
> >>>> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> >>>> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> >>>> List help: virtio-comment-help@lists.oasis-open.org
> >>>> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> >>>> Feedback License: https://www.oasis-
> >>>> open.org/who/ipr/feedback_license.pdf
> >>>> List Guidelines:
> >>>> https://www.oasis-open.org/policies-guidelines/mailing-lists
> >>>> Committee: https://www.oasis-open.org/committees/virtio/
> >>>> Join OASIS: https://www.oasis-open.org/join/
> >>> --------------------------------------------------------------------
> >>> - To unsubscribe, e-mail:
> >>> virtio-dev-unsubscribe@lists.oasis-open.org
> >>> For additional commands, e-mail:
> >>> virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 20+ messages in thread
* [virtio-dev] Re: [virtio-comment] RE: [virtio-dev] RE: [virtio-comment] [PATCH v2] virtio-net: support setting coalescing params for multiple vqs
2024-01-22 5:03 ` [virtio-dev] " Parav Pandit
@ 2024-01-22 7:36 ` Michael S. Tsirkin
2024-01-23 5:55 ` [virtio-dev] " Parav Pandit
2024-01-24 13:01 ` [virtio-dev] " Heng Qi
1 sibling, 1 reply; 20+ messages in thread
From: Michael S. Tsirkin @ 2024-01-22 7:36 UTC (permalink / raw)
To: Parav Pandit
Cc: Heng Qi, virtio-comment@lists.oasis-open.org,
virtio-dev@lists.oasis-open.org, Jason Wang, Xuan Zhuo
On Mon, Jan 22, 2024 at 05:03:38AM +0000, Parav Pandit wrote:
> > >>> The right test on Linux to do without rtnl lock which is anyway ugly
> > >>> and
> > >> wrong semantic to use blocking the whole netdev stack.
> > >>> (in case if you used that).
> > >> Do you have any good directions and attempts to remove rtnl_lock?
> > >>
> > > I think per device lock instead of rtnl is first step that we can start with.
> >
> Wil check internally who if someone already started working on it.
I feel the issue is at the conceptual level. Yes some drivers will take
a command and just queue it for execution later, but this means that
errors can not be propagated back at all. Imagine device with mac
0x123 in promisc mode. Now commands:
1- program MAC 0xabcdef
2- disable promisc mode
If command 1 fails but 2 proceeds then packets with MAC 0xabc
will be dropped.
Any attempts to batch arbitrary commands will have this issue -
be it at driver or device level.
So, here's my question: what exactly is the guest behaviour
that is driving this work? Is it with a linux guest? which
commands does userspace issue that we need to send multiple
vq coalescing commands? If all you want is to send
same config to all VQs then why not just use
VIRTIO_NET_CTRL_NOTF_COAL_RX_SET as opposed to
VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET ?
--
MST
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 20+ messages in thread
* [virtio-dev] RE: [virtio-comment] RE: [virtio-dev] RE: [virtio-comment] [PATCH v2] virtio-net: support setting coalescing params for multiple vqs
2024-01-22 7:36 ` [virtio-dev] " Michael S. Tsirkin
@ 2024-01-23 5:55 ` Parav Pandit
2024-01-23 7:15 ` [virtio-dev] " Michael S. Tsirkin
0 siblings, 1 reply; 20+ messages in thread
From: Parav Pandit @ 2024-01-23 5:55 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Heng Qi, virtio-comment@lists.oasis-open.org,
virtio-dev@lists.oasis-open.org, Jason Wang, Xuan Zhuo
> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Monday, January 22, 2024 1:06 PM
>
> On Mon, Jan 22, 2024 at 05:03:38AM +0000, Parav Pandit wrote:
> > > >>> The right test on Linux to do without rtnl lock which is anyway
> > > >>> ugly and
> > > >> wrong semantic to use blocking the whole netdev stack.
> > > >>> (in case if you used that).
> > > >> Do you have any good directions and attempts to remove rtnl_lock?
> > > >>
> > > > I think per device lock instead of rtnl is first step that we can start with.
> > >
> > Wil check internally who if someone already started working on it.
>
> I feel the issue is at the conceptual level.
Not for requests which are initiated by the kernel stack (non user initiated).
> Yes some drivers will take a command
> and just queue it for execution later, but this means that errors can not be
> propagated back at all. Imagine device with mac
> 0x123 in promisc mode. Now commands:
>
> 1- program MAC 0xabcdef
> 2- disable promisc mode
>
User initiated commands error can be propagated when the command completes.
Enqueuing command is at the different bottom level in the driver.
> If command 1 fails but 2 proceeds then packets with MAC 0xabc will be
> dropped.
>
> Any attempts to batch arbitrary commands will have this issue - be it at driver
> or device level.
>
There is no suggestion to batch arbitrary commands from the driver side.
The suggestion is to batch VQs notification coalescing from the driver side.
> So, here's my question: what exactly is the guest behaviour that is driving this
> work? Is it with a linux guest?
At least looks to me yes based on the partial patches which are taking rtnl lock on netdim's worker callbacks.
> which commands does userspace issue that we
> need to send multiple vq coalescing commands?
None.
> If all you want is to send
> same config to all VQs then why not just use
> VIRTIO_NET_CTRL_NOTF_COAL_RX_SET as opposed to
> VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET ?
Only kernel stack initiated VQ notification coalescing changes.
Since every VQ has different values, VIRTIO_NET_CTRL_NOTF_COAL_RX_SET is not sufficient.
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 20+ messages in thread
* [virtio-dev] Re: [virtio-comment] RE: [virtio-dev] RE: [virtio-comment] [PATCH v2] virtio-net: support setting coalescing params for multiple vqs
2024-01-23 5:55 ` [virtio-dev] " Parav Pandit
@ 2024-01-23 7:15 ` Michael S. Tsirkin
2024-01-23 7:28 ` [virtio-dev] " Parav Pandit
2024-01-24 2:43 ` [virtio-dev] " Heng Qi
0 siblings, 2 replies; 20+ messages in thread
From: Michael S. Tsirkin @ 2024-01-23 7:15 UTC (permalink / raw)
To: Parav Pandit
Cc: Heng Qi, virtio-comment@lists.oasis-open.org,
virtio-dev@lists.oasis-open.org, Jason Wang, Xuan Zhuo
On Tue, Jan 23, 2024 at 05:55:02AM +0000, Parav Pandit wrote:
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Monday, January 22, 2024 1:06 PM
> >
> > On Mon, Jan 22, 2024 at 05:03:38AM +0000, Parav Pandit wrote:
> > > > >>> The right test on Linux to do without rtnl lock which is anyway
> > > > >>> ugly and
> > > > >> wrong semantic to use blocking the whole netdev stack.
> > > > >>> (in case if you used that).
> > > > >> Do you have any good directions and attempts to remove rtnl_lock?
> > > > >>
> > > > > I think per device lock instead of rtnl is first step that we can start with.
> > > >
> > > Wil check internally who if someone already started working on it.
> >
> > I feel the issue is at the conceptual level.
> Not for requests which are initiated by the kernel stack (non user initiated).
So how is this different? Is it basically just because
tweaking coalescing in unexpected ways is considered mostly
harmless?
> > Yes some drivers will take a command
> > and just queue it for execution later, but this means that errors can not be
> > propagated back at all. Imagine device with mac
> > 0x123 in promisc mode. Now commands:
> >
> > 1- program MAC 0xabcdef
> > 2- disable promisc mode
> >
> User initiated commands error can be propagated when the command completes.
> Enqueuing command is at the different bottom level in the driver.
>
> > If command 1 fails but 2 proceeds then packets with MAC 0xabc will be
> > dropped.
> >
> > Any attempts to batch arbitrary commands will have this issue - be it at driver
> > or device level.
> >
> There is no suggestion to batch arbitrary commands from the driver side.
> The suggestion is to batch VQs notification coalescing from the driver side.
>
> > So, here's my question: what exactly is the guest behaviour that is driving this
> > work? Is it with a linux guest?
> At least looks to me yes based on the partial patches which are taking rtnl lock on netdim's worker callbacks.
>
> > which commands does userspace issue that we
> > need to send multiple vq coalescing commands?
> None.
>
> > If all you want is to send
> > same config to all VQs then why not just use
> > VIRTIO_NET_CTRL_NOTF_COAL_RX_SET as opposed to
> > VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET ?
> Only kernel stack initiated VQ notification coalescing changes.
> Since every VQ has different values, VIRTIO_NET_CTRL_NOTF_COAL_RX_SET is not sufficient.
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 20+ messages in thread
* [virtio-dev] RE: [virtio-comment] RE: [virtio-dev] RE: [virtio-comment] [PATCH v2] virtio-net: support setting coalescing params for multiple vqs
2024-01-23 7:15 ` [virtio-dev] " Michael S. Tsirkin
@ 2024-01-23 7:28 ` Parav Pandit
2024-01-24 2:43 ` [virtio-dev] " Heng Qi
1 sibling, 0 replies; 20+ messages in thread
From: Parav Pandit @ 2024-01-23 7:28 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Heng Qi, virtio-comment@lists.oasis-open.org,
virtio-dev@lists.oasis-open.org, Jason Wang, Xuan Zhuo
> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, January 23, 2024 12:45 PM
>
> On Tue, Jan 23, 2024 at 05:55:02AM +0000, Parav Pandit wrote:
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Monday, January 22, 2024 1:06 PM
> > >
> > > On Mon, Jan 22, 2024 at 05:03:38AM +0000, Parav Pandit wrote:
> > > > > >>> The right test on Linux to do without rtnl lock which is
> > > > > >>> anyway ugly and
> > > > > >> wrong semantic to use blocking the whole netdev stack.
> > > > > >>> (in case if you used that).
> > > > > >> Do you have any good directions and attempts to remove rtnl_lock?
> > > > > >>
> > > > > > I think per device lock instead of rtnl is first step that we can start
> with.
> > > > >
> > > > Wil check internally who if someone already started working on it.
> > >
> > > I feel the issue is at the conceptual level.
> > Not for requests which are initiated by the kernel stack (non user initiated).
>
> So how is this different? Is it basically just because tweaking coalescing in
> unexpected ways is considered mostly harmless?
>
It is different in two ways.
1. requests are unrelated to each other (unlike the promisc/mac example).
For example, changing vq1.params have no effect on vq2.params.
2. Therefore they don't need to be blocked.
3. There is no user expecting a success/failure on this command.
If it fails, possibly the queues can be marked in error for recovery/re-enablement.
> > > Yes some drivers will take a command and just queue it for execution
> > > later, but this means that errors can not be propagated back at all.
> > > Imagine device with mac
> > > 0x123 in promisc mode. Now commands:
> > >
> > > 1- program MAC 0xabcdef
> > > 2- disable promisc mode
> > >
> > User initiated commands error can be propagated when the command
> completes.
> > Enqueuing command is at the different bottom level in the driver.
> >
> > > If command 1 fails but 2 proceeds then packets with MAC 0xabc will
> > > be dropped.
> > >
> > > Any attempts to batch arbitrary commands will have this issue - be
> > > it at driver or device level.
> > >
> > There is no suggestion to batch arbitrary commands from the driver side.
> > The suggestion is to batch VQs notification coalescing from the driver side.
> >
> > > So, here's my question: what exactly is the guest behaviour that is
> > > driving this work? Is it with a linux guest?
> > At least looks to me yes based on the partial patches which are taking rtnl
> lock on netdim's worker callbacks.
> >
> > > which commands does userspace issue that we need to send multiple vq
> > > coalescing commands?
> > None.
> >
> > > If all you want is to send
> > > same config to all VQs then why not just use
> > > VIRTIO_NET_CTRL_NOTF_COAL_RX_SET as opposed to
> > > VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET ?
> > Only kernel stack initiated VQ notification coalescing changes.
> > Since every VQ has different values, VIRTIO_NET_CTRL_NOTF_COAL_RX_SET
> is not sufficient.
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [virtio-dev] Re: [virtio-comment] RE: [virtio-dev] RE: [virtio-comment] [PATCH v2] virtio-net: support setting coalescing params for multiple vqs
2024-01-23 7:15 ` [virtio-dev] " Michael S. Tsirkin
2024-01-23 7:28 ` [virtio-dev] " Parav Pandit
@ 2024-01-24 2:43 ` Heng Qi
1 sibling, 0 replies; 20+ messages in thread
From: Heng Qi @ 2024-01-24 2:43 UTC (permalink / raw)
To: Michael S. Tsirkin, Parav Pandit
Cc: virtio-comment@lists.oasis-open.org,
virtio-dev@lists.oasis-open.org, Jason Wang, Xuan Zhuo
在 2024/1/23 下午3:15, Michael S. Tsirkin 写道:
> On Tue, Jan 23, 2024 at 05:55:02AM +0000, Parav Pandit wrote:
>>> From: Michael S. Tsirkin <mst@redhat.com>
>>> Sent: Monday, January 22, 2024 1:06 PM
>>>
>>> On Mon, Jan 22, 2024 at 05:03:38AM +0000, Parav Pandit wrote:
>>>>>>>> The right test on Linux to do without rtnl lock which is anyway
>>>>>>>> ugly and
>>>>>>> wrong semantic to use blocking the whole netdev stack.
>>>>>>>> (in case if you used that).
>>>>>>> Do you have any good directions and attempts to remove rtnl_lock?
>>>>>>>
>>>>>> I think per device lock instead of rtnl is first step that we can start with.
>>>> Wil check internally who if someone already started working on it.
>>> I feel the issue is at the conceptual level.
>> Not for requests which are initiated by the kernel stack (non user initiated).
> So how is this different? Is it basically just because
> tweaking coalescing in unexpected ways is considered mostly
> harmless?
DIM sends configurations frequently, which is try best.
>
>>> Yes some drivers will take a command
>>> and just queue it for execution later, but this means that errors can not be
>>> propagated back at all. Imagine device with mac
>>> 0x123 in promisc mode. Now commands:
>>>
>>> 1- program MAC 0xabcdef
>>> 2- disable promisc mode
>>>
>> User initiated commands error can be propagated when the command completes.
>> Enqueuing command is at the different bottom level in the driver.
>>
>>> If command 1 fails but 2 proceeds then packets with MAC 0xabc will be
>>> dropped.
>>>
>>> Any attempts to batch arbitrary commands will have this issue - be it at driver
>>> or device level.
>>>
>> There is no suggestion to batch arbitrary commands from the driver side.
>> The suggestion is to batch VQs notification coalescing from the driver side.
>>
>>> So, here's my question: what exactly is the guest behaviour that is driving this
>>> work? Is it with a linux guest?
>> At least looks to me yes based on the partial patches which are taking rtnl lock on netdim's worker callbacks.
>>
>>> which commands does userspace issue that we
>>> need to send multiple vq coalescing commands?
>> None.
>>
>>> If all you want is to send
>>> same config to all VQs then why not just use
>>> VIRTIO_NET_CTRL_NOTF_COAL_RX_SET as opposed to
>>> VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET ?
>> Only kernel stack initiated VQ notification coalescing changes.
>> Since every VQ has different values, VIRTIO_NET_CTRL_NOTF_COAL_RX_SET is not sufficient.
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 20+ messages in thread
* [virtio-dev] Re: [PATCH v2] virtio-net: support setting coalescing params for multiple vqs
2024-01-15 13:06 [virtio-dev] [PATCH v2] virtio-net: support setting coalescing params for multiple vqs Heng Qi
2024-01-15 13:21 ` [virtio-dev] RE: [virtio-comment] " Parav Pandit
2024-01-15 22:58 ` [virtio-dev] " Michael S. Tsirkin
@ 2024-01-24 10:50 ` Michael S. Tsirkin
2024-01-24 10:52 ` Michael S. Tsirkin
2 siblings, 1 reply; 20+ messages in thread
From: Michael S. Tsirkin @ 2024-01-24 10:50 UTC (permalink / raw)
To: Heng Qi; +Cc: virtio-comment, virtio-dev, Jason Wang, Xuan Zhuo
On Mon, Jan 15, 2024 at 09:06:02PM +0800, Heng Qi wrote:
> Currently, when each time the driver attempts to update the coalescing parameters
> for a vq, it needs to kick the device and wait for the ctrlq response to return.
> The following path is observed: 1. Driver kicks the device; 2. After the device
> receives the kick, CPU scheduling occurs and DMA multiple buffers multiple times;
> 3. The device completes processing and replies with a response.
>
> When large-queue devices issue multiple requests and kick the device frequently,
> this often interrupt the work of the device-side CPU. In addition, each vq request
> is processed separately, causing more delays for the CPU to wait for the DMA
> request to complete.
>
> These interruptions and overhead will strain the CPU responsible for controlling
> the path of the DPU, especially in multi-device and large-queue scenarios.
>
> To solve the above problems, we internally tried batch request, which merges
> requests from multiple queues and sends them at once. We conservatively tested
> 8 queue commands and sent them together. The DPU processing efficiency can be
> improved by 8 times, which greatly eases the DPU's support for multi-device
> and multi-queue DIM.
>
> Maintainers may be concerned about whether the batch command method can optimize
> the above problems: accumulate multiple request commands to kick the device once,
> and obtain the processing results of the corresponding commands asynchronously.
> The batch command method is used by us to optimize the CPU overhead of the DIM
> worker caused by the guest being busy waiting for the command response result.
> This is a different focus than batch request to solve the problem.
>
> Suggested-by: Xiaoming Zhao <zxm377917@alibaba-inc.com>
> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> ---
> v1->v2: Updated commit log. Due to sensitivity, sorry that can not give the
> absolute value directly. @Michael
>
> device-types/net/description.tex | 26 ++++++++++++++++++++------
> 1 file changed, 20 insertions(+), 6 deletions(-)
>
> diff --git a/device-types/net/description.tex b/device-types/net/description.tex
> index aff5e08..b3766c4 100644
> --- a/device-types/net/description.tex
> +++ b/device-types/net/description.tex
> @@ -1667,8 +1667,8 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
> for notification coalescing.
>
> If the VIRTIO_NET_F_VQ_NOTF_COAL feature is negotiated, the driver can
> -send commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET
> -for virtqueue notification coalescing.
> +send commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and
> +VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET for virtqueue notification coalescing.
>
> \begin{lstlisting}
> struct virtio_net_ctrl_coal {
> @@ -1682,11 +1682,17 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
> struct virtio_net_ctrl_coal coal;
> };
>
> +struct virtio_net_ctrl_mrg_coal_vq {
> + le16 num_entries; /* indicates number of valid entries */
> + struct virtio_net_ctrl_coal_vq entries[];
> +};
> +
> #define VIRTIO_NET_CTRL_NOTF_COAL 6
> #define VIRTIO_NET_CTRL_NOTF_COAL_TX_SET 0
> #define VIRTIO_NET_CTRL_NOTF_COAL_RX_SET 1
> #define VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET 2
> #define VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET 3
> + #define VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET 4
> \end{lstlisting}
>
> Coalescing parameters:
> @@ -1706,6 +1712,7 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
> \item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, the structure virtio_net_ctrl_coal_vq is write-only for the driver.
> \item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET, \field{vq_index} and \field{reserved} are write-only
> for the driver, and the structure virtio_net_ctrl_coal is read-only for the driver.
> +\item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET, the structure virtio_net_ctrl_mrg_coal_vq is write-only for the driver.
> \end{itemize}
>
> The class VIRTIO_NET_CTRL_NOTF_COAL has the following commands:
> @@ -1716,6 +1723,9 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
> for an enabled transmit/receive virtqueue whose index is \field{vq_index}.
> \item VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET: use the structure virtio_net_ctrl_coal_vq to get the \field{max_usecs} and \field{max_packets} parameters
> for an enabled transmit/receive virtqueue whose index is \field{vq_index}.
> +\item VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET: use the structure virtio_net_ctrl_mrg_coal_vq to set the \field{max_usecs} and \field{max_packets} parameters
> + for \field{num_entries} enabled transmit/receive virtqueues. The corresponding index value
> + of each configured virtqueue is \field{vq_index}.
> \end{enumerate}
>
> The device may generate notifications more or less frequently than specified by set commands of the VIRTIO_NET_CTRL_NOTF_COAL class.
> @@ -1782,9 +1792,13 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
>
> The driver MUST set \field{vq_index} to the virtqueue index of an enabled transmit or receive virtqueue.
>
> +The driver MUST set \field{num_entries} to a non-zero value and MUST NOT set \field{num_entries} to
> +a value greater than the number of enabled transmit and receive virtqueues.
> +
> The driver MUST have negotiated the VIRTIO_NET_F_NOTF_COAL feature when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and VIRTIO_NET_CTRL_NOTF_COAL_RX_SET.
>
> -The driver MUST have negotiated the VIRTIO_NET_F_VQ_NOTF_COAL feature when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET.
> +The driver MUST have negotiated the VIRTIO_NET_F_VQ_NOTF_COAL feature when issuing commands
> +VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET.
>
> The driver MUST ignore the values of coalescing parameters received from the VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET command if the device responds with VIRTIO_NET_ERR.
>
> @@ -1794,10 +1808,10 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
>
> The device SHOULD respond to VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and VIRTIO_NET_CTRL_NOTF_COAL_RX_SET commands with VIRTIO_NET_ERR if it was not able to change the parameters.
>
> -The device MUST respond to the VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET command with VIRTIO_NET_ERR if it was not able to change the parameters.
> +The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET commands with VIRTIO_NET_ERR if it was not able to change the parameters.
This one however is vague.
In fact the addition of VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET made
VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and VIRTIO_NET_CTRL_NOTF_COAL_RX_SET
also vague we just missed this.
So, it is not clear whether if the device responds with VIRTIO_NET_ERR - it can change
parameters for some VQs but not others, or does it have to either
change parameters for all VQs or none at all.
I think a better sematic is all or nothing.
So I suggest combining this statement with one of
VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and VIRTIO_NET_CTRL_NOTF_COAL_RX_SET,
and adding something like
"in this case, coalescing parameters MUST remain unchanged,
for all VQs".
This bug bothers me. Maybe we should fix it before release?
I also don't know why we repeat same text for VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET,
VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and VIRTIO_NET_CTRL_NOTF_COAL_RX_SET.
>
> -The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET commands with
> -VIRTIO_NET_ERR if the designated virtqueue is not an enabled transmit or receive virtqueue.
> +The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET
> +commands with VIRTIO_NET_ERR if the designated virtqueue is not an enabled transmit or receive virtqueue.
>
> Upon disabling and re-enabling a transmit virtqueue, the device MUST set the coalescing parameters of the virtqueue
> to those configured through the VIRTIO_NET_CTRL_NOTF_COAL_TX_SET command, or, if the driver did not set any TX coalescing parameters, to 0.
> --
> 1.8.3.1
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 20+ messages in thread
* [virtio-dev] Re: [PATCH v2] virtio-net: support setting coalescing params for multiple vqs
2024-01-24 10:50 ` Michael S. Tsirkin
@ 2024-01-24 10:52 ` Michael S. Tsirkin
2024-01-25 2:33 ` [virtio-dev] Re: [virtio-comment] " Heng Qi
0 siblings, 1 reply; 20+ messages in thread
From: Michael S. Tsirkin @ 2024-01-24 10:52 UTC (permalink / raw)
To: Heng Qi; +Cc: virtio-comment, virtio-dev, Jason Wang, Xuan Zhuo
On Wed, Jan 24, 2024 at 05:50:28AM -0500, Michael S. Tsirkin wrote:
> On Mon, Jan 15, 2024 at 09:06:02PM +0800, Heng Qi wrote:
> > Currently, when each time the driver attempts to update the coalescing parameters
> > for a vq, it needs to kick the device and wait for the ctrlq response to return.
> > The following path is observed: 1. Driver kicks the device; 2. After the device
> > receives the kick, CPU scheduling occurs and DMA multiple buffers multiple times;
> > 3. The device completes processing and replies with a response.
> >
> > When large-queue devices issue multiple requests and kick the device frequently,
> > this often interrupt the work of the device-side CPU. In addition, each vq request
> > is processed separately, causing more delays for the CPU to wait for the DMA
> > request to complete.
> >
> > These interruptions and overhead will strain the CPU responsible for controlling
> > the path of the DPU, especially in multi-device and large-queue scenarios.
> >
> > To solve the above problems, we internally tried batch request, which merges
> > requests from multiple queues and sends them at once. We conservatively tested
> > 8 queue commands and sent them together. The DPU processing efficiency can be
> > improved by 8 times, which greatly eases the DPU's support for multi-device
> > and multi-queue DIM.
> >
> > Maintainers may be concerned about whether the batch command method can optimize
> > the above problems: accumulate multiple request commands to kick the device once,
> > and obtain the processing results of the corresponding commands asynchronously.
> > The batch command method is used by us to optimize the CPU overhead of the DIM
> > worker caused by the guest being busy waiting for the command response result.
> > This is a different focus than batch request to solve the problem.
> >
> > Suggested-by: Xiaoming Zhao <zxm377917@alibaba-inc.com>
> > Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> > ---
> > v1->v2: Updated commit log. Due to sensitivity, sorry that can not give the
> > absolute value directly. @Michael
> >
> > device-types/net/description.tex | 26 ++++++++++++++++++++------
> > 1 file changed, 20 insertions(+), 6 deletions(-)
> >
> > diff --git a/device-types/net/description.tex b/device-types/net/description.tex
> > index aff5e08..b3766c4 100644
> > --- a/device-types/net/description.tex
> > +++ b/device-types/net/description.tex
> > @@ -1667,8 +1667,8 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
> > for notification coalescing.
> >
> > If the VIRTIO_NET_F_VQ_NOTF_COAL feature is negotiated, the driver can
> > -send commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET
> > -for virtqueue notification coalescing.
> > +send commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and
> > +VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET for virtqueue notification coalescing.
> >
> > \begin{lstlisting}
> > struct virtio_net_ctrl_coal {
> > @@ -1682,11 +1682,17 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
> > struct virtio_net_ctrl_coal coal;
> > };
> >
> > +struct virtio_net_ctrl_mrg_coal_vq {
> > + le16 num_entries; /* indicates number of valid entries */
> > + struct virtio_net_ctrl_coal_vq entries[];
> > +};
> > +
> > #define VIRTIO_NET_CTRL_NOTF_COAL 6
> > #define VIRTIO_NET_CTRL_NOTF_COAL_TX_SET 0
> > #define VIRTIO_NET_CTRL_NOTF_COAL_RX_SET 1
> > #define VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET 2
> > #define VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET 3
> > + #define VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET 4
> > \end{lstlisting}
> >
> > Coalescing parameters:
> > @@ -1706,6 +1712,7 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
> > \item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, the structure virtio_net_ctrl_coal_vq is write-only for the driver.
> > \item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET, \field{vq_index} and \field{reserved} are write-only
> > for the driver, and the structure virtio_net_ctrl_coal is read-only for the driver.
> > +\item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET, the structure virtio_net_ctrl_mrg_coal_vq is write-only for the driver.
> > \end{itemize}
> >
> > The class VIRTIO_NET_CTRL_NOTF_COAL has the following commands:
> > @@ -1716,6 +1723,9 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
> > for an enabled transmit/receive virtqueue whose index is \field{vq_index}.
> > \item VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET: use the structure virtio_net_ctrl_coal_vq to get the \field{max_usecs} and \field{max_packets} parameters
> > for an enabled transmit/receive virtqueue whose index is \field{vq_index}.
> > +\item VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET: use the structure virtio_net_ctrl_mrg_coal_vq to set the \field{max_usecs} and \field{max_packets} parameters
> > + for \field{num_entries} enabled transmit/receive virtqueues. The corresponding index value
> > + of each configured virtqueue is \field{vq_index}.
> > \end{enumerate}
> >
> > The device may generate notifications more or less frequently than specified by set commands of the VIRTIO_NET_CTRL_NOTF_COAL class.
> > @@ -1782,9 +1792,13 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
> >
> > The driver MUST set \field{vq_index} to the virtqueue index of an enabled transmit or receive virtqueue.
> >
> > +The driver MUST set \field{num_entries} to a non-zero value and MUST NOT set \field{num_entries} to
> > +a value greater than the number of enabled transmit and receive virtqueues.
> > +
> > The driver MUST have negotiated the VIRTIO_NET_F_NOTF_COAL feature when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and VIRTIO_NET_CTRL_NOTF_COAL_RX_SET.
> >
> > -The driver MUST have negotiated the VIRTIO_NET_F_VQ_NOTF_COAL feature when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET.
> > +The driver MUST have negotiated the VIRTIO_NET_F_VQ_NOTF_COAL feature when issuing commands
> > +VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET.
> >
> > The driver MUST ignore the values of coalescing parameters received from the VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET command if the device responds with VIRTIO_NET_ERR.
> >
> > @@ -1794,10 +1808,10 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
> >
> > The device SHOULD respond to VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and VIRTIO_NET_CTRL_NOTF_COAL_RX_SET commands with VIRTIO_NET_ERR if it was not able to change the parameters.
> >
> > -The device MUST respond to the VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET command with VIRTIO_NET_ERR if it was not able to change the parameters.
> > +The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET commands with VIRTIO_NET_ERR if it was not able to change the parameters.
>
> This one however is vague.
> In fact the addition of VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET made
> VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and VIRTIO_NET_CTRL_NOTF_COAL_RX_SET
> also vague we just missed this.
>
>
>
> So, it is not clear whether if the device responds with VIRTIO_NET_ERR - it can change
> parameters for some VQs but not others, or does it have to either
> change parameters for all VQs or none at all.
>
> I think a better sematic is all or nothing.
>
> So I suggest combining this statement with one of
> VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and VIRTIO_NET_CTRL_NOTF_COAL_RX_SET,
> and adding something like
> "in this case, coalescing parameters MUST remain unchanged,
> for all VQs".
>
> This bug bothers me. Maybe we should fix it before release?
>
> I also don't know why we repeat same text for VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET,
> VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and VIRTIO_NET_CTRL_NOTF_COAL_RX_SET.
I figured it out, TX/RX are SHOULD.
I will fix TX/RX and you can add the text to this one.
> >
> > -The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET commands with
> > -VIRTIO_NET_ERR if the designated virtqueue is not an enabled transmit or receive virtqueue.
> > +The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET
> > +commands with VIRTIO_NET_ERR if the designated virtqueue is not an enabled transmit or receive virtqueue.
> >
> > Upon disabling and re-enabling a transmit virtqueue, the device MUST set the coalescing parameters of the virtqueue
> > to those configured through the VIRTIO_NET_CTRL_NOTF_COAL_TX_SET command, or, if the driver did not set any TX coalescing parameters, to 0.
> > --
> > 1.8.3.1
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [virtio-dev] RE: [virtio-comment] RE: [virtio-dev] RE: [virtio-comment] [PATCH v2] virtio-net: support setting coalescing params for multiple vqs
2024-01-22 5:03 ` [virtio-dev] " Parav Pandit
2024-01-22 7:36 ` [virtio-dev] " Michael S. Tsirkin
@ 2024-01-24 13:01 ` Heng Qi
2024-01-24 13:18 ` Parav Pandit
1 sibling, 1 reply; 20+ messages in thread
From: Heng Qi @ 2024-01-24 13:01 UTC (permalink / raw)
To: Parav Pandit, virtio-comment@lists.oasis-open.org,
virtio-dev@lists.oasis-open.org
Cc: Jason Wang, Michael S. Tsirkin, Xuan Zhuo
在 2024/1/22 下午1:03, Parav Pandit 写道:
>
>> From: Heng Qi <hengqi@linux.alibaba.com>
>> Sent: Monday, January 22, 2024 8:27 AM
>>
>> 在 2024/1/20 下午5:59, Parav Pandit 写道:
>>>> From: Heng Qi <hengqi@linux.alibaba.com>
>>>> Sent: Wednesday, January 17, 2024 10:22 AM
>>>>
>>>> 在 2024/1/15 下午9:21, Parav Pandit 写道:
>>>>>> From: virtio-comment@lists.oasis-open.org
>>>>>> <virtio-comment@lists.oasis- open.org> On Behalf Of Heng Qi
>>>>>> Sent: Monday, January 15, 2024 6:36 PM
>>>>>>
>>>>>> Currently, when each time the driver attempts to update the
>>>>>> coalescing parameters for a vq, it needs to kick the device and
>>>>>> wait for the ctrlq response to return.
>>>>> It does not need to wait. This is some driver limitation that does
>>>>> not use
>>>> the queue as "queue".
>>>>> Such driver limitation should be removed in the driver. It does not
>>>>> qualify
>>>> as limitation.
>>>>
>>>> Yes, we don't have to wait.
>>>>
>>>> But in general, for user commands, it is necessary to obtain the
>>>> final results synchronously.
>>> Yes. Use initiated command can enqueue the request to cvq. Go to sleep
>> for several micro to milliseconds.
>>>> The user command cannot return before the final result is obtained.
>>>> And wait is not the problem this patch solves.
>>>>
>>> By not holding the rtnl lock, rest of the context that needs to enqueue the
>> request can progress such as that of netdim.
>>
>> Would like to see the using of rtnl lock changed.
>>
> Inside the virtnet_rx_dim_work() there should be rtnl lock call.
> A virtio_device level lock to be used for cvq. :)
>
>> In addition, I have made batching and asynchronousization of the netdim
>> command, you can refer to this patch:
>> https://lore.kernel.org/all/1705410693-118895-4-git-send-email-
>> hengqi@linux.alibaba.com/
>>
> In the listed above driver patch the motivation "to optimize the
> CPU overhead of the DIM worker caused by the guest being busy
> waiting for the command response result."
>
> Is not right.
> Because guest is still busy waiting.
There is no busy wait for guests, see get_cvq_work().
> Without batching, due to rtnl lock every VQ command is serialized as one outstanding command at a time in virtnet_rx_dim_work().
> Due to this device is unable to take benefit of DMA batching at large scale.
Adding dim commands is now asynchronous, and the device will receive
batches of commands.
>
>>>>> This will enable driver to enqueue multiple cvq commands without
>>>>> waiting
>>>> for previous one.
>>>>> This will also enable device to find natural coalescing done on
>>>>> multiple
>>>> commands.
>>>>
>>>> When batch user commands occur, ensuring synchronization is a concern.
>>>>
>>>>>> The following path is observed: 1. Driver kicks the device; 2.
>>>>>> After the device receives the kick, CPU scheduling occurs and DMA
>>>>>> multiple buffers multiple times; 3. The device completes processing
>>>>>> and replies
>>>> with a response.
>>>>>> When large-queue devices issue multiple requests and kick the
>>>>>> device frequently, this often interrupt the work of the device-side CPU.
>>>>> When there is large devices and multiple driver notifications by a
>>>>> cpu that is N times faster than the device side cpu, the device may
>>>>> find natural
>>>> coalescing on the commands of a given cvq.
>>>>
>>>> First we have to solve the ctrlq batch adding user (ethtool) command.
>>>> Even if processed in a batch way on device side, the number of kicks
>>>> and the number of backend DMAs has not been reduced.
>>> Driver notifications are PCI writes so it should not hamper device side,
>> which can ignore them when they do not bother about it.
>>
>> Driver notifications need to be processed by the DPU, which interferes with
>> the CPU on the DPU.
>>
> I was asking, if there is anyway to disable for your DPU to ignore these notifications while previous one is pending?
> From your above description, it seems there isn’t.
While the device is processing the request, additional kicks are ignored.
Thanks.
>
>>> Backend DMAs should be reduced by avoiding the LIFO pattern followed by
>> the splitq driver.
>>> Placing the descriptors contiguously like packedq reduces amount of DMA
>> naturally.
>>
>> splitq is widely used, migrating to packedq is not that easy, especially when
>> there are many components and hardware involved.
> I am not suggesting to packed_VQ.
> I am suggesting fixing the driver to not do LIFO on descriptors for splitq.
> In other words, using contiguous set of descriptors on splitq will improve the splitq for DMA.
> This will allow using splitq more efficiently for dma as short-term solution for DMA until more efficient queues are defined.
>
>>> The second predicable DMA to avoid is having 8Bytes of data inline in the
>> descriptor, instead of 16B indirection and extra dma.
>>
>> Looking forward to working inline!
>> But I think this does not conflict with batch work, and combining the two will
>> be more beneficial.
>>
> It does not conflict. However, batching for large number of queues may not use the inline as the data bytes may not fit in the inline.
>
>>>>> For multiple DMAs, we need to way to send 8 bytes of data without 16
>>>> bytes of indirection via a descriptor.
>>>>> This is what we discussed a while back to do in txq and Stefan
>>>>> suggested to
>>>> generalize for more queues, which is also a good idea.
>>>>
>>>> Yes, this sounds good.
>>>>
>>> This the next item to focus as soon as flow filters are stable/merged.
>> Greate!
>>
>>>>>> In addition,
>>>>>> each vq request is processed separately, causing more delays for
>>>>>> the CPU to wait for the DMA request to complete.
>>>>>>
>>>>>> These interruptions and overhead will strain the CPU responsible
>>>>>> for controlling the path of the DPU, especially in multi-device and
>>>>>> large-queue scenarios.
>>>>>>
>>>>>> To solve the above problems, we internally tried batch request,
>>>>>> which merges requests from multiple queues and sends them at once.
>>>>>> We conservatively tested
>>>>> The batching done may end up modifying the given VQ's parameters
>>>> multiple times.
>>>>
>>>> In practice, we do not try to accumulate multiple parameter
>>>> modifications for a specific vqn.
>>> I meant to accumulate parameters of multiple VQs to batch in single
>> command.
>>
>> Yes, this is what is done now, practical example:
>> https://lore.kernel.org/all/1705410693-118895-3-git-send-email-
>> hengqi@linux.alibaba.com/
>> In our scenario, only benefits are seen, no negative effects.
>>
> The right way to test is without the rtnl lock and enqueue the command to cvq without being blocked.
>
> The new optional batch command with new feature bit may be useful for devices to which CVQ kicks are interruptions.
>
>>>>> The right test on Linux to do without rtnl lock which is anyway ugly
>>>>> and
>>>> wrong semantic to use blocking the whole netdev stack.
>>>>> (in case if you used that).
>>>> Do you have any good directions and attempts to remove rtnl_lock?
>>>>
>>> I think per device lock instead of rtnl is first step that we can start with.
> Wil check internally who if someone already started working on it.
>
>> Totally looking forward to it.
>>
>> Thanks,
>> Heng
>>
>>>>>> 8 queue commands and sent them together. The DPU processing
>>>>>> efficiency can be improved by 8 times, which greatly eases the
>>>>>> DPU's support for multi- device and multi-queue DIM.
>>>>>>
>>>>> This is good.
>>>> YES. Makes sense for our DPUs.
>>>>
>>>>>> Maintainers may be concerned about whether the batch command
>>>> method
>>>>>> can optimize the above problems: accumulate multiple request
>>>> commands
>>>>>> to kick the device once, and obtain the processing results of the
>>>>>> corresponding commands asynchronously.
>>>>> This is unlikely to improve, rather it will have negative impact as
>>>>> it only
>>>> means that moderation parameters are just delayed by the driver.
>>>> Why is it delayed by the driver? It is not delayed by the driver, the
>>>> kick still happens for every command.
>>> Since the kick is delayed, the driver lost the amount of time and device
>> reaction time is driven by this wait time.
>>> The guest driver is not aware how much a shared DPU is busy.
>>> So if it waits too much, the VQs moderation is slow.
>>>
>>>> In theory and practice, it will not affect DIM performance, but it
>>>> will significantly reduce CPU consumption caused by waiting.
>>>>
>>> Once the guest driver sleeps after enqueuing the request, it should be fine.
>>> The DIM calls are in highly parallel worker threads, where/if the thread
>> sleep, a new worker thread is invoked.
>>> But for sure I agree that vq notification moderation should happen quickly
>> enough under < 100usec or so.
>>> May be now that we bring the aq resources, we can start migrating on it.
>>>
>>>>>> The batch command method is used by us to optimize the CPU
>> overhead
>>>>>> of the DIM worker caused by the guest being busy waiting for the
>>>>>> command response result.
>>>>> In that case fixing the guest driver which is not yet written is the right fix.
>>>>>
>>>>>> This is a different focus than batch request to solve the problem.
>>>>>>
>>>>>> Suggested-by: Xiaoming Zhao <zxm377917@alibaba-inc.com>
>>>>>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
>>>>>> ---
>>>>>> v1->v2: Updated commit log. Due to sensitivity, sorry that can not
>>>>>> v1->give the
>>>>>> absolute value directly. @Michael
>>>>>>
>>>>>> device-types/net/description.tex | 26 ++++++++++++++++++++------
>>>>>> 1 file changed, 20 insertions(+), 6 deletions(-)
>>>>>>
>>>>>> diff --git a/device-types/net/description.tex b/device-
>>>>>> types/net/description.tex index aff5e08..b3766c4 100644
>>>>>> --- a/device-types/net/description.tex
>>>>>> +++ b/device-types/net/description.tex
>>>>>> @@ -1667,8 +1667,8 @@ \subsubsection{Control
>>>>>> Virtqueue}\label{sec:Device Types / Network Device / Devi for
>>>>>> notification coalescing.
>>>>>>
>>>>>> If the VIRTIO_NET_F_VQ_NOTF_COAL feature is negotiated, the
>>>>>> driver can - send commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET
>> and
>>>>>> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET -for virtqueue notification
>>>>>> coalescing.
>>>>>> +send commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET,
>>>>>> +VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and
>>>>>> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET for virtqueue notification
>>>>>> coalescing.
>>>>>>
>>>>> A new feature bit is needed for this extra functionality.
>>>> I tried to extend it to the command VIRTIO_NET_F_VQ_NOTF_COAL, is it
>>>> too late?
>>>>
>>> For existing command it is not.
>>> But there may be a device which may not support coalesced command.
>>> Better to have the feature bit for this coalesce command, instead of
>> random NOSUPP failure.
>>>
>>>> Thanks!
>>>>
>>>>>> \begin{lstlisting}
>>>>>> struct virtio_net_ctrl_coal {
>>>>>> @@ -1682,11 +1682,17 @@ \subsubsection{Control
>>>>>> Virtqueue}\label{sec:Device Types / Network Device / Devi
>>>>>> struct virtio_net_ctrl_coal coal;
>>>>>> };
>>>>>>
>>>>>> +struct virtio_net_ctrl_mrg_coal_vq {
>>>>>> + le16 num_entries; /* indicates number of valid entries */
>>>>>> + struct virtio_net_ctrl_coal_vq entries[]; };
>>>>>> +
>>>>>> #define VIRTIO_NET_CTRL_NOTF_COAL 6
>>>>>> #define VIRTIO_NET_CTRL_NOTF_COAL_TX_SET 0
>>>>>> #define VIRTIO_NET_CTRL_NOTF_COAL_RX_SET 1
>>>>>> #define VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET 2
>>>>>> #define VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET 3
>>>>>> + #define VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET 4
>>>>>> \end{lstlisting}
>>>>>>
>>>>>> Coalescing parameters:
>>>>>> @@ -1706,6 +1712,7 @@ \subsubsection{Control
>>>>>> Virtqueue}\label{sec:Device Types / Network Device / Devi \item
>>>>>> For the command VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, the
>> structure
>>>>>> virtio_net_ctrl_coal_vq is write-only for the driver.
>>>>>> \item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET,
>>>>>> \field{vq_index} and \field{reserved} are write-only
>>>>>> for the driver, and the structure virtio_net_ctrl_coal is
>>>>>> read-only for the driver.
>>>>>> +\item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET, the
>>>>>> structure virtio_net_ctrl_mrg_coal_vq is write-only for the driver.
>>>>>> \end{itemize}
>>>>>>
>>>>>> The class VIRTIO_NET_CTRL_NOTF_COAL has the following
>> commands:
>>>>>> @@ -1716,6 +1723,9 @@ \subsubsection{Control
>>>>>> Virtqueue}\label{sec:Device Types / Network Device / Devi
>>>>>> for an enabled
>>>>>> transmit/receive virtqueue whose index is \field{vq_index}.
>>>>>> \item VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET: use the structure
>>>>>> virtio_net_ctrl_coal_vq to get the \field{max_usecs} and
>>>>>> \field{max_packets} parameters
>>>>>> for an enabled
>>>>>> transmit/receive virtqueue whose index is \field{vq_index}.
>>>>>> +\item VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET: use the structure
>>>>>> virtio_net_ctrl_mrg_coal_vq to set the \field{max_usecs} and
>>>>>> \field{max_packets} parameters
>>>>>> + for \field{num_entries}
>>>>>> + enabled transmit/receive
>>>>>> virtqueues. The corresponding index value
>>>>>> + of each configured virtqueue is
>> \field{vq_index}.
>>>>>> \end{enumerate}
>>>>>>
>>>>>> The device may generate notifications more or less frequently
>>>>>> than specified by set commands of the VIRTIO_NET_CTRL_NOTF_COAL
>> class.
>>>>>> @@ -1782,9 +1792,13 @@ \subsubsection{Control
>>>>>> Virtqueue}\label{sec:Device Types / Network Device / Devi
>>>>>>
>>>>>> The driver MUST set \field{vq_index} to the virtqueue index of
>>>>>> an enabled transmit or receive virtqueue.
>>>>>>
>>>>>> +The driver MUST set \field{num_entries} to a non-zero value and
>>>>>> +MUST NOT set \field{num_entries} to a value greater than the
>>>>>> +number of enabled
>>>>>> transmit and receive virtqueues.
>>>>>> +
>>>>>> The driver MUST have negotiated the VIRTIO_NET_F_NOTF_COAL
>>>> feature
>>>>>> when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and
>>>>>> VIRTIO_NET_CTRL_NOTF_COAL_RX_SET.
>>>>>>
>>>>>> -The driver MUST have negotiated the VIRTIO_NET_F_VQ_NOTF_COAL
>>>>>> feature when issuing commands
>> VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET
>>>> and
>>>>>> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET.
>>>>>> +The driver MUST have negotiated the VIRTIO_NET_F_VQ_NOTF_COAL
>>>>>> feature
>>>>>> +when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET,
>>>>>> VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and
>>>>>> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET.
>>>>>>
>>>>>> The driver MUST ignore the values of coalescing parameters
>>>>>> received from the VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET command if
>> the
>>>>>> device responds with VIRTIO_NET_ERR.
>>>>>>
>>>>>> @@ -1794,10 +1808,10 @@ \subsubsection{Control
>>>>>> Virtqueue}\label{sec:Device Types / Network Device / Devi
>>>>>>
>>>>>> The device SHOULD respond to
>> VIRTIO_NET_CTRL_NOTF_COAL_TX_SET
>>>> and
>>>>>> VIRTIO_NET_CTRL_NOTF_COAL_RX_SET commands with
>>>> VIRTIO_NET_ERR if it
>>>>>> was not able to change the parameters.
>>>>>>
>>>>>> -The device MUST respond to the
>>>> VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET
>>>>>> command with VIRTIO_NET_ERR if it was not able to change the
>>>> parameters.
>>>>>> +The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET
>>>> and
>>>>>> VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET commands with
>>>> VIRTIO_NET_ERR if it
>>>>>> was not able to change the parameters.
>>>>>>
>>>>>> -The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET
>>>> and
>>>>>> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET commands with -
>>>> VIRTIO_NET_ERR if the
>>>>>> designated virtqueue is not an enabled transmit or receive virtqueue.
>>>>>> +The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET,
>>>>>> +VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and
>>>>>> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET commands with
>>>> VIRTIO_NET_ERR if the
>>>>>> designated virtqueue is not an enabled transmit or receive virtqueue.
>>>>>>
>>>>>> Upon disabling and re-enabling a transmit virtqueue, the device
>>>>>> MUST set the coalescing parameters of the virtqueue to those
>>>>>> configured through the VIRTIO_NET_CTRL_NOTF_COAL_TX_SET
>>>> command, or,
>>>>>> if the driver did not set any TX coalescing parameters, to 0.
>>>>>> --
>>>>>> 1.8.3.1
>>>>>>
>>>>>>
>>>>>> This publicly archived list offers a means to provide input to the
>>>>>> OASIS Virtual I/O Device (VIRTIO) TC.
>>>>>>
>>>>>> In order to verify user consent to the Feedback License terms and
>>>>>> to minimize spam in the list archive, subscription is required
>>>>>> before posting.
>>>>>>
>>>>>> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
>>>>>> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
>>>>>> List help: virtio-comment-help@lists.oasis-open.org
>>>>>> List archive: https://lists.oasis-open.org/archives/virtio-comment/
>>>>>> Feedback License: https://www.oasis-
>>>>>> open.org/who/ipr/feedback_license.pdf
>>>>>> List Guidelines:
>>>>>> https://www.oasis-open.org/policies-guidelines/mailing-lists
>>>>>> Committee: https://www.oasis-open.org/committees/virtio/
>>>>>> Join OASIS: https://www.oasis-open.org/join/
>>>>> --------------------------------------------------------------------
>>>>> - To unsubscribe, e-mail:
>>>>> virtio-dev-unsubscribe@lists.oasis-open.org
>>>>> For additional commands, e-mail:
>>>>> virtio-dev-help@lists.oasis-open.org
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [virtio-dev] RE: [virtio-comment] RE: [virtio-dev] RE: [virtio-comment] [PATCH v2] virtio-net: support setting coalescing params for multiple vqs
2024-01-24 13:01 ` [virtio-dev] " Heng Qi
@ 2024-01-24 13:18 ` Parav Pandit
2024-01-25 3:05 ` [virtio-dev] Re: [virtio-comment] " Heng Qi
0 siblings, 1 reply; 20+ messages in thread
From: Parav Pandit @ 2024-01-24 13:18 UTC (permalink / raw)
To: Heng Qi, virtio-comment@lists.oasis-open.org,
virtio-dev@lists.oasis-open.org
Cc: Jason Wang, Michael S. Tsirkin, Xuan Zhuo
> From: Heng Qi <hengqi@linux.alibaba.com>
> Sent: Wednesday, January 24, 2024 6:31 PM
>
>
> 在 2024/1/22 下午1:03, Parav Pandit 写道:
> >
> >> From: Heng Qi <hengqi@linux.alibaba.com>
> >> Sent: Monday, January 22, 2024 8:27 AM
> >>
> >> 在 2024/1/20 下午5:59, Parav Pandit 写道:
> >>>> From: Heng Qi <hengqi@linux.alibaba.com>
> >>>> Sent: Wednesday, January 17, 2024 10:22 AM
> >>>>
> >>>> 在 2024/1/15 下午9:21, Parav Pandit 写道:
> >>>>>> From: virtio-comment@lists.oasis-open.org
> >>>>>> <virtio-comment@lists.oasis- open.org> On Behalf Of Heng Qi
> >>>>>> Sent: Monday, January 15, 2024 6:36 PM
> >>>>>>
> >>>>>> Currently, when each time the driver attempts to update the
> >>>>>> coalescing parameters for a vq, it needs to kick the device and
> >>>>>> wait for the ctrlq response to return.
> >>>>> It does not need to wait. This is some driver limitation that does
> >>>>> not use
> >>>> the queue as "queue".
> >>>>> Such driver limitation should be removed in the driver. It does
> >>>>> not qualify
> >>>> as limitation.
> >>>>
> >>>> Yes, we don't have to wait.
> >>>>
> >>>> But in general, for user commands, it is necessary to obtain the
> >>>> final results synchronously.
> >>> Yes. Use initiated command can enqueue the request to cvq. Go to
> >>> sleep
> >> for several micro to milliseconds.
> >>>> The user command cannot return before the final result is obtained.
> >>>> And wait is not the problem this patch solves.
> >>>>
> >>> By not holding the rtnl lock, rest of the context that needs to
> >>> enqueue the
> >> request can progress such as that of netdim.
> >>
> >> Would like to see the using of rtnl lock changed.
> >>
> > Inside the virtnet_rx_dim_work() there should be rtnl lock call.
> > A virtio_device level lock to be used for cvq. :)
> >
> >> In addition, I have made batching and asynchronousization of the
> >> netdim command, you can refer to this patch:
> >> https://lore.kernel.org/all/1705410693-118895-4-git-send-email-
> >> hengqi@linux.alibaba.com/
> >>
> > In the listed above driver patch the motivation "to optimize the CPU
> > overhead of the DIM worker caused by the guest being busy waiting for
> > the command response result."
> >
> > Is not right.
> > Because guest is still busy waiting.
>
> There is no busy wait for guests, see get_cvq_work().
>
Ok. not always busy waiting, sometimes it does. virtnet_cvq_response() should not have flag..
Who ever gets the OS global rtnl lock is calling virtnet_cvq_response() and checking and releasing.
Shouldn’t be done this way with try lock etc.
Rtnl lock is not supposed to protect low level driver some ctrlvq response flag.
> > Without batching, due to rtnl lock every VQ command is serialized as one
> outstanding command at a time in virtnet_rx_dim_work().
> > Due to this device is unable to take benefit of DMA batching at large scale.
>
> Adding dim commands is now asynchronous, and the device will receive
> batches of commands.
>
I am comparing the code where there is no rtnl lock with the code you posted.
The reference code of patch [1] needs to begin without OS global rtnl lock as base line.
[1] https://lore.kernel.org/all/1705410693-118895-4-git-send-email-hengqi@linux.alibaba.com/
> >
> >>>>> This will enable driver to enqueue multiple cvq commands without
> >>>>> waiting
> >>>> for previous one.
> >>>>> This will also enable device to find natural coalescing done on
> >>>>> multiple
> >>>> commands.
> >>>>
> >>>> When batch user commands occur, ensuring synchronization is a
> concern.
> >>>>
> >>>>>> The following path is observed: 1. Driver kicks the device; 2.
> >>>>>> After the device receives the kick, CPU scheduling occurs and DMA
> >>>>>> multiple buffers multiple times; 3. The device completes
> >>>>>> processing and replies
> >>>> with a response.
> >>>>>> When large-queue devices issue multiple requests and kick the
> >>>>>> device frequently, this often interrupt the work of the device-side CPU.
> >>>>> When there is large devices and multiple driver notifications by a
> >>>>> cpu that is N times faster than the device side cpu, the device
> >>>>> may find natural
> >>>> coalescing on the commands of a given cvq.
> >>>>
> >>>> First we have to solve the ctrlq batch adding user (ethtool) command.
> >>>> Even if processed in a batch way on device side, the number of
> >>>> kicks and the number of backend DMAs has not been reduced.
> >>> Driver notifications are PCI writes so it should not hamper device
> >>> side,
> >> which can ignore them when they do not bother about it.
> >>
> >> Driver notifications need to be processed by the DPU, which
> >> interferes with the CPU on the DPU.
> >>
> > I was asking, if there is anyway to disable for your DPU to ignore these
> notifications while previous one is pending?
> > From your above description, it seems there isn’t.
>
> While the device is processing the request, additional kicks are ignored.
>
Ok. that is good. In that case the kicks are not interfering, right?
^ permalink raw reply [flat|nested] 20+ messages in thread
* [virtio-dev] Re: [virtio-comment] Re: [PATCH v2] virtio-net: support setting coalescing params for multiple vqs
2024-01-24 10:52 ` Michael S. Tsirkin
@ 2024-01-25 2:33 ` Heng Qi
0 siblings, 0 replies; 20+ messages in thread
From: Heng Qi @ 2024-01-25 2:33 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: virtio-comment, virtio-dev, Jason Wang, Xuan Zhuo
在 2024/1/24 下午6:52, Michael S. Tsirkin 写道:
> On Wed, Jan 24, 2024 at 05:50:28AM -0500, Michael S. Tsirkin wrote:
>> On Mon, Jan 15, 2024 at 09:06:02PM +0800, Heng Qi wrote:
>>> Currently, when each time the driver attempts to update the coalescing parameters
>>> for a vq, it needs to kick the device and wait for the ctrlq response to return.
>>> The following path is observed: 1. Driver kicks the device; 2. After the device
>>> receives the kick, CPU scheduling occurs and DMA multiple buffers multiple times;
>>> 3. The device completes processing and replies with a response.
>>>
>>> When large-queue devices issue multiple requests and kick the device frequently,
>>> this often interrupt the work of the device-side CPU. In addition, each vq request
>>> is processed separately, causing more delays for the CPU to wait for the DMA
>>> request to complete.
>>>
>>> These interruptions and overhead will strain the CPU responsible for controlling
>>> the path of the DPU, especially in multi-device and large-queue scenarios.
>>>
>>> To solve the above problems, we internally tried batch request, which merges
>>> requests from multiple queues and sends them at once. We conservatively tested
>>> 8 queue commands and sent them together. The DPU processing efficiency can be
>>> improved by 8 times, which greatly eases the DPU's support for multi-device
>>> and multi-queue DIM.
>>>
>>> Maintainers may be concerned about whether the batch command method can optimize
>>> the above problems: accumulate multiple request commands to kick the device once,
>>> and obtain the processing results of the corresponding commands asynchronously.
>>> The batch command method is used by us to optimize the CPU overhead of the DIM
>>> worker caused by the guest being busy waiting for the command response result.
>>> This is a different focus than batch request to solve the problem.
>>>
>>> Suggested-by: Xiaoming Zhao <zxm377917@alibaba-inc.com>
>>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
>>> ---
>>> v1->v2: Updated commit log. Due to sensitivity, sorry that can not give the
>>> absolute value directly. @Michael
>>>
>>> device-types/net/description.tex | 26 ++++++++++++++++++++------
>>> 1 file changed, 20 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/device-types/net/description.tex b/device-types/net/description.tex
>>> index aff5e08..b3766c4 100644
>>> --- a/device-types/net/description.tex
>>> +++ b/device-types/net/description.tex
>>> @@ -1667,8 +1667,8 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
>>> for notification coalescing.
>>>
>>> If the VIRTIO_NET_F_VQ_NOTF_COAL feature is negotiated, the driver can
>>> -send commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET
>>> -for virtqueue notification coalescing.
>>> +send commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and
>>> +VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET for virtqueue notification coalescing.
>>>
>>> \begin{lstlisting}
>>> struct virtio_net_ctrl_coal {
>>> @@ -1682,11 +1682,17 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
>>> struct virtio_net_ctrl_coal coal;
>>> };
>>>
>>> +struct virtio_net_ctrl_mrg_coal_vq {
>>> + le16 num_entries; /* indicates number of valid entries */
>>> + struct virtio_net_ctrl_coal_vq entries[];
>>> +};
>>> +
>>> #define VIRTIO_NET_CTRL_NOTF_COAL 6
>>> #define VIRTIO_NET_CTRL_NOTF_COAL_TX_SET 0
>>> #define VIRTIO_NET_CTRL_NOTF_COAL_RX_SET 1
>>> #define VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET 2
>>> #define VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET 3
>>> + #define VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET 4
>>> \end{lstlisting}
>>>
>>> Coalescing parameters:
>>> @@ -1706,6 +1712,7 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
>>> \item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, the structure virtio_net_ctrl_coal_vq is write-only for the driver.
>>> \item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET, \field{vq_index} and \field{reserved} are write-only
>>> for the driver, and the structure virtio_net_ctrl_coal is read-only for the driver.
>>> +\item For the command VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET, the structure virtio_net_ctrl_mrg_coal_vq is write-only for the driver.
>>> \end{itemize}
>>>
>>> The class VIRTIO_NET_CTRL_NOTF_COAL has the following commands:
>>> @@ -1716,6 +1723,9 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
>>> for an enabled transmit/receive virtqueue whose index is \field{vq_index}.
>>> \item VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET: use the structure virtio_net_ctrl_coal_vq to get the \field{max_usecs} and \field{max_packets} parameters
>>> for an enabled transmit/receive virtqueue whose index is \field{vq_index}.
>>> +\item VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET: use the structure virtio_net_ctrl_mrg_coal_vq to set the \field{max_usecs} and \field{max_packets} parameters
>>> + for \field{num_entries} enabled transmit/receive virtqueues. The corresponding index value
>>> + of each configured virtqueue is \field{vq_index}.
>>> \end{enumerate}
>>>
>>> The device may generate notifications more or less frequently than specified by set commands of the VIRTIO_NET_CTRL_NOTF_COAL class.
>>> @@ -1782,9 +1792,13 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
>>>
>>> The driver MUST set \field{vq_index} to the virtqueue index of an enabled transmit or receive virtqueue.
>>>
>>> +The driver MUST set \field{num_entries} to a non-zero value and MUST NOT set \field{num_entries} to
>>> +a value greater than the number of enabled transmit and receive virtqueues.
>>> +
>>> The driver MUST have negotiated the VIRTIO_NET_F_NOTF_COAL feature when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and VIRTIO_NET_CTRL_NOTF_COAL_RX_SET.
>>>
>>> -The driver MUST have negotiated the VIRTIO_NET_F_VQ_NOTF_COAL feature when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET.
>>> +The driver MUST have negotiated the VIRTIO_NET_F_VQ_NOTF_COAL feature when issuing commands
>>> +VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET.
>>>
>>> The driver MUST ignore the values of coalescing parameters received from the VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET command if the device responds with VIRTIO_NET_ERR.
>>>
>>> @@ -1794,10 +1808,10 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
>>>
>>> The device SHOULD respond to VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and VIRTIO_NET_CTRL_NOTF_COAL_RX_SET commands with VIRTIO_NET_ERR if it was not able to change the parameters.
>>>
>>> -The device MUST respond to the VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET command with VIRTIO_NET_ERR if it was not able to change the parameters.
>>> +The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET commands with VIRTIO_NET_ERR if it was not able to change the parameters.
>> This one however is vague.
>> In fact the addition of VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET made
>> VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and VIRTIO_NET_CTRL_NOTF_COAL_RX_SET
>> also vague we just missed this.
>>
>>
>>
>> So, it is not clear whether if the device responds with VIRTIO_NET_ERR - it can change
>> parameters for some VQs but not others, or does it have to either
>> change parameters for all VQs or none at all.
>>
>> I think a better sematic is all or nothing.
>>
>> So I suggest combining this statement with one of
>> VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and VIRTIO_NET_CTRL_NOTF_COAL_RX_SET,
>> and adding something like
>> "in this case, coalescing parameters MUST remain unchanged,
>> for all VQs".
>>
>> This bug bothers me. Maybe we should fix it before release?
>>
>> I also don't know why we repeat same text for VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET,
>> VIRTIO_NET_CTRL_NOTF_COAL_TX_SET and VIRTIO_NET_CTRL_NOTF_COAL_RX_SET.
> I figured it out, TX/RX are SHOULD.
> I will fix TX/RX and you can add the text to this one.
Sorry Michael, it seems I missed this email yesterday.
I will add this.
Thanks,
Heng
>
>
>>>
>>> -The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET commands with
>>> -VIRTIO_NET_ERR if the designated virtqueue is not an enabled transmit or receive virtqueue.
>>> +The device MUST respond to VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET
>>> +commands with VIRTIO_NET_ERR if the designated virtqueue is not an enabled transmit or receive virtqueue.
>>>
>>> Upon disabling and re-enabling a transmit virtqueue, the device MUST set the coalescing parameters of the virtqueue
>>> to those configured through the VIRTIO_NET_CTRL_NOTF_COAL_TX_SET command, or, if the driver did not set any TX coalescing parameters, to 0.
>>> --
>>> 1.8.3.1
>
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
>
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
>
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 20+ messages in thread
* [virtio-dev] Re: [virtio-comment] RE: [virtio-dev] RE: [virtio-comment] RE: [virtio-dev] RE: [virtio-comment] [PATCH v2] virtio-net: support setting coalescing params for multiple vqs
2024-01-24 13:18 ` Parav Pandit
@ 2024-01-25 3:05 ` Heng Qi
2024-01-25 3:51 ` Jason Wang
0 siblings, 1 reply; 20+ messages in thread
From: Heng Qi @ 2024-01-25 3:05 UTC (permalink / raw)
To: Parav Pandit
Cc: Jason Wang, Michael S. Tsirkin, Xuan Zhuo,
virtio-comment@lists.oasis-open.org,
virtio-dev@lists.oasis-open.org
在 2024/1/24 下午9:18, Parav Pandit 写道:
>> From: Heng Qi <hengqi@linux.alibaba.com>
>> Sent: Wednesday, January 24, 2024 6:31 PM
>>
>>
>> 在 2024/1/22 下午1:03, Parav Pandit 写道:
>>>> From: Heng Qi <hengqi@linux.alibaba.com>
>>>> Sent: Monday, January 22, 2024 8:27 AM
>>>>
>>>> 在 2024/1/20 下午5:59, Parav Pandit 写道:
>>>>>> From: Heng Qi <hengqi@linux.alibaba.com>
>>>>>> Sent: Wednesday, January 17, 2024 10:22 AM
>>>>>>
>>>>>> 在 2024/1/15 下午9:21, Parav Pandit 写道:
>>>>>>>> From: virtio-comment@lists.oasis-open.org
>>>>>>>> <virtio-comment@lists.oasis- open.org> On Behalf Of Heng Qi
>>>>>>>> Sent: Monday, January 15, 2024 6:36 PM
>>>>>>>>
>>>>>>>> Currently, when each time the driver attempts to update the
>>>>>>>> coalescing parameters for a vq, it needs to kick the device and
>>>>>>>> wait for the ctrlq response to return.
>>>>>>> It does not need to wait. This is some driver limitation that does
>>>>>>> not use
>>>>>> the queue as "queue".
>>>>>>> Such driver limitation should be removed in the driver. It does
>>>>>>> not qualify
>>>>>> as limitation.
>>>>>>
>>>>>> Yes, we don't have to wait.
>>>>>>
>>>>>> But in general, for user commands, it is necessary to obtain the
>>>>>> final results synchronously.
>>>>> Yes. Use initiated command can enqueue the request to cvq. Go to
>>>>> sleep
>>>> for several micro to milliseconds.
>>>>>> The user command cannot return before the final result is obtained.
>>>>>> And wait is not the problem this patch solves.
>>>>>>
>>>>> By not holding the rtnl lock, rest of the context that needs to
>>>>> enqueue the
>>>> request can progress such as that of netdim.
>>>>
>>>> Would like to see the using of rtnl lock changed.
>>>>
>>> Inside the virtnet_rx_dim_work() there should be rtnl lock call.
>>> A virtio_device level lock to be used for cvq. :)
>>>
>>>> In addition, I have made batching and asynchronousization of the
>>>> netdim command, you can refer to this patch:
>>>> https://lore.kernel.org/all/1705410693-118895-4-git-send-email-
>>>> hengqi@linux.alibaba.com/
>>>>
>>> In the listed above driver patch the motivation "to optimize the CPU
>>> overhead of the DIM worker caused by the guest being busy waiting for
>>> the command response result."
>>>
>>> Is not right.
>>> Because guest is still busy waiting.
>> There is no busy wait for guests, see get_cvq_work().
>>
> Ok. not always busy waiting, sometimes it does.
Busy waiting will only occur when the user command or dim command cannot
find the available buffer for cvq.
The user command is still in polling mode for now, I have not tried to
optimize this. Now it's about improving dim performance.
> virtnet_cvq_response() should not have flag..
The flag is mainly used to identify whether it is a user command. If so,
the previous polling mode will still be maintained.
>
> Who ever gets the OS global rtnl lock is calling virtnet_cvq_response() and checking and releasing.
> Shouldn’t be done this way with try lock etc.
> Rtnl lock is not supposed to protect low level driver some ctrlvq response flag.
To summarize, we now want to make some improvements to cvq:
1. Reasonable timeout in busy waiting mode or interrupt-based etc.
2. Batch processing (the core problem is how to get results from user
commands synchronously)
3. Remove rtnl_lock’s protection for ctrlq.
Yes, these improve the processing efficiency of cvq and avoid the
problem of OS hang.
I think 1, 3 need short term long term work and we need to try.
I don’t see any demand for batching user commands.
So for now, let's move forward with optimizing dim.
>
>
>>> Without batching, due to rtnl lock every VQ command is serialized as one
>> outstanding command at a time in virtnet_rx_dim_work().
>>> Due to this device is unable to take benefit of DMA batching at large scale.
>> Adding dim commands is now asynchronous, and the device will receive
>> batches of commands.
>>
> I am comparing the code where there is no rtnl lock with the code you posted.
Okay, I can post a patch later based on the code without rtnl_lock.
> The reference code of patch [1] needs to begin without OS global rtnl lock as base line.
>
> [1] https://lore.kernel.org/all/1705410693-118895-4-git-send-email-hengqi@linux.alibaba.com/
>
>>>>>>> This will enable driver to enqueue multiple cvq commands without
>>>>>>> waiting
>>>>>> for previous one.
>>>>>>> This will also enable device to find natural coalescing done on
>>>>>>> multiple
>>>>>> commands.
>>>>>>
>>>>>> When batch user commands occur, ensuring synchronization is a
>> concern.
>>>>>>>> The following path is observed: 1. Driver kicks the device; 2.
>>>>>>>> After the device receives the kick, CPU scheduling occurs and DMA
>>>>>>>> multiple buffers multiple times; 3. The device completes
>>>>>>>> processing and replies
>>>>>> with a response.
>>>>>>>> When large-queue devices issue multiple requests and kick the
>>>>>>>> device frequently, this often interrupt the work of the device-side CPU.
>>>>>>> When there is large devices and multiple driver notifications by a
>>>>>>> cpu that is N times faster than the device side cpu, the device
>>>>>>> may find natural
>>>>>> coalescing on the commands of a given cvq.
>>>>>>
>>>>>> First we have to solve the ctrlq batch adding user (ethtool) command.
>>>>>> Even if processed in a batch way on device side, the number of
>>>>>> kicks and the number of backend DMAs has not been reduced.
>>>>> Driver notifications are PCI writes so it should not hamper device
>>>>> side,
>>>> which can ignore them when they do not bother about it.
>>>>
>>>> Driver notifications need to be processed by the DPU, which
>>>> interferes with the CPU on the DPU.
>>>>
>>> I was asking, if there is anyway to disable for your DPU to ignore these
>> notifications while previous one is pending?
>>> From your above description, it seems there isn’t.
>> While the device is processing the request, additional kicks are ignored.
>>
> Ok. that is good. In that case the kicks are not interfering, right?
YES.
When a VM sends many DIM requests, the requests are serialized and
processed one at a time, and the number of kicks increases.
Even when we add commands asynchronously (such as this patch
https://lore.kernel.org/all/1705410693-118895-4-git-send-email-hengqi@linux.alibaba.com/),
the device needs to process each command present in the ctrlq individually.
Each command requires the device CPU to wait for the DMA operation to
complete before replying.
Although kicks are reduced, the device's DMA operations for batched
commands are serialized, which affects efficiency.
In addition, I would like to extend the VIRTIO_NET_F_VQ_NOTF_COAL
feature to support the new VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET command.
I think this is ok.
Thanks!
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 20+ messages in thread
* [virtio-dev] Re: [virtio-comment] RE: [virtio-dev] RE: [virtio-comment] RE: [virtio-dev] RE: [virtio-comment] [PATCH v2] virtio-net: support setting coalescing params for multiple vqs
2024-01-25 3:05 ` [virtio-dev] Re: [virtio-comment] " Heng Qi
@ 2024-01-25 3:51 ` Jason Wang
0 siblings, 0 replies; 20+ messages in thread
From: Jason Wang @ 2024-01-25 3:51 UTC (permalink / raw)
To: Heng Qi
Cc: Parav Pandit, Michael S. Tsirkin, Xuan Zhuo,
virtio-comment@lists.oasis-open.org,
virtio-dev@lists.oasis-open.org
On Thu, Jan 25, 2024 at 11:05 AM Heng Qi <hengqi@linux.alibaba.com> wrote:
>
>
>
> 在 2024/1/24 下午9:18, Parav Pandit 写道:
> >> From: Heng Qi <hengqi@linux.alibaba.com>
> >> Sent: Wednesday, January 24, 2024 6:31 PM
> >>
> >>
> >> 在 2024/1/22 下午1:03, Parav Pandit 写道:
> >>>> From: Heng Qi <hengqi@linux.alibaba.com>
> >>>> Sent: Monday, January 22, 2024 8:27 AM
> >>>>
> >>>> 在 2024/1/20 下午5:59, Parav Pandit 写道:
> >>>>>> From: Heng Qi <hengqi@linux.alibaba.com>
> >>>>>> Sent: Wednesday, January 17, 2024 10:22 AM
> >>>>>>
> >>>>>> 在 2024/1/15 下午9:21, Parav Pandit 写道:
> >>>>>>>> From: virtio-comment@lists.oasis-open.org
> >>>>>>>> <virtio-comment@lists.oasis- open.org> On Behalf Of Heng Qi
> >>>>>>>> Sent: Monday, January 15, 2024 6:36 PM
> >>>>>>>>
> >>>>>>>> Currently, when each time the driver attempts to update the
> >>>>>>>> coalescing parameters for a vq, it needs to kick the device and
> >>>>>>>> wait for the ctrlq response to return.
> >>>>>>> It does not need to wait. This is some driver limitation that does
> >>>>>>> not use
> >>>>>> the queue as "queue".
> >>>>>>> Such driver limitation should be removed in the driver. It does
> >>>>>>> not qualify
> >>>>>> as limitation.
> >>>>>>
> >>>>>> Yes, we don't have to wait.
> >>>>>>
> >>>>>> But in general, for user commands, it is necessary to obtain the
> >>>>>> final results synchronously.
> >>>>> Yes. Use initiated command can enqueue the request to cvq. Go to
> >>>>> sleep
> >>>> for several micro to milliseconds.
> >>>>>> The user command cannot return before the final result is obtained.
> >>>>>> And wait is not the problem this patch solves.
> >>>>>>
> >>>>> By not holding the rtnl lock, rest of the context that needs to
> >>>>> enqueue the
> >>>> request can progress such as that of netdim.
> >>>>
> >>>> Would like to see the using of rtnl lock changed.
> >>>>
> >>> Inside the virtnet_rx_dim_work() there should be rtnl lock call.
> >>> A virtio_device level lock to be used for cvq. :)
> >>>
> >>>> In addition, I have made batching and asynchronousization of the
> >>>> netdim command, you can refer to this patch:
> >>>> https://lore.kernel.org/all/1705410693-118895-4-git-send-email-
> >>>> hengqi@linux.alibaba.com/
> >>>>
> >>> In the listed above driver patch the motivation "to optimize the CPU
> >>> overhead of the DIM worker caused by the guest being busy waiting for
> >>> the command response result."
> >>>
> >>> Is not right.
> >>> Because guest is still busy waiting.
> >> There is no busy wait for guests, see get_cvq_work().
> >>
> > Ok. not always busy waiting, sometimes it does.
>
> Busy waiting will only occur when the user command or dim command cannot
> find the available buffer for cvq.
>
> The user command is still in polling mode for now, I have not tried to
> optimize this. Now it's about improving dim performance.
>
> > virtnet_cvq_response() should not have flag..
>
> The flag is mainly used to identify whether it is a user command. If so,
> the previous polling mode will still be maintained.
>
> >
> > Who ever gets the OS global rtnl lock is calling virtnet_cvq_response() and checking and releasing.
> > Shouldn’t be done this way with try lock etc.
> > Rtnl lock is not supposed to protect low level driver some ctrlvq response flag.
>
> To summarize, we now want to make some improvements to cvq:
> 1. Reasonable timeout in busy waiting mode or interrupt-based etc.
Let's use interrupt to avoid tricky code.
> 2. Batch processing (the core problem is how to get results from user
> commands synchronously)
> 3. Remove rtnl_lock’s protection for ctrlq.
Virtio-comment is not the right place to discuss these. Let's move it to netdev.
Thanks
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2024-01-25 3:52 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-15 13:06 [virtio-dev] [PATCH v2] virtio-net: support setting coalescing params for multiple vqs Heng Qi
2024-01-15 13:21 ` [virtio-dev] RE: [virtio-comment] " Parav Pandit
2024-01-17 4:52 ` Heng Qi
2024-01-20 9:59 ` Parav Pandit
2024-01-22 2:57 ` [virtio-dev] Re: [virtio-comment] " Heng Qi
2024-01-22 5:03 ` [virtio-dev] " Parav Pandit
2024-01-22 7:36 ` [virtio-dev] " Michael S. Tsirkin
2024-01-23 5:55 ` [virtio-dev] " Parav Pandit
2024-01-23 7:15 ` [virtio-dev] " Michael S. Tsirkin
2024-01-23 7:28 ` [virtio-dev] " Parav Pandit
2024-01-24 2:43 ` [virtio-dev] " Heng Qi
2024-01-24 13:01 ` [virtio-dev] " Heng Qi
2024-01-24 13:18 ` Parav Pandit
2024-01-25 3:05 ` [virtio-dev] Re: [virtio-comment] " Heng Qi
2024-01-25 3:51 ` Jason Wang
2024-01-15 22:58 ` [virtio-dev] " Michael S. Tsirkin
2024-01-17 8:53 ` Heng Qi
2024-01-24 10:50 ` Michael S. Tsirkin
2024-01-24 10:52 ` Michael S. Tsirkin
2024-01-25 2:33 ` [virtio-dev] Re: [virtio-comment] " Heng Qi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).