From: Heng Qi <hengqi@linux.alibaba.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: virtio-dev@lists.oasis-open.org,
virtio-comment@lists.oasis-open.org,
Parav Pandit <parav@nvidia.com>, Jason Wang <jasowang@redhat.com>,
Yuri Benditovich <yuri.benditovich@daynix.com>,
Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Subject: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash
Date: Wed, 26 Apr 2023 22:14:30 +0800 [thread overview]
Message-ID: <19e6d4e6-e3d8-7eca-4d54-d113b4cc5504@linux.alibaba.com> (raw)
In-Reply-To: <20230425165659-mutt-send-email-mst@kernel.org>
在 2023/4/26 上午5:03, Michael S. Tsirkin 写道:
> On Sun, Apr 23, 2023 at 03:35:32PM +0800, Heng Qi wrote:
>> 1. Currently, a received encapsulated packet has an outer and an inner header, but
>> the virtio device is unable to calculate the hash for the inner header. The same
>> flow can traverse through different tunnels, resulting in the encapsulated
>> packets being spread across multiple receive queues (refer to the figure below).
>> However, in certain scenarios, we may need to direct these encapsulated packets of
>> the same flow to a single receive queue. This facilitates the processing
>> of the flow by the same CPU to improve performance (warm caches, less locking, etc.).
>>
>> client1 client2
>> | +-------+ |
>> +------->|tunnels|<--------+
>> +-------+
>> | |
>> v v
>> +-----------------+
>> | monitoring host |
>> +-----------------+
>>
>> To achieve this, the device can calculate a symmetric hash based on the inner headers
>> of the same flow.
>>
>> 2. For legacy systems, they may lack entropy fields which modern protocols have in
>> the outer header, resulting in multiple flows with the same outer header but
>> different inner headers being directed to the same receive queue. This results in
>> poor receive performance.
>>
>> To address this limitation, inner header hash can be used to enable the device to advertise
>> the capability to calculate the hash for the inner packet, regaining better receive performance.
>>
>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
>> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> grammar in new text is still pretty bad, lots of typos too.
> Don't have time to fix it for you right now sorry, it's
> a holiday here.
I used a grammar checker and it doesn't seem to be doing a good job.
I'll do a more granular check.:)
>
>> ---
>> v12->v13:
>> 1. Add a GET command for hash_tunnel_types. @Parav Pandit
>> 2. Add tunneling protocol explanation. @Jason Wang
>> 3. Add comments on some usage scenarios for inner hash.
>>
>> v11->v12:
>> 1. Add a command VIRTIO_NET_CTRL_MQ_TUNNEL_CONFIG.
>> 2. Refine the commit log. @Michael S . Tsirkin
>> 3. Add some tunnel types.
>>
>> v10->v11:
>> 1. Revise commit log for clarity for readers.
>> 2. Some modifications to avoid undefined terms. @Parav Pandit
>> 3. Change VIRTIO_NET_F_HASH_TUNNEL dependency. @Parav Pandit
>> 4. Add the normative statements. @Parav Pandit
>>
>> v9->v10:
>> 1. Removed hash_report_tunnel related information. @Parav Pandit
>> 2. Re-describe the limitations of QoS for tunneling.
>> 3. Some clarification.
>>
>> v8->v9:
>> 1. Merge hash_report_tunnel_types into hash_report. @Parav Pandit
>> 2. Add tunnel security section. @Michael S . Tsirkin
>> 3. Add VIRTIO_NET_F_HASH_REPORT_TUNNEL.
>> 4. Fix some typos.
>> 5. Add more tunnel types. @Michael S . Tsirkin
>>
>> v7->v8:
>> 1. Add supported_hash_tunnel_types. @Jason Wang, @Parav Pandit
>> 2. Change hash_report_tunnel to hash_report_tunnel_types. @Parav Pandit
>> 3. Removed re-definition for inner packet hashing. @Parav Pandit
>> 4. Fix some typos. @Michael S . Tsirkin
>> 5. Clarify some sentences. @Michael S . Tsirkin
>>
>> v6->v7:
>> 1. Modify the wording of some sentences for clarity. @Michael S. Tsirkin
>> 2. Fix some syntax issues. @Michael S. Tsirkin
>>
>> v5->v6:
>> 1. Fix some syntax and capitalization issues. @Michael S. Tsirkin
>> 2. Use encapsulated/encaptulation uniformly. @Michael S. Tsirkin
>> 3. Move the links to introduction section. @Michael S. Tsirkin
>> 4. Clarify some sentences. @Michael S. Tsirkin
>>
>> v4->v5:
>> 1. Clarify some paragraphs. @Cornelia Huck
>> 2. Fix the u8 type. @Cornelia Huck
>>
>> v3->v4:
>> 1. Rename VIRTIO_NET_F_HASH_GRE_VXLAN_GENEVE_INNER to VIRTIO_NET_F_HASH_TUNNEL. @Jason Wang
>> 2. Make things clearer. @Jason Wang @Michael S. Tsirkin
>> 3. Keep the possibility to use inner hash for automatic receive steering. @Jason Wang
>> 4. Add the "Tunnel packet" paragraph to avoid repeating the GRE etc. many times. @Michael S. Tsirkin
>>
>> v2->v3:
>> 1. Add a feature bit for GRE/VXLAN/GENEVE inner hash. @Jason Wang
>> 2. Chang \field{hash_tunnel} to \field{hash_report_tunnel}. @Jason Wang, @Michael S. Tsirkin
>>
>> v1->v2:
>> 1. Remove the patch for the bitmask fix. @Michael S. Tsirkin
>> 2. Clarify some paragraphs. @Jason Wang
>> 3. Add \field{hash_tunnel} and VIRTIO_NET_HASH_REPORT_GRE. @Yuri Benditovich
>>
>> device-types/net/description.tex | 159 ++++++++++++++++++++++++
>> device-types/net/device-conformance.tex | 1 +
>> device-types/net/driver-conformance.tex | 1 +
>> introduction.tex | 44 +++++++
>> 4 files changed, 205 insertions(+)
>>
>> diff --git a/device-types/net/description.tex b/device-types/net/description.tex
>> index 0500bb6..48e41f1 100644
>> --- a/device-types/net/description.tex
>> +++ b/device-types/net/description.tex
>> @@ -83,6 +83,9 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
>> \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
>> channel.
>>
>> +\item[VIRTIO_NET_F_HASH_TUNNEL(52)] Device supports inner header hash
>> + for tunnel-encapsulated packets.
>> +
>> \item[VIRTIO_NET_F_NOTF_COAL(53)] Device supports notifications coalescing.
>>
>> \item[VIRTIO_NET_F_GUEST_USO4 (54)] Driver can receive USOv4 packets.
>> @@ -139,6 +142,7 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
>> \item[VIRTIO_NET_F_NOTF_COAL] Requires VIRTIO_NET_F_CTRL_VQ.
>> \item[VIRTIO_NET_F_RSC_EXT] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
>> \item[VIRTIO_NET_F_RSS] Requires VIRTIO_NET_F_CTRL_VQ.
>> +\item[VIRTIO_NET_F_HASH_TUNNEL] Requires VIRTIO_NET_F_CTRL_VQ along with VIRTIO_NET_F_RSS and/or VIRTIO_NET_F_HASH_REPORT.
>> \end{description}
>>
>> \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Network Device / Feature bits / Legacy Interface: Feature bits}
>> @@ -198,6 +202,7 @@ \subsection{Device configuration layout}\label{sec:Device Types / Network Device
>> u8 rss_max_key_size;
>> le16 rss_max_indirection_table_length;
>> le32 supported_hash_types;
>> + le32 supported_tunnel_hash_types;
>> };
>> \end{lstlisting}
>> The following field, \field{rss_max_key_size} only exists if VIRTIO_NET_F_RSS or VIRTIO_NET_F_HASH_REPORT is set.
>> @@ -212,6 +217,11 @@ \subsection{Device configuration layout}\label{sec:Device Types / Network Device
>> Field \field{supported_hash_types} contains the bitmask of supported hash types.
>> See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types} for details of supported hash types.
>>
>> +Field \field{supported_tunnel_hash_types} only exists if the device supports inner header hash, i.e. if VIRTIO_NET_F_HASH_TUNNEL is set.
>> +
>> +Filed \field{supported_tunnel_hash_types} contains the bitmask of supported tunnel hash types.
>> +See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled encapsulation hash types} for details of supported tunnel hash types.
>> +
>> \devicenormative{\subsubsection}{Device configuration layout}{Device Types / Network Device / Device configuration layout}
>>
>> The device MUST set \field{max_virtqueue_pairs} to between 1 and 0x8000 inclusive,
>> @@ -848,6 +858,7 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>> If the feature VIRTIO_NET_F_RSS was negotiated:
>> \begin{itemize}
>> \item The device uses \field{hash_types} of the virtio_net_rss_config structure as 'Enabled hash types' bitmask.
>> +\item The device uses \field{hash_tunnel_types} of the virtio_net_hash_tunnel_config structure as 'Enabled hash tunnel types' bitmask if VIRTIO_NET_F_HASH_TUNNEL was negotiated.
>> \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_rss_config structure (see
>> \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / Setting RSS parameters}).
>> \end{itemize}
>> @@ -855,6 +866,7 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>> If the feature VIRTIO_NET_F_RSS was not negotiated:
>> \begin{itemize}
>> \item The device uses \field{hash_types} of the virtio_net_hash_config structure as 'Enabled hash types' bitmask.
>> +\item The device uses \field{hash_tunnel_types} of the virtio_net_hash_tunnel_config structure as 'Enabled hash tunnel types' bitmask if VIRTIO_NET_F_HASH_TUNNEL was negotiated.
>> \item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_hash_config structure (see
>> \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}).
>> \end{itemize}
>> @@ -870,6 +882,7 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>>
>> \subparagraph{Supported/enabled hash types}
>> \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}
>> +This paragraph relies on definitions from \hyperref[intro:IP]{[IP]}, \hyperref[intro:UDP]{[UDP]} and \hyperref[intro:TCP]{[TCP]}.
>> Hash types applicable for IPv4 packets:
>> \begin{lstlisting}
>> #define VIRTIO_NET_HASH_TYPE_IPv4 (1 << 0)
>> @@ -980,6 +993,152 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>> (see \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv6 packets without extension header}).
>> \end{itemize}
>>
>> +\paragraph{Inner Header Hash}
>> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Inner Header Hash}
>> +
>> +If VIRTIO_NET_F_HASH_TUNNEL has been negotiated, the device supports inner header hash and the driver can send
>> +commands VIRTIO_NET_CTRL_TUNNEL_HASH_SET and VIRTIO_NET_CTRL_TUNNEL_HASH_GET for the inner header hash configuration.
>> +
>> +struct virtio_net_hash_tunnel_config {
>> + le32 hash_tunnel_types;
>> +};
>> +
>> +#define VIRTIO_NET_CTRL_TUNNEL_HASH 7
>> + #define VIRTIO_NET_CTRL_TUNNEL_HASH_SET 0
>> + #define VIRTIO_NET_CTRL_TUNNEL_HASH_GET 1
>> +
>> +Filed \field{hash_tunnel_types} contains a bitmask of configured hash tunnel types as
>> +defined in \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash tunnel types}.
>> +
>> +The class VIRTIO_NET_CTRL_TUNNEL_HASH has the following commands:
>> +\begin{itemize}
>> +\item VIRTIO_NET_CTRL_TUNNEL_HASH_SET: set the \field{hash_tunnel_types} to configure the inner header hash calculation for the device.
>> +\item VIRTIO_NET_CTRL_TUNNEL_HASH_GET: get the \field{hash_tunnel_types} from the device.
>> +\end{itemize}
>> +
>> +For the command VIRTIO_NET_CTRL_TUNNEL_HASH_SET, the structure virtio_net_hash_tunnel_config is write-only for the driver.
>> +For the command VIRTIO_NET_CTRL_TUNNEL_HASH_GET, the structure virtio_net_hash_tunnel_config is read-only for the driver.
>> +
>> +\subparagraph{Tunnel/Encapsulated packet}
>> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Tunnel/Encapsulated packet}
>> +
>> +A tunnel packet is encapsulated from the original packet based on the tunneling protocol (only a single level of
>> +encapsulation is currently supported). The encapsulated packet contains an outer header and an inner header, and
>> +the device calculates the hash over either the inner header or the outer header.
>> +
>> +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received encapsulated packet's outer header matches one of the
>> +configured \field{hash_tunnel_types}, the hash of the inner header is calculated.
>> +
>> +Supported encapsulated packet types:
>> +\begin{itemize}
>> +\item \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]}: the outer header is over IPv4 and the inner header is over IPv4. The outer header does not contain the transport protocol.
>> +\item \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]}: the outer header is over IPv4 and the inner header is over IPv4. The outer header does not contain the transport protocol.
>> +\item \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header does not contain the transport protocol.
>> +\item \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as the transport protocol.
>> +\item \hyperref[intro:vxlan]{[VXLAN]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as the transport protocol.
>> +\item \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as the transport protocol.
>> +\item \hyperref[intro:geneve]{[GENEVE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as the transport protocol.
>> +\item \hyperref[intro:ipip]{[IPIP]}: the outer header is over IPv4 and the inner header is over IPv4. The outer header does not contain the transport protocol.
>> +\item \hyperref[intro:nvgre]{[NVGRE]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header does not contain the transport protocol.
>> +\item \hyperref[intro:sit]{[STT]}: the outer header is over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses TCP-like as the transport protocol.
>> +\end{itemize}
>> +
>> +If VIRTIO_NET_HASH_TUNNEL_TYPE_NONE is set or the encapsulation type is not included in \field{hash_tunnel_types},
>> +the hash of the outer header is calculated for the received encapsulated packet.
>> +
>> +The hash is calculated for the received non-encapsulated packet as if VIRTIO_NET_F_HASH_TUNNEL was not negotiated.
>> +
>> +\subparagraph{Supported/enabled encapsulation hash types}
>> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled encapsulation hash types}
>> +
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_NONE (1 << 0)
>> +\end{lstlisting}
>> +
>> +Supported encapsulation hash types:
>> +Hash type applicable for inner payload of the \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2784 (1 << 1)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2890 (1 << 2)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_7676 (1 << 3)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_UDP (1 << 4)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the \hyperref[intro:vxlan]{[VXLAN]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN (1 << 5)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN_GPE (1 << 6)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the \hyperref[intro:geneve]{[GENEVE]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GENEVE (1 << 7)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the \hyperref[intro:ipip]{[IPIP]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_IPIP (1 << 8)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the \hyperref[intro:nvgre]{[NVGRE]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_NVGRE (1 << 9)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the \hyperref[intro:stt]{[STT]} packet:
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_STT (1 << 10)
>> +\end{lstlisting}
> Too many protocols to support. Can we start with just one or two?
This does not mean that every device needs to implement and support all
of these, they can choose to support some protocols they want.
I add these because we have scale application scenarios for modern
protocols VXLAN-GPE/GENEVE:
+\item In scenarios where the same flow passing through different tunnels is expected to be received in the same queue,
+ warm caches, lessing locking, etc. are optimized to obtain receiving performance.
Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.
Thanks.
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
next prev parent reply other threads:[~2023-04-26 14:14 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-23 7:35 [virtio-dev] [PATCH v13] virtio-net: support inner header hash Heng Qi
2023-04-25 20:28 ` [virtio-dev] " Parav Pandit
2023-04-25 21:06 ` [virtio-dev] Re: [virtio-comment] " Michael S. Tsirkin
2023-04-25 21:39 ` [virtio-dev] " Parav Pandit
2023-04-26 4:12 ` [virtio-dev] " Michael S. Tsirkin
2023-04-26 4:27 ` [virtio-dev] " Parav Pandit
2023-04-26 5:02 ` [virtio-dev] " Michael S. Tsirkin
2023-04-26 13:42 ` [virtio-dev] " Heng Qi
2023-04-26 13:47 ` [virtio-dev] " Parav Pandit
2023-04-26 14:03 ` [virtio-dev] Re: [virtio-comment] " Heng Qi
2023-04-26 14:24 ` [virtio-dev] " Parav Pandit
2023-04-26 14:57 ` [virtio-dev] " Michael S. Tsirkin
2023-04-26 15:20 ` [virtio-dev] " Parav Pandit
2023-04-27 2:19 ` Heng Qi
2023-04-25 21:03 ` [virtio-dev] Re: [virtio-comment] " Michael S. Tsirkin
2023-04-26 14:14 ` Heng Qi [this message]
2023-04-26 14:48 ` Michael S. Tsirkin
2023-04-27 2:28 ` Heng Qi
2023-04-27 17:13 ` Michael S. Tsirkin
2023-05-05 13:51 ` [virtio-dev] Re: [virtio-comment] " Heng Qi
2023-05-05 14:56 ` Michael S. Tsirkin
2023-05-09 14:22 ` Heng Qi
2023-05-09 15:15 ` Michael S. Tsirkin
2023-05-10 9:15 ` [virtio-dev] Re: [virtio-comment] " Heng Qi
2023-05-11 6:22 ` Michael S. Tsirkin
2023-05-12 6:00 ` Heng Qi
2023-05-12 6:54 ` Michael S. Tsirkin
2023-05-12 7:23 ` Heng Qi
2023-05-12 11:27 ` Michael S. Tsirkin
2023-05-15 6:51 ` Heng Qi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=19e6d4e6-e3d8-7eca-4d54-d113b4cc5504@linux.alibaba.com \
--to=hengqi@linux.alibaba.com \
--cc=jasowang@redhat.com \
--cc=mst@redhat.com \
--cc=parav@nvidia.com \
--cc=virtio-comment@lists.oasis-open.org \
--cc=virtio-dev@lists.oasis-open.org \
--cc=xuanzhuo@linux.alibaba.com \
--cc=yuri.benditovich@daynix.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox