From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Sender: List-Post: List-Help: List-Unsubscribe: List-Subscribe: Received: from lists.oasis-open.org (oasis-open.org [10.110.1.242]) by lists.oasis-open.org (Postfix) with ESMTP id 9BED9986353 for ; Tue, 22 Mar 2022 07:58:00 +0000 (UTC) Message-ID: Date: Tue, 22 Mar 2022 15:57:48 +0800 MIME-Version: 1.0 References: <20220315032402.6088-1-xuanzhuo@linux.alibaba.com> From: Jason Wang In-Reply-To: <20220315032402.6088-1-xuanzhuo@linux.alibaba.com> Subject: [virtio-dev] Re: [PATCH v12] virtio-net: support device stats Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable To: Xuan Zhuo , virtio-dev@lists.oasis-open.org Cc: "Michael S. Tsirkin" List-ID: =E5=9C=A8 2022/3/15 =E4=B8=8A=E5=8D=8811:24, Xuan Zhuo =E5=86=99=E9=81=93: > This patch allows the driver to obtain some statistics from the device. > > In the back-end implementation, we can count a lot of such information, > which can be used for debugging and judging the running status of the > back-end. We hope to directly display it to the user through ethtool. > > To get stats atomically, try to get stats for all queue pairs in one > command. > > Signed-off-by: Xuan Zhuo > Suggested-by: Michael S. Tsirkin > --- > conformance.tex | 2 + > content.tex | 406 +++++++++++++++++++++++++++++++++++++++++++++++- > 2 files changed, 405 insertions(+), 3 deletions(-) > > diff --git a/conformance.tex b/conformance.tex > index 42f8537..c67f877 100644 > --- a/conformance.tex > +++ b/conformance.tex > @@ -142,6 +142,7 @@ \section{Conformance Targets}\label{sec:Conformance /= Conformance Targets} > \item \ref{drivernormative:Device Types / Network Device / Device Opera= tion / Control Virtqueue / Automatic receive steering in multiqueue mode} > \item \ref{drivernormative:Device Types / Network Device / Device Opera= tion / Control Virtqueue / Offloads State Configuration / Setting Offloads = State} > \item \ref{drivernormative:Device Types / Network Device / Device Opera= tion / Control Virtqueue / Receive-side scaling (RSS) } > +\item \ref{drivernormative:Device Types / Network Device / Device Operat= ion / Control Virtqueue / Device Stats} > \end{itemize} > =20 > \conformance{\subsection}{Block Driver Conformance}\label{sec:Conforman= ce / Driver Conformance / Block Driver Conformance} > @@ -401,6 +402,7 @@ \section{Conformance Targets}\label{sec:Conformance /= Conformance Targets} > \item \ref{devicenormative:Device Types / Network Device / Device Opera= tion / Control Virtqueue / Gratuitous Packet Sending} > \item \ref{devicenormative:Device Types / Network Device / Device Opera= tion / Control Virtqueue / Automatic receive steering in multiqueue mode} > \item \ref{devicenormative:Device Types / Network Device / Device Opera= tion / Control Virtqueue / Receive-side scaling (RSS) / RSS processing} > +\item \ref{devicenormative:Device Types / Network Device / Device Operat= ion / Control Virtqueue / Device Stats} > \end{itemize} > =20 > \conformance{\subsection}{Block Device Conformance}\label{sec:Conforman= ce / Device Conformance / Block Device Conformance} > diff --git a/content.tex b/content.tex > index c6f116c..81f325d 100644 > --- a/content.tex > +++ b/content.tex > @@ -3092,6 +3092,9 @@ \subsection{Feature bits}\label{sec:Device Types / = Network Device / Feature bits > \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control > channel. > =20 > +\item[VIRTIO_NET_F_CTRL_STATS(55)] Device can provide device-level stati= stics > + to the driver through the control channel. > + > \item[VIRTIO_NET_F_HOST_USO (56)] Device can receive USO packets. Unlik= e UFO > (fragmenting the packet) the USO splits large UDP packet > to several segments when each of these smaller packets has UDP header. > @@ -3137,6 +3140,7 @@ \subsubsection{Feature bit requirements}\label{sec:= Device Types / Network Device > \item[VIRTIO_NET_F_GUEST_ANNOUNCE] Requires VIRTIO_NET_F_CTRL_VQ. > \item[VIRTIO_NET_F_MQ] Requires VIRTIO_NET_F_CTRL_VQ. > \item[VIRTIO_NET_F_CTRL_MAC_ADDR] Requires VIRTIO_NET_F_CTRL_VQ. > +\item[VIRTIO_NET_F_CTRL_STATS] Requires VIRTIO_NET_F_CTRL_VQ. > \item[VIRTIO_NET_F_RSC_EXT] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_N= ET_F_HOST_TSO6. > \item[VIRTIO_NET_F_RSS] Requires VIRTIO_NET_F_CTRL_VQ. > \end{description} > @@ -4015,6 +4019,7 @@ \subsubsection{Control Virtqueue}\label{sec:Device = Types / Network Device / Devi > u8 command; > u8 command-specific-data[]; > u8 ack; > + u8 command-specific-data-reply[]; > }; > =20 > /* ack values */ > @@ -4023,9 +4028,11 @@ \subsubsection{Control Virtqueue}\label{sec:Device= Types / Network Device / Devi > \end{lstlisting} > =20 > The \field{class}, \field{command} and command-specific-data are set by= the > -driver, and the device sets the \field{ack} byte. There is little it can > -do except issue a diagnostic if \field{ack} is not > -VIRTIO_NET_OK. > +driver, and the device sets the \field{ack} byte and optionally > +\field{command-specific-data-reply}. There is little the driver can > +do except issue a diagnostic if \field{ack} is not VIRTIO_NET_OK. > + > +The command VIRTIO_NET_CTRL_STATS_GET contains \field{command-specific-d= ata-reply}. > =20 > \paragraph{Packet Receive Filtering}\label{sec:Device Types / Network D= evice / Device Operation / Control Virtqueue / Packet Receive Filtering} > \label{sec:Device Types / Network Device / Device Operation / Control V= irtqueue / Setting Promiscuous Mode}%old label for latexdiff > @@ -4471,6 +4478,399 @@ \subsubsection{Control Virtqueue}\label{sec:Devic= e Types / Network Device / Devi > according to the native endian of the guest rather than > (necessarily when not using the legacy interface) little-endian. > =20 > +\paragraph{Device Stats}\label{sec:Device Types / Network Device / Devic= e Operation / Control Virtqueue / Device Stats} > + > +If the VIRTIO_NET_F_CTRL_STATS feature is negotiated, the driver can > +get the device stats from the device in \field{command-specific-data-rep= ly}. > + > +To get the stats, the following definitions are used: > +\begin{lstlisting} > +#define VIRTIO_NET_CTRL_STATS 6 > +#define VIRTIO_NET_CTRL_STATS_GET 0 > + > +#define VIRTIO_NET_STATS_TYPE_CVQ 0 > +#define VIRTIO_NET_STATS_TYPE_RX_BASIC 1 > +#define VIRTIO_NET_STATS_TYPE_RX_CSUM 2 > +#define VIRTIO_NET_STATS_TYPE_RX_GSO 3 > +#define VIRTIO_NET_STATS_TYPE_RX_RESET 4 > +#define VIRTIO_NET_STATS_TYPE_TX_BASIC 5 > +#define VIRTIO_NET_STATS_TYPE_TX_CSUM 6 > +#define VIRTIO_NET_STATS_TYPE_TX_GSO 7 > +#define VIRTIO_NET_STATS_TYPE_TX_RESET 8 > + > +\end{lstlisting} > + > +Use the command VIRTIO_NET_CTRL_STATS_GET and \field{command-specific-da= ta} > +containing struct virtio_net_ctrl_queue_stats to get the device stats. > +The result is returned by \field{command-specific-data-reply}. > +The stats ware returned in the order of the type specified in the s/ware/were/ and s/type/types/ ? > +\field{virtio_net_ctrl_queue_stats}. > + > +The following layout structures are used: > + > +\field{command-specific-data} > +\begin{lstlisting} > +struct virtio_net_ctrl_queue_stats { > +=09u16 nstats; > +=09struct { > +=09 u16 queue_num; > +=09 u16 type; > +=09} stats[]; > +}; > +\end{lstlisting} > + > +\field{command-specific-data-reply} > +\begin{lstlisting} > +struct virtio_net_stats_cvq { > + le64 command_num; > + le64 ok_num; > +}; > + > +struct virtio_net_stats_rx_basic { > + le64 rx_packets; > + le64 rx_bytes; > + > + le64 rx_notification; > + le64 rx_interrupt; > + > + le64 rx_drop; > + le64 rx_drop_overruns; > +}; > + > +struct virtio_net_stats_rx_csum { > + le64 rx_csum_valid; > + le64 rx_needs_csum; > + le64 rx_csum_bad; > + le64 rx_csum_none; > +}; > + > +struct virtio_net_stats_rx_gso { > + le64 rx_gso_packets; > + le64 rx_gso_bytes; > + le64 rx_gso_packets_coalesced; > + le64 rx_gso_bytes_coalesced; > + le64 rx_gso_segments; > + le64 rx_gso_segments_bytes; > +}; > + > +struct virtio_net_stats_rx_reset { > + le64 rx_reset; > +}; > + > +struct virtio_net_stats_tx_basic { > + le64 tx_packets; > + le64 tx_bytes; > + > + le64 tx_notification; > + le64 tx_interrupt; > + > + le64 tx_drop; > + le64 tx_drop_malformed; > +}; > + > +struct virtio_net_stats_tx_csum { > + le64 tx_csum_none; > + le64 tx_needs_csum; > +}; > + > +struct virtio_net_stats_tx_gso { > + le64 tx_gso_packets; > + le64 tx_gso_bytes; > + le64 tx_gso_packets_split; > + le64 tx_gso_bytes_split; > + le64 tx_gso_segments; > + le64 tx_gso_segments_bytes; > +}; > + > +struct virtio_net_stats_tx_reset { > + le64 tx_reset; > +}; > + > +\end{lstlisting} > + > +\begin{description} > + \item [nstats] > + The number of \field{stats}. This looks not necessary since it can be deduced from the buffer length. > + > + \item [queue_num] > + The number of the virtqueue to obtain the statistics. > + > + \item [type] > + The type of the stats to be obtained. > + > +\end{description} > + > +Correspondence between the vq type, the stats type, the stats structure = and the > +required features. > +\begin{tabular}{ |l|l|l|l| } > + \hline > + VQ Type & Stats Type & Stats St= ructure & Features \\ \hline > + > + controlq & VIRTIO_NET_STATS_TYPE_CVQ & virtio_n= et_stats_cvq & \\ \hline > + > + \multirow{4}*{receiveq} & VIRTIO_NET_STATS_TYPE_RX_BASIC & virtio_n= et_stats_rx_basic & \\ \cline{2-4} > + & VIRTIO_NET_STATS_TYPE_RX_CSUM & virtio_n= et_stats_rx_csum & VIRTIO_NET_F_GUEST_CSUM \\ \cline{2-4} > + & VIRTIO_NET_STATS_TYPE_RX_GSO & virtio_n= et_stats_rx_gso & VIRTIO_NET_F_GUEST_TSO4 or\newline > + = VIRTIO_NET_F_GUEST_TSO6 or\newline > + = VIRTIO_NET_F_GUEST_UFO \\ \cline{2-4} > + & VIRTIO_NET_STATS_TYPE_RX_RESET & virtio_n= et_stats_rx_reset & VIRTIO_F_RING_RESET \\ \hline > + > + \multirow{4}*{transmitq} & VIRTIO_NET_STATS_TYPE_TX_BASIC & virtio_n= et_stats_tx_basic & \\ \cline{2-4} > + & VIRTIO_NET_STATS_TYPE_TX_CSUM & virtio_n= et_stats_tx_csum & VIRTIO_NET_F_CSUM \\ \cline{2-4} > + & VIRTIO_NET_STATS_TYPE_TX_GSO & virtio_n= et_stats_tx_gso & VIRTIO_NET_F_HOST_TSO4 or\newline > + = VIRTIO_NET_F_HOST_TSO6 or\newline > + = VIRTIO_NET_F_HOST_USO or\newline > + = VIRTIO_NET_F_HOST_UFO \\ \cline{2-4} > + & VIRTIO_NET_STATS_TYPE_TX_RESET & virtio_n= et_stats_tx_reset & VIRTIO_F_RING_RESET \\ > + \hline > +\end{tabular} > + > + > +\subparagraph{Controlq Stats}\label{sec:Device Types / Network Device / = Device Operation / Control Virtqueue / Device Stats / Controlq Stats} > + > +The structure corresponding to the controlq stats is virtio_net_stats_cv= q. > + > +\begin{description} > + \item [command_num] > + The number of commands, including the current command. > + > + \item [ok_num] > + The number of commands (including the current command) where the= ack was VIRTIO_NET_OK. > +\end{description} > + > + > +\subparagraph{Receiveq Basic Stats}\label{sec:Device Types / Network Dev= ice / Device Operation / Control Virtqueue / Device Stats / Receiveq Basic = Stats} > + > +The structure corresponding to the receiveq basic stats is virtio_net_st= ats_rx_basic. > + > +Receiveq basic stats doesn't need any features, as long as the device su= pports > +VIRTIO_NET_F_CTRL_STATS. The following are the receiveq basic stats. > + > +\begin{description} > + \item [rx_packets] > + The number of packets received by device (not the packets passed= to the > + guest), including the dropped packets by device. > + > + \item [rx_bytes] > + The number of bytes received by device (not the packets passed t= o the > + guest), including the dropped packets by device. > + > + \item [rx_notification] > + The number of driver notifications. > + > + \item [rx_interrupt] > + The number of device interrupts. > + > + \item [rx_drop] > + The number of packets dropped by the receiveq. Contains all kind= s of > + packet drop. > + > + \item [rx_drop_overruns] > + The number of packets dropped by the receiveq when no more descr= iptors > + were available. > + > +\end{description} > + > +\subparagraph{Transmitq Basic Stats}\label{sec:Device Types / Network De= vice / Device Operation / Control Virtqueue / Device Stats / Transmitq Basi= c Stats} > + > +The structure corresponding to the transmitq basic stats is virtio_net_s= tats_tx_basic. > + > +Transmitq basic stats doesn't need any features, as long as the device s= upports > +VIRTIO_NET_F_CTRL_STATS. The following are the transmitq basic stats. > + > +\begin{description} > + \item [tx_packets] > + The number of packets sent by device (not the packets got from t= he > + guest), excluding the dropped packets by device. "packets dropped by device"? > + > + \item [tx_bytes] > + The number of bytes sent by device (not the packets got from the > + guest), excluding the dropped packets by device. > + > + \item [tx_notification] > + The number of driver notifications. > + > + \item [tx_interrupt] > + The number of device interrupts. > + > + \item [tx_drop] > + The number of packets dropped by the transmitq. Contains all kin= ds of > + packet drop. > + > + \item [tx_drop_malformed] > + The number of packets dropped when the descriptor is in an error= state. > + For example, the buffer is too short. I wonder if "tx_erros" is better (this is what I see from at least two=20 other NIC vendors). > + > +\end{description} > + > +\subparagraph{Receiveq CSUM Stats}\label{sec:Device Types / Network Devi= ce / Device Operation / Control Virtqueue / Device Stats / Receiveq CSUM St= ats} > + > +The structure corresponding to the receiveq csum stats is virtio_net_sta= ts_rx_csum. > + > +Only after the VIRTIO_NET_F_GUEST_CSUM negotiation is successful, the re= ceiveq > +csum stats can be obtained. > + > +The following are the receiveq csum stats: > + > +\begin{description} > + \item [rx_csum_valid] > + The number of packets with VIRTIO_NET_HDR_F_DATA_VALID. A question, it looks to me the stats refer to the phy -> device counters=20 not the device -> driver counter (which could be count by the driver). If this is true, technically, device can't receive a packet with=20 VIRTIO_NET_HDR_F_DATA_VALID. FYI, e1000e had: ethtool -S enp0s31f6 | grep csum =C2=A0=C2=A0=C2=A0=C2=A0 rx_csum_offload_good: 601959 =C2=A0=C2=A0=C2=A0=C2=A0 rx_csum_offload_errors: 0 > + > + \item [rx_needs_csum] > + The number of packets with VIRTIO_NET_HDR_F_NEEDS_CSUM. > + > + \item [rx_csum_bad] > + The number of packets with abnormal csum. > + > + \item [rx_csum_none] > + The number of packets without hardware csum. The packet here ref= ers to > + the non-TCP/UDP packet that the backend cannot recognize. This is probably not correct. We may have a guest without=20 VIRTIO_NET_F_GUEST_CSUM support. > + > +\end{description} > + > +\subparagraph{Transmitq CSUM Stats}\label{sec:Device Types / Network Dev= ice / Device Operation / Control Virtqueue / Device Stats / Transmitq CSUM = Stats} > + > +The structure corresponding to the transmitq csum stats is virtio_net_st= ats_tx_csum. > + > +Only after the VIRTIO_NET_F_CSUM negotiation is successful, the transmit= q csum > +stats can be obtained. > + > +The following are the transmitq csum stats: > + > +\begin{description} > + \item [tx_csum_none] > + The number of packets that didn't require hardware csum. > + > + \item [tx_needs_csum] > + The number of packets that required hardware csum. > + > +\end{description} > + > +\subparagraph{Receiveq GSO Stats}\label{sec:Device Types / Network Devic= e / Device Operation / Control Virtqueue / Device Stats / Receiveq GSO Stat= s} > + > +The structure corresponding to the receiveq gso stats is virtio_net_stat= s_rx_gso. > + > +If one of the VIRTIO_NET_F_GUEST_TSO4, TSO6, or UFO options have > +been negotiated, the receiveq gso stats can be obtained. We probably need to use "GSO" instead of "gso" for the whole patch. > + > +Rx gso packets refer to packets passed by the device to the driver where > +\field{gso_type} is not VIRTIO_NET_HDR_GSO_NONE. > + > +\begin{description} > + \item [rx_gso_packets] > + The number of the rx gso packets. > + > + \item [rx_gso_bytes] > + The number of bytes(excluding the virtio net header) of the rx g= so packets. > + > + \item [rx_gso_packets_coalesced] > + The number of the rx gso packages generated by coalescing. For "packages", did you mean "packets" actually? Does it work only if=20 RSC is negotiated? > + > + \item [rx_gso_bytes_coalesced] > + The number of bytes(excluding the virtio net header) of the rx g= so packets generated by coalescing. > + > + \item [rx_gso_segments] > + The number of coalesced segments. If we do VM2VM traffic, we can receive GSO packets directly without=20 coalescing. Do we count this packet here? > + > + \item [rx_gso_segments_bytes] > + The number of bytes of coalesced segments. > + > +\end{description} > + > +\subparagraph{Transmitq GSO Stats}\label{sec:Device Types / Network Devi= ce / Device Operation / Control Virtqueue / Device Stats / Transmitq GSO St= ats} > + > +The structure corresponding to the transmitq gso stats is virtio_net_sta= ts_tx_gso. > + > +If one of the VIRTIO_NET_F_HOST_TSO4, TSO6, USO or UFO options have > +been negotiated, the transmitq gso stats can be obtained. > + > +Tx gso packets refer to packets passed by the driver to the device where > +\field{gso_type} is not VIRTIO_NET_HDR_GSO_NONE. > + > +\begin{description} > + \item [tx_gso_packets] > + The number of the tx gso packets. > + > + \item [tx_gso_bytes] > + The number of bytes(excluding the virtio net header) of the tx g= so packets. > + > + \item [tx_gso_packets_split] > + The number of the tx gso packets that been split. > + > + \item [tx_gso_bytes_split] > + The number of bytes(excluding the virtio net header) of the tx g= so packets that been split. S/packet/bytes/? And does it include the L2/L3/L4 headers of the=20 segmented packets? > + > + \item [tx_gso_segments] > + The number of segments split from the gso package. For package, I think you meant packet actually. > + > + \item [tx_gso_segments_bytes] > + The number of bytes(excluding the virtio net header) of segments= split from the gso package. > +\end{description} > + > +\subparagraph{Receiveq Reset Stats}\label{sec:Device Types / Network Dev= ice / Device Operation / Control Virtqueue / Device Stats / Receiveq Reset = Stats} > + > +The structure corresponding to the receiveq reset stats is virtio_net_st= ats_rx_reset. > + > +Only when VIRTIO_F_RING_RESET is successfully negotiated, the receiveq r= eset stats > +can be obtained. Not a native speaker, but I think we can remove "successfully" here. > + > +See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueu= e Reset} > +for more about \field{rx_reset}. > + > +\begin{description} > + \item [rx_reset] > + The number of receiveq resets. > +\end{description} > + > +\subparagraph{Transmitq Reset Stats}\label{sec:Device Types / Network De= vice / Device Operation / Control Virtqueue / Device Stats / Transmitq Rese= t Stats} > + > +The structure corresponding to the transmitq reset stats is virtio_net_s= tats_tx_reset. > + > +Only when VIRTIO_F_RING_RESET is successfully negotiated, the transmitq = reset stats > +can be obtained. > + > +See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueu= e Reset} > +for more about \field{tx_reset}. > + > +\begin{description} > + \item [tx_reset] > + The number of transmitq resets. > +\end{description} > + > +\devicenormative{\subparagraph}{Device Stats}{Device Types / Network Dev= ice / Device Operation / Control Virtqueue / Device Stats} > + > +If virtio_net_ctrl_queue_stats is incorrect (such as the following), the= device > +MUST set \field{ack} to VIRTIO_NET_ERR. Even if there is only one error, > +the device MUST abort the entire command. I guess "fail" is better than "abort". > +\begin{itemize} > + \item \field{queue_num} exceeds the queue range. > + \item \field{type} is not a known value. > + \item The type of vq does not match \field{type}. E.g. the driver tr= ies to query > + RX stats through a TX index. > + \item The feature corresponding to the specified \field{type} was no= t successfully > + negotiated. > + \item The size of the buffer allocated by the driver for \field{comm= and-specific-data-reply} > + is less than the total size of the stats specialed by > + \field{virtio_net_ctrl_queue_stats}. > +\end{itemize} > + > +The device MUST write the requested stats structures in > +\field{command-specific-data-reply} in the order specified by the struct= ure > +virtio_net_ctrl_queue_stats. Are the counters reset during device reset? Thanks > + > +\drivernormative{\subparagraph}{Device Stats}{Device Types / Network Dev= ice / Device Operation / Control Virtqueue / Device Stats} > + > +When a driver tries to obtain a certain stats, it MUST confirm that the = relevant > +feature negotiation is successful. > + > +\field{type} in struct virtio_net_ctrl_queue_stats MUST correspond to th= e vq > +specified by \field{queue_num}. > + > +The \field{command-specific-data-reply} buffer allocated by the driver M= UST be > +able to hold all the stats specified by virtio_net_ctrl_queue_stats. > + > +When the driver reads the response, it MUST read > +\field{command-specific-data-reply} one by one based on the \field{type}= . > =20 > \subsubsection{Legacy Interface: Framing Requirements}\label{sec:Device > Types / Network Device / Legacy Interface: Framing Requirements} --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org