* [PATCH v3 05/20] virtio-entropy: Maintain entropy device spec in separate directory
From: Parav Pandit @ 2023-01-10 23:03 UTC (permalink / raw)
To: mst, virtio-dev, cohuck; +Cc: virtio-comment, Parav Pandit
In-Reply-To: <20230110230358.528098-1-parav@nvidia.com>
Move virtio entropy device specification to its own file similar to
recent virtio devices.
While at it, place device specification, its driver and device
conformance into its own directory to have self contained device
specification.
Fixes: https://github.com/oasis-tcs/virtio-spec/issues/153
Signed-off-by: Parav Pandit <parav@nvidia.com>
---
changelog:
v2->v3:
- file name changed from device.tex to description.tex
- use input instead of import to insert a file
v0->v1:
- moved to device specific directory
---
conformance.tex | 18 +-------
content.tex | 44 +------------------
device-types/virtio-entropy/description.tex | 42 ++++++++++++++++++
.../virtio-entropy/device-conformance.tex | 7 +++
.../virtio-entropy/driver-conformance.tex | 7 +++
5 files changed, 59 insertions(+), 59 deletions(-)
create mode 100644 device-types/virtio-entropy/description.tex
create mode 100644 device-types/virtio-entropy/device-conformance.tex
create mode 100644 device-types/virtio-entropy/driver-conformance.tex
diff --git a/conformance.tex b/conformance.tex
index ddc2a8e..758bec7 100644
--- a/conformance.tex
+++ b/conformance.tex
@@ -137,14 +137,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
\input{device-types/virtio-network/driver-conformance.tex}
\input{device-types/virtio-block/driver-conformance.tex}
\input{device-types/virtio-console/driver-conformance.tex}
-
-\conformance{\subsection}{Entropy Driver Conformance}\label{sec:Conformance / Driver Conformance / Entropy Driver Conformance}
-
-An entropy driver MUST conform to the following normative statements:
-
-\begin{itemize}
-\item \ref{drivernormative:Device Types / Entropy Device / Device Operation}
-\end{itemize}
+\input{device-types/virtio-entropy/driver-conformance.tex}
\conformance{\subsection}{Traditional Memory Balloon Driver Conformance}\label{sec:Conformance / Driver Conformance / Traditional Memory Balloon Driver Conformance}
@@ -372,14 +365,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
\input{device-types/virtio-network/device-conformance.tex}
\input{device-types/virtio-block/device-conformance.tex}
\input{device-types/virtio-console/device-conformance.tex}
-
-\conformance{\subsection}{Entropy Device Conformance}\label{sec:Conformance / Device Conformance / Entropy Device Conformance}
-
-An entropy device MUST conform to the following normative statements:
-
-\begin{itemize}
-\item \ref{devicenormative:Device Types / Entropy Device / Device Operation}
-\end{itemize}
+\input{device-types/virtio-entropy/device-conformance.tex}
\conformance{\subsection}{Traditional Memory Balloon Device Conformance}\label{sec:Conformance / Device Conformance / Traditional Memory Balloon Device Conformance}
diff --git a/content.tex b/content.tex
index 03b64a5..4928a13 100644
--- a/content.tex
+++ b/content.tex
@@ -3006,49 +3006,7 @@ \chapter{Device Types}\label{sec:Device Types}
\input{device-types/virtio-network/description.tex}
\input{device-types/virtio-block/description.tex}
\input{device-types/virtio-console/description.tex}
-
-\section{Entropy Device}\label{sec:Device Types / Entropy Device}
-
-The virtio entropy device supplies high-quality randomness for
-guest use.
-
-\subsection{Device ID}\label{sec:Device Types / Entropy Device / Device ID}
- 4
-
-\subsection{Virtqueues}\label{sec:Device Types / Entropy Device / Virtqueues}
-\begin{description}
-\item[0] requestq
-\end{description}
-
-\subsection{Feature bits}\label{sec:Device Types / Entropy Device / Feature bits}
- None currently defined
-
-\subsection{Device configuration layout}\label{sec:Device Types / Entropy Device / Device configuration layout}
- None currently defined.
-
-\subsection{Device Initialization}\label{sec:Device Types / Entropy Device / Device Initialization}
-
-\begin{enumerate}
-\item The virtqueue is initialized
-\end{enumerate}
-
-\subsection{Device Operation}\label{sec:Device Types / Entropy Device / Device Operation}
-
-When the driver requires random bytes, it places the descriptor
-of one or more buffers in the queue. It will be completely filled
-by random data by the device.
-
-\drivernormative{\subsubsection}{Device Operation}{Device Types / Entropy Device / Device Operation}
-
-The driver MUST NOT place device-readable buffers into the queue.
-
-The driver MUST examine the length written by the device to determine
-how many random bytes were received.
-
-\devicenormative{\subsubsection}{Device Operation}{Device Types / Entropy Device / Device Operation}
-
-The device MUST place one or more random bytes into the buffer, but it
-MAY use less than the entire buffer length.
+\input{device-types/virtio-entropy/description.tex}
\section{Traditional Memory Balloon Device}\label{sec:Device Types / Memory Balloon Device}
diff --git a/device-types/virtio-entropy/description.tex b/device-types/virtio-entropy/description.tex
new file mode 100644
index 0000000..c26f589
--- /dev/null
+++ b/device-types/virtio-entropy/description.tex
@@ -0,0 +1,42 @@
+\section{Entropy Device}\label{sec:Device Types / Entropy Device}
+
+The virtio entropy device supplies high-quality randomness for
+guest use.
+
+\subsection{Device ID}\label{sec:Device Types / Entropy Device / Device ID}
+ 4
+
+\subsection{Virtqueues}\label{sec:Device Types / Entropy Device / Virtqueues}
+\begin{description}
+\item[0] requestq
+\end{description}
+
+\subsection{Feature bits}\label{sec:Device Types / Entropy Device / Feature bits}
+ None currently defined
+
+\subsection{Device configuration layout}\label{sec:Device Types / Entropy Device / Device configuration layout}
+ None currently defined.
+
+\subsection{Device Initialization}\label{sec:Device Types / Entropy Device / Device Initialization}
+
+\begin{enumerate}
+\item The virtqueue is initialized
+\end{enumerate}
+
+\subsection{Device Operation}\label{sec:Device Types / Entropy Device / Device Operation}
+
+When the driver requires random bytes, it places the descriptor
+of one or more buffers in the queue. It will be completely filled
+by random data by the device.
+
+\drivernormative{\subsubsection}{Device Operation}{Device Types / Entropy Device / Device Operation}
+
+The driver MUST NOT place device-readable buffers into the queue.
+
+The driver MUST examine the length written by the device to determine
+how many random bytes were received.
+
+\devicenormative{\subsubsection}{Device Operation}{Device Types / Entropy Device / Device Operation}
+
+The device MUST place one or more random bytes into the buffer, but it
+MAY use less than the entire buffer length.
diff --git a/device-types/virtio-entropy/device-conformance.tex b/device-types/virtio-entropy/device-conformance.tex
new file mode 100644
index 0000000..2789fda
--- /dev/null
+++ b/device-types/virtio-entropy/device-conformance.tex
@@ -0,0 +1,7 @@
+\conformance{\subsection}{Entropy Device Conformance}\label{sec:Conformance / Device Conformance / Entropy Device Conformance}
+
+An entropy device MUST conform to the following normative statements:
+
+\begin{itemize}
+\item \ref{devicenormative:Device Types / Entropy Device / Device Operation}
+\end{itemize}
diff --git a/device-types/virtio-entropy/driver-conformance.tex b/device-types/virtio-entropy/driver-conformance.tex
new file mode 100644
index 0000000..175c453
--- /dev/null
+++ b/device-types/virtio-entropy/driver-conformance.tex
@@ -0,0 +1,7 @@
+\conformance{\subsection}{Entropy Driver Conformance}\label{sec:Conformance / Driver Conformance / Entropy Driver Conformance}
+
+An entropy driver MUST conform to the following normative statements:
+
+\begin{itemize}
+\item \ref{drivernormative:Device Types / Entropy Device / Device Operation}
+\end{itemize}
--
2.26.2
^ permalink raw reply related
* [PATCH v3 04/20] virtio-console: Maintain console device spec in separate directory
From: Parav Pandit @ 2023-01-10 23:03 UTC (permalink / raw)
To: mst, virtio-dev, cohuck; +Cc: virtio-comment, Parav Pandit
In-Reply-To: <20230110230358.528098-1-parav@nvidia.com>
Move virtio console device specification to its own file similar to
recent virtio devices.
While at it, place device specification, its driver and device
conformance into its own directory to have self contained device
specification.
Fixes: https://github.com/oasis-tcs/virtio-spec/issues/153
Signed-off-by: Parav Pandit <parav@nvidia.com>
---
changelog:
v2->v3:
- file name changed from device.tex to description.tex
- use input instead of import to insert a file
v0->v1:
- moved to device specific directory
---
conformance.tex | 20 +-
content.tex | 233 +-----------------
device-types/virtio-console/description.tex | 231 +++++++++++++++++
.../virtio-console/device-conformance.tex | 8 +
.../virtio-console/driver-conformance.tex | 8 +
5 files changed, 250 insertions(+), 250 deletions(-)
create mode 100644 device-types/virtio-console/description.tex
create mode 100644 device-types/virtio-console/device-conformance.tex
create mode 100644 device-types/virtio-console/driver-conformance.tex
diff --git a/conformance.tex b/conformance.tex
index 096ca3d..ddc2a8e 100644
--- a/conformance.tex
+++ b/conformance.tex
@@ -136,15 +136,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
\input{device-types/virtio-network/driver-conformance.tex}
\input{device-types/virtio-block/driver-conformance.tex}
-
-\conformance{\subsection}{Console Driver Conformance}\label{sec:Conformance / Driver Conformance / Console Driver Conformance}
-
-A console driver MUST conform to the following normative statements:
-
-\begin{itemize}
-\item \ref{drivernormative:Device Types / Console Device / Device Operation}
-\item \ref{drivernormative:Device Types / Console Device / Device Operation / Multiport Device Operation}
-\end{itemize}
+\input{device-types/virtio-console/driver-conformance.tex}
\conformance{\subsection}{Entropy Driver Conformance}\label{sec:Conformance / Driver Conformance / Entropy Driver Conformance}
@@ -379,15 +371,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
\input{device-types/virtio-network/device-conformance.tex}
\input{device-types/virtio-block/device-conformance.tex}
-
-\conformance{\subsection}{Console Device Conformance}\label{sec:Conformance / Device Conformance / Console Device Conformance}
-
-A console device MUST conform to the following normative statements:
-
-\begin{itemize}
-\item \ref{devicenormative:Device Types / Console Device / Device Initialization}
-\item \ref{devicenormative:Device Types / Console Device / Device Operation / Multiport Device Operation}
-\end{itemize}
+\input{device-types/virtio-console/device-conformance.tex}
\conformance{\subsection}{Entropy Device Conformance}\label{sec:Conformance / Device Conformance / Entropy Device Conformance}
diff --git a/content.tex b/content.tex
index 8c091a9..03b64a5 100644
--- a/content.tex
+++ b/content.tex
@@ -3005,238 +3005,7 @@ \chapter{Device Types}\label{sec:Device Types}
\input{device-types/virtio-network/description.tex}
\input{device-types/virtio-block/description.tex}
-
-\section{Console Device}\label{sec:Device Types / Console Device}
-
-The virtio console device is a simple device for data input and
-output. A device MAY have one or more ports. Each port has a pair
-of input and output virtqueues. Moreover, a device has a pair of
-control IO virtqueues. The control virtqueues are used to
-communicate information between the device and the driver about
-ports being opened and closed on either side of the connection,
-indication from the device about whether a particular port is a
-console port, adding new ports, port hot-plug/unplug, etc., and
-indication from the driver about whether a port or a device was
-successfully added, port open/close, etc. For data IO, one or
-more empty buffers are placed in the receive queue for incoming
-data and outgoing characters are placed in the transmit queue.
-
-\subsection{Device ID}\label{sec:Device Types / Console Device / Device ID}
-
- 3
-
-\subsection{Virtqueues}\label{sec:Device Types / Console Device / Virtqueues}
-
-\begin{description}
-\item[0] receiveq(port0)
-\item[1] transmitq(port0)
-\item[2] control receiveq
-\item[3] control transmitq
-\item[4] receiveq(port1)
-\item[5] transmitq(port1)
-\item[\ldots]
-\end{description}
-
-The port 0 receive and transmit queues always exist: other queues
-only exist if VIRTIO_CONSOLE_F_MULTIPORT is set.
-
-\subsection{Feature bits}\label{sec:Device Types / Console Device / Feature bits}
-
-\begin{description}
-\item[VIRTIO_CONSOLE_F_SIZE (0)] Configuration \field{cols} and \field{rows}
- are valid.
-
-\item[VIRTIO_CONSOLE_F_MULTIPORT (1)] Device has support for multiple
- ports; \field{max_nr_ports} is valid and control virtqueues will be used.
-
-\item[VIRTIO_CONSOLE_F_EMERG_WRITE (2)] Device has support for emergency write.
- Configuration field emerg_wr is valid.
-\end{description}
-
-\subsection{Device configuration layout}\label{sec:Device Types / Console Device / Device configuration layout}
-
- The size of the console is supplied
- in the configuration space if the VIRTIO_CONSOLE_F_SIZE feature
- is set. Furthermore, if the VIRTIO_CONSOLE_F_MULTIPORT feature
- is set, the maximum number of ports supported by the device can
- be fetched.
-
- If VIRTIO_CONSOLE_F_EMERG_WRITE is set then the driver can use emergency write
- to output a single character without initializing virtio queues, or even
- acknowledging the feature.
-
-\begin{lstlisting}
-struct virtio_console_config {
- le16 cols;
- le16 rows;
- le32 max_nr_ports;
- le32 emerg_wr;
-};
-\end{lstlisting}
-
-\subsubsection{Legacy Interface: Device configuration layout}\label{sec:Device Types / Console Device / Device configuration layout / Legacy Interface: Device configuration layout}
-When using the legacy interface, transitional devices and drivers
-MUST format the fields in struct virtio_console_config
-according to the native endian of the guest rather than
-(necessarily when not using the legacy interface) little-endian.
-
-\subsection{Device Initialization}\label{sec:Device Types / Console Device / Device Initialization}
-
-\begin{enumerate}
-\item If the VIRTIO_CONSOLE_F_EMERG_WRITE feature is offered,
- \field{emerg_wr} field of the configuration can be written at any time.
- Thus it works for very early boot debugging output as well as
- catastophic OS failures (eg. virtio ring corruption).
-
-\item If the VIRTIO_CONSOLE_F_SIZE feature is negotiated, the driver
- can read the console dimensions from \field{cols} and \field{rows}.
-
-\item If the VIRTIO_CONSOLE_F_MULTIPORT feature is negotiated, the
- driver can spawn multiple ports, not all of which are necessarily
- attached to a console. Some could be generic ports. In this
- case, the control virtqueues are enabled and according to
- \field{max_nr_ports}, the appropriate number
- of virtqueues are created. A control message indicating the
- driver is ready is sent to the device. The device can then send
- control messages for adding new ports to the device. After
- creating and initializing each port, a
- VIRTIO_CONSOLE_PORT_READY control message is sent to the device
- for that port so the device can let the driver know of any additional
- configuration options set for that port.
-
-\item The receiveq for each port is populated with one or more
- receive buffers.
-\end{enumerate}
-
-\devicenormative{\subsubsection}{Device Initialization}{Device Types / Console Device / Device Initialization}
-
-The device MUST allow a write to \field{emerg_wr}, even on an
-unconfigured device.
-
-The device SHOULD transmit the lower byte written to \field{emerg_wr} to
-an appropriate log or output method.
-
-\subsection{Device Operation}\label{sec:Device Types / Console Device / Device Operation}
-
-\begin{enumerate}
-\item For output, a buffer containing the characters is placed in
- the port's transmitq\footnote{Because this is high importance and low bandwidth, the current
-Linux implementation polls for the buffer to become used, rather than
-waiting for a used buffer notification, simplifying the implementation
-significantly. However, for generic serial ports with the
-O_NONBLOCK flag set, the polling limitation is relaxed and the
-consumed buffers are freed upon the next write or poll call or
-when a port is closed or hot-unplugged.
-}.
-
-\item When a buffer is used in the receiveq (signalled by a
- used buffer notification), the contents is the input to the port associated
- with the virtqueue for which the notification was received.
-
-\item If the driver negotiated the VIRTIO_CONSOLE_F_SIZE feature, a
- configuration change notification indicates that the updated size can
- be read from the configuration fields. This size applies to port 0 only.
-
-\item If the driver negotiated the VIRTIO_CONSOLE_F_MULTIPORT
- feature, active ports are announced by the device using the
- VIRTIO_CONSOLE_PORT_ADD control message. The same message is
- used for port hot-plug as well.
-\end{enumerate}
-
-\drivernormative{\subsubsection}{Device Operation}{Device Types / Console Device / Device Operation}
-
-The driver MUST NOT put a device-readable buffer in a receiveq. The driver
-MUST NOT put a device-writable buffer in a transmitq.
-
-\subsubsection{Multiport Device Operation}\label{sec:Device Types / Console Device / Device Operation / Multiport Device Operation}
-
-If the driver negotiated the VIRTIO_CONSOLE_F_MULTIPORT, the two
-control queues are used to manipulate the different console ports: the
-control receiveq for messages from the device to the driver, and the
-control sendq for driver-to-device messages. The layout of the
-control messages is:
-
-\begin{lstlisting}
-struct virtio_console_control {
- le32 id; /* Port number */
- le16 event; /* The kind of control event */
- le16 value; /* Extra information for the event */
-};
-\end{lstlisting}
-
-The values for \field{event} are:
-\begin{description}
-\item [VIRTIO_CONSOLE_DEVICE_READY (0)] Sent by the driver at initialization
- to indicate that it is ready to receive control messages. A value of
- 1 indicates success, and 0 indicates failure. The port number \field{id} is unused.
-\item [VIRTIO_CONSOLE_DEVICE_ADD (1)] Sent by the device, to create a new
- port. \field{value} is unused.
-\item [VIRTIO_CONSOLE_DEVICE_REMOVE (2)] Sent by the device, to remove an
- existing port. \field{value} is unused.
-\item [VIRTIO_CONSOLE_PORT_READY (3)] Sent by the driver in response
- to the device's VIRTIO_CONSOLE_PORT_ADD message, to indicate that
- the port is ready to be used. A \field{value} of 1 indicates success, and 0
- indicates failure.
-\item [VIRTIO_CONSOLE_CONSOLE_PORT (4)] Sent by the device to nominate
- a port as a console port. There MAY be more than one console port.
-\item [VIRTIO_CONSOLE_RESIZE (5)] Sent by the device to indicate
- a console size change. \field{value} is unused. The buffer is followed by the number of columns and rows:
-\begin{lstlisting}
-struct virtio_console_resize {
- le16 cols;
- le16 rows;
-};
-\end{lstlisting}
-\item [VIRTIO_CONSOLE_PORT_OPEN (6)] This message is sent by both the
- device and the driver. \field{value} indicates the state: 0 (port
- closed) or 1 (port open). This allows for ports to be used directly
- by guest and host processes to communicate in an application-defined
- manner.
-\item [VIRTIO_CONSOLE_PORT_NAME (7)] Sent by the device to give a tag
- to the port. This control command is immediately
- followed by the UTF-8 name of the port for identification
- within the guest (without a NUL terminator).
-\end{description}
-
-\devicenormative{\paragraph}{Multiport Device Operation}{Device Types / Console Device / Device Operation / Multiport Device Operation}
-
-The device MUST NOT specify a port which exists in a
-VIRTIO_CONSOLE_DEVICE_ADD message, nor a port which is equal or
-greater than \field{max_nr_ports}.
-
-The device MUST NOT specify a port in VIRTIO_CONSOLE_DEVICE_REMOVE
-which has not been created with a previous VIRTIO_CONSOLE_DEVICE_ADD.
-
-\drivernormative{\paragraph}{Multiport Device Operation}{Device Types / Console Device / Device Operation / Multiport Device Operation}
-
-The driver MUST send a VIRTIO_CONSOLE_DEVICE_READY message if
-VIRTIO_CONSOLE_F_MULTIPORT is negotiated.
-
-Upon receipt of a VIRTIO_CONSOLE_CONSOLE_PORT message, the driver
-SHOULD treat the port in a manner suitable for text console access
-and MUST respond with a VIRTIO_CONSOLE_PORT_OPEN message, which MUST
-have \field{value} set to 1.
-
-\subsubsection{Legacy Interface: Device Operation}\label{sec:Device Types / Console Device / Device Operation / Legacy Interface: Device Operation}
-When using the legacy interface, transitional devices and drivers
-MUST format the fields in struct virtio_console_control
-according to the native endian of the guest rather than
-(necessarily when not using the legacy interface) little-endian.
-
-When using the legacy interface, the driver SHOULD ignore the
-used length values for the transmit queues
-and the control transmitq.
-\begin{note}
-Historically, some devices put the total descriptor length there,
-even though no data was actually written.
-\end{note}
-
-\subsubsection{Legacy Interface: Framing Requirements}\label{sec:Device
-Types / Console Device / Legacy Interface: Framing Requirements}
-
-When using legacy interfaces, transitional drivers which have not
-negotiated VIRTIO_F_ANY_LAYOUT MUST use only a single
-descriptor for all buffers in the control receiveq and control transmitq.
+\input{device-types/virtio-console/description.tex}
\section{Entropy Device}\label{sec:Device Types / Entropy Device}
diff --git a/device-types/virtio-console/description.tex b/device-types/virtio-console/description.tex
new file mode 100644
index 0000000..40a2ba4
--- /dev/null
+++ b/device-types/virtio-console/description.tex
@@ -0,0 +1,231 @@
+\section{Console Device}\label{sec:Device Types / Console Device}
+
+The virtio console device is a simple device for data input and
+output. A device MAY have one or more ports. Each port has a pair
+of input and output virtqueues. Moreover, a device has a pair of
+control IO virtqueues. The control virtqueues are used to
+communicate information between the device and the driver about
+ports being opened and closed on either side of the connection,
+indication from the device about whether a particular port is a
+console port, adding new ports, port hot-plug/unplug, etc., and
+indication from the driver about whether a port or a device was
+successfully added, port open/close, etc. For data IO, one or
+more empty buffers are placed in the receive queue for incoming
+data and outgoing characters are placed in the transmit queue.
+
+\subsection{Device ID}\label{sec:Device Types / Console Device / Device ID}
+
+ 3
+
+\subsection{Virtqueues}\label{sec:Device Types / Console Device / Virtqueues}
+
+\begin{description}
+\item[0] receiveq(port0)
+\item[1] transmitq(port0)
+\item[2] control receiveq
+\item[3] control transmitq
+\item[4] receiveq(port1)
+\item[5] transmitq(port1)
+\item[\ldots]
+\end{description}
+
+The port 0 receive and transmit queues always exist: other queues
+only exist if VIRTIO_CONSOLE_F_MULTIPORT is set.
+
+\subsection{Feature bits}\label{sec:Device Types / Console Device / Feature bits}
+
+\begin{description}
+\item[VIRTIO_CONSOLE_F_SIZE (0)] Configuration \field{cols} and \field{rows}
+ are valid.
+
+\item[VIRTIO_CONSOLE_F_MULTIPORT (1)] Device has support for multiple
+ ports; \field{max_nr_ports} is valid and control virtqueues will be used.
+
+\item[VIRTIO_CONSOLE_F_EMERG_WRITE (2)] Device has support for emergency write.
+ Configuration field emerg_wr is valid.
+\end{description}
+
+\subsection{Device configuration layout}\label{sec:Device Types / Console Device / Device configuration layout}
+
+ The size of the console is supplied
+ in the configuration space if the VIRTIO_CONSOLE_F_SIZE feature
+ is set. Furthermore, if the VIRTIO_CONSOLE_F_MULTIPORT feature
+ is set, the maximum number of ports supported by the device can
+ be fetched.
+
+ If VIRTIO_CONSOLE_F_EMERG_WRITE is set then the driver can use emergency write
+ to output a single character without initializing virtio queues, or even
+ acknowledging the feature.
+
+\begin{lstlisting}
+struct virtio_console_config {
+ le16 cols;
+ le16 rows;
+ le32 max_nr_ports;
+ le32 emerg_wr;
+};
+\end{lstlisting}
+
+\subsubsection{Legacy Interface: Device configuration layout}\label{sec:Device Types / Console Device / Device configuration layout / Legacy Interface: Device configuration layout}
+When using the legacy interface, transitional devices and drivers
+MUST format the fields in struct virtio_console_config
+according to the native endian of the guest rather than
+(necessarily when not using the legacy interface) little-endian.
+
+\subsection{Device Initialization}\label{sec:Device Types / Console Device / Device Initialization}
+
+\begin{enumerate}
+\item If the VIRTIO_CONSOLE_F_EMERG_WRITE feature is offered,
+ \field{emerg_wr} field of the configuration can be written at any time.
+ Thus it works for very early boot debugging output as well as
+ catastophic OS failures (eg. virtio ring corruption).
+
+\item If the VIRTIO_CONSOLE_F_SIZE feature is negotiated, the driver
+ can read the console dimensions from \field{cols} and \field{rows}.
+
+\item If the VIRTIO_CONSOLE_F_MULTIPORT feature is negotiated, the
+ driver can spawn multiple ports, not all of which are necessarily
+ attached to a console. Some could be generic ports. In this
+ case, the control virtqueues are enabled and according to
+ \field{max_nr_ports}, the appropriate number
+ of virtqueues are created. A control message indicating the
+ driver is ready is sent to the device. The device can then send
+ control messages for adding new ports to the device. After
+ creating and initializing each port, a
+ VIRTIO_CONSOLE_PORT_READY control message is sent to the device
+ for that port so the device can let the driver know of any additional
+ configuration options set for that port.
+
+\item The receiveq for each port is populated with one or more
+ receive buffers.
+\end{enumerate}
+
+\devicenormative{\subsubsection}{Device Initialization}{Device Types / Console Device / Device Initialization}
+
+The device MUST allow a write to \field{emerg_wr}, even on an
+unconfigured device.
+
+The device SHOULD transmit the lower byte written to \field{emerg_wr} to
+an appropriate log or output method.
+
+\subsection{Device Operation}\label{sec:Device Types / Console Device / Device Operation}
+
+\begin{enumerate}
+\item For output, a buffer containing the characters is placed in
+ the port's transmitq\footnote{Because this is high importance and low bandwidth, the current
+Linux implementation polls for the buffer to become used, rather than
+waiting for a used buffer notification, simplifying the implementation
+significantly. However, for generic serial ports with the
+O_NONBLOCK flag set, the polling limitation is relaxed and the
+consumed buffers are freed upon the next write or poll call or
+when a port is closed or hot-unplugged.
+}.
+
+\item When a buffer is used in the receiveq (signalled by a
+ used buffer notification), the contents is the input to the port associated
+ with the virtqueue for which the notification was received.
+
+\item If the driver negotiated the VIRTIO_CONSOLE_F_SIZE feature, a
+ configuration change notification indicates that the updated size can
+ be read from the configuration fields. This size applies to port 0 only.
+
+\item If the driver negotiated the VIRTIO_CONSOLE_F_MULTIPORT
+ feature, active ports are announced by the device using the
+ VIRTIO_CONSOLE_PORT_ADD control message. The same message is
+ used for port hot-plug as well.
+\end{enumerate}
+
+\drivernormative{\subsubsection}{Device Operation}{Device Types / Console Device / Device Operation}
+
+The driver MUST NOT put a device-readable buffer in a receiveq. The driver
+MUST NOT put a device-writable buffer in a transmitq.
+
+\subsubsection{Multiport Device Operation}\label{sec:Device Types / Console Device / Device Operation / Multiport Device Operation}
+
+If the driver negotiated the VIRTIO_CONSOLE_F_MULTIPORT, the two
+control queues are used to manipulate the different console ports: the
+control receiveq for messages from the device to the driver, and the
+control sendq for driver-to-device messages. The layout of the
+control messages is:
+
+\begin{lstlisting}
+struct virtio_console_control {
+ le32 id; /* Port number */
+ le16 event; /* The kind of control event */
+ le16 value; /* Extra information for the event */
+};
+\end{lstlisting}
+
+The values for \field{event} are:
+\begin{description}
+\item [VIRTIO_CONSOLE_DEVICE_READY (0)] Sent by the driver at initialization
+ to indicate that it is ready to receive control messages. A value of
+ 1 indicates success, and 0 indicates failure. The port number \field{id} is unused.
+\item [VIRTIO_CONSOLE_DEVICE_ADD (1)] Sent by the device, to create a new
+ port. \field{value} is unused.
+\item [VIRTIO_CONSOLE_DEVICE_REMOVE (2)] Sent by the device, to remove an
+ existing port. \field{value} is unused.
+\item [VIRTIO_CONSOLE_PORT_READY (3)] Sent by the driver in response
+ to the device's VIRTIO_CONSOLE_PORT_ADD message, to indicate that
+ the port is ready to be used. A \field{value} of 1 indicates success, and 0
+ indicates failure.
+\item [VIRTIO_CONSOLE_CONSOLE_PORT (4)] Sent by the device to nominate
+ a port as a console port. There MAY be more than one console port.
+\item [VIRTIO_CONSOLE_RESIZE (5)] Sent by the device to indicate
+ a console size change. \field{value} is unused. The buffer is followed by the number of columns and rows:
+\begin{lstlisting}
+struct virtio_console_resize {
+ le16 cols;
+ le16 rows;
+};
+\end{lstlisting}
+\item [VIRTIO_CONSOLE_PORT_OPEN (6)] This message is sent by both the
+ device and the driver. \field{value} indicates the state: 0 (port
+ closed) or 1 (port open). This allows for ports to be used directly
+ by guest and host processes to communicate in an application-defined
+ manner.
+\item [VIRTIO_CONSOLE_PORT_NAME (7)] Sent by the device to give a tag
+ to the port. This control command is immediately
+ followed by the UTF-8 name of the port for identification
+ within the guest (without a NUL terminator).
+\end{description}
+
+\devicenormative{\paragraph}{Multiport Device Operation}{Device Types / Console Device / Device Operation / Multiport Device Operation}
+
+The device MUST NOT specify a port which exists in a
+VIRTIO_CONSOLE_DEVICE_ADD message, nor a port which is equal or
+greater than \field{max_nr_ports}.
+
+The device MUST NOT specify a port in VIRTIO_CONSOLE_DEVICE_REMOVE
+which has not been created with a previous VIRTIO_CONSOLE_DEVICE_ADD.
+
+\drivernormative{\paragraph}{Multiport Device Operation}{Device Types / Console Device / Device Operation / Multiport Device Operation}
+
+The driver MUST send a VIRTIO_CONSOLE_DEVICE_READY message if
+VIRTIO_CONSOLE_F_MULTIPORT is negotiated.
+
+Upon receipt of a VIRTIO_CONSOLE_CONSOLE_PORT message, the driver
+SHOULD treat the port in a manner suitable for text console access
+and MUST respond with a VIRTIO_CONSOLE_PORT_OPEN message, which MUST
+have \field{value} set to 1.
+
+\subsubsection{Legacy Interface: Device Operation}\label{sec:Device Types / Console Device / Device Operation / Legacy Interface: Device Operation}
+When using the legacy interface, transitional devices and drivers
+MUST format the fields in struct virtio_console_control
+according to the native endian of the guest rather than
+(necessarily when not using the legacy interface) little-endian.
+
+When using the legacy interface, the driver SHOULD ignore the
+used length values for the transmit queues
+and the control transmitq.
+\begin{note}
+Historically, some devices put the total descriptor length there,
+even though no data was actually written.
+\end{note}
+
+\subsubsection{Legacy Interface: Framing Requirements}\label{sec:Device
+Types / Console Device / Legacy Interface: Framing Requirements}
+
+When using legacy interfaces, transitional drivers which have not
+negotiated VIRTIO_F_ANY_LAYOUT MUST use only a single
+descriptor for all buffers in the control receiveq and control transmitq.
diff --git a/device-types/virtio-console/device-conformance.tex b/device-types/virtio-console/device-conformance.tex
new file mode 100644
index 0000000..c61c3f7
--- /dev/null
+++ b/device-types/virtio-console/device-conformance.tex
@@ -0,0 +1,8 @@
+\conformance{\subsection}{Console Device Conformance}\label{sec:Conformance / Device Conformance / Console Device Conformance}
+
+A console device MUST conform to the following normative statements:
+
+\begin{itemize}
+\item \ref{devicenormative:Device Types / Console Device / Device Initialization}
+\item \ref{devicenormative:Device Types / Console Device / Device Operation / Multiport Device Operation}
+\end{itemize}
diff --git a/device-types/virtio-console/driver-conformance.tex b/device-types/virtio-console/driver-conformance.tex
new file mode 100644
index 0000000..1460f4a
--- /dev/null
+++ b/device-types/virtio-console/driver-conformance.tex
@@ -0,0 +1,8 @@
+\conformance{\subsection}{Console Driver Conformance}\label{sec:Conformance / Driver Conformance / Console Driver Conformance}
+
+A console driver MUST conform to the following normative statements:
+
+\begin{itemize}
+\item \ref{drivernormative:Device Types / Console Device / Device Operation}
+\item \ref{drivernormative:Device Types / Console Device / Device Operation / Multiport Device Operation}
+\end{itemize}
--
2.26.2
^ permalink raw reply related
* [PATCH v3 03/20] virtio-block: Maintain block device spec in separate directory
From: Parav Pandit @ 2023-01-10 23:03 UTC (permalink / raw)
To: mst, virtio-dev, cohuck; +Cc: virtio-comment, Parav Pandit
In-Reply-To: <20230110230358.528098-1-parav@nvidia.com>
Move virtio block device specification to its own file similar to
recent virtio devices.
While at it, place device specification, its driver and device
conformance into its own directory to have self contained device
specification.
Fixes: https://github.com/oasis-tcs/virtio-spec/issues/153
Signed-off-by: Parav Pandit <parav@nvidia.com>
---
changelog:
v2->v3:
- file name changed from device.tex to description.tex
- use input instead of import to insert a file
v1->v2:
- removed extra blank lines at end of file
v0->v1:
- moved to device specific directory
---
conformance.tex | 20 +-
content.tex | 1315 +----------------
device-types/virtio-block/description.tex | 1313 ++++++++++++++++
.../virtio-block/device-conformance.tex | 8 +
.../virtio-block/driver-conformance.tex | 8 +
5 files changed, 1332 insertions(+), 1332 deletions(-)
create mode 100644 device-types/virtio-block/description.tex
create mode 100644 device-types/virtio-block/device-conformance.tex
create mode 100644 device-types/virtio-block/driver-conformance.tex
diff --git a/conformance.tex b/conformance.tex
index 956c808..096ca3d 100644
--- a/conformance.tex
+++ b/conformance.tex
@@ -135,15 +135,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
\end{itemize}
\input{device-types/virtio-network/driver-conformance.tex}
-
-\conformance{\subsection}{Block Driver Conformance}\label{sec:Conformance / Driver Conformance / Block Driver Conformance}
-
-A block driver MUST conform to the following normative statements:
-
-\begin{itemize}
-\item \ref{drivernormative:Device Types / Block Device / Device Initialization}
-\item \ref{drivernormative:Device Types / Block Device / Device Operation}
-\end{itemize}
+\input{device-types/virtio-block/driver-conformance.tex}
\conformance{\subsection}{Console Driver Conformance}\label{sec:Conformance / Driver Conformance / Console Driver Conformance}
@@ -386,15 +378,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
\end{itemize}
\input{device-types/virtio-network/device-conformance.tex}
-
-\conformance{\subsection}{Block Device Conformance}\label{sec:Conformance / Device Conformance / Block Device Conformance}
-
-A block device MUST conform to the following normative statements:
-
-\begin{itemize}
-\item \ref{devicenormative:Device Types / Block Device / Device Initialization}
-\item \ref{devicenormative:Device Types / Block Device / Device Operation}
-\end{itemize}
+\input{device-types/virtio-block/device-conformance.tex}
\conformance{\subsection}{Console Device Conformance}\label{sec:Conformance / Device Conformance / Console Device Conformance}
diff --git a/content.tex b/content.tex
index 90e042c..8c091a9 100644
--- a/content.tex
+++ b/content.tex
@@ -3004,1320 +3004,7 @@ \chapter{Device Types}\label{sec:Device Types}
them no further.
\input{device-types/virtio-network/description.tex}
-
-\section{Block Device}\label{sec:Device Types / Block Device}
-
-The virtio block device is a simple virtual block device (ie.
-disk). Read and write requests (and other exotic requests) are
-placed in one of its queues, and serviced (probably out of order) by the
-device except where noted.
-
-\subsection{Device ID}\label{sec:Device Types / Block Device / Device ID}
- 2
-
-\subsection{Virtqueues}\label{sec:Device Types / Block Device / Virtqueues}
-\begin{description}
-\item[0] requestq1
-\item[\ldots]
-\item[N-1] requestqN
-\end{description}
-
- N=1 if VIRTIO_BLK_F_MQ is not negotiated, otherwise N is set by
- \field{num_queues}.
-
-\subsection{Feature bits}\label{sec:Device Types / Block Device / Feature bits}
-
-\begin{description}
-\item[VIRTIO_BLK_F_SIZE_MAX (1)] Maximum size of any single segment is
- in \field{size_max}.
-
-\item[VIRTIO_BLK_F_SEG_MAX (2)] Maximum number of segments in a
- request is in \field{seg_max}.
-
-\item[VIRTIO_BLK_F_GEOMETRY (4)] Disk-style geometry specified in
- \field{geometry}.
-
-\item[VIRTIO_BLK_F_RO (5)] Device is read-only.
-
-\item[VIRTIO_BLK_F_BLK_SIZE (6)] Block size of disk is in \field{blk_size}.
-
-\item[VIRTIO_BLK_F_FLUSH (9)] Cache flush command support.
-
-\item[VIRTIO_BLK_F_TOPOLOGY (10)] Device exports information on optimal I/O
- alignment.
-
-\item[VIRTIO_BLK_F_CONFIG_WCE (11)] Device can toggle its cache between writeback
- and writethrough modes.
-
-\item[VIRTIO_BLK_F_MQ (12)] Device supports multiqueue.
-
-\item[VIRTIO_BLK_F_DISCARD (13)] Device can support discard command, maximum
- discard sectors size in \field{max_discard_sectors} and maximum discard
- segment number in \field{max_discard_seg}.
-
-\item[VIRTIO_BLK_F_WRITE_ZEROES (14)] Device can support write zeroes command,
- maximum write zeroes sectors size in \field{max_write_zeroes_sectors} and
- maximum write zeroes segment number in \field{max_write_zeroes_seg}.
-
-\item[VIRTIO_BLK_F_LIFETIME (15)] Device supports providing storage lifetime
- information.
-
-\item[VIRTIO_BLK_F_SECURE_ERASE (16)] Device supports secure erase command,
- maximum erase sectors count in \field{max_secure_erase_sectors} and
- maximum erase segment number in \field{max_secure_erase_seg}.
-
-\item[VIRTIO_BLK_F_ZONED(17)] Device is a Zoned Block Device, that is, a device
- that follows the zoned storage device behavior that is also supported by
- industry standards such as the T10 Zoned Block Command standard (ZBC r05) or
- the NVMe(TM) NVM Express Zoned Namespace Command Set Specification 1.1b
- (ZNS). For brevity, these standard documents are referred as "ZBD standards"
- from this point on in the text.
-
-\end{description}
-
-\subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Block Device / Feature bits / Legacy Interface: Feature bits}
-
-\begin{description}
-\item[VIRTIO_BLK_F_BARRIER (0)] Device supports request barriers.
-
-\item[VIRTIO_BLK_F_SCSI (7)] Device supports scsi packet commands.
-\end{description}
-
-\begin{note}
- In the legacy interface, VIRTIO_BLK_F_FLUSH was also
- called VIRTIO_BLK_F_WCE.
-\end{note}
-
-\subsection{Device configuration layout}\label{sec:Device Types / Block Device / Device configuration layout}
-
-The \field{capacity} of the device (expressed in 512-byte sectors) is always
-present. The availability of the others all depend on various feature
-bits as indicated above.
-
-The field \field{num_queues} only exists if VIRTIO_BLK_F_MQ is set. This field specifies
-the number of queues.
-
-The parameters in the configuration space of the device \field{max_discard_sectors}
-\field{discard_sector_alignment} are expressed in 512-byte units if the
-VIRTIO_BLK_F_DISCARD feature bit is negotiated. The \field{max_write_zeroes_sectors}
-is expressed in 512-byte units if the VIRTIO_BLK_F_WRITE_ZEROES feature
-bit is negotiated. The parameters in the configuration space of the device
-\field{max_secure_erase_sectors} \field{secure_erase_sector_alignment} are expressed
-in 512-byte units if the VIRTIO_BLK_F_SECURE_ERASE feature bit is negotiated.
-
-If the VIRTIO_BLK_F_ZONED feature is negotiated, then in
-\field{virtio_blk_zoned_characteristics},
-\begin{itemize}
-\item \field{zone_sectors} value is expressed in 512-byte sectors.
-\item \field{max_append_sectors} value is expressed in 512-byte sectors.
-\item \field{write_granularity} value is expressed in bytes.
-\end{itemize}
-
-The \field{model} field in \field{zoned} may have the following values:
-
-\begin{lstlisting}
-#define VIRTIO_BLK_Z_NONE 0
-#define VIRTIO_BLK_Z_HM 1
-#define VIRTIO_BLK_Z_HA 2
-\end{lstlisting}
-
-Depending on their design, zoned block devices may follow several possible
-models of operation. The three models that are standardized for ZBDs are
-drive-managed, host-managed and host-aware.
-
-While being zoned internally, drive-managed ZBDs behave exactly like regular,
-non-zoned block devices. For the purposes of virtio standardization,
-drive-managed ZBDs can always be treated as non-zoned devices. These devices
-have the VIRTIO_BLK_Z_NONE model value set in the \field{model} field in
-\field{zoned}.
-
-Devices that offer the VIRTIO_BLK_F_ZONED feature while reporting the
-VIRTIO_BLK_Z_NONE zoned model are drive-managed zoned block devices. In this
-case, the driver treats the device as a regular non-zoned block device.
-
-Host-managed zoned block devices have their LBA range divided into Sequential
-Write Required (SWR) zones that require some additional handling by the host
-for correct operation. All write requests to SWR zones are required be
-sequential and zones containing some written data need to be reset before that
-data can be rewritten. Host-managed devices support a set of ZBD-specific I/O
-requests that can be used by the host to manage device zones. Host-managed
-devices report VIRTIO_BLK_Z_HM in the \field{model} field in \field{zoned}.
-
-Host-aware zoned block devices have their LBA range divided to Sequential
-Write Preferred (SWP) zones that support random write access, similar to
-regular non-zoned devices. However, the device I/O performance might not be
-optimal if SWP zones are used in a random I/O pattern. SWP zones also support
-the same set of ZBD-specific I/O requests as host-managed devices that allow
-host-aware devices to be managed by any host that supports zoned block devices
-to achieve its optimum performance. Host-aware devices report VIRTIO_BLK_Z_HA
-in the \field{model} field in \field{zoned}.
-
-Both SWR zones and SWP zones are sometimes referred as sequential zones.
-
-During device operation, sequential zones can be in one of the following states:
-empty, implicitly-open, explicitly-open, closed and full. The state machine that
-governs the transitions between these states is described later in this document.
-
-SWR and SWP zones consume volatile device resources while being in certain
-states and the device may set limits on the number of zones that can be in these
-states simultaneously.
-
-Zoned block devices use two internal counters to account for the device
-resources in use, the number of currently open zones and the number of currently
-active zones.
-
-Any zone state transition from a state that doesn't consume a zone resource to a
-state that consumes the same resource increments the internal device counter for
-that resource. Any zone transition out of a state that consumes a zone resource
-to a state that doesn't consume the same resource decrements the counter. Any
-request that causes the device to exceed the reported zone resource limits is
-terminated by the device with a "zone resources exceeded" error as defined for
-specific commands later.
-
-\begin{lstlisting}
-struct virtio_blk_config {
- le64 capacity;
- le32 size_max;
- le32 seg_max;
- struct virtio_blk_geometry {
- le16 cylinders;
- u8 heads;
- u8 sectors;
- } geometry;
- le32 blk_size;
- struct virtio_blk_topology {
- // # of logical blocks per physical block (log2)
- u8 physical_block_exp;
- // offset of first aligned logical block
- u8 alignment_offset;
- // suggested minimum I/O size in blocks
- le16 min_io_size;
- // optimal (suggested maximum) I/O size in blocks
- le32 opt_io_size;
- } topology;
- u8 writeback;
- u8 unused0;
- u16 num_queues;
- le32 max_discard_sectors;
- le32 max_discard_seg;
- le32 discard_sector_alignment;
- le32 max_write_zeroes_sectors;
- le32 max_write_zeroes_seg;
- u8 write_zeroes_may_unmap;
- u8 unused1[3];
- le32 max_secure_erase_sectors;
- le32 max_secure_erase_seg;
- le32 secure_erase_sector_alignment;
- struct virtio_blk_zoned_characteristics {
- le32 zone_sectors;
- le32 max_open_zones;
- le32 max_active_zones;
- le32 max_append_sectors;
- le32 write_granularity;
- u8 model;
- u8 unused2[3];
- } zoned;
-};
-\end{lstlisting}
-
-
-\subsubsection{Legacy Interface: Device configuration layout}\label{sec:Device Types / Block Device / Device configuration layout / Legacy Interface: Device configuration layout}
-When using the legacy interface, transitional devices and drivers
-MUST format the fields in struct virtio_blk_config
-according to the native endian of the guest rather than
-(necessarily when not using the legacy interface) little-endian.
-
-
-\subsection{Device Initialization}\label{sec:Device Types / Block Device / Device Initialization}
-
-\begin{enumerate}
-\item The device size can be read from \field{capacity}.
-
-\item If the VIRTIO_BLK_F_BLK_SIZE feature is negotiated,
- \field{blk_size} can be read to determine the optimal sector size
- for the driver to use. This does not affect the units used in
- the protocol (always 512 bytes), but awareness of the correct
- value can affect performance.
-
-\item If the VIRTIO_BLK_F_RO feature is set by the device, any write
- requests will fail.
-
-\item If the VIRTIO_BLK_F_TOPOLOGY feature is negotiated, the fields in the
- \field{topology} struct can be read to determine the physical block size and optimal
- I/O lengths for the driver to use. This also does not affect the units
- in the protocol, only performance.
-
-\item If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the cache
- mode can be read or set through the \field{writeback} field. 0 corresponds
- to a writethrough cache, 1 to a writeback cache\footnote{Consistent with
- \ref{devicenormative:Device Types / Block Device / Device Operation},
- a writethrough cache can be defined broadly as a cache that commits
- writes to persistent device backend storage before reporting their
- completion. For example, a battery-backed writeback cache actually
- counts as writethrough according to this definition.}. The cache mode
- after reset can be either writeback or writethrough. The actual
- mode can be determined by reading \field{writeback} after feature
- negotiation.
-
-\item If the VIRTIO_BLK_F_DISCARD feature is negotiated,
- \field{max_discard_sectors} and \field{max_discard_seg} can be read
- to determine the maximum discard sectors and maximum number of discard
- segments for the block driver to use. \field{discard_sector_alignment}
- can be used by OS when splitting a request based on alignment.
-
-\item If the VIRTIO_BLK_F_WRITE_ZEROES feature is negotiated,
- \field{max_write_zeroes_sectors} and \field{max_write_zeroes_seg} can
- be read to determine the maximum write zeroes sectors and maximum
- number of write zeroes segments for the block driver to use.
-
-\item If the VIRTIO_BLK_F_MQ feature is negotiated, \field{num_queues} field
- can be read to determine the number of queues.
-
-\item If the VIRTIO_BLK_F_SECURE_ERASE feature is negotiated,
- \field{max_secure_erase_sectors} and \field{max_secure_erase_seg} can be read
- to determine the maximum secure erase sectors and maximum number of
- secure erase segments for the block driver to use.
- \field{secure_erase_sector_alignment} can be used by OS when splitting a
- request based on alignment.
-
-\item If the VIRTIO_BLK_F_ZONED feature is negotiated, the fields in
- \field{zoned} can be read by the driver to determine the zone
- characteristics of the device. All \field{zoned} fields are read-only.
-
-\end{enumerate}
-
-\drivernormative{\subsubsection}{Device Initialization}{Device Types / Block Device / Device Initialization}
-
-Drivers SHOULD NOT negotiate VIRTIO_BLK_F_FLUSH if they are incapable of
-sending VIRTIO_BLK_T_FLUSH commands.
-
-If neither VIRTIO_BLK_F_CONFIG_WCE nor VIRTIO_BLK_F_FLUSH are
-negotiated, the driver MAY deduce the presence of a writethrough cache.
-If VIRTIO_BLK_F_CONFIG_WCE was not negotiated but VIRTIO_BLK_F_FLUSH was,
-the driver SHOULD assume presence of a writeback cache.
-
-The driver MUST NOT read \field{writeback} before setting
-the FEATURES_OK \field{device status} bit.
-
-Drivers MUST NOT negotiate the VIRTIO_BLK_F_ZONED feature if they are incapable
-of supporting devices with the VIRTIO_BLK_Z_HM, VIRTIO_BLK_Z_HA or
-VIRTIO_BLK_Z_NONE zoned model.
-
-If the VIRTIO_BLK_F_ZONED feature is offered by the device with the
-VIRTIO_BLK_Z_HM zone model, then the VIRTIO_BLK_F_DISCARD feature MUST NOT be
-offered by the driver.
-
-If the VIRTIO_BLK_F_ZONED feature and VIRTIO_BLK_F_DISCARD feature are both
-offered by the device with the VIRTIO_BLK_Z_HA or VIRTIO_BLK_Z_NONE zone model,
-then the driver MAY negotiate these two bits independently.
-
-If the VIRTIO_BLK_F_ZONED feature is negotiated, then
-\begin{itemize}
-\item if the driver that can not support host-managed zoned devices
- reads VIRTIO_BLK_Z_HM from the \field{model} field of \field{zoned}, the
- driver MUST NOT set FEATURES_OK flag and instead set the FAILED bit.
-
-\item if the driver that can not support zoned devices reads VIRTIO_BLK_Z_HA
- from the \field{model} field of \field{zoned}, the driver
- MAY handle the device as a non-zoned device. In this case, the
- driver SHOULD ignore all other fields in \field{zoned}.
-\end{itemize}
-
-\devicenormative{\subsubsection}{Device Initialization}{Device Types / Block Device / Device Initialization}
-
-Devices SHOULD always offer VIRTIO_BLK_F_FLUSH, and MUST offer it
-if they offer VIRTIO_BLK_F_CONFIG_WCE.
-
-If VIRTIO_BLK_F_CONFIG_WCE is negotiated but VIRTIO_BLK_F_FLUSH
-is not, the device MUST initialize \field{writeback} to 0.
-
-The device MUST initialize padding bytes \field{unused0} and
-\field{unused1} to 0.
-
-If the device that is being initialized is a not a zoned device, the device
-SHOULD NOT offer the VIRTIO_BLK_F_ZONED feature.
-
-The VIRTIO_BLK_F_ZONED feature cannot be properly negotiated without
-FEATURES_OK bit. Legacy devices MUST NOT offer VIRTIO_BLK_F_ZONED feature bit.
-
-If the VIRTIO_BLK_F_ZONED feature is not accepted by the driver,
-\begin{itemize}
-\item the device with the VIRTIO_BLK_Z_HA or VIRTIO_BLK_Z_NONE zone model SHOULD
- proceed with the initialization while setting all zoned characteristics
- fields to zero.
-
-\item the device with the VIRTIO_BLK_Z_HM zone model MUST fail to set the
- FEATURES_OK device status bit when the driver writes the Device Status
- field.
-\end{itemize}
-
-If the VIRTIO_BLK_F_ZONED feature is negotiated, then the \field{model} field in
-\field{zoned} struct in the configuration space MUST be set by the device
-\begin{itemize}
-\item to the value of VIRTIO_BLK_Z_NONE if it operates as a drive-managed
- zoned block device or a non-zoned block device.
-
-\item to the value of VIRTIO_BLK_Z_HM if it operates as a host-managed zoned
- block device.
-
-\item to the value of VIRTIO_BLK_Z_HA if it operates as a host-aware zoned
- block device.
-\end{itemize}
-
-If the VIRTIO_BLK_F_ZONED feature is negotiated and the device \field{model}
-field in \field{zoned} struct is VIRTIO_BLK_Z_HM or VIRTIO_BLK_Z_HA,
-
-\begin{itemize}
-\item the \field{zone_sectors} field of \field{zoned} MUST be set by the device
- to the size of a single zone on the device. All zones of the device have the
- same size indicated by \field{zone_sectors} except for the last zone that
- MAY be smaller than all other zones. The driver can calculate the number of
- zones on the device as
- \begin{lstlisting}
- nr_zones = (capacity + zone_sectors - 1) / zone_sectors;
- \end{lstlisting}
- and the size of the last zone as
- \begin{lstlisting}
- zs_last = capacity - (nr_zones - 1) * zone_sectors;
- \end{lstlisting}
-
-\item The \field{max_open_zones} field of the \field{zoned} structure MUST be
- set by the device to the maximum number of zones that can be open on the
- device (zones in the implicit open or explicit open state). A value
- of zero indicates that the device does not have any limit on the number of
- open zones.
-
-\item The \field{max_active_zones} field of the \field{zoned} structure MUST
- be set by the device to the maximum number zones that can be active on the
- device (zones in the implicit open, explicit open or closed state). A value
- of zero indicates that the device does not have any limit on the number of
- active zones.
-
-\item the \field{max_append_sectors} field of \field{zoned} MUST be set by
- the device to the maximum data size of a VIRTIO_BLK_T_ZONE_APPEND request
- that can be successfully issued to the device. The value of this field MUST
- NOT exceed the \field{seg_max} * \field{size_max} value. A device MAY set
- the \field{max_append_sectors} to zero if it doesn't support
- VIRTIO_BLK_T_ZONE_APPEND requests.
-
-\item the \field{write_granularity} field of \field{zoned} MUST be set by the
- device to the offset and size alignment constraint for VIRTIO_BLK_T_OUT
- and VIRTIO_BLK_T_ZONE_APPEND requests issued to a sequential zone of the
- device.
-
-\item the device MUST initialize padding bytes \field{unused2} to 0.
-\end{itemize}
-
-\subsubsection{Legacy Interface: Device Initialization}\label{sec:Device Types / Block Device / Device Initialization / Legacy Interface: Device Initialization}
-
-Because legacy devices do not have FEATURES_OK, transitional devices
-MUST implement slightly different behavior around feature negotiation
-when used through the legacy interface. In particular, when using the
-legacy interface:
-
-\begin{itemize}
-\item the driver MAY read or write \field{writeback} before setting
- the DRIVER or DRIVER_OK \field{device status} bit
-
-\item the device MUST NOT modify the cache mode (and \field{writeback})
- as a result of a driver setting a status bit, unless
- the DRIVER_OK bit is being set and the driver has not set the
- VIRTIO_BLK_F_CONFIG_WCE driver feature bit.
-
-\item the device MUST NOT modify the cache mode (and \field{writeback})
- as a result of a driver modifying the driver feature bits, for example
- if the driver sets the VIRTIO_BLK_F_CONFIG_WCE driver feature bit but
- does not set the VIRTIO_BLK_F_FLUSH bit.
-\end{itemize}
-
-
-\subsection{Device Operation}\label{sec:Device Types / Block Device / Device Operation}
-
-The driver queues requests to the virtqueues, and they are used by
-the device (not necessarily in order). Each request except
-VIRTIO_BLK_T_ZONE_APPEND is of form:
-
-\begin{lstlisting}
-struct virtio_blk_req {
- le32 type;
- le32 reserved;
- le64 sector;
- u8 data[];
- u8 status;
-};
-\end{lstlisting}
-
-The type of the request is either a read (VIRTIO_BLK_T_IN), a write
-(VIRTIO_BLK_T_OUT), a discard (VIRTIO_BLK_T_DISCARD), a write zeroes
-(VIRTIO_BLK_T_WRITE_ZEROES), a flush (VIRTIO_BLK_T_FLUSH), a get device ID
-string command (VIRTIO_BLK_T_GET_ID), a secure erase
-(VIRTIO_BLK_T_SECURE_ERASE), or a get device lifetime command
-(VIRTIO_BLK_T_GET_LIFETIME).
-
-\begin{lstlisting}
-#define VIRTIO_BLK_T_IN 0
-#define VIRTIO_BLK_T_OUT 1
-#define VIRTIO_BLK_T_FLUSH 4
-#define VIRTIO_BLK_T_GET_ID 8
-#define VIRTIO_BLK_T_GET_LIFETIME 10
-#define VIRTIO_BLK_T_DISCARD 11
-#define VIRTIO_BLK_T_WRITE_ZEROES 13
-#define VIRTIO_BLK_T_SECURE_ERASE 14
-\end{lstlisting}
-
-The \field{sector} number indicates the offset (multiplied by 512) where
-the read or write is to occur. This field is unused and set to 0 for
-commands other than read, write and some zone operations.
-
-VIRTIO_BLK_T_IN requests populate \field{data} with the contents of sectors
-read from the block device (in multiples of 512 bytes). VIRTIO_BLK_T_OUT
-requests write the contents of \field{data} to the block device (in multiples
-of 512 bytes).
-
-The \field{data} used for discard, secure erase or write zeroes commands
-consists of one or more segments. The maximum number of segments is
-\field{max_discard_seg} for discard commands, \field{max_secure_erase_seg} for
-secure erase commands and \field{max_write_zeroes_seg} for write zeroes
-commands.
-Each segment is of form:
-
-\begin{lstlisting}
-struct virtio_blk_discard_write_zeroes {
- le64 sector;
- le32 num_sectors;
- struct {
- le32 unmap:1;
- le32 reserved:31;
- } flags;
-};
-\end{lstlisting}
-
-\field{sector} indicates the starting offset (in 512-byte units) of the
-segment, while \field{num_sectors} indicates the number of sectors in each
-discarded range. \field{unmap} is only used in write zeroes commands and allows
-the device to discard the specified range, provided that following reads return
-zeroes.
-
-VIRTIO_BLK_T_GET_ID requests fetch the device ID string from the device into
-\field{data}. The device ID string is a NUL-padded ASCII string up to 20 bytes
-long. If the string is 20 bytes long then there is no NUL terminator.
-
-The \field{data} used for VIRTIO_BLK_T_GET_LIFETIME requests is populated
-by the device, and is of the form
-
-\begin{lstlisting}
-struct virtio_blk_lifetime {
- le16 pre_eol_info;
- le16 device_lifetime_est_typ_a;
- le16 device_lifetime_est_typ_b;
-};
-\end{lstlisting}
-
-The \field{pre_eol_info} specifies the percentage of reserved blocks
-that are consumed and will have one of these values:
-
-\begin{lstlisting}
-/* Value not available */
-#define VIRTIO_BLK_PRE_EOL_INFO_UNDEFINED 0
-/* < 80% of reserved blocks are consumed */
-#define VIRTIO_BLK_PRE_EOL_INFO_NORMAL 1
-/* 80% of reserved blocks are consumed */
-#define VIRTIO_BLK_PRE_EOL_INFO_WARNING 2
-/* 90% of reserved blocks are consumed */
-#define VIRTIO_BLK_PRE_EOL_INFO_URGENT 3
-/* All others values are reserved */
-\end{lstlisting}
-
-The \field{device_lifetime_est_typ_a} refers to wear of SLC cells and is provided
-in increments of 10%, with 0 meaning undefined, 1 meaning up-to 10% of lifetime
-used, and so on, thru to 11 meaning estimated lifetime exceeded.
-All values above 11 are reserved.
-
-The \field{device_lifetime_est_typ_b} refers to wear of MLC cells and is provided
-with the same semantics as \field{device_lifetime_est_typ_a}.
-
-The final \field{status} byte is written by the device: either
-VIRTIO_BLK_S_OK for success, VIRTIO_BLK_S_IOERR for device or driver
-error or VIRTIO_BLK_S_UNSUPP for a request unsupported by device:
-
-\begin{lstlisting}
-#define VIRTIO_BLK_S_OK 0
-#define VIRTIO_BLK_S_IOERR 1
-#define VIRTIO_BLK_S_UNSUPP 2
-\end{lstlisting}
-
-The status of individual segments is indeterminate when a discard or write zero
-command produces VIRTIO_BLK_S_IOERR. A segment may have completed
-successfully, failed, or not been processed by the device.
-
-The following requirements only apply if the VIRTIO_BLK_F_ZONED feature is
-negotiated.
-
-In addition to the request types defined for non-zoned devices, the type of the
-request can be a zone report (VIRTIO_BLK_T_ZONE_REPORT), an explicit zone open
-(VIRTIO_BLK_T_ZONE_OPEN), a zone close (VIRTIO_BLK_T_ZONE_CLOSE), a zone finish
-(VIRTIO_BLK_T_ZONE_FINISH), a zone_append (VIRTIO_BLK_T_ZONE_APPEND), a zone
-reset (VIRTIO_BLK_T_ZONE_RESET) or a zone reset all
-(VIRTIO_BLK_T_ZONE_RESET_ALL).
-
-\begin{lstlisting}
-#define VIRTIO_BLK_T_ZONE_APPEND 15
-#define VIRTIO_BLK_T_ZONE_REPORT 16
-#define VIRTIO_BLK_T_ZONE_OPEN 18
-#define VIRTIO_BLK_T_ZONE_CLOSE 20
-#define VIRTIO_BLK_T_ZONE_FINISH 22
-#define VIRTIO_BLK_T_ZONE_RESET 24
-#define VIRTIO_BLK_T_ZONE_RESET_ALL 26
-\end{lstlisting}
-
-Requests of type VIRTIO_BLK_T_OUT, VIRTIO_BLK_T_ZONE_OPEN,
-VIRTIO_BLK_T_ZONE_CLOSE, VIRTIO_BLK_T_ZONE_FINISH, VIRTIO_BLK_T_ZONE_APPEND,
-VIRTIO_BLK_T_ZONE_RESET or VIRTIO_BLK_T_ZONE_RESET_ALL may be completed by the
-device with VIRTIO_BLK_S_OK, VIRTIO_BLK_S_IOERR or VIRTIO_BLK_S_UNSUPP
-\field{status}, or, additionally, with VIRTIO_BLK_S_ZONE_INVALID_CMD,
-VIRTIO_BLK_S_ZONE_UNALIGNED_WP, VIRTIO_BLK_S_ZONE_OPEN_RESOURCE or
-VIRTIO_BLK_S_ZONE_ACTIVE_RESOURCE ZBD-specific status codes.
-
-Besides the request status, VIRTIO_BLK_T_ZONE_APPEND requests return the
-starting sector of the appended data back to the driver. For this reason,
-the VIRTIO_BLK_T_ZONE_APPEND request has the layout that is extended to have
-the \field{append_sector} field to carry this value:
-
-\begin{lstlisting}
-struct virtio_blk_req_za {
- le32 type;
- le32 reserved;
- le64 sector;
- u8 data[];
- le64 append_sector;
- u8 status;
-};
-\end{lstlisting}
-
-\begin{lstlisting}
-#define VIRTIO_BLK_S_ZONE_INVALID_CMD 3
-#define VIRTIO_BLK_S_ZONE_UNALIGNED_WP 4
-#define VIRTIO_BLK_S_ZONE_OPEN_RESOURCE 5
-#define VIRTIO_BLK_S_ZONE_ACTIVE_RESOURCE 6
-\end{lstlisting}
-
-Requests of the type VIRTIO_BLK_T_ZONE_REPORT are reads and requests of the type
-VIRTIO_BLK_T_ZONE_APPEND are writes. VIRTIO_BLK_T_ZONE_OPEN,
-VIRTIO_BLK_T_ZONE_CLOSE, VIRTIO_BLK_T_ZONE_FINISH, VIRTIO_BLK_T_ZONE_RESET and
-VIRTIO_BLK_T_ZONE_RESET_ALL are non-data requests.
-
-Zone sector address is a 64-bit address of the first 512-byte sector of the
-zone.
-
-VIRTIO_BLK_T_ZONE_OPEN, VIRTIO_BLK_T_ZONE_CLOSE, VIRTIO_BLK_T_ZONE_FINISH and
-VIRTIO_BLK_T_ZONE_RESET requests make the zone operation to act on a particular
-zone specified by the zone sector address in the \field{sector} of the request.
-
-VIRTIO_BLK_T_ZONE_RESET_ALL request acts upon all applicable zones of the
-device. The \field{sector} value is not used for this request.
-
-In ZBD standards, the VIRTIO_BLK_T_ZONE_REPORT request belongs to "Zone
-Management Receive" command category and VIRTIO_BLK_T_ZONE_OPEN,
-VIRTIO_BLK_T_ZONE_CLOSE, VIRTIO_BLK_T_ZONE_FINISH and
-VIRTIO_BLK_T_ZONE_RESET/VIRTIO_BLK_T_ZONE_RESET_ALL requests are categorized as
-"Zone Management Send" commands. VIRTIO_BLK_T_ZONE_APPEND is categorized
-separately from zone management commands and is the only request that uses
-the \field{append_secctor} field \field{virtio_blk_req_za} to return
-to the driver the sector at which the data has been appended to the zone.
-
-VIRTIO_BLK_T_ZONE_REPORT is a read request that returns the information about
-the current state of zones on the device starting from the zone containing the
-\field{sector} of the request. The report consists of a header followed by zero
-or more zone descriptors.
-
-A zone report reply has the following structure:
-
-\begin{lstlisting}
-struct virtio_blk_zone_report {
- le64 nr_zones;
- u8 reserved[56];
- struct virtio_blk_zone_descriptor zones[];
-};
-\end{lstlisting}
-
-The device sets the \field{nr_zones} field in the report header to the number of
-fully transferred zone descriptors in the data buffer.
-
-A zone descriptor has the following structure:
-
-\begin{lstlisting}
-struct virtio_blk_zone_descriptor {
- le64 z_cap;
- le64 z_start;
- le64 z_wp;
- u8 z_type;
- u8 z_state;
- u8 reserved[38];
-};
-\end{lstlisting}
-
-The zone descriptor field \field{z_type} \field{virtio_blk_zone_descriptor}
-indicates the type of the zone.
-
-The following zone types are available:
-
-\begin{lstlisting}
-#define VIRTIO_BLK_ZT_CONV 1
-#define VIRTIO_BLK_ZT_SWR 2
-#define VIRTIO_BLK_ZT_SWP 3
-\end{lstlisting}
-
-Read and write operations into zones with the VIRTIO_BLK_ZT_CONV (Conventional)
-type have the same behavior as read and write operations on a regular block
-device. Any block in a conventional zone can be read or written at any time and
-in any order.
-
-Zones with VIRTIO_BLK_ZT_SWR can be read randomly, but must be written
-sequentially at a certain point in the zone called the Write Pointer (WP). With
-every write, the Write Pointer is incremented by the number of sectors written.
-
-Zones with VIRTIO_BLK_ZT_SWP can be read randomly and should be written
-sequentially, similarly to SWR zones. However, SWP zones can accept random write
-operations, that is, VIRTIO_BLK_T_OUT requests with a start sector different
-from the zone write pointer position.
-
-The field \field{z_state} of \field{virtio_blk_zone_descriptor} indicates the
-state of the device zone.
-
-The following zone states are available:
-
-\begin{lstlisting}
-#define VIRTIO_BLK_ZS_NOT_WP 0
-#define VIRTIO_BLK_ZS_EMPTY 1
-#define VIRTIO_BLK_ZS_IOPEN 2
-#define VIRTIO_BLK_ZS_EOPEN 3
-#define VIRTIO_BLK_ZS_CLOSED 4
-#define VIRTIO_BLK_ZS_RDONLY 13
-#define VIRTIO_BLK_ZS_FULL 14
-#define VIRTIO_BLK_ZS_OFFLINE 15
-\end{lstlisting}
-
-Zones of the type VIRTIO_BLK_ZT_CONV are always reported by the device to be in
-the VIRTIO_BLK_ZS_NOT_WP state. Zones of the types VIRTIO_BLK_ZT_SWR and
-VIRTIO_BLK_ZT_SWP can not transition to the VIRTIO_BLK_ZS_NOT_WP state.
-
-Zones in VIRTIO_BLK_ZS_EMPTY (Empty), VIRTIO_BLK_ZS_IOPEN (Implicitly Open),
-VIRTIO_BLK_ZS_EOPEN (Explicitly Open) and VIRTIO_BLK_ZS_CLOSED (Closed) state
-are writable, but zones in VIRTIO_BLK_ZS_RDONLY (Read-Only), VIRTIO_BLK_ZS_FULL
-(Full) and VIRTIO_BLK_ZS_OFFLINE (Offline) state are not. The write pointer
-value (\field{z_wp}) is not valid for Read-Only, Full and Offline zones.
-
-The zone descriptor field \field{z_cap} contains the maximum number of 512-byte
-sectors that are available to be written with user data when the zone is in the
-Empty state. This value shall be less than or equal to the \field{zone_sectors}
-value in \field{virtio_blk_zoned_characteristics} structure in the device
-configuration space.
-
-The zone descriptor field \field{z_start} contains the zone sector address.
-
-The zone descriptor field \field{z_wp} contains the sector address where the
-next write operation for this zone should be issued. This value is undefined
-for conventional zones and for zones in VIRTIO_BLK_ZS_RDONLY,
-VIRTIO_BLK_ZS_FULL and VIRTIO_BLK_ZS_OFFLINE state.
-
-Depending on their state, zones consume resources as follows:
-\begin{itemize}
-\item a zone in VIRTIO_BLK_ZS_IOPEN and VIRTIO_BLK_ZS_EOPEN state consumes one
- open zone resource and, additionally,
-
-\item a zone in VIRTIO_BLK_ZS_IOPEN, VIRTIO_BLK_ZS_EOPEN and
- VIRTIO_BLK_ZS_CLOSED state consumes one active resource.
-\end{itemize}
-
-Attempts for zone transitions that violate zone resource limits must fail with
-VIRTIO_BLK_S_ZONE_OPEN_RESOURCE or VIRTIO_BLK_S_ZONE_ACTIVE_RESOURCE
-\field{status}.
-
-Zones in the VIRTIO_BLK_ZS_EMPTY (Empty) state have the write pointer value
-equal to the sector address of the zone. In this state, the entire capacity of
-the zone is available for writing. A zone can transition from this state to
-\begin{itemize}
-\item VIRTIO_BLK_ZS_IOPEN when a successful VIRTIO_BLK_T_OUT request or
- VIRTIO_BLK_T_ZONE_APPEND with a non-zero data size is received for the zone.
-
-\item VIRTIO_BLK_ZS_EOPEN when a successful VIRTIO_BLK_T_ZONE_OPEN request is
- received for the zone
-\end{itemize}
-
-When a VIRTIO_BLK_T_ZONE_RESET request is issued to an Empty zone, the request
-is completed successfully and the zone stays in the VIRTIO_BLK_ZS_EMPTY state.
-
-Zones in the VIRTIO_BLK_ZS_IOPEN (Implicitly Open) state transition from
-this state to
-\begin{itemize}
-\item VIRTIO_BLK_ZS_EMPTY when a successful VIRTIO_BLK_T_ZONE_RESET request is
- received for the zone,
-
-\item VIRTIO_BLK_ZS_EMPTY when a successful VIRTIO_BLK_T_ZONE_RESET_ALL request
- is received by the device,
-
-\item VIRTIO_BLK_ZS_EOPEN when a successful VIRTIO_BLK_T_ZONE_OPEN request is
- received for the zone,
-
-\item VIRTIO_BLK_ZS_CLOSED when a successful VIRTIO_BLK_T_ZONE_CLOSE request is
- received for the zone,
-
-\item VIRTIO_BLK_ZS_CLOSED implicitly by the device when another zone is
- entering the VIRTIO_BLK_ZS_IOPEN or VIRTIO_BLK_ZS_EOPEN state and the number
- of currently open zones is at \field{max_open_zones} limit,
-
-\item VIRTIO_BLK_ZS_FULL when a successful VIRTIO_BLK_T_ZONE_FINISH request is
- received for the zone.
-
-\item VIRTIO_BLK_ZS_FULL when a successful VIRTIO_BLK_T_OUT or
- VIRTIO_BLK_T_ZONE_APPEND request that causes the zone to reach its writable
- capacity is received for the zone.
-\end{itemize}
-
-Zones in the VIRTIO_BLK_ZS_EOPEN (Explicitly Open) state transition from
-this state to
-\begin{itemize}
-\item VIRTIO_BLK_ZS_EMPTY when a successful VIRTIO_BLK_T_ZONE_RESET request is
- received for the zone,
-
-\item VIRTIO_BLK_ZS_EMPTY when a successful VIRTIO_BLK_T_ZONE_RESET_ALL request
- is received by the device,
-
-\item VIRTIO_BLK_ZS_EMPTY when a successful VIRTIO_BLK_T_ZONE_CLOSE request is
- received for the zone and the write pointer of the zone has the value equal
- to the start sector of the zone,
-
-\item VIRTIO_BLK_ZS_CLOSED when a successful VIRTIO_BLK_T_ZONE_CLOSE request is
- received for the zone and the zone write pointer is larger then the start
- sector of the zone,
-
-\item VIRTIO_BLK_ZS_FULL when a successful VIRTIO_BLK_T_ZONE_FINISH request is
- received for the zone,
-
-\item VIRTIO_BLK_ZS_FULL when a successful VIRTIO_BLK_T_OUT or
- VIRTIO_BLK_T_ZONE_APPEND request that causes the zone to reach its writable
- capacity is received for the zone.
-\end{itemize}
-
-When a VIRTIO_BLK_T_ZONE_EOPEN request is issued to an Explicitly Open zone, the
-request is completed successfully and the zone stays in the VIRTIO_BLK_ZS_EOPEN
-state.
-
-Zones in the VIRTIO_BLK_ZS_CLOSED (Closed) state transition from this state
-to
-\begin{itemize}
-\item VIRTIO_BLK_ZS_EMPTY when a successful VIRTIO_BLK_T_ZONE_RESET request is
- received for the zone,
-
-\item VIRTIO_BLK_ZS_EMPTY when a successful VIRTIO_BLK_T_ZONE_RESET_ALL request
- is received by the device,
-
-\item VIRTIO_BLK_ZS_IOPEN when a successful VIRTIO_BLK_T_OUT request or
- VIRTIO_BLK_T_ZONE_APPEND with a non-zero data size is received for the zone.
-
-\item VIRTIO_BLK_ZS_EOPEN when a successful VIRTIO_BLK_T_ZONE_OPEN request is
- received for the zone,
-\end{itemize}
-
-When a VIRTIO_BLK_T_ZONE_CLOSE request is issued to a Closed zone, the request
-is completed successfully and the zone stays in the VIRTIO_BLK_ZS_CLOSED state.
-
-Zones in the VIRTIO_BLK_ZS_FULL (Full) state transition from this state to
-VIRTIO_BLK_ZS_EMPTY when a successful VIRTIO_BLK_T_ZONE_RESET request is
-received for the zone or a successful VIRTIO_BLK_T_ZONE_RESET_ALL request is
-received by the device.
-
-When a VIRTIO_BLK_T_ZONE_FINISH request is issued to a Full zone, the request
-is completed successfully and the zone stays in the VIRTIO_BLK_ZS_FULL state.
-
-The device may automatically transition zones to VIRTIO_BLK_ZS_RDONLY
-(Read-Only) or VIRTIO_BLK_ZS_OFFLINE (Offline) state from any other state. The
-device may also automatically transition zones in the Read-Only state to the
-Offline state. Zones in the Offline state may not transition to any other state.
-Such automatic transitions usually indicate hardware failures. The previously
-written data may only be read from zones in the Read-Only state. Zones in the
-Offline state can not be read or written.
-
-VIRTIO_BLK_S_ZONE_UNALIGNED_WP is set by the device when the request received
-from the driver attempts to perform a write to an SWR zone and at least one of
-the following conditions is met:
-
-\begin{itemize}
-\item the starting sector of the request is not equal to the current value of
- the zone write pointer.
-
-\item the ending sector of the request data multiplied by 512 is not a multiple
- of the value reported by the device in the field \field{write_granularity}
- in the device configuration space.
-\end{itemize}
-
-VIRTIO_BLK_S_ZONE_OPEN_RESOURCE is set by the device when a zone operation or
-write request received from the driver can not be handled without exceeding the
-\field{max_open_zones} limit value reported by the device in the configuration
-space.
-
-VIRTIO_BLK_S_ZONE_ACTIVE_RESOURCE is set by the device when a zone operation or
-write request received from the driver can not be handled without exceeding the
-\field{max_active_zones} limit value reported by the device in the configuration
-space.
-
-A zone transition request that leads to both the \field{max_open_zones} and the
-\field{max_active_zones} limits to be exceeded is terminated by the device with
-VIRTIO_BLK_S_ZONE_ACTIVE_RESOURCE \field{status} value.
-
-The device reports all other error conditions related to zoned block model
-operation by setting the VIRTIO_BLK_S_ZONE_INVALID_CMD value in
-\field{status} of \field{virtio_blk_req} structure.
-
-\drivernormative{\subsubsection}{Device Operation}{Device Types / Block Device / Device Operation}
-
-The driver SHOULD check if the content of the \field{capacity} field has
-changed upon receiving a configuration change notification.
-
-A driver MUST NOT submit a request which would cause a read or write
-beyond \field{capacity}.
-
-A driver SHOULD accept the VIRTIO_BLK_F_RO feature if offered.
-
-A driver MUST set \field{sector} to 0 for a VIRTIO_BLK_T_FLUSH request.
-A driver SHOULD NOT include any data in a VIRTIO_BLK_T_FLUSH request.
-
-The length of \field{data} MUST be a multiple of 512 bytes for VIRTIO_BLK_T_IN
-and VIRTIO_BLK_T_OUT requests.
-
-The length of \field{data} MUST be a multiple of the size of struct
-virtio_blk_discard_write_zeroes for VIRTIO_BLK_T_DISCARD,
-VIRTIO_BLK_T_SECURE_ERASE and VIRTIO_BLK_T_WRITE_ZEROES requests.
-
-The length of \field{data} MUST be 20 bytes for VIRTIO_BLK_T_GET_ID requests.
-
-VIRTIO_BLK_T_DISCARD requests MUST NOT contain more than
-\field{max_discard_seg} struct virtio_blk_discard_write_zeroes segments in
-\field{data}.
-
-VIRTIO_BLK_T_SECURE_ERASE requests MUST NOT contain more than
-\field{max_secure_erase_seg} struct virtio_blk_discard_write_zeroes segments in
-\field{data}.
-
-VIRTIO_BLK_T_WRITE_ZEROES requests MUST NOT contain more than
-\field{max_write_zeroes_seg} struct virtio_blk_discard_write_zeroes segments in
-\field{data}.
-
-If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the driver MAY
-switch to writethrough or writeback mode by writing respectively 0 and
-1 to the \field{writeback} field. After writing a 0 to \field{writeback},
-the driver MUST NOT assume that any volatile writes have been committed
-to persistent device backend storage.
-
-The \field{unmap} bit MUST be zero for discard commands. The driver
-MUST NOT assume anything about the data returned by read requests after
-a range of sectors has been discarded.
-
-A driver MUST NOT assume that individual segments in a multi-segment
-VIRTIO_BLK_T_DISCARD or VIRTIO_BLK_T_WRITE_ZEROES request completed
-successfully, failed, or were processed by the device at all if the request
-failed with VIRTIO_BLK_S_IOERR.
-
-The following requirements only apply if the VIRTIO_BLK_F_ZONED feature is
-negotiated.
-
-A zone sector address provided by the driver MUST be a multiple of 512 bytes.
-
-When forming a VIRTIO_BLK_T_ZONE_REPORT request, the driver MUST set a sector
-within the sector range of the starting zone to report to \field{sector} field.
-It MAY be a sector that is different from the zone sector address.
-
-In VIRTIO_BLK_T_ZONE_OPEN, VIRTIO_BLK_T_ZONE_CLOSE, VIRTIO_BLK_T_ZONE_FINISH and
-VIRTIO_BLK_T_ZONE_RESET requests, the driver MUST set \field{sector} field to
-point at the first sector in the target zone.
-
-In VIRTIO_BLK_T_ZONE_RESET_ALL request, the driver MUST set the field
-\field{sector} to zero value.
-
-The \field{sector} field of the VIRTIO_BLK_T_ZONE_APPEND request MUST specify
-the zone sector address of the zone to which data is to be appended at the
-position of the write pointer. The size of the data that is appended MUST be a
-multiple of \field{write_granularity} bytes and MUST NOT exceed the
-\field{max_append_sectors} value provided by the device in
-\field{virtio_blk_zoned_characteristics} configuration space structure.
-
-Upon a successful completion of a VIRTIO_BLK_T_ZONE_APPEND request, the driver
-MAY read the starting sector location of the written data from the request
-field \field{append_sector}.
-
-All VIRTIO_BLK_T_OUT requests issued by the driver to sequential zones and
-VIRTIO_BLK_T_ZONE_APPEND requests MUST have:
-
-\begin{enumerate}
-\item the data size that is a multiple of the number of bytes reported
- by the device in the field \field{write_granularity} in the
- \field{virtio_blk_zoned_characteristics} configuration space structure.
-
-\item the value of the field \field{sector} that is a multiple of the number of
- bytes reported by the device in the field \field{write_granularity} in the
- \field{virtio_blk_zoned_characteristics} configuration space structure.
-
-\item the data size that will not exceed the writable zone capacity when its
- value is added to the current value of the write pointer of the zone.
-
-\end{enumerate}
-
-\devicenormative{\subsubsection}{Device Operation}{Device Types / Block Device / Device Operation}
-
-The device MAY change the content of the \field{capacity} field during
-operation of the device. When this happens, the device SHOULD trigger a
-configuration change notification.
-
-A device MUST set the \field{status} byte to VIRTIO_BLK_S_IOERR
-for a write request if the VIRTIO_BLK_F_RO feature if offered, and MUST NOT
-write any data.
-
-The device MUST set the \field{status} byte to VIRTIO_BLK_S_UNSUPP for
-discard, secure erase and write zeroes commands if any unknown flag is set.
-Furthermore, the device MUST set the \field{status} byte to
-VIRTIO_BLK_S_UNSUPP for discard commands if the \field{unmap} flag is set.
-
-For discard commands, the device MAY deallocate the specified range of
-sectors in the device backend storage.
-
-For write zeroes commands, if the \field{unmap} is set, the device MAY
-deallocate the specified range of sectors in the device backend storage,
-as if the discard command had been sent. After a write zeroes command
-is completed, reads of the specified ranges of sectors MUST return
-zeroes. This is true independent of whether \field{unmap} was set or clear.
-
-The device SHOULD clear the \field{write_zeroes_may_unmap} field of the
-virtio configuration space if and only if a write zeroes request cannot
-result in deallocating one or more sectors. The device MAY change the
-content of the field during operation of the device; when this happens,
-the device SHOULD trigger a configuration change notification.
-
-A write is considered volatile when it is submitted; the contents of
-sectors covered by a volatile write are undefined in persistent device
-backend storage until the write becomes stable. A write becomes stable
-once it is completed and one or more of the following conditions is true:
-
-\begin{enumerate}
-\item\label{item:flush1} neither VIRTIO_BLK_F_CONFIG_WCE nor
- VIRTIO_BLK_F_FLUSH feature were negotiated, but VIRTIO_BLK_F_FLUSH was
- offered by the device;
-
-\item\label{item:flush2} the VIRTIO_BLK_F_CONFIG_WCE feature was negotiated and the
- \field{writeback} field in configuration space was 0 \textbf{all the time between
- the submission of the write and its completion};
-
-\item\label{item:flush3} a VIRTIO_BLK_T_FLUSH request is sent \textbf{after the write is
- completed} and is completed itself.
-\end{enumerate}
-
-If the device is backed by persistent storage, the device MUST ensure that
-stable writes are committed to it, before reporting completion of the write
-(cases~\ref{item:flush1} and~\ref{item:flush2}) or the flush
-(case~\ref{item:flush3}). Failure to do so can cause data loss
-in case of a crash.
-
-If the driver changes \field{writeback} between the submission of the write
-and its completion, the write could be either volatile or stable when
-its completion is reported; in other words, the exact behavior is undefined.
-
-% According to the device requirements for device initialization:
-% Offer(CONFIG_WCE) => Offer(FLUSH).
-%
-% After reversing the implication:
-% not Offer(FLUSH) => not Offer(CONFIG_WCE).
-
-If VIRTIO_BLK_F_FLUSH was not offered by the
- device\footnote{Note that in this case, according to
- \ref{devicenormative:Device Types / Block Device / Device Initialization},
- the device will not have offered VIRTIO_BLK_F_CONFIG_WCE either.}, the
-device MAY also commit writes to persistent device backend storage before
-reporting their completion. Unlike case~\ref{item:flush1}, however, this
-is not an absolute requirement of the specification.
-
-\begin{note}
- An implementation that does not offer VIRTIO_BLK_F_FLUSH and does not commit
- completed writes will not be resilient to data loss in case of crashes.
- Not offering VIRTIO_BLK_F_FLUSH is an absolute requirement
- for implementations that do not wish to be safe against such data losses.
-\end{note}
-
-If the device is backed by storage providing lifetime metrics (such as eMMC
-or UFS persistent storage), the device SHOULD offer the VIRTIO_BLK_F_LIFETIME
-flag. The flag MUST NOT be offered if the device is backed by storage for which
-the lifetime metrics described in this document cannot be obtained or for which
-such metrics have no useful meaning. If the metrics are offered, the device MUST NOT
-send any reserved values, as defined in this specification.
-
-\begin{note}
- The device lifetime metrics \field{pre_eol_info}, \field{device_lifetime_est_a}
- and \field{device_lifetime_est_b} are discussed in the JESD84-B50 specification.
-
- The complete JESD84-B50 is available at the JEDEC website (https://www.jedec.org)
- pursuant to JEDEC's licensing terms and conditions. This information is provided to
- simplfy passthrough implementations from eMMC devices.
-\end{note}
-
-If the VIRTIO_BLK_F_ZONED feature is not negotiated, the device MUST reject
-VIRTIO_BLK_T_ZONE_REPORT, VIRTIO_BLK_T_ZONE_OPEN, VIRTIO_BLK_T_ZONE_CLOSE,
-VIRTIO_BLK_T_ZONE_FINISH, VIRTIO_BLK_T_ZONE_APPEND, VIRTIO_BLK_T_ZONE_RESET and
-VIRTIO_BLK_T_ZONE_RESET_ALL requests with VIRTIO_BLK_S_UNSUPP status.
-
-The following device requirements only apply if the VIRTIO_BLK_F_ZONED feature
-is negotiated.
-
-If a request of type VIRTIO_BLK_T_ZONE_OPEN, VIRTIO_BLK_T_ZONE_CLOSE,
-VIRTIO_BLK_T_ZONE_FINISH or VIRTIO_BLK_T_ZONE_RESET is issued for a Conventional
-zone (type VIRTIO_BLK_ZT_CONV), the device MUST complete the request with
-VIRTIO_BLK_S_ZONE_INVALID_CMD \field{status}.
-
-If the zone specified by the VIRTIO_BLK_T_ZONE_APPEND request is not a SWR zone,
-then the request SHALL be completed with VIRTIO_BLK_S_ZONE_INVALID_CMD
-\field{status}.
-
-The device handles a VIRTIO_BLK_T_ZONE_OPEN request by attempting to change the
-state of the zone with the \field{sector} address to VIRTIO_BLK_ZS_EOPEN. If the
-transition to this state can not be performed, the request MUST be completed
-with VIRTIO_BLK_S_ZONE_INVALID_CMD \field{status}. If, while processing this
-request, the available zone resources are insufficient, then the zone state does
-not change and the request MUST be completed with
-VIRTIO_BLK_S_ZONE_OPEN_RESOURCE or VIRTIO_BLK_S_ZONE_ACTIVE_RESOURCE value in
-the field \field{status}.
-
-The device handles a VIRTIO_BLK_T_ZONE_CLOSE request by attempting to change the
-state of the zone with the \field{sector} address to VIRTIO_BLK_ZS_CLOSED. If
-the transition to this state can not be performed, the request MUST be completed
-with VIRTIO_BLK_S_ZONE_INVALID_CMD value in the field \field{status}.
-
-The device handles a VIRTIO_BLK_T_ZONE_FINISH request by attempting to change
-the state of the zone with the \field{sector} address to VIRTIO_BLK_ZS_FULL. If
-the transition to this state can not be performed, the zone state does not
-change and the request MUST be completed with VIRTIO_BLK_S_ZONE_INVALID_CMD
-value in the field \field{status}.
-
-The device handles a VIRTIO_BLK_T_ZONE_RESET request by attempting to change the
-state of the zone with the \field{sector} address to VIRTIO_BLK_ZS_EMPTY state.
-If the transition to this state can not be performed, the zone state does not
-change and the request MUST be completed with VIRTIO_BLK_S_ZONE_INVALID_CMD
-value in the field \field{status}.
-
-The device handles a VIRTIO_BLK_T_ZONE_RESET_ALL request by transitioning all
-sequential device zones in VIRTIO_BLK_ZS_IOPEN, VIRTIO_BLK_ZS_EOPEN,
-VIRTIO_BLK_ZS_CLOSED and VIRTIO_BLK_ZS_FULL state to VIRTIO_BLK_ZS_EMPTY state.
-
-Upon receiving a VIRTIO_BLK_T_ZONE_APPEND request or a VIRTIO_BLK_T_OUT
-request issued to a SWR zone in VIRTIO_BLK_ZS_EMPTY or VIRTIO_BLK_ZS_CLOSED
-state, the device attempts to perform the transition of the zone to
-VIRTIO_BLK_ZS_IOPEN state before writing data. This transition may fail due to
-insufficient open and/or active zone resources available on the device. In this
-case, the request MUST be completed with VIRTIO_BLK_S_ZONE_OPEN_RESOURCE or
-VIRTIO_BLK_S_ZONE_ACTIVE_RESOURCE value in the \field{status}.
-
-If the \field{sector} field in the VIRTIO_BLK_T_ZONE_APPEND request does not
-specify the lowest sector for a zone, then the request SHALL be completed with
-VIRTIO_BLK_S_ZONE_INVALID_CMD value in \field{status}.
-
-A VIRTIO_BLK_T_ZONE_APPEND request or a VIRTIO_BLK_T_OUT request that has the
-data range that exceeds the remaining writable capacity for the zone, then the
-request SHALL be completed with VIRTIO_BLK_S_ZONE_INVALID_CMD value in
-\field{status}.
-
-If a request of the type VIRTIO_BLK_T_ZONE_APPEND is completed with
-VIRTIO_BLK_S_OK status, the field \field{append_sector} in
-\field{virtio_blk_req_za} MUST be set by the device to contain the first sector
-of the data written to the zone.
-
-If a request of the type VIRTIO_BLK_T_ZONE_APPEND is completed with a status
-other than VIRTIO_BLK_S_OK, the value of \field{append_sector} field in
-\field{virtio_blk_req_za} is undefined.
-
-A VIRTIO_BLK_T_ZONE_APPEND request that has the data size that exceeds
-\field{max_append_sectors} configuration space value, then,
-\begin{itemize}
-\item if \field{max_append_sectors} configuration space value is reported as
- zero by the device, the request SHALL be completed with VIRTIO_BLK_S_UNSUPP
- \field{status}.
-
-\item if \field{max_append_sectors} configuration space value is reported as
- a non-zero value by the device, the request SHALL be completed with
- VIRTIO_BLK_S_ZONE_INVALID_CMD \field{status}.
-\end{itemize}
-
-If a VIRTIO_BLK_T_ZONE_APPEND request, a VIRTIO_BLK_T_IN request or a
-VIRTIO_BLK_T_OUT request issued to a SWR zone has the range that has sectors in
-more than one zone, then the request SHALL be completed with
-VIRTIO_BLK_S_ZONE_INVALID_CMD value in the field \field{status}.
-
-A VIRTIO_BLK_T_OUT request that has the \field{sector} value that is not aligned
-with the write pointer for the zone, then the request SHALL be completed with
-VIRTIO_BLK_S_ZONE_UNALIGNED_WP value in the field \field{status}.
-
-In order to avoid resource-related errors while opening zones implicitly, the
-device MAY automatically transition zones in VIRTIO_BLK_ZS_IOPEN state to
-VIRTIO_BLK_ZS_CLOSED state.
-
-All VIRTIO_BLK_T_OUT requests or VIRTIO_BLK_T_ZONE_APPEND requests issued
-to a zone in the VIRTIO_BLK_ZS_RDONLY state SHALL be completed with
-VIRTIO_BLK_S_ZONE_INVALID_CMD \field{status}.
-
-All requests issued to a zone in the VIRTIO_BLK_ZS_OFFLINE state SHALL be
-completed with VIRTIO_BLK_S_ZONE_INVALID_CMD value in the field \field{status}.
-
-The device MUST consider the sectors that are read between the write pointer
-position of a zone and the end of the last sector of the zone as unwritten data.
-The sectors between the write pointer position and the end of the last sector
-within the zone capacity during VIRTIO_BLK_T_ZONE_FINISH request processing are
-also considered unwritten data.
-
-When unwritten data is present in the sector range of a read request, the device
-MUST process this data in one of the following ways -
-
-\begin{enumerate}
-\item Fill the unwritten data with a device-specific byte pattern. The
-configuration, control and reporting of this byte pattern is beyond the scope
-of this standard. This is the preferred approach.
-
-\item Fail the request. Depending on the driver implementation, this may prevent
-the device from becoming operational.
-\end{enumerate}
-
-If both the VIRTIO_BLK_F_ZONED and VIRTIO_BLK_F_SECURE_ERASE features are
-negotiated, then
-
-\begin{enumerate}
-\item the field \field{secure_erase_sector_alignment} in the configuration space
-of the device MUST be a multiple of \field{zone_sectors} value reported in the
-device configuration space.
-
-\item the data size in VIRTIO_BLK_T_SECURE_ERASE requests MUST be a multiple of
-\field{zone_sectors} value in the device configuration space.
-\end{enumerate}
-
-The device MUST handle a VIRTIO_BLK_T_SECURE_ERASE request in the same way it
-handles VIRTIO_BLK_T_ZONE_RESET request for the zone range specified in the
-VIRTIO_BLK_T_SECURE_ERASE request.
-
-\subsubsection{Legacy Interface: Device Operation}\label{sec:Device Types / Block Device / Device Operation / Legacy Interface: Device Operation}
-When using the legacy interface, transitional devices and drivers
-MUST format the fields in struct virtio_blk_req
-according to the native endian of the guest rather than
-(necessarily when not using the legacy interface) little-endian.
-
-When using the legacy interface, transitional drivers
-SHOULD ignore the used length values.
-\begin{note}
-Historically, some devices put the total descriptor length,
-or the total length of device-writable buffers there,
-even when only the status byte was actually written.
-\end{note}
-
-The \field{reserved} field was previously called \field{ioprio}. \field{ioprio}
-is a hint about the relative priorities of requests to the device:
-higher numbers indicate more important requests.
-
-\begin{lstlisting}
-#define VIRTIO_BLK_T_FLUSH_OUT 5
-\end{lstlisting}
-
-The command VIRTIO_BLK_T_FLUSH_OUT was a synonym for VIRTIO_BLK_T_FLUSH;
-a driver MUST treat it as a VIRTIO_BLK_T_FLUSH command.
-
-\begin{lstlisting}
-#define VIRTIO_BLK_T_BARRIER 0x80000000
-\end{lstlisting}
-
-If the device has VIRTIO_BLK_F_BARRIER
-feature the high bit (VIRTIO_BLK_T_BARRIER) indicates that this
-request acts as a barrier and that all preceding requests SHOULD be
-complete before this one, and all following requests SHOULD NOT be
-started until this is complete.
-
-\begin{note} A barrier does not flush
-caches in the underlying backend device in host, and thus does not
-serve as data consistency guarantee. Only a VIRTIO_BLK_T_FLUSH request
-does that.
-\end{note}
-
-Some older legacy devices did not commit completed writes to persistent
-device backend storage when VIRTIO_BLK_F_FLUSH was offered but not
-negotiated. In order to work around this, the driver MAY set the
-\field{writeback} to 0 (if available) or it MAY send an explicit flush
-request after every completed write.
-
-If the device has VIRTIO_BLK_F_SCSI feature, it can also support
-scsi packet command requests, each of these requests is of form:
-
-\begin{lstlisting}
-/* All fields are in guest's native endian. */
-struct virtio_scsi_pc_req {
- u32 type;
- u32 ioprio;
- u64 sector;
- u8 cmd[];
- u8 data[][512];
-#define SCSI_SENSE_BUFFERSIZE 96
- u8 sense[SCSI_SENSE_BUFFERSIZE];
- u32 errors;
- u32 data_len;
- u32 sense_len;
- u32 residual;
- u8 status;
-};
-\end{lstlisting}
-
-A request type can also be a scsi packet command (VIRTIO_BLK_T_SCSI_CMD or
-VIRTIO_BLK_T_SCSI_CMD_OUT). The two types are equivalent, the device
-does not distinguish between them:
-
-\begin{lstlisting}
-#define VIRTIO_BLK_T_SCSI_CMD 2
-#define VIRTIO_BLK_T_SCSI_CMD_OUT 3
-\end{lstlisting}
-
-The \field{cmd} field is only present for scsi packet command requests,
-and indicates the command to perform. This field MUST reside in a
-single, separate device-readable buffer; command length can be derived
-from the length of this buffer.
-
-Note that these first three (four for scsi packet commands)
-fields are always device-readable: \field{data} is either device-readable
-or device-writable, depending on the request. The size of the read or
-write can be derived from the total size of the request buffers.
-
-\field{sense} is only present for scsi packet command requests,
-and indicates the buffer for scsi sense data.
-
-\field{data_len} is only present for scsi packet command
-requests, this field is deprecated, and SHOULD be ignored by the
-driver. Historically, devices copied data length there.
-
-\field{sense_len} is only present for scsi packet command
-requests and indicates the number of bytes actually written to
-the \field{sense} buffer.
-
-\field{residual} field is only present for scsi packet command
-requests and indicates the residual size, calculated as data
-length - number of bytes actually transferred.
-
-\subsubsection{Legacy Interface: Framing Requirements}\label{sec:Device Types / Block Device / Legacy Interface: Framing Requirements}
-
-When using legacy interfaces, transitional drivers which have not
-negotiated VIRTIO_F_ANY_LAYOUT:
-
-\begin{itemize}
-\item MUST use a single 8-byte descriptor containing \field{type},
- \field{reserved} and \field{sector}, followed by descriptors
- for \field{data}, then finally a separate 1-byte descriptor
- for \field{status}.
-
-\item For SCSI commands there are additional constraints.
- \field{sense} MUST reside in a
- single separate device-writable descriptor of size 96 bytes,
- and \field{errors}, \field{data_len}, \field{sense_len} and
- \field{residual} MUST reside a single separate
- device-writable descriptor.
-\end{itemize}
-
-See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Message Framing}.
+\input{device-types/virtio-block/description.tex}
\section{Console Device}\label{sec:Device Types / Console Device}
diff --git a/device-types/virtio-block/description.tex b/device-types/virtio-block/description.tex
new file mode 100644
index 0000000..20007e3
--- /dev/null
+++ b/device-types/virtio-block/description.tex
@@ -0,0 +1,1313 @@
+\section{Block Device}\label{sec:Device Types / Block Device}
+
+The virtio block device is a simple virtual block device (ie.
+disk). Read and write requests (and other exotic requests) are
+placed in one of its queues, and serviced (probably out of order) by the
+device except where noted.
+
+\subsection{Device ID}\label{sec:Device Types / Block Device / Device ID}
+ 2
+
+\subsection{Virtqueues}\label{sec:Device Types / Block Device / Virtqueues}
+\begin{description}
+\item[0] requestq1
+\item[\ldots]
+\item[N-1] requestqN
+\end{description}
+
+ N=1 if VIRTIO_BLK_F_MQ is not negotiated, otherwise N is set by
+ \field{num_queues}.
+
+\subsection{Feature bits}\label{sec:Device Types / Block Device / Feature bits}
+
+\begin{description}
+\item[VIRTIO_BLK_F_SIZE_MAX (1)] Maximum size of any single segment is
+ in \field{size_max}.
+
+\item[VIRTIO_BLK_F_SEG_MAX (2)] Maximum number of segments in a
+ request is in \field{seg_max}.
+
+\item[VIRTIO_BLK_F_GEOMETRY (4)] Disk-style geometry specified in
+ \field{geometry}.
+
+\item[VIRTIO_BLK_F_RO (5)] Device is read-only.
+
+\item[VIRTIO_BLK_F_BLK_SIZE (6)] Block size of disk is in \field{blk_size}.
+
+\item[VIRTIO_BLK_F_FLUSH (9)] Cache flush command support.
+
+\item[VIRTIO_BLK_F_TOPOLOGY (10)] Device exports information on optimal I/O
+ alignment.
+
+\item[VIRTIO_BLK_F_CONFIG_WCE (11)] Device can toggle its cache between writeback
+ and writethrough modes.
+
+\item[VIRTIO_BLK_F_MQ (12)] Device supports multiqueue.
+
+\item[VIRTIO_BLK_F_DISCARD (13)] Device can support discard command, maximum
+ discard sectors size in \field{max_discard_sectors} and maximum discard
+ segment number in \field{max_discard_seg}.
+
+\item[VIRTIO_BLK_F_WRITE_ZEROES (14)] Device can support write zeroes command,
+ maximum write zeroes sectors size in \field{max_write_zeroes_sectors} and
+ maximum write zeroes segment number in \field{max_write_zeroes_seg}.
+
+\item[VIRTIO_BLK_F_LIFETIME (15)] Device supports providing storage lifetime
+ information.
+
+\item[VIRTIO_BLK_F_SECURE_ERASE (16)] Device supports secure erase command,
+ maximum erase sectors count in \field{max_secure_erase_sectors} and
+ maximum erase segment number in \field{max_secure_erase_seg}.
+
+\item[VIRTIO_BLK_F_ZONED(17)] Device is a Zoned Block Device, that is, a device
+ that follows the zoned storage device behavior that is also supported by
+ industry standards such as the T10 Zoned Block Command standard (ZBC r05) or
+ the NVMe(TM) NVM Express Zoned Namespace Command Set Specification 1.1b
+ (ZNS). For brevity, these standard documents are referred as "ZBD standards"
+ from this point on in the text.
+
+\end{description}
+
+\subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Block Device / Feature bits / Legacy Interface: Feature bits}
+
+\begin{description}
+\item[VIRTIO_BLK_F_BARRIER (0)] Device supports request barriers.
+
+\item[VIRTIO_BLK_F_SCSI (7)] Device supports scsi packet commands.
+\end{description}
+
+\begin{note}
+ In the legacy interface, VIRTIO_BLK_F_FLUSH was also
+ called VIRTIO_BLK_F_WCE.
+\end{note}
+
+\subsection{Device configuration layout}\label{sec:Device Types / Block Device / Device configuration layout}
+
+The \field{capacity} of the device (expressed in 512-byte sectors) is always
+present. The availability of the others all depend on various feature
+bits as indicated above.
+
+The field \field{num_queues} only exists if VIRTIO_BLK_F_MQ is set. This field specifies
+the number of queues.
+
+The parameters in the configuration space of the device \field{max_discard_sectors}
+\field{discard_sector_alignment} are expressed in 512-byte units if the
+VIRTIO_BLK_F_DISCARD feature bit is negotiated. The \field{max_write_zeroes_sectors}
+is expressed in 512-byte units if the VIRTIO_BLK_F_WRITE_ZEROES feature
+bit is negotiated. The parameters in the configuration space of the device
+\field{max_secure_erase_sectors} \field{secure_erase_sector_alignment} are expressed
+in 512-byte units if the VIRTIO_BLK_F_SECURE_ERASE feature bit is negotiated.
+
+If the VIRTIO_BLK_F_ZONED feature is negotiated, then in
+\field{virtio_blk_zoned_characteristics},
+\begin{itemize}
+\item \field{zone_sectors} value is expressed in 512-byte sectors.
+\item \field{max_append_sectors} value is expressed in 512-byte sectors.
+\item \field{write_granularity} value is expressed in bytes.
+\end{itemize}
+
+The \field{model} field in \field{zoned} may have the following values:
+
+\begin{lstlisting}
+#define VIRTIO_BLK_Z_NONE 0
+#define VIRTIO_BLK_Z_HM 1
+#define VIRTIO_BLK_Z_HA 2
+\end{lstlisting}
+
+Depending on their design, zoned block devices may follow several possible
+models of operation. The three models that are standardized for ZBDs are
+drive-managed, host-managed and host-aware.
+
+While being zoned internally, drive-managed ZBDs behave exactly like regular,
+non-zoned block devices. For the purposes of virtio standardization,
+drive-managed ZBDs can always be treated as non-zoned devices. These devices
+have the VIRTIO_BLK_Z_NONE model value set in the \field{model} field in
+\field{zoned}.
+
+Devices that offer the VIRTIO_BLK_F_ZONED feature while reporting the
+VIRTIO_BLK_Z_NONE zoned model are drive-managed zoned block devices. In this
+case, the driver treats the device as a regular non-zoned block device.
+
+Host-managed zoned block devices have their LBA range divided into Sequential
+Write Required (SWR) zones that require some additional handling by the host
+for correct operation. All write requests to SWR zones are required be
+sequential and zones containing some written data need to be reset before that
+data can be rewritten. Host-managed devices support a set of ZBD-specific I/O
+requests that can be used by the host to manage device zones. Host-managed
+devices report VIRTIO_BLK_Z_HM in the \field{model} field in \field{zoned}.
+
+Host-aware zoned block devices have their LBA range divided to Sequential
+Write Preferred (SWP) zones that support random write access, similar to
+regular non-zoned devices. However, the device I/O performance might not be
+optimal if SWP zones are used in a random I/O pattern. SWP zones also support
+the same set of ZBD-specific I/O requests as host-managed devices that allow
+host-aware devices to be managed by any host that supports zoned block devices
+to achieve its optimum performance. Host-aware devices report VIRTIO_BLK_Z_HA
+in the \field{model} field in \field{zoned}.
+
+Both SWR zones and SWP zones are sometimes referred as sequential zones.
+
+During device operation, sequential zones can be in one of the following states:
+empty, implicitly-open, explicitly-open, closed and full. The state machine that
+governs the transitions between these states is described later in this document.
+
+SWR and SWP zones consume volatile device resources while being in certain
+states and the device may set limits on the number of zones that can be in these
+states simultaneously.
+
+Zoned block devices use two internal counters to account for the device
+resources in use, the number of currently open zones and the number of currently
+active zones.
+
+Any zone state transition from a state that doesn't consume a zone resource to a
+state that consumes the same resource increments the internal device counter for
+that resource. Any zone transition out of a state that consumes a zone resource
+to a state that doesn't consume the same resource decrements the counter. Any
+request that causes the device to exceed the reported zone resource limits is
+terminated by the device with a "zone resources exceeded" error as defined for
+specific commands later.
+
+\begin{lstlisting}
+struct virtio_blk_config {
+ le64 capacity;
+ le32 size_max;
+ le32 seg_max;
+ struct virtio_blk_geometry {
+ le16 cylinders;
+ u8 heads;
+ u8 sectors;
+ } geometry;
+ le32 blk_size;
+ struct virtio_blk_topology {
+ // # of logical blocks per physical block (log2)
+ u8 physical_block_exp;
+ // offset of first aligned logical block
+ u8 alignment_offset;
+ // suggested minimum I/O size in blocks
+ le16 min_io_size;
+ // optimal (suggested maximum) I/O size in blocks
+ le32 opt_io_size;
+ } topology;
+ u8 writeback;
+ u8 unused0;
+ u16 num_queues;
+ le32 max_discard_sectors;
+ le32 max_discard_seg;
+ le32 discard_sector_alignment;
+ le32 max_write_zeroes_sectors;
+ le32 max_write_zeroes_seg;
+ u8 write_zeroes_may_unmap;
+ u8 unused1[3];
+ le32 max_secure_erase_sectors;
+ le32 max_secure_erase_seg;
+ le32 secure_erase_sector_alignment;
+ struct virtio_blk_zoned_characteristics {
+ le32 zone_sectors;
+ le32 max_open_zones;
+ le32 max_active_zones;
+ le32 max_append_sectors;
+ le32 write_granularity;
+ u8 model;
+ u8 unused2[3];
+ } zoned;
+};
+\end{lstlisting}
+
+
+\subsubsection{Legacy Interface: Device configuration layout}\label{sec:Device Types / Block Device / Device configuration layout / Legacy Interface: Device configuration layout}
+When using the legacy interface, transitional devices and drivers
+MUST format the fields in struct virtio_blk_config
+according to the native endian of the guest rather than
+(necessarily when not using the legacy interface) little-endian.
+
+
+\subsection{Device Initialization}\label{sec:Device Types / Block Device / Device Initialization}
+
+\begin{enumerate}
+\item The device size can be read from \field{capacity}.
+
+\item If the VIRTIO_BLK_F_BLK_SIZE feature is negotiated,
+ \field{blk_size} can be read to determine the optimal sector size
+ for the driver to use. This does not affect the units used in
+ the protocol (always 512 bytes), but awareness of the correct
+ value can affect performance.
+
+\item If the VIRTIO_BLK_F_RO feature is set by the device, any write
+ requests will fail.
+
+\item If the VIRTIO_BLK_F_TOPOLOGY feature is negotiated, the fields in the
+ \field{topology} struct can be read to determine the physical block size and optimal
+ I/O lengths for the driver to use. This also does not affect the units
+ in the protocol, only performance.
+
+\item If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the cache
+ mode can be read or set through the \field{writeback} field. 0 corresponds
+ to a writethrough cache, 1 to a writeback cache\footnote{Consistent with
+ \ref{devicenormative:Device Types / Block Device / Device Operation},
+ a writethrough cache can be defined broadly as a cache that commits
+ writes to persistent device backend storage before reporting their
+ completion. For example, a battery-backed writeback cache actually
+ counts as writethrough according to this definition.}. The cache mode
+ after reset can be either writeback or writethrough. The actual
+ mode can be determined by reading \field{writeback} after feature
+ negotiation.
+
+\item If the VIRTIO_BLK_F_DISCARD feature is negotiated,
+ \field{max_discard_sectors} and \field{max_discard_seg} can be read
+ to determine the maximum discard sectors and maximum number of discard
+ segments for the block driver to use. \field{discard_sector_alignment}
+ can be used by OS when splitting a request based on alignment.
+
+\item If the VIRTIO_BLK_F_WRITE_ZEROES feature is negotiated,
+ \field{max_write_zeroes_sectors} and \field{max_write_zeroes_seg} can
+ be read to determine the maximum write zeroes sectors and maximum
+ number of write zeroes segments for the block driver to use.
+
+\item If the VIRTIO_BLK_F_MQ feature is negotiated, \field{num_queues} field
+ can be read to determine the number of queues.
+
+\item If the VIRTIO_BLK_F_SECURE_ERASE feature is negotiated,
+ \field{max_secure_erase_sectors} and \field{max_secure_erase_seg} can be read
+ to determine the maximum secure erase sectors and maximum number of
+ secure erase segments for the block driver to use.
+ \field{secure_erase_sector_alignment} can be used by OS when splitting a
+ request based on alignment.
+
+\item If the VIRTIO_BLK_F_ZONED feature is negotiated, the fields in
+ \field{zoned} can be read by the driver to determine the zone
+ characteristics of the device. All \field{zoned} fields are read-only.
+
+\end{enumerate}
+
+\drivernormative{\subsubsection}{Device Initialization}{Device Types / Block Device / Device Initialization}
+
+Drivers SHOULD NOT negotiate VIRTIO_BLK_F_FLUSH if they are incapable of
+sending VIRTIO_BLK_T_FLUSH commands.
+
+If neither VIRTIO_BLK_F_CONFIG_WCE nor VIRTIO_BLK_F_FLUSH are
+negotiated, the driver MAY deduce the presence of a writethrough cache.
+If VIRTIO_BLK_F_CONFIG_WCE was not negotiated but VIRTIO_BLK_F_FLUSH was,
+the driver SHOULD assume presence of a writeback cache.
+
+The driver MUST NOT read \field{writeback} before setting
+the FEATURES_OK \field{device status} bit.
+
+Drivers MUST NOT negotiate the VIRTIO_BLK_F_ZONED feature if they are incapable
+of supporting devices with the VIRTIO_BLK_Z_HM, VIRTIO_BLK_Z_HA or
+VIRTIO_BLK_Z_NONE zoned model.
+
+If the VIRTIO_BLK_F_ZONED feature is offered by the device with the
+VIRTIO_BLK_Z_HM zone model, then the VIRTIO_BLK_F_DISCARD feature MUST NOT be
+offered by the driver.
+
+If the VIRTIO_BLK_F_ZONED feature and VIRTIO_BLK_F_DISCARD feature are both
+offered by the device with the VIRTIO_BLK_Z_HA or VIRTIO_BLK_Z_NONE zone model,
+then the driver MAY negotiate these two bits independently.
+
+If the VIRTIO_BLK_F_ZONED feature is negotiated, then
+\begin{itemize}
+\item if the driver that can not support host-managed zoned devices
+ reads VIRTIO_BLK_Z_HM from the \field{model} field of \field{zoned}, the
+ driver MUST NOT set FEATURES_OK flag and instead set the FAILED bit.
+
+\item if the driver that can not support zoned devices reads VIRTIO_BLK_Z_HA
+ from the \field{model} field of \field{zoned}, the driver
+ MAY handle the device as a non-zoned device. In this case, the
+ driver SHOULD ignore all other fields in \field{zoned}.
+\end{itemize}
+
+\devicenormative{\subsubsection}{Device Initialization}{Device Types / Block Device / Device Initialization}
+
+Devices SHOULD always offer VIRTIO_BLK_F_FLUSH, and MUST offer it
+if they offer VIRTIO_BLK_F_CONFIG_WCE.
+
+If VIRTIO_BLK_F_CONFIG_WCE is negotiated but VIRTIO_BLK_F_FLUSH
+is not, the device MUST initialize \field{writeback} to 0.
+
+The device MUST initialize padding bytes \field{unused0} and
+\field{unused1} to 0.
+
+If the device that is being initialized is a not a zoned device, the device
+SHOULD NOT offer the VIRTIO_BLK_F_ZONED feature.
+
+The VIRTIO_BLK_F_ZONED feature cannot be properly negotiated without
+FEATURES_OK bit. Legacy devices MUST NOT offer VIRTIO_BLK_F_ZONED feature bit.
+
+If the VIRTIO_BLK_F_ZONED feature is not accepted by the driver,
+\begin{itemize}
+\item the device with the VIRTIO_BLK_Z_HA or VIRTIO_BLK_Z_NONE zone model SHOULD
+ proceed with the initialization while setting all zoned characteristics
+ fields to zero.
+
+\item the device with the VIRTIO_BLK_Z_HM zone model MUST fail to set the
+ FEATURES_OK device status bit when the driver writes the Device Status
+ field.
+\end{itemize}
+
+If the VIRTIO_BLK_F_ZONED feature is negotiated, then the \field{model} field in
+\field{zoned} struct in the configuration space MUST be set by the device
+\begin{itemize}
+\item to the value of VIRTIO_BLK_Z_NONE if it operates as a drive-managed
+ zoned block device or a non-zoned block device.
+
+\item to the value of VIRTIO_BLK_Z_HM if it operates as a host-managed zoned
+ block device.
+
+\item to the value of VIRTIO_BLK_Z_HA if it operates as a host-aware zoned
+ block device.
+\end{itemize}
+
+If the VIRTIO_BLK_F_ZONED feature is negotiated and the device \field{model}
+field in \field{zoned} struct is VIRTIO_BLK_Z_HM or VIRTIO_BLK_Z_HA,
+
+\begin{itemize}
+\item the \field{zone_sectors} field of \field{zoned} MUST be set by the device
+ to the size of a single zone on the device. All zones of the device have the
+ same size indicated by \field{zone_sectors} except for the last zone that
+ MAY be smaller than all other zones. The driver can calculate the number of
+ zones on the device as
+ \begin{lstlisting}
+ nr_zones = (capacity + zone_sectors - 1) / zone_sectors;
+ \end{lstlisting}
+ and the size of the last zone as
+ \begin{lstlisting}
+ zs_last = capacity - (nr_zones - 1) * zone_sectors;
+ \end{lstlisting}
+
+\item The \field{max_open_zones} field of the \field{zoned} structure MUST be
+ set by the device to the maximum number of zones that can be open on the
+ device (zones in the implicit open or explicit open state). A value
+ of zero indicates that the device does not have any limit on the number of
+ open zones.
+
+\item The \field{max_active_zones} field of the \field{zoned} structure MUST
+ be set by the device to the maximum number zones that can be active on the
+ device (zones in the implicit open, explicit open or closed state). A value
+ of zero indicates that the device does not have any limit on the number of
+ active zones.
+
+\item the \field{max_append_sectors} field of \field{zoned} MUST be set by
+ the device to the maximum data size of a VIRTIO_BLK_T_ZONE_APPEND request
+ that can be successfully issued to the device. The value of this field MUST
+ NOT exceed the \field{seg_max} * \field{size_max} value. A device MAY set
+ the \field{max_append_sectors} to zero if it doesn't support
+ VIRTIO_BLK_T_ZONE_APPEND requests.
+
+\item the \field{write_granularity} field of \field{zoned} MUST be set by the
+ device to the offset and size alignment constraint for VIRTIO_BLK_T_OUT
+ and VIRTIO_BLK_T_ZONE_APPEND requests issued to a sequential zone of the
+ device.
+
+\item the device MUST initialize padding bytes \field{unused2} to 0.
+\end{itemize}
+
+\subsubsection{Legacy Interface: Device Initialization}\label{sec:Device Types / Block Device / Device Initialization / Legacy Interface: Device Initialization}
+
+Because legacy devices do not have FEATURES_OK, transitional devices
+MUST implement slightly different behavior around feature negotiation
+when used through the legacy interface. In particular, when using the
+legacy interface:
+
+\begin{itemize}
+\item the driver MAY read or write \field{writeback} before setting
+ the DRIVER or DRIVER_OK \field{device status} bit
+
+\item the device MUST NOT modify the cache mode (and \field{writeback})
+ as a result of a driver setting a status bit, unless
+ the DRIVER_OK bit is being set and the driver has not set the
+ VIRTIO_BLK_F_CONFIG_WCE driver feature bit.
+
+\item the device MUST NOT modify the cache mode (and \field{writeback})
+ as a result of a driver modifying the driver feature bits, for example
+ if the driver sets the VIRTIO_BLK_F_CONFIG_WCE driver feature bit but
+ does not set the VIRTIO_BLK_F_FLUSH bit.
+\end{itemize}
+
+
+\subsection{Device Operation}\label{sec:Device Types / Block Device / Device Operation}
+
+The driver queues requests to the virtqueues, and they are used by
+the device (not necessarily in order). Each request except
+VIRTIO_BLK_T_ZONE_APPEND is of form:
+
+\begin{lstlisting}
+struct virtio_blk_req {
+ le32 type;
+ le32 reserved;
+ le64 sector;
+ u8 data[];
+ u8 status;
+};
+\end{lstlisting}
+
+The type of the request is either a read (VIRTIO_BLK_T_IN), a write
+(VIRTIO_BLK_T_OUT), a discard (VIRTIO_BLK_T_DISCARD), a write zeroes
+(VIRTIO_BLK_T_WRITE_ZEROES), a flush (VIRTIO_BLK_T_FLUSH), a get device ID
+string command (VIRTIO_BLK_T_GET_ID), a secure erase
+(VIRTIO_BLK_T_SECURE_ERASE), or a get device lifetime command
+(VIRTIO_BLK_T_GET_LIFETIME).
+
+\begin{lstlisting}
+#define VIRTIO_BLK_T_IN 0
+#define VIRTIO_BLK_T_OUT 1
+#define VIRTIO_BLK_T_FLUSH 4
+#define VIRTIO_BLK_T_GET_ID 8
+#define VIRTIO_BLK_T_GET_LIFETIME 10
+#define VIRTIO_BLK_T_DISCARD 11
+#define VIRTIO_BLK_T_WRITE_ZEROES 13
+#define VIRTIO_BLK_T_SECURE_ERASE 14
+\end{lstlisting}
+
+The \field{sector} number indicates the offset (multiplied by 512) where
+the read or write is to occur. This field is unused and set to 0 for
+commands other than read, write and some zone operations.
+
+VIRTIO_BLK_T_IN requests populate \field{data} with the contents of sectors
+read from the block device (in multiples of 512 bytes). VIRTIO_BLK_T_OUT
+requests write the contents of \field{data} to the block device (in multiples
+of 512 bytes).
+
+The \field{data} used for discard, secure erase or write zeroes commands
+consists of one or more segments. The maximum number of segments is
+\field{max_discard_seg} for discard commands, \field{max_secure_erase_seg} for
+secure erase commands and \field{max_write_zeroes_seg} for write zeroes
+commands.
+Each segment is of form:
+
+\begin{lstlisting}
+struct virtio_blk_discard_write_zeroes {
+ le64 sector;
+ le32 num_sectors;
+ struct {
+ le32 unmap:1;
+ le32 reserved:31;
+ } flags;
+};
+\end{lstlisting}
+
+\field{sector} indicates the starting offset (in 512-byte units) of the
+segment, while \field{num_sectors} indicates the number of sectors in each
+discarded range. \field{unmap} is only used in write zeroes commands and allows
+the device to discard the specified range, provided that following reads return
+zeroes.
+
+VIRTIO_BLK_T_GET_ID requests fetch the device ID string from the device into
+\field{data}. The device ID string is a NUL-padded ASCII string up to 20 bytes
+long. If the string is 20 bytes long then there is no NUL terminator.
+
+The \field{data} used for VIRTIO_BLK_T_GET_LIFETIME requests is populated
+by the device, and is of the form
+
+\begin{lstlisting}
+struct virtio_blk_lifetime {
+ le16 pre_eol_info;
+ le16 device_lifetime_est_typ_a;
+ le16 device_lifetime_est_typ_b;
+};
+\end{lstlisting}
+
+The \field{pre_eol_info} specifies the percentage of reserved blocks
+that are consumed and will have one of these values:
+
+\begin{lstlisting}
+/* Value not available */
+#define VIRTIO_BLK_PRE_EOL_INFO_UNDEFINED 0
+/* < 80% of reserved blocks are consumed */
+#define VIRTIO_BLK_PRE_EOL_INFO_NORMAL 1
+/* 80% of reserved blocks are consumed */
+#define VIRTIO_BLK_PRE_EOL_INFO_WARNING 2
+/* 90% of reserved blocks are consumed */
+#define VIRTIO_BLK_PRE_EOL_INFO_URGENT 3
+/* All others values are reserved */
+\end{lstlisting}
+
+The \field{device_lifetime_est_typ_a} refers to wear of SLC cells and is provided
+in increments of 10%, with 0 meaning undefined, 1 meaning up-to 10% of lifetime
+used, and so on, thru to 11 meaning estimated lifetime exceeded.
+All values above 11 are reserved.
+
+The \field{device_lifetime_est_typ_b} refers to wear of MLC cells and is provided
+with the same semantics as \field{device_lifetime_est_typ_a}.
+
+The final \field{status} byte is written by the device: either
+VIRTIO_BLK_S_OK for success, VIRTIO_BLK_S_IOERR for device or driver
+error or VIRTIO_BLK_S_UNSUPP for a request unsupported by device:
+
+\begin{lstlisting}
+#define VIRTIO_BLK_S_OK 0
+#define VIRTIO_BLK_S_IOERR 1
+#define VIRTIO_BLK_S_UNSUPP 2
+\end{lstlisting}
+
+The status of individual segments is indeterminate when a discard or write zero
+command produces VIRTIO_BLK_S_IOERR. A segment may have completed
+successfully, failed, or not been processed by the device.
+
+The following requirements only apply if the VIRTIO_BLK_F_ZONED feature is
+negotiated.
+
+In addition to the request types defined for non-zoned devices, the type of the
+request can be a zone report (VIRTIO_BLK_T_ZONE_REPORT), an explicit zone open
+(VIRTIO_BLK_T_ZONE_OPEN), a zone close (VIRTIO_BLK_T_ZONE_CLOSE), a zone finish
+(VIRTIO_BLK_T_ZONE_FINISH), a zone_append (VIRTIO_BLK_T_ZONE_APPEND), a zone
+reset (VIRTIO_BLK_T_ZONE_RESET) or a zone reset all
+(VIRTIO_BLK_T_ZONE_RESET_ALL).
+
+\begin{lstlisting}
+#define VIRTIO_BLK_T_ZONE_APPEND 15
+#define VIRTIO_BLK_T_ZONE_REPORT 16
+#define VIRTIO_BLK_T_ZONE_OPEN 18
+#define VIRTIO_BLK_T_ZONE_CLOSE 20
+#define VIRTIO_BLK_T_ZONE_FINISH 22
+#define VIRTIO_BLK_T_ZONE_RESET 24
+#define VIRTIO_BLK_T_ZONE_RESET_ALL 26
+\end{lstlisting}
+
+Requests of type VIRTIO_BLK_T_OUT, VIRTIO_BLK_T_ZONE_OPEN,
+VIRTIO_BLK_T_ZONE_CLOSE, VIRTIO_BLK_T_ZONE_FINISH, VIRTIO_BLK_T_ZONE_APPEND,
+VIRTIO_BLK_T_ZONE_RESET or VIRTIO_BLK_T_ZONE_RESET_ALL may be completed by the
+device with VIRTIO_BLK_S_OK, VIRTIO_BLK_S_IOERR or VIRTIO_BLK_S_UNSUPP
+\field{status}, or, additionally, with VIRTIO_BLK_S_ZONE_INVALID_CMD,
+VIRTIO_BLK_S_ZONE_UNALIGNED_WP, VIRTIO_BLK_S_ZONE_OPEN_RESOURCE or
+VIRTIO_BLK_S_ZONE_ACTIVE_RESOURCE ZBD-specific status codes.
+
+Besides the request status, VIRTIO_BLK_T_ZONE_APPEND requests return the
+starting sector of the appended data back to the driver. For this reason,
+the VIRTIO_BLK_T_ZONE_APPEND request has the layout that is extended to have
+the \field{append_sector} field to carry this value:
+
+\begin{lstlisting}
+struct virtio_blk_req_za {
+ le32 type;
+ le32 reserved;
+ le64 sector;
+ u8 data[];
+ le64 append_sector;
+ u8 status;
+};
+\end{lstlisting}
+
+\begin{lstlisting}
+#define VIRTIO_BLK_S_ZONE_INVALID_CMD 3
+#define VIRTIO_BLK_S_ZONE_UNALIGNED_WP 4
+#define VIRTIO_BLK_S_ZONE_OPEN_RESOURCE 5
+#define VIRTIO_BLK_S_ZONE_ACTIVE_RESOURCE 6
+\end{lstlisting}
+
+Requests of the type VIRTIO_BLK_T_ZONE_REPORT are reads and requests of the type
+VIRTIO_BLK_T_ZONE_APPEND are writes. VIRTIO_BLK_T_ZONE_OPEN,
+VIRTIO_BLK_T_ZONE_CLOSE, VIRTIO_BLK_T_ZONE_FINISH, VIRTIO_BLK_T_ZONE_RESET and
+VIRTIO_BLK_T_ZONE_RESET_ALL are non-data requests.
+
+Zone sector address is a 64-bit address of the first 512-byte sector of the
+zone.
+
+VIRTIO_BLK_T_ZONE_OPEN, VIRTIO_BLK_T_ZONE_CLOSE, VIRTIO_BLK_T_ZONE_FINISH and
+VIRTIO_BLK_T_ZONE_RESET requests make the zone operation to act on a particular
+zone specified by the zone sector address in the \field{sector} of the request.
+
+VIRTIO_BLK_T_ZONE_RESET_ALL request acts upon all applicable zones of the
+device. The \field{sector} value is not used for this request.
+
+In ZBD standards, the VIRTIO_BLK_T_ZONE_REPORT request belongs to "Zone
+Management Receive" command category and VIRTIO_BLK_T_ZONE_OPEN,
+VIRTIO_BLK_T_ZONE_CLOSE, VIRTIO_BLK_T_ZONE_FINISH and
+VIRTIO_BLK_T_ZONE_RESET/VIRTIO_BLK_T_ZONE_RESET_ALL requests are categorized as
+"Zone Management Send" commands. VIRTIO_BLK_T_ZONE_APPEND is categorized
+separately from zone management commands and is the only request that uses
+the \field{append_secctor} field \field{virtio_blk_req_za} to return
+to the driver the sector at which the data has been appended to the zone.
+
+VIRTIO_BLK_T_ZONE_REPORT is a read request that returns the information about
+the current state of zones on the device starting from the zone containing the
+\field{sector} of the request. The report consists of a header followed by zero
+or more zone descriptors.
+
+A zone report reply has the following structure:
+
+\begin{lstlisting}
+struct virtio_blk_zone_report {
+ le64 nr_zones;
+ u8 reserved[56];
+ struct virtio_blk_zone_descriptor zones[];
+};
+\end{lstlisting}
+
+The device sets the \field{nr_zones} field in the report header to the number of
+fully transferred zone descriptors in the data buffer.
+
+A zone descriptor has the following structure:
+
+\begin{lstlisting}
+struct virtio_blk_zone_descriptor {
+ le64 z_cap;
+ le64 z_start;
+ le64 z_wp;
+ u8 z_type;
+ u8 z_state;
+ u8 reserved[38];
+};
+\end{lstlisting}
+
+The zone descriptor field \field{z_type} \field{virtio_blk_zone_descriptor}
+indicates the type of the zone.
+
+The following zone types are available:
+
+\begin{lstlisting}
+#define VIRTIO_BLK_ZT_CONV 1
+#define VIRTIO_BLK_ZT_SWR 2
+#define VIRTIO_BLK_ZT_SWP 3
+\end{lstlisting}
+
+Read and write operations into zones with the VIRTIO_BLK_ZT_CONV (Conventional)
+type have the same behavior as read and write operations on a regular block
+device. Any block in a conventional zone can be read or written at any time and
+in any order.
+
+Zones with VIRTIO_BLK_ZT_SWR can be read randomly, but must be written
+sequentially at a certain point in the zone called the Write Pointer (WP). With
+every write, the Write Pointer is incremented by the number of sectors written.
+
+Zones with VIRTIO_BLK_ZT_SWP can be read randomly and should be written
+sequentially, similarly to SWR zones. However, SWP zones can accept random write
+operations, that is, VIRTIO_BLK_T_OUT requests with a start sector different
+from the zone write pointer position.
+
+The field \field{z_state} of \field{virtio_blk_zone_descriptor} indicates the
+state of the device zone.
+
+The following zone states are available:
+
+\begin{lstlisting}
+#define VIRTIO_BLK_ZS_NOT_WP 0
+#define VIRTIO_BLK_ZS_EMPTY 1
+#define VIRTIO_BLK_ZS_IOPEN 2
+#define VIRTIO_BLK_ZS_EOPEN 3
+#define VIRTIO_BLK_ZS_CLOSED 4
+#define VIRTIO_BLK_ZS_RDONLY 13
+#define VIRTIO_BLK_ZS_FULL 14
+#define VIRTIO_BLK_ZS_OFFLINE 15
+\end{lstlisting}
+
+Zones of the type VIRTIO_BLK_ZT_CONV are always reported by the device to be in
+the VIRTIO_BLK_ZS_NOT_WP state. Zones of the types VIRTIO_BLK_ZT_SWR and
+VIRTIO_BLK_ZT_SWP can not transition to the VIRTIO_BLK_ZS_NOT_WP state.
+
+Zones in VIRTIO_BLK_ZS_EMPTY (Empty), VIRTIO_BLK_ZS_IOPEN (Implicitly Open),
+VIRTIO_BLK_ZS_EOPEN (Explicitly Open) and VIRTIO_BLK_ZS_CLOSED (Closed) state
+are writable, but zones in VIRTIO_BLK_ZS_RDONLY (Read-Only), VIRTIO_BLK_ZS_FULL
+(Full) and VIRTIO_BLK_ZS_OFFLINE (Offline) state are not. The write pointer
+value (\field{z_wp}) is not valid for Read-Only, Full and Offline zones.
+
+The zone descriptor field \field{z_cap} contains the maximum number of 512-byte
+sectors that are available to be written with user data when the zone is in the
+Empty state. This value shall be less than or equal to the \field{zone_sectors}
+value in \field{virtio_blk_zoned_characteristics} structure in the device
+configuration space.
+
+The zone descriptor field \field{z_start} contains the zone sector address.
+
+The zone descriptor field \field{z_wp} contains the sector address where the
+next write operation for this zone should be issued. This value is undefined
+for conventional zones and for zones in VIRTIO_BLK_ZS_RDONLY,
+VIRTIO_BLK_ZS_FULL and VIRTIO_BLK_ZS_OFFLINE state.
+
+Depending on their state, zones consume resources as follows:
+\begin{itemize}
+\item a zone in VIRTIO_BLK_ZS_IOPEN and VIRTIO_BLK_ZS_EOPEN state consumes one
+ open zone resource and, additionally,
+
+\item a zone in VIRTIO_BLK_ZS_IOPEN, VIRTIO_BLK_ZS_EOPEN and
+ VIRTIO_BLK_ZS_CLOSED state consumes one active resource.
+\end{itemize}
+
+Attempts for zone transitions that violate zone resource limits must fail with
+VIRTIO_BLK_S_ZONE_OPEN_RESOURCE or VIRTIO_BLK_S_ZONE_ACTIVE_RESOURCE
+\field{status}.
+
+Zones in the VIRTIO_BLK_ZS_EMPTY (Empty) state have the write pointer value
+equal to the sector address of the zone. In this state, the entire capacity of
+the zone is available for writing. A zone can transition from this state to
+\begin{itemize}
+\item VIRTIO_BLK_ZS_IOPEN when a successful VIRTIO_BLK_T_OUT request or
+ VIRTIO_BLK_T_ZONE_APPEND with a non-zero data size is received for the zone.
+
+\item VIRTIO_BLK_ZS_EOPEN when a successful VIRTIO_BLK_T_ZONE_OPEN request is
+ received for the zone
+\end{itemize}
+
+When a VIRTIO_BLK_T_ZONE_RESET request is issued to an Empty zone, the request
+is completed successfully and the zone stays in the VIRTIO_BLK_ZS_EMPTY state.
+
+Zones in the VIRTIO_BLK_ZS_IOPEN (Implicitly Open) state transition from
+this state to
+\begin{itemize}
+\item VIRTIO_BLK_ZS_EMPTY when a successful VIRTIO_BLK_T_ZONE_RESET request is
+ received for the zone,
+
+\item VIRTIO_BLK_ZS_EMPTY when a successful VIRTIO_BLK_T_ZONE_RESET_ALL request
+ is received by the device,
+
+\item VIRTIO_BLK_ZS_EOPEN when a successful VIRTIO_BLK_T_ZONE_OPEN request is
+ received for the zone,
+
+\item VIRTIO_BLK_ZS_CLOSED when a successful VIRTIO_BLK_T_ZONE_CLOSE request is
+ received for the zone,
+
+\item VIRTIO_BLK_ZS_CLOSED implicitly by the device when another zone is
+ entering the VIRTIO_BLK_ZS_IOPEN or VIRTIO_BLK_ZS_EOPEN state and the number
+ of currently open zones is at \field{max_open_zones} limit,
+
+\item VIRTIO_BLK_ZS_FULL when a successful VIRTIO_BLK_T_ZONE_FINISH request is
+ received for the zone.
+
+\item VIRTIO_BLK_ZS_FULL when a successful VIRTIO_BLK_T_OUT or
+ VIRTIO_BLK_T_ZONE_APPEND request that causes the zone to reach its writable
+ capacity is received for the zone.
+\end{itemize}
+
+Zones in the VIRTIO_BLK_ZS_EOPEN (Explicitly Open) state transition from
+this state to
+\begin{itemize}
+\item VIRTIO_BLK_ZS_EMPTY when a successful VIRTIO_BLK_T_ZONE_RESET request is
+ received for the zone,
+
+\item VIRTIO_BLK_ZS_EMPTY when a successful VIRTIO_BLK_T_ZONE_RESET_ALL request
+ is received by the device,
+
+\item VIRTIO_BLK_ZS_EMPTY when a successful VIRTIO_BLK_T_ZONE_CLOSE request is
+ received for the zone and the write pointer of the zone has the value equal
+ to the start sector of the zone,
+
+\item VIRTIO_BLK_ZS_CLOSED when a successful VIRTIO_BLK_T_ZONE_CLOSE request is
+ received for the zone and the zone write pointer is larger then the start
+ sector of the zone,
+
+\item VIRTIO_BLK_ZS_FULL when a successful VIRTIO_BLK_T_ZONE_FINISH request is
+ received for the zone,
+
+\item VIRTIO_BLK_ZS_FULL when a successful VIRTIO_BLK_T_OUT or
+ VIRTIO_BLK_T_ZONE_APPEND request that causes the zone to reach its writable
+ capacity is received for the zone.
+\end{itemize}
+
+When a VIRTIO_BLK_T_ZONE_EOPEN request is issued to an Explicitly Open zone, the
+request is completed successfully and the zone stays in the VIRTIO_BLK_ZS_EOPEN
+state.
+
+Zones in the VIRTIO_BLK_ZS_CLOSED (Closed) state transition from this state
+to
+\begin{itemize}
+\item VIRTIO_BLK_ZS_EMPTY when a successful VIRTIO_BLK_T_ZONE_RESET request is
+ received for the zone,
+
+\item VIRTIO_BLK_ZS_EMPTY when a successful VIRTIO_BLK_T_ZONE_RESET_ALL request
+ is received by the device,
+
+\item VIRTIO_BLK_ZS_IOPEN when a successful VIRTIO_BLK_T_OUT request or
+ VIRTIO_BLK_T_ZONE_APPEND with a non-zero data size is received for the zone.
+
+\item VIRTIO_BLK_ZS_EOPEN when a successful VIRTIO_BLK_T_ZONE_OPEN request is
+ received for the zone,
+\end{itemize}
+
+When a VIRTIO_BLK_T_ZONE_CLOSE request is issued to a Closed zone, the request
+is completed successfully and the zone stays in the VIRTIO_BLK_ZS_CLOSED state.
+
+Zones in the VIRTIO_BLK_ZS_FULL (Full) state transition from this state to
+VIRTIO_BLK_ZS_EMPTY when a successful VIRTIO_BLK_T_ZONE_RESET request is
+received for the zone or a successful VIRTIO_BLK_T_ZONE_RESET_ALL request is
+received by the device.
+
+When a VIRTIO_BLK_T_ZONE_FINISH request is issued to a Full zone, the request
+is completed successfully and the zone stays in the VIRTIO_BLK_ZS_FULL state.
+
+The device may automatically transition zones to VIRTIO_BLK_ZS_RDONLY
+(Read-Only) or VIRTIO_BLK_ZS_OFFLINE (Offline) state from any other state. The
+device may also automatically transition zones in the Read-Only state to the
+Offline state. Zones in the Offline state may not transition to any other state.
+Such automatic transitions usually indicate hardware failures. The previously
+written data may only be read from zones in the Read-Only state. Zones in the
+Offline state can not be read or written.
+
+VIRTIO_BLK_S_ZONE_UNALIGNED_WP is set by the device when the request received
+from the driver attempts to perform a write to an SWR zone and at least one of
+the following conditions is met:
+
+\begin{itemize}
+\item the starting sector of the request is not equal to the current value of
+ the zone write pointer.
+
+\item the ending sector of the request data multiplied by 512 is not a multiple
+ of the value reported by the device in the field \field{write_granularity}
+ in the device configuration space.
+\end{itemize}
+
+VIRTIO_BLK_S_ZONE_OPEN_RESOURCE is set by the device when a zone operation or
+write request received from the driver can not be handled without exceeding the
+\field{max_open_zones} limit value reported by the device in the configuration
+space.
+
+VIRTIO_BLK_S_ZONE_ACTIVE_RESOURCE is set by the device when a zone operation or
+write request received from the driver can not be handled without exceeding the
+\field{max_active_zones} limit value reported by the device in the configuration
+space.
+
+A zone transition request that leads to both the \field{max_open_zones} and the
+\field{max_active_zones} limits to be exceeded is terminated by the device with
+VIRTIO_BLK_S_ZONE_ACTIVE_RESOURCE \field{status} value.
+
+The device reports all other error conditions related to zoned block model
+operation by setting the VIRTIO_BLK_S_ZONE_INVALID_CMD value in
+\field{status} of \field{virtio_blk_req} structure.
+
+\drivernormative{\subsubsection}{Device Operation}{Device Types / Block Device / Device Operation}
+
+The driver SHOULD check if the content of the \field{capacity} field has
+changed upon receiving a configuration change notification.
+
+A driver MUST NOT submit a request which would cause a read or write
+beyond \field{capacity}.
+
+A driver SHOULD accept the VIRTIO_BLK_F_RO feature if offered.
+
+A driver MUST set \field{sector} to 0 for a VIRTIO_BLK_T_FLUSH request.
+A driver SHOULD NOT include any data in a VIRTIO_BLK_T_FLUSH request.
+
+The length of \field{data} MUST be a multiple of 512 bytes for VIRTIO_BLK_T_IN
+and VIRTIO_BLK_T_OUT requests.
+
+The length of \field{data} MUST be a multiple of the size of struct
+virtio_blk_discard_write_zeroes for VIRTIO_BLK_T_DISCARD,
+VIRTIO_BLK_T_SECURE_ERASE and VIRTIO_BLK_T_WRITE_ZEROES requests.
+
+The length of \field{data} MUST be 20 bytes for VIRTIO_BLK_T_GET_ID requests.
+
+VIRTIO_BLK_T_DISCARD requests MUST NOT contain more than
+\field{max_discard_seg} struct virtio_blk_discard_write_zeroes segments in
+\field{data}.
+
+VIRTIO_BLK_T_SECURE_ERASE requests MUST NOT contain more than
+\field{max_secure_erase_seg} struct virtio_blk_discard_write_zeroes segments in
+\field{data}.
+
+VIRTIO_BLK_T_WRITE_ZEROES requests MUST NOT contain more than
+\field{max_write_zeroes_seg} struct virtio_blk_discard_write_zeroes segments in
+\field{data}.
+
+If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the driver MAY
+switch to writethrough or writeback mode by writing respectively 0 and
+1 to the \field{writeback} field. After writing a 0 to \field{writeback},
+the driver MUST NOT assume that any volatile writes have been committed
+to persistent device backend storage.
+
+The \field{unmap} bit MUST be zero for discard commands. The driver
+MUST NOT assume anything about the data returned by read requests after
+a range of sectors has been discarded.
+
+A driver MUST NOT assume that individual segments in a multi-segment
+VIRTIO_BLK_T_DISCARD or VIRTIO_BLK_T_WRITE_ZEROES request completed
+successfully, failed, or were processed by the device at all if the request
+failed with VIRTIO_BLK_S_IOERR.
+
+The following requirements only apply if the VIRTIO_BLK_F_ZONED feature is
+negotiated.
+
+A zone sector address provided by the driver MUST be a multiple of 512 bytes.
+
+When forming a VIRTIO_BLK_T_ZONE_REPORT request, the driver MUST set a sector
+within the sector range of the starting zone to report to \field{sector} field.
+It MAY be a sector that is different from the zone sector address.
+
+In VIRTIO_BLK_T_ZONE_OPEN, VIRTIO_BLK_T_ZONE_CLOSE, VIRTIO_BLK_T_ZONE_FINISH and
+VIRTIO_BLK_T_ZONE_RESET requests, the driver MUST set \field{sector} field to
+point at the first sector in the target zone.
+
+In VIRTIO_BLK_T_ZONE_RESET_ALL request, the driver MUST set the field
+\field{sector} to zero value.
+
+The \field{sector} field of the VIRTIO_BLK_T_ZONE_APPEND request MUST specify
+the zone sector address of the zone to which data is to be appended at the
+position of the write pointer. The size of the data that is appended MUST be a
+multiple of \field{write_granularity} bytes and MUST NOT exceed the
+\field{max_append_sectors} value provided by the device in
+\field{virtio_blk_zoned_characteristics} configuration space structure.
+
+Upon a successful completion of a VIRTIO_BLK_T_ZONE_APPEND request, the driver
+MAY read the starting sector location of the written data from the request
+field \field{append_sector}.
+
+All VIRTIO_BLK_T_OUT requests issued by the driver to sequential zones and
+VIRTIO_BLK_T_ZONE_APPEND requests MUST have:
+
+\begin{enumerate}
+\item the data size that is a multiple of the number of bytes reported
+ by the device in the field \field{write_granularity} in the
+ \field{virtio_blk_zoned_characteristics} configuration space structure.
+
+\item the value of the field \field{sector} that is a multiple of the number of
+ bytes reported by the device in the field \field{write_granularity} in the
+ \field{virtio_blk_zoned_characteristics} configuration space structure.
+
+\item the data size that will not exceed the writable zone capacity when its
+ value is added to the current value of the write pointer of the zone.
+
+\end{enumerate}
+
+\devicenormative{\subsubsection}{Device Operation}{Device Types / Block Device / Device Operation}
+
+The device MAY change the content of the \field{capacity} field during
+operation of the device. When this happens, the device SHOULD trigger a
+configuration change notification.
+
+A device MUST set the \field{status} byte to VIRTIO_BLK_S_IOERR
+for a write request if the VIRTIO_BLK_F_RO feature if offered, and MUST NOT
+write any data.
+
+The device MUST set the \field{status} byte to VIRTIO_BLK_S_UNSUPP for
+discard, secure erase and write zeroes commands if any unknown flag is set.
+Furthermore, the device MUST set the \field{status} byte to
+VIRTIO_BLK_S_UNSUPP for discard commands if the \field{unmap} flag is set.
+
+For discard commands, the device MAY deallocate the specified range of
+sectors in the device backend storage.
+
+For write zeroes commands, if the \field{unmap} is set, the device MAY
+deallocate the specified range of sectors in the device backend storage,
+as if the discard command had been sent. After a write zeroes command
+is completed, reads of the specified ranges of sectors MUST return
+zeroes. This is true independent of whether \field{unmap} was set or clear.
+
+The device SHOULD clear the \field{write_zeroes_may_unmap} field of the
+virtio configuration space if and only if a write zeroes request cannot
+result in deallocating one or more sectors. The device MAY change the
+content of the field during operation of the device; when this happens,
+the device SHOULD trigger a configuration change notification.
+
+A write is considered volatile when it is submitted; the contents of
+sectors covered by a volatile write are undefined in persistent device
+backend storage until the write becomes stable. A write becomes stable
+once it is completed and one or more of the following conditions is true:
+
+\begin{enumerate}
+\item\label{item:flush1} neither VIRTIO_BLK_F_CONFIG_WCE nor
+ VIRTIO_BLK_F_FLUSH feature were negotiated, but VIRTIO_BLK_F_FLUSH was
+ offered by the device;
+
+\item\label{item:flush2} the VIRTIO_BLK_F_CONFIG_WCE feature was negotiated and the
+ \field{writeback} field in configuration space was 0 \textbf{all the time between
+ the submission of the write and its completion};
+
+\item\label{item:flush3} a VIRTIO_BLK_T_FLUSH request is sent \textbf{after the write is
+ completed} and is completed itself.
+\end{enumerate}
+
+If the device is backed by persistent storage, the device MUST ensure that
+stable writes are committed to it, before reporting completion of the write
+(cases~\ref{item:flush1} and~\ref{item:flush2}) or the flush
+(case~\ref{item:flush3}). Failure to do so can cause data loss
+in case of a crash.
+
+If the driver changes \field{writeback} between the submission of the write
+and its completion, the write could be either volatile or stable when
+its completion is reported; in other words, the exact behavior is undefined.
+
+% According to the device requirements for device initialization:
+% Offer(CONFIG_WCE) => Offer(FLUSH).
+%
+% After reversing the implication:
+% not Offer(FLUSH) => not Offer(CONFIG_WCE).
+
+If VIRTIO_BLK_F_FLUSH was not offered by the
+ device\footnote{Note that in this case, according to
+ \ref{devicenormative:Device Types / Block Device / Device Initialization},
+ the device will not have offered VIRTIO_BLK_F_CONFIG_WCE either.}, the
+device MAY also commit writes to persistent device backend storage before
+reporting their completion. Unlike case~\ref{item:flush1}, however, this
+is not an absolute requirement of the specification.
+
+\begin{note}
+ An implementation that does not offer VIRTIO_BLK_F_FLUSH and does not commit
+ completed writes will not be resilient to data loss in case of crashes.
+ Not offering VIRTIO_BLK_F_FLUSH is an absolute requirement
+ for implementations that do not wish to be safe against such data losses.
+\end{note}
+
+If the device is backed by storage providing lifetime metrics (such as eMMC
+or UFS persistent storage), the device SHOULD offer the VIRTIO_BLK_F_LIFETIME
+flag. The flag MUST NOT be offered if the device is backed by storage for which
+the lifetime metrics described in this document cannot be obtained or for which
+such metrics have no useful meaning. If the metrics are offered, the device MUST NOT
+send any reserved values, as defined in this specification.
+
+\begin{note}
+ The device lifetime metrics \field{pre_eol_info}, \field{device_lifetime_est_a}
+ and \field{device_lifetime_est_b} are discussed in the JESD84-B50 specification.
+
+ The complete JESD84-B50 is available at the JEDEC website (https://www.jedec.org)
+ pursuant to JEDEC's licensing terms and conditions. This information is provided to
+ simplfy passthrough implementations from eMMC devices.
+\end{note}
+
+If the VIRTIO_BLK_F_ZONED feature is not negotiated, the device MUST reject
+VIRTIO_BLK_T_ZONE_REPORT, VIRTIO_BLK_T_ZONE_OPEN, VIRTIO_BLK_T_ZONE_CLOSE,
+VIRTIO_BLK_T_ZONE_FINISH, VIRTIO_BLK_T_ZONE_APPEND, VIRTIO_BLK_T_ZONE_RESET and
+VIRTIO_BLK_T_ZONE_RESET_ALL requests with VIRTIO_BLK_S_UNSUPP status.
+
+The following device requirements only apply if the VIRTIO_BLK_F_ZONED feature
+is negotiated.
+
+If a request of type VIRTIO_BLK_T_ZONE_OPEN, VIRTIO_BLK_T_ZONE_CLOSE,
+VIRTIO_BLK_T_ZONE_FINISH or VIRTIO_BLK_T_ZONE_RESET is issued for a Conventional
+zone (type VIRTIO_BLK_ZT_CONV), the device MUST complete the request with
+VIRTIO_BLK_S_ZONE_INVALID_CMD \field{status}.
+
+If the zone specified by the VIRTIO_BLK_T_ZONE_APPEND request is not a SWR zone,
+then the request SHALL be completed with VIRTIO_BLK_S_ZONE_INVALID_CMD
+\field{status}.
+
+The device handles a VIRTIO_BLK_T_ZONE_OPEN request by attempting to change the
+state of the zone with the \field{sector} address to VIRTIO_BLK_ZS_EOPEN. If the
+transition to this state can not be performed, the request MUST be completed
+with VIRTIO_BLK_S_ZONE_INVALID_CMD \field{status}. If, while processing this
+request, the available zone resources are insufficient, then the zone state does
+not change and the request MUST be completed with
+VIRTIO_BLK_S_ZONE_OPEN_RESOURCE or VIRTIO_BLK_S_ZONE_ACTIVE_RESOURCE value in
+the field \field{status}.
+
+The device handles a VIRTIO_BLK_T_ZONE_CLOSE request by attempting to change the
+state of the zone with the \field{sector} address to VIRTIO_BLK_ZS_CLOSED. If
+the transition to this state can not be performed, the request MUST be completed
+with VIRTIO_BLK_S_ZONE_INVALID_CMD value in the field \field{status}.
+
+The device handles a VIRTIO_BLK_T_ZONE_FINISH request by attempting to change
+the state of the zone with the \field{sector} address to VIRTIO_BLK_ZS_FULL. If
+the transition to this state can not be performed, the zone state does not
+change and the request MUST be completed with VIRTIO_BLK_S_ZONE_INVALID_CMD
+value in the field \field{status}.
+
+The device handles a VIRTIO_BLK_T_ZONE_RESET request by attempting to change the
+state of the zone with the \field{sector} address to VIRTIO_BLK_ZS_EMPTY state.
+If the transition to this state can not be performed, the zone state does not
+change and the request MUST be completed with VIRTIO_BLK_S_ZONE_INVALID_CMD
+value in the field \field{status}.
+
+The device handles a VIRTIO_BLK_T_ZONE_RESET_ALL request by transitioning all
+sequential device zones in VIRTIO_BLK_ZS_IOPEN, VIRTIO_BLK_ZS_EOPEN,
+VIRTIO_BLK_ZS_CLOSED and VIRTIO_BLK_ZS_FULL state to VIRTIO_BLK_ZS_EMPTY state.
+
+Upon receiving a VIRTIO_BLK_T_ZONE_APPEND request or a VIRTIO_BLK_T_OUT
+request issued to a SWR zone in VIRTIO_BLK_ZS_EMPTY or VIRTIO_BLK_ZS_CLOSED
+state, the device attempts to perform the transition of the zone to
+VIRTIO_BLK_ZS_IOPEN state before writing data. This transition may fail due to
+insufficient open and/or active zone resources available on the device. In this
+case, the request MUST be completed with VIRTIO_BLK_S_ZONE_OPEN_RESOURCE or
+VIRTIO_BLK_S_ZONE_ACTIVE_RESOURCE value in the \field{status}.
+
+If the \field{sector} field in the VIRTIO_BLK_T_ZONE_APPEND request does not
+specify the lowest sector for a zone, then the request SHALL be completed with
+VIRTIO_BLK_S_ZONE_INVALID_CMD value in \field{status}.
+
+A VIRTIO_BLK_T_ZONE_APPEND request or a VIRTIO_BLK_T_OUT request that has the
+data range that exceeds the remaining writable capacity for the zone, then the
+request SHALL be completed with VIRTIO_BLK_S_ZONE_INVALID_CMD value in
+\field{status}.
+
+If a request of the type VIRTIO_BLK_T_ZONE_APPEND is completed with
+VIRTIO_BLK_S_OK status, the field \field{append_sector} in
+\field{virtio_blk_req_za} MUST be set by the device to contain the first sector
+of the data written to the zone.
+
+If a request of the type VIRTIO_BLK_T_ZONE_APPEND is completed with a status
+other than VIRTIO_BLK_S_OK, the value of \field{append_sector} field in
+\field{virtio_blk_req_za} is undefined.
+
+A VIRTIO_BLK_T_ZONE_APPEND request that has the data size that exceeds
+\field{max_append_sectors} configuration space value, then,
+\begin{itemize}
+\item if \field{max_append_sectors} configuration space value is reported as
+ zero by the device, the request SHALL be completed with VIRTIO_BLK_S_UNSUPP
+ \field{status}.
+
+\item if \field{max_append_sectors} configuration space value is reported as
+ a non-zero value by the device, the request SHALL be completed with
+ VIRTIO_BLK_S_ZONE_INVALID_CMD \field{status}.
+\end{itemize}
+
+If a VIRTIO_BLK_T_ZONE_APPEND request, a VIRTIO_BLK_T_IN request or a
+VIRTIO_BLK_T_OUT request issued to a SWR zone has the range that has sectors in
+more than one zone, then the request SHALL be completed with
+VIRTIO_BLK_S_ZONE_INVALID_CMD value in the field \field{status}.
+
+A VIRTIO_BLK_T_OUT request that has the \field{sector} value that is not aligned
+with the write pointer for the zone, then the request SHALL be completed with
+VIRTIO_BLK_S_ZONE_UNALIGNED_WP value in the field \field{status}.
+
+In order to avoid resource-related errors while opening zones implicitly, the
+device MAY automatically transition zones in VIRTIO_BLK_ZS_IOPEN state to
+VIRTIO_BLK_ZS_CLOSED state.
+
+All VIRTIO_BLK_T_OUT requests or VIRTIO_BLK_T_ZONE_APPEND requests issued
+to a zone in the VIRTIO_BLK_ZS_RDONLY state SHALL be completed with
+VIRTIO_BLK_S_ZONE_INVALID_CMD \field{status}.
+
+All requests issued to a zone in the VIRTIO_BLK_ZS_OFFLINE state SHALL be
+completed with VIRTIO_BLK_S_ZONE_INVALID_CMD value in the field \field{status}.
+
+The device MUST consider the sectors that are read between the write pointer
+position of a zone and the end of the last sector of the zone as unwritten data.
+The sectors between the write pointer position and the end of the last sector
+within the zone capacity during VIRTIO_BLK_T_ZONE_FINISH request processing are
+also considered unwritten data.
+
+When unwritten data is present in the sector range of a read request, the device
+MUST process this data in one of the following ways -
+
+\begin{enumerate}
+\item Fill the unwritten data with a device-specific byte pattern. The
+configuration, control and reporting of this byte pattern is beyond the scope
+of this standard. This is the preferred approach.
+
+\item Fail the request. Depending on the driver implementation, this may prevent
+the device from becoming operational.
+\end{enumerate}
+
+If both the VIRTIO_BLK_F_ZONED and VIRTIO_BLK_F_SECURE_ERASE features are
+negotiated, then
+
+\begin{enumerate}
+\item the field \field{secure_erase_sector_alignment} in the configuration space
+of the device MUST be a multiple of \field{zone_sectors} value reported in the
+device configuration space.
+
+\item the data size in VIRTIO_BLK_T_SECURE_ERASE requests MUST be a multiple of
+\field{zone_sectors} value in the device configuration space.
+\end{enumerate}
+
+The device MUST handle a VIRTIO_BLK_T_SECURE_ERASE request in the same way it
+handles VIRTIO_BLK_T_ZONE_RESET request for the zone range specified in the
+VIRTIO_BLK_T_SECURE_ERASE request.
+
+\subsubsection{Legacy Interface: Device Operation}\label{sec:Device Types / Block Device / Device Operation / Legacy Interface: Device Operation}
+When using the legacy interface, transitional devices and drivers
+MUST format the fields in struct virtio_blk_req
+according to the native endian of the guest rather than
+(necessarily when not using the legacy interface) little-endian.
+
+When using the legacy interface, transitional drivers
+SHOULD ignore the used length values.
+\begin{note}
+Historically, some devices put the total descriptor length,
+or the total length of device-writable buffers there,
+even when only the status byte was actually written.
+\end{note}
+
+The \field{reserved} field was previously called \field{ioprio}. \field{ioprio}
+is a hint about the relative priorities of requests to the device:
+higher numbers indicate more important requests.
+
+\begin{lstlisting}
+#define VIRTIO_BLK_T_FLUSH_OUT 5
+\end{lstlisting}
+
+The command VIRTIO_BLK_T_FLUSH_OUT was a synonym for VIRTIO_BLK_T_FLUSH;
+a driver MUST treat it as a VIRTIO_BLK_T_FLUSH command.
+
+\begin{lstlisting}
+#define VIRTIO_BLK_T_BARRIER 0x80000000
+\end{lstlisting}
+
+If the device has VIRTIO_BLK_F_BARRIER
+feature the high bit (VIRTIO_BLK_T_BARRIER) indicates that this
+request acts as a barrier and that all preceding requests SHOULD be
+complete before this one, and all following requests SHOULD NOT be
+started until this is complete.
+
+\begin{note} A barrier does not flush
+caches in the underlying backend device in host, and thus does not
+serve as data consistency guarantee. Only a VIRTIO_BLK_T_FLUSH request
+does that.
+\end{note}
+
+Some older legacy devices did not commit completed writes to persistent
+device backend storage when VIRTIO_BLK_F_FLUSH was offered but not
+negotiated. In order to work around this, the driver MAY set the
+\field{writeback} to 0 (if available) or it MAY send an explicit flush
+request after every completed write.
+
+If the device has VIRTIO_BLK_F_SCSI feature, it can also support
+scsi packet command requests, each of these requests is of form:
+
+\begin{lstlisting}
+/* All fields are in guest's native endian. */
+struct virtio_scsi_pc_req {
+ u32 type;
+ u32 ioprio;
+ u64 sector;
+ u8 cmd[];
+ u8 data[][512];
+#define SCSI_SENSE_BUFFERSIZE 96
+ u8 sense[SCSI_SENSE_BUFFERSIZE];
+ u32 errors;
+ u32 data_len;
+ u32 sense_len;
+ u32 residual;
+ u8 status;
+};
+\end{lstlisting}
+
+A request type can also be a scsi packet command (VIRTIO_BLK_T_SCSI_CMD or
+VIRTIO_BLK_T_SCSI_CMD_OUT). The two types are equivalent, the device
+does not distinguish between them:
+
+\begin{lstlisting}
+#define VIRTIO_BLK_T_SCSI_CMD 2
+#define VIRTIO_BLK_T_SCSI_CMD_OUT 3
+\end{lstlisting}
+
+The \field{cmd} field is only present for scsi packet command requests,
+and indicates the command to perform. This field MUST reside in a
+single, separate device-readable buffer; command length can be derived
+from the length of this buffer.
+
+Note that these first three (four for scsi packet commands)
+fields are always device-readable: \field{data} is either device-readable
+or device-writable, depending on the request. The size of the read or
+write can be derived from the total size of the request buffers.
+
+\field{sense} is only present for scsi packet command requests,
+and indicates the buffer for scsi sense data.
+
+\field{data_len} is only present for scsi packet command
+requests, this field is deprecated, and SHOULD be ignored by the
+driver. Historically, devices copied data length there.
+
+\field{sense_len} is only present for scsi packet command
+requests and indicates the number of bytes actually written to
+the \field{sense} buffer.
+
+\field{residual} field is only present for scsi packet command
+requests and indicates the residual size, calculated as data
+length - number of bytes actually transferred.
+
+\subsubsection{Legacy Interface: Framing Requirements}\label{sec:Device Types / Block Device / Legacy Interface: Framing Requirements}
+
+When using legacy interfaces, transitional drivers which have not
+negotiated VIRTIO_F_ANY_LAYOUT:
+
+\begin{itemize}
+\item MUST use a single 8-byte descriptor containing \field{type},
+ \field{reserved} and \field{sector}, followed by descriptors
+ for \field{data}, then finally a separate 1-byte descriptor
+ for \field{status}.
+
+\item For SCSI commands there are additional constraints.
+ \field{sense} MUST reside in a
+ single separate device-writable descriptor of size 96 bytes,
+ and \field{errors}, \field{data_len}, \field{sense_len} and
+ \field{residual} MUST reside a single separate
+ device-writable descriptor.
+\end{itemize}
+
+See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Message Framing}.
diff --git a/device-types/virtio-block/device-conformance.tex b/device-types/virtio-block/device-conformance.tex
new file mode 100644
index 0000000..b4fbc8b
--- /dev/null
+++ b/device-types/virtio-block/device-conformance.tex
@@ -0,0 +1,8 @@
+\conformance{\subsection}{Block Device Conformance}\label{sec:Conformance / Device Conformance / Block Device Conformance}
+
+A block device MUST conform to the following normative statements:
+
+\begin{itemize}
+\item \ref{devicenormative:Device Types / Block Device / Device Initialization}
+\item \ref{devicenormative:Device Types / Block Device / Device Operation}
+\end{itemize}
diff --git a/device-types/virtio-block/driver-conformance.tex b/device-types/virtio-block/driver-conformance.tex
new file mode 100644
index 0000000..0f69866
--- /dev/null
+++ b/device-types/virtio-block/driver-conformance.tex
@@ -0,0 +1,8 @@
+\conformance{\subsection}{Block Driver Conformance}\label{sec:Conformance / Driver Conformance / Block Driver Conformance}
+
+A block driver MUST conform to the following normative statements:
+
+\begin{itemize}
+\item \ref{drivernormative:Device Types / Block Device / Device Initialization}
+\item \ref{drivernormative:Device Types / Block Device / Device Operation}
+\end{itemize}
--
2.26.2
^ permalink raw reply related
* [PATCH v3 02/20] virtio-network: Fix spelling errors
From: Parav Pandit @ 2023-01-10 23:03 UTC (permalink / raw)
To: mst, virtio-dev, cohuck; +Cc: virtio-comment, Parav Pandit
In-Reply-To: <20230110230358.528098-1-parav@nvidia.com>
Fix two spelling errors in the virtio network device specification.
Signed-off-by: Parav Pandit <parav@nvidia.com>
---
device-types/virtio-network/description.tex | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/device-types/virtio-network/description.tex b/device-types/virtio-network/description.tex
index 367681d..82d7374 100644
--- a/device-types/virtio-network/description.tex
+++ b/device-types/virtio-network/description.tex
@@ -331,7 +331,7 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev
Otherwise, the driver assumes it's active.
\item A performant driver would indicate that it will generate checksumless
- packets by negotating the VIRTIO_NET_F_CSUM feature.
+ packets by negotiating the VIRTIO_NET_F_CSUM feature.
\item If that feature is negotiated, a driver can use TCP segmentation or UDP
segmentation/fragmentation offload by negotiating the VIRTIO_NET_F_HOST_TSO4 (IPv4
@@ -1062,7 +1062,7 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
\begin{itemize}
\item VIRTIO_NET_CTRL_RX_PROMISC turns promiscuous mode on and
off. The command-specific-data is one byte containing 0 (off) or
-1 (on). If promiscous mode is on, the device SHOULD receive all
+1 (on). If promiscuous mode is on, the device SHOULD receive all
incoming packets.
This SHOULD take effect even if one of the other modes set by
a VIRTIO_NET_CTRL_RX class command is on.
--
2.26.2
^ permalink raw reply related
* [PATCH v3 01/20] virtio-network: Maintain network device spec in separate directory
From: Parav Pandit @ 2023-01-10 23:03 UTC (permalink / raw)
To: mst, virtio-dev, cohuck; +Cc: virtio-comment, Parav Pandit
In-Reply-To: <20230110230358.528098-1-parav@nvidia.com>
Move virtio network device specification to its own file similar to
recent virtio devices.
While at it, place device specification, its driver and device
conformance into its own directory to have self contained device
specification.
Fixes: https://github.com/oasis-tcs/virtio-spec/issues/153
Signed-off-by: Parav Pandit <parav@nvidia.com>
---
changelog:
v2->v3:
- file name changed from device.tex to description.tex
- use input instead of import to insert a file
v1->v2:
- removed extra blank lines at end of file
v0->v1:
- moved to device specific directory
---
conformance.tex | 35 +-
content.tex | 1595 +----------------
device-types/virtio-network/description.tex | 1594 ++++++++++++++++
.../virtio-network/device-conformance.tex | 16 +
.../virtio-network/driver-conformance.tex | 17 +
5 files changed, 1630 insertions(+), 1627 deletions(-)
create mode 100644 device-types/virtio-network/description.tex
create mode 100644 device-types/virtio-network/device-conformance.tex
create mode 100644 device-types/virtio-network/driver-conformance.tex
diff --git a/conformance.tex b/conformance.tex
index c3c1d3e..956c808 100644
--- a/conformance.tex
+++ b/conformance.tex
@@ -134,23 +134,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
\item \ref{drivernormative:Virtio Transport Options / Virtio over channel I/O / Device Operation / Resetting Devices}
\end{itemize}
-\conformance{\subsection}{Network Driver Conformance}\label{sec:Conformance / Driver Conformance / Network Driver Conformance}
-
-A network driver MUST conform to the following normative statements:
-
-\begin{itemize}
-\item \ref{drivernormative:Device Types / Network Device / Device configuration layout}
-\item \ref{drivernormative:Device Types / Network Device / Device Operation / Packet Transmission}
-\item \ref{drivernormative:Device Types / Network Device / Device Operation / Setting Up Receive Buffers}
-\item \ref{drivernormative:Device Types / Network Device / Device Operation / Processing of Incoming Packets}
-\item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Packet Receive Filtering}
-\item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Setting MAC Address Filtering}
-\item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Gratuitous Packet Sending}
-\item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode}
-\item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Offloads State Configuration / Setting Offloads State}
-\item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) }
-\item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
-\end{itemize}
+\input{device-types/virtio-network/driver-conformance.tex}
\conformance{\subsection}{Block Driver Conformance}\label{sec:Conformance / Driver Conformance / Block Driver Conformance}
@@ -401,22 +385,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
\item \ref{devicenormative:Virtio Transport Options / Virtio over channel I/O / Device Operation / Resetting Devices}
\end{itemize}
-\conformance{\subsection}{Network Device Conformance}\label{sec:Conformance / Device Conformance / Network Device Conformance}
-
-A network device MUST conform to the following normative statements:
-
-\begin{itemize}
-\item \ref{devicenormative:Device Types / Network Device / Device configuration layout}
-\item \ref{devicenormative:Device Types / Network Device / Device Operation / Packet Transmission}
-\item \ref{devicenormative:Device Types / Network Device / Device Operation / Setting Up Receive Buffers}
-\item \ref{devicenormative:Device Types / Network Device / Device Operation / Processing of Incoming Packets}
-\item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Packet Receive Filtering}
-\item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Setting MAC Address Filtering}
-\item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Gratuitous Packet Sending}
-\item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode}
-\item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / RSS processing}
-\item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
-\end{itemize}
+\input{device-types/virtio-network/device-conformance.tex}
\conformance{\subsection}{Block Device Conformance}\label{sec:Conformance / Device Conformance / Block Device Conformance}
diff --git a/content.tex b/content.tex
index 96f4723..90e042c 100644
--- a/content.tex
+++ b/content.tex
@@ -3003,1600 +3003,7 @@ \chapter{Device Types}\label{sec:Device Types}
entirely, or live on outside this standard. We shall speak of
them no further.
-\section{Network Device}\label{sec:Device Types / Network Device}
-
-The virtio network device is a virtual ethernet card, and is the
-most complex of the devices supported so far by virtio. It has
-enhanced rapidly and demonstrates clearly how support for new
-features are added to an existing device. Empty buffers are
-placed in one virtqueue for receiving packets, and outgoing
-packets are enqueued into another for transmission in that order.
-A third command queue is used to control advanced filtering
-features.
-
-\subsection{Device ID}\label{sec:Device Types / Network Device / Device ID}
-
- 1
-
-\subsection{Virtqueues}\label{sec:Device Types / Network Device / Virtqueues}
-
-\begin{description}
-\item[0] receiveq1
-\item[1] transmitq1
-\item[\ldots]
-\item[2(N-1)] receiveqN
-\item[2(N-1)+1] transmitqN
-\item[2N] controlq
-\end{description}
-
- N=1 if neither VIRTIO_NET_F_MQ nor VIRTIO_NET_F_RSS are negotiated, otherwise N is set by
- \field{max_virtqueue_pairs}.
-
- controlq only exists if VIRTIO_NET_F_CTRL_VQ set.
-
-\subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits}
-
-\begin{description}
-\item[VIRTIO_NET_F_CSUM (0)] Device handles packets with partial checksum. This
- ``checksum offload'' is a common feature on modern network cards.
-
-\item[VIRTIO_NET_F_GUEST_CSUM (1)] Driver handles packets with partial checksum.
-
-\item[VIRTIO_NET_F_CTRL_GUEST_OFFLOADS (2)] Control channel offloads
- reconfiguration support.
-
-\item[VIRTIO_NET_F_MTU(3)] Device maximum MTU reporting is supported. If
- offered by the device, device advises driver about the value of
- its maximum MTU. If negotiated, the driver uses \field{mtu} as
- the maximum MTU value.
-
-\item[VIRTIO_NET_F_MAC (5)] Device has given MAC address.
-
-\item[VIRTIO_NET_F_GUEST_TSO4 (7)] Driver can receive TSOv4.
-
-\item[VIRTIO_NET_F_GUEST_TSO6 (8)] Driver can receive TSOv6.
-
-\item[VIRTIO_NET_F_GUEST_ECN (9)] Driver can receive TSO with ECN.
-
-\item[VIRTIO_NET_F_GUEST_UFO (10)] Driver can receive UFO.
-
-\item[VIRTIO_NET_F_HOST_TSO4 (11)] Device can receive TSOv4.
-
-\item[VIRTIO_NET_F_HOST_TSO6 (12)] Device can receive TSOv6.
-
-\item[VIRTIO_NET_F_HOST_ECN (13)] Device can receive TSO with ECN.
-
-\item[VIRTIO_NET_F_HOST_UFO (14)] Device can receive UFO.
-
-\item[VIRTIO_NET_F_MRG_RXBUF (15)] Driver can merge receive buffers.
-
-\item[VIRTIO_NET_F_STATUS (16)] Configuration status field is
- available.
-
-\item[VIRTIO_NET_F_CTRL_VQ (17)] Control channel is available.
-
-\item[VIRTIO_NET_F_CTRL_RX (18)] Control channel RX mode support.
-
-\item[VIRTIO_NET_F_CTRL_VLAN (19)] Control channel VLAN filtering.
-
-\item[VIRTIO_NET_F_GUEST_ANNOUNCE(21)] Driver can send gratuitous
- packets.
-
-\item[VIRTIO_NET_F_MQ(22)] Device supports multiqueue with automatic
- receive steering.
-
-\item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
- channel.
-
-\item[VIRTIO_NET_F_NOTF_COAL(53)] Device supports notifications coalescing.
-
-\item[VIRTIO_NET_F_GUEST_USO4 (54)] Driver can receive USOv4 packets.
-
-\item[VIRTIO_NET_F_GUEST_USO6 (55)] Driver can receive USOv6 packets.
-
-\item[VIRTIO_NET_F_HOST_USO (56)] Device can receive USO packets. Unlike UFO
- (fragmenting the packet) the USO splits large UDP packet
- to several segments when each of these smaller packets has UDP header.
-
-\item[VIRTIO_NET_F_HASH_REPORT(57)] Device can report per-packet hash
- value and a type of calculated hash.
-
-\item[VIRTIO_NET_F_GUEST_HDRLEN(59)] Driver can provide the exact \field{hdr_len}
- value. Device benefits from knowing the exact header length.
-
-\item[VIRTIO_NET_F_RSS(60)] Device supports RSS (receive-side scaling)
- with Toeplitz hash calculation and configurable hash
- parameters for receive steering.
-
-\item[VIRTIO_NET_F_RSC_EXT(61)] Device can process duplicated ACKs
- and report number of coalesced segments and duplicated ACKs.
-
-\item[VIRTIO_NET_F_STANDBY(62)] Device may act as a standby for a primary
- device with the same MAC address.
-
-\item[VIRTIO_NET_F_SPEED_DUPLEX(63)] Device reports speed and duplex.
-\end{description}
-
-\subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device / Feature bits / Feature bit requirements}
-
-Some networking feature bits require other networking feature bits
-(see \ref{drivernormative:Basic Facilities of a Virtio Device / Feature Bits}):
-
-\begin{description}
-\item[VIRTIO_NET_F_GUEST_TSO4] Requires VIRTIO_NET_F_GUEST_CSUM.
-\item[VIRTIO_NET_F_GUEST_TSO6] Requires VIRTIO_NET_F_GUEST_CSUM.
-\item[VIRTIO_NET_F_GUEST_ECN] Requires VIRTIO_NET_F_GUEST_TSO4 or VIRTIO_NET_F_GUEST_TSO6.
-\item[VIRTIO_NET_F_GUEST_UFO] Requires VIRTIO_NET_F_GUEST_CSUM.
-\item[VIRTIO_NET_F_GUEST_USO4] Requires VIRTIO_NET_F_GUEST_CSUM.
-\item[VIRTIO_NET_F_GUEST_USO6] Requires VIRTIO_NET_F_GUEST_CSUM.
-
-\item[VIRTIO_NET_F_HOST_TSO4] Requires VIRTIO_NET_F_CSUM.
-\item[VIRTIO_NET_F_HOST_TSO6] Requires VIRTIO_NET_F_CSUM.
-\item[VIRTIO_NET_F_HOST_ECN] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
-\item[VIRTIO_NET_F_HOST_UFO] Requires VIRTIO_NET_F_CSUM.
-\item[VIRTIO_NET_F_HOST_USO] Requires VIRTIO_NET_F_CSUM.
-
-\item[VIRTIO_NET_F_CTRL_RX] Requires VIRTIO_NET_F_CTRL_VQ.
-\item[VIRTIO_NET_F_CTRL_VLAN] Requires VIRTIO_NET_F_CTRL_VQ.
-\item[VIRTIO_NET_F_GUEST_ANNOUNCE] Requires VIRTIO_NET_F_CTRL_VQ.
-\item[VIRTIO_NET_F_MQ] Requires VIRTIO_NET_F_CTRL_VQ.
-\item[VIRTIO_NET_F_CTRL_MAC_ADDR] Requires VIRTIO_NET_F_CTRL_VQ.
-\item[VIRTIO_NET_F_NOTF_COAL] Requires VIRTIO_NET_F_CTRL_VQ.
-\item[VIRTIO_NET_F_RSC_EXT] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
-\item[VIRTIO_NET_F_RSS] Requires VIRTIO_NET_F_CTRL_VQ.
-\end{description}
-
-\subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Network Device / Feature bits / Legacy Interface: Feature bits}
-\begin{description}
-\item[VIRTIO_NET_F_GSO (6)] Device handles packets with any GSO type. This was supposed to indicate segmentation offload support, but
-upon further investigation it became clear that multiple bits were needed.
-\item[VIRTIO_NET_F_GUEST_RSC4 (41)] Device coalesces TCPIP v4 packets. This was implemented by hypervisor patch for certification
-purposes and current Windows driver depends on it. It will not function if virtio-net device reports this feature.
-\item[VIRTIO_NET_F_GUEST_RSC6 (42)] Device coalesces TCPIP v6 packets. Similar to VIRTIO_NET_F_GUEST_RSC4.
-\end{description}
-
-\subsection{Device configuration layout}\label{sec:Device Types / Network Device / Device configuration layout}
-\label{sec:Device Types / Block Device / Feature bits / Device configuration layout}
-
-Device configuration fields are listed below, they are read-only for a driver. The \field{mac} address field
-always exists (though is only valid if VIRTIO_NET_F_MAC is set), and
-\field{status} only exists if VIRTIO_NET_F_STATUS is set. Two
-read-only bits (for the driver) are currently defined for the status field:
-VIRTIO_NET_S_LINK_UP and VIRTIO_NET_S_ANNOUNCE.
-
-\begin{lstlisting}
-#define VIRTIO_NET_S_LINK_UP 1
-#define VIRTIO_NET_S_ANNOUNCE 2
-\end{lstlisting}
-
-The following driver-read-only field, \field{max_virtqueue_pairs} only exists if
-VIRTIO_NET_F_MQ or VIRTIO_NET_F_RSS is set. This field specifies the maximum number
-of each of transmit and receive virtqueues (receiveq1\ldots receiveqN
-and transmitq1\ldots transmitqN respectively) that can be configured once at least one of these features
-is negotiated.
-
-The following driver-read-only field, \field{mtu} only exists if
-VIRTIO_NET_F_MTU is set. This field specifies the maximum MTU for the driver to
-use.
-
-The following two fields, \field{speed} and \field{duplex}, only
-exist if VIRTIO_NET_F_SPEED_DUPLEX is set.
-
-\field{speed} contains the device speed, in units of 1 MBit per
-second, 0 to 0x7fffffff, or 0xffffffff for unknown speed.
-
-\field{duplex} has the values of 0x01 for full duplex, 0x00 for
-half duplex and 0xff for unknown duplex state.
-
-Both \field{speed} and \field{duplex} can change, thus the driver
-is expected to re-read these values after receiving a
-configuration change notification.
-
-\begin{lstlisting}
-struct virtio_net_config {
- u8 mac[6];
- le16 status;
- le16 max_virtqueue_pairs;
- le16 mtu;
- le32 speed;
- u8 duplex;
- u8 rss_max_key_size;
- le16 rss_max_indirection_table_length;
- le32 supported_hash_types;
-};
-\end{lstlisting}
-The following field, \field{rss_max_key_size} only exists if VIRTIO_NET_F_RSS or VIRTIO_NET_F_HASH_REPORT is set.
-It specifies the maximum supported length of RSS key in bytes.
-
-The following field, \field{rss_max_indirection_table_length} only exists if VIRTIO_NET_F_RSS is set.
-It specifies the maximum number of 16-bit entries in RSS indirection table.
-
-The next field, \field{supported_hash_types} only exists if the device supports hash calculation,
-i.e. if VIRTIO_NET_F_RSS or VIRTIO_NET_F_HASH_REPORT is set.
-
-Field \field{supported_hash_types} contains the bitmask of supported hash types.
-See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types} for details of supported hash types.
-
-\devicenormative{\subsubsection}{Device configuration layout}{Device Types / Network Device / Device configuration layout}
-
-The device MUST set \field{max_virtqueue_pairs} to between 1 and 0x8000 inclusive,
-if it offers VIRTIO_NET_F_MQ.
-
-The device MUST set \field{mtu} to between 68 and 65535 inclusive,
-if it offers VIRTIO_NET_F_MTU.
-
-The device SHOULD set \field{mtu} to at least 1280, if it offers
-VIRTIO_NET_F_MTU.
-
-The device MUST NOT modify \field{mtu} once it has been set.
-
-The device MUST NOT pass received packets that exceed \field{mtu} (plus low
-level ethernet header length) size with \field{gso_type} NONE or ECN
-after VIRTIO_NET_F_MTU has been successfully negotiated.
-
-The device MUST forward transmitted packets of up to \field{mtu} (plus low
-level ethernet header length) size with \field{gso_type} NONE or ECN, and do
-so without fragmentation, after VIRTIO_NET_F_MTU has been successfully
-negotiated.
-
-The device MUST set \field{rss_max_key_size} to at least 40, if it offers
-VIRTIO_NET_F_RSS or VIRTIO_NET_F_HASH_REPORT.
-
-The device MUST set \field{rss_max_indirection_table_length} to at least 128, if it offers
-VIRTIO_NET_F_RSS.
-
-If the driver negotiates the VIRTIO_NET_F_STANDBY feature, the device MAY act
-as a standby device for a primary device with the same MAC address.
-
-If VIRTIO_NET_F_SPEED_DUPLEX has been negotiated, \field{speed}
-MUST contain the device speed, in units of 1 MBit per second, 0 to
-0x7ffffffff, or 0xfffffffff for unknown.
-
-If VIRTIO_NET_F_SPEED_DUPLEX has been negotiated, \field{duplex}
-MUST have the values of 0x00 for full duplex, 0x01 for half
-duplex, or 0xff for unknown.
-
-If VIRTIO_NET_F_SPEED_DUPLEX and VIRTIO_NET_F_STATUS have both
-been negotiated, the device SHOULD NOT change the \field{speed} and
-\field{duplex} fields as long as VIRTIO_NET_S_LINK_UP is set in
-the \field{status}.
-
-\drivernormative{\subsubsection}{Device configuration layout}{Device Types / Network Device / Device configuration layout}
-
-A driver SHOULD negotiate VIRTIO_NET_F_MAC if the device offers it.
-If the driver negotiates the VIRTIO_NET_F_MAC feature, the driver MUST set
-the physical address of the NIC to \field{mac}. Otherwise, it SHOULD
-use a locally-administered MAC address (see \hyperref[intro:IEEE 802]{IEEE 802},
-``9.2 48-bit universal LAN MAC addresses'').
-
-If the driver does not negotiate the VIRTIO_NET_F_STATUS feature, it SHOULD
-assume the link is active, otherwise it SHOULD read the link status from
-the bottom bit of \field{status}.
-
-A driver SHOULD negotiate VIRTIO_NET_F_MTU if the device offers it.
-
-If the driver negotiates VIRTIO_NET_F_MTU, it MUST supply enough receive
-buffers to receive at least one receive packet of size \field{mtu} (plus low
-level ethernet header length) with \field{gso_type} NONE or ECN.
-
-If the driver negotiates VIRTIO_NET_F_MTU, it MUST NOT transmit packets of
-size exceeding the value of \field{mtu} (plus low level ethernet header length)
-with \field{gso_type} NONE or ECN.
-
-A driver SHOULD negotiate the VIRTIO_NET_F_STANDBY feature if the device offers it.
-
-If VIRTIO_NET_F_SPEED_DUPLEX has been negotiated,
-the driver MUST treat any value of \field{speed} above
-0x7fffffff as well as any value of \field{duplex} not
-matching 0x00 or 0x01 as an unknown value.
-
-If VIRTIO_NET_F_SPEED_DUPLEX has been negotiated, the driver
-SHOULD re-read \field{speed} and \field{duplex} after a
-configuration change notification.
-
-\subsubsection{Legacy Interface: Device configuration layout}\label{sec:Device Types / Network Device / Device configuration layout / Legacy Interface: Device configuration layout}
-\label{sec:Device Types / Block Device / Feature bits / Device configuration layout / Legacy Interface: Device configuration layout}
-When using the legacy interface, transitional devices and drivers
-MUST format \field{status} and
-\field{max_virtqueue_pairs} in struct virtio_net_config
-according to the native endian of the guest rather than
-(necessarily when not using the legacy interface) little-endian.
-
-When using the legacy interface, \field{mac} is driver-writable
-which provided a way for drivers to update the MAC without
-negotiating VIRTIO_NET_F_CTRL_MAC_ADDR.
-
-\subsection{Device Initialization}\label{sec:Device Types / Network Device / Device Initialization}
-
-A driver would perform a typical initialization routine like so:
-
-\begin{enumerate}
-\item Identify and initialize the receive and
- transmission virtqueues, up to N of each kind. If
- VIRTIO_NET_F_MQ feature bit is negotiated,
- N=\field{max_virtqueue_pairs}, otherwise identify N=1.
-
-\item If the VIRTIO_NET_F_CTRL_VQ feature bit is negotiated,
- identify the control virtqueue.
-
-\item Fill the receive queues with buffers: see \ref{sec:Device Types / Network Device / Device Operation / Setting Up Receive Buffers}.
-
-\item Even with VIRTIO_NET_F_MQ, only receiveq1, transmitq1 and
- controlq are used by default. The driver would send the
- VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command specifying the
- number of the transmit and receive queues to use.
-
-\item If the VIRTIO_NET_F_MAC feature bit is set, the configuration
- space \field{mac} entry indicates the ``physical'' address of the
- network card, otherwise the driver would typically generate a random
- local MAC address.
-
-\item If the VIRTIO_NET_F_STATUS feature bit is negotiated, the link
- status comes from the bottom bit of \field{status}.
- Otherwise, the driver assumes it's active.
-
-\item A performant driver would indicate that it will generate checksumless
- packets by negotating the VIRTIO_NET_F_CSUM feature.
-
-\item If that feature is negotiated, a driver can use TCP segmentation or UDP
- segmentation/fragmentation offload by negotiating the VIRTIO_NET_F_HOST_TSO4 (IPv4
- TCP), VIRTIO_NET_F_HOST_TSO6 (IPv6 TCP), VIRTIO_NET_F_HOST_UFO
- (UDP fragmentation) and VIRTIO_NET_F_HOST_USO (UDP segmentation) features.
-
-\item The converse features are also available: a driver can save
- the virtual device some work by negotiating these features.\note{For example, a network packet transported between two guests on
-the same system might not need checksumming at all, nor segmentation,
-if both guests are amenable.}
- The VIRTIO_NET_F_GUEST_CSUM feature indicates that partially
- checksummed packets can be received, and if it can do that then
- the VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6,
- VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_GUEST_USO4
- and VIRTIO_NET_F_GUEST_USO6 are the input equivalents of the features described above.
- See \ref{sec:Device Types / Network Device / Device Operation /
-Setting Up Receive Buffers}~\nameref{sec:Device Types / Network
-Device / Device Operation / Setting Up Receive Buffers} and
-\ref{sec:Device Types / Network Device / Device Operation /
-Processing of Incoming Packets}~\nameref{sec:Device Types /
-Network Device / Device Operation / Processing of Incoming Packets} below.
-\end{enumerate}
-
-A truly minimal driver would only accept VIRTIO_NET_F_MAC and ignore
-everything else.
-
-\subsection{Device Operation}\label{sec:Device Types / Network Device / Device Operation}
-
-Packets are transmitted by placing them in the
-transmitq1\ldots transmitqN, and buffers for incoming packets are
-placed in the receiveq1\ldots receiveqN. In each case, the packet
-itself is preceded by a header:
-
-\begin{lstlisting}
-struct virtio_net_hdr {
-#define VIRTIO_NET_HDR_F_NEEDS_CSUM 1
-#define VIRTIO_NET_HDR_F_DATA_VALID 2
-#define VIRTIO_NET_HDR_F_RSC_INFO 4
- u8 flags;
-#define VIRTIO_NET_HDR_GSO_NONE 0
-#define VIRTIO_NET_HDR_GSO_TCPV4 1
-#define VIRTIO_NET_HDR_GSO_UDP 3
-#define VIRTIO_NET_HDR_GSO_TCPV6 4
-#define VIRTIO_NET_HDR_GSO_UDP_L4 5
-#define VIRTIO_NET_HDR_GSO_ECN 0x80
- u8 gso_type;
- le16 hdr_len;
- le16 gso_size;
- le16 csum_start;
- le16 csum_offset;
- le16 num_buffers;
- le32 hash_value; (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
- le16 hash_report; (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
- le16 padding_reserved; (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
-};
-\end{lstlisting}
-
-The controlq is used to control device features such as
-filtering.
-
-\subsubsection{Legacy Interface: Device Operation}\label{sec:Device Types / Network Device / Device Operation / Legacy Interface: Device Operation}
-When using the legacy interface, transitional devices and drivers
-MUST format the fields in struct virtio_net_hdr
-according to the native endian of the guest rather than
-(necessarily when not using the legacy interface) little-endian.
-
-The legacy driver only presented \field{num_buffers} in the struct virtio_net_hdr
-when VIRTIO_NET_F_MRG_RXBUF was negotiated; without that feature the
-structure was 2 bytes shorter.
-
-When using the legacy interface, the driver SHOULD ignore the
-used length for the transmit queues
-and the controlq queue.
-\begin{note}
-Historically, some devices put
-the total descriptor length there, even though no data was
-actually written.
-\end{note}
-
-\subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / Device Operation / Packet Transmission}
-
-Transmitting a single packet is simple, but varies depending on
-the different features the driver negotiated.
-
-\begin{enumerate}
-\item The driver can send a completely checksummed packet. In this case,
- \field{flags} will be zero, and \field{gso_type} will be VIRTIO_NET_HDR_GSO_NONE.
-
-\item If the driver negotiated VIRTIO_NET_F_CSUM, it can skip
- checksumming the packet:
- \begin{itemize}
- \item \field{flags} has the VIRTIO_NET_HDR_F_NEEDS_CSUM set,
-
- \item \field{csum_start} is set to the offset within the packet to begin checksumming,
- and
-
- \item \field{csum_offset} indicates how many bytes after the csum_start the
- new (16 bit ones' complement) checksum is placed by the device.
-
- \item The TCP checksum field in the packet is set to the sum
- of the TCP pseudo header, so that replacing it by the ones'
- complement checksum of the TCP header and body will give the
- correct result.
- \end{itemize}
-
-\begin{note}
-For example, consider a partially checksummed TCP (IPv4) packet.
-It will have a 14 byte ethernet header and 20 byte IP header
-followed by the TCP header (with the TCP checksum field 16 bytes
-into that header). \field{csum_start} will be 14+20 = 34 (the TCP
-checksum includes the header), and \field{csum_offset} will be 16.
-\end{note}
-
-\item If the driver negotiated
- VIRTIO_NET_F_HOST_TSO4, TSO6, USO or UFO, and the packet requires
- TCP segmentation, UDP segmentation or fragmentation, then \field{gso_type}
- is set to VIRTIO_NET_HDR_GSO_TCPV4, TCPV6, UDP_L4 or UDP.
- (Otherwise, it is set to VIRTIO_NET_HDR_GSO_NONE). In this
- case, packets larger than 1514 bytes can be transmitted: the
- metadata indicates how to replicate the packet header to cut it
- into smaller packets. The other gso fields are set:
-
- \begin{itemize}
- \item If the VIRTIO_NET_F_GUEST_HDRLEN feature has been negotiated,
- \field{hdr_len} indicates the header length that needs to be replicated
- for each packet. It's the number of bytes from the beginning of the packet
- to the beginning of the transport payload.
- Otherwise, if the VIRTIO_NET_F_GUEST_HDRLEN feature has not been negotiated,
- \field{hdr_len} is a hint to the device as to how much of the header
- needs to be kept to copy into each packet, usually set to the
- length of the headers, including the transport header\footnote{Due to various bugs in implementations, this field is not useful
-as a guarantee of the transport header size.
-}.
-
- \begin{note}
- Some devices benefit from knowledge of the exact header length.
- \end{note}
-
- \item \field{gso_size} is the maximum size of each packet beyond that
- header (ie. MSS).
-
- \item If the driver negotiated the VIRTIO_NET_F_HOST_ECN feature,
- the VIRTIO_NET_HDR_GSO_ECN bit in \field{gso_type}
- indicates that the TCP packet has the ECN bit set\footnote{This case is not handled by some older hardware, so is called out
-specifically in the protocol.}.
- \end{itemize}
-
-\item \field{num_buffers} is set to zero. This field is unused on transmitted packets.
-
-\item The header and packet are added as one output descriptor to the
- transmitq, and the device is notified of the new entry
- (see \ref{sec:Device Types / Network Device / Device Initialization}~\nameref{sec:Device Types / Network Device / Device Initialization}).
-\end{enumerate}
-
-\drivernormative{\paragraph}{Packet Transmission}{Device Types / Network Device / Device Operation / Packet Transmission}
-
-The driver MUST set \field{num_buffers} to zero.
-
-If VIRTIO_NET_F_CSUM is not negotiated, the driver MUST set
-\field{flags} to zero and SHOULD supply a fully checksummed
-packet to the device.
-
-If VIRTIO_NET_F_HOST_TSO4 is negotiated, the driver MAY set
-\field{gso_type} to VIRTIO_NET_HDR_GSO_TCPV4 to request TCPv4
-segmentation, otherwise the driver MUST NOT set
-\field{gso_type} to VIRTIO_NET_HDR_GSO_TCPV4.
-
-If VIRTIO_NET_F_HOST_TSO6 is negotiated, the driver MAY set
-\field{gso_type} to VIRTIO_NET_HDR_GSO_TCPV6 to request TCPv6
-segmentation, otherwise the driver MUST NOT set
-\field{gso_type} to VIRTIO_NET_HDR_GSO_TCPV6.
-
-If VIRTIO_NET_F_HOST_UFO is negotiated, the driver MAY set
-\field{gso_type} to VIRTIO_NET_HDR_GSO_UDP to request UDP
-fragmentation, otherwise the driver MUST NOT set
-\field{gso_type} to VIRTIO_NET_HDR_GSO_UDP.
-
-If VIRTIO_NET_F_HOST_USO is negotiated, the driver MAY set
-\field{gso_type} to VIRTIO_NET_HDR_GSO_UDP_L4 to request UDP
-segmentation, otherwise the driver MUST NOT set
-\field{gso_type} to VIRTIO_NET_HDR_GSO_UDP_L4.
-
-The driver SHOULD NOT send to the device TCP packets requiring segmentation offload
-which have the Explicit Congestion Notification bit set, unless the
-VIRTIO_NET_F_HOST_ECN feature is negotiated, in which case the
-driver MUST set the VIRTIO_NET_HDR_GSO_ECN bit in
-\field{gso_type}.
-
-If the VIRTIO_NET_F_CSUM feature has been negotiated, the
-driver MAY set the VIRTIO_NET_HDR_F_NEEDS_CSUM bit in
-\field{flags}, if so:
-\begin{enumerate}
-\item the driver MUST validate the packet checksum at
- offset \field{csum_offset} from \field{csum_start} as well as all
- preceding offsets;
-\item the driver MUST set the packet checksum stored in the
- buffer to the TCP/UDP pseudo header;
-\item the driver MUST set \field{csum_start} and
- \field{csum_offset} such that calculating a ones'
- complement checksum from \field{csum_start} up until the end of
- the packet and storing the result at offset \field{csum_offset}
- from \field{csum_start} will result in a fully checksummed
- packet;
-\end{enumerate}
-
-If none of the VIRTIO_NET_F_HOST_TSO4, TSO6, USO or UFO options have
-been negotiated, the driver MUST set \field{gso_type} to
-VIRTIO_NET_HDR_GSO_NONE.
-
-If \field{gso_type} differs from VIRTIO_NET_HDR_GSO_NONE, then
-the driver MUST also set the VIRTIO_NET_HDR_F_NEEDS_CSUM bit in
-\field{flags} and MUST set \field{gso_size} to indicate the
-desired MSS.
-
-If one of the VIRTIO_NET_F_HOST_TSO4, TSO6, USO or UFO options have
-been negotiated:
-\begin{itemize}
-\item If the VIRTIO_NET_F_GUEST_HDRLEN feature has been negotiated,
- and \field{gso_type} differs from VIRTIO_NET_HDR_GSO_NONE,
- the driver MUST set \field{hdr_len} to a value equal to the length
- of the headers, including the transport header.
-
-\item If the VIRTIO_NET_F_GUEST_HDRLEN feature has not been negotiated,
- or \field{gso_type} is VIRTIO_NET_HDR_GSO_NONE,
- the driver SHOULD set \field{hdr_len} to a value
- not less than the length of the headers, including the transport
- header.
-\end{itemize}
-
-The driver SHOULD accept the VIRTIO_NET_F_GUEST_HDRLEN feature if it has
-been offered, and if it's able to provide the exact header length.
-
-The driver MUST NOT set the VIRTIO_NET_HDR_F_DATA_VALID and
-VIRTIO_NET_HDR_F_RSC_INFO bits in \field{flags}.
-
-\devicenormative{\paragraph}{Packet Transmission}{Device Types / Network Device / Device Operation / Packet Transmission}
-The device MUST ignore \field{flag} bits that it does not recognize.
-
-If VIRTIO_NET_HDR_F_NEEDS_CSUM bit in \field{flags} is not set, the
-device MUST NOT use the \field{csum_start} and \field{csum_offset}.
-
-If one of the VIRTIO_NET_F_HOST_TSO4, TSO6, USO or UFO options have
-been negotiated:
-\begin{itemize}
-\item If the VIRTIO_NET_F_GUEST_HDRLEN feature has been negotiated,
- and \field{gso_type} differs from VIRTIO_NET_HDR_GSO_NONE,
- the device MAY use \field{hdr_len} as the transport header size.
-
- \begin{note}
- Caution should be taken by the implementation so as to prevent
- a malicious driver from attacking the device by setting an incorrect hdr_len.
- \end{note}
-
-\item If the VIRTIO_NET_F_GUEST_HDRLEN feature has not been negotiated,
- or \field{gso_type} is VIRTIO_NET_HDR_GSO_NONE,
- the device MAY use \field{hdr_len} only as a hint about the
- transport header size.
- The device MUST NOT rely on \field{hdr_len} to be correct.
-
- \begin{note}
- This is due to various bugs in implementations.
- \end{note}
-\end{itemize}
-
-If VIRTIO_NET_HDR_F_NEEDS_CSUM is not set, the device MUST NOT
-rely on the packet checksum being correct.
-\paragraph{Packet Transmission Interrupt}\label{sec:Device Types / Network Device / Device Operation / Packet Transmission / Packet Transmission Interrupt}
-
-Often a driver will suppress transmission virtqueue interrupts
-and check for used packets in the transmit path of following
-packets.
-
-The normal behavior in this interrupt handler is to retrieve
-used buffers from the virtqueue and free the corresponding
-headers and packets.
-
-\subsubsection{Setting Up Receive Buffers}\label{sec:Device Types / Network Device / Device Operation / Setting Up Receive Buffers}
-
-It is generally a good idea to keep the receive virtqueue as
-fully populated as possible: if it runs out, network performance
-will suffer.
-
-If the VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6,
-VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_GUEST_USO4 or VIRTIO_NET_F_GUEST_USO6
-features are used, the maximum incoming packet
-will be to 65550 bytes long (the maximum size of a
-TCP or UDP packet, plus the 14 byte ethernet header), otherwise
-1514 bytes. The 12-byte struct virtio_net_hdr is prepended to this,
-making for 65562 or 1526 bytes.
-
-\drivernormative{\paragraph}{Setting Up Receive Buffers}{Device Types / Network Device / Device Operation / Setting Up Receive Buffers}
-
-\begin{itemize}
-\item If VIRTIO_NET_F_MRG_RXBUF is not negotiated:
- \begin{itemize}
- \item If VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6, VIRTIO_NET_F_GUEST_UFO,
- VIRTIO_NET_F_GUEST_USO4 or VIRTIO_NET_F_GUEST_USO6 are negotiated, the driver SHOULD populate
- the receive queue(s) with buffers of at least 65562 bytes.
- \item Otherwise, the driver SHOULD populate the receive queue(s)
- with buffers of at least 1526 bytes.
- \end{itemize}
-\item If VIRTIO_NET_F_MRG_RXBUF is negotiated, each buffer MUST be at
-least the size of the struct virtio_net_hdr.
-\end{itemize}
-
-\begin{note}
-Obviously each buffer can be split across multiple descriptor elements.
-\end{note}
-
-If VIRTIO_NET_F_MQ is negotiated, each of receiveq1\ldots receiveqN
-that will be used SHOULD be populated with receive buffers.
-
-\devicenormative{\paragraph}{Setting Up Receive Buffers}{Device Types / Network Device / Device Operation / Setting Up Receive Buffers}
-
-The device MUST set \field{num_buffers} to the number of descriptors used to
-hold the incoming packet.
-
-The device MUST use only a single descriptor if VIRTIO_NET_F_MRG_RXBUF
-was not negotiated.
-\begin{note}
-{This means that \field{num_buffers} will always be 1
-if VIRTIO_NET_F_MRG_RXBUF is not negotiated.}
-\end{note}
-
-\subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets}
-\label{sec:Device Types / Network Device / Device Operation / Processing of Packets}%old label for latexdiff
-
-When a packet is copied into a buffer in the receiveq, the
-optimal path is to disable further used buffer notifications for the
-receiveq and process packets until no more are found, then re-enable
-them.
-
-Processing incoming packets involves:
-
-\begin{enumerate}
-\item \field{num_buffers} indicates how many descriptors
- this packet is spread over (including this one): this will
- always be 1 if VIRTIO_NET_F_MRG_RXBUF was not negotiated.
- This allows receipt of large packets without having to allocate large
- buffers: a packet that does not fit in a single buffer can flow
- over to the next buffer, and so on. In this case, there will be
- at least \field{num_buffers} used buffers in the virtqueue, and the device
- chains them together to form a single packet in a way similar to
- how it would store it in a single buffer spread over multiple
- descriptors.
- The other buffers will not begin with a struct virtio_net_hdr.
-
-\item If
- \field{num_buffers} is one, then the entire packet will be
- contained within this buffer, immediately following the struct
- virtio_net_hdr.
-\item If the VIRTIO_NET_F_GUEST_CSUM feature was negotiated, the
- VIRTIO_NET_HDR_F_DATA_VALID bit in \field{flags} can be
- set: if so, device has validated the packet checksum.
- In case of multiple encapsulated protocols, one level of checksums
- has been validated.
-\end{enumerate}
-
-Additionally, VIRTIO_NET_F_GUEST_CSUM, TSO4, TSO6, UDP and ECN
-features enable receive checksum, large receive offload and ECN
-support which are the input equivalents of the transmit checksum,
-transmit segmentation offloading and ECN features, as described
-in \ref{sec:Device Types / Network Device / Device Operation /
-Packet Transmission}:
-\begin{enumerate}
-\item If the VIRTIO_NET_F_GUEST_TSO4, TSO6, UFO, USO4 or USO6 options were
- negotiated, then \field{gso_type} MAY be something other than
- VIRTIO_NET_HDR_GSO_NONE, and \field{gso_size} field indicates the
- desired MSS (see Packet Transmission point 2).
-\item If the VIRTIO_NET_F_RSC_EXT option was negotiated (this
- implies one of VIRTIO_NET_F_GUEST_TSO4, TSO6), the
- device processes also duplicated ACK segments, reports
- number of coalesced TCP segments in \field{csum_start} field and
- number of duplicated ACK segments in \field{csum_offset} field
- and sets bit VIRTIO_NET_HDR_F_RSC_INFO in \field{flags}.
-\item If the VIRTIO_NET_F_GUEST_CSUM feature was negotiated, the
- VIRTIO_NET_HDR_F_NEEDS_CSUM bit in \field{flags} can be
- set: if so, the packet checksum at offset \field{csum_offset}
- from \field{csum_start} and any preceding checksums
- have been validated. The checksum on the packet is incomplete and
- if bit VIRTIO_NET_HDR_F_RSC_INFO is not set in \field{flags},
- then \field{csum_start} and \field{csum_offset} indicate how to calculate it
- (see Packet Transmission point 1).
-
-\end{enumerate}
-
-If applicable, the device calculates per-packet hash for incoming packets as
-defined in \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets}.
-
-If applicable, the device reports hash information for incoming packets as
-defined in \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash reporting for incoming packets}.
-
-\devicenormative{\paragraph}{Processing of Incoming Packets}{Device Types / Network Device / Device Operation / Processing of Incoming Packets}
-\label{devicenormative:Device Types / Network Device / Device Operation / Processing of Packets}%old label for latexdiff
-
-If VIRTIO_NET_F_MRG_RXBUF has not been negotiated, the device MUST set
-\field{num_buffers} to 1.
-
-If VIRTIO_NET_F_MRG_RXBUF has been negotiated, the device MUST set
-\field{num_buffers} to indicate the number of buffers
-the packet (including the header) is spread over.
-
-If a receive packet is spread over multiple buffers, the device
-MUST use all buffers but the last (i.e. the first \field{num_buffers} -
-1 buffers) completely up to the full length of each buffer
-supplied by the driver.
-
-The device MUST use all buffers used by a single receive
-packet together, such that at least \field{num_buffers} are
-observed by driver as used.
-
-If VIRTIO_NET_F_GUEST_CSUM is not negotiated, the device MUST set
-\field{flags} to zero and SHOULD supply a fully checksummed
-packet to the driver.
-
-If VIRTIO_NET_F_GUEST_TSO4 is not negotiated, the device MUST NOT set
-\field{gso_type} to VIRTIO_NET_HDR_GSO_TCPV4.
-
-If VIRTIO_NET_F_GUEST_UDP is not negotiated, the device MUST NOT set
-\field{gso_type} to VIRTIO_NET_HDR_GSO_UDP.
-
-If VIRTIO_NET_F_GUEST_TSO6 is not negotiated, the device MUST NOT set
-\field{gso_type} to VIRTIO_NET_HDR_GSO_TCPV6.
-
-If none of VIRTIO_NET_F_GUEST_USO4 or VIRTIO_NET_F_GUEST_USO6 have been negotiated,
-the device MUST NOT set \field{gso_type} to VIRTIO_NET_HDR_GSO_UDP_L4.
-
-The device SHOULD NOT send to the driver TCP packets requiring segmentation offload
-which have the Explicit Congestion Notification bit set, unless the
-VIRTIO_NET_F_GUEST_ECN feature is negotiated, in which case the
-device MUST set the VIRTIO_NET_HDR_GSO_ECN bit in
-\field{gso_type}.
-
-If the VIRTIO_NET_F_GUEST_CSUM feature has been negotiated, the
-device MAY set the VIRTIO_NET_HDR_F_NEEDS_CSUM bit in
-\field{flags}, if so:
-\begin{enumerate}
-\item the device MUST validate the packet checksum at
- offset \field{csum_offset} from \field{csum_start} as well as all
- preceding offsets;
-\item the device MUST set the packet checksum stored in the
- receive buffer to the TCP/UDP pseudo header;
-\item the device MUST set \field{csum_start} and
- \field{csum_offset} such that calculating a ones'
- complement checksum from \field{csum_start} up until the
- end of the packet and storing the result at offset
- \field{csum_offset} from \field{csum_start} will result in a
- fully checksummed packet;
-\end{enumerate}
-
-If none of the VIRTIO_NET_F_GUEST_TSO4, TSO6, UFO, USO4 or USO6 options have
-been negotiated, the device MUST set \field{gso_type} to
-VIRTIO_NET_HDR_GSO_NONE.
-
-If \field{gso_type} differs from VIRTIO_NET_HDR_GSO_NONE, then
-the device MUST also set the VIRTIO_NET_HDR_F_NEEDS_CSUM bit in
-\field{flags} MUST set \field{gso_size} to indicate the desired MSS.
-If VIRTIO_NET_F_RSC_EXT was negotiated, the device MUST also
-set VIRTIO_NET_HDR_F_RSC_INFO bit in \field{flags},
-set \field{csum_start} to number of coalesced TCP segments and
-set \field{csum_offset} to number of received duplicated ACK segments.
-
-If VIRTIO_NET_F_RSC_EXT was not negotiated, the device MUST
-not set VIRTIO_NET_HDR_F_RSC_INFO bit in \field{flags}.
-
-If one of the VIRTIO_NET_F_GUEST_TSO4, TSO6, UFO, USO4 or USO6 options have
-been negotiated, the device SHOULD set \field{hdr_len} to a value
-not less than the length of the headers, including the transport
-header.
-
-If the VIRTIO_NET_F_GUEST_CSUM feature has been negotiated, the
-device MAY set the VIRTIO_NET_HDR_F_DATA_VALID bit in
-\field{flags}, if so, the device MUST validate the packet
-checksum (in case of multiple encapsulated protocols, one level
-of checksums is validated).
-
-\drivernormative{\paragraph}{Processing of Incoming
-Packets}{Device Types / Network Device / Device Operation /
-Processing of Incoming Packets}
-
-The driver MUST ignore \field{flag} bits that it does not recognize.
-
-If VIRTIO_NET_HDR_F_NEEDS_CSUM bit in \field{flags} is not set or
-if VIRTIO_NET_HDR_F_RSC_INFO bit \field{flags} is set, the
-driver MUST NOT use the \field{csum_start} and \field{csum_offset}.
-
-If one of the VIRTIO_NET_F_GUEST_TSO4, TSO6, UFO, USO4 or USO6 options have
-been negotiated, the driver MAY use \field{hdr_len} only as a hint about the
-transport header size.
-The driver MUST NOT rely on \field{hdr_len} to be correct.
-\begin{note}
-This is due to various bugs in implementations.
-\end{note}
-
-If neither VIRTIO_NET_HDR_F_NEEDS_CSUM nor
-VIRTIO_NET_HDR_F_DATA_VALID is set, the driver MUST NOT
-rely on the packet checksum being correct.
-
-\paragraph{Hash calculation for incoming packets}
-\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets}
-
-A device attempts to calculate a per-packet hash in the following cases:
-\begin{itemize}
-\item The feature VIRTIO_NET_F_RSS was negotiated. The device uses the hash to determine the receive virtqueue to place incoming packets.
-\item The feature VIRTIO_NET_F_HASH_REPORT was negotiated. The device reports the hash value and the hash type with the packet.
-\end{itemize}
-
-If the feature VIRTIO_NET_F_RSS was negotiated:
-\begin{itemize}
-\item The device uses \field{hash_types} of the virtio_net_rss_config structure as 'Enabled hash types' bitmask.
-\item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_rss_config structure (see
-\ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / Setting RSS parameters}).
-\end{itemize}
-
-If the feature VIRTIO_NET_F_RSS was not negotiated:
-\begin{itemize}
-\item The device uses \field{hash_types} of the virtio_net_hash_config structure as 'Enabled hash types' bitmask.
-\item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_hash_config structure (see
-\ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}).
-\end{itemize}
-
-Note that if the device offers VIRTIO_NET_F_HASH_REPORT, even if it supports only one pair of virtqueues, it MUST support
-at least one of commands of VIRTIO_NET_CTRL_MQ class to configure reported hash parameters:
-\begin{itemize}
-\item If the device offers VIRTIO_NET_F_RSS, it MUST support VIRTIO_NET_CTRL_MQ_RSS_CONFIG command per
- \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / Setting RSS parameters}.
-\item Otherwise the device MUST support VIRTIO_NET_CTRL_MQ_HASH_CONFIG command per
- \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}.
-\end{itemize}
-
-\subparagraph{Supported/enabled hash types}
-\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}
-Hash types applicable for IPv4 packets:
-\begin{lstlisting}
-#define VIRTIO_NET_HASH_TYPE_IPv4 (1 << 0)
-#define VIRTIO_NET_HASH_TYPE_TCPv4 (1 << 1)
-#define VIRTIO_NET_HASH_TYPE_UDPv4 (1 << 2)
-\end{lstlisting}
-Hash types applicable for IPv6 packets without extension headers
-\begin{lstlisting}
-#define VIRTIO_NET_HASH_TYPE_IPv6 (1 << 3)
-#define VIRTIO_NET_HASH_TYPE_TCPv6 (1 << 4)
-#define VIRTIO_NET_HASH_TYPE_UDPv6 (1 << 5)
-\end{lstlisting}
-Hash types applicable for IPv6 packets with extension headers
-\begin{lstlisting}
-#define VIRTIO_NET_HASH_TYPE_IP_EX (1 << 6)
-#define VIRTIO_NET_HASH_TYPE_TCP_EX (1 << 7)
-#define VIRTIO_NET_HASH_TYPE_UDP_EX (1 << 8)
-\end{lstlisting}
-
-\subparagraph{IPv4 packets}
-\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv4 packets}
-The device calculates the hash on IPv4 packets according to 'Enabled hash types' bitmask as follows:
-\begin{itemize}
-\item If VIRTIO_NET_HASH_TYPE_TCPv4 is set and the packet has
-a TCP header, the hash is calculated over the following fields:
-\begin{itemize}
-\item Source IP address
-\item Destination IP address
-\item Source TCP port
-\item Destination TCP port
-\end{itemize}
-\item Else if VIRTIO_NET_HASH_TYPE_UDPv4 is set and the
-packet has a UDP header, the hash is calculated over the following fields:
-\begin{itemize}
-\item Source IP address
-\item Destination IP address
-\item Source UDP port
-\item Destination UDP port
-\end{itemize}
-\item Else if VIRTIO_NET_HASH_TYPE_IPv4 is set, the hash is
-calculated over the following fields:
-\begin{itemize}
-\item Source IP address
-\item Destination IP address
-\end{itemize}
-\item Else the device does not calculate the hash
-\end{itemize}
-
-\subparagraph{IPv6 packets without extension header}
-\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv6 packets without extension header}
-The device calculates the hash on IPv6 packets without extension
-headers according to 'Enabled hash types' bitmask as follows:
-\begin{itemize}
-\item If VIRTIO_NET_HASH_TYPE_TCPv6 is set and the packet has
-a TCPv6 header, the hash is calculated over the following fields:
-\begin{itemize}
-\item Source IPv6 address
-\item Destination IPv6 address
-\item Source TCP port
-\item Destination TCP port
-\end{itemize}
-\item Else if VIRTIO_NET_HASH_TYPE_UDPv6 is set and the
-packet has a UDPv6 header, the hash is calculated over the following fields:
-\begin{itemize}
-\item Source IPv6 address
-\item Destination IPv6 address
-\item Source UDP port
-\item Destination UDP port
-\end{itemize}
-\item Else if VIRTIO_NET_HASH_TYPE_IPv6 is set, the hash is
-calculated over the following fields:
-\begin{itemize}
-\item Source IPv6 address
-\item Destination IPv6 address
-\end{itemize}
-\item Else the device does not calculate the hash
-\end{itemize}
-
-\subparagraph{IPv6 packets with extension header}
-\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv6 packets with extension header}
-The device calculates the hash on IPv6 packets with extension
-headers according to 'Enabled hash types' bitmask as follows:
-\begin{itemize}
-\item If VIRTIO_NET_HASH_TYPE_TCP_EX is set and the packet
-has a TCPv6 header, the hash is calculated over the following fields:
-\begin{itemize}
-\item Home address from the home address option in the IPv6 destination options header. If the extension header is not present, use the Source IPv6 address.
-\item IPv6 address that is contained in the Routing-Header-Type-2 from the associated extension header. If the extension header is not present, use the Destination IPv6 address.
-\item Source TCP port
-\item Destination TCP port
-\end{itemize}
-\item Else if VIRTIO_NET_HASH_TYPE_UDP_EX is set and the
-packet has a UDPv6 header, the hash is calculated over the following fields:
-\begin{itemize}
-\item Home address from the home address option in the IPv6 destination options header. If the extension header is not present, use the Source IPv6 address.
-\item IPv6 address that is contained in the Routing-Header-Type-2 from the associated extension header. If the extension header is not present, use the Destination IPv6 address.
-\item Source UDP port
-\item Destination UDP port
-\end{itemize}
-\item Else if VIRTIO_NET_HASH_TYPE_IP_EX is set, the hash is
-calculated over the following fields:
-\begin{itemize}
-\item Home address from the home address option in the IPv6 destination options header. If the extension header is not present, use the Source IPv6 address.
-\item IPv6 address that is contained in the Routing-Header-Type-2 from the associated extension header. If the extension header is not present, use the Destination IPv6 address.
-\end{itemize}
-\item Else skip IPv6 extension headers and calculate the hash as
-defined for an IPv6 packet without extension headers
-(see \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv6 packets without extension header}).
-\end{itemize}
-
-\paragraph{Hash reporting for incoming packets}
-\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash reporting for incoming packets}
-
-If VIRTIO_NET_F_HASH_REPORT was negotiated and
- the device has calculated the hash for the packet, the device fills \field{hash_report} with the report type of calculated hash
-and \field{hash_value} with the value of calculated hash.
-
-If VIRTIO_NET_F_HASH_REPORT was negotiated but due to any reason the
-hash was not calculated, the device sets \field{hash_report} to VIRTIO_NET_HASH_REPORT_NONE.
-
-Possible values that the device can report in \field{hash_report} are defined below.
-They correspond to supported hash types defined in
-\ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}
-as follows:
-
-VIRTIO_NET_HASH_TYPE_XXX = 1 << (VIRTIO_NET_HASH_REPORT_XXX - 1)
-
-\begin{lstlisting}
-#define VIRTIO_NET_HASH_REPORT_NONE 0
-#define VIRTIO_NET_HASH_REPORT_IPv4 1
-#define VIRTIO_NET_HASH_REPORT_TCPv4 2
-#define VIRTIO_NET_HASH_REPORT_UDPv4 3
-#define VIRTIO_NET_HASH_REPORT_IPv6 4
-#define VIRTIO_NET_HASH_REPORT_TCPv6 5
-#define VIRTIO_NET_HASH_REPORT_UDPv6 6
-#define VIRTIO_NET_HASH_REPORT_IPv6_EX 7
-#define VIRTIO_NET_HASH_REPORT_TCPv6_EX 8
-#define VIRTIO_NET_HASH_REPORT_UDPv6_EX 9
-\end{lstlisting}
-
-\subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue}
-
-The driver uses the control virtqueue (if VIRTIO_NET_F_CTRL_VQ is
-negotiated) to send commands to manipulate various features of
-the device which would not easily map into the configuration
-space.
-
-All commands are of the following form:
-
-\begin{lstlisting}
-struct virtio_net_ctrl {
- u8 class;
- u8 command;
- u8 command-specific-data[];
- u8 ack;
-};
-
-/* ack values */
-#define VIRTIO_NET_OK 0
-#define VIRTIO_NET_ERR 1
-\end{lstlisting}
-
-The \field{class}, \field{command} and command-specific-data are set by the
-driver, and the device sets the \field{ack} byte. There is little it can
-do except issue a diagnostic if \field{ack} is not
-VIRTIO_NET_OK.
-
-\paragraph{Packet Receive Filtering}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Packet Receive Filtering}
-\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Setting Promiscuous Mode}%old label for latexdiff
-
-If the VIRTIO_NET_F_CTRL_RX and VIRTIO_NET_F_CTRL_RX_EXTRA
-features are negotiated, the driver can send control commands for
-promiscuous mode, multicast, unicast and broadcast receiving.
-
-\begin{note}
-In general, these commands are best-effort: unwanted
-packets could still arrive.
-\end{note}
-
-\begin{lstlisting}
-#define VIRTIO_NET_CTRL_RX 0
- #define VIRTIO_NET_CTRL_RX_PROMISC 0
- #define VIRTIO_NET_CTRL_RX_ALLMULTI 1
- #define VIRTIO_NET_CTRL_RX_ALLUNI 2
- #define VIRTIO_NET_CTRL_RX_NOMULTI 3
- #define VIRTIO_NET_CTRL_RX_NOUNI 4
- #define VIRTIO_NET_CTRL_RX_NOBCAST 5
-\end{lstlisting}
-
-
-\devicenormative{\subparagraph}{Packet Receive Filtering}{Device Types / Network Device / Device Operation / Control Virtqueue / Packet Receive Filtering}
-
-If the VIRTIO_NET_F_CTRL_RX feature has been negotiated,
-the device MUST support the following VIRTIO_NET_CTRL_RX class
-commands:
-\begin{itemize}
-\item VIRTIO_NET_CTRL_RX_PROMISC turns promiscuous mode on and
-off. The command-specific-data is one byte containing 0 (off) or
-1 (on). If promiscous mode is on, the device SHOULD receive all
-incoming packets.
-This SHOULD take effect even if one of the other modes set by
-a VIRTIO_NET_CTRL_RX class command is on.
-\item VIRTIO_NET_CTRL_RX_ALLMULTI turns all-multicast receive on and
-off. The command-specific-data is one byte containing 0 (off) or
-1 (on). When all-multicast receive is on the device SHOULD allow
-all incoming multicast packets.
-\end{itemize}
-
-If the VIRTIO_NET_F_CTRL_RX_EXTRA feature has been negotiated,
-the device MUST support the following VIRTIO_NET_CTRL_RX class
-commands:
-\begin{itemize}
-\item VIRTIO_NET_CTRL_RX_ALLUNI turns all-unicast receive on and
-off. The command-specific-data is one byte containing 0 (off) or
-1 (on). When all-unicast receive is on the device SHOULD allow
-all incoming unicast packets.
-\item VIRTIO_NET_CTRL_RX_NOMULTI suppresses multicast receive.
-The command-specific-data is one byte containing 0 (multicast
-receive allowed) or 1 (multicast receive suppressed).
-When multicast receive is suppressed, the device SHOULD NOT
-send multicast packets to the driver.
-This SHOULD take effect even if VIRTIO_NET_CTRL_RX_ALLMULTI is on.
-This filter SHOULD NOT apply to broadcast packets.
-\item VIRTIO_NET_CTRL_RX_NOUNI suppresses unicast receive.
-The command-specific-data is one byte containing 0 (unicast
-receive allowed) or 1 (unicast receive suppressed).
-When unicast receive is suppressed, the device SHOULD NOT
-send unicast packets to the driver.
-This SHOULD take effect even if VIRTIO_NET_CTRL_RX_ALLUNI is on.
-\item VIRTIO_NET_CTRL_RX_NOBCAST suppresses broadcast receive.
-The command-specific-data is one byte containing 0 (broadcast
-receive allowed) or 1 (broadcast receive suppressed).
-When broadcast receive is suppressed, the device SHOULD NOT
-send broadcast packets to the driver.
-This SHOULD take effect even if VIRTIO_NET_CTRL_RX_ALLMULTI is on.
-\end{itemize}
-
-\drivernormative{\subparagraph}{Packet Receive Filtering}{Device Types / Network Device / Device Operation / Control Virtqueue / Packet Receive Filtering}
-
-If the VIRTIO_NET_F_CTRL_RX feature has not been negotiated,
-the driver MUST NOT issue commands VIRTIO_NET_CTRL_RX_PROMISC or
-VIRTIO_NET_CTRL_RX_ALLMULTI.
-
-If the VIRTIO_NET_F_CTRL_RX_EXTRA feature has not been negotiated,
-the driver MUST NOT issue commands
- VIRTIO_NET_CTRL_RX_ALLUNI,
- VIRTIO_NET_CTRL_RX_NOMULTI,
- VIRTIO_NET_CTRL_RX_NOUNI or
- VIRTIO_NET_CTRL_RX_NOBCAST.
-
-\paragraph{Setting MAC Address Filtering}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Setting MAC Address Filtering}
-
-If the VIRTIO_NET_F_CTRL_RX feature is negotiated, the driver can
-send control commands for MAC address filtering.
-
-\begin{lstlisting}
-struct virtio_net_ctrl_mac {
- le32 entries;
- u8 macs[entries][6];
-};
-
-#define VIRTIO_NET_CTRL_MAC 1
- #define VIRTIO_NET_CTRL_MAC_TABLE_SET 0
- #define VIRTIO_NET_CTRL_MAC_ADDR_SET 1
-\end{lstlisting}
-
-The device can filter incoming packets by any number of destination
-MAC addresses\footnote{Since there are no guarantees, it can use a hash filter or
-silently switch to allmulti or promiscuous mode if it is given too
-many addresses.
-}. This table is set using the class
-VIRTIO_NET_CTRL_MAC and the command VIRTIO_NET_CTRL_MAC_TABLE_SET. The
-command-specific-data is two variable length tables of 6-byte MAC
-addresses (as described in struct virtio_net_ctrl_mac). The first table contains unicast addresses, and the second
-contains multicast addresses.
-
-The VIRTIO_NET_CTRL_MAC_ADDR_SET command is used to set the
-default MAC address which rx filtering
-accepts (and if VIRTIO_NET_F_MAC has been negotiated,
-this will be reflected in \field{mac} in config space).
-
-The command-specific-data for VIRTIO_NET_CTRL_MAC_ADDR_SET is
-the 6-byte MAC address.
-
-\devicenormative{\subparagraph}{Setting MAC Address Filtering}{Device Types / Network Device / Device Operation / Control Virtqueue / Setting MAC Address Filtering}
-
-The device MUST have an empty MAC filtering table on reset.
-
-The device MUST update the MAC filtering table before it consumes
-the VIRTIO_NET_CTRL_MAC_TABLE_SET command.
-
-The device MUST update \field{mac} in config space before it consumes
-the VIRTIO_NET_CTRL_MAC_ADDR_SET command, if VIRTIO_NET_F_MAC has
-been negotiated.
-
-The device SHOULD drop incoming packets which have a destination MAC which
-matches neither the \field{mac} (or that set with VIRTIO_NET_CTRL_MAC_ADDR_SET)
-nor the MAC filtering table.
-
-\drivernormative{\subparagraph}{Setting MAC Address Filtering}{Device Types / Network Device / Device Operation / Control Virtqueue / Setting MAC Address Filtering}
-
-If VIRTIO_NET_F_CTRL_RX has not been negotiated,
-the driver MUST NOT issue VIRTIO_NET_CTRL_MAC class commands.
-
-If VIRTIO_NET_F_CTRL_RX has been negotiated,
-the driver SHOULD issue VIRTIO_NET_CTRL_MAC_ADDR_SET
-to set the default mac if it is different from \field{mac}.
-
-The driver MUST follow the VIRTIO_NET_CTRL_MAC_TABLE_SET command
-by a le32 number, followed by that number of non-multicast
-MAC addresses, followed by another le32 number, followed by
-that number of multicast addresses. Either number MAY be 0.
-
-\subparagraph{Legacy Interface: Setting MAC Address Filtering}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Setting MAC Address Filtering / Legacy Interface: Setting MAC Address Filtering}
-When using the legacy interface, transitional devices and drivers
-MUST format \field{entries} in struct virtio_net_ctrl_mac
-according to the native endian of the guest rather than
-(necessarily when not using the legacy interface) little-endian.
-
-Legacy drivers that didn't negotiate VIRTIO_NET_F_CTRL_MAC_ADDR
-changed \field{mac} in config space when NIC is accepting
-incoming packets. These drivers always wrote the mac value from
-first to last byte, therefore after detecting such drivers,
-a transitional device MAY defer MAC update, or MAY defer
-processing incoming packets until driver writes the last byte
-of \field{mac} in the config space.
-
-\paragraph{VLAN Filtering}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / VLAN Filtering}
-
-If the driver negotiates the VIRTIO_NET_F_CTRL_VLAN feature, it
-can control a VLAN filter table in the device.
-
-\begin{note}
-Similar to the MAC address based filtering, the VLAN filtering
-is also best-effort: unwanted packets could still arrive.
-\end{note}
-
-\begin{lstlisting}
-#define VIRTIO_NET_CTRL_VLAN 2
- #define VIRTIO_NET_CTRL_VLAN_ADD 0
- #define VIRTIO_NET_CTRL_VLAN_DEL 1
-\end{lstlisting}
-
-Both the VIRTIO_NET_CTRL_VLAN_ADD and VIRTIO_NET_CTRL_VLAN_DEL
-command take a little-endian 16-bit VLAN id as the command-specific-data.
-
-\subparagraph{Legacy Interface: VLAN Filtering}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / VLAN Filtering / Legacy Interface: VLAN Filtering}
-When using the legacy interface, transitional devices and drivers
-MUST format the VLAN id
-according to the native endian of the guest rather than
-(necessarily when not using the legacy interface) little-endian.
-
-\paragraph{Gratuitous Packet Sending}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Gratuitous Packet Sending}
-
-If the driver negotiates the VIRTIO_NET_F_GUEST_ANNOUNCE (depends
-on VIRTIO_NET_F_CTRL_VQ), the device can ask the driver to send gratuitous
-packets; this is usually done after the guest has been physically
-migrated, and needs to announce its presence on the new network
-links. (As hypervisor does not have the knowledge of guest
-network configuration (eg. tagged vlan) it is simplest to prod
-the guest in this way).
-
-\begin{lstlisting}
-#define VIRTIO_NET_CTRL_ANNOUNCE 3
- #define VIRTIO_NET_CTRL_ANNOUNCE_ACK 0
-\end{lstlisting}
-
-The driver checks VIRTIO_NET_S_ANNOUNCE bit in the device configuration \field{status} field
-when it notices the changes of device configuration. The
-command VIRTIO_NET_CTRL_ANNOUNCE_ACK is used to indicate that
-driver has received the notification and device clears the
-VIRTIO_NET_S_ANNOUNCE bit in \field{status}.
-
-Processing this notification involves:
-
-\begin{enumerate}
-\item Sending the gratuitous packets (eg. ARP) or marking there are pending
- gratuitous packets to be sent and letting deferred routine to
- send them.
-
-\item Sending VIRTIO_NET_CTRL_ANNOUNCE_ACK command through control
- vq.
-\end{enumerate}
-
-\drivernormative{\subparagraph}{Gratuitous Packet Sending}{Device Types / Network Device / Device Operation / Control Virtqueue / Gratuitous Packet Sending}
-
-If the driver negotiates VIRTIO_NET_F_GUEST_ANNOUNCE, it SHOULD notify
-network peers of its new location after it sees the VIRTIO_NET_S_ANNOUNCE bit
-in \field{status}. The driver MUST send a command on the command queue
-with class VIRTIO_NET_CTRL_ANNOUNCE and command VIRTIO_NET_CTRL_ANNOUNCE_ACK.
-
-\devicenormative{\subparagraph}{Gratuitous Packet Sending}{Device Types / Network Device / Device Operation / Control Virtqueue / Gratuitous Packet Sending}
-
-If VIRTIO_NET_F_GUEST_ANNOUNCE is negotiated, the device MUST clear the
-VIRTIO_NET_S_ANNOUNCE bit in \field{status} upon receipt of a command buffer
-with class VIRTIO_NET_CTRL_ANNOUNCE and command VIRTIO_NET_CTRL_ANNOUNCE_ACK
-before marking the buffer as used.
-
-\paragraph{Device operation in multiqueue mode}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Device operation in multiqueue mode}
-
-This specification defines the following modes that a device MAY implement for operation with multiple transmit/receive virtqueues:
-\begin{itemize}
-\item Automatic receive steering as defined in \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode}.
- If a device supports this mode, it offers the VIRTIO_NET_F_MQ feature bit.
-\item Receive-side scaling as defined in \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / RSS processing}.
- If a device supports this mode, it offers the VIRTIO_NET_F_RSS feature bit.
-\end{itemize}
-
-A device MAY support one of these features or both. The driver MAY negotiate any set of these features that the device supports.
-
-Multiqueue is disabled by default.
-
-The driver enables multiqueue by sending a command using \field{class} VIRTIO_NET_CTRL_MQ. The \field{command} selects the mode of multiqueue operation, as follows:
-\begin{lstlisting}
-#define VIRTIO_NET_CTRL_MQ 4
- #define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET 0 (for automatic receive steering)
- #define VIRTIO_NET_CTRL_MQ_RSS_CONFIG 1 (for configurable receive steering)
- #define VIRTIO_NET_CTRL_MQ_HASH_CONFIG 2 (for configurable hash calculation)
-\end{lstlisting}
-
-If more than one multiqueue mode is negotiated, the resulting device configuration is defined by the last command sent by the driver.
-
-\paragraph{Automatic receive steering in multiqueue mode}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode}
-
-If the driver negotiates the VIRTIO_NET_F_MQ feature bit (depends on VIRTIO_NET_F_CTRL_VQ), it MAY transmit outgoing packets on one
-of the multiple transmitq1\ldots transmitqN and ask the device to
-queue incoming packets into one of the multiple receiveq1\ldots receiveqN
-depending on the packet flow.
-
-The driver enables multiqueue by
-sending the VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command, specifying
-the number of the transmit and receive queues to be used up to
-\field{max_virtqueue_pairs}; subsequently,
-transmitq1\ldots transmitqn and receiveq1\ldots receiveqn where
-n=\field{virtqueue_pairs} MAY be used.
-\begin{lstlisting}
-struct virtio_net_ctrl_mq_pairs_set {
- le16 virtqueue_pairs;
-};
-#define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MIN 1
-#define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX 0x8000
-
-\end{lstlisting}
-
-When multiqueue is enabled by VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command, the device MUST use automatic receive steering
-based on packet flow. Programming of the receive steering
-classificator is implicit. After the driver transmitted a packet of a
-flow on transmitqX, the device SHOULD cause incoming packets for that flow to
-be steered to receiveqX. For uni-directional protocols, or where
-no packets have been transmitted yet, the device MAY steer a packet
-to a random queue out of the specified receiveq1\ldots receiveqn.
-
-Multiqueue is disabled by VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET with \field{virtqueue_pairs} to 1 (this is
-the default) and waiting for the device to use the command buffer.
-
-\drivernormative{\subparagraph}{Automatic receive steering in multiqueue mode}{Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode}
-
-The driver MUST configure the virtqueues before enabling them with the
-VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command.
-
-The driver MUST NOT request a \field{virtqueue_pairs} of 0 or
-greater than \field{max_virtqueue_pairs} in the device configuration space.
-
-The driver MUST queue packets only on any transmitq1 before the
-VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command.
-
-The driver MUST NOT queue packets on transmit queues greater than
-\field{virtqueue_pairs} once it has placed the VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command in the available ring.
-
-\devicenormative{\subparagraph}{Automatic receive steering in multiqueue mode}{Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode}
-
-After initialization of reset, the device MUST queue packets only on receiveq1.
-
-The device MUST NOT queue packets on receive queues greater than
-\field{virtqueue_pairs} once it has placed the
-VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command in a used buffer.
-
-If the destination receive queue is being reset (See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}),
-the device SHOULD re-select another random queue. If all receive queues are
-being reset, the device MUST drop the packet.
-
-\subparagraph{Legacy Interface: Automatic receive steering in multiqueue mode}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Legacy Interface: Automatic receive steering in multiqueue mode}
-When using the legacy interface, transitional devices and drivers
-MUST format \field{virtqueue_pairs}
-according to the native endian of the guest rather than
-(necessarily when not using the legacy interface) little-endian.
-
-\subparagraph{Hash calculation}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}
-If VIRTIO_NET_F_HASH_REPORT was negotiated and the device uses automatic receive steering,
-the device MUST support a command to configure hash calculation parameters.
-
-The driver provides parameters for hash calculation as follows:
-
-\field{class} VIRTIO_NET_CTRL_MQ, \field{command} VIRTIO_NET_CTRL_MQ_HASH_CONFIG.
-
-The \field{command-specific-data} has following format:
-\begin{lstlisting}
-struct virtio_net_hash_config {
- le32 hash_types;
- le16 reserved[4];
- u8 hash_key_length;
- u8 hash_key_data[hash_key_length];
-};
-\end{lstlisting}
-Field \field{hash_types} contains a bitmask of allowed hash types as
-defined in
-\ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}.
-Initially the device has all hash types disabled and reports only VIRTIO_NET_HASH_REPORT_NONE.
-
-Field \field{reserved} MUST contain zeroes. It is defined to make the structure to match the layout of virtio_net_rss_config structure,
-defined in \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS)}.
-
-Fields \field{hash_key_length} and \field{hash_key_data} define the key to be used in hash calculation.
-
-\paragraph{Receive-side scaling (RSS)}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS)}
-A device offers the feature VIRTIO_NET_F_RSS if it supports RSS receive steering with Toeplitz hash calculation and configurable parameters.
-
-A driver queries RSS capabilities of the device by reading device configuration as defined in \ref{sec:Device Types / Network Device / Device configuration layout}
-
-\subparagraph{Setting RSS parameters}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / Setting RSS parameters}
-
-Driver sends a VIRTIO_NET_CTRL_MQ_RSS_CONFIG command using the following format for \field{command-specific-data}:
-\begin{lstlisting}
-struct virtio_net_rss_config {
- le32 hash_types;
- le16 indirection_table_mask;
- le16 unclassified_queue;
- le16 indirection_table[indirection_table_length];
- le16 max_tx_vq;
- u8 hash_key_length;
- u8 hash_key_data[hash_key_length];
-};
-\end{lstlisting}
-Field \field{hash_types} contains a bitmask of allowed hash types as
-defined in
-\ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}.
-
-Field \field{indirection_table_mask} is a mask to be applied to
-the calculated hash to produce an index in the
-\field{indirection_table} array.
-Number of entries in \field{indirection_table} is (\field{indirection_table_mask} + 1).
-
-Field \field{unclassified_queue} contains the 0-based index of
-the receive virtqueue to place unclassified packets in. Index 0 corresponds to receiveq1.
-
-Field \field{indirection_table} contains an array of 0-based indices of receive virtqueus. Index 0 corresponds to receiveq1.
-
-A driver sets \field{max_tx_vq} to inform a device how many transmit virtqueues it may use (transmitq1\ldots transmitq \field{max_tx_vq}).
-
-Fields \field{hash_key_length} and \field{hash_key_data} define the key to be used in hash calculation.
-
-\drivernormative{\subparagraph}{Setting RSS parameters}{Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) }
-
-A driver MUST NOT send the VIRTIO_NET_CTRL_MQ_RSS_CONFIG command if the feature VIRTIO_NET_F_RSS has not been negotiated.
-
-A driver MUST fill the \field{indirection_table} array only with indices of enabled queues. Index 0 corresponds to receiveq1.
-
-The number of entries in \field{indirection_table} (\field{indirection_table_mask} + 1) MUST be a power of two.
-
-A driver MUST use \field{indirection_table_mask} values that are less than \field{rss_max_indirection_table_length} reported by a device.
-
-A driver MUST NOT set any VIRTIO_NET_HASH_TYPE_ flags that are not supported by a device.
-
-\devicenormative{\subparagraph}{RSS processing}{Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / RSS processing}
-The device MUST determine the destination queue for a network packet as follows:
-\begin{itemize}
-\item Calculate the hash of the packet as defined in \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets}.
-\item If the device did not calculate the hash for the specific packet, the device directs the packet to the receiveq specified by \field{unclassified_queue} of virtio_net_rss_config structure (value of 0 corresponds to receiveq1).
-\item Apply \field{indirection_table_mask} to the calculated hash and use the result as the index in the indirection table to get 0-based number of destination receiveq (value of 0 corresponds to receiveq1).
-\item If the destination receive queue is being reset (See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}), the device MUST drop the packet.
-\end{itemize}
-
-\paragraph{Offloads State Configuration}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Offloads State Configuration}
-
-If the VIRTIO_NET_F_CTRL_GUEST_OFFLOADS feature is negotiated, the driver can
-send control commands for dynamic offloads state configuration.
-
-\subparagraph{Setting Offloads State}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Offloads State Configuration / Setting Offloads State}
-
-To configure the offloads, the following layout structure and
-definitions are used:
-
-\begin{lstlisting}
-le64 offloads;
-
-#define VIRTIO_NET_F_GUEST_CSUM 1
-#define VIRTIO_NET_F_GUEST_TSO4 7
-#define VIRTIO_NET_F_GUEST_TSO6 8
-#define VIRTIO_NET_F_GUEST_ECN 9
-#define VIRTIO_NET_F_GUEST_UFO 10
-#define VIRTIO_NET_F_GUEST_USO4 54
-#define VIRTIO_NET_F_GUEST_USO6 55
-
-#define VIRTIO_NET_CTRL_GUEST_OFFLOADS 5
- #define VIRTIO_NET_CTRL_GUEST_OFFLOADS_SET 0
-\end{lstlisting}
-
-The class VIRTIO_NET_CTRL_GUEST_OFFLOADS has one command:
-VIRTIO_NET_CTRL_GUEST_OFFLOADS_SET applies the new offloads configuration.
-
-le64 value passed as command data is a bitmask, bits set define
-offloads to be enabled, bits cleared - offloads to be disabled.
-
-There is a corresponding device feature for each offload. Upon feature
-negotiation corresponding offload gets enabled to preserve backward
-compatibility.
-
-\drivernormative{\subparagraph}{Setting Offloads State}{Device Types / Network Device / Device Operation / Control Virtqueue / Offloads State Configuration / Setting Offloads State}
-
-A driver MUST NOT enable an offload for which the appropriate feature
-has not been negotiated.
-
-\subparagraph{Legacy Interface: Setting Offloads State}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Offloads State Configuration / Setting Offloads State / Legacy Interface: Setting Offloads State}
-When using the legacy interface, transitional devices and drivers
-MUST format \field{offloads}
-according to the native endian of the guest rather than
-(necessarily when not using the legacy interface) little-endian.
-
-
-\paragraph{Notifications Coalescing}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
-
-If the VIRTIO_NET_F_NOTF_COAL feature is negotiated, the driver can
-send control commands for dynamically changing the coalescing parameters.
-
-\begin{lstlisting}
-struct virtio_net_ctrl_coal_rx {
- le32 rx_max_packets;
- le32 rx_usecs;
-};
-
-struct virtio_net_ctrl_coal_tx {
- le32 tx_max_packets;
- le32 tx_usecs;
-};
-
-#define VIRTIO_NET_CTRL_NOTF_COAL 6
- #define VIRTIO_NET_CTRL_NOTF_COAL_TX_SET 0
- #define VIRTIO_NET_CTRL_NOTF_COAL_RX_SET 1
-\end{lstlisting}
-
-Coalescing parameters:
-\begin{itemize}
-\item \field{rx_usecs}: Maximum number of usecs to delay a RX notification.
-\item \field{tx_usecs}: Maximum number of usecs to delay a TX notification.
-\item \field{rx_max_packets}: Maximum number of packets to receive before a RX notification.
-\item \field{tx_max_packets}: Maximum number of packets to send before a TX notification.
-\end{itemize}
-
-
-The class VIRTIO_NET_CTRL_NOTF_COAL has 2 commands:
-\begin{enumerate}
-\item VIRTIO_NET_CTRL_NOTF_COAL_TX_SET: set the \field{tx_usecs} and \field{tx_max_packets} parameters.
-\item VIRTIO_NET_CTRL_NOTF_COAL_RX_SET: set the \field{rx_usecs} and \field{rx_max_packets} parameters.
-\end{enumerate}
-
-\subparagraph{RX Notifications}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing / RX Notifications}
-
-If, for example:
-\begin{itemize}
-\item \field{rx_usecs} = 10.
-\item \field{rx_max_packets} = 15.
-\end{itemize}
-
-The device will operate as follows:
-
-\begin{itemize}
-\item The device will count received packets until it accumulates 15, or until 10 usecs elapsed since the first one was received.
-\item If the notifications are not suppressed by the driver, the device will send an used buffer notification, otherwise, the device will not send an used buffer notification as long as the notifications are suppressed.
-\end{itemize}
-
-\subparagraph{TX Notifications}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing / TX Notifications}
-
-If, for example:
-\begin{itemize}
-\item \field{tx_usecs} = 10.
-\item \field{tx_max_packets} = 15.
-\end{itemize}
-
-The device will operate as follows:
-
-\begin{itemize}
-\item The device will count sent packets until it accumulates 15, or until 10 usecs elapsed since the first one was sent.
-\item If the notifications are not suppressed by the driver, the device will send an used buffer notification, otherwise, the device will not send an used buffer notification as long as the notifications are suppressed.
-\end{itemize}
-
-\drivernormative{\subparagraph}{Notifications Coalescing}{Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
-
-If the VIRTIO_NET_F_NOTF_COAL feature has not been negotiated, the driver MUST NOT issue VIRTIO_NET_CTRL_NOTF_COAL commands.
-
-\devicenormative{\subparagraph}{Notifications Coalescing}{Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
-
-A device SHOULD respond to the VIRTIO_NET_CTRL_NOTF_COAL commands with VIRTIO_NET_ERR if it was not able to change the parameters.
-
-A device SHOULD NOT send used buffer notifications to the driver, if the notifications are suppressed as explained in \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Used Buffer Notification Suppression}, even if the coalescing counters expired.
-
-Upon reset, a device MUST initialize all coalescing parameters to 0.
-
-\subsubsection{Legacy Interface: Framing Requirements}\label{sec:Device
-Types / Network Device / Legacy Interface: Framing Requirements}
-
-When using legacy interfaces, transitional drivers which have not
-negotiated VIRTIO_F_ANY_LAYOUT MUST use a single descriptor for the
-struct virtio_net_hdr on both transmit and receive, with the
-network data in the following descriptors.
-
-Additionally, when using the control virtqueue (see \ref{sec:Device
-Types / Network Device / Device Operation / Control Virtqueue})
-, transitional drivers which have not
-negotiated VIRTIO_F_ANY_LAYOUT MUST:
-\begin{itemize}
-\item for all commands, use a single 2-byte descriptor including the first two
-fields: \field{class} and \field{command}
-\item for all commands except VIRTIO_NET_CTRL_MAC_TABLE_SET
-use a single descriptor including command-specific-data
-with no padding.
-\item for the VIRTIO_NET_CTRL_MAC_TABLE_SET command use exactly
-two descriptors including command-specific-data with no padding:
-the first of these descriptors MUST include the
-virtio_net_ctrl_mac table structure for the unicast addresses with no padding,
-the second of these descriptors MUST include the
-virtio_net_ctrl_mac table structure for the multicast addresses
-with no padding.
-\item for all commands, use a single 1-byte descriptor for the
-\field{ack} field
-\end{itemize}
-
-See \ref{sec:Basic
-Facilities of a Virtio Device / Virtqueues / Message Framing}.
+\input{device-types/virtio-network/description.tex}
\section{Block Device}\label{sec:Device Types / Block Device}
diff --git a/device-types/virtio-network/description.tex b/device-types/virtio-network/description.tex
new file mode 100644
index 0000000..367681d
--- /dev/null
+++ b/device-types/virtio-network/description.tex
@@ -0,0 +1,1594 @@
+\section{Network Device}\label{sec:Device Types / Network Device}
+
+The virtio network device is a virtual ethernet card, and is the
+most complex of the devices supported so far by virtio. It has
+enhanced rapidly and demonstrates clearly how support for new
+features are added to an existing device. Empty buffers are
+placed in one virtqueue for receiving packets, and outgoing
+packets are enqueued into another for transmission in that order.
+A third command queue is used to control advanced filtering
+features.
+
+\subsection{Device ID}\label{sec:Device Types / Network Device / Device ID}
+
+ 1
+
+\subsection{Virtqueues}\label{sec:Device Types / Network Device / Virtqueues}
+
+\begin{description}
+\item[0] receiveq1
+\item[1] transmitq1
+\item[\ldots]
+\item[2(N-1)] receiveqN
+\item[2(N-1)+1] transmitqN
+\item[2N] controlq
+\end{description}
+
+ N=1 if neither VIRTIO_NET_F_MQ nor VIRTIO_NET_F_RSS are negotiated, otherwise N is set by
+ \field{max_virtqueue_pairs}.
+
+ controlq only exists if VIRTIO_NET_F_CTRL_VQ set.
+
+\subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits}
+
+\begin{description}
+\item[VIRTIO_NET_F_CSUM (0)] Device handles packets with partial checksum. This
+ ``checksum offload'' is a common feature on modern network cards.
+
+\item[VIRTIO_NET_F_GUEST_CSUM (1)] Driver handles packets with partial checksum.
+
+\item[VIRTIO_NET_F_CTRL_GUEST_OFFLOADS (2)] Control channel offloads
+ reconfiguration support.
+
+\item[VIRTIO_NET_F_MTU(3)] Device maximum MTU reporting is supported. If
+ offered by the device, device advises driver about the value of
+ its maximum MTU. If negotiated, the driver uses \field{mtu} as
+ the maximum MTU value.
+
+\item[VIRTIO_NET_F_MAC (5)] Device has given MAC address.
+
+\item[VIRTIO_NET_F_GUEST_TSO4 (7)] Driver can receive TSOv4.
+
+\item[VIRTIO_NET_F_GUEST_TSO6 (8)] Driver can receive TSOv6.
+
+\item[VIRTIO_NET_F_GUEST_ECN (9)] Driver can receive TSO with ECN.
+
+\item[VIRTIO_NET_F_GUEST_UFO (10)] Driver can receive UFO.
+
+\item[VIRTIO_NET_F_HOST_TSO4 (11)] Device can receive TSOv4.
+
+\item[VIRTIO_NET_F_HOST_TSO6 (12)] Device can receive TSOv6.
+
+\item[VIRTIO_NET_F_HOST_ECN (13)] Device can receive TSO with ECN.
+
+\item[VIRTIO_NET_F_HOST_UFO (14)] Device can receive UFO.
+
+\item[VIRTIO_NET_F_MRG_RXBUF (15)] Driver can merge receive buffers.
+
+\item[VIRTIO_NET_F_STATUS (16)] Configuration status field is
+ available.
+
+\item[VIRTIO_NET_F_CTRL_VQ (17)] Control channel is available.
+
+\item[VIRTIO_NET_F_CTRL_RX (18)] Control channel RX mode support.
+
+\item[VIRTIO_NET_F_CTRL_VLAN (19)] Control channel VLAN filtering.
+
+\item[VIRTIO_NET_F_GUEST_ANNOUNCE(21)] Driver can send gratuitous
+ packets.
+
+\item[VIRTIO_NET_F_MQ(22)] Device supports multiqueue with automatic
+ receive steering.
+
+\item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
+ channel.
+
+\item[VIRTIO_NET_F_NOTF_COAL(53)] Device supports notifications coalescing.
+
+\item[VIRTIO_NET_F_GUEST_USO4 (54)] Driver can receive USOv4 packets.
+
+\item[VIRTIO_NET_F_GUEST_USO6 (55)] Driver can receive USOv6 packets.
+
+\item[VIRTIO_NET_F_HOST_USO (56)] Device can receive USO packets. Unlike UFO
+ (fragmenting the packet) the USO splits large UDP packet
+ to several segments when each of these smaller packets has UDP header.
+
+\item[VIRTIO_NET_F_HASH_REPORT(57)] Device can report per-packet hash
+ value and a type of calculated hash.
+
+\item[VIRTIO_NET_F_GUEST_HDRLEN(59)] Driver can provide the exact \field{hdr_len}
+ value. Device benefits from knowing the exact header length.
+
+\item[VIRTIO_NET_F_RSS(60)] Device supports RSS (receive-side scaling)
+ with Toeplitz hash calculation and configurable hash
+ parameters for receive steering.
+
+\item[VIRTIO_NET_F_RSC_EXT(61)] Device can process duplicated ACKs
+ and report number of coalesced segments and duplicated ACKs.
+
+\item[VIRTIO_NET_F_STANDBY(62)] Device may act as a standby for a primary
+ device with the same MAC address.
+
+\item[VIRTIO_NET_F_SPEED_DUPLEX(63)] Device reports speed and duplex.
+\end{description}
+
+\subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device / Feature bits / Feature bit requirements}
+
+Some networking feature bits require other networking feature bits
+(see \ref{drivernormative:Basic Facilities of a Virtio Device / Feature Bits}):
+
+\begin{description}
+\item[VIRTIO_NET_F_GUEST_TSO4] Requires VIRTIO_NET_F_GUEST_CSUM.
+\item[VIRTIO_NET_F_GUEST_TSO6] Requires VIRTIO_NET_F_GUEST_CSUM.
+\item[VIRTIO_NET_F_GUEST_ECN] Requires VIRTIO_NET_F_GUEST_TSO4 or VIRTIO_NET_F_GUEST_TSO6.
+\item[VIRTIO_NET_F_GUEST_UFO] Requires VIRTIO_NET_F_GUEST_CSUM.
+\item[VIRTIO_NET_F_GUEST_USO4] Requires VIRTIO_NET_F_GUEST_CSUM.
+\item[VIRTIO_NET_F_GUEST_USO6] Requires VIRTIO_NET_F_GUEST_CSUM.
+
+\item[VIRTIO_NET_F_HOST_TSO4] Requires VIRTIO_NET_F_CSUM.
+\item[VIRTIO_NET_F_HOST_TSO6] Requires VIRTIO_NET_F_CSUM.
+\item[VIRTIO_NET_F_HOST_ECN] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
+\item[VIRTIO_NET_F_HOST_UFO] Requires VIRTIO_NET_F_CSUM.
+\item[VIRTIO_NET_F_HOST_USO] Requires VIRTIO_NET_F_CSUM.
+
+\item[VIRTIO_NET_F_CTRL_RX] Requires VIRTIO_NET_F_CTRL_VQ.
+\item[VIRTIO_NET_F_CTRL_VLAN] Requires VIRTIO_NET_F_CTRL_VQ.
+\item[VIRTIO_NET_F_GUEST_ANNOUNCE] Requires VIRTIO_NET_F_CTRL_VQ.
+\item[VIRTIO_NET_F_MQ] Requires VIRTIO_NET_F_CTRL_VQ.
+\item[VIRTIO_NET_F_CTRL_MAC_ADDR] Requires VIRTIO_NET_F_CTRL_VQ.
+\item[VIRTIO_NET_F_NOTF_COAL] Requires VIRTIO_NET_F_CTRL_VQ.
+\item[VIRTIO_NET_F_RSC_EXT] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
+\item[VIRTIO_NET_F_RSS] Requires VIRTIO_NET_F_CTRL_VQ.
+\end{description}
+
+\subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Network Device / Feature bits / Legacy Interface: Feature bits}
+\begin{description}
+\item[VIRTIO_NET_F_GSO (6)] Device handles packets with any GSO type. This was supposed to indicate segmentation offload support, but
+upon further investigation it became clear that multiple bits were needed.
+\item[VIRTIO_NET_F_GUEST_RSC4 (41)] Device coalesces TCPIP v4 packets. This was implemented by hypervisor patch for certification
+purposes and current Windows driver depends on it. It will not function if virtio-net device reports this feature.
+\item[VIRTIO_NET_F_GUEST_RSC6 (42)] Device coalesces TCPIP v6 packets. Similar to VIRTIO_NET_F_GUEST_RSC4.
+\end{description}
+
+\subsection{Device configuration layout}\label{sec:Device Types / Network Device / Device configuration layout}
+\label{sec:Device Types / Block Device / Feature bits / Device configuration layout}
+
+Device configuration fields are listed below, they are read-only for a driver. The \field{mac} address field
+always exists (though is only valid if VIRTIO_NET_F_MAC is set), and
+\field{status} only exists if VIRTIO_NET_F_STATUS is set. Two
+read-only bits (for the driver) are currently defined for the status field:
+VIRTIO_NET_S_LINK_UP and VIRTIO_NET_S_ANNOUNCE.
+
+\begin{lstlisting}
+#define VIRTIO_NET_S_LINK_UP 1
+#define VIRTIO_NET_S_ANNOUNCE 2
+\end{lstlisting}
+
+The following driver-read-only field, \field{max_virtqueue_pairs} only exists if
+VIRTIO_NET_F_MQ or VIRTIO_NET_F_RSS is set. This field specifies the maximum number
+of each of transmit and receive virtqueues (receiveq1\ldots receiveqN
+and transmitq1\ldots transmitqN respectively) that can be configured once at least one of these features
+is negotiated.
+
+The following driver-read-only field, \field{mtu} only exists if
+VIRTIO_NET_F_MTU is set. This field specifies the maximum MTU for the driver to
+use.
+
+The following two fields, \field{speed} and \field{duplex}, only
+exist if VIRTIO_NET_F_SPEED_DUPLEX is set.
+
+\field{speed} contains the device speed, in units of 1 MBit per
+second, 0 to 0x7fffffff, or 0xffffffff for unknown speed.
+
+\field{duplex} has the values of 0x01 for full duplex, 0x00 for
+half duplex and 0xff for unknown duplex state.
+
+Both \field{speed} and \field{duplex} can change, thus the driver
+is expected to re-read these values after receiving a
+configuration change notification.
+
+\begin{lstlisting}
+struct virtio_net_config {
+ u8 mac[6];
+ le16 status;
+ le16 max_virtqueue_pairs;
+ le16 mtu;
+ le32 speed;
+ u8 duplex;
+ u8 rss_max_key_size;
+ le16 rss_max_indirection_table_length;
+ le32 supported_hash_types;
+};
+\end{lstlisting}
+The following field, \field{rss_max_key_size} only exists if VIRTIO_NET_F_RSS or VIRTIO_NET_F_HASH_REPORT is set.
+It specifies the maximum supported length of RSS key in bytes.
+
+The following field, \field{rss_max_indirection_table_length} only exists if VIRTIO_NET_F_RSS is set.
+It specifies the maximum number of 16-bit entries in RSS indirection table.
+
+The next field, \field{supported_hash_types} only exists if the device supports hash calculation,
+i.e. if VIRTIO_NET_F_RSS or VIRTIO_NET_F_HASH_REPORT is set.
+
+Field \field{supported_hash_types} contains the bitmask of supported hash types.
+See \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types} for details of supported hash types.
+
+\devicenormative{\subsubsection}{Device configuration layout}{Device Types / Network Device / Device configuration layout}
+
+The device MUST set \field{max_virtqueue_pairs} to between 1 and 0x8000 inclusive,
+if it offers VIRTIO_NET_F_MQ.
+
+The device MUST set \field{mtu} to between 68 and 65535 inclusive,
+if it offers VIRTIO_NET_F_MTU.
+
+The device SHOULD set \field{mtu} to at least 1280, if it offers
+VIRTIO_NET_F_MTU.
+
+The device MUST NOT modify \field{mtu} once it has been set.
+
+The device MUST NOT pass received packets that exceed \field{mtu} (plus low
+level ethernet header length) size with \field{gso_type} NONE or ECN
+after VIRTIO_NET_F_MTU has been successfully negotiated.
+
+The device MUST forward transmitted packets of up to \field{mtu} (plus low
+level ethernet header length) size with \field{gso_type} NONE or ECN, and do
+so without fragmentation, after VIRTIO_NET_F_MTU has been successfully
+negotiated.
+
+The device MUST set \field{rss_max_key_size} to at least 40, if it offers
+VIRTIO_NET_F_RSS or VIRTIO_NET_F_HASH_REPORT.
+
+The device MUST set \field{rss_max_indirection_table_length} to at least 128, if it offers
+VIRTIO_NET_F_RSS.
+
+If the driver negotiates the VIRTIO_NET_F_STANDBY feature, the device MAY act
+as a standby device for a primary device with the same MAC address.
+
+If VIRTIO_NET_F_SPEED_DUPLEX has been negotiated, \field{speed}
+MUST contain the device speed, in units of 1 MBit per second, 0 to
+0x7ffffffff, or 0xfffffffff for unknown.
+
+If VIRTIO_NET_F_SPEED_DUPLEX has been negotiated, \field{duplex}
+MUST have the values of 0x00 for full duplex, 0x01 for half
+duplex, or 0xff for unknown.
+
+If VIRTIO_NET_F_SPEED_DUPLEX and VIRTIO_NET_F_STATUS have both
+been negotiated, the device SHOULD NOT change the \field{speed} and
+\field{duplex} fields as long as VIRTIO_NET_S_LINK_UP is set in
+the \field{status}.
+
+\drivernormative{\subsubsection}{Device configuration layout}{Device Types / Network Device / Device configuration layout}
+
+A driver SHOULD negotiate VIRTIO_NET_F_MAC if the device offers it.
+If the driver negotiates the VIRTIO_NET_F_MAC feature, the driver MUST set
+the physical address of the NIC to \field{mac}. Otherwise, it SHOULD
+use a locally-administered MAC address (see \hyperref[intro:IEEE 802]{IEEE 802},
+``9.2 48-bit universal LAN MAC addresses'').
+
+If the driver does not negotiate the VIRTIO_NET_F_STATUS feature, it SHOULD
+assume the link is active, otherwise it SHOULD read the link status from
+the bottom bit of \field{status}.
+
+A driver SHOULD negotiate VIRTIO_NET_F_MTU if the device offers it.
+
+If the driver negotiates VIRTIO_NET_F_MTU, it MUST supply enough receive
+buffers to receive at least one receive packet of size \field{mtu} (plus low
+level ethernet header length) with \field{gso_type} NONE or ECN.
+
+If the driver negotiates VIRTIO_NET_F_MTU, it MUST NOT transmit packets of
+size exceeding the value of \field{mtu} (plus low level ethernet header length)
+with \field{gso_type} NONE or ECN.
+
+A driver SHOULD negotiate the VIRTIO_NET_F_STANDBY feature if the device offers it.
+
+If VIRTIO_NET_F_SPEED_DUPLEX has been negotiated,
+the driver MUST treat any value of \field{speed} above
+0x7fffffff as well as any value of \field{duplex} not
+matching 0x00 or 0x01 as an unknown value.
+
+If VIRTIO_NET_F_SPEED_DUPLEX has been negotiated, the driver
+SHOULD re-read \field{speed} and \field{duplex} after a
+configuration change notification.
+
+\subsubsection{Legacy Interface: Device configuration layout}\label{sec:Device Types / Network Device / Device configuration layout / Legacy Interface: Device configuration layout}
+\label{sec:Device Types / Block Device / Feature bits / Device configuration layout / Legacy Interface: Device configuration layout}
+When using the legacy interface, transitional devices and drivers
+MUST format \field{status} and
+\field{max_virtqueue_pairs} in struct virtio_net_config
+according to the native endian of the guest rather than
+(necessarily when not using the legacy interface) little-endian.
+
+When using the legacy interface, \field{mac} is driver-writable
+which provided a way for drivers to update the MAC without
+negotiating VIRTIO_NET_F_CTRL_MAC_ADDR.
+
+\subsection{Device Initialization}\label{sec:Device Types / Network Device / Device Initialization}
+
+A driver would perform a typical initialization routine like so:
+
+\begin{enumerate}
+\item Identify and initialize the receive and
+ transmission virtqueues, up to N of each kind. If
+ VIRTIO_NET_F_MQ feature bit is negotiated,
+ N=\field{max_virtqueue_pairs}, otherwise identify N=1.
+
+\item If the VIRTIO_NET_F_CTRL_VQ feature bit is negotiated,
+ identify the control virtqueue.
+
+\item Fill the receive queues with buffers: see \ref{sec:Device Types / Network Device / Device Operation / Setting Up Receive Buffers}.
+
+\item Even with VIRTIO_NET_F_MQ, only receiveq1, transmitq1 and
+ controlq are used by default. The driver would send the
+ VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command specifying the
+ number of the transmit and receive queues to use.
+
+\item If the VIRTIO_NET_F_MAC feature bit is set, the configuration
+ space \field{mac} entry indicates the ``physical'' address of the
+ network card, otherwise the driver would typically generate a random
+ local MAC address.
+
+\item If the VIRTIO_NET_F_STATUS feature bit is negotiated, the link
+ status comes from the bottom bit of \field{status}.
+ Otherwise, the driver assumes it's active.
+
+\item A performant driver would indicate that it will generate checksumless
+ packets by negotating the VIRTIO_NET_F_CSUM feature.
+
+\item If that feature is negotiated, a driver can use TCP segmentation or UDP
+ segmentation/fragmentation offload by negotiating the VIRTIO_NET_F_HOST_TSO4 (IPv4
+ TCP), VIRTIO_NET_F_HOST_TSO6 (IPv6 TCP), VIRTIO_NET_F_HOST_UFO
+ (UDP fragmentation) and VIRTIO_NET_F_HOST_USO (UDP segmentation) features.
+
+\item The converse features are also available: a driver can save
+ the virtual device some work by negotiating these features.\note{For example, a network packet transported between two guests on
+the same system might not need checksumming at all, nor segmentation,
+if both guests are amenable.}
+ The VIRTIO_NET_F_GUEST_CSUM feature indicates that partially
+ checksummed packets can be received, and if it can do that then
+ the VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6,
+ VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_GUEST_USO4
+ and VIRTIO_NET_F_GUEST_USO6 are the input equivalents of the features described above.
+ See \ref{sec:Device Types / Network Device / Device Operation /
+Setting Up Receive Buffers}~\nameref{sec:Device Types / Network
+Device / Device Operation / Setting Up Receive Buffers} and
+\ref{sec:Device Types / Network Device / Device Operation /
+Processing of Incoming Packets}~\nameref{sec:Device Types /
+Network Device / Device Operation / Processing of Incoming Packets} below.
+\end{enumerate}
+
+A truly minimal driver would only accept VIRTIO_NET_F_MAC and ignore
+everything else.
+
+\subsection{Device Operation}\label{sec:Device Types / Network Device / Device Operation}
+
+Packets are transmitted by placing them in the
+transmitq1\ldots transmitqN, and buffers for incoming packets are
+placed in the receiveq1\ldots receiveqN. In each case, the packet
+itself is preceded by a header:
+
+\begin{lstlisting}
+struct virtio_net_hdr {
+#define VIRTIO_NET_HDR_F_NEEDS_CSUM 1
+#define VIRTIO_NET_HDR_F_DATA_VALID 2
+#define VIRTIO_NET_HDR_F_RSC_INFO 4
+ u8 flags;
+#define VIRTIO_NET_HDR_GSO_NONE 0
+#define VIRTIO_NET_HDR_GSO_TCPV4 1
+#define VIRTIO_NET_HDR_GSO_UDP 3
+#define VIRTIO_NET_HDR_GSO_TCPV6 4
+#define VIRTIO_NET_HDR_GSO_UDP_L4 5
+#define VIRTIO_NET_HDR_GSO_ECN 0x80
+ u8 gso_type;
+ le16 hdr_len;
+ le16 gso_size;
+ le16 csum_start;
+ le16 csum_offset;
+ le16 num_buffers;
+ le32 hash_value; (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
+ le16 hash_report; (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
+ le16 padding_reserved; (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
+};
+\end{lstlisting}
+
+The controlq is used to control device features such as
+filtering.
+
+\subsubsection{Legacy Interface: Device Operation}\label{sec:Device Types / Network Device / Device Operation / Legacy Interface: Device Operation}
+When using the legacy interface, transitional devices and drivers
+MUST format the fields in struct virtio_net_hdr
+according to the native endian of the guest rather than
+(necessarily when not using the legacy interface) little-endian.
+
+The legacy driver only presented \field{num_buffers} in the struct virtio_net_hdr
+when VIRTIO_NET_F_MRG_RXBUF was negotiated; without that feature the
+structure was 2 bytes shorter.
+
+When using the legacy interface, the driver SHOULD ignore the
+used length for the transmit queues
+and the controlq queue.
+\begin{note}
+Historically, some devices put
+the total descriptor length there, even though no data was
+actually written.
+\end{note}
+
+\subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / Device Operation / Packet Transmission}
+
+Transmitting a single packet is simple, but varies depending on
+the different features the driver negotiated.
+
+\begin{enumerate}
+\item The driver can send a completely checksummed packet. In this case,
+ \field{flags} will be zero, and \field{gso_type} will be VIRTIO_NET_HDR_GSO_NONE.
+
+\item If the driver negotiated VIRTIO_NET_F_CSUM, it can skip
+ checksumming the packet:
+ \begin{itemize}
+ \item \field{flags} has the VIRTIO_NET_HDR_F_NEEDS_CSUM set,
+
+ \item \field{csum_start} is set to the offset within the packet to begin checksumming,
+ and
+
+ \item \field{csum_offset} indicates how many bytes after the csum_start the
+ new (16 bit ones' complement) checksum is placed by the device.
+
+ \item The TCP checksum field in the packet is set to the sum
+ of the TCP pseudo header, so that replacing it by the ones'
+ complement checksum of the TCP header and body will give the
+ correct result.
+ \end{itemize}
+
+\begin{note}
+For example, consider a partially checksummed TCP (IPv4) packet.
+It will have a 14 byte ethernet header and 20 byte IP header
+followed by the TCP header (with the TCP checksum field 16 bytes
+into that header). \field{csum_start} will be 14+20 = 34 (the TCP
+checksum includes the header), and \field{csum_offset} will be 16.
+\end{note}
+
+\item If the driver negotiated
+ VIRTIO_NET_F_HOST_TSO4, TSO6, USO or UFO, and the packet requires
+ TCP segmentation, UDP segmentation or fragmentation, then \field{gso_type}
+ is set to VIRTIO_NET_HDR_GSO_TCPV4, TCPV6, UDP_L4 or UDP.
+ (Otherwise, it is set to VIRTIO_NET_HDR_GSO_NONE). In this
+ case, packets larger than 1514 bytes can be transmitted: the
+ metadata indicates how to replicate the packet header to cut it
+ into smaller packets. The other gso fields are set:
+
+ \begin{itemize}
+ \item If the VIRTIO_NET_F_GUEST_HDRLEN feature has been negotiated,
+ \field{hdr_len} indicates the header length that needs to be replicated
+ for each packet. It's the number of bytes from the beginning of the packet
+ to the beginning of the transport payload.
+ Otherwise, if the VIRTIO_NET_F_GUEST_HDRLEN feature has not been negotiated,
+ \field{hdr_len} is a hint to the device as to how much of the header
+ needs to be kept to copy into each packet, usually set to the
+ length of the headers, including the transport header\footnote{Due to various bugs in implementations, this field is not useful
+as a guarantee of the transport header size.
+}.
+
+ \begin{note}
+ Some devices benefit from knowledge of the exact header length.
+ \end{note}
+
+ \item \field{gso_size} is the maximum size of each packet beyond that
+ header (ie. MSS).
+
+ \item If the driver negotiated the VIRTIO_NET_F_HOST_ECN feature,
+ the VIRTIO_NET_HDR_GSO_ECN bit in \field{gso_type}
+ indicates that the TCP packet has the ECN bit set\footnote{This case is not handled by some older hardware, so is called out
+specifically in the protocol.}.
+ \end{itemize}
+
+\item \field{num_buffers} is set to zero. This field is unused on transmitted packets.
+
+\item The header and packet are added as one output descriptor to the
+ transmitq, and the device is notified of the new entry
+ (see \ref{sec:Device Types / Network Device / Device Initialization}~\nameref{sec:Device Types / Network Device / Device Initialization}).
+\end{enumerate}
+
+\drivernormative{\paragraph}{Packet Transmission}{Device Types / Network Device / Device Operation / Packet Transmission}
+
+The driver MUST set \field{num_buffers} to zero.
+
+If VIRTIO_NET_F_CSUM is not negotiated, the driver MUST set
+\field{flags} to zero and SHOULD supply a fully checksummed
+packet to the device.
+
+If VIRTIO_NET_F_HOST_TSO4 is negotiated, the driver MAY set
+\field{gso_type} to VIRTIO_NET_HDR_GSO_TCPV4 to request TCPv4
+segmentation, otherwise the driver MUST NOT set
+\field{gso_type} to VIRTIO_NET_HDR_GSO_TCPV4.
+
+If VIRTIO_NET_F_HOST_TSO6 is negotiated, the driver MAY set
+\field{gso_type} to VIRTIO_NET_HDR_GSO_TCPV6 to request TCPv6
+segmentation, otherwise the driver MUST NOT set
+\field{gso_type} to VIRTIO_NET_HDR_GSO_TCPV6.
+
+If VIRTIO_NET_F_HOST_UFO is negotiated, the driver MAY set
+\field{gso_type} to VIRTIO_NET_HDR_GSO_UDP to request UDP
+fragmentation, otherwise the driver MUST NOT set
+\field{gso_type} to VIRTIO_NET_HDR_GSO_UDP.
+
+If VIRTIO_NET_F_HOST_USO is negotiated, the driver MAY set
+\field{gso_type} to VIRTIO_NET_HDR_GSO_UDP_L4 to request UDP
+segmentation, otherwise the driver MUST NOT set
+\field{gso_type} to VIRTIO_NET_HDR_GSO_UDP_L4.
+
+The driver SHOULD NOT send to the device TCP packets requiring segmentation offload
+which have the Explicit Congestion Notification bit set, unless the
+VIRTIO_NET_F_HOST_ECN feature is negotiated, in which case the
+driver MUST set the VIRTIO_NET_HDR_GSO_ECN bit in
+\field{gso_type}.
+
+If the VIRTIO_NET_F_CSUM feature has been negotiated, the
+driver MAY set the VIRTIO_NET_HDR_F_NEEDS_CSUM bit in
+\field{flags}, if so:
+\begin{enumerate}
+\item the driver MUST validate the packet checksum at
+ offset \field{csum_offset} from \field{csum_start} as well as all
+ preceding offsets;
+\item the driver MUST set the packet checksum stored in the
+ buffer to the TCP/UDP pseudo header;
+\item the driver MUST set \field{csum_start} and
+ \field{csum_offset} such that calculating a ones'
+ complement checksum from \field{csum_start} up until the end of
+ the packet and storing the result at offset \field{csum_offset}
+ from \field{csum_start} will result in a fully checksummed
+ packet;
+\end{enumerate}
+
+If none of the VIRTIO_NET_F_HOST_TSO4, TSO6, USO or UFO options have
+been negotiated, the driver MUST set \field{gso_type} to
+VIRTIO_NET_HDR_GSO_NONE.
+
+If \field{gso_type} differs from VIRTIO_NET_HDR_GSO_NONE, then
+the driver MUST also set the VIRTIO_NET_HDR_F_NEEDS_CSUM bit in
+\field{flags} and MUST set \field{gso_size} to indicate the
+desired MSS.
+
+If one of the VIRTIO_NET_F_HOST_TSO4, TSO6, USO or UFO options have
+been negotiated:
+\begin{itemize}
+\item If the VIRTIO_NET_F_GUEST_HDRLEN feature has been negotiated,
+ and \field{gso_type} differs from VIRTIO_NET_HDR_GSO_NONE,
+ the driver MUST set \field{hdr_len} to a value equal to the length
+ of the headers, including the transport header.
+
+\item If the VIRTIO_NET_F_GUEST_HDRLEN feature has not been negotiated,
+ or \field{gso_type} is VIRTIO_NET_HDR_GSO_NONE,
+ the driver SHOULD set \field{hdr_len} to a value
+ not less than the length of the headers, including the transport
+ header.
+\end{itemize}
+
+The driver SHOULD accept the VIRTIO_NET_F_GUEST_HDRLEN feature if it has
+been offered, and if it's able to provide the exact header length.
+
+The driver MUST NOT set the VIRTIO_NET_HDR_F_DATA_VALID and
+VIRTIO_NET_HDR_F_RSC_INFO bits in \field{flags}.
+
+\devicenormative{\paragraph}{Packet Transmission}{Device Types / Network Device / Device Operation / Packet Transmission}
+The device MUST ignore \field{flag} bits that it does not recognize.
+
+If VIRTIO_NET_HDR_F_NEEDS_CSUM bit in \field{flags} is not set, the
+device MUST NOT use the \field{csum_start} and \field{csum_offset}.
+
+If one of the VIRTIO_NET_F_HOST_TSO4, TSO6, USO or UFO options have
+been negotiated:
+\begin{itemize}
+\item If the VIRTIO_NET_F_GUEST_HDRLEN feature has been negotiated,
+ and \field{gso_type} differs from VIRTIO_NET_HDR_GSO_NONE,
+ the device MAY use \field{hdr_len} as the transport header size.
+
+ \begin{note}
+ Caution should be taken by the implementation so as to prevent
+ a malicious driver from attacking the device by setting an incorrect hdr_len.
+ \end{note}
+
+\item If the VIRTIO_NET_F_GUEST_HDRLEN feature has not been negotiated,
+ or \field{gso_type} is VIRTIO_NET_HDR_GSO_NONE,
+ the device MAY use \field{hdr_len} only as a hint about the
+ transport header size.
+ The device MUST NOT rely on \field{hdr_len} to be correct.
+
+ \begin{note}
+ This is due to various bugs in implementations.
+ \end{note}
+\end{itemize}
+
+If VIRTIO_NET_HDR_F_NEEDS_CSUM is not set, the device MUST NOT
+rely on the packet checksum being correct.
+\paragraph{Packet Transmission Interrupt}\label{sec:Device Types / Network Device / Device Operation / Packet Transmission / Packet Transmission Interrupt}
+
+Often a driver will suppress transmission virtqueue interrupts
+and check for used packets in the transmit path of following
+packets.
+
+The normal behavior in this interrupt handler is to retrieve
+used buffers from the virtqueue and free the corresponding
+headers and packets.
+
+\subsubsection{Setting Up Receive Buffers}\label{sec:Device Types / Network Device / Device Operation / Setting Up Receive Buffers}
+
+It is generally a good idea to keep the receive virtqueue as
+fully populated as possible: if it runs out, network performance
+will suffer.
+
+If the VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6,
+VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_GUEST_USO4 or VIRTIO_NET_F_GUEST_USO6
+features are used, the maximum incoming packet
+will be to 65550 bytes long (the maximum size of a
+TCP or UDP packet, plus the 14 byte ethernet header), otherwise
+1514 bytes. The 12-byte struct virtio_net_hdr is prepended to this,
+making for 65562 or 1526 bytes.
+
+\drivernormative{\paragraph}{Setting Up Receive Buffers}{Device Types / Network Device / Device Operation / Setting Up Receive Buffers}
+
+\begin{itemize}
+\item If VIRTIO_NET_F_MRG_RXBUF is not negotiated:
+ \begin{itemize}
+ \item If VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6, VIRTIO_NET_F_GUEST_UFO,
+ VIRTIO_NET_F_GUEST_USO4 or VIRTIO_NET_F_GUEST_USO6 are negotiated, the driver SHOULD populate
+ the receive queue(s) with buffers of at least 65562 bytes.
+ \item Otherwise, the driver SHOULD populate the receive queue(s)
+ with buffers of at least 1526 bytes.
+ \end{itemize}
+\item If VIRTIO_NET_F_MRG_RXBUF is negotiated, each buffer MUST be at
+least the size of the struct virtio_net_hdr.
+\end{itemize}
+
+\begin{note}
+Obviously each buffer can be split across multiple descriptor elements.
+\end{note}
+
+If VIRTIO_NET_F_MQ is negotiated, each of receiveq1\ldots receiveqN
+that will be used SHOULD be populated with receive buffers.
+
+\devicenormative{\paragraph}{Setting Up Receive Buffers}{Device Types / Network Device / Device Operation / Setting Up Receive Buffers}
+
+The device MUST set \field{num_buffers} to the number of descriptors used to
+hold the incoming packet.
+
+The device MUST use only a single descriptor if VIRTIO_NET_F_MRG_RXBUF
+was not negotiated.
+\begin{note}
+{This means that \field{num_buffers} will always be 1
+if VIRTIO_NET_F_MRG_RXBUF is not negotiated.}
+\end{note}
+
+\subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets}
+\label{sec:Device Types / Network Device / Device Operation / Processing of Packets}%old label for latexdiff
+
+When a packet is copied into a buffer in the receiveq, the
+optimal path is to disable further used buffer notifications for the
+receiveq and process packets until no more are found, then re-enable
+them.
+
+Processing incoming packets involves:
+
+\begin{enumerate}
+\item \field{num_buffers} indicates how many descriptors
+ this packet is spread over (including this one): this will
+ always be 1 if VIRTIO_NET_F_MRG_RXBUF was not negotiated.
+ This allows receipt of large packets without having to allocate large
+ buffers: a packet that does not fit in a single buffer can flow
+ over to the next buffer, and so on. In this case, there will be
+ at least \field{num_buffers} used buffers in the virtqueue, and the device
+ chains them together to form a single packet in a way similar to
+ how it would store it in a single buffer spread over multiple
+ descriptors.
+ The other buffers will not begin with a struct virtio_net_hdr.
+
+\item If
+ \field{num_buffers} is one, then the entire packet will be
+ contained within this buffer, immediately following the struct
+ virtio_net_hdr.
+\item If the VIRTIO_NET_F_GUEST_CSUM feature was negotiated, the
+ VIRTIO_NET_HDR_F_DATA_VALID bit in \field{flags} can be
+ set: if so, device has validated the packet checksum.
+ In case of multiple encapsulated protocols, one level of checksums
+ has been validated.
+\end{enumerate}
+
+Additionally, VIRTIO_NET_F_GUEST_CSUM, TSO4, TSO6, UDP and ECN
+features enable receive checksum, large receive offload and ECN
+support which are the input equivalents of the transmit checksum,
+transmit segmentation offloading and ECN features, as described
+in \ref{sec:Device Types / Network Device / Device Operation /
+Packet Transmission}:
+\begin{enumerate}
+\item If the VIRTIO_NET_F_GUEST_TSO4, TSO6, UFO, USO4 or USO6 options were
+ negotiated, then \field{gso_type} MAY be something other than
+ VIRTIO_NET_HDR_GSO_NONE, and \field{gso_size} field indicates the
+ desired MSS (see Packet Transmission point 2).
+\item If the VIRTIO_NET_F_RSC_EXT option was negotiated (this
+ implies one of VIRTIO_NET_F_GUEST_TSO4, TSO6), the
+ device processes also duplicated ACK segments, reports
+ number of coalesced TCP segments in \field{csum_start} field and
+ number of duplicated ACK segments in \field{csum_offset} field
+ and sets bit VIRTIO_NET_HDR_F_RSC_INFO in \field{flags}.
+\item If the VIRTIO_NET_F_GUEST_CSUM feature was negotiated, the
+ VIRTIO_NET_HDR_F_NEEDS_CSUM bit in \field{flags} can be
+ set: if so, the packet checksum at offset \field{csum_offset}
+ from \field{csum_start} and any preceding checksums
+ have been validated. The checksum on the packet is incomplete and
+ if bit VIRTIO_NET_HDR_F_RSC_INFO is not set in \field{flags},
+ then \field{csum_start} and \field{csum_offset} indicate how to calculate it
+ (see Packet Transmission point 1).
+
+\end{enumerate}
+
+If applicable, the device calculates per-packet hash for incoming packets as
+defined in \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets}.
+
+If applicable, the device reports hash information for incoming packets as
+defined in \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash reporting for incoming packets}.
+
+\devicenormative{\paragraph}{Processing of Incoming Packets}{Device Types / Network Device / Device Operation / Processing of Incoming Packets}
+\label{devicenormative:Device Types / Network Device / Device Operation / Processing of Packets}%old label for latexdiff
+
+If VIRTIO_NET_F_MRG_RXBUF has not been negotiated, the device MUST set
+\field{num_buffers} to 1.
+
+If VIRTIO_NET_F_MRG_RXBUF has been negotiated, the device MUST set
+\field{num_buffers} to indicate the number of buffers
+the packet (including the header) is spread over.
+
+If a receive packet is spread over multiple buffers, the device
+MUST use all buffers but the last (i.e. the first \field{num_buffers} -
+1 buffers) completely up to the full length of each buffer
+supplied by the driver.
+
+The device MUST use all buffers used by a single receive
+packet together, such that at least \field{num_buffers} are
+observed by driver as used.
+
+If VIRTIO_NET_F_GUEST_CSUM is not negotiated, the device MUST set
+\field{flags} to zero and SHOULD supply a fully checksummed
+packet to the driver.
+
+If VIRTIO_NET_F_GUEST_TSO4 is not negotiated, the device MUST NOT set
+\field{gso_type} to VIRTIO_NET_HDR_GSO_TCPV4.
+
+If VIRTIO_NET_F_GUEST_UDP is not negotiated, the device MUST NOT set
+\field{gso_type} to VIRTIO_NET_HDR_GSO_UDP.
+
+If VIRTIO_NET_F_GUEST_TSO6 is not negotiated, the device MUST NOT set
+\field{gso_type} to VIRTIO_NET_HDR_GSO_TCPV6.
+
+If none of VIRTIO_NET_F_GUEST_USO4 or VIRTIO_NET_F_GUEST_USO6 have been negotiated,
+the device MUST NOT set \field{gso_type} to VIRTIO_NET_HDR_GSO_UDP_L4.
+
+The device SHOULD NOT send to the driver TCP packets requiring segmentation offload
+which have the Explicit Congestion Notification bit set, unless the
+VIRTIO_NET_F_GUEST_ECN feature is negotiated, in which case the
+device MUST set the VIRTIO_NET_HDR_GSO_ECN bit in
+\field{gso_type}.
+
+If the VIRTIO_NET_F_GUEST_CSUM feature has been negotiated, the
+device MAY set the VIRTIO_NET_HDR_F_NEEDS_CSUM bit in
+\field{flags}, if so:
+\begin{enumerate}
+\item the device MUST validate the packet checksum at
+ offset \field{csum_offset} from \field{csum_start} as well as all
+ preceding offsets;
+\item the device MUST set the packet checksum stored in the
+ receive buffer to the TCP/UDP pseudo header;
+\item the device MUST set \field{csum_start} and
+ \field{csum_offset} such that calculating a ones'
+ complement checksum from \field{csum_start} up until the
+ end of the packet and storing the result at offset
+ \field{csum_offset} from \field{csum_start} will result in a
+ fully checksummed packet;
+\end{enumerate}
+
+If none of the VIRTIO_NET_F_GUEST_TSO4, TSO6, UFO, USO4 or USO6 options have
+been negotiated, the device MUST set \field{gso_type} to
+VIRTIO_NET_HDR_GSO_NONE.
+
+If \field{gso_type} differs from VIRTIO_NET_HDR_GSO_NONE, then
+the device MUST also set the VIRTIO_NET_HDR_F_NEEDS_CSUM bit in
+\field{flags} MUST set \field{gso_size} to indicate the desired MSS.
+If VIRTIO_NET_F_RSC_EXT was negotiated, the device MUST also
+set VIRTIO_NET_HDR_F_RSC_INFO bit in \field{flags},
+set \field{csum_start} to number of coalesced TCP segments and
+set \field{csum_offset} to number of received duplicated ACK segments.
+
+If VIRTIO_NET_F_RSC_EXT was not negotiated, the device MUST
+not set VIRTIO_NET_HDR_F_RSC_INFO bit in \field{flags}.
+
+If one of the VIRTIO_NET_F_GUEST_TSO4, TSO6, UFO, USO4 or USO6 options have
+been negotiated, the device SHOULD set \field{hdr_len} to a value
+not less than the length of the headers, including the transport
+header.
+
+If the VIRTIO_NET_F_GUEST_CSUM feature has been negotiated, the
+device MAY set the VIRTIO_NET_HDR_F_DATA_VALID bit in
+\field{flags}, if so, the device MUST validate the packet
+checksum (in case of multiple encapsulated protocols, one level
+of checksums is validated).
+
+\drivernormative{\paragraph}{Processing of Incoming
+Packets}{Device Types / Network Device / Device Operation /
+Processing of Incoming Packets}
+
+The driver MUST ignore \field{flag} bits that it does not recognize.
+
+If VIRTIO_NET_HDR_F_NEEDS_CSUM bit in \field{flags} is not set or
+if VIRTIO_NET_HDR_F_RSC_INFO bit \field{flags} is set, the
+driver MUST NOT use the \field{csum_start} and \field{csum_offset}.
+
+If one of the VIRTIO_NET_F_GUEST_TSO4, TSO6, UFO, USO4 or USO6 options have
+been negotiated, the driver MAY use \field{hdr_len} only as a hint about the
+transport header size.
+The driver MUST NOT rely on \field{hdr_len} to be correct.
+\begin{note}
+This is due to various bugs in implementations.
+\end{note}
+
+If neither VIRTIO_NET_HDR_F_NEEDS_CSUM nor
+VIRTIO_NET_HDR_F_DATA_VALID is set, the driver MUST NOT
+rely on the packet checksum being correct.
+
+\paragraph{Hash calculation for incoming packets}
+\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets}
+
+A device attempts to calculate a per-packet hash in the following cases:
+\begin{itemize}
+\item The feature VIRTIO_NET_F_RSS was negotiated. The device uses the hash to determine the receive virtqueue to place incoming packets.
+\item The feature VIRTIO_NET_F_HASH_REPORT was negotiated. The device reports the hash value and the hash type with the packet.
+\end{itemize}
+
+If the feature VIRTIO_NET_F_RSS was negotiated:
+\begin{itemize}
+\item The device uses \field{hash_types} of the virtio_net_rss_config structure as 'Enabled hash types' bitmask.
+\item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_rss_config structure (see
+\ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / Setting RSS parameters}).
+\end{itemize}
+
+If the feature VIRTIO_NET_F_RSS was not negotiated:
+\begin{itemize}
+\item The device uses \field{hash_types} of the virtio_net_hash_config structure as 'Enabled hash types' bitmask.
+\item The device uses a key as defined in \field{hash_key_data} and \field{hash_key_length} of the virtio_net_hash_config structure (see
+\ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}).
+\end{itemize}
+
+Note that if the device offers VIRTIO_NET_F_HASH_REPORT, even if it supports only one pair of virtqueues, it MUST support
+at least one of commands of VIRTIO_NET_CTRL_MQ class to configure reported hash parameters:
+\begin{itemize}
+\item If the device offers VIRTIO_NET_F_RSS, it MUST support VIRTIO_NET_CTRL_MQ_RSS_CONFIG command per
+ \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / Setting RSS parameters}.
+\item Otherwise the device MUST support VIRTIO_NET_CTRL_MQ_HASH_CONFIG command per
+ \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}.
+\end{itemize}
+
+\subparagraph{Supported/enabled hash types}
+\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}
+Hash types applicable for IPv4 packets:
+\begin{lstlisting}
+#define VIRTIO_NET_HASH_TYPE_IPv4 (1 << 0)
+#define VIRTIO_NET_HASH_TYPE_TCPv4 (1 << 1)
+#define VIRTIO_NET_HASH_TYPE_UDPv4 (1 << 2)
+\end{lstlisting}
+Hash types applicable for IPv6 packets without extension headers
+\begin{lstlisting}
+#define VIRTIO_NET_HASH_TYPE_IPv6 (1 << 3)
+#define VIRTIO_NET_HASH_TYPE_TCPv6 (1 << 4)
+#define VIRTIO_NET_HASH_TYPE_UDPv6 (1 << 5)
+\end{lstlisting}
+Hash types applicable for IPv6 packets with extension headers
+\begin{lstlisting}
+#define VIRTIO_NET_HASH_TYPE_IP_EX (1 << 6)
+#define VIRTIO_NET_HASH_TYPE_TCP_EX (1 << 7)
+#define VIRTIO_NET_HASH_TYPE_UDP_EX (1 << 8)
+\end{lstlisting}
+
+\subparagraph{IPv4 packets}
+\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv4 packets}
+The device calculates the hash on IPv4 packets according to 'Enabled hash types' bitmask as follows:
+\begin{itemize}
+\item If VIRTIO_NET_HASH_TYPE_TCPv4 is set and the packet has
+a TCP header, the hash is calculated over the following fields:
+\begin{itemize}
+\item Source IP address
+\item Destination IP address
+\item Source TCP port
+\item Destination TCP port
+\end{itemize}
+\item Else if VIRTIO_NET_HASH_TYPE_UDPv4 is set and the
+packet has a UDP header, the hash is calculated over the following fields:
+\begin{itemize}
+\item Source IP address
+\item Destination IP address
+\item Source UDP port
+\item Destination UDP port
+\end{itemize}
+\item Else if VIRTIO_NET_HASH_TYPE_IPv4 is set, the hash is
+calculated over the following fields:
+\begin{itemize}
+\item Source IP address
+\item Destination IP address
+\end{itemize}
+\item Else the device does not calculate the hash
+\end{itemize}
+
+\subparagraph{IPv6 packets without extension header}
+\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv6 packets without extension header}
+The device calculates the hash on IPv6 packets without extension
+headers according to 'Enabled hash types' bitmask as follows:
+\begin{itemize}
+\item If VIRTIO_NET_HASH_TYPE_TCPv6 is set and the packet has
+a TCPv6 header, the hash is calculated over the following fields:
+\begin{itemize}
+\item Source IPv6 address
+\item Destination IPv6 address
+\item Source TCP port
+\item Destination TCP port
+\end{itemize}
+\item Else if VIRTIO_NET_HASH_TYPE_UDPv6 is set and the
+packet has a UDPv6 header, the hash is calculated over the following fields:
+\begin{itemize}
+\item Source IPv6 address
+\item Destination IPv6 address
+\item Source UDP port
+\item Destination UDP port
+\end{itemize}
+\item Else if VIRTIO_NET_HASH_TYPE_IPv6 is set, the hash is
+calculated over the following fields:
+\begin{itemize}
+\item Source IPv6 address
+\item Destination IPv6 address
+\end{itemize}
+\item Else the device does not calculate the hash
+\end{itemize}
+
+\subparagraph{IPv6 packets with extension header}
+\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv6 packets with extension header}
+The device calculates the hash on IPv6 packets with extension
+headers according to 'Enabled hash types' bitmask as follows:
+\begin{itemize}
+\item If VIRTIO_NET_HASH_TYPE_TCP_EX is set and the packet
+has a TCPv6 header, the hash is calculated over the following fields:
+\begin{itemize}
+\item Home address from the home address option in the IPv6 destination options header. If the extension header is not present, use the Source IPv6 address.
+\item IPv6 address that is contained in the Routing-Header-Type-2 from the associated extension header. If the extension header is not present, use the Destination IPv6 address.
+\item Source TCP port
+\item Destination TCP port
+\end{itemize}
+\item Else if VIRTIO_NET_HASH_TYPE_UDP_EX is set and the
+packet has a UDPv6 header, the hash is calculated over the following fields:
+\begin{itemize}
+\item Home address from the home address option in the IPv6 destination options header. If the extension header is not present, use the Source IPv6 address.
+\item IPv6 address that is contained in the Routing-Header-Type-2 from the associated extension header. If the extension header is not present, use the Destination IPv6 address.
+\item Source UDP port
+\item Destination UDP port
+\end{itemize}
+\item Else if VIRTIO_NET_HASH_TYPE_IP_EX is set, the hash is
+calculated over the following fields:
+\begin{itemize}
+\item Home address from the home address option in the IPv6 destination options header. If the extension header is not present, use the Source IPv6 address.
+\item IPv6 address that is contained in the Routing-Header-Type-2 from the associated extension header. If the extension header is not present, use the Destination IPv6 address.
+\end{itemize}
+\item Else skip IPv6 extension headers and calculate the hash as
+defined for an IPv6 packet without extension headers
+(see \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv6 packets without extension header}).
+\end{itemize}
+
+\paragraph{Hash reporting for incoming packets}
+\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash reporting for incoming packets}
+
+If VIRTIO_NET_F_HASH_REPORT was negotiated and
+ the device has calculated the hash for the packet, the device fills \field{hash_report} with the report type of calculated hash
+and \field{hash_value} with the value of calculated hash.
+
+If VIRTIO_NET_F_HASH_REPORT was negotiated but due to any reason the
+hash was not calculated, the device sets \field{hash_report} to VIRTIO_NET_HASH_REPORT_NONE.
+
+Possible values that the device can report in \field{hash_report} are defined below.
+They correspond to supported hash types defined in
+\ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}
+as follows:
+
+VIRTIO_NET_HASH_TYPE_XXX = 1 << (VIRTIO_NET_HASH_REPORT_XXX - 1)
+
+\begin{lstlisting}
+#define VIRTIO_NET_HASH_REPORT_NONE 0
+#define VIRTIO_NET_HASH_REPORT_IPv4 1
+#define VIRTIO_NET_HASH_REPORT_TCPv4 2
+#define VIRTIO_NET_HASH_REPORT_UDPv4 3
+#define VIRTIO_NET_HASH_REPORT_IPv6 4
+#define VIRTIO_NET_HASH_REPORT_TCPv6 5
+#define VIRTIO_NET_HASH_REPORT_UDPv6 6
+#define VIRTIO_NET_HASH_REPORT_IPv6_EX 7
+#define VIRTIO_NET_HASH_REPORT_TCPv6_EX 8
+#define VIRTIO_NET_HASH_REPORT_UDPv6_EX 9
+\end{lstlisting}
+
+\subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue}
+
+The driver uses the control virtqueue (if VIRTIO_NET_F_CTRL_VQ is
+negotiated) to send commands to manipulate various features of
+the device which would not easily map into the configuration
+space.
+
+All commands are of the following form:
+
+\begin{lstlisting}
+struct virtio_net_ctrl {
+ u8 class;
+ u8 command;
+ u8 command-specific-data[];
+ u8 ack;
+};
+
+/* ack values */
+#define VIRTIO_NET_OK 0
+#define VIRTIO_NET_ERR 1
+\end{lstlisting}
+
+The \field{class}, \field{command} and command-specific-data are set by the
+driver, and the device sets the \field{ack} byte. There is little it can
+do except issue a diagnostic if \field{ack} is not
+VIRTIO_NET_OK.
+
+\paragraph{Packet Receive Filtering}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Packet Receive Filtering}
+\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Setting Promiscuous Mode}%old label for latexdiff
+
+If the VIRTIO_NET_F_CTRL_RX and VIRTIO_NET_F_CTRL_RX_EXTRA
+features are negotiated, the driver can send control commands for
+promiscuous mode, multicast, unicast and broadcast receiving.
+
+\begin{note}
+In general, these commands are best-effort: unwanted
+packets could still arrive.
+\end{note}
+
+\begin{lstlisting}
+#define VIRTIO_NET_CTRL_RX 0
+ #define VIRTIO_NET_CTRL_RX_PROMISC 0
+ #define VIRTIO_NET_CTRL_RX_ALLMULTI 1
+ #define VIRTIO_NET_CTRL_RX_ALLUNI 2
+ #define VIRTIO_NET_CTRL_RX_NOMULTI 3
+ #define VIRTIO_NET_CTRL_RX_NOUNI 4
+ #define VIRTIO_NET_CTRL_RX_NOBCAST 5
+\end{lstlisting}
+
+
+\devicenormative{\subparagraph}{Packet Receive Filtering}{Device Types / Network Device / Device Operation / Control Virtqueue / Packet Receive Filtering}
+
+If the VIRTIO_NET_F_CTRL_RX feature has been negotiated,
+the device MUST support the following VIRTIO_NET_CTRL_RX class
+commands:
+\begin{itemize}
+\item VIRTIO_NET_CTRL_RX_PROMISC turns promiscuous mode on and
+off. The command-specific-data is one byte containing 0 (off) or
+1 (on). If promiscous mode is on, the device SHOULD receive all
+incoming packets.
+This SHOULD take effect even if one of the other modes set by
+a VIRTIO_NET_CTRL_RX class command is on.
+\item VIRTIO_NET_CTRL_RX_ALLMULTI turns all-multicast receive on and
+off. The command-specific-data is one byte containing 0 (off) or
+1 (on). When all-multicast receive is on the device SHOULD allow
+all incoming multicast packets.
+\end{itemize}
+
+If the VIRTIO_NET_F_CTRL_RX_EXTRA feature has been negotiated,
+the device MUST support the following VIRTIO_NET_CTRL_RX class
+commands:
+\begin{itemize}
+\item VIRTIO_NET_CTRL_RX_ALLUNI turns all-unicast receive on and
+off. The command-specific-data is one byte containing 0 (off) or
+1 (on). When all-unicast receive is on the device SHOULD allow
+all incoming unicast packets.
+\item VIRTIO_NET_CTRL_RX_NOMULTI suppresses multicast receive.
+The command-specific-data is one byte containing 0 (multicast
+receive allowed) or 1 (multicast receive suppressed).
+When multicast receive is suppressed, the device SHOULD NOT
+send multicast packets to the driver.
+This SHOULD take effect even if VIRTIO_NET_CTRL_RX_ALLMULTI is on.
+This filter SHOULD NOT apply to broadcast packets.
+\item VIRTIO_NET_CTRL_RX_NOUNI suppresses unicast receive.
+The command-specific-data is one byte containing 0 (unicast
+receive allowed) or 1 (unicast receive suppressed).
+When unicast receive is suppressed, the device SHOULD NOT
+send unicast packets to the driver.
+This SHOULD take effect even if VIRTIO_NET_CTRL_RX_ALLUNI is on.
+\item VIRTIO_NET_CTRL_RX_NOBCAST suppresses broadcast receive.
+The command-specific-data is one byte containing 0 (broadcast
+receive allowed) or 1 (broadcast receive suppressed).
+When broadcast receive is suppressed, the device SHOULD NOT
+send broadcast packets to the driver.
+This SHOULD take effect even if VIRTIO_NET_CTRL_RX_ALLMULTI is on.
+\end{itemize}
+
+\drivernormative{\subparagraph}{Packet Receive Filtering}{Device Types / Network Device / Device Operation / Control Virtqueue / Packet Receive Filtering}
+
+If the VIRTIO_NET_F_CTRL_RX feature has not been negotiated,
+the driver MUST NOT issue commands VIRTIO_NET_CTRL_RX_PROMISC or
+VIRTIO_NET_CTRL_RX_ALLMULTI.
+
+If the VIRTIO_NET_F_CTRL_RX_EXTRA feature has not been negotiated,
+the driver MUST NOT issue commands
+ VIRTIO_NET_CTRL_RX_ALLUNI,
+ VIRTIO_NET_CTRL_RX_NOMULTI,
+ VIRTIO_NET_CTRL_RX_NOUNI or
+ VIRTIO_NET_CTRL_RX_NOBCAST.
+
+\paragraph{Setting MAC Address Filtering}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Setting MAC Address Filtering}
+
+If the VIRTIO_NET_F_CTRL_RX feature is negotiated, the driver can
+send control commands for MAC address filtering.
+
+\begin{lstlisting}
+struct virtio_net_ctrl_mac {
+ le32 entries;
+ u8 macs[entries][6];
+};
+
+#define VIRTIO_NET_CTRL_MAC 1
+ #define VIRTIO_NET_CTRL_MAC_TABLE_SET 0
+ #define VIRTIO_NET_CTRL_MAC_ADDR_SET 1
+\end{lstlisting}
+
+The device can filter incoming packets by any number of destination
+MAC addresses\footnote{Since there are no guarantees, it can use a hash filter or
+silently switch to allmulti or promiscuous mode if it is given too
+many addresses.
+}. This table is set using the class
+VIRTIO_NET_CTRL_MAC and the command VIRTIO_NET_CTRL_MAC_TABLE_SET. The
+command-specific-data is two variable length tables of 6-byte MAC
+addresses (as described in struct virtio_net_ctrl_mac). The first table contains unicast addresses, and the second
+contains multicast addresses.
+
+The VIRTIO_NET_CTRL_MAC_ADDR_SET command is used to set the
+default MAC address which rx filtering
+accepts (and if VIRTIO_NET_F_MAC has been negotiated,
+this will be reflected in \field{mac} in config space).
+
+The command-specific-data for VIRTIO_NET_CTRL_MAC_ADDR_SET is
+the 6-byte MAC address.
+
+\devicenormative{\subparagraph}{Setting MAC Address Filtering}{Device Types / Network Device / Device Operation / Control Virtqueue / Setting MAC Address Filtering}
+
+The device MUST have an empty MAC filtering table on reset.
+
+The device MUST update the MAC filtering table before it consumes
+the VIRTIO_NET_CTRL_MAC_TABLE_SET command.
+
+The device MUST update \field{mac} in config space before it consumes
+the VIRTIO_NET_CTRL_MAC_ADDR_SET command, if VIRTIO_NET_F_MAC has
+been negotiated.
+
+The device SHOULD drop incoming packets which have a destination MAC which
+matches neither the \field{mac} (or that set with VIRTIO_NET_CTRL_MAC_ADDR_SET)
+nor the MAC filtering table.
+
+\drivernormative{\subparagraph}{Setting MAC Address Filtering}{Device Types / Network Device / Device Operation / Control Virtqueue / Setting MAC Address Filtering}
+
+If VIRTIO_NET_F_CTRL_RX has not been negotiated,
+the driver MUST NOT issue VIRTIO_NET_CTRL_MAC class commands.
+
+If VIRTIO_NET_F_CTRL_RX has been negotiated,
+the driver SHOULD issue VIRTIO_NET_CTRL_MAC_ADDR_SET
+to set the default mac if it is different from \field{mac}.
+
+The driver MUST follow the VIRTIO_NET_CTRL_MAC_TABLE_SET command
+by a le32 number, followed by that number of non-multicast
+MAC addresses, followed by another le32 number, followed by
+that number of multicast addresses. Either number MAY be 0.
+
+\subparagraph{Legacy Interface: Setting MAC Address Filtering}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Setting MAC Address Filtering / Legacy Interface: Setting MAC Address Filtering}
+When using the legacy interface, transitional devices and drivers
+MUST format \field{entries} in struct virtio_net_ctrl_mac
+according to the native endian of the guest rather than
+(necessarily when not using the legacy interface) little-endian.
+
+Legacy drivers that didn't negotiate VIRTIO_NET_F_CTRL_MAC_ADDR
+changed \field{mac} in config space when NIC is accepting
+incoming packets. These drivers always wrote the mac value from
+first to last byte, therefore after detecting such drivers,
+a transitional device MAY defer MAC update, or MAY defer
+processing incoming packets until driver writes the last byte
+of \field{mac} in the config space.
+
+\paragraph{VLAN Filtering}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / VLAN Filtering}
+
+If the driver negotiates the VIRTIO_NET_F_CTRL_VLAN feature, it
+can control a VLAN filter table in the device.
+
+\begin{note}
+Similar to the MAC address based filtering, the VLAN filtering
+is also best-effort: unwanted packets could still arrive.
+\end{note}
+
+\begin{lstlisting}
+#define VIRTIO_NET_CTRL_VLAN 2
+ #define VIRTIO_NET_CTRL_VLAN_ADD 0
+ #define VIRTIO_NET_CTRL_VLAN_DEL 1
+\end{lstlisting}
+
+Both the VIRTIO_NET_CTRL_VLAN_ADD and VIRTIO_NET_CTRL_VLAN_DEL
+command take a little-endian 16-bit VLAN id as the command-specific-data.
+
+\subparagraph{Legacy Interface: VLAN Filtering}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / VLAN Filtering / Legacy Interface: VLAN Filtering}
+When using the legacy interface, transitional devices and drivers
+MUST format the VLAN id
+according to the native endian of the guest rather than
+(necessarily when not using the legacy interface) little-endian.
+
+\paragraph{Gratuitous Packet Sending}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Gratuitous Packet Sending}
+
+If the driver negotiates the VIRTIO_NET_F_GUEST_ANNOUNCE (depends
+on VIRTIO_NET_F_CTRL_VQ), the device can ask the driver to send gratuitous
+packets; this is usually done after the guest has been physically
+migrated, and needs to announce its presence on the new network
+links. (As hypervisor does not have the knowledge of guest
+network configuration (eg. tagged vlan) it is simplest to prod
+the guest in this way).
+
+\begin{lstlisting}
+#define VIRTIO_NET_CTRL_ANNOUNCE 3
+ #define VIRTIO_NET_CTRL_ANNOUNCE_ACK 0
+\end{lstlisting}
+
+The driver checks VIRTIO_NET_S_ANNOUNCE bit in the device configuration \field{status} field
+when it notices the changes of device configuration. The
+command VIRTIO_NET_CTRL_ANNOUNCE_ACK is used to indicate that
+driver has received the notification and device clears the
+VIRTIO_NET_S_ANNOUNCE bit in \field{status}.
+
+Processing this notification involves:
+
+\begin{enumerate}
+\item Sending the gratuitous packets (eg. ARP) or marking there are pending
+ gratuitous packets to be sent and letting deferred routine to
+ send them.
+
+\item Sending VIRTIO_NET_CTRL_ANNOUNCE_ACK command through control
+ vq.
+\end{enumerate}
+
+\drivernormative{\subparagraph}{Gratuitous Packet Sending}{Device Types / Network Device / Device Operation / Control Virtqueue / Gratuitous Packet Sending}
+
+If the driver negotiates VIRTIO_NET_F_GUEST_ANNOUNCE, it SHOULD notify
+network peers of its new location after it sees the VIRTIO_NET_S_ANNOUNCE bit
+in \field{status}. The driver MUST send a command on the command queue
+with class VIRTIO_NET_CTRL_ANNOUNCE and command VIRTIO_NET_CTRL_ANNOUNCE_ACK.
+
+\devicenormative{\subparagraph}{Gratuitous Packet Sending}{Device Types / Network Device / Device Operation / Control Virtqueue / Gratuitous Packet Sending}
+
+If VIRTIO_NET_F_GUEST_ANNOUNCE is negotiated, the device MUST clear the
+VIRTIO_NET_S_ANNOUNCE bit in \field{status} upon receipt of a command buffer
+with class VIRTIO_NET_CTRL_ANNOUNCE and command VIRTIO_NET_CTRL_ANNOUNCE_ACK
+before marking the buffer as used.
+
+\paragraph{Device operation in multiqueue mode}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Device operation in multiqueue mode}
+
+This specification defines the following modes that a device MAY implement for operation with multiple transmit/receive virtqueues:
+\begin{itemize}
+\item Automatic receive steering as defined in \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode}.
+ If a device supports this mode, it offers the VIRTIO_NET_F_MQ feature bit.
+\item Receive-side scaling as defined in \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / RSS processing}.
+ If a device supports this mode, it offers the VIRTIO_NET_F_RSS feature bit.
+\end{itemize}
+
+A device MAY support one of these features or both. The driver MAY negotiate any set of these features that the device supports.
+
+Multiqueue is disabled by default.
+
+The driver enables multiqueue by sending a command using \field{class} VIRTIO_NET_CTRL_MQ. The \field{command} selects the mode of multiqueue operation, as follows:
+\begin{lstlisting}
+#define VIRTIO_NET_CTRL_MQ 4
+ #define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET 0 (for automatic receive steering)
+ #define VIRTIO_NET_CTRL_MQ_RSS_CONFIG 1 (for configurable receive steering)
+ #define VIRTIO_NET_CTRL_MQ_HASH_CONFIG 2 (for configurable hash calculation)
+\end{lstlisting}
+
+If more than one multiqueue mode is negotiated, the resulting device configuration is defined by the last command sent by the driver.
+
+\paragraph{Automatic receive steering in multiqueue mode}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode}
+
+If the driver negotiates the VIRTIO_NET_F_MQ feature bit (depends on VIRTIO_NET_F_CTRL_VQ), it MAY transmit outgoing packets on one
+of the multiple transmitq1\ldots transmitqN and ask the device to
+queue incoming packets into one of the multiple receiveq1\ldots receiveqN
+depending on the packet flow.
+
+The driver enables multiqueue by
+sending the VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command, specifying
+the number of the transmit and receive queues to be used up to
+\field{max_virtqueue_pairs}; subsequently,
+transmitq1\ldots transmitqn and receiveq1\ldots receiveqn where
+n=\field{virtqueue_pairs} MAY be used.
+\begin{lstlisting}
+struct virtio_net_ctrl_mq_pairs_set {
+ le16 virtqueue_pairs;
+};
+#define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MIN 1
+#define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX 0x8000
+
+\end{lstlisting}
+
+When multiqueue is enabled by VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command, the device MUST use automatic receive steering
+based on packet flow. Programming of the receive steering
+classificator is implicit. After the driver transmitted a packet of a
+flow on transmitqX, the device SHOULD cause incoming packets for that flow to
+be steered to receiveqX. For uni-directional protocols, or where
+no packets have been transmitted yet, the device MAY steer a packet
+to a random queue out of the specified receiveq1\ldots receiveqn.
+
+Multiqueue is disabled by VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET with \field{virtqueue_pairs} to 1 (this is
+the default) and waiting for the device to use the command buffer.
+
+\drivernormative{\subparagraph}{Automatic receive steering in multiqueue mode}{Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode}
+
+The driver MUST configure the virtqueues before enabling them with the
+VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command.
+
+The driver MUST NOT request a \field{virtqueue_pairs} of 0 or
+greater than \field{max_virtqueue_pairs} in the device configuration space.
+
+The driver MUST queue packets only on any transmitq1 before the
+VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command.
+
+The driver MUST NOT queue packets on transmit queues greater than
+\field{virtqueue_pairs} once it has placed the VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command in the available ring.
+
+\devicenormative{\subparagraph}{Automatic receive steering in multiqueue mode}{Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode}
+
+After initialization of reset, the device MUST queue packets only on receiveq1.
+
+The device MUST NOT queue packets on receive queues greater than
+\field{virtqueue_pairs} once it has placed the
+VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command in a used buffer.
+
+If the destination receive queue is being reset (See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}),
+the device SHOULD re-select another random queue. If all receive queues are
+being reset, the device MUST drop the packet.
+
+\subparagraph{Legacy Interface: Automatic receive steering in multiqueue mode}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Legacy Interface: Automatic receive steering in multiqueue mode}
+When using the legacy interface, transitional devices and drivers
+MUST format \field{virtqueue_pairs}
+according to the native endian of the guest rather than
+(necessarily when not using the legacy interface) little-endian.
+
+\subparagraph{Hash calculation}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}
+If VIRTIO_NET_F_HASH_REPORT was negotiated and the device uses automatic receive steering,
+the device MUST support a command to configure hash calculation parameters.
+
+The driver provides parameters for hash calculation as follows:
+
+\field{class} VIRTIO_NET_CTRL_MQ, \field{command} VIRTIO_NET_CTRL_MQ_HASH_CONFIG.
+
+The \field{command-specific-data} has following format:
+\begin{lstlisting}
+struct virtio_net_hash_config {
+ le32 hash_types;
+ le16 reserved[4];
+ u8 hash_key_length;
+ u8 hash_key_data[hash_key_length];
+};
+\end{lstlisting}
+Field \field{hash_types} contains a bitmask of allowed hash types as
+defined in
+\ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}.
+Initially the device has all hash types disabled and reports only VIRTIO_NET_HASH_REPORT_NONE.
+
+Field \field{reserved} MUST contain zeroes. It is defined to make the structure to match the layout of virtio_net_rss_config structure,
+defined in \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS)}.
+
+Fields \field{hash_key_length} and \field{hash_key_data} define the key to be used in hash calculation.
+
+\paragraph{Receive-side scaling (RSS)}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS)}
+A device offers the feature VIRTIO_NET_F_RSS if it supports RSS receive steering with Toeplitz hash calculation and configurable parameters.
+
+A driver queries RSS capabilities of the device by reading device configuration as defined in \ref{sec:Device Types / Network Device / Device configuration layout}
+
+\subparagraph{Setting RSS parameters}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / Setting RSS parameters}
+
+Driver sends a VIRTIO_NET_CTRL_MQ_RSS_CONFIG command using the following format for \field{command-specific-data}:
+\begin{lstlisting}
+struct virtio_net_rss_config {
+ le32 hash_types;
+ le16 indirection_table_mask;
+ le16 unclassified_queue;
+ le16 indirection_table[indirection_table_length];
+ le16 max_tx_vq;
+ u8 hash_key_length;
+ u8 hash_key_data[hash_key_length];
+};
+\end{lstlisting}
+Field \field{hash_types} contains a bitmask of allowed hash types as
+defined in
+\ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}.
+
+Field \field{indirection_table_mask} is a mask to be applied to
+the calculated hash to produce an index in the
+\field{indirection_table} array.
+Number of entries in \field{indirection_table} is (\field{indirection_table_mask} + 1).
+
+Field \field{unclassified_queue} contains the 0-based index of
+the receive virtqueue to place unclassified packets in. Index 0 corresponds to receiveq1.
+
+Field \field{indirection_table} contains an array of 0-based indices of receive virtqueus. Index 0 corresponds to receiveq1.
+
+A driver sets \field{max_tx_vq} to inform a device how many transmit virtqueues it may use (transmitq1\ldots transmitq \field{max_tx_vq}).
+
+Fields \field{hash_key_length} and \field{hash_key_data} define the key to be used in hash calculation.
+
+\drivernormative{\subparagraph}{Setting RSS parameters}{Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) }
+
+A driver MUST NOT send the VIRTIO_NET_CTRL_MQ_RSS_CONFIG command if the feature VIRTIO_NET_F_RSS has not been negotiated.
+
+A driver MUST fill the \field{indirection_table} array only with indices of enabled queues. Index 0 corresponds to receiveq1.
+
+The number of entries in \field{indirection_table} (\field{indirection_table_mask} + 1) MUST be a power of two.
+
+A driver MUST use \field{indirection_table_mask} values that are less than \field{rss_max_indirection_table_length} reported by a device.
+
+A driver MUST NOT set any VIRTIO_NET_HASH_TYPE_ flags that are not supported by a device.
+
+\devicenormative{\subparagraph}{RSS processing}{Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / RSS processing}
+The device MUST determine the destination queue for a network packet as follows:
+\begin{itemize}
+\item Calculate the hash of the packet as defined in \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets}.
+\item If the device did not calculate the hash for the specific packet, the device directs the packet to the receiveq specified by \field{unclassified_queue} of virtio_net_rss_config structure (value of 0 corresponds to receiveq1).
+\item Apply \field{indirection_table_mask} to the calculated hash and use the result as the index in the indirection table to get 0-based number of destination receiveq (value of 0 corresponds to receiveq1).
+\item If the destination receive queue is being reset (See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}), the device MUST drop the packet.
+\end{itemize}
+
+\paragraph{Offloads State Configuration}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Offloads State Configuration}
+
+If the VIRTIO_NET_F_CTRL_GUEST_OFFLOADS feature is negotiated, the driver can
+send control commands for dynamic offloads state configuration.
+
+\subparagraph{Setting Offloads State}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Offloads State Configuration / Setting Offloads State}
+
+To configure the offloads, the following layout structure and
+definitions are used:
+
+\begin{lstlisting}
+le64 offloads;
+
+#define VIRTIO_NET_F_GUEST_CSUM 1
+#define VIRTIO_NET_F_GUEST_TSO4 7
+#define VIRTIO_NET_F_GUEST_TSO6 8
+#define VIRTIO_NET_F_GUEST_ECN 9
+#define VIRTIO_NET_F_GUEST_UFO 10
+#define VIRTIO_NET_F_GUEST_USO4 54
+#define VIRTIO_NET_F_GUEST_USO6 55
+
+#define VIRTIO_NET_CTRL_GUEST_OFFLOADS 5
+ #define VIRTIO_NET_CTRL_GUEST_OFFLOADS_SET 0
+\end{lstlisting}
+
+The class VIRTIO_NET_CTRL_GUEST_OFFLOADS has one command:
+VIRTIO_NET_CTRL_GUEST_OFFLOADS_SET applies the new offloads configuration.
+
+le64 value passed as command data is a bitmask, bits set define
+offloads to be enabled, bits cleared - offloads to be disabled.
+
+There is a corresponding device feature for each offload. Upon feature
+negotiation corresponding offload gets enabled to preserve backward
+compatibility.
+
+\drivernormative{\subparagraph}{Setting Offloads State}{Device Types / Network Device / Device Operation / Control Virtqueue / Offloads State Configuration / Setting Offloads State}
+
+A driver MUST NOT enable an offload for which the appropriate feature
+has not been negotiated.
+
+\subparagraph{Legacy Interface: Setting Offloads State}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Offloads State Configuration / Setting Offloads State / Legacy Interface: Setting Offloads State}
+When using the legacy interface, transitional devices and drivers
+MUST format \field{offloads}
+according to the native endian of the guest rather than
+(necessarily when not using the legacy interface) little-endian.
+
+
+\paragraph{Notifications Coalescing}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
+
+If the VIRTIO_NET_F_NOTF_COAL feature is negotiated, the driver can
+send control commands for dynamically changing the coalescing parameters.
+
+\begin{lstlisting}
+struct virtio_net_ctrl_coal_rx {
+ le32 rx_max_packets;
+ le32 rx_usecs;
+};
+
+struct virtio_net_ctrl_coal_tx {
+ le32 tx_max_packets;
+ le32 tx_usecs;
+};
+
+#define VIRTIO_NET_CTRL_NOTF_COAL 6
+ #define VIRTIO_NET_CTRL_NOTF_COAL_TX_SET 0
+ #define VIRTIO_NET_CTRL_NOTF_COAL_RX_SET 1
+\end{lstlisting}
+
+Coalescing parameters:
+\begin{itemize}
+\item \field{rx_usecs}: Maximum number of usecs to delay a RX notification.
+\item \field{tx_usecs}: Maximum number of usecs to delay a TX notification.
+\item \field{rx_max_packets}: Maximum number of packets to receive before a RX notification.
+\item \field{tx_max_packets}: Maximum number of packets to send before a TX notification.
+\end{itemize}
+
+
+The class VIRTIO_NET_CTRL_NOTF_COAL has 2 commands:
+\begin{enumerate}
+\item VIRTIO_NET_CTRL_NOTF_COAL_TX_SET: set the \field{tx_usecs} and \field{tx_max_packets} parameters.
+\item VIRTIO_NET_CTRL_NOTF_COAL_RX_SET: set the \field{rx_usecs} and \field{rx_max_packets} parameters.
+\end{enumerate}
+
+\subparagraph{RX Notifications}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing / RX Notifications}
+
+If, for example:
+\begin{itemize}
+\item \field{rx_usecs} = 10.
+\item \field{rx_max_packets} = 15.
+\end{itemize}
+
+The device will operate as follows:
+
+\begin{itemize}
+\item The device will count received packets until it accumulates 15, or until 10 usecs elapsed since the first one was received.
+\item If the notifications are not suppressed by the driver, the device will send an used buffer notification, otherwise, the device will not send an used buffer notification as long as the notifications are suppressed.
+\end{itemize}
+
+\subparagraph{TX Notifications}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing / TX Notifications}
+
+If, for example:
+\begin{itemize}
+\item \field{tx_usecs} = 10.
+\item \field{tx_max_packets} = 15.
+\end{itemize}
+
+The device will operate as follows:
+
+\begin{itemize}
+\item The device will count sent packets until it accumulates 15, or until 10 usecs elapsed since the first one was sent.
+\item If the notifications are not suppressed by the driver, the device will send an used buffer notification, otherwise, the device will not send an used buffer notification as long as the notifications are suppressed.
+\end{itemize}
+
+\drivernormative{\subparagraph}{Notifications Coalescing}{Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
+
+If the VIRTIO_NET_F_NOTF_COAL feature has not been negotiated, the driver MUST NOT issue VIRTIO_NET_CTRL_NOTF_COAL commands.
+
+\devicenormative{\subparagraph}{Notifications Coalescing}{Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
+
+A device SHOULD respond to the VIRTIO_NET_CTRL_NOTF_COAL commands with VIRTIO_NET_ERR if it was not able to change the parameters.
+
+A device SHOULD NOT send used buffer notifications to the driver, if the notifications are suppressed as explained in \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Used Buffer Notification Suppression}, even if the coalescing counters expired.
+
+Upon reset, a device MUST initialize all coalescing parameters to 0.
+
+\subsubsection{Legacy Interface: Framing Requirements}\label{sec:Device
+Types / Network Device / Legacy Interface: Framing Requirements}
+
+When using legacy interfaces, transitional drivers which have not
+negotiated VIRTIO_F_ANY_LAYOUT MUST use a single descriptor for the
+struct virtio_net_hdr on both transmit and receive, with the
+network data in the following descriptors.
+
+Additionally, when using the control virtqueue (see \ref{sec:Device
+Types / Network Device / Device Operation / Control Virtqueue})
+, transitional drivers which have not
+negotiated VIRTIO_F_ANY_LAYOUT MUST:
+\begin{itemize}
+\item for all commands, use a single 2-byte descriptor including the first two
+fields: \field{class} and \field{command}
+\item for all commands except VIRTIO_NET_CTRL_MAC_TABLE_SET
+use a single descriptor including command-specific-data
+with no padding.
+\item for the VIRTIO_NET_CTRL_MAC_TABLE_SET command use exactly
+two descriptors including command-specific-data with no padding:
+the first of these descriptors MUST include the
+virtio_net_ctrl_mac table structure for the unicast addresses with no padding,
+the second of these descriptors MUST include the
+virtio_net_ctrl_mac table structure for the multicast addresses
+with no padding.
+\item for all commands, use a single 1-byte descriptor for the
+\field{ack} field
+\end{itemize}
+
+See \ref{sec:Basic
+Facilities of a Virtio Device / Virtqueues / Message Framing}.
diff --git a/device-types/virtio-network/device-conformance.tex b/device-types/virtio-network/device-conformance.tex
new file mode 100644
index 0000000..c686377
--- /dev/null
+++ b/device-types/virtio-network/device-conformance.tex
@@ -0,0 +1,16 @@
+\conformance{\subsection}{Network Device Conformance}\label{sec:Conformance / Device Conformance / Network Device Conformance}
+
+A network device MUST conform to the following normative statements:
+
+\begin{itemize}
+\item \ref{devicenormative:Device Types / Network Device / Device configuration layout}
+\item \ref{devicenormative:Device Types / Network Device / Device Operation / Packet Transmission}
+\item \ref{devicenormative:Device Types / Network Device / Device Operation / Setting Up Receive Buffers}
+\item \ref{devicenormative:Device Types / Network Device / Device Operation / Processing of Incoming Packets}
+\item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Packet Receive Filtering}
+\item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Setting MAC Address Filtering}
+\item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Gratuitous Packet Sending}
+\item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode}
+\item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / RSS processing}
+\item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
+\end{itemize}
diff --git a/device-types/virtio-network/driver-conformance.tex b/device-types/virtio-network/driver-conformance.tex
new file mode 100644
index 0000000..97d0cc1
--- /dev/null
+++ b/device-types/virtio-network/driver-conformance.tex
@@ -0,0 +1,17 @@
+\conformance{\subsection}{Network Driver Conformance}\label{sec:Conformance / Driver Conformance / Network Driver Conformance}
+
+A network driver MUST conform to the following normative statements:
+
+\begin{itemize}
+\item \ref{drivernormative:Device Types / Network Device / Device configuration layout}
+\item \ref{drivernormative:Device Types / Network Device / Device Operation / Packet Transmission}
+\item \ref{drivernormative:Device Types / Network Device / Device Operation / Setting Up Receive Buffers}
+\item \ref{drivernormative:Device Types / Network Device / Device Operation / Processing of Incoming Packets}
+\item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Packet Receive Filtering}
+\item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Setting MAC Address Filtering}
+\item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Gratuitous Packet Sending}
+\item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode}
+\item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Offloads State Configuration / Setting Offloads State}
+\item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) }
+\item \ref{drivernormative:Device Types / Network Device / Device Operation / Control Virtqueue / Notifications Coalescing}
+\end{itemize}
--
2.26.2
^ permalink raw reply related
* [PATCH v3 00/20] Split device spec to its individual files
From: Parav Pandit @ 2023-01-10 23:03 UTC (permalink / raw)
To: mst, virtio-dev, cohuck; +Cc: virtio-comment, Parav Pandit
Relatively several of the recent device specifications are maintained
in their own specification file. Such separate files enables better
maintenance of the specification overall.
However, several of the initial virtio device specifications
are located in single file.
Hence, split them into their individual files.
Additionally, each device's driver and device conformance is
present in one giant conformance file all together.
As Michael suggest's move this device and driver conformance
section adjacent to device specification in each device specific
directory. This further makes device specification self-contained.
Added patch to fix spelling errors in network device
specification which was inherited from its previous file
location.
Patches do not change any part of the specification outcome
except fixing the spelling errors.
It only changes how the specification is maintained.
patch summary:
-------------
patch 1 to 7 creates new files for moving devices spec out of
content and conformance files.
patch 8 to 20 move existing dedicated file spec to new directory
and creates per device,driver conformance file for each device.
changelog:
----------
v2->v3:
- file name changed from device.tex to description.tex
- use input instead of import to insert a file
v1->v2:
- removed extra blank lines in network and block device files
- added missing device conformance link for rpmb, sound, i2c and
gpio devices
v0->v1:
- move device spec to their own directory
- added split files for conformance and placed them adjacent to
device spec
- added patch to fix spelling errors in network device
Parav Pandit (20):
virtio-network: Maintain network device spec in separate directory
virtio-network: Fix spelling errors
virtio-block: Maintain block device spec in separate directory
virtio-console: Maintain console device spec in separate directory
virtio-entropy: Maintain entropy device spec in separate directory
virtio-mem-balloon: Maintain mem balloon device spec in separate
directory
virtio-scsi: Maintain scsi host device spec in separate directory
virtio-gpu: Maintain gpu device spec in separate directory
virtio-input: Maintain input device spec in separate directory
virtio-crypto: Maintain crypto device spec in separate directory
virtio-vsock: Maintain socket device spec in separate directory
virtio-fs: Maintain file system device spec in separate directory
virtio-rpmb: Maintain rpmb device spec in separate directory
virtio-iommu: Maintain iommu device spec in separate directory
virtio-sound: Maintain sound device spec in separate directory
virtio-mem: Maintain memory device spec in separate directory
virtio-i2c: Maintain i2c device spec in separate directory
virtio-scmi: Maintain scmi device spec in separate directory
virtio-gpio: Maintain gpio device spec in separate directory
virtio-pmem: Maintain pmem device spec in separate directory
conformance.tex | 456 +-
content.tex | 4561 +----------------
device-types/virtio-block/description.tex | 1313 +++++
.../virtio-block/device-conformance.tex | 8 +
.../virtio-block/driver-conformance.tex | 8 +
device-types/virtio-console/description.tex | 231 +
.../virtio-console/device-conformance.tex | 8 +
.../virtio-console/driver-conformance.tex | 8 +
.../virtio-crypto/description.tex | 0
.../virtio-crypto/device-conformance.tex | 13 +
.../virtio-crypto/driver-conformance.tex | 14 +
device-types/virtio-entropy/description.tex | 42 +
.../virtio-entropy/device-conformance.tex | 7 +
.../virtio-entropy/driver-conformance.tex | 7 +
.../virtio-fs/description.tex | 0
device-types/virtio-fs/device-conformance.tex | 9 +
device-types/virtio-fs/driver-conformance.tex | 10 +
.../virtio-gpio/description.tex | 0
.../virtio-gpio/device-conformance.tex | 9 +
.../virtio-gpio/driver-conformance.tex | 9 +
.../virtio-gpu/description.tex | 0
.../virtio-gpu/device-conformance.tex | 8 +
.../virtio-i2c/description.tex | 0
.../virtio-i2c/device-conformance.tex | 7 +
.../virtio-i2c/driver-conformance.tex | 7 +
.../virtio-input/description.tex | 0
.../virtio-input/device-conformance.tex | 8 +
.../virtio-input/driver-conformance.tex | 8 +
.../virtio-iommu/description.tex | 0
.../virtio-iommu/device-conformance.tex | 16 +
.../virtio-iommu/driver-conformance.tex | 17 +
.../virtio-mem-balloon/description.tex | 634 +++
.../virtio-mem-balloon/device-conformance.tex | 12 +
.../virtio-mem-balloon/driver-conformance.tex | 12 +
.../virtio-mem/description.tex | 0
.../virtio-mem/device-conformance.tex | 13 +
.../virtio-mem/driver-conformance.tex | 13 +
device-types/virtio-network/description.tex | 1594 ++++++
.../virtio-network/device-conformance.tex | 16 +
.../virtio-network/driver-conformance.tex | 17 +
.../virtio-pmem/description.tex | 0
.../virtio-pmem/device-conformance.tex | 9 +
.../virtio-pmem/driver-conformance.tex | 7 +
.../virtio-rpmb/description.tex | 0
.../virtio-rpmb/device-conformance.tex | 13 +
.../virtio-rpmb/driver-conformance.tex | 7 +
.../virtio-scmi/description.tex | 0
.../virtio-scmi/device-conformance.tex | 10 +
.../virtio-scmi/driver-conformance.tex | 8 +
device-types/virtio-scsi/description.tex | 709 +++
.../virtio-scsi/device-conformance.tex | 10 +
.../virtio-scsi/driver-conformance.tex | 9 +
.../virtio-sound/description.tex | 0
.../virtio-sound/device-conformance.tex | 16 +
.../virtio-sound/driver-conformance.tex | 13 +
.../virtio-vsock/description.tex | 0
.../virtio-vsock/device-conformance.tex | 9 +
.../virtio-vsock/driver-conformance.tex | 10 +
58 files changed, 4964 insertions(+), 4961 deletions(-)
create mode 100644 device-types/virtio-block/description.tex
create mode 100644 device-types/virtio-block/device-conformance.tex
create mode 100644 device-types/virtio-block/driver-conformance.tex
create mode 100644 device-types/virtio-console/description.tex
create mode 100644 device-types/virtio-console/device-conformance.tex
create mode 100644 device-types/virtio-console/driver-conformance.tex
rename virtio-crypto.tex => device-types/virtio-crypto/description.tex (100%)
create mode 100644 device-types/virtio-crypto/device-conformance.tex
create mode 100644 device-types/virtio-crypto/driver-conformance.tex
create mode 100644 device-types/virtio-entropy/description.tex
create mode 100644 device-types/virtio-entropy/device-conformance.tex
create mode 100644 device-types/virtio-entropy/driver-conformance.tex
rename virtio-fs.tex => device-types/virtio-fs/description.tex (100%)
create mode 100644 device-types/virtio-fs/device-conformance.tex
create mode 100644 device-types/virtio-fs/driver-conformance.tex
rename virtio-gpio.tex => device-types/virtio-gpio/description.tex (100%)
create mode 100644 device-types/virtio-gpio/device-conformance.tex
create mode 100644 device-types/virtio-gpio/driver-conformance.tex
rename virtio-gpu.tex => device-types/virtio-gpu/description.tex (100%)
create mode 100644 device-types/virtio-gpu/device-conformance.tex
rename virtio-i2c.tex => device-types/virtio-i2c/description.tex (100%)
create mode 100644 device-types/virtio-i2c/device-conformance.tex
create mode 100644 device-types/virtio-i2c/driver-conformance.tex
rename virtio-input.tex => device-types/virtio-input/description.tex (100%)
create mode 100644 device-types/virtio-input/device-conformance.tex
create mode 100644 device-types/virtio-input/driver-conformance.tex
rename virtio-iommu.tex => device-types/virtio-iommu/description.tex (100%)
create mode 100644 device-types/virtio-iommu/device-conformance.tex
create mode 100644 device-types/virtio-iommu/driver-conformance.tex
create mode 100644 device-types/virtio-mem-balloon/description.tex
create mode 100644 device-types/virtio-mem-balloon/device-conformance.tex
create mode 100644 device-types/virtio-mem-balloon/driver-conformance.tex
rename virtio-mem.tex => device-types/virtio-mem/description.tex (100%)
create mode 100644 device-types/virtio-mem/device-conformance.tex
create mode 100644 device-types/virtio-mem/driver-conformance.tex
create mode 100644 device-types/virtio-network/description.tex
create mode 100644 device-types/virtio-network/device-conformance.tex
create mode 100644 device-types/virtio-network/driver-conformance.tex
rename virtio-pmem.tex => device-types/virtio-pmem/description.tex (100%)
create mode 100644 device-types/virtio-pmem/device-conformance.tex
create mode 100644 device-types/virtio-pmem/driver-conformance.tex
rename virtio-rpmb.tex => device-types/virtio-rpmb/description.tex (100%)
create mode 100644 device-types/virtio-rpmb/device-conformance.tex
create mode 100644 device-types/virtio-rpmb/driver-conformance.tex
rename virtio-scmi.tex => device-types/virtio-scmi/description.tex (100%)
create mode 100644 device-types/virtio-scmi/device-conformance.tex
create mode 100644 device-types/virtio-scmi/driver-conformance.tex
create mode 100644 device-types/virtio-scsi/description.tex
create mode 100644 device-types/virtio-scsi/device-conformance.tex
create mode 100644 device-types/virtio-scsi/driver-conformance.tex
rename virtio-sound.tex => device-types/virtio-sound/description.tex (100%)
create mode 100644 device-types/virtio-sound/device-conformance.tex
create mode 100644 device-types/virtio-sound/driver-conformance.tex
rename virtio-vsock.tex => device-types/virtio-vsock/description.tex (100%)
create mode 100644 device-types/virtio-vsock/device-conformance.tex
create mode 100644 device-types/virtio-vsock/driver-conformance.tex
--
2.26.2
^ permalink raw reply
* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
From: Halil Pasic @ 2023-01-10 22:34 UTC (permalink / raw)
To: Xuan Zhuo
Cc: virtio-dev, hans, herongguang, zmlcc, dust.li, tonylu, zhenzao,
helinguo, gerry, mst, cohuck, jasowang, Jan Kiszka, wintera,
kgraul, wenjia, jaka, hca, twinkler, raspl, Halil Pasic
In-Reply-To: <20221223081354.15026-2-xuanzhuo@linux.alibaba.com>
On Fri, 23 Dec 2022 16:13:54 +0800
Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> The virtio ism device provides and manages many memory ism regions in
> host. These ism regions can be alloc/attach/detach by driver. Every
[..]
Hi Xuan Zhou!
Some words in advance. While I'm supportive of the general idea, I find
the proposed specification quite difficult to read, and I have the
feeling I did not get the full picture. So my review will be a mix of
feedback on the tactical and of questions on the strategic level. Please
bear with me.
>
> diff --git a/virtio-ism.tex b/virtio-ism.tex
> new file mode 100644
> index 0000000..7f09c43
> --- /dev/null
> +++ b/virtio-ism.tex
> @@ -0,0 +1,472 @@
> +\section{ISM Device}\label{sec:Device Types / ISM Device}
> +
> +\begin{lstlisting}
> +|-------------------------------------------------------------------------------------------------------------|
> +| |------------------------------------------------| |------------------------------------------------| |
> +| | VM [M1] [M2] [M3] | | VM [M2] [M3] | |
> +| | | | | | | | | | |
> +| | -----------------------|------|------|--- | | ------------------------------|------|--- | |
> +| | | driver | | | | | | | driver | | | | |
> +| | -----------------------|------|------|--- | | ------------------------------|------|--- | |
> +| | |cq| |map |map |map | | |cq| |map |map | |
> +| | | | | | | | | | | | | | |
> +| | | | ------------------- | | | | ------------------- | |
> +| |----|--|----------------| device memory |-----| |----|--|----------------| device memory |-----| |
> +| | | | ------------------- | | | | ------------------- | |
> +| | | | | | | |
> +| | | | | | | |
> +| | | | | | | |
> +| |--------------------------------+---------------| |--------------------------------+---------------| |
> +| | | |
> +| | | |
> +| |------------------------------+------------------------| |
> +| | |
> +| | |
> +| -------------------------- |
> +| | M1 | | M2 | | M3 | |
> +| -------------------------- |
> +| |
> +| |
> +|-------------------------------------------------------------------------------------------------------------|
> +\end{lstlisting}
> +
> +ISM(Internal Shared Memory) device provides the ability to share memory between
> +different VMs launched from the same entity.
Launched by instead of from? Maybe introduce a catchy name for the
"entity that launched the VMs" and prevent oversimplification by
explaining any shortcomings of the name if any in one place. Host would
be one candidate, VMM another.
> A vm's memory got from ISM device
> +can be shared with multiple peers at the same time. This shared relationship can
s/at the same time/simultaneously/ ?
> +be dynamically created and released.
"This shared relationship" does not sound right, but I have no proposal
out of the top of my head.
> +
> +The contiguous shared memory obtained from the device is divided into multiple
> +ism regions for share.
What does "for share" mean here? I don't quite understand this sentence.
> +
> +ISM device provides a mechanism to notify other ism region referrers of events.
> +
> +
> +\subsection{Device ID}\label{sec:Device Types / ISM Device / Device ID}
> + 44
> +
> +\subsection{Virtqueues}\label{sec:Device Types / ISM Device / Virtqueues}
> +\begin{description}
> +\item[0] controlq
> +\item[1] eventq
> +\end{description}
> +
> +\subsection{Device configuration layout}\label{sec:Device Types / ISM Device / Device configuration layout}
> +
> +\begin{lstlisting}
> +struct virtio_ism_config {
> + le64 gid;
> + le64 devid;
> + le64 chunk_size;
> + le64 notify_size;
> +};
> +\end{lstlisting}
> +
> +\begin{description}
> + \item[\field{gid}] \field{gid} is used to identify different entity that
> + launches the VMs.
What does gid stand for?
s/different/the/ ?
I can't figure out what is "different" supposed to mean here.
>Only the ism devices with the same \field{gid} can
s/the ism/ism/ ?
> + share the ism regions. Therefore, this value is unique in the
/s/share the/share/ ?
> + world-wide.
Makes no sense to me.
We could call gid host_id. We could also say that a host_id is
world-wide unique in a sense that there must not be another host
with the same host_id.
That of course raises the question, how do the different implementations
ensure that the gid (or hostid) remains unique. I guess therefore this
specification needs to specify how such unique gids are generated.
> +
> + \item[\field{devid}] the device id is used to identify different ism devices
s/uniquely identify an ism device within the scope of the host. I.e.
devices attached to VMs on the same host must have different
\filed{devid} values, while devices attached to VMs that are hosted by
different hosts may have the same \field{devid} values.
> + on the same entity.
> +
> + \item[\field{chunk_size}] the size of the every ism chunk.
What is a chunk? Fist introduced here. Please use consistent wording.
> + Large shared memories are divided into multiple chunks, and one time
"Memories" sounds wrong here. I guess what you used to call "regions"
you now call "chunks". But I may be wrong.
> + will take up at least one chunk.
> +
> + \item[\field{notify_size}] the size of the notify address.
The term "notify address" ain't properly introduced.
> +\end{description}
> +
> +\devicenormative{\subsubsection}{Device configuration layout}{Device Types / ISM Device / Device configuration layout}
> +
> +The device MUST ensure that the \field{gid} on the same entity i
s/i$/is$/
> +same and different from the \field{gid} on other entity.
How is the device supposed to know what is "the \field{gid} on other
entity"? See my previous comment.
> +
> +On the same entity, the device MUST ensure that the \field{devid} is unique.
> +
> +\field{chunk_size} MUST be a power of two.
> +
> +\subsection{Event}\label{sec:Device Types / Network Device / Device Operation / Event}
> +
> +\begin{lstlisting}
> +#define VIRTIO_ISM_EVENT_UPDATE (1 << 0)
> +#define VIRTIO_ISM_EVENT_ATTACH (1 << 1)
> +#define VIRTIO_ISM_EVENT_DETACH (1 << 2)
> +\end{lstlisting}
> +
> +\begin{description}
> + \item[VIRTIO_ISM_EVENT_UPDATE]
> + The driver kick the notify area to notify other peers of the update
> + event of the ism region content.
> +
> + \item[VIRTIO_ISM_EVENT_ATTACH] A new device attaches the ism region.
> + \item[VIRTIO_ISM_EVENT_DETACH] A device detaches the ism region.
> +\end{description}
> +
> +The ism device supports event notification of the ism region. When a device
> +kick/attach/detach a region, other ism region referrers may receive related
> +events.
> +
Is "may" what we want to use here? This sounds like the referrers may
not rely on receiving these events, because they don't have the
guarantee they will receive any.
> +A buffer received from eventq can contain multiple event structures.
> +
> +\begin{lstlisting}
> +struct virtio_ism_event_update {
> + le64 ev_type;
> + le64 offset;
> + le64 devid;
> +};
> +
> +struct virtio_ism_event_attach_detach {
> + le64 ev_type;
> + le64 offset;
> + le64 devid;
> + le64 peers;
> +};
> +
> +\end{lstlisting}
> +
> +\begin{description}
> +\item[\field{ev_type}] The type of event, the driver can get the size of the
> + structure based on this.
> +
> +\item[\field{offset}] The offset of ism regions with the event.
Offset with respect to what?
> +
> +\item[\field{devid}] \field{devid} of the device that generated events.
> +\item[\field{peers}] The number of the ism region referres (does not include the
> + device that receiving this event)
> +
> +\end{description}
> +
> +
> +\subsection{Permissions}\label{sec:Device Types / Network Device / Device Operation / Permission}
> +
> +The permissions of a ism region determine whether this ism region can be
> +attached and the read and write permissions after attach.
> +
> +The driver can set the default permissions, or set permissions for some certain
> +devices.
What does "default permissions" and "some certain devices" mean here?
> +
> +When a driver has the management permission of the ism region, then it can
> +modify the permissions of this ism region. By default, only the device that
> +allocated the ism region has this permission.
> +
> +\subsection{Device Initialization}\label{sec:Device Types / ISM Device / Device Initialization}
> +
> +\devicenormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
> +
> +The device MUST generate a \field{devid}. \field{devid} remains unchanged
> +during reset. \field{devid} MUST NOT be 0.
> +
> +The device shares memory to the driver based on shared memory regions
> +\ref{sec:Basic Facilities of a Virtio Device / Shared Memory Regions}.
> +However, it does not need to allocate physical memory during initialization.
> +
> +The \field{shmid} of a region MUST be one of the following
> +\begin{lstlisting}
> +enum virtio_ism_shm_id {
> + VIRTIO_ISM_SHM_ID_UNDEFINED = 0,
> + VIRTIO_ISM_SHM_ID_REGIONS = 1,
> + VIRTIO_ISM_SHM_ID_NOTIFY = 2,
> +};
> +\end{lstlisting}
> +
> +The shared memory whose shmid is VIRTIO_ISM_SHM_ID_REGIONS is used to implement
> +ism regions.
Hm. AFAIU, these are supposed to be used for exchanging data between the
VMs that are supposed to communicate with each other via ism.
How does this relate to the following normative section:
"""
2.10.2 Device Requirements: Shared Memory Regions
Shared memory regions MUST NOT expose shared memory regions which are used to control the operation of the device, nor to stream data.
"""
> If there are multiple shared memories whose shmid is
> +VIRTIO_ISM_SHM_ID_REGIONS, they are used as contiguous memory in the order of
> +acquisition.
Hm, I used to think that the shmid is an unique identifier for a shared
memory region. How can multiple virtio shared memory regions have the
same shmid?
Can you have a look at the MMIO interface for virtio shared memory
regions.
How far what ISM tries to do consistent with virtio shared memory. I
mean, AFAIU once the shared memory is advertised, the shared memory is
supposed to be there and accessible by the driver for both writes and
reads. But for ISM there is this allocation and attach/detach and
permissions.
What about "Memory consistency rules vary depending on the region and
the device and they will be specified as required by each device." form
"2.10 Shared Memory Regions".
> +
> +The device MUST also provides a shared memory with VIRTIO_ISM_SHM_ID_NOTIFY to
> +the driver. This memory area is used for notify, and each ism region MUST have a
> +corresponding notify address inside this area, and the size of the notify
> +address is \field{notify_size};
> +
> +\drivernormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
> +
> +The driver MUST query all shared memory regions supported by the device.
> +(see \ref{sec:Basic Facilities of a Virtio Device / Shared Memory Regions})
> +
> +
> +\subsection{Control Virtqueue}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue}
> +
> +The driver uses the control virtqueue send commands to implement operations on
> +the ism region and some global configurations.
> +
> +All commands are of the following form:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl {
> + u8 class;
> + u8 command;
> + u8 command_specific_data[];
> + u8 ack;
> + u8 command_specific_data_reply[];
> +};
> +
> +/* ack values */
> +#define VIRTIO_ISM_OK 0
> +#define VIRTIO_ISM_ERR 255
> +
> +#define VIRTIO_ISM_ENOENT 2
> +#define VIRTIO_ISM_E2BIG 7
> +#define VIRTIO_ISM_ENOMEM 12
> +#define VIRTIO_ISM_ENOSPEC 28
> +
> +#define VIRTIO_ISM_PERM_EATTACH 100
> +#define VIRTIO_ISM_PERM_EREAD 101
> +#define VIRTIO_ISM_PERM_EWRITE 102
> +\end{lstlisting}
> +
> +The \field{class}, \field{command} and command-specific-data are set by the
> +driver, and the device sets the \field{ack} byte and optionally
> +\field{command-specific-data-reply}.
Where are values and their semantic for \field{command} specified? I see
no more references to \field{command}. Similarly I see no further
references the fields \field{command-specific-data} and
\field{command-specific-data-reply}.
[..]
I will continue with the review from here. I didn't have the opportunity
to look at the PoC implementation. Maybe it will be easier to get through
the rest of this text once I have a better understanding of the part up
till now.
Regards,
Halil
^ permalink raw reply
* Re: [virtio-dev] Re: [PATCH v2 00/20] Split device spec to its individual files
From: Michael S. Tsirkin @ 2023-01-10 17:47 UTC (permalink / raw)
To: Parav Pandit
Cc: Cornelia Huck, virtio-dev@lists.oasis-open.org,
virtio-comment@lists.oasis-open.org
In-Reply-To: <PH0PR12MB5481DC75C7AD401B9CE43CCEDCFF9@PH0PR12MB5481.namprd12.prod.outlook.com>
On Tue, Jan 10, 2023 at 03:36:28PM +0000, Parav Pandit wrote:
>
> > From: Cornelia Huck <cohuck@redhat.com>
> > Sent: Tuesday, January 10, 2023 10:34 AM
>
> [..]
> > >> Apologies if I sound like a process stickler, but the main problem is
> > >> that the ballot is currently about v1 of the patches (and we
> > >> obviously can't change that while it is open, as that would
> > >> invalidate the votes that already have been cast.)
> > > If it auto invalidates, is there any withdrawal process needed?
> > > If no, lets withdraw an re-vote on v3.
> >
> > No, there's no such thing as auto-invalidation, as we cannot modify the ballot
> > (sorry if I was unclear.) Let's just withdraw the ballot.
> >
> No problem. Its clear to me now.
>
> > >
> > >> If your v3 looks good, we need a new ballot to vote on that. As it
> > >> stands now, the TC would have voted to include v1 with its known
> > >> problems... that's why I think a withdrawal would be best.
> > >
> > > We will use the same github issue for v3, and new ballot yes?
> >
> > Yes, exactly. (This is not the first time this has happened.)
>
> Ok. Please proceed to withdraw.
Cornelia will you do this? Parav just a note pls update the
link in the description (not in comments - these are ignored)
before requesting a new ballot.
Thanks!
--
MST
^ permalink raw reply
* Re: [virtio-comment] Re: [PATCH v2 00/20] Split device spec to its individual files
From: Halil Pasic @ 2023-01-10 15:46 UTC (permalink / raw)
To: Cornelia Huck
Cc: Michael S. Tsirkin, Parav Pandit, virtio-dev, virtio-comment,
Halil Pasic
In-Reply-To: <87358i4ow0.fsf@redhat.com>
On Tue, 10 Jan 2023 12:20:31 +0100
Cornelia Huck <cohuck@redhat.com> wrote:
> > Previously it looked like a cosmetic issue, but now it looks
> > like it's important.
>
> I agree, and we need to decide quickly what to do with the ballot. We
> don't want to merge v1, but the current votes still have a majority of
> 'yes'. My preference would be to withdraw the ballot, which needs to be
> done before 22:00 UTC today, if I'm not confused.
FYI: I've changed form Yes to No as well, so currently it is 4:4.
Nevertheless I agree withdrawing the current ballot is the best way to
go about this.
Regards,
Halil
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply
* RE: [virtio-dev] Re: [PATCH v2 00/20] Split device spec to its individual files
From: Parav Pandit @ 2023-01-10 15:36 UTC (permalink / raw)
To: Cornelia Huck, Michael S. Tsirkin
Cc: virtio-dev@lists.oasis-open.org,
virtio-comment@lists.oasis-open.org
In-Reply-To: <87wn5u2yls.fsf@redhat.com>
> From: Cornelia Huck <cohuck@redhat.com>
> Sent: Tuesday, January 10, 2023 10:34 AM
[..]
> >> Apologies if I sound like a process stickler, but the main problem is
> >> that the ballot is currently about v1 of the patches (and we
> >> obviously can't change that while it is open, as that would
> >> invalidate the votes that already have been cast.)
> > If it auto invalidates, is there any withdrawal process needed?
> > If no, lets withdraw an re-vote on v3.
>
> No, there's no such thing as auto-invalidation, as we cannot modify the ballot
> (sorry if I was unclear.) Let's just withdraw the ballot.
>
No problem. Its clear to me now.
> >
> >> If your v3 looks good, we need a new ballot to vote on that. As it
> >> stands now, the TC would have voted to include v1 with its known
> >> problems... that's why I think a withdrawal would be best.
> >
> > We will use the same github issue for v3, and new ballot yes?
>
> Yes, exactly. (This is not the first time this has happened.)
Ok. Please proceed to withdraw.
^ permalink raw reply
* RE: [virtio-dev] Re: [PATCH v2 00/20] Split device spec to its individual files
From: Cornelia Huck @ 2023-01-10 15:33 UTC (permalink / raw)
To: Parav Pandit, Michael S. Tsirkin
Cc: virtio-dev@lists.oasis-open.org,
virtio-comment@lists.oasis-open.org
In-Reply-To: <PH0PR12MB5481324F6E61B352850E8FB5DCFF9@PH0PR12MB5481.namprd12.prod.outlook.com>
On Tue, Jan 10 2023, Parav Pandit <parav@nvidia.com> wrote:
>> From: Cornelia Huck <cohuck@redhat.com>
>> Sent: Tuesday, January 10, 2023 10:21 AM
>>
>> On Tue, Jan 10 2023, Parav Pandit <parav@nvidia.com> wrote:
>>
>> >> From: virtio-dev@lists.oasis-open.org
>> >> <virtio-dev@lists.oasis-open.org> On Behalf Of Cornelia Huck
>> >>
>> >> On Mon, Jan 09 2023, "Michael S. Tsirkin" <mst@redhat.com> wrote:
>> >> > Does makediff still work? Documentation says latexpand does not
>> >> > support import. without latexdiff generating redlined versions
>> >> > would be very difficult.
>> >> >
>> >> >
>> >> > I am also worried about consistency since we already use \\input.
>> >> > If using \\input means putting everything in a single directory,
>> >> > that's a small price to pay:
>> >> >
>> >> > virtio-sound.tex + virtio-sound-conformance.tex
>> >> >
>> >> > is not fundamentally worse than
>> >> > device-types/virtio-sound/device.tex
>> >> > and device-types/virtio-sound/device-conformance.tex
>> >> >
>> >> > and it avoids the duplicated "device" in the name.
>> >> >
>> >> > Previously it looked like a cosmetic issue, but now it looks like
>> >> > it's important.
>> >>
>> >> I agree, and we need to decide quickly what to do with the ballot. We
>> >> don't want to merge v1, but the current votes still have a majority
>> >> of 'yes'. My preference would be to withdraw the ballot, which needs
>> >> to be done before
>> >> 22:00 UTC today, if I'm not confused.
>> >>
>> >> Parav, what do you think? If you request to withdraw the ballot,
>> >> that's easy to do; we'll just open a new one once we've agreed on a version.
>> >
>> > I am revising the v2 and should be available in 7 pm UTC time.
>> > This will include,
>> > a. white space removal at end of the net and blk files b. fix missing
>> > device conformance links for 4 devices c. import to input d. continue
>> > with directories e. rename device-types/<name>/device.tex to
>> > device-types/<name>/description.tex
>>
>> Apologies if I sound like a process stickler, but the main problem is that the
>> ballot is currently about v1 of the patches (and we obviously can't change that
>> while it is open, as that would invalidate the votes that already have been cast.)
> If it auto invalidates, is there any withdrawal process needed?
> If no, lets withdraw an re-vote on v3.
No, there's no such thing as auto-invalidation, as we cannot modify the
ballot (sorry if I was unclear.) Let's just withdraw the ballot.
>
>> If your v3 looks good, we need a new ballot to vote on that. As it stands now,
>> the TC would have voted to include v1 with its known problems... that's why I
>> think a withdrawal would be best.
>
> We will use the same github issue for v3, and new ballot yes?
Yes, exactly. (This is not the first time this has happened.)
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply
* RE: [virtio-dev] Re: [PATCH v2 00/20] Split device spec to its individual files
From: Parav Pandit @ 2023-01-10 15:26 UTC (permalink / raw)
To: Cornelia Huck, Michael S. Tsirkin
Cc: virtio-dev@lists.oasis-open.org,
virtio-comment@lists.oasis-open.org
In-Reply-To: <87zgaq2z6e.fsf@redhat.com>
> From: Cornelia Huck <cohuck@redhat.com>
> Sent: Tuesday, January 10, 2023 10:21 AM
>
> On Tue, Jan 10 2023, Parav Pandit <parav@nvidia.com> wrote:
>
> >> From: virtio-dev@lists.oasis-open.org
> >> <virtio-dev@lists.oasis-open.org> On Behalf Of Cornelia Huck
> >>
> >> On Mon, Jan 09 2023, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >> > Does makediff still work? Documentation says latexpand does not
> >> > support import. without latexdiff generating redlined versions
> >> > would be very difficult.
> >> >
> >> >
> >> > I am also worried about consistency since we already use \\input.
> >> > If using \\input means putting everything in a single directory,
> >> > that's a small price to pay:
> >> >
> >> > virtio-sound.tex + virtio-sound-conformance.tex
> >> >
> >> > is not fundamentally worse than
> >> > device-types/virtio-sound/device.tex
> >> > and device-types/virtio-sound/device-conformance.tex
> >> >
> >> > and it avoids the duplicated "device" in the name.
> >> >
> >> > Previously it looked like a cosmetic issue, but now it looks like
> >> > it's important.
> >>
> >> I agree, and we need to decide quickly what to do with the ballot. We
> >> don't want to merge v1, but the current votes still have a majority
> >> of 'yes'. My preference would be to withdraw the ballot, which needs
> >> to be done before
> >> 22:00 UTC today, if I'm not confused.
> >>
> >> Parav, what do you think? If you request to withdraw the ballot,
> >> that's easy to do; we'll just open a new one once we've agreed on a version.
> >
> > I am revising the v2 and should be available in 7 pm UTC time.
> > This will include,
> > a. white space removal at end of the net and blk files b. fix missing
> > device conformance links for 4 devices c. import to input d. continue
> > with directories e. rename device-types/<name>/device.tex to
> > device-types/<name>/description.tex
>
> Apologies if I sound like a process stickler, but the main problem is that the
> ballot is currently about v1 of the patches (and we obviously can't change that
> while it is open, as that would invalidate the votes that already have been cast.)
If it auto invalidates, is there any withdrawal process needed?
If no, lets withdraw an re-vote on v3.
> If your v3 looks good, we need a new ballot to vote on that. As it stands now,
> the TC would have voted to include v1 with its known problems... that's why I
> think a withdrawal would be best.
We will use the same github issue for v3, and new ballot yes?
^ permalink raw reply
* [virtio-comment] RE: [virtio-dev] Re: [PATCH v2 00/20] Split device spec to its individual files
From: Cornelia Huck @ 2023-01-10 15:21 UTC (permalink / raw)
To: Parav Pandit, Michael S. Tsirkin
Cc: virtio-dev@lists.oasis-open.org,
virtio-comment@lists.oasis-open.org
In-Reply-To: <PH0PR12MB548116EBA8343DA37A8F345DDCFF9@PH0PR12MB5481.namprd12.prod.outlook.com>
On Tue, Jan 10 2023, Parav Pandit <parav@nvidia.com> wrote:
>> From: virtio-dev@lists.oasis-open.org <virtio-dev@lists.oasis-open.org> On
>> Behalf Of Cornelia Huck
>>
>> On Mon, Jan 09 2023, "Michael S. Tsirkin" <mst@redhat.com> wrote:
>> > Does makediff still work? Documentation says latexpand does not
>> > support import. without latexdiff generating redlined versions would
>> > be very difficult.
>> >
>> >
>> > I am also worried about consistency since we already use \\input.
>> > If using \\input means putting everything in a single directory,
>> > that's a small price to pay:
>> >
>> > virtio-sound.tex + virtio-sound-conformance.tex
>> >
>> > is not fundamentally worse than device-types/virtio-sound/device.tex
>> > and device-types/virtio-sound/device-conformance.tex
>> >
>> > and it avoids the duplicated "device" in the name.
>> >
>> > Previously it looked like a cosmetic issue, but now it looks like it's
>> > important.
>>
>> I agree, and we need to decide quickly what to do with the ballot. We don't
>> want to merge v1, but the current votes still have a majority of 'yes'. My
>> preference would be to withdraw the ballot, which needs to be done before
>> 22:00 UTC today, if I'm not confused.
>>
>> Parav, what do you think? If you request to withdraw the ballot, that's easy to
>> do; we'll just open a new one once we've agreed on a version.
>
> I am revising the v2 and should be available in 7 pm UTC time.
> This will include,
> a. white space removal at end of the net and blk files
> b. fix missing device conformance links for 4 devices
> c. import to input
> d. continue with directories
> e. rename device-types/<name>/device.tex to device-types/<name>/description.tex
Apologies if I sound like a process stickler, but the main problem is
that the ballot is currently about v1 of the patches (and we obviously
can't change that while it is open, as that would invalidate the votes
that already have been cast.) If your v3 looks good, we need a new
ballot to vote on that. As it stands now, the TC would have voted to
include v1 with its known problems... that's why I think a withdrawal
would be best.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply
* RE: [virtio-dev] Re: [PATCH v2 00/20] Split device spec to its individual files
From: Parav Pandit @ 2023-01-10 15:05 UTC (permalink / raw)
To: Cornelia Huck, Michael S. Tsirkin
Cc: virtio-dev@lists.oasis-open.org,
virtio-comment@lists.oasis-open.org
In-Reply-To: <87358i4ow0.fsf@redhat.com>
> From: virtio-dev@lists.oasis-open.org <virtio-dev@lists.oasis-open.org> On
> Behalf Of Cornelia Huck
>
> On Mon, Jan 09 2023, "Michael S. Tsirkin" <mst@redhat.com> wrote:
>
> > On Mon, Jan 09, 2023 at 06:28:29PM +0200, Parav Pandit wrote:
> >> Relatively several of the recent device specifications are maintained
> >> in their own specification file. Such separate files enables better
> >> maintenance of the specification overall.
> >> However, several of the initial virtio device specifications are
> >> located in single file.
> >>
> >> Hence, split them into their individual files.
> >>
> >> Additionally, each device's driver and device conformance is present
> >> in one giant conformance file all together.
> >>
> >> As Michael suggest's move this device and driver conformance section
> >> adjacent to device specification in each device specific directory.
> >> This further makes device specification self-contained.
> >>
> >> Added patch to fix spelling errors in network device specification
> >> which was inherited from its previous file location.
> >>
> >> Patches do not change any part of the specification outcome except
> >> fixing the spelling errors.
> >> It only changes how the specification is maintained.
> >>
> >> patch summary:
> >> -------------
> >> patch 1 to 7 creates new files for moving devices spec out of content
> >> and conformance files.
> >> patch 8 to 20 move existing dedicated file spec to new directory and
> >> creates per device,driver conformance file for each device.
> >>
> >> changelog:
> >> ----------
> >> v1->v2:
> >> - removed extra blank lines in network and block device files
> >> - added missing device conformance link for rpmb, sound, i2c and
> >> gpio devices
> >> v0->v1:
> >> - move device spec to their own directory
> >> - added split files for conformance and placed them adjacent to
> >> device spec
> >> - added patch to fix spelling errors in network device
> >>
> >> Parav Pandit (20):
> >> virtio-network: Maintain network device spec in separate directory
> >> virtio-network: Fix spelling errors
> >> virtio-block: Maintain block device spec in separate directory
> >> virtio-console: Maintain console device spec in separate directory
> >> virtio-entropy: Maintain entropy device spec in separate directory
> >> virtio-mem-balloon: Maintain mem balloon device spec in separate
> >> directory
> >> virtio-scsi: Maintain scsi host device spec in separate directory
> >> virtio-gpu: Maintain gpu device spec in separate directory
> >> virtio-input: Maintain input device spec in separate directory
> >> virtio-crypto: Maintain crypto device spec in separate directory
> >> virtio-vsock: Maintain socket device spec in separate directory
> >> virtio-fs: Maintain file system device spec in separate directory
> >> virtio-rpmb: Maintain rpmb device spec in separate directory
> >> virtio-iommu: Maintain iommu device spec in separate directory
> >> virtio-sound: Maintain sound device spec in separate directory
> >> virtio-mem: Maintain memory device spec in separate directory
> >> virtio-i2c: Maintain i2c device spec in separate directory
> >> virtio-scmi: Maintain scmi device spec in separate directory
> >> virtio-gpio: Maintain gpio device spec in separate directory
> >> virtio-pmem: Maintain pmem device spec in separate directory
> >>
> >> conformance.tex | 456 +-
> >> content.tex | 4561 +----------------
> >> .../virtio-block/device-conformance.tex | 8 +
> >> device-types/virtio-block/device.tex | 1313 +++++
> >> .../virtio-block/driver-conformance.tex | 8 +
> >> .../virtio-console/device-conformance.tex | 8 +
> >> device-types/virtio-console/device.tex | 231 +
> >> .../virtio-console/driver-conformance.tex | 8 +
> >> .../virtio-crypto/device-conformance.tex | 13 +
> >> .../virtio-crypto/device.tex | 0
> >> .../virtio-crypto/driver-conformance.tex | 14 +
> >> .../virtio-entropy/device-conformance.tex | 7 +
> >> device-types/virtio-entropy/device.tex | 42 +
> >> .../virtio-entropy/driver-conformance.tex | 7 +
> >> device-types/virtio-fs/device-conformance.tex | 9 +
> >> .../virtio-fs/device.tex | 0
> >> device-types/virtio-fs/driver-conformance.tex | 10 +
> >> .../virtio-gpio/device-conformance.tex | 9 +
> >> .../virtio-gpio/device.tex | 0
> >> .../virtio-gpio/driver-conformance.tex | 9 +
> >> .../virtio-gpu/device-conformance.tex | 8 +
> >> .../virtio-gpu/device.tex | 0
> >> .../virtio-i2c/device-conformance.tex | 7 +
> >> .../virtio-i2c/device.tex | 0
> >> .../virtio-i2c/driver-conformance.tex | 7 +
> >> .../virtio-input/device-conformance.tex | 8 +
> >> .../virtio-input/device.tex | 0
> >> .../virtio-input/driver-conformance.tex | 8 +
> >> .../virtio-iommu/device-conformance.tex | 16 +
> >> .../virtio-iommu/device.tex | 0
> >> .../virtio-iommu/driver-conformance.tex | 17 +
> >> .../virtio-mem-balloon/device-conformance.tex | 12 +
> >> device-types/virtio-mem-balloon/device.tex | 634 +++
> >> .../virtio-mem-balloon/driver-conformance.tex | 12 +
> >> .../virtio-mem/device-conformance.tex | 13 +
> >> .../virtio-mem/device.tex | 0
> >> .../virtio-mem/driver-conformance.tex | 13 +
> >> .../virtio-network/device-conformance.tex | 16 +
> >> device-types/virtio-network/device.tex | 1594 ++++++
> >> .../virtio-network/driver-conformance.tex | 17 +
> >> .../virtio-pmem/device-conformance.tex | 9 +
> >> .../virtio-pmem/device.tex | 0
> >> .../virtio-pmem/driver-conformance.tex | 7 +
> >> .../virtio-rpmb/device-conformance.tex | 13 +
> >> .../virtio-rpmb/device.tex | 0
> >> .../virtio-rpmb/driver-conformance.tex | 7 +
> >> .../virtio-scmi/device-conformance.tex | 10 +
> >> .../virtio-scmi/device.tex | 0
> >> .../virtio-scmi/driver-conformance.tex | 8 +
> >> .../virtio-scsi/device-conformance.tex | 10 +
> >> device-types/virtio-scsi/device.tex | 709 +++
> >> .../virtio-scsi/driver-conformance.tex | 9 +
> >> .../virtio-sound/device-conformance.tex | 16 +
> >> .../virtio-sound/device.tex | 0
> >> .../virtio-sound/driver-conformance.tex | 13 +
> >> .../virtio-vsock/device-conformance.tex | 9 +
> >> .../virtio-vsock/device.tex | 0
> >> .../virtio-vsock/driver-conformance.tex | 10 +
> >> virtio.tex | 1 +
> >> 59 files changed, 4965 insertions(+), 4961 deletions(-)
> >
> > Does makediff still work? Documentation says latexpand does not
> > support import. without latexdiff generating redlined versions would
> > be very difficult.
> >
> >
> > I am also worried about consistency since we already use \\input.
> > If using \\input means putting everything in a single directory,
> > that's a small price to pay:
> >
> > virtio-sound.tex + virtio-sound-conformance.tex
> >
> > is not fundamentally worse than device-types/virtio-sound/device.tex
> > and device-types/virtio-sound/device-conformance.tex
> >
> > and it avoids the duplicated "device" in the name.
> >
> > Previously it looked like a cosmetic issue, but now it looks like it's
> > important.
>
> I agree, and we need to decide quickly what to do with the ballot. We don't
> want to merge v1, but the current votes still have a majority of 'yes'. My
> preference would be to withdraw the ballot, which needs to be done before
> 22:00 UTC today, if I'm not confused.
>
> Parav, what do you think? If you request to withdraw the ballot, that's easy to
> do; we'll just open a new one once we've agreed on a version.
I am revising the v2 and should be available in 7 pm UTC time.
This will include,
a. white space removal at end of the net and blk files
b. fix missing device conformance links for 4 devices
c. import to input
d. continue with directories
e. rename device-types/<name>/device.tex to device-types/<name>/description.tex
^ permalink raw reply
* [virtio-dev] Re: [PATCH v2 00/20] Split device spec to its individual files
From: Cornelia Huck @ 2023-01-10 11:20 UTC (permalink / raw)
To: Michael S. Tsirkin, Parav Pandit; +Cc: virtio-dev, virtio-comment
In-Reply-To: <20230109121857-mutt-send-email-mst@kernel.org>
On Mon, Jan 09 2023, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Mon, Jan 09, 2023 at 06:28:29PM +0200, Parav Pandit wrote:
>> Relatively several of the recent device specifications are maintained
>> in their own specification file. Such separate files enables better
>> maintenance of the specification overall.
>> However, several of the initial virtio device specifications
>> are located in single file.
>>
>> Hence, split them into their individual files.
>>
>> Additionally, each device's driver and device conformance is
>> present in one giant conformance file all together.
>>
>> As Michael suggest's move this device and driver conformance
>> section adjacent to device specification in each device specific
>> directory. This further makes device specification self-contained.
>>
>> Added patch to fix spelling errors in network device
>> specification which was inherited from its previous file
>> location.
>>
>> Patches do not change any part of the specification outcome
>> except fixing the spelling errors.
>> It only changes how the specification is maintained.
>>
>> patch summary:
>> -------------
>> patch 1 to 7 creates new files for moving devices spec out of
>> content and conformance files.
>> patch 8 to 20 move existing dedicated file spec to new directory
>> and creates per device,driver conformance file for each device.
>>
>> changelog:
>> ----------
>> v1->v2:
>> - removed extra blank lines in network and block device files
>> - added missing device conformance link for rpmb, sound, i2c and
>> gpio devices
>> v0->v1:
>> - move device spec to their own directory
>> - added split files for conformance and placed them adjacent to
>> device spec
>> - added patch to fix spelling errors in network device
>>
>> Parav Pandit (20):
>> virtio-network: Maintain network device spec in separate directory
>> virtio-network: Fix spelling errors
>> virtio-block: Maintain block device spec in separate directory
>> virtio-console: Maintain console device spec in separate directory
>> virtio-entropy: Maintain entropy device spec in separate directory
>> virtio-mem-balloon: Maintain mem balloon device spec in separate
>> directory
>> virtio-scsi: Maintain scsi host device spec in separate directory
>> virtio-gpu: Maintain gpu device spec in separate directory
>> virtio-input: Maintain input device spec in separate directory
>> virtio-crypto: Maintain crypto device spec in separate directory
>> virtio-vsock: Maintain socket device spec in separate directory
>> virtio-fs: Maintain file system device spec in separate directory
>> virtio-rpmb: Maintain rpmb device spec in separate directory
>> virtio-iommu: Maintain iommu device spec in separate directory
>> virtio-sound: Maintain sound device spec in separate directory
>> virtio-mem: Maintain memory device spec in separate directory
>> virtio-i2c: Maintain i2c device spec in separate directory
>> virtio-scmi: Maintain scmi device spec in separate directory
>> virtio-gpio: Maintain gpio device spec in separate directory
>> virtio-pmem: Maintain pmem device spec in separate directory
>>
>> conformance.tex | 456 +-
>> content.tex | 4561 +----------------
>> .../virtio-block/device-conformance.tex | 8 +
>> device-types/virtio-block/device.tex | 1313 +++++
>> .../virtio-block/driver-conformance.tex | 8 +
>> .../virtio-console/device-conformance.tex | 8 +
>> device-types/virtio-console/device.tex | 231 +
>> .../virtio-console/driver-conformance.tex | 8 +
>> .../virtio-crypto/device-conformance.tex | 13 +
>> .../virtio-crypto/device.tex | 0
>> .../virtio-crypto/driver-conformance.tex | 14 +
>> .../virtio-entropy/device-conformance.tex | 7 +
>> device-types/virtio-entropy/device.tex | 42 +
>> .../virtio-entropy/driver-conformance.tex | 7 +
>> device-types/virtio-fs/device-conformance.tex | 9 +
>> .../virtio-fs/device.tex | 0
>> device-types/virtio-fs/driver-conformance.tex | 10 +
>> .../virtio-gpio/device-conformance.tex | 9 +
>> .../virtio-gpio/device.tex | 0
>> .../virtio-gpio/driver-conformance.tex | 9 +
>> .../virtio-gpu/device-conformance.tex | 8 +
>> .../virtio-gpu/device.tex | 0
>> .../virtio-i2c/device-conformance.tex | 7 +
>> .../virtio-i2c/device.tex | 0
>> .../virtio-i2c/driver-conformance.tex | 7 +
>> .../virtio-input/device-conformance.tex | 8 +
>> .../virtio-input/device.tex | 0
>> .../virtio-input/driver-conformance.tex | 8 +
>> .../virtio-iommu/device-conformance.tex | 16 +
>> .../virtio-iommu/device.tex | 0
>> .../virtio-iommu/driver-conformance.tex | 17 +
>> .../virtio-mem-balloon/device-conformance.tex | 12 +
>> device-types/virtio-mem-balloon/device.tex | 634 +++
>> .../virtio-mem-balloon/driver-conformance.tex | 12 +
>> .../virtio-mem/device-conformance.tex | 13 +
>> .../virtio-mem/device.tex | 0
>> .../virtio-mem/driver-conformance.tex | 13 +
>> .../virtio-network/device-conformance.tex | 16 +
>> device-types/virtio-network/device.tex | 1594 ++++++
>> .../virtio-network/driver-conformance.tex | 17 +
>> .../virtio-pmem/device-conformance.tex | 9 +
>> .../virtio-pmem/device.tex | 0
>> .../virtio-pmem/driver-conformance.tex | 7 +
>> .../virtio-rpmb/device-conformance.tex | 13 +
>> .../virtio-rpmb/device.tex | 0
>> .../virtio-rpmb/driver-conformance.tex | 7 +
>> .../virtio-scmi/device-conformance.tex | 10 +
>> .../virtio-scmi/device.tex | 0
>> .../virtio-scmi/driver-conformance.tex | 8 +
>> .../virtio-scsi/device-conformance.tex | 10 +
>> device-types/virtio-scsi/device.tex | 709 +++
>> .../virtio-scsi/driver-conformance.tex | 9 +
>> .../virtio-sound/device-conformance.tex | 16 +
>> .../virtio-sound/device.tex | 0
>> .../virtio-sound/driver-conformance.tex | 13 +
>> .../virtio-vsock/device-conformance.tex | 9 +
>> .../virtio-vsock/device.tex | 0
>> .../virtio-vsock/driver-conformance.tex | 10 +
>> virtio.tex | 1 +
>> 59 files changed, 4965 insertions(+), 4961 deletions(-)
>
> Does makediff still work? Documentation says latexpand does not
> support import. without latexdiff generating redlined versions would
> be very difficult.
>
>
> I am also worried about consistency since we
> already use \\input.
> If using \\input means putting everything in a single directory,
> that's a small price to pay:
>
> virtio-sound.tex + virtio-sound-conformance.tex
>
> is not fundamentally worse than device-types/virtio-sound/device.tex
> and device-types/virtio-sound/device-conformance.tex
>
> and it avoids the duplicated "device" in the name.
>
> Previously it looked like a cosmetic issue, but now it looks
> like it's important.
I agree, and we need to decide quickly what to do with the ballot. We
don't want to merge v1, but the current votes still have a majority of
'yes'. My preference would be to withdraw the ballot, which needs to be
done before 22:00 UTC today, if I'm not confused.
Parav, what do you think? If you request to withdraw the ballot, that's
easy to do; we'll just open a new one once we've agreed on a version.
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply
* Re: [virtio-comment] Re: [PATCH v7] virtio-net: support inner header hash
From: Heng Qi @ 2023-01-10 7:47 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: virtio-comment, virtio-dev, Jason Wang, Yuri Benditovich,
Cornelia Huck, Xuan Zhuo
In-Reply-To: <20230109063654-mutt-send-email-mst@kernel.org>
在 2023/1/9 下午7:39, Michael S. Tsirkin 写道:
> Btw this "are defined below" all over the place is just contributing
> to making the spec unnecesarily verbose. Simple "are:" will do.
Sure. I'll fix it in the next version.
Thanks.
^ permalink raw reply
* Re: [PATCH v7] virtio-net: support inner header hash
From: Heng Qi @ 2023-01-10 7:46 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: virtio-comment, virtio-dev, Jason Wang, Yuri Benditovich,
Cornelia Huck, Xuan Zhuo
In-Reply-To: <20230109063413-mutt-send-email-mst@kernel.org>
在 2023/1/9 下午7:36, Michael S. Tsirkin 写道:
> On Wed, Jan 04, 2023 at 03:14:01PM +0800, Heng Qi wrote:
>> If the tunnel is used to encapsulate the packets, the hash calculated
>> using the outer header of the receive packets is always fixed for the
>> same flow packets, i.e. they will be steered to the same receive queue.
>>
>> We add a feature bit VIRTIO_NET_F_HASH_TUNNEL and related bitmasks
>> in \field{hash_types}, which instructs the device to calculate the
>> hash using the inner headers of tunnel-encapsulated packets. Besides,
>> values in \field{hash_report_tunnel} are added to report tunnel types.
>>
>> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/151
>>
>> Reviewed-by: Jason Wang <jasowang@redhat.com>
>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
>> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
>> ---
>> v6:
>> 1. Modify the wording of some sentences for clarity. @Michael S. Tsirkin
>> 2. Fix some syntax issues. @Michael S. Tsirkin
>>
>> v5:
>> 1. Fix some syntax and capitalization issues. @Michael S. Tsirkin
>> 2. Use encapsulated/encaptulation uniformly. @Michael S. Tsirkin
>> 3. Move the links to introduction section. @Michael S. Tsirkin
>> 4. Clarify some sentences. @Michael S. Tsirkin
>>
>> v4:
>> 1. Clarify some paragraphs. @Cornelia Huck
>> 2. Fix the u8 type. @Cornelia Huck
>>
>> v3:
>> 1. Rename VIRTIO_NET_F_HASH_GRE_VXLAN_GENEVE_INNER to VIRTIO_NET_F_HASH_TUNNEL. @Jason Wang
>> 2. Make things clearer. @Jason Wang @Michael S. Tsirkin
>> 3. Keep the possibility to use inner hash for automatic receive steering. @Jason Wang
>> 4. Add the "Tunnel packet" paragraph to avoid repeating the GRE etc. many times. @Michael S. Tsirkin
>>
>> v2:
>> 1. Add a feature bit for GRE/VXLAN/GENEVE inner hash. @Jason Wang
>> 2. Chang \field{hash_tunnel} to \field{hash_report_tunnel}. @Jason Wang, @Michael S. Tsirkin
>>
>> v1:
>> 1. Remove the patch for the bitmask fix. @Michael S. Tsirkin
>> 2. Clarify some paragraphs. @Jason Wang
>> 3. Add \field{hash_tunnel} and VIRTIO_NET_HASH_REPORT_GRE. @Yuri Benditovich
>>
>> content.tex | 191 +++++++++++++++++++++++++++++++++++++++++++++--
>> introduction.tex | 19 +++++
>> 2 files changed, 203 insertions(+), 7 deletions(-)
>>
>> diff --git a/content.tex b/content.tex
>> index e863709..7845f6c 100644
>> --- a/content.tex
>> +++ b/content.tex
>> @@ -3084,6 +3084,9 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
>> \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
>> channel.
>>
>> +\item[VIRTIO_NET_F_HASH_TUNNEL(52)] Device supports inner
>> + header hash for GRE, VXLAN and GENEVE tunnel-encapsulated packets.
>> +
>> \item[VIRTIO_NET_F_NOTF_COAL(53)] Device supports notifications coalescing.
>>
>> \item[VIRTIO_NET_F_GUEST_USO4 (54)] Driver can receive USOv4 packets.
>> @@ -3095,7 +3098,8 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
>> to several segments when each of these smaller packets has UDP header.
>>
>> \item[VIRTIO_NET_F_HASH_REPORT(57)] Device can report per-packet hash
>> - value and a type of calculated hash.
>> + value, a type of calculated hash, and, if VIRTIO_NET_F_HASH_TUNNEL
>> + is negotiated, an encapsulation packet type.
>>
>> \item[VIRTIO_NET_F_GUEST_HDRLEN(59)] Driver can provide the exact \field{hdr_len}
>> value. Device benefits from knowing the exact header length.
>> @@ -3140,6 +3144,7 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
>> \item[VIRTIO_NET_F_NOTF_COAL] Requires VIRTIO_NET_F_CTRL_VQ.
>> \item[VIRTIO_NET_F_RSC_EXT] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
>> \item[VIRTIO_NET_F_RSS] Requires VIRTIO_NET_F_CTRL_VQ.
>> +\item[VIRTIO_NET_F_HASH_TUNNEL] Requires VIRTIO_NET_F_CTRL_VQ.
>> \end{description}
>>
>> \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Network Device / Feature bits / Legacy Interface: Feature bits}
>> @@ -3386,7 +3391,8 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O
>> le16 num_buffers;
>> le32 hash_value; (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
>> le16 hash_report; (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
>> - le16 padding_reserved; (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
>> + u8 hash_report_tunnel; (Only if VIRTIO_NET_F_HASH_REPORT negotiated, only valid of VIRTIO_NET_F_HASH_TUNNEL negotiated)
>> + u8 padding_reserved; (Only if VIRTIO_NET_F_HASH_REPORT negotiated)
>> };
>> \end{lstlisting}
>>
>> @@ -3837,7 +3843,8 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>> A device attempts to calculate a per-packet hash in the following cases:
>> \begin{itemize}
>> \item The feature VIRTIO_NET_F_RSS was negotiated. The device uses the hash to determine the receive virtqueue to place incoming packets.
>> -\item The feature VIRTIO_NET_F_HASH_REPORT was negotiated. The device reports the hash value and the hash type with the packet.
>> +\item The feature VIRTIO_NET_F_HASH_REPORT was negotiated. The device reports the hash value and the hash type. If additionally
>> +VIRTIO_NET_F_HASH_TUNNEL was negotiated, the device reports the encapsulation type as well.
>> \end{itemize}
>>
>> If the feature VIRTIO_NET_F_RSS was negotiated:
>> @@ -3863,8 +3870,36 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>> \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode / Hash calculation}.
>> \end{itemize}
>>
>> +\subparagraph{Tunnel/Encapsulated packet}
>> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Tunnel/Encapsulated packet}
>> +A tunnel packet is encapsulated from the original packet based on the tunneling
>> +protocol (only a single level of encapsulation is currently supported). The
>> +encapsulated packet contains an outer header and an inner header, and the device
>> +calculates the hash over either the inner header or the outer header.
>> +
>> +When the feature VIRTIO_NET_F_HASH_TUNNEL is negotiated and the corresponding
>> +encapsulation type is set in \field{hash_types}, the hash for a specific type of
>> +encapsulated packet is calculated over the inner as opposed to outer header.
>> +Supported encapsulation types are listed in \ref{sec:Device Types / Network Device /
>> +Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets /
>> +Supported/enabled hash types}.
>> +
>> +If both VIRTIO_NET_F_HASH_REPORT and VIRTIO_NET_F_HASH_TUNNEL are negotiated,
>> +the device can support inner hash calculation for \hyperref[intro:GRE]{[GRE]},
>> +\hyperref[intro:VXLAN]{[VXLAN]} and \hyperref[intro:GENEVE]{[GENEVE]}
>> +encapsulated packets, and the corresponding encapsulation type in \field{hash_types}
>> +is VIRTIO_NET_HASH_TYPE_{GRE, VXLAN, GENEVE}_INNER respectively. The value in
>> +\field{hash_report_tunnel} is VIRTIO_NET_HASH_REPORT_{NONE, GRE, VXLAN, GENEVE} respectively.
>> +
>> +If VIRTIO_NET_F_HASH_REPORT is negotiated but VIRTIO_NET_F_HASH_TUNNEL is not
>> +negotiated, the device calculates the hash over the outer header, and
>> +\field{hash_report} reports the hash type. \field{hash_report_tunnel}
>> +is no longer valid.
>> +
>> \subparagraph{Supported/enabled hash types}
>> \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}
>> +This paragraph relies on definitions from \hyperref[intro:IP]{[IP]},
>> +\hyperref[intro:UDP]{[UDP]} and \hyperref[intro:TCP]{[TCP]}.
>> Hash types applicable for IPv4 packets:
>> \begin{lstlisting}
>> #define VIRTIO_NET_HASH_TYPE_IPv4 (1 << 0)
>> @@ -3884,6 +3919,22 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>> #define VIRTIO_NET_HASH_TYPE_UDP_EX (1 << 8)
>> \end{lstlisting}
>>
>> +If the feature VIRTIO_NET_F_HASH_TUNNEL is negotiated, the encapsulation
>> +hash type below indicates that the hash is calculated over the inner
>> +header of the encapsulated packet:
>> +Hash type applicable for inner payload of the gre-encapsulated packet
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TYPE_GRE_INNER (1 << 9)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the vxlan-encapsulated packet
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TYPE_VXLAN_INNER (1 << 10)
>> +\end{lstlisting}
>> +Hash type applicable for inner payload of the geneve-encapsulated packet
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_TYPE_GENEVE_INNER (1 << 11)
>> +\end{lstlisting}
>> +
>> \subparagraph{IPv4 packets}
>> \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv4 packets}
>> The device calculates the hash on IPv4 packets according to 'Enabled hash types' bitmask as follows:
>> @@ -3975,15 +4026,125 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>> (see \ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / IPv6 packets without extension header}).
>> \end{itemize}
>>
>> +\subparagraph{Inner payload of an encapsulated packet}
>> +\label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Inner payload of the encapsulated packet}
>> +If the feature VIRTIO_NET_F_HASH_TUNNEL is negotiated and the corresponding
>> +encapsulation hash type is set in \field{hash_types}, the device calculates the
>> +inner header hash on an encapsulated packet (See \ref{sec:Device Types
>> +/ Network Device / Device Operation / Processing of Incoming Packets /
>> +Hash calculation for incoming packets / Tunnel/Encapsulated packet}) as follows:
>> +
>> +The device calculates the hash on the inner IPv4 packet of an encapsulated packet
>> +according to 'Enabled hash types' bitmask as follows:
>> +\begin{itemize}
>> + \item If VIRTIO_NET_HASH_TYPE_TCPv4 is set and the encapsulated packet has an inner
>> + TCPv4 header, the hash is calculated over the following fields:
>> + \begin{itemsize}
>> + \item inner Source IP address
>> + \item inner Destination IP address
>> + \item inner Source TCP port
>> + \item inner Destination TCP port
>> + \end{itemsize}
>> + \item Else if VIRTIO_NET_HASH_TYPE_UDPv4 is set and the encapsulated packet has an
>> + inner UDPv4 header, the hash is calculated over the following fields:
>> + \begin{itemsize}
>> + \item inner Source IP address
>> + \item inner Destination IP address
>> + \item inner Source UDP port
>> + \item inner Destination UDP port
>> + \end{itemize}
>> + \item Else if VIRTIO_NET_HASH_TYPE_IPv4 is set, the hash is calculated over the
>> + following fields:
>> + \begin{itemsize}
>> + \item inner Source IP address
>> + \item inner Destination IP address
>> + \end{itemsize}
>> + \item Else the device does not calculate the hash
>> +\end{itemize}
>> +
>> +The device calculates the hash on the inner IPv6 packet without an extension header
>> +of an encapsulated packet according to 'Enabled hash types' bitmask as follows:
>> +\begin{itemize}
>> + \item If VIRTIO_NET_HASH_TYPE_TCPv6 is set and the encapsulated packet has an inner
>> + TCPv6 header, the hash is calculated over the following fields:
>> + \begin{itemsize}
>> + \item inner Source IPv6 address
>> + \item inner Destination IPv6 address
>> + \item inner Source TCP port
>> + \item inner Destination TCP port
>> + \end{itemsize}
>> + \item Else if VIRTIO_NET_HASH_TYPE_UDPv6 is set and the encapsulated packet has an
>> + inner UDPv6 header, the hash is calculated over the following fields:
>> + \begin{itemsize}
>> + \item inner Source IPv6 address
>> + \item inner Destination IPv6 address
>> + \item inner Source UDP port
>> + \item inner Destination UDP port
>> + \end{itemize}
>> + \item Else if VIRTIO_NET_HASH_TYPE_IPv6 is set, the hash is calculated over the
>> + following fields:
>> + \begin{itemsize}
>> + \item inner Source IPv6 address
>> + \item inner Destination IPv6 address
>> + \end{itemsize}
>> + \item Else the device does not calculate the hash
>> +\end{itemize}
>> +
>> +The device calculates the hash on the inner IPv6 packet with an extension header
>> +of an encapsulated packet according to 'Enabled hash types' bitmask as follows:
>> +\begin{itemsize}
>> + \item If VIRTIO_NET_HASH_TYPE_TCP_EX is set and the encapsulated packet has an inner
>> + TCPv6 header, the hash is calculated over the following fields:
>> + \begin{itemize}
>> + \item Home address from the home address option in the inner IPv6 destination
>> + options header. If the inner extension header is not present, use the
>> + inner Source IPv6 address.
>> + \item IPv6 address that is contained in the Routing-Header-Type-2 from the
>> + associated inner extension header. If the inner extension header is not
>> + present, use the inner Destination IPv6 address.
>> + \item inner Source TCP port
>> + \item inner Destination TCP port
>> + \end{itemize}
>> + \item Else if VIRTIO_NET_HASH_TYPE_UDP_EX is set and the encapsulated packet has an inner
>> + UDPv6 header, the hash is calculated over the following fields:
>> + \begin{itemsize}
>> + \item Home address from the home address option in the inner IPv6 destination
>> + options header. If the inner extension header is not present, use the
>> + inner Source IPv6 address.
>> + \item IPv6 address that is contained in the Routing-Header-Type-2 from the
>> + associated inner extension header. If the inner extension header is not
>> + present, use the inner Destination IPv6 address.
>> + \item inner Source UDP port
>> + \item inner Destination UDP port
>> + \end{itemize}
>> + \item Else if VIRTIO_NET_HASH_TYPE_IP_EX is set, the hash is calculated over the
>> + following fields:
>> + \begin{itemsize}
>> + \item Home address from the home address option in the inner IPv6 destination
>> + options header. If the inner extension header is not present, use the
>> + inner Source IPv6 address.
>> + \item IPv6 address that is contained in the Routing-Header-Type-2 from the
>> + associated inner extension header. If the inner extension header is not
>> + present, use the inner Destination IPv6 address.
>> + \end{itemize}
>> + \item Else skip the inner IPv6 extension header and calculate the inner header hash as
>> + defined for an encapsulated packet whose inner payload is an IPv6 packet without
>> + an extension header.
>> +\end{itemsize}
>> +
>> \paragraph{Hash reporting for incoming packets}
>> \label{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash reporting for incoming packets}
>>
>> -If VIRTIO_NET_F_HASH_REPORT was negotiated and
>> - the device has calculated the hash for the packet, the device fills \field{hash_report} with the report type of calculated hash
>> -and \field{hash_value} with the value of calculated hash.
>> +If VIRTIO_NET_F_HASH_REPORT was negotiated and the device has calculated the
>> +hash for the packet, the device fills \field{hash_report} with the report type
>> +of calculated hash, and \field{hash_value} with the value of calculated hash.
>> +Also, if VIRTIO_NET_F_HASH_TUNNEL was negotiated, the device needs to fill
>> +\field{hash_report_tunnel} with the report type of the encapsulated packet, and
>> +it is set to VIRTIO_NET_HASH_REPORT_TUNNEL_NONE for the unencapsulated packet.
>>
>> If VIRTIO_NET_F_HASH_REPORT was negotiated but due to any reason the
>> -hash was not calculated, the device sets \field{hash_report} to VIRTIO_NET_HASH_REPORT_NONE.
>> +hash was not calculated, the device sets \field{hash_report} to VIRTIO_NET_HASH_REPORT_NONE,
>> +and sets \field{hash_report_tunnel} to VIRTIO_NET_HASH_REPORT_TUNNEL_NONE.
>>
>> Possible values that the device can report in \field{hash_report} are defined below.
>> They correspond to supported hash types defined in
>> @@ -4005,6 +4166,22 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
>> #define VIRTIO_NET_HASH_REPORT_UDPv6_EX 9
>> \end{lstlisting}
>>
>> +If \field{hash_report} differs from VIRTIO_NET_HASH_REPORT_NONE,
>> +\field{hash_report_tunnel} can report the type of the encapsulated
>> +packet to the driver over the inner header hash calculation.
> this sentence is a bit confused btw. so this reports
> the type of the encapsulated packet. but what does
> "over the inner header hash calculation." mean?
> what happens over what?
Just means "After the hash is computed, the encapsulation type can be
reported to the driver".
It means redundant here, I will remove "over the inner header hash
calculation".
Thanks.
>
>
>> +Possible values that the device can report in \field{hash_report_tunnel}
>> +are defined below:
>> +
>> +\begin{lstlisting}
>> +#define VIRTIO_NET_HASH_REPORT_TUNNEL_NONE 0
>> +#define VIRTIO_NET_HASH_REPORT_GRE 1
>> +#define VIRTIO_NET_HASH_REPORT_VXLAN 2
>> +#define VIRTIO_NET_HASH_REPORT_GENEVE 3
>> +\end{lstlisting}
>> +
>> +They correspond to supported hash types defined in
>> +\ref{sec:Device Types / Network Device / Device Operation / Processing of Incoming Packets / Hash calculation for incoming packets / Supported/enabled hash types}.
>> +
>> \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue}
>>
>> The driver uses the control virtqueue (if VIRTIO_NET_F_CTRL_VQ is
>> diff --git a/introduction.tex b/introduction.tex
>> index 287c5fc..ff01a9b 100644
>> --- a/introduction.tex
>> +++ b/introduction.tex
>> @@ -98,6 +98,25 @@ \section{Normative References}\label{sec:Normative References}
>> \phantomsection\label{intro:SEC1}\textbf{[SEC1]} &
>> Standards for Efficient Cryptography Group(SECG), ``SEC1: Elliptic Cureve Cryptography'', Version 1.0, September 2000.
>> \newline\url{https://www.secg.org/sec1-v2.pdf}\\
>> + \phantomsection\label{intro:GRE}\textbf{[GRE]} &
>> + Generic Routing Encapsulation
>> + \newline\url{https://datatracker.ietf.org/doc/rfc2784/}\\
>> + \phantomsection\label{intro:VXLAN}\textbf{[VXLAN]} &
>> + Virtual eXtensible Local Area Network
>> + \newline\url{https://datatracker.ietf.org/doc/rfc7348/}\\
>> + \phantomsection\label{intro:GENEVE}\textbf{[GENEVE]} &
>> + Generic Network Virtualization Encapsulation
>> + \newline\url{https://datatracker.ietf.org/doc/rfc8926/}\\
>> + \phantomsection\label{intro:IP}\textbf{[IP]} &
>> + INTERNET PROTOCOL
>> + \newline\url{https://www.rfc-editor.org/rfc/rfc791}\\
>> + \phantomsection\label{intro:UDP}\textbf{[UDP]} &
>> + User Datagram Protocol
>> + \newline\url{https://www.rfc-editor.org/rfc/rfc768}\\
>> + \phantomsection\label{intro:TCP}\textbf{[TCP]} &
>> + TRANSMISSION CONTROL PROTOCOL
>> + \newline\url{https://www.rfc-editor.org/rfc/rfc793}\\
>> +
>>
>> \end{longtable}
>>
>> --
>> 2.19.1.6.gb485710b
^ permalink raw reply
* Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v7] virtio-net: support inner header hash
From: Heng Qi @ 2023-01-10 7:26 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Jason Wang, virtio-comment, virtio-dev, Yuri Benditovich,
Cornelia Huck, Xuan Zhuo
In-Reply-To: <20230110005158-mutt-send-email-mst@kernel.org>
On Tue, Jan 10, 2023 at 12:57:38AM -0500, Michael S. Tsirkin wrote:
> On Tue, Jan 10, 2023 at 12:25:02AM -0500, Michael S. Tsirkin wrote:
> > > This will give extra pressure on the management stack, e.g it requires
> > > the device to have an out of spec way for introspection.
> > >
> > > Thanks
> >
> > As I tried to explain this is already the case. Feature bits do not
> > describe device capabilities fully, some of them are in config space.
>
> To be precise, this does not necessarily require introspection, but
> it does require management control over config space
> such as supported hash types just like it has control over feature bits.
> E.g. QEMU currently seems to hard-code these to
> #define VIRTIO_NET_RSS_SUPPORTED_HASHES (VIRTIO_NET_RSS_HASH_TYPE_IPv4 | \
> VIRTIO_NET_RSS_HASH_TYPE_TCPv4 | \
> VIRTIO_NET_RSS_HASH_TYPE_UDPv4 | \
> VIRTIO_NET_RSS_HASH_TYPE_IPv6 | \
> VIRTIO_NET_RSS_HASH_TYPE_TCPv6 | \
> VIRTIO_NET_RSS_HASH_TYPE_UDPv6 | \
> VIRTIO_NET_RSS_HASH_TYPE_IP_EX | \
> VIRTIO_NET_RSS_HASH_TYPE_TCP_EX | \
> VIRTIO_NET_RSS_HASH_TYPE_UDP_EX)
>
> but there's no reason not to give management control over these.
Yes, QEMU has requirements for live migration: the PCI config space will be
checked in get_pci_config_device(), and if src and dst are inconsistent, it
will prompt that the live migration failed.
In fact, this is also done within our group. Live migration requires that
the two VMs have the same rss configuration, otherwise the migration will fail.
Therefore, it seems that we can regularize the description of VIRTIO_NET_F_HASH_TUNNEL into
"[VIRTIO_NET_F_HASH_TUNNEL(52)] Device supports inner header hash for tunnel-encapsulated packets.",
and use different hash_types to help the migration determine whether it can succeed.
Thanks.
>
> --
> MST
^ permalink raw reply
* Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v7] virtio-net: support inner header hash
From: Michael S. Tsirkin @ 2023-01-10 5:57 UTC (permalink / raw)
To: Jason Wang
Cc: Heng Qi, virtio-comment, virtio-dev, Yuri Benditovich,
Cornelia Huck, Xuan Zhuo
In-Reply-To: <20230110002017-mutt-send-email-mst@kernel.org>
On Tue, Jan 10, 2023 at 12:25:02AM -0500, Michael S. Tsirkin wrote:
> > This will give extra pressure on the management stack, e.g it requires
> > the device to have an out of spec way for introspection.
> >
> > Thanks
>
> As I tried to explain this is already the case. Feature bits do not
> describe device capabilities fully, some of them are in config space.
To be precise, this does not necessarily require introspection, but
it does require management control over config space
such as supported hash types just like it has control over feature bits.
E.g. QEMU currently seems to hard-code these to
#define VIRTIO_NET_RSS_SUPPORTED_HASHES (VIRTIO_NET_RSS_HASH_TYPE_IPv4 | \
VIRTIO_NET_RSS_HASH_TYPE_TCPv4 | \
VIRTIO_NET_RSS_HASH_TYPE_UDPv4 | \
VIRTIO_NET_RSS_HASH_TYPE_IPv6 | \
VIRTIO_NET_RSS_HASH_TYPE_TCPv6 | \
VIRTIO_NET_RSS_HASH_TYPE_UDPv6 | \
VIRTIO_NET_RSS_HASH_TYPE_IP_EX | \
VIRTIO_NET_RSS_HASH_TYPE_TCP_EX | \
VIRTIO_NET_RSS_HASH_TYPE_UDP_EX)
but there's no reason not to give management control over these.
--
MST
^ permalink raw reply
* Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v7] virtio-net: support inner header hash
From: Michael S. Tsirkin @ 2023-01-10 5:24 UTC (permalink / raw)
To: Jason Wang
Cc: Heng Qi, virtio-comment, virtio-dev, Yuri Benditovich,
Cornelia Huck, Xuan Zhuo
In-Reply-To: <CACGkMEvjo8t_MhEXvn_u3R827105E1nHDRWku9j1ARWfH7yUPA@mail.gmail.com>
On Tue, Jan 10, 2023 at 10:06:53AM +0800, Jason Wang wrote:
> On Mon, Jan 9, 2023 at 7:34 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Mon, Jan 09, 2023 at 04:59:26PM +0800, Jason Wang wrote:
> > > On Mon, Jan 9, 2023 at 10:43 AM Heng Qi <hengqi@linux.alibaba.com> wrote:
> > > >
> > > > On Fri, Jan 06, 2023 at 01:59:38AM -0500, Michael S. Tsirkin wrote:
> > > > > On Fri, Jan 06, 2023 at 02:42:21PM +0800, Heng Qi wrote:
> > > > > > On Fri, Jan 06, 2023 at 12:27:04AM -0500, Michael S. Tsirkin wrote:
> > > > > > > On Wed, Jan 04, 2023 at 03:14:01PM +0800, Heng Qi wrote:
> > > > > > > > If the tunnel is used to encapsulate the packets, the hash calculated
> > > > > > > > using the outer header of the receive packets is always fixed for the
> > > > > > > > same flow packets, i.e. they will be steered to the same receive queue.
> > > > > > > >
> > > > > > > > We add a feature bit VIRTIO_NET_F_HASH_TUNNEL and related bitmasks
> > > > > > > > in \field{hash_types}, which instructs the device to calculate the
> > > > > > > > hash using the inner headers of tunnel-encapsulated packets. Besides,
> > > > > > > > values in \field{hash_report_tunnel} are added to report tunnel types.
> > > > > > > >
> > > > > > > > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/151
> > > > > > > >
> > > > > > > > Reviewed-by: Jason Wang <jasowang@redhat.com>
> > > > > > > > Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > >
> > > > > > >
> > > > > > > ok close to being ready. a couple of minor comments.
> > > > > > >
> > > > > > > > ---
> > > > > > > > v6:
> > > > > > > > 1. Modify the wording of some sentences for clarity. @Michael S. Tsirkin
> > > > > > > > 2. Fix some syntax issues. @Michael S. Tsirkin
> > > > > > > >
> > > > > > > > v5:
> > > > > > > > 1. Fix some syntax and capitalization issues. @Michael S. Tsirkin
> > > > > > > > 2. Use encapsulated/encaptulation uniformly. @Michael S. Tsirkin
> > > > > > > > 3. Move the links to introduction section. @Michael S. Tsirkin
> > > > > > > > 4. Clarify some sentences. @Michael S. Tsirkin
> > > > > > > >
> > > > > > > > v4:
> > > > > > > > 1. Clarify some paragraphs. @Cornelia Huck
> > > > > > > > 2. Fix the u8 type. @Cornelia Huck
> > > > > > > >
> > > > > > > > v3:
> > > > > > > > 1. Rename VIRTIO_NET_F_HASH_GRE_VXLAN_GENEVE_INNER to VIRTIO_NET_F_HASH_TUNNEL. @Jason Wang
> > > > > > > > 2. Make things clearer. @Jason Wang @Michael S. Tsirkin
> > > > > > > > 3. Keep the possibility to use inner hash for automatic receive steering. @Jason Wang
> > > > > > > > 4. Add the "Tunnel packet" paragraph to avoid repeating the GRE etc. many times. @Michael S. Tsirkin
> > > > > > > >
> > > > > > > > v2:
> > > > > > > > 1. Add a feature bit for GRE/VXLAN/GENEVE inner hash. @Jason Wang
> > > > > > > > 2. Chang \field{hash_tunnel} to \field{hash_report_tunnel}. @Jason Wang, @Michael S. Tsirkin
> > > > > > > >
> > > > > > > > v1:
> > > > > > > > 1. Remove the patch for the bitmask fix. @Michael S. Tsirkin
> > > > > > > > 2. Clarify some paragraphs. @Jason Wang
> > > > > > > > 3. Add \field{hash_tunnel} and VIRTIO_NET_HASH_REPORT_GRE. @Yuri Benditovich
> > > > > > > >
> > > > > > > > content.tex | 191 +++++++++++++++++++++++++++++++++++++++++++++--
> > > > > > > > introduction.tex | 19 +++++
> > > > > > > > 2 files changed, 203 insertions(+), 7 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/content.tex b/content.tex
> > > > > > > > index e863709..7845f6c 100644
> > > > > > > > --- a/content.tex
> > > > > > > > +++ b/content.tex
> > > > > > > > @@ -3084,6 +3084,9 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
> > > > > > > > \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
> > > > > > > > channel.
> > > > > > > >
> > > > > > > > +\item[VIRTIO_NET_F_HASH_TUNNEL(52)] Device supports inner
> > > > > > > > + header hash for GRE, VXLAN and GENEVE tunnel-encapsulated packets.
> > > > > > >
> > > > > > > I would probably drop the list of tunnel types here.
> > > > > >
> > > > > > Do you mean to use "Device supports inner header hash for
> > > > > > tunnel-encapsulated packets." instead? Why? We do want to use this
> > > > > > feature bit to indicate that the device supports inner hashing of
> > > > > > GRE, VXLAN and GENEVE encapsulated packets. As in the v3 discussion
> > > > > > https://lists.oasis-open.org/archives/virtio-dev/202212/msg00024.html ,
> > > > > > we discussed using VIRTIO_NET_F_HASH_TUNNEL to replace
> > > > > > VIRTIO_NET_F_HASH_GRE_VXLAN_GENEVE_INNER and plan to use
> > > > > > VIRTIO_NET_F_HASH_TUNNEL_XYZ for future extensions.
> > > > >
> > > > > So imagine we add a new tunnel type. Let's say there's VXLAN v2.
> > > > > why would we need a new feature bit? I think a new hash type
> > > > > will be sufficient. No?
> > > >
> > > > If the description for VIRTIO_NET_F_HASH_TUNNEL is as follows:
> > > > "[VIRTIO_NET_F_HASH_TUNNEL(52)] Device supports inner header hash for tunnel-encapsulated packets.".
> > > > Then the following may happen
> > > > 1. For VXLANv2, if both device src and device dst have negotiated this feature, it is assumed that
> > > > device src supports VXLAN and VXLANv2, but device dst may only support VXLAN, not VXLANv2.
> > > > 2. For other encapsulation protocols such as ip in ip, after device src and device dst have
> > > > negotiated this feature, it is assumed that device src supports GRE, VXLAN, GENEVE and ip in ip,
> > > > but it is not clear that device dst also supports ip in ip. Especially when migrating, this can
> > > > lead to inconsistencies in live migrations.
> >
> > it does not matter. e.g. if device supports GRE but not VXLAN then
> > feature bit it set but hash types are different.
>
> But the hash will be used for RSS steering. Consider the steering on
> ipip work on src but not on dst it would break application logic.
Exactly. And for this reason just making sure feature bits match
is not sufficient for migration. It follows that hash types
must also match, and therefore there is no need to invent
new feature bits for migration if all we are doing is adding new hash types.
> > and this is not different from any other offload we have.
>
> But we have different feature bits for TCPv4/v6.
For segmentation but not for hash. Basically because we
don't have segmentation types analogous to hash types.
> >
> > Generally the only reason we even use a new feature bit for this
> > tunneling is because the header structure is a bit different
> > so this kind of makes sense.
> >
> > >
> > > Yes, this looks like the only way that I can think of to keep
> > > migration compatibility in an easy way.
> > >
> > > Thanks
> >
> > Jason migration compatibility must also check hash types.
> > There's no chance with the current hash offload scheme to
> > only rely on feature bits for migration compatibility.
>
> This will give extra pressure on the management stack, e.g it requires
> the device to have an out of spec way for introspection.
>
> Thanks
As I tried to explain this is already the case. Feature bits do not
describe device capabilities fully, some of them are in config space.
> >
> >
> > > > So, I think it's better to keep the original description:
> > > > "[VIRTIO_NET_F_HASH_TUNNEL(52)] Device supports inner header hash for GRE, VXLAN and GENEVE tunnel-encapsulated packets."
> > > >
> > > > Thanks.
> > > >
> > > > >
> > > > > --
> > > > > MST
> > > > >
> > > > >
> > > > > ---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > > > > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> > > >
> >
^ permalink raw reply
* Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v7] virtio-net: support inner header hash
From: Jason Wang @ 2023-01-10 2:06 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Heng Qi, virtio-comment, virtio-dev, Yuri Benditovich,
Cornelia Huck, Xuan Zhuo
In-Reply-To: <20230109062943-mutt-send-email-mst@kernel.org>
On Mon, Jan 9, 2023 at 7:34 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Jan 09, 2023 at 04:59:26PM +0800, Jason Wang wrote:
> > On Mon, Jan 9, 2023 at 10:43 AM Heng Qi <hengqi@linux.alibaba.com> wrote:
> > >
> > > On Fri, Jan 06, 2023 at 01:59:38AM -0500, Michael S. Tsirkin wrote:
> > > > On Fri, Jan 06, 2023 at 02:42:21PM +0800, Heng Qi wrote:
> > > > > On Fri, Jan 06, 2023 at 12:27:04AM -0500, Michael S. Tsirkin wrote:
> > > > > > On Wed, Jan 04, 2023 at 03:14:01PM +0800, Heng Qi wrote:
> > > > > > > If the tunnel is used to encapsulate the packets, the hash calculated
> > > > > > > using the outer header of the receive packets is always fixed for the
> > > > > > > same flow packets, i.e. they will be steered to the same receive queue.
> > > > > > >
> > > > > > > We add a feature bit VIRTIO_NET_F_HASH_TUNNEL and related bitmasks
> > > > > > > in \field{hash_types}, which instructs the device to calculate the
> > > > > > > hash using the inner headers of tunnel-encapsulated packets. Besides,
> > > > > > > values in \field{hash_report_tunnel} are added to report tunnel types.
> > > > > > >
> > > > > > > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/151
> > > > > > >
> > > > > > > Reviewed-by: Jason Wang <jasowang@redhat.com>
> > > > > > > Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > >
> > > > > >
> > > > > > ok close to being ready. a couple of minor comments.
> > > > > >
> > > > > > > ---
> > > > > > > v6:
> > > > > > > 1. Modify the wording of some sentences for clarity. @Michael S. Tsirkin
> > > > > > > 2. Fix some syntax issues. @Michael S. Tsirkin
> > > > > > >
> > > > > > > v5:
> > > > > > > 1. Fix some syntax and capitalization issues. @Michael S. Tsirkin
> > > > > > > 2. Use encapsulated/encaptulation uniformly. @Michael S. Tsirkin
> > > > > > > 3. Move the links to introduction section. @Michael S. Tsirkin
> > > > > > > 4. Clarify some sentences. @Michael S. Tsirkin
> > > > > > >
> > > > > > > v4:
> > > > > > > 1. Clarify some paragraphs. @Cornelia Huck
> > > > > > > 2. Fix the u8 type. @Cornelia Huck
> > > > > > >
> > > > > > > v3:
> > > > > > > 1. Rename VIRTIO_NET_F_HASH_GRE_VXLAN_GENEVE_INNER to VIRTIO_NET_F_HASH_TUNNEL. @Jason Wang
> > > > > > > 2. Make things clearer. @Jason Wang @Michael S. Tsirkin
> > > > > > > 3. Keep the possibility to use inner hash for automatic receive steering. @Jason Wang
> > > > > > > 4. Add the "Tunnel packet" paragraph to avoid repeating the GRE etc. many times. @Michael S. Tsirkin
> > > > > > >
> > > > > > > v2:
> > > > > > > 1. Add a feature bit for GRE/VXLAN/GENEVE inner hash. @Jason Wang
> > > > > > > 2. Chang \field{hash_tunnel} to \field{hash_report_tunnel}. @Jason Wang, @Michael S. Tsirkin
> > > > > > >
> > > > > > > v1:
> > > > > > > 1. Remove the patch for the bitmask fix. @Michael S. Tsirkin
> > > > > > > 2. Clarify some paragraphs. @Jason Wang
> > > > > > > 3. Add \field{hash_tunnel} and VIRTIO_NET_HASH_REPORT_GRE. @Yuri Benditovich
> > > > > > >
> > > > > > > content.tex | 191 +++++++++++++++++++++++++++++++++++++++++++++--
> > > > > > > introduction.tex | 19 +++++
> > > > > > > 2 files changed, 203 insertions(+), 7 deletions(-)
> > > > > > >
> > > > > > > diff --git a/content.tex b/content.tex
> > > > > > > index e863709..7845f6c 100644
> > > > > > > --- a/content.tex
> > > > > > > +++ b/content.tex
> > > > > > > @@ -3084,6 +3084,9 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
> > > > > > > \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
> > > > > > > channel.
> > > > > > >
> > > > > > > +\item[VIRTIO_NET_F_HASH_TUNNEL(52)] Device supports inner
> > > > > > > + header hash for GRE, VXLAN and GENEVE tunnel-encapsulated packets.
> > > > > >
> > > > > > I would probably drop the list of tunnel types here.
> > > > >
> > > > > Do you mean to use "Device supports inner header hash for
> > > > > tunnel-encapsulated packets." instead? Why? We do want to use this
> > > > > feature bit to indicate that the device supports inner hashing of
> > > > > GRE, VXLAN and GENEVE encapsulated packets. As in the v3 discussion
> > > > > https://lists.oasis-open.org/archives/virtio-dev/202212/msg00024.html ,
> > > > > we discussed using VIRTIO_NET_F_HASH_TUNNEL to replace
> > > > > VIRTIO_NET_F_HASH_GRE_VXLAN_GENEVE_INNER and plan to use
> > > > > VIRTIO_NET_F_HASH_TUNNEL_XYZ for future extensions.
> > > >
> > > > So imagine we add a new tunnel type. Let's say there's VXLAN v2.
> > > > why would we need a new feature bit? I think a new hash type
> > > > will be sufficient. No?
> > >
> > > If the description for VIRTIO_NET_F_HASH_TUNNEL is as follows:
> > > "[VIRTIO_NET_F_HASH_TUNNEL(52)] Device supports inner header hash for tunnel-encapsulated packets.".
> > > Then the following may happen
> > > 1. For VXLANv2, if both device src and device dst have negotiated this feature, it is assumed that
> > > device src supports VXLAN and VXLANv2, but device dst may only support VXLAN, not VXLANv2.
> > > 2. For other encapsulation protocols such as ip in ip, after device src and device dst have
> > > negotiated this feature, it is assumed that device src supports GRE, VXLAN, GENEVE and ip in ip,
> > > but it is not clear that device dst also supports ip in ip. Especially when migrating, this can
> > > lead to inconsistencies in live migrations.
>
> it does not matter. e.g. if device supports GRE but not VXLAN then
> feature bit it set but hash types are different.
But the hash will be used for RSS steering. Consider the steering on
ipip work on src but not on dst it would break application logic.
> and this is not different from any other offload we have.
But we have different feature bits for TCPv4/v6.
>
> Generally the only reason we even use a new feature bit for this
> tunneling is because the header structure is a bit different
> so this kind of makes sense.
>
> >
> > Yes, this looks like the only way that I can think of to keep
> > migration compatibility in an easy way.
> >
> > Thanks
>
> Jason migration compatibility must also check hash types.
> There's no chance with the current hash offload scheme to
> only rely on feature bits for migration compatibility.
This will give extra pressure on the management stack, e.g it requires
the device to have an out of spec way for introspection.
Thanks
>
>
> > > So, I think it's better to keep the original description:
> > > "[VIRTIO_NET_F_HASH_TUNNEL(52)] Device supports inner header hash for GRE, VXLAN and GENEVE tunnel-encapsulated packets."
> > >
> > > Thanks.
> > >
> > > >
> > > > --
> > > > MST
> > > >
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > > > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> > >
>
^ permalink raw reply
* [virtio-comment] RE: [PATCH v1 01/20] virtio-network: Maintain network device spec in separate directory
From: Parav Pandit @ 2023-01-09 22:41 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: virtio-dev@lists.oasis-open.org,
virtio-comment@lists.oasis-open.org
In-Reply-To: <20230109140825-mutt-send-email-mst@kernel.org>
> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Monday, January 9, 2023 2:15 PM
>
> > > So my preference is \\input for now.
> > \\input doesn't support reading from the directory.
>
> I just tested this and it seems to work fine:
>
> \input{sub/file.tex}
>
> what is the issue that you see?
I was running with file_conformance.tex name and it doesn't like underscore based named.
And I lost track when I moved away from understore and missed to revert to input again.
>
> I think I prefer \\input as it's easier to understand.
Yes. I will respin with \input and directories.
>
> --
> MST
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply
* [virtio-dev] Re: [PATCH v8] virtio-network: Clarify VLAN filter table configuration
From: Si-Wei Liu @ 2023-01-09 21:04 UTC (permalink / raw)
To: Parav Pandit, mst, virtio-dev; +Cc: virtio-comment
In-Reply-To: <20230109163301.463208-1-parav@nvidia.com>
On 1/9/2023 8:33 AM, Parav Pandit wrote:
> The filtering behavior of the VLAN filter commands is not very clear as
> discussed in thread [1].
>
> Hence, add the command description and device requirements for it.
>
> [1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg912392.html
> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/147
> Suggested-by: Si-Wei Liu <si-wei.liu@oracle.com>
> Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
-Siwei
> ---
> changelog:
> v7->v8:
> - Fixed grammar
> v6->v7:
> - Moved VLAN filter table description from requirements to device
> descrption section
> - Added MUST and SHOULD to device requirements
> v5->v6:
> - removed unwanted article
> v4->v5:
> - reworded 'vlan filtering table' to 'vlan filter table' to match
> to the existing description about vlan filtering
> - remove confusing text around VLAN_DEL command description
> - added missing article
> - reword device match configuration to device configuration
> - changed 'found' to 'present' and 'not found' to 'absent' to
> consider vlan filter table as config table rather
> than search table
> v3->v4:
> - added description for accepting vlan tagged packets when vlan
> filter is not negotiated
> v2->v3:
> - corrected grammar
> - simplified description for untagged packets
> v1->v2:
> - adapt to new file path
> v0->v1:
> - added missing conformance section link
> ---
> .../virtio-network/device-conformance.tex | 1 +
> device-types/virtio-network/device.tex | 22 ++++++++++++++++++-
> 2 files changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/device-types/virtio-network/device-conformance.tex b/device-types/virtio-network/device-conformance.tex
> index c686377..54f6783 100644
> --- a/device-types/virtio-network/device-conformance.tex
> +++ b/device-types/virtio-network/device-conformance.tex
> @@ -9,6 +9,7 @@
> \item \ref{devicenormative:Device Types / Network Device / Device Operation / Processing of Incoming Packets}
> \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Packet Receive Filtering}
> \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Setting MAC Address Filtering}
> +\item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / VLAN Filtering}
> \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Gratuitous Packet Sending}
> \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Automatic receive steering in multiqueue mode}
> \item \ref{devicenormative:Device Types / Network Device / Device Operation / Control Virtqueue / Receive-side scaling (RSS) / RSS processing}
> diff --git a/device-types/virtio-network/device.tex b/device-types/virtio-network/device.tex
> index e0637c5..7c89f30 100644
> --- a/device-types/virtio-network/device.tex
> +++ b/device-types/virtio-network/device.tex
> @@ -1194,7 +1194,11 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
> \paragraph{VLAN Filtering}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / VLAN Filtering}
>
> If the driver negotiates the VIRTIO_NET_F_CTRL_VLAN feature, it
> -can control a VLAN filter table in the device.
> +can control a VLAN filter table in the device. The VLAN filter
> +table applies only to VLAN tagged packets.
> +
> +When VIRTIO_NET_F_CTRL_VLAN is negotiated, the device starts with
> +an empty VLAN filter table.
>
> \begin{note}
> Similar to the MAC address based filtering, the VLAN filtering
> @@ -1210,6 +1214,22 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
> Both the VIRTIO_NET_CTRL_VLAN_ADD and VIRTIO_NET_CTRL_VLAN_DEL
> command take a little-endian 16-bit VLAN id as the command-specific-data.
>
> +VIRTIO_NET_CTRL_VLAN_ADD command adds the specified VLAN to the
> +VLAN filter table.
> +
> +VIRTIO_NET_CTRL_VLAN_DEL command removes the specified VLAN from
> +the VLAN filter table.
> +
> +\devicenormative{\subparagraph}{VLAN Filtering}{Device Types / Network Device / Device Operation / Control Virtqueue / VLAN Filtering}
> +
> +When VIRTIO_NET_F_CTRL_VLAN is not negotiated, the device MUST
> +accept all VLAN tagged packets as per the device configuration.
> +
> +When VIRTIO_NET_F_CTRL_VLAN is negotiated, the device MUST
> +accept all VLAN tagged packets whose VLAN tag is present in
> +the VLAN filter table and SHOULD drop all VLAN tagged packets
> +whose VLAN tag is absent in the VLAN filter table.
> +
> \subparagraph{Legacy Interface: VLAN Filtering}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue / VLAN Filtering / Legacy Interface: VLAN Filtering}
> When using the legacy interface, transitional devices and drivers
> MUST format the VLAN id
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply
* [virtio-comment] Re: [PATCH v1 01/20] virtio-network: Maintain network device spec in separate directory
From: Michael S. Tsirkin @ 2023-01-09 19:14 UTC (permalink / raw)
To: Parav Pandit
Cc: virtio-dev@lists.oasis-open.org,
virtio-comment@lists.oasis-open.org
In-Reply-To: <PH0PR12MB54814FCB63E60BE7F5985A15DCFE9@PH0PR12MB5481.namprd12.prod.outlook.com>
On Mon, Jan 09, 2023 at 02:12:36PM +0000, Parav Pandit wrote:
>
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Monday, January 9, 2023 8:43 AM
> >
> > I found another issue with this. Currently for redline diff generation
> > we use latexpand. Using a flat expanded file has lots of benefits, in
> > particular latexdiff is sometimes fragile as it is, with a flat file
> > one can at least see the input it gets.
> >
> > We should either stick with \\input or more work is needed on
> > these scripts. Besides, we are already using \\input and I like
> > consistency.
> >
> > So my preference is \\input for now.
> \\input doesn't support reading from the directory.
I just tested this and it seems to work fine:
\input{sub/file.tex}
what is the issue that you see?
I think I prefer \\input as it's easier to understand.
--
MST
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply
* Re: [PATCH v2 00/20] Split device spec to its individual files
From: Michael S. Tsirkin @ 2023-01-09 17:25 UTC (permalink / raw)
To: Parav Pandit; +Cc: virtio-dev, cohuck, virtio-comment
In-Reply-To: <20230109162849.463101-1-parav@nvidia.com>
On Mon, Jan 09, 2023 at 06:28:29PM +0200, Parav Pandit wrote:
> Relatively several of the recent device specifications are maintained
> in their own specification file. Such separate files enables better
> maintenance of the specification overall.
> However, several of the initial virtio device specifications
> are located in single file.
>
> Hence, split them into their individual files.
>
> Additionally, each device's driver and device conformance is
> present in one giant conformance file all together.
>
> As Michael suggest's move this device and driver conformance
> section adjacent to device specification in each device specific
> directory. This further makes device specification self-contained.
>
> Added patch to fix spelling errors in network device
> specification which was inherited from its previous file
> location.
>
> Patches do not change any part of the specification outcome
> except fixing the spelling errors.
> It only changes how the specification is maintained.
>
> patch summary:
> -------------
> patch 1 to 7 creates new files for moving devices spec out of
> content and conformance files.
> patch 8 to 20 move existing dedicated file spec to new directory
> and creates per device,driver conformance file for each device.
>
> changelog:
> ----------
> v1->v2:
> - removed extra blank lines in network and block device files
> - added missing device conformance link for rpmb, sound, i2c and
> gpio devices
> v0->v1:
> - move device spec to their own directory
> - added split files for conformance and placed them adjacent to
> device spec
> - added patch to fix spelling errors in network device
>
> Parav Pandit (20):
> virtio-network: Maintain network device spec in separate directory
> virtio-network: Fix spelling errors
> virtio-block: Maintain block device spec in separate directory
> virtio-console: Maintain console device spec in separate directory
> virtio-entropy: Maintain entropy device spec in separate directory
> virtio-mem-balloon: Maintain mem balloon device spec in separate
> directory
> virtio-scsi: Maintain scsi host device spec in separate directory
> virtio-gpu: Maintain gpu device spec in separate directory
> virtio-input: Maintain input device spec in separate directory
> virtio-crypto: Maintain crypto device spec in separate directory
> virtio-vsock: Maintain socket device spec in separate directory
> virtio-fs: Maintain file system device spec in separate directory
> virtio-rpmb: Maintain rpmb device spec in separate directory
> virtio-iommu: Maintain iommu device spec in separate directory
> virtio-sound: Maintain sound device spec in separate directory
> virtio-mem: Maintain memory device spec in separate directory
> virtio-i2c: Maintain i2c device spec in separate directory
> virtio-scmi: Maintain scmi device spec in separate directory
> virtio-gpio: Maintain gpio device spec in separate directory
> virtio-pmem: Maintain pmem device spec in separate directory
>
> conformance.tex | 456 +-
> content.tex | 4561 +----------------
> .../virtio-block/device-conformance.tex | 8 +
> device-types/virtio-block/device.tex | 1313 +++++
> .../virtio-block/driver-conformance.tex | 8 +
> .../virtio-console/device-conformance.tex | 8 +
> device-types/virtio-console/device.tex | 231 +
> .../virtio-console/driver-conformance.tex | 8 +
> .../virtio-crypto/device-conformance.tex | 13 +
> .../virtio-crypto/device.tex | 0
> .../virtio-crypto/driver-conformance.tex | 14 +
> .../virtio-entropy/device-conformance.tex | 7 +
> device-types/virtio-entropy/device.tex | 42 +
> .../virtio-entropy/driver-conformance.tex | 7 +
> device-types/virtio-fs/device-conformance.tex | 9 +
> .../virtio-fs/device.tex | 0
> device-types/virtio-fs/driver-conformance.tex | 10 +
> .../virtio-gpio/device-conformance.tex | 9 +
> .../virtio-gpio/device.tex | 0
> .../virtio-gpio/driver-conformance.tex | 9 +
> .../virtio-gpu/device-conformance.tex | 8 +
> .../virtio-gpu/device.tex | 0
> .../virtio-i2c/device-conformance.tex | 7 +
> .../virtio-i2c/device.tex | 0
> .../virtio-i2c/driver-conformance.tex | 7 +
> .../virtio-input/device-conformance.tex | 8 +
> .../virtio-input/device.tex | 0
> .../virtio-input/driver-conformance.tex | 8 +
> .../virtio-iommu/device-conformance.tex | 16 +
> .../virtio-iommu/device.tex | 0
> .../virtio-iommu/driver-conformance.tex | 17 +
> .../virtio-mem-balloon/device-conformance.tex | 12 +
> device-types/virtio-mem-balloon/device.tex | 634 +++
> .../virtio-mem-balloon/driver-conformance.tex | 12 +
> .../virtio-mem/device-conformance.tex | 13 +
> .../virtio-mem/device.tex | 0
> .../virtio-mem/driver-conformance.tex | 13 +
> .../virtio-network/device-conformance.tex | 16 +
> device-types/virtio-network/device.tex | 1594 ++++++
> .../virtio-network/driver-conformance.tex | 17 +
> .../virtio-pmem/device-conformance.tex | 9 +
> .../virtio-pmem/device.tex | 0
> .../virtio-pmem/driver-conformance.tex | 7 +
> .../virtio-rpmb/device-conformance.tex | 13 +
> .../virtio-rpmb/device.tex | 0
> .../virtio-rpmb/driver-conformance.tex | 7 +
> .../virtio-scmi/device-conformance.tex | 10 +
> .../virtio-scmi/device.tex | 0
> .../virtio-scmi/driver-conformance.tex | 8 +
> .../virtio-scsi/device-conformance.tex | 10 +
> device-types/virtio-scsi/device.tex | 709 +++
> .../virtio-scsi/driver-conformance.tex | 9 +
> .../virtio-sound/device-conformance.tex | 16 +
> .../virtio-sound/device.tex | 0
> .../virtio-sound/driver-conformance.tex | 13 +
> .../virtio-vsock/device-conformance.tex | 9 +
> .../virtio-vsock/device.tex | 0
> .../virtio-vsock/driver-conformance.tex | 10 +
> virtio.tex | 1 +
> 59 files changed, 4965 insertions(+), 4961 deletions(-)
Does makediff still work? Documentation says latexpand does not
support import. without latexdiff generating redlined versions would
be very difficult.
I am also worried about consistency since we
already use \\input.
If using \\input means putting everything in a single directory,
that's a small price to pay:
virtio-sound.tex + virtio-sound-conformance.tex
is not fundamentally worse than device-types/virtio-sound/device.tex
and device-types/virtio-sound/device-conformance.tex
and it avoids the duplicated "device" in the name.
Previously it looked like a cosmetic issue, but now it looks
like it's important.
> create mode 100644 device-types/virtio-block/device-conformance.tex
> create mode 100644 device-types/virtio-block/device.tex
> create mode 100644 device-types/virtio-block/driver-conformance.tex
> create mode 100644 device-types/virtio-console/device-conformance.tex
> create mode 100644 device-types/virtio-console/device.tex
> create mode 100644 device-types/virtio-console/driver-conformance.tex
> create mode 100644 device-types/virtio-crypto/device-conformance.tex
> rename virtio-crypto.tex => device-types/virtio-crypto/device.tex (100%)
> create mode 100644 device-types/virtio-crypto/driver-conformance.tex
> create mode 100644 device-types/virtio-entropy/device-conformance.tex
> create mode 100644 device-types/virtio-entropy/device.tex
> create mode 100644 device-types/virtio-entropy/driver-conformance.tex
> create mode 100644 device-types/virtio-fs/device-conformance.tex
> rename virtio-fs.tex => device-types/virtio-fs/device.tex (100%)
> create mode 100644 device-types/virtio-fs/driver-conformance.tex
> create mode 100644 device-types/virtio-gpio/device-conformance.tex
> rename virtio-gpio.tex => device-types/virtio-gpio/device.tex (100%)
> create mode 100644 device-types/virtio-gpio/driver-conformance.tex
> create mode 100644 device-types/virtio-gpu/device-conformance.tex
> rename virtio-gpu.tex => device-types/virtio-gpu/device.tex (100%)
> create mode 100644 device-types/virtio-i2c/device-conformance.tex
> rename virtio-i2c.tex => device-types/virtio-i2c/device.tex (100%)
> create mode 100644 device-types/virtio-i2c/driver-conformance.tex
> create mode 100644 device-types/virtio-input/device-conformance.tex
> rename virtio-input.tex => device-types/virtio-input/device.tex (100%)
> create mode 100644 device-types/virtio-input/driver-conformance.tex
> create mode 100644 device-types/virtio-iommu/device-conformance.tex
> rename virtio-iommu.tex => device-types/virtio-iommu/device.tex (100%)
> create mode 100644 device-types/virtio-iommu/driver-conformance.tex
> create mode 100644 device-types/virtio-mem-balloon/device-conformance.tex
> create mode 100644 device-types/virtio-mem-balloon/device.tex
> create mode 100644 device-types/virtio-mem-balloon/driver-conformance.tex
> create mode 100644 device-types/virtio-mem/device-conformance.tex
> rename virtio-mem.tex => device-types/virtio-mem/device.tex (100%)
> create mode 100644 device-types/virtio-mem/driver-conformance.tex
> create mode 100644 device-types/virtio-network/device-conformance.tex
> create mode 100644 device-types/virtio-network/device.tex
> create mode 100644 device-types/virtio-network/driver-conformance.tex
> create mode 100644 device-types/virtio-pmem/device-conformance.tex
> rename virtio-pmem.tex => device-types/virtio-pmem/device.tex (100%)
> create mode 100644 device-types/virtio-pmem/driver-conformance.tex
> create mode 100644 device-types/virtio-rpmb/device-conformance.tex
> rename virtio-rpmb.tex => device-types/virtio-rpmb/device.tex (100%)
> create mode 100644 device-types/virtio-rpmb/driver-conformance.tex
> create mode 100644 device-types/virtio-scmi/device-conformance.tex
> rename virtio-scmi.tex => device-types/virtio-scmi/device.tex (100%)
> create mode 100644 device-types/virtio-scmi/driver-conformance.tex
> create mode 100644 device-types/virtio-scsi/device-conformance.tex
> create mode 100644 device-types/virtio-scsi/device.tex
> create mode 100644 device-types/virtio-scsi/driver-conformance.tex
> create mode 100644 device-types/virtio-sound/device-conformance.tex
> rename virtio-sound.tex => device-types/virtio-sound/device.tex (100%)
> create mode 100644 device-types/virtio-sound/driver-conformance.tex
> create mode 100644 device-types/virtio-vsock/device-conformance.tex
> rename virtio-vsock.tex => device-types/virtio-vsock/device.tex (100%)
> create mode 100644 device-types/virtio-vsock/driver-conformance.tex
>
> --
> 2.26.2
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox