From: "Michael S. Tsirkin" <mst@redhat.com>
To: Alberto Faria <afaria@redhat.com>
Cc: virtio-comment@lists.linux.dev, Daniel Verkamp <dverkamp@chromium.org>
Subject: Re: [PATCH v2] virtio-blk: Add a VIRTIO_BLK_T_OUT_FUA request type
Date: Thu, 8 May 2025 02:35:47 -0400 [thread overview]
Message-ID: <20250508023438-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20250508003409.426807-1-afaria@redhat.com>
On Thu, May 08, 2025 at 01:34:09AM +0100, Alberto Faria wrote:
> This is a variant of the VIRTIO_BLK_T_OUT request type that further
> ensures the write becomes stable once the request completes, commonly
> known as a Force Unit Access (FUA) request.
>
> Also add a VIRTIO_BLK_F_OUT_FUA feature bit signaling support for this
> request type.
>
> Signed-off-by: Alberto Faria <afaria@redhat.com>
Thanks for the patch!
Could you pls expand on the motivation for the feature?
It's a good idea to add this info in qemu and linux patches, too.
Thanks!
> ---
> Matching kernel and qemu RFCs:
> - https://lore.kernel.org/linux-block/20250508001951.421467-1-afaria@redhat.com/
> - https://lore.kernel.org/qemu-devel/20250508002440.423776-1-afaria@redhat.com/
>
> v2:
> - Redefine VIRTIO_BLK_T_OUT_FUA to 27 since 15 is already in use.
> - Clarify that the cache mode has no impact on VIRTIO_BLK_T_OUT_FUA
> semantics.
> - Allow drivers to negotiate VIRTIO_BLK_F_OUT_FUA even if they are
> incapable of sending VIRTIO_BLK_T_OUT_FUA commands.
>
> device-types/blk/description.tex | 114 +++++++++++++++++--------------
> 1 file changed, 64 insertions(+), 50 deletions(-)
>
> diff --git a/device-types/blk/description.tex b/device-types/blk/description.tex
> index 2712ada..d3f4f37 100644
> --- a/device-types/blk/description.tex
> +++ b/device-types/blk/description.tex
> @@ -66,6 +66,8 @@ \subsection{Feature bits}\label{sec:Device Types / Block Device / Feature bits}
> (ZNS). For brevity, these standard documents are referred as "ZBD standards"
> from this point on in the text.
>
> +\item[VIRTIO_BLK_F_OUT_FUA (18)] Force Unit Access (FUA) command support.
> +
> \end{description}
>
> \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Block Device / Feature bits / Legacy Interface: Feature bits}
> @@ -395,9 +397,9 @@ \subsection{Device Initialization}\label{sec:Device Types / Block Device / Devic
> VIRTIO_BLK_T_ZONE_APPEND requests.
>
> \item the \field{write_granularity} field of \field{zoned} MUST be set by the
> - device to the offset and size alignment constraint for VIRTIO_BLK_T_OUT
> - and VIRTIO_BLK_T_ZONE_APPEND requests issued to a sequential zone of the
> - device.
> + device to the offset and size alignment constraint for VIRTIO_BLK_T_OUT,
> + VIRTIO_BLK_T_ZONE_APPEND, and VIRTIO_BLK_T_OUT_FUA requests issued to a
> + sequential zone of the device.
>
> \item the device MUST initialize padding bytes \field{unused2} to 0.
> \end{itemize}
> @@ -445,8 +447,9 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
> (VIRTIO_BLK_T_OUT), a discard (VIRTIO_BLK_T_DISCARD), a write zeroes
> (VIRTIO_BLK_T_WRITE_ZEROES), a flush (VIRTIO_BLK_T_FLUSH), a get device ID
> string command (VIRTIO_BLK_T_GET_ID), a secure erase
> -(VIRTIO_BLK_T_SECURE_ERASE), or a get device lifetime command
> -(VIRTIO_BLK_T_GET_LIFETIME).
> +(VIRTIO_BLK_T_SECURE_ERASE), a get device lifetime command
> +(VIRTIO_BLK_T_GET_LIFETIME), or a Force Unit Access write
> +(VIRTIO_BLK_T_OUT_FUA).
>
> \begin{lstlisting}
> #define VIRTIO_BLK_T_IN 0
> @@ -457,6 +460,7 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
> #define VIRTIO_BLK_T_DISCARD 11
> #define VIRTIO_BLK_T_WRITE_ZEROES 13
> #define VIRTIO_BLK_T_SECURE_ERASE 14
> +#define VIRTIO_BLK_T_OUT_FUA 27
> \end{lstlisting}
>
> The \field{sector} number indicates the offset (multiplied by 512) where
> @@ -464,9 +468,9 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
> commands other than read, write and some zone operations.
>
> VIRTIO_BLK_T_IN requests populate \field{data} with the contents of sectors
> -read from the block device (in multiples of 512 bytes). VIRTIO_BLK_T_OUT
> -requests write the contents of \field{data} to the block device (in multiples
> -of 512 bytes).
> +read from the block device (in multiples of 512 bytes). VIRTIO_BLK_T_OUT and
> +VIRTIO_BLK_T_OUT_FUA requests write the contents of \field{data} to the block
> +device (in multiples of 512 bytes).
>
> The \field{data} used for discard, secure erase or write zeroes commands
> consists of one or more segments. The maximum number of segments is
> @@ -566,11 +570,12 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
>
> Requests of type VIRTIO_BLK_T_OUT, VIRTIO_BLK_T_ZONE_OPEN,
> VIRTIO_BLK_T_ZONE_CLOSE, VIRTIO_BLK_T_ZONE_FINISH, VIRTIO_BLK_T_ZONE_APPEND,
> -VIRTIO_BLK_T_ZONE_RESET or VIRTIO_BLK_T_ZONE_RESET_ALL may be completed by the
> -device with VIRTIO_BLK_S_OK, VIRTIO_BLK_S_IOERR or VIRTIO_BLK_S_UNSUPP
> -\field{status}, or, additionally, with VIRTIO_BLK_S_ZONE_INVALID_CMD,
> -VIRTIO_BLK_S_ZONE_UNALIGNED_WP, VIRTIO_BLK_S_ZONE_OPEN_RESOURCE or
> -VIRTIO_BLK_S_ZONE_ACTIVE_RESOURCE ZBD-specific status codes.
> +VIRTIO_BLK_T_ZONE_RESET, VIRTIO_BLK_T_ZONE_RESET_ALL or VIRTIO_BLK_T_OUT_FUA may
> +be completed by the device with VIRTIO_BLK_S_OK, VIRTIO_BLK_S_IOERR or
> +VIRTIO_BLK_S_UNSUPP \field{status}, or, additionally, with
> +VIRTIO_BLK_S_ZONE_INVALID_CMD, VIRTIO_BLK_S_ZONE_UNALIGNED_WP,
> +VIRTIO_BLK_S_ZONE_OPEN_RESOURCE or VIRTIO_BLK_S_ZONE_ACTIVE_RESOURCE
> +ZBD-specific status codes.
>
> Besides the request status, VIRTIO_BLK_T_ZONE_APPEND requests return the
> starting sector of the appended data back to the driver. For this reason,
> @@ -672,8 +677,8 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
>
> Zones with VIRTIO_BLK_ZT_SWP can be read randomly and should be written
> sequentially, similarly to SWR zones. However, SWP zones can accept random write
> -operations, that is, VIRTIO_BLK_T_OUT requests with a start sector different
> -from the zone write pointer position.
> +operations, that is, VIRTIO_BLK_T_OUT and VIRTIO_BLK_T_OUT_FUA requests with a
> +start sector different from the zone write pointer position.
>
> The field \field{z_state} of \field{virtio_blk_zone_descriptor} indicates the
> state of the device zone.
> @@ -731,8 +736,9 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
> equal to the sector address of the zone. In this state, the entire capacity of
> the zone is available for writing. A zone can transition from this state to
> \begin{itemize}
> -\item VIRTIO_BLK_ZS_IOPEN when a successful VIRTIO_BLK_T_OUT request or
> - VIRTIO_BLK_T_ZONE_APPEND with a non-zero data size is received for the zone.
> +\item VIRTIO_BLK_ZS_IOPEN when a successful VIRTIO_BLK_T_OUT or
> + VIRTIO_BLK_T_OUT_FUA request or VIRTIO_BLK_T_ZONE_APPEND with a non-zero
> + data size is received for the zone.
>
> \item VIRTIO_BLK_ZS_EOPEN when a successful VIRTIO_BLK_T_ZONE_OPEN request is
> received for the zone
> @@ -763,9 +769,9 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
> \item VIRTIO_BLK_ZS_FULL when a successful VIRTIO_BLK_T_ZONE_FINISH request is
> received for the zone.
>
> -\item VIRTIO_BLK_ZS_FULL when a successful VIRTIO_BLK_T_OUT or
> - VIRTIO_BLK_T_ZONE_APPEND request that causes the zone to reach its writable
> - capacity is received for the zone.
> +\item VIRTIO_BLK_ZS_FULL when a successful VIRTIO_BLK_T_OUT,
> + VIRTIO_BLK_T_ZONE_APPEND, or VIRTIO_BLK_T_OUT_FUA request that causes the
> + zone to reach its writable capacity is received for the zone.
> \end{itemize}
>
> Zones in the VIRTIO_BLK_ZS_EOPEN (Explicitly Open) state transition from
> @@ -788,9 +794,9 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
> \item VIRTIO_BLK_ZS_FULL when a successful VIRTIO_BLK_T_ZONE_FINISH request is
> received for the zone,
>
> -\item VIRTIO_BLK_ZS_FULL when a successful VIRTIO_BLK_T_OUT or
> - VIRTIO_BLK_T_ZONE_APPEND request that causes the zone to reach its writable
> - capacity is received for the zone.
> +\item VIRTIO_BLK_ZS_FULL when a successful VIRTIO_BLK_T_OUT,
> + VIRTIO_BLK_T_ZONE_APPEND, or VIRTIO_BLK_T_OUT_FUA request that causes the
> + zone to reach its writable capacity is received for the zone.
> \end{itemize}
>
> When a VIRTIO_BLK_T_ZONE_EOPEN request is issued to an Explicitly Open zone, the
> @@ -806,8 +812,9 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
> \item VIRTIO_BLK_ZS_EMPTY when a successful VIRTIO_BLK_T_ZONE_RESET_ALL request
> is received by the device,
>
> -\item VIRTIO_BLK_ZS_IOPEN when a successful VIRTIO_BLK_T_OUT request or
> - VIRTIO_BLK_T_ZONE_APPEND with a non-zero data size is received for the zone.
> +\item VIRTIO_BLK_ZS_IOPEN when a successful VIRTIO_BLK_T_OUT or
> + VIRTIO_BLK_T_OUT_FUA request or VIRTIO_BLK_T_ZONE_APPEND with a non-zero
> + data size is received for the zone.
>
> \item VIRTIO_BLK_ZS_EOPEN when a successful VIRTIO_BLK_T_ZONE_OPEN request is
> received for the zone,
> @@ -876,8 +883,8 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
> A driver MUST set \field{sector} to 0 for a VIRTIO_BLK_T_FLUSH request.
> A driver SHOULD NOT include any data in a VIRTIO_BLK_T_FLUSH request.
>
> -The length of \field{data} MUST be a multiple of 512 bytes for VIRTIO_BLK_T_IN
> -and VIRTIO_BLK_T_OUT requests.
> +The length of \field{data} MUST be a multiple of 512 bytes for VIRTIO_BLK_T_IN,
> +VIRTIO_BLK_T_OUT, and VIRTIO_BLK_T_OUT_FUA requests.
>
> The length of \field{data} MUST be a multiple of the size of struct
> virtio_blk_discard_write_zeroes for VIRTIO_BLK_T_DISCARD,
> @@ -939,8 +946,8 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
> MAY read the starting sector location of the written data from the request
> field \field{append_sector}.
>
> -All VIRTIO_BLK_T_OUT requests issued by the driver to sequential zones and
> -VIRTIO_BLK_T_ZONE_APPEND requests MUST have:
> +All VIRTIO_BLK_T_OUT and VIRTIO_BLK_T_OUT_FUA requests issued by the driver to
> +sequential zones and VIRTIO_BLK_T_ZONE_APPEND requests MUST have:
>
> \begin{enumerate}
> \item the data size that is a multiple of the number of bytes reported
> @@ -1002,11 +1009,17 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
>
> \item\label{item:flush3} a VIRTIO_BLK_T_FLUSH request is sent \textbf{after the write is
> completed} and is completed itself.
> +
> +\item\label{item:flush4} the write was performed using a VIRTIO_BLK_T_OUT_FUA
> + request, regardless of whether the VIRTIO_BLK_F_FLUSH or
> + VIRTIO_BLK_F_CONFIG_WCE features were negotiated and regardless of the current
> + cache mode (as expressed by the value of the \field{writeback} field in
> + configuration space).
> \end{enumerate}
>
> If the device is backed by persistent storage, the device MUST ensure that
> stable writes are committed to it, before reporting completion of the write
> -(cases~\ref{item:flush1} and~\ref{item:flush2}) or the flush
> +(cases~\ref{item:flush1}, \ref{item:flush2}, and~\ref{item:flush4}) or the flush
> (case~\ref{item:flush3}). Failure to do so can cause data loss
> in case of a crash.
>
> @@ -1098,22 +1111,22 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
> sequential device zones in VIRTIO_BLK_ZS_IOPEN, VIRTIO_BLK_ZS_EOPEN,
> VIRTIO_BLK_ZS_CLOSED and VIRTIO_BLK_ZS_FULL state to VIRTIO_BLK_ZS_EMPTY state.
>
> -Upon receiving a VIRTIO_BLK_T_ZONE_APPEND request or a VIRTIO_BLK_T_OUT
> -request issued to a SWR zone in VIRTIO_BLK_ZS_EMPTY or VIRTIO_BLK_ZS_CLOSED
> -state, the device attempts to perform the transition of the zone to
> -VIRTIO_BLK_ZS_IOPEN state before writing data. This transition may fail due to
> -insufficient open and/or active zone resources available on the device. In this
> -case, the request MUST be completed with VIRTIO_BLK_S_ZONE_OPEN_RESOURCE or
> -VIRTIO_BLK_S_ZONE_ACTIVE_RESOURCE value in the \field{status}.
> +Upon receiving a VIRTIO_BLK_T_ZONE_APPEND request or a VIRTIO_BLK_T_OUT or
> +VIRTIO_BLK_T_OUT_FUA request issued to a SWR zone in VIRTIO_BLK_ZS_EMPTY or
> +VIRTIO_BLK_ZS_CLOSED state, the device attempts to perform the transition of the
> +zone to VIRTIO_BLK_ZS_IOPEN state before writing data. This transition may fail
> +due to insufficient open and/or active zone resources available on the device.
> +In this case, the request MUST be completed with VIRTIO_BLK_S_ZONE_OPEN_RESOURCE
> +or VIRTIO_BLK_S_ZONE_ACTIVE_RESOURCE value in the \field{status}.
>
> If the \field{sector} field in the VIRTIO_BLK_T_ZONE_APPEND request does not
> specify the lowest sector for a zone, then the request SHALL be completed with
> VIRTIO_BLK_S_ZONE_INVALID_CMD value in \field{status}.
>
> -A VIRTIO_BLK_T_ZONE_APPEND request or a VIRTIO_BLK_T_OUT request that has the
> -data range that exceeds the remaining writable capacity for the zone, then the
> -request SHALL be completed with VIRTIO_BLK_S_ZONE_INVALID_CMD value in
> -\field{status}.
> +A VIRTIO_BLK_T_ZONE_APPEND request or a VIRTIO_BLK_T_OUT or VIRTIO_BLK_T_OUT_FUA
> +request that has the data range that exceeds the remaining writable capacity for
> +the zone, then the request SHALL be completed with VIRTIO_BLK_S_ZONE_INVALID_CMD
> +value in \field{status}.
>
> If a request of the type VIRTIO_BLK_T_ZONE_APPEND is completed with
> VIRTIO_BLK_S_OK status, the field \field{append_sector} in
> @@ -1136,22 +1149,23 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
> VIRTIO_BLK_S_ZONE_INVALID_CMD \field{status}.
> \end{itemize}
>
> -If a VIRTIO_BLK_T_ZONE_APPEND request, a VIRTIO_BLK_T_IN request or a
> -VIRTIO_BLK_T_OUT request issued to a SWR zone has the range that has sectors in
> -more than one zone, then the request SHALL be completed with
> +If a VIRTIO_BLK_T_ZONE_APPEND, VIRTIO_BLK_T_IN, VIRTIO_BLK_T_OUT, or
> +VIRTIO_BLK_T_OUT_FUA request issued to a SWR zone has the range that has sectors
> +in more than one zone, then the request SHALL be completed with
> VIRTIO_BLK_S_ZONE_INVALID_CMD value in the field \field{status}.
>
> -A VIRTIO_BLK_T_OUT request that has the \field{sector} value that is not aligned
> -with the write pointer for the zone, then the request SHALL be completed with
> -VIRTIO_BLK_S_ZONE_UNALIGNED_WP value in the field \field{status}.
> +A VIRTIO_BLK_T_OUT or VIRTIO_BLK_T_OUT_FUA request that has the \field{sector}
> +value that is not aligned with the write pointer for the zone, then the request
> +SHALL be completed with VIRTIO_BLK_S_ZONE_UNALIGNED_WP value in the field
> +\field{status}.
>
> In order to avoid resource-related errors while opening zones implicitly, the
> device MAY automatically transition zones in VIRTIO_BLK_ZS_IOPEN state to
> VIRTIO_BLK_ZS_CLOSED state.
>
> -All VIRTIO_BLK_T_OUT requests or VIRTIO_BLK_T_ZONE_APPEND requests issued
> -to a zone in the VIRTIO_BLK_ZS_RDONLY state SHALL be completed with
> -VIRTIO_BLK_S_ZONE_INVALID_CMD \field{status}.
> +All VIRTIO_BLK_T_OUT, VIRTIO_BLK_T_OUT_FUA, and VIRTIO_BLK_T_ZONE_APPEND
> +requests issued to a zone in the VIRTIO_BLK_ZS_RDONLY state SHALL be completed
> +with VIRTIO_BLK_S_ZONE_INVALID_CMD \field{status}.
>
> All requests issued to a zone in the VIRTIO_BLK_ZS_OFFLINE state SHALL be
> completed with VIRTIO_BLK_S_ZONE_INVALID_CMD value in the field \field{status}.
> --
> 2.49.0
>
next prev parent reply other threads:[~2025-05-08 6:35 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-08 0:34 [PATCH v2] virtio-blk: Add a VIRTIO_BLK_T_OUT_FUA request type Alberto Faria
2025-05-08 6:35 ` Michael S. Tsirkin [this message]
2025-05-08 20:30 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250508023438-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=afaria@redhat.com \
--cc=dverkamp@chromium.org \
--cc=virtio-comment@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox