public inbox for virtio-comment@lists.linux.dev
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Alberto Faria <afaria@redhat.com>
Cc: virtio-comment@lists.linux.dev, Daniel Verkamp <dverkamp@chromium.org>
Subject: Re: [PATCH v2] virtio-blk: Add a VIRTIO_BLK_T_OUT_FUA request type
Date: Thu, 8 May 2025 02:35:47 -0400	[thread overview]
Message-ID: <20250508023438-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20250508003409.426807-1-afaria@redhat.com>

On Thu, May 08, 2025 at 01:34:09AM +0100, Alberto Faria wrote:
> This is a variant of the VIRTIO_BLK_T_OUT request type that further
> ensures the write becomes stable once the request completes, commonly
> known as a Force Unit Access (FUA) request.
> 
> Also add a VIRTIO_BLK_F_OUT_FUA feature bit signaling support for this
> request type.
> 
> Signed-off-by: Alberto Faria <afaria@redhat.com>

Thanks for the patch!
Could you pls expand on the motivation for the feature?
It's a good idea to add this info in qemu and linux patches, too.
Thanks!


> ---
> Matching kernel and qemu RFCs:
> - https://lore.kernel.org/linux-block/20250508001951.421467-1-afaria@redhat.com/
> - https://lore.kernel.org/qemu-devel/20250508002440.423776-1-afaria@redhat.com/
> 
> v2:
> - Redefine VIRTIO_BLK_T_OUT_FUA to 27 since 15 is already in use.
> - Clarify that the cache mode has no impact on VIRTIO_BLK_T_OUT_FUA
>   semantics.
> - Allow drivers to negotiate VIRTIO_BLK_F_OUT_FUA even if they are
>   incapable of sending VIRTIO_BLK_T_OUT_FUA commands.
> 
>  device-types/blk/description.tex | 114 +++++++++++++++++--------------
>  1 file changed, 64 insertions(+), 50 deletions(-)
> 
> diff --git a/device-types/blk/description.tex b/device-types/blk/description.tex
> index 2712ada..d3f4f37 100644
> --- a/device-types/blk/description.tex
> +++ b/device-types/blk/description.tex
> @@ -66,6 +66,8 @@ \subsection{Feature bits}\label{sec:Device Types / Block Device / Feature bits}
>  	(ZNS). For brevity, these standard documents are referred as "ZBD standards"
>  	from this point on in the text.
>  
> +\item[VIRTIO_BLK_F_OUT_FUA (18)] Force Unit Access (FUA) command support.
> +
>  \end{description}
>  
>  \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Block Device / Feature bits / Legacy Interface: Feature bits}
> @@ -395,9 +397,9 @@ \subsection{Device Initialization}\label{sec:Device Types / Block Device / Devic
>      VIRTIO_BLK_T_ZONE_APPEND requests.
>  
>  \item the \field{write_granularity} field of \field{zoned} MUST be set by the
> -    device to the offset and size alignment constraint for VIRTIO_BLK_T_OUT
> -    and VIRTIO_BLK_T_ZONE_APPEND requests issued to a sequential zone of the
> -    device.
> +    device to the offset and size alignment constraint for VIRTIO_BLK_T_OUT,
> +    VIRTIO_BLK_T_ZONE_APPEND, and VIRTIO_BLK_T_OUT_FUA requests issued to a
> +    sequential zone of the device.
>  
>  \item the device MUST initialize padding bytes \field{unused2} to 0.
>  \end{itemize}
> @@ -445,8 +447,9 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
>  (VIRTIO_BLK_T_OUT), a discard (VIRTIO_BLK_T_DISCARD), a write zeroes
>  (VIRTIO_BLK_T_WRITE_ZEROES), a flush (VIRTIO_BLK_T_FLUSH), a get device ID
>  string command (VIRTIO_BLK_T_GET_ID), a secure erase
> -(VIRTIO_BLK_T_SECURE_ERASE), or a get device lifetime command
> -(VIRTIO_BLK_T_GET_LIFETIME).
> +(VIRTIO_BLK_T_SECURE_ERASE), a get device lifetime command
> +(VIRTIO_BLK_T_GET_LIFETIME), or a Force Unit Access write
> +(VIRTIO_BLK_T_OUT_FUA).
>  
>  \begin{lstlisting}
>  #define VIRTIO_BLK_T_IN           0
> @@ -457,6 +460,7 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
>  #define VIRTIO_BLK_T_DISCARD      11
>  #define VIRTIO_BLK_T_WRITE_ZEROES 13
>  #define VIRTIO_BLK_T_SECURE_ERASE   14
> +#define VIRTIO_BLK_T_OUT_FUA      27
>  \end{lstlisting}
>  
>  The \field{sector} number indicates the offset (multiplied by 512) where
> @@ -464,9 +468,9 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
>  commands other than read, write and some zone operations.
>  
>  VIRTIO_BLK_T_IN requests populate \field{data} with the contents of sectors
> -read from the block device (in multiples of 512 bytes).  VIRTIO_BLK_T_OUT
> -requests write the contents of \field{data} to the block device (in multiples
> -of 512 bytes).
> +read from the block device (in multiples of 512 bytes).  VIRTIO_BLK_T_OUT and
> +VIRTIO_BLK_T_OUT_FUA requests write the contents of \field{data} to the block
> +device (in multiples of 512 bytes).
>  
>  The \field{data} used for discard, secure erase or write zeroes commands
>  consists of one or more segments. The maximum number of segments is
> @@ -566,11 +570,12 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
>  
>  Requests of type VIRTIO_BLK_T_OUT, VIRTIO_BLK_T_ZONE_OPEN,
>  VIRTIO_BLK_T_ZONE_CLOSE, VIRTIO_BLK_T_ZONE_FINISH, VIRTIO_BLK_T_ZONE_APPEND,
> -VIRTIO_BLK_T_ZONE_RESET or VIRTIO_BLK_T_ZONE_RESET_ALL may be completed by the
> -device with VIRTIO_BLK_S_OK, VIRTIO_BLK_S_IOERR or VIRTIO_BLK_S_UNSUPP
> -\field{status}, or, additionally, with  VIRTIO_BLK_S_ZONE_INVALID_CMD,
> -VIRTIO_BLK_S_ZONE_UNALIGNED_WP, VIRTIO_BLK_S_ZONE_OPEN_RESOURCE or
> -VIRTIO_BLK_S_ZONE_ACTIVE_RESOURCE ZBD-specific status codes.
> +VIRTIO_BLK_T_ZONE_RESET, VIRTIO_BLK_T_ZONE_RESET_ALL or VIRTIO_BLK_T_OUT_FUA may
> +be completed by the device with VIRTIO_BLK_S_OK, VIRTIO_BLK_S_IOERR or
> +VIRTIO_BLK_S_UNSUPP \field{status}, or, additionally, with
> +VIRTIO_BLK_S_ZONE_INVALID_CMD, VIRTIO_BLK_S_ZONE_UNALIGNED_WP,
> +VIRTIO_BLK_S_ZONE_OPEN_RESOURCE or VIRTIO_BLK_S_ZONE_ACTIVE_RESOURCE
> +ZBD-specific status codes.
>  
>  Besides the request status, VIRTIO_BLK_T_ZONE_APPEND requests return the
>  starting sector of the appended data back to the driver. For this reason,
> @@ -672,8 +677,8 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
>  
>  Zones with VIRTIO_BLK_ZT_SWP can be read randomly and should be written
>  sequentially, similarly to SWR zones. However, SWP zones can accept random write
> -operations, that is, VIRTIO_BLK_T_OUT requests with a start sector different
> -from the zone write pointer position.
> +operations, that is, VIRTIO_BLK_T_OUT and VIRTIO_BLK_T_OUT_FUA requests with a
> +start sector different from the zone write pointer position.
>  
>  The field \field{z_state} of \field{virtio_blk_zone_descriptor} indicates the
>  state of the device zone.
> @@ -731,8 +736,9 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
>  equal to the sector address of the zone. In this state, the entire capacity of
>  the zone is available for writing. A zone can transition from this state to
>  \begin{itemize}
> -\item VIRTIO_BLK_ZS_IOPEN when a successful VIRTIO_BLK_T_OUT request or
> -    VIRTIO_BLK_T_ZONE_APPEND with a non-zero data size is received for the zone.
> +\item VIRTIO_BLK_ZS_IOPEN when a successful VIRTIO_BLK_T_OUT or
> +    VIRTIO_BLK_T_OUT_FUA request or VIRTIO_BLK_T_ZONE_APPEND with a non-zero
> +    data size is received for the zone.
>  
>  \item VIRTIO_BLK_ZS_EOPEN when a successful VIRTIO_BLK_T_ZONE_OPEN request is
>      received for the zone
> @@ -763,9 +769,9 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
>  \item VIRTIO_BLK_ZS_FULL when a successful VIRTIO_BLK_T_ZONE_FINISH request is
>      received for the zone.
>  
> -\item VIRTIO_BLK_ZS_FULL when a successful VIRTIO_BLK_T_OUT or
> -    VIRTIO_BLK_T_ZONE_APPEND request that causes the zone to reach its writable
> -    capacity is received for the zone.
> +\item VIRTIO_BLK_ZS_FULL when a successful VIRTIO_BLK_T_OUT,
> +    VIRTIO_BLK_T_ZONE_APPEND, or VIRTIO_BLK_T_OUT_FUA request that causes the
> +    zone to reach its writable capacity is received for the zone.
>  \end{itemize}
>  
>  Zones in the VIRTIO_BLK_ZS_EOPEN (Explicitly Open) state transition from
> @@ -788,9 +794,9 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
>  \item VIRTIO_BLK_ZS_FULL when a successful VIRTIO_BLK_T_ZONE_FINISH request is
>      received for the zone,
>  
> -\item VIRTIO_BLK_ZS_FULL when a successful VIRTIO_BLK_T_OUT or
> -    VIRTIO_BLK_T_ZONE_APPEND request that causes the zone to reach its writable
> -    capacity is received for the zone.
> +\item VIRTIO_BLK_ZS_FULL when a successful VIRTIO_BLK_T_OUT,
> +    VIRTIO_BLK_T_ZONE_APPEND, or VIRTIO_BLK_T_OUT_FUA request that causes the
> +    zone to reach its writable capacity is received for the zone.
>  \end{itemize}
>  
>  When a VIRTIO_BLK_T_ZONE_EOPEN request is issued to an Explicitly Open zone, the
> @@ -806,8 +812,9 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
>  \item VIRTIO_BLK_ZS_EMPTY when a successful VIRTIO_BLK_T_ZONE_RESET_ALL request
>      is received by the device,
>  
> -\item VIRTIO_BLK_ZS_IOPEN when a successful VIRTIO_BLK_T_OUT request or
> -    VIRTIO_BLK_T_ZONE_APPEND with a non-zero data size is received for the zone.
> +\item VIRTIO_BLK_ZS_IOPEN when a successful VIRTIO_BLK_T_OUT or
> +    VIRTIO_BLK_T_OUT_FUA request or VIRTIO_BLK_T_ZONE_APPEND with a non-zero
> +    data size is received for the zone.
>  
>  \item VIRTIO_BLK_ZS_EOPEN when a successful VIRTIO_BLK_T_ZONE_OPEN request is
>      received for the zone,
> @@ -876,8 +883,8 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
>  A driver MUST set \field{sector} to 0 for a VIRTIO_BLK_T_FLUSH request.
>  A driver SHOULD NOT include any data in a VIRTIO_BLK_T_FLUSH request.
>  
> -The length of \field{data} MUST be a multiple of 512 bytes for VIRTIO_BLK_T_IN
> -and VIRTIO_BLK_T_OUT requests.
> +The length of \field{data} MUST be a multiple of 512 bytes for VIRTIO_BLK_T_IN,
> +VIRTIO_BLK_T_OUT, and VIRTIO_BLK_T_OUT_FUA requests.
>  
>  The length of \field{data} MUST be a multiple of the size of struct
>  virtio_blk_discard_write_zeroes for VIRTIO_BLK_T_DISCARD,
> @@ -939,8 +946,8 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
>  MAY read the starting sector location of the written data from the request
>  field \field{append_sector}.
>  
> -All VIRTIO_BLK_T_OUT requests issued by the driver to sequential zones and
> -VIRTIO_BLK_T_ZONE_APPEND requests MUST have:
> +All VIRTIO_BLK_T_OUT and VIRTIO_BLK_T_OUT_FUA requests issued by the driver to
> +sequential zones and VIRTIO_BLK_T_ZONE_APPEND requests MUST have:
>  
>  \begin{enumerate}
>  \item the data size that is a multiple of the number of bytes reported
> @@ -1002,11 +1009,17 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
>  
>  \item\label{item:flush3} a VIRTIO_BLK_T_FLUSH request is sent \textbf{after the write is
>    completed} and is completed itself.
> +
> +\item\label{item:flush4} the write was performed using a VIRTIO_BLK_T_OUT_FUA
> +  request, regardless of whether the VIRTIO_BLK_F_FLUSH or
> +  VIRTIO_BLK_F_CONFIG_WCE features were negotiated and regardless of the current
> +  cache mode (as expressed by the value of the \field{writeback} field in
> +  configuration space).
>  \end{enumerate}
>  
>  If the device is backed by persistent storage, the device MUST ensure that
>  stable writes are committed to it, before reporting completion of the write
> -(cases~\ref{item:flush1} and~\ref{item:flush2}) or the flush
> +(cases~\ref{item:flush1}, \ref{item:flush2}, and~\ref{item:flush4}) or the flush
>  (case~\ref{item:flush3}).  Failure to do so can cause data loss
>  in case of a crash.
>  
> @@ -1098,22 +1111,22 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
>  sequential device zones in VIRTIO_BLK_ZS_IOPEN, VIRTIO_BLK_ZS_EOPEN,
>  VIRTIO_BLK_ZS_CLOSED and VIRTIO_BLK_ZS_FULL state to VIRTIO_BLK_ZS_EMPTY state.
>  
> -Upon receiving a VIRTIO_BLK_T_ZONE_APPEND request or a VIRTIO_BLK_T_OUT
> -request issued to a SWR zone in VIRTIO_BLK_ZS_EMPTY or VIRTIO_BLK_ZS_CLOSED
> -state, the device attempts to perform the transition of the zone to
> -VIRTIO_BLK_ZS_IOPEN state before writing data. This transition may fail due to
> -insufficient open and/or active zone resources available on the device. In this
> -case, the request MUST be completed with VIRTIO_BLK_S_ZONE_OPEN_RESOURCE or
> -VIRTIO_BLK_S_ZONE_ACTIVE_RESOURCE value in the \field{status}.
> +Upon receiving a VIRTIO_BLK_T_ZONE_APPEND request or a VIRTIO_BLK_T_OUT or
> +VIRTIO_BLK_T_OUT_FUA request issued to a SWR zone in VIRTIO_BLK_ZS_EMPTY or
> +VIRTIO_BLK_ZS_CLOSED state, the device attempts to perform the transition of the
> +zone to VIRTIO_BLK_ZS_IOPEN state before writing data. This transition may fail
> +due to insufficient open and/or active zone resources available on the device.
> +In this case, the request MUST be completed with VIRTIO_BLK_S_ZONE_OPEN_RESOURCE
> +or VIRTIO_BLK_S_ZONE_ACTIVE_RESOURCE value in the \field{status}.
>  
>  If the \field{sector} field in the VIRTIO_BLK_T_ZONE_APPEND request does not
>  specify the lowest sector for a zone, then the request SHALL be completed with
>  VIRTIO_BLK_S_ZONE_INVALID_CMD value in \field{status}.
>  
> -A VIRTIO_BLK_T_ZONE_APPEND request or a VIRTIO_BLK_T_OUT request that has the
> -data range that exceeds the remaining writable capacity for the zone, then the
> -request SHALL be completed with VIRTIO_BLK_S_ZONE_INVALID_CMD value in
> -\field{status}.
> +A VIRTIO_BLK_T_ZONE_APPEND request or a VIRTIO_BLK_T_OUT or VIRTIO_BLK_T_OUT_FUA
> +request that has the data range that exceeds the remaining writable capacity for
> +the zone, then the request SHALL be completed with VIRTIO_BLK_S_ZONE_INVALID_CMD
> +value in \field{status}.
>  
>  If a request of the type VIRTIO_BLK_T_ZONE_APPEND is completed with
>  VIRTIO_BLK_S_OK status, the field \field{append_sector} in
> @@ -1136,22 +1149,23 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
>      VIRTIO_BLK_S_ZONE_INVALID_CMD \field{status}.
>  \end{itemize}
>  
> -If a VIRTIO_BLK_T_ZONE_APPEND request, a VIRTIO_BLK_T_IN request or a
> -VIRTIO_BLK_T_OUT request issued to a SWR zone has the range that has sectors in
> -more than one zone, then the request SHALL be completed with
> +If a VIRTIO_BLK_T_ZONE_APPEND, VIRTIO_BLK_T_IN, VIRTIO_BLK_T_OUT, or
> +VIRTIO_BLK_T_OUT_FUA request issued to a SWR zone has the range that has sectors
> +in more than one zone, then the request SHALL be completed with
>  VIRTIO_BLK_S_ZONE_INVALID_CMD value in the field \field{status}.
>  
> -A VIRTIO_BLK_T_OUT request that has the \field{sector} value that is not aligned
> -with the write pointer for the zone, then the request SHALL be completed with
> -VIRTIO_BLK_S_ZONE_UNALIGNED_WP value in the field \field{status}.
> +A VIRTIO_BLK_T_OUT or VIRTIO_BLK_T_OUT_FUA request that has the \field{sector}
> +value that is not aligned with the write pointer for the zone, then the request
> +SHALL be completed with VIRTIO_BLK_S_ZONE_UNALIGNED_WP value in the field
> +\field{status}.
>  
>  In order to avoid resource-related errors while opening zones implicitly, the
>  device MAY automatically transition zones in VIRTIO_BLK_ZS_IOPEN state to
>  VIRTIO_BLK_ZS_CLOSED state.
>  
> -All VIRTIO_BLK_T_OUT requests or VIRTIO_BLK_T_ZONE_APPEND requests issued
> -to a zone in the VIRTIO_BLK_ZS_RDONLY state SHALL be completed with
> -VIRTIO_BLK_S_ZONE_INVALID_CMD \field{status}.
> +All VIRTIO_BLK_T_OUT, VIRTIO_BLK_T_OUT_FUA, and VIRTIO_BLK_T_ZONE_APPEND
> +requests issued to a zone in the VIRTIO_BLK_ZS_RDONLY state SHALL be completed
> +with VIRTIO_BLK_S_ZONE_INVALID_CMD \field{status}.
>  
>  All requests issued to a zone in the VIRTIO_BLK_ZS_OFFLINE state SHALL be
>  completed with VIRTIO_BLK_S_ZONE_INVALID_CMD value in the field \field{status}.
> -- 
> 2.49.0
> 


  reply	other threads:[~2025-05-08  6:35 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-08  0:34 [PATCH v2] virtio-blk: Add a VIRTIO_BLK_T_OUT_FUA request type Alberto Faria
2025-05-08  6:35 ` Michael S. Tsirkin [this message]
2025-05-08 20:30 ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250508023438-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=afaria@redhat.com \
    --cc=dverkamp@chromium.org \
    --cc=virtio-comment@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox