public inbox for virtio-comment@lists.linux.dev
 help / color / mirror / Atom feed
From: Alberto Faria <afaria@redhat.com>
To: virtio-comment@lists.linux.dev
Cc: Daniel Verkamp <dverkamp@chromium.org>,
	Alberto Faria <afaria@redhat.com>
Subject: [PATCH v2] virtio-blk: Add a VIRTIO_BLK_T_OUT_FUA request type
Date: Thu,  8 May 2025 01:34:09 +0100	[thread overview]
Message-ID: <20250508003409.426807-1-afaria@redhat.com> (raw)

This is a variant of the VIRTIO_BLK_T_OUT request type that further
ensures the write becomes stable once the request completes, commonly
known as a Force Unit Access (FUA) request.

Also add a VIRTIO_BLK_F_OUT_FUA feature bit signaling support for this
request type.

Signed-off-by: Alberto Faria <afaria@redhat.com>
---
Matching kernel and qemu RFCs:
- https://lore.kernel.org/linux-block/20250508001951.421467-1-afaria@redhat.com/
- https://lore.kernel.org/qemu-devel/20250508002440.423776-1-afaria@redhat.com/

v2:
- Redefine VIRTIO_BLK_T_OUT_FUA to 27 since 15 is already in use.
- Clarify that the cache mode has no impact on VIRTIO_BLK_T_OUT_FUA
  semantics.
- Allow drivers to negotiate VIRTIO_BLK_F_OUT_FUA even if they are
  incapable of sending VIRTIO_BLK_T_OUT_FUA commands.

 device-types/blk/description.tex | 114 +++++++++++++++++--------------
 1 file changed, 64 insertions(+), 50 deletions(-)

diff --git a/device-types/blk/description.tex b/device-types/blk/description.tex
index 2712ada..d3f4f37 100644
--- a/device-types/blk/description.tex
+++ b/device-types/blk/description.tex
@@ -66,6 +66,8 @@ \subsection{Feature bits}\label{sec:Device Types / Block Device / Feature bits}
 	(ZNS). For brevity, these standard documents are referred as "ZBD standards"
 	from this point on in the text.
 
+\item[VIRTIO_BLK_F_OUT_FUA (18)] Force Unit Access (FUA) command support.
+
 \end{description}
 
 \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Block Device / Feature bits / Legacy Interface: Feature bits}
@@ -395,9 +397,9 @@ \subsection{Device Initialization}\label{sec:Device Types / Block Device / Devic
     VIRTIO_BLK_T_ZONE_APPEND requests.
 
 \item the \field{write_granularity} field of \field{zoned} MUST be set by the
-    device to the offset and size alignment constraint for VIRTIO_BLK_T_OUT
-    and VIRTIO_BLK_T_ZONE_APPEND requests issued to a sequential zone of the
-    device.
+    device to the offset and size alignment constraint for VIRTIO_BLK_T_OUT,
+    VIRTIO_BLK_T_ZONE_APPEND, and VIRTIO_BLK_T_OUT_FUA requests issued to a
+    sequential zone of the device.
 
 \item the device MUST initialize padding bytes \field{unused2} to 0.
 \end{itemize}
@@ -445,8 +447,9 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
 (VIRTIO_BLK_T_OUT), a discard (VIRTIO_BLK_T_DISCARD), a write zeroes
 (VIRTIO_BLK_T_WRITE_ZEROES), a flush (VIRTIO_BLK_T_FLUSH), a get device ID
 string command (VIRTIO_BLK_T_GET_ID), a secure erase
-(VIRTIO_BLK_T_SECURE_ERASE), or a get device lifetime command
-(VIRTIO_BLK_T_GET_LIFETIME).
+(VIRTIO_BLK_T_SECURE_ERASE), a get device lifetime command
+(VIRTIO_BLK_T_GET_LIFETIME), or a Force Unit Access write
+(VIRTIO_BLK_T_OUT_FUA).
 
 \begin{lstlisting}
 #define VIRTIO_BLK_T_IN           0
@@ -457,6 +460,7 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
 #define VIRTIO_BLK_T_DISCARD      11
 #define VIRTIO_BLK_T_WRITE_ZEROES 13
 #define VIRTIO_BLK_T_SECURE_ERASE   14
+#define VIRTIO_BLK_T_OUT_FUA      27
 \end{lstlisting}
 
 The \field{sector} number indicates the offset (multiplied by 512) where
@@ -464,9 +468,9 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
 commands other than read, write and some zone operations.
 
 VIRTIO_BLK_T_IN requests populate \field{data} with the contents of sectors
-read from the block device (in multiples of 512 bytes).  VIRTIO_BLK_T_OUT
-requests write the contents of \field{data} to the block device (in multiples
-of 512 bytes).
+read from the block device (in multiples of 512 bytes).  VIRTIO_BLK_T_OUT and
+VIRTIO_BLK_T_OUT_FUA requests write the contents of \field{data} to the block
+device (in multiples of 512 bytes).
 
 The \field{data} used for discard, secure erase or write zeroes commands
 consists of one or more segments. The maximum number of segments is
@@ -566,11 +570,12 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
 
 Requests of type VIRTIO_BLK_T_OUT, VIRTIO_BLK_T_ZONE_OPEN,
 VIRTIO_BLK_T_ZONE_CLOSE, VIRTIO_BLK_T_ZONE_FINISH, VIRTIO_BLK_T_ZONE_APPEND,
-VIRTIO_BLK_T_ZONE_RESET or VIRTIO_BLK_T_ZONE_RESET_ALL may be completed by the
-device with VIRTIO_BLK_S_OK, VIRTIO_BLK_S_IOERR or VIRTIO_BLK_S_UNSUPP
-\field{status}, or, additionally, with  VIRTIO_BLK_S_ZONE_INVALID_CMD,
-VIRTIO_BLK_S_ZONE_UNALIGNED_WP, VIRTIO_BLK_S_ZONE_OPEN_RESOURCE or
-VIRTIO_BLK_S_ZONE_ACTIVE_RESOURCE ZBD-specific status codes.
+VIRTIO_BLK_T_ZONE_RESET, VIRTIO_BLK_T_ZONE_RESET_ALL or VIRTIO_BLK_T_OUT_FUA may
+be completed by the device with VIRTIO_BLK_S_OK, VIRTIO_BLK_S_IOERR or
+VIRTIO_BLK_S_UNSUPP \field{status}, or, additionally, with
+VIRTIO_BLK_S_ZONE_INVALID_CMD, VIRTIO_BLK_S_ZONE_UNALIGNED_WP,
+VIRTIO_BLK_S_ZONE_OPEN_RESOURCE or VIRTIO_BLK_S_ZONE_ACTIVE_RESOURCE
+ZBD-specific status codes.
 
 Besides the request status, VIRTIO_BLK_T_ZONE_APPEND requests return the
 starting sector of the appended data back to the driver. For this reason,
@@ -672,8 +677,8 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
 
 Zones with VIRTIO_BLK_ZT_SWP can be read randomly and should be written
 sequentially, similarly to SWR zones. However, SWP zones can accept random write
-operations, that is, VIRTIO_BLK_T_OUT requests with a start sector different
-from the zone write pointer position.
+operations, that is, VIRTIO_BLK_T_OUT and VIRTIO_BLK_T_OUT_FUA requests with a
+start sector different from the zone write pointer position.
 
 The field \field{z_state} of \field{virtio_blk_zone_descriptor} indicates the
 state of the device zone.
@@ -731,8 +736,9 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
 equal to the sector address of the zone. In this state, the entire capacity of
 the zone is available for writing. A zone can transition from this state to
 \begin{itemize}
-\item VIRTIO_BLK_ZS_IOPEN when a successful VIRTIO_BLK_T_OUT request or
-    VIRTIO_BLK_T_ZONE_APPEND with a non-zero data size is received for the zone.
+\item VIRTIO_BLK_ZS_IOPEN when a successful VIRTIO_BLK_T_OUT or
+    VIRTIO_BLK_T_OUT_FUA request or VIRTIO_BLK_T_ZONE_APPEND with a non-zero
+    data size is received for the zone.
 
 \item VIRTIO_BLK_ZS_EOPEN when a successful VIRTIO_BLK_T_ZONE_OPEN request is
     received for the zone
@@ -763,9 +769,9 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
 \item VIRTIO_BLK_ZS_FULL when a successful VIRTIO_BLK_T_ZONE_FINISH request is
     received for the zone.
 
-\item VIRTIO_BLK_ZS_FULL when a successful VIRTIO_BLK_T_OUT or
-    VIRTIO_BLK_T_ZONE_APPEND request that causes the zone to reach its writable
-    capacity is received for the zone.
+\item VIRTIO_BLK_ZS_FULL when a successful VIRTIO_BLK_T_OUT,
+    VIRTIO_BLK_T_ZONE_APPEND, or VIRTIO_BLK_T_OUT_FUA request that causes the
+    zone to reach its writable capacity is received for the zone.
 \end{itemize}
 
 Zones in the VIRTIO_BLK_ZS_EOPEN (Explicitly Open) state transition from
@@ -788,9 +794,9 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
 \item VIRTIO_BLK_ZS_FULL when a successful VIRTIO_BLK_T_ZONE_FINISH request is
     received for the zone,
 
-\item VIRTIO_BLK_ZS_FULL when a successful VIRTIO_BLK_T_OUT or
-    VIRTIO_BLK_T_ZONE_APPEND request that causes the zone to reach its writable
-    capacity is received for the zone.
+\item VIRTIO_BLK_ZS_FULL when a successful VIRTIO_BLK_T_OUT,
+    VIRTIO_BLK_T_ZONE_APPEND, or VIRTIO_BLK_T_OUT_FUA request that causes the
+    zone to reach its writable capacity is received for the zone.
 \end{itemize}
 
 When a VIRTIO_BLK_T_ZONE_EOPEN request is issued to an Explicitly Open zone, the
@@ -806,8 +812,9 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
 \item VIRTIO_BLK_ZS_EMPTY when a successful VIRTIO_BLK_T_ZONE_RESET_ALL request
     is received by the device,
 
-\item VIRTIO_BLK_ZS_IOPEN when a successful VIRTIO_BLK_T_OUT request or
-    VIRTIO_BLK_T_ZONE_APPEND with a non-zero data size is received for the zone.
+\item VIRTIO_BLK_ZS_IOPEN when a successful VIRTIO_BLK_T_OUT or
+    VIRTIO_BLK_T_OUT_FUA request or VIRTIO_BLK_T_ZONE_APPEND with a non-zero
+    data size is received for the zone.
 
 \item VIRTIO_BLK_ZS_EOPEN when a successful VIRTIO_BLK_T_ZONE_OPEN request is
     received for the zone,
@@ -876,8 +883,8 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
 A driver MUST set \field{sector} to 0 for a VIRTIO_BLK_T_FLUSH request.
 A driver SHOULD NOT include any data in a VIRTIO_BLK_T_FLUSH request.
 
-The length of \field{data} MUST be a multiple of 512 bytes for VIRTIO_BLK_T_IN
-and VIRTIO_BLK_T_OUT requests.
+The length of \field{data} MUST be a multiple of 512 bytes for VIRTIO_BLK_T_IN,
+VIRTIO_BLK_T_OUT, and VIRTIO_BLK_T_OUT_FUA requests.
 
 The length of \field{data} MUST be a multiple of the size of struct
 virtio_blk_discard_write_zeroes for VIRTIO_BLK_T_DISCARD,
@@ -939,8 +946,8 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
 MAY read the starting sector location of the written data from the request
 field \field{append_sector}.
 
-All VIRTIO_BLK_T_OUT requests issued by the driver to sequential zones and
-VIRTIO_BLK_T_ZONE_APPEND requests MUST have:
+All VIRTIO_BLK_T_OUT and VIRTIO_BLK_T_OUT_FUA requests issued by the driver to
+sequential zones and VIRTIO_BLK_T_ZONE_APPEND requests MUST have:
 
 \begin{enumerate}
 \item the data size that is a multiple of the number of bytes reported
@@ -1002,11 +1009,17 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
 
 \item\label{item:flush3} a VIRTIO_BLK_T_FLUSH request is sent \textbf{after the write is
   completed} and is completed itself.
+
+\item\label{item:flush4} the write was performed using a VIRTIO_BLK_T_OUT_FUA
+  request, regardless of whether the VIRTIO_BLK_F_FLUSH or
+  VIRTIO_BLK_F_CONFIG_WCE features were negotiated and regardless of the current
+  cache mode (as expressed by the value of the \field{writeback} field in
+  configuration space).
 \end{enumerate}
 
 If the device is backed by persistent storage, the device MUST ensure that
 stable writes are committed to it, before reporting completion of the write
-(cases~\ref{item:flush1} and~\ref{item:flush2}) or the flush
+(cases~\ref{item:flush1}, \ref{item:flush2}, and~\ref{item:flush4}) or the flush
 (case~\ref{item:flush3}).  Failure to do so can cause data loss
 in case of a crash.
 
@@ -1098,22 +1111,22 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
 sequential device zones in VIRTIO_BLK_ZS_IOPEN, VIRTIO_BLK_ZS_EOPEN,
 VIRTIO_BLK_ZS_CLOSED and VIRTIO_BLK_ZS_FULL state to VIRTIO_BLK_ZS_EMPTY state.
 
-Upon receiving a VIRTIO_BLK_T_ZONE_APPEND request or a VIRTIO_BLK_T_OUT
-request issued to a SWR zone in VIRTIO_BLK_ZS_EMPTY or VIRTIO_BLK_ZS_CLOSED
-state, the device attempts to perform the transition of the zone to
-VIRTIO_BLK_ZS_IOPEN state before writing data. This transition may fail due to
-insufficient open and/or active zone resources available on the device. In this
-case, the request MUST be completed with VIRTIO_BLK_S_ZONE_OPEN_RESOURCE or
-VIRTIO_BLK_S_ZONE_ACTIVE_RESOURCE value in the \field{status}.
+Upon receiving a VIRTIO_BLK_T_ZONE_APPEND request or a VIRTIO_BLK_T_OUT or
+VIRTIO_BLK_T_OUT_FUA request issued to a SWR zone in VIRTIO_BLK_ZS_EMPTY or
+VIRTIO_BLK_ZS_CLOSED state, the device attempts to perform the transition of the
+zone to VIRTIO_BLK_ZS_IOPEN state before writing data. This transition may fail
+due to insufficient open and/or active zone resources available on the device.
+In this case, the request MUST be completed with VIRTIO_BLK_S_ZONE_OPEN_RESOURCE
+or VIRTIO_BLK_S_ZONE_ACTIVE_RESOURCE value in the \field{status}.
 
 If the \field{sector} field in the VIRTIO_BLK_T_ZONE_APPEND request does not
 specify the lowest sector for a zone, then the request SHALL be completed with
 VIRTIO_BLK_S_ZONE_INVALID_CMD value in \field{status}.
 
-A VIRTIO_BLK_T_ZONE_APPEND request or a VIRTIO_BLK_T_OUT request that has the
-data range that exceeds the remaining writable capacity for the zone, then the
-request SHALL be completed with VIRTIO_BLK_S_ZONE_INVALID_CMD value in
-\field{status}.
+A VIRTIO_BLK_T_ZONE_APPEND request or a VIRTIO_BLK_T_OUT or VIRTIO_BLK_T_OUT_FUA
+request that has the data range that exceeds the remaining writable capacity for
+the zone, then the request SHALL be completed with VIRTIO_BLK_S_ZONE_INVALID_CMD
+value in \field{status}.
 
 If a request of the type VIRTIO_BLK_T_ZONE_APPEND is completed with
 VIRTIO_BLK_S_OK status, the field \field{append_sector} in
@@ -1136,22 +1149,23 @@ \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Ope
     VIRTIO_BLK_S_ZONE_INVALID_CMD \field{status}.
 \end{itemize}
 
-If a VIRTIO_BLK_T_ZONE_APPEND request, a VIRTIO_BLK_T_IN request or a
-VIRTIO_BLK_T_OUT request issued to a SWR zone has the range that has sectors in
-more than one zone, then the request SHALL be completed with
+If a VIRTIO_BLK_T_ZONE_APPEND, VIRTIO_BLK_T_IN, VIRTIO_BLK_T_OUT, or
+VIRTIO_BLK_T_OUT_FUA request issued to a SWR zone has the range that has sectors
+in more than one zone, then the request SHALL be completed with
 VIRTIO_BLK_S_ZONE_INVALID_CMD value in the field \field{status}.
 
-A VIRTIO_BLK_T_OUT request that has the \field{sector} value that is not aligned
-with the write pointer for the zone, then the request SHALL be completed with
-VIRTIO_BLK_S_ZONE_UNALIGNED_WP value in the field \field{status}.
+A VIRTIO_BLK_T_OUT or VIRTIO_BLK_T_OUT_FUA request that has the \field{sector}
+value that is not aligned with the write pointer for the zone, then the request
+SHALL be completed with VIRTIO_BLK_S_ZONE_UNALIGNED_WP value in the field
+\field{status}.
 
 In order to avoid resource-related errors while opening zones implicitly, the
 device MAY automatically transition zones in VIRTIO_BLK_ZS_IOPEN state to
 VIRTIO_BLK_ZS_CLOSED state.
 
-All VIRTIO_BLK_T_OUT requests or VIRTIO_BLK_T_ZONE_APPEND requests issued
-to a zone in the VIRTIO_BLK_ZS_RDONLY state SHALL be completed with
-VIRTIO_BLK_S_ZONE_INVALID_CMD \field{status}.
+All VIRTIO_BLK_T_OUT, VIRTIO_BLK_T_OUT_FUA, and VIRTIO_BLK_T_ZONE_APPEND
+requests issued to a zone in the VIRTIO_BLK_ZS_RDONLY state SHALL be completed
+with VIRTIO_BLK_S_ZONE_INVALID_CMD \field{status}.
 
 All requests issued to a zone in the VIRTIO_BLK_ZS_OFFLINE state SHALL be
 completed with VIRTIO_BLK_S_ZONE_INVALID_CMD value in the field \field{status}.
-- 
2.49.0


             reply	other threads:[~2025-05-08  0:34 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-08  0:34 Alberto Faria [this message]
2025-05-08  6:35 ` [PATCH v2] virtio-blk: Add a VIRTIO_BLK_T_OUT_FUA request type Michael S. Tsirkin
2025-05-08 20:30 ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250508003409.426807-1-afaria@redhat.com \
    --to=afaria@redhat.com \
    --cc=dverkamp@chromium.org \
    --cc=virtio-comment@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox