From: Parav Pandit <parav@nvidia.com>
To: <virtio-comment@lists.oasis-open.org>, <mst@redhat.com>,
<cohuck@redhat.com>
Cc: <sburla@marvell.com>, <shahafs@nvidia.com>, <maorg@nvidia.com>,
<yishaih@nvidia.com>, <lingshan.zhu@intel.com>,
<jasowang@redhat.com>, "Parav Pandit" <parav@nvidia.com>
Subject: [virtio-comment] [PATCH v2 7/8] admin: Add write recording commands
Date: Tue, 17 Oct 2023 23:06:44 +0300 [thread overview]
Message-ID: <20231017200645.779222-8-parav@nvidia.com> (raw)
In-Reply-To: <20231017200645.779222-1-parav@nvidia.com>
When migrating a virtual machine with passthrough
virtio devices, the virtio device may write into the guest
memory. Some systems may not be able to keep track of these
pages efficiently.
To facilitate such a system, a device provides the record
of pages which are written by the device.
The owner driver configures the member device for list of address
ranges for which it expects write recording and reporting by the device.
The owner driver periodically queries the written pages address record
which gets cleared from the device upon reading it.
When the write records reduces over the time, at one point write recording
is stopped after the device mode is set to FREEZE.
Fixes: https://github.com/oasis-tcs/virtio-spec/issues/176
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Satananda Burla <sburla@marvell.com>
---
changelog:
v1->v2:
- addressed comments from Michael
- merged theory of operation changes to previous patch
- replaced iova with physical address
- renamed iova range with a page
- reworded and simplified wording using page
---
admin-cmds-device-migration.tex | 129 +++++++++++++++++++++++++++++++-
admin.tex | 10 ++-
2 files changed, 135 insertions(+), 4 deletions(-)
diff --git a/admin-cmds-device-migration.tex b/admin-cmds-device-migration.tex
index fba3a6b..992d6ec 100644
--- a/admin-cmds-device-migration.tex
+++ b/admin-cmds-device-migration.tex
@@ -106,9 +106,8 @@ \subsubsection{Device Migration}\label{sec:Basic Facilities of a Virtio Device /
address records from the device. As the driver reads the written address records,
the device clears those records from the device.
Once the device reports zero or small number of written address records, the device
-mode is set to \field{Stop} or \field{Freeze}. Once the device is set to \field{Stop}
-or \field{Freeze} mode, and once all the IOVA records are read, the driver stops
-the write recording in the device.
+mode is set to \field{Stop} or \field{Freeze}. Once all the physical address records
+are read, the driver stops the write recording in the device.
The owner driver uses following device migration group administration commands.
@@ -120,6 +119,9 @@ \subsubsection{Device Migration}\label{sec:Basic Facilities of a Virtio Device /
\item Device Context Write Command
\item Device Context Supported Fields Query Command
\item Device Context Discard Command
+\item Device Write Records Start Command
+\item Device Write Records Stop Command
+\item Device Write Records Read Command
\end{enumerate}
These commands are currently only defined for the SR-IOV group type.
@@ -340,6 +342,127 @@ \subsubsection{Device Migration}\label{sec:Basic Facilities of a Virtio Device /
discarded, subsequent VIRTIO_ADMIN_CMD_DEV_CTX_WRITE command writes a new device
context.
+\paragraph{Device Write Record Capabilities Query Command}
+\label{par:Basic Facilities of a Virtio Device / Device groups / Group administration commands / Device Migration / Device Write Record Capabilities Query Command}
+
+This command reads the device write record capabilities.
+For the command VIRTIO_ADMIN_CMD_DEV_WRITE_RECORD_CAP_QUERY, \field{opcode}
+is set to 0xd.
+The \field{group_member_id} refers to the member device to be accessed.
+
+\begin{lstlisting}
+struct virtio_admin_cmd_dev_write_record_cap_result {
+ le32 supported_page_size_bitmap;
+ le32 supported_ranges;
+};
+\end{lstlisting}
+
+When the command completes successfully, \field{command_specific_result}
+is in the format \field{struct virtio_admin_cmd_dev_write_record_cap_result}
+returned by the device. The \field{supported_page_size_bitmap} indicates
+the physical address range named as page size granularity at which the device can record.
+The minimum page size granularity is of 4KB. Each bit represents a
+supported page size. Bit 0 corresponds to 4KB, bit 1 corresponds to 8KB,
+bit 31 corresponds to 4TB. The device support one or more page sizes.
+For page size, the device sets corresponding bit in the
+\field{supported_page_size_bitmap}. The \field{supported_ranges}
+indicates unique (non overlapping) physical address ranges in page granularity
+can be recorded by the device.
+
+\paragraph{Device Write Records Start Command}
+\label{par:Basic Facilities of a Virtio Device / Device groups / Group administration commands / Device Migration / Device Write Records Start Command}
+
+This command starts the write recording in the device for the specified
+physical address ranges.
+
+For the command VIRTIO_ADMIN_CMD_DEV_WRITE_RECORDS_START, \field{opcode}
+is set to 0xe.
+The \field{group_member_id} refers to the member device to be accessed.
+
+The \field{command_specific_data} is in the format
+\field{struct virtio_admin_cmd_write_record_start_data}.
+
+\begin{lstlisting}
+struct virtio_admin_cmd_write_record_start_entry {
+ le64 page_address;
+ le64 page_count;
+};
+
+struct virtio_admin_cmd_write_record_start_data {
+ le64 page_size;
+ le32 count;
+ u8 reserved[4];
+ struct virtio_admin_cmd_write_record_start_entry entries[];
+};
+
+\end{lstlisting}
+
+The \field{count} is set to indicate number of valid \field{entries}.
+The \field{page_address} indicates the start physical address.
+The \field{page_count} indicates number of pages of size \field{page_size}
+starting from \field{page_address} to record. All the \field{entries}
+are unique non overlapping page entries.
+Whenever a memory write occurs by the device in the supplied address range, the
+device records the physical address of the page in which the write occurred
+by the device. These write records can be read by the driver using
+VIRTIO_ADMIN_CMD_DEV_WRITE_RECORDS_READ command.
+
+This command has no command specific result.
+
+\paragraph{Device Write Record Stop Command}
+\label{par:Basic Facilities of a Virtio Device / Device groups / Group administration commands / Device Migration / Device Write Record Stop Command}
+
+This command stops the write recording in the device for which was
+previously started using VIRTIO_ADMIN_CMD_DEV_WRITE_RECORDS_START command.
+
+For the command VIRTIO_ADMIN_CMD_DEV_WRITE_RECORDS_STOP, \field{opcode}
+is set to 0xf.
+The \field{group_member_id} refers to the member device to be accessed.
+
+This command does not have any command specific data.
+This command has no command specific result.
+
+\paragraph{Device Write Records Read Command}
+\label{par:Basic Facilities of a Virtio Device / Device groups / Group administration commands / Device Migration / Device Write Records Read Command}
+
+This command reads the device write records for which the write recording is
+previously started using VIRTIO_ADMIN_CMD_DEV_WRITE_RECORDS_START command.
+
+For the command VIRTIO_ADMIN_CMD_DEV_WRITE_RECORDS_READ, \field{opcode}
+is set to 0x10.
+The \field{group_member_id} refers to the member device to be accessed.
+
+\begin{lstlisting}
+struct virtio_admin_cmd_write_records_read_data {
+ le64 page_address;
+ le64 length;
+};
+
+struct virtio_admin_cmd_dev_write_records_cnt {
+ le32 count;
+};
+
+struct virtio_admin_cmd_dev_write_records_result {
+ le64 address_entries[];
+};
+\end{lstlisting}
+
+The \field{command_specific_data} is in the format
+\field{struct virtio_admin_cmd_write_records_read_data}. The driver
+sets the \field {page_address} indicating the start page address for up to the
+\field{length} number of bytes. The supplied physical address range can be
+same or smaller than the range supplied when write recording is started by
+the driver in VIRTIO_ADMIN_CMD_DEV_WRITE_RECORDS_START command. The \field{length}
+must be same or multiple of any of the page size reported by the device in the
+\field{supported_page_size_bitmap}.
+
+When the command completes successfully, \field{command_specific_result} is in
+format of \field{struct virtio_admin_cmd_dev_write_records_cnt} containing number
+of write records returned by the device and \field{command_specific_result} is
+in the format of \field{struct virtio_admin_cmd_dev_write_records_result}
+When the command completes successfully, the write records which are returned
+in the result are cleared from the device.
+
\devicenormative{\paragraph}{Device Migration}{Basic Facilities of a Virtio Device / Device groups / Group administration commands / Device Migration}
A device MUST either support all of, or none of
diff --git a/admin.tex b/admin.tex
index 142692c..41cabfe 100644
--- a/admin.tex
+++ b/admin.tex
@@ -140,7 +140,15 @@ \subsection{Group administration commands}\label{sec:Basic Facilities of a Virti
\hline
0x000d & VIRTIO_ADMIN_CMD_DEV_CTX_DISCARD & Clear the device context data \\
\hline
-0x000e - 0x7FFF & - & Commands using \field{struct virtio_admin_cmd} \\
+0x000f & VIRTIO_ADMIN_CMD_DEV_WRITE_RECORD_CAP_QUERY & Query Write recording capabilities \\
+\hline
+0x0010 & VIRTIO_ADMIN_CMD_DEV_WRITE_RECORDS_START & Start Write recording in the device \\
+\hline
+0x0011 & VIRTIO_ADMIN_CMD_DEV_WRITE_RECORDS_STOP & Stop write recording in the device \\
+\hline
+0x0012 & VIRTIO_ADMIN_CMD_DEV_WRITE_RECORDS_READ & Read and clear write records from the device \\
+\hline
+0x0013 - 0x7FFF & - & Commands using \field{struct virtio_admin_cmd} \\
\hline
0x8000 - 0xFFFF & - & Reserved for future commands (possibly using a different structure) \\
\hline
--
2.34.1
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
next prev parent reply other threads:[~2023-10-17 20:07 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-17 20:06 [virtio-comment] [PATCH v2 0/8] Introduce device migration support commands Parav Pandit
2023-10-17 20:06 ` [virtio-comment] [PATCH v2 1/8] admin: Add theory of operation for device migration Parav Pandit
2023-10-17 20:06 ` [virtio-comment] [PATCH v2 2/8] admin: Redefine reserved2 as command specific output Parav Pandit
2023-10-17 20:06 ` [virtio-comment] [PATCH v2 3/8] device-context: Define the device context fields for device migration Parav Pandit
2023-10-17 20:06 ` [virtio-comment] [PATCH v2 4/8] admin: Add device migration admin commands Parav Pandit
2023-10-17 20:06 ` [virtio-comment] [PATCH v2 5/8] admin: Add requirements of device migration commands Parav Pandit
2023-10-17 20:06 ` [virtio-comment] [PATCH v2 6/8] admin: Add theory of operation for write recording commands Parav Pandit
2023-10-17 20:06 ` Parav Pandit [this message]
2023-10-17 20:06 ` [virtio-comment] [PATCH v2 8/8] admin: Add requirements of write reporting commands Parav Pandit
2023-10-18 0:53 ` [virtio-comment] [PATCH v2 0/8] Introduce device migration support commands Jason Wang
2023-10-18 4:02 ` Parav Pandit
2023-10-18 1:56 ` Zhu, Lingshan
2023-10-18 4:04 ` Parav Pandit
2023-10-18 6:04 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231017200645.779222-8-parav@nvidia.com \
--to=parav@nvidia.com \
--cc=cohuck@redhat.com \
--cc=jasowang@redhat.com \
--cc=lingshan.zhu@intel.com \
--cc=maorg@nvidia.com \
--cc=mst@redhat.com \
--cc=sburla@marvell.com \
--cc=shahafs@nvidia.com \
--cc=virtio-comment@lists.oasis-open.org \
--cc=yishaih@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.