All of lore.kernel.org
 help / color / mirror / Atom feed
* [virtio-comment] [PATCH] Introduce VIRTIO_F_ISOLATE_INDIRECT_DESC feature
@ 2022-10-13  7:45 Baptiste Afsa
  2023-01-13 12:46 ` Michael S. Tsirkin
       [not found] ` <6380471.4BWXO1n1mU@silver>
  0 siblings, 2 replies; 20+ messages in thread
From: Baptiste Afsa @ 2022-10-13  7:45 UTC (permalink / raw)
  To: virtio-comment; +Cc: Baptiste Afsa

When negotiated, this feature bit serves two purposes:

  - It instructs the driver to allocate every indirect descriptor on its
    own dedicated memory pages. A single memory page must only hold data
    for exactly one indirect descriptor.

  - It allows the host to unmap these pages from the guest address space
    while the indirect descriptor is available to the device.

This feature may cause extra memory consumption on the guest side but is
particularly helpful to support the memory isolation scheme described
below. However, note that this feature is not limited to this specific
use case and other applications may be considered.

The main idea is to keep the memory of the VM that runs the driver
isolated from the memory that runs the device, while still allowing
zero-copy transfers between the two domains. The virtio buffers are
mapped dynamically in the host address space by the hypervisor.

In this model, the virtqueues shared with the device are not the
original virtqueues allocated by the driver but a copy maintained by the
hypervisor. The hypervisor copies the descriptors from the driver
virtqueues to these second virtqueues when the descriptors are made
available to the device, along with mapping the I/O buffers to the
device VM address space.

The use of this second set of virtqueues avoids the device the need to
verify that the buffers are actually accessible to the device since the
driver cannot update these copies.

Dealing with indirect descriptors in this model brings additional issues
because the hypervisor needs to grant the device access to both the
indirect descriptor table and all the I/O buffers pointed to by this
table. However, a compromised guest can modify the table while the
device is using it, which may lead to faults in the device.

This problem could be solved by creating a copy of the indirect
descriptor table as it is done with other descriptors but this approach
requires some sort of dynamic memory allocation in the hypervisor, which
might be problematic depending on the situation.

Using the VIRTIO_F_ISOLATE_INDIRECT_DESC feature allows the hypervisor
to unmap the indirect descriptor table from the guest address space
while the indirect descriptor is on the device side and guarantees that
it will not be modified while the device is using it.
---
 content.tex | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/content.tex b/content.tex
index e863709..20e17b7 100644
--- a/content.tex
+++ b/content.tex
@@ -6944,6 +6944,23 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
   that the driver can reset a queue individually.
   See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}.
 
+  \item[VIRTIO_F_ISOLATE_INDIRECT_DESC(41)] This feature indicates that the
+  device requires the driver to allocate each indirect descriptor on its own
+  dedicated memory pages which MUST NOT hold any other data than this indirect
+  descriptor.
+
+  This allows the host to unmap these pages from the guest address space while
+  the indirect descriptors are available to the device. The device
+  implementation is therefore guaranteed that the driver cannot tampered with an
+  indirect descriptor table while the device is using it.
+
+  This mechanism notably allows to implement a memory isolation scheme where the
+  virtio buffers are shared dynamically between the host and the guest as they
+  are exchanged through the virtqueues. It permits the host to share indirect
+  descriptor tables and all the associated buffers as is, without the need for
+  the device to verify that the buffers referenced in an indirect descriptor
+  table are actually accessible to the device.
+
 \end{description}
 
 \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature Bits}
@@ -6980,6 +6997,12 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
 
 A driver SHOULD accept VIRTIO_F_NOTIF_CONFIG_DATA if it is offered.
 
+A driver SHOULD accept VIRTIO_F_ISOLATE_INDIRECT_DESC if it is offered. If
+VIRTIO_F_ISOLATE_INDIRECT_DESC has been negotiated, a driver MUST NOT access the
+memory pages that contain an indirect descriptor after the indirect descriptor
+has been made available to the device and before it is returned as used,
+otherwise the resulting behavior is undefined.
+
 \devicenormative{\section}{Reserved Feature Bits}{Reserved Feature Bits}
 
 A device MUST offer VIRTIO_F_VERSION_1.  A device MAY fail to operate further
@@ -7009,6 +7032,9 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
 and presents a PCI SR-IOV capability structure, otherwise
 it MUST NOT offer VIRTIO_F_SR_IOV.
 
+A device MAY fail to operate further if VIRTIO_F_ISOLATE_INDIRECT_DESC is not
+accepted.
+
 \section{Legacy Interface: Reserved Feature Bits}\label{sec:Reserved Feature Bits / Legacy Interface: Reserved Feature Bits}
 
 Transitional devices MAY offer the following:
-- 
2.38.0


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply related	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2023-03-13 13:54 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-10-13  7:45 [virtio-comment] [PATCH] Introduce VIRTIO_F_ISOLATE_INDIRECT_DESC feature Baptiste Afsa
2023-01-13 12:46 ` Michael S. Tsirkin
2023-01-17 15:19   ` Afsa, Baptiste
2023-01-17 18:27     ` Eugenio Perez Martin
2023-02-27 14:53       ` Afsa, Baptiste
2023-02-27 15:45         ` Stefan Hajnoczi
     [not found]           ` <2244126.gP0zCk8Q6A@silver>
2023-02-27 17:41             ` [virtio-comment] Re: VIRTIO_RING_F_INDIRECT_SIZE status Michael S. Tsirkin
     [not found]               ` <2494182.5W6NY9sLyD@silver>
2023-02-28 12:05                 ` Michael S. Tsirkin
     [not found] ` <6380471.4BWXO1n1mU@silver>
     [not found]   ` <Y/9Z5fphn34/HSKs@fedora>
     [not found]     ` <2458440.T3bEdP9vpG@silver>
2023-03-06 16:27       ` Stefan Hajnoczi
     [not found]   ` <20230301095017-mutt-send-email-mst@kernel.org>
     [not found]     ` <2812377.Px9Efocobp@silver>
2023-03-06 17:41       ` Michael S. Tsirkin
2023-03-06 20:46         ` Stefan Hajnoczi
2023-03-06 21:50           ` Michael S. Tsirkin
2023-03-07 12:40             ` Christian Schoenebeck
2023-03-13 11:48               ` Christian Schoenebeck
2023-03-13 13:06                 ` Michael S. Tsirkin
2023-03-13 13:48                   ` Christian Schoenebeck
2023-03-13 13:54                     ` Michael S. Tsirkin
2023-03-07 13:26             ` Stefan Hajnoczi
2023-03-07 16:47               ` Michael S. Tsirkin
2023-03-07 19:35                 ` Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.