All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/1] introduce virtio-ism: internal shared memory device
@ 2022-12-23  8:13 Xuan Zhuo
  2022-12-23  8:13 ` [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism Xuan Zhuo
  2023-01-25 12:55 ` [PATCH v2 0/1] introduce virtio-ism: internal shared memory device Wenjia Zhang
  0 siblings, 2 replies; 37+ messages in thread
From: Xuan Zhuo @ 2022-12-23  8:13 UTC (permalink / raw)
  To: virtio-dev
  Cc: hans, herongguang, zmlcc, dust.li, tonylu, zhenzao, helinguo,
	gerry, xuanzhuo, mst, cohuck, jasowang, Jan Kiszka, wintera,
	kgraul, wenjia, jaka, hca, twinkler, raspl

Hello everyone,

# Background

    Nowadays, there is a common scenario to accelerate communication between
    different VMs and containers, including light weight virtual machine based
    containers. One way to achieve this is to colocate them on the same host.
    However, the performance of inter-VM communication through network stack is
    not optimal and may also waste extra CPU cycles. This scenario has been
    discussed many times, but still no generic solution available [1] [2] [3].

    With pci-ivshmem + SMC(Shared Memory Communications: [4]) based PoC[5],
    We found that by changing the communication channel between VMs from TCP to
    SMC with shared memory, we can achieve superior performance for a common
    socket-based application[5]:
      - latency reduced by about 50%
      - throughput increased by about 300%
      - CPU consumption reduced by about 50%

    Since there is no particularly suitable shared memory management solution
    matches the need for SMC(See ## Comparison with existing technology), and
    virtio is the standard for communication in the virtualization world, we
    want to implement a virtio-ism device based on virtio, which can support
    on-demand memory sharing across VMs, containers or VM-container. To match
    the needs of SMC, the virtio-ism device need to support:

    1. Dynamic provision: shared memory regions are dynamically allocated and
       provisioned.
    2. Multi-region management: the shared memory is divided into regions,
       and a peer may allocate one or more regions from the same shared memory
       device.
    3. Permission control: the permission of each region can be set seperately.
    4. Dynamic connection: each ism region of a device can be shared with
       different devices, eventually a device can be shared with thousands of
       devices

# Virtio ISM device

    ISM devices provide the ability to share memory between different guests on
    a host. A guest's memory got from ism device can be shared with multiple
    peers at the same time. This shared relationship can be dynamically created
    and released.

    The shared memory obtained from the device is divided into multiple ism
    regions for share. ISM device provides a mechanism to notify other ism
    region referrers of content update events.

## Design

    This is a structure diagram based on ism sharing between two vms.

    |-------------------------------------------------------------------------------------------------------------|
    | |------------------------------------------------|       |------------------------------------------------| |
    | | Guest                                          |       | Guest                                          | |
    | |                                                |       |                                                | |
    | |   ----------------                             |       |   ----------------                             | |
    | |   |    driver    |     [M1]   [M2]   [M3]      |       |   |    driver    |             [M2]   [M3]     | |
    | |   ----------------       |      |      |       |       |   ----------------               |      |      | |
    | |    |cq|                  |map   |map   |map    |       |    |cq|                          |map   |map   | |
    | |    |  |                  |      |      |       |       |    |  |                          |      |      | |
    | |    |  |                -------------------     |       |    |  |                --------------------    | |
    | |----|--|----------------|  device memory  |-----|       |----|--|----------------|  device memory   |----| |
    | |    |  |                -------------------     |       |    |  |                --------------------    | |
    | |                                |               |       |                               |                | |
    | |                                |               |       |                               |                | |
    | | Qemu                           |               |       | Qemu                          |                | |
    | |--------------------------------+---------------|       |-------------------------------+----------------| |
    |                                  |                                                       |                  |
    |                                  |                                                       |                  |
    |                                  |------------------------------+------------------------|                  |
    |                                                                 |                                           |
    |                                                                 |                                           |
    |                                                   --------------------------                                |
    |                                                    | M1 |   | M2 |   | M3 |                                 |
    |                                                   --------------------------                                |
    |                                                                                                             |
    | HOST                                                                                                        |
    ---------------------------------------------------------------------------------------------------------------

## Inspiration

    Our design idea for virtio-ism comes from IBM's ISM device, to pay tribute,
    we directly name this device "ism".

    Information about IBM ism device and SMC:
      1. SMC reference: https://www.ibm.com/docs/en/zos/2.5.0?topic=system-shared-memory-communications
      2. SMC-Dv2 and ISMv2 introduction: https://www.newera.com/INFO/SMCv2_Introduction_10-15-2020.pdf
      3. ISM device: https://www.ibm.com/docs/en/linux-on-systems?topic=n-ism-device-driver-1
      4. SMC protocol (including SMC-D): https://www.ibm.com/support/pages/system/files/inline-files/IBM%20Shared%20Memory%20Communications%20Version%202_2.pdf
      5. SMC-D FAQ: https://www.ibm.com/support/pages/system/files/inline-files/2021-02-09-SMC-D-FAQ.pdf

## ISM VLAN

    Since SMC uses TCP to handshake with IP facilities, virtio-ism device is not
    bound to existing IP device, and the latest ISMv2 device doesn't require
    VLAN. So it is not necessary for virtio-ism to support VLAN attributes.

## Live Migration

    Currently SMC-D doesn't support migration to another device or fallback. And
    SMC-R supports migration to another link, no fallback.

    So we may not support live migration for the time being.

## About hot plugging of the ism device

    Hot plugging of devices is a heavier, possibly failed, time-consuming, and
    less scalable operation. So, we don't plan to support it for now.


# Usage (SMC as example)

    There is one of possible use cases:

    1. SMC calls the interface ism_alloc_region() of the ism driver to return
       the location of a memory region in the PCI space and a token.
    2. The ism driver mmap the memory region and return to SMC with the token
    3. SMC passes the token to the connected peer
    4. the peer calls the ism driver interface ism_attach_region(token) to
       get the location of the PCI space of the shared memory
    5. The connected pair communicating through the shared memory

# Comparison with existing technology

## ivshmem or ivshmem 2.0 of Qemu

   1. ivshmem 1.0 is a large piece of memory that can be seen by all devices
      that use this VM, so the security is not enough.

   2. ivshmem 2.0 is a shared memory belonging to a VM that can be read-only by
      all other VMs that use the ivshmem 2.0 shared memory device, which also
      does not meet our needs in terms of security.

## vhost-pci and virtiovhostuser

    1. does not support dynamic allocation
    2. one device just support connect to one vm


# POC CODE

There are no functions related to eventq and perm yet.

## Qemu (virtio ism device):

     https://github.com/fengidri/qemu/compare/7d66b74c4dd0d74d12c1d3d6de366242b13ed76d...ism-upstream-1216?expand=1

    Start qemu with option "--device virtio-ism-pci,disable-legacy=on, disable-modern=off".

##  Kernel (virtio ism driver and smc support):

     https://github.com/fengidri/linux-kernel-virtio-ism/compare/6f8101eb21bab480537027e62c4b17021fb7ea5d...ism-upstream-1223


### SMC

    Support SMC-D works with virtio-ism.

    Use SMC with virtio-ism to accelerate inter-VM communication.

    1. insmod virtio-ism and smc module.
    2. use smc-tools [1] to get the device name of SMC-D based on virtio-ism.

      $ smcd d # here is _virtio2_
      FID  Type  PCI-ID        PCHID  InUse  #LGs  PNET-ID
      0000 0     virtio2       0000   Yes       1  *C1

    3. add the nic and SMC-D device to the same pnet, do it in both client and server.

      $ smc_pnet -a -I eth1 c1 # use eth1 to setup SMC connection
      $ smc_pnet -a -D virtio2 c1 # virtio2 is the virtio-ism device

    4. use SMC to accelerate your application, smc_run in [1] can do this.

      # smc_run use LD_PRELOAD to hijack socket syscall with AF_SMC
      $ smc_run sockperf server --tcp # run in server
      $ smc_run sockperf tp --tcp -i a.b.c.d # run in client

    [1] https://github.com/ibm-s390-linux/smc-tools

    Notice: The current POC state, we only tested some basic functions.

### App inside user space

    The ism driver provide /dev/vismX interface, allow users to use Virtio-ISM
    device in user space directly.

    Try tools/virtio/virtio-ism/virtio-ism-mmap

    Usage:
         cd tools/virtio/virtio-ism/; make
         insmode virtio-ism.ko

    case1: communicate

       vm1: ./virtio-ism-mmap alloc -> token
       vm2: ./virtio-ism-mmap attach -t <token> --write-msg AAAA --commit

       vm2 will write msg to shared memory, then notify vm1. After vm1 receive
       notify, then read from shared memory.

    case2: ping-pong test.

        vm1: ./virtio-ism-mmap server
        vm2: ./virtio-ism-mmap -i 192.168.122.101 pp

        1. server alloc one ism region
        2. client get the token by tcp

        3. client commit(kick) to server, server recv notify, commit(kick) to client
        4. loop #3

    case3: throughput test.

        vm1: ./virtio-ism-mmap server
        vm2: ./virtio-ism-mmap -i 192.168.122.101 tp

        1. server alloc one ism region
        2. client get the token by tcp

        3. client write 1M data to ism region
        4. client commit(kick) to server
        5. server recv notify, copy the data, the commit(kick) back to client
        6. loop #3-#5

    case4: throughput test with protocol defined by user.

        vm1: ./virtio-ism-mmap server
        vm2: ./virtio-ism-mmap -i 192.168.122.101 tp --polling --tp-chunks 15 --msg-size 64k -n 50000

        Used the ism region as a ring.

        In this scene, client and server are in the polling mode. Test it on
        my machine, the maximum can reach 12GBps

# References

    [1] https://projectacrn.github.io/latest/tutorials/enable_ivshmem.html
    [2] https://dl.acm.org/doi/10.1145/2847562
    [3] https://hal.archives-ouvertes.fr/hal-00368622/document
    [4] https://lwn.net/Articles/711071/
    [5] https://lore.kernel.org/netdev/20220720170048.20806-1-tonylu@linux.alibaba.com/T/


If there are any problems, please point them out.
Hope to hear from you, thank you.

v2:
   1. add Attach/Detach event
   2. add Events Filter
   3. allow Alloc/Attach huge region
   4. remove host/guest terms

v1:
   1. cover letter adding explanation of ism vlan
   2. spec add gid
   3. explain the source of ideas about ism
   4. POC support virtio-ism-smc.ko virtio-ism-dev.ko and support virtio-ism-mmap


Xuan Zhuo (1):
  virtio-ism: introduce new device virtio-ism

 conformance.tex |  24 +++
 content.tex     |   1 +
 virtio-ism.tex  | 472 ++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 497 insertions(+)
 create mode 100644 virtio-ism.tex

-- 
2.32.0.3.g01195cf9f


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2022-12-23  8:13 [PATCH v2 0/1] introduce virtio-ism: internal shared memory device Xuan Zhuo
@ 2022-12-23  8:13 ` Xuan Zhuo
  2023-01-10 22:34   ` [virtio-dev] " Halil Pasic
  2023-01-25 12:55 ` [PATCH v2 0/1] introduce virtio-ism: internal shared memory device Wenjia Zhang
  1 sibling, 1 reply; 37+ messages in thread
From: Xuan Zhuo @ 2022-12-23  8:13 UTC (permalink / raw)
  To: virtio-dev
  Cc: hans, herongguang, zmlcc, dust.li, tonylu, zhenzao, helinguo,
	gerry, xuanzhuo, mst, cohuck, jasowang, Jan Kiszka, wintera,
	kgraul, wenjia, jaka, hca, twinkler, raspl

The virtio ism device provides and manages many memory ism regions in
host. These ism regions can be alloc/attach/detach by driver. Every
ism region can be shared by token with other VM after allocation.
The driver obtains the memory region on the host through the memory on
the device.

|-------------------------------------------------------------------------------------------------------------|
| |------------------------------------------------|       |------------------------------------------------| |
| | Guest                                          |       | Guest                                          | |
| |                                                |       |                                                | |
| |   ----------------                             |       |   ----------------                             | |
| |   |    driver    |     [M1]   [M2]   [M3]      |       |   |    driver    |             [M2]   [M3]     | |
| |   ----------------       |      |      |       |       |   ----------------               |      |      | |
| |    |cq|                  |map   |map   |map    |       |    |cq|                          |map   |map   | |
| |    |  |                  |      |      |       |       |    |  |                          |      |      | |
| |    |  |                -------------------     |       |    |  |                --------------------    | |
| |----|--|----------------|  device memory  |-----|       |----|--|----------------|  device memory   |----| |
| |    |  |                -------------------     |       |    |  |                --------------------    | |
| |                                |               |       |                               |                | |
| |                                |               |       |                               |                | |
| | Qemu                           |               |       | Qemu                          |                | |
| |--------------------------------+---------------|       |-------------------------------+----------------| |
|                                  |                                                       |                  |
|                                  |                                                       |                  |
|                                  |------------------------------+------------------------|                  |
|                                                                 |                                           |
|                                                                 |                                           |
|                                                   --------------------------                                |
|                                                    | M1 |   | M2 |   | M3 |                                 |
|                                                   --------------------------                                |
|                                                                                                             |
| HOST                                                                                                        |
---------------------------------------------------------------------------------------------------------------

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
Signed-off-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
Signed-off-by: Hans Zhang <hans@linux.alibaba.com>
Signed-off-by: He Rongguang <herongguang@linux.alibaba.com>
---
 conformance.tex |  24 +++
 content.tex     |   1 +
 virtio-ism.tex  | 472 ++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 497 insertions(+)
 create mode 100644 virtio-ism.tex

diff --git a/conformance.tex b/conformance.tex
index c3c1d3e..7c3cbc3 100644
--- a/conformance.tex
+++ b/conformance.tex
@@ -335,6 +335,15 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
 \item \ref{drivernormative:Device Types / PMEM Device / Device Initialization}
 \end{itemize}
 
+\conformance{\subsection}{ISM Driver Conformance}\label{sec:Conformance / Driver Conformance / ISM Driver Conformance}
+
+A ISM driver MUST conform to the following normative statements:
+
+\begin{itemize}
+\item \ref{drivernormative:Device Types / ISM Device / Device Initialization}
+\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Alloc ISM Region}
+\end{itemize}
+
 \conformance{\section}{Device Conformance}\label{sec:Conformance / Device Conformance}
 
 A device MUST conform to the following normative statements:
@@ -621,6 +630,21 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
 \item \ref{devicenormative:Device Types / PMEM Device / Device Operation / Virtqueue return}
 \end{itemize}
 
+\conformance{\subsection}{ISM Device Conformance}\label{sec:Conformance / Device Conformance / ISM Device Conformance}
+
+A ISM device MUST conform to the following normative statements:
+
+\begin{itemize}
+\item \ref{devicenormative:Device Types / ISM Device / Device configuration layout}
+\item \ref{devicenormative:Device Types / ISM Device / Device Initialization}
+\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Alloc ISM Region}
+\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Attach ISM Region}
+\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Detach ISM Region}
+\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Grant ISM Region}
+\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Inform Event IRQ Vector}
+\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Event Filter}
+\end{itemize}
+
 \conformance{\section}{Legacy Interface: Transitional Device and Transitional Driver Conformance}\label{sec:Conformance / Legacy Interface: Transitional Device and Transitional Driver Conformance}
 A conformant implementation MUST be either transitional or
 non-transitional, see \ref{intro:Legacy
diff --git a/content.tex b/content.tex
index 96f4723..fe02aec 100644
--- a/content.tex
+++ b/content.tex
@@ -7545,6 +7545,7 @@ \subsubsection{Legacy Interface: Framing Requirements}\label{sec:Device
 \input{virtio-scmi.tex}
 \input{virtio-gpio.tex}
 \input{virtio-pmem.tex}
+\input{virtio-ism.tex}
 
 \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
 
diff --git a/virtio-ism.tex b/virtio-ism.tex
new file mode 100644
index 0000000..7f09c43
--- /dev/null
+++ b/virtio-ism.tex
@@ -0,0 +1,472 @@
+\section{ISM Device}\label{sec:Device Types / ISM Device}
+
+\begin{lstlisting}
+|-------------------------------------------------------------------------------------------------------------|
+| |------------------------------------------------|    |------------------------------------------------|    |
+| | VM                      [M1]   [M2]   [M3]     |    | VM                             [M2]   [M3]     |    |
+| |                          |      |      |       |    |                                 |      |       |    |
+| |   -----------------------|------|------|---    |    |   ------------------------------|------|---    |    |
+| |   |             driver   |      |      |  |    |    |   |             driver          |      |  |    |    |
+| |   -----------------------|------|------|---    |    |   ------------------------------|------|---    |    |
+| |    |cq|                  |map   |map   |map    |    |    |cq|                         |map   |map    |    |
+| |    |  |                  |      |      |       |    |    |  |                         |      |       |    |
+| |    |  |                -------------------     |    |    |  |                -------------------     |    |
+| |----|--|----------------|  device memory  |-----|    |----|--|----------------|  device memory  |-----|    |
+| |    |  |                -------------------     |    |    |  |                -------------------     |    |
+| |                                |               |    |                                |               |    |
+| |                                |               |    |                                |               |    |
+| |                                |               |    |                                |               |    |
+| |--------------------------------+---------------|    |--------------------------------+---------------|    |
+|                                  |                                                       |                  |
+|                                  |                                                       |                  |
+|                                  |------------------------------+------------------------|                  |
+|                                                                 |                                           |
+|                                                                 |                                           |
+|                                                   --------------------------                                |
+|                                                    | M1 |   | M2 |   | M3 |                                 |
+|                                                   --------------------------                                |
+|                                                                                                             |
+|                                                                                                             |
+|-------------------------------------------------------------------------------------------------------------|
+\end{lstlisting}
+
+ISM(Internal Shared Memory) device provides the ability to share memory between
+different VMs launched from the same entity. A vm's memory got from ISM device
+can be shared with multiple peers at the same time. This shared relationship can
+be dynamically created and released.
+
+The contiguous shared memory obtained from the device is divided into multiple
+ism regions for share.
+
+ISM device provides a mechanism to notify other ism region referrers of events.
+
+
+\subsection{Device ID}\label{sec:Device Types / ISM Device / Device ID}
+	44
+
+\subsection{Virtqueues}\label{sec:Device Types / ISM Device / Virtqueues}
+\begin{description}
+\item[0] controlq
+\item[1] eventq
+\end{description}
+
+\subsection{Device configuration layout}\label{sec:Device Types / ISM Device / Device configuration layout}
+
+\begin{lstlisting}
+struct virtio_ism_config {
+	le64 gid;
+	le64 devid;
+	le64 chunk_size;
+	le64 notify_size;
+};
+\end{lstlisting}
+
+\begin{description}
+    \item[\field{gid}] \field{gid} is used to identify different entity that
+        launches the VMs. Only the ism devices with the same \field{gid} can
+        share the ism regions. Therefore, this value is unique in the
+        world-wide.
+
+    \item[\field{devid}] the device id is used to identify different ism devices
+        on the same entity.
+
+    \item[\field{chunk_size}] the size of the every ism chunk.
+        Large shared memories are divided into multiple chunks, and one time
+        will take up at least one chunk.
+
+    \item[\field{notify_size}] the size of the notify address.
+\end{description}
+
+\devicenormative{\subsubsection}{Device configuration layout}{Device Types / ISM Device / Device configuration layout}
+
+The device MUST ensure that the \field{gid} on the same entity i
+same and different from the \field{gid} on other entity.
+
+On the same entity, the device MUST ensure that the \field{devid} is unique.
+
+\field{chunk_size} MUST be a power of two.
+
+\subsection{Event}\label{sec:Device Types / Network Device / Device Operation / Event}
+
+\begin{lstlisting}
+#define   VIRTIO_ISM_EVENT_UPDATE (1 << 0)
+#define   VIRTIO_ISM_EVENT_ATTACH (1 << 1)
+#define   VIRTIO_ISM_EVENT_DETACH (1 << 2)
+\end{lstlisting}
+
+\begin{description}
+    \item[VIRTIO_ISM_EVENT_UPDATE]
+        The driver kick the notify area to notify other peers of the update
+        event of the ism region content.
+
+    \item[VIRTIO_ISM_EVENT_ATTACH] A new device attaches the ism region.
+    \item[VIRTIO_ISM_EVENT_DETACH] A device detaches the ism region.
+\end{description}
+
+The ism device supports event notification of the ism region. When a device
+kick/attach/detach a region, other ism region referrers may receive related
+events.
+
+A buffer received from eventq can contain multiple event structures.
+
+\begin{lstlisting}
+struct virtio_ism_event_update {
+	le64 ev_type;
+	le64 offset;
+	le64 devid;
+};
+
+struct virtio_ism_event_attach_detach {
+	le64 ev_type;
+	le64 offset;
+	le64 devid;
+	le64 peers;
+};
+
+\end{lstlisting}
+
+\begin{description}
+\item[\field{ev_type}] The type of event, the driver can get the size of the
+    structure based on this.
+
+\item[\field{offset}] The offset of ism regions with the event.
+
+\item[\field{devid}] \field{devid} of the device that generated events.
+\item[\field{peers}] The number of the ism region referres (does not include the
+    device that receiving this event)
+
+\end{description}
+
+
+\subsection{Permissions}\label{sec:Device Types / Network Device / Device Operation / Permission}
+
+The permissions of a ism region determine whether this ism region can be
+attached and the read and write permissions after attach.
+
+The driver can set the default permissions, or set permissions for some certain
+devices.
+
+When a driver has the management permission of the ism region, then it can
+modify the permissions of this ism region. By default, only the device that
+allocated the ism region has this permission.
+
+\subsection{Device Initialization}\label{sec:Device Types / ISM Device / Device Initialization}
+
+\devicenormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
+
+The device MUST generate a \field{devid}. \field{devid} remains unchanged
+during reset. \field{devid} MUST NOT be 0.
+
+The device shares memory to the driver based on shared memory regions
+\ref{sec:Basic Facilities of a Virtio Device / Shared Memory Regions}.
+However, it does not need to allocate physical memory during initialization.
+
+The \field{shmid} of a region MUST be one of the following
+\begin{lstlisting}
+enum virtio_ism_shm_id {
+	VIRTIO_ISM_SHM_ID_UNDEFINED = 0,
+	VIRTIO_ISM_SHM_ID_REGIONS   = 1,
+	VIRTIO_ISM_SHM_ID_NOTIFY    = 2,
+};
+\end{lstlisting}
+
+The shared memory whose shmid is VIRTIO_ISM_SHM_ID_REGIONS is used to implement
+ism regions. If there are multiple shared memories whose shmid is
+VIRTIO_ISM_SHM_ID_REGIONS, they are used as contiguous memory in the order of
+acquisition.
+
+The device MUST also provides a shared memory with VIRTIO_ISM_SHM_ID_NOTIFY to
+the driver. This memory area is used for notify, and each ism region MUST have a
+corresponding notify address inside this area, and the size of the notify
+address is \field{notify_size};
+
+\drivernormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
+
+The driver MUST query all shared memory regions supported by the device.
+(see \ref{sec:Basic Facilities of a Virtio Device / Shared Memory Regions})
+
+
+\subsection{Control Virtqueue}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue}
+
+The driver uses the control virtqueue send commands to implement operations on
+the ism region and some global configurations.
+
+All commands are of the following form:
+\begin{lstlisting}
+struct virtio_ism_ctrl {
+	u8 class;
+	u8 command;
+	u8 command_specific_data[];
+	u8 ack;
+	u8 command_specific_data_reply[];
+};
+
+/* ack values */
+#define VIRTIO_ISM_OK      0
+#define VIRTIO_ISM_ERR     255
+
+#define VIRTIO_ISM_ENOENT  2
+#define VIRTIO_ISM_E2BIG   7
+#define VIRTIO_ISM_ENOMEM  12
+#define VIRTIO_ISM_ENOSPEC 28
+
+#define VIRTIO_ISM_PERM_EATTACH 100
+#define VIRTIO_ISM_PERM_EREAD   101
+#define VIRTIO_ISM_PERM_EWRITE  102
+\end{lstlisting}
+
+The \field{class}, \field{command} and command-specific-data are set by the
+driver, and the device sets the \field{ack} byte and optionally
+\field{command-specific-data-reply}.
+
+\subsection{Device Operation}  \label{sec:Device Types / ISM Driver / Device Operation}
+
+\subsubsection{Alloc ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Alloc ISM Region}
+
+Based on controlq, the driver can request an ism region to be allocated.
+
+The ism region obtained from the device will carry a token, which can be passed
+to other VMs for attaching to this ism region.
+
+\begin{lstlisting}
+struct virtio_ism_area {
+	le64 offset;
+	le64 size;
+}
+
+struct virtio_ism_ctrl_alloc {
+	le64 size;
+};
+
+struct virtio_ism_ctrl_alloc_reply {
+	le64 token;
+	le64 num;
+    struct virtio_ism_area area[];
+};
+
+#define VIRTIO_ISM_CTRL_ALLOC  0
+	#define VIRTIO_ISM_CTRL_ALLOC_REGION 0
+\end{lstlisting}
+
+
+\devicenormative{\subparagraph}{Alloc ISM Region}{Device Types / ISM Device / Device Operation / Alloc ISM Region}
+
+The device sets \field{ack} to VIRTIO_ISM_OK after successfully assigning the
+physical ism region. At the same time, a new token MUST be dynamically created
+for this ism region. \field{offset} is the location of this ism region in shared
+memory.
+
+If there is no free chunk, the device MUST set \field{ack} to VIRTIO_ISM_NOSPEC.
+
+If new physical memory cannot be allocated, the device MUST set
+\field{ack} to VIRTIO_ISM_NOMEM.
+
+The device MUST clear the new ism region before committing to the driver.
+
+If \field{size} is greater than \field{chunk_size}, this time the distribution
+may occupy multiple chunks, then the device fills the offset and size of each
+chunk into \field {offset} \field {size}.
+
+If \field{size} is smaller than \field{chunk_size}, the ism region also
+occupies one chunk in the shared memory space.
+
+\field{num} is the number of the array \field{offset} and \field{size}.
+
+If virtio_ism_ctrl_alloc_reply is not enough, the device MUST set the \field{ack}
+to VIRTIO_ISM_E2BIG.
+
+\drivernormative{\subparagraph}{Alloc ISM Region}{Device Types / ISM Device / Device Operation / Alloc ISM Region}
+
+After the alloc request is successful, the driver MUST only use the range
+\field{offset} to \field{offset} + \field{size} - 1.
+
+\subsubsection{Attach ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Attach ISM Region}
+
+Based on controlq, the driver can request to attach an ism region with a
+specified token.
+
+\begin{lstlisting}
+struct virtio_ism_ctrl_attach {
+	le64 token;
+	le64 rw_perm;
+};
+
+struct virtio_ism_ctrl_attach_reply {
+	le64 num;
+    struct virtio_ism_area area[];
+};
+
+#define VIRTIO_ISM_CTRL_ATTACH  1
+	#define VIRTIO_ISM_CTRL_ATTACH_REGION 0
+\end{lstlisting}
+\devicenormative{\subparagraph}{Attach ISM Region}{Device Types / ISM Device / Device Operation / Attach ISM Region}
+
+The device sets \field{ack} to VIRTIO_ISM_OK after successfully finding and
+assigning the physical ism region. \field{offset} is the location of this ism
+region in shared memory.
+
+If there is no free chunk, the device MUST set \field{ack} to VIRTIO_ISM_NOSPEC.
+
+If the ism region specified by \field{token} does not exist, the device MUST set
+\field{ack} to VIRTIO_ISM_ENOENT.
+
+If the device does not has the VIRTIO_ISM_PERM_ATTACH, the device MUST set
+\field{ack} to VIRTIO_ISM_PERM_EATTACH.
+
+If \field{rw_perm} include VIRTIO_ISM_PERM_READ, but the device does not
+ has the VIRTIO_ISM_PERM_READ for this region, the device MUST set \field{ack}
+ to VIRTIO_ISM_PERM_EREAD.
+
+If \field{rw_perm} include VIRTIO_ISM_PERM_WRITE, but the device does not
+ has the VIRTIO_ISM_PERM_WRITE for this region, the device MUST set \field{ack}
+ to VIRTIO_ISM_PERM_EWRITE.
+
+The device MUST set the read and write permissions of the physical memory based on
+\field{rw_perm}.
+
+If \field{size} is greater than \field{chunk_size}, this time the distribution
+may occupy multiple chunks, and the offset and size of each chunk fill in
+\field {offset} \field {size}.
+
+If \field{size} is smaller than \field{chunk_size}, the ism region also
+occupies \field{chunk_size} in the shared memory space.
+
+\field{num} is the number of the array \field{offset} and \field{size}.
+
+If virtio_ism_ctrl_alloc_reply is not enough, the device MUST set the \field{ack}
+to VIRTIO_ISM_E2BIG and set the \field{num} for next request.
+
+\subsubsection{Detach ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Detach ISM Region}
+Based on controlq, the device can release references to the ism region.
+
+\begin{lstlisting}
+struct virtio_ism_ctrl_detach {
+	le64 token;
+};
+
+#define VIRTIO_ISM_CTRL_DETACH  2
+	#define VIRTIO_ISM_CTRL_DETACH_REGION 0
+\end{lstlisting}
+
+\devicenormative{\subparagraph}{Detach ISM Region}{Device Types / ISM Device / Device Operation / Detach ISM Region}
+
+If the ism region specified by \field{token} is not attached, the device MUST set
+\field{ack} to VIRTIO_ISM_ENOENT.
+
+The device MUST release the physical memory of this ism region.
+
+\subsubsection{Grant ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Grant ISM Region}
+Based on controlq, the driver can set the permissions for each ism
+region.
+
+\begin{lstlisting}
+struct virtio_ism_ctrl_grant_default {
+	le64 token;
+	le64 permissions;
+};
+
+struct virtio_ism_ctrl_grant {
+	le64 token;
+	le64 permissions;
+	le64 peer_devid;
+};
+
+#define VIRTIO_ISM_CTRL_GRANT  3
+	#define VIRTIO_ISM_CTRL_GRANT_SET_DEFAULT    0
+	#define VIRTIO_ISM_CTRL_GRANT_SET_FOR_DEVICE 1
+
+#define VIRTIO_ISM_PERM_READ       (1 << 0)
+#define VIRTIO_ISM_PERM_WRITE      (1 << 1)
+
+#define VIRTIO_ISM_PERM_ATTACH     (1 << 2)
+
+#define VIRTIO_ISM_PERM_MANAGE     (1 << 3)
+#define VIRTIO_ISM_PERM_CLEAN_DEFAULT     (1 << 4)
+
+\end{lstlisting}
+
+\begin{description}
+\item[VIRTIO_ISM_PERM_READ] read permission
+\item[VIRTIO_ISM_PERM_WRITE] write permission
+\item[VIRTIO_ISM_PERM_ATTACH] attach permission
+
+\item[VIRTIO_ISM_PERM_MANAGE] Management permission, the device with this
+	permission can modify the permission of this ism region. By default, only
+	the device that allocated the region has this permission.
+
+\item[VIRTIO_ISM_PERM_DENY_DEFAULT] Clean all permissions of default, just used
+    with VIRTIO_ISM_CTRL_GRANT_SET_FOR_DEVICE.
+
+\end{description}
+
+Permission control is divided into two categories, one is the permission for the
+specified device, and the other is the default permission.
+
+\devicenormative{\subparagraph}{Grant ISM Region}{Device Types / ISM Device / Device Operation / Grant ISM Region}
+
+If the ism region specified by \field{token} is not attached, the device MUST set
+\field{ack} to VIRTIO_ISM_ENOENT.
+
+The device MUST respond to the driver's request based on the permissions the
+device has.
+
+The default permissions of the newly allocated ism region MUST be VIRTIO_ISM_PERM_READ | VIRTIO_ISM_PERM_WRITE | VIRTIO_ISM_PERM_ATTACH.
+The device that allocated the ism region MUST has the perm (VIRTIO_ISM_PERM_READ | VIRTIO_ISM_PERM_WRITE | VIRTIO_ISM_PERM_ATTACH | VIRTIO_ISM_PERM_MANAGE) for this region.
+
+\subsubsection{Inform Update Event IRQ Vector}\label{sec:Device Types / ISM Device / Device Operation / Inform Update Event IRQ Vector}
+
+For the region update event, the driver can choose to bind a interrupt for
+region. If successful, then this region's update event will no longer pass
+through eventq, but the interrupt of the binding is directly triggered.
+
+\begin{lstlisting}
+struct virtio_ism_ctrl_irq_vector {
+	le64 token;
+	le64 vector;
+};
+
+#define VIRTIO_ISM_CTRL_EVENT_VECTOR  4
+	#define VIRTIO_ISM_CTRL_EVENT_VECTOR_SET 0
+\end{lstlisting}
+
+
+\devicenormative{\subparagraph}{Inform Event IRQ Vector}{Device Types / ISM Device / Device Operation / Inform Event IRQ Vector}
+
+The device MUST record the relationship between the ism region and the vector
+notified by the driver, and notify the update event of this region to driver
+by the corresponding vector.
+
+
+\subsubsection{Events Filter}\label{sec:Device Types / ISM Device / Device Operation / Event Filter}
+
+The driver can filter the event to choose which events to receive. The driver
+can set the default filter for all regions, or set filter for some specified ism
+regions.
+
+\begin{lstlisting}
+struct virtio_ism_ctrl_events_filter_def {
+	le64 ev_type;
+};
+
+struct virtio_ism_ctrl_events_filter {
+	le64 ev_type;
+ 	le64 token;
+};
+
+#define VIRTIO_ISM_CTRL_EVENTS_FILTER  5
+	#define VIRTIO_ISM_CTRL_EVENTS_FILTER_SET_DEFAULT 0
+	#define VIRTIO_ISM_CTRL_EVENTS_FILTER_SET_REGION  1
+\end{lstlisting}
+
+The \field{ev_type} is a bitmask. See \ref{sec:Device Types / Network Device / Device Operation / Event}
+
+\devicenormative{\subparagraph}{Events Filter}{Device Types / ISM Device / Device Operation / Event Filter}
+
+After initialization of reset, the default \field{ev_type} MUST be VIRTIO_ISM_EVENT_UPDATE.
+
+The modification of the driver for the default events filter MUST NOT affect the region
+with a separate events filter.
+
+
+
+
-- 
2.32.0.3.g01195cf9f


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2022-12-23  8:13 ` [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism Xuan Zhuo
@ 2023-01-10 22:34   ` Halil Pasic
  2023-01-11 11:08     ` Xuan Zhuo
  0 siblings, 1 reply; 37+ messages in thread
From: Halil Pasic @ 2023-01-10 22:34 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: virtio-dev, hans, herongguang, zmlcc, dust.li, tonylu, zhenzao,
	helinguo, gerry, mst, cohuck, jasowang, Jan Kiszka, wintera,
	kgraul, wenjia, jaka, hca, twinkler, raspl, Halil Pasic

On Fri, 23 Dec 2022 16:13:54 +0800
Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:

> The virtio ism device provides and manages many memory ism regions in
> host. These ism regions can be alloc/attach/detach by driver. Every
[..]

Hi Xuan Zhou!

Some words in advance. While I'm supportive of the general idea, I find
the proposed specification quite difficult to read, and I have the
feeling I did not get the full picture. So my review will be a mix of
feedback on the tactical and of questions on the strategic level. Please
bear with me.

>  
> diff --git a/virtio-ism.tex b/virtio-ism.tex
> new file mode 100644
> index 0000000..7f09c43
> --- /dev/null
> +++ b/virtio-ism.tex
> @@ -0,0 +1,472 @@
> +\section{ISM Device}\label{sec:Device Types / ISM Device}
> +
> +\begin{lstlisting}
> +|-------------------------------------------------------------------------------------------------------------|
> +| |------------------------------------------------|    |------------------------------------------------|    |
> +| | VM                      [M1]   [M2]   [M3]     |    | VM                             [M2]   [M3]     |    |
> +| |                          |      |      |       |    |                                 |      |       |    |
> +| |   -----------------------|------|------|---    |    |   ------------------------------|------|---    |    |
> +| |   |             driver   |      |      |  |    |    |   |             driver          |      |  |    |    |
> +| |   -----------------------|------|------|---    |    |   ------------------------------|------|---    |    |
> +| |    |cq|                  |map   |map   |map    |    |    |cq|                         |map   |map    |    |
> +| |    |  |                  |      |      |       |    |    |  |                         |      |       |    |
> +| |    |  |                -------------------     |    |    |  |                -------------------     |    |
> +| |----|--|----------------|  device memory  |-----|    |----|--|----------------|  device memory  |-----|    |
> +| |    |  |                -------------------     |    |    |  |                -------------------     |    |
> +| |                                |               |    |                                |               |    |
> +| |                                |               |    |                                |               |    |
> +| |                                |               |    |                                |               |    |
> +| |--------------------------------+---------------|    |--------------------------------+---------------|    |
> +|                                  |                                                       |                  |
> +|                                  |                                                       |                  |
> +|                                  |------------------------------+------------------------|                  |
> +|                                                                 |                                           |
> +|                                                                 |                                           |
> +|                                                   --------------------------                                |
> +|                                                    | M1 |   | M2 |   | M3 |                                 |
> +|                                                   --------------------------                                |
> +|                                                                                                             |
> +|                                                                                                             |
> +|-------------------------------------------------------------------------------------------------------------|
> +\end{lstlisting}
> +
> +ISM(Internal Shared Memory) device provides the ability to share memory between
> +different VMs launched from the same entity.

Launched by instead of from? Maybe introduce a catchy name for the
"entity that launched the VMs" and prevent oversimplification by
explaining any shortcomings of the name if any in one place. Host would
be one candidate, VMM another.

> A vm's memory got from ISM device
> +can be shared with multiple peers at the same time. This shared relationship can

s/at the same time/simultaneously/ ?

> +be dynamically created and released.

"This shared relationship" does not sound right, but I have no proposal
out of the top of my head.

> +
> +The contiguous shared memory obtained from the device is divided into multiple
> +ism regions for share.

What does "for share" mean here? I don't quite understand this sentence.

> +
> +ISM device provides a mechanism to notify other ism region referrers of events.
> +
> +
> +\subsection{Device ID}\label{sec:Device Types / ISM Device / Device ID}
> +	44
> +
> +\subsection{Virtqueues}\label{sec:Device Types / ISM Device / Virtqueues}
> +\begin{description}
> +\item[0] controlq
> +\item[1] eventq
> +\end{description}
> +
> +\subsection{Device configuration layout}\label{sec:Device Types / ISM Device / Device configuration layout}
> +
> +\begin{lstlisting}
> +struct virtio_ism_config {
> +	le64 gid;
> +	le64 devid;
> +	le64 chunk_size;
> +	le64 notify_size;
> +};
> +\end{lstlisting}
> +
> +\begin{description}
> +    \item[\field{gid}] \field{gid} is used to identify different entity that
> +        launches the VMs.

What does gid stand for? 

s/different/the/ ?

I can't figure out what is "different" supposed to mean here.

>Only the ism devices with the same \field{gid} can

s/the ism/ism/ ?
> +        share the ism regions. Therefore, this value is unique in the

/s/share the/share/ ?

> +        world-wide.

Makes no sense to me. 

We could call gid host_id. We could also say that a host_id is
world-wide unique in a sense that there must not be another host
with the same host_id.

That of course raises the question, how do the different implementations
ensure that the gid (or hostid) remains unique. I guess therefore this
specification needs to specify how such unique gids are generated.

> +
> +    \item[\field{devid}] the device id is used to identify different ism devices

s/uniquely identify an ism device within the scope of the host. I.e.
devices attached to VMs on the same host must have different
\filed{devid} values, while devices attached to VMs that are hosted by
different hosts may have the same \field{devid} values.

> +        on the same entity.
> +
> +    \item[\field{chunk_size}] the size of the every ism chunk.

What is a chunk? Fist introduced here. Please use consistent wording.

> +        Large shared memories are divided into multiple chunks, and one time

"Memories" sounds wrong here. I guess what you used to call "regions"
you now call "chunks". But I may be wrong.

> +        will take up at least one chunk.
> +
> +    \item[\field{notify_size}] the size of the notify address.

The term "notify address" ain't properly introduced.

> +\end{description}
> +
> +\devicenormative{\subsubsection}{Device configuration layout}{Device Types / ISM Device / Device configuration layout}
> +
> +The device MUST ensure that the \field{gid} on the same entity i

s/i$/is$/

> +same and different from the \field{gid} on other entity.

How is the device supposed to know what is "the \field{gid} on other
entity"? See my previous comment.

> +
> +On the same entity, the device MUST ensure that the \field{devid} is unique.
> +
> +\field{chunk_size} MUST be a power of two.
> +
> +\subsection{Event}\label{sec:Device Types / Network Device / Device Operation / Event}
> +
> +\begin{lstlisting}
> +#define   VIRTIO_ISM_EVENT_UPDATE (1 << 0)
> +#define   VIRTIO_ISM_EVENT_ATTACH (1 << 1)
> +#define   VIRTIO_ISM_EVENT_DETACH (1 << 2)
> +\end{lstlisting}
> +
> +\begin{description}
> +    \item[VIRTIO_ISM_EVENT_UPDATE]
> +        The driver kick the notify area to notify other peers of the update
> +        event of the ism region content.
> +
> +    \item[VIRTIO_ISM_EVENT_ATTACH] A new device attaches the ism region.
> +    \item[VIRTIO_ISM_EVENT_DETACH] A device detaches the ism region.
> +\end{description}
> +
> +The ism device supports event notification of the ism region. When a device
> +kick/attach/detach a region, other ism region referrers may receive related
> +events.
> +

Is "may" what we want to use here? This sounds like the referrers may
not rely on receiving these events, because they don't have the
guarantee they will receive any.

> +A buffer received from eventq can contain multiple event structures.
> +
> +\begin{lstlisting}
> +struct virtio_ism_event_update {
> +	le64 ev_type;
> +	le64 offset;
> +	le64 devid;
> +};
> +
> +struct virtio_ism_event_attach_detach {
> +	le64 ev_type;
> +	le64 offset;
> +	le64 devid;
> +	le64 peers;
> +};
> +
> +\end{lstlisting}
> +
> +\begin{description}
> +\item[\field{ev_type}] The type of event, the driver can get the size of the
> +    structure based on this.
> +
> +\item[\field{offset}] The offset of ism regions with the event.

Offset with respect to what?

> +
> +\item[\field{devid}] \field{devid} of the device that generated events.
> +\item[\field{peers}] The number of the ism region referres (does not include the
> +    device that receiving this event)
> +
> +\end{description}
> +
> +
> +\subsection{Permissions}\label{sec:Device Types / Network Device / Device Operation / Permission}
> +
> +The permissions of a ism region determine whether this ism region can be
> +attached and the read and write permissions after attach.
> +
> +The driver can set the default permissions, or set permissions for some certain
> +devices.

What does "default permissions" and "some certain devices" mean here?

> +
> +When a driver has the management permission of the ism region, then it can
> +modify the permissions of this ism region. By default, only the device that
> +allocated the ism region has this permission.
> +
> +\subsection{Device Initialization}\label{sec:Device Types / ISM Device / Device Initialization}
> +
> +\devicenormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
> +
> +The device MUST generate a \field{devid}. \field{devid} remains unchanged
> +during reset. \field{devid} MUST NOT be 0.
> +
> +The device shares memory to the driver based on shared memory regions
> +\ref{sec:Basic Facilities of a Virtio Device / Shared Memory Regions}.
> +However, it does not need to allocate physical memory during initialization.
> +
> +The \field{shmid} of a region MUST be one of the following
> +\begin{lstlisting}
> +enum virtio_ism_shm_id {
> +	VIRTIO_ISM_SHM_ID_UNDEFINED = 0,
> +	VIRTIO_ISM_SHM_ID_REGIONS   = 1,
> +	VIRTIO_ISM_SHM_ID_NOTIFY    = 2,
> +};
> +\end{lstlisting}
> +
> +The shared memory whose shmid is VIRTIO_ISM_SHM_ID_REGIONS is used to implement
> +ism regions. 

Hm. AFAIU, these are supposed to be used for exchanging data between the
VMs that are supposed to communicate with each other via ism. 

How does this relate to the following normative section:
"""
2.10.2 Device Requirements: Shared Memory Regions

Shared memory regions MUST NOT expose shared memory regions which are used to control the operation of the device, nor to stream data.
"""

> If there are multiple shared memories whose shmid is
> +VIRTIO_ISM_SHM_ID_REGIONS, they are used as contiguous memory in the order of
> +acquisition.

Hm, I used to think that the shmid is an unique identifier for a shared
memory region. How can multiple virtio shared memory regions have the
same shmid?

Can you have a look at the MMIO interface for virtio shared memory
regions.

How far what ISM tries to do consistent with virtio shared memory. I
mean, AFAIU once the shared memory is advertised, the  shared memory is
supposed to be there and accessible by the driver for both writes and
reads. But for ISM there is this allocation and attach/detach and
permissions.

What about "Memory consistency rules vary depending on the region and
the device and they will be specified as required by each device." form
"2.10 Shared Memory Regions".

> +
> +The device MUST also provides a shared memory with VIRTIO_ISM_SHM_ID_NOTIFY to
> +the driver. This memory area is used for notify, and each ism region MUST have a
> +corresponding notify address inside this area, and the size of the notify
> +address is \field{notify_size};
> +
> +\drivernormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
> +
> +The driver MUST query all shared memory regions supported by the device.
> +(see \ref{sec:Basic Facilities of a Virtio Device / Shared Memory Regions})
> +
> +
> +\subsection{Control Virtqueue}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue}
> +
> +The driver uses the control virtqueue send commands to implement operations on
> +the ism region and some global configurations.
> +
> +All commands are of the following form:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl {
> +	u8 class;
> +	u8 command;
> +	u8 command_specific_data[];
> +	u8 ack;
> +	u8 command_specific_data_reply[];
> +};
> +
> +/* ack values */
> +#define VIRTIO_ISM_OK      0
> +#define VIRTIO_ISM_ERR     255
> +
> +#define VIRTIO_ISM_ENOENT  2
> +#define VIRTIO_ISM_E2BIG   7
> +#define VIRTIO_ISM_ENOMEM  12
> +#define VIRTIO_ISM_ENOSPEC 28
> +
> +#define VIRTIO_ISM_PERM_EATTACH 100
> +#define VIRTIO_ISM_PERM_EREAD   101
> +#define VIRTIO_ISM_PERM_EWRITE  102
> +\end{lstlisting}
> +
> +The \field{class}, \field{command} and command-specific-data are set by the
> +driver, and the device sets the \field{ack} byte and optionally
> +\field{command-specific-data-reply}.

Where are values and their semantic for \field{command} specified? I see
no more references to \field{command}. Similarly I see no further
references the fields \field{command-specific-data} and
\field{command-specific-data-reply}.

[..]

I will continue with the review from here. I didn't have the opportunity
to look at the PoC implementation. Maybe it will be easier to get through
the rest of this text once I have a better understanding of the part up
till now.

Regards,
Halil


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-10 22:34   ` [virtio-dev] " Halil Pasic
@ 2023-01-11 11:08     ` Xuan Zhuo
  2023-01-11 15:11       ` Halil Pasic
                         ` (4 more replies)
  0 siblings, 5 replies; 37+ messages in thread
From: Xuan Zhuo @ 2023-01-11 11:08 UTC (permalink / raw)
  To: Halil Pasic
  Cc: virtio-dev, hans, herongguang, zmlcc, dust.li, tonylu, zhenzao,
	helinguo, gerry, mst, cohuck, jasowang, Jan Kiszka, wintera,
	kgraul, wenjia, jaka, hca, twinkler, raspl, Halil Pasic

On Tue, 10 Jan 2023 23:34:01 +0100, Halil Pasic <pasic@linux.ibm.com> wrote:
> On Fri, 23 Dec 2022 16:13:54 +0800
> Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> > The virtio ism device provides and manages many memory ism regions in
> > host. These ism regions can be alloc/attach/detach by driver. Every
> [..]
>
> Hi Xuan Zhou!
>
> Some words in advance. While I'm supportive of the general idea, I find
> the proposed specification quite difficult to read, and I have the
> feeling I did not get the full picture. So my review will be a mix of
> feedback on the tactical and of questions on the strategic level. Please
> bear with me.

Any opinion is welcome. ^_^

>
> >
> > diff --git a/virtio-ism.tex b/virtio-ism.tex
> > new file mode 100644
> > index 0000000..7f09c43
> > --- /dev/null
> > +++ b/virtio-ism.tex
> > @@ -0,0 +1,472 @@
> > +\section{ISM Device}\label{sec:Device Types / ISM Device}
> > +
> > +\begin{lstlisting}
> > +|-------------------------------------------------------------------------------------------------------------|
> > +| |------------------------------------------------|    |------------------------------------------------|    |
> > +| | VM                      [M1]   [M2]   [M3]     |    | VM                             [M2]   [M3]     |    |
> > +| |                          |      |      |       |    |                                 |      |       |    |
> > +| |   -----------------------|------|------|---    |    |   ------------------------------|------|---    |    |
> > +| |   |             driver   |      |      |  |    |    |   |             driver          |      |  |    |    |
> > +| |   -----------------------|------|------|---    |    |   ------------------------------|------|---    |    |
> > +| |    |cq|                  |map   |map   |map    |    |    |cq|                         |map   |map    |    |
> > +| |    |  |                  |      |      |       |    |    |  |                         |      |       |    |
> > +| |    |  |                -------------------     |    |    |  |                -------------------     |    |
> > +| |----|--|----------------|  device memory  |-----|    |----|--|----------------|  device memory  |-----|    |
> > +| |    |  |                -------------------     |    |    |  |                -------------------     |    |
> > +| |                                |               |    |                                |               |    |
> > +| |                                |               |    |                                |               |    |
> > +| |                                |               |    |                                |               |    |
> > +| |--------------------------------+---------------|    |--------------------------------+---------------|    |
> > +|                                  |                                                       |                  |
> > +|                                  |                                                       |                  |
> > +|                                  |------------------------------+------------------------|                  |
> > +|                                                                 |                                           |
> > +|                                                                 |                                           |
> > +|                                                   --------------------------                                |
> > +|                                                    | M1 |   | M2 |   | M3 |                                 |
> > +|                                                   --------------------------                                |
> > +|                                                                                                             |
> > +|                                                                                                             |
> > +|-------------------------------------------------------------------------------------------------------------|
> > +\end{lstlisting}
> > +
> > +ISM(Internal Shared Memory) device provides the ability to share memory between
> > +different VMs launched from the same entity.
>
> Launched by instead of from? Maybe introduce a catchy name for the
> "entity that launched the VMs" and prevent oversimplification by
> explaining any shortcomings of the name if any in one place. Host would
> be one candidate, VMM another.

Cornelia Huck wrote:

	Is there a way to avoid the term "host" (throughout this document)?
	IIUC, you need the uniqueness within the scope of the entity that
	launches the different instances that get shared access to the regions
	(which could conceivably a unit of hardware?)

And I think she is right, so I am trying to remove the term HOST.

Do you have better opinions? I think VMM is not particularly suitable.

>
> > A vm's memory got from ISM device
> > +can be shared with multiple peers at the same time. This shared relationship can
>
> s/at the same time/simultaneously/ ?

Yes

>
> > +be dynamically created and released.
>
> "This shared relationship" does not sound right, but I have no proposal
> out of the top of my head.
>
> > +
> > +The contiguous shared memory obtained from the device is divided into multiple
> > +ism regions for share.
>
> What does "for share" mean here? I don't quite understand this sentence.


Maybe s/for share//


>
> > +
> > +ISM device provides a mechanism to notify other ism region referrers of events.
> > +
> > +
> > +\subsection{Device ID}\label{sec:Device Types / ISM Device / Device ID}
> > +	44
> > +
> > +\subsection{Virtqueues}\label{sec:Device Types / ISM Device / Virtqueues}
> > +\begin{description}
> > +\item[0] controlq
> > +\item[1] eventq
> > +\end{description}
> > +
> > +\subsection{Device configuration layout}\label{sec:Device Types / ISM Device / Device configuration layout}
> > +
> > +\begin{lstlisting}
> > +struct virtio_ism_config {
> > +	le64 gid;
> > +	le64 devid;
> > +	le64 chunk_size;
> > +	le64 notify_size;
> > +};
> > +\end{lstlisting}
> > +
> > +\begin{description}
> > +    \item[\field{gid}] \field{gid} is used to identify different entity that
> > +        launches the VMs.
>
> What does gid stand for?
>
> s/different/the/ ?

Will fix.

>
> I can't figure out what is "different" supposed to mean here.
>
> >Only the ism devices with the same \field{gid} can
>
> s/the ism/ism/ ?

Will fix.

> > +        share the ism regions. Therefore, this value is unique in the
>
> /s/share the/share/ ?

Will fix.

>
> > +        world-wide.
>
> Makes no sense to me.
>
> We could call gid host_id.

Yes, this has been discussed in the discussion with Alexandra.

> We could also say that a host_id is
> world-wide unique in a sense that there must not be another host
> with the same host_id.
>
> That of course raises the question, how do the different implementations
> ensure that the gid (or hostid) remains unique. I guess therefore this
> specification needs to specify how such unique gids are generated.

Yes, we currently have several solutions. The most suitable may be to use UUID
and record it. In this way, all VMMs can use this UUID. But this requires that
the host_id is 128-bit.


>
> > +
> > +    \item[\field{devid}] the device id is used to identify different ism devices
>
> s/uniquely identify an ism device within the scope of the host. I.e.
> devices attached to VMs on the same host must have different
> \filed{devid} values, while devices attached to VMs that are hosted by
> different hosts may have the same \field{devid} values.

Will fix.

>
> > +        on the same entity.
> > +
> > +    \item[\field{chunk_size}] the size of the every ism chunk.
>
> What is a chunk? Fist introduced here. Please use consistent wording.

Will fix.

>
> > +        Large shared memories are divided into multiple chunks, and one time
>
> "Memories" sounds wrong here. I guess what you used to call "regions"
> you now call "chunks". But I may be wrong.

A ism region consists of multiple chunks. The size of each chunk is fixed.
The size of ism region is not fixed.

>
> > +        will take up at least one chunk.
> > +
> > +    \item[\field{notify_size}] the size of the notify address.
>
> The term "notify address" ain't properly introduced.

Will fix.

>
> > +\end{description}
> > +
> > +\devicenormative{\subsubsection}{Device configuration layout}{Device Types / ISM Device / Device configuration layout}
> > +
> > +The device MUST ensure that the \field{gid} on the same entity i
>
> s/i$/is$/

Will fix.


>
> > +same and different from the \field{gid} on other entity.
>
> How is the device supposed to know what is "the \field{gid} on other
> entity"? See my previous comment.

It is guaranteed by the algorithm. The use of a unified algorithm on the
implementation of VMM can be implemented. Considering that the VM of different
manufacturers may exchange Host-Id, we should define an algorithm that generates
the host-id in this standard.


>
> > +
> > +On the same entity, the device MUST ensure that the \field{devid} is unique.
> > +
> > +\field{chunk_size} MUST be a power of two.
> > +
> > +\subsection{Event}\label{sec:Device Types / Network Device / Device Operation / Event}
> > +
> > +\begin{lstlisting}
> > +#define   VIRTIO_ISM_EVENT_UPDATE (1 << 0)
> > +#define   VIRTIO_ISM_EVENT_ATTACH (1 << 1)
> > +#define   VIRTIO_ISM_EVENT_DETACH (1 << 2)
> > +\end{lstlisting}
> > +
> > +\begin{description}
> > +    \item[VIRTIO_ISM_EVENT_UPDATE]
> > +        The driver kick the notify area to notify other peers of the update
> > +        event of the ism region content.
> > +
> > +    \item[VIRTIO_ISM_EVENT_ATTACH] A new device attaches the ism region.
> > +    \item[VIRTIO_ISM_EVENT_DETACH] A device detaches the ism region.
> > +\end{description}
> > +
> > +The ism device supports event notification of the ism region. When a device
> > +kick/attach/detach a region, other ism region referrers may receive related
> > +events.
> > +
>
> Is "may" what we want to use here? This sounds like the referrers may
> not rely on receiving these events, because they don't have the
> guarantee they will receive any.


Will fix.

>
> > +A buffer received from eventq can contain multiple event structures.
> > +
> > +\begin{lstlisting}
> > +struct virtio_ism_event_update {
> > +	le64 ev_type;
> > +	le64 offset;
> > +	le64 devid;
> > +};
> > +
> > +struct virtio_ism_event_attach_detach {
> > +	le64 ev_type;
> > +	le64 offset;
> > +	le64 devid;
> > +	le64 peers;
> > +};
> > +
> > +\end{lstlisting}
> > +
> > +\begin{description}
> > +\item[\field{ev_type}] The type of event, the driver can get the size of the
> > +    structure based on this.
> > +
> > +\item[\field{offset}] The offset of ism regions with the event.
>
> Offset with respect to what?

Used to specify a region. Offset is the position of this ISM Region inside the
memory of Device.


>
> > +
> > +\item[\field{devid}] \field{devid} of the device that generated events.
> > +\item[\field{peers}] The number of the ism region referres (does not include the
> > +    device that receiving this event)
> > +
> > +\end{description}
> > +
> > +
> > +\subsection{Permissions}\label{sec:Device Types / Network Device / Device Operation / Permission}
> > +
> > +The permissions of a ism region determine whether this ism region can be
> > +attached and the read and write permissions after attach.
> > +
> > +The driver can set the default permissions, or set permissions for some certain
> > +devices.
>
> What does "default permissions" and "some certain devices" mean here?

"default permissions": This can be understood as the permissions for all devices.
"some certain devices": This is the permissions set for a specific device.

>
> > +
> > +When a driver has the management permission of the ism region, then it can
> > +modify the permissions of this ism region. By default, only the device that
> > +allocated the ism region has this permission.
> > +
> > +\subsection{Device Initialization}\label{sec:Device Types / ISM Device / Device Initialization}
> > +
> > +\devicenormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
> > +
> > +The device MUST generate a \field{devid}. \field{devid} remains unchanged
> > +during reset. \field{devid} MUST NOT be 0.
> > +
> > +The device shares memory to the driver based on shared memory regions
> > +\ref{sec:Basic Facilities of a Virtio Device / Shared Memory Regions}.
> > +However, it does not need to allocate physical memory during initialization.
> > +
> > +The \field{shmid} of a region MUST be one of the following
> > +\begin{lstlisting}
> > +enum virtio_ism_shm_id {
> > +	VIRTIO_ISM_SHM_ID_UNDEFINED = 0,
> > +	VIRTIO_ISM_SHM_ID_REGIONS   = 1,
> > +	VIRTIO_ISM_SHM_ID_NOTIFY    = 2,
> > +};
> > +\end{lstlisting}
> > +
> > +The shared memory whose shmid is VIRTIO_ISM_SHM_ID_REGIONS is used to implement
> > +ism regions.
>
> Hm. AFAIU, these are supposed to be used for exchanging data between the
> VMs that are supposed to communicate with each other via ism.
>
> How does this relate to the following normative section:
> """
> 2.10.2 Device Requirements: Shared Memory Regions
>
> Shared memory regions MUST NOT expose shared memory regions which are used to control the operation of the device, nor to stream data.
> """

From https://lists.oasis-open.org/archives/virtio-comment/201906/msg00010.html

	I'm trying to find a way to say that people shouldn't side-step virtio
	queues by putting everything in a big blob of shared memory.

So I think this should not be conflict.

>
> > If there are multiple shared memories whose shmid is
> > +VIRTIO_ISM_SHM_ID_REGIONS, they are used as contiguous memory in the order of
> > +acquisition.
>
> Hm, I used to think that the shmid is an unique identifier for a shared
> memory region. How can multiple virtio shared memory regions have the
> same shmid?

Shmid is used to define what this sharing memory is used to do, not its unique
identifier.

This is the point that device provides multiple shared memory areas.
These memory areas are used to provide ISM Region.

>
> Can you have a look at the MMIO interface for virtio shared memory
> regions.

Is there any special attention?


>
> How far what ISM tries to do consistent with virtio shared memory. I
> mean, AFAIU once the shared memory is advertised, the  shared memory is
> supposed to be there and accessible by the driver for both writes and
> reads.

Before allocation, VMM did not allocate physical memory for it. Therefore, it
was illegal to read and write this memory.

> But for ISM there is this allocation and attach/detach and
> permissions.

Alloc/Attach is to let the device allocate or bind physical memory.

>
> What about "Memory consistency rules vary depending on the region and
> the device and they will be specified as required by each device." form
> "2.10 Shared Memory Regions".
>
> > +
> > +The device MUST also provides a shared memory with VIRTIO_ISM_SHM_ID_NOTIFY to
> > +the driver. This memory area is used for notify, and each ism region MUST have a
> > +corresponding notify address inside this area, and the size of the notify
> > +address is \field{notify_size};
> > +
> > +\drivernormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
> > +
> > +The driver MUST query all shared memory regions supported by the device.
> > +(see \ref{sec:Basic Facilities of a Virtio Device / Shared Memory Regions})
> > +
> > +
> > +\subsection{Control Virtqueue}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue}
> > +
> > +The driver uses the control virtqueue send commands to implement operations on
> > +the ism region and some global configurations.
> > +
> > +All commands are of the following form:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl {
> > +	u8 class;
> > +	u8 command;
> > +	u8 command_specific_data[];
> > +	u8 ack;
> > +	u8 command_specific_data_reply[];
> > +};
> > +
> > +/* ack values */
> > +#define VIRTIO_ISM_OK      0
> > +#define VIRTIO_ISM_ERR     255
> > +
> > +#define VIRTIO_ISM_ENOENT  2
> > +#define VIRTIO_ISM_E2BIG   7
> > +#define VIRTIO_ISM_ENOMEM  12
> > +#define VIRTIO_ISM_ENOSPEC 28
> > +
> > +#define VIRTIO_ISM_PERM_EATTACH 100
> > +#define VIRTIO_ISM_PERM_EREAD   101
> > +#define VIRTIO_ISM_PERM_EWRITE  102
> > +\end{lstlisting}
> > +
> > +The \field{class}, \field{command} and command-specific-data are set by the
> > +driver, and the device sets the \field{ack} byte and optionally
> > +\field{command-specific-data-reply}.
>
> Where are values and their semantic for \field{command} specified? I see
> no more references to \field{command}. Similarly I see no further
> references the fields \field{command-specific-data} and
> \field{command-specific-data-reply}.


Sorry, this is a common method for Virtio CVQ. I didn't write it particularly
clearly. I will try to write a little more in next version.

As far as "Alloc ISM Region" is concerned, there are these definitions.

	+struct virtio_ism_area {
	+	le64 offset;
	+	le64 size;
	+}
	+
	+struct virtio_ism_ctrl_alloc {
	+	le64 size;
	+};
	+
	+struct virtio_ism_ctrl_alloc_reply {
	+	le64 token;
	+	le64 num;
	+    struct virtio_ism_area area[];
	+};
	+
	+#define VIRTIO_ISM_CTRL_ALLOC  0
	+	#define VIRTIO_ISM_CTRL_ALLOC_REGION 0


class:                       VIRTIO_ISM_CTRL_ALLOC
command:                     VIRTIO_ISM_CTRL_ALLOC_REGION
command-specific-data:       virtio_ism_ctrl_alloc
command-specific-data-reply: virtio_ism_ctrl_alloc_reply



Look forward to your reply.

Regards,
Xuan


>
> [..]
>
> I will continue with the review from here. I didn't have the opportunity
> to look at the PoC implementation. Maybe it will be easier to get through
> the rest of this text once I have a better understanding of the part up
> till now.
>
> Regards,
> Halil


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-11 11:08     ` Xuan Zhuo
@ 2023-01-11 15:11       ` Halil Pasic
  2023-01-12  2:01         ` Jason Wang
  2023-01-11 15:22       ` Halil Pasic
                         ` (3 subsequent siblings)
  4 siblings, 1 reply; 37+ messages in thread
From: Halil Pasic @ 2023-01-11 15:11 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: virtio-dev, hans, herongguang, zmlcc, dust.li, tonylu, zhenzao,
	helinguo, gerry, mst, cohuck, jasowang, Jan Kiszka, wintera,
	kgraul, wenjia, jaka, hca, twinkler, raspl, Halil Pasic

On Wed, 11 Jan 2023 19:08:53 +0800
Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:

> > > +ISM(Internal Shared Memory) device provides the ability to share memory between
> > > +different VMs launched from the same entity.  
> >
> > Launched by instead of from? Maybe introduce a catchy name for the
> > "entity that launched the VMs" and prevent oversimplification by
> > explaining any shortcomings of the name if any in one place. Host would
> > be one candidate, VMM another.  
> 
> Cornelia Huck wrote:
> 
> 	Is there a way to avoid the term "host" (throughout this document)?
> 	IIUC, you need the uniqueness within the scope of the entity that
> 	launches the different instances that get shared access to the regions
> 	(which could conceivably a unit of hardware?)
> 
> And I think she is right, so I am trying to remove the term HOST.
> 
> Do you have better opinions? I think VMM is not particularly suitable.

Well I'm don't know what Connie's objective was with that. Maybe she can
explain. My concern is calling it "the entity" is awfully vague and
calling it "the entity that launched the VMs" is awfully long.

Please notice that by being explicit about the launched entities being
VMs, you already know that the entity launching the VMs is a (host)
machine (which does not have to be a bare metal machine, but can be
a VM, a container).

From Wikipedia (https://en.wikipedia.org/wiki/Hypervisor): "A hypervisor
(also known as a virtual machine monitor, VMM, or virtualizer) is a type
of computer software, firmware or hardware that creates and runs virtual
machines. A computer on which a hypervisor runs one or more virtual
machines is called a host machine, and each virtual machine is called a
guest machine."

So I guess we could call it a host machine, a hypervisor instance.

On the other hand if we want to not mention (and kind of scope down to)
VMs either, we could try with something like originator or creator.

Either way, I would really like to get some more opinions on this,
especially Connie's.

Regards,
Halil
 
 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-11 11:08     ` Xuan Zhuo
  2023-01-11 15:11       ` Halil Pasic
@ 2023-01-11 15:22       ` Halil Pasic
  2023-01-12 11:57         ` Xuan Zhuo
  2023-01-11 15:30       ` Halil Pasic
                         ` (2 subsequent siblings)
  4 siblings, 1 reply; 37+ messages in thread
From: Halil Pasic @ 2023-01-11 15:22 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: virtio-dev, hans, herongguang, zmlcc, dust.li, tonylu, zhenzao,
	helinguo, gerry, mst, cohuck, jasowang, Jan Kiszka, wintera,
	kgraul, wenjia, jaka, hca, twinkler, raspl, Halil Pasic

On Wed, 11 Jan 2023 19:08:53 +0800
Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:

> >  
> > > +        Large shared memories are divided into multiple chunks, and one time  
> >
> > "Memories" sounds wrong here. I guess what you used to call "regions"
> > you now call "chunks". But I may be wrong.  
> 
> A ism region consists of multiple chunks. The size of each chunk is fixed.
> The size of ism region is not fixed.

Hm. I'm still a little confused. Does this mean you have something like

virtio shared memory region (with shmid) >= ism shared memory region >=
ism shared memory chunk

Than is multiple ism shared memory regions may live within a single
virtio shared memory region, and multiple ism shared memory chunks may
compose an ism shared memory region?

Regards,
Halil


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-11 11:08     ` Xuan Zhuo
  2023-01-11 15:11       ` Halil Pasic
  2023-01-11 15:22       ` Halil Pasic
@ 2023-01-11 15:30       ` Halil Pasic
  2023-01-12 12:03         ` Xuan Zhuo
  2023-01-11 20:46       ` Halil Pasic
  2023-01-11 21:12       ` Halil Pasic
  4 siblings, 1 reply; 37+ messages in thread
From: Halil Pasic @ 2023-01-11 15:30 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: virtio-dev, hans, herongguang, zmlcc, dust.li, tonylu, zhenzao,
	helinguo, gerry, mst, cohuck, jasowang, Jan Kiszka, wintera,
	kgraul, wenjia, jaka, hca, twinkler, raspl, Halil Pasic

On Wed, 11 Jan 2023 19:08:53 +0800
Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:

> > > +\begin{description}
> > > +\item[\field{ev_type}] The type of event, the driver can get the size of the
> > > +    structure based on this.
> > > +
> > > +\item[\field{offset}] The offset of ism regions with the event.  
> >
> > Offset with respect to what?  
> 
> Used to specify a region. Offset is the position of this ISM Region inside the
> memory of Device.

An offset is per definition always relative to something. I would have
thought this is an offset relative to the beginning of *the* virtio
shared memory region identified by the ismid 1. But since you claim that
there may be multiple virtio shared memory regions with the shmid 1 I'm
heavily confused. What is here "the memory of Device"?


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-11 11:08     ` Xuan Zhuo
                         ` (2 preceding siblings ...)
  2023-01-11 15:30       ` Halil Pasic
@ 2023-01-11 20:46       ` Halil Pasic
  2023-01-12 12:23         ` Xuan Zhuo
  2023-01-11 21:12       ` Halil Pasic
  4 siblings, 1 reply; 37+ messages in thread
From: Halil Pasic @ 2023-01-11 20:46 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: virtio-dev, hans, herongguang, zmlcc, dust.li, tonylu, zhenzao,
	helinguo, gerry, mst, cohuck, jasowang, Jan Kiszka, wintera,
	kgraul, wenjia, jaka, hca, twinkler, raspl, Halil Pasic

On Wed, 11 Jan 2023 19:08:53 +0800
Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:

> > > +\subsection{Permissions}\label{sec:Device Types / Network Device / Device Operation / Permission}
> > > +
> > > +The permissions of a ism region determine whether this ism region can be
> > > +attached and the read and write permissions after attach.
> > > +
> > > +The driver can set the default permissions, or set permissions for some certain
> > > +devices.  
> >
> > What does "default permissions" and "some certain devices" mean here?  
> 
> "default permissions": This can be understood as the permissions for all devices.
> "some certain devices": This is the permissions set for a specific device.

This is IMHO to far where you discuss the permission model and make some
normative statements about it (in 5.20.8.4 Grant ISM Region).

BTW in my opinion that part needs some more work as well. For example if
the "default" permission is permissive, but the permission specifically
set for the device attempting the operation it ain't clear what happens.
I may end up commenting some more on these at when reviewing that part.



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-11 11:08     ` Xuan Zhuo
                         ` (3 preceding siblings ...)
  2023-01-11 20:46       ` Halil Pasic
@ 2023-01-11 21:12       ` Halil Pasic
  2023-01-12  7:01         ` Michael S. Tsirkin
                           ` (3 more replies)
  4 siblings, 4 replies; 37+ messages in thread
From: Halil Pasic @ 2023-01-11 21:12 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: virtio-dev, hans, herongguang, zmlcc, dust.li, tonylu, zhenzao,
	helinguo, gerry, mst, cohuck, jasowang, Jan Kiszka, wintera,
	kgraul, wenjia, jaka, hca, twinkler, raspl, Halil Pasic

On Wed, 11 Jan 2023 19:08:53 +0800
Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:

> > > +The device shares memory to the driver based on shared memory regions
> > > +\ref{sec:Basic Facilities of a Virtio Device / Shared Memory Regions}.
> > > +However, it does not need to allocate physical memory during initialization.
> > > +
> > > +The \field{shmid} of a region MUST be one of the following
> > > +\begin{lstlisting}
> > > +enum virtio_ism_shm_id {
> > > +	VIRTIO_ISM_SHM_ID_UNDEFINED = 0,
> > > +	VIRTIO_ISM_SHM_ID_REGIONS   = 1,
> > > +	VIRTIO_ISM_SHM_ID_NOTIFY    = 2,
> > > +};
> > > +\end{lstlisting}
> > > +
> > > +The shared memory whose shmid is VIRTIO_ISM_SHM_ID_REGIONS is used to implement
> > > +ism regions.  
> >
> > Hm. AFAIU, these are supposed to be used for exchanging data between the
> > VMs that are supposed to communicate with each other via ism.
> >
> > How does this relate to the following normative section:
> > """
> > 2.10.2 Device Requirements: Shared Memory Regions
> >
> > Shared memory regions MUST NOT expose shared memory regions which are used to control the operation of the device, nor to stream data.
> > """  
> 
> From https://lists.oasis-open.org/archives/virtio-comment/201906/msg00010.html
> 
> 	I'm trying to find a way to say that people shouldn't side-step virtio
> 	queues by putting everything in a big blob of shared memory.
> 
> So I think this should not be conflict.

AFAIU putting everything in a big blob of shared memory is what ISM
does. I'm not saying that this is bad. But I'm for sure confused.

In my opinion we should either clarify what is meant by that sentence
and maybe also why, or get rid of it.

@MST: What is your take on this?

> 
> >  
> > > If there are multiple shared memories whose shmid is
> > > +VIRTIO_ISM_SHM_ID_REGIONS, they are used as contiguous memory in the order of
> > > +acquisition.  
> >
> > Hm, I used to think that the shmid is an unique identifier for a shared
> > memory region. How can multiple virtio shared memory regions have the
> > same shmid?  
> 
> Shmid is used to define what this sharing memory is used to do, not its unique
> identifier.

As far as I can tell the shmid is supposed to uniquely identify a single
shared memory region!

" A device may have multiple shared memory regions associated with it.
Each region has a shmid to identify it, the meaning of which is device-specific." 

@MST: Can you please help us to get clarity on this?

BTW, this is why I pointed you to the corresponding MMIO interface:
"""
SHMSel	             Shared memory id
0x0ac
W

Writing to this register selects the shared memory region 2.10 following
operations on SHMLenLow, SHMLenHigh, SHMBaseLow and SHMBaseHigh apply
to.

SHMLenLow            Shared memory region 64 bit long length 
0x0b0
SHMLenHigh
0x0b4
R
	

These registers return the length of the shared memory region in bytes,
as defined by the device for the region selected by the SHMSel register.
The lower 32 bits of the length are read from SHMLenLow and the higher
32 bits from SHMLenHigh. Reading from a non-existent region (i.e. where
the ID written to SHMSel is unused) results in a length of -1.
	

SHMBaseLow         Shared memory region 64 bit long physical address 
0x0b8
SHMBaseHigh
0x0bc
R  

The driver reads these registers to discover the base address of the
region in physical address space. This address is chosen by the device
(or other part of the VMM). The lower 32 bits of the address are read
from SHMBaseLow with the higher 32 bits from SHMBaseHigh. Reading from a
non-existent region (i.e. where the ID written to SHMSel is unused)
results in a base address of 0xffffffffffffffff. 
"""

I.e. after you write the shmid into SHMSel you can obtain the base
address and the length of that region, provided there is a region
with that id.

> 
> This is the point that device provides multiple shared memory areas.
> These memory areas are used to provide ISM Region.
> 

Thus I don't think this is possible without violating the currently
standing rules on shared memory regions.

> >
> > Can you have a look at the MMIO interface for virtio shared memory
> > regions.  
> 
> Is there any special attention?
> 

Please see above.

> 
> >
> > How far what ISM tries to do consistent with virtio shared memory. I
> > mean, AFAIU once the shared memory is advertised, the  shared memory is
> > supposed to be there and accessible by the driver for both writes and
> > reads.  
> 
> Before allocation, VMM did not allocate physical memory for it. Therefore, it
> was illegal to read and write this memory.

I'm not sure whether generating addressing exceptions on accesses to the
advertised virtio shared memory regions is acceptable or not. Maybe
Michael, Connie and the TC members more versed in shared memory regions
can help us get clarity on this.

I don't think it is per-se prohibited, but on the other hand, to me
virtio shared memory regions read more like memory that is supposed to
work , and less like an address space that may or may not be backed by
memory. So if this is allowed, maybe adding a sentence to the part that
describes virtio shared memory regions in the basic facilities portion of
the spec would be beneficial.

Regards,
Halil


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-11 15:11       ` Halil Pasic
@ 2023-01-12  2:01         ` Jason Wang
  2023-01-12  6:56           ` Michael S. Tsirkin
  0 siblings, 1 reply; 37+ messages in thread
From: Jason Wang @ 2023-01-12  2:01 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Xuan Zhuo, virtio-dev, hans, herongguang, zmlcc, dust.li, tonylu,
	zhenzao, helinguo, gerry, mst, cohuck, Jan Kiszka, wintera,
	kgraul, wenjia, jaka, hca, twinkler, raspl

On Wed, Jan 11, 2023 at 11:11 PM Halil Pasic <pasic@linux.ibm.com> wrote:
>
> On Wed, 11 Jan 2023 19:08:53 +0800
> Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> > > > +ISM(Internal Shared Memory) device provides the ability to share memory between
> > > > +different VMs launched from the same entity.
> > >
> > > Launched by instead of from? Maybe introduce a catchy name for the
> > > "entity that launched the VMs" and prevent oversimplification by
> > > explaining any shortcomings of the name if any in one place. Host would
> > > be one candidate, VMM another.
> >
> > Cornelia Huck wrote:
> >
> >       Is there a way to avoid the term "host" (throughout this document)?
> >       IIUC, you need the uniqueness within the scope of the entity that
> >       launches the different instances that get shared access to the regions
> >       (which could conceivably a unit of hardware?)
> >
> > And I think she is right, so I am trying to remove the term HOST.
> >
> > Do you have better opinions? I think VMM is not particularly suitable.
>
> Well I'm don't know what Connie's objective was with that. Maybe she can
> explain. My concern is calling it "the entity" is awfully vague and
> calling it "the entity that launched the VMs" is awfully long.
>
> Please notice that by being explicit about the launched entities being
> VMs, you already know that the entity launching the VMs is a (host)
> machine (which does not have to be a bare metal machine, but can be
> a VM, a container).
>
> From Wikipedia (https://en.wikipedia.org/wiki/Hypervisor): "A hypervisor
> (also known as a virtual machine monitor, VMM, or virtualizer) is a type
> of computer software, firmware or hardware that creates and runs virtual
> machines. A computer on which a hypervisor runs one or more virtual
> machines is called a host machine, and each virtual machine is called a
> guest machine."
>
> So I guess we could call it a host machine, a hypervisor instance.
>
> On the other hand if we want to not mention (and kind of scope down to)
> VMs either, we could try with something like originator or creator.
>
> Either way, I would really like to get some more opinions on this,
> especially Connie's.

My understanding is, since virtio is not necessarily used for the case
of virtualization, it might be better to use "driver/device" instead
of "guest/host" or "VM/VMM".

Thanks

>
> Regards,
> Halil
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-12  2:01         ` Jason Wang
@ 2023-01-12  6:56           ` Michael S. Tsirkin
  2023-01-12  8:42             ` Cornelia Huck
                               ` (2 more replies)
  0 siblings, 3 replies; 37+ messages in thread
From: Michael S. Tsirkin @ 2023-01-12  6:56 UTC (permalink / raw)
  To: Jason Wang
  Cc: Halil Pasic, Xuan Zhuo, virtio-dev, hans, herongguang, zmlcc,
	dust.li, tonylu, zhenzao, helinguo, gerry, cohuck, Jan Kiszka,
	wintera, kgraul, wenjia, jaka, hca, twinkler, raspl

On Thu, Jan 12, 2023 at 10:01:25AM +0800, Jason Wang wrote:
> On Wed, Jan 11, 2023 at 11:11 PM Halil Pasic <pasic@linux.ibm.com> wrote:
> >
> > On Wed, 11 Jan 2023 19:08:53 +0800
> > Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >
> > > > > +ISM(Internal Shared Memory) device provides the ability to share memory between
> > > > > +different VMs launched from the same entity.
> > > >
> > > > Launched by instead of from? Maybe introduce a catchy name for the
> > > > "entity that launched the VMs" and prevent oversimplification by
> > > > explaining any shortcomings of the name if any in one place. Host would
> > > > be one candidate, VMM another.
> > >
> > > Cornelia Huck wrote:
> > >
> > >       Is there a way to avoid the term "host" (throughout this document)?
> > >       IIUC, you need the uniqueness within the scope of the entity that
> > >       launches the different instances that get shared access to the regions
> > >       (which could conceivably a unit of hardware?)
> > >
> > > And I think she is right, so I am trying to remove the term HOST.
> > >
> > > Do you have better opinions? I think VMM is not particularly suitable.

I think fundamentally from spec POV memory is shared between devices.
How sharing is accomplished guest does not care so neither should the
spec. Can some RDMA tricks be used for synchronisation behind the
scenes? Maybe, the spec does not care. But we can give an example.

So something like:

	An ISM(Internal Shared Memory) device provides the ability to
	access memory shared between multiple devices. This allows low-overhead
	communication in presence of such memory. For example, memory can be
	shared with guests of multiple virtual machines running on the same
	host, with each virtual machine including an ISM device and with
	the guests using the ISM devices to access the shared memory.

what do others think?


-- 
MST


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-11 21:12       ` Halil Pasic
@ 2023-01-12  7:01         ` Michael S. Tsirkin
  2023-01-12 12:31         ` Xuan Zhuo
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 37+ messages in thread
From: Michael S. Tsirkin @ 2023-01-12  7:01 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Xuan Zhuo, virtio-dev, hans, herongguang, zmlcc, dust.li, tonylu,
	zhenzao, helinguo, gerry, cohuck, jasowang, Jan Kiszka, wintera,
	kgraul, wenjia, jaka, hca, twinkler, raspl

On Wed, Jan 11, 2023 at 10:12:38PM +0100, Halil Pasic wrote:
> On Wed, 11 Jan 2023 19:08:53 +0800
> Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> 
> > > > +The device shares memory to the driver based on shared memory regions
> > > > +\ref{sec:Basic Facilities of a Virtio Device / Shared Memory Regions}.
> > > > +However, it does not need to allocate physical memory during initialization.
> > > > +
> > > > +The \field{shmid} of a region MUST be one of the following
> > > > +\begin{lstlisting}
> > > > +enum virtio_ism_shm_id {
> > > > +	VIRTIO_ISM_SHM_ID_UNDEFINED = 0,
> > > > +	VIRTIO_ISM_SHM_ID_REGIONS   = 1,
> > > > +	VIRTIO_ISM_SHM_ID_NOTIFY    = 2,
> > > > +};
> > > > +\end{lstlisting}
> > > > +
> > > > +The shared memory whose shmid is VIRTIO_ISM_SHM_ID_REGIONS is used to implement
> > > > +ism regions.  
> > >
> > > Hm. AFAIU, these are supposed to be used for exchanging data between the
> > > VMs that are supposed to communicate with each other via ism.
> > >
> > > How does this relate to the following normative section:
> > > """
> > > 2.10.2 Device Requirements: Shared Memory Regions
> > >
> > > Shared memory regions MUST NOT expose shared memory regions which are used to control the operation of the device, nor to stream data.
> > > """  
> > 
> > From https://lists.oasis-open.org/archives/virtio-comment/201906/msg00010.html
> > 
> > 	I'm trying to find a way to say that people shouldn't side-step virtio
> > 	queues by putting everything in a big blob of shared memory.
> > 
> > So I think this should not be conflict.
> 
> AFAIU putting everything in a big blob of shared memory is what ISM
> does. I'm not saying that this is bad. But I'm for sure confused.
> 
> In my opinion we should either clarify what is meant by that sentence
> and maybe also why, or get rid of it.
> 
> @MST: What is your take on this?

I think what this is trying to prevent is people building proprietary
extensions into devices themselves. This as opposed to proprietary
communication between devices.

If we want to do this, maybe just say that device itself should not
access or modify the data inside the region. But we need to
clarify that as multiple devices refer to the same shared
memory driver must not assume it will not be read or written since
another driver can access it through another device.




> > 
> > >  
> > > > If there are multiple shared memories whose shmid is
> > > > +VIRTIO_ISM_SHM_ID_REGIONS, they are used as contiguous memory in the order of
> > > > +acquisition.  
> > >
> > > Hm, I used to think that the shmid is an unique identifier for a shared
> > > memory region. How can multiple virtio shared memory regions have the
> > > same shmid?  
> > 
> > Shmid is used to define what this sharing memory is used to do, not its unique
> > identifier.
> 
> As far as I can tell the shmid is supposed to uniquely identify a single
> shared memory region!
> 
> " A device may have multiple shared memory regions associated with it.
> Each region has a shmid to identify it, the meaning of which is device-specific." 
> 
> @MST: Can you please help us to get clarity on this?
> 
> BTW, this is why I pointed you to the corresponding MMIO interface:
> """
> SHMSel	             Shared memory id
> 0x0ac
> W
> 
> Writing to this register selects the shared memory region 2.10 following
> operations on SHMLenLow, SHMLenHigh, SHMBaseLow and SHMBaseHigh apply
> to.
> 
> SHMLenLow            Shared memory region 64 bit long length 
> 0x0b0
> SHMLenHigh
> 0x0b4
> R
> 	
> 
> These registers return the length of the shared memory region in bytes,
> as defined by the device for the region selected by the SHMSel register.
> The lower 32 bits of the length are read from SHMLenLow and the higher
> 32 bits from SHMLenHigh. Reading from a non-existent region (i.e. where
> the ID written to SHMSel is unused) results in a length of -1.
> 	
> 
> SHMBaseLow         Shared memory region 64 bit long physical address 
> 0x0b8
> SHMBaseHigh
> 0x0bc
> R  
> 
> The driver reads these registers to discover the base address of the
> region in physical address space. This address is chosen by the device
> (or other part of the VMM). The lower 32 bits of the address are read
> from SHMBaseLow with the higher 32 bits from SHMBaseHigh. Reading from a
> non-existent region (i.e. where the ID written to SHMSel is unused)
> results in a base address of 0xffffffffffffffff. 
> """
> 
> I.e. after you write the shmid into SHMSel you can obtain the base
> address and the length of that region, provided there is a region
> with that id.
> 
> > 
> > This is the point that device provides multiple shared memory areas.
> > These memory areas are used to provide ISM Region.
> > 
> 
> Thus I don't think this is possible without violating the currently
> standing rules on shared memory regions.
> 
> > >
> > > Can you have a look at the MMIO interface for virtio shared memory
> > > regions.  
> > 
> > Is there any special attention?
> > 
> 
> Please see above.
> 
> > 
> > >
> > > How far what ISM tries to do consistent with virtio shared memory. I
> > > mean, AFAIU once the shared memory is advertised, the  shared memory is
> > > supposed to be there and accessible by the driver for both writes and
> > > reads.  
> > 
> > Before allocation, VMM did not allocate physical memory for it. Therefore, it
> > was illegal to read and write this memory.
> 
> I'm not sure whether generating addressing exceptions on accesses to the
> advertised virtio shared memory regions is acceptable or not. Maybe
> Michael, Connie and the TC members more versed in shared memory regions
> can help us get clarity on this.
> 
> I don't think it is per-se prohibited, but on the other hand, to me
> virtio shared memory regions read more like memory that is supposed to
> work , and less like an address space that may or may not be backed by
> memory. So if this is allowed, maybe adding a sentence to the part that
> describes virtio shared memory regions in the basic facilities portion of
> the spec would be beneficial.
> 
> Regards,
> Halil


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-12  6:56           ` Michael S. Tsirkin
@ 2023-01-12  8:42             ` Cornelia Huck
  2023-01-12 11:48               ` Xuan Zhuo
  2023-01-12 11:47             ` Xuan Zhuo
  2023-01-12 12:15             ` Halil Pasic
  2 siblings, 1 reply; 37+ messages in thread
From: Cornelia Huck @ 2023-01-12  8:42 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang
  Cc: Halil Pasic, Xuan Zhuo, virtio-dev, hans, herongguang, zmlcc,
	dust.li, tonylu, zhenzao, helinguo, gerry, Jan Kiszka, wintera,
	kgraul, wenjia, jaka, hca, twinkler, raspl

On Thu, Jan 12 2023, "Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Thu, Jan 12, 2023 at 10:01:25AM +0800, Jason Wang wrote:
>> On Wed, Jan 11, 2023 at 11:11 PM Halil Pasic <pasic@linux.ibm.com> wrote:
>> >
>> > On Wed, 11 Jan 2023 19:08:53 +0800
>> > Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>> >
>> > > > > +ISM(Internal Shared Memory) device provides the ability to share memory between
>> > > > > +different VMs launched from the same entity.
>> > > >
>> > > > Launched by instead of from? Maybe introduce a catchy name for the
>> > > > "entity that launched the VMs" and prevent oversimplification by
>> > > > explaining any shortcomings of the name if any in one place. Host would
>> > > > be one candidate, VMM another.
>> > >
>> > > Cornelia Huck wrote:
>> > >
>> > >       Is there a way to avoid the term "host" (throughout this document)?
>> > >       IIUC, you need the uniqueness within the scope of the entity that
>> > >       launches the different instances that get shared access to the regions
>> > >       (which could conceivably a unit of hardware?)
>> > >
>> > > And I think she is right, so I am trying to remove the term HOST.
>> > >
>> > > Do you have better opinions? I think VMM is not particularly suitable.
>
> I think fundamentally from spec POV memory is shared between devices.
> How sharing is accomplished guest does not care so neither should the
> spec. Can some RDMA tricks be used for synchronisation behind the
> scenes? Maybe, the spec does not care. But we can give an example.
>
> So something like:
>
> 	An ISM(Internal Shared Memory) device provides the ability to
> 	access memory shared between multiple devices. This allows low-overhead
> 	communication in presence of such memory. For example, memory can be
> 	shared with guests of multiple virtual machines running on the same
> 	host, with each virtual machine including an ISM device and with
> 	the guests using the ISM devices to access the shared memory.
>
> what do others think?

I like that: we don't want to talk about hosts/VMMs/etc. as we
fundamentally deal with devices and drivers, but sharing between guests
is of course the obvious use case.

I'm just wondering how best to express the uniqueness scope, is it per
(ISM) device?


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-12  6:56           ` Michael S. Tsirkin
  2023-01-12  8:42             ` Cornelia Huck
@ 2023-01-12 11:47             ` Xuan Zhuo
  2023-01-12 12:15             ` Halil Pasic
  2 siblings, 0 replies; 37+ messages in thread
From: Xuan Zhuo @ 2023-01-12 11:47 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Halil Pasic, virtio-dev, hans, herongguang, zmlcc, dust.li,
	tonylu, zhenzao, helinguo, gerry, cohuck, Jan Kiszka, wintera,
	kgraul, wenjia, jaka, hca, twinkler, raspl, Jason Wang

On Thu, 12 Jan 2023 01:56:14 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Thu, Jan 12, 2023 at 10:01:25AM +0800, Jason Wang wrote:
> > On Wed, Jan 11, 2023 at 11:11 PM Halil Pasic <pasic@linux.ibm.com> wrote:
> > >
> > > On Wed, 11 Jan 2023 19:08:53 +0800
> > > Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > >
> > > > > > +ISM(Internal Shared Memory) device provides the ability to share memory between
> > > > > > +different VMs launched from the same entity.
> > > > >
> > > > > Launched by instead of from? Maybe introduce a catchy name for the
> > > > > "entity that launched the VMs" and prevent oversimplification by
> > > > > explaining any shortcomings of the name if any in one place. Host would
> > > > > be one candidate, VMM another.
> > > >
> > > > Cornelia Huck wrote:
> > > >
> > > >       Is there a way to avoid the term "host" (throughout this document)?
> > > >       IIUC, you need the uniqueness within the scope of the entity that
> > > >       launches the different instances that get shared access to the regions
> > > >       (which could conceivably a unit of hardware?)
> > > >
> > > > And I think she is right, so I am trying to remove the term HOST.
> > > >
> > > > Do you have better opinions? I think VMM is not particularly suitable.
>
> I think fundamentally from spec POV memory is shared between devices.
> How sharing is accomplished guest does not care so neither should the
> spec. Can some RDMA tricks be used for synchronisation behind the
> scenes? Maybe, the spec does not care. But we can give an example.
>
> So something like:
>
> 	An ISM(Internal Shared Memory) device provides the ability to
> 	access memory shared between multiple devices. This allows low-overhead
> 	communication in presence of such memory. For example, memory can be
> 	shared with guests of multiple virtual machines running on the same
> 	host, with each virtual machine including an ISM device and with
> 	the guests using the ISM devices to access the shared memory.
>
> what do others think?

I think it's great.

>
>
> --
> MST
>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-12  8:42             ` Cornelia Huck
@ 2023-01-12 11:48               ` Xuan Zhuo
  2023-01-12 14:30                 ` Cornelia Huck
  0 siblings, 1 reply; 37+ messages in thread
From: Xuan Zhuo @ 2023-01-12 11:48 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Halil Pasic, virtio-dev, hans, herongguang, zmlcc, dust.li,
	tonylu, zhenzao, helinguo, gerry, Jan Kiszka, wintera, kgraul,
	wenjia, jaka, hca, twinkler, raspl, Michael S. Tsirkin,
	Jason Wang

On Thu, 12 Jan 2023 09:42:05 +0100, Cornelia Huck <cohuck@redhat.com> wrote:
> On Thu, Jan 12 2023, "Michael S. Tsirkin" <mst@redhat.com> wrote:
>
> > On Thu, Jan 12, 2023 at 10:01:25AM +0800, Jason Wang wrote:
> >> On Wed, Jan 11, 2023 at 11:11 PM Halil Pasic <pasic@linux.ibm.com> wrote:
> >> >
> >> > On Wed, 11 Jan 2023 19:08:53 +0800
> >> > Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >> >
> >> > > > > +ISM(Internal Shared Memory) device provides the ability to share memory between
> >> > > > > +different VMs launched from the same entity.
> >> > > >
> >> > > > Launched by instead of from? Maybe introduce a catchy name for the
> >> > > > "entity that launched the VMs" and prevent oversimplification by
> >> > > > explaining any shortcomings of the name if any in one place. Host would
> >> > > > be one candidate, VMM another.
> >> > >
> >> > > Cornelia Huck wrote:
> >> > >
> >> > >       Is there a way to avoid the term "host" (throughout this document)?
> >> > >       IIUC, you need the uniqueness within the scope of the entity that
> >> > >       launches the different instances that get shared access to the regions
> >> > >       (which could conceivably a unit of hardware?)
> >> > >
> >> > > And I think she is right, so I am trying to remove the term HOST.
> >> > >
> >> > > Do you have better opinions? I think VMM is not particularly suitable.
> >
> > I think fundamentally from spec POV memory is shared between devices.
> > How sharing is accomplished guest does not care so neither should the
> > spec. Can some RDMA tricks be used for synchronisation behind the
> > scenes? Maybe, the spec does not care. But we can give an example.
> >
> > So something like:
> >
> > 	An ISM(Internal Shared Memory) device provides the ability to
> > 	access memory shared between multiple devices. This allows low-overhead
> > 	communication in presence of such memory. For example, memory can be
> > 	shared with guests of multiple virtual machines running on the same
> > 	host, with each virtual machine including an ISM device and with
> > 	the guests using the ISM devices to access the shared memory.
> >
> > what do others think?
>
> I like that: we don't want to talk about hosts/VMMs/etc. as we
> fundamentally deal with devices and drivers, but sharing between guests
> is of course the obvious use case.
>
> I'm just wondering how best to express the uniqueness scope, is it per
> (ISM) device?

No, each vm has at least one separate device. The devices in a host form
an uniqueness scope.

Thanks.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-11 15:22       ` Halil Pasic
@ 2023-01-12 11:57         ` Xuan Zhuo
  0 siblings, 0 replies; 37+ messages in thread
From: Xuan Zhuo @ 2023-01-12 11:57 UTC (permalink / raw)
  To: Halil Pasic
  Cc: virtio-dev, hans, herongguang, zmlcc, dust.li, tonylu, zhenzao,
	helinguo, gerry, mst, cohuck, jasowang, Jan Kiszka, wintera,
	kgraul, wenjia, jaka, hca, twinkler, raspl, Halil Pasic

On Wed, 11 Jan 2023 16:22:10 +0100, Halil Pasic <pasic@linux.ibm.com> wrote:
> On Wed, 11 Jan 2023 19:08:53 +0800
> Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> > >
> > > > +        Large shared memories are divided into multiple chunks, and one time
> > >
> > > "Memories" sounds wrong here. I guess what you used to call "regions"
> > > you now call "chunks". But I may be wrong.
> >
> > A ism region consists of multiple chunks. The size of each chunk is fixed.
> > The size of ism region is not fixed.
>
> Hm. I'm still a little confused. Does this mean you have something like
>
> virtio shared memory region (with shmid) >= ism shared memory region >=
> ism shared memory chunk
>
> Than is multiple ism shared memory regions may live within a single
> virtio shared memory region, and multiple ism shared memory chunks may
> compose an ism shared memory region?

Yes

Thanks.

>
> Regards,
> Halil


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-11 15:30       ` Halil Pasic
@ 2023-01-12 12:03         ` Xuan Zhuo
  0 siblings, 0 replies; 37+ messages in thread
From: Xuan Zhuo @ 2023-01-12 12:03 UTC (permalink / raw)
  To: Halil Pasic
  Cc: virtio-dev, hans, herongguang, zmlcc, dust.li, tonylu, zhenzao,
	helinguo, gerry, mst, cohuck, jasowang, Jan Kiszka, wintera,
	kgraul, wenjia, jaka, hca, twinkler, raspl, Halil Pasic

On Wed, 11 Jan 2023 16:30:34 +0100, Halil Pasic <pasic@linux.ibm.com> wrote:
> On Wed, 11 Jan 2023 19:08:53 +0800
> Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> > > > +\begin{description}
> > > > +\item[\field{ev_type}] The type of event, the driver can get the size of the
> > > > +    structure based on this.
> > > > +
> > > > +\item[\field{offset}] The offset of ism regions with the event.
> > >
> > > Offset with respect to what?
> >
> > Used to specify a region. Offset is the position of this ISM Region inside the
> > memory of Device.
>
> An offset is per definition always relative to something. I would have
> thought this is an offset relative to the beginning of *the* virtio
> shared memory region identified by the ismid 1. But since you claim that
> there may be multiple virtio shared memory regions with the shmid 1 I'm
> heavily confused. What is here "the memory of Device"?

A device may indeed have multiple virtio shared memory regions. Although the
shmid of these virtio shared memory regions are all 1, there will be an order
when we initialize the device. For example, for PCI device we will check the PCI
cap one by one, so there is an order between these virtio shared memory regions.

This is described in the spec:
	+The shared memory whose shmid is VIRTIO_ISM_SHM_ID_REGIONS is used to implement
	+ism regions. If there are multiple shared memories whose shmid is
	+VIRTIO_ISM_SHM_ID_REGIONS, they are used as contiguous memory in the order of
	+acquisition.

Example:

	offset: 0                                                               1G - 1
	        |  virtio shared memory region (shmid: 1) (size: 1G) (order: 0) |

	offset: 1G                                                              3G - 1
	        |  virtio shared memory region (shmid: 1) (size: 2G) (order: 1) |

	offset: 3G                                                              7G - 1
	        |  virtio shared memory region (shmid: 1) (size: 4G) (order: 2) |

Thanks.



>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-12  6:56           ` Michael S. Tsirkin
  2023-01-12  8:42             ` Cornelia Huck
  2023-01-12 11:47             ` Xuan Zhuo
@ 2023-01-12 12:15             ` Halil Pasic
  2 siblings, 0 replies; 37+ messages in thread
From: Halil Pasic @ 2023-01-12 12:15 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, Xuan Zhuo, virtio-dev, hans, herongguang, zmlcc,
	dust.li, tonylu, zhenzao, helinguo, gerry, cohuck, Jan Kiszka,
	wintera, kgraul, wenjia, jaka, hca, twinkler, raspl, Halil Pasic

On Thu, 12 Jan 2023 01:56:14 -0500
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> I think fundamentally from spec POV memory is shared between devices.

Right, but with the gid stuff and the corresponding rules shared between
two or more arbitrary virtio-ism devices won't do. We need to find a way
to express the what device can communicate with what device relationship.

> How sharing is accomplished guest does not care so neither should the
> spec.

One of the goals of the spec is to foster interoperability. I wonder
how far that goes. For example one could imagine a shared memory on
the same host implementation by one vendor, and an RDMA based
implementation of an other vendor both implementing the very same
interface on the driver-device level. Two entities would not be
able to talk to each other via virtio-ism devices that use different
ways to accomplish the sharing. Is that out of scope for this spec?

> Can some RDMA tricks be used for synchronisation behind the
> scenes? 

I'm not familiar enough with RDMA. But I guess it may also depend on
the "memory consistency" and coherency properties. Which are not
specified for now for the ISM shared memory regions AFAIU.

> Maybe, the spec does not care. But we can give an example.
> 

At this point I'm not sure, whether the spec should care or not.

> So something like:
> 
> 	An ISM(Internal Shared Memory) device provides the ability to
> 	access memory shared between multiple devices. This allows low-overhead
> 	communication in presence of such memory. For example, memory can be
> 	shared with guests of multiple virtual machines running on the same
> 	host, with each virtual machine including an ISM device and with
> 	the guests using the ISM devices to access the shared memory.
> 
> what do others think?

I agree, the spec should be as abstract as possible. As stated above,
I don't have clarity on the interoperability goals. Is multiple flavors
of virtio-ism devices that are not mutually interoperable a good outcome?

Regards,
Halil


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-11 20:46       ` Halil Pasic
@ 2023-01-12 12:23         ` Xuan Zhuo
  0 siblings, 0 replies; 37+ messages in thread
From: Xuan Zhuo @ 2023-01-12 12:23 UTC (permalink / raw)
  To: Halil Pasic
  Cc: virtio-dev, hans, herongguang, zmlcc, dust.li, tonylu, zhenzao,
	helinguo, gerry, mst, cohuck, jasowang, Jan Kiszka, wintera,
	kgraul, wenjia, jaka, hca, twinkler, raspl, Halil Pasic

On Wed, 11 Jan 2023 21:46:37 +0100, Halil Pasic <pasic@linux.ibm.com> wrote:
> On Wed, 11 Jan 2023 19:08:53 +0800
> Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> > > > +\subsection{Permissions}\label{sec:Device Types / Network Device / Device Operation / Permission}
> > > > +
> > > > +The permissions of a ism region determine whether this ism region can be
> > > > +attached and the read and write permissions after attach.
> > > > +
> > > > +The driver can set the default permissions, or set permissions for some certain
> > > > +devices.
> > >
> > > What does "default permissions" and "some certain devices" mean here?
> >
> > "default permissions": This can be understood as the permissions for all devices.
> > "some certain devices": This is the permissions set for a specific device.
>
> This is IMHO to far where you discuss the permission model and make some
> normative statements about it (in 5.20.8.4 Grant ISM Region).

Will fix.

>
> BTW in my opinion that part needs some more work as well. For example if
> the "default" permission is permissive, but the permission specifically
> set for the device attempting the operation it ain't clear what happens.
> I may end up commenting some more on these at when reviewing that part.
>
>

Look forward to your reply.

Thanks.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-11 21:12       ` Halil Pasic
  2023-01-12  7:01         ` Michael S. Tsirkin
@ 2023-01-12 12:31         ` Xuan Zhuo
  2023-01-20 13:06           ` Halil Pasic
  2023-01-12 12:40         ` Xuan Zhuo
  2023-02-05 12:30         ` Michael S. Tsirkin
  3 siblings, 1 reply; 37+ messages in thread
From: Xuan Zhuo @ 2023-01-12 12:31 UTC (permalink / raw)
  To: Halil Pasic
  Cc: virtio-dev, hans, herongguang, zmlcc, dust.li, tonylu, zhenzao,
	helinguo, gerry, mst, cohuck, jasowang, Jan Kiszka, wintera,
	kgraul, wenjia, jaka, hca, twinkler, raspl, Halil Pasic

On Wed, 11 Jan 2023 22:12:38 +0100, Halil Pasic <pasic@linux.ibm.com> wrote:
> On Wed, 11 Jan 2023 19:08:53 +0800
> Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> > > > +The device shares memory to the driver based on shared memory regions
> > > > +\ref{sec:Basic Facilities of a Virtio Device / Shared Memory Regions}.
> > > > +However, it does not need to allocate physical memory during initialization.
> > > > +
> > > > +The \field{shmid} of a region MUST be one of the following
> > > > +\begin{lstlisting}
> > > > +enum virtio_ism_shm_id {
> > > > +	VIRTIO_ISM_SHM_ID_UNDEFINED = 0,
> > > > +	VIRTIO_ISM_SHM_ID_REGIONS   = 1,
> > > > +	VIRTIO_ISM_SHM_ID_NOTIFY    = 2,
> > > > +};
> > > > +\end{lstlisting}
> > > > +
> > > > +The shared memory whose shmid is VIRTIO_ISM_SHM_ID_REGIONS is used to implement
> > > > +ism regions.
> > >
> > > Hm. AFAIU, these are supposed to be used for exchanging data between the
> > > VMs that are supposed to communicate with each other via ism.
> > >
> > > How does this relate to the following normative section:
> > > """
> > > 2.10.2 Device Requirements: Shared Memory Regions
> > >
> > > Shared memory regions MUST NOT expose shared memory regions which are used to control the operation of the device, nor to stream data.
> > > """
> >
> > From https://lists.oasis-open.org/archives/virtio-comment/201906/msg00010.html
> >
> > 	I'm trying to find a way to say that people shouldn't side-step virtio
> > 	queues by putting everything in a big blob of shared memory.
> >
> > So I think this should not be conflict.
>
> AFAIU putting everything in a big blob of shared memory is what ISM
> does. I'm not saying that this is bad. But I'm for sure confused.
>
> In my opinion we should either clarify what is meant by that sentence
> and maybe also why, or get rid of it.
>
> @MST: What is your take on this?
>
> >
> > >
> > > > If there are multiple shared memories whose shmid is
> > > > +VIRTIO_ISM_SHM_ID_REGIONS, they are used as contiguous memory in the order of
> > > > +acquisition.
> > >
> > > Hm, I used to think that the shmid is an unique identifier for a shared
> > > memory region. How can multiple virtio shared memory regions have the
> > > same shmid?
> >
> > Shmid is used to define what this sharing memory is used to do, not its unique
> > identifier.
>
> As far as I can tell the shmid is supposed to uniquely identify a single
> shared memory region!
>
> " A device may have multiple shared memory regions associated with it.
> Each region has a shmid to identify it, the meaning of which is device-specific."
>
> @MST: Can you please help us to get clarity on this?
>
> BTW, this is why I pointed you to the corresponding MMIO interface:
> """
> SHMSel	             Shared memory id
> 0x0ac
> W
>
> Writing to this register selects the shared memory region 2.10 following
> operations on SHMLenLow, SHMLenHigh, SHMBaseLow and SHMBaseHigh apply
> to.
>
> SHMLenLow            Shared memory region 64 bit long length
> 0x0b0
> SHMLenHigh
> 0x0b4
> R
>
>
> These registers return the length of the shared memory region in bytes,
> as defined by the device for the region selected by the SHMSel register.
> The lower 32 bits of the length are read from SHMLenLow and the higher
> 32 bits from SHMLenHigh. Reading from a non-existent region (i.e. where
> the ID written to SHMSel is unused) results in a length of -1.
>
>
> SHMBaseLow         Shared memory region 64 bit long physical address
> 0x0b8
> SHMBaseHigh
> 0x0bc
> R
>
> The driver reads these registers to discover the base address of the
> region in physical address space. This address is chosen by the device
> (or other part of the VMM). The lower 32 bits of the address are read
> from SHMBaseLow with the higher 32 bits from SHMBaseHigh. Reading from a
> non-existent region (i.e. where the ID written to SHMSel is unused)
> results in a base address of 0xffffffffffffffff.
> """
>
> I.e. after you write the shmid into SHMSel you can obtain the base
> address and the length of that region, provided there is a region
> with that id.


Oh, I may really be wrong.

@MST: we need your help.

The good news is that if my understanding is wrong, then this will be
simpler.

Thanks.


>
> >
> > This is the point that device provides multiple shared memory areas.
> > These memory areas are used to provide ISM Region.
> >
>
> Thus I don't think this is possible without violating the currently
> standing rules on shared memory regions.
>
> > >
> > > Can you have a look at the MMIO interface for virtio shared memory
> > > regions.
> >
> > Is there any special attention?
> >
>
> Please see above.
>
> >
> > >
> > > How far what ISM tries to do consistent with virtio shared memory. I
> > > mean, AFAIU once the shared memory is advertised, the  shared memory is
> > > supposed to be there and accessible by the driver for both writes and
> > > reads.
> >
> > Before allocation, VMM did not allocate physical memory for it. Therefore, it
> > was illegal to read and write this memory.
>
> I'm not sure whether generating addressing exceptions on accesses to the
> advertised virtio shared memory regions is acceptable or not. Maybe
> Michael, Connie and the TC members more versed in shared memory regions
> can help us get clarity on this.
>
> I don't think it is per-se prohibited, but on the other hand, to me
> virtio shared memory regions read more like memory that is supposed to
> work , and less like an address space that may or may not be backed by
> memory. So if this is allowed, maybe adding a sentence to the part that
> describes virtio shared memory regions in the basic facilities portion of
> the spec would be beneficial.
>
> Regards,
> Halil


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-11 21:12       ` Halil Pasic
  2023-01-12  7:01         ` Michael S. Tsirkin
  2023-01-12 12:31         ` Xuan Zhuo
@ 2023-01-12 12:40         ` Xuan Zhuo
  2023-02-05 12:30         ` Michael S. Tsirkin
  3 siblings, 0 replies; 37+ messages in thread
From: Xuan Zhuo @ 2023-01-12 12:40 UTC (permalink / raw)
  To: Halil Pasic
  Cc: virtio-dev, hans, herongguang, zmlcc, dust.li, tonylu, zhenzao,
	helinguo, gerry, mst, cohuck, jasowang, Jan Kiszka, wintera,
	kgraul, wenjia, jaka, hca, twinkler, raspl, Halil Pasic

On Wed, 11 Jan 2023 22:12:38 +0100, Halil Pasic <pasic@linux.ibm.com> wrote:
> On Wed, 11 Jan 2023 19:08:53 +0800
> Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> > > > +The device shares memory to the driver based on shared memory regions
> > > > +\ref{sec:Basic Facilities of a Virtio Device / Shared Memory Regions}.
> > > > +However, it does not need to allocate physical memory during initialization.
> > > > +
> > > > +The \field{shmid} of a region MUST be one of the following
> > > > +\begin{lstlisting}
> > > > +enum virtio_ism_shm_id {
> > > > +	VIRTIO_ISM_SHM_ID_UNDEFINED = 0,
> > > > +	VIRTIO_ISM_SHM_ID_REGIONS   = 1,
> > > > +	VIRTIO_ISM_SHM_ID_NOTIFY    = 2,
> > > > +};
> > > > +\end{lstlisting}
> > > > +
> > > > +The shared memory whose shmid is VIRTIO_ISM_SHM_ID_REGIONS is used to implement
> > > > +ism regions.
> > >
> > > Hm. AFAIU, these are supposed to be used for exchanging data between the
> > > VMs that are supposed to communicate with each other via ism.
> > >
> > > How does this relate to the following normative section:
> > > """
> > > 2.10.2 Device Requirements: Shared Memory Regions
> > >
> > > Shared memory regions MUST NOT expose shared memory regions which are used to control the operation of the device, nor to stream data.
> > > """
> >
> > From https://lists.oasis-open.org/archives/virtio-comment/201906/msg00010.html
> >
> > 	I'm trying to find a way to say that people shouldn't side-step virtio
> > 	queues by putting everything in a big blob of shared memory.
> >
> > So I think this should not be conflict.
>
> AFAIU putting everything in a big blob of shared memory is what ISM
> does. I'm not saying that this is bad. But I'm for sure confused.
>
> In my opinion we should either clarify what is meant by that sentence
> and maybe also why, or get rid of it.
>
> @MST: What is your take on this?
>
> >
> > >
> > > > If there are multiple shared memories whose shmid is
> > > > +VIRTIO_ISM_SHM_ID_REGIONS, they are used as contiguous memory in the order of
> > > > +acquisition.
> > >
> > > Hm, I used to think that the shmid is an unique identifier for a shared
> > > memory region. How can multiple virtio shared memory regions have the
> > > same shmid?
> >
> > Shmid is used to define what this sharing memory is used to do, not its unique
> > identifier.
>
> As far as I can tell the shmid is supposed to uniquely identify a single
> shared memory region!
>
> " A device may have multiple shared memory regions associated with it.
> Each region has a shmid to identify it, the meaning of which is device-specific."
>
> @MST: Can you please help us to get clarity on this?
>
> BTW, this is why I pointed you to the corresponding MMIO interface:
> """
> SHMSel	             Shared memory id
> 0x0ac
> W
>
> Writing to this register selects the shared memory region 2.10 following
> operations on SHMLenLow, SHMLenHigh, SHMBaseLow and SHMBaseHigh apply
> to.
>
> SHMLenLow            Shared memory region 64 bit long length
> 0x0b0
> SHMLenHigh
> 0x0b4
> R
>
>
> These registers return the length of the shared memory region in bytes,
> as defined by the device for the region selected by the SHMSel register.
> The lower 32 bits of the length are read from SHMLenLow and the higher
> 32 bits from SHMLenHigh. Reading from a non-existent region (i.e. where
> the ID written to SHMSel is unused) results in a length of -1.
>
>
> SHMBaseLow         Shared memory region 64 bit long physical address
> 0x0b8
> SHMBaseHigh
> 0x0bc
> R
>
> The driver reads these registers to discover the base address of the
> region in physical address space. This address is chosen by the device
> (or other part of the VMM). The lower 32 bits of the address are read
> from SHMBaseLow with the higher 32 bits from SHMBaseHigh. Reading from a
> non-existent region (i.e. where the ID written to SHMSel is unused)
> results in a base address of 0xffffffffffffffff.
> """
>
> I.e. after you write the shmid into SHMSel you can obtain the base
> address and the length of that region, provided there is a region
> with that id.
>
> >
> > This is the point that device provides multiple shared memory areas.
> > These memory areas are used to provide ISM Region.
> >
>
> Thus I don't think this is possible without violating the currently
> standing rules on shared memory regions.
>
> > >
> > > Can you have a look at the MMIO interface for virtio shared memory
> > > regions.
> >
> > Is there any special attention?
> >
>
> Please see above.
>
> >
> > >
> > > How far what ISM tries to do consistent with virtio shared memory. I
> > > mean, AFAIU once the shared memory is advertised, the  shared memory is
> > > supposed to be there and accessible by the driver for both writes and
> > > reads.
> >
> > Before allocation, VMM did not allocate physical memory for it. Therefore, it
> > was illegal to read and write this memory.
>
> I'm not sure whether generating addressing exceptions on accesses to the
> advertised virtio shared memory regions is acceptable or not. Maybe
> Michael, Connie and the TC members more versed in shared memory regions
> can help us get clarity on this.


First of all, the virtio shared memory region is only a capability of the
hardware, and we don't necessarily map all of it to the OS when the device is
initialized. We can only map the allocated ism region to the os.

If we map all of it into the OS when the device is initialized, these memories
only exist in the driver, and no one will access it. Only when the upper layer
calls alloc/attach, we will provide an ism region to the upper layer for use,
and the premise is that we have completed the mapping of physical memory.

>
> I don't think it is per-se prohibited, but on the other hand, to me
> virtio shared memory regions read more like memory that is supposed to
> work , and less like an address space that may or may not be backed by
> memory. So if this is allowed, maybe adding a sentence to the part that
> describes virtio shared memory regions in the basic facilities portion of
> the spec would be beneficial.

I agree.

Thanks.


>
> Regards,
> Halil


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-12 11:48               ` Xuan Zhuo
@ 2023-01-12 14:30                 ` Cornelia Huck
  2023-01-12 15:41                   ` Halil Pasic
  0 siblings, 1 reply; 37+ messages in thread
From: Cornelia Huck @ 2023-01-12 14:30 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: Halil Pasic, virtio-dev, hans, herongguang, zmlcc, dust.li,
	tonylu, zhenzao, helinguo, gerry, Jan Kiszka, wintera, kgraul,
	wenjia, jaka, hca, twinkler, raspl, Michael S. Tsirkin,
	Jason Wang

On Thu, Jan 12 2023, Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:

> On Thu, 12 Jan 2023 09:42:05 +0100, Cornelia Huck <cohuck@redhat.com> wrote:
>> On Thu, Jan 12 2023, "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>
>> > On Thu, Jan 12, 2023 at 10:01:25AM +0800, Jason Wang wrote:
>> >> On Wed, Jan 11, 2023 at 11:11 PM Halil Pasic <pasic@linux.ibm.com> wrote:
>> >> >
>> >> > On Wed, 11 Jan 2023 19:08:53 +0800
>> >> > Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>> >> >
>> >> > > > > +ISM(Internal Shared Memory) device provides the ability to share memory between
>> >> > > > > +different VMs launched from the same entity.
>> >> > > >
>> >> > > > Launched by instead of from? Maybe introduce a catchy name for the
>> >> > > > "entity that launched the VMs" and prevent oversimplification by
>> >> > > > explaining any shortcomings of the name if any in one place. Host would
>> >> > > > be one candidate, VMM another.
>> >> > >
>> >> > > Cornelia Huck wrote:
>> >> > >
>> >> > >       Is there a way to avoid the term "host" (throughout this document)?
>> >> > >       IIUC, you need the uniqueness within the scope of the entity that
>> >> > >       launches the different instances that get shared access to the regions
>> >> > >       (which could conceivably a unit of hardware?)
>> >> > >
>> >> > > And I think she is right, so I am trying to remove the term HOST.
>> >> > >
>> >> > > Do you have better opinions? I think VMM is not particularly suitable.
>> >
>> > I think fundamentally from spec POV memory is shared between devices.
>> > How sharing is accomplished guest does not care so neither should the
>> > spec. Can some RDMA tricks be used for synchronisation behind the
>> > scenes? Maybe, the spec does not care. But we can give an example.
>> >
>> > So something like:
>> >
>> > 	An ISM(Internal Shared Memory) device provides the ability to
>> > 	access memory shared between multiple devices. This allows low-overhead
>> > 	communication in presence of such memory. For example, memory can be
>> > 	shared with guests of multiple virtual machines running on the same
>> > 	host, with each virtual machine including an ISM device and with
>> > 	the guests using the ISM devices to access the shared memory.
>> >
>> > what do others think?
>>
>> I like that: we don't want to talk about hosts/VMMs/etc. as we
>> fundamentally deal with devices and drivers, but sharing between guests
>> is of course the obvious use case.
>>
>> I'm just wondering how best to express the uniqueness scope, is it per
>> (ISM) device?
>
> No, each vm has at least one separate device. The devices in a host form
> an uniqueness scope.

Should we call it a 'group', then? A host would be an example of such a
group.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-12 14:30                 ` Cornelia Huck
@ 2023-01-12 15:41                   ` Halil Pasic
  2023-01-12 16:07                     ` Cornelia Huck
  2023-01-13  1:58                     ` Xuan Zhuo
  0 siblings, 2 replies; 37+ messages in thread
From: Halil Pasic @ 2023-01-12 15:41 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Xuan Zhuo, virtio-dev, hans, herongguang, zmlcc, dust.li, tonylu,
	zhenzao, helinguo, gerry, Jan Kiszka, wintera, kgraul, wenjia,
	jaka, hca, twinkler, raspl, Michael S. Tsirkin, Jason Wang,
	Halil Pasic

On Thu, 12 Jan 2023 15:30:58 +0100
Cornelia Huck <cohuck@redhat.com> wrote:

> >>
> >> I like that: we don't want to talk about hosts/VMMs/etc. as we
> >> fundamentally deal with devices and drivers, but sharing between guests
> >> is of course the obvious use case.
> >>
> >> I'm just wondering how best to express the uniqueness scope, is it per
> >> (ISM) device?  
> >
> > No, each vm has at least one separate device. The devices in a host form
> > an uniqueness scope.  
> 
> Should we call it a 'group', then? A host would be an example of such a
> group.

How about 'communication domain'? Devices within a single communication
domain may be able to speak to each other via SMC and may not have the
same device_id. Two devices from different communication domains can't
communicate via ISM, but may have the same device_id.

I don't like group because it is very generic, and may sound like
the grouping can be done arbitrarily. E.g. with a shared memory based
implementation akin to the PoC putting devices on different hosts into
the same 'group' should be illegal.

On the other hand there is also the following question. If we move away
form the one ID per host model ("The device MUST ensure that the gid on
the same entity i same and different from the gid on other entity.") then
we could also allow having more than one communication domains on a
single host (to limit what entities can use ISM to communicate).

Regards,
Halil


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-12 15:41                   ` Halil Pasic
@ 2023-01-12 16:07                     ` Cornelia Huck
  2023-01-13  1:58                     ` Xuan Zhuo
  1 sibling, 0 replies; 37+ messages in thread
From: Cornelia Huck @ 2023-01-12 16:07 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Xuan Zhuo, virtio-dev, hans, herongguang, zmlcc, dust.li, tonylu,
	zhenzao, helinguo, gerry, Jan Kiszka, wintera, kgraul, wenjia,
	jaka, hca, twinkler, raspl, Michael S. Tsirkin, Jason Wang,
	Halil Pasic

On Thu, Jan 12 2023, Halil Pasic <pasic@linux.ibm.com> wrote:

> On Thu, 12 Jan 2023 15:30:58 +0100
> Cornelia Huck <cohuck@redhat.com> wrote:
>
>> >>
>> >> I like that: we don't want to talk about hosts/VMMs/etc. as we
>> >> fundamentally deal with devices and drivers, but sharing between guests
>> >> is of course the obvious use case.
>> >>
>> >> I'm just wondering how best to express the uniqueness scope, is it per
>> >> (ISM) device?  
>> >
>> > No, each vm has at least one separate device. The devices in a host form
>> > an uniqueness scope.  
>> 
>> Should we call it a 'group', then? A host would be an example of such a
>> group.
>
> How about 'communication domain'? Devices within a single communication
> domain may be able to speak to each other via SMC and may not have the
> same device_id. Two devices from different communication domains can't
> communicate via ISM, but may have the same device_id.
>
> I don't like group because it is very generic, and may sound like
> the grouping can be done arbitrarily. E.g. with a shared memory based
> implementation akin to the PoC putting devices on different hosts into
> the same 'group' should be illegal.

Yes, 'communication domain' sounds better.

>
> On the other hand there is also the following question. If we move away
> form the one ID per host model ("The device MUST ensure that the gid on
> the same entity i same and different from the gid on other entity.") then
> we could also allow having more than one communication domains on a
> single host (to limit what entities can use ISM to communicate).

Makes sense, I guess. But I haven't looked too much into the details of
ism yet.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-12 15:41                   ` Halil Pasic
  2023-01-12 16:07                     ` Cornelia Huck
@ 2023-01-13  1:58                     ` Xuan Zhuo
  2023-01-13  2:29                       ` Jason Wang
  1 sibling, 1 reply; 37+ messages in thread
From: Xuan Zhuo @ 2023-01-13  1:58 UTC (permalink / raw)
  To: Halil Pasic
  Cc: virtio-dev, hans, herongguang, zmlcc, dust.li, tonylu, zhenzao,
	helinguo, gerry, Jan Kiszka, wintera, kgraul, wenjia, jaka, hca,
	twinkler, raspl, Michael S. Tsirkin, Jason Wang, Halil Pasic,
	Cornelia Huck

On Thu, 12 Jan 2023 16:41:32 +0100, Halil Pasic <pasic@linux.ibm.com> wrote:
> On Thu, 12 Jan 2023 15:30:58 +0100
> Cornelia Huck <cohuck@redhat.com> wrote:
>
> > >>
> > >> I like that: we don't want to talk about hosts/VMMs/etc. as we
> > >> fundamentally deal with devices and drivers, but sharing between guests
> > >> is of course the obvious use case.
> > >>
> > >> I'm just wondering how best to express the uniqueness scope, is it per
> > >> (ISM) device?
> > >
> > > No, each vm has at least one separate device. The devices in a host form
> > > an uniqueness scope.
> >
> > Should we call it a 'group', then? A host would be an example of such a
> > group.
>
> How about 'communication domain'? Devices within a single communication
> domain may be able to speak to each other via SMC and may not have the
> same device_id. Two devices from different communication domains can't
> communicate via ISM, but may have the same device_id.

I agree.

>
> I don't like group because it is very generic, and may sound like
> the grouping can be done arbitrarily. E.g. with a shared memory based
> implementation akin to the PoC putting devices on different hosts into
> the same 'group' should be illegal.
>
> On the other hand there is also the following question. If we move away
> form the one ID per host model ("The device MUST ensure that the gid on
> the same entity i same and different from the gid on other entity.") then
> we could also allow having more than one communication domains on a
> single host (to limit what entities can use ISM to communicate).

Yes, this is a good idea.

Thanks.

>
> Regards,
> Halil
>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-13  1:58                     ` Xuan Zhuo
@ 2023-01-13  2:29                       ` Jason Wang
  2023-01-13  6:24                         ` Xuan Zhuo
  0 siblings, 1 reply; 37+ messages in thread
From: Jason Wang @ 2023-01-13  2:29 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: Halil Pasic, virtio-dev, hans, herongguang, zmlcc, dust.li,
	tonylu, zhenzao, helinguo, gerry, Jan Kiszka, wintera, kgraul,
	wenjia, jaka, hca, twinkler, raspl, Michael S. Tsirkin,
	Cornelia Huck

On Fri, Jan 13, 2023 at 9:59 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> On Thu, 12 Jan 2023 16:41:32 +0100, Halil Pasic <pasic@linux.ibm.com> wrote:
> > On Thu, 12 Jan 2023 15:30:58 +0100
> > Cornelia Huck <cohuck@redhat.com> wrote:
> >
> > > >>
> > > >> I like that: we don't want to talk about hosts/VMMs/etc. as we
> > > >> fundamentally deal with devices and drivers, but sharing between guests
> > > >> is of course the obvious use case.
> > > >>
> > > >> I'm just wondering how best to express the uniqueness scope, is it per
> > > >> (ISM) device?
> > > >
> > > > No, each vm has at least one separate device. The devices in a host form
> > > > an uniqueness scope.
> > >
> > > Should we call it a 'group', then? A host would be an example of such a
> > > group.
> >
> > How about 'communication domain'? Devices within a single communication
> > domain may be able to speak to each other via SMC and may not have the
> > same device_id. Two devices from different communication domains can't
> > communicate via ISM, but may have the same device_id.
>
> I agree.
>
> >
> > I don't like group because it is very generic, and may sound like
> > the grouping can be done arbitrarily. E.g. with a shared memory based
> > implementation akin to the PoC putting devices on different hosts into
> > the same 'group' should be illegal.

Any reason why this is illegal?

> >
> > On the other hand there is also the following question. If we move away
> > form the one ID per host model ("The device MUST ensure that the gid on
> > the same entity i same and different from the gid on other entity.") then
> > we could also allow having more than one communication domains on a
> > single host (to limit what entities can use ISM to communicate).

Yes, but I think it might not be necessary to say how gid is actually
implemented, I can think most of the time it should be provisioned by
the the management stack which is probably out of the scope of the
spec.

Thanks

>
> Yes, this is a good idea.
>
> Thanks.
>
> >
> > Regards,
> > Halil
> >
>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-13  2:29                       ` Jason Wang
@ 2023-01-13  6:24                         ` Xuan Zhuo
  2023-01-13 12:00                           ` Michael S. Tsirkin
  0 siblings, 1 reply; 37+ messages in thread
From: Xuan Zhuo @ 2023-01-13  6:24 UTC (permalink / raw)
  To: Jason Wang
  Cc: Halil Pasic, virtio-dev, hans, herongguang, zmlcc, dust.li,
	tonylu, zhenzao, helinguo, gerry, Jan Kiszka, wintera, kgraul,
	wenjia, jaka, hca, twinkler, raspl, Michael S. Tsirkin,
	Cornelia Huck

On Fri, 13 Jan 2023 10:29:49 +0800, Jason Wang <jasowang@redhat.com> wrote:
> On Fri, Jan 13, 2023 at 9:59 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >
> > On Thu, 12 Jan 2023 16:41:32 +0100, Halil Pasic <pasic@linux.ibm.com> wrote:
> > > On Thu, 12 Jan 2023 15:30:58 +0100
> > > Cornelia Huck <cohuck@redhat.com> wrote:
> > >
> > > > >>
> > > > >> I like that: we don't want to talk about hosts/VMMs/etc. as we
> > > > >> fundamentally deal with devices and drivers, but sharing between guests
> > > > >> is of course the obvious use case.
> > > > >>
> > > > >> I'm just wondering how best to express the uniqueness scope, is it per
> > > > >> (ISM) device?
> > > > >
> > > > > No, each vm has at least one separate device. The devices in a host form
> > > > > an uniqueness scope.
> > > >
> > > > Should we call it a 'group', then? A host would be an example of such a
> > > > group.
> > >
> > > How about 'communication domain'? Devices within a single communication
> > > domain may be able to speak to each other via SMC and may not have the
> > > same device_id. Two devices from different communication domains can't
> > > communicate via ISM, but may have the same device_id.
> >
> > I agree.
> >
> > >
> > > I don't like group because it is very generic, and may sound like
> > > the grouping can be done arbitrarily. E.g. with a shared memory based
> > > implementation akin to the PoC putting devices on different hosts into
> > > the same 'group' should be illegal.
>
> Any reason why this is illegal?

The ism device must on the same host.

>
> > >
> > > On the other hand there is also the following question. If we move away
> > > form the one ID per host model ("The device MUST ensure that the gid on
> > > the same entity i same and different from the gid on other entity.") then
> > > we could also allow having more than one communication domains on a
> > > single host (to limit what entities can use ISM to communicate).
>
> Yes, but I think it might not be necessary to say how gid is actually
> implemented, I can think most of the time it should be provisioned by
> the the management stack which is probably out of the scope of the
> spec.

Imagine that the VMs from two different cloud manufacturers may have the same
GID (Host-Id). They believed that they can communicate based on ISM Device. This
is wrong.

Thanks.


>
> Thanks
>
> >
> > Yes, this is a good idea.
> >
> > Thanks.
> >
> > >
> > > Regards,
> > > Halil
> > >
> >
>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-13  6:24                         ` Xuan Zhuo
@ 2023-01-13 12:00                           ` Michael S. Tsirkin
  2023-01-16  2:10                             ` Xuan Zhuo
  0 siblings, 1 reply; 37+ messages in thread
From: Michael S. Tsirkin @ 2023-01-13 12:00 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: Jason Wang, Halil Pasic, virtio-dev, hans, herongguang, zmlcc,
	dust.li, tonylu, zhenzao, helinguo, gerry, Jan Kiszka, wintera,
	kgraul, wenjia, jaka, hca, twinkler, raspl, Cornelia Huck

On Fri, Jan 13, 2023 at 02:24:14PM +0800, Xuan Zhuo wrote:
> On Fri, 13 Jan 2023 10:29:49 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > On Fri, Jan 13, 2023 at 9:59 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > >
> > > On Thu, 12 Jan 2023 16:41:32 +0100, Halil Pasic <pasic@linux.ibm.com> wrote:
> > > > On Thu, 12 Jan 2023 15:30:58 +0100
> > > > Cornelia Huck <cohuck@redhat.com> wrote:
> > > >
> > > > > >>
> > > > > >> I like that: we don't want to talk about hosts/VMMs/etc. as we
> > > > > >> fundamentally deal with devices and drivers, but sharing between guests
> > > > > >> is of course the obvious use case.
> > > > > >>
> > > > > >> I'm just wondering how best to express the uniqueness scope, is it per
> > > > > >> (ISM) device?
> > > > > >
> > > > > > No, each vm has at least one separate device. The devices in a host form
> > > > > > an uniqueness scope.
> > > > >
> > > > > Should we call it a 'group', then? A host would be an example of such a
> > > > > group.
> > > >
> > > > How about 'communication domain'? Devices within a single communication
> > > > domain may be able to speak to each other via SMC and may not have the
> > > > same device_id. Two devices from different communication domains can't
> > > > communicate via ISM, but may have the same device_id.
> > >
> > > I agree.
> > >
> > > >
> > > > I don't like group because it is very generic, and may sound like
> > > > the grouping can be done arbitrarily. E.g. with a shared memory based
> > > > implementation akin to the PoC putting devices on different hosts into
> > > > the same 'group' should be illegal.
> >
> > Any reason why this is illegal?
> 
> The ism device must on the same host.

Fundamentally the limitation is that
the devices must have access to the same memory.
This is what we care about not who runs the VMs there's
no need to mention that at all.


But I feel a bigger question is whether we can avoid making ISM
a migration blocker? E.g.:
- a lone vm is migrated, it's disconnected from memory on source
- a lone vm is migrated, it's connected to memory on destination
- a group of vms is migrated, the memory is migrated with them
  and they remain connected to it

I feel the switch to virtio is a good time to address these
issues, if we don't address this straight away then users
using virtio-ism will have no way to know whether their
VM is migrateable or not.


> >
> > > >
> > > > On the other hand there is also the following question. If we move away
> > > > form the one ID per host model ("The device MUST ensure that the gid on
> > > > the same entity i same and different from the gid on other entity.") then
> > > > we could also allow having more than one communication domains on a
> > > > single host (to limit what entities can use ISM to communicate).
> >
> > Yes, but I think it might not be necessary to say how gid is actually
> > implemented, I can think most of the time it should be provisioned by
> > the the management stack which is probably out of the scope of the
> > spec.
> 
> Imagine that the VMs from two different cloud manufacturers may have the same
> GID (Host-Id). They believed that they can communicate based on ISM Device. This
> is wrong.
> 
> Thanks.

Let's leave all this talk about entities out it just serves to
confuse. Same as with previous discussion, explain the
limitation: two devices can access the same shared memory if and
only if they have the same gid. And give an example of a host
running multiple VMs.


> 
> >
> > Thanks
> >
> > >
> > > Yes, this is a good idea.
> > >
> > > Thanks.
> > >
> > > >
> > > > Regards,
> > > > Halil
> > > >
> > >
> >


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-13 12:00                           ` Michael S. Tsirkin
@ 2023-01-16  2:10                             ` Xuan Zhuo
  2023-01-19 12:30                               ` Alexandra Winter
  0 siblings, 1 reply; 37+ messages in thread
From: Xuan Zhuo @ 2023-01-16  2:10 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, Halil Pasic, virtio-dev, hans, herongguang, zmlcc,
	dust.li, tonylu, zhenzao, helinguo, gerry, Jan Kiszka, wintera,
	kgraul, wenjia, jaka, hca, twinkler, raspl, Cornelia Huck

On Fri, 13 Jan 2023 07:00:51 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Fri, Jan 13, 2023 at 02:24:14PM +0800, Xuan Zhuo wrote:
> > On Fri, 13 Jan 2023 10:29:49 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > On Fri, Jan 13, 2023 at 9:59 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > >
> > > > On Thu, 12 Jan 2023 16:41:32 +0100, Halil Pasic <pasic@linux.ibm.com> wrote:
> > > > > On Thu, 12 Jan 2023 15:30:58 +0100
> > > > > Cornelia Huck <cohuck@redhat.com> wrote:
> > > > >
> > > > > > >>
> > > > > > >> I like that: we don't want to talk about hosts/VMMs/etc. as we
> > > > > > >> fundamentally deal with devices and drivers, but sharing between guests
> > > > > > >> is of course the obvious use case.
> > > > > > >>
> > > > > > >> I'm just wondering how best to express the uniqueness scope, is it per
> > > > > > >> (ISM) device?
> > > > > > >
> > > > > > > No, each vm has at least one separate device. The devices in a host form
> > > > > > > an uniqueness scope.
> > > > > >
> > > > > > Should we call it a 'group', then? A host would be an example of such a
> > > > > > group.
> > > > >
> > > > > How about 'communication domain'? Devices within a single communication
> > > > > domain may be able to speak to each other via SMC and may not have the
> > > > > same device_id. Two devices from different communication domains can't
> > > > > communicate via ISM, but may have the same device_id.
> > > >
> > > > I agree.
> > > >
> > > > >
> > > > > I don't like group because it is very generic, and may sound like
> > > > > the grouping can be done arbitrarily. E.g. with a shared memory based
> > > > > implementation akin to the PoC putting devices on different hosts into
> > > > > the same 'group' should be illegal.
> > >
> > > Any reason why this is illegal?
> >
> > The ism device must on the same host.
>
> Fundamentally the limitation is that
> the devices must have access to the same memory.
> This is what we care about not who runs the VMs there's
> no need to mention that at all.
>
>
> But I feel a bigger question is whether we can avoid making ISM
> a migration blocker? E.g.:
> - a lone vm is migrated, it's disconnected from memory on source
> - a lone vm is migrated, it's connected to memory on destination
> - a group of vms is migrated, the memory is migrated with them
>   and they remain connected to it
>
> I feel the switch to virtio is a good time to address these
> issues, if we don't address this straight away then users
> using virtio-ism will have no way to know whether their
> VM is migrateable or not.


Yes, realizing live migration is more troublesome. But if it is only used for
SMC, I think theoretically, SMC can recognize this situation and keep
connecting.

If app uses ism device directly, it is difficult to make application not feel in
the process of migration.



>
>
> > >
> > > > >
> > > > > On the other hand there is also the following question. If we move away
> > > > > form the one ID per host model ("The device MUST ensure that the gid on
> > > > > the same entity i same and different from the gid on other entity.") then
> > > > > we could also allow having more than one communication domains on a
> > > > > single host (to limit what entities can use ISM to communicate).
> > >
> > > Yes, but I think it might not be necessary to say how gid is actually
> > > implemented, I can think most of the time it should be provisioned by
> > > the the management stack which is probably out of the scope of the
> > > spec.
> >
> > Imagine that the VMs from two different cloud manufacturers may have the same
> > GID (Host-Id). They believed that they can communicate based on ISM Device. This
> > is wrong.
> >
> > Thanks.
>
> Let's leave all this talk about entities out it just serves to
> confuse. Same as with previous discussion, explain the
> limitation: two devices can access the same shared memory if and
> only if they have the same gid. And give an example of a host
> running multiple VMs.


I agree.

Thanks.



>
>
> >
> > >
> > > Thanks
> > >
> > > >
> > > > Yes, this is a good idea.
> > > >
> > > > Thanks.
> > > >
> > > > >
> > > > > Regards,
> > > > > Halil
> > > > >
> > > >
> > >
>

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-16  2:10                             ` Xuan Zhuo
@ 2023-01-19 12:30                               ` Alexandra Winter
  2023-01-28  7:42                                 ` Xuan Zhuo
  0 siblings, 1 reply; 37+ messages in thread
From: Alexandra Winter @ 2023-01-19 12:30 UTC (permalink / raw)
  To: Xuan Zhuo, Michael S. Tsirkin, virtio-comment
  Cc: Jason Wang, Halil Pasic, virtio-dev, hans, herongguang, zmlcc,
	dust.li, tonylu, zhenzao, helinguo, gerry, Jan Kiszka, kgraul,
	wenjia, jaka, hca, twinkler, raspl, Cornelia Huck



On 16.01.23 03:10, Xuan Zhuo wrote:
> On Fri, 13 Jan 2023 07:00:51 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
>> On Fri, Jan 13, 2023 at 02:24:14PM +0800, Xuan Zhuo wrote:
>>> On Fri, 13 Jan 2023 10:29:49 +0800, Jason Wang <jasowang@redhat.com> wrote:
>>>> On Fri, Jan 13, 2023 at 9:59 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>>>>>
>>>>> On Thu, 12 Jan 2023 16:41:32 +0100, Halil Pasic <pasic@linux.ibm.com> wrote:
>>>>>> On Thu, 12 Jan 2023 15:30:58 +0100
>>>>>> Cornelia Huck <cohuck@redhat.com> wrote:
>>>>>>
[...]
>>>>>>>>>
>>>>>>>>> I'm just wondering how best to express the uniqueness scope, is it per
>>>>>>>>> (ISM) device?
>>>>>>>>
>>>>>>>> No, each vm has at least one separate device. The devices in a host form
>>>>>>>> an uniqueness scope.
>>>>>>>
>>>>>>> Should we call it a 'group', then? A host would be an example of such a
>>>>>>> group.
>>>>>>
>>>>>> How about 'communication domain'? Devices within a single communication
>>>>>> domain may be able to speak to each other via SMC and may not have the
>>>>>> same device_id. Two devices from different communication domains can't
>>>>>> communicate via ISM, but may have the same device_id.
>>>>>
>>>>> I agree.
>>>>>
>>>>>>
>>>>>> I don't like group because it is very generic, and may sound like
>>>>>> the grouping can be done arbitrarily. E.g. with a shared memory based
>>>>>> implementation akin to the PoC putting devices on different hosts into
>>>>>> the same 'group' should be illegal.
>>>>
>>>> Any reason why this is illegal?
>>>
>>> The ism device must on the same host.
>>
>> Fundamentally the limitation is that
>> the devices must have access to the same memory.
>> This is what we care about not who runs the VMs there's
>> no need to mention that at all.
>>
>>
[...]
>>>>
>>>>>>
>>>>>> On the other hand there is also the following question. If we move away
>>>>>> form the one ID per host model ("The device MUST ensure that the gid on
>>>>>> the same entity i same and different from the gid on other entity.") then
>>>>>> we could also allow having more than one communication domains on a
>>>>>> single host (to limit what entities can use ISM to communicate).
>>>>
>>>> Yes, but I think it might not be necessary to say how gid is actually
>>>> implemented, I can think most of the time it should be provisioned by
>>>> the the management stack which is probably out of the scope of the
>>>> spec.
>>>
>>> Imagine that the VMs from two different cloud manufacturers may have the same
>>> GID (Host-Id). They believed that they can communicate based on ISM Device. This
>>> is wrong.
>>>
>>> Thanks.
>>
>> Let's leave all this talk about entities out it just serves to
>> confuse. Same as with previous discussion, explain the
>> limitation: two devices can access the same shared memory if and
>> only if they have the same gid. And give an example of a host
>> running multiple VMs.
> 
> 
[...]
One purpose of the proposed virtio device is to work with the SMC-D(irect) protocol
(a variant of AF_SMC sockets). I understand that this virtio device should be generic
and usable for other purposes, but it also should be usable for SMC-D.
SMC-D today can be used with s390's ISM devices. There we had to solve the same questions
- How to find 2 ISM devices that can communicate?
Let me summarize the values we use in the SMC rendevous for this purpose, maybe this
helps this discussion. 

+ UEID(s) – (32 byte) Defined per system (OS instance). User defined group(s) of systems that should
     use SMC between each other.
     If the peer has only different UEIDs: Don’t try to use SMC with this peer.

+ SEID – (32 byte) Defined per system. Maximum space of systems that are able to use SMC-D between each other.
   Equals to the machine hardware / first level hypervisor. Is unique per machine, derived from unique machine ID.
    If the peer has a different SEID: Don’t try to use SMC-D with this peer.

+ CHID – (2 Bytes) Defined per SMC-D interface. Fabric ID of this interface. Unique against other fabrics on this machine. 
    Try to find a pair of SMC-D interfaces on the same fabric for the 2 peers for this SMC-D connection.

+ GID – (8 bytes) Defined per SMC-D interface. ID of this interface.
   For s390 ISM is actually globally unique. But for SMC-D protocol purpose just needs to be unique within the CHID (fabric).
   Use it to identify the chosen interfaces to the other peer.


You see the term GID is used for the interface, not for the fabric here. 

With this abstraction: Hardware, Fabric, Interface, we can say that 2 peers that are on the same
Hardware (SEID) and can find a common Fabric (CHID) can use SMC-D to communicate and identify each
other by Interface ID (GID).
So for the KVM example, the fabric would be defined in the KVM host, and needs to be unique
against other KVM hosts on the same hardware (in s390 KVM is at least the second level hipervisor)
and against s390 ISM fabrics.

HTH



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-12 12:31         ` Xuan Zhuo
@ 2023-01-20 13:06           ` Halil Pasic
  0 siblings, 0 replies; 37+ messages in thread
From: Halil Pasic @ 2023-01-20 13:06 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: virtio-dev, hans, herongguang, zmlcc, dust.li, tonylu, zhenzao,
	helinguo, gerry, mst, cohuck, jasowang, Jan Kiszka, wintera,
	kgraul, wenjia, jaka, hca, twinkler, raspl, Halil Pasic

On Thu, 12 Jan 2023 20:31:24 +0800
Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:

> > > Shmid is used to define what this sharing memory is used to do, not its unique
> > > identifier.  
> >
> > As far as I can tell the shmid is supposed to uniquely identify a single
> > shared memory region!
> >
> > " A device may have multiple shared memory regions associated with it.
> > Each region has a shmid to identify it, the meaning of which is device-specific."
> >
> > @MST: Can you please help us to get clarity on this?
> >
[..]
> 
> 
> Oh, I may really be wrong.
> 
> @MST: we need your help.

@Michael: Ping! Can you help us out on this one?


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 0/1] introduce virtio-ism: internal shared memory device
  2022-12-23  8:13 [PATCH v2 0/1] introduce virtio-ism: internal shared memory device Xuan Zhuo
  2022-12-23  8:13 ` [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism Xuan Zhuo
@ 2023-01-25 12:55 ` Wenjia Zhang
  2023-03-01  9:34     ` [virtio-dev] " Tony Lu
  1 sibling, 1 reply; 37+ messages in thread
From: Wenjia Zhang @ 2023-01-25 12:55 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: virtio-dev, hans, herongguang, zmlcc, dust.li, tonylu, zhenzao,
	helinguo, gerry, mst, cohuck, jasowang, Jan Kiszka, wintera,
	kgraul, jaka, hca, twinkler, raspl



On 23.12.22 09:13, Xuan Zhuo wrote:
> Hello everyone,
> 
> # Background
> 
>      Nowadays, there is a common scenario to accelerate communication between
>      different VMs and containers, including light weight virtual machine based
>      containers. One way to achieve this is to colocate them on the same host.
>      However, the performance of inter-VM communication through network stack is
>      not optimal and may also waste extra CPU cycles. This scenario has been
>      discussed many times, but still no generic solution available [1] [2] [3].
> 
>      With pci-ivshmem + SMC(Shared Memory Communications: [4]) based PoC[5],
>      We found that by changing the communication channel between VMs from TCP to
>      SMC with shared memory, we can achieve superior performance for a common
>      socket-based application[5]:
>        - latency reduced by about 50%
>        - throughput increased by about 300%
>        - CPU consumption reduced by about 50%
> 
>      Since there is no particularly suitable shared memory management solution
>      matches the need for SMC(See ## Comparison with existing technology), and
>      virtio is the standard for communication in the virtualization world, we
>      want to implement a virtio-ism device based on virtio, which can support
>      on-demand memory sharing across VMs, containers or VM-container. To match
>      the needs of SMC, the virtio-ism device need to support:
> 
>      1. Dynamic provision: shared memory regions are dynamically allocated and
>         provisioned.
>      2. Multi-region management: the shared memory is divided into regions,
>         and a peer may allocate one or more regions from the same shared memory
>         device.
>      3. Permission control: the permission of each region can be set seperately.
>      4. Dynamic connection: each ism region of a device can be shared with
>         different devices, eventually a device can be shared with thousands of
>         devices
> 
> # Virtio ISM device
> 
>      ISM devices provide the ability to share memory between different guests on
>      a host. A guest's memory got from ism device can be shared with multiple
>      peers at the same time. This shared relationship can be dynamically created
>      and released.
> 
>      The shared memory obtained from the device is divided into multiple ism
>      regions for share. ISM device provides a mechanism to notify other ism
>      region referrers of content update events.
> 
> ## Design
> 
>      This is a structure diagram based on ism sharing between two vms.
> 
>      |-------------------------------------------------------------------------------------------------------------|
>      | |------------------------------------------------|       |------------------------------------------------| |
>      | | Guest                                          |       | Guest                                          | |
>      | |                                                |       |                                                | |
>      | |   ----------------                             |       |   ----------------                             | |
>      | |   |    driver    |     [M1]   [M2]   [M3]      |       |   |    driver    |             [M2]   [M3]     | |
>      | |   ----------------       |      |      |       |       |   ----------------               |      |      | |
>      | |    |cq|                  |map   |map   |map    |       |    |cq|                          |map   |map   | |
>      | |    |  |                  |      |      |       |       |    |  |                          |      |      | |
>      | |    |  |                -------------------     |       |    |  |                --------------------    | |
>      | |----|--|----------------|  device memory  |-----|       |----|--|----------------|  device memory   |----| |
>      | |    |  |                -------------------     |       |    |  |                --------------------    | |
>      | |                                |               |       |                               |                | |
>      | |                                |               |       |                               |                | |
>      | | Qemu                           |               |       | Qemu                          |                | |
>      | |--------------------------------+---------------|       |-------------------------------+----------------| |
>      |                                  |                                                       |                  |
>      |                                  |                                                       |                  |
>      |                                  |------------------------------+------------------------|                  |
>      |                                                                 |                                           |
>      |                                                                 |                                           |
>      |                                                   --------------------------                                |
>      |                                                    | M1 |   | M2 |   | M3 |                                 |
>      |                                                   --------------------------                                |
>      |                                                                                                             |
>      | HOST                                                                                                        |
>      ---------------------------------------------------------------------------------------------------------------
> 
> ## Inspiration
> 
>      Our design idea for virtio-ism comes from IBM's ISM device, to pay tribute,
>      we directly name this device "ism".
> 
>      Information about IBM ism device and SMC:
>        1. SMC reference: https://www.ibm.com/docs/en/zos/2.5.0?topic=system-shared-memory-communications
>        2. SMC-Dv2 and ISMv2 introduction: https://www.newera.com/INFO/SMCv2_Introduction_10-15-2020.pdf
>        3. ISM device: https://www.ibm.com/docs/en/linux-on-systems?topic=n-ism-device-driver-1
>        4. SMC protocol (including SMC-D): https://www.ibm.com/support/pages/system/files/inline-files/IBM%20Shared%20Memory%20Communications%20Version%202_2.pdf
>        5. SMC-D FAQ: https://www.ibm.com/support/pages/system/files/inline-files/2021-02-09-SMC-D-FAQ.pdf
> 
> ## ISM VLAN
> 
>      Since SMC uses TCP to handshake with IP facilities, virtio-ism device is not
>      bound to existing IP device, and the latest ISMv2 device doesn't require
>      VLAN. So it is not necessary for virtio-ism to support VLAN attributes.
> 
> ## Live Migration
> 
>      Currently SMC-D doesn't support migration to another device or fallback. And
>      SMC-R supports migration to another link, no fallback.
> 
>      So we may not support live migration for the time being.
> 
> ## About hot plugging of the ism device
> 
>      Hot plugging of devices is a heavier, possibly failed, time-consuming, and
>      less scalable operation. So, we don't plan to support it for now.
> 
> 
> # Usage (SMC as example)
> 
>      There is one of possible use cases:
> 
>      1. SMC calls the interface ism_alloc_region() of the ism driver to return
>         the location of a memory region in the PCI space and a token.
>      2. The ism driver mmap the memory region and return to SMC with the token
>      3. SMC passes the token to the connected peer
>      4. the peer calls the ism driver interface ism_attach_region(token) to
>         get the location of the PCI space of the shared memory
>      5. The connected pair communicating through the shared memory
> 
> # Comparison with existing technology
> 
> ## ivshmem or ivshmem 2.0 of Qemu
> 
>     1. ivshmem 1.0 is a large piece of memory that can be seen by all devices
>        that use this VM, so the security is not enough.
> 
>     2. ivshmem 2.0 is a shared memory belonging to a VM that can be read-only by
>        all other VMs that use the ivshmem 2.0 shared memory device, which also
>        does not meet our needs in terms of security.
> 
> ## vhost-pci and virtiovhostuser
> 
>      1. does not support dynamic allocation
>      2. one device just support connect to one vm
> 
> 
> # POC CODE
> 
> There are no functions related to eventq and perm yet.
> 
> ## Qemu (virtio ism device):
> 
>       https://github.com/fengidri/qemu/compare/7d66b74c4dd0d74d12c1d3d6de366242b13ed76d...ism-upstream-1216?expand=1
> 
>      Start qemu with option "--device virtio-ism-pci,disable-legacy=on, disable-modern=off".
> 
> ##  Kernel (virtio ism driver and smc support):
> 
>       https://github.com/fengidri/linux-kernel-virtio-ism/compare/6f8101eb21bab480537027e62c4b17021fb7ea5d...ism-upstream-1223
> 
I tried to run the PoC kernel on our platform and wanted to test the 
main path of SMC-D and SMC-R flows as we used to firstly. Unfortunately, 
most of our basic tests are failed and even ftrace could trigger a crash 
which points that "Unable to handle kernel pointer dereference in 
virtual kernel address space". Thus, I'm wondering:

- What aspects of the PoC would you like to get feedback on? Which parts 
are known to be not as good as they need to become? Or are known to need 
to be significantly expanded upon for reaching non-RFC maturity? What is 
suposed to stick around?
- Is there a version that does not break the existing scenarios in the 
pipe? Or maybe even already published somewhere else?
- Are you aware of the recent refactoring in SMC-D: 
https://git.kernel.org/netdev/net-next/c/8c81ba20349d ? Could you base 
your next version on this code?



> 
> ### SMC
> 
>      Support SMC-D works with virtio-ism.
> 
>      Use SMC with virtio-ism to accelerate inter-VM communication.
> 
>      1. insmod virtio-ism and smc module.
>      2. use smc-tools [1] to get the device name of SMC-D based on virtio-ism.
> 
>        $ smcd d # here is _virtio2_
>        FID  Type  PCI-ID        PCHID  InUse  #LGs  PNET-ID
>        0000 0     virtio2       0000   Yes       1  *C1
> 
>      3. add the nic and SMC-D device to the same pnet, do it in both client and server.
> 
>        $ smc_pnet -a -I eth1 c1 # use eth1 to setup SMC connection
>        $ smc_pnet -a -D virtio2 c1 # virtio2 is the virtio-ism device
> 
>      4. use SMC to accelerate your application, smc_run in [1] can do this.
> 
>        # smc_run use LD_PRELOAD to hijack socket syscall with AF_SMC
>        $ smc_run sockperf server --tcp # run in server
>        $ smc_run sockperf tp --tcp -i a.b.c.d # run in client
> 
>      [1] https://github.com/ibm-s390-linux/smc-tools
> 
>      Notice: The current POC state, we only tested some basic functions.
> 
> ### App inside user space
> 
>      The ism driver provide /dev/vismX interface, allow users to use Virtio-ISM
>      device in user space directly.
> 
>      Try tools/virtio/virtio-ism/virtio-ism-mmap
> 
>      Usage:
>           cd tools/virtio/virtio-ism/; make
>           insmode virtio-ism.ko
> 
>      case1: communicate
> 
>         vm1: ./virtio-ism-mmap alloc -> token
>         vm2: ./virtio-ism-mmap attach -t <token> --write-msg AAAA --commit
> 
>         vm2 will write msg to shared memory, then notify vm1. After vm1 receive
>         notify, then read from shared memory.
> 
>      case2: ping-pong test.
> 
>          vm1: ./virtio-ism-mmap server
>          vm2: ./virtio-ism-mmap -i 192.168.122.101 pp
> 
>          1. server alloc one ism region
>          2. client get the token by tcp
> 
>          3. client commit(kick) to server, server recv notify, commit(kick) to client
>          4. loop #3
> 
>      case3: throughput test.
> 
>          vm1: ./virtio-ism-mmap server
>          vm2: ./virtio-ism-mmap -i 192.168.122.101 tp
> 
>          1. server alloc one ism region
>          2. client get the token by tcp
> 
>          3. client write 1M data to ism region
>          4. client commit(kick) to server
>          5. server recv notify, copy the data, the commit(kick) back to client
>          6. loop #3-#5
> 
>      case4: throughput test with protocol defined by user.
> 
>          vm1: ./virtio-ism-mmap server
>          vm2: ./virtio-ism-mmap -i 192.168.122.101 tp --polling --tp-chunks 15 --msg-size 64k -n 50000
> 
>          Used the ism region as a ring.
> 
>          In this scene, client and server are in the polling mode. Test it on
>          my machine, the maximum can reach 12GBps
> 
> # References
> 
>      [1] https://projectacrn.github.io/latest/tutorials/enable_ivshmem.html
>      [2] https://dl.acm.org/doi/10.1145/2847562
>      [3] https://hal.archives-ouvertes.fr/hal-00368622/document
>      [4] https://lwn.net/Articles/711071/
>      [5] https://lore.kernel.org/netdev/20220720170048.20806-1-tonylu@linux.alibaba.com/T/
> 
> 
> If there are any problems, please point them out.
> Hope to hear from you, thank you.
> 
> v2:
>     1. add Attach/Detach event
>     2. add Events Filter
>     3. allow Alloc/Attach huge region
>     4. remove host/guest terms
> 
> v1:
>     1. cover letter adding explanation of ism vlan
>     2. spec add gid
>     3. explain the source of ideas about ism
>     4. POC support virtio-ism-smc.ko virtio-ism-dev.ko and support virtio-ism-mmap
> 
> 
> Xuan Zhuo (1):
>    virtio-ism: introduce new device virtio-ism
> 
>   conformance.tex |  24 +++
>   content.tex     |   1 +
>   virtio-ism.tex  | 472 ++++++++++++++++++++++++++++++++++++++++++++++++
>   3 files changed, 497 insertions(+)
>   create mode 100644 virtio-ism.tex
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-19 12:30                               ` Alexandra Winter
@ 2023-01-28  7:42                                 ` Xuan Zhuo
  0 siblings, 0 replies; 37+ messages in thread
From: Xuan Zhuo @ 2023-01-28  7:42 UTC (permalink / raw)
  To: Alexandra Winter
  Cc: Jason Wang, Halil Pasic, virtio-dev, hans, herongguang, zmlcc,
	dust.li, tonylu, zhenzao, helinguo, gerry, Jan Kiszka, kgraul,
	wenjia, jaka, hca, twinkler, raspl, Cornelia Huck,
	Michael S. Tsirkin, virtio-comment

On Thu, 19 Jan 2023 13:30:09 +0100, Alexandra Winter <wintera@linux.ibm.com> wrote:
>
>
> On 16.01.23 03:10, Xuan Zhuo wrote:
> > On Fri, 13 Jan 2023 07:00:51 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >> On Fri, Jan 13, 2023 at 02:24:14PM +0800, Xuan Zhuo wrote:
> >>> On Fri, 13 Jan 2023 10:29:49 +0800, Jason Wang <jasowang@redhat.com> wrote:
> >>>> On Fri, Jan 13, 2023 at 9:59 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >>>>>
> >>>>> On Thu, 12 Jan 2023 16:41:32 +0100, Halil Pasic <pasic@linux.ibm.com> wrote:
> >>>>>> On Thu, 12 Jan 2023 15:30:58 +0100
> >>>>>> Cornelia Huck <cohuck@redhat.com> wrote:
> >>>>>>
> [...]
> >>>>>>>>>
> >>>>>>>>> I'm just wondering how best to express the uniqueness scope, is it per
> >>>>>>>>> (ISM) device?
> >>>>>>>>
> >>>>>>>> No, each vm has at least one separate device. The devices in a host form
> >>>>>>>> an uniqueness scope.
> >>>>>>>
> >>>>>>> Should we call it a 'group', then? A host would be an example of such a
> >>>>>>> group.
> >>>>>>
> >>>>>> How about 'communication domain'? Devices within a single communication
> >>>>>> domain may be able to speak to each other via SMC and may not have the
> >>>>>> same device_id. Two devices from different communication domains can't
> >>>>>> communicate via ISM, but may have the same device_id.
> >>>>>
> >>>>> I agree.
> >>>>>
> >>>>>>
> >>>>>> I don't like group because it is very generic, and may sound like
> >>>>>> the grouping can be done arbitrarily. E.g. with a shared memory based
> >>>>>> implementation akin to the PoC putting devices on different hosts into
> >>>>>> the same 'group' should be illegal.
> >>>>
> >>>> Any reason why this is illegal?
> >>>
> >>> The ism device must on the same host.
> >>
> >> Fundamentally the limitation is that
> >> the devices must have access to the same memory.
> >> This is what we care about not who runs the VMs there's
> >> no need to mention that at all.
> >>
> >>
> [...]
> >>>>
> >>>>>>
> >>>>>> On the other hand there is also the following question. If we move away
> >>>>>> form the one ID per host model ("The device MUST ensure that the gid on
> >>>>>> the same entity i same and different from the gid on other entity.") then
> >>>>>> we could also allow having more than one communication domains on a
> >>>>>> single host (to limit what entities can use ISM to communicate).
> >>>>
> >>>> Yes, but I think it might not be necessary to say how gid is actually
> >>>> implemented, I can think most of the time it should be provisioned by
> >>>> the the management stack which is probably out of the scope of the
> >>>> spec.
> >>>
> >>> Imagine that the VMs from two different cloud manufacturers may have the same
> >>> GID (Host-Id). They believed that they can communicate based on ISM Device. This
> >>> is wrong.
> >>>
> >>> Thanks.
> >>
> >> Let's leave all this talk about entities out it just serves to
> >> confuse. Same as with previous discussion, explain the
> >> limitation: two devices can access the same shared memory if and
> >> only if they have the same gid. And give an example of a host
> >> running multiple VMs.
> >
> >
> [...]
> One purpose of the proposed virtio device is to work with the SMC-D(irect) protocol
> (a variant of AF_SMC sockets). I understand that this virtio device should be generic
> and usable for other purposes, but it also should be usable for SMC-D.
> SMC-D today can be used with s390's ISM devices. There we had to solve the same questions
> - How to find 2 ISM devices that can communicate?
> Let me summarize the values we use in the SMC rendevous for this purpose, maybe this
> helps this discussion.
>
> + UEID(s) – (32 byte) Defined per system (OS instance). User defined group(s) of systems that should
>      use SMC between each other.
>      If the peer has only different UEIDs: Don’t try to use SMC with this peer.
>
> + SEID – (32 byte) Defined per system. Maximum space of systems that are able to use SMC-D between each other.
>    Equals to the machine hardware / first level hypervisor. Is unique per machine, derived from unique machine ID.
>     If the peer has a different SEID: Don’t try to use SMC-D with this peer.
>
> + CHID – (2 Bytes) Defined per SMC-D interface. Fabric ID of this interface. Unique against other fabrics on this machine.
>     Try to find a pair of SMC-D interfaces on the same fabric for the 2 peers for this SMC-D connection.
>
> + GID – (8 bytes) Defined per SMC-D interface. ID of this interface.
>    For s390 ISM is actually globally unique. But for SMC-D protocol purpose just needs to be unique within the CHID (fabric).
>    Use it to identify the chosen interfaces to the other peer.
>
>
> You see the term GID is used for the interface, not for the fabric here.
>
> With this abstraction: Hardware, Fabric, Interface, we can say that 2 peers that are on the same
> Hardware (SEID) and can find a common Fabric (CHID) can use SMC-D to communicate and identify each
> other by Interface ID (GID).
> So for the KVM example, the fabric would be defined in the KVM host, and needs to be unique
> against other KVM hosts on the same hardware (in s390 KVM is at least the second level hipervisor)
> and against s390 ISM fabrics.

Sorry for late, most of us are on Chinese New Year vacation.

Thanks for your explanation, I think I understand.

I agree.

So the problem we have to solve now is how to unify ISM device and virtio-ism
device, which is beneficial to SMC Protocol.

For Virtio-ISM, we hope to simply return to SMC (or other applications) a ID (8 or 16 bytes)
through device, and ensure that it is unique against other hosts.

Thanks.



>
> HTH
>
>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-01-11 21:12       ` Halil Pasic
                           ` (2 preceding siblings ...)
  2023-01-12 12:40         ` Xuan Zhuo
@ 2023-02-05 12:30         ` Michael S. Tsirkin
  2023-02-06  2:15           ` Xuan Zhuo
  3 siblings, 1 reply; 37+ messages in thread
From: Michael S. Tsirkin @ 2023-02-05 12:30 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Xuan Zhuo, virtio-dev, hans, herongguang, zmlcc, dust.li, tonylu,
	zhenzao, helinguo, gerry, cohuck, jasowang, Jan Kiszka, wintera,
	kgraul, wenjia, jaka, hca, twinkler, raspl

On Wed, Jan 11, 2023 at 10:12:38PM +0100, Halil Pasic wrote:
> On Wed, 11 Jan 2023 19:08:53 +0800
> Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> 
> > > > +The device shares memory to the driver based on shared memory regions

exposes memory to the driver btw. "share" is already overused in this
text.

> > > > +\ref{sec:Basic Facilities of a Virtio Device / Shared Memory Regions}.
> > > > +However, it does not need to allocate physical memory during initialization.
> > > > +
> > > > +The \field{shmid} of a region MUST be one of the following
> > > > +\begin{lstlisting}
> > > > +enum virtio_ism_shm_id {
> > > > +	VIRTIO_ISM_SHM_ID_UNDEFINED = 0,
> > > > +	VIRTIO_ISM_SHM_ID_REGIONS   = 1,
> > > > +	VIRTIO_ISM_SHM_ID_NOTIFY    = 2,
> > > > +};
> > > > +\end{lstlisting}
> > > > +
> > > > +The shared memory whose shmid is VIRTIO_ISM_SHM_ID_REGIONS is used to implement
> > > > +ism regions.  
> > >
> > > Hm. AFAIU, these are supposed to be used for exchanging data between the
> > > VMs that are supposed to communicate with each other via ism.
> > >
> > > How does this relate to the following normative section:
> > > """
> > > 2.10.2 Device Requirements: Shared Memory Regions
> > >
> > > Shared memory regions MUST NOT expose shared memory regions which are used to control the operation of the device, nor to stream data.
> > > """  
> > 
> > From https://lists.oasis-open.org/archives/virtio-comment/201906/msg00010.html
> > 
> > 	I'm trying to find a way to say that people shouldn't side-step virtio
> > 	queues by putting everything in a big blob of shared memory.
> > 
> > So I think this should not be conflict.
> 
> AFAIU putting everything in a big blob of shared memory is what ISM
> does. I'm not saying that this is bad. But I'm for sure confused.
> 
> In my opinion we should either clarify what is meant by that sentence
> and maybe also why, or get rid of it.
> 
> @MST: What is your take on this?

My take is that I am equally confused. It will help readability
a lot if the text
- stops overusing the words "share" and "region" - just use them
  for one specific thing, anything else - call it something else.
  E.g. expose, area, etc.
- eschew abbreviation. E.g. don't say "ism" everywhere.
- stops mixing interface and implementation. focus on the interface and
  document that. then talk about what happens on the backend as an
  example. So there's device memory exposed to guest and device resources
  can be allocated later. For example this could be host memory
  and physical memory is not allocated. something like this.

> > 
> > >  
> > > > If there are multiple shared memories whose shmid is
> > > > +VIRTIO_ISM_SHM_ID_REGIONS, they are used as contiguous memory in the order of
> > > > +acquisition.  
> > >
> > > Hm, I used to think that the shmid is an unique identifier for a shared
> > > memory region. How can multiple virtio shared memory regions have the
> > > same shmid?  
> > 
> > Shmid is used to define what this sharing memory is used to do, not its unique
> > identifier.
> 
> As far as I can tell the shmid is supposed to uniquely identify a single
> shared memory region!
> 
> " A device may have multiple shared memory regions associated with it.
> Each region has a shmid to identify it, the meaning of which is device-specific." 
> 
> @MST: Can you please help us to get clarity on this?

My understanding would be that it uniquely identifies the region
within the device.
It is unfortunate that we don't have a conformance clause like this.
Let's add one?


> BTW, this is why I pointed you to the corresponding MMIO interface:
> """
> SHMSel	             Shared memory id
> 0x0ac
> W
> 
> Writing to this register selects the shared memory region 2.10 following
> operations on SHMLenLow, SHMLenHigh, SHMBaseLow and SHMBaseHigh apply
> to.
> 
> SHMLenLow            Shared memory region 64 bit long length 
> 0x0b0
> SHMLenHigh
> 0x0b4
> R
> 	
> 
> These registers return the length of the shared memory region in bytes,
> as defined by the device for the region selected by the SHMSel register.
> The lower 32 bits of the length are read from SHMLenLow and the higher
> 32 bits from SHMLenHigh. Reading from a non-existent region (i.e. where
> the ID written to SHMSel is unused) results in a length of -1.
> 	
> 
> SHMBaseLow         Shared memory region 64 bit long physical address 
> 0x0b8
> SHMBaseHigh
> 0x0bc
> R  
> 
> The driver reads these registers to discover the base address of the
> region in physical address space. This address is chosen by the device
> (or other part of the VMM). The lower 32 bits of the address are read
> from SHMBaseLow with the higher 32 bits from SHMBaseHigh. Reading from a
> non-existent region (i.e. where the ID written to SHMSel is unused)
> results in a base address of 0xffffffffffffffff. 
> """
> 
> I.e. after you write the shmid into SHMSel you can obtain the base
> address and the length of that region, provided there is a region
> with that id.
> 
> > 
> > This is the point that device provides multiple shared memory areas.
> > These memory areas are used to provide ISM Region.
> > 
> 
> Thus I don't think this is possible without violating the currently
> standing rules on shared memory regions.
> 
> > >
> > > Can you have a look at the MMIO interface for virtio shared memory
> > > regions.  
> > 
> > Is there any special attention?
> > 
> 
> Please see above.
> 
> > 
> > >
> > > How far what ISM tries to do consistent with virtio shared memory. I
> > > mean, AFAIU once the shared memory is advertised, the  shared memory is
> > > supposed to be there and accessible by the driver for both writes and
> > > reads.  
> > 
> > Before allocation, VMM did not allocate physical memory for it. Therefore, it
> > was illegal to read and write this memory.
> 
> I'm not sure whether generating addressing exceptions on accesses to the
> advertised virtio shared memory regions is acceptable or not. Maybe
> Michael, Connie and the TC members more versed in shared memory regions
> can help us get clarity on this.

At the moment spec does not say anything about it.

> I don't think it is per-se prohibited, but on the other hand, to me
> virtio shared memory regions read more like memory that is supposed to
> work , and less like an address space that may or may not be backed by
> memory. So if this is allowed, maybe adding a sentence to the part that
> describes virtio shared memory regions in the basic facilities portion of
> the spec would be beneficial.
> 
> Regards,
> Halil


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [virtio-dev] [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism
  2023-02-05 12:30         ` Michael S. Tsirkin
@ 2023-02-06  2:15           ` Xuan Zhuo
  0 siblings, 0 replies; 37+ messages in thread
From: Xuan Zhuo @ 2023-02-06  2:15 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, hans, herongguang, zmlcc, dust.li, tonylu, zhenzao,
	helinguo, gerry, cohuck, jasowang, Jan Kiszka, wintera, kgraul,
	wenjia, jaka, hca, twinkler, raspl, Halil Pasic

On Sun, 5 Feb 2023 07:30:08 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Wed, Jan 11, 2023 at 10:12:38PM +0100, Halil Pasic wrote:
> > On Wed, 11 Jan 2023 19:08:53 +0800
> > Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >
> > > > > +The device shares memory to the driver based on shared memory regions
>
> exposes memory to the driver btw. "share" is already overused in this
> text.
>
> > > > > +\ref{sec:Basic Facilities of a Virtio Device / Shared Memory Regions}.
> > > > > +However, it does not need to allocate physical memory during initialization.
> > > > > +
> > > > > +The \field{shmid} of a region MUST be one of the following
> > > > > +\begin{lstlisting}
> > > > > +enum virtio_ism_shm_id {
> > > > > +	VIRTIO_ISM_SHM_ID_UNDEFINED = 0,
> > > > > +	VIRTIO_ISM_SHM_ID_REGIONS   = 1,
> > > > > +	VIRTIO_ISM_SHM_ID_NOTIFY    = 2,
> > > > > +};
> > > > > +\end{lstlisting}
> > > > > +
> > > > > +The shared memory whose shmid is VIRTIO_ISM_SHM_ID_REGIONS is used to implement
> > > > > +ism regions.
> > > >
> > > > Hm. AFAIU, these are supposed to be used for exchanging data between the
> > > > VMs that are supposed to communicate with each other via ism.
> > > >
> > > > How does this relate to the following normative section:
> > > > """
> > > > 2.10.2 Device Requirements: Shared Memory Regions
> > > >
> > > > Shared memory regions MUST NOT expose shared memory regions which are used to control the operation of the device, nor to stream data.
> > > > """
> > >
> > > From https://lists.oasis-open.org/archives/virtio-comment/201906/msg00010.html
> > >
> > > 	I'm trying to find a way to say that people shouldn't side-step virtio
> > > 	queues by putting everything in a big blob of shared memory.
> > >
> > > So I think this should not be conflict.
> >
> > AFAIU putting everything in a big blob of shared memory is what ISM
> > does. I'm not saying that this is bad. But I'm for sure confused.
> >
> > In my opinion we should either clarify what is meant by that sentence
> > and maybe also why, or get rid of it.
> >
> > @MST: What is your take on this?
>
> My take is that I am equally confused. It will help readability
> a lot if the text
> - stops overusing the words "share" and "region" - just use them
>   for one specific thing, anything else - call it something else.
>   E.g. expose, area, etc.
> - eschew abbreviation. E.g. don't say "ism" everywhere.
> - stops mixing interface and implementation. focus on the interface and
>   document that. then talk about what happens on the backend as an
>   example. So there's device memory exposed to guest and device resources
>   can be allocated later. For example this could be host memory
>   and physical memory is not allocated. something like this.

Will fix.

But "ism" is the name of this device, it's not an abbreviation.

Thanks.

>
> > >
> > > >
> > > > > If there are multiple shared memories whose shmid is
> > > > > +VIRTIO_ISM_SHM_ID_REGIONS, they are used as contiguous memory in the order of
> > > > > +acquisition.
> > > >
> > > > Hm, I used to think that the shmid is an unique identifier for a shared
> > > > memory region. How can multiple virtio shared memory regions have the
> > > > same shmid?
> > >
> > > Shmid is used to define what this sharing memory is used to do, not its unique
> > > identifier.
> >
> > As far as I can tell the shmid is supposed to uniquely identify a single
> > shared memory region!
> >
> > " A device may have multiple shared memory regions associated with it.
> > Each region has a shmid to identify it, the meaning of which is device-specific."
> >
> > @MST: Can you please help us to get clarity on this?
>
> My understanding would be that it uniquely identifies the region
> within the device.
> It is unfortunate that we don't have a conformance clause like this.
> Let's add one?
>
>
> > BTW, this is why I pointed you to the corresponding MMIO interface:
> > """
> > SHMSel	             Shared memory id
> > 0x0ac
> > W
> >
> > Writing to this register selects the shared memory region 2.10 following
> > operations on SHMLenLow, SHMLenHigh, SHMBaseLow and SHMBaseHigh apply
> > to.
> >
> > SHMLenLow            Shared memory region 64 bit long length
> > 0x0b0
> > SHMLenHigh
> > 0x0b4
> > R
> >
> >
> > These registers return the length of the shared memory region in bytes,
> > as defined by the device for the region selected by the SHMSel register.
> > The lower 32 bits of the length are read from SHMLenLow and the higher
> > 32 bits from SHMLenHigh. Reading from a non-existent region (i.e. where
> > the ID written to SHMSel is unused) results in a length of -1.
> >
> >
> > SHMBaseLow         Shared memory region 64 bit long physical address
> > 0x0b8
> > SHMBaseHigh
> > 0x0bc
> > R
> >
> > The driver reads these registers to discover the base address of the
> > region in physical address space. This address is chosen by the device
> > (or other part of the VMM). The lower 32 bits of the address are read
> > from SHMBaseLow with the higher 32 bits from SHMBaseHigh. Reading from a
> > non-existent region (i.e. where the ID written to SHMSel is unused)
> > results in a base address of 0xffffffffffffffff.
> > """
> >
> > I.e. after you write the shmid into SHMSel you can obtain the base
> > address and the length of that region, provided there is a region
> > with that id.
> >
> > >
> > > This is the point that device provides multiple shared memory areas.
> > > These memory areas are used to provide ISM Region.
> > >
> >
> > Thus I don't think this is possible without violating the currently
> > standing rules on shared memory regions.
> >
> > > >
> > > > Can you have a look at the MMIO interface for virtio shared memory
> > > > regions.
> > >
> > > Is there any special attention?
> > >
> >
> > Please see above.
> >
> > >
> > > >
> > > > How far what ISM tries to do consistent with virtio shared memory. I
> > > > mean, AFAIU once the shared memory is advertised, the  shared memory is
> > > > supposed to be there and accessible by the driver for both writes and
> > > > reads.
> > >
> > > Before allocation, VMM did not allocate physical memory for it. Therefore, it
> > > was illegal to read and write this memory.
> >
> > I'm not sure whether generating addressing exceptions on accesses to the
> > advertised virtio shared memory regions is acceptable or not. Maybe
> > Michael, Connie and the TC members more versed in shared memory regions
> > can help us get clarity on this.
>
> At the moment spec does not say anything about it.
>
> > I don't think it is per-se prohibited, but on the other hand, to me
> > virtio shared memory regions read more like memory that is supposed to
> > work , and less like an address space that may or may not be backed by
> > memory. So if this is allowed, maybe adding a sentence to the part that
> > describes virtio shared memory regions in the basic facilities portion of
> > the spec would be beneficial.
> >
> > Regards,
> > Halil
>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 0/1] introduce virtio-ism: internal shared memory device
@ 2023-03-01  9:34     ` Tony Lu
  0 siblings, 0 replies; 37+ messages in thread
From: Tony Lu @ 2023-03-01  9:34 UTC (permalink / raw)
  To: Wenjia Zhang
  Cc: Xuan Zhuo, virtio-dev, hans, herongguang, zmlcc, dust.li, zhenzao,
	helinguo, gerry, mst, cohuck, jasowang, Jan Kiszka, wintera,
	kgraul, jaka, hca, twinkler, raspl

On Wed, Jan 25, 2023 at 01:55:51PM +0100, Wenjia Zhang wrote:
> 
> 
> On 23.12.22 09:13, Xuan Zhuo wrote:
> > Hello everyone,
> > 
> > # Background
> > 
> >      Nowadays, there is a common scenario to accelerate communication between
> >      different VMs and containers, including light weight virtual machine based
> >      containers. One way to achieve this is to colocate them on the same host.
> >      However, the performance of inter-VM communication through network stack is
> >      not optimal and may also waste extra CPU cycles. This scenario has been
> >      discussed many times, but still no generic solution available [1] [2] [3].
> > 
> >      With pci-ivshmem + SMC(Shared Memory Communications: [4]) based PoC[5],
> >      We found that by changing the communication channel between VMs from TCP to
> >      SMC with shared memory, we can achieve superior performance for a common
> >      socket-based application[5]:
> >        - latency reduced by about 50%
> >        - throughput increased by about 300%
> >        - CPU consumption reduced by about 50%
> > 
> >      Since there is no particularly suitable shared memory management solution
> >      matches the need for SMC(See ## Comparison with existing technology), and
> >      virtio is the standard for communication in the virtualization world, we
> >      want to implement a virtio-ism device based on virtio, which can support
> >      on-demand memory sharing across VMs, containers or VM-container. To match
> >      the needs of SMC, the virtio-ism device need to support:
> > 
> >      1. Dynamic provision: shared memory regions are dynamically allocated and
> >         provisioned.
> >      2. Multi-region management: the shared memory is divided into regions,
> >         and a peer may allocate one or more regions from the same shared memory
> >         device.
> >      3. Permission control: the permission of each region can be set seperately.
> >      4. Dynamic connection: each ism region of a device can be shared with
> >         different devices, eventually a device can be shared with thousands of
> >         devices
> > 
> > # Virtio ISM device
> > 
> >      ISM devices provide the ability to share memory between different guests on
> >      a host. A guest's memory got from ism device can be shared with multiple
> >      peers at the same time. This shared relationship can be dynamically created
> >      and released.
> > 
> >      The shared memory obtained from the device is divided into multiple ism
> >      regions for share. ISM device provides a mechanism to notify other ism
> >      region referrers of content update events.
> > 
> > ## Design
> > 
> >      This is a structure diagram based on ism sharing between two vms.
> > 
> >      |-------------------------------------------------------------------------------------------------------------|
> >      | |------------------------------------------------|       |------------------------------------------------| |
> >      | | Guest                                          |       | Guest                                          | |
> >      | |                                                |       |                                                | |
> >      | |   ----------------                             |       |   ----------------                             | |
> >      | |   |    driver    |     [M1]   [M2]   [M3]      |       |   |    driver    |             [M2]   [M3]     | |
> >      | |   ----------------       |      |      |       |       |   ----------------               |      |      | |
> >      | |    |cq|                  |map   |map   |map    |       |    |cq|                          |map   |map   | |
> >      | |    |  |                  |      |      |       |       |    |  |                          |      |      | |
> >      | |    |  |                -------------------     |       |    |  |                --------------------    | |
> >      | |----|--|----------------|  device memory  |-----|       |----|--|----------------|  device memory   |----| |
> >      | |    |  |                -------------------     |       |    |  |                --------------------    | |
> >      | |                                |               |       |                               |                | |
> >      | |                                |               |       |                               |                | |
> >      | | Qemu                           |               |       | Qemu                          |                | |
> >      | |--------------------------------+---------------|       |-------------------------------+----------------| |
> >      |                                  |                                                       |                  |
> >      |                                  |                                                       |                  |
> >      |                                  |------------------------------+------------------------|                  |
> >      |                                                                 |                                           |
> >      |                                                                 |                                           |
> >      |                                                   --------------------------                                |
> >      |                                                    | M1 |   | M2 |   | M3 |                                 |
> >      |                                                   --------------------------                                |
> >      |                                                                                                             |
> >      | HOST                                                                                                        |
> >      ---------------------------------------------------------------------------------------------------------------
> > 
> > ## Inspiration
> > 
> >      Our design idea for virtio-ism comes from IBM's ISM device, to pay tribute,
> >      we directly name this device "ism".
> > 
> >      Information about IBM ism device and SMC:
> >        1. SMC reference: https://www.ibm.com/docs/en/zos/2.5.0?topic=system-shared-memory-communications
> >        2. SMC-Dv2 and ISMv2 introduction: https://www.newera.com/INFO/SMCv2_Introduction_10-15-2020.pdf
> >        3. ISM device: https://www.ibm.com/docs/en/linux-on-systems?topic=n-ism-device-driver-1
> >        4. SMC protocol (including SMC-D): https://www.ibm.com/support/pages/system/files/inline-files/IBM%20Shared%20Memory%20Communications%20Version%202_2.pdf
> >        5. SMC-D FAQ: https://www.ibm.com/support/pages/system/files/inline-files/2021-02-09-SMC-D-FAQ.pdf
> > 
> > ## ISM VLAN
> > 
> >      Since SMC uses TCP to handshake with IP facilities, virtio-ism device is not
> >      bound to existing IP device, and the latest ISMv2 device doesn't require
> >      VLAN. So it is not necessary for virtio-ism to support VLAN attributes.
> > 
> > ## Live Migration
> > 
> >      Currently SMC-D doesn't support migration to another device or fallback. And
> >      SMC-R supports migration to another link, no fallback.
> > 
> >      So we may not support live migration for the time being.
> > 
> > ## About hot plugging of the ism device
> > 
> >      Hot plugging of devices is a heavier, possibly failed, time-consuming, and
> >      less scalable operation. So, we don't plan to support it for now.
> > 
> > 
> > # Usage (SMC as example)
> > 
> >      There is one of possible use cases:
> > 
> >      1. SMC calls the interface ism_alloc_region() of the ism driver to return
> >         the location of a memory region in the PCI space and a token.
> >      2. The ism driver mmap the memory region and return to SMC with the token
> >      3. SMC passes the token to the connected peer
> >      4. the peer calls the ism driver interface ism_attach_region(token) to
> >         get the location of the PCI space of the shared memory
> >      5. The connected pair communicating through the shared memory
> > 
> > # Comparison with existing technology
> > 
> > ## ivshmem or ivshmem 2.0 of Qemu
> > 
> >     1. ivshmem 1.0 is a large piece of memory that can be seen by all devices
> >        that use this VM, so the security is not enough.
> > 
> >     2. ivshmem 2.0 is a shared memory belonging to a VM that can be read-only by
> >        all other VMs that use the ivshmem 2.0 shared memory device, which also
> >        does not meet our needs in terms of security.
> > 
> > ## vhost-pci and virtiovhostuser
> > 
> >      1. does not support dynamic allocation
> >      2. one device just support connect to one vm
> > 
> > 
> > # POC CODE
> > 
> > There are no functions related to eventq and perm yet.
> > 
> > ## Qemu (virtio ism device):
> > 
> >       https://github.com/fengidri/qemu/compare/7d66b74c4dd0d74d12c1d3d6de366242b13ed76d...ism-upstream-1216?expand=1
> > 
> >      Start qemu with option "--device virtio-ism-pci,disable-legacy=on, disable-modern=off".
> > 
> > ##  Kernel (virtio ism driver and smc support):
> > 
> >       https://github.com/fengidri/linux-kernel-virtio-ism/compare/6f8101eb21bab480537027e62c4b17021fb7ea5d...ism-upstream-1223
> > 
> I tried to run the PoC kernel on our platform and wanted to test the main
> path of SMC-D and SMC-R flows as we used to firstly. Unfortunately, most of
> our basic tests are failed and even ftrace could trigger a crash which
> points that "Unable to handle kernel pointer dereference in virtual kernel
> address space". Thus, I'm wondering:

Thanks Wenjia for your patient about this early version PoC.

Emm :-( there have some issues on different platform, and this version
isn't fully tested. We are going to refine this patch set, and refactor
with new APIs of SMC.

> 
> - What aspects of the PoC would you like to get feedback on? Which parts are
> known to be not as good as they need to become? Or are known to need to be
> significantly expanded upon for reaching non-RFC maturity? What is suposed
> to stick around?
> - Is there a version that does not break the existing scenarios in the pipe?
> Or maybe even already published somewhere else?

SMC with virtio-ism is inspired by IBM ISM device. virtio-ism can serves
SMC and gives SMC an ability to support inter-VM connection. To get this
done, we have two parts of work to do:
- propose virtio-ism device,
- and let SMC support it.

So we are going to invite you to discuss the proposal about virtio-ism
and how to do in SMC. We're very glad to hear your advice. The code is
in very early version, and try to help people understand our approach,
also it only published here and wait for your comments and advice.

When virtio-ism is accepted and implemented, we will send formal RFC to
SMC mail list. 

> - Are you aware of the recent refactoring in SMC-D:
> https://git.kernel.org/netdev/net-next/c/8c81ba20349d ? Could you base your
> next version on this code?

Yes, we are going to do it in next version. The refactoring work really
helps us to support virtio-ism.

Cheers,
Tony Lu

> 
> 
> 
> > 
> > ### SMC
> > 
> >      Support SMC-D works with virtio-ism.
> > 
> >      Use SMC with virtio-ism to accelerate inter-VM communication.
> > 
> >      1. insmod virtio-ism and smc module.
> >      2. use smc-tools [1] to get the device name of SMC-D based on virtio-ism.
> > 
> >        $ smcd d # here is _virtio2_
> >        FID  Type  PCI-ID        PCHID  InUse  #LGs  PNET-ID
> >        0000 0     virtio2       0000   Yes       1  *C1
> > 
> >      3. add the nic and SMC-D device to the same pnet, do it in both client and server.
> > 
> >        $ smc_pnet -a -I eth1 c1 # use eth1 to setup SMC connection
> >        $ smc_pnet -a -D virtio2 c1 # virtio2 is the virtio-ism device
> > 
> >      4. use SMC to accelerate your application, smc_run in [1] can do this.
> > 
> >        # smc_run use LD_PRELOAD to hijack socket syscall with AF_SMC
> >        $ smc_run sockperf server --tcp # run in server
> >        $ smc_run sockperf tp --tcp -i a.b.c.d # run in client
> > 
> >      [1] https://github.com/ibm-s390-linux/smc-tools
> > 
> >      Notice: The current POC state, we only tested some basic functions.
> > 
> > ### App inside user space
> > 
> >      The ism driver provide /dev/vismX interface, allow users to use Virtio-ISM
> >      device in user space directly.
> > 
> >      Try tools/virtio/virtio-ism/virtio-ism-mmap
> > 
> >      Usage:
> >           cd tools/virtio/virtio-ism/; make
> >           insmode virtio-ism.ko
> > 
> >      case1: communicate
> > 
> >         vm1: ./virtio-ism-mmap alloc -> token
> >         vm2: ./virtio-ism-mmap attach -t <token> --write-msg AAAA --commit
> > 
> >         vm2 will write msg to shared memory, then notify vm1. After vm1 receive
> >         notify, then read from shared memory.
> > 
> >      case2: ping-pong test.
> > 
> >          vm1: ./virtio-ism-mmap server
> >          vm2: ./virtio-ism-mmap -i 192.168.122.101 pp
> > 
> >          1. server alloc one ism region
> >          2. client get the token by tcp
> > 
> >          3. client commit(kick) to server, server recv notify, commit(kick) to client
> >          4. loop #3
> > 
> >      case3: throughput test.
> > 
> >          vm1: ./virtio-ism-mmap server
> >          vm2: ./virtio-ism-mmap -i 192.168.122.101 tp
> > 
> >          1. server alloc one ism region
> >          2. client get the token by tcp
> > 
> >          3. client write 1M data to ism region
> >          4. client commit(kick) to server
> >          5. server recv notify, copy the data, the commit(kick) back to client
> >          6. loop #3-#5
> > 
> >      case4: throughput test with protocol defined by user.
> > 
> >          vm1: ./virtio-ism-mmap server
> >          vm2: ./virtio-ism-mmap -i 192.168.122.101 tp --polling --tp-chunks 15 --msg-size 64k -n 50000
> > 
> >          Used the ism region as a ring.
> > 
> >          In this scene, client and server are in the polling mode. Test it on
> >          my machine, the maximum can reach 12GBps
> > 
> > # References
> > 
> >      [1] https://projectacrn.github.io/latest/tutorials/enable_ivshmem.html
> >      [2] https://dl.acm.org/doi/10.1145/2847562
> >      [3] https://hal.archives-ouvertes.fr/hal-00368622/document
> >      [4] https://lwn.net/Articles/711071/
> >      [5] https://lore.kernel.org/netdev/20220720170048.20806-1-tonylu@linux.alibaba.com/T/
> > 
> > 
> > If there are any problems, please point them out.
> > Hope to hear from you, thank you.
> > 
> > v2:
> >     1. add Attach/Detach event
> >     2. add Events Filter
> >     3. allow Alloc/Attach huge region
> >     4. remove host/guest terms
> > 
> > v1:
> >     1. cover letter adding explanation of ism vlan
> >     2. spec add gid
> >     3. explain the source of ideas about ism
> >     4. POC support virtio-ism-smc.ko virtio-ism-dev.ko and support virtio-ism-mmap
> > 
> > 
> > Xuan Zhuo (1):
> >    virtio-ism: introduce new device virtio-ism
> > 
> >   conformance.tex |  24 +++
> >   content.tex     |   1 +
> >   virtio-ism.tex  | 472 ++++++++++++++++++++++++++++++++++++++++++++++++
> >   3 files changed, 497 insertions(+)
> >   create mode 100644 virtio-ism.tex
> > 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [virtio-dev] Re: [PATCH v2 0/1] introduce virtio-ism: internal shared memory device
@ 2023-03-01  9:34     ` Tony Lu
  0 siblings, 0 replies; 37+ messages in thread
From: Tony Lu @ 2023-03-01  9:34 UTC (permalink / raw)
  To: Wenjia Zhang
  Cc: Xuan Zhuo, virtio-dev, hans, herongguang, zmlcc, dust.li, zhenzao,
	helinguo, gerry, mst, cohuck, jasowang, Jan Kiszka, wintera,
	kgraul, jaka, hca, twinkler, raspl

On Wed, Jan 25, 2023 at 01:55:51PM +0100, Wenjia Zhang wrote:
> 
> 
> On 23.12.22 09:13, Xuan Zhuo wrote:
> > Hello everyone,
> > 
> > # Background
> > 
> >      Nowadays, there is a common scenario to accelerate communication between
> >      different VMs and containers, including light weight virtual machine based
> >      containers. One way to achieve this is to colocate them on the same host.
> >      However, the performance of inter-VM communication through network stack is
> >      not optimal and may also waste extra CPU cycles. This scenario has been
> >      discussed many times, but still no generic solution available [1] [2] [3].
> > 
> >      With pci-ivshmem + SMC(Shared Memory Communications: [4]) based PoC[5],
> >      We found that by changing the communication channel between VMs from TCP to
> >      SMC with shared memory, we can achieve superior performance for a common
> >      socket-based application[5]:
> >        - latency reduced by about 50%
> >        - throughput increased by about 300%
> >        - CPU consumption reduced by about 50%
> > 
> >      Since there is no particularly suitable shared memory management solution
> >      matches the need for SMC(See ## Comparison with existing technology), and
> >      virtio is the standard for communication in the virtualization world, we
> >      want to implement a virtio-ism device based on virtio, which can support
> >      on-demand memory sharing across VMs, containers or VM-container. To match
> >      the needs of SMC, the virtio-ism device need to support:
> > 
> >      1. Dynamic provision: shared memory regions are dynamically allocated and
> >         provisioned.
> >      2. Multi-region management: the shared memory is divided into regions,
> >         and a peer may allocate one or more regions from the same shared memory
> >         device.
> >      3. Permission control: the permission of each region can be set seperately.
> >      4. Dynamic connection: each ism region of a device can be shared with
> >         different devices, eventually a device can be shared with thousands of
> >         devices
> > 
> > # Virtio ISM device
> > 
> >      ISM devices provide the ability to share memory between different guests on
> >      a host. A guest's memory got from ism device can be shared with multiple
> >      peers at the same time. This shared relationship can be dynamically created
> >      and released.
> > 
> >      The shared memory obtained from the device is divided into multiple ism
> >      regions for share. ISM device provides a mechanism to notify other ism
> >      region referrers of content update events.
> > 
> > ## Design
> > 
> >      This is a structure diagram based on ism sharing between two vms.
> > 
> >      |-------------------------------------------------------------------------------------------------------------|
> >      | |------------------------------------------------|       |------------------------------------------------| |
> >      | | Guest                                          |       | Guest                                          | |
> >      | |                                                |       |                                                | |
> >      | |   ----------------                             |       |   ----------------                             | |
> >      | |   |    driver    |     [M1]   [M2]   [M3]      |       |   |    driver    |             [M2]   [M3]     | |
> >      | |   ----------------       |      |      |       |       |   ----------------               |      |      | |
> >      | |    |cq|                  |map   |map   |map    |       |    |cq|                          |map   |map   | |
> >      | |    |  |                  |      |      |       |       |    |  |                          |      |      | |
> >      | |    |  |                -------------------     |       |    |  |                --------------------    | |
> >      | |----|--|----------------|  device memory  |-----|       |----|--|----------------|  device memory   |----| |
> >      | |    |  |                -------------------     |       |    |  |                --------------------    | |
> >      | |                                |               |       |                               |                | |
> >      | |                                |               |       |                               |                | |
> >      | | Qemu                           |               |       | Qemu                          |                | |
> >      | |--------------------------------+---------------|       |-------------------------------+----------------| |
> >      |                                  |                                                       |                  |
> >      |                                  |                                                       |                  |
> >      |                                  |------------------------------+------------------------|                  |
> >      |                                                                 |                                           |
> >      |                                                                 |                                           |
> >      |                                                   --------------------------                                |
> >      |                                                    | M1 |   | M2 |   | M3 |                                 |
> >      |                                                   --------------------------                                |
> >      |                                                                                                             |
> >      | HOST                                                                                                        |
> >      ---------------------------------------------------------------------------------------------------------------
> > 
> > ## Inspiration
> > 
> >      Our design idea for virtio-ism comes from IBM's ISM device, to pay tribute,
> >      we directly name this device "ism".
> > 
> >      Information about IBM ism device and SMC:
> >        1. SMC reference: https://www.ibm.com/docs/en/zos/2.5.0?topic=system-shared-memory-communications
> >        2. SMC-Dv2 and ISMv2 introduction: https://www.newera.com/INFO/SMCv2_Introduction_10-15-2020.pdf
> >        3. ISM device: https://www.ibm.com/docs/en/linux-on-systems?topic=n-ism-device-driver-1
> >        4. SMC protocol (including SMC-D): https://www.ibm.com/support/pages/system/files/inline-files/IBM%20Shared%20Memory%20Communications%20Version%202_2.pdf
> >        5. SMC-D FAQ: https://www.ibm.com/support/pages/system/files/inline-files/2021-02-09-SMC-D-FAQ.pdf
> > 
> > ## ISM VLAN
> > 
> >      Since SMC uses TCP to handshake with IP facilities, virtio-ism device is not
> >      bound to existing IP device, and the latest ISMv2 device doesn't require
> >      VLAN. So it is not necessary for virtio-ism to support VLAN attributes.
> > 
> > ## Live Migration
> > 
> >      Currently SMC-D doesn't support migration to another device or fallback. And
> >      SMC-R supports migration to another link, no fallback.
> > 
> >      So we may not support live migration for the time being.
> > 
> > ## About hot plugging of the ism device
> > 
> >      Hot plugging of devices is a heavier, possibly failed, time-consuming, and
> >      less scalable operation. So, we don't plan to support it for now.
> > 
> > 
> > # Usage (SMC as example)
> > 
> >      There is one of possible use cases:
> > 
> >      1. SMC calls the interface ism_alloc_region() of the ism driver to return
> >         the location of a memory region in the PCI space and a token.
> >      2. The ism driver mmap the memory region and return to SMC with the token
> >      3. SMC passes the token to the connected peer
> >      4. the peer calls the ism driver interface ism_attach_region(token) to
> >         get the location of the PCI space of the shared memory
> >      5. The connected pair communicating through the shared memory
> > 
> > # Comparison with existing technology
> > 
> > ## ivshmem or ivshmem 2.0 of Qemu
> > 
> >     1. ivshmem 1.0 is a large piece of memory that can be seen by all devices
> >        that use this VM, so the security is not enough.
> > 
> >     2. ivshmem 2.0 is a shared memory belonging to a VM that can be read-only by
> >        all other VMs that use the ivshmem 2.0 shared memory device, which also
> >        does not meet our needs in terms of security.
> > 
> > ## vhost-pci and virtiovhostuser
> > 
> >      1. does not support dynamic allocation
> >      2. one device just support connect to one vm
> > 
> > 
> > # POC CODE
> > 
> > There are no functions related to eventq and perm yet.
> > 
> > ## Qemu (virtio ism device):
> > 
> >       https://github.com/fengidri/qemu/compare/7d66b74c4dd0d74d12c1d3d6de366242b13ed76d...ism-upstream-1216?expand=1
> > 
> >      Start qemu with option "--device virtio-ism-pci,disable-legacy=on, disable-modern=off".
> > 
> > ##  Kernel (virtio ism driver and smc support):
> > 
> >       https://github.com/fengidri/linux-kernel-virtio-ism/compare/6f8101eb21bab480537027e62c4b17021fb7ea5d...ism-upstream-1223
> > 
> I tried to run the PoC kernel on our platform and wanted to test the main
> path of SMC-D and SMC-R flows as we used to firstly. Unfortunately, most of
> our basic tests are failed and even ftrace could trigger a crash which
> points that "Unable to handle kernel pointer dereference in virtual kernel
> address space". Thus, I'm wondering:

Thanks Wenjia for your patient about this early version PoC.

Emm :-( there have some issues on different platform, and this version
isn't fully tested. We are going to refine this patch set, and refactor
with new APIs of SMC.

> 
> - What aspects of the PoC would you like to get feedback on? Which parts are
> known to be not as good as they need to become? Or are known to need to be
> significantly expanded upon for reaching non-RFC maturity? What is suposed
> to stick around?
> - Is there a version that does not break the existing scenarios in the pipe?
> Or maybe even already published somewhere else?

SMC with virtio-ism is inspired by IBM ISM device. virtio-ism can serves
SMC and gives SMC an ability to support inter-VM connection. To get this
done, we have two parts of work to do:
- propose virtio-ism device,
- and let SMC support it.

So we are going to invite you to discuss the proposal about virtio-ism
and how to do in SMC. We're very glad to hear your advice. The code is
in very early version, and try to help people understand our approach,
also it only published here and wait for your comments and advice.

When virtio-ism is accepted and implemented, we will send formal RFC to
SMC mail list. 

> - Are you aware of the recent refactoring in SMC-D:
> https://git.kernel.org/netdev/net-next/c/8c81ba20349d ? Could you base your
> next version on this code?

Yes, we are going to do it in next version. The refactoring work really
helps us to support virtio-ism.

Cheers,
Tony Lu

> 
> 
> 
> > 
> > ### SMC
> > 
> >      Support SMC-D works with virtio-ism.
> > 
> >      Use SMC with virtio-ism to accelerate inter-VM communication.
> > 
> >      1. insmod virtio-ism and smc module.
> >      2. use smc-tools [1] to get the device name of SMC-D based on virtio-ism.
> > 
> >        $ smcd d # here is _virtio2_
> >        FID  Type  PCI-ID        PCHID  InUse  #LGs  PNET-ID
> >        0000 0     virtio2       0000   Yes       1  *C1
> > 
> >      3. add the nic and SMC-D device to the same pnet, do it in both client and server.
> > 
> >        $ smc_pnet -a -I eth1 c1 # use eth1 to setup SMC connection
> >        $ smc_pnet -a -D virtio2 c1 # virtio2 is the virtio-ism device
> > 
> >      4. use SMC to accelerate your application, smc_run in [1] can do this.
> > 
> >        # smc_run use LD_PRELOAD to hijack socket syscall with AF_SMC
> >        $ smc_run sockperf server --tcp # run in server
> >        $ smc_run sockperf tp --tcp -i a.b.c.d # run in client
> > 
> >      [1] https://github.com/ibm-s390-linux/smc-tools
> > 
> >      Notice: The current POC state, we only tested some basic functions.
> > 
> > ### App inside user space
> > 
> >      The ism driver provide /dev/vismX interface, allow users to use Virtio-ISM
> >      device in user space directly.
> > 
> >      Try tools/virtio/virtio-ism/virtio-ism-mmap
> > 
> >      Usage:
> >           cd tools/virtio/virtio-ism/; make
> >           insmode virtio-ism.ko
> > 
> >      case1: communicate
> > 
> >         vm1: ./virtio-ism-mmap alloc -> token
> >         vm2: ./virtio-ism-mmap attach -t <token> --write-msg AAAA --commit
> > 
> >         vm2 will write msg to shared memory, then notify vm1. After vm1 receive
> >         notify, then read from shared memory.
> > 
> >      case2: ping-pong test.
> > 
> >          vm1: ./virtio-ism-mmap server
> >          vm2: ./virtio-ism-mmap -i 192.168.122.101 pp
> > 
> >          1. server alloc one ism region
> >          2. client get the token by tcp
> > 
> >          3. client commit(kick) to server, server recv notify, commit(kick) to client
> >          4. loop #3
> > 
> >      case3: throughput test.
> > 
> >          vm1: ./virtio-ism-mmap server
> >          vm2: ./virtio-ism-mmap -i 192.168.122.101 tp
> > 
> >          1. server alloc one ism region
> >          2. client get the token by tcp
> > 
> >          3. client write 1M data to ism region
> >          4. client commit(kick) to server
> >          5. server recv notify, copy the data, the commit(kick) back to client
> >          6. loop #3-#5
> > 
> >      case4: throughput test with protocol defined by user.
> > 
> >          vm1: ./virtio-ism-mmap server
> >          vm2: ./virtio-ism-mmap -i 192.168.122.101 tp --polling --tp-chunks 15 --msg-size 64k -n 50000
> > 
> >          Used the ism region as a ring.
> > 
> >          In this scene, client and server are in the polling mode. Test it on
> >          my machine, the maximum can reach 12GBps
> > 
> > # References
> > 
> >      [1] https://projectacrn.github.io/latest/tutorials/enable_ivshmem.html
> >      [2] https://dl.acm.org/doi/10.1145/2847562
> >      [3] https://hal.archives-ouvertes.fr/hal-00368622/document
> >      [4] https://lwn.net/Articles/711071/
> >      [5] https://lore.kernel.org/netdev/20220720170048.20806-1-tonylu@linux.alibaba.com/T/
> > 
> > 
> > If there are any problems, please point them out.
> > Hope to hear from you, thank you.
> > 
> > v2:
> >     1. add Attach/Detach event
> >     2. add Events Filter
> >     3. allow Alloc/Attach huge region
> >     4. remove host/guest terms
> > 
> > v1:
> >     1. cover letter adding explanation of ism vlan
> >     2. spec add gid
> >     3. explain the source of ideas about ism
> >     4. POC support virtio-ism-smc.ko virtio-ism-dev.ko and support virtio-ism-mmap
> > 
> > 
> > Xuan Zhuo (1):
> >    virtio-ism: introduce new device virtio-ism
> > 
> >   conformance.tex |  24 +++
> >   content.tex     |   1 +
> >   virtio-ism.tex  | 472 ++++++++++++++++++++++++++++++++++++++++++++++++
> >   3 files changed, 497 insertions(+)
> >   create mode 100644 virtio-ism.tex
> > 

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2023-03-01  9:35 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-12-23  8:13 [PATCH v2 0/1] introduce virtio-ism: internal shared memory device Xuan Zhuo
2022-12-23  8:13 ` [PATCH v2 1/1] virtio-ism: introduce new device virtio-ism Xuan Zhuo
2023-01-10 22:34   ` [virtio-dev] " Halil Pasic
2023-01-11 11:08     ` Xuan Zhuo
2023-01-11 15:11       ` Halil Pasic
2023-01-12  2:01         ` Jason Wang
2023-01-12  6:56           ` Michael S. Tsirkin
2023-01-12  8:42             ` Cornelia Huck
2023-01-12 11:48               ` Xuan Zhuo
2023-01-12 14:30                 ` Cornelia Huck
2023-01-12 15:41                   ` Halil Pasic
2023-01-12 16:07                     ` Cornelia Huck
2023-01-13  1:58                     ` Xuan Zhuo
2023-01-13  2:29                       ` Jason Wang
2023-01-13  6:24                         ` Xuan Zhuo
2023-01-13 12:00                           ` Michael S. Tsirkin
2023-01-16  2:10                             ` Xuan Zhuo
2023-01-19 12:30                               ` Alexandra Winter
2023-01-28  7:42                                 ` Xuan Zhuo
2023-01-12 11:47             ` Xuan Zhuo
2023-01-12 12:15             ` Halil Pasic
2023-01-11 15:22       ` Halil Pasic
2023-01-12 11:57         ` Xuan Zhuo
2023-01-11 15:30       ` Halil Pasic
2023-01-12 12:03         ` Xuan Zhuo
2023-01-11 20:46       ` Halil Pasic
2023-01-12 12:23         ` Xuan Zhuo
2023-01-11 21:12       ` Halil Pasic
2023-01-12  7:01         ` Michael S. Tsirkin
2023-01-12 12:31         ` Xuan Zhuo
2023-01-20 13:06           ` Halil Pasic
2023-01-12 12:40         ` Xuan Zhuo
2023-02-05 12:30         ` Michael S. Tsirkin
2023-02-06  2:15           ` Xuan Zhuo
2023-01-25 12:55 ` [PATCH v2 0/1] introduce virtio-ism: internal shared memory device Wenjia Zhang
2023-03-01  9:34   ` Tony Lu
2023-03-01  9:34     ` [virtio-dev] " Tony Lu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.