From: marcandre.lureau@redhat.com
To: qemu-devel@nongnu.org
Cc: peter.maydell@linaro.org, drjones@redhat.com,
claudio.fontana@huawei.com,
"David Marchand" <david.marchand@6wind.com>,
stefanha@redhat.com,
"Marc-André Lureau" <marcandre.lureau@redhat.com>,
pbonzini@redhat.com, cam@cs.ualberta.ca
Subject: [Qemu-devel] [PULL v2 36/50] docs: update ivshmem device spec
Date: Mon, 12 Oct 2015 18:41:30 +0200 [thread overview]
Message-ID: <1444668104-22955-37-git-send-email-marcandre.lureau@redhat.com> (raw)
In-Reply-To: <1444668104-22955-1-git-send-email-marcandre.lureau@redhat.com>
From: David Marchand <david.marchand@6wind.com>
Add some notes on the parts needed to use ivshmem devices: more specifically,
explain the purpose of an ivshmem server and the basic concept to use the
ivshmem devices in guests.
Move some parts of the documentation and re-organise it.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
docs/specs/ivshmem_device_spec.txt | 124 +++++++++++++++++++++++++++----------
1 file changed, 93 insertions(+), 31 deletions(-)
diff --git a/docs/specs/ivshmem_device_spec.txt b/docs/specs/ivshmem_device_spec.txt
index 667a862..12f338e 100644
--- a/docs/specs/ivshmem_device_spec.txt
+++ b/docs/specs/ivshmem_device_spec.txt
@@ -2,30 +2,103 @@
Device Specification for Inter-VM shared memory device
------------------------------------------------------
-The Inter-VM shared memory device is designed to share a region of memory to
-userspace in multiple virtual guests. The memory region does not belong to any
-guest, but is a POSIX memory object on the host. Optionally, the device may
-support sending interrupts to other guests sharing the same memory region.
+The Inter-VM shared memory device is designed to share a memory region (created
+on the host via the POSIX shared memory API) between multiple QEMU processes
+running different guests. In order for all guests to be able to pick up the
+shared memory area, it is modeled by QEMU as a PCI device exposing said memory
+to the guest as a PCI BAR.
+The memory region does not belong to any guest, but is a POSIX memory object on
+the host. The host can access this shared memory if needed.
+
+The device also provides an optional communication mechanism between guests
+sharing the same memory object. More details about that in the section 'Guest to
+guest communication' section.
The Inter-VM PCI device
-----------------------
-*BARs*
+From the VM point of view, the ivshmem PCI device supports three BARs.
+
+- BAR0 is a 1 Kbyte MMIO region to support registers and interrupts when MSI is
+ not used.
+- BAR1 is used for MSI-X when it is enabled in the device.
+- BAR2 is used to access the shared memory object.
+
+It is your choice how to use the device but you must choose between two
+behaviors :
+
+- basically, if you only need the shared memory part, you will map BAR2.
+ This way, you have access to the shared memory in guest and can use it as you
+ see fit (memnic, for example, uses it in userland
+ http://dpdk.org/browse/memnic).
+
+- BAR0 and BAR1 are used to implement an optional communication mechanism
+ through interrupts in the guests. If you need an event mechanism between the
+ guests accessing the shared memory, you will most likely want to write a
+ kernel driver that will handle interrupts. See details in the section 'Guest
+ to guest communication' section.
+
+The behavior is chosen when starting your QEMU processes:
+- no communication mechanism needed, the first QEMU to start creates the shared
+ memory on the host, subsequent QEMU processes will use it.
+
+- communication mechanism needed, an ivshmem server must be started before any
+ QEMU processes, then each QEMU process connects to the server unix socket.
+
+For more details on the QEMU ivshmem parameters, see qemu-doc documentation.
+
+
+Guest to guest communication
+----------------------------
+
+This section details the communication mechanism between the guests accessing
+the ivhsmem shared memory.
-The device supports three BARs. BAR0 is a 1 Kbyte MMIO region to support
-registers. BAR1 is used for MSI-X when it is enabled in the device. BAR2 is
-used to map the shared memory object from the host. The size of BAR2 is
-specified when the guest is started and must be a power of 2 in size.
+*ivshmem server*
-*Registers*
+This server code is available in qemu.git/contrib/ivshmem-server.
-The device currently supports 4 registers of 32-bits each. Registers
-are used for synchronization between guests sharing the same memory object when
-interrupts are supported (this requires using the shared memory server).
+The server must be started on the host before any guest.
+It creates a shared memory object then waits for clients to connect on a unix
+socket.
-The server assigns each VM an ID number and sends this ID number to the QEMU
-process when the guest starts.
+For each client (QEMU process) that connects to the server:
+- the server assigns an ID for this client and sends this ID to him as the first
+ message,
+- the server sends a fd to the shared memory object to this client,
+- the server creates a new set of host eventfds associated to the new client and
+ sends this set to all already connected clients,
+- finally, the server sends all the eventfds sets for all clients to the new
+ client.
+
+The server signals all clients when one of them disconnects.
+
+The client IDs are limited to 16 bits because of the current implementation (see
+Doorbell register in 'PCI device registers' subsection). Hence only 65536
+clients are supported.
+
+All the file descriptors (fd to the shared memory, eventfds for each client)
+are passed to clients using SCM_RIGHTS over the server unix socket.
+
+Apart from the current ivshmem implementation in QEMU, an ivshmem client has
+been provided in qemu.git/contrib/ivshmem-client for debug.
+
+*QEMU as an ivshmem client*
+
+At initialisation, when creating the ivshmem device, QEMU gets its ID from the
+server then makes it available through BAR0 IVPosition register for the VM to
+use (see 'PCI device registers' subsection).
+QEMU then uses the fd to the shared memory to map it to BAR2.
+eventfds for all other clients received from the server are stored to implement
+BAR0 Doorbell register (see 'PCI device registers' subsection).
+Finally, eventfds assigned to this QEMU process are used to send interrupts in
+this VM.
+
+*PCI device registers*
+
+From the VM point of view, the ivshmem PCI device supports 4 registers of
+32-bits each.
enum ivshmem_registers {
IntrMask = 0,
@@ -49,8 +122,8 @@ bit to 0 and unmasked by setting the first bit to 1.
IVPosition Register: The IVPosition register is read-only and reports the
guest's ID number. The guest IDs are non-negative integers. When using the
server, since the server is a separate process, the VM ID will only be set when
-the device is ready (shared memory is received from the server and accessible via
-the device). If the device is not ready, the IVPosition will return -1.
+the device is ready (shared memory is received from the server and accessible
+via the device). If the device is not ready, the IVPosition will return -1.
Applications should ensure that they have a valid VM ID before accessing the
shared memory.
@@ -59,8 +132,8 @@ Doorbell register. The doorbell register is 32-bits, logically divided into
two 16-bit fields. The high 16-bits are the guest ID to interrupt and the low
16-bits are the interrupt vector to trigger. The semantics of the value
written to the doorbell depends on whether the device is using MSI or a regular
-pin-based interrupt. In short, MSI uses vectors while regular interrupts set the
-status register.
+pin-based interrupt. In short, MSI uses vectors while regular interrupts set
+the status register.
Regular Interrupts
@@ -71,7 +144,7 @@ interrupt in the destination guest.
Message Signalled Interrupts
-A ivshmem device may support multiple MSI vectors. If so, the lower 16-bits
+An ivshmem device may support multiple MSI vectors. If so, the lower 16-bits
written to the Doorbell register must be between 0 and the maximum number of
vectors the guest supports. The lower 16 bits written to the doorbell is the
MSI vector that will be raised in the destination guest. The number of MSI
@@ -83,14 +156,3 @@ interrupt itself should be communicated via the shared memory region. Devices
supporting multiple MSI vectors can use different vectors to indicate different
events have occurred. The semantics of interrupt vectors are left to the
user's discretion.
-
-
-Usage in the Guest
-------------------
-
-The shared memory device is intended to be used with the provided UIO driver.
-Very little configuration is needed. The guest should map BAR0 to access the
-registers (an array of 32-bit ints allows simple writing) and map BAR2 to
-access the shared memory region itself. The size of the shared memory region
-is specified when the guest (or shared memory server) is started. A guest may
-map the whole shared memory region or only part of it.
--
2.4.3
next prev parent reply other threads:[~2015-10-12 16:50 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-12 16:40 [Qemu-devel] [PULL v2 00/50] Ivshmem patches marcandre.lureau
2015-10-12 16:40 ` [Qemu-devel] [PULL v2 01/50] tests: Add ivshmem qtest marcandre.lureau
2015-10-12 18:00 ` Andreas Färber
2015-10-12 16:40 ` [Qemu-devel] [PULL v2 02/50] char: add qemu_chr_free() marcandre.lureau
2015-10-12 16:40 ` [Qemu-devel] [PULL v2 03/50] msix: add VMSTATE_MSIX_TEST marcandre.lureau
2015-10-12 16:40 ` [Qemu-devel] [PULL v2 04/50] ivhsmem: read do not accept more than sizeof(long) marcandre.lureau
2015-10-12 16:40 ` [Qemu-devel] [PULL v2 05/50] ivshmem: fix number of bytes to push to fifo marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 06/50] ivshmem: factor out the incoming fifo handling marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 07/50] ivshmem: remove unnecessary dup() marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 08/50] ivshmem: remove superflous ivshmem_attr field marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 09/50] ivshmem: remove useless doorbell field marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 10/50] ivshmem: more qdev conversion marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 11/50] ivshmem: remove last exit(1) marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 12/50] ivshmem: limit maximum number of peers to G_MAXUINT16 marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 13/50] ivshmem: simplify around increase_dynamic_storage() marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 14/50] ivshmem: allocate eventfds in resize_peers() marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 15/50] ivshmem: remove useless ivshmem_update_irq() val argument marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 16/50] ivshmem: initialize max_peer to -1 marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 17/50] ivshmem: remove max_peer field marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 18/50] ivshmem: improve debug messages marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 19/50] ivshmem: improve error handling marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 20/50] ivshmem: print error on invalid peer id marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 21/50] ivshmem: simplify a bit the code marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 22/50] ivshmem: use common return marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 23/50] ivshmem: use common is_power_of_2() marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 24/50] ivshmem: migrate with VMStateDescription marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 25/50] ivshmem: shmfd can be 0 marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 26/50] ivshmem: check shm isn't already initialized marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 27/50] ivshmem: add device description marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 28/50] ivshmem: fix pci_ivshmem_exit() marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 29/50] ivshmem: replace 'guest' for 'peer' appropriately marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 30/50] ivshmem: error on too many eventfd received marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 31/50] ivshmem: reset mask on device reset marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 32/50] contrib: add ivshmem client and server marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 33/50] ivshmem-client: check the number of vectors marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 34/50] ivshmem-server: use a uint16 for client ID marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 35/50] ivshmem-server: fix hugetlbfs support marcandre.lureau
2015-10-12 16:41 ` marcandre.lureau [this message]
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 37/50] ivshmem: add check on protocol version in QEMU marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 38/50] contrib: remove unnecessary strdup() marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 39/50] msix: implement pba write (but read-only) marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 40/50] qtest: add qtest_add_abrt_handler() marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 41/50] glib-compat: add 2.38/2.40/2.46 asserts marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 42/50] tests: add ivshmem qtest marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 43/50] ivshmem: do not keep shm_fd open marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 44/50] ivshmem: use qemu_strtosz() marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 45/50] ivshmem: add hostmem backend marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 46/50] ivshmem: remove EventfdEntry.vector marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 47/50] ivshmem: rename MSI eventfd_table marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 48/50] ivshmem: use kvm irqfd for msi notifications marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 49/50] ivshmem: use little-endian int64_t for the protocol marcandre.lureau
2015-10-12 16:41 ` [Qemu-devel] [PULL v2 50/50] doc: document ivshmem & hugepages marcandre.lureau
2015-10-13 9:29 ` [Qemu-devel] [PULL v2 00/50] Ivshmem patches Peter Maydell
2015-10-13 13:01 ` Paolo Bonzini
2015-10-13 13:04 ` Marc-André Lureau
2015-10-13 15:29 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1444668104-22955-37-git-send-email-marcandre.lureau@redhat.com \
--to=marcandre.lureau@redhat.com \
--cc=cam@cs.ualberta.ca \
--cc=claudio.fontana@huawei.com \
--cc=david.marchand@6wind.com \
--cc=drjones@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).