From: Claudio Fontana <claudio.fontana@huawei.com>
To: David Marchand <david.marchand@6wind.com>, qemu-devel@nongnu.org
Cc: pbonzini@redhat.com, jani.kokkonen@huawei.com,
cam@cs.ualberta.ca, kvm@vger.kernel.org
Subject: Re: [Qemu-devel] [PATCH v2 2/2] docs: update ivshmem device spec
Date: Mon, 21 Jul 2014 10:19:03 +0200 [thread overview]
Message-ID: <53CCCCF7.5030104@huawei.com> (raw)
In-Reply-To: <1405849119-13569-3-git-send-email-david.marchand@6wind.com>
On 20.07.2014 11:38, David Marchand wrote:
> Add some notes on the parts needed to use ivshmem devices: more specifically,
> explain the purpose of an ivshmem server and the basic concept to use the
> ivshmem devices in guests.
> Move some parts of the documentation and re-organise it.
>
> Signed-off-by: David Marchand <david.marchand@6wind.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
> ---
> docs/specs/ivshmem_device_spec.txt | 124 +++++++++++++++++++++++++++---------
> 1 file changed, 93 insertions(+), 31 deletions(-)
>
> diff --git a/docs/specs/ivshmem_device_spec.txt b/docs/specs/ivshmem_device_spec.txt
> index 667a862..f5f2b95 100644
> --- a/docs/specs/ivshmem_device_spec.txt
> +++ b/docs/specs/ivshmem_device_spec.txt
> @@ -2,30 +2,103 @@
> Device Specification for Inter-VM shared memory device
> ------------------------------------------------------
>
> -The Inter-VM shared memory device is designed to share a region of memory to
> -userspace in multiple virtual guests. The memory region does not belong to any
> -guest, but is a POSIX memory object on the host. Optionally, the device may
> -support sending interrupts to other guests sharing the same memory region.
> +The Inter-VM shared memory device is designed to share a memory region (created
> +on the host via the POSIX shared memory API) between multiple QEMU processes
> +running different guests. In order for all guests to be able to pick up the
> +shared memory area, it is modeled by QEMU as a PCI device exposing said memory
> +to the guest as a PCI BAR.
> +The memory region does not belong to any guest, but is a POSIX memory object on
> +the host. The host can access this shared memory if needed.
> +
> +The device also provides an optional communication mechanism between guests
> +sharing the same memory object. More details about that in the section 'Guest to
> +guest communication' section.
>
>
> The Inter-VM PCI device
> -----------------------
>
> -*BARs*
> +From the VM point of view, the ivshmem PCI device supports three BARs.
> +
> +- BAR0 is a 1 Kbyte MMIO region to support registers and interrupts when MSI is
> + not used.
> +- BAR1 is used for MSI-X when it is enabled in the device.
> +- BAR2 is used to access the shared memory object.
> +
> +It is your choice how to use the device but you must choose between two
> +behaviors :
> +
> +- basically, if you only need the shared memory part, you will map BAR2.
> + This way, you have access to the shared memory in guest and can use it as you
> + see fit (memnic, for example, uses it in userland
> + http://dpdk.org/browse/memnic).
> +
> +- BAR0 and BAR1 are used to implement an optional communication mechanism
> + through interrupts in the guests. If you need an event mechanism between the
> + guests accessing the shared memory, you will most likely want to write a
> + kernel driver that will handle interrupts. See details in the section 'Guest
> + to guest communication' section.
> +
> +The behavior is chosen when starting your QEMU processes:
> +- no communication mechanism needed, the first QEMU to start creates the shared
> + memory on the host, subsequent QEMU processes will use it.
> +
> +- communication mechanism needed, an ivshmem server must be started before any
> + QEMU processes, then each QEMU process connects to the server unix socket.
> +
> +For more details on the QEMU ivshmem parameters, see qemu-doc documentation.
> +
> +
> +Guest to guest communication
> +----------------------------
> +
> +This section details the communication mechanism between the guests accessing
> +the ivhsmem shared memory.
>
> -The device supports three BARs. BAR0 is a 1 Kbyte MMIO region to support
> -registers. BAR1 is used for MSI-X when it is enabled in the device. BAR2 is
> -used to map the shared memory object from the host. The size of BAR2 is
> -specified when the guest is started and must be a power of 2 in size.
> +*ivshmem server*
>
> -*Registers*
> +This server code is available in qemu.git/contrib/ivshmem-server.
>
> -The device currently supports 4 registers of 32-bits each. Registers
> -are used for synchronization between guests sharing the same memory object when
> -interrupts are supported (this requires using the shared memory server).
> +The server must be started on the host before any guest.
> +It creates a shared memory object then waits for clients to connect on an unix
> +socket.
>
> -The server assigns each VM an ID number and sends this ID number to the QEMU
> -process when the guest starts.
> +For each client (QEMU processes) that connects to the server:
> +- the server assigns an ID for this client and sends this ID to him as the first
> + message,
> +- the server sends a fd to the shared memory object to this client,
> +- the server creates a new set of host eventfds associated to the new client and
> + sends this set to all already connected clients,
> +- finally, the server sends all the eventfds sets for all clients to the new
> + client.
> +
> +The server signals all clients when one of them disconnects.
> +
> +The client IDs are limited to 16 bits because of the current implementation (see
> +Doorbell register in 'PCI device registers' subsection). Hence on 65536 clients
> +are supported.
> +
> +All the file descriptors (fd to the shared memory, eventfds for each client)
> +are passed to clients using SCM_RIGHTS over the server unix socket.
> +
> +Apart from the current ivshmem implementation in QEMU, an ivshmem client has
> +been provided in qemu.git/contrib/ivshmem-client for debug.
> +
> +*QEMU as an ivshmem client*
> +
> +At initialisation, when creating the ivshmem device, QEMU gets its ID from the
> +server then make it available through BAR0 IVPosition register for the VM to use
> +(see 'PCI device registers' subsection).
> +QEMU then uses the fd to the shared memory to map it to BAR2.
> +eventfds for all other clients received from the server are stored to implement
> +BAR0 Doorbell register (see 'PCI device registers' subsection).
> +Finally, eventfds assigned to this QEMU process are used to send interrupts in
> +this VM.
> +
> +*PCI device registers*
> +
> +From the VM point of view, the ivshmem PCI device supports 4 registers of
> +32-bits each.
>
> enum ivshmem_registers {
> IntrMask = 0,
> @@ -49,8 +122,8 @@ bit to 0 and unmasked by setting the first bit to 1.
> IVPosition Register: The IVPosition register is read-only and reports the
> guest's ID number. The guest IDs are non-negative integers. When using the
> server, since the server is a separate process, the VM ID will only be set when
> -the device is ready (shared memory is received from the server and accessible via
> -the device). If the device is not ready, the IVPosition will return -1.
> +the device is ready (shared memory is received from the server and accessible
> +via the device). If the device is not ready, the IVPosition will return -1.
> Applications should ensure that they have a valid VM ID before accessing the
> shared memory.
>
> @@ -59,8 +132,8 @@ Doorbell register. The doorbell register is 32-bits, logically divided into
> two 16-bit fields. The high 16-bits are the guest ID to interrupt and the low
> 16-bits are the interrupt vector to trigger. The semantics of the value
> written to the doorbell depends on whether the device is using MSI or a regular
> -pin-based interrupt. In short, MSI uses vectors while regular interrupts set the
> -status register.
> +pin-based interrupt. In short, MSI uses vectors while regular interrupts set
> +the status register.
>
> Regular Interrupts
>
> @@ -71,7 +144,7 @@ interrupt in the destination guest.
>
> Message Signalled Interrupts
>
> -A ivshmem device may support multiple MSI vectors. If so, the lower 16-bits
> +An ivshmem device may support multiple MSI vectors. If so, the lower 16-bits
> written to the Doorbell register must be between 0 and the maximum number of
> vectors the guest supports. The lower 16 bits written to the doorbell is the
> MSI vector that will be raised in the destination guest. The number of MSI
> @@ -83,14 +156,3 @@ interrupt itself should be communicated via the shared memory region. Devices
> supporting multiple MSI vectors can use different vectors to indicate different
> events have occurred. The semantics of interrupt vectors are left to the
> user's discretion.
> -
> -
> -Usage in the Guest
> -------------------
> -
> -The shared memory device is intended to be used with the provided UIO driver.
> -Very little configuration is needed. The guest should map BAR0 to access the
> -registers (an array of 32-bit ints allows simple writing) and map BAR2 to
> -access the shared memory region itself. The size of the shared memory region
> -is specified when the guest (or shared memory server) is started. A guest may
> -map the whole shared memory region or only part of it.
>
--
Claudio Fontana
Server Virtualization Architect
Huawei Technologies Duesseldorf GmbH
Riesstraße 25 - 80992 München
office: +49 89 158834 4135
mobile: +49 15253060158
prev parent reply other threads:[~2014-07-21 8:20 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-20 9:38 [Qemu-devel] [PATCH v2 0/2] ivshmem: update documentation, add client/server tools David Marchand
2014-07-20 9:38 ` [Qemu-devel] [PATCH v2 1/2] contrib: add ivshmem client and server David Marchand
2014-07-21 14:21 ` Eric Blake
2014-07-21 14:30 ` Daniel P. Berrange
2014-07-21 15:26 ` David Marchand
2014-07-21 17:35 ` Markus Armbruster
2014-08-08 8:49 ` David Marchand
2014-07-20 9:38 ` [Qemu-devel] [PATCH v2 2/2] docs: update ivshmem device spec David Marchand
2014-07-21 8:19 ` Claudio Fontana [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53CCCCF7.5030104@huawei.com \
--to=claudio.fontana@huawei.com \
--cc=cam@cs.ualberta.ca \
--cc=david.marchand@6wind.com \
--cc=jani.kokkonen@huawei.com \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).