From: Avi Kivity <avi@redhat.com>
To: Cam Macdonell <cam@cs.ualberta.ca>
Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org
Subject: [Qemu-devel] Re: [PATCH v3 0/2] Inter-VM shared memory PCI device
Date: Thu, 25 Mar 2010 11:04:54 +0200 [thread overview]
Message-ID: <4BAB2736.7020202@redhat.com> (raw)
In-Reply-To: <1269497310-21858-1-git-send-email-cam@cs.ualberta.ca>
On 03/25/2010 08:08 AM, Cam Macdonell wrote:
> Support an inter-vm shared memory device that maps a shared-memory object
> as a PCI device in the guest. This patch also supports interrupts between
> guest by communicating over a unix domain socket. This patch applies to the
> qemu-kvm repository.
>
> Changes in this version are using the qdev format and optional use of MSI and
> ioeventfd/irqfd.
>
> The non-interrupt version is supported by passing the shm parameter
>
> -device ivshmem,size=<size in MB>,[shm=<shm_name>]
>
> which will simply map the shm object into a BAR.
>
> Interrupts are supported between multiple VMs by using a shared memory server
> that is connected to with a socket character device
>
> -device ivshmem,size=<size in MB>[,chardev=<chardev name>][,irqfd=on]
> [,msi=on][,nvectors=n]
> -chardev socket,path=<path>,id=<chardev name>
>
> The server passes file descriptors for the shared memory object and eventfds (our
> interrupt mechanism) to the respective qemu instances.
>
> When using interrupts, VMs communicate with a shared memory server that passes
> the shared memory object file descriptor using SCM_RIGHTS. The server assigns
> each VM an ID number and sends this ID number to the Qemu process along with a
> series of eventfd file descriptors, one per guest using the shared memory
> server. These eventfds will be used to send interrupts between guests. Each
> guest listens on the eventfd corresponding to their ID and may use the others
> for sending interrupts to other guests.
>
Please put the spec somewhere publicly accessible with a permanent URL.
I suggest a new qemu.git directory specs/. It's more important than the
code IMO.
> enum ivshmem_registers {
> IntrMask = 0,
> IntrStatus = 4,
> IVPosition = 8,
> Doorbell = 12
> };
>
> The first two registers are the interrupt mask and status registers. Mask and
> status are only used with pin-based interrupts. They are unused with MSI
> interrupts. The IVPosition register is read-only and reports the guest's ID
> number. Interrupts are triggered when a message is received on the guest's
> eventfd from another VM. To trigger an event, a guest must write to another
> guest's Doorbell. The "Doorbells" begin at offset 12. A particular guest's
> doorbell offset in the MMIO region is equal to
>
> guest_id * 32 + Doorbell
>
> The doorbell register for each guest is 32-bits. The doorbell-per-guest
> design was motivated for use with ioeventfd.
>
You can also use a single doorbell register with ioeventfd, as it can
match against the data written. If you go this route, you'd have two
doorbells, one where you write a guest ID to send an interrupt to that
guest, and one where any write generates a multicast.
Possible later extensions:
- multiple doorbells that trigger different vectors
- multicast doorbells
> The semantics of the value written to the doorbell depends on whether the
> device is using MSI or a regular pin-based interrupt.
>
I recommend against making the semantics interrupt-style dependent. It
means the application needs to know whether MSI is in use or not, while
it is generally the OS that is in control of that.
> Regular Interrupts
> ------------------
>
> If regular interrupts are used (due to either a guest not supporting MSI or the
> user specifying not to use them on the command-line) then the value written to
> a guest's doorbell is what the guest's status register will be set to.
>
> An status of (2^32 - 1) indicates that a new guest has joined. Guests
> should not send a message of this value for any other reason.
>
> Message Signalled Interrupts
> ----------------------------
>
> The important thing to remember with MSI is that it is only a signal, no
> status is set (since MSI interrupts are not shared). All information other
> than the interrupt itself should be communicated via the shared memory region.
> MSI is on by default. It can be turned off with the msi=off to the parameter.
>
> If the device uses MSI then the value written to the doorbell is the MSI vector
> that will be raised. Vector 0 is used to notify that a new guest has joined.
> Vector 0 cannot be triggered by another guest since a value of 0 does not
> trigger an eventfd.
>
Ah, looks like we approached the vector/guest matrix from different
directions.
> ioeventfd/irqfd
> ---------------
>
> ioeventfd/irqfd is turned on by irqfd=on passed to the device parameter (it is
> off by default). When using ioeventfd/irqfd the only interrupt value that can
> be passed to another guest is 1 despite what value is written to a guest's
> Doorbell.
>
ioeventfd/irqfd are an implementation detail. The spec should not
depend on it. It needs to be written as if qemu and kvm do not exist.
Again, I recommend Rusty's virtio-pci for inspiration.
Applications should see exactly the same thing whether ioeventfd is
enabled or not.
> Sample programs, init scripts and the shared memory server are available in a
> git repo here:
>
> www.gitorious.org/nahanni
>
> Cam Macdonell (2):
> Support adding a file to qemu's ram allocation
> Inter-VM shared memory PCI device
>
Do you plan do maintain the server indefinitely in that repository? If
not, we can put it in qemu.git, perhaps under contrib/.
--
error compiling committee.c: too many arguments to function
next prev parent reply other threads:[~2010-03-25 9:05 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-25 6:08 [Qemu-devel] [PATCH v3 0/2] Inter-VM shared memory PCI device Cam Macdonell
2010-03-25 6:08 ` [Qemu-devel] [PATCH v3 1/2] Support adding a file to qemu's ram allocation Cam Macdonell
2010-03-25 6:08 ` [Qemu-devel] [PATCH v3 2/2] Inter-VM shared memory PCI device Cam Macdonell
2010-03-25 9:04 ` Avi Kivity [this message]
2010-03-25 9:21 ` [Qemu-devel] Re: [PATCH v3 0/2] " Michael S. Tsirkin
2010-03-25 16:11 ` Cam Macdonell
2010-03-25 9:26 ` Markus Armbruster
2010-03-25 9:37 ` Avi Kivity
2010-03-25 16:50 ` Cam Macdonell
2010-03-25 17:02 ` Avi Kivity
2010-03-25 17:35 ` Cam Macdonell
2010-03-25 17:48 ` Avi Kivity
2010-03-25 18:17 ` Cam Macdonell
2010-03-25 21:10 ` Avi Kivity
2010-03-25 23:05 ` Cam Macdonell
2010-03-26 15:56 ` Avi Kivity
2010-03-26 1:32 ` Jamie Lokier
2010-03-26 15:52 ` Cam Macdonell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4BAB2736.7020202@redhat.com \
--to=avi@redhat.com \
--cc=cam@cs.ualberta.ca \
--cc=kvm@vger.kernel.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).