qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Avi Kivity <avi@redhat.com>
To: Cam Macdonell <cam@cs.ualberta.ca>
Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org
Subject: [Qemu-devel] Re: [PATCH v3 0/2] Inter-VM shared memory PCI device
Date: Thu, 25 Mar 2010 11:04:54 +0200	[thread overview]
Message-ID: <4BAB2736.7020202@redhat.com> (raw)
In-Reply-To: <1269497310-21858-1-git-send-email-cam@cs.ualberta.ca>

On 03/25/2010 08:08 AM, Cam Macdonell wrote:
> Support an inter-vm shared memory device that maps a shared-memory object
> as a PCI device in the guest.  This patch also supports interrupts between
> guest by communicating over a unix domain socket.  This patch applies to the
> qemu-kvm repository.
>
> Changes in this version are using the qdev format and optional use of MSI and
> ioeventfd/irqfd.
>
> The non-interrupt version is supported by passing the shm parameter
>
>      -device ivshmem,size=<size in MB>,[shm=<shm_name>]
>
> which will simply map the shm object into a BAR.
>
> Interrupts are supported between multiple VMs by using a shared memory server
> that is connected to with a socket character device
>
>      -device ivshmem,size=<size in MB>[,chardev=<chardev name>][,irqfd=on]
>              [,msi=on][,nvectors=n]
>      -chardev socket,path=<path>,id=<chardev name>
>
> The server passes file descriptors for the shared memory object and eventfds (our
> interrupt mechanism) to the respective qemu instances.
>
> When using interrupts, VMs communicate with a shared memory server that passes
> the shared memory object file descriptor using SCM_RIGHTS.  The server assigns
> each VM an ID number and sends this ID number to the Qemu process along with a
> series of eventfd file descriptors, one per guest using the shared memory
> server.  These eventfds will be used to send interrupts between guests.  Each
> guest listens on the eventfd corresponding to their ID and may use the others
> for sending interrupts to other guests.
>    

Please put the spec somewhere publicly accessible with a permanent URL.  
I suggest a new qemu.git directory specs/.  It's more important than the 
code IMO.

> enum ivshmem_registers {
>      IntrMask = 0,
>      IntrStatus = 4,
>      IVPosition = 8,
>      Doorbell = 12
> };
>
> The first two registers are the interrupt mask and status registers.  Mask and
> status are only used with pin-based interrupts.  They are unused with MSI
> interrupts.  The IVPosition register is read-only and reports the guest's ID
> number.  Interrupts are triggered when a message is received on the guest's
> eventfd from another VM.  To trigger an event, a guest must write to another
> guest's Doorbell.  The "Doorbells" begin at offset 12.  A particular guest's
> doorbell offset in the MMIO region is equal to
>
> guest_id * 32 + Doorbell
>
> The doorbell register for each guest is 32-bits.  The doorbell-per-guest
> design was motivated for use with ioeventfd.
>    

You can also use a single doorbell register with ioeventfd, as it can 
match against the data written.  If you go this route, you'd have two 
doorbells, one where you write a guest ID to send an interrupt to that 
guest, and one where any write generates a multicast.

Possible later extensions:
- multiple doorbells that trigger different vectors
- multicast doorbells

> The semantics of the value written to the doorbell depends on whether the
> device is using MSI or a regular pin-based interrupt.
>    

I recommend against making the semantics interrupt-style dependent.  It 
means the application needs to know whether MSI is in use or not, while 
it is generally the OS that is in control of that.

> Regular Interrupts
> ------------------
>
> If regular interrupts are used (due to either a guest not supporting MSI or the
> user specifying not to use them on the command-line) then the value written to
> a guest's doorbell is what the guest's status register will be set to.
>
> An status of (2^32 - 1) indicates that a new guest has joined.  Guests
> should not send a message of this value for any other reason.
>
> Message Signalled Interrupts
> ----------------------------
>
> The important thing to remember with MSI is that it is only a signal, no
> status is set (since MSI interrupts are not shared).  All information other
> than the interrupt itself should be communicated via the shared memory region.
> MSI is on by default.  It can be turned off with the msi=off to the parameter.
>    

> If the device uses MSI then the value written to the doorbell is the MSI vector
> that will be raised.  Vector 0 is used to notify that a new guest has joined.
> Vector 0 cannot be triggered by another guest since a value of 0 does not
> trigger an eventfd.
>    

Ah, looks like we approached the vector/guest matrix from different 
directions.

> ioeventfd/irqfd
> ---------------
>
> ioeventfd/irqfd is turned on by irqfd=on passed to the device parameter (it is
> off by default).  When using ioeventfd/irqfd the only interrupt value that can
> be passed to another guest is 1 despite what value is written to a guest's
> Doorbell.
>    

ioeventfd/irqfd are an implementation detail.  The spec should not 
depend on it.  It needs to be written as if qemu and kvm do not exist.  
Again, I recommend Rusty's virtio-pci for inspiration.

Applications should see exactly the same thing whether ioeventfd is 
enabled or not.

> Sample programs, init scripts and the shared memory server are available in a
> git repo here:
>
>      www.gitorious.org/nahanni
>
> Cam Macdonell (2):
>    Support adding a file to qemu's ram allocation
>    Inter-VM shared memory PCI device
>    

Do you plan do maintain the server indefinitely in that repository?  If 
not, we can put it in qemu.git, perhaps under contrib/.

-- 
error compiling committee.c: too many arguments to function

  parent reply	other threads:[~2010-03-25  9:05 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-25  6:08 [Qemu-devel] [PATCH v3 0/2] Inter-VM shared memory PCI device Cam Macdonell
2010-03-25  6:08 ` [Qemu-devel] [PATCH v3 1/2] Support adding a file to qemu's ram allocation Cam Macdonell
2010-03-25  6:08   ` [Qemu-devel] [PATCH v3 2/2] Inter-VM shared memory PCI device Cam Macdonell
2010-03-25  9:04 ` Avi Kivity [this message]
2010-03-25  9:21   ` [Qemu-devel] Re: [PATCH v3 0/2] " Michael S. Tsirkin
2010-03-25 16:11     ` Cam Macdonell
2010-03-25  9:26   ` Markus Armbruster
2010-03-25  9:37     ` Avi Kivity
2010-03-25 16:50   ` Cam Macdonell
2010-03-25 17:02     ` Avi Kivity
2010-03-25 17:35       ` Cam Macdonell
2010-03-25 17:48         ` Avi Kivity
2010-03-25 18:17           ` Cam Macdonell
2010-03-25 21:10             ` Avi Kivity
2010-03-25 23:05               ` Cam Macdonell
2010-03-26 15:56                 ` Avi Kivity
2010-03-26  1:32             ` Jamie Lokier
2010-03-26 15:52               ` Cam Macdonell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BAB2736.7020202@redhat.com \
    --to=avi@redhat.com \
    --cc=cam@cs.ualberta.ca \
    --cc=kvm@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).