All of lore.kernel.org
 help / color / mirror / Atom feed
From: Avi Kivity <avi@redhat.com>
To: Cam Macdonell <cam@cs.ualberta.ca>
Cc: kvm@vger.kernel.org, qemu-devel@nongnu.org
Subject: Re: [PATCH v3 0/2] Inter-VM shared memory PCI device
Date: Thu, 25 Mar 2010 11:04:54 +0200	[thread overview]
Message-ID: <4BAB2736.7020202@redhat.com> (raw)
In-Reply-To: <1269497310-21858-1-git-send-email-cam@cs.ualberta.ca>

On 03/25/2010 08:08 AM, Cam Macdonell wrote:
> Support an inter-vm shared memory device that maps a shared-memory object
> as a PCI device in the guest.  This patch also supports interrupts between
> guest by communicating over a unix domain socket.  This patch applies to the
> qemu-kvm repository.
>
> Changes in this version are using the qdev format and optional use of MSI and
> ioeventfd/irqfd.
>
> The non-interrupt version is supported by passing the shm parameter
>
>      -device ivshmem,size=<size in MB>,[shm=<shm_name>]
>
> which will simply map the shm object into a BAR.
>
> Interrupts are supported between multiple VMs by using a shared memory server
> that is connected to with a socket character device
>
>      -device ivshmem,size=<size in MB>[,chardev=<chardev name>][,irqfd=on]
>              [,msi=on][,nvectors=n]
>      -chardev socket,path=<path>,id=<chardev name>
>
> The server passes file descriptors for the shared memory object and eventfds (our
> interrupt mechanism) to the respective qemu instances.
>
> When using interrupts, VMs communicate with a shared memory server that passes
> the shared memory object file descriptor using SCM_RIGHTS.  The server assigns
> each VM an ID number and sends this ID number to the Qemu process along with a
> series of eventfd file descriptors, one per guest using the shared memory
> server.  These eventfds will be used to send interrupts between guests.  Each
> guest listens on the eventfd corresponding to their ID and may use the others
> for sending interrupts to other guests.
>    

Please put the spec somewhere publicly accessible with a permanent URL.  
I suggest a new qemu.git directory specs/.  It's more important than the 
code IMO.

> enum ivshmem_registers {
>      IntrMask = 0,
>      IntrStatus = 4,
>      IVPosition = 8,
>      Doorbell = 12
> };
>
> The first two registers are the interrupt mask and status registers.  Mask and
> status are only used with pin-based interrupts.  They are unused with MSI
> interrupts.  The IVPosition register is read-only and reports the guest's ID
> number.  Interrupts are triggered when a message is received on the guest's
> eventfd from another VM.  To trigger an event, a guest must write to another
> guest's Doorbell.  The "Doorbells" begin at offset 12.  A particular guest's
> doorbell offset in the MMIO region is equal to
>
> guest_id * 32 + Doorbell
>
> The doorbell register for each guest is 32-bits.  The doorbell-per-guest
> design was motivated for use with ioeventfd.
>    

You can also use a single doorbell register with ioeventfd, as it can 
match against the data written.  If you go this route, you'd have two 
doorbells, one where you write a guest ID to send an interrupt to that 
guest, and one where any write generates a multicast.

Possible later extensions:
- multiple doorbells that trigger different vectors
- multicast doorbells

> The semantics of the value written to the doorbell depends on whether the
> device is using MSI or a regular pin-based interrupt.
>    

I recommend against making the semantics interrupt-style dependent.  It 
means the application needs to know whether MSI is in use or not, while 
it is generally the OS that is in control of that.

> Regular Interrupts
> ------------------
>
> If regular interrupts are used (due to either a guest not supporting MSI or the
> user specifying not to use them on the command-line) then the value written to
> a guest's doorbell is what the guest's status register will be set to.
>
> An status of (2^32 - 1) indicates that a new guest has joined.  Guests
> should not send a message of this value for any other reason.
>
> Message Signalled Interrupts
> ----------------------------
>
> The important thing to remember with MSI is that it is only a signal, no
> status is set (since MSI interrupts are not shared).  All information other
> than the interrupt itself should be communicated via the shared memory region.
> MSI is on by default.  It can be turned off with the msi=off to the parameter.
>    

> If the device uses MSI then the value written to the doorbell is the MSI vector
> that will be raised.  Vector 0 is used to notify that a new guest has joined.
> Vector 0 cannot be triggered by another guest since a value of 0 does not
> trigger an eventfd.
>    

Ah, looks like we approached the vector/guest matrix from different 
directions.

> ioeventfd/irqfd
> ---------------
>
> ioeventfd/irqfd is turned on by irqfd=on passed to the device parameter (it is
> off by default).  When using ioeventfd/irqfd the only interrupt value that can
> be passed to another guest is 1 despite what value is written to a guest's
> Doorbell.
>    

ioeventfd/irqfd are an implementation detail.  The spec should not 
depend on it.  It needs to be written as if qemu and kvm do not exist.  
Again, I recommend Rusty's virtio-pci for inspiration.

Applications should see exactly the same thing whether ioeventfd is 
enabled or not.

> Sample programs, init scripts and the shared memory server are available in a
> git repo here:
>
>      www.gitorious.org/nahanni
>
> Cam Macdonell (2):
>    Support adding a file to qemu's ram allocation
>    Inter-VM shared memory PCI device
>    

Do you plan do maintain the server indefinitely in that repository?  If 
not, we can put it in qemu.git, perhaps under contrib/.

-- 
error compiling committee.c: too many arguments to function


  parent reply	other threads:[~2010-03-25  9:04 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-25  6:08 [PATCH v3 0/2] Inter-VM shared memory PCI device Cam Macdonell
2010-03-25  6:08 ` [PATCH v3 1/2] Support adding a file to qemu's ram allocation Cam Macdonell
2010-03-25  6:08   ` [PATCH v3 2/2] Inter-VM shared memory PCI device Cam Macdonell
2010-03-25  9:04 ` Avi Kivity [this message]
2010-03-25  9:21   ` [PATCH v3 0/2] " Michael S. Tsirkin
2010-03-25 16:11     ` Cam Macdonell
2010-03-25  9:26   ` Markus Armbruster
2010-03-25  9:37     ` Avi Kivity
2010-03-25 16:50   ` Cam Macdonell
2010-03-25 17:02     ` Avi Kivity
2010-03-25 17:35       ` Cam Macdonell
2010-03-25 17:48         ` Avi Kivity
2010-03-25 18:17           ` Cam Macdonell
2010-03-25 21:10             ` Avi Kivity
2010-03-25 23:05               ` Cam Macdonell
2010-03-26 15:56                 ` Avi Kivity
2010-03-26  1:32             ` [Qemu-devel] " Jamie Lokier
2010-03-26 15:52               ` Cam Macdonell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BAB2736.7020202@redhat.com \
    --to=avi@redhat.com \
    --cc=cam@cs.ualberta.ca \
    --cc=kvm@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.