From: Cam Macdonell <cam@cs.ualberta.ca>
To: Avi Kivity <avi@redhat.com>
Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org
Subject: [Qemu-devel] Re: [PATCH v3 0/2] Inter-VM shared memory PCI device
Date: Thu, 25 Mar 2010 11:35:39 -0600 [thread overview]
Message-ID: <8286e4ee1003251035o75fed405j45b60d496afa66b5@mail.gmail.com> (raw)
In-Reply-To: <4BAB9718.3030808@redhat.com>
On Thu, Mar 25, 2010 at 11:02 AM, Avi Kivity <avi@redhat.com> wrote:
> On 03/25/2010 06:50 PM, Cam Macdonell wrote:
>>
>>> Please put the spec somewhere publicly accessible with a permanent URL.
>>> I
>>> suggest a new qemu.git directory specs/. It's more important than the
>>> code
>>> IMO.
>>>
>>
>> Sorry to be pedantic, do you want a URL or the spec as part of a patch
>> that adds it as a file in qemu.git/docs/specs/
>>
>
> I leave it up to you. If you are up to hosting it independently, than just
> post a URL as part of the patch. Otherwise, I'm sure qemu.git will be more
> than happy to be the official repository for the memory sharing device
> specification. In that case, make the the spec the first patch in the
> series.
Ok, I'll send it as part of the series that way people can comment
inline easily.
>
>>> Possible later extensions:
>>> - multiple doorbells that trigger different vectors
>>> - multicast doorbells
>>>
>>
>> Since the doorbells are exposed the multicast could be done by the
>> driver. If multicast is handled by qemu, then we have different
>> behaviour when using ioeventfd/irqfd since only one eventfd can be
>> triggered by a write.
>>
>
> Multicast by the driver would require one exit per guest signalled.
> Multicast by the shared memory server needs one exit to signal an eventfd,
> then the shared memory server signals the irqfds of all members of the
> multicast group.
>
>>>> The semantics of the value written to the doorbell depends on whether
>>>> the
>>>> device is using MSI or a regular pin-based interrupt.
>>>>
>>>>
>>>
>>> I recommend against making the semantics interrupt-style dependent. It
>>> means the application needs to know whether MSI is in use or not, while
>>> it
>>> is generally the OS that is in control of that.
>>>
>>
>> It is basically the use of the status register that is the difference.
>> The application view of what is happening doesn't need to change,
>> especially with UIO: write to doorbells, block on read until interrupt
>> arrives. In the MSI case I could set the status register to the
>> vector that is received and then the would be equivalent from the view
>> of the application. But, if future MSI support in UIO allows MSI
>> information (such as vector number) to be accessible in userspace,
>> then applications would become MSI dependent anyway.
>>
>
> Ah, I see. You adjusted for the different behaviours in the driver.
>
> Still I recommend dropping the status register: this allows single-msi and
> PIRQ to behave the same way. Also it is racy, if two guests signal a third,
> they will overwrite each other's status.
With shared interrupts with PIRQ without a status register how does a
device know it generated the interrupt?
>
>>> ioeventfd/irqfd are an implementation detail. The spec should not depend
>>> on
>>> it. It needs to be written as if qemu and kvm do not exist. Again, I
>>> recommend Rusty's virtio-pci for inspiration.
>>>
>>> Applications should see exactly the same thing whether ioeventfd is
>>> enabled
>>> or not.
>>>
>>
>> The challenge I recently encountered with this is one line in the
>> eventfd implementation
>>
>> from kvm/virt/kvm/eventfd.c
>>
>> /* MMIO/PIO writes trigger an event if the addr/val match */
>> static int
>> ioeventfd_write(struct kvm_io_device *this, gpa_t addr, int len,
>> const void *val)
>> {
>> struct _ioeventfd *p = to_ioeventfd(this);
>>
>> if (!ioeventfd_in_range(p, addr, len, val))
>> return -EOPNOTSUPP;
>>
>> eventfd_signal(p->eventfd, 1);
>> return 0;
>> }
>>
>> IIUC, no matter what value is written to an ioeventfd by a guest, a
>> value of 1 is written. So ioeventfds work differently than eventfds.
>> Can we add a "multivalue" flag to ioeventfds so that the value that
>> the guest writes is written to eventfd?
>>
>
> Eventfd values are a counter, not a register. A read() on the other side
> returns the sum of all write()s (or eventfd_signal()s). In the context of
> irqfd it just means the number of interrupts we coalesced.
>
> Multivalue was considered at one time for a different need and rejected.
> Really, to solve the race you need a queue, and that can only be done in
> the shared memory segment using locked instructions.
I had a hunch it was probably considered. That explains why irqfd
doesn't have a datamatch field. I guess supporting multiple MSI
vectors with one doorbell per guest isn't possible if one 1 bit of
information can be communicated.
So, ioeventfd/irqfd restricts MSI to 1 vector between guests. Should
multi-MSI even be supported then in the non-ioeventfd/irq case?
Otherwise ioeventfd/irqfd become more than an implementation detail.
>
> --
> error compiling committee.c: too many arguments to function
>
>
next prev parent reply other threads:[~2010-03-25 17:35 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-25 6:08 [Qemu-devel] [PATCH v3 0/2] Inter-VM shared memory PCI device Cam Macdonell
2010-03-25 6:08 ` [Qemu-devel] [PATCH v3 1/2] Support adding a file to qemu's ram allocation Cam Macdonell
2010-03-25 6:08 ` [Qemu-devel] [PATCH v3 2/2] Inter-VM shared memory PCI device Cam Macdonell
2010-03-25 9:04 ` [Qemu-devel] Re: [PATCH v3 0/2] " Avi Kivity
2010-03-25 9:21 ` Michael S. Tsirkin
2010-03-25 16:11 ` Cam Macdonell
2010-03-25 9:26 ` Markus Armbruster
2010-03-25 9:37 ` Avi Kivity
2010-03-25 16:50 ` Cam Macdonell
2010-03-25 17:02 ` Avi Kivity
2010-03-25 17:35 ` Cam Macdonell [this message]
2010-03-25 17:48 ` Avi Kivity
2010-03-25 18:17 ` Cam Macdonell
2010-03-25 21:10 ` Avi Kivity
2010-03-25 23:05 ` Cam Macdonell
2010-03-26 15:56 ` Avi Kivity
2010-03-26 1:32 ` Jamie Lokier
2010-03-26 15:52 ` Cam Macdonell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8286e4ee1003251035o75fed405j45b60d496afa66b5@mail.gmail.com \
--to=cam@cs.ualberta.ca \
--cc=avi@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).